Linear Matrix Inequalities in Control · 2006-06-06 · is to reformulate a given control problem...

Linear Matrix Inequalities in Control

Carsten Scherer and Siep Weiland

Delft Center for Systems and ControlDelft University of TechnologyThe Netherlands

Department of Electrical EngineeringEindhoven University of TechnologyThe Netherlands

Contents

Preface vii

1 Convex optimization and linear matrix inequalities 1

1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.2 Facts from convex analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

1.3 Convex optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

1.3.1 Local and global minima . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

1.3.2 Uniform bounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

1.3.3 Duality and convex programs . . . . . . . . . . . . . . . . . . . . . . . . . 9

1.3.4 Subgradients . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

1.4 Linear matrix inequalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

1.4.1 What are they? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

1.4.2 Why are they interesting? . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

1.4.3 What are they good for? . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

1.4.4 How are they solved? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

1.4.5 What are dual LMI problems? . . . . . . . . . . . . . . . . . . . . . . . . . 26

1.4.6 When were they invented? . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

1.5 Further reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

i

ii CONTENTS

1.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

2 Dissipative dynamical systems 33

2.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

2.2 Dissipative dynamical systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

2.2.1 Definitions and examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

2.2.2 A classification of storage functions . . . . . . . . . . . . . . . . . . . . . . 37

2.2.3 Dissipation functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

2.3 Linear dissipative systems with quadratic supply rates . . . . . . . . . . . . . . . . . 39

2.3.1 Main results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

2.3.2 Dissipation functions and spectral factors . . . . . . . . . . . . . . . . . . . 43

2.3.3 The Kalman-Yakubovich-Popov lemma . . . . . . . . . . . . . . . . . . . . 45

2.3.4 The positive real lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

2.3.5 The bounded real lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

2.4 Interconnected dissipative systems . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

2.5 Further reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

2.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

3 Stability and nominal performance 55

3.1 Lyapunov stability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

3.2 Generalized stability regions for LTI systems . . . . . . . . . . . . . . . . . . . . . 61

3.3 The generalized plant concept . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

3.4 Nominal performance and LMI’s . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68

3.4.1 Quadratic nominal performance . . . . . . . . . . . . . . . . . . . . . . . . 68

3.4.2 H∞ nominal performance . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

3.4.3 H2 nominal performance . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

ii Compilation: November 16, 2004

CONTENTS iii

3.4.4 Generalized H2 nominal performance . . . . . . . . . . . . . . . . . . . . . 73

3.4.5 L1 or peak-to-peak nominal performance . . . . . . . . . . . . . . . . . . . 75

3.5 Further reading . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

3.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

4 Controller synthesis 83

4.1 The setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

4.2 From analysis to synthesis – a general procedure . . . . . . . . . . . . . . . . . . . 86

4.3 Other performance specifications . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92

4.3.1 H∞ Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92

4.3.2 Positive real design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93

4.3.3 H2-problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93

4.3.4 Upper bound on peak-to-peak norm . . . . . . . . . . . . . . . . . . . . . . 95

4.4 Multi-objective and mixed controller design . . . . . . . . . . . . . . . . . . . . . . 96

4.5 Elimination of parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100

4.5.1 Dualization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101

4.5.2 Special linear matrix inequalities . . . . . . . . . . . . . . . . . . . . . . . . 104

4.5.3 The quadratic performance problem . . . . . . . . . . . . . . . . . . . . . . 109

4.5.4 H2-problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113

4.6 State-feedback problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115

4.7 Discrete-Time Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118

4.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119

5 Systems with parametric uncertainty 125

5.1 Parametric uncertainties . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125

5.1.1 Affine parameter dependent systems . . . . . . . . . . . . . . . . . . . . . . 126

iii Compilation: November 16, 2004

iv CONTENTS

5.1.2 Polytopic parameter dependent systems . . . . . . . . . . . . . . . . . . . . 127

5.2 Robust stability for autonomous systems . . . . . . . . . . . . . . . . . . . . . . . . 127

5.2.1 Quadratic stability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 128

5.2.2 Quadratic stability of affine models . . . . . . . . . . . . . . . . . . . . . . 129

5.2.3 Quadratic stability of polytopic models . . . . . . . . . . . . . . . . . . . . 130

5.3 Parameter dependent Lyapunov functions . . . . . . . . . . . . . . . . . . . . . . . 131

5.3.1 Time-invariant uncertainties . . . . . . . . . . . . . . . . . . . . . . . . . . 132

5.3.2 Time varying uncertainties . . . . . . . . . . . . . . . . . . . . . . . . . . . 134

5.4 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135

iv Compilation: November 16, 2004

List of Figures

1.1 Joseph-Louis Lagrange . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

1.2 Aleksandr Mikhailovich Lyapunov . . . . . . . . . . . . . . . . . . . . . . . . . . . 28

2.1 System interconnections . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

2.2 Model for suspension system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

2.3 Feedback configuration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

3.1 A system interconnection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

3.2 General closed-loop interconnection. . . . . . . . . . . . . . . . . . . . . . . . . . . 64

3.3 Binary distillation column . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80

4.1 Multi-channel closed-loop system . . . . . . . . . . . . . . . . . . . . . . . . . . . 85

4.2 Active suspension system . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121

v

vi LIST OF FIGURES

vi Compilation: November 16, 2004

Preface

In recent years linear matrix inequalities (LMI’s) have emerged as a powerful tool to approach con-trol problems that appear hard if not impossible to solve in an analytic fashion. Although the historyof linear matrix inequalities goes back to the fourties with a major emphasis of their role in con-trol in the sixties through the work of Kalman, Yakubovich, Popov and Willems, only during thelast decades powerful numerical interior point techniques have been developed to solve LMI’s in apractically efficient manner (Nesterov, Nemirovskii 1994). Currently, several commercial and non-commercial software packages are available that allow a simple coding of general LMI problemsand provide efficient tools to solve typical control problems in a numerically efficient manner.

Boosted by the availability of fast LMI solvers, research in robust control has experienced a sig-nificant paradigm shift. Instead of arriving at an analytical solution of an optimal control problemand implementing such a solution in software so as to synthesize optimal controllers, the intentionis to reformulate a given control problem to verifying whether a specific linear matrix inequality issolvable or, alternatively, to optimizing functionals over linear matrix inequality constraints. Thisbook aims at providing a state of the art treatment of the theory, the usage and the applications oflinear matrix inequalities in control. The main emphasis of this book is to reveal the basic prin-ciples and background for formulating desired properties of a control system in the form of linearmatrix inequalities, and to demonstrate the techniques to reduce the corresponding controller syn-thesis problem to an LMI problem. The power of this approach is illustrated by several fundamentalrobustness and performance problems in analysis and design of linear control systems.

This book has been written for a graduate course on the subject of LMI’s in systems and control.Within the graduate program of the Dutch Institute of Systems and Control (DISC), this course isintended to provide up-to-date information on the topic for students involved in either the practicalor theoretical aspects of control system design. DISC courses have the format of two class hoursonce per week during a period of eight weeks. Within DISC, the first course on LMI’s in controlhas been given in 1997, when the first draft of this book has been distributed as lecture notes to thestudents. The lecture notes of this course have been evaluating to the present book, thanks to themany suggestions, criticism and help of many students that followed this course.

Readers of this book are supposed to have an academic background in linear algebra, basic calculus,and possibly in system and control theory.

vii

viii

viii Compilation: November 16, 2004

Chapter 1

Convex optimization and linearmatrix inequalities

1.1 Introduction

Optimization questions and decision making processes are abundant in daily life and invariably in-volve the selection of the best decision from a number of options or a set of candidate decisions.Many examples of this theme can be found in technical sciences such as electrical, mechanical andchemical engineering, in architecture and in economics, but also in the social sciences, in biolog-ical and ecological processes and organizational questions. For example, production processes inindustry are becoming more and more market driven and require an ever increasing flexibility ofproduct changes and product specifications due to customer demands in quality, price and specifi-cation. Products need to be manufactured within strict product specifications, with large variationsof input quality, against competitive prices, with minimal waste of resources, energy and valuableproduction time, with a minimal time-to-market and, of course, with maximal economical profit.Important economical benefits can therefore only be realized by making proper decisions in theoperating conditions of production processes. Due to increasing requirements on the safety andflexibility of production processes, there is a constant need for further optimization, for increasedefficiency and a better control of processes.

Casting an optimization problem in mathematics involves the specification of the candidate deci-sions and, most importantly, the formalization of the concept ofbestor optimal decision. Let theuniversum of all possible decisions in an optimization problem be denoted by a setX, and supposethat the set offeasible (or candidate) decisionsis a subsetS of X. Then one approach to quantifythe performance of a feasible decisionx ∈ S is to express its value in terms of a single real quantityf (x) where f is some real valued functionf : S → R called theobjective functionor cost function.The value of decisionx ∈ S is then given byf (x). Depending on the interpretation of the objective

1

2 1.2. FACTS FROM CONVEX ANALYSIS

function, we may wish to minimize or maximizef over all feasible candidates inS. An optimaldecision is then simply an element ofS that minimizes or maximizesf over all feasible alternatives.

The optimization problem tominimizethe objective functionf over a set of feasible decisionsSinvolves various specific questions:

(a) What is the least possible cost? That is, determine theoptimal value

Vopt := infx∈S

f (x) = inf{ f (x) | x ∈ S}.

By convention, the optimal valueVopt = +∞ if S is empty, while the problem is said to beunboundedif Vopt = −∞.

(b) How to determine analmost optimal solution, i.e., for arbitraryε > 0, how to determinexε ∈ S such that

Vopt ≤ f (xε) ≤ Vopt + ε.

(c) Does there exist anoptimal solution xopt ∈ S with f (xopt) = Vopt? If so, we say that theminimum is attainedand we write f (x0) = minx∈S f (x). The set of all optimal solutions isdenoted by argminx∈S f (x).

(d) If an optimal solutionxopt exists, is it also unique?

We will address each of these questions in the sequel.

1.2 Facts from convex analysis

In view of the optimization problems just formulated, we are interested in finding conditions foroptimal solutions to exist. It is therefore natural to resort to a branch of analysis which provides suchconditions: convex analysis. The results and definitions in this subsection are mainly basic, but theyhave very important implications and applications as we will see later.

We start with summarizing some definitions and elementary properties from linear algebra and func-tional analysis. We assume the reader to be familiar with the basic concepts of vector spaces, normsand normed linear spaces.

Suppose thatX andY are two normed linear spaces. A functionf which mapsX to Y is said to becontinuous at x0 ∈ X if, for everyε > 0, there exists aδ = δ(x0, ε) such that

‖ f (x) − f (x0)‖ < ε whenever ‖x − x0‖ < δ. (1.2.1)

The function f is calledcontinuousif it is continuous at allx0 ∈ X. Finally, f is said to beuniformlycontinuousif, for every ε > 0, there existsδ = δ(ε), not depending onx0, such that (1.2.1) holds.Obviously, continuity depends on the definition of the norm in the normed spacesX andY. We

2 Compilation: February 8, 2005

1.2. FACTS FROM CONVEX ANALYSIS 3

remark that a functionf : X → Y is continuous atx0 ∈ X if and only if for every sequence{xn}

∞

n=1, xn ∈ X, which converges tox0 asn → ∞, there holds thatf (xn) → f (x0).

Now letS be a subset of the normed linear spaceX. ThenS is calledcompactif for every sequence{xn}

∞

n=1 in S there exists a subsequence{xnm}∞

m=1 which converges to an elementx0 ∈ S. Compactsets in finite dimensional vector spaces are easily characterized. Indeed, ifX is finite dimensionalthen a subsetS of X is compact if and only ifS is closed and bounded1.

The well-known Weierstrass theorem provides a useful tool to determine whether an optimizationproblem admits a solution. It provides an answer to the third question raised in the previous subsec-tion for special setsS and special performance functionsf .

Proposition 1.1 (Weierstrass)If f : S → R is a continuous function defined on a compact subsetS of a normed linear spaceX, then there exists xmin, xmax ∈ S such that

f (xmin) = infx∈S

f (x) ≤ f (x) ≤ supx∈S

f (x) = f (xmax)

for all x ∈ S.

Proof. DefineVmin := infx∈S f (x). Then there exists a sequence{xn}∞

n=1 in S such thatf (xn) →

Vmin asn → ∞. AsS is compact, there must exist a subsequence{xnm}∞

m=1 of {xn} which convergesto an element, sayxmin, which lies inS. Obviously, f (xnm) → Vmin and the continuity off impliesthat f (xnm) → f (xmin) asnm → ∞. We claim thatVmin = f (xmin). By definition ofVmin, we haveVmin ≤ f (xmin). Now suppose that the latter inequality is strict, i.e., suppose thatVmin < f (xmin).Then 0< f (xmin)−Vmin = limnm→∞ f (xnm)−limnm→∞ f (xnm) = 0, which yields a contradiction.The proof of the existence of a maximizing element is similar.

Following his father’s wishes, Karl Theodor Wilhelm Weierstrass (1815-1897) studiedlaw, finance and economics at the university of Bonn. His primary interest, however,was in mathematics which led to a serious conflict with his father. He started his careeras a teacher of mathematics. After various positions and invitations, he accepted a chairat the ‘Industry Institute’ in Berlin in 1855. Weierstrass contributed to the foundationsof analytic functions, elliptic functions, Abelian functions, converging infinite products,and the calculus of variations. Hurwitz and Frobenius were among his students.

Note that Proposition 1.1 does not give a constructive method to find the extremal solutionsxmin andxmax. It only guarantees the existence of these elements for continuous functions defined on compactsets. For many optimization problems these conditions (continuity and compactness) turn out to beoverly restrictive. We will therefore resort to convex sets.

Definition 1.2 (Convex sets)A setS in a linear vector space is said to beconvexif

{x1, x2 ∈ S} =⇒ {x := αx1 + (1 − α)x2 ∈ S for all α ∈ (0, 1)}.

1A setS is boundedif there exists a numberB such that for allx ∈ S, ‖x‖ ≤ B; it is closedif xn → x implies thatx ∈ S.



In geometric terms, this states that for any two points of a convex set also the line segment connectingthese two points belongs to the set. In general, theempty setandsingletons(sets that consist of onepoint only) are considered to be convex. The pointαx1+(1−α)x2 with α ∈ (0, 1) is called aconvexcombinationof the two pointsx1 andx2. More generally, convex combinations are defined for anyfinite set of points as follows.

Definition 1.3 (Convex combinations)Let S be a subset of a vector space. The point

x :=

n∑i =1

αi xi

is called aconvex combinationof x1, . . . , xn ∈ S if αi ≥ 0 for i = 1, . . . , n and∑n

i =1 αi = 1.

It is easy to see that the set of all convex combinations ofn pointsx1, . . . , xn in S is itself convex,i.e.,

C := {x | x is a convex combination ofx1, . . . , xn}

is convex.

We next define the notion of interior points and closure points of sets. LetS be a subset of a normedspaceX. The pointx ∈ S is called aninterior point of S if there exists anε > 0 such that all pointsy ∈ X with ‖x − y‖ < ε also belong theS. The interior of S is the collection of all interior pointsof S. S is said to beopenif it is equal to its interior. The pointx ∈ X is called aclosure pointof Sif, for all ε > 0, there exists a pointy ∈ S with ‖x − y‖ < ε. Theclosureof S is the collection ofall closure points ofS. S is said to beclosedif it is equal to its closure.

We summarize some elementary properties pertaining to convex sets in the following proposition.

Proposition 1.4 LetS andT be convex sets in a normed vector spaceX. Then

(a) the setαS := {x | x = αs, s ∈ S} is convex for any scalarα.

(b) the sumS + T := {x | x = s + t, s ∈ S, t ∈ T } is convex.

(c) for anyα1 ≥ 0 andα2 ≥ 0, (α1 + α2)S = α1S + α2S.

(d) the closure and the interior ofS (andT ) are convex.

(e) for any linear transformation T: X → X, the image TS := {x | x = T s, s ∈ S} and theinverse image T−1S := {x | T x ∈ S} are convex.

(f) the intersectionS ∩ T := {x | x ∈ S and x∈ T } is convex.

The distributive property in the third item is non trivial and depends on the convexity ofS. Thelast property actually holds for the intersection of anarbitrary collectionof convex sets, i.e, ifSα,with α ∈ A, A an arbitrary index set, is a family of convex sets then the intersection∩α∈ASα is


1.2. FACTS FROM CONVEX ANALYSIS 5

also convex. This property turns out to be very useful in constructing the smallest convex set thatcontains a given set. It is defined as follows.

As an example, witha ∈ Rn\{0} andb ∈ R, thehyperplane{x ∈ Rn

| a>x = b} and thehalf-space{x ∈ Rn

| a>x ≤ b} are convex. Apolyhedronis the intersection of finitely many hyperplanes andhalf-spaces and is convex by the last item of Proposition 1.4. Apolytopeis a compact polyhedron.

Definition 1.5 (Convex hull) The convex hullconvS of any subsetS ⊂ X is the intersection ofall convex sets containingS. If S consists of a finite number of elements, then these elements arereferred to as theverticesof convS.

It is easily seen that the convex hull of a finite set is a polytope. The converse is also true: any poly-tope is the convex hull of a finite set. Since convexity is a property that is closed under intersection,the following proposition is immediate.

Proposition 1.6 (Convex hulls)For any subsetS of a linear vector spaceX, the convex hullconv(S)

is convex and consists precisely of all convex combinations of the elements ofS.

At a few occasions we will need the concept of a cone. A subsetS of a vector spaceX is calleda coneif αx ∈ S for all x ∈ S andα > 0. A convex coneis a cone which is a convex set. Likein Proposition 1.4, ifS andT are convex cones then the setsαS, S + T , S ∩ T , TS andT−1Sare convex cones for all scalarsα and all linear transformationsT . Likewise, the intersection of anarbitrary collection of convex cones is a convex cone again. Important examples of convex cones aredefined in terms of inequalities as follows. IfX is a Hilbert space equipped with the inner product〈·, ·〉, then

S := {x ∈ X | 〈x, ki 〉 ≤ 0, i ∈ A}

is a (closed) convex cone for any index setA and any collection of elementski ∈ X. Thus, solutionsets of systems of linear inequalities define convex cones.

Definition 1.7 (Convex functions) A function f : S → R is calledconvexif S is a non-emptyconvex set and if for allx1, x2 ∈ S andα ∈ (0, 1) there holds that

f (αx1 + (1 − α)x2) ≤ α f (x1) + (1 − α) f (x2). (1.2.2)

f is calledstrictly convexif the inequality (1.2.2) is strict for allx1, x2 ∈ S, x1 6= x2 and allα ∈ (0, 1).

Note that the domain of a convex function isby definitiona convex set. Simple examples of convexfunctions aref (x) = x2 on R, f (x) = sinx on [π, 2π ] and f (x) = − logx on x > 0. A functionf : S → R is calledconcaveif − f is convex. Many operations on convex functions naturallypreserve convexity. For example, iff1 and f2 are convex functions with domainS then the functionf1 + f2 : x 7→ f1(x) + f2(x) is convex. More generally,α1 f1 + α2 f2 and the composite functiong( f1) are convex for any non-negative numbersα1, α2 and any non-decreasing convex functiong : R → R.



Instead of minimizing the functionf : S → R we can set our aims a little lower and be satisfiedwith considering all possiblex ∈ S that give a guaranteed upper bound off . For this, we introduce,for any numberγ ∈ R, thesublevel setsassociated withf as follows

Sγ := {x ∈ S | f (x) ≤ γ }.

Obviously, Sγ = ∅ if γ < infx∈S f (x) and Sγ coincides with the set of global minimizers off if γ = infx∈S f (x). Note also thatSγ ′ ⊆ Sγ ′′ wheneverγ ′

≤ γ ′′; that is sublevel sets arenon-decreasing (in a set theoretic sense) when viewed as function ofγ . As one can guess, convexfunctions and convex sublevel sets are closely related to each other:

Proposition 1.8 If f : S → R is convex then the sublevel setSγ is convex for allγ ∈ R.

Proof. Supposef is convex. Letγ ∈ R and considerSγ . If Sγ is empty then the statement istrivial. Suppose therefore thatSγ 6= ∅ and letx1, x2 ∈ Sγ , α ∈ [0, 1]. Then, f (x1) ≤ γ , f (x2) ≤ γ

and the convexity ofS implies thatαx1 + (1 − α)x2 ∈ S. Convexity of f now yields that

f (αx1 + (1 − α)x2) ≤ α f (x1) + (1 − α) f (x2) ≤ αγ + (1 − α)γ = γ

i.e.,αx1 + (1 − α)x2 ∈ Sγ .

Sublevel sets are commonly used in specifying desired behavior of multi-objective control problems.As an example, suppose thatS denotes a class of (closed-loop) transfer functions and let, fork =

1, . . . , K , fk : S → R be thekth objective function onS. For a multi-indexγ = (γ1, . . . , γK ) withγk ∈ R, thekth sublevel setSk

γk:= {x ∈ S | fk(x) ≤ γk} then expresses thekth design objective.

The multi-objective specification amounts to characterizing

Sγ := S1γ1

∩ . . . ∩ SKγK

and the design question will be to decide for which multi-indexγ this set is non-empty. Proposi-tion 1.8 together with Proposition 1.4 promises the convexity ofSγ wheneverfk is convex for allk.See Exercise 9.

We emphasize that it isnot true that convexity of the sublevel setsSγ , γ ∈ R implies convexity off . (See Exercise 2 and Exercise 6 in this Chapter). However, the class of functions for which allsublevel sets are convex is that important that it deserves its own name.

Definition 1.9 (Quasi-convex functions)A function f : S → R is quasi-convexif its sublevel setsSγ are convex for allγ ∈ R.

It is easy to verify thatf is quasi-convex if and only if

f (αx1 + (1 − α)x2) ≤ max[ f (x1), f (x2)]

for all α ∈ (0, 1) and for allx1, x2 ∈ S. In particular, every convex function is quasi-convex.


1.3. CONVEX OPTIMIZATION 7

We conclude this section with the introduction of affine sets and affine functions. A subsetS of alinear vector space is called anaffine setif the point x := αx1 + (1 − α)x2 belongs toS for everyx1 ∈ S, x2 ∈ S andα ∈ R. The geometric idea is that for any two points of an affine set, also theline through these two points belongs to the set. From Definition 1.2 it is evident that every affineset is convex. The empty set and singletons are generally considered to be affine. Each non-emptyaffine setS in a finite dimensional vector spaceX can be written as

S = {x ∈ X | x = x0 + s, s ∈ S0}

wherex0 ∈ X is a vector andS0 is a linear subspace ofX. That is, affine sets aretranslatesof linearsubspaces. For any such representation, the linear subspaceS0 ⊆ X is uniquely defined. However,the vectorx0 is not.

Definition 1.10 (Affine functions) A function f : S → T is affineif

f (αx1 + (1 − α)x2) = α f (x1) + (1 − α) f (x2)

for all x1, x2 ∈ S andα ∈ R.

If S and T are finite dimensional then any affine functionf : S → T can be represented asf (x) = f0 + T(x) where f0 ∈ T andT : S → T is a linear map. Indeed, settingf0 = f (0) andT(x) = f (x)− f0 establishes this representation. In particular,f : Rn

→ Rm is affine if and only ifthere existsx0 ∈ Rn such thatf (x) = f (x0) + T(x − x0) whereT is amatrixof dimensionm × n.Note that all affine functions are convex and concave.

1.3 Convex optimization

After the previous section with definitions and elementary properties of convex sets and convexfunctions, we hope that this section will convince the most skeptical reader why convexity of setsand functions is such a desirable property for optimization.

1.3.1 Local and global minima

Anyone who gained experience with numerical optimization methods got familiar with the pitfallsof local minima and local maxima. One crucial reason for studying convex functions is related totheabsence of local minima.

Definition 1.11 (Local and global optimality) Let S be a subset of a normed spaceX. An elementx0 ∈ S is said to be alocal optimal solutionof f : S → R if there existsε > 0 such that

f (x0) ≤ f (x) (1.3.1)

for all x ∈ S with ‖x − x0‖ < ε. It is called aglobal optimal solutionif (1.3.1) holds for allx ∈ S.


8 1.3. CONVEX OPTIMIZATION

In words,x0 ∈ S is a local optimal solution if there exists a neighborhood ofx0 such thatf (x0) ≤

f (x) for all feasible points nearbyx0. According to this definition, a global optimal solution is alsolocally optimal. Here is a simple and nice result which provides one of our main interests in convexfunctions.

Proposition 1.12 Suppose that f: S → R is convex. Every local optimal solution of f is a globaloptimal solution. Moreover, if f is strictly convex, then the global optimal solution is unique.

Proof. Let f be convex and suppose thatx0 ∈ S is a local optimal solution off . Then for allx ∈ Sandα ∈ (0, 1) sufficiently small,

f (x0) ≤ f (x0 + α(x − x0)) = f ((1 − α)x0 + αx) ≤ (1 − α) f (x0) + α f (x). (1.3.2)

This implies that0 ≤ α( f (x) − f (x0)) (1.3.3)

or f (x0) ≤ f (x). Hence,x0 is a global optimal solution off . If f is strictly convex, then the secondinequality in (1.3.2) is strict so that (1.3.3) becomes strict for allx ∈ S. Hence,x0 must be unique.

It is very important to emphasize that Proposition 1.12 does not make any statement about theexis-tenceof optimal solutionsx0 ∈ S that minimize f . It merely says that all locally optimal solutionsare globally optimal. For convex functions it therefore suffices to computelocally optimal solutionsto actually determine its globally optimal solution.

Remark 1.13 Proposition 1.12 does not hold for quasi-convex functions.

1.3.2 Uniform bounds

The second reason to investigate convex functions comes from the fact that uniform upperbounds ofconvex functions can be verified on subsets of their domain. Here are the details: letS0 be a set andsuppose thatf : S → R is a function with domain

S = conv(S0).

As we have seen in Proposition 1.6,S is convex and we have the following property which is bothsimple and powerful.

Proposition 1.14 Let f : S → R be a convex function whereS = conv(S0). Then f(x) ≤ γ for allx ∈ S if and only if f(x) ≤ γ for all x ∈ S0

Proof. The ‘only if’ part is trivial. To see the ‘if’ part, Proposition 1.6 implies that everyx ∈ S canbe written as a convex combinationx =

∑ni =1 αi xi wheren > 0, αi ≥ 0, xi ∈ S0, i = 1, . . . , n and



∑ni =1 αi = 1. Using convexity off and non-negativity of theαi ’s, we infer

f (x) = f (

n∑i =1

αi xi ) ≤

n∑i =1

αi f (xi ) ≤

n∑i =1

αi γ = γ,

which yields the result.

Proposition 1.14 states that the uniform boundf (x) ≤ γ on S can equivalently be verified on theset S0. This is of great practical relevance especially whenS0 contains only afinite number ofelements, i.e., whenS is a polytope. It then requires afinite number of teststo conclude whether ornot f (x) ≤ γ for all x ∈ S. In addition, since

γ0 := supx∈S

f (x) = maxx∈S0

f (x)

the supremum off is attained and can be determined by consideringS0 only.

1.3.3 Duality and convex programs

In many optimization problems, the universum of all possible decisions is a real valued finite dimen-sional vector spaceX = Rn and the space of feasible decisions typically consists ofx ∈ X whichsatisfy a finite number ofinequalitiesandequationsof the form

gi (x) ≤ 0, i = 1, . . . , k

hi (x) = 0, i = 1, . . . , l .

Indeed, saturation constraints, safety margins, physically meaningful variables, and a large numberof constitutive and balance equations can be written in this way. The space of feasible decisionsS ⊂ X is then expressed as

S := {x ∈ X | g(x) ≤ 0, h(x) = 0} (1.3.4)

whereg : X → Rk andh : X → Rl are the vector valued functions which consist of the componentfunctionsgi andhi and where the inequalityg(x) ≤ 0 is interpreted component-wise. Hence, thefeasibility setS is defined in terms ofk inequality constraints andl equality constraints. With setsof this form, we consider the optimization problem to find the optimal value

Popt := infx∈S

f (x),

and possibly optimal solutionsxopt ∈ S such that f (xopt) = Popt. Here, f : X → R is a givenobjective function. In this section, we will refer to this constrained optimization problem as aprimaloptimization problemand toPopt as theprimal optimal value. To make the problem non-trivial, wewill assume thatPopt > −∞ and thatS is non-empty.



Remark 1.15 If X, f andg are convex andh is affine, then it is easily seen thatS is convex, inwhich case this problem is commonly referred to as aconvex program. This is probably the onlytractable instance of this problem and its study certainly belongs to the most sophisticated area ofnonlinear optimization theory. The Karush-Kuhn-Tucker Theorem, presented below, is the key tounderstanding convex programs. The special instance wheref , g and h are all affine functionsmakes the problem to determinePopt a linear programming problem. If f is quadratic (i.e.,f is ofthe form f (x) = (x − x0)

>Q(x − x0) for someQ = Q>∈ Rn×n and fixedx0 ∈ Rn) andg andh

are affine, this is aquadratic programming problem.

Obviously, for anyx0 ∈ S we have thatPopt ≤ f (x0), i.e., anupperboundof Popt is obtained fromany feasible pointx0 ∈ S. On the other hand, ifx ∈ X satisfiesg(x) ≤ 0 andh(x) = 0, then forarbitrary vectorsy ≥ 0 (meaning that each componentyi of y is non-negative) andz we have that

L(x, y, z) := f (x) + 〈y, g(x)〉 + 〈z, h(x)〉 ≤ f (x).

Here,L(·, ·, ·) is called aLagrangian, which is a function ofn + k + l variables, and〈·, ·〉 denotesthe standard inner product inRn, that is,〈x1, x2〉 := x>

2 x1. It is immediate that for ally ∈ Rk, y ≥ 0andz ∈ Rl we have that

`(y, z) := infx∈X

L(x, y, z) ≤ infx∈S

f (x) = Popt.

The function`(·, ·) is theLagrange dual cost. A pair of vectors(y, z) with y ≥ 0 is said to befeasible for the dual problemif `(y, z) > −∞. Suppose that there exists at least one such feasiblepair (y, z). Since` is independent ofx, we may conclude that

Dopt := supy≥0, z

`(y, z) = supy≥0, z

infx∈X

L(x, y, z) ≤ Popt

provides alower boundof Popt. Since` is the pointwise infimum of the concave functionsL(x, ·, ·),`(y, z) is a concavefunction (no matter whether the primal problem is convex program) and thedual optimization problemto determineDopt is therefore a concave optimization problem. The mainreason to consider this problem is that the constraints in the dual problem are much simpler to dealwith than the ones in the primal problem.

Of course, the question arises whenDopt = Popt. To answer this question, suppose thatX, f andgare convex andh is affine. As noted before, this implies thatS is convex. We will say thatS satisfiestheconstraint qualificationif there exists a pointx0 in the interior ofX with g(x0) ≤ 0, h(x0) = 0such thatg j (x0) < 0 for all component functionsg j that are not affine2. In particular,S satisfies theconstraint qualification ifg is affine. We have the following central result.

Theorem 1.16 (Karush-Kuhn-Tucker) Suppose thatX, f and g are convex and h is affine. As-sume that Popt > −∞ andS defined in(1.3.4)satisfies the constraint qualification. Then

Dopt = Popt.

2Some authors callS superconsistentand the pointx0 aSlater pointin that case.



and there exist vectors yopt ∈ Rk, yopt ≥ 0 and zopt ∈ Rl , such that Dopt = `(yopt, zopt), i.e., thedual optimization problem admits an optimal solution. Moreover, xopt is an optimal solution of theprimal optimization problem and(yopt, zopt) is an optimal solution of the dual optimization problem,if and only if

(a) g(xopt) ≤ 0, h(xopt) = 0,

(b) yopt ≥ 0 and xopt minimizes L(x, yopt, zopt) over all x ∈ X and

(c) 〈yopt, g(xopt)〉 = 0.

The result of Theorem 1.16 is very general and provides a strong tool in convex optimization. This,because the dual optimization problem is, in general, simpler and, under the stated assumptions,is guaranteed to be solvable. The optimal solutions(yopt, zopt) of the dual optimization problemare generally calledKuhn-Tucker points. The conditions 1, 2 and 3 in Theorem 1.16 are called theprimal feasibility, thedual feasibilityand thealignment(or complementary slackness) condition.

Theorem 1.16 provides a conceptual solution of the primal optimization problem as follows. Firstconstruct the dual optimization problem to maximize`(y, z). Second, calculate a Kuhn-Tuckerpoint (yopt, zopt) which defines an optimal solution to the dual problem (existence is guaranteed).Third, determine (if any) the set of optimal solutionsP ′

opt which minimizeL(x, yopt, zopt) over allx ∈ X. Fourth, letPopt be the set of pointsxopt ∈ P ′

opt such thatg(xopt) ≤ 0, h(xopt) = 0 and〈yopt, g(xopt)〉 = 0. ThenPopt is the set of optimal solutions of the primal optimization problem.We emphasize that optimal solutions to the dual problem are guaranteed to exist, optimal solutionsof the primal problem may not exist.

Remark 1.17 In order that the triple(xopt, yopt, zopt) defined in Theorem 1.16 exists, it is necessaryand sufficient that(xopt, yopt, zopt) be asaddle pointof the LagrangianL in the sense that

L(xopt, y, z) ≤ L(xopt, yopt, zopt) ≤ L(x, yopt, zopt)

for all x, y ≥ 0 andz. In that case,

Popt = L(xopt, yopt, zopt) =

= infx

supy≥0, z

L(x, y, z) = supy≥0, z

infx

L(x, y, z) =

= Dopt.

That is, the optimal value of the primal and dual optimization problem coincide with thesaddle pointvalue L(xopt, yopt, zopt). Under the given conditions, Theorem 1.16 therefore states thatxopt is anoptimal solution of the primal optimization problem if and only if there exist(yopt, zopt) such thatyopt ≥ 0 and(xopt, yopt, zopt) is a saddle point ofL.

Remark 1.18 A few generalizations of Theorem 1.16 are worth mentioning. The inequality con-straintsg(x) ≤ 0 in (1.3.4) can be replaced by the more general constraintg(x) ∈ K, whereK ⊂ H



is a closed convex cone in a Hilbert spaceH . In the definition of the LagrangianL, the vectorsy andz definelinear functionals〈y, ·〉 and〈z, ·〉 on the Hilbert spacesRk andRl , respectively. For moregeneral Hilbert spaces(H , 〈·, ·〉), the constrainty ≥ 0 needs to be replaced by the requirement that〈y, g(x)〉 ≤ 0 for all x ∈ X with g(x) ∈ K. This is equivalent of saying that the linear functional〈y, ·〉 is non-positive onK.

Figure 1.1: Joseph-Louis Lagrange

Joseph-Louis Lagrange (1736-1813) studied at the College of Turin and he became in-terested in mathematics when he read a copy of Halley’s work on the use of algebra inoptics. Although he decided to devote himself to mathematics, he did not have the ben-efit of studying under supervision of a leading mathematician. Before writing his firstpaper, he sent his results to Euler, who at that time was working in Berlin. Lagrangeworked on the calculus of variations and regularly corresponded on this topic with Eu-ler. Among many contributions in mathematics, Lagrange worked on the calculus ofdifferential equations and applications in fluid mechanics where he first introduced theLagrangian function.

1.3.4 Subgradients

Our fourth reason of interest in convex functions comes from the geometric idea that through anypoint on the graph of a convex function we can draw a line such that the entire graph lies above oron the line. For functionsf : S → R with S ⊆ R, this idea is pretty intuitive from a geometric pointof view. The general result is stated in the next Proposition and its proof is a surprisingly simpleapplication of Theorem 1.16.



Proposition 1.19 LetS ⊆ Rn. If f : S → R is convex then for all x0 in the interior ofS there existsa vector g= g(x0) ∈ Rn, such that

f (x) ≥ f (x0) + 〈g, x − x0〉 (1.3.5)

for all x ∈ S.

Proof. The setS′:= {x ∈ S | x − x0 = 0} has the form (1.3.4) and we note that the primal optimal

valuePopt := infx∈S′ f (x)− f (x0) = 0. Define the LagrangianL(x, z) := f (x)− f (x0)+〈z, x−x0〉

and the corresponding dual optimizationDopt := supz∈S infx∈Rn L(x, z). ThenDopt ≤ Popt = 0 andsinceS′ trivially satisfies the constraint qualification, we infer from Theorem 1.16 that there existszopt ∈ Rn such that

Dopt = 0 = infx∈S

f (x) − f (x0) + 〈zopt, x − x0〉.

Consequently,f (x) − f (x0) + 〈zopt, x − x0〉 ≥ 0 for all x ∈ S which yields (1.3.5) by settingg := −zopt.

A vector g satisfying (1.3.5) is called asubgradientof f at the pointx0, and the affine functiondefined by the right-hand side of (1.3.5) is called asupport functional for f at x0. Inequality (1.3.5)is generally referred to as thesubgradient inequality. We emphasize that the subgradient of a convexfunction f at a point is in general non-unique. Indeed, the real-valued functionf (x) = |x| is convexon R and has any real numberg ∈ [−1, 1] as its subgradient atx = 0. The set of all subgradientsof f at x0 is thesubdifferential of f at x0 and is denoted by∂ f (x0) (or ∂x f (x0) if the independentvariable need to be displayed explicitly). From the subgradient inequality (1.3.5) it is immediate that∂(α f )(x) = α∂ f (x) for all x andα > 0. Also,x0 ∈ S is a global optimal solution off if and onlyif 0 ∈ ∂ f (x0). We remark that for a convex functionf , ∂ f (x) is a closed convex set for anyx in theinterior of its domain. As a more striking property, letf1 and f2 be convex functions with domainsS1 andS2, respectively, then

∂( f1 + f2)(x) = ∂ f1(x) + ∂ f2(x)

for all x in the interior ofS1 ∩ S2.

Remark 1.20 Proposition 1.19 gives a necessary condition for convexity of a functionf . It can beshown that if the gradient

∇ f =

(∂ f∂x1

. . .∂ f∂xn

)exists and is continuous atx ∈ S then∂ f (x) = ∇ f (x). Conversely, if f has a unique subgradientat x, then the gradient off exists atx and∂ f (x) = ∇ f (x).

The geometric interpretation of Proposition 1.19 is that the graphs of the affine functionsx 7→

f (x0) + 〈g, x − x0〉, with g ∈ ∂ f (x0), range over the collection of all hyperplanes which aretangent to the graph off at the point(x0, f (x0). That is, the graph off lies on or above thesehyperplanes each of which contains the point(x0, f (x0)). If we consider the right hand side of(1.3.5), then trivially〈g, x − x0〉 > 0 implies that f (x) > f (x0). Thus all points in the half-space



{x ∈ S | 〈g, x − x0〉 > 0} lead to larger values off than f (x0). In particular, in searching for theglobal minimum of f we can disregard this entire half-space.

This last observation is at the basis of theellipsoid algorithm: a simple, numerically robust andstraightforward iterative algorithm for the computation of optimal values.

Algorithm 1.21 (Ellipsoid algorithm) Aim: determine the optimal value of a convex functionf :

S → R.

Input: A convex functionf : S → R with S ⊂ Rn. An ellipsoid

E0 := {x ∈ Rn| (x − x0)

> P−10 (x − x0) ≤ 1}.

centered atx0 ∈ Rn and oriented by a positive definite matrixP0 = P>

0 such that itcontains an optimal solution of the problem to minimizef . Let ε > 0 be an accuracylevel. Letk = 0.

Step 1: Compute a subgradientgk ∈ ∂ f (xk) and set

Lk := max`≤k

(f (x`) −

√g>

` P g`

)Uk := min

`≤kf (x`)

If gk = 0 orUk − Lk < ε, then setx∗= xk and stop. Otherwise proceed to Step 2.

Step 2: PutHk := Ek ∩ {x ∈ Rn| 〈gk, x − xk〉 ≤ 0}.

Step 3: Set

xk+1 := xk −Pkgk

(n + 1)

√g>

k Pkgk

Pk+1 :=n2

n2 − 1

(Pk −

2

(n + 1)g>

k PkgkPkgkg>

k Pk

)

and define the ellipsoid

Ek+1 := {x ∈ Rn| (x − xk+1)

> P−1k+1(x − xk+1) ≤ 1}

with centerxk+1 and orientationPk+1.

Step 4: Setk to k + 1 and return to Step 1.

Output: The pointx∗ with the property that| f (x∗) − infx∈S f (x)| ≤ ε.


1.4. LINEAR MATRIX INEQUALITIES 15

The algorithm therefore determines the optimal value off with arbitrary accuracy. We emphasizethat the pointx∗ is generallynot an optimal or almost optimal solutionunlessgk = 0 upon termi-nation of the algorithm. Only in that casex∗ is an optimal solution. Hence, the algorithm does notnecessarily calculate a solution, but only the optimal valueVopt = infx∈S f (x).

The idea behind the algorithm is as follows. The algorithm is initialized by a ‘non-automated’ choiceof x0 and P0 such that that there exists an optimal solutionxopt in the ellipsoidE0. If S is boundedthen the safest choice would be such thatS ⊆ E0. The subgradientsgk ∈ ∂ f (xk) divide Rn in thetwo half-spaces

{x | 〈gk, x − xk〉 < 0} and {x | 〈gk, x − xk〉 > 0}

while thecutting plane{x | 〈gk, x − xk〉 = 0} passes through the center of the ellipsoidEk for eachk. Since f (x) > f (xk) whenever〈gk, x − xk〉 > 0, the optimal solutionxopt is guaranteed to belocated inHk. The ellipsoid defined in Step 3 containsHk and is the smallest volume ellipsoid withthis property. Iterating overk, the algorithm produces a sequence of ellipsoidsE0, E1, E2, . . . whosevolumes decrease according to

vol(Ek+1) = det(Pk+1) ≤ e−12n det(Pk) = e−

12n vol(Ek)

and where each ellipsoid is guaranteed to containxopt. The sequence of centersx0, x1, x2, . . . of theellipsoids generate a sequence of function evaluationsf (xk) which converges to the optimal valuef (xopt). Convergence of the algorithm is in ‘polynomial time’ due to the fact that the volume of theellipsoids decreases geometrically. Sincexopt ∈ Ek for all k, we have

f (xk) ≥ f (xopt) ≥ f (xk) + 〈gk, xopt − xk〉 ≥

≥ f (xk) + infξ∈Ek

〈gk, ξ − xk〉 = f (xk) −

√g>

k Pkgk

so thatLk ≤ f (xopt) ≤ Uk define an upper and lower bound on the optimal value.

The algorithm is easy to implement, is very robust from a numerical point of view and implies lowmemory requirements in its performance. However, convergence may be rather slow which may bea disadvantage for large optimization problems.

1.4 Linear matrix inequalities

1.4.1 What are they?

A linear matrix inequalityis an expression of the form

F(x) := F0 + x1F1 + . . . + xnFn ≺ 0 (1.4.1)

where


16 1.4. LINEAR MATRIX INEQUALITIES

• x = (x1, . . . , xn) is a vector ofn real numbers called thedecision variables.

• F0, . . . , Fn are real symmetric matrices, i.e.,F j = F>

j , for j = 0, . . . , n.

• the inequality≺ 0 in (1.4.1) means ‘negative definite’. That is,u>F(x)u < 0 for all non-zero real vectorsu. Because all eigenvalues of a real symmetric matrix are real, (1.4.1) isequivalent to saying that all eigenvaluesλ(F(x)) are negative. Equivalently, the maximaleigenvalueλmax(F(x)) < 0.

It is convenient to introduce some notation for symmetric and Hermitian matrices. A matrixA isHermitian if it is square andA = A∗

= A> where the bar denotes taking the complex conjugate ofeach entry inA. If A is real then this amounts to saying thatA = A> and we callA symmetric. Thesets of allm × m Hermitian and symmetric matrices will be denoted byHm andSm, respectively,and we will omit the superscriptm if the dimension is not relevant for the context.

Definition 1.22 (Linear Matrix Inequality) A linear matrix inequality(LMI) is an inequality

F(x) ≺ 0 (1.4.2)

whereF is anaffine functionmapping a finite dimensional vector spaceX to eitherH or S.

Remark 1.23 Recall from Definition 1.10 that an affine mappingF : X → S necessarily takes theform F(x) = F0 + T(x) whereF0 ∈ S (i.e., F0 is real symmetric) andT : X → S is a lineartransformation. Thus ifX is finite dimensional, say of dimensionn, and{e1, . . . , en} constitutes abasis forX, then everyx ∈ X can be represented asx =

∑nj =1 x j ej and we can write

T(x) = T

n∑j =1

x j ej

n∑j =1

x j F j

whereF j = T(ej ) ∈ S. Hence we obtain (1.4.1) as a special case.

Remark 1.24 In most control applications, LMI’s arise as functions ofmatrix variablesrather thanscalar valued decision variables. This means that we consider inequalities of the form (1.4.2) whereX = Rm1×m2 is the set of real matrices of dimensionm1 × m2. A simple example withm1 = m2 =

m is the Lyapunov inequalityF(X) = A>X + X A+ Q ≺ 0 whereA, Q ∈ Rm×m are assumed tobe given andX is the unknown matrix variable of dimensionm × m. Note that this defines an LMIonly if Q ∈ Sm. We can view this LMI as a special case of (1.4.1) by defining an arbitrary basise1, . . . , en of X and expandingX ∈ X asX =

∑nj =1 x j ej . Then

F(X) = F

n∑j =1

x j ej

= F0 +

n∑j =1

x j F(ej ) = F0 +

n∑j =1

x j F j

which is of the form (1.4.1). The coefficientsx j in the expansion ofX define the decision variables.The number of (independent) decision variablesn corresponds to the dimension ofX. The numbern



is at mostm2 (or m1 × m2 for non-square matrix variabes) and will depend on the structure imposedon the matrix variableX. For example, if the matrix variableX is required to be symmetric,X = Sm

which has a basis ofn = m(m + 1)/2 matrix-valued elements. IfX is required to be diagonal thenn = m.

Remark 1.25 A non-strict LMI is a linear matrix inequality where≺ in (1.4.1) and (1.4.2) is re-placed by�. The matrix inequalitiesF(x) � 0, andF(x) ≺ G(x) with F andG affine functionsare obtained as special cases of Definition 1.22 as they can be rewritten as the linear matrix inequal-ities−F(x) ≺ 0 andF(x) − G(x) ≺ 0.

1.4.2 Why are they interesting?

The linear matrix inequality (1.4.2) defines aconvex constrainton x. That is, the set

S := {x | F(x) ≺ 0}

of solutions of the LMIF(x) ≺ 0 is convex. Indeed, ifx1, x2 ∈ S andα ∈ (0, 1) then

F(αx1 + (1 − α)x2) = αF(x1) + (1 − α)F(x2) ≺ 0

where we used thatF is affine and where the inequality follows from the fact thatα > 0 and(1 − α) > 0.

Although the convex constraintF(x) ≺ 0 on x may seem rather special, it turns out that manyconvex sets can be represented in this way and that these sets have more attractive properties thangeneral convex sets. In this subsection we discuss some seemingly trivial properties of linear matrixinequalities which turn out to be of eminent help to reduce multiple constraints on an unknownvariable to an equivalent constraint involving a single linear matrix inequality.

Definition 1.26 (System of LMI’s) A system of linear matrix inequalitiesis a finite set of linearmatrix inequalities

F1(x) ≺ 0, . . . , Fk(x) ≺ 0. (1.4.3)

From Proposition 1.4 we infer that the intersection of the feasible sets of each of the inequalities(1.4.3) is convex. In other words, the set of allx that satisfy (1.4.3) is convex. The question nowarises whether or not this set can be represented as the feasibility set of another LMI. The answer isyes. Indeed,F1(x) ≺ 0, . . . , Fk(x) ≺ 0 if and only if

F(x) :=

F1(x) 0 . . . 0

0 F2(x) . . . 0...

. . ....

0 0 . . . Fk(x)

≺ 0.

The last inequality indeed makes sense asF(x) is symmetric (or Hermitian) for anyx. Further, sincethe set of eigenvalues ofF(x) is simply the union of the eigenvalues ofF1(x), . . . , Fk(x), anyx that



satisfiesF(x) ≺ 0 also satisfies the system of LMI’s (1.4.3) and vice versa. Conclude that multipleLMI constraints can always be converted to a single LMI constraint.

A second important property amounts to incorporatingaffine constraintsin linear matrix inequalities.By this, we mean thatcombined constraints(in the unknownx) of the form{

F(x) ≺ 0

Ax = aor

{F(x) ≺ 0

x = Bu + b for someu

where the affine functionF : Rn→ S, matricesA and B and vectorsa andb are given, can be

lumpedin one linear matrix inequalityF (x) ≺ 0. More generally, the combined equations{F(x) ≺ 0

x ∈ M(1.4.4)

whereM is anaffine setin Rn can be rewritten in the form of one single linear matrix inequalityF(x) ≺ 0 so as toeliminatethe affine constraint. To do this, recall that affine setsM can be writtenas

M = {x | x = x0 + m, m ∈ M0}

with x0 ∈ Rn andM0 a linear subspace ofRn. Suppose thatn = dim(M0) and lete1, . . . , en ∈ Rn

be a basis ofM0. Let F(x) = F0 + T(x) be decomposed as in Remark 1.23. Then (1.4.4) can berewritten as

0 � F(x) = F0 + T(x0 +

n∑j =1

x j ej ) =

= F0 + T(x0) +

n∑j =1

x j T(ej ) = F0 + x1F1 + . . . + xn Fn =: F (x)

where F0 = F0 + T(x0), F j = T(ej ) and x = col(x1, . . . , xn) are the coefficients ofx − x0 inthe basis ofM0. This implies thatx ∈ Rn satisfies (1.4.4) if and only ifF (x) ≺ 0. Note that thedimensionn of x is at most equal to the dimensionn of x.

A third property of LMI’s is obtained from a simple algebraic observation. IfM is a square matrixandT is non-singular, then the productT∗MT is called acongruence transformationof M . ForHermitian matricesM such a transformation does not change the number of positive and negativeeigenvalues ofM . Indeed, if vectorsu andv are related according tou = Tv with T non-singular,thenu∗Mu < 0 for all nonzerou is equivalent to saying thatv∗T∗MTv < 0 for all nonzerov.HenceM ≺ 0 if and only if T∗MT ≺ 0. Apply this insight to a partitioned Hermitian matrix

M =

(M11 M12M21 M22

)with M11 square and non-singular by computing the congruence transformationT∗MT with the



non-singular matrixT =

(I 0

−M−111 M12 I

). This yields that

M ≺ 0 ⇐⇒

(M11 00 S

)≺ 0 ⇐⇒

{M11 ≺ 0

S ≺ 0

whereS := M22 − M21M−1

11 M12

is theSchur complementof M11 in M . A similar result is obtained withT =

(I −M−1

22 M210 I

). This

observation can be exploited to derive a powerful result to linearize somenon-linearinequalities tolinear inequalities:

Proposition 1.27 (Schur complement)Let F : X → H be an affine function which is partitionedaccording to

F(x) =

(F11(x) F12(x)

F21(x) F22(x)

)where F11(x) is square. Then F(x) ≺ 0 if and only if{

F11(x) ≺ 0

F22(x) − F21(x) [F11(x)]−1 F12(x) ≺ 0.(1.4.5)

if and only if {F22(x) ≺ 0

F11(x) − F12(x) [F22(x)]−1 F21(x) ≺ 0. (1.4.6)

The second inequalities in (1.4.5) and (1.4.6) arenon-linearconstraints inx. Using this result, itfollows that non-linear matrix inequalities of the form (1.4.5) or (1.4.6) can be converted to linearmatrix inequalities. In particular, non-linear inequalities of the form (1.4.5) or (1.4.6) define convexconstraints on the variablex in the sense that the solution set of these inequalities is convex.

1.4.3 What are they good for?

As we will see, many optimization problems in control, identification and signal processing canbe formulated (or reformulated) using linear matrix inequalities. Clearly, it only makes sense tocast these problems in an LMI setting if these inequalities can be solved in an efficient and reliableway. Since the linear matrix inequalityF(x) ≺ 0 defines aconvex constrainton the variablex,optimization problems involving the minimization (or maximization) of a performance functionf :

S → R with S := {x | F(x) ≺ 0} belong to the class ofconvex optimization problems. Casting thisin the setting of the previous section, it may be apparent that the full power of convex optimizationtheory can be employed if the performance functionf is known to be convex.



Suppose thatF : X → S is affine. There are two generic problems related to the study of linearmatrix inequalities:

(a) Feasibility: The question whether or not there exist elementsx ∈ X such thatF(x) ≺ 0 iscalled afeasibility problem. The LMI F(x) ≺ 0 is calledfeasibleif suchx exists, otherwise itis said to beinfeasible.

(b) Optimization: Let anobjective function f: S → R whereS = {x | F(x) ≺ 0}. The problemto determine

Vopt = infx∈S

f (x)

is called anoptimization problem with an LMI constraint. This problem involves the deter-mination ofVopt, the calculation of analmost optimal solution x(i.e., for arbitraryε > 0 thecalculation of anx ∈ S such thatVopt ≤ f (x) ≤ Vopt + ε), or the calculation of a optimalsolutionsxopt (elementsxopt ∈ S such thatVopt = f (xopt)).

Let us give some simple examples to motivate the study of these problems.

Example 1: stability

Consider the problem to determine exponential stability of the linear autonomous system

x = Ax (1.4.7)

where A ∈ Rn×n. By this, we mean the problem to decide whether or not there exists positiveconstantsK andα such that for any initial conditionx0 the solutionx(t) of (1.4.7) withx(t0) = x0satisfies the bound

‖x(t)‖ ≤ ‖x(t0)‖Me−α(t−t0), for all t ≥ t0. (1.4.8)

Lyapunov taught us that the system (1.4.7) is exponentially stable if and only if there existsX = X>

such thatX � 0 andA>X + X A ≺ 0. Indeed, in that case the functionV(x) := x>Xx qualifiesas a Lyapunov function in that it is positive for all non-zerox and strictly decaying along solutionsx of (1.4.7). In Chapter 3 we show that (1.4.8) holds withK 2

= λmax(X)/λmin andα > 0 suchthat A>X + X A + αX ≺ 0. Thus, exponential stability of the system (1.4.7) is equivalent to thefeasibility of the LMI (

−X 00 A>X + X A

)≺ 0.



Example 2: µ-analysis

Experts inµ-analysis (but other people as well!) regularly face the problem to determine a diagonalmatrix D such that‖DM D−1

‖ < 1 whereM is some given matrix. Since

‖DM D−1‖ < 1 ⇐⇒ D−>M>D>DM D−1

≺ I

⇐⇒ M>D>DM ≺ D>D

⇐⇒ M>X M − X ≺ 0

whereX := D>D � 0, we see that the existence of such a matrix is an LMI feasibility problemwhereX needs to be taken as the set of diagonal matrices.

Example 3: singular value minimization

Let F : X → S be an affine function and letσmax(·) denote the maximal singular value of a realsymmetric matrix. Consider the problem to minimizef (x) := σmax(F(x)) overx. Clearly,

f (x) < γ ⇐⇒ λmax

(F>(x)F(x)

)< γ 2

⇐⇒1

γF>(x)F(x) − γ I ≺ 0

⇐⇒

(γ I F (x)

F>(x) γ I

)� 0

where the latter inequality is obtained by taking a Schur complement. If we define

y :=

(γ

x

), G(y) := −

(γ I F (x)

F>(x) γ I

), g(y) := γ

thenG is anaffine functionof y and the problem to minimizef overx is equivalent to the problemto minimizeg over all y such thatG(y) ≺ 0. Hence, this is an optimization problem with an LMIconstraint and alinear objective function g.

Example 4: simultaneous stabilization

Considerk linear time-invariant systems of the form

x = Ai x + Bi u

where Ai ∈ Rn×n and Bi ∈ Rn×m, i = 1, . . . , k. The question ofsimultaneous stabilizationamounts to finding a state feedback lawu = Fx with F ∈ Rm×n such that each of thek autonomoussystemsx = (Ai + Bi F)x, i = 1, . . . , k is asymptotically stable. Using Example 1 above, thisproblem is solved when we can find matricesF andXi , i = 1, . . . , k, such that for all of thesei ’s{

Xi � 0

(Ai + Bi F)>Xi + Xi (Ai + Bi F) ≺ 0. (1.4.9)



Since bothXi and F are unknown, this isnot a system of LMI’s in the variablesXi and F . Oneway out of this inconvenience is to require thatX1 = . . . = Xk =: X. If we introduce new variablesY = X−1 andK = FY then (1.4.9) reads{

Y � 0

Ai Y + Y A>

i + Bi K + K >B>

i ≺ 0

for i = 1, . . . , k. The latter is a system of LMI’s in the variablesY andK . The joint stabilizationproblem therefore has a solutionF = KY−1 if the latter system of LMI’s is feasible. We will see inChapter 3 that the quadratic functionV(x) := x>Xx serves as a joint Lyapunov function for thekautonomous systems.

Example 5: evaluation of quadratic cost

Consider the linear autonomous systemx = Ax (1.4.10)

together with an arbitrary (but fixed) initial valuex(0) = x0 and the criterion functionJ :=∫∞

0 x>(t)Qx(t) dt whereQ = Q>� 0. Assume that the system is asymptotically stable. Then

all solutionsx of (1.4.10) are square integrable so thatJ < ∞. Now consider the non-strict linearmatrix inequalityA>X + X A+ Q � 0. For any solutionX = X> of this LMI we can differentiatex>(t)Xx(t) along solutionsx of (1.4.10) to get

d

dt[x>(t)Xx(t)] = x>(t)[A>X + X A]x(t) ≤ −x>(t)Qx(t).

If we assume thatX � 0 then integrating the latter inequality fromt = 0 till ∞ yields the upperbound

J =

∫∞

0x>(t)Qx(t) dt ≤ x>

0 Xx0.

where we used that limt→∞ x(t) = 0. Moreover, the smallest upperbound ofJ is obtained byminimizing the functionf (X) := x>

0 Xx0 over all X = X> which satisfyX � 0 andA>X + X A+

Q � 0. Again, this is an optimization problem with an LMI constraint.

Example 6: a Leontief economy

A manufacturer is able to producen different products fromm different resources. Assume that theselling price of productj is p j and that it takes the manufacturerai j units of resourcei to produceone unit of productj . Let x j denote the amount of productj that is to be produced and letai

denote the amount of available units of resourcei , i = 1, . . . , m. A smart manager advised themanufacturer to maximize his profit

p(x1, . . . , xn) := p1x1 + p2x2 + . . . + pnxn,



but the manager can do this only subject to the production constraints

a11x1 + a12x2 + . . . + a1nxn ≤ a1

a21x1 + a22x2 + . . . + a2nxn ≤ a2

...

am1x1 + am2x2 + . . . + amnxn ≤ am

andx j ≥ 0, j = 1, . . . , n. Note that the manufacturer faces an optimization problem subject to asystem of non-strict linear matrix inequalities.

Wassily Leontief was born in 1906 in St. Petersburg and is winner of the 1973 NobelPrize of Economics. Among many things, he used input-output analysis to study thecharacteristics of trade flow between the U.S. and other countries.

1.4.4 How are they solved?

The problems defined in the previous subsection can be solved with efficient numerical tools. In thissection we discuss the basic ideas behind the ‘LMI-solvers’.

Ellipsoid method

We first give a solution which is based on the Ellipsoid Algorithm 1.21 discussed in Section 1.3. Asmentioned before, this algorithm is simple, numerically robust and easy to implement but may beslow for larger optimization problems.

We will apply this algorithm to the feasibility problem defined in subsection 1.4.3. LetF : RnSbe affine. Recall thatF(x) ≺ 0 if and only if λmax(F(x)) < 0. Define, forx ∈ Rn, the functionf (x) := λmax(F(x)) and consider the optimal valueVopt := infx∈Rn f (x). Clearly, the LMIF(x) ≺

0 is feasible if and only ifVopt < 0. It is infeasible if and only ifVopt ≥ 0.

There are a few observations to make to apply Proposition 1.19 and the ellipsoid algorithm for thisoptimization problem. The first one is to establish thatf is a convex function. Indeed,S is convexand for all 0< α < 1 andx1, x2 ∈ S we have that

f (αx1 + (1 − α)x2) = λmax(F(αx1 + (1 − α)x2))

= λmax(αF(x1) + (1 − α)F(x2))

≤ αλmax(F(x1)) + (1 − α)λmax(F(x2))

= α f (x1) + (1 − α) f (x2)



which shows thatf is convex. Second, to apply Step 1 of the algorithm, for anyx0 we need todetermine a subgradientg of f at the pointx0. To do this, we will use the fact that

f (x) = λmax(F(x)) = maxu>u=1

u>F(x)u.

This means that for an arbitraryx0 ∈ S we can determine a vectoru0 ∈ RN , depending onx0, withu>

0 u0 = 1 such thatλmax(F(x0)) = u>

0 F(x0)u0. But then

f (x) − f (x0) = maxu>u=1

u>F(x)u − u>

0 F(x0)u0

≥ u>

0 F(x)u0 − u>

0 F(x0)u0

= u>

0 (F(x) − F(x0)) u0.

The last expression is an affine functional which vanishes atx0. This means that the right-hand sideof this expression must be of the form〈g, x − x0〉 for some vectorg ∈ Rn. To obtaing, we can write

u>

0 F(x)u0 = u>

0 F0u0︸︷︷︸g0

+

n∑j =1

x j u>

0 F j u0︸︷︷︸g j

= g0 + 〈g, x〉,

whereg j are the components ofg. In particular, we obtain thatf (x) − f (x0) ≥ 〈g, x − x0〉. Theremaining steps of the ellipsoid algorithm can be applied in a straightforward way.

Interior point methods

A major breakthrough in convex optimization lies in the introduction of interior-point methods.These methods were developed in a series of papers [11] and became of true interest in the contextof LMI problems in the work of Yurii Nesterov and Arkadii Nemirovskii [20].

The main idea is as follows. LetF be an affine function and letS := {x | F(x) ≺ 0} be the domainof a convex functionf : S → R which we wish to minimize. That is, we consider the convexoptimization problem

Vopt = infx∈S

f (x).

To solve this problem (that is, to determine optimal or almost optimal solutions), it is first necessaryto introduce abarrier function. This is a smooth functionφ which is required to

(a) bestrictly convexon the interior ofS and

(b) approach+∞ along each sequence of points{xn}∞

n=1 in the interior ofS that converges to aboundary point ofS.



Given such a barrier functionφ, the constraint optimization problem to minimizef (x) over allx ∈ Sis replaced by theunconstrained optimization problemto minimize the functional

ft (x) := t f (x) + φ(x) (1.4.11)

wheret > 0 is a so calledpenalty parameter. Note that ft is strictly convex onRn. The mainidea is to determine a mappingt 7→ x(t) that associates with anyt > 0 a minimizerx(t) of ft .Subsequently, we consider the behavior of this mapping as the penalty parametert varies. In almostall interior point methods, the latter unconstrained optimization problem is solved with the classicalNewton-Raphson iteration technique to approximate the minimum offt . Under mild assumptionsand for a suitably defined sequence of penalty parameterstn with tn → ∞ asn → ∞, the se-quencex(tn) with n ∈ Z+ will converge to a pointxopt which is a solution of the original convexoptimization problem. That is, the limitxopt := limt→∞ x(t) exists andVopt = f (xopt).

A small modification of this theme is obtained by replacing the original constraint optimizationproblem by the unconstrained optimization problem to minimize

gt (x) := φ0(t − f (x)) + φ(x) (1.4.12)

wheret > t0 := Vopt andφ0 is a barrier function for the non-negative real half-axis. Again, theidea is to determine, for everyt > 0 a minimizerx(t) of gt (typically using the classical Newton-Raphson algorithm) and to consider the ‘path’t 7→ x(t) as function of the penalty parametert .The curvet 7→ x(t) with t > t0 is called thepath of centersfor the optimization problem. Undersuitable conditions the solutionsx(t) are analytic and have a limit ast ↓ t0, sayxopt. The pointxopt := limt↓t0 x(t) is optimal in the sense thatVopt = f (xopt) since fort > t0, x(t) is feasible andsatisfiesf (x(t)) < t .

Interior point methods can be applied to either of the two LMI problems as defined in the previoussection. If we consider thefeasibility problemassociated with the LMIF(x) ≺ 0 then (f does notplay a role and) one candidate barrier function is the logarithmic function

φ(x) :=

{log det−F(x)−1 if x ∈ S

∞ otherwise.

Under the assumption that the feasible setS is bounded and non-empty, it follows thatφ is strictlyconvex and hence it defines a barrier function for the feasibility setS. By invoking Proposition 1.12,we know that there exists a uniquexopt such thatφ(xopt) is the global minimum ofφ. The pointxoptobviously belongs toS and is called theanalytic centerof the feasibility setS. It is usually obtainedin a very efficient way from the classical Newton iteration

xk+1 = xk −(φ′′(xk)

)−1φ′(xk). (1.4.13)

Hereφ′ andφ′′ denote the gradient and the Hessian ofφ, respectively.

The convergence of this algorithm can be analyzed as follows. Sinceφ is strongly convex andsufficiently smooth, there exist numbersL and M such that for all vectorsu with norm ‖u‖ = 1



there holds

u>φ′′(x)u ≥ M

‖φ′′(x)u − φ′′(y)u‖ ≤ L‖x − y‖.

In that case,

‖φ′(xk+1)‖2

≤L

2M2‖φ′(xk)‖

2

so that whenever the initial valuex0 is such that L2M2 ‖φ′(x0)‖ < 1 the method is guaranteed to

convergequadratically.

The idea will be to implement this algorithm in such a way that quadratic convergence can be guar-anteed for the largest possible set of initial valuesx0. For this reason the iteration (1.4.13) is modifiedas follows

xk+1 = xk − αk(λ(xk))(φ′′(xk)

)−1φ′(xk)

where

αk(λ) :=

{1 if λ < 2 −

√3

11+λ

if λ ≥ 2 −√

3.

andλ(x) :=

√φ′(x)>φ′′(x)φ′(x) is the so calledNewton decrementassociated withφ. It is this

damping factor that guarantees thatxk will converge to the analytic centerxopt, the unique minimizerof φ. It is important to note that the step-size is variable in magnitude. The algorithm guaranteesthatxk is always feasible in the sense thatxk ∈ S and thatxk converges globally to a minimizerxoptof φ. It can be shown thatφ(xk) − φ(xopt) ≤ ε whenever

k ≥ c1 + c2 log log(1/ε) + c3(φ(x0) − φ(xopt)

)wherec1, c2 andc3 are constants. The first and second terms on the right-hand side do not depen-dent on the optimization criterion and the specific LMI constraint. The second term can almost beneglected for small values ofε.

The optimization problemto minimize f (x) subject to the LMIF(x) ≺ 0 can be viewed as afeasibility problem for the LMI

Ft (x) :=

(f (x) − t 0

0 F(x)

)≺ 0.

wheret > t0 := infx∈S f (x) is a penalty parameter. Using the same barrier function for this linearmatrix inequality yields the unconstrained optimization problem to minimize

gt (x) := log det−Ft (x)−1= log

1

t − f (x)︸︷︷︸φ0(t− f (x))

+ log det−F(x)−1︸︷︷︸φ(x)

which is of the form (1.4.12). Due to the strict convexity ofgt the minimizerx(t) of gt is uniquefor all t > t0. It can be shown that the sequencex(t) is feasible for allt > t0 and approaches theinfimum infx∈S f (x) ast ↓ t0.



1.4.5 What are dual LMI problems?

Let X be a finite dimensional vector space and define alinear objective function f: X → R bysetting f (x) = c>x = 〈x, c〉 wherec ∈ X. We consider the optimization problem to determine

Popt = infx∈S

f (x)

where the space of feasible decisionsS is defined as

S = {x ∈ X | G(x) � 0, H(x) = 0}.

Here,G : X → Y andH : X → Z are affine functions, andY andZ are finite dimensional spaces.To properly interpret the inequalityG(x) � 0, the spaceY is equal to the setS of real symmetricmatrices. SinceG and H are affine, we can writeG(x) = G0 + G1(x) and H(x) = H0 + H1(x)

whereG1 and H1 are linear transformations. This optimization problem is similar to the ‘primalproblem’ which we discussed in Section 1.3.3, except for the fact that the constraintsG(x) � 0 andH(x) = 0 are now matrix valued. Following the terminology of Section 1.3.3, this is an example of alinear convex program. The aim of this subsection is to generalize the duality results of Section 1.3.3to the matrix valued case and to establish a precise formulation of thedual optimization problem.

The setY = S of real symmetric matrices becomes a Hilbert space when equipped with the innerproduct〈y1, y2〉 := trace(y1y2). Here,y1 andy2 are matrices inY. Similarly, we equipZ with aninner product〈·, ·〉, so as to makeZ a Hilbert space as well.

With reference to Remark 1.18, define the closed convex coneK := {Y ∈ Y | Y � 0} and notethat the inequalityG(x) � 0 is equivalent toG(x) ∈ K. Obviously, everyy ∈ Y defines a linearfunctionalg(·) := 〈y, ·〉 on Y. Conversely, every linear functionalg : Y → R uniquely defines anelementy ∈ Y such thatg(·) = 〈y, ·〉. In particular, for allk ∈ K the linear functionalg(k) = 〈y, k〉

is non-positive if and only ify � 0 (in the sense that the smallest eigenvalue ofy is non-negative).Hence, the LagrangianL : X × Y × Z → R defined by

L(x, y, z) := 〈x, c〉 + 〈y, G(x)〉 + 〈z, H(x)〉

satisfies`(y, z) := inf

x∈XL(x, y, z) ≤ inf

x∈S〈x, c〉 = Popt.

for all y ∈ Y, z ∈ Z, y � 0 and the dual optimization problem amounts to determining

Dopt := supy�0, z

`(y, z) = supy�0, z

infx∈X

L(x, y, z).

We then obtain the following generalization of Theorem 1.16.

Theorem 1.28 Under the conditions given in this subsection, if there exists(x0, y0, z0) ∈ X×Y×Zsuch that G(x0) ≺ 0, H(x0) = 0, y0 � 0 and c+ G∗

1(y0) + H∗

1 (z0) = 0 then both the primal andthe dual optimization problem admit optimal solutions and

Popt = minG(x)�0, H(x)=0

〈x, c〉 = maxy�0, c+G∗

1(y)+H∗

1 (z)=0〈G0, y〉 + 〈H0, z〉 = Dopt.



Moreover, the triple(xopt, yopt, zopt) is optimal for both the primal and the dual problem if and onlyif

(a) G(xopt) � 0, H(xopt) = 0,

(b) yopt � 0, c + G∗

1(yopt) + H∗

1 (zopt) = 0 and

(c) 〈yopt, G(xopt)〉 = 0.

Proof. Under the given feasibility conditions, the dual optimal value

Dopt = maxy�0, z

infx

〈c, x〉 + 〈y, G0〉 + 〈z, H0〉 + 〈G∗

1(y), x〉 + 〈H∗

1 (z), x〉

= maxy�0, z

infx

〈c + G∗

1(y) + H∗

1 (z), x〉 + 〈G0, y〉 + 〈H0, z〉

= maxy�0, c+G∗

1(y)+H∗

1 (z)=0〈G0, y〉 + 〈H0, z〉

where the last equality follows by dualization of the dual problem. Since the dual problem satis-fies the constraint qualification by assumption, the primal problem admits an optimal solution. Let(xopt, yopt, zopt) be optimal for both the primal and the dual problem. Then items 1 and 2 are im-mediate and item 1 implies that〈xopt, c〉 ≤ L(x, yopt, zopt) for all x. With x = xopt this yields〈yopt, G(xopt)〉 ≥ 0. On the other hand, items 1 and 2 imply〈yopt, G(xopt)〉 ≤ 0 and we mayconclude item 3. Conversely, if items 1, 2 and 3 hold, then it is easily verified thatL satisfies thesaddle-point property

L(xopt, y, z) ≤ L(xopt, yopt, zopt) ≤ L(x, yopt, zopt), for all x, y ≥ 0, z

The first inequality shows that(yopt, zopt) is an optimal solution for the dual problem. Likewise, thesecond inequality shows thatxopt is optimal for the primal problem.

1.4.6 When were they invented?

Contrary to what many authors nowadays seem to suggest, the study of linear matrix inequalities inthe context of dynamical systems and control goes back a long way in history and probably startswith the fundamental work of Aleksandr Mikhailovich Lyapunov on the stability of motion. Lya-punov was a school friend of Markov (yes, the one of the Markov parameters) and later a student ofChebyshev. Around 1890, Lyapunov made a systematic study of the local expansion and contractionproperties of motions of dynamical systems around an attractor. He worked out the idea that aninvariant set of a differential equation is stable in the sense that it attracts all solutions if one can finda function that is bounded from below and decreases along all solutions outside the invariant set.

Aleksandr Mikhailovich Lyapunov was born on May 25, 1857 and published in 1892 his work‘The General Problem of the Stability of Motion’ in which he analyzed the question of stability of



Figure 1.2: Aleksandr Mikhailovich Lyapunov

equilibrium motions of mechanical systems. This work served as his doctoral dissertation and wasdefended on September 1892 in Moscow University. Put into modern jargon, he studied stability ofdifferential equations of the form

x = A(x)

where A : Rn→ Rn is some analytic function andx is a vector of positions and velocities of

material taking values in a finite dimensional state spaceX = Rn. As Theorem I in Chapter 1,section 16 it contains the statement3 that

if the differential equation of the disturbed motion is such that it is possible to finda definite function V of which the derivative V′ is a function of fixed sign which isopposite to that of V , or reduces identically to zero, the undisturbed motion is stable.

The intuitive idea behind this result is that the so calledLyapunov function Vcan be viewed as ageneralized ‘energy function’ (in the context of mechanical systems the kinetic and potential energiesalways served as typical Lyapunov functions). A system is then stable if it is ‘dissipative’ in thesense that the Lyapunov function decreases. Actually, this intuitive idea turns out to be extremelyfruitful in understanding the role of linear matrix inequalities in many problems related to analysisand synthesis of systems. This is why we devote the next chapter to dissipative dynamical systems.We will consider stability issues in much more detail in a later chapter.

3Translation by A.T. Fuller as published in the special issue of the International Journal of Control in March 1992 andin [16].


30 1.5. FURTHER READING

1.5 Further reading

Foundations on convex sets and convex functions have been developed around 1900, mainly by thework of Minkowski in [19]. Detailed modern treatments on the theory of convex functions andconvex set analysis can be found in [1, 22, 24, 41]. The theory of convex programs and its relationwith Lagrange multipliers and saddle points originates with the work by Kuhn-Tucker in [14]. Theconstraint qualification assumption is due to Slater. For details on the theory of subgradients werefer to [1,25]. A standard work on optimization methods with applications in systems theory is theclassical book by Leunberger [15]. Interior point methods were developed in a series of papers [11]and have led to major breakthroughs in LMI-solvers in [20]. For general sources on linear algebrawe refer to the classics [4] and [8].

1.6 Exercises

Exercise 1Which of the following statements are true.

(a)

(1 ii 1

)�

(0 11 0

).

(b) A � B implies thatλmax(A) > λmax(B).

(c) λmax(A + B) ≤ λmax(A) + λmax(B).

(d) Write A ∈ H hasA = X + iY with X andY real. ThenA � 0 if and only if

(X Y

−Y X

)� 0.

(e) LetT ∈ Cn×m andA ∈ Hn. If A � 0 thenT∗ AT � 0.

(f) Let T ∈ Cn×m andA ∈ Hn. If T∗ AT � 0 thenA � 0.

Exercise 2Give an example of a non-convex functionf : S → R whose sublevel setsSα are convex for allα ∈ R.

Exercise 3Let f : S → R be a convex function.

(a) Show the so calledJensen’s inequalitywhich states that for a convex combinationx =∑ni =1 αi xi of x1, . . . xn ∈ S there holds that

f (

n∑i =1

αi xi ) ≤

n∑i =1

αi f (xi ).


1.6. EXERCISES 31

Hint: A proof by induction on n may be the easiest.

(b) Show that conv(S) is equal to the set of all convex combinations ofS

Exercise 4Let S be a subset of a finite dimensional vector space. Theaffine hullaff(S) of S is the intersectionof all affine sets containingS (cf. Definition 1.5). Anaffine combinationof x1, . . . , xn ∈ S is a point

x :=

n∑i =1

αi xi

where∑n

i =1 αi = 1 (cf. Definition 1.3). Show that for any setS in Rn the affine hull aff(S) is affineand consists of all affine combinations of the elements ofS. (cf. Proposition 1.6).

Exercise 5Let S andT be finite dimensional vector spaces and letf : S → T be an affine function. Show that

(a) f −1 is affine whenever the inverse function exists.

(b) thegraph{(x, y) | x ∈ S, y = f (x)} of f is affine.

(c) f (S′) is affine (as a subset ofT ) for every affine subsetS′⊆ S.

Exercise 6It should not be surprising that the notion of a convex set and a convex function are related. LetS ⊆ Rn. Show that a functionf : S → R is convex if and only if itsepigraph{(x, y) | x ∈ S, y ∈

R, f (x) ≤ y} is a convex set.

Exercise 7Perform a feasibility test to verify the asymptotic stability of the systemx = Ax, where

A =

0 1 00 0 1

−2 −3 −4

.

Exercise 8Show that

(a) the functionf : Rn→ R defined by the quadratic formf (x) = x>Rx+ 2s>x + q is convex

if and only if then × n matrix R = R>� 0.

(b) the intersection of the setsS j := {x ∈ Rn| x>Rj x + 2s>

j x + q j ≤ 0} where j = 1, . . . , kandRj � 0 is convex.


32 1.6. EXERCISES

How doesS j look like if Rj = 0? And how ifsj = 0?

Exercise 9Let S denote the vector space of (closed-loop) single-input and single-output rational transfer func-tions. Let fmin : R+ → R and fmax : R+ → R be two functions such thatfmin(·) ≤ fmax(·).Consider the following rather typical time and frequency domain specifications:

(a) Frequency response shaping: Forγ ≥ 0,

Sγ = { S ∈ S | supω≥0

| fmax(ω) − S(i ω)| ≤ γ }.

(b) Bandwidth shaping: Forγ ≥ 0,

Sγ = { S ∈ S | fmin(ω) ≤ |S(i ω)| ≤ fmax(ω) for all 0 ≤ ω ≤1

γ}.

(c) Step response shaping:

Sγ = { S ∈ S | fmin(t) ≤ s(t) ≤ fmax(t) for all t ≥ 0 }.

Here,s(t) =1

2π

∫∞

−∞

S(i ω)i ω ei ωt dω is the step response of the systemS ∈ S.

(d) Over- and undershoot: Forγ > 0,

Sγ = { S ∈ S | supt≥0

max(s(t) − fmax(t), fmin(t) − s(t), 0) ≤ γ }

with s the step response as defined in item 3.

Verify for each of these specifications whether or notSγ defines a convex subset ofS.

Exercise 10Consider the systemsx = Ai x + Bi u wherei = 1, . . . , 4 and

A1 =

(1 01 1

), B1 =

(10

), A2 =

(−1 21 2

), B2 =

(11

),

A3 =

(0 11 0

), B3 =

(01

), A4 =

(0 21 −1

), B4 =

(21

).

Find a state feedback lawu = Fx such that each of the 4 autonomous systemsx = (Ai + Bi F)x,i = 1, . . . , k is asymptotically stable. (See example 4 in subsection (1.4.3)).

Exercise 11In this exercise we investigate the stability of the linear time-varying system

x = A(t)x (1.6.1)


1.6. EXERCISES 33

where for allt ∈ R+ the matrixA(t) is a convex combination of the triple

A1 :=

(−1 1−1 −0.2

), A2 :=

(−1 1−2 −0.7

), A3 :=

(−2 1

−1.2 0.4

).

That is,A(t) ∈ conv(A1, A2, A3)

for all values oft ∈ R+. This is apolytopic model. It is an interesting fact that the time-varyingsystem (1.6.1) is asymptotically stable in the sense that for any initial conditionx0 ∈ Rn, the solutionx(·) of (1.6.1) satisfies limt→∞ x(t) = 0 whenever there exists a matrixX = X>

� 0 such that

A>

1 X + X A1 ≺ 0

A>

2 X + X A2 ≺ 0

A>

3 X + X A3 ≺ 0.

(We will come to this fact later!) If such anX exists then (1.6.1) is asymptotically stableirrespectiveof how fast the time variations of A(t) take place! Reformulate the question of asymptotic stabilityof (1.6.1) as a feasibility problem and find, if possible, a feasible solutionX to this problem.

Exercise 12Consider the dynamical system

x = Ax + Bu

wherex is ann-dimensional state andu is a scalar-valued input which is supposed to belong toU = {u : R → R | −1 ≤ u(t) ≤ 1 for all t ≥ 0}. Define thenull controllable subspaceof thissystem as the set

C := {x0 ∈ Rn| ∃T ≥ 0 andu ∈ U such thatx(T) = 0}

i.e., the set of initial states that can be steered to the origin of the state space in finite time withconstrained inputs. Show thatC is a convex set.

Exercise 13Let F : X → H be affine and suppose that the LMIF(x) ≺ 0 is feasible. Prove that there existsε > 0 such that the LMIF(x) + ε I ≺ 0 is also feasible. Does this statement hold forI replacedwith any Hermitian matrix?


34 1.6. EXERCISES


Chapter 2

Dissipative dynamical systems

2.1 Introduction

The notion of dissipativity is a most important concept in systems theory both for theoretical consid-erations as well as from a practical point of view. Especially in the physical sciences, dissipativityis closely related to the notion of energy. Roughly speaking, a dissipative system is characterizedby the property that at any time the amount of energy which the system can conceivably supply toits environment can not exceed the amount of energy that has been supplied to it. Stated otherwise,when time evolves, a dissipative system absorbs a fraction of its supplied energy and transforms it forexample into heat, an increase of entropy, mass, electro-magnetic radiation, or other kinds of energy‘losses’. In many applications, the question whether a system is dissipative or not can be answeredfrom physical considerations on the way the system interacts with its environment. For example, byobserving that the system is an interconnection of dissipative components, or by considering systemsin which a loss of energy is inherent to the behavior of the system due to friction, optical dispersion,evaporation losses, etc.

In this chapter we will formalize the notion of a dissipative dynamical system for a very generalclass of systems. It will be shown that linear matrix inequalities occur in a very natural way inthe study of linear dissipative systems. Perhaps the most appealing setting for studying LMI’s insystem and control theory is within the framework of dissipative dynamical systems. It will beshown that solutions of LMI’s have a natural interpretation asstorage functionsassociated with adissipative system. This interpretation will play a key role in understanding the importance of LMI’sin questions related to stability, performance, robustness, and a large variety of controller designproblems.

35

36 2.2. DISSIPATIVE DYNAMICAL SYSTEMS

2.2 Dissipative dynamical systems

2.2.1 Definitions and examples

Consider a continuous time, time-invariant dynamical system6 described by the equations

x = f (x, w) (2.2.1a)

z = g(x, w) (2.2.1b)

Here,x is the state which takes its values in astate space X, w is the input taking its values in aninput space Wandz denotes the output of the system which assumes its values in theoutput spaceZ. Throughout this section, the precise representation of the system will not be relevant. What weneed, though, is that for any initial conditionx0 ∈ X and for any inputw belonging to an inputclassW , there exists uniquely defined signalsx : R+ → X andz : R+ → Z which satisfy (2.2.1)subject tox(0) = x0. Here,R+ = [0, ∞) is the time set. In addition, the outputz is assumed todepend onw in acausalway; that is, ifw1 ∈ W andw2 ∈ W are two input signals that are identicalon [0, T] then the outputsz1 andz2 of (2.2.1) corresponding to the inputsw1 andw2 and the same(but arbitrary) initial conditionx(0) = x0 are also identical on[0, T]. The system (2.2.1) thereforegenerates outputs from inputs and initial conditions while future values of the inputs do not have aneffect on the past outputs. Let

s : W × Z → R

be a mapping and assume that for allt0, t1 ∈ R and for all input-output pairs(w, z) satisfying (2.2.1)the composite function s(w(t), z(t)) is locally absolutely integrable, i.e.,

∫ t1t0

| s(w(t), z(t))|dt < ∞.The mapping s will be referred to as thesupply function.

Definition 2.1 (Dissipativity) The system6 with supply function s is said to bedissipativeif thereexists a functionV : X → R such that

V(x(t0)) +

∫ t1

t0s(w(t), z(t))dt ≥ V(x(t1)) (2.2.2)

for all t0 ≤ t1 and all signals(w, x, z) which satisfy (2.2.1). The pair(6, s) is said to beconservativeif equality holds in (2.2.2) for allt0 ≤ t1 and all(w, x, z) satisfying (2.2.1).

Interpretation 2.2 The supply function (orsupply rate) s should be interpreted as the supplydeliv-ered to the system. This means that s(w(·), z(·)) represents the rate at which supply flows into thesystem if the system generates the input-output pair(w(·), z(·)). In other words, in the time interval[0, T] work has been doneon the system whenever

∫ T0 s(w(t), z(t))dt is positive, while work is done

by the system if this integral is negative. The functionV is called astorage functionand generalizesthe notion of an energy function for a dissipative system. With this interpretation, inequality (2.2.2)formalizes the idea that a dissipative system is characterized by the property that the change of inter-nal storageV(x(t1)) − V(x(t0)) in any time interval[t0, t1] will never exceed the amount of supplythat flows into the system. This means that part of what is supplied to the system is stored, while theremaining part is dissipated. Inequality (2.2.2) is known as thedissipation inequality.


2.2. DISSIPATIVE DYNAMICAL SYSTEMS 37

We stress that, contrary to the definition in the classical papers [45,46], we donot require the storagefunctionV in (2.2.2) to be non-negative. This difference is an important one and stems mainly fromapplications in mechanical and thermodynamical systems where energy or entropy functions neednot necessarily be bounded from below. See Example 2.3.

If the composite functionSV(t) := V(x(t)) is differentiable, then (2.2.2) is equivalent to

d

dtSV(t) = Vx(x(t)) f (x(t), w(t)) ≤ s(w(t), z(t))

for all t and all solutions(w, x, z) of (2.2.1). Here,Vx denotes the gradient ofV . This observationmakes dissipativity of a dynamical system alocal propertyin the sense that(6, s) is dissipative ifand only if

Vx(x) f (x, w) ≤ s(w, g(x, w)) (2.2.3)

holds for all pointsx ∈ Rn andw ∈ Rm. In words, (2.2.3) states that the rate of change of storagealong trajectories of the system will never exceed the rate of supply. We will refer to (2.2.3) as thedifferential dissipation inequality.

The classical motivation for the study of dissipativity comes from circuit theory. In the analysis ofelectrical networks the product of voltages and currents at the external branches of a network, i.e. thepower, is an obvious supply function. Similarly, the product of forces and velocities of masses is acandidate supply function in mechanical systems. For those familiar with the theory of bond-graphswe remark that every bond-graph can be viewed as a representation of a dissipative dynamical systemwhere inputs and outputs are taken to be effort and flow variables and the supply function is theproduct of these two variables. A bond-graph is therefore a special case of a dissipative system.

Example 2.3 Consider a thermodynamic system at uniform temperatureT on which mechanicalwork is being done at rateW and which is being heated at rateQ. Let (T, Q, W) be the externalvariables of such a system and assume that –either by physical or chemical principles or throughexperimentation– the mathematical model of the thermodynamic system has been decided upon andis given by the time invariant system (2.2.1). The first and second law of thermodynamics may thenbe formulated in the sense of Definition 2.1 by saying that the system6 is conservativewith respectto the supply function s1 := (W + Q) anddissipativewith respect to the supply function s2 :=

−Q/T . Indeed, the first law of thermodynamics states that for all system trajectories(T, Q, W) andall time instantst0 ≤ t1,

E(x(t0)) +

∫ t1

t0Q(t) + W(t) dt = E(x(t1))

(conservation of thermodynamic energy). The second law of thermodynamics states that all systemtrajectories satisfy

S(x(t0)) +

∫ t1

t0−

Q(t)

T(t)dt ≥ S(x(t1))

for all t0 ≤ t1. Here, E is called theinternal energyand S the entropy. The first law promisesthat the change of internal energy is equal to the heat absorbed by the system and the mechanicalwork which is done on the system. The second law states that the entropy decreases at a higher



rate than the quotient of absorbed heat and temperature. It follows that thermodynamic systems aredissipative with respect to two supply functions. Nernst’s third law of thermodynamics –the entropyof any object of zero temperature is zero– is only a matter of scaling of the entropy functionS anddoes not further constrain the trajectories of the system.

Example 2.4 Other examples of supply functions s: W × Z → R are the quadratic forms

s(w, z) = w>z, s(w, z) = ‖w‖2− ‖z‖2,

s(w, z) = ‖w‖2+ ‖z‖2, s(w, z) = ‖w‖

2

which arise in network theory, bondgraph theory, scattering theory,H∞ theory, game theory andLQ-optimal control andH2-optimal control theory. We will come across these examples in moredetail below.

There are a few refinements to Definition 2.1 which are worth mentioning. Definition 2.1 can begeneralized totime-varyingsystems by letting the supply rate s explicitly depend on time. Manyauthors have proposed a definition of dissipativity for discrete time systems, but since we can notthink of any physical example of such a system, there seems little practical point in doing this.Another refinement consists of the idea that a system may be dissipative with respect to more thanone supply function. See Example 2.3 for an example. Also, a notion ofrobust dissipativitymaybe developed in which the system description (2.2.1) is not assumed to be perfectly known, butuncertain to some well defined extend. An uncertain system is then called robustly dissipative if(2.2.2) holds for allt0 ≤ t1 and all trajectories(w, x, z) that can conceivably be generated by theuncertain system. See Section?? in Chapter?? for more details. The notion ofstrict dissipativityisa refinement of Definition 2.1 which will prove useful in the sequel. It is defined as follows.

Definition 2.5 (Strict dissipativity) The system6 with supply rate s is said to bestrictly dissipativeif there exists a storage functionV : X → R and anε > 0 such that

V(x(t0)) +

∫ t1

t0s(w(t), z(t))dt − ε2

∫ t1

t0‖w(t)‖2dt ≥ V(x(t1)) (2.2.4)

for all t0 ≤ t1 and all trajectories(w, x, z) which satisfy (2.2.1).

A strictly dissipative system satisfies (2.2.2) with strict inequality, which justifies its name. As a finalcomment we mention the notion ofcyclo dissipativitywhich has been introduced in (??). ForT > 0,the functionw : R → W is said to beT -periodicif for all t ∈ R we have thatw(t) = w(t + T). Asystem6 with supply function s is calledcyclo dissipativeif for all T > 0 there holds∫ T

0s(w(t), z(t))dt ≥ 0

for all T -periodic trajectories(w(·), z(·)) which satisfy (2.2.1). Cyclo dissipativity is therefore asystem property defined in terms ofT-periodic trajectories only. The importance of this notion liesin the fact that it avoids reference to the internal state space structure of the system and requires


2.2. DISSIPATIVE DYNAMICAL SYSTEMS 39

a condition on signals in the external (input-output) behavior of the system only. It is easily seenthat a dissipative system is cyclo dissipative whenever the statex is observable from(w, z), thatis, wheneverx is uniquely defined by any(w, z) which satisfies (2.2.1). Conversely, under somemild minimality and connectability conditions on the stateX, a cyclo dissipative system is alsodissipative. We refer to [44] for details.

2.2.2 A classification of storage functions

Suppose that(6, s) is dissipative and letx∗∈ X be a fixed reference point in the state space of6.

Instead of considering the set of all possible storage functions associated with(6, s), we will restrictattention to the set ofnormalized storage functionsdefined by

V(x∗) := {V : X → R | V(x∗) = 0 and (2.2.2) holds}.

Hence,x∗ is a reference point ofneutral storage. Clearly, ifV is a storage function satisfying (2.2.2),then we can obtain a normalized storage function by definingV(x) := V(x) − V(x∗) by noting thatV ∈ V(x∗).

Two mappingsVav : X → R ∪ {+∞} andVreq : X → R ∪ {−∞} will play a crucial role in thesequel. They are defined by

Vav(x0) := sup

{−

∫ t1

0s(w(t), z(t)) dt | t1 ≥ 0; (w, x, z) satisfies (2.2.1) with (2.2.5a)

x(0) = x0 andx(t1) = x∗}

Vreq(x0) := inf

{∫ 0

t−1

s(w(t), z(t)) dt | t−1 ≤ 0; (w, x, z) satisfies (2.2.1) with (2.2.5b)

x(0) = x0 andx(t−1) = x∗}

Here,Vav(x) denotes the maximal amount of internal storage that may be recovered from the systemover all state trajectories starting inx and eventually ending inx∗. Similarly, Vreq(x) reflects theminimal supply which needs to be delivered to the system in order to steer the state tox via anytrajectory originating inx∗. We refer toVav and Vreq as theavailable storageand therequiredsupply, (measured with respect tox∗). In (2.2.5) it is silently assumed that forx0 ∈ X there existsan inputw ∈ W which steers the state fromx∗ at some time instantt−1 < 0 to x0 at time t = 0and back tox∗ at timet1 > 0. We callx0 connectable with x∗ if this property holds. If such a loopcan be run in finite time for anyx0 ∈ X, then we say thatevery state is connectable with x∗. Thefollowing characterization is the main result of this section.

Proposition 2.6 Let the system6 be represented by(2.2.1)and lets be a supply function. Sup-pose that every state is connectable with x∗ for some x∗ ∈ X. Then the following statements areequivalent

(a) (6, s) is dissipative.



(b) −∞ < Vav(x) < ∞ for all x ∈ X.

(c) −∞ < Vreq(x) < ∞ for all x ∈ X.

Moreover, if one of these equivalent statements hold, then

(a) Vav, Vreq ∈ V(x∗).

(b) {V ∈ V(x∗)} ⇒ {for all x ∈ X there holds Vav(x) ≤ V(x) ≤ Vreq(x)}.

(c) V(x∗) is a convex set. In particular, Vα := αVav + (1 − α)Vreq ∈ V(x∗) for all α ∈ (0, 1).

Interpretation 2.7 Proposition 2.6 confirms the intuitive idea that a dissipative system can neithersupply nor store an infinite amount of energy during any experiment that starts or ends in a state ofneutral storage. Proposition 2.6 shows that a system is dissipative if and only if the available storageand the required supply are real (finite) valued functions. Moreover, both the available storageand the required supply are possible storage functions of a dissipative system, these functions arenormalized and defineextremal storage functionsin V(x∗) in the sense thatVav is the smallest andVreq is the largest element inV(x∗). In particular, for any state of a dissipative system, the availablestorage can not exceed its required supply. In addition, convex combinations of the available storageand the required supply are candidate storage functions.

Proof. Let (6, s) be dissipative, and letV be a storage function. SinceV(x) := V(x) − V(x∗) ∈

V(x∗) it follows that V(x∗) 6= ∅ so that we may equally assume thatV ∈ V(x∗). Let x0 ∈ X,t−1 ≤ 0 ≤ t1 and(w, x, z) satisfy (2.2.1) withx(t−1) = x(t1) = x∗ andx(0) = x0. Since6 isx∗-connectable such trajectories exist. From (2.2.2) we then infer that

−∞ < −

∫ t1

0s(t)dt ≤

∫ 0

t−1

s(t)dt < +∞.

First take in this inequality the supremum over allt1 ≥ 0 and(w, x, z)|[0,t1] which satisfy (2.2.1)with x(0) = x0 and x(t1) = x∗. This yields that−∞ < Vav(x0) < ∞. Second, by takingthe infimum over allt−1 ≤ 0 and(w, x, z)|[t−1,0] with x(t−1) = x∗ andx(0) = x0 we infer that−∞ < Vreq(x0) < ∞. Sincex0 is arbitrary, we obtain 2 and 3. To prove the converse implication, itsuffices to show thatVav andVreq define storage functions. To see this, lett0 ≤ t1 ≤ t2 and(w, x, z)satisfy (2.2.1) withx(t2) = x∗. Then

Vav(x(t0)) ≥ −

∫ t1

t0s(w(t), z(t))dt −

∫ t2

t1s(w(t), z(t))dt.

Since the second term in the right hand side of this inequality holds for arbitraryt2 ≥ t1 and arbitrary(w, x, z)|[t1,t2] (with x(t1) fixed andx(t2) = x∗), we can take the supremum over all such trajectoriesto conclude that

Vav(x(t0)) ≥ −

∫ t1

t0s(w(t), z(t))dt + Vav(x(t1))


2.3. LINEAR DISSIPATIVE SYSTEMS WITH QUADRATIC SUPPLY RATES 41

which shows thatVav satisfies (2.2.2). In a similar manner it is seen thatVreq satisfies (2.2.2).

We next prove the remaining claims.

1. We already proved thatVav andVreq are storage functions. It thus remains to show thatVav(x∗) =

Vreq(x∗) = 0. Obviously,Vav(x∗) ≥ 0 andVreq(x∗) ≤ 0 (taket1 = t−1 = 0 in (2.2.5)). Supposethat the latter inequalities are strict. Then, since the system isx∗-connectable, there existst−1 ≤

0 ≤ t1 and a state trajectoryx with x(t−1) = x(0) = x(t1) = x∗ such that−∫ t1

0 s(t)dt > 0 and∫ 0t−1

s(t)dt < 0. But this contradicts the dissipation inequality (2.2.2) as both∫ t1

0 s(t)dt ≥ 0 and∫ 0t−1

s(t)dt ≥ 0. Thus,Vav(x∗) = Vreq(x∗) = 0.

2. If V ∈ V(x∗) then

−

∫ t1

0s(w(t), z(t))dt ≤ V(x0) ≤

∫ 0

t−1

s(w(t), z(t))dt

for all t−1 ≤ 0 ≤ t1 and(w, x, z) satisfying (2.2.1) withx(t−1) = x∗= x(t1) andx(0) = x0. Now

take the supremum and infimum over all such trajectories to obtain thatVav(x0) ≤ V(x0) ≤ Vreq(x0).

3. Follows trivially from the dissipation inequality (2.2.2).

2.3 Linear dissipative systems with quadratic supply rates

In the previous section we analyzed the notion of dissipativity at a fairly high level of generality. Inthis section we will apply the above theory to linear input-output systems6 described by(

xz

)=

(A BC D

)(xw

)(2.3.1)

with state spaceX = Rn, input spaceW = Rm and output spaceZ = Rp. Let x∗= 0 be the point

of neutral storage and consider supply functions that are generalquadratic functionss : W× Z → Rdefined by

s(w, z) =

(w

z

)> ( Q SS> R

)(w

z

)= w>Qw + w>Sz+ z>S>w + z>Rz. (2.3.2)

Here, the matrix

P :=

(Q SS> R

)is a real symmetric matrix (that is,P ∈ Sm+p) which is partitioned conform withw andz. No apriori definiteness assumptions are made onP.


42 2.3. LINEAR DISSIPATIVE SYSTEMS WITH QUADRATIC SUPPLY RATES

Substituting the output equationz = Cx + Dw in (2.3.2) shows that (2.3.2) can equivalently beviewed as a quadratic function in the variablesx andw. Indeed,

s(w, z) = s(w, Cx + Dw) =

(xw

)> (0 IC D

)> ( Q SS> R

)(0 IC D

)(xw

)This quadratic form inx andw will be used frequently in the sequel.

2.3.1 Main results

The following theorem is the main result of this chapter. It provides necessary and sufficient condi-tions for the pair(6, s) to be dissipative. In addition, it provides a complete parametrization of allnormalized storage functions, together with a useful frequency domain characterization of dissipa-tivity.

Theorem 2.8 Suppose that the system6 described by(2.3.1) is controllable and let the supplyfunctions be defined by(2.3.2). Then the following statements are equivalent.


(b) (6, s) admits a quadratic storage function V(x) := x>K x with K = K >.

(c) There exists K= K > such that

F(K ) :=

(A>K + K A K B

B>K 0

)−

(0 IC D

)> ( Q SS> R

)(0 IC D

)� 0. (2.3.3)

(d) There exists K− = K >− such that Vav(x) = x>K−x.

(e) There exists K+ = K >+ such that Vreq(x) = x>K+x.

(f) For all ω ∈ R with det(i ω I − A) 6= 0, the transfer function T(s) := C(I s − A)−1B + Dsatisfies (

IT(i ω)

)∗ ( Q SS> R

)(I

T(i ω)

)� 0 (2.3.4)

Moreover, if one of the above equivalent statements holds, then V(x) := x>K x is a quadraticstorage function inV(0) if and only if F(K ) � 0.

Proof. (1⇔4,5). If (6, s) is dissipative then by Proposition 2.6Vav andVreq are storage functions.We claim that bothVav(x) andVreq are quadratic functions ofx. This follows from [?] upon notingthat bothVav andVreq are defined as optimal values corresponding to a linear quadratic optimizationproblem. Hence, ifx∗

= 0, Vav(x) is of the formx>K−x andVreq(x) takes the formx>K+x forsome matricesK− = K >

− andK+ = K >+ . The converse implication is obvious from Proposition 2.6.



(1⇔2). Using the previous argument, item 1 implies item 4. But item 4 implies item 2 by Proposi-tion 2.6. Hence, 1⇒ 2. The reverse implication is trivial.

(2⇒3). If V(x) = x>K x with K ∈ Sn a storage function then the differential dissipation inequality(2.2.3) can be rewritten as

2x>K (Ax + Bw) ≤ s(w, Cx + Dw)

for all x ∈ X and allw ∈ W. This is equivalent to the algebraic condition(xw

)>{(

A>K + K A K BB>K 0

)−

(0 IC D

)> ( Q SS> R

)(0 IC D

)}︸︷︷︸

F(K )

(xw

)dt ≤ 0 (2.3.5)

for all w and x. Controllability of the system implies that through everyx we can pass a statetrajectory. Hence, (2.3.5) reduces to the condition thatK ∈ Sn satisfiesF(K ) � 0.

(3⇒2). If there existK ∈ S such thatF(K ) � 0 then (2.3.5) holds from which it follows thatV(x) = x>K x is a storage function satisfying the dissipation inequality.

The equivalence (1⇔6) is an application of Lemma 2.12, which we present below. An alterna-tive proof for the implication (1⇒6) can be given as follows. Letω > 0 be such that det(i ω I −

A) 6= 0 and consider the (complex) inputw(t) = exp(i ωt)w0 with w0 ∈ Rm. Definex(t) :=

exp(i ωt)(i ω I − A)−1Bw0 andz(t) := Cx(t) + Dw(t). Thenz(t) = exp(i ωt)T(i ω)w0 and the(complex valued) triple(w, x, z) is aτ -periodic harmonic solution of (2.3.1) withτ = 2π/ω. More-over,

s(w(t), z(t)) = w∗

0

(I

T(i ω)

)∗ ( Q SS> R

)(I

T(i ω)

)w0

which is constantfor all t ∈ R. Now suppose that(6, s) is dissipative. Then for allk ∈ Z,x(t0) = x(t0+kτ) and henceV(x(t0)) = V(x(t0+kτ)). For t1 = t0+kτ , the dissipation inequality(2.2.2) thus reads∫ t1

t0s(w(t), z(t))dt =

∫ t1

t0w∗

0

(I

T(i ω)

)∗ ( Q SS> R

)(I

T(i ω)

)w0 dt

= kτw∗

0

(I

T(i ω)I

)∗ ( Q SS> R

)(I

T(i ω)

)w0 ≥ 0

Sincekτ > 0 andw0 is arbitrary, this yields 6.

We recognize in (2.3.3) a non-strict linear matrix inequality. The matrixF(K ) is usually calledthedissipation matrix. Observe that in the above theorem the set ofquadratic storage functionsinV(0) is completely characterized by the linear matrix inequalityF(K ) � 0. In other words, the setof normalized quadratic storage functions associated with(6, s) coincides with the feasibility setof the system of LMIF(K ) � 0. In particular, the available storage and the required supply are



quadratic storage functions and henceK− andK+ also satisfyF(K−) � 0 andF(K+) � 0. UsingProposition 2.6, it moreover follows that any solutionK ∈ S of F(K ) � 0 has the property that

K− � K � K+.

In other words, among the set of symmetric solutionsK of the LMI F(K ) � 0 there exists asmallest and a largest element. The inequality (2.3.4) is called thefrequency domain inequality.The equivalence between statements 1 and the frequency domain characterization in statement 6 hasa long history in system theory. The result goes back to Popov (1962), V.A. Yakubovich (1962)and R. Kalman (1963) and is an application of the ‘Kalman-Yakubovich-Popov Lemma’, which wepresent and discuss in subsection 2.3.3 below.

For conservative systems with quadratic supply functions a similar characterization can be given.The precise formulation is evident from Theorem 2.8 and is left to the reader. Strictly dissipativesystems are characterized as follows.

Theorem 2.9 Suppose that the system6 is described by(2.3.1)and letsbe a supply function definedby (2.3.2). Then the following statements are equivalent.

(a) (6, s) is strictly dissipative.

(b) There exists K= K > such that

F(K ) :=

(A>K + K A K B

B>K 0

)−

(0 IC D

)> ( Q SS> R

)(0 IC D

)≺ 0 (2.3.6)

(c) For all ω ∈ R with det(i ω I − A) 6= 0, the transfer function T(s) := C(I s − A)−1B + Dsatisfies (

IT(i ω)

)∗ ( Q SS> R

)(I

T(i ω)

)� 0 (2.3.7)

Moreover, if one of the above equivalent statements holds, then V(x) := x>K x is a quadraticstorage function satisfying(2.2.4)for someε > 0 if and only if F(K ) ≺ 0.

Proof. (1⇒3). By definition, item 1 implies that for someε > 0 the pair(6, s′) is dissipativewith s′(w, z) := s(w, z) − ε2

‖w‖2. If 6 is controllable, Theorem 2.8 yields that for allω ∈ R with

det(i ω I − A) 6= 0,(I

T(i ω)

)∗ (Q − ε2I S

S> R

)(I

T(i ω)

)=

(I

T(i ω)

)∗ ( Q SS> R

)(I

T(i ω)

)− ε2I � 0 (2.3.8)

The strict inequality (2.3.7) then follows. If6 is not controllable, we use a perturbation argument toarrive at the inequality (2.3.7). Indeed, forδ > 0 let Bδ be such thatBδ → B asδ → 0 and(A, Bδ)

controllable. Obviously, suchBδ exist. DefineTδ(s) := C(I s − A)−1Bδ + D and let(w, xδ, zδ)

satisfy xδ = Axδ + Bδw, zδ = Cxδ + Dw. It follows that Bδ → B implies that for everyw(t)



which is bounded on the interval[t0, t1], xδ(t) → x(t) andzδ(t) → z(t) pointwise int asδ → 0.Here,(w, x, z) satisfy (2.3.1) and (2.2.4). SinceV and s are continuous functions, the dissipationinequality (2.2.4) also holds for the perturbed system trajectories(w, xδ, zδ). This means that theperturbed system is strictly dissipative which, by Theorem 2.8, implies that (2.3.8) holds withT(i ω)

replaced byTδ(i ω) := C(I i ω − A)−1Bδ + D. Since for everyω ∈ R, det(I i ω − A) 6= 0, we havethatT(i ω) = limδ→0 Tδ(i ω), T(i ω) satisfies (2.3.8) which, in turn, yields (2.3.7).

(3⇒2) is a consequence of Lemma 2.12, presented below.

(2⇒1). If K = K > satisfiesF(K ) ≺ 0, then there existsε > 0 such that

F ′(K ) := F(K ) +

(0 00 ε2I

)= F(K ) +

(0 IC D

)> (ε2I 00 0

)(0 IC D

)� 0

By Theorem 2.8, this implies that(6, s′) is dissipative with s′(w, z) = s(w, z)−ε2‖w‖

2. Inequality(2.2.4) therefore holds and we conclude that(6, s) is strictly dissipative.

Remark 2.10 Contrary to Theorem 2.8, the system6 is not assumed to be controllable in Theo-rem 2.9.

The dissipation matrixF(K ) in (2.3.3) and (2.3.6) can be written in various equivalent and some-times more convenient forms. Indeed:

F(K ) =

(I 0A B

)> ( 0 KK 0

)(I 0A B

)−

(0 IC D

)> ( Q SS> R

)(0 IC D

)=

=

I 0A B0 IC D

>

0 K 0 0K 0 0 00 0 −Q −S0 0 −S>

−R

I 0A B0 IC D

=

(A>K + K A − C>RC K B− (SC)> − C>RDB>K − SC− D>RC −Q − SD− (SD)> − D>RD

).

If we set T := Q + SD + (SD)> + D>RD ≺ 0 and use a Schur complement, then the LMIF(K ) ≺ 0 is equivalent toT � 0 and

A>K + K A + C>RC+

(K B + (SC)> + C>RD

)T−1

(B>K + SC+ D>RC

)≺ 0.

The latter is a quadratic inequality in the unknownK .

2.3.2 Dissipation functions and spectral factors

If the system6 is dissipative with respect to the supply function s then for any storage functionV ,the inequality (2.2.3) implies that

d(x, w) := s(w, g(x, w)) − Vx(x) f (x, w) (2.3.9)



is non-negative for allx andw. Conversely, if there exists a non-negative functiond and a differ-entiableV : X → R for which (2.3.9) holds, then the pair(6, s) is dissipative. The functiondquantifies the amount of supply that is dissipated in the system when it finds itself in statex whilethe inputw is exerted. We will calld : X × W → R a dissipation functionfor (6, s) if (2.3.9) issatisfied for a differentiable storage functionV : X → R.

For linear systems with quadratic supply functions, (2.3.9) reads

d(x, w) =

(xw

)> (0 IC D

)> ( Q SS> R

)(0 IC D

)(xw

)−

(xw

)> (A>K + K A K BB>K 0

)(xw

)= −

(xw

)>

F(K )

(xw

).

If K = K > is such thatF(K ) � 0 (or F(K ) ≺ 0) then the dissipation matrix can be factorized as

−F(K ) =(MK NK

)> (MK NK). (2.3.10)

where(MK NK

)is a real matrix withn + m columns and at leastr K := rank(F(K )) rows. For

any such factorization, the function

d(x, w) := (MK x + NK w)>(MK x + NK w) = ‖MK x + NK w‖2

is therefore a dissipation function. If we extend the system equations (2.3.1) with the output equationv = MK x + NK w, then the outputv incorporates the dissipated supply at each time instant we inferfrom (2.3.9) that the extended systemx

zv

=

A BC D

MK NK

(xw

)(2.3.11)

becomesconservativewith respect to the quadratic supply function s′(w, z, v) := s(w, z) − v>v.

This observation leads to an interesting connection between dissipation functions and spectral fac-torizations of rational functions. A complex valued rational function8 : C → H is called aspectraldensityif 8(s) = 8∗(s) and8 is analytic on the imaginary axis. A rational functionV is aspectralfactor of 8 if 8(s) = V∗(s)V(s) for all but finitely manys ∈ C.

Theorem 2.11 Consider the spectral density

8(s) =

(I

T(s)

)∗ ( Q SS> R

)(I

T(s)

)where T(s) = C(I s − A)−1B + D, A has no eigenvalues on the imaginary axis and where(A, B)

is controllable. Then there exists a spectral factor of8 if and only if there exists K= K > such thatF(K ) � 0. In that case, V(s) := MK (I s − A)−1B + NK is a spectral factor of8 where MK andNK are defined by the factorization(2.3.10).



Proof. If F(K ) � 0 thenF(K ) can be factorized as (2.3.10) and for any such factorization thesystem (2.3.11) is conservative with respect to the supply function s′(w, z, v) := s(w, z) − v>v.Applying Theorem 2.8 for conservative systems, this means that I

T(i ω)

V(i ω)

∗ Q S 0S> R 00 0 −I

IT(i ω)

V(i ω)

= 8(i ω) − V∗(i ω)V(i ω) = 0

for all ω ∈ R. But a rational function that vanishes identically on the imaginary axis, vanishes forall s ∈ C. Hence, we infer that8(s) = V∗(s)V(s) for all but finitely manys ∈ C. Conversely, if noK = K > exists withF(K ) � 0, it follows from Theorem 2.8 that8(i ω) � 0 and hence8 admitsno factorization onC0.

2.3.3 The Kalman-Yakubovich-Popov lemma

As mentioned in the proofs of Theorem 2.8 and Theorem 2.9, the Kalman-Yakubovich-Popov lemmais at the basis of the relation between frequency dependent matrix inequalities and an algebraic fea-sibility property of a linear matrix inequality. We will use this important result at various instances.The Lemma originates from a stability criterion of nonlinear feedback systems given by Popov in1962 ( [?]). Yakubovich and Kalman introduced the lemma by showing that the frequency conditionof Popov is equivalent to the existence of a Lyapunov function.

We present a very general statement of the lemma which is free of any hypothesis on the systemparameters. The proof which we present here is an elementary proof based on the dualization resultthat we stated in Theorem 1.16 of Chapter 1.

Lemma 2.12 (Kalman-Yakubovich-Popov)For any triple of matrices A∈ Rn×n, B ∈ Rn×m and

M ∈ S(n+m)×(n+m)=

(M11 M12M21 M22

), the following statements are equivalent:

(a) There exists a symmetric K= K > such that(I 0A B

)> ( 0 KK 0

)(I 0A B

)+ M ≺ 0 (2.3.12)

(b) M22 ≺ 0 and for allω ∈ R and complex vectorscol(x, w) 6= 0(A − i ω I B

) (xw

)= 0 implies

(xw

)∗

M

(xw

)< 0. (2.3.13)

If (A, B) is controllable, the corresponding equivalence also holds for non-strict inequalities.

If

M = −

(0 IC D

)> ( Q SS> R

)(0 IC D

)47 Compilation: February 8, 2005


then statement 2 is equivalent to the condition that for allω ∈ R, with det(i ω I − A) 6= 0,(I

C(i ω I − A)−1B + D

)∗ (Q SS> R

)(I

C(i ω I − A)−1B + D

)≺ 0.

Lemma 2.12 therefore reduces to the equivalence between the linear matrix inequality and the fre-quency domain inequality in Theorem 2.8 and Theorem 2.9. The Kalman-Yakubovich-Popov lemmatherefore completes the proofs of these theorems. We proceed with an elegant proof of Lemma 2.12.

Proof. (1⇒2). Letw ∈ Cm and(i ω I − A)x = Bw, ω ∈ R. The implication then follows from(xw

)∗ ( I 0A B

)> ( 0 KK 0

)(I 0A B

)(xw

)+

(xw

)M

(xw

)=

(xw

)∗

M

(xw

).

(1⇐2). Suppose that (2.3.12) has no solutionK ∈ S. This means that the optimal value

Popt := inf

{γ | ∃K = K > such thatG(K , γ ) :=

(A>K + K A K B

B>K 0

)+ M − γ I ≤ 0

}is non-negative. Note that this is a convex optimization problem with a linear objective function. Asin subsection 1.4.5, let〈Y, X〉 := trace(Y X) be the natural inner product of the Hilbert spaceSn+m.Infer from Theorem 1.28 of Chapter 1 that

Dopt := maxY�0

infK∈S,γ∈R

γ + 〈Y, G(K , γ )〉 =

= maxY�0

infK∈S,γ∈R

{〈Y, M〉 + 〈Y,

(A B

)>K(I 0

)+(I 0

)>K(A B

)− γ I 〉

}=

= max{〈Y, M〉 | Y ≥ 0,

(A B

)Y(I 0

)>+(I 0

)Y(A B

)>= 0

}=

= Popt

is also non-negative. Moreover, by Theorem 1.16, there existsY = Y> such that

〈Y, M〉 = trace(Y M) ≥ 0,(A B

)Y(I 0

)>+(I 0

)Y(A B

)>= 0, Y � 0.

(2.3.14)PartitionY as the(n + m) × (n + m) matrix

Y =

(Y11 Y12Y21 Y22

).

We claim thatY � 0 implies thatY21 = Y>

12 = FY11 for someF ∈ Rm×n. To see this, defineF |im Y11

as Fx := u whereu = Y21v andv is such thatY11v = x and letF vanish on any complement ofim Y11 in Rn. ThenF is well defined unless there existv1, v2, with Y11v1 = Y11v2 andY21v1 6=

Y21v2. That is, unless(v1 − v2) ∈ kerY11 implies(v1 − v2) 6∈ kerY21. Hence,F is well defined if(and also only if) kerY11 ⊆ kerY21. To see this inclusion, letx ∈ kerY11, u ∈ Rm. ThenY ≥ 0implies that for allα ∈ R, αx>Y12u + utr Y21αx + u>Y22u ≥ 0. Sinceα is arbitrary, this means



that x ∈ kerY21 so that kerY11 ⊆ kerY21. Conclude thatY21 = FY11. The second expression in(2.3.14) now reads(A + BF)Y11 + Y>

11(A + BF)> = 0, which shows that imY11 is (A + BF)-invariant and kerY11 is (A + BF)>-invariant. Hence there exist a basisx j 6= 0 of imY11 for which(A + BF)x j = λ j x j andx j = Y11z j , 0 6= z j ∈ Rn. But then

0 = z>

j

[(A + BF)Y11 + Y>

11(A + BF)]

z j = (λ j + λ j ) z>

j Y11z j︸︷︷︸>0

which yields thatλ j ∈ C0. Since

Y =

(I 0F I

)(Y11 00 Y22 − FY11F>

)(I 0F I

)>

we may factorizeY11 = UU>, andY22 − FY11F>= V V> with U andV full rank matrices. Then

0 ≤ trace(Y M) = trace(

(U 00 V

)(I 0F I

)>

M

(I 0F I

)(U 00 V

)>

)

= trace(U M11U>) + trace(V MV>)

2.3.4 The positive real lemma

Consider the system (2.3.1) together with the quadratic supply function s(w, z) = z>w+w>z. Thenthe following result is worth mentioning as a special case of Theorem 2.8.

Corollary 2.13 Suppose that the system6 described by(2.3.1) is controllable and has transferfunction T . Lets(w, z) = z>w + w>z be a supply function. Then equivalent statements are


(b) the LMI I 0A B0 IC D

>

0 K 0 0K 0 0 00 0 0 −I0 0 −I 0

I 0A B0 IC D

� 0

is feasible

(c) For all ω ∈ R with det(i ω I − A) 6= 0 one has T(i ω)∗ + T(i ω) � 0.

Moreover, V(x) = x>K x defines a quadratic storage function if and only if K satisfies the aboveLMI.



Corollary 2.13 is known as thepositive real lemmaand has played a crucial role in questions relatedto the stability of control systems and synthesis of passive electrical networks. Transfer functionswhich satisfy the third statement are generally calledpositive real. Note that for single-input andsingle-output systems, positive realness of a transfer function is graphically verified by the conditionthat the Nyquist plot of the system lies entirely in the right-half complex plane.

2.3.5 The bounded real lemma

Consider the quadratic supply function

s(w, z) = γ 2w>w − z>z (2.3.15)

whereγ ≥ 0. We obtain the following result as an immediate consequence of Theorem 2.8.

Corollary 2.14 Suppose that the system6 described by(2.3.1) is controllable and has transferfunction T . Lets(w, z) = γ 2w>w − z>z be a supply function whereγ ≥ 0. Then equivalentstatements are


(b) The LMI I 0A B0 IC D

>

0 K 0 0K 0 0 00 0 −γ 2I 00 0 0 I

I 0A B0 IC D

� 0

is feasible.

(c) For all ω ∈ R with det(i ω I − A) 6= 0 one has T(i ω)∗T(i ω) � γ 2I .

Moreover, V(x) = x>K x defines a quadratic storage function if and only if K satisfies the aboveLMI.

Let us analyze the importance of this result. If the transfer functionT of a system satisfies item 3of Corollary 2.14 then for allω ∈ R for which i ω is not an eigenvalue ofA and all complex vectorsw(ω) ∈ Cm we have

w(ω)∗T(i ω)∗T(i ω)w(ω) ≤ γ 2‖w(ω)‖2.

Now suppose thatw, viewed as a function ofω ∈ R is square integrable in the sense that

‖w‖22 :=

1

2π

∫∞

−∞

w(ω)∗w(ω)dω < ∞


2.4. INTERCONNECTED DISSIPATIVE SYSTEMS 51

Using Parseval’s theorem,w is the Fourier transform of a functionw : R+ → W in that w(ω) =∫∞

0 w(t)e−i ωtdt and we have that‖w‖2 = ‖w‖2 provided that the latter norm is defined as

‖w‖22 =

∫∞

0w(t)>w(t)dt

Let z(ω) = T(i ω)w(ω). Thenz is the Fourier transform of the outputz of (2.3.1) due to the inputw and initial conditionx(0) = 0. Consequently, item 3 is equivalent to saying that

‖z‖22 ≤ γ 2

‖w‖2

for all inputsw for which ‖w‖2 < ∞. That is, with initial conditionx(0) = 0, the 2-norm of theoutput of (2.3.1) is uniformly bounded byγ 2 times the 2-norm of the input. This crucial observationis at the basis ofH∞ optimal control theory and will be further exploited in the next chapter.

2.4 Interconnected dissipative systems

In this section we consider the question whether the interconnection of a number of dissipativedynamical systems is dissipative. To answer this question, we restrict attention to the case where justtwo dynamical systems are interconnected. Generalizations of these results to interconnections ofmore than two dynamical systems follow immediately from these ideas. Consider the two dynamicalsystems

61 :

{x1 = f1(x1, w1)

z1 = g1(x1, w1), 62 :

{x2 = f2(x2, w2)

z2 = g2(x2, w2). (2.4.1)

We will first need to formalize what we mean by an interconnection of these systems. For this,assume that the input and output variablewi andzi (i = 1, 2) of each of these systems is partitionedas

wi =

(w′

iw′′

i

), zi =

(z′

iz′′

i

).

The input spaceWi and the output spaceZi can therefore be written as the Cartesian set productsWi = W′

i × W′′

i andZi = Z′

i × Z′′

i . We will think of the variables(w′

i , z′

i ) as theexternal variablesand(w′′

i , z′′

i ) as theinterconnection variablesof the interconnected system. It will be assumed thatdim Z′′

1 = dimW′′

2 and dimW′′

1 = dim Z′′

2. The interconnection constraintis then defined by thealgebraic relations

z′′

1 = w′′

2, w′′

1 = z′′

2 (2.4.2)

and the interconnected system is defined by the laws (2.4.1) of61 and62 and the interconnectionconstraint (2.4.2). The idea behind this concept is displayed in Figure 2.1.

It is not evident that the joint equations (2.4.1)-(2.4.2) will have a unique solution for every pairof input variables(w′

1, w′

2). This property is referred to aswell posednessof the interconnection.We decided to disappoint the reader and avoid a thorough discussion on this issue here. For thetime being, we will assume that the interconnected system is well defined, takes(w′

1, w′

2) as its


52 2.4. INTERCONNECTED DISSIPATIVE SYSTEMS

61 62--

��

z′

1

w′

1

� �

z′′

1

z′′

2

- -

w′′

2

w′′

1

--

��

w′

2

z′

2

Figure 2.1: System interconnections

input variables and(z′

1, z′

2) as its outputs. Whenever well defined, the interconnected system will bedenoted by6 = 61 u 62.

We now get to the discuss the question whether or not the interconnection is dissipative. Suppose that(6i , si ) is dissipative fori = 1, 2. Assume that the supply functions si admit an additive structurein that there exists functions s′

i : W′

i × Z′

i → R and s′′i : W′′

i × Z′′

i → R such that

si (wi , zi ) = s′

i (w′

i , z′

i ) + s′′

i (w′′

i , z′′

i )

for all wi ∈ Wi andzi ∈ Zi . The interconnection of61 and62 is the said to beneutral if

s′′

1(w′′

1, z′′

1) + s′′

2(w′′

2, z′′

2) = 0

for all w′′

1 ∈ W′′

1 , z1 ∈ Z′′

1, w′′

2 ∈ W′′

2 andz′′

2 ∈ Z′′

2. In words, this means that there is no dissipationin the interconnection variables, i.e., the interconnected system is conservative with respect to thesupply function s′′ : W′′

1 × W′′

2 → R defined as s′′(w′′

1, w′′

2) := s′′

1(w′′

1, w′′

2)+s′′

2(w′′

2, w′′

1). Neutralityseems a rather natural requirement on many interconnected systems. The following result states thata neutral interconnection of dissipative systems is dissipative again.

Theorem 2.15 Let (6i , si ), i = 1, 2 be dissipative dynamical systems and suppose that the inter-connection6 = 61u62 is well defined and neutral. Then6 is dissipative with respect to the supplyfunctions : W′

1 × Z′

1 × W′

2 × Z′

2 defined as

s(w′

1, z′

1, w′

2, z′

2) := s′

1(w′

1, z′

1) + s′

2(w′

2, z′

2).

Moreover, if Vi : Xi → R is a storage function of(6i , si ) then V : X1 × X2 → R defined byV(x1, x2) := V1(x1) + V2(x2) is a storage function for(6, s).

Proof. Since(6i , si ), i = 1, 2 is dissipative, there existsV1 : X1 → R, V2 : X2 → R such that forall t0 ≤ t1,

V1(x1(t0)) +

∫ t1

t0s′

1(w′

1, z′

1) + s′′

1(w′′

1, z′′

1)dt ≥ V1(x1(t1))

V2(x2(t0)) +

∫ t1

t0s′

2(w′

2, z′

2) + s′′

2(w′′

2, z′′

2)dt ≥ V2(x2(t1))


2.5. FURTHER READING 53

Adding these two equations and applying the neutrality condition of the interconnection yields that

V(x(t0) +

∫ t1

t0s′

1(w′

1, z′

1) + s′

2(w′

2, z′

2)dt ≥ V(x(t1))

i.e.,(6, s) is dissipative withV as its storage function.

We finally remark that the above ideas can be easily generalized to interconnections of any finitenumber of dynamical systems.

2.5 Further reading

Many of the ideas on dissipative dynamical systems originate from the work of Willems in [45,46].The definitions on dissipative dynamical systems and their characterizations have been reportedin [43,44] and originate from the behavioral approach of dynamical systems. See also similar workin [39]. Extensions to nonlinear systems are discussed in [40]. The proof of the Kalman-Yakubovich-Popov lemma which is presented here is very much based on ideas in [42]. For alternative proofs ofthis important lemma, see [?,23].

2.6 Exercises

Exercise 1Show that for conservative controllable systems the set of normalized storage functionsV(x∗) con-sist of one element only.

Conclude that storage functions of conservative systems are unique up to normalization!.

Exercise 2Show that the set of dissipation functions associated with a dissipative system is convex.

Exercise 3Consider the suspension system6 of a transport vehicle as depicted in Figure 2.2. The system ismodeled by the equations

m2q2 + b2(q2 − q1) + k2(q2 − q1) − F = 0

m1q1 + b2(q1 − q2) + k2(q1 − q2) + k1(q1 − q0) + F = 0

whereF (resp.−F) is a force acting on the chassis massm2 (the axle massm1). Here,q2 − q1 isthe distance between chassis and axle, andq2 denotes the acceleration of the chassis massm2. b2 isa damping coefficient andk1 andk2 are spring coefficients. (b1 = 0). The variableq0 represents theroad profile. A ‘real life’ set of system parameters is given in Table 2.1.


54 2.6. EXERCISES

Figure 2.2: Model for suspension system

m1 m2 k1 k2 b2

1.5 × 103 1.0 × 104 5.0 × 106 5.0 × 105 50× 103

Table 2.1: Physical parameters

(a) Derive a state space model of the form 2.3.1 of the system which assumesw = col(q0, F) andz = col(q1, q1, q2, q2) as its input and output, respectively.

(b) Define a supply functions : W × Z → R such that(6, s) is dissipative. (Base your definitionon physical insight).

(c) Characterize the set of all quadratic storage functions of the system as the feasibility set of alinear matrix inequality.

(d) Compute a quadratic storage functionV(x) = x>K x for this system.

(e) Determine a dissipation functiond : X × W → R for this system.

Exercise 4Consider the transfer functions

(a) T1(s) = 1/(s + 1)

(b) T2(s) = (s − 1)/(s + 1)

(c) T3(s) =

((s + 2)(s − 1)/(s + 1)2 (s + 3)/(s + 4)

(s − 1)/(s + 0.5) (s + 1)/(s + 2)

).

Determine for each of these transfer functions

(a) whether or not they are positive real and

(b) the smallest value ofγ > 0 for whichT∗(i ω)T(i ω) � γ 2I .


2.6. EXERCISES 55

Reformulate these problems as a feasibility test involving a suitably defined LMI. (See Corollar-ies 2.13 and 2.14).

Exercise 5Consider the following electrical circuit. We will be interested in modeling the relation between

RC RL

C LV

I

� ��kkkkkk

the external voltageV and the currentI through the circuit. Assume that the resistorsRC = 1 andRL = 1, the capacitorC = 2 and the inductanceL = 1.

(a) Derive a linear, time-invariant system6 that models the relation between the voltageV andthe currentI .

(b) Find a state space representation of the form (2.3.1) which represents6. Is the choice of inputand output variable unique?

(c) Define a supply functions : W × Z → R such that(6, s) is dissipative.

(d) Characterize the set of all quadratic storage functions of the system as the feasibility set of alinear matrix inequality.

(e) Compute a quadratic storage functionV(x) = x>K x of this system.

(f) Does dissipativity of(6, s) depend on whether the voltageV or the currentI is taken as inputof your system?

Exercise 6Consider a first-order unstable systemP(s) = 1/(−3s + 1). It is desirable to design a feedbackcompensatorC, so that the feedback system is dissipative. Assume that the compensatorC is asimple gainC(s) = k, k ∈ R. Find the range of gainsk that will make the system depicted inFigure 2.3 dissipative with respect to the supply function s(w, z) = wz.

-w f -+

C - P - z

Figure 2.3: Feedback configuration


56 2.6. EXERCISES

Exercise 7Consider the example of an orbit of a massm moving in a central conservative force field, the forcebeing given by Newton’s inverse square law, i.e., at timet , the massm is at positionz(t) ∈ R3 andmoves under influence of a Newtonian gravitational field of a second massM , which is assumed tobe at the origin ofR3. Newton’s inverse square law claims that

d2z

dt2+

Mg

‖z‖3z = 0. (2.6.1)

Let v = z denote the velocity vector ofm and letx = col(z, v) be a state vector.

(a) Represent this (autonomous) system in the formx = f (x), z = g(x).

(b) Consider the function

V(z, v) :=1

2m‖v‖

2−

Mg

‖z‖

and show thatV is a storage function of the system with supply function s(z) = 0.

(c) Prove that the orbit of the massm is a hyperbola, parabola or ellipse depending on whetherV > 0, V = 0 or V < 0 along solutionsz of (2.6.1).

Conclude from the last item that Kepler’s first law of planetary motions tells us that the solar systemis an autonomous dissipative system with a negative storage function.


Chapter 3

Nominal stability and nominalperformance

3.1 Lyapunov stability

As mentioned in Chapter 1, Aleksandr Mikhailovich Lyapunov studied contraction and expansionphenomena of the motions of a mechanical system around an equilibrium. Translated in modern jar-gon, the study ofLyapunov stabilityconcerns the asymptotic behavior of the state of an autonomousdynamical system. The main contribution of Lyapunov has been to define the concept of stability,asymptotic stability and instability of such systems and to give a method of verification of these con-cepts in terms of the existence of functions, called Lyapunov functions. Both his definition and hisverification method characterize, in a local way, the stability properties of an autonomous dynamicalsystem. Unfortunately, for the general class of nonlinear systems there are no systematic proceduresfor finding Lyapunov functions. However, we will see that for linear systems the problem of findingLyapunov functions can be solved adequately as a feasibility test of a linear matrix inequality.

3.1.1 Nominal stability of nonlinear systems

Let X be a set andT ⊆ R. A flow is a mappingφ : T × T × X → X which satisfies the

(a) consistency property:φ(t, t, x) = x and the

(b) semi-group property:φ(t1, t−1, x0) = φ(t1, t0, φ(t0, t−1, x0)) for all t−1 ≤ t0 ≤ t1 andx0 ∈

X.

57

58 3.1. LYAPUNOV STABILITY

The setX is called thestate space(or thephase space) and we will think of a flow as astate evolutionmap. A flow defines anunforcedor autonomous dynamical systemin the sense that the evolution ofa flow is completely determined by an initial state and not by any kind of external input. A typicalexample of a flow is the solution of a differential equation of the form

x(t) = f (x(t), t) (3.1.1)

with finite dimensional state spaceX = Rn and wheref : X × T → X is a function. By asolutionof (3.1.1) over a (finite) time intervalT ⊂ R we will mean a functionx : T → X which isdifferentiable everywhere and which satisfies (3.1.1) for allt ∈ T . By a solution over the unboundedintervalsR+, R− or R we will mean a solution over all finite intervalsT contained in either of thesesets. Let us denote byφ(t, t0, x0) the solution of (3.1.1) at timet with initial conditionx(t0) = x0.If for all pairs(t0, x0) ∈ T ×X one can guarantee existence and uniqueness of a solutionφ(t, t0, x0)

of (3.1.1) with t ∈ T , then the mappingφ satisfies the consistency and semi-group property, andtherefore defines a flow. In that case,φ is said to be theflow associated with the differential equation(3.1.1). This flow is calledtime-invariantif

φ(t + τ, t0 + τ, x0) = φ(t, t0, x0)

for anyτ ∈ T . It is said to belinear if X is a vector space andφ(t, t0, ·) is a linear mapping for alltime instancest andt0. For time-invariant flows we usually take (without loss of generality)t0 = 0as initial time. We remark that the condition on existence and uniqueness of solutions of (3.1.1)holds wheneverf satisfies aglobal Lipschitz condition. That is, ifX has the structure of a normedvector space with norm‖ · ‖, and if for someL > 0, the inequality

‖ f (x, t) − f (y, t)‖ ≤ L‖x − y‖

holds for allx, y ∈ X, t ∈ T .

An elementx∗ is afixed pointor anequilibrium pointof (3.1.1) if f (x∗, t) = 0 for all t ∈ T . It iseasy to see thatx∗ is a fixed point if and only ifφ(t, t0, x∗) = x∗ is a solution of (3.1.1) for allt andt0. In other words, solutions of (3.1.1) remain inx∗ once they started there.

There exists a wealth of concepts to define the stability of a flowφ. The various notions of Lyapunovstability pertain to a distinguished fixed pointx∗ of a flow φ and express to what extend an othertrajectoryφ(t, t0, x0), whose initial statex0 lies in the neighborhood ofx∗ at timet0, remains or getsclose toφ(t, t0, x∗) for all time instancest ≥ t0.

Definition 3.1 (Lyapunov stability) Let φ : T × T × X be a flow and suppose thatT = R andXis a normed vector space. The fixed pointx∗ is said to be

(a) stable(in the sense of Lyapunov) if given anyε > 0 andt0 ∈ T , there existsδ = δ(ε, t0) > 0(not depending ont) such that

‖x0 − x∗‖ ≤ δ =⇒ ‖φ(t, t0, x0) − x∗

‖ ≤ ε for all t ≥ t0. (3.1.2)


3.1. LYAPUNOV STABILITY 59

(b) attractiveif for all t0 ∈ T there existsδ = δ(t0) > 0 with the property that

‖x0 − x∗‖ ≤ δ =⇒ lim

t→∞‖φ(t, t0, x0) − x∗

‖ = 0. (3.1.3)

(c) exponentially stableif for all t0 ∈ T there existsδ = δ(t0), α = α(t0) > 0 andβ = β(t0) > 0such that

‖x0 − x∗‖ ≤ δ =⇒ ‖φ(t, t0, x0) − x∗

‖ ≤ β‖x0 − x∗‖e−α(t−t0) for all t ≥ t0. (3.1.4)

(d) asymptotically stable(in the sense of Lyapunov) if it is both stable (in the sense of Lyapunov)and attractive.

(e) unstableif it is not stable (in the sense of Lyapunov).

(f) uniformly stable(in the sense of Lyapunov) if given anyε > 0 there existsδ = δ(ε) > 0 (notdepending ont0) such that (3.1.2) holds for allt0 ∈ T .

(g) uniformly attractiveif there existsδ > 0 (not depending ont0) such that (3.1.3) holds for allt0 ∈ T .

(h) uniformly exponentially stableif there existsδ > 0 (not depending ont0) such that (3.1.3)holds for allt0 ∈ T .

(i) uniformly asymptotically stable(in the sense of Lyapunov) if it is both uniformly stable (inthe sense of Lyapunov) and uniformly attractive.

In words, a fixed point is stable if the graphs of all flows that initiate sufficiently close tox∗ at timet0, remain as close as desired tox∗ for all time t ≥ t0. Stated otherwise, a fixed point is stable ifthe mappingφ(t, t0, ·) is continuous atx∗, uniformly in t ≥ t0. Theregion of attractionassociatedwith a fixed pointx∗ is defined to be the set of all initial statesx0 ∈ X for whichφ(t, t0, x0) → x∗

ast → ∞. If this region does not depend ont0, it is said to beuniform, if it coincides withX thenx∗ is globally attractive. Similarly, we can define the region of stability, the region of asymptoticstability and the region of exponential stability associated withx∗. Again, these regions are saidto beuniform if they do not depend ont0. If these regions cover the entire state spaceX, then thefixed point is calledglobally stable, globally asymptotically stable, or globally exponentially stable,respectively.

We remark that there exist examples of stable fixed points that are not attractive, and examples ofattractive fixed points that are not stable. The notion of exponential stability is the strongest in thesense that an exponentially stable fixed point is also asymptotically stable (i.e., a stable attractor).Similarly, it is easily seen that uniform exponential stability implies uniform asymptotic stability.

A setS ⊂ X is called apositive invariant setof a flowφ if x0 ∈ S implies that there exists at0 ∈ Tsuch thatφ(t, t0, x0) ∈ S for all t ≥ t0. It is called anegative invariant setif this condition holdsfor t ≤ t0 and it is said to be aninvariant setif it is both positive and negative invariant. The idea ofa (positive, negative) invariant set is, therefore, that a flow remains in the set once it started there at



time t0. Naturally, a setS ⊆ X is said to be an invariant set of the differential equation (3.1.1), if itis an invariant set of its associated flow. Also, any pointx0 ∈ X naturally generates the invariant setS = {φ(t, t0, x0) | t ≥ t0} consisting of all points that the flowφ(t, t0, x0) hits when time evolves.In particular, for every fixed pointx∗ of (3.1.1) the singletonS = {x∗

} is an invariant set.

The following proposition gives a first reason not to distinguish all of the above notions of stability.

Proposition 3.2 Letφ : T × T × X be a flow with T= R and suppose that x∗ is a fixed point. Ifφis linear then

(a) x∗ is attractive if and only if x∗ is globally attractive.

(b) x∗ is asymptotically stable if and only if x∗ is globally asymptotically stable.

(c) x∗ is exponentially stable if and only if x∗ is globally exponentially stable.

If φ is time-invariant then

(a) x∗ is stable if and only if x∗ is uniformly stable.

(b) x∗ is asymptotically stable if and only if x∗ is uniformly asymptotically stable.

(c) x∗ is exponentially stable if and only if x∗ is uniformly exponentially stable.

Proof. All if parts are trivial. To prove theonly if parts, letφ be linear, and suppose thatx∗ is attrac-tive. Without loss of generality we will assume thatx∗

= 0. Takex0 ∈ X andδ as in Definition 3.1.Then there existsα such that‖αx0‖ < δ and by linearity of the flow, limt→∞ ‖φ(t, t0, x0)‖ =

α−1 limt→∞ ‖φ(t, t0, αx0)‖ = 0, i.e., 0 is a global attractor. The second and third claim for linearflows is now obvious. Next, letφ be time-invariant andx∗

= 0 stable. Then forε > 0 andt0 ∈ Tthere existsδ > 0 such that‖x0‖ ≤ δ implies‖φ(t + τ, t0 + τ, x0)‖ = ‖φ(t, t0, x0)‖ ≤ ε for allt ≥ t0 andτ ∈ T . Sett ′ = t + τ andt ′0 = t0 + τ to infer that‖x0‖ ≤ δ implies‖φ(t ′, t ′0, x0)‖ ≤ ε

for all t ′ ≥ t ′0. But these are the trajectories passing throughx0 at timet ′0 with t ′0 arbitrary. Hence 0is uniformly stable. The last claims follow with a similar reasoning.

Definition 3.3 (Definite functions) Let S ⊆ Rn have the pointx∗ in its interior and letT ⊆ R. AfunctionV : S × T → R is said to be

(a) positive definite (with respect to x∗) if there exists a continuous, strictly increasing functiona : R+ → R+ with a(0) = 0 such thatV(x, t) ≥ a(‖x − x∗

‖) for all (x, t) ∈ S × T .

(b) positive semi-definiteif V(x, t) ≥ 0 for all (x, t) ∈ S × T .

(c) decrescent (with respect to x∗) if there exists a continuous, strictly increasing functionb :

R+ → R+ with b(0) = 0 such thatV(x, t) ≤ b(‖x − x∗‖) for all (x, t) ∈ S × T ..



(d) negative definite (around x∗) or negative semi-definiteif −V is positive definite or positivesemi-definite, respectively.

We introduced positive definite and positive semi-definitematricesin Chapter 1. This use of termi-nology is consistent with Definition 3.3 in the following sense. IfX is a real symmetric matrix thenfor all vectorsx we have that

λmin(X)‖x‖2

≤ x>Xx ≤ λmax(X)‖x‖2.

Hence, the functionV(x) := x>Xx is positive definite with respect to the origin if, and only if, thesmallest eigenvalueλmin(X) is positive. Hence,X is positive definite as a matrix (denotedX � 0)if and only if the functionV(x) = x>Xx is positive definite. Similarly,V is negative definite withrespect to the origin if and only ifλmax(X) < 0 which we denoted this byX ≺ 0.

Consider the system (3.1.1) and suppose thatx∗ is an equilibrium point. LetS be a set which hasx∗ in its interior and suppose thatV : S × T → R has continuous partial derivatives (i.e.,V iscontinuously differentiable). Consider, for(x0, t0) ∈ S × T , the functionV∗

: T → X defined bythe composition

V∗(t) := V(φ(t, t0, x0), t).

Then this function is differentiable and its derivative reads

V∗(t) = ∂xV(φ(t, t0, x0), t) f (φ(t, t0, x0), t) + ∂t V(φ(t, t0, x0), t).

Now introduce the mappingV ′: S × T → R by setting

V ′(x, t) := ∂xV(x, t) f (x, t) + ∂t V(x, t). (3.1.5)

V ′ is called thederivative of V along trajectories of(3.1.1) and, by construction, we have thatV∗(t) = V ′(φ(t, t0, x0), t) for all t ∈ T . It is very important to observe thatV ′ not only depends onV but also on the differential equation (3.1.1). It is rather common to writeV for V ′ and even morecommon to confuseV∗ with V ′. Formally, these objects are different asV∗ is a function of time,whereasV ′ is a function of the state and time.

The main stability results for autonomous systems of the form (3.1.1) are summarized in the follow-ing result.

Theorem 3.4 (Lyapunov theorem)Consider the differential equation(3.1.1)and let x∗ ∈ X be anequilibrium point which belongs to the interior of a setS.

(a) If there exists a positive definite, continuously differentiable function V: S × T → R withV(x∗, t) = 0 and V′ negative semi-definite, then x∗ is stable. If, in addition, V is decrescent,then x∗ is uniformly stable.

(b) If there exists a positive definite decrescent and continuously differentiable function V: S ×

T → R with V(x∗, t) = 0 and V′ negative definite, then x∗ is uniformly asymptotically stable.



Proof. 1. Let t0 ∈ T andε > 0. SinceV(·, t0) is continuous atx∗ andV(x∗, t0) = 0, there existsδ > 0 such thatV(x0, t0) ≤ a(ε) for everyx0 ∈ S with ‖x0 − x∗

‖ < δ. SinceV is a positive definiteandV negative semi-definite, we have that for everyx0 ∈ S with ‖x0 − x∗

‖ < δ andt ≥ t0:

a(‖x(t) − x∗‖) ≤ V(x(t), t) ≤ V(x0, t0) ≤ a(ε).

where we denotedx(t) = φ(t, t0, x0). Sincea is strictly increasing, this implies (3.1.2), i.e.,x∗ isstable. If, in addition,V is decrescent, thenV(x, t) ≤ b(‖x − x∗

‖) for all (x, t) ∈ S × T . Applythe previous argument withδ such thatb(δ) ≤ a(ε). Thenδ is independent oft0 andV(x0, t0) ≤

b(δ) ≤ a(ε) for every(x0, t0) ∈ S × T such that‖x0 − x∗‖ < δ. Hence, (3.1.2) holds for allt0.

2. By item 1,x∗ is uniformly stable. It thus suffices to show thatx∗ is uniformly attractive. Letδ > 0 be such that allx0 with ‖x0 − x∗

‖ < δ belong toS. Sincex∗ is an interior point ofS suchδ

obviously exists. Letx0 satisfy‖x0−x∗‖ < δ and lett0 ∈ T . Under the given hypothesis, there exist

continuous, strictly increasing functionsa, b andc such thata(‖x − x∗‖) ≤ V(x, t) ≤ b(‖x − x∗

‖)

and V(x, t) ≥ −c(‖x − x∗‖) for all (x, t) ∈ S × T . Let ε > 0, γ > 0 such thatb(γ ) < a(ε),

and t1 > t0 + b(δ)/c(γ ). We claim that there existsτ ∈ [t0, t1] such thatx(τ ) := φ(τ, t0, x0)

satisfies‖x(τ ) − x∗‖ ≤ γ . Indeed, if no suchτ exists, integration of both sides of the inequality

V(x(t), t) ≤ −c(‖x(t) − x∗‖) yields that

V(x(t1), t1) ≤ V(x0, t0) −

∫ t1

t0c(‖x(t) − x∗

‖)

< b(‖x0 − x∗‖) − (t1 − t0)c(γ )

< b(δ) −b(δ)

c(γ )c(γ ) = 0,

which contradicts the assumption thatV(x(t1), t1) ≥ 0. Consequently, it follows from the hypothesisthat for allt ≥ τ :

a(‖x(t) − x∗‖) ≤ V(x(t), t) ≤ V(x(τ ), τ ) ≤ b(‖x(τ ) − x∗

‖) ≤ b(γ ) ≤ a(ε).

Sincea is strictly increasing, this yields that‖x(t) − x∗‖ ≤ ε for all t ≥ τ . As ε is arbitrary, this

proves (3.1.3) for allt0.

The functionsV that satisfy either of the properties of Theorem 3.4 are generally referred to asLyapunov functions. The main implication of Theorem 3.4 is that stability of equilibrium points ofdifferential equations of the form (3.1.1) can be verified by searching for suitable Lyapunov func-tions.

3.1.2 Stability and dissipativity

An intuitive way to think about the result of Theorem 3.4 is to viewV as a storage function ofthe autonomous dynamical system described by the differential equation (3.1.1). At a distinguished



equilibrium statex∗, the storage vanishes but is positive everywhere else. For example, a dampedpendulum described by the nonlinear differential equations

x1 = x2, x2 = − sin(x1) − dx2

has all points(kπ, 0) with k ∈ Z as its fixed points (corresponding to the vertical positions of thependulum). The functionV(x1, x2) :=

12x2

2 + (1 − cos(x1)), representing the sum of kinetic andpotential mechanical energy in the system, vanishes at the fixed points(kπ, 0) with k even and isa Lyapunov function whereV ′(x1, x2) := V(x1, x2) = −dx2

2 is negative definite provided that thedamping coefficientd > 0. Hence, the fixed points(kπ, 0) with k even are stable equilibria. In fact,these points are uniformly asymptotically stable.

Storage functions, introduced in Chapter 2, and Lyapunov functions are closely related as can beenseen by comparing (3.1.5) with the differential dissipation inequality (2.2.3) from Chapter 2. Indeed,if w(t) = w∗ is taken to a constant input in the input-state-output system (2.2.1) then we obtain theautonomous time-invariant system

x = f (x, w∗)

z = g(x, w∗).

If x∗ is an equilibrium point of this system and if this system is known to be dissipative with respectto a supply function that satisfies

s(w∗, g(x, w∗)) ≤ 0

for all x in a neighborhood ofx∗, then any (differentiable) storage functionV satisfies the differentialdissipation inequality (2.2.3) and is monotone non-increasing along solutions in a neighborhood ofx∗. If the storage function is moreover non-negative in this neighborhood, then it follows fromTheorem 3.4 thatx∗ is a stable equilibrium atx∗. In that case, the storage functionV is nothingelse than a Lyapunov function defined in a neighborhood ofx∗. In particular, the dissipativity of thesystem implies the stability of the system provided the storage function is non-negative.

Unfortunately, for this general class of nonlinear differential equations there are no systematic proce-dures for actually finding such functions. We will see next that more explicit results can be obtainedfor the analysis of stability of solutions of linear differential equations.

3.1.3 Nominal stability of linear systems

Let us consider thelinear autonomous system

x = Ax (3.1.6)

where A : Rn→ Rn is a linear map obtained as thelinearization of f : X → X around an

equilibrium pointx∗∈ X of (3.1.1). Precisely, forx∗

∈ X we write

f (x) = f (x∗) +

n∑j =1

∂ f

∂x j(x∗)[x − x∗

] + . . .


64 3.2. GENERALIZED STABILITY REGIONS FOR LTI SYSTEMS

where we assume thatf is at least once differentiable. The linearization off aroundx∗ is definedby the system (3.1.6) withA defined by the realn × n matrix

A := ∂x f (x∗).

All elements in kerA are equilibrium points, but we will investigate the stability of (3.1.6) at theequilibriumx∗

= 0. The positive definite quadratic functionV : X → R defined by

V(x) = x>Xx

serves as a quadratic Lyapunov function. Indeed,V is continuous atx∗= 0, is assumes a strong

local minimum atx = 0 (actually this is a strong global minimum ofV), while the derivative ofV(x) in the direction of the vector fieldAx is given by

Vx Ax = x>[A>X + X A]x

which should be negative to guarantee that the origin is an asymptotic stable equilibrium point of(3.1.6). We thus obtain the following result:

Proposition 3.5 Let the system(3.1.6) be a linearization of(3.1.1) at the equilibrium x∗. Thefollowing statements are equivalent.

(a) The origin is an asymptotically stable equilibrium for(3.1.6).

(b) The origin is a globally asymptotically stable equilibrium for(3.1.6).

(c) All eigenvaluesλ(A) of A have strictly negative real part.

(d) The linear matrix inequalities

A>X + X A ≺ 0, X � 0

are feasible.

Moreover, if one of these statements hold, then the equilibrium x∗ of the flow(3.1.1)is asymptoticallystable.

As the most important implication of Proposition 3.5, asymptotic stability of the equilibriumx∗ ofthe nonlinear system (3.1.1) can be concluded from the asymptotic stability of its linearization atx∗.

3.2 Generalized stability regions for LTI systems

As we have seen, the autonomous linear system

x = Ax


3.2. GENERALIZED STABILITY REGIONS FOR LTI SYSTEMS 65

is asymptotically stable if and only if all eigenvalues ofA lie in C−, the open left half complexplane. For many applications in control and engineering we may be interested in more generalstability regions. Let us define astability regionas a subsetCstab ⊆ C with the following twoproperties {

Property 1: λ ∈ Cstab =⇒ λ ∈ Cstab

Property 2: Cstab is convex.

Typical examples of common stability sets include

Cstab 1= C− open left half complex planeCstab 2= C no stability requirementCstab 3= {s ∈ C | Re(s) < −α} guaranteed dampingCstab 4= {s ∈ C | |s| < r } circle centered at originCstab 5= {s ∈ C | α1 < Re(s) < α2} vertical stripCstab 6= {s ∈ C | Re(s) tan(θ) < −| Im(s)|} conic stability region.

Here,θ ∈ (0, π/2) and r, α, α1, α2 are real numbers. We consider the question whether we canderive a feasibility test to verify whether the eigen-modes of the systemx = Ax belong to either ofthese sets. This can indeed be done in the case of the given examples. To see this, let us introducethe notion of an LMI-region as follows:

Definition 3.6 For a real symmetric matrixP ∈ S2m×2m, the set of complex numbers

L P :=

{s ∈ C |

(I

s I

)∗

P

(I

s I

)≺ 0

}is called anLMI region.

If P is partitioned according toP =

(Q SS> R

), then an LMI region is defined by those pointss ∈ C

for whichQ + sS+ sS>

+ s Rs< 0.

All of the above examples fit in this definition. Indeed, by setting

P1 =

(0 11 0

), P2 =

(−1 00 0

), P3 =

(2α 11 0

)

P4 =

(−r 2 0

0 1

), P5 =

2α1 0 −1 00 −2α2 0 1

−1 0 0 00 1 0 0

, P6 =

0 0 sin(θ) cos(θ)

0 0 − cos(θ) sin(θ)

sin(θ) − cos(θ) 0 0cos(θ) sin(θ) 0 0

we obtain thatCstabi = L Pi . More specifically, LMI regions include regions bounded by circles,ellipses, strips, parabolas and hyperbolas. Since any finite intersection of LMI regions is again anLMI region one can virtually approximate any convex region in the complex plane.


66 3.2. GENERALIZED STABILITY REGIONS FOR LTI SYSTEMS

To present the main result of this section, we will need to introduce the notation forKroneckerproducts. Given two matricesA ∈ Cm×n, B ∈ Ck×`, the Kronecker product ofA and B is themk× n` matrix

A ⊗ B =

A11B . . . A1nB...

...

Am1B . . . AmnB

Some properties pertaining to the Kronecker product are as follows

• 1 ⊗ A = A = A ⊗ 1.

• (A + B) ⊗ C = (A ⊗ C) + (B ⊗ C).

• (A ⊗ B)(C ⊗ D) = (AC) ⊗ (B D).

• (A ⊗ B)∗ = A∗⊗ B∗.

• (A ⊗ B)−1= A−1

⊗ B−1.

• in generalA ⊗ B 6= B ⊗ A.

These properties are easily verified and we will not prove them here. Stability regions described byLMI regions lead to the following interesting generalization of the Lyapunov inequality.

Theorem 3.7 All eigenvalues of A∈ Rn×n are contained in the LMI region{s ∈ C |

(I

s I

)∗ ( Q SS> R

)(I

s I

)≺ 0

}if and only if there exists X� 0 such that(

IA ⊗ I

)∗ ( X ⊗ Q X ⊗ SX ⊗ S> X ⊗ R

)(I

A ⊗ I

)≺ 0

Note that the latter is an LMI inX and that the Lyapunov theorem (Theorem 3.4) corresponds totaking Q = 0, S = I andR = I . Among the many interesting special cases of LMI regions that arecovered by Theorem 3.7, we mention the stability setCstab 4with r = 1 used for the characterizationof stability of thediscrete timesystemx(t + 1) = Ax(t). This system is stable if and only if theeigenvalues ofA are inside the unit circle. Equivalently,λ(A) ∈ L P with P =

(−1 00 1

), which by

Theorem 3.7 is equivalent to saying that there existX � 0 such that(IA

)∗ (−X 00 X

)(IA

)= A∗X A− X ≺ 0.


3.3. NOMINAL PERFORMANCE AND LMI’S 67

3.3 Nominal performance and LMI’s

In this section we will use the results on dissipative systems of Chapter 2 to characterize a number ofrelevant performance criteria for dynamical systems. In view of forthcoming chapters we considerthe system {

x = Ax + Bw

z = Cx + Dw(3.3.1)

wherex(t) ∈ X = Rn is the state,w(t) ∈ W = Rm the input andz(t) ∈ Z = Rp the output. LetT(s) = C(I s − A)−1B + D denote the corresponding transfer function and assume throughout thissection that the system is asymptotically stable (i.e., the eigenvalues ofA are in the open left-halfcomplex plane). We will vieww as an input variable (a ‘disturbance’) whose effect on the outputz (an ’error indicator’) we wish to minimize. There are various ways to quantify the effect ofw onz. For example, for a given inputw, and for suitable signal norms, the quotient‖z‖/‖w‖ indicatesthe relative gain which the inputw has on the outputz. More generally, theworst case gainof thesystem is the quantity

‖T‖ := sup0<‖w‖<∞

‖z‖

‖w‖

which, of course, depends on the chosen signal norms. Other indicators for nominal performancecould be the energy in the impulse response of the system, the (asymptotic) variance of the outputwhen the system is fed with inputs with a prescribed stochastic nature, percentage overshoot in stepresponses, etc.

3.3.1 Quadratic nominal performance

We start this section by reconsidering Theorem 2.9 from Chapter 2. The following proposition is ob-tained by making sign changes in Theorem 2.9 and applying the Kalman-Yabubovich-Popov lemma.We will see that it has important implications for the characterization of performance criteria.

Proposition 3.8 Consider the system(3.3.1)with transfer function T . Suppose that A has its eigen-values inC− and let x(0) = 0. The following statements are equivalent.

(a) there existsε > 0 such that for allw ∈ L2∫∞

0

(w

z

)> ( Q SS> R

)(w

z

)dt ≤ −ε2

∫∞

0w>(t)w(t)dt (3.3.2)

(b) for all ω ∈ R ∪ {∞} there holds(I

T(i ω)

)∗ ( Q SS> R

)(I

T(i ω)

)≺ 0.


68 3.3. NOMINAL PERFORMANCE AND LMI’S

(c) there exists K= K >∈ Rn×n such that

F(K ) =

(A>K + K A K B

B>K 0

)+

(0 IC D

)> ( Q SS> R

)(0 IC D

)=

=

(I 0A B

)> ( 0 KK 0

)(I 0A B

)+

(0 IC D

)> ( Q SS> R

)(0 IC D

)=

=

I 0A B0 IC D

>

0 K 0 0K 0 0 00 0 Q S0 0 S> R

I 0A B0 IC D

≺ 0.

This result characterizesquadratic performanceof the system (3.3.1) in the sense that it providesnecessary and sufficient conditions for the quadratic performance functionJ :=

∫∞

0 s(w, z)dt tobe strictly negative for all square integrable trajectories of the system. Proposition (3.8) providesan equivalent condition in terms of a frequency domain inequality and an equivalent linear matrixinequality for this. This very general result proves useful in a number of important special cases,which we describe below.

3.3.2 H∞ nominal performance

A popular performance measure of a stable linear time-invariant system is theH∞ norm of its trans-fer function. It is defined as follows. Consider the system (3.3.1) together with its transfer functionT . Assume the system to be asymptotically stable. In that case,T(s) is bounded for alls ∈ C withpositive real part. By this, we mean that the largest singular valueσmax(T(s)) is finite for all s ∈ Cwith Res > 0. This is an example of anH∞ function. To be slightly more formal on this class offunctions, letC+ denote the set of complex numbers with positive real part. TheHardy space H∞consists of all complex valued functionsT : C+

→ Cp×m which are analytic and for which

‖T‖∞ := sups∈C+

σmax(T(s)) < ∞.

The left-hand side of this expression satisfies the axioms of a norm and defines theH∞ norm ofT .Although H∞ functions are defined on the right-half complex plane, it can be shown that each suchfunction has a unique extension to the imaginary axis (which is usually also denoted byT) and thatthe H∞ norm is given by

‖T‖∞ = supω∈R

σmax(T(i ω)).

In words, theH∞ norm of a transfer function is the supremum of the maximum singular value of thefrequency response of the system.

Remark 3.9 Various graphical representations of frequency responses are illustrative to investigatesystem properties like bandwidth, gains, etc. Probably the most important one is a plot of the singular



valuesσ j (T(i ω)) ( j = 1, . . . , min(m, p)) viewed as function of the frequencyω ∈ R. For single-input single-output systems there is only one singular value andσ(T(i ω)) = |T(i ω)|. A Bodediagramof the system is a plot of the mappingω 7→ |T(i ω)| and provides useful information towhat extent the system amplifies purely harmonic input signals with frequenciesω ∈ R. In order tointerpret these diagrams one usually takes logarithmic scales on theω axis and plots 2010 log(T( j ω))

to get units in decibels dB. TheH∞ norm of a transfer function is then nothing else than the highestpeak value which occurs in the Bode plot. In other words it is the largest gain if the system is fedwith harmonic input signals.

TheH∞ norm of a stable linear system admits an interpretation in terms of dissipativity of the systemwith respect to a specific quadratic supply function. This is expressed in the following result.

Proposition 3.10 Let the system(3.3.1)be asymptotically stable andγ > 0. Then the followingstatements are equivalent.

(a) ‖T‖∞ < γ .

(b) for all w there holds that

sup0<‖w‖2<∞

‖z‖2

‖w‖2< γ

where z is the output of(3.3.1)subject to inputw and initial condition x(0) = 0.

(c) The system(3.3.1)is strictly dissipative with respect to the supply function s(w, z) = γ 2‖w‖

2−

‖z‖2.

(d) there exists a solution K= K > to the LMI(A>K + K A + C>C K B + C>D

B>K + D>C D>D − γ 2I

)≺ 0. (3.3.3)

Proof. Apply Proposition 3.8 with(Q SS> R

)=

(−γ 2I 0

0 I

).

For a stable system, theH∞ norm of the transfer function therefore coincides with theL2-inducednorm of the input-output operator associated with the system. Using the Kalman-Yakubovich-Popovlemma, this yields an LMI feasibility test to verify whether or not theH∞ norm of the transferfunctionT is bounded byγ .



3.3.3 H2 nominal performance

The Hardy spaceH2 consists of the class of complex valued functions which are analytic inC+ andfor which

‖T‖H2 :=

√1

2πsupσ>0

trace∫

∞

−∞

T(σ + i ω)[T(σ + i ω)]∗ dω

is finite. This defines theH2 norm ofT . This ‘cold-blooded’ definition may seem little appealing atfirst sight, but in fact, it has nice and important system theoretic interpretations. As inH∞, it can beshown that each function inH2 has a unique extension to the imaginary axis, which we also denoteby T , and that in fact theH2 norm satisfies

‖T‖2H2

=1

2πtrace

∫∞

−∞

T(i ω)T(i ω)∗ dω (3.3.4)

We will first give an interpretation of theH2 norm of a system in terms of itsimpulsive behavior.Consider the system (3.3.1) and suppose that we are interested only in the impulse responses of thissystem. This means, that we take impulsive inputs1 of the form

w(t) = δ(t)ej

whereej the j th basis vector in the standard basis of the input spaceRm, ( j = 1, . . . , m). Theoutputz j which corresponds to the inputw and initial conditionx(0) = 0 is uniquely defined andgiven by

z j (t) =

C exp(At)Bej for t > 0

Dej δ(t) for t = 0

0 for t < 0

.

Since the system is assumed to be stable, the outputsz j are square integrable for allj = 1, . . . , m,provided thatD = 0. In that case

m∑j =1

‖z j‖

22 = trace

∫∞

0B> exp(A>t)C>C exp(At)B dt

= trace∫

∞

0C exp(At)B B> exp(A>t)C> dt.

Long ago, Parseval taught us that the latter expression is equal to

1

2πtrace

∫∞

−∞

T(i ω)T(i ω)∗ dω

which is ‖T‖2H2

. Infer that the squaredH2 norm of T coincides with the total ‘output energy’ inthe impulse responses of the system. What is more, this observation provides a straightforward

1Formally, the impulseδ is not a function and for this reason it is neither a signal. It requires a complete introduction todistribution theory to make these statements more precise, but we will not do this at this place.



algorithm to determine theH2 norm of a stable rational transfer function. Indeed, associate with thesystem (3.3.1) the symmetric non-negative matrices

W :=

∫∞

0exp(At)B B> exp(A>t) dt

M :=

∫∞

0exp(A>t)C>C exp(At) dt.

ThenW is usually referred to as thecontrollability gramianandM theobservability gramianof thesystem (3.3.1). The gramians satisfy the matrix equations

AW + W A>+ B B>

= 0, A>M + M A + C>C = 0

and are, in fact, the unique solutions to these equations wheneverA has its eigenvalues inC− (as isassumed here). Consequently,

‖T‖2H2

= trace(CWC>) = trace(B>M B).

A second interpretation of theH2 norm makes use of stochastics. Consider the system (3.3.1) andassume that the components of the inputw are independent zero-mean, white noise processes. If wetakex(0) = 0 as initial condition, the state variance matrix

W(t) := E(x(t)x>(t))

is the solution of the matrix differential equation

W = AW + W A>+ B B>, W(0) = 0.

Consequently, withD = 0, the output variance

E(z>z(t)) = E(x>C>Cx(t)) = E trace(Cx(t)x>(t)C>) =

= traceCE(x(t)x>(t))C>= trace(CW(t)C>).

SinceA is asymptotically stable, the limitW := limt→∞ W(t) exists and is equal to the controlla-bility gramian of the system (3.3.1). Consequently, the asymptotic output variance

limt→∞

E(z(t)z>(t)) = trace(CWC>)

which is the square of theH2 norm of the system. TheH2 norm therefore has an interpretation interms of the asymptotic output variance of the system when it is excited by white noise input signals.

The following theorem characterizes theH2 norm in terms of linear matrix inequalities.

Proposition 3.11 Suppose that the system(3.3.1) is asymptotically stable and let T(s) = C(I s −

A)−1B + D denote its transfer function. Then

(a) ‖T‖2 < ∞ if and only if D= 0.



(b) If D = 0 then the following statements are equivalent

(i) ‖T‖2 < γ

(ii) there exists X� 0 such that

AX + X A>+ B B>

≺ 0, trace(C XC>) < γ 2.

(iii) there exists Y� 0 such that

A>Y + Y A+ C>C ≺ 0, trace(B>Y B) < γ 2.

(iv) there exists K= K >� 0 and Z such that(

A>K + K A K BB>K −γ I

)≺ 0;

(K C>

C Z

)� 0; trace(Z) < γ (3.3.5)

(v) there exists K= K >� 0 and Z such that(

AK + K A> KC>

C K −γ I

)≺ 0;

(K BB> Z

)� 0; trace(Z) < γ. (3.3.6)

Proof. The first claim is immediate from the definition of theH2 norm. To prove the secondpart, note that‖T‖2 < γ is equivalent to requiring that the controllability gramianW satisfiestrace(CWC>) < γ 2. Since the controllability gramian is the unique positive definite solution of theLyapunov equationAW + W A>

+ B B>= 0 this is equivalent to saying that there existsX � 0

such thatAX + X A>

+ B B>≺ 0; trace(C XC>) < γ 2.

In turn, with a change of variablesK := X−1, this is equivalent to the existence ofK > 0 andZsuch that

A>K + K A + K B B>K ≺ 0; C K−1C>≺ Z; trace(Z) < γ 2.

Now, using Schur complements for the first two inequalities yields that‖T‖2 < γ is equivalent tothe existence ofK � 0 andZ such that(

A>K + K A K BB>K −I

)≺ 0;

(K C>

C Z

)� 0; trace(Z) < γ 2

which is (3.3.5) as desired. The equivalence with (3.3.6) and the matrix inequalities inY are obtainedby a direct dualization and the observation that‖T‖2 = ‖T>

‖2.

Interpretation 3.12 The smallest possible upperbound of theH2-norm of the transfer function canbe calculated by minimizing the criterion trace(Z) over the variablesK � 0 andZ that satisfy theLMI’s defined by the first two inequalities in (3.3.5) or (3.3.6).



3.3.4 GeneralizedH2 nominal performance

Consider again the system (3.3.1) and suppose thatx(0) = 0 andA has its eigenvalues inC−. Recallthat‖T‖H2 < ∞ if and only if D = 0. The system then defines a bounded operator fromL2 inputsto L∞ outputs. That is, for any inputw for which‖w‖

22 :=

∫∞

0 ‖w(t)‖2dt < ∞ the correspondingoutputz belongs toL∞, the space of signalsz : R+ → Rp of finite amplitude2

‖z‖∞ := supt≥0

√〈z(t), z(t)〉.

TheL2-L∞ induced norm (or ‘energy to peak’ norm) of the system is defined as

‖T‖2,∞ := sup0<‖w‖2<∞

‖z‖∞

‖w‖2

and satisfies ( [26])

‖T‖22,∞ =

1

2πλmax

(∫∞

−∞

T(i ω)T(i ω)∗ dω

)(3.3.7)

whereλmax(·) denotes maximum eigenvalue. Note that whenz is scalar valued, the latter expressionreduces to theH2 norm, i.e, for systems with scalar valued output variables

‖T‖2,∞ = ‖T‖H2,

which is the reason why we refer to (3.3.7) as ageneralized H2 norm. The following result charac-terizes an upperbound on this quantity.

Proposition 3.13 Suppose that the system(3.3.1) is asymptotically stable and that D= 0. Then‖T‖2,∞ < γ if and only if there exists a solution K= K > > 0 to the LMI’s(

A>K + K A K BB>K −γ I

)≺ 0;

(K C>

C γ I

)� 0 (3.3.8)

Proof. Firstly, infer from Theorem 2.9 that the existence ofK > 0 with(A>K + K A K B

B>K −I

)≺ 0

is equivalent to the dissipativity of the system (3.3.1) with respect to the supply functions(w, z) =

w>w. Equivalently, for allw ∈ L2 andt ≥ 0 there holds

x(t)>K x(t) ≤

∫ t

0w(τ)>w(τ) dτ.

2An alternative and more common definition for theL∞ norm of a signalz : R → Rp is ‖z‖∞ :=

maxj =1,...,p supt≥0 |z j (t)|. For scalar valued signals this coincides with the given definition, but for non-scalar signalsthis is a different signal norm. When equipped with this alternative amplitude norm of output signals, the characterization(3.3.7) still holds withλmax(·) redefined as themaximal entry on the diagonalof its argument. See [26] for details.



Secondly, using Schur complements, the LMI(K C>

C γ 2I

)� 0

is equivalent to the existence of anε > 0 such thatC>C ≺ (γ 2− ε2)K . Together, this yields that

for all t ≥ 0

〈z(t), z(t)〉 = x(t)>C>Cx(t) ≤ (γ 2− ε2)x(t)>K x(t)

≤ (γ 2− ε2)

∫ t

0w(τ)>w(τ) dτ.

≤

∫∞

0w(τ)>w(τ) dτ.

Take the supremum overt ≥ 0 yields the existence ofε > 0 such that for allw ∈ L2

‖z‖2∞ ≤ (γ 2

− ε2)‖w‖22.

Dividing the latter expression by‖w‖22 and taking the supremum over allw ∈ L2 then yields the

result.

3.3.5 L1 or peak-to-peak nominal performance

Consider the system (3.3.1) and assume again that the system is stable. For fixed initial conditionx(0) = 0 this system defines a mapping from bounded amplitude inputsw ∈ L∞ to boundedamplitude outputsz ∈ L∞ and a relevant performance criterion is the ‘peak-to-peak’ orL∞-inducednorm of this mapping

‖T‖∞,∞ := sup0<‖w‖∞<∞

‖z‖∞

‖w‖∞

.

The following result gives a sufficient condition for an upperboundγ of the peak-to-peak gain of thesystem.

Proposition 3.14 If there exists K> 0, λ > 0 andµ > 0 such that

(A>K + K A + λK K B

B>K −µI

)≺ 0;

λK 0 C>

0 (γ − µ)I D>

C D γ I

� 0 (3.3.9)

then the peak-to-peak (orL∞ induced) norm of the system is smaller thanγ , i.e.,‖T‖∞,∞ < γ .

Proof. The first inequality in (3.3.9) implies that

d

dtx(t)>K x(t) + λx(t)K x(t) − µw(t)>w(t) < 0.


3.4. THE GENERALIZED PLANT CONCEPT 75

for all w andx for which x = Ax + Bw. Now assume thatx(0) = 0 andw ∈ L∞ with ‖w‖∞ ≤ 1.Then, sinceK > 0, we obtain (pointwise int ≥ 0) that

x(t)>K x(t) ≤µ

λ.

Taking a Schur complement of the second inequality in (3.3.9) yields that(λK 00 (γ − µ)I

)−

1

γ − ε

(C D

)> (C D)

� 0

so that, pointwise int ≥ 0 and for all‖w‖∞ ≤ 1 we can write

〈z(t), z(t)〉 ≤ (γ − ε)[λx(t)>K x(t) + (γ − µ)w(t)>w(t)]

≤ γ (γ − ε)

Consequently, the peak-to-peak gain of the system is smaller thanγ .

Remark 3.15 We emphasize that Proposition 3.14 gives only a sufficient condition for an upper-boundγ of the peak-to-peak gain of the system. The minimalγ ≥ 0 for which the there existK > 0, λ > 0 andµ ≥ 0 such that (3.3.9) is satisfied is usually only an upperbound of the realpeak-to-peak gain of the system.

3.4 The generalized plant concept

K1-w1 -r g6

-u1

G1 - g - G2u2 - g?

w2

- z

?g� w3

�K2

6

y

Figure 3.1: A system interconnection

Consider a configuration of interconnected systems such as the one given in Figure 3.1. The sub-systemsG j , calledplant components, are assumed to be given, whereas the subsystemsK j areto-be-designed and generally referred to ascontroller components. In this section, we would liketo illustrate how one embeds a specific system interconnection (such as the one in Figure 3.1) intoa so-calledgeneralized plant. The first step is to identify the following (possibly multivariable)signals:


76 3.4. THE GENERALIZED PLANT CONCEPT

• Generalized disturbances. The collectionw of all exogenous signals that affect the intercon-nection and cannot be altered or influenced by any of the controller components.

• Controlled outputs. The collectionz of all signals such that the channelw 7→ z allows tocharacterize whether the controlled system has user-desired or user specified properties.

• Control inputs . The collectionu of all signals that are actuated (or generated) by all controllercomponents.

• Measured outputs. The collectiony of all signals that enter the controller components.

After having specified these four signals, one can disconnect the controller components to arrive attheopen-loop interconnection. Algebraically, the open-loop interconnection is described by(

zy

)= P

(w

u

)=

(P11 P12P21 P22

)(w

u

)(3.4.1)

where P defines the specific manner in which the plant components interact. A transfer functionor state-space description of (3.4.1) can be computed manually or with the help of basic routinesin standard modeling toolboxes. Thecontrolled interconnectionis obtained by connecting the con-troller as

u = K y.

With y = P21w + P22K y or (I − P22K )y = P21w, let us assume thatK is such thatI − P22K hasa proper inverse. Theny = (I − P22K )−1P21w and hence

z = [P11 + P12K (I − P22K )−1P21]w = (P ? K )w

such that the transfer functionw 7→ z is proper. The general closed-loop interconnection is depictedin Figure 3.2. Clearly,I − P22K has a proper inverse if and only if its direct feed-through matrix

P

K

�

-

�

�

z w

u y

Figure 3.2: General closed-loop interconnection.



I − P22(∞)K (∞) is non-singular, a test which is easily verifiable. Equivalently,(I −K

−P22 I

)has a proper inverse. (3.4.2)

In general, the controllerK is required to stabilize the controlled interconnection. For most applica-tions, it is not sufficient thatK just rendersP ? K stable. Instead, we will say thatK stabilizes P(orP is stabilized by K) if the interconnection depicted in Figure??and defined through the relations(

zy

)= P

(w

u

), u = Kv + v1, v = y + v2

or, equivalently, through the relations zv1v2

=

P11 P12 00 I −K

−P21 −P22 I

w

uv

(3.4.3)

defines aproper and stable transfer matrixmapping col(w, v1, v2) to col(z, u, v). Note that themapping col(u, v) 7→ col(v1, v2) is represented by (3.4.2). Consequently, (3.4.2) holds wheneverKstabilizesP. In that case, the mapping col(v1, v2) 7→ col(u, v) can be explicitly computed as(

I −K−P22 I

)−1

=

((I − K P22)

−1 K (I − P22K )−1

(I − P22K )−1P22 (I − P22K )−1

).

Consequently, this transfer matrix is explicitly given as

P11 +(

P12 0) ( I −K

−P22 I

)−1( 0P21

) (P12 0

) ( I −K−P22 I

)−1

(I −K

−P22 I

)−1( 0P21

) (I −K

−P22 I

)−1

=

=

P11 + P12K (I − P22K )−1P21 P12(I − K P22)−1 P12K (I − P22K )−1

K (I − P22K )−1P21 (I − K P22)−1 K (I − P22K )−1

(I − P22K )−1P21 (I − P22K )−1P22 (I − P22K )−1

. (3.4.4)

Hence,K stabilizesP if it satisfies (3.4.2) and if all nine block transfer matrices (3.4.4) are stable.

Let us now assume thatP andK admit the state-space descriptions

x = Ax + B1w + B2uz = C1x + D11w + D12uy = C2x + D21w + D22u

(3.4.5)

andxK = AK xK + BK uK

yK = CK xK + DK uK .(3.4.6)


78 3.4. THE GENERALIZED PLANT CONCEPT

Clearly, (3.4.3) admits the state-space realizationx

xK

zv1v2

=

A 0 B1 B2 00 AK 0 0 BK

C1 0 D11 D12 00 −CK 0 I −DK

−C2 0 −D21 −D22 I

xxK

w

uv

. (3.4.7)

Obviously, (3.4.2) just translates into the property that(I −DK

−D21 I

)is non-singular. (3.4.8)

We can write (3.4.7) in more compact format asx

xK

zv1v2

=

A B1 B2

C1 D11 D12

C2 D21 D22

xxK

w

uv

This hypothesis allows to derive the following state-space realization of (??):

xxK

zuv

=

A − B2D−122 C2 B1 − B2D−1

22 D21 B2D−122

C1 − D12D−122 C2 D1 − D12D−1

22 D21 D12D−122

D−122 C2 D−1

22 D21 D−122

xxK

w

v1v2

. (3.4.9)

Let us now assume that the realizations ofP and K are stabilizable and detectable. It is a simpleexercise to verify that the realizations (3.4.7) and (3.4.9) are then stabilizable and detectable. Thisallows to conclude that the transfer matrix (??) with stabilizable and detectable realization (3.4.9) isstable if and only ifA − B2D−1

22 C2 is Hurwitz. More explicitly(A 00 AK

)+

(B2 00 BK

)(I −DK

−D22 I

)−1( 0 CK

C2 0

)is Hurwitz. (3.4.10)

In summaryK stabiliziesP iff (for stabilizable and detectable realizations) (3.4.8) and (3.4.10) holdtrue.

Finally we callP a generalized plantif it indeed admits a stabilizing controller. This property canbe easily tested both in terms of the transfer matrixP and its state-space realization.

Suppose thatP admits a stabilizing controller. In terms of (stabilizable/detectable) representationsthis means (3.4.10) which in turn implies, just with e.g. the Hautus test, that

(A, B2) is stabilizable and(A, C2) is detectable. (3.4.11)



Conversely, (3.4.11) allow to determineF andL such thatA+ B2F andA+ LC2 are Hurwitz. Thenthe standard observer based controller for the systemx = Ax + B2u, y = C2x + D22u which isdefined as

xK = (A + B2F + LC2 + L D22F)xK − Ly, u = FxK

does indeed stabilizeP. The realization(AK , BK , CK , DK ) = (A+B2F+LC2+L D22F, −L , F, 0)

is obviously stabilizable and detectable. SinceDK = 0 we infer (3.4.8). Finally a simple computa-tion shows(

A 00 AK

)+

(B2 00 BK

)(I 0

−D22 I

)−1( 0 CK

C2 0

)=

(A B2F

−LC2 A + B2F + LC2

)and the similarity transformation (error dynamics)(

I 0I −I

)(A B2F

−LC2 A + B2F + LC2

)(I 0I −I

)−1

=

(A + B2F B2F

0 A + LC2

)reveals that (3.4.10) is true as well. Hence (3.4.11) is necessary and sufficient forP being a gener-alized plant.

Let us finally derive a test directly in terms of transfer matrices. IfK stabilizesP we conclude that(I −K

−P22 I

)−1

is proper and stable. (3.4.12)

Note that this comes down toK stabilizingP22 in the sense of our definition for emptywp andzp.We conclude that the set of controllers which stabilizeP is a subset of the controller which stabilizeP22. Generalized plants are characterized by the fact that these two sets of controllers do actuallycoincide. This can be most easily verified in terms of state-space realizations: With stabilizable anddetectable realizations (3.4.5) and (3.4.6) suppose thatK renders (3.4.12) satisfied. In exactly thesame fashion as above we observe that (3.4.12) admits the realization

xxK

uv

=

(A − B2D−1

22 C2 B2D−122

D−122 C2 D−1

22

)x

xK

v1v2

which is stabilizable and detectable. This implies thatA − B2D−1

22 C2 is Hurwitz, which in turnmeans thatK also stabilizesP.

This leads us to the following two easily verifiable input-output and state-space tests forP being ageneralized plant.

• Determine aK which stabilizesP22. Then P is a generalized plant if and only ifK alsostabilizesP.

• If P admits a stabilizable and detectable representation (3.4.5), then it is a generalized plant ifand only if(A, B2) is stabilizable and(A, C2) is detectable.


80 3.5. FURTHER READING

3.5 Further reading

Lyapunov theory: [5,16,27,47]More details on the generalizedH2 norm, see [26]. A first variation of Theorem 3.7 appeared in [3].

3.6 Exercises

Exercise 1Consider the non-linear differential equation{

x1 = x1(1 − x1)

x2 = 1(3.6.1)

(a) Show that there exists one and only one solution of (3.6.1) which passes throughx0 ∈ R2 attime t = 0.

(b) Show that (3.6.1) admits only one equilibrium point.

(c) Show that the setS := {x ∈ R2| |x1| = 1} is a positive and negative invariant set of (3.6.1).

Exercise 2Show that

(a) the quadratic functionV(x) := x>K x is positive definite if and only ifK is a positive definitematrix.

(b) the functionV : R × R+ → R defined asV(x, t) := e−t x2 is decrescent but not positivedefinite.

Exercise 3A pendulum of massm is connected to a servo motor which is driven by a voltageu. The anglewhich the pendulum makes with respect to the upright vertical axis through the center of rotation isdenoted byθ (that is,θ = 0 means that the pendulum is in upright position). The system is describedby the equations

Jd2θ

dt2= mlgsin(θ) + u (3.6.2)

y = θ (3.6.3)

wherel denotes the distance from the axis of the servo motor to the center of mass of the pendulum,J is the inertia andg is the gravitation constant. The system is specified by the constantsJ = 0.03,m = 1, l = 0.15 andg = 10.


3.6. EXERCISES 81

(a) Determine the equilibrium points of this system.

(b) Are the equilibrium points Lyapunov stable? If so, determine a Lyapunov function.

(c) Linearize the system around the equilibrium points and provide a state space representation ofthe linearized systems.

(d) Verify whether the linearized systems are stable, unstable or asymptotically stable.

(e) A proportional feedbackcontroller is a controller of the formu = ky wherek ∈ R. Doesthere exists a proportional feedback controller such that the unstable equilibrium point of thesystem becomes asymptotically stable?

Exercise 4Let a stability regionCstabbe defined as those complex numberss ∈ C which satisfy

Re(s) < −α and

|s − c| < r and

| Im(s)| < | Re(s)|.

whereα > 0, c > 0 andr > 0. Specify a real symmetric matrixP ∈ S2m×2m such thatCstabcoincides with the LMI regionL P as specified in Definition (3.6).

Exercise 5Let 0 ≤ α ≤ π and consider the Lyapunov equationA>X + X A+ I = 0 where

A =

(sin(α) cos(α)

− cos(α) sin(α)

)Show that the solutionX of the Lyapunov equation diverges in the sense that det(X) −→ ∞ when-everα −→ 0.

Exercise 6Consider the the suspension system in Exercise 4 of Chapter 2. Recall that the variableq0 representsthe road profile.

(a) Consider the case whereF = 0 andq0 = 0 (thus no active force between chassis and axle anda ‘flat’ road characteristic). Verify whether this system is asymptotically stable.

(b) Determine a Lyapunov functionV : X → R of this system (withF = 0 andq0 = 0) andshow that its derivative is negative along solutions of the autonomous behavior of the system(i.e. F = 0 andq0 = 0).

(c) Design your favorite road profileq0 in MATLAB and simulate the response of the system to thisroad profile (the forceF is kept 0). Plot the variablesq1 andq2. What are your conclusions?

(d) Consider, withF = 0, the transfer functionT which maps the road profileq0 to the outputcol(q1, q2) of the system. Determine the norms‖T‖H∞

and‖T‖H2.


82 3.6. EXERCISES

Exercise7Consider the systemx = Ax + Bw, z = Cx + Dw with

A =

−1 0 0 10 −1 4 −31 −3 −1 −30 4 2 −1

B =

0 10 0

−1 00 0

, C =

(−1 0 1 00 1 0 1

), D =

(0 10 0

)

(a) Write a program to compute theH∞ norm of this system.

(b) Check the eigenvalues of the Lyapunov matrix. What can you conclude from this?

Exercise 8Consider a batch chemical reactor with a constant volumeV of liquids. Inside the reactor the seriesreaction

Ak1

−−−−→ Bk2

−−−−→ Ctakes place. Herek1 andk2 represent the kinetic rate constants (1/sec.) for the conversionsA → BandB → C, respectively. The conversions are assumed to be irreversible which leads to the modelequations

CA = −k1CA

CB = k1CA − k2CB

CC = k2CB

whereCA, CB andCC denote the concentrations of the componentsA, B andC, respectively, andk1 andk2 are positive constants. ReactantB is the desired product and we will be interested in theevolution of its concentration.

(a) Show that the system which describes the evolution ofCB is asymptotically stable.

(b) Determine a Lyapunov function for this system.

(c) Suppose that at timet = 0 the reactor is injected with an initial concentrationCA(0) =

10 (mol/liter) of reactantA and thatCB(0) = CC(0) = 0. Plot the time evolution of theconcentrationCB of reactantB if (k1, k2) = (0.2, 0.4) and if (k1, k2) = (0.3, 0.3).

Exercise 9Consider the nonlinear scalar differential equation

x =√

x.

with initial conditionx(0) = 0.

(a) Show that this differential equation does not satisfy the Lipschitz condition to guaranteeuniqueness of solutionsx : R+ → R.


3.6. EXERCISES 83

(b) Show that the differential equation has at least two solutionsx(t), t ≥ 0, with x(0) = 0.

Exercise 10Consider the discrete time system

x(t + 1) = f (x(t)). (3.6.4)

(a) Show thatx∗ is an equilibrium point of (3.6.4) if and only iff (x∗) = x∗.

(b) Let x(t + 1) = Ax(t) be the linearization of (3.6.4) around the equilibrium pointx∗. Derivean LMI feasibility test which is necessary and sufficient forA to have its eigenvalues in{z ∈

C | |z| < 1}. (Hint: apply Theorem??).

Exercise 11In chemical process industry, distillation columns play a key role to split an input stream of chemicalspecies into two or more output streams of desired chemical species. Distillation is usually themost economical method for separating liquids, and consists of a process of multi-stage equilibriumseparations. Figure 3.3 illustrates a typical distillation column.

Separation of input components, thefeed, is achieved by controlling the transfer of componentsbetween the variousstages(also calledtraysor plates), within the column, so as to produce outputproducts at the bottom and at the top of the column. In a typical distillation system, two recyclestreams are returned to the column. Acondenseris added at the top of the column and a fractionof the overhead vaporV is condensed to form a liquid recycleL. The liquid recycle provides theliquid stream needed in the tower. The remaining fraction ofV , is thedistillate- or top product. Avaporizer orreboiler is added to the bottom of the column and a portion of the bottom liquid,Lb, isvaporized and recycled to the tower as a vapor streamVb. This provides the vapor stream needed inthe tower, while the remaining portion ofLb is thebottom product.

��

��

�

��

��

�

�

�

�

��

��

�

��

�

�

�

��

��

�

��

��

��

�� !"�#�

� !"�#�$!��!%

&��"

'!("�(��

$!��

��)!��

*!�"��

�+� ��+�

Figure 3.3: Binary distillation column

The column consists ofn stages, numbered from top to bottom. Thefeedenters the column at stagenf , with 1 < nf < n. The feed flow, F [kmol/hr], is a saturated liquid with compositionzF [mole


84 3.6. EXERCISES

fraction]. L [kmole/hr] denotes thereflux flow rateof the condenser,Vb [kmole/hr] is theboilup flowrateof the reboiler. The variable

w =

LVbF

is taken as input of the plant. The top product consists of adistillate stream D[kmol/hr], withcompositionXd [mole fraction]. Likewise, the bottom product consists of abottom stream B, withcompositionXB [mole fraction]. The output of the system is taken to be

z =

(XdXb

)and therefore consists of the distillate composition and bottom composition, respectively.

A model for this type of reactors is obtained as follows. The stages above the feed stage (indexi < nf) define theenriching sectionand those below the feed stage (indexi > nf) the strippingsectionof the column. The liquid flow rate in the stripping section is defined asLb = L +q F where0 ≤ q ≤ 1 is a constant. The vapor flow rate in the enriching section is given byV = Vb+ (1−q)F .The distillate and bottom product flow rates areD = V − L andB = Lb − Vb, respectively. Denoteby Xi andYi [mole fraction] the liquid and vapor compositions of stagei , respectively. For constantliquid holdup conditions, thematerial balancesof the column are given as follows.

Mdd X1

dt= V Y2 − (L + D)X1 Condenser stage

Md Xi

dt= L(Xi −1 − Xi ) + V(Yi +1 − Yi ) 1 < i < nf

Md Xf

dt= L Xf−1 − LbXf + VbYf+1 − V Yf + Fzf Feed stage

Md Xi

dt= Lb(Xi −1 − Xi ) + Vb(Yi +1 − Yi ) nf < i < n

Mbd Xn

dt= LbXn−1 − VbYn − B Xn. Reboiler stage

Here,M , Md andMb [kmol] denote the nominal stage hold-up of material at the trays, the condenserand the bottom, respectively. The vapor-liquid equilibrium describes the relation between the vaporand liquid compositionsYi and Xi on each stagei of the column and is given by the non-linearexpression:

Yi =aXi

1 + (a − 1)Xi, i = 1, . . . , n.

wherea is the so calledrelative volatility(dependent on the product). Withx denoting the vector ofcomponentsXi , these equations yield a nonlinear model of the form

x = f (x, w), z = g(x, w)

of state dimensionn (equal to the number of stages).

We consider a medium-sized propane-butane distillation column whose physical parameters aregiven in Table 3.1. The last three entries in this table definew∗.


3.6. EXERCISES 85

n Number of stages 20nf Feed stage 6Md Condenser holdup 200 [kmol]Mb Reboiler holdup 400 [kmol]M Stage holdup 50 [kmol]zf Feed composition 0.5 [mole fraction]q Feed liquid fraction 1a Relative volatility 2.46L∗ Reflux flow 1090 [kmol/hr]V∗

b Boilup vapor flow 1575 [kmol/hr]F∗ Feed flow 1000 [kmol/hr]

Table 3.1: Operating point data

(a) Calculate an equilibrium statex∗ of the model if the input is set tow∗. The equilibrium point(w∗, x∗, z∗) represents a steady-state ornominal operating pointof the column.

(b) Construct (or compute) a linear model of the column when linearized around the equilibriumpoint (w∗, x∗, z∗).

(c) Is the linear model stable?

(d) Is the linear model asymptotically stable?

(e) Make a Bode diagram of the 2× 3 transfer function of this system.

(f) Determine the responses of the system if the set-points of, respectively,L, Vb andF undergoa +10% step-change with respect to their nominal valuesL∗, V∗

b andF∗. Can you draw anyconclusion on the (relative) sensitivity of the system with respect to its inputsw? Are thereany major differences in the settling times?


86 3.6. EXERCISES


Chapter 4

Controller synthesis

In this chapter we provide a powerful result that allows to step in a straightforward manner fromthe performance analysis conditions formulated in terms of matrix inequalities to the correspondingmatrix inequalities for controller synthesis. This is achieved by a nonlinear and essentially bijectivetransformation of the controller parameters.

4.1 The setup

Suppose a linear time-invariant system is described asxz1...

zq

y

=

A B1 · · · Bq B

C1 D1 · · · D1q E1...

.... . .

......

Cq Dq1 · · · Dq Eq

C F1 · · · Fq 0

xw1...

wq

u

. (4.1.1)

We denote byu the control input, byy the measured output available for control, and byw j → z j thechannels on which we want to impose certain robustness and/or performance objectives. Since wewant to extend the design technique to mixed problems with various performance specifications onvarious channels, we already start at this point with a multi-channel system description. Sometimeswe collect the signals as

z =

z1...

zq

, w =

w1...

wq

.

87

88 4.1. THE SETUP

Remark 4.1 Note that we do not exclude the situation that some of the signalsw j or z j are identi-cal. Therefore, we only need to consider an equal number of input- and output-signals. Moreover, itmight seem restrictive to only consider the diagonal channels and neglect the channelsw j → zk forj 6= k. This is not the case. As a typical example, suppose we intend to impose forz = Tw speci-fications onL j T Rj whereL j , Rj are arbitrary matrices that pick out certain linear combinations ofthe signalsz, w (or of the rows/columns of the transfer matrix ifT is described by an LTI system).If we setw = Rj w j , z j = L j z, we are hence interested in specifications on the diagonal channelsof z1

z2...

=

L1L2...

T(

R1 R2 . . .) w1

w2...

.

If T is LTI, the selection matricesL j andRj can be easily incorporated into the realization to arriveat the description (4.1.1).

A controller is any finite dimensional linear time invariant system described as(xc

u

)=

(Ac Bc

Cc Dc

)(xc

y

)(4.1.2)

that hasy as its input andu as its output. Controllers are hence simply parameterized by the matricesAc, Bc, Cc, Dc.

The controlled or closed-loop system then admits the description

(ξ

z

)=

(A BC D

)(ξ

w

)or

ξ

z1...

zq

=

A B1 · · · Bq

C1 D1 · · · D1q...

.... . .

...

Cq Dq1 · · · Dq

ξ

w1...

wq

. (4.1.3)

The corresponding input-output mappings (or transfer matrices) are denoted as

w = T z or

z1...

zq

=

T1 ∗

. . .

∗ Tq

w1

...

wq

.

respectively.

One can easily calculate a realization ofT j as(ξ

z j

)=

(A B j

C j D j

)(ξ

w j

)(4.1.4)

where (A B j

C j D j

)=

A + B DcC BCc B j + B DcF j

BcC Ac BcF j

C j + E j DcC E j Cc D j + E j DcF j

.


4.1. THE SETUP 89

�

w1

uy

z1

P

�

w2z2

�

w jz j

�

wkzk

K

�

�

�

�

�

-

Figure 4.1: Multi-channel closed-loop system


90 4.2. FROM ANALYSIS TO SYNTHESIS – A GENERAL PROCEDURE

It simplifies some calculations if we use the equivalent alternative formula

(A B j

C j D j

)=

A 0 B j

0 0 0C j 0 D j

+

0 BI 00 E j

( Ac Bc

Cc Dc

)(0 I 0C 0 F j

). (4.1.5)

From this notation it is immediate that the left-hand side is an affine function of the controller pa-rameters.

4.2 From analysis to synthesis – a general procedure

As a paradigm example let us consider the design of a controller that achieves stability and quadraticperformance in the channelw j → z j . For that purpose we suppose that we have given a performanceindex

Pj =

(Q j Sj

STj Rj

)with Rj ≥ 0.

In Chapter 2 we have revealed that the following conditions are equivalent: The controller (4.1.2)renders (4.1.4) internally stable and leads to∫

∞

0

(w j (t)z j (t)

)T

Pj

(w j (t)z j (t)

)dt ≤ −ε

∫∞

0w j (t)

Tw j (t) dt

for someε > 0 if and only if

σ(A) ⊂ C− and

(I

T j (i ω)

)∗

Pj

(I

T j (i ω)

)< 0 for all ω ∈ R ∪ {∞}

if and only if there exists a symmetricX satisfying

X > 0,

(ATX + XA XB j

BTj X 0

)+

(0 IC j D j

)T

Pj

(0 IC j D j

)< 0. (4.2.1)

The correspondingquadratic performance synthesis problemamounts to finding controller parame-

ters

(Ac Bc

Cc Dc

)and anX > 0 such that (4.2.1) holds.

Obviously,A depends on the controller parameters. SinceX is also a variable, we observe thatXAdepends non-linearly on the variables to be found.

It has been observed only quite recently [18,36] that there exist a nonlinear transformation(X,

(Ac Bc

Cc Dc

) )→ v =

(X, Y,

(K LM N

) )(4.2.2)


4.2. FROM ANALYSIS TO SYNTHESIS – A GENERAL PROCEDURE 91

and aY such that, with the functions

X(v) :=

(Y II X

)(

A(v) B j (v)

C j (v) D j (v)

):=

AY + BM A+ BNC B j + BN Fj

K AX + LC X Bj + L F j

C j Y + E j M C j + E j NC D j + E j N Fj

(4.2.3)

one hasYTXY = X(v)(

YTXAY YTXB j

C j Y D j

)=

(A(v) B j (v)

C j (v) D j (v)

) (4.2.4)

Hence, under congruence transformations with the matrices

Y and

(Y 00 I

), (4.2.5)

the blocks transform as

X → X(v),

(XA XB j

C j D j

)→

(A(v) B j (v)

C j (v) D j (v)

).

Therefore, the original blocks that depend non-linearly on the decision variablesX and

(Ac Bc

Cc Dc

)are transformed into blocks that areaffinefunctions of the new variablesv.

If Y is nonsingular, we can perform a congruence transformation on the two inequalities in (4.2.1)with the nonsingular matrices (4.2.5) to obtain

YTXY > 0,

(YT

[ATX + XA]Y YTXB j

BTj XY 0

)+

(0 I

C j Y D j

)T

Pj

(0 I

C j Y D j

)< 0

(4.2.6)what is nothing but

X(v) > 0,

(A(v)T

+ A(v) B j (v)

B j (v)T 0

)+

(0 I

C j (v) D j (v)

)T

Pj

(0 I

C j (v) D j (v)

)< 0.

(4.2.7)

For Rj = 0 (as it happens in the positive real performance index), we inferPj =

(Q j Sj

STj 0

)what

implies that the inequalities (4.2.7) are affine inv. For a general performance index withRj ≥ 0,the second inequality in (4.2.7) is non-linear but convex inv. It is straightforward to transform itto a genuine LMI with a Schur complement argument. Since it is more convenient to stay with theinequalities in the form (4.2.7), we rather formulate a general auxiliary result that displays how toperform the linearization whenever it is required for computational purposes.



Lemma 4.2 (Linearization Lemma) Suppose that A and S are constant matrices, that B(v), Q(v) =

Q(v)T depend affinely on a parameterv, and that R(v) can be decomposed as TU(v)−1TT withU (v) being affine. Then the non-linear matrix inequalities

U (v) > 0,

(A

B(v)

)T ( Q(v) SS′ R(v)

)(A

B(v)

)< 0

are equivalent to the linear matrix inequality(AT Q(v)A + AT SB(v) + B(v)T ST A B(v)T T

TT B(v) −U (v)

)< 0.

In order to apply this lemma we rewrite the second inequality of (4.2.7) asI 0

A(v) B j (v)

0 IC j (v) D j (v)

T

0 I 0 0I 0 0 00 0 Q j Sj

0 0 STj Rj

I 0A(v) B j (v)

0 IC j (v) D j (v)

< 0 (4.2.8)

what is, after a simple permutation, nothing butI 00 I

A(v) B j (v)

C j (v) D j (v)

T

0 0 I 00 Q j 0 Sj

I 0 0 00 ST

j 0 Rj

I 00 I

A(v) B j (v)

C j (v) D j (v)

< 0. (4.2.9)

This inequality can be linearized according to Lemma 4.2 with an arbitrary factorization

Rj = Tj TTj leading to

(0 00 Rj

)=

(0Tj

)(0 TT

j

).

So far we have discussed how to derive the synthesis inequalities (4.2.7). Let us now suppose thatwe have verified that these inequalities do have a solution, and that we have computed some solution

v. If we can find a preimage

(X,

(Ac Bc

Cc Dc

) )of v under the transformation (4.2.2) and a

nonsingularY for which (4.2.4) holds, then we can simply reverse all the steps performed above

to reveal that (4.2.7) is equivalent to (4.2.1). Therefore, the controller defined by

(Ac Bc

Cc Dc

)renders (4.2.1) satisfied and, hence, leads to the desired quadratic performance specification for thecontrolled system.

Before we comment on the resulting design procedure, let us first provide a proof of the followingresult that summarizes the discussion.

Theorem 4.3 There exists a controller

(Ac Bc

Cc Dc

)and anX satisfying(4.2.1)iff there exists an

v that solves the inequalities(4.2.7). If v satisfies(4.2.7), then I− XY is nonsingular and there exist



nonsingular U, V with I− XY = U VT . The uniqueX and

(Ac Bc

Cc Dc

)with

(Y VI 0

)X =

(I 0X U

)and(

K LM N

)=

(U X B0 I

)(Ac Bc

Cc Dc

)(VT 0CY I

)+

(X AY 0

0 0

)(4.2.10)

satisfy the LMI’s(4.2.1).

Note thatU andV are square and nonsingular so that (4.2.10) leads to the formulas

X =

(Y VI 0

)−1( I 0X U

)and(

Ac Bc

Cc Dc

)=

(U X B0 I

)−1( K − X AY LM N

)(VT 0CY I

)−1

.

Due to the zero blocks in the inverses, the formulas can be rendered even more explicit. Of course,numerically it is better to directly solve the equations (4.2.10) by a stable technique.

Proof. Suppose a controller and someX satisfy (4.2.1). Let us partition

X =

(X U

U T∗

)and X−1

=

(Y V

VT∗

)according toA. Define

Y =

(Y I

VT 0

)and Z =

(I 0X U

)to get YTX = Z. (4.2.11)

Without loss of generality we can assume that the dimension ofAc is larger than that ofA. Hence,Uhas more columns than rows, and we can perturb this block (since we work with strict inequalities)such that it has full row rank. ThenZ has full row rank and, hence,Y has full column rank.

Due toXY + U VT= I , we infer

YTXY =

(Y II X

)= X(v)

what leads to the first relation in (4.2.4). Let us now consider(Y 00 I

)T (XA XB j

C j D j

)(Y 00 I

)=

(YTXAY YTXB j

C j Y D j

).



Using (4.1.5), a very brief calculation (do it!) reveals that

(YTXAY YTXB j

C j Y D j

)=

(ZAY ZB j

C j Y D j

)=

AY A B j

0 X A X Bj

C j Y Cj D j

+

+

0 BI 00 E j

[( U X B0 I

)(Ac Bc

Cc Dc

)(VT 0CY I

)+

(X AY 0

0 0

)](I 0 00 C F j

).

If we introduce the new parameters

(K LM N

)as in (4.2.10), we infer

(YTXAY YTXB j

C j Y D j

)=

=

AY A B j

0 X A X Bj

C j Y Cj D j

+

0 BI 00 E j

( K LM N

)(I 0 00 C F j

)=

=

AY + BM A+ BNC B j + BN Fj

K AX + LC X Bj + L F j

C j Y + E j M C j + E j NC D j + E j N Fj

=

(A(v) B j (v)

C j (v) D j (v)

).

Hence the relations (4.2.4) are valid. SinceY has full column rank, (4.2.1) implies (4.2.6), and by(4.2.4), (4.2.6) is identical to (4.2.7). This proves necessity.

To reverse the arguments we assume thatv is a solution of (4.2.7). Due toX(v) > 0, we infer thatI − XY is nonsingular. Hence we can factorizeI − XY = U VT with square and nonsingularU ,V . ThenY andZ defined in (4.2.11) are, as well, square and nonsingular. Hence we can choose

X,

(Ac Bc

Cc Dc

)such that (4.2.10) hold true; this implies that, again, the relations (4.2.4) are valid.

Therefore, (4.2.7) and (4.2.6) are identical. SinceY is nonsingular, a congruence transformationwith Y−1 and diag(Y−1, I ) leads from (4.2.6) back to (4.2.1) and the proof is finished.

We have obtained a general procedure for deriving from analysis inequalities the correspondingsynthesis inequalities and for construction corresponding controllers as follows:

• Rewrite the analysis inequalities in the blocksX, XA, XB j , C j , D j in order to be ableto find a (formal) congruence transformation involvingY which leads to inequalities in theblocksYTXY, YTXAY, YTXB j , C j Y, D j .

• Perform the substitution (4.2.4) to arrive at matrix inequalities in the variablesv.

• After having solved the synthesis inequalities forv, one factorizesI − XY into non-singularblocksU VT and solves the equations (4.2.10) to obtain the controller parametersAc, Bc, Cc,Dc and a Lyapunov matrixX which render the analysis inequalities satisfied.



The power of this procedure lies in its simplicity and its generality. Virtually all controller designmethods that are based on matrix inequality analysis results can be converted with ease into thecorresponding synthesis result. In the subsequent section we will include an extensive discussionof how to apply this technique to the various analysis results that have been obtained in the presentnotes.

Remark 4.4 (controller order) In Theorem 4.3 we have not restricted the order of the controller.In proving necessity of the solvability of the synthesis inequalities, the size ofAc was arbitrary. Thespecific construction of a controller in proving sufficiency leads to anAc that hasthe same sizeasA. Hence Theorem 4.3 also include the side result that controllers of order larger than that of theplant offer no advantage over controllers that have the same order as the plant. The story is verydifferent in reduced order control: Then the intention is to include a constraint dim(Ac) ≤ k forsomek that is smaller than the dimension ofA. It is not very difficult to derive the correspondingsynthesis inequalities; however, they includerank constraintsthat are hard if not impossible to treatby current optimization techniques. We will only briefly comment on a concrete result later.

Remark 4.5 (strictly proper controllers) Note that the direct feed-through of the controllerDc isactually not transformed; we simply haveDc = N. If we intend to design a strictly proper controller(i.e. Dc = 0), we can just setN = 0 to arrive at the corresponding synthesis inequalities. Theconstruction of the other controller parameters remains the same. Clearly, the same holds if onewishes to impose an arbitrary more refined structural constraint on the direct feed-through term aslong as it can be expressed in terms of LMI’s.

Remark 4.6 (numerical aspects)After having verified the solvability of the synthesis inequalities,we recommend to take some precautions to improve the conditioning of the calculations to recon-struct the controller out of the decision variablev. In particular, one should avoid that the parametersv get too large, and thatI − XY is close to singular what might render the controller computationill-conditioned. We have observed good results with the following two-step procedure:

• Add to the feasibility inequalities the bounds

‖X‖ < α, ‖Y‖ < α,

∥∥∥∥( K LM N

)∥∥∥∥ < α

as extra constraints and minimizeα. Note that these bounds are equivalently rewritten in LMIform as

X < α I , Y < α I ,

α I 0 K L0 α I M N

K T MT α I 0LT NT 0 α I

> 0.

Hence they can be easily included in the feasibility test, and one can directly minimizeα tocompute the smallest boundα∗.

• In a second step, one adds to the feasibility inequalities and to the bounding inequalities forsome enlarged but fixedα > α∗ the extra constraint(

Y β Iβ I X

)> 0.


96 4.3. OTHER PERFORMANCE SPECIFICATIONS

Of course, the resulting LMI system is feasible forβ = 1. One can hence maximizeβ to obtaina supremalβ∗ > 1. The valueβ∗ gives an indication of the conditioning of the controllerreconstruction procedure. In fact, the extra inequality is equivalent toX−β2Y−1 > 0. Hence,maximizingβ amounts to ‘pushingX away fromY−1’. Therefore, this step is expected to pushthe smallest singular value ofI − XY away from zero. The larger the smaller singular valueof I − XY, the larger one can choose the smallest singular values of bothU and V in thefactorizationI − XY = U VT . This improves the conditioning ofU andV , and renders thecalculation of the controller parameters more reliable.

4.3 Other performance specifications

4.3.1 H∞ Design

The optimal value of theH∞ problem is defined as

γ ∗

j = infAc,Bc,Cc,Dc such thatσ(A)⊂C−

‖T j ‖∞.

Clearly, the numberγ j is larger thanγ ∗

j iff there exists a controller which renders

σ(A) ⊂ C− and ‖T j ‖∞ < γ j

satisfied. These two properties are equivalent to stability and quadratic performance for the index

Pj =

(Q j Sj

STj Rj

)=

(−γ j I 0

0 (γ j I )−1

).

The corresponding synthesis inequalities (4.2.7) are rewritten with Lemma 4.2 to

X(v) > 0,

A(v)T+ A(v) B j (v) C j (v)T

B j (v)T−γ j I D j (v)T

C j (v) D j (v) −γ j I

< 0.

Note that the the optimalH∞ valueγ ∗

j is then just given by the minimalγ j for which these inequal-ities are feasible; one can directly computeγ ∗

j by a standard LMI algorithm.

For the controller reconstruction, one should improve the conditioning (as described in the previoussection) by an additional LMI optimization. We recommend not to perform this step with the optimalvalueγ ∗

j itself but with a slightly increased valueγ j > γ ∗

j . This is motivated by the observationthat, at optimality, the matrixX(v) is often (but not always!) close to singular; thenI − XY is closeto singular and it is expected to be difficult to render it better conditioned ifγ j is too close to theoptimal valueγ ∗

j .


4.3. OTHER PERFORMANCE SPECIFICATIONS 97

4.3.2 Positive real design

In this problem the goal is to test whether there exists a controller which renders the following twoconditions satisfied:

σ(A) ⊂ C−, T j (i ω)∗ + T j (i ω) > 0 for all ω ∈ R ∪ {∞}.

This is equivalent to stability and quadratic performance for

Pj =

(Q j Sj

STj Rj

)=

(0 −I

−I 0

),

and the corresponding synthesis inequalities read as

X(v) > 0,

(A(v)T

+ A(v) B j (v) − C j (v)T

B j (v)T− C j (v) −D j (v) − D j (v)T

)< 0.

4.3.3 H2-problems

Let us define the linear functionalf j (Z) := trace(Z).

Then we recall thatA is stable and‖T j ‖2 < γ j iff there exists a symmetricX with

D j = 0, X > 0,

(ATX + XA XB j

BTj X −γ j I

)< 0, f j (C j X

−1CTj ) < γ j . (4.3.1)

The latter inequality is rendered affine inX andC j by introducing the auxiliary variable (or slackvariable)Z j . Indeed, the analysis test is equivalent to

D j = 0,

(ATX + XA XB j

BTj X −γ j I

)< 0,

(X CT

jC j Z j

)> 0, f j (Z j ) < γ j . (4.3.2)

This version of the inequalities is suited to simply read-off the corresponding synthesis inequalities.

Corollary 4.7 There exists a controller that renders(4.3.2)for someX, Z j satisfied iff there existvand Zj with

D j (v) = 0,

(A(v)T

+ A(v) B j (v)

B j (v)T−γ j I

)< 0,

(X(v) C j (v)T

C j (v) Z j

)> 0, f j (Z j ) < γ j .

(4.3.3)

The proof of this statement and the controller construction are literally the same as for quadraticperformance.


98 4.3. OTHER PERFORMANCE SPECIFICATIONS

For the generalizedH2-norm‖T j ‖2g, we recall thatA is stable and‖T j ‖2g < γ j iff

D j = 0, X > 0,

(ATX + XA XB j

BTj X −γ j I

)< 0, C j X

−1CTj < γ j I .

These conditions are nothing but

D j = 0,

(ATX + XA XB j

BTj X −γ j I

)< 0,

(X CT

jC j γ j I

)> 0

and it is straightforward to derive the synthesis LMI’s.

Note that the corresponding inequalities are equivalent to (4.3.3) for the function

f j (Z) = Z.

In contrast to the genuineH2-problem, there is no need for the extra variableZ j to render theinequalities affine.

Remarks.

• If f assigns toZ its diagonal diag(z1, . . . , zm) (wherem is the dimension ofZ), one char-acterizes a bound on the gain ofL2 3 w j → z j ∈ L∞ if equipping L∞ with the norm‖x‖∞ := ess supt≥0 maxk |xk(t)| [26,29]. Note that the three concreteH2-like analysis resultsfor f j (Z) = trace(Z), f j (Z) = Z, f j (Z) = diag(z1, . . . , zm) are exact characterizations, andthat the corresponding synthesis results do not involve any conservatism.

• In fact, Corollary 4.7 holds for any affine functionf that maps symmetric matrices into sym-metric matrices (of possibly different dimension) and that has the propertyZ ≥ 0 ⇒ f (Z) ≥

0. Hence, Corollary 4.7 admits many other specializations.

• Similarly as in theH∞ problem, we can directly minimize the boundγ j to find the optimalH2-value or the optimal generalizedH2-value that can be achieved by stabilizing controllers.

• We observe that it causes no trouble in our general procedure to derive the synthesis inequali-ties if the underlying analysis inequalities involve certain auxiliary parameters (such asZ j ) asextra decision variables.

• It is instructive to equivalently rewrite (4.3.2) asX > 0, Z j > 0, f j (Z j ) < γ j and I 0XA XB j

0 I

T 0 I 0I 0 00 0 −γ j I

I 0XA XB j

0 I

< 0,

I 00 IC j D j

T −X 0 00 0 00 0 Z−1

j

I 00 IC j D j

≤ 0.


4.3. OTHER PERFORMANCE SPECIFICATIONS 99

Note that the last inequality is non-strict and includes the algebraic constraintD j = 0. It canbe equivalently replaced by(

IC j

)T (−X 0

0 Z−1j

)(I

C j

)< 0, D j = 0.

The synthesis relations then read asX(v) > 0, Z j > 0, f j (Z j ) < γ j and I 0A(v) B j (v)

0 I

T 0 I 0I 0 00 0 −γ j I

I 0A(v) B j (v)

0 I

< 0, (4.3.4)

(I

C j (v)

)T (−X(v) 0

0 Z−1j

)(I

C j (v)

)< 0, D j (v) = 0. (4.3.5)

The first inequality is affine inv, whereas the second one can be rendered affine inv and Z j withLemma 4.2.

4.3.4 Upper bound on peak-to-peak norm

The controller (4.1.2) rendersA stable and the bound

‖w j ‖∞ ≤ γ j ‖z j ‖∞ for all z j ∈ L∞

satisfiedif there exist a symmetricX and real parametersλ, µ with

λ > 0,

(ATX + XA + λX XB j

BTj X 0

)+

(0 IC j D j

)T (−µI 0

0 0

)(0 IC j D j

)< 0(

0 IC j D j

)T(

0 00 1

γ jI

)(0 IC j D j

)<

(λX 00 (γ j − µ)I

).

(Note thatX > 0 is built in. Where?) The inequalities are obviously equivalent to

λ > 0,

(ATX + XA + λX XB j

BTj X −µI

)< 0,

λX 0 CTj

0 (γ j − µ)I DTj

C j D j γ j I

> 0,

and the corresponding synthesis inequalities thus read as

λ > 0,

(A(v)T

+ A(v) + λX(v) B j (v)

B j (v)T−µI

)< 0,

λX(v) 0 C j (v)T

0 (γ j − µ)I D j (v)T

C j (v) D j (v) γ j I

> 0.

If these inequalities are feasible, one can construct a stabilizing controller which bounds the peak-to-peak norm ofz j = T j z j by γ j . We would like to stress that the converse of this statement is nottrue since the analysis result involves conservatism.


100 4.4. MULTI-OBJECTIVE AND MIXED CONTROLLER DESIGN

Note that the synthesis inequalities are formulated in terms of the variablesv, λ, andµ; hence theyarenon-linearsinceλX(v) depends quadratically onλ andv. This problem can be overcome asfollows: For fixedλ > 0, test whether the resultinglinear matrix inequalities are feasible; if yes, onecan stop since the boundγ j on the peak-to-peak norm has been assured; if the LMI’s are infeasible,one has to pick anotherλ > 0 and repeat the test.

In practice, it might be advantageous to find the best possible upper bound on the peak-to-peak normthat can be assured with the present analysis result. This would lead to the problem of minimizingγ j

under the synthesis inequality constraints as follows: Perform a line-search overλ > 0 to minimizeγ ∗

j (λ), the minimal value ofγ j if λ > 0 is held fixed; note that the calculation ofγ ∗

j (λ) indeedamounts to solving a genuine LMI problem. The line-search leads to the best achievable upperbound

γ uj = inf

λ>0γ ∗

j (λ).

To estimate the conservatism, let us recall that‖T j ‖∞ is a lower bound on the peak-to-peak norm ofT j . If we calculate the minimal achievableH∞-norm, sayγ l

j , of T j , we know that the actual optimalpeak-to-peak gain must be contained in the interval

[γ lj , γ

uj ].

If the length of this interval is small, we have a good estimate of the actual optimal peak-to-peakgain that is achievable by control, and if the interval is large, this estimate is poor.

4.4 Multi-objective and mixed controller design

In a realistic design problem one is usually not just confronted with a single-objective problem butone has to render various objectives satisfied. As a typical example, one might wish to keep theH∞ norm of z1 = T1w1 below a boundγ1 to ensure robust stability against uncertainties enteringasw1 = 1z1 where the stable mapping1 hasL2-gain smaller than 1/γ1, and render, at the sametime, theH2-norm ofz2 = T2w2 as small as possible to ensure good performance measured in theH2-norm (such as guaranteeing small asymptotic variance ofz j against white noise inputsw j orsmall energy of the outputz j against pulses as inputsw j .)

Such a problem would lead to minimizingγ2 over all controllers which render

σ(A) ⊂ C−, ‖T1‖∞ < γ1, ‖T2‖2 < γ2 (4.4.1)

satisfied. This is amulti-objective H2/H∞ control problem with two performance specifications.

Note that it is often interesting to investigate the trade-off between theH∞-norm and theH2-normconstraint. For that purpose one plots the curve of optimal values if varyingγ1 in some interval[γ l

1, γu1 ] where the lower boundγ l

1 could be taken close to the smallest achievableH∞-norm ofT1.Note that the optimal value will be non-increasing if increasingγ1. The actual curve will provide


4.4. MULTI-OBJECTIVE AND MIXED CONTROLLER DESIGN 101

insight in how far one can improve performance by giving up robustness. In practice, it might benumerically advantageous to give up the hard constraints and proceed, alternatively, as follows: Forfixed real weightsα1 andα2, minimize

α1γ1 + α2γ2

over all controllers that satisfy (4.4.1). The largerα j , the more weight is put on penalizing largevalues ofγ j , the more the optimization procedure is expected to reduce the corresponding boundγ j .

Multi-objective control problems as formulated here are hard to solve. Let us briefly sketch one lineof approach. The Youla parameterization [17] reveals that the set of allT j that can be obtained byinternally stabilizing controllers can be parameterized as

T j1 + T j

2 QT j3 with Q varying freely inRHp×q

∞ .

HereT j1 , T j

2 , T j3 are real-rational proper and stable transfer matrices which can be easily computed

in terms of the system description (4.1.1) and an arbitrary stabilizing controller. Recall also thatRHp×q

∞ denotes the algebra of real-rational proper and stable transfer matrices of dimensionp × q.With this re-parameterization, the multi-objective control problem then amounts to finding aQ ∈

RHp×q∞ that minimizesγ2 under the constraints

‖T11 + T1

2 QT13 ‖∞ < γ1, ‖T2

1 + T22 QT2

3 ‖2 < γ2. (4.4.2)

After this re-formulation, we are hence faced with a convex optimization problem in the parameterQwhich varies in the infinite-dimensional spaceRH∞. A pretty standard Ritz-Galerkin approximationscheme leads to finite-dimensional problems. In fact, consider for a fixed real parametera > 0 thesequence of finite-dimensional subspaces

Sν :=

{Q0 + Q1

s − a

s + a+ Q2

(s − a)2

(s + a)2+ · · · + Qν

(s − a)ν

(s + a)ν: Q0, . . . , Qν ∈ Rp×q

}of the spaceRHp×q

∞ . Let us now denote the infimum of allγ2 satisfying the constraint (4.4.2) forQ ∈ RHp×q

∞ by γ ∗

2 , and that forQ ∈ Sν by γ2(ν). SinceSν ⊂ RHp×q∞ , we clearly have

γ ∗

2 ≤ γ2(ν + 1) ≤ γ2(ν) for all ν = 0, 1, 2 . . . .

Hence solving the optimization problems for increasingν leads to a non-increasing sequence ofvaluesγ (ν) that are all upper bounds on the actual optimumγ ∗

2 . If we now note that any element ofQ can be approximated in theH∞-norm with arbitrary accuracy by an element inSν if ν is chosensufficiently large, it is not surprising thatγ2(ν) actually converges toγ ∗

2 for ν → ∞. To be moreprecise, we need to assume that the strict constraint‖T1

1 + T12 QT1

3 ‖∞ < γ1 is feasible forQ ∈ Sν

and someν, and thatT11 andT2

2 or T32 are strictly proper such that‖T2

1 + T22 QT2

3 ‖2 is finite for all

Q ∈ RHp×q∞ . Then it is not difficult to show that

limν→∞

γ2(ν) = γ ∗

2 .

Finally, we observe that computingγ2(ν) is in fact an LMI problem. For more information on thisand related problems the reader is referred to [7,30,38].


102 4.4. MULTI-OBJECTIVE AND MIXED CONTROLLER DESIGN

We observe that the approach that is sketched above suffers from two severe disadvantages: First,if improving the approximation accuracy by lettingν grow, the size of the LMI’s and the numberof variables that are involved grow drastically what renders the corresponding computations slow.Second, increasingν amounts to a potential increase of the McMillan degree ofQ ∈ Sν what leadsto controllers whose McMillan degree cannot be bounded a priori.

In view of these difficulties, it has been proposed to replace the multi-objective control problem bythe mixed control problem. To prepare its definition, recall that the conditions (4.4.1) are guaranteedby the existence of symmetric matricesX1, X2, Z2 satisfying

X1 > 0,

ATX1 + X1A X1B1 CT1

X1B1 −γ1I DT1

C1 D1 −γ1I

< 0

D2 = 0,

(ATX2 + X2A X2B2

BT2 X2 −γ2I

)< 0,

(X2 CT

2C2 Z2

)> 0, trace(Z2) < γ2.

If trying to apply the general procedure to derive the synthesis inequalities, there is some troublesince the controller parameter transformation depends on the closed-loop Lyapunov matrix; heretwo such matricesX1, X2 do appear such that the technique breaks down. This observation itselfmotivates a remedy: Just force the two Lyapunov matrices to be equal. This certainly introducesconservatism that is, in general, hard to quantify. On the positive side, if one can find a commonmatrix

X = X1 = X2

that satisfies the analysis relations, we can still guarantee (4.4.1) to hold. However, the converse isnot true, since (4.4.1) does not imply the existence of common Lyapunov matrix to satisfy the aboveinequalities.

This discussion leads to the definition of themixed H2/H∞ control problem: Minimizeγ2 subjectto the existence ofX, Z2 satisfying ATX + XA XB1 CT

1BT

1 X −γ1I DT1

C1 D1 −γ1I

< 0

D2 = 0,

(ATX + XA XB2

BT2 X −γ2I

)< 0,

(X CT

2C2 Z2

)> 0, trace(Z2) < γ2.

This problem is amenable to our general procedure. One proves as before that the correspondingsynthesis LMI’s are A(v)T

+ A(v) B1(v) C1(v)T

B1(v)T−γ1I D1(v)T

C1(v) D1(v) −γ1I

< 0

D2(v) = 0,

(A(v)T

+ A(v) B2(v)

B2(v)T−γ2I

)< 0,

(X(v) C2(v)T

C2(v) Z2

)> 0, trace(Z2) < γ2,


4.4. MULTI-OBJECTIVE AND MIXED CONTROLLER DESIGN 103

and the controller construction remains unchanged.

Let us conclude this section with some important remarks.

• After having solved the synthesis inequalities corresponding to the mixed problem forv andZ2, one can construct a controller which satisfies (4.4.1) and which has a McMillan degree(size ofAc) that is not larger than (equal to) the size ofA.

• For the controller resulting from mixed synthesis one can perform an analysis with differentLyapunov matricesX1 andX2 without any conservatism. In general, the actualH∞-norm ofT1 will be strictly smaller thanγ1, and theH2-norm will be strictly smaller than the optimalvalue obtained from solving the mixed problem. Judging a mixed controller should, hence,rather be based on an additional non-conservative and direct analysis.

• Performing synthesis by searching for a common Lyapunov matrix introduces conservatism.Little is known about how to estimate this conservatism a priori. However, the optimal value ofthe mixed problem is always an upper bound of the optimal value of the actual multi-objectiveproblem.

• Starting from a mixed controller, it has been suggested in [33,34] how to compute sequencesof upper and lower bounds, on the basis of solving LMI problems, that approach the actualoptimal value. This allows to provide an a posteriori estimate of the conservatism that isintroduced by settingX1 equal toX2.

• If starting from different versions of the analysis inequalities (e.g. through scaling the Lya-punov matrix), the artificial constraintX1 = X2 might lead to a different mixed controlproblem. Therefore, it is recommended to choose those analysis tests that are expected to leadto Lyapunov matrices which are close to each other. However, there is no general rule how toguarantee this property.

• In view of the previous remark, let us sketch one possibility to reduce the conservatism inmixed design. If we multiply the analysis inequalities for stability ofA and for‖T1‖∞ < γ1by an arbitrary real parameterα > 0, we obtain

αX1 > 0,

AT (αX1) + (αX1)A (αX1)B1 αCT1

BT1 (αX1) −αγ1I αDT

1αC1 αD1 −αγ1I

< 0.

If we multiply the last row and the last column of the second inequality with1α

(what isa congruence transformation) and if we introduceY1 := αX1, we arrive at the followingequivalent version of the analysis inequality for theH∞-norm constraint:

Y1 > 0,

ATY1 + Y1A Y1B1 CT1

BT1 Y1 −γ1α I DT

1C1 D1 −γ1/α I

< 0.

Performing mixed synthesis with this analysis inequality leads to optimal values of the mixedH2/H∞ problem that depend onα. Each of these values form an upper bound on the actual


104 4.5. ELIMINATION OF PARAMETERS

optimal value of the multi-objective problem such that the best bound is found by performinga line-search overα > 0.

• Contrary to previous approaches to the mixed problem, the one presented here does not requireidentical input- or output-signals of theH∞ or H2 channel. In view of their interpretation(uncertainty forH∞ and performance forH2), such a restriction is, in general, very unnatural.However, due to this flexibility, it is even more crucial to suitably scale the Lyapunov matrices.

• We can incorporate with ease various other performance or robustness specifications (formu-lated in terms of linear matrix inequalities) on other channels. Under the constraint of usingfor all desired specifications the same Lyapunov matrix, the design of a mixed controller isstraightforward. Hence, one could conceivably consider a mixture ofH∞, H2, generalizedH2, and peak-to-peak upper bound requirements on more than one channel. In its flexibilityand generality, this approach is unique; however, one should never forget the conservatismthat is involved.

• Using the same Lyapunov function might appear less restrictive if viewing the resulting pro-cedure as aLyapunov shaping technique. Indeed, one can start with the most important spec-ification to be imposed on the controller. This amounts to solving a single-objective problemwithout conservatism. Then one keeps the already achieved property as a constraint and sys-tematically imposes other specifications on other channels of the system to exploit possibleadditional freedom that is left in designing the controller. Hence, the Lyapunov function isshaped to realize additional specifications.

• Finally, constraints that are not necessarily related to input- output-specifications can be in-corporated as well. As a nice example we mention the possibility to place the eigenvalues ofA into an arbitrary LMI region{z : Q + Pz+ PT z < 0}. For that purpose one just has toinclude p11X(v) + q11A(v) + q11A(v)T . . . p1k X(v) + q1k A(v) + qk1A(v)T

.... . .

...

pk1X(v) + qk1 A(v) + q1k A(v)T . . . pkkX(v) + qkk A(v) + qkk A(v)T

< 0

in the set of synthesis LMI (see Chapter 2).

4.5 Elimination of parameters

The general procedure described in Section 4.2 leads to synthesis inequalities in the variablesK , L,M , N and X, Y as well as some auxiliary variables. For specific problems it is often possible toeliminate some of these variables in order to reduce the computation time. For example, sinceK hasthe same size asA, eliminatingK for a system with McMillan degree 20 would save 400 variables.In view of the fact that, in our experience, present-day solvers are practical for solving problems upto about one thousand variables, parameter elimination might be of paramount importance to be ableto solve realistic design problems.


4.5. ELIMINATION OF PARAMETERS 105

In general, one cannot eliminate any variable that appears in at least two synthesis inequalities.Hence, in mixed design problems, parameter elimination is typically only possible under specificcircumstances. In single-objective design problems one has to distinguish various information struc-tures. In output-feedback design problems, it is in general not possible to eliminateX, Y but it mightbe possible to eliminate some of the variablesK , L, M , N if they only appear in one inequality.For example, in quadratic performance problems one can eliminate all the variablesK , L, M , N. Instate-feedback design, one can typically eliminate in additionX, and for estimation problems onecan eliminateY.

To understand which variables can be eliminated and how this is performed, we turn to a discussionof two topics that will be of relevance, namely the dualization of matrix inequalities and explicitsolvability tests for specifically structured LMI’s [9,31].

4.5.1 Dualization

The synthesis inequalities for quadratic performance can be written in the form (4.2.9). The secondinequality has the structure(

IM

)T ( Q SS′ R

)(IM

)< 0 and R ≥ 0. (4.5.1)

Let us re-formulate these conditions in geometric terms. For that purpose we abbreviate

P =

(Q SST R

)∈ R(k+l )×(k+l )

and observe that (4.5.1) is nothing but

P < 0 on im

(IM

)and P ≥ 0 on im

(0I

).

Since the direct sum of im

(IM

)and im

(0I

)spans the wholeR(k+l )×(k+l ), we can apply the

following dualization lemma ifP is non-singular.

Lemma 4.8 (Dualization Lemma)Let P be a non-singular symmetric matrix inRn×n, and letU,V be two complementary subspaces whose sum equalsRn. Then

xT Px < 0 for all x ∈ U \ {0} and xT Px ≥ 0 for all x ∈ V (4.5.2)

is equivalent to

xT P−1x > 0 for all x ∈ U⊥\ {0} and xT P−1x ≤ 0 for all x ∈ V⊥. (4.5.3)

Proof. SinceU ⊕ V = Rn is equivalent toU⊥⊕ V⊥

= Rn, it suffices to prove that (4.5.2)implies (4.5.3); the converse implication follows by symmetry. Let us assume that (4.5.2) is true.



Moreover, let us assume thatU andV have dimensionk andl respectively. We infer from (4.5.2)that P has at leastk negative eigenvalues and at leastl non-negative eigenvalues. Sincek + l = nand sinceP is non-singular, we infer thatP has exactlyk negative andl positive eigenvalues. Wefirst prove thatP−1 is positive definite onU⊥. We assume, to the contrary, that there exists a vectory ∈ U⊥

\ {0} with yT P−1y ≥ 0. Define the non-zero vectorz = P−1y. Thenz is not containedin U since, otherwise, we would conclude from (4.5.2) on the one handzT Pz < 0, and on the otherhandz⊥y = Pz what implieszT Pz = 0. Therefore, the spaceUe := span(z) + U has dimensionk + 1. Moreover,P is positive semi-definite on this space: for anyx ∈ U we have

(z + x)T P(z + x) = yT P−1y + yT x + xT y + xT Px = yT P−1y + xT Px ≥ 0.

This implies thatP has at leastk + 1 non-negative eigenvalues, a contradiction to the already estab-lished fact thatP has exactlyk positive eigenvalues and that 0 is not an eigenvalue ofP.

Let us now prove thatP−1 is negative semi-definite onV⊥. For that purpose we just observe thatP + ε I satisfies

xT (P + ε I )x < 0 for all x ∈ U \ {0} and xT (P + ε I )x > 0 for all x ∈ V \ {0}

for all smallε > 0. Due to what has been already proved, this implies

xT (P + ε I )−1x > 0 for all x ∈ U⊥\ {0} and xT (P + ε I )−1x < 0 for all x ∈ V⊥

\ {0}

for all smallε. SinceP is non-singular,(P + ε I )−1 converges toP−1 for ε → 0. After taking thelimit, we end up with

xT P−1x ≥ 0 for all x ∈ U⊥\ {0} and xT P−1x ≤ 0 for all x ∈ V⊥

\ {0}.

Since we already know that the first inequality must be strict, the proof is finished.

Let us hence introduce

P−1=

(Q SST R

)∈ R(k+l )×(k+l )

and observe that

im

(IM

)⊥

= ker(

I M T)

= im

(−MT

I

)as well as im

(0I

)⊥

= im

(I0

).

Hence Lemma 4.8 implies that (4.5.1) is equivalent to(−MT

I

)T (Q SST R

)(−MT

I

)> 0 and Q ≤ 0. (4.5.4)

As an immediate consequence, we arrive at the following dual version of the quadratic performancesynthesis inequalities.



Corollary 4.9 Let Pj :=

(Q j Sj

STj Rj

)be non-singular, and abbreviate P−1

j :=

(Q j Sj

STj Rj

).

Then I 0

A(v) B j (v)

0 IC j (v) D j (v)

T

0 I 0 0I 0 0 00 0 Q j Sj

0 0 STj Rj

I 0A(v) B j (v)

0 IC j (v) D j (v)

< 0, Rj ≥ 0

is equivalent to−A(v)T

−C(v)T

I 0−B(v)T

−D(v)T

0 I

T

0 I 0 0I 0 0 00 0 Q j Sj

0 0 STj Rj

−A(v)T−C(v)T

I 0−B(v)T

−D(v)T

0 I

> 0, Q j ≤ 0.

Remark. Any non-singular performance indexPj =

(Q j Sj

STj Rj

)can be inverted toP−1

j =(Q j Sj

STj Rj

). Recall that we requiredPj to satisfyRj ≥ 0 since, otherwise, the synthesis inequal-

ities may not be convex. The above discussion reveals that any non-singular performance index hasto satisfy as wellQ j ≤ 0 since, otherwise, we are sure that the synthesis inequalities are not feasible.We stress this point since, in general,Rj ≥ 0 does not implyQ j ≤ 0. (Take e.g.Pj > 0 such thatP−1

j > 0.)

Similarly, we can dualize theH2-type synthesis inequalities as formulated in (4.3.4)-(4.3.5).

Corollary 4.10 For γ j > 0, I 0A(v) B j (v)

0 I

T 0 I 0I 0 00 0 −γ j I

I 0A(v) B j (v)

0 I

< 0

if and only if −A(v)T

−B j (v)T

I

T 0 I 0I 0 00 0 −

1γ j

I

−A(v)T

−B j (v)T

I

> 0.

For X(v) > 0 and Zj > 0,(I

C j (v)

)T (−X(v) 0

0 Z−1j

)(I

C j (v)

)< 0

if and only if (−C j (v)T

I

)T (−X(v)−1 0

0 Z j

)(−C j (v)T

I

)> 0.



Again, Lemma 4.2 allows to render the first and the second dual inequalities affine inγ j andX(v)

respectively.

4.5.2 Special linear matrix inequalities

Let us now turn to specific linear matrix inequalities for which one can easily derive explicit solv-ability tests.

We start by a trivial example that is cited for later reference.

Lemma 4.11 The inequality P11 P12 P13P21 P22 + X P23P31 P32 P33

< 0

in the symmetric unknown X has a solution if and only if(P11 P13P31 P33

)< 0.

Proof. The direction ‘only if’ is obvious by cancelling the second row/column. To prove the con-verse implication, we just need to observe that anyX with

X < −P22 +(

P11 P13) ( P11 P13

P31 P33

)−1( P12P32

)< 0

(such asX = −α I for sufficiently largeα > 0) is a solution (Schur).

Remark. This result extends to finding a common solution to a whole system of LMI’s, due tothe following simple fact: For finitely matricesQ1, . . . , Qm, there exists anX with X < Q j ,j = 1, . . . , m.

The first of three more advanced results in this vain is just a simple consequence of a Schur comple-ment argument and it can be viewed as a powerful variant of what is often called the technique of‘completing the squares’.

Lemma 4.12 (Projection Lemma)Let P be a symmetric matrix partitioned into three rows/columnsand consider the LMI P11 P12 + XT P13

P21 + X P22 P23P31 P32 P33

< 0 (4.5.5)

in the unstructured matrix X. There exists a solution X of this LMI iff(P11 P13P31 P33

)< 0 and

(P22 P23P32 P33

)< 0. (4.5.6)



If (4.5.6)hold, one particular solution is given by

X = PT32P−1

33 P31 − P21. (4.5.7)

Proof. If (4.5.5) has a solution then (4.5.6) just follow from (4.5.5) by canceling the first or secondblock row/column.

Now suppose that (4.5.6) holds what impliesP33 < 0. We observe that (4.5.5) is equivalent to(Schur complement)(

P11 P12 + XT

P21 + X P22

)−

(P13P23

)P−1

33

(P31 P32

)< 0.

Due to (4.5.6), the diagonal blocks are negative definite.X defined in (4.5.7) just renders the off-diagonal block zero such that it is a solution of the latter matrix inequality.

An even more powerful generalization is the so-called projection lemma.

Lemma 4.13 (Projection Lemma)For arbitrary A, B and a symmetric P, the LMI

AT X B + BT XT A + P < 0 (4.5.8)

in the unstructured X has a solution if and only if

Ax = 0 or Bx = 0 imply xT Px < 0 or x = 0. (4.5.9)

If A⊥ and B⊥ denote arbitrary matrices whose columns form a basis ofker(A) andker(B) respec-tively, (4.5.9)is equivalent to

AT⊥

P A⊥ < 0 and BT⊥

P B⊥ < 0. (4.5.10)

We give a full proof of the Projection Lemma since it provides a scheme for constructing a solutionX if it exists. It also reveals that, in suitable coordinates, Lemma 4.13 reduces to Lemma 4.12 if thekernels ofA andB together span the whole space.

Proof. The proof of ‘only if’ is trivial. Indeed, let us assume that there exists someX with AT X B+

BT XT A + P < 0. Then Ax = 0 or Bx = 0 with x 6= 0 imply the desired inequality 0>xT (AT X B + BT XT A + P)x = xT Px.

For proving ‘if’, let S = (S1 S2 S3 S4) be a nonsingular matrix such that the columns ofS3 spanker(A) ∩ ker(B), those of(S1 S3) span ker(A), and those of(S2 S3) span ker(B). Instead of (4.5.8),we consider the equivalent inequalityST (4.5.8)S < 0 which reads as

(AS)T X(BS) + (BS)T XT (AS) + ST PS< 0. (4.5.11)



Now note thatAS and BS have the structure(0 A2 0 A4) and (B1 0 0 B4) where(A2 A4) and(B1 B4) have full column rank respectively. The rank properties imply that the equation

(AS)T X(BS) =

0

AT2

0AT

4

X(

B1 0 0 B4)

=

0 0 0 0

Z21 0 0 Z240 0 0 0

Z41 0 0 Z44

has a solutionX for arbitraryZ21, Z24, Z41, Z44. With Q := ST PSpartitioned accordingly, (4.5.11)hence reads as

Q11 Q12 + ZT21 Q13 Q14 + ZT

41Q21 + Z21 Q22 Q23 Q24 + Z24

Q31 Q32 Q33 Q34

Q41 + Z41 Q42 + ZT24 Q43 Q44 + Z44 + ZT

44

< 0 (4.5.12)

with freeblocksZ21, Z24, Z41, Z44. Since

ker(AS) = im

I 00 00 I0 0

and ker(BS) = im

0 0I 00 I0 0

,

the hypothesis (4.5.9) just amounts to the conditions(Q11 Q13Q31 Q33

)< 0 and

(Q22 Q23Q32 Q33

)< 0.

By Lemma 4.12, we can hence find a matrixZ21 which renders the marked 3× 3 block in (4.5.12)negative definite. The blocksZ41 andZ24 can be taken arbitrary. After having fixedZ21, Z41, Z24,we can chooseZ44 according to Lemma 4.11 such that the whole matrix on the left-hand side of(4.5.12) is negative definite.

Remark. We can, of course, replace< everywhere by>. It is important to recall that the unknownX is unstructured. If one requiresX to have a certain structure (such as being symmetric), the tests,if existing at all, are much more complicated. There is, however, a generally valid extension ofthe Projection Lemma to block-triangular unknownsX [28]. Note that the results do not hold trueas formulated if just replacing the strict inequalities by non-strict inequalities (as it is sometimeserroneously claimed in the literature)! Again, it is possible to provide a full generalization of theProjection Lemma to non-strict inequalities.

Let

P =

(Q SST R

)with R ≥ 0 have the inverse P−1

=

(Q SST R

)with Q ≤ 0 (4.5.13)

and let us finally consider the quadratic inequality(I

AT X B + C

)T

P

(I

AT X B + C

)< 0 (4.5.14)



in the unstructured unknownX. According to Lemma 4.8, we can dualize this inequality to(−BT XT A − C′

I

)T

P−1(

−BT XT A − C′

I

)> 0. (4.5.15)

It is pretty straightforward to derive necessary conditions for the solvability of (4.5.14). Indeed, letus assume that (4.5.14) holds for someX. If A⊥ andB⊥ denote basis matrices of ker(A) and ker(B)

respectively, we infer(I

AT X B + C

)B⊥ =

(IC

)B⊥ and

(−BT XT A − C′

I

)A⊥ =

(−C′

I

)A⊥.

SinceBT⊥

(4.5.14)B⊥ < 0 andAT⊥

(4.5.15)A⊥ > 0, we arrive at the two easily verifiable inequalities

BT⊥

(IC

)T

P

(IC

)B⊥ < 0 and AT

⊥

(−CT

I

)T

P−1(

−CT

I

)A⊥ > 0 (4.5.16)

which are necessary for a solution of (4.5.14) to exist. One can constructively prove that they aresufficient [35].

Lemma 4.14 (Elimination Lemma) Under the hypotheses(4.5.13)on P, the inequality(4.5.14)has a solution if and only if(4.5.16)hold true.

Proof. It remains to prove that (4.5.16) implies the existence of a solution of (4.5.14).

Let us first reveal that one can assume without loss of generality thatR > 0 and Q < 0. Forthat purpose we need to have information about the inertia ofP. Due toR ≥ 0, P and P−1 havesize(R) positive eigenvalues (since none of the eigenvalues can vanish). Similarly,Q ≤ 0 impliesthat P−1 and P have size(Q) = size(Q) negative eigenvalues. Let us now consider (4.5.14) withthe perturbed data

Pε :=

(Q SST R + ε I

)where ε > 0

is fixed sufficiently small such that (4.5.16) persist to hold forPε , and such thatPε and P have thesame number of positive and negative eigenvalues. Trivially, the right-lower block ofPε is positivedefinite. The Schur complementQ − S(R + ε I )−1ST of this right-lower block must be negativedefinite sincePε has size(Q) negative and size(R) positive eigenvalues. Hence the left-upper blockof P−1

ε which equals[Q − S(R + ε I )−1ST]−1 is negative definite as well. If the result is proved

with R > 0 andQ < 0, we can conclude that (4.5.14) has a solutionX for the perturbed dataPε .Due toP0 ≤ Pε , the very sameX also satisfies the original inequality forP0.

Let us hence assume from now onR > 0 andQ < 0. We observe that the left-hand side of (4.5.14)equals(

IC

)T

P

(IC

)+ (AT X B)T (ST

+ RC) + (ST+ RC)T (AT X B) + (AT X B)T R(AT X B).



Hence (4.5.14) is equivalent to (Schur) (IC

)T

P

(IC

)+ (AT X B)T (ST

+ RC) + (ST+ RC)T (AT X B) (AT X B)T

(AT X B) −R−1

< 0

or (IC

)T

P

(IC

)0

0 −R−1

+

+

(A(ST

+ RC)T

A

)T

X(

B 0)+

(BT

0

)XT ( A(ST

+ RC) A)

< 0. (4.5.17)

The inequality (4.5.17) has the structure as required in the Projection Lemma. We need to show that

(B 0

) ( xy

)= 0,

(xy

)6= 0 (4.5.18)

or (A(ST

+ RC) A) ( x

y

)= 0,

(xy

)6= 0 (4.5.19)

imply

(xy

)T (

IC

)T

P

(IC

)0

0 −I

( xy

)= xT

(IC

)T

P

(IC

)x−yT y < 0. (4.5.20)

In a first step we show that (4.5.17) and hence (4.5.14) have a solution ifA = I . Let us assume(4.5.18). Then (4.5.20) is trivial ifx = 0. For x 6= 0 we infer Bx = 0 and the first inequality in(4.5.16) implies

xT(

IC

)T

P

(IC

)x < 0

what shows that (4.5.20) is true. Let us now assume (4.5.19) withA = I . We infer x 6= 0 andy = −(ST

+ RC)x. The left-hand side of (4.5.20) is nothing but

xT(

IC

)T

P

(IC

)x − xT (ST

+ RC)T R−1(ST+ RC)x =

= xT(

IC

)T

P

(IC

)x − xT

(IC

)T ( SR

)R−1 ( ST R

) ( IC

)x =

= xT(

IC

)T [P −

(SR−1ST S

ST R

)](IC

)x = xT (Q − SR−1ST )x



what is indeed negative sinceQ−1= Q − SR−1ST < 0 andx 6= 0. We conclude that, forA = I ,

(4.5.17) and hence (I

X B + C

)T

P

(I

X B + C

)< 0

have a solution.

By symmetry –since one can apply the arguments provided above to the dual inequality (4.5.15)–we can infer that (

IAT X + C

)T

P

(I

AT X + C

)< 0

has a solutionX. This implies that (4.5.17) has a solution forB = I . Therefore, with the ProjectionLemma, (4.5.19) implies (4.5.20) for a generalA.

In summary, we have proved for generalA and B that (4.5.18) or (4.5.19) imply (4.5.20). We caninfer the solvability of (4.5.17) or that of (4.5.14).

4.5.3 The quadratic performance problem

For the performance index

Pj =

(Q j Sj

S′

j Rj

), Rj ≥ 0 with inverse P−1

j =

(Q j Sj

S′

j Rj

), Q j ≤ 0, (4.5.21)

we have derived the following synthesis inequalities:

X(v) > 0,

I 0

A(v) B j (v)

0 IC j (v) D j (v)

T

0 I 0 0I 0 0 00 0 Q j Sj

0 0 STj Rj

I 0A(v) B j (v)

0 IC j (v) D j (v)

< 0. (4.5.22)

Due to the specific structure

(A(v) B j (v)

C j (v) D j (v)

)=

AY A B j

0 X A X Bj

C j Y Cj D j

+

0 BI 00 E j

( K LM N

)(I 0 00 C F j

),

(4.5.23)

it is straightforward to apply Lemma 4.14 to eliminate all the variables

(K LM N

). For that pur-

pose it suffices to compute basis matrices

8 j =

(81

j82

j

)of ker

(BT ET

j

)and 9 j =

(91

j92

j

)of ker

(C Fj

).



Corollary 4.15 For a performance index with(4.5.21), there exists a solutionv of (4.5.22)if andonly if there exist symmetric X and Y that satisfy(

Y II X

)> 0, (4.5.24)

9T

I 0A Bj

0 IC j D j

T

0 X 0 0X 0 0 00 0 Q j Sj

0 0 STj Rj

I 0A Bj

0 IC j D j

9 < 0, (4.5.25)

8T

−AT

−CTj

I 0−BT

j −DTj

0 I

T

0 Y 0 0Y 0 0 00 0 Q j Sj

0 0 STj Rj

−AT−CT

jI 0

−BTj −DT

j0 I

8 > 0. (4.5.26)

Remark. Note that the columns of

(BE j

)indicate in how far the right-hand side of (4.1.1) can be

modified by control, and the rows of(

C Fj)

determine those functionals that provide informa-tion about the system state and the disturbance that is available for control. Roughly speaking, thecolumns of8 j or of 9 j indicate whatcannotbe influenced by control or which informationcannotbe extracted from the measured output. Let us hence compare (4.5.24)-(4.5.26) with the synthesisinequalities that would be obtained for

xz1...

zq

=

A B1 · · · Bq

C1 D1 · · · D1q...

.... . .

...

Cq Dq1 · · · Dq

xw1...

wq

(4.5.27)

without control input and measurement output. For this system we could choose8 = I and9 = Ito arrive at the synthesis inequalities (

Y II X

)> 0, (4.5.28)

I 0A Bj

0 IC j D j

T

0 X 0 0X 0 0 00 0 Q j Sj

0 0 STj Rj

I 0A Bj

0 IC j D j

< 0, (4.5.29)

−AT

−CTj

I 0−BT

j −DTj

0 I

T

0 Y 0 0Y 0 0 00 0 Q j Sj

0 0 STj Rj

−AT−CT

jI 0

−BTj −DT

j0 I

> 0. (4.5.30)



Since there is no control and no measured output, these could be viewed as analysis inequalities for(4.5.27). Hence we have very nicely displayed in how far controls or measurements do influence thesynthesis inequalities through8 j and9 j . Finally, we note that (4.5.28)-(4.5.30) are equivalent toX > 0, (4.5.29) or toY > 0, (4.5.30). Moreover, if dualizingX > 0, (4.5.29), we arrive atY > 0,(4.5.30) forY := X−1.

Proof of Corollary 4.15. The first inequality (4.5.24) is justX(v) > 0. The inequalities (4.5.25)-(4.5.26) are obtained by simply applying Lemma 4.14 to the second inequality of (4.5.22), viewed

as a quadratic matrix inequality in the unknowns

(K LM N

). For that purpose we first observe that

ker

(0 I 0

BT 0 ETj

), ker

(I 0 00 C F j

)have the basis matrices

81j

082

j

,

091

j

92j

respectively. Due to

I 0

A(v) B j (v)

0 IC j (v) D j (v)

0

91j

92j

=

I 0 00 I 0

AY A B j

0 X A X Bj

0 0 IC j Y Cj D j

0

91j

92j

=

0 0I 0A Bj

X A X Bj

0 IC j D j

9,

the solvability condition that corresponds to the first inequality in (4.5.16) reads as

9T

0 0I 0A Bj

X A X Bj

0 IC j D j

T

0 0 I 0 0 00 0 0 I 0 0I 0 0 0 0 00 I 0 0 0 00 0 0 0 Q j Sj

0 0 0 0 STj Rj

0 0I 0A Bj

X A X Bj

0 IC j D j

9 < 0

what simplifies to

9T

I 0

X A X Bj

0 IC j D j

T

0 I 0 0I 0 0 00 0 Q j Sj

0 0 STj Rj

I 0X A X Bj

0 IC j D j

9 < 0.



This is clearly nothing but (4.5.25). The very same simple steps lead to (4.5.26). Indeed, we have

A(v)T C j (v)T

I 0B j (v)T D j (v)T

0 I

81

j0

82j

=

=

−Y AT 0 −YCTj

−AT−AT X −CT

jI 0 00 I 0

−BTj −X BT

j −DTj

0 0 I

81

j0

82j

=

−Y AT

−YCT1

−AT−CT

1I 00 0

−BT1 −DT

10 I

8

such that the solvability condition that corresponds to the second inequality in (4.5.16) is

8T

−Y AT

−YCT1

−AT−CT

1I 00 0

−BT1 −DT

10 I

T

0 0 I 0 0 00 0 0 I 0 0I 0 0 0 0 00 I 0 0 0 00 0 0 0 Q j Sj

0 0 0 0 STj Rj

−Y AT

−YCT1

−AT−CT

1I 00 0

−BT1 −DT

10 I

8 < 0

what simplifies to

8T

−Y AT

−YCT1

I 0−BT

1 −DT1

0 I

T

0 I 0 0I 0 0 00 0 Q j Sj

0 0 STj Rj

−Y AT−YCT

1I 0

−BT1 −DT

10 I

8 < 0

and we arrive at (4.5.26).

Starting from the synthesis inequalities (4.5.22) in the variablesX, Y,

(K LM N

), we have de-

rived the equivalent inequalities (4.5.24)-(4.5.26) in the variablesX, Y only. Testing feasibility ofthese latter inequalities can hence be accomplished much faster. This is particularly advantageousif optimizing an additional parameter, such as minimizing the sup-optimality levelγ in the H∞

problem.

To conclude this section, let us comment on how to compute the controller after having found solu-tionsX, Y of (4.5.24)-(4.5.26). One possibility is to explicitly solve the quadratic inequality (4.5.22)

in

(K LM N

)along the lines of the proof of Lemma 4.14, and reconstruct the controller parameters

as earlier. One could as well proceed directly: Starting fromX andY, we can compute non-singularU andV with U VT

= I − XY, and determineX > 0 by solving the first equation in (4.2.10). Due



to (4.1.5), we can apply Lemma 4.14 directly to the analysis inequality

I 0A B j

0 IC j D j

T

0 X 0 0X 0 0 00 0 Q j Sj

0 0 STj Rj

I 0A B j

0 IC j D j

< 0

if viewing

(Ac Bc

Cc Dc

)as variables. It is not difficult (and you should provide the details!) to

verify the solvability conditions for this quadratic inequality, and to construct an explicit solutionalong the lines of the proof of Lemma 4.14. Alternatively, one can transform the quadratic inequalityto a linear matrix inequality with Lemma 4.2, and apply the Projection Lemma to reconstruct thecontroller parameters. For the latter step the LMI-Lab offers a standard routine. We conclude thatthere are many basically equivalent alternative ways to compute a controller once one has determinedX andY.

4.5.4 H2-problems

If recalling (4.2.3), we observe that both inequalities in theH2-synthesis conditions (4.3.3) involvethe variablesM andN, but only the first one

(A(v)T

+ A(v) B j (v)

B j (v)T−γ j I

)< 0 (4.5.31)

is affected byK and L. This might suggest that the latter two variables can be eliminated in thesynthesis conditions. Since (4.5.31) is affine in

(K L

), we can indeed apply the Projection

Lemma to eliminate these variables. It is not difficult to arrive at the following alternative synthesisconditions forH2-type criteria.

Corollary 4.16 There exists a controller that renders(4.3.2)for someX, Z j satisfied iff there existX, Y , M, N, Zj with f j (Z j ) < γ j , D j + E j N Fj = 0 and

Y I (C j Y + E j M)T

I X (C j + E j NC)T

C j Y + E j M C j + E j NC Zj

> 0,

9T(

AT X + X A X Bj

BTj X −γ j I

)9 < 0,

((AY + BM) + (AY + BM)T B j + BN Fj

(B j + BN Fj )T

−γ j I

)< 0.

(4.5.32)



Proof. We only need to show that the elimination ofK andL in (4.5.31) leads to the two inequalities(4.5.32). Let us recall(

A(v) B j (v))

=

(AY A B j

0 X A X Bj

)+

(0 BI 0

)(K LM N

)(I 0 00 C F j

)=

=

(AY + BM A+ BNC B j + BN Fj

0 X A X Bj

)+

(0I

) (K L

) ( I 0 00 C F j

).

Therefore, (4.5.31) is equivalent to AY + Y AT A B j

AT AT X + X A X Bj

BTj BT

j X −γ j I

+ sym

B00

( M N) ( I 0 0

0 C F j

)+

+ sym

0I0

( K L) ( I 0 0

0 C F j

) < 0

where sym(M) := M + MT is just an abbreviation to shorten the formulas. Now note that

ker(

0 I 0), ker

(I 0 00 C F j

)have the basis matrices

I 00 00 I

,

091

j

92j

respectively. Therefore, the Projection Lemma leads to the two inequalities 0

91j

92j

T AY + Y AT A B j

AT AT X + X A X Bj

BTj BT

j X −γ j I

091

j

92j

< 0

and (AY + Y AT B j

BTj −γ j I

)+ sym

((B0

) (M N

) ( I 00 F j

))< 0

that are easily rewritten to (4.5.32).

If it happens thatE j vanishes, we can also eliminate all variables

(K LM N

)from the synthesis

inequalities. The corresponding results are obtained in a straightforward fashion and their proof isleft as an exercise.

Corollary 4.17 Suppose that Ej = 0. Then there exists a controller that renders(4.3.2)for someX, Z j satisfied iff Dj = 0 and there exist X, Y Zj with f j (Z j ) < γ j and Y I (C j Y)T

I X CTj

C j Y Cj Z j

> 0,

9T(

AT X + X A X Bj

BTj X −γ j I

)9 < 0,

(8 00 I

)T ( AY + Y AT B j

BTj −γ j I

)(8 00 I

)< 0


4.6. STATE-FEEDBACK PROBLEMS 119

where8 is a basis matrix ofker(B).

Remarks.

• Once the synthesis inequalities have been solved, the computation of(

K L)

or of

(K LM N

)can be performed along the lines of the proof of the Projection Lemma.

• It was our main concern to perform the variable elimination with as little computations aspossible. They should be read as examples how one can proceed in specific circumstances,and they can be easily extended to various other performance specifications. As an exercise,the reader should eliminate variables in the peak-to-peak upper bound synthesis LMI’s.

4.6 State-feedback problems

The state-feedback problem is characterized by

y = x or(

C F1 · · · Fq)

=(

I 0 · · · 0).

Then the formulas (4.2.3) read as(A(v) B j (v)

C j (v) D j (v)

)=

AY + BM A+ BN B j

K AX + L X Bj

C j Y + E j M C j + E j N D j

.

Note that the variableL only appears in the(2, 2)-block, and that we can assign an arbitrary matrixin this position by suitably choosingL. Therefore, by varyingL, the(2, 2) block of(

A(v)T+ A(v) B j (v)

B j (v) 0

)=

=

(AY + BM) + (AY + BM)T (A + BN) + K T B j

K + (A + BN)T (AX + L) + (AX + L)T X Bj

BTj BT

j X 0

varies in the set of all symmetric matrices. This allows to apply Lemma 4.11 in order to eliminateLin synthesis inequalities what leads to a drastic simplification.

Let us illustrate all this for the quadratic performance problem. The corresponding synthesis in-equalities (4.2.7) read as

(Y II X

)> 0,

(AY + BM) + (AY + BM)T (A + BN) + K T B j

K + (A + BN)T (AX + L) + (AX + L)T X Bj

BTj BT

j X 0

+

+

(0 0 I


)T

Pj

(0 0 I


)< 0.


120 4.6. STATE-FEEDBACK PROBLEMS

These imply, just by cancelling the second block row/column,

Y > 0,

((AY + BM) + (AY + BM)T B j

BTj 0

)+

+

(0 I

C j Y + E j M D j

)T

Pj

(0 I

C j Y + E j M D j

)< 0

or

Y > 0,

I 0

AY + BM Bj

0 IC j Y + E j M D j

T

0 I 0 0I 0 0 00 0 Q j Sj

0 0 STj Rj

I 0AY + BM Bj

0 IC j Y + E j M D j

< 0. (4.6.1)

This is a drastic simplification since only the variablesY andM do appear in the resulting inequal-ities. It is no problem to reverse the arguments in order to show that the reduced inequalities areequivalent to the full synthesis inequalities.

However, proceeding in a different fashion leads to another fundamental insight: With solutionsYandM of (4.6.1), one can in fact design astaticcontroller which solves the quadratic performanceproblem. Indeed, we just choose

Dc := MY−1

to infer that the static controllery = Dcu leads to a controlled system with the describing matrices(A B j

C j D j

)=

(A + B Dc B j

C j + E j Dc D j

)=

((AY + BM)Y−1 B j

(C j Y + E j M)Y−1 D j

).

We infer that (4.6.1) is identical to

Y > 0,

I 0

AY B j

0 IC j Y D j

T

0 I 0 0I 0 0 00 0 Q j Sj

0 0 STj Rj

I 0AY B j

0 IC j Y D j

< 0.

If we perform congruence transformations withY−1 and

(Y−1 0

0 I

), we arrive withX := Y−1

at

X > 0,

I 0

XA XB j

0 IC j D j

T

0 I 0 0I 0 0 00 0 Q j Sj

0 0 STj Rj

I 0XA XB j

0 IC j D j

< 0.

Hence the static gainD indeed defines a controller which solves the quadratic performance problem.


4.6. STATE-FEEDBACK PROBLEMS 121

Corollary 4.18 Under the state-feedback information structure, there exists a dynamic controller(Ac Bc

Cc Dc

)and someX which satisfy(4.2.1)iff there exist solutions Y and M of the inequalities

(4.6.1). If Y and M solve(4.6.1), thestaticstate-feedback controller gain

Dc = MY−1

and the Lyapunov matrixX := Y−1 render(4.2.1)satisfied.

In literally the same fashion as for output-feedback control, we arrive at the following general pro-cedure to proceed from analysis inequalities to synthesis inequalities, and to construct a static state-feedback controller:

• Rewrite the analysis inequalities in the blocksX, XA, XB j , C j , D j in order to be ableto find a (formal) congruence transformation involvingY which leads to inequalities in theblocksYTXY, YTXAY, YTXB j , C j Y, D j .

• Perform the substitutions

YTXY → Y and

(YTXAY YTXB j

C j Y D j

)→

(AY + BM Bj

C j Y + E j M D j

)to arrive at matrix inequalities in the variablesY andM .

• After having solved the synthesis inequalities forY andM , the static controller gain and theLyapunov matrix

D = MY−1 and X = Y−1

render the analysis inequalities satisfied.

As an illustration, starting form the analysis inequalities (4.3.2) forH2-type problems, the corre-sponding state-feedback synthesis conditions read as(

(AY + BM)T+ (AY + BM) B j

BTj −γ j I

)< 0,(

Y (C j Y + E j M)T

C j Y + E j M Z j

)> 0, f j (Z j ) < γ j , D j = 0.

All our previous remarks pertaining to the (more complicated) procedure for the output-feedbackinformation structure apply without modification.

In general we can conclude that dynamics in the controller do not offer any advantage over static con-trollers for state-feedback problems. This is also true for mixed control problems. This statementsrequires extra attention since our derivation was based on eliminating the variableL which mightoccur in several matrix inequalities. At this point the remark after Lemma 4.11 comes into play:


122 4.7. DISCRETE-TIME SYSTEMS

This particular elimination result also applies to systems of matrix inequalities such that, indeed, theoccurrence ofL is various inequalities will not harm the arguments.

As earlier, in the single-objective quadratic performance problem by state-feedback, it is possible toeliminate the variableM in (4.6.1). Alternatively, one could as well exploit the particular structureof the system description to simplify the conditions in Theorem 4.15. Both approaches lead to thefollowing result.

Corollary 4.19 For the state-feedback quadratic performance problem with index satisfying(4.5.21),there exists dynamic controller and someX with (4.2.1)iff there exists a symmetric Y which solves

Y > 0, 8T

−AT

−CTj

I 0−BT

j −DTj

0 I

T

0 Y 0 0Y 0 0 00 0 Q j Sj

0 0 STj Rj

−AT−CT

jI 0

−BTj −DT

j0 I

8 > 0. (4.6.2)

Remarks. All these results should be viewed as illustrations how to proceed for specific systemdescriptions. Indeed, another popular choice is the so-calledfull informationstructure in which boththe state and the disturbance are measurable:

y =

(xw

).

Similarly, one could consider the corresponding dual versions that are typically related toestimationproblems, such as e.g.

BE1...

Eq

=

I0...

0

.

We have collected all auxiliary results that allow to handle these specific problems without anycomplications.

4.7 Discrete-Time Systems

Everything what has been said so far can be easily extended to discrete time-design problems. Thisis particularly surprising since, in the literature, discrete-time problem solutions often seem muchmore involved and harder to master than their continuous-time counterparts.

Our general procedure to step from analysis to synthesis as well as the technique to recover thecontroller need no change at all; in particular, the concrete formulas for the block substitutions donot change. The elimination of transformed controller parameters proceeds in the same fashion onthe basis of the Projection Lemma or the Elimination Lemma and the specialized version thereof.


4.8. EXERCISES 123

Only as a example we consider the problem discussed in [12]: the mixedH2/H∞ problem withdifferent disturbance inputs and controlled outputs in discrete-time.

It is well-known [12] thatA has all its eigenvalues in the unit disk, that the discrete timeH2-normof

C1(z I − A)−1B1 + D1

is smaller thanγ1, and that the discrete timeH∞-norm of

C2(z I − A)−1B2 + D2

is smaller thanγ2 iff there exist symmetric matricesX1, X2, andZ with trace(Z) < γ1 and

X1 X1A X1B1

ATX1 X1 0BT

1 X1 0 γ1I

> 0,

X1 0 CT1

0 I DT1

C1 D1 Z

> 0,

X2 0 ATX2 CT

20 γ2I BT

2 X2 DT2

X2A X2B2 X2 0C2 D2 0 γ2I

> 0.

(4.7.1)Note that we have transformed these analysis LMI’s such that they are affine in the blocks that willbe transformed for synthesis.

The mixed problem consists of searching for a controller that renders these inequalities satisfied witha common Lyapunov functionX = X1 = X2. The solution is immediate: Perform congruencetransformations of (4.7.1) with

diag(Y, Y, I ), diag(Y, I , I ), diag(Y, I , Y, I )

and read off the synthesis LMI’s using (4.2.3). After solving the synthesis LMI’s, we stress againthat the controller construction proceeds alongthe same stepsas in Theorem 4.3. The inclusion ofpole constraints for arbitrary LMI regions (related, of course, to discrete time stability) and othercriteria poses no extra problems.

4.8 Exercises

Exercise 1Derive an LMI solution of theH∞-problem for the system xz1y

=

A B1 BC1 D1 EC F 0

xw1u

with

C =

(I0

), F =

(0I

)such thaty =

(xw1

).

(This is the so-called full information problem.)


124 4.8. EXERCISES

Exercise 2 (Nominal and robust estimation)Consider the system xzy

=

A B1C1 D1C F

( xw

)and inter-connect it with the estimator(

xc

z

)=

(Ac Bc

Cc Dc

)(xc

y

)(4.8.1)

where bothA and Ac are Hurwitz. The goal in optimal estimation is to design an estimator whichkeepsz − z as small as possible for all disturbancesw in a certain class. Out of the multitude ofpossibilities, we choose theL2-gain ofw → z− z (for zero initial condition of both the system andthe estimator) as the measure of the estimation quality.

This leads to the following problem formulation: Givenγ > 0, test whether there exists an estimatorwhich renders

supw∈L2, w 6=0

‖z − z‖2

‖w‖2< γ (4.8.2)

satisfied. If yes, reveal how to design an estimator that leads to this property.

(a) Show that the estimation problem is a specialization of the general output-feedbackH∞-design problem.

(b) Due to the specific structure of the open-loop system, show that there exists a linearizingtransformation of the estimator parameters which does not involve any matrices that describethe open-loop system.

Hint: To find the transformation, proceed as in the proof of Theorem 4.3 with the factorization

YTX = Z where YT=

(I Y−1VI 0

), Z =

(Y−1 0X U

),

and consider as before the blocksYTXAY, YTXB, CY.

(c) Now assume that the system is affected by time-varying uncertain parameters as xzy

=

A(1(t)) B1(1(t))C1(1(t)) D1(1(t))C(1(t)) F(1(t))

( xw

)where A(1) B1(1)

C1(1) D1(1)

C(1) F(1)

is affine in 1 and 1(t) ∈ conv{11, ...,1N}.

Derive LMI conditions for the existence of an estimator that guarantees (4.8.2) for all uncer-tainties, and show how to actually compute such an estimator if the LMI’s are feasible.

Hint: Recall what we have discussed for the state-feedback problem in Section 8.1.3.


4.8. EXERCISES 125

(d) Suppose that the uncertainty enters rationally, and that it has been pulled out to arrive at theLFT representation

xz1zy

=

A B1 B2

C1 D1 D12C2 D21 D2C F1 F2

x

w1w

, w1(t) = 1(t)z1(t), 1(t) ∈ conv{11, ...,1N}

of the uncertain system. Derive synthesis inequalities with full-block scalings that guaranteethe existence of an estimator that guarantees (4.8.2) for all uncertainties and reveal how toactually compute such an estimator if the LMI’s are feasible. What happens ifD1 = 0 suchthat the uncertainty enters affinely?

Hint: The results should be formulated analogously to what we have done in Section 8.1.2.There are two possibilities to proceed: You can either just use the transformation (4.2.10) toobtain synthesis inequalities that can be rendered convex by an additional congruence trans-formation, or you can employ the alternative parameter transformation as derived in part 2 ofthis exercise to directly obtain a convex test.

Exercise 3

This is a simulation exercise that involves the synthesis of an active controller for the suspensionsystem in Exercise 3 of Chapter 2. We consider the rear wheel of a tractor-trailer combination asis depicted in Figure 4.2. Herem1 represents tire, wheel and rear axle mass,m2 denotes a fraction

Figure 4.2: Active suspension system

of the semitrailer mass. The deflection variablesqi are properly scaled so thatq2 − q1 = 0 andq1 − q0 = 0 in steady state. The system is modeled by the state space equations

x = Ax + B

(q0F

)z = Cx + D

(q0F

)


126 4.8. EXERCISES

where

A =

0 0 1 00 0 0 1

−k1+k2

m1

k2m1

−b1+b2

m1

b2m1

k2m2

−k2m2

b2m2

−b2m2

; B =

b1m1

00 0

k1m1

−b1m1

b1+b2m1

−1

m1b1b2

m1m2

1m2

C =

1 0 0 00 0 0 0k2m2

−k2m2

b2m2

−b2m2

−1 1 0 0

; D =

−1 00 1

b1b2m1m2

1m2

0 0

.

Here,x = col(q1, q2, q1 − b1q0/m1, q2) andz = col(q1 − q0, F, q2, q2 − q1) define the state andthe to-be-controlled output, respectively. The control input is the forceF , the exogenous input is theroad profileq0.

Let the physical parameters be specified as in Table 2.1 in Chapter 2 and letb1 = 1.7 × 103. Theaim of the exercise is to design an active suspension control system that generates the forceF as a(causal) function of the measured variabley = col(q2, q2 − q1).

The main objective of the controller design is to achieve low levels of acceleration throughout thevehicle (q2), bounded suspension deflection (q2 − q1 andq1 − q0) and bounded dynamic tire force(F).

(a) Let the road profile be represented byq0 = Wq0q0 whereq0 ∈ L2 is equalized in frequencyandWq0 is the transfer function

Wq0(s) =0.01

0.4s + 1reflecting the quality of the road when the vehicle drives at constant speed. Define the to-be-controlled outputz = Wzz whereWz is a weighting matrix with transfer function

Wz(s) =

200 0 0 00 0.1 0 00 0 0.0318s+0.4

3.16×10−4s2+0.0314s+10

0 0 0 100

.

The weight on the chassis acceleration reflects the human sensitivity to vertical accelerations.Use the routinesltisys , smult and sdiag (from the LMI toolbox) to implement thegeneralized plant

P :

(q0F

)7→

(zy

)and synthesize with the routinehinflmi a controller which minimizes theH∞ norm of theclosed-loop transfer functionT : q0 7→ z.

(b) Construct with this controller the closed-loop system which mapsq0 to z (not q0 to z!) andvalidate the controlled system by plotting the four frequency responses of the closed-loop


4.8. EXERCISES 127

system and the four responses to a step with amplitude 0.2 (meter). (See the routinesslft ,ssub andsplot ). What are your conclusions about the behavior of this active suspensionsystem?

(c) Partition the outputz of the system into

z =

(z1z2

); z1 =

(q1 − q0

F

); z2 =

(q2

q2 − q1

).

and let the weights on the signal components be as in the first part of this exercise. LetTi ,i = 1, 2 be the transfer function mappingq0 7→ zi . We wish to obtain insight in the achievabletrade-offs between upper bounds of‖T1‖∞ and‖T2‖2. To do this,

(i) Calculate the minimal achievableH∞ norm ofT1.

(ii) Calculate the minimal achievableH2 norm ofT2.

(iii) Calculate the minimal achievableH2 norm ofT2 subject to the bound‖T1‖∞ < γ1 whereγ1 takes some1 values in the interval[0.15, 0.30].

Make a plot of the Pareto optimal performances, i.e, plot the minimal achievableH2 norm ofT2 as function ofγ1. (See the routinehinfmix for details).

1Slightly depending on your patience and the length of your coffee breaks, I suggest about 5.


128 4.8. EXERCISES


Chapter 5

Robust stability and robustperformance

First principle models of physical systems are often represented by state space descriptions in whichthe various components of the state variable represent a well defined physical quantity. Variations,perturbations or uncertainties in specific physical parameters lead to uncertainty in the model. Of-ten, this uncertainty is reflected by variations in well distinguished parameters or coefficients in themodel, while in addition the nature and/or range of the uncertain parameters may be known, or par-tially known. Since very small parameter variations may have a major impact on the dynamics ofa system it is of evident importance to analyse parametric uncertainties of dynamical systems. Thiswill be the subject of this chapter. We reconsider the notions of nominal stability and nominal per-formance introduced in Chapter 3 in the light of different types of parametric uncertainty that effectthe behavior of the system. We aim to derive robust stability tests and robust performance tests forsystems with time-varying rate-bounded parametric uncertainties.

5.1 Parametric uncertainties

Suppose thatδ = (δ1, . . . , δp) is the vector which expresses the ensemble of all uncertain quantitiesin a given dynamical system. There are at least two distinct cases which are of independent interest:

(a) time-invariant parametric uncertainties: the vectorδ is a fixed but unknown element of auncertainty set1 ⊆ Rp.

(b) time-varying parametric uncertainties: the vectorδ is an unknown time varying functionδ : R → Rk whose valuesδ(t) belong to an uncertainty set1 ⊆ Rp, and possibly satisfyadditional constraints on rates of variation, continuity, spectral content, etc.

129

130 5.1. PARAMETRIC UNCERTAINTIES

The first case typically appears in models in which the physical parameters are fixed but only approx-imately known up to some level of accuracy. The second case typically captures models in whichthe uncertainty in parameters, coefficients, or other physical quantities is time-dependent. One mayobject that in many practical situations both time-varying and time-invariant uncertainties occur sothat the distinction between the two cases may seem somewhat artificial. This is true, but sincetime-invariant uncertainties can equivalently be viewed as time-varying uncertainties with a zerorate constraint, combined time-varying and time-invariant uncertainties are certainly not excluded.

A rather general class ofuncertaincontinuous time, dynamical systems is described by the statespace equations

x = f (x, w, δ) (5.1.1a)

z = g(x, w, δ) (5.1.1b)

whereδ may or may not be time-varying,x, w andz are the state, the input and the output which takevalues in the state spaceX, the input spaceW and the output spaceZ, respectively. This constitutesa generalization of the model described in (2.2.1) defined in Chapter 2. If the uncertaintiesδ arefixed but unknown elements of an uncertainty set1 ⊆ Rp then one way to think of equations ofthis sort is to view them as a set of time-invariant systems, parametrized byδ ∈ 1. However, ifδis time-varying, then (5.1.1a) is to be interpreted asx(t) = f (x(t), w(t), δ(t)) and (5.1.1) is betterviewed as a time-varying dynamical system. If the components ofδ(t) coincide, for example, withstate components then (5.1.1) defines a non-linear system, even when the mappingsf and g arelinear. If δ(t) is scalar valued and assumes values in a finite set1 = {1, . . . , K } then (5.1.1) definesa hybrid system ofK modes whosekth mode is defined by the dynamics

x = fk(x, w) := f (x, w, k)

z = gk(x, w) := g(x, w, k)

and where the time-varying behavior ofδ(t) defines the switching events between the various modes.In any case, the system (5.1.1) is of considerable theoretical and practical interest as it covers quitesome relevant classes of dynamical systems.

5.1.1 Affine parameter dependent systems

If f andg in (5.1.1) are linear inx andw then the uncertain model assumes a representation(xz

)=

(A(δ) B(δ)

C(δ) D(δ)

)(xw

)(5.1.2)

in which δ may or may not be time-varying.

Of particular interest will be those systems in which the system matricesaffinely depend onδ. This


5.1. PARAMETRIC UNCERTAINTIES 131

means that

A(δ) = A0 + δ1A1 + . . . + δp Ap

B(δ) = B0 + δ1B1 + . . . + δpBp

C(δ) = C0 + δ1C1 + . . . + δpCp

D(δ) = D0 + δ1D1 + . . . + δpDp

or, written in a more compact form,

S(δ) = S0 + δ1S1 + . . . + δpSp

where

S(δ) =

(A(δ) B(δ)

C(δ) D(δ)

)is the system matrix associated with (5.1.2). We call these modelsaffine parameter dependent mod-els.

5.1.2 Polytopic parameter dependent systems

In Definition 1.3 of Chapter 1 we introduced the notion of a convex combination of a finite set ofpoints. This notion gets relevance in the context of dynamical systems if ‘points’ become systems.Consider a time-varying dynamical system

x(t) = A(t)x(t) + B(t)w(t)

z(t) = C(t)x(t) + D(t)w(t)

with inputw, outputz and statex. Suppose that itssystem matrix

S(t) :=

(A(t) B(t)C(t) D(t)

)is a time varying object which for any time instantt ∈ R can be written as aconvex combinationofthe p system matricesS1, . . . , Sp. This means that there exist functionsα j : R → [0, 1] such thatfor any time instantt ∈ R we have that

S(t) =

p∑j =1

α j (t)Sj

where∑p

j =1 α j (t) = 1 and

Sj =

(A j B j

C j D j

), j = 1, . . . , p


132 5.2. ROBUST STABILITY

are constant system matrices of equal dimension. In particular, this implies that the system matricesS(t), t ∈ R belong to the convex hull ofS1, . . . , Sp, i.e.,

S(t) ∈ conv(S1, . . . , Sp), t ∈ R.

Such models are calledpolytopic linear differential inclusionsand arise in a wide variety of modelingproblems.

5.2 Robust stability

5.2.1 Time-invariant parametric uncertainty

An important issue in the design of control systems involves the question as to what extend thestability and performance of the controlled system is robust against perturbations and uncertaintiesin the parameters of the system. In this section we consider the linear time-invariant system definedby

x = A(δ)x (5.2.1)

where the state matrixA(·) is acontinuousfunction of a real valued time-invariant parameter vectorδ = col(δ1, . . . , δp) which we assume to be contained in anuncertainty set1 ⊆ Rp. Let X = Rn

be the state space of this system. We will analyze therobust stabilityof the equilibrium pointx∗= 0

of this system. Precisely, we address the question when the equilibrium pointx∗= 0 of (5.2.1) is

asymptotically stable in the sense of Definition 3.1 for allδ ∈ 1.

Example 5.1 As an example, let

A(δ) =

−1 2δ1 2δ2 −2 13 −1 δ3−10

δ1+1

whereδ1, δ2 andδ3 are bounded asδ1 ∈ [−0.5, 1], δ2 ∈ [−2, 1] andδ3 ∈ [−0.5, 2]. Then theuncertainty set1 is polytopic and defined by

1 = {col(δ1, δ2, δ3) | δ1 ∈ [−0.5, 1], δ2 ∈ [−2, 1], δ3 ∈ [−0.5, 2]}

and1 = conv(10) with

10 = {col(δ1, δ2, δ3) | δ1 ∈ {−0.5, 1}, δ2 ∈ {−2, 1}, δ3 ∈ {−0.5, 2}}

the set of vertices (or generators) of1.

In the case of time-invariant parametric uncertainties, the systemx = A(δ)x is asymptotically stableif and only if A(δ) is Hurwitz for all δ ∈ 1. That is, if and only if the eigenvalues ofA(δ) lie in the


5.2. ROBUST STABILITY 133

open left-half complex plane for all admissible perturbationsδ ∈ 1. Hence, using Proposition 3.5,the verification of robust stability amounts to checking whether

ρ(A(δ)) := maxλ∈λ(A(δ))

Re(λ) < 0 for all δ ∈ 1.

If 1 is a continuum inRp, this means verifying an inequality at an infinite number of points. Evenif 1 is a polytope, it will not suffice to check the above inequality on the vertices of the uncertaintyset only. Sinceρ(A(δ) is not a convex or concave function ofδ (except in a few trivial cases) it willbe difficult to find its global maximum over1.

Quadratic stability with time-invariant uncertainties

We will apply Theorem 3.4 to infer the asymptotic stability of the equilibrium pointx∗= 0

of (5.2.1).

Definition 5.2 (Quadratic stability) The system (5.2.1) is said to bequadratically stable for per-turbationsδ ∈ 1 if there exists a matrixK = K >

� 0 such that

A(δ)>K + K A(δ) ≺ 0 for all δ ∈ 1. (5.2.2)

The importance of this definition becomes apparent when considering quadratic Lyapunov functionsV(x) = x>K x. Indeed, ifK � 0 satisfies (5.2.2) then there exists anε > 0 such that

A(δ)>K + K A(δ) + εK � 0 for all δ ∈ 1.

so that the time-derivativeV ′ of the composite functionV(x(t)) = x(t)>K x(t) along solutions of(5.2.1), defined in (3.1.5), satisfies

V ′(x) + εV(x) = ∂xV(x)A(δ)x + εV(x) = x>

[A(δ)>K + K A(δ) + εK

]x ≤ 0

for all x ∈ X and allδ ∈ 1. But then the composite functionV∗(t) = V(x(t)) with x(·) a solutionof (5.2.1) satisfies

V∗(t) + εV∗(t) = V ′(x(t)) + εV(x(t)) = x(t)>[

A(δ)>K + K A(δ) + εK]

x(t) ≤ 0

for all t ∈ R and allδ ∈ 1. Integrating this expression over an interval[t0, t1] shows thatV∗ hasexponential decay according toV∗(t1) ≤ V∗(t0)e−ε(t1−t0) for all t1 ≥ t0 and allδ ∈ 1. Now usethat

λmin(K )‖x‖2

≤ x>K x ≤ λmax(K )‖x‖2

to see that

‖x(t1)‖2

≤1

λmin(K )V(x(t1)) ≤

1

λmin(K )V(x(t0))e

−ε(t1−t0) ≤λmax(K )

λmin(K )‖x(t0)‖

2e−ε(t1−t0).



That is,

‖x(t)‖ ≤ ‖x(t0)‖

√λmax(K )

λmin(K )e−

ε2 (t−t0) for all δ ∈ 1

which shows that the origin is globally exponentially stable (and hence globally asymptotically sta-ble) for all δ ∈ 1. Moreover, the exponential decay rate,ε/2 does not depend onδ.

It is worthwhile to understand (and perhaps appreciate) the arguments in this reasoning as they areat the basis of more general results to come.

Verifying quadratic stability with time-invariant uncertainties

The verification of quadratic stability of the system places an infinite number of constraints on thesymmetric matrixK in (5.2.2). It is the purpose of this section to make additional assumptions on theway the uncertainty enters the system, so as to convert (5.2.2) into a numerically tractable condition.

Suppose thatA(δ) in (5.2.1) is anaffine function of the parameter vectorδ. That is, suppose thatthere exist real matricesA0, . . . Ap, all of dimensionn × n, such that

A(δ) = A0 + δ1A1 + · · · + δp Ap (5.2.3)

with δ ∈ 1. Then (5.2.1) is referred to as anaffine parameter dependent model. In addition, let ussuppose that the uncertainty set1 is convex and coincides with the convex hull of a set10 ⊂ Rp.With this structure onA and1 we state the following result.

Proposition 5.3 If A(·) is an affine function and1 = conv(10) with 10 ⊂ Rp, then the system(5.2.1)is quadratically stable if and only if there exists K= K >

� 0 such that

A(δ)>K + K A(δ) ≺ 0 for all δ ∈ 10.

Proof. The proof of this result is an application of Proposition 1.14 in Chapter 1. Indeed, fixx ∈ Rn

and consider the mappingfx : 1 → R defined by

fx(δ) := x>[A(δ)>K + K A(δ)]x.

The domain1 of this mapping is a convex set and by assumption1 = conv(10). SinceA(·) isaffine, it follows that fx(·) is a convexfunction. In particular, Proposition 1.14 (Chapter 1) yieldsthat fx(δ) < 0 for all δ ∈ 1 if and only if fx(δ) < 0 for all δ ∈ 10. Sincex is arbitrary, it followsthat the inequalityA(δ)>K + K A(δ) ≺ 0 holds for allδ ∈ 1 if and only if it holds for allδ ∈ 10,which yields the result.

Obviously, the importance of this result lies in the fact that quadratic stability can be concluded fromafinite testof matrix inequalities whenever10 consists of a finite number of elements. That is, whenthe uncertainty set is the convex hull of a finite number of points inRp. In that case, the conditionstated in Proposition 5.3 is a feasibility test of a (finite) system of LMI’s.



Example 5.4 Continuing Example 5.1, the matrixA(δ) is not affine inδ, but by settingδ4 =δ3−10δ1+1 +

12 we obtain that

A(δ) =

−1 2δ1 2δ2 −2 13 −1 δ4 − 12

,

δ1δ2δ4

∈ 1 = [−0.5, 1] × [−2, 1] × [−9, 8].

Since1 = conv(10) with 10 = {δ1, . . . , δN} consisting of theN = 23

= 8 corner points of theuncertainty set, the verification of the quadratic stability of (5.2.1) is a feasibility test of the (finite)system of LMI’s

K � 0, A(δ j )>K + K A(δ j ) ≺ 0, j = 1, . . . , 8.

The test will not pass. By Proposition 5.3, the system is not quadratically stable for the givenuncertainty set.

Example 5.5 Consider the uncertain control systemx = A(δ)x+ B(δ)u where we wish to constructa feedback lawu = Fx such that the controlled systemx = (A(delta) + B(δ)F)x is quadraticallystable for allδ in some uncertainty set1 = conv(10) with 10 a finite set. By Proposition 5.3, thisis equivalent of findingF andK � 0 such that

(A(δ) + B(δ)F)>K + K (A(δ) + B(δ)F) ≺ 0, for all δ ∈ 10

This is not a system of LMI’s. However, withX = K −1 andL = F K −1 and assuming thatA(·)

andB(·) are affine, we can transform the latter to an LMI feasibility test to findX � 0 such that

A(δ)X + X A(δ) + B(δ)L + (B(δ)L)> ≺ 0, for all δ ∈ 10.

Whenever the test passes, the feedback law is given byF = L X−1.

Parameter-dependent Lyapunov functions with time-invariant uncertainties

The main disadvantage in searching for one quadratic Lyapunov function for a class of uncertainmodels is the conservatism of the test. Indeed, the last example shows that (5.2.1) may not bequadratically stable, but no conclusions can been drawn from this observation concerning the in-stability of the uncertain system. To reduce conservatism of the quadratic stability test we willconsider quadratic Lyapunov functions for the system (5.2.1) which areparameter dependent, i.e.,Lyapunov functionsV : X × 1 → R of the form

V(x, δ) := x>K (δ)x

whereK (δ) is a matrix valued function that is allowed to dependent on the uncertain parameterδ. Asufficient condition for robust asymptotic stability can be stated as follows.



Proposition 5.6 Let the uncertainty set1 be compact and suppose that K(δ) is continuously differ-entiable on1 and satisfies

K (δ) � 0 for all δ ∈ 1 (5.2.4a)

A(δ)>K (δ) + K (δ)A(δ) ≺ 0 for all δ ∈ 1. (5.2.4b)

Then the system(5.2.1)is globally asymptotically stable for allδ ∈ 1.

Proof. Let K (δ) satisfy (5.2.4) and considerV(x, δ) = x>K (δ)x as candidate Lyapunov function.There existsε > 0 such thatK (δ) satisfiesA(δ)>K (δ) + K (δ)A(δ) + εK (δ) � 0. Take thetime derivative of the composite functionV∗(t) := V(x(t), δ) along solutions of (5.2.1) to infer thatV∗(t)+εV∗(t) ≤ 0 for all t ∈ R and allδ ∈ 1. This means that for allδ ∈ 1, V∗(·) is exponentiallydecaying along solutions of (5.2.1) according toV∗(t) ≤ V∗(0)e−εt . Definea := infδ∈1 λmin(K (δ))

andb := supδ∈1 λmax(K (δ)). If 1 is compact, the positive definiteness ofK implies that botha andb are positive and we have thata‖x‖

2≤ V(x, δ) ≤ b‖x‖

2 for all δ ∈ 1 and allx ∈ X. Togetherwith the exponential decay ofV∗ this yields that‖x(t)‖2

≤ba‖x(0)‖2e−εt for all δ ∈ 1 which

proves the exponential and asymptotic stability of (5.2.1).

The search for matrix valuedfunctionsthat satisfy the conditions 5.2.4 is much more involved andvirtually intractable from a computational point of view. We therefore wish to turn Proposition 5.6into a numerically efficient scheme to compute parameter varying Lyapunov functions. For this, firstconsider Lyapunov functions that areaffinein the parameterδ, i.e.,

K (δ) = K0 + δ1K1 + . . . + δpK p

whereK0, . . . , K p are real symmetric matrices of dimensionn × n andδ = col(δ1, . . . , δp) is thetime-invariant uncertainty vector. Clearly, withK1 = . . . = K p = 0 we are back to the case ofparameterindependentquadratic Lyapunov functions as discussed in the previous subsection. Thesystem (5.2.1) is calledaffine quadratically stableif there exists matricesK0, . . . , K p such thatK (δ)

satisfies the conditions (5.2.4) of Proposition 5.6.

Let A(·) be affine and represented by (5.2.3). Suppose that1 is convex with1 = conv(10) where10 is a finite set of vertices of1. Then the expression

L(δ) := A(δ)>K (δ) + K (δ)A(δ)

in (5.2.4b) is, in general, not affine inδ. As a consequence, for fixedx ∈ Rn, the function fx : 1 →

R defined by

fx(δ) := x>L(δ)x (5.2.5)

will not be convex so that the implication

{ fx(δ) < 0 for all δ ∈ 10} =⇒ { fx(δ) < 0 for all δ ∈ 1} (5.2.6)



used in the previous section (proof of Proposition 5.3), will not hold. ExpandingL(δ) yields

L(δ) = [A0 +

p∑j =1

δ j A j ]>[K0 +

p∑j =1

δ j K j ] + [K0 +

p∑j =1

δ j K j ][A0 +

p∑j =1

δ j A j ]

=

p∑i =0

p∑j =0

δi δ j [A>

i K j + K j Ai ]

where, for the latter compact notation, we introducedδ0 = 1. Consequently, (5.2.5) takes the form

fx(δ) = c0 +

p∑j =1

δ j c j +

p∑j =1

j −1∑i =1

δi δ j ci j +

p∑j =1

δ2j d j

wherec0, c j , ci j andd j are constants. Consider the following uncertainty sets

1 = {δ ∈ Rp| δk ∈ [δk, δk] }, 10 = {δ ∈ Rp

| δk ∈ {δk, δk} } (5.2.7)

whereδk ≤ δk. Then1 = conv(10) and it is easily seen that a sufficient condition for the implica-tion (5.2.6) to hold for the uncertainty sets (5.2.7) is thatfx(δ1, . . . , δ j , . . . , δp) is partially convex,that is convex in each of its argumentsδ j , j = 1, . . . , p. Since fx is a twice differentiable function,this is the case when

d j =1

2

∂2 fx

∂δ2j

= x>[A>

j K j + K j A j ]x ≥ 0

for j = 1, . . . , p. Sincex is arbitrary, we therefore obtain that

A>

j K j + K j A j � 0, j = 1, . . . , p

is a sufficient condition for (5.2.6) to hold on the uncertainty sets (5.2.7). This leads to the followingmain result.

Theorem 5.7 If A(·) is an affine function described by(5.2.3)and1 = conv(10) assumes the form(5.2.7), then the system(5.2.1)is affine quadratically stable if there exist real matrices K0, . . . , K p

such that K(δ) = K0 +∑p

j =1 δ j K j satisfies

A(δ)>K (δ) + K (δ)A(δ) ≺ 0 for all δ ∈ 10 (5.2.8a)

K (δ) � 0 for all δ ∈ 10 (5.2.8b)

A>

j K j + K j A j � 0 for j = 1, . . . , p. (5.2.8c)

In that case, the parameter varying function satisfies the conditions(5.2.4)and V(x, δ) := x>K (δ)xis a quadratic parameter-dependent Lyapunov function of the system.

Proof. It suffices to prove that (5.2.8) implies (5.2.4) for allδ ∈ 1. Let x be a non-zero fixed butarbitrary element ofRn. SinceK (δ) is affine inδ, the mapping

δ 7→ x>K (δ)x



with δ ∈ 1 is convex. Consequently,x>K (δ)x is larger than zero for allδ ∈ 1 if it is larger thanzero for allδ ∈ 10. As x is arbitrary, this yields that (5.2.4a) is implied by (5.2.8b). It now sufficesto prove that (5.2.8a) and (5.2.8c) imply (5.2.4b) for allδ ∈ 1. This however, we showed in thearguments preceding the theorem.

Theorem 5.7 reduces the problem to verify affine quadratic stability of the system (5.2.1) with box-type uncertainties to a feasibility problem of a (finite) set of linear matrix inequalities.

5.2.2 Time-varying parametric uncertainty

Robust stability against time-varying perturbations is generally a more demanding requirement thanrobust stability against time-invariant parameter uncertainties. In this section we consider the ques-tion of robust stability for the system

x(t) = A(δ(t))x(t) (5.2.9)

where the values of the time-varying parameter vectorδ(t) belong to the uncertainty set1 ⊂ Rp forall time t ∈ R. In this section we assess therobust stabilityof the fixed pointx∗

= 0. It is importantto remark that, unlike the case with time-invariant uncertainties, robust stability of the origin of thetime-varying system (5.2.9) isnot equivalentto the condition that the (time-varying) eigenvaluesλ(A(δ(t))) belong to the stability regionC− for all admissible perturbationsδ(t) ∈ 1.

Proposition 5.8 The uncertain system(5.2.9)with time-varying uncertaintiesδ(·) ∈ 1 is asymptot-ically stable if there exists a matrix K= K >

� 0 such that(5.2.2)holds.

The inequality (5.2.2) is atime-invariantcondition that needs to be verified for all points in theuncertainty set1 so as to conclude asymptotic stability of thetime-varyinguncertainties that occurin (5.2.9). Since Proposition 5.8 is obtained as a special case of Theorem 5.9 below, we defer itsproof.

An interesting observation related to Proposition 5.8 is that the existence of a real symmetric matrixK = K >

� 0 satisfying (5.2.2) not only yields quadratic stability of the system (5.2.1) withδ ∈ 1

but also the asymptotic stability of (5.2.9) withδ(t) ∈ 1. Hence, the existence of suchK impliesthatarbitrary fast variationsin the time-varying parameter vectorδ(·) may occur so as to guaranteeasymptotic stability of (5.2.9). If additionala priori information on the time-varying parametersis known, the result of Proposition 5.8 may become too conservative and we therefore may like toresort to different techniques to incorporate information about the parameter trajectoriesδ(·).

One way to do this is to assume that the trajectoriesδ(·) are continuously differentiable while

δ(t) ∈ 1, δ(t) ∈ 3 for all time t ∈ R. (5.2.10)

Here,1 and3 are subsets ofRp and we will assume that these sets are compact. We will thereforeassume that not only thevaluesbut also theratesof the parameter trajectories are constrained.



Quadratic stability with time-varying uncertainties

A central result for achieving robust stability of the system (5.2.9) against all uncertainties (5.2.10)is given in the following theorem.

Theorem 5.9 Suppose that the function K: 1 → Sn is continuously differentiable on a compactset1 and satisfies

K (δ) � 0 for all δ ∈ 1 (5.2.11a)

∂δ K (δ)λ + A(δ)>K (δ) + K (δ)A(δ) ≺ 0 for all δ ∈ 1 andλ ∈ 3. (5.2.11b)

Then the origin of the system(5.2.9) is exponentially stable against all time-varying uncertaintiesδ : R → Rp that satisfy(5.2.10). Moreover, in that case V(x, δ) := x>K (δ)x is a quadraticparameter depending Lyapunov function for the system(5.2.9).

Proof. The proof follows very much the same lines as the proof of Proposition 5.6, but nowincludes time-dependence of the parameter functions. Suppose thatK (δ) satisfies the hypothesis.ConsiderV(x, δ) = x>K (δ)x as candidate Lyapunov function. Leta := infδ∈1 λminK (δ) andb := supδ∈1 λmaxK (δ). If 1 is compact, the positive definiteness ofK (δ) for all δ ∈ 1 implies thatbotha andb are positive. In addition, we can findε > 0 such thatK (δ) satisfies

aI � K (δ) � bI, ∂δ K (δ)λ + A(δ)>K (δ) + K (δ)A(δ) + εK (δ) � 0

for all δ ∈ 1 andλ ∈ 3. Take the time derivative of the composite functionV∗(t) := V(x(t), δ(t))along solutions of (5.2.9) to infer that

V∗(t) + εV∗(t) = x(t)> A(δ(t))>K (δ(t))x(t) + x(t)>K (δ(t))A(δ(t))x(t)+

+ εx(t)>K (δ(t))x(t) + x(t)>{ p∑

k=1

∂kK (δ(t))δk(t)

}x(t) ≤ 0

for all t ∈ R, all δ(t) ∈ 1 and allδ(t) ∈ 3. This means that for this class of uncertaintiesV∗(·) isexponentially decaying along solutions of (5.2.1) according toV∗(t) ≤ V∗(0)e−εt . Moreover, sincea‖x‖

2≤ V(x, δ) ≤ b‖x‖

2 for all δ ∈ 1 and allx ∈ X we infer that‖x(t)‖2≤

ba‖x(0)‖2e−εt for

all t ≥ 0 and all uncertaintiesδ(t) satisfying (5.2.10). Hence, (5.2.9) is exponentially stable againstuncertainties (5.2.10).

Theorem 5.9 involves a search for matrix functions satisfying the inequalities (5.2.11) to guaranteerobust asymptotic stability. Note that the result is a sufficient algebraic test only that provides aquadratic parameter dependent Lyapunov function, when the test passes. The result is not easy toapply or verify by a computer program as it involves (in general) an infinite number of conditionson the inequalities (5.2.11). We will therefore focus on a number of special cases that convertTheorem 5.9 in a feasible algorithm.



For this, first consider the case where the parameters are time-invariant. This is equivalent to sayingthat 3 = {0}. The conditions (5.2.11) then coincide with (5.2.4) and we therefore obtain Propo-sition 5.6 as a special case. In particular, the sufficient condition (5.2.11) for robust stability inTheorem 5.9 is also necessary in this case.

If we assume arbitrary fast time-variations inδ(t) then we consider rate constraints of the form3 = [−r, r ]

p with r → ∞. For (5.2.11b) to hold for anyλ with |λk| > r and r → ∞ it isimmediate that∂δ K (δ) needs to vanish for allδ ∈ 1. Consequently, in this caseK can not dependonδ and Theorem 5.9 reduces to Proposition 5.8. In particular, this argument proves Proposition 5.8as a special case.

Verifying quadratic stability with time-varying uncertainties

In this section we will assume thatA(·) in (5.2.9) is anaffine functionof δ(t). The uncertainty sets1 and3 are assumed to be convex sets defined by the ‘boxes’

1 = {δ ∈ Rp| δk ∈ [δk, δk]}, 3 = {λ ∈ Rp

| λk ∈ [λk, λk]} (5.2.12)

Stated otherwise, the uncertainty regions are the convex hulls of the sets

10 = {δ ∈ Rp| δk ∈ {δk, δk} }, 30 = {λ ∈ Rp

| λk ∈ {λk, λk} }.

In addition, the search of a parameter dependentK (δ) will be restricted to the class ofaffine functionsK (δ) represented by

K (δ) = K0 + δ1K1 + · · · + δpK p

whereK j ∈ Sn, j = 1, . . . , p is symmetric. For this class of parameter dependent functions wehave that∂δk K (δ) = Kk so that (5.2.11b) reads

p∑k=1

∂δk K (δ)λk + A(δ)>K (δ) + K (δ)A(δ) =

p∑k=1

Kkλk +

p∑ν=0

p∑µ=0

δνδµ(A>ν Kµ + Kµ Aν).

Here, we setδ0 = 1 to simplify notation. The latter expression is

• affine in K1, . . . , K p

• affine inλ1, . . . , λp

• quadratic inδ1, . . . , δp due to the mixture of constant, linear and quadratic terms.

Again, similar to (5.2.5), introduce for arbitraryx ∈ X the function fx : 1 × 3 → R defined by

fx(δ, λ) := x>

p∑k=1

Kkλk +

p∑ν=0

p∑µ=0

δνδµ(A>ν Kµ + Kµ Aν)

x.



A sufficient condition for the implication

{ fx(δ, λ) < 0 for all (δ, λ) ∈ 10 × 30} =⇒ { fx(δ, λ) < 0 for all (δ, λ) ∈ 1 × 3}

is that fx is partially convexin each of its argumentsδ j , j = 1, . . . , p. As in subsection 5.2.1, thiscondition translates to the requirement that

A>

j K j + K j A j � 0, j = 1, . . . , p,

which brings us to the following main result.

Theorem 5.10 Suppose that A(·) is affine as described by(5.2.3), and assume thatδ(t) satisfies theconstraints(5.2.10)with 1 and3 compact sets specified in(5.2.12). Then the origin of the system(5.2.9)is robustly asymptotically stable against all time-varying uncertainties that satisfy(5.2.10)ifthere exist real matrices K0, . . . , K p such that K(δ) = K0 +

∑pj =1 δ j K j satisfies

p∑k=1

Kkλk +

p∑k=0

p∑`=0

δkδ`

(A>

k K` + K` Ak

)≺ 0 for all δ ∈ 10 andλ ∈ 30 (5.2.13a)

K (δ) � 0 for all δ ∈ 10 (5.2.13b)

A>

j K j + K j A j � 0 for j = 1, . . . , p. (5.2.13c)

Moreover, in that case, V(x, δ) := x>K (δ)x defines a quadratic parameter-dependent Lyapunovfunction for the system.

Theorem 5.10 provides an LMI feasibility test to verify robust asymptotic stability against uncer-tainty described by (5.2.10).

It is interesting to compare the numerical complexity of the conditions of Theorem 5.7 with theconditions mentioned in Theorem 5.10. If the uncertainty vectorδ is p-dimensional then the vertexset10 has dimension 2p so that the verification of conditions (5.2.11) requires a feasibility test of

2p+ 2p

+ p

linear matrix inequalities. In this case, also the vertex set30 has dimension 2k which implies thatthe condition of Theorem 5.10 require a feasibility test of

22p+ 2p

+ p = 4p+ 2p

+ p

linear matrix inequalities.

Generalizations

Assuming an affine structure for the state evolution mapA(·) and for the matrix functionK (·), wewere able (Theorem 5.10) to restrict the search of a parameter dependent Lyapunov function for the


142 5.3. ROBUST DISSIPATIVITY

system (5.2.9) to a finite dimensional subspace. The central idea that led to Theorem 5.10 allowsmany generalizations to non-affine structures. For example, ifb1(δ), b2(δ), . . . , bp(δ) denote a setof scalar, continuously differentiable basis functions in the uncertain parameterδ, we may assumethat A(δ) andK (δ) allow for expansions

A(δ) = A1b1(δ) + · · · + Apbp(δ)

K (δ) = K1b1(δ) + · · · + K pbp(δ).

The condition (5.2.11b) in Theorem 5.9 then involves the partial derivative

∂kK (δ) =

p∑j =1

K j ∂kb j (δ).

The robust stability conditions (5.2.11) in Theorem 5.9 then translate to

p∑k=1

Kkbk(δ) � 0 for all δ ∈ 1 (5.2.14a)

p∑k=1

p∑j =1

K j ∂kb j (δ)λk + [A(δ)>K j + K j A(δ)]b j (δ)

≺ 0 for all δ ∈ 1 andλ ∈ 3. (5.2.14b)

which is finite dimensional but not yet an LMI feasibility test. Possible basis functions are

(a) natural basisb1(δ) = δ1, . . . , bp(δ) = δp

whereδ j = 〈ej , δ〉 is the j th component ofδ in the natural basis{ej }pj =1 of Rp.

(b) polynomial basis

bk1,...,kp(δ) = δk11 · · · δ

kpp , kν = 0, 1, 2, . . . , andv = 1, . . . , p.

As a conclusion, in this section we obtained from the nominal stability characterizations in Chap-ter 3 the corresponding robust stability tests against both time-invariant and time-varying and rate-bounded parametric uncertainties. We continue this chapter to also generalize the performance char-acterizations in Chapter 3 to robust performance.

5.3 Robust dissipativity

Among the various refinements and generalizations of the notion of a dissipative dynamical system,we mentioned in Section 2.2 the idea of a robustly dissipative system. To make this more precise,let s : W × Z → R be asupply functionassociated with the uncertain system (5.1.1) where theuncertain parameterδ(·) satisfies the constraints (5.2.10).


5.3. ROBUST DISSIPATIVITY 143

Definition 5.11 (Robust dissipativity) The system (5.1.1) with supply function s is said to bero-bustly dissipativeagainst time-varying uncertainties (5.2.10) if there exists a functionV : X ×1 →

R such that the dissipation inequality

V(x(t0), δ(t0)) +

∫ t1

t0s(w(t), z(t))dt ≥ V(x(t1), δ(t1)) (5.3.1)

holds for allt0 ≤ t1 and all signals(w, x, z, δ) that satisfy (5.1.1) and (5.2.10).

Any functionV that satisfies (5.3.1) is called a(parameter dependent) storage functionand (5.3.1)is referred to as therobust dissipation inequality. If the composite functionV(x(t), δ(t)) is differ-entiable as a function of timet , then it is easily seen that the system (5.3.1) is robustly dissiptive ifand only if

∂xV(x, δ) f (x, w, δ) + ∂δV(x, δ)v ≤ s(w, g(x, w, δ)) (5.3.2)

holds for all points(x, w, δ, v) ∈ Rn× Rm

× 1 × 3. The latterrobust differential dissipationinequality (5.3.2) makes robust dissipativity ade factolocal property of the functionsf , g, thesupply function s and the uncertainty sets1 and3.

As in Chapter 2, we will specialize this to linear systems with quadratic supply functions and deriveexplicit tests for the verification of robust dissipativity. Suppose thatf andg in (5.1.1) are linearin x andw. This results in the model (5.1.2) in which the uncertaintyδ will be time-varying. Inaddition, suppose that the supply function is a general quadratic form in(w, z) given by (2.3.2) inChapter 2. A sufficient condition for robust dissipativity is then given as follows.

Theorem 5.12 Suppose that the function K: 1 → Sn is continuously differentiable on a compactset1 and satisfies

F(K , δ, λ) :=

(∂δ K (δ)λ + A(δ)>K (δ) + K (δ)A(δ) K (δ)B(δ)

B(δ)>K (δ) 0

)−

(0 I

C(δ) D(δ)

)> ( Q SS> R

)(0 I

C(δ) D(δ)

)� 0.

(5.3.3)for all δ ∈ 1 and allλ ∈ 3. Then the uncertain system{

x(t) = A(δ(t))x(t) + B(δ(t))w(t)

z(t) = C(δ(t))x(t) + D(δ(t))w(t)(5.3.4)

whereδ : R → Rp is in the class of continuously differentiable functions satisfying the value andrate constraints(5.2.10)is robustly dissipative with respect to the quadratic supply function

s(w, z) =

(w

z

)> ( Q SS> R

)(w

z

).

Moreover, in that case, V(x, δ) := x>K (δ)x is a parameter dependent storage function.

Proof. With V(x, δ) := x>K (δ)x, the robust differential dissipation inequality (5.3.2) readscol(x, w)F(K , δ, v) col(x, w) < 0 for all x, w, δ ∈ 1 andλ ∈ 3. In particular, if (5.3.3) holds forall (δ, λ) ∈ 1 × 3, then (5.1.2) is robustly dissipative.


144 5.4. ROBUST PERFORMANCE

Theorem 5.12 provides a sufficient condition for robust dissipativity. We remark that the condition(5.3.3) is also necessary if the class of storage functionV(x, δ) that satisfy the dissipation inequality(5.3.1) is restricted to functions that arequadraticin x. A result similar to Theorem 5.12 can be de-rived for the notion of a robuststrictly dissipative system. Furthermore, if the uncertainty parameterδ is time-invariant, then3 = {0} andF(K , δ, λ) simplifies to an expression that does not depend onλ. Only in this case, one can apply the Kalman-Yakubovich-Popov Lemma 2.12 to infer that (5.3.3)is equivalent to to the frequency domain characterization: . . .

(to be continued)

5.4 Robust performance

5.4.1 Robust quadratic performance

Consider the uncertain input-output system as described by (5.3.4) and suppose thatδ : R → Rp isin the class of continuously differentiable functions satisfying the value and rate constraints (5.2.10)with 1 and3 both compact subsets ofRp.

Proposition 5.13 Consider the uncertain system(5.3.4)whereδ satisfies the value and rate-constraints(5.2.10)with 1 and3 compact. Let

P =

(Q SS> R

)be a real symmetric matrix with R� 0. Suppose there exists a continuously differentiable Hermitian-valued function K(δ) such that K(δ) � 0 and

(∂δ K (δ)λ + A(δ)>K (δ) + K (δ)A(δ) K (δ)B(δ)

B(δ)>K (δ) 0

)+

(0 I

C(δ) D(δ)

)>

P

(0 I

C(δ) D(δ)

)≺ 0

for all δ ∈ 1 andλ ∈ 3. Then

(a) the uncertain system(5.3.4)is exponentially stable and

(b) there existsε > 0 such that for x(0) = 0 and for all w ∈ L2 and all uncertain parameterfunctionsδ(·) satisfying(5.2.10)we have

∫∞

0

(w(t)z(t)

)>

P

(w(t)z(t)

)dt ≤ −ε2

∫∞

0w>(t)w(t)dt (5.4.1)


5.5. FURTHER READING 145

5.5 Further reading

5.6 Exercises

Exercise 1Consider the systemx = A(δ)x with A(δ) and1 defined in Example 5.4.

(a) Show thatA(δ) is quadratically stable on the set 0.41.

(b) Derive a quadratic stability test for the eigenvalues ofA(δ) being located in a disc of radiusraround 0.

(c) Can you find the smallest radiusr by convex optimization?

Exercise 2Time-invariant perturbations and arbitrary fast perturbations can be viewed as two extreme cases oftime-varying uncertainty sets of the type (5.2.10). These two extreme manifestations of time-varyingperturbations reduce Theorem 5.9 to two special cases.

(a) Show that the result of Theorem 5.7 is obtained as a special case of Theorem 5.10 if3 = {0}.

(b) Show that if3 = [−r, r ]p with r → ∞ then the matricesK0, . . . , Kk satisfying the condi-

tions of Theorem 5.10 necessarily satisfyK1 = . . . = Kk = 0.

The latter property implies that with arbitrary fast perturbations the only affine parameter-dependentLyapunov matrices K(δ) = K0 +

∑kj =1 δ j K j are the constant (parameter-independent) ones. It is

in this sense that Theorem 5.10 reduces to Proposition 5.3 for arbitrarily fast perturbations.

Exercise 3Reconsider the suspension system of Exercise 3 in Chapter 4. Suppose that the road profileq0 = 0and the active suspension forceF = 0. Let k = 50000 andb = 5000. The suspension damping is atime-varying uncertain quantity with

b2(t) ∈ [50× 103− b, 50× 103

+ b], t ≥ 0 (5.6.1)

and the suspension stiffness is a time-varying uncertainty parameter with

k2(t) ∈ [500× 103− k, 500× 103

+ k], t ≥ 0. (5.6.2)

Let

δ =

(b2k2

)be the vector containing the uncertain physical parameters.


146 5.6. EXERCISES

(a) Let x = col(q1, q2, q1, q2) denote the state of this system and write this system in the form(5.2.1). Verify whetherA(δ) is affine in the uncertainty parameterδ.

(b) Use Proposition 5.3 to verify whether this system is quadratically stable. If so, give a quadraticLyapunov function for this system.

(c) Calculate vertex matricesA1, . . . , Ak such that

A(δ) ∈ conv(A1, . . . , Ak)

for all δ satisfying the specifications.

(d) Suppose thatb2 andk2 are time-varying and that their rates of variation satisfy

|b2| ≤ β (5.6.3a)

|k2| ≤ κ (5.6.3b)

whereβ = 1 andκ = 3.7. Use Theorem 5.10 to verify whether there exists a parameterdependent Lyapunov function that proves affine quadratic stability of the uncertain system. Ifso, give such a Lyapunov function.

Exercise 4In Exercise 8 of Chapter 3 we considered the batch chemical reactor where the series reaction

Ak1

−−−−→ Bk2

−−−−→ C

takes place.k1 andk2 are the kinetic rate constants of the conversions from productA to B and fromproductB to productC, respectively. We will be interested in the concentrationCB of productB.

(a) Show thatCB satisfies the differential equationCB +(k1+k2)CB +k1k2CB = 0 and representthis system in state space form with statex = col(CA, CB).

(b) Show that the state space system is of the form (5.2.1) whereA is an affine function of thekinetic rate constants.

(c) Verify whether this system is quadratically stable in view of jointly uncertain kinetic constantsk1 andk2 in the range[.1, 1]. If so, calculate a Lyapunov function for the uncertain system.

(d) At time t = 0 the reactor is injected with an initial concentrationCA0 = 10 (mol/liter) ofreactantA while the concentrationsCB(0) = CC(0) = 0. Plot the time evolution of theconcentrationCB of reactantB if

k1(t) = 1 − 0.9 exp(−t); k2(t) = 0.1 + 0.9 exp(−t)

Exercise 5Let f : 1 → R be partially convex and suppose that1 = {δ | δk ∈ [δk, δk], k = 1, . . . , p} with


5.6. EXERCISES 147

δk ≤ δk. Let10 = {δ | δk ∈ {δk, δk}, k = 1, . . . , p} be the corresponding set of corner points. Showthat for allγ ∈ R we have that

f (δ) ≤ γ for all δ ∈ 1

if and only iff (δ) ≤ γ for all δ ∈ 10.


148 5.6. EXERCISES


Chapter 6

Robust linear algebra problems

Control systems need to operate well in the face of many types of uncertainties. That is, stabilityand performance of controlled systems need to be as robust as possible against perturbations, modeluncertainties and un-modelled dynamics. These uncertainties may be known, partly known or com-pletely unknown. It will be one of the fundamental insights in this chapter that guarantees for robuststability and robust performance against linear time invariant uncertainties of complex systems canbe reduced to robust linear algebra problems. This has the main advantage that complex and diverserobustness questions in control can be transformed to a specific algebraic problem and subsequentlybe solved. It is of quite some independent interest to understand the emergence of robustness in alge-braic problems and to convey the essential techniques how to handle them. We therefore turn in thischapter to a practically most relevant problem in linear algebra, the problem of robust approximationby least-squares.

6.1 Robust least-squares

The classical least squares approximation problem is perfectly understood when it comes to ques-tions about existence, uniqueness, and representations of its solutions. Today, highly efficient al-gorithms exist for solving this problem, and it has found widespread applications in diverse fieldsof applications. The purpose of this section is to introduce the robust counterpart of least squaresapproximation.

Given matricesM ∈ Rq×r and N ∈ Rp×r and with the Frobenius norm‖ · ‖F , the least squaresapproximation problem amounts to solving

minX∈Rp×q

‖X M − N‖F . (6.1.1)

It will be explained below why we start with the somewhat unusual representationX M − N of the

149

150 6.2. LINEAR FRACTIONAL REPRESENTATIONS OF RATIONAL FUNCTIONS

residual. First note that

‖X M − N‖F < γ if and only ifr∑

j =1

1

γe>

j (X M − N)>(X M − N)ej < γ.

Therefore, if we define

G(X) :=

X Me1 − Ne1...

X Mer − Ner

and Pγ :=

(−γ I 0

0 1γ

I

),

thenG is an affine function and it follows that

‖X M − N‖F < γ if and only if

(I

G(X)

)>

Pγ

(I

G(X)

)< 0.

Therefore, we can computeX which minimizesγ by solving an LMI-problem. In practice, it oftenhappens that the matricesM andN depend (continuously) on an uncertain parameter vectorδ thatis known to be contained in some (compact) parameter setδ ⊂ Rm. To express this dependence,we write M(δ) and N(δ) for M and N and for a given matrixX ∈ Rp×q, theworst case squaredresidualis the number

maxδ∈δ

‖X M(δ) − N(δ)‖2F . (6.1.2)

The robust least squares problemthen amounts to findingX which renders this worst-case residualas small as possible. That is, it requires solving the min-max problem

infX∈Rp×q

maxδ∈δ

‖X M(δ) − N(δ)‖2F .

With

G(δ, X) :=

X M(δ)e1 − N(δ)e1...

X M(δ)er − N(δ)er

optimal or almost optimal solutions (see Chapter 1) to this problem are obtained by minimizingγ

such that the inequality(I

G(δ, X)

)>

Pγ

(I

G(δ, X)

)< 0 for all δ ∈ δ

is feasible inX. We thus arrive at a semi-infinite LMI problem. The main goal of this chapter is toshow how to re-formulate this semi-infinite LMI-problem into a finite-dimensional problem ifM(δ)

andN(δ) dependrationally on the parameterδ.

6.2 Linear fractional representations of rational functions

Suppose thatG(δ) is a real matrix-valued function which depends on a parameter vectorδ =

col(δ1, . . . , δm) ∈ Rm. Throughout this section it will be essential to identify, as usual, thema-trix G(δ) with the linear functionwhich maps a vectorξ into the vectorη = G(δ)ξ .


6.2. LINEAR FRACTIONAL REPRESENTATIONS OF RATIONAL FUNCTIONS 151

» H

¢(±)

G(±)» ´

z w

´

Figure 6.1: Linear Fractional Representation

Definition 6.1 A linear fractional representation(LFR) of G(δ) is a pair(H, 1(δ)) where

H =

(A BC D

)is a constant partitioned matrix and1 a linear function ofδ such that for allδ for which I − A1(δ)

is invertible and all(ξ, η) there holdsη = G(δ)ξ (6.2.1)

if and only if there exist vectorsw andz such that(zη

)=

(A BC D

)︸︷︷︸

H

(w

ξ

), w = 1(δ)z. (6.2.2)

This relation is pictorially expressed in Figure 6.1. We will sometimes call1(δ) theparameter-blockof the LFR, and the LFR is said to bewell-posedat δ if I − A1(δ) is invertible. If (6.2.2) is a wellposed LFR ofG(δ) at δ then

∃ w, z : (6.2.2) ⇐⇒ ∃ z : (I − A1(δ))z = Bξ η = C1(δ)z + Dξ

⇐⇒ ∃ z : z = (I − A1(δ))−1Bξ η = C1(δ)z + Dξ

⇐⇒ η = [D + C1(δ)(I − A1(δ))−1B]ξ

⇐⇒: η = [1(δ) ? H ]ξ.

Hence, in that case, the equations (6.2.2) define a linear mapping in thatG(δ) = 1(δ)? H . Here, the‘star product’1(δ) ? H is defined by the latter expression and will implicitly mean thatI − A1(δ)

is invertible. Since1(δ) is linear inδ, this formula reveals that anyG(δ) which admits an LFR(H, 1(δ)) must be arational functionof δ. Moreover, sinceG(0) = 1(0) ? H = D it is clear thatzero is not a pole ofG. The main point of this section is to prove that the converse is true as well.That is, we will show thatany matrix-valued multivariable rational function without pole in zeroadmits an LFR.

Before proving this essential insight we first derive some elementary operations that allow manip-ulations with LFR’s. This wil prove useful especially in later chapters where all operations onLFR’s are fully analogous to manipulating (feedback) interconnections of linear systems and their



»

H1

¢1(±)

G1(±)

» ´

z1

´

G2(±)

w1

H2

¢2(±)z2 w2

+ +

Figure 6.2: Sum of LFR’s

state-space realizations. Let us summarize some of the most important operations for a single LFRG(δ) = 1(δ) ? H or for two LFR’s G1(δ) = 11(δ) ? H1 and G2(δ) = 12(δ) ? H2 assumingcompatible matrix dimensions. We refer to [?, Chapter 10] for a nice collection of a variety of otheroperations and configurations that can all be easily derived.

SummationAs depicted in Figure 6.2,η = [G1(δ) + G2(δ)]ξ admits the LFR

z1z2

η

=

A1 0 B10 A2 B2

C1 C2 D

w1w2

ξ

,

(w1w2

)=

(11(δ) 0

0 12(δ)

)(z1z2

).

MultiplicationAs shown in Figure 6.3,η = [G1(δ)G2(δ)]ξ admits the LFR

z1z2

η

=

A1 B1C2 B1D20 A2 B2

C1 D1C2 D1D2

w1w2

ξ

,

(w1w2

)=

(11(δ) 0

0 12(δ)

)(z1z2

).

AugmentationFor arbitrary matricesL andR, the augmented systemη = [LG(δ)R]ξ admits the LFR

(zη

)=

(A B R

LC L DR

)(w

ξ

), w = 1(δ)z.



» H1

¢1(±)

G1(±)» ´

z1

´

G2(±)

w1

H2

¢2(±)z2 w2

Figure 6.3: Product of LFR’s

This allows to construct LFR’s of row, column and diagonal augmentations by summation:(G1(δ) G2(δ)

)= G1(δ)

(I 0

)+ G2(δ)

(0 I

)(G1(δ)

G2(δ)

)=

(I0

)G1(δ) +

(0I

)G2(δ)(

G1(δ) 00 G2(δ)

)=

(I0

)G1(δ)

(I 0

)+

(0I

)G2(δ)

(0 I

).

InversionIf D is invertible thenξ = G(δ)−1η admits the LFR(

zξ

)=

(A − B D−1C −B D−1

D−1C D−1

)(w

η

), w = 1(δ)z.

LFR of LFR is LFRAs illustrated in Figure 6.4, ifI − D1A2 is invertible thenη = [G1(δ) ? H2]ξ admits the LFR(

z1ξ2

)=

(A1 + B1A2(I − D1A2)

−1C1 B1(I − A2D1)−1B2

C2(I − D1A2)−1C1 D2 + C2D1(I − A2D1)

−1B2

)︸︷︷︸

H12

(w1η2

)

w1 = 11(δ)z1.

In other words,(11(δ) ? H1) ? H2 = 11(δ) ? H12 with H12 as defined here. This expression isrelevant for the shifting and scaling of parameters. Indeed, withA1 = 0 the LFR

G1(δ) = 11(δ) ? H1 = D1 + C111(δ)B1

is a joint scaling and shifting of the uncertainty block11(δ). We thus find thatG1(δ) ? H2 =

11(δ) ? H12. If the parameter block11(δ) is not shifted, then in additionD1 = 0 which leads to theeven more special case:

[C111(δ)B1] ? H2 = 11(δ) ? H2 with H2 =

(B1A2C1 B1B2C2C1 D2

). (6.2.3)



H1

¢1(±)w1

H2

z2 w2

»2 ´2

G1(±)

z1

w2z2

H2 »2 ´2

Figure 6.4: LFR of LFR is LFR

If 11(δ) is of smaller dimension thanC111(δ)B1, this formula reveals that one can compress[C111(δ)B1] ? H2 to the more compact LFR11(δ) ? H2.

On the other hand, (6.2.3) easily allows to convince ourselves that any LFR with a general linearparameter-block1(δ) can be transformed into one with the concrete parameter dependence

1d(δ) =

δ1Id1 0. . .

0 δmIdm

(6.2.4)

where Id1, . . . , Idm are identity matrices of sizesd1, . . . , dm respectively. In this case we call thepositive integer vectord = (d1, . . . , dm) theorder of the LFR. Indeed if1(δ) depends linearly onδ, there exist suitable coefficient matrices1 j such that

w = 1(δ)z = [δ111 + · · · + δm1m]z. (6.2.5)

Again we view the matricesδ j I as non-dynamic system components and simply name the inputsignals and output signals of each of these subsystems. Thenw = 1(δ)z is equivalent tow =

w1 + · · · + wm, w j = δ j z j = [δ j I j ]z j , z j = 1 j z which is the desired alternative LFR. We observethat the size ofI j , which determines how oftenδ j has to be repeated, corresponds to the numberof components of the signalsw j andz j . Since the coefficient matrices1 j have often small rank inpractice, this procedure can be adapted to reduce the order of the LFR. One just needs to performthe factorizations

11 = L1R1, . . . ,1m = LmRm

such that the number of columns and rows ofL j andRj equal the rank of1 j and are, therefore, assmall as possible. These full-rank factorizations can be easily computed by the Gauss-eliminationalgorithm or by singular value decompositions. Then (6.2.5) reads as

w =[L1(δ1Id1)R1 + · · · + Lm(δmIdm)Rm

]z



which leads to the following smaller order LFR ofw = 1(δ)z:z1...

zm

w

=

0 · · · 0 R1...

. . ....

...

0 · · · 0 Rm

L1 · · · Lm 0

w1...

wm

z

,

w1...

wm

=

δ1Id1 0. . .

0 δmIdm

z1

...

zm

.

Since an LFR of an LFR is an LFR, it is straightforward to combine this LFR ofw = 1(δ)z withthat of η = G(δ)ξ to obtain the following new LFR ofη = G(δ)ξ with a parameter-block thatadmits the desired diagonal structure:

z1...

zm

η

=

R1AL1 · · · R1ALm R1B

.... . .

......

RmAL1 · · · RmALm RmBC L1 · · · C Lm D

w1...

wm

ξ

, w j = [δ j Id j ]z j .

Let us now turn to the construction of LFR for rational functions. We begin withG(δ) dependingaffinely onδ. ThenG(δ) = G0 + 1(δ) with 1(δ) linear, and an LFR ofη = G(δ)ξ is simply givenby (

zη

)=

(0 II G0

)(w

ξ

), w = 1(δ)z.

As just described, one can easily go one step further to obtain an LFR with a block-diagonalparameter-block as described for general LFR’s.

We proceed with a one-variable polynomial dependence

η =[G0 + δG1 + · · · + δ pGp

]ξ. (6.2.6)

For achieving efficient LFR’s we rewrite as

η = G0ξ + δ[G1 + δ

[G2 + · · · + δ

[Gp−1 + δGp

]· · ·]]

ξ. (6.2.7)

Then we can separate the uncertainties as

η = G0ξ + w1, w1 = δz1, z1 = G1ξ + w2, w2 = δz2, z2 = G2ξ + w3, . . .

. . . , wp−1 = δzp−1, zp−1 = Gp−1ξ + wp, wp = δzp, zp = Gpξ.

With w = col(w1, . . . , wp), z = col(z1, . . . , zp), this LFR of (6.2.6) reads asz1...

zp−1zp

η

=

0 I · · · 0 G1...

. . .. . .

......

0 · · · 0 I Gp−10 · · · 0 0 Gp

I · · · 0 0 G0

w1...

wp−1wp

ξ

, w = δz.



Now assume thatG(δ) is a one-parameter matrix-valued rational function without pole in zero. It isstraightforward to construct a square polynomial matrixD such thatDG = N is itself polynomial.For example, one can determine the common multiple of the element’s denominators in each rowand collect these scalar polynomials on the diagonal ofD. Since none of the elements’ denominatorsof G vanish in zero this procedure guarantees thatD(0) is nonsingular. Therefore, det(D(δ)) doesnot vanish identically which impliesG(δ) = D(δ)−1N(δ) for all δ for which D(δ) is non-singular.In summary, one can construct polynomial matrices

D(δ) = D0 + δD1 + · · · + δ pDp and N(δ) = N0 + δN1 + · · · + δ pNp

with eitherDp 6= 0 or Np 6= 0 and with the following properties:

• D is square andD(0) = D0 is non-singular.

• If δ is chosen such thatD(δ) is invertible thenG has no pole inδ, andη = G(δ)ξ is equivalentto D(δ)η = N(δ)ξ .

Clearly D(δ)η = N(δ)ξ reads as[D0 + δD1 + · · · + δ pDp

]η =

[N0 + δN1 + · · · + δ pNp

]ξ

which is equivalent to

η = D−10 N0ξ + δ(D−1

0 N1ξ − D−10 D1η) + · · · + δ p(D−1

0 Npξ − D−10 Dpη)

and hence to

η = D−10 N0ξ + w1, w1 = δz1, z1 =

p∑j =1

δ j −1[N j ξ − D j D−1

0 N0ξ − D−10 D j w1].

Exactly as described for polynomial dependence this leads to the LFR

z1...

zp−1zp

η

=

−D1D−1

0 I · · · 0 N1 − D1D−10 N0

.... . .

. . ....

...

−Dp−1D−10 · · · 0 I Np−1 − Dp−1D−1

0 N0

−DpD−10 · · · 0 0 Np − DpD−1

0 N0

I · · · 0 0 N0

w1...

wp−1wp

ξ

,

w1 = δz1, . . . , wm = δzm.

If this LFR is well-posed atδ, thenξ = 0 impliesη = 0, which in turn shows thatD(δ) is invertible.Therefore we can conclude that (??) is indeed an LFR ofη = G(δ)ξ as desired.

Remark 6.2 We have essentially recapitulated how to construct a realization for the rational func-tion G(1/s) in the variables which is proper sinceG has no pole in zero. There exists a large body



of literature how to compute efficient minimal realizations, either by starting with polynomial frac-tional representations that are coprime, or by reducing (??) on the basis of state-space techniques. Ifconstructing a minimal realization ofG(1/s) with any of these (numerically stable) techniques it isthus possible to determine a minimally sized matrixA with

G(δ) = C(1

δI − A)−1B + D = D + Cδ(I − Aδ)−1B = [δ Idim(A)] ?

(A BC D

).

This leads to the minimal sized LFR(zη

)=

(A BC D

)(w

ξ

), w = δz

of (6.2.1) with rational one-parameter dependence.

Remark 6.3 Let us choose an arbitrary mappingπ : {1, . . . , p} → {1, . . . , p}. If replacingw j =

[δ I ]z j in (??) with w j = [δπ( j ) I ]z j , we can follow the above derivation backwards to observe thatone obtains an LFR of

[D0 + δπ(1)D1 + (δπ(1)δπ(2))D2 + · · · + (δπ(1) · · · δπ(p))Dp]η =

= [N0 + δπ(1)N1 + (δπ(1)δπ(2))N2 + · · · + (δπ(1) · · · δπ(p))Np]ξ.

with a specific multi-linear dependence ofD(δ) andN(δ) on δ.

Let us finally consider a general multivariable matrix-valued rational functionG(δ) without pole inzero. This just means that each of its elements can be represented as

p1∑j1=0

· · ·

pm∑jm=0

α j1,..., jm δj11 · · · δ

jmm

l1∑j1=0

· · ·

lm∑jm=0

β j1,..., jm δj11 · · · δ

jmm

with β0,...,0 6= 0.

In literally the same fashion as described for one-variable polynomials on can easily construct mul-tivariable polynomial matricesD andN with DG = N and nonsingularD(0). Again, for allδ forwhich D(δ) is non-singular one concludes thatη = G(δ)ξ is equivalent toD(δ)η = N(δ)ξ or to thekernel representation

0 = H(δ)

(ξ

η

)with H(δ) =

(N(δ) −D(δ)

). (6.2.8)

One can decompose asH(δ) = H0 + δ1H1

1 (δ) + · · · + δmH1m(δ)

with easily determined polynomial matricesH1j whose degrees are strictly smaller than those ofH .

Then (6.2.8) is equivalent to

0 = H0

(ξ

η

)+

m∑j =1

w1j , w1

j = δ j z1j , z1

j = H1j (δ)

(ξ

η

). (6.2.9)



Let us stress that the channelj should be considered absent (empty) ifH1j is the zero polynomial ma-

trix. With w1= col(w1

1, . . . , w1m), z1

= col(z11, . . . , z1

m), E0 := (I · · · I ), H1= col(H1

1 , . . . , H1m),

(6.2.9) is more compactly written as

0 = H0

(ξ

η

)+ E0w

1, z1= H1(δ)

(ξ

η

), w1

= diag(δ1I , . . . , δmI )z1. (6.2.10)

We are now in the position to iterate. For this purpose one decomposes

H1(δ) = H1 + δ1H21 (δ) + · · · + δmH2

m(δ)

and pulls parameters out of the middle relation in (6.2.10) as

z1 = H1

(ξ

η

)+ w2

1 + · · · + w2m, w2

j = δ j z2j , z2

j = H2j (δ)

(ξ

η

).

Again using compact notations, (6.2.10) is equivalent to

0 = H0

(ξ

η

)+ E0w

1, z1= H1

(ξ

η

)+ E1w

2, z2= H2(δ)

(ξ

η

),

w1= diag(δ1I , . . . , δmI )z1, w2

= diag(δ1I , . . . , δmI )z2.

If continuing in this fashion, the degree ofH j is strictly decreased in each step, and the iterationstops afterk steps ifHk(δ) = Hk is just a constant matrix. One arrives at

0 = H0

(ξ

η

)+ E0w

1, z j= H j

(ξ

η

)+ E j w

j +1, w j= diag(δ1I , . . . , δmI )z j , j = 1, . . . , k

with Ek andwk+1 being empty. In a last step we turn this implicit relation into a genuine inputoutput relation. For this purpose we partition

H j =(

N j −D j)

conformably withH = (N − D). Let us recall thatD0 = D(0) is non-singular and hence

0 = H0

(ξ

η

)+ E0w

1 is equivalent toη = D−10 N0ξ + D−1

0 E0w1.

After performing this substitution we indeed directly arrive at the desired explicit LFR of the input-output relationη = G(δ)ξ :

η = D−10 N0ξ + D−1

0 E0w1, z j

= [N j − D j D−10 N0]ξ − [D j D−1

0 E0]w1+ E j w

j +1,

w j= diag(δ1I , . . . , δmI )z j , j = 1, . . . , k.

Of course this LFR can be again modified to one with the parameter-block (6.2.4), which justamounts to reordering in the present situation. We have constructively proved the desired repre-sentation theorem for rationalG(δ), where the presented technique is suited to lead to reasonablysized LFR’s in practice.



Theorem 6.4 Suppose that all entries of the matrix G(δ) are rational functions ofδ = (δ1, . . . , δm)

whose denominators do not vanish atδ = 0. Then there exist matrices A, B, C, D and non-negativeintegers d1, . . . , dm such that, with1(δ) = diag(δ1Id1, . . . , δmIdm),

η = G(δ)ξ if and only if there exists w, z :

(zη

)=

(A BC D

)(w

ξ

), w = 1(δ)z

for all δ with det(I − A1(δ)) 6= 0 and for all ξ .

Example 6.5 Consider

η =1

(δ2 − 1)(δ2 + δ21 + δ1 − 1)

(2δ2

2 − δ2 + 2δ21δ2 − δ2

1 + 2δ1 δ1δ2 + δ21 + δ2 + δ1

(−2δ1δ2 + 3δ1 + 2δ2 − 1)δ1 (δ1δ2 + 1)δ1

)︸︷︷︸

G(δ)

ξ

or ((δ2 − 1)(δ2 + δ2

1 + δ1 − 1) 00 (δ2 − 1)(δ2 + δ2

1 + δ1 − 1)

)︸︷︷︸

D(δ)

η =

=

(2δ2

2 − δ2 + 2δ21δ2 − δ2

1 + 2δ1 δ1δ2 + δ21 + δ2 + δ1

(−2δ1δ2 + 3δ1 + 2δ2 − 1)δ1 (δ1δ2 + 1)δ1

)︸︷︷︸

N(δ)

ξ.

Sorting for common powers leads to

η = δ1

[(2 1

−1 1

)ξ + η

]+ δ2

1

[(−1 1

3 0

)ξ + η

]+ δ2

1δ2

[(2 0

−2 1

)ξ − η

]+

+ δ2

[(−1 1

0 0

)ξ + 2η

]+ δ1δ2

[(0 12 0

)ξ − η

]+ δ2

2

[(2 00 0

)ξ − η

].

In a first step we pull out the parameters as

η = w1 + w4, w1 = δ1z1, w4 = δ2z4

z1 =

(2 1

−1 1

)ξ + η + δ1

[(−1 1

3 0

)ξ + η

]+ δ1δ2

[(2 0

−2 1

)ξ − η

]z4 =

(−1 1

0 0

)ξ + 2η + δ1

[(0 12 0

)ξ − η

]+ δ2

[(2 00 0

)ξ − η

].

We iterate with the second step as

η = w1 + w4, w1 = δ1z1, w2 = δ1z2, w3 = δ1z3, w4 = δ2z4, w5 = δ2z5

z1 =

(2 1

−1 1

)ξ + η + w2, z2 =

(−1 1

3 0

)ξ + η + δ2

[(2 0

−2 1

)ξ − η

]z4 =

(−1 1

0 0

)ξ + 2η + w3 + w5, z3 =

(0 12 0

)ξ − η, z5 =

(2 00 0

)ξ − η.


160 6.3. ON THE PRACTICAL CONSTRUCTION OF LFR’S

It just requires one last step to complete the separation as

η = w1 + w4, w1 = δ1z1, w2 = δ2z2, w3 = δ1z3, w4 = δ1z4, w5 = δ2z5, w6 = δ2z6

z1 =

(2 1

−1 1

)ξ + η + w2, z2 =

(−1 1

3 0

)ξ + η + w6, z6 =

(2 0

−2 1

)ξ − η

z4 =

(−1 1

0 0

)ξ + 2η + w3 + w5, z3 =

(0 12 0

)ξ − η, z5 =

(2 00 0

)ξ − η.

In matrix format this can be expressed as

z1

z2

z3

z4

z5

z6

η

=

1 0 1 0 0 0 1 0 0 0 0 0 2 10 1 0 1 0 0 0 1 0 0 0 0 −1 11 0 0 0 0 0 1 0 0 0 1 0 −1 10 1 0 0 0 0 0 1 0 0 0 1 3 0

−1 0 0 0 0 0 −1 0 0 0 0 0 0 10 −1 0 0 0 0 0 −1 0 0 0 0 2 02 0 0 0 1 0 2 0 1 0 0 0 −1 10 2 0 0 0 1 0 2 0 1 0 0 0 0

−1 0 0 0 0 0 −1 0 0 0 0 0 2 00 −1 0 0 0 0 0 −1 0 0 0 0 0 0

−1 0 0 0 0 0 −1 0 0 0 0 0 2 00 −1 0 0 0 0 0 −1 0 0 0 0 −2 1

1 0 0 0 0 0 1 0 0 0 0 0 0 00 1 0 0 0 0 0 1 0 0 0 0 0 0

w1

w2

w3

w4

w5

w6

ξ

w1w2w3

w4w5w6

=

δ1I2 0 0 0 0 0

0 δ1I2 0 0 0 00 0 δ1I2 0 0 00 0 0 δ2I2 0 00 0 0 0 δ2I2 00 0 0 0 0 δ2I2

z1z2z3

z4z5z6

.

The parameter-block turns out to be of size 12. We stress that, even for the rather simple case athand, a less careful construction can easily lead to much larger-sized LFR’s. For example withSISO realization techniques it is not difficult to find LFR’s of the elements ofD, N of sizes 4 and 3respectively. If using the operations augmentation, inversion and multiplication one obtains an LFRwith parameter-block of size(4 + 4) + (4 + 3 + 3 + 3) = 21. We will briefly comment on variousaspects around the practical construction of LFR’s in the next section.

6.3 On the practical construction of LFR’s

LFR-Toolbox. The practical construction of LFR’s is supported by a very helpful, professionallycomposed and freely available Matlab LFR-Toolbox developed by Francois Magni [?]. The toolbox


6.3. ON THE PRACTICAL CONSTRUCTION OF LFR’S 161

is complemented with an excellent manual which contains numerous theoretical and practical hints,a whole variety of examples and many pointers to the recent literature. The purpose of this sectionis to provide some additional remarks that might not be as easily accessible.

Reduction. It is of crucial importance to construct LFR’s with reasonably sized parameter-block.After having found one LFR1(δ) ? H , it is hence desirable to systematically construct an LFR1r (δ)?Hr whose parameter-block1r (δ) is as small as possible and such that1(δ)?H = 1r (δ)?Hr

for all δ for which both are well-posed. Unfortunately this minimal representation problem doesnot admit a computationally tractable solution. Two reduction schemes, channel-by-channel [?]multi-channel [?] reduction, have been suggested in the literature and are implemented in the LFR-Toolbox [?]. To explain the channel-by-channel scheme consider the LFR

z1...

zm

η

=

A11 · · · A1m B1...

. . ....

...

Am1 · · · Amm Bm

C1 · · · Cm D

w1...

wm

ξ

, w j = δ j z j , j = 1, . . . , m. (6.3.1)

For eachj we associate with this LFR (via the substitutionw j → x j , z j → x j ) the LTI system

z1...

x j...

zm

η

=

A11 · · · A1 j · · · A1m B1...

. . ....

. . ....

...

A j 1 · · · A j j · · · A jm B j...

. . ....

. . ....

...

Am1 · · · Amj · · · Amm B j

C1 · · · C j · · · Cm D

w1...

x j...

wm

ξ

.

If non-minimal, this realization can be reduced to be controllable and observable to arrive at

z1...˙x j...

zm

η

=

A11 · · · A1 j · · · A1m B1...

. . ....

. . ....

...

A j 1 · · · A j j · · · A jm B j...

. . ....

. . ....

...

Am1 · · · Amj · · · Amm B j

C1 · · · C j · · · Cm D

w1...

x j...

wm

ξ

with x j of smaller size thanx j . If sequentially performing this reduction step for allj = 1, . . . , mwe arrive at the LFR

z1...

z j...

zm

η

=

A11 · · · A1 j · · · A1m B1...

. . ....

. . ....

...

A j 1 · · · A j j · · · A jm B j...

. . ....

. . ....

...

Am1 · · · Amj · · · Amm B j

C1 · · · C j · · · Cm D

w1...

w j...

wm

ξ

, w j = δ j z j (6.3.2)



with z j andw j of smaller size thanz j andw j . A little reflection convinces us that (6.3.1) is equiva-lent to (6.3.2) for allδ for which both LFR’s are well-posed. Note that the algebraic variety ofδ forwhich the original LFR and the new LFR are well-posed are generally different. For a multi-channelextension we refer to [?], and for an illustration of how these schemes work in practice we consideran example.

Example 6.6 Let us continue Example 6.5 by constructing an LFR of the rational matrix with theLFR-Toolbox. The block structure is(d1, d2) = (32, 23) and after SISO and MIMO reductionit amounts to(d1, d2) = (23, 10) and (d1, d2) = (13, 10) respectively. Starting with the Hornerdecomposition ofG even leads to the larger reduced size(d1, d2) = (14, 12). Even after reduction,all these LFR’s are much larger than the one found in Exercise 6.5. Constructing an LFR of thepolynomial matrix(D − N) (after Horner decomposition and reduction) leads to(d1, d2) = (7, 6)

which allows to find an LFR ofD−1N with the same parameter-block (Exercise 1). Reducing themanually constructed LFR from Exercise 6.5 leads to the smallest sizes(d1, d2) = (6, 5). We stressthat this is still far from optimal!

Approximation. If exact reduction of a given LFRG(δ) = 1(δ) ? H is hard or impossible, onemight construct an approximation1a(δ) ? Ha with smaller-sized1a(δ). For this purpose we stressthat LFR’s are typically used for parametersδ in some a priori given setδ. If both the original andthe approximate LFR’s are well-posed for allδ ∈ δ, we can use

supδ∈δ

‖1(δ) ? H − 1a(δ) ? Ha‖

(with any matrix norm‖.‖) as a measure for the distance of the two LFR’s, and hence as an indicatorfor the approximation quality. For LMI-based approximate reduction algorithms that are similar tothe balanced truncation reduction of transfer matrices we refer to []. We stress that the evaluation ofthe distance just requires knowledge of the valuesG(δ) = 1(δ)? H atδ ∈ 1. GivenG(δ), this leadsto the viable idea to construct LFR’s by non-linear optimization []. Among the various possibilitieswe would like to sketch one scheme that will be successfully applied to our example. Choosesome parameter-block1a(δ) (of small size) which depends linearly onδ together with finitely manydistinct pointsδ1, . . . , δN ∈ δ (which should be well-distributed inδ). Then the goal is to findAa,Ba, Ca, Da such that

maxj =1,...,N

∥∥∥∥G(δ j ) − 1a(δ j ) ?

(Aa Ba

Ca Da

)∥∥∥∥ (6.3.3)

is as small as possible. Clearly

Vj = G(δ j ) − 1a(δ j ) ?

(Aa Ba

Ca Da

)⇐⇒

⇐⇒ ∃ U j : Vj = G(δ j ) − Da − CaU j , U j = (I − 1a(δ j )Aa)−11a(δ j )Ba ⇐⇒

⇐⇒ ∃ U j : [1a(δ j )Aa − I ]U j + 1a(δ j )Ba = 0, CaU j + Da − G(δ j ) = Vj .

Therefore we have to solve the nonlinear optimization problem

minimize maxj =1,...,N

‖CaU j + Da − G(δ j )‖

subject to [1a(δ j )Aa − I ]U j + 1a(δ j )Ba = 0, j = 1, . . . , N.(6.3.4)


6.3. ON THE PRACTICAL CONSTRUCTION OF LFR’S 163

in the variablesAa, Ba, Ca, Da, U j , j = 1, . . . , N. It might be relevant to explicitly includethe constraint that1a(δ j )Aa − I has to be invertible. Moreover we stress that it is immediateto formulate variations which might considerably simplify the corresponding computations. Forexample if striving for LFR’s which are exact inδ1, . . . , δN , one could solve the non-linear least-squares problem

infAa, Ba, Ca, Da

U j , j = 1, . . . , N

N∑j =1

∥∥∥∥( 1a(δ j )Aa − ICa

)U j +

(1a(δ j )Ba

Da − G(δ j )

)∥∥∥∥F

. (6.3.5)

Example 6.7 We continue Example 6.5 by constructing an LFR with nonlinear least-squares op-timization. It is simple to verify that the denominator polynomial(δ2 − 1)(δ2 + δ2

1 + δ1 − 1) isnonzero on the boxδ = {(δ1, δ2) : −0.3 ≤ δ1 ≤ 0.3, −0.3 ≤ δ1 ≤ 0.3}. With 1(δ) =

diag(δ1, δ1, δ1, δ2, δ2, δ2) and the 49 points{−0.3 + 0.1 j : j = 0, . . . , 6}2 we solve (6.3.5) to

achieve a cost smaller than 1.3 10−9. The deviation (6.3.3) turns out to be not larger than 8 10−5 onthe denser grid{−0.3 + 0.01 j : j = 0, . . . , 60}2 with 3721 points. We have thus constructed anapproximate LFR of considerably reduced order(d1, d2) = (3, 3). It can be actually verified that

G(δ) =

δ1 0 0 00 δ1 0 00 0 δ2 00 0 0 δ2

?

1 1 1 2 2 11 0 1 1 −1 11 0 1 1 −1 10 0 0 1 1 11 0 1 0 0 00 1 0 0 0 0

which could not be reconstructed with any of the suggested techniques! The Examples 6.5-6.7demonstrate that it might be difficult to reconstruct this small-sized LFR from the correspondingrational function with reasonable computational effort. As the main lesson to be learnt for practice,one should keep track of how large-sized LFR’s are actually composed by smaller ones throughinterconnection in order not to loose essential structural information.

Approximate LFR’s of non-rational functions. We observe that the suggested approximate reduc-tion technique is only relying on the availability ofG(δ j ) for j = 1, . . . , N. This opens the pathto determine LFR approximations for non-rational matrix valued mappings or even for mappingsthat are just defined through look-up tables. As we have seen, the direct construction of LFR’s thenrequires the solution of a generally non-convex optimization problem.

Alternatively one could first interpolate the data matrices with multivariable polynomial or rationalfunctions, and then construct an LFR of the interpolant. With multivariable polynomial matricesD(δ) andN(δ) of fixed degree and to-be-determined coefficients, this requires to solve the equationD(δ j )

−1N(δ j ) = G(δ j ) which can be reduced toD(δ j )G(δ j ) − N(δ j ) = 0 (for all j = 1, . . . , N)and hence turns out to be a linear system of equations in the coefficient matrices.

As a further alternative one can rely on polynomial or rational approximation. This amounts to find-ing polynomial matricesD, N which minimize maxj =1,...,N ‖D(δ j )

−1N(δ j ) − G(δ j )‖. For a fixeddenominator polynomial matrixD and for a fixed degree ofN this is clearly a convex optimization



problem. Indeed, as an immediate consequence of the proof of Theorem 6.4, any such fixed denom-

inator function can be parameterized as1(δ) ?

(Aa Ba

Ca Da

)with fixed Aa, Ca and free Ba, Da.

Then it is obvious that (6.3.4) is a convex optimization problem which can be reduced to an LMI-problem if using any matrix norm whose sublevel sets admit an LMI-representation. For generalrational approximation one has to rely on solving the non-convex problem (6.3.4) withfree Aa, Ba,Ca, Da, or one could turn to the multitude of existing alternatives such as Padé approximation forwhich we refer to the approximation literature.

Non-diagonal parameter-blocks.We have proved how to construct LFR’s of rational matrices withdiagonal parameter-blocks in a routine fashion. It is often possible to further reduce the size of LFR’swith the help of full non-diagonal parameter-blocks which are only required to depend linearly onthe parameters.

Example 6.8 At this point we disclose that the rational matrix in Examples 6.5-6.7 was actuallyconstructed as

G(δ) =

δ1 δ2 δ10 δ1 δ20 0 δ2

?

1 1 0 1 01 0 1 −1 10 0 1 1 11 0 0 0 00 1 −1 0 0

which is still smaller than all the LFR’s constructed so far.

Example 6.9 Let us illustrate how to actually find such reduced-sized LFR’s by means of the exam-ple

G(δ) =3δ2

1 − 2δ2

1 − 4δ1 + 2δ1δ2 − δ2.

A key role is played by finding suitable (matrix) factorizations. Indeedη = G(δ)ξ iff

η = (3δ21 − 2δ2)ξ + (4δ1 − 2δ1δ2 + δ2)η =

(δ1 δ2

) ( 3δ1 4 − 2δ2−2 1

)(ξ

η

)⇐⇒

⇐⇒ η = w1, w1 =(

δ1 δ2)

z1, z1 =

(4η

−2ξ + η

)+

(10

) (δ1 δ2

) ( 3ξ

−2η

)⇐⇒

⇐⇒ η = w1, w1 =(

δ1 δ2)

z1, z1 =

(4η + w2−2ξ + η

), w2 =

(δ1 δ2

)z2, z2 =

(3ξ

−2η

)which leads to the LFR

z1

z2

η

=

4 1 01 0 −20 0 3

−2 0 01 0 0

w1

w2

ξ

,

(w1w2

)=

(δ1 δ2 0 00 0 δ1 δ2

)z1

z2

. (6.3.6)


6.4. ROBUST LINEAR ALGEBRA PROBLEMS 165

We observe that this LFR has the structure zzη

=

A BL A L BC D

( w

ξ

), w =

(1(δ) 1(δ)

) ( zz

)with L =

(0 −2 −4/3

)and is, due toz = Lz, hence equivalent to(

zη

)=

(A BC D

)(w

ξ

), w = [1(δ) + 1(δ)L]z.

Therefore (6.3.6) can be compressed toz1

z2

η

=

4 1 01 0 −20 0 31 0 0

w1

w2

ξ

(w1w2

)=

(δ1 δ2 00 −2δ2 δ1 − 4/3δ2

) z1

z2

with a parameter-block of size 2× 3. Standard LFR’s can be obtained with orders(d1, d2) = (3, 1)

or (d1, d2) = (2, 2) which involves parameter-blocks of larger size 4× 4.

Nonlinear parameter-blocks and parameter transformations. System descriptions of intercon-nected non-linear components very often involve highly structured matrix-valued mappings suchas

G(δ) =

(0 δ1δ4 δ2δ4 + cos(δ3)

−δ1δ4 −δ2δ4 − cos(δ3) −δ1δ24 + δ4 cos(δ3)

).

One can construct a fractional representation with the nonlinear parameter-block1(δ) = diag(δ1I2, δ2I2, cos(δ3)I2, δ4I4)

of size 10× 10. If performing the parameter transformation

δ1 := δ1δ4, δ2 := δ2δ4 + cos(δ3), δ3 = cos(δ3), δ4 = δ4

one has to pull the transformed parameters out of

G(δ) =

(0 δ1 δ2

−δ1 −δ2 (δ3 − δ1)δ4

)with only one bilinear element. Now one can construct an LFR with parameter-block1(δ) =

diag(δ1I2, δ2I2, δ3, δ4) of smaller size 6× 6. In concrete applications, if one has to analyzeG(δ) forδ ∈ δ, it remains to determine the imageδ of δ under the non-linear parameter transformation and toperform the desired analysis forG(δ) with δ ∈ δ.

6.4 Robust linear algebra problems

Recall that we translated the robust counterpart of least-squares approximation into the min-maxproblem

infX∈Rp×q

maxδ∈δ

‖X M(δ) − N(δ)‖2F .


166 6.4. ROBUST LINEAR ALGEBRA PROBLEMS

Let us now assume thatM(δ) andN(δ) are rational functions ofδ without pole in zero, and determinean LFR of the stacked columns ofM(δ) andN(δ) as

M(δ)e1N(δ)e1

...

M(δ)el

N(δ)el

= 1(δ) ?

A BCM

1 DM1

CN1 DN

1...

...

CMl DM

lCN

l DNl

.

Then it is clear that

G(δ, X) :=

X M(δ)e1 − N(δ)e1...

X M(δ)er − N(δ)er

admits the LFR

1(δ) ?

A B

XCM1 − CN

1 X DM1 − DN

1...

...

XCMl − CN

l X DMl − DN

l

= 1(δ) ?

(A B

C(X) D(X)

)= 1(δ) ? H(X).

Robust least-squares approximation then requires to findX ∈ Rp×q which minimizesγ under thesemi-infinite constraint(

I1(δ) ? H(X)

)T

Pγ

(I

1(δ) ? H(X)

)< 0 for all δ ∈ δ.

—————— Formulate better

Let us distinguish analysis from synthesis ... This structure is the starting-point for deriving LMIrelaxations on the basis of the so-called full block S-procedure in the next section.

—————— Formulate better

Example 6.10 As a concrete example we consider robust interpolation as suggested in [?]. Givendata points(ai , bi ), i = 1, . . . , k, standard polynomial interpolation requires to find the coefficientsof the polynomialp(t) = z0 + z1t + · · · zntn such that it satisfiesp(ai ) = bi for all i = 1, . . . , k. Ifsuch an interpolant does not exist one can instead search for a least squares approximant, a polyno-mial for which

∑ki =1 |p(ai ) − bi |

2 is as small as possible. Hence we require to solve

minz∈Rn+1

∥∥∥∥∥∥∥∥∥zT

1 · · · 1a1 · · · ak...

...

an1 · · · an

k

−(

b1 · · · bk)∥∥∥∥∥∥∥∥∥

2

.


6.4. ROBUST LINEAR ALGEBRA PROBLEMS 167

Let us now assume that(ai , bi ) are only known to satisfy|ai − ai | ≤ r i and |bi − bi | ≤ si fori = 1, . . . , k (since they are acquired through error-prone experiments). Then one should search thecoefficient vectorz such that it minimizes the worst case least squares error

max|ai −ai |≤r i , |bi −bi |≤si , i =1,...,k

∥∥∥∥∥∥∥∥∥zT

1 · · · 1a1 · · · ak...

...

an1 · · · an

k

−(

b1 · · · bk)∥∥∥∥∥∥∥∥∥

2

.

Note that it is rather simple to determine the LFR

1 · · · 1a1 · · · ak...

...

an1 · · · an

kb1 · · · bk

= diag(a1In, . . . , ak In, b1, . . . , bk) ?

N e1 0. . .

. . .

N 0 e10k e1 · · · ek

M · · · M 0 e1 · · · e10 · · · 0 e 0 · · · 0

︸︷︷︸(

A BC D

)

with the standard unit vectorej , the all-ones row vectore and with

N =

(0 0

In−1 0

)∈ Rn×n, M =

(0In

)∈ R(n+1)×n,

just by combining (through augmentation) the LFR’s for one-variable polynomials. Then it isstraightforward to derive an LFR’s in the the variables

(δ1, . . . , δk, δk+1, δ2k) := (1

r1(a1 − a1), . . . ,

1

rk(ak − ak),

1

s1(b1 − b1), . . . ,

1

sk(bk − bk))

by affine substitution. Indeed with

10 = diag(a1In, . . . , ak In, b1, . . . , bk), W = diag(r1In, . . . , rk In, s1, . . . , sk)

we infer

diag(a1In, . . . , ak In, b1, . . . , bk) = 10 + W diag(δ1In, . . . , δk In, δk+1, . . . , δ2k)

which leads to the LFR1 · · · 1

a1 + δ1 · · · ak + δk...

...

(a1 + δ1)n

· · · (ak + δk)n

b1 + δk+1 · · · bk + δ2k

= diag(δ1In, . . . , δk In, δk+1, . . . , δ2k) ? H


168 6.5. FULL BLOCK S-PROCEDURE

with

H =

(A(I − 10A)−1W (I − A10)

−1BC(I − 10A)−1W D + C10(I − 10A)−1B

),

just by using the formula for computing the LFR of an LFR. Hence robust polynomial interpolationnicely fits into the general robust least squares framework as introduced in this section.

6.5 Full block S-procedure

If not specified all matrices and vectors in this section are tacitly assumed to have their elements inK whereK either equalsR or C.

Many robust linear algebra problems, in particular those appearing in robust control, can be con-verted to testing whether

1 ? H is well-posed and

(I

1 ? H

)∗

Pp

(I

1 ? H

)< 0 for all 1 ∈ 1c. (6.5.1)

Here1c is some set ofK-matrices and the LFR coefficient matrixH as well as the HermitianPp toexpress the desired bound on1 ? H are partitioned as

H =

(A BC D

)and Pp =

(Qp Sp

S∗p Rp

)with Rp ≥ 0.

Recall that1? H is well-posed, by definition, ifI − A1 is invertible. Here is the fundamental resultwhich allows to design relaxations of (6.5.1) that are computationally tractable with LMI’s.

Theorem 6.11 (Concrete full block S-procedure)If there exist matrices Q= Q∗, S, R= R∗ with(1

I

)∗ ( Q SS∗ R

)(1

I

)≥ 0 for all 1 ∈ 1c (6.5.2)

and(I 0A B

)∗ ( Q SS∗ R

)(I 0A B

)+

(0 IC D

)∗ ( Qp Sp

S∗p Rp

)(0 IC D

)< 0 (6.5.3)

then(6.5.1)is true. The converse holds in case that1c is compact.

As will be explained below it is elementary to show that (6.5.2) and (6.5.3) are sufficient for (6.5.1)whereas the proof of necessity is more difficult. Before we embed this concrete version of the S-procedure into a more versatile abstract formulation we intend to address the practical benefit of thisreformulation. With the help of introducing the auxiliary variablesQ, R, S, which are often called


6.5. FULL BLOCK S-PROCEDURE 169

multipliers or scalings, it has been possible to equivalently rephrase (6.5.1) involving a multivari-able rational function into the LMI conditions (6.5.2) and (6.5.3). Let us introduce, for the purposeof clarity, the obviously convex set of all multipliers that satisfy the infinite family of LMI’s (6.5.2):

Pall :=

{P =

(Q SS∗ R

):

(1

I

)∗ ( Q SS∗ R

)(1

I

)≥ 0 for all 1 ∈ 1c

}.

Testing (6.5.1) then simply amounts to finding an element inPall which fulfills the LMI (6.5.3).Unfortunately, the actual implementation of this test requires a description ofPall through finitelymany LMI’s, or at least a close approximation thereof. The need for such an approximation isthe fundamental reason for conservatism in typical robust stability and robust performance tests incontrol!

Let us assume thatPi ⊂ Pall is an inner approximation with an LMI description. Then the com-putation ofP ∈ Pi with (6.5.3) is computationally tractable. If the existence of such a multipliercan be verified, it is clear that (6.5.1) has been verified. On the other hand letPall ⊂ Po be an outerapproximation with an LMI description. If one can computationally confirm that there does not existany P ∈ Po with (6.5.3), it is guaranteed that (6.5.1) is not true. We stress that the non-existence ofP ∈ Pi or the existence ofP ∈ Po satisfying (6.5.1) does not allow to draw any conclusion withoutany additional knowledge about the quality of the inner or outer approximations respectively. In thesequel we discuss a selection of possible choices for inner approximations that have been suggestedin the literature.

• Spectral-Norm Multipliers. The simplest and crudest inner approximation is obtained byneglecting any structural properties of the matrices in1c and just exploiting a known boundon their spectral norm. Indeed suppose thatr > 0 is any (preferably smallest) number with

‖1‖ ≤ r for all 1 ∈ 1c.

Since

‖1‖ ≤ r iff1

r1∗1 ≤ r I iff

(1

I

)∗ (−1/r 0

0 r

)(1

I

)≥ 0

we can choose the set

Pn :=

{τ

(−1/r 0

0 r

): τ ≥ 0

}as an inner approximation, and checking (6.5.3) amounts to solving a one-parameter LMIproblem.

• Full-Block Multipliers. For a more refined inner approximation let us assume that the set1c

is described as

1c = conv(1g) = conv{11, ...,1N} with 1g = {11, ...,1N}

as finitely many generators. Then define

P f :=

{P =

(Q SS∗ R

): Q ≤ 0,

(1

I

)∗

P

(1

I

)≥ 0 for all 1 ∈ 1g

}.



Since anyP ∈ P f has a left upper block which is negative semi-definite, the mapping

1 →

(1

I

)∗

P

(1

I

)is concave,

and hence positivity of its values at the generators1 ∈ 1g implies positivity for all1 ∈ 1c.We conclude thatP f ⊂ Pall . On the other hand,P f is described by finitely many LMI’s;hence searching forP ∈ P f satisfying (6.5.3) is a standard LMI problem that can be easilyimplemented.

The construction of LFR’s revealed that1c is often parameterized with a linear parameter-block1(δ) in δ ∈ δ ⊂ Km as

1c = {1(δ) : δ ∈ δ}.

Just by linearity of1(δ) we infer for convex finitely generated parameter sets that

1c = conv{1(δ j ) : j = 1, . . . , N} in case ofδ = conv{δ1, . . . , δN}.

For parameter boxes (products of intervals) defined as

δ = {δ ∈ Rm: δ j ∈ [a j , b j ], j = 1, . . . , m}

we can use the representation

δ = conv(δg) with δg := conv{δ ∈ Rm: δ j ∈ {a j , b j }, j = 1, . . . , m}.

All these choices do not depend upon a specific structure of1(δ) and hence allow for non-diagonal parameter-blocks.

• Extended Full-Block Multipliers. Can we be more specific ifδ is a box generated byδg andthe parameter-block1(δ) is diagonal, as in (6.2.4)? With the column partition(E1 · · · Em) =

I of the identity matrix conformable with that of1(δ), we infer that

1(δ) =

m∑k=1

Ek[δk I ]ETk .

Let us then define

P f e :=

{P : ET

k QEk ≤ 0, k = 1, . . . , m,

(1(δ)

I

)∗

P

(1(δ)

I

)≥ 0, δ ∈ δg

}.

Due toETk QEk ≤ 0, k = 1, . . . , m, we conclude for anyP ∈ P f e that

δ →

(1(δ)

I

)∗ ( Q SS∗ R

)(1(δ)

I

)is concave in δk for k = 1, . . . , m.

It is easily seen that this implies again thatP f e ⊂ Pall . We stress thatP f ⊂ P f e suchthat P f e is a potentially better inner approximation ofPall which leads to less conservativenumerical results. This comes at the expense of a more complex description ofP f e since itinvolves a larger number of LMI’s. This observation leads us to the description of the lastinner approximation ofPall in our non-exhaustive list.



• Diagonal Multipliers. If m j = (a j + b j )/2 denotes the mid-point of the interval[a j , b j ] andd j = (b j − a j )/2 the half of its diameter we infer forδ j ∈ R that

δ j ∈ [a j , b j ] iff (δ j − m j )2

≤ d2j iff

(δ j

1

)∗ (−1 m j

m j d2j − m2

j

)︸︷︷︸(

q j sj

s∗

j r j

)(

δ j

1

)≥ 0.

Since (Ek 00 Ek

)T (1(δ)

I

)=

( ∑ml=1 ET

k El [δl I ]ETl

ETk

)=

(δk II

)ET

k

(due toETk El = 0 for k 6= l andET

k Ek = I ) we infer for D j ≥ 0 that(1(δ)

I

)T m∑j =1

(E j 00 E j

)(q j D j sj D j

sj D j r j D j

)(E j 00 E j

)T (1(δ)

I

)=

=

m∑j =1

E j

(δ j II

)T ( q j D j sj D j

s∗

j D j r j D j

)(δ j II

)ET

j =

=

m∑j =1

E j D j ETj

(δ j

1

)T ( q j sj

s∗

j r j

)(δ j

1

)≥ 0.

The fact thatδ j is real can be expressed as

δ∗

j + δ j = 0 or

(δ j

1

)∗ ( 0 1−1 0

)(δ j

1

)= 0.

Exactly the same computation as above reveals for anyG j = G∗

j that(1(δ)

I

)∗ m∑j =1

(E j 00 E j

)(0 G j

−G j 0

)(E j 00 E j

)T (1(δ)

I

)= 0.

This implies for the set of block-diagonal multipliers

Pd :=

m∑

j =1

(E j 00 E j

)(q j D j sj D j + G j

s∗

j D j − G j r j D j

)(E j 00 E j

)T

: D j ≥ 0, G j = G∗

j

that Pd ⊂ Pall . Note that the setPd is explicitly parametrized by finitely many linear matrixinequalities.

• Mixtures. Let us assume that1(δ) is diagonal as in (6.2.4). Our discussion makes it possibleto construct sets of multipliers that are mixtures of those that have been listed, as illustratedfor the real-parameter example

1c = {diag(δ1I2, δ2, δ3, δ4I4) : δ1 ∈ [−1, 1], δ2 ∈ [2, 4], |δ3| ≤ 1 − δ4, δ4 ≥ −1} .



One can then work with the multiplier set of all

P =

−D1 0 0 0 G1 0 0 00 −D2 0 0 0 3D2 + G2 0 00 0 Q11 Q12 0 0 S11 S120 0 Q∗

12 Q22 0 0 S21 S22

−G1 0 0 0 D1 0 0 00 −3D2 − G2 0 0 0 −2D2 0 00 0 S∗

11 S∗

21 0 0 R11 R120 0 S∗

12 S∗

22 0 0 R∗

12 R22

with

D1, ≥ 0, D2 ≥ 0, G1 = G∗

1, G2 = G∗

2 and Q11 ≤ 0, Q22 ≤ 0

as well asδ3 00 δ4I4

1 00 1

∗

Q11 Q12 S11 S12Q∗

12 Q22 S21 S22

S∗

11 S∗

21 R11 R12S∗

12 S∗

22 R∗

12 R22

δ3 00 δ4I4

1 00 1

≥ 0, (δ3, δ4) =

(−2, −1)

(2, −1).

(0, 1)

Of course it is straightforward to construct other block structures which all might lead todifferent numerical results in actual computations.

With the various classes of multipliers we arrive at the chain of inclusions

Pn ⊂ Pd ⊂ P f ⊂ P f e ⊂ Pall

which is briefly characterized as allowing the reduction of conservatism at the expense of increasein complexity of their descriptions. As the main distinction we stress that the number of LMI’s todescribePd grows linearly in the numberm of parameters, whereas that for parametrizingP f growsexponentially inm. This shows the practical relevance of allowing for mixed block structures to beable to reduce conservatism while avoiding the explosion of computational complexity.

Remark 6.12 In practice, instead of just testing feasibility of (6.5.3) for some multiplier classP ∈

P, it is rather suggested to choose someε > 0 and to infimizeγ over P ∈ P with(I 0A B

)∗ ( Q SS∗ R

)(I 0A B

)+

(0 IC D

)∗ ( Qp Sp

S∗p Rp

)(0 IC D

)≤ γ

(ε I 00 I

).

In this fashion the largest eigenvalue of the left-hand side is really pushed to smallest possible values.If there exists someP ∈ P with (6.5.3) then, trivially, there exists someε > 0 such that the infimalvalueγ∗ is negative. It requires only slight modifications of the arguments to follow in order to showthat

1 ? H is well-posed and

(I

1 ? H

)∗

Pp

(I

1 ? H

)≤ γ∗ I for all 1 ∈ 1c.

Hence the value ofγ∗ provides an indication of distance to robust performance failure.



To conclude this section let us formulate a somewhat more abstract version of the full block S-procedure in terms of quadratic forms on parameter dependent subspaces which is considerablymore powerful than the concrete version as will be illustrated by means of examples. Very similarto our general recipe for constructing LFR’s, the abstraction relies on the mapping interpretation ofthe involved matrices. Indeed, if we fix1 ∈ 1c such thatI − A1 is invertible we observe that(

I1 ? H

)∗

Pp

(I

1 ? H

)< 0

iff

ξ 6= 0, η = [1 ? H ]ξ ⇒

(ξ

η

)∗

Pp

(ξ

η

)< 0

iff (using the LFR)

ξ 6= 0,

(zη

)=

(A BC D

)(w

ξ

), w = 1z ⇒

(ξ

η

)∗

Pp

(ξ

η

)< 0

iff (just by rearrangements)

(0 I

) ( w

ξ

)6= 0,

(I 0A B

)(w

ξ

)∈ im

(1

I

)⇒

⇒

[(0 IC D

)(w

ξ

)]∗

Pp

(0 IC D

)(w

ξ

)< 0.

On the other handI − A1 is invertible iff (I − A1)z = 0 impliesz = 0 iff

z = Aw, w = 1z ⇒ w = 0, z = 0

iff (IA

)w ∈ im

(1

I

)⇒ w = 0

iff (0 I

) ( w

ξ

)= 0,

(I 0A B

)(w

ξ

)∈ im

(1

I

)⇒

(w

ξ

)= 0.

With the abbreviations

W =(

0 I), V =

(I 0A B

), S(1) =

(1

I

), T =

(0 IC D

)∗

Pp

(0 IC D

)it is hence quite obvious that Theorem 6.11 is a direct consequence of the following result.

Theorem 6.13 (Abstract full block S-procedure)Let V ∈ Cv×n, W ∈ Cw×n and T = T∗∈ Cn×n

where x∗T x ≥ 0 if W x = 0. Moreover suppose that S(1) ∈ Cv×∗ depends continuously on1 ∈ 1c

and has constant rank on1c. Then, for all1 ∈ 1c,

V x ∈ im(S(1)) ⇒

{x∗T x < 0 if W x 6= 0

x = 0 if W x = 0

if and only if there exists some Hermitian matrix P∈ Cv×v with

S(1)∗ PS(1) ≥ 0 for all 1 ∈ 1c and V∗ PV + T < 0.



Remark 6.14 We observe that the proof of the ‘if’ direction is trivial. SupposeV x ∈ im(S(1)).Then there existsz with V x = S(1)z which in turn impliesx∗V∗ PV x = z∗S(1)∗ PS(1)z ≥ 0.Let W x 6= 0 such thatx 6= 0 and hencex∗V∗ PV x + x∗T x < 0 which impliesx∗T x < 0.Now supposeW x = 0 such thatx∗T x ≥ 0 by hypothesis. Ifx 6= 0 we infer x∗V∗ PV x < 0which contradictsx∗V∗ PV x ≥ 0; hence we can concludex = 0. A proof of the other direction issomewhat less simple and can be found in [?].

Remark 6.15 Let us explicitly formulate the specialization of Theorem 6.13 with empty matricesQ andW respectively:

V x ∈ im(S(1)) ⇒ x = 0 holds for all 1 ∈ 1c

iff there exists some HermitianP ∈ Cv×v with

S(1)∗ PS(1) ≥ 0 for all 1 ∈ 1c and V∗ PV < 0.

The latter specialization is particularly useful for guaranteeing well-posedness of LFR’s. We haveargued above thatI − A1 is non-singular iff(

IA

)w ∈ im

(1

I

)⇒ w = 0.

This holds for all1 ∈ 1c iff(IA

)∗

P

(IA

)< 0 for some P ∈ Pall .

To conclude this section let us illustrate the flexibility of the abstract version of the full-block S-procedure by considering the problem of guaranteeing that

N − 1M has full column rank for all1 ∈ 1c.

Literally the same arguments as for non-singularity reveal that his holds for all1 ∈ 1c iff

(NM

)∗

P

(NM

)< 0 for some P ∈ Pall .

In the framework suggested it is straightforward to devise numerically verifiable robust full-ranktests, in contrast to the standard structured singular value theory which requires the development ofdedicated algorithms [?].


6.6. NUMERICAL SOLUTION OF ROBUST LINEAR ALGEBRA PROBLEMS 175

6.6 Numerical solution of robust linear algebra problems

6.6.1 Relaxations for computing upper bounds

Let us assume that the decision variable isv ∈ Rn, and suppose that

H(v) =

(A B

C(v) D(v)

)with C(v), D(v) depending affinely onv,

and

Pp(v) =

(Qp(v) Sp

S∗p TpUp(v)−1T∗

p

)with Qp(v), Up(v) depending affinely onv.

Moreover let1(δ) be a parameter-block which is linear inδ, andδ is some subset ofRm.

With the linear cost functional defined byc ∈ Rn we consider the following paradigm problem.

Problem Infimizec∗v over allv which satisfyUp(v) > 0 and

det(I − A1(δ)) 6= 0,

(I

1(δ) ? H(v)

)T

Pp(v)

(I

1(δ) ? H(v)

)< 0 for all δ ∈ δ. (6.6.1)

Denote the optimal value byγopt ∈ [−∞, ∞].

In various circumstances we have made explicit how to construct a setP with an LMI descriptionand such that (

1(δ)

I

)∗

P

(1(δ)

I

)≥ 0 for all δ ∈ δ. (6.6.2)

This allows to consider the following relaxation.

Problem Infimizec∗v over allv andP ∈ P with

Up(v) > 0,

(I 0A B

)∗

P

(I 0A B

)+

(0 I

C(v) D(v)

)∗

Pp(v)

(0 I

C(v) D(v)

)< 0.

Denote the optimal value byγrel ∈ [−∞, ∞].

Since P has an LMI description, the linearization lemma?? reveals that we can computeγrel bysolving a genuine LMI problem. The (simple) sufficiency conclusion in the full-block S-procedureimplies thatv is feasible for Problem 6.6.1 ifv and P ∈ P are feasible for Problem 6.6.1. We canhence infer

γopt ≤ γrel,

including the casesγopt ∈ {−∞, ∞} andγrel ∈ {−∞, ∞} with the usual interpretations.

Unfortunately it is impossible to make a priori statements on the relaxation gapγrel − γopt in thegenerality as discussed here. However, let us stress again that this gap can be possibly reduced by


176 6.6. NUMERICAL SOLUTION OF ROBUST LINEAR ALGEBRA PROBLEMS

employing a larger (superset) ofP, and thatγopt ≤ γrel if δ is compact and if replacingP with theset of all multipliersP that satisfy 6.6.2, just due to the full-block S-procedure. Summarizing, thechoice of an increasing family of multipliersP1 ⊂ P2 ⊂ · · · leads to a family of LMI-relaxationswith non-increasing optimal values.

We conclude this section by stressing that the developed techniques allow an immediate extensionto multiple-objectives expressed by finitely many constraints as(

I1(δ) ? H(v)

)T

Pp, j (v)

(I

1(δ) ? H(v)

)< 0 for all δ ∈ δ, j = 1, . . . , p.

These constraints are relaxed as follows: For allj = 1, . . . , p, there existsPj ∈ P with(I 0A B

)∗

Pj

(I 0A B

)+

(0 I

C(v) D(v)

)∗

Pp, j (v)

(0 I

C(v) D(v)

)< 0.

It is hence essential to exploit the extra freedom to relax any individual constraint with its individualmultiplier in order to keep conservatism subdued. We remark as well that one could even allow theLFR and the class of multipliers used for each relaxation vary from constraint to constraint withoutintroducing any extra complications.

6.6.2 When are relaxations exact?

We have seen in the previous section that multiplier relaxations typically cause a gapγrel − γopt. Inthis section we formulate a general principle about a numerical verifiable sufficient condition for theabsences of this gap and hence for exactness of the relaxation.

One situation is simple: If the relaxation is strictly feasible and has optimal valueγrel = −∞. Thiscertainly impliesγopt = −∞ and there is no relaxation gap. Let us hence assume from now on thatγrel > −∞.

The key is to consider the dual of the suggested relaxations. For this purpose we first apply thelinearization lemma?? in order to reformulate the relaxation as infimizingc∗ over v and P ∈ Pwith

(I 0 0A B 0

)∗

P

(I 0 0A B 0

)+

0 C(v)∗S∗p C(v)∗Tp

SpC(v) SpD(v) + D(v)∗S∗p + Qp(v) D(v)∗Tp

T∗pC(v) T∗

p D(v) −Up(v)

< 0.

With the standard basis vectorse1, . . . , en of Rn and withe0 = 0 define

c j = c(ej ), Wj =

0 C(ej )∗S∗

p C(ej )∗Tp

SpC(ej ) SpD(ej ) + D(ej )∗S∗

p + Qp(ej ) D(ej )∗Tp

T∗pC(ej ) T∗

p D(ej ) −Up(ej )

, j = 0, 1, . . . , n.



Then we have to infimizec∗x overx andP ∈ P subject to(I 0 0A B 0

)∗

P

(I 0 0A B 0

)+ W0 +

∑j

x j Wj ≤ 0.

Forδ = conv{δ1, . . . , δN} let us now consider the concrete classP of full block multipliersP which

are just implicitly described by the (strictly feasible) constraints(I0

)∗

P

(I0

)≤ 0,

(1(δ j )

I

)∗

P

(1(δ j )

I

)≥ 0 for all j = 1, . . . , N.

With Lagrange multipliersM , M , M j , j = 1, . . . , N, the Lagrangian hence reads as

n∑j =1

c j x j + 〈

(I 0 0A B 0

)∗

P

(I 0 0A B 0

), M〉 + 〈W0, M〉 +

n∑j =1

x j 〈Wj , M〉+

+ 〈

(I0

)∗

P

(I0

), M〉 −

N∑j =1

〈

(1(δ j )

I

)∗

P

(1(δ j )

I

), M j 〉

or, after ’sorting for primal variables’,

〈W0, M〉 +

n∑j =1

(c j + 〈Wj , M〉)x j +

+〈P,

(I 0 0A B 0

)M

(I 0 0A B 0

)∗

+

(I0

)M

(I0

)∗

+

N∑j =1

(1(δ j )

I

)M j

(1(δ j )

I

)∗

〉.

By standard duality we can draw the following conclusions:

(a) The primal is not strictly feasible (which just meansγrel = ∞) iff there existM ≥ 0, M ≥ 0,M j ≥ 0 with 〈W0, M〉 ≥ 0, 〈Wj , M〉 = 0, j = 1, . . . , N and

(I 0 0A B 0

)M

(I 0 0A B 0

)∗

+

(I0

)M

(I0

)∗

+

N∑j =1

(1(δ j )

I

)M j

(1(δ j )

I

)∗

= 0.

(6.6.3)

(b) If the primal is strictly feasible and has finite optimal value there existsM ≥ 0, M ≥ 0,M j ≥ 0 which maximize〈W0, M〉 under the constraints〈Wj , M〉 + c j = 0, j = 1, . . . , Nand (6.6.3). The optimal value of this Lagrange dual problem equalsγrel.

Theorem 6.16 If γrel = ∞ and if M in 1. has rank one thenγopt = ∞. If γrel < ∞ and if M in 2.has rank one thenγopt = γrel.



It is convenient to summarize this result as follows: If there exists either an infeasibility certificateor a dual optimizer such thatM has rank one then the relaxation is exact.

Proof. If M has rank one, it can be decomposed asM = mm∗. Let us partitionm as col(w, ξ, ξe)

according to the columns of(A B 0) and definez = (A B 0)m = Aw + Bξ . The essential point isto conclude from (6.6.3) that

w = 1(δ0)z for some δ0∈ δ.

Indeed just by using the definitions we obtain with (6.6.3) the relation

(w

z

) (w∗ z∗

)+

(I0

)M(

I 0)−

N∑j =1

(1(δ j )

I

)M j

(1(δ j )

I

)∗

= 0.

Fromzz∗ =∑N

j =1 M j we infer thatz∗x = 0 impliesM j x = 0 for all j such that there existα j with

M j = zα j z∗. If z 6= 0 haveα j ≥ 0 and∑N

j =1 α j = 1. Nowwz∗=∑N

j =1 1(δ j )M j = 0 allows toconclude by right-multiplication withz and division byz∗z 6= 0 that

w =

N∑j =1

1(δ j )zα j = 1

N∑j =1

α j δj

z = 1(δ0)z with δ0:=

N∑j =1

α j δj∈ δ.

If z = 0 we inferM j = 0 and henceww∗+ M = 0 which impliesw = 0 (andM = 0) such that

w = 1(δ0)z holds for an arbitraryδ0∈ δ.

Let us now assumeγrel = ∞. In can happen thatI − A1(δ0) is singular. This implies that Problem6.6.1 is not feasible and henceγopt = ∞.

Let us continue under the hypothesis thatI − A1(δ0) is non-singular which impliesw = 1(δ0)(I −

A1(δ0))−1Bξ (due toz = Aw + Bξ andw = 1(δ0)z). We infer for allx ∈ Rn:

ξ∗

(∗

∗

)∗ ( Q(x) SS∗ TU(x)−1T∗

)(I

D(x) + C(x)1(δ0)(I − A1(δ0))−1B

)ξ =

=

(w

ξ

)∗ ( 0 IC(x) D(x)

)∗ ( Q(x) SS∗ TU(x)−1T∗

)(0 I

C(x) D(x)

)(w

ξ

)=

= maxη

w

ξ

η

∗ 0 C(x)∗S∗p C(x)∗Tp

SpC(x) SpD(x) + D(x)∗S∗p + Qp(x) D(x)∗Tp

T∗pC(x) T∗

p D(x) −Up(x)

w

ξ

η

≥

≥

w

ξ

ξe

∗ 0 C(x)∗S∗p C(x)∗Tp

SpC(x) SpD(x) + D(x)∗S∗p + Qp(x) D(x)∗Tp

T∗pC(x) T∗

p D(x) −Up(x)

w

ξ

ξe

=

= m∗W0m +

n∑j =1

x j m∗Wj m.



By hypothesis we have

0 ≤ 〈W0, M〉 +

n∑j =1

x j 〈Wj , M〉 = m∗W0m +

n∑j =1

x j m∗Wj m for all x ∈ Rn.

From the above-derived chain of inequalities we can hence infer that nox ∈ Rn can ever be feasiblefor (6.6.1), which implies again thatγopt = ∞ and thusγopt = γrel.

Let us now assumeγrel < ∞. Since problem 6.6.1 is strictly feasible it is guaranteed thatI − A1(δ0)

is nonsingular, and the above-derived chain of inequalities is again true. Let us apply this chain ofinequalities for anyx that is feasible for Problem 6.6.1 to infer

0 ≥ m∗W0m +

n∑j =1

x j m∗Wj m.

Just due to〈W0, M〉 = γrel andc j = 〈Wj , M〉 we can conclude

n∑j =1

c j x j = γrel − 〈W0, M〉 −

n∑j =1

x j 〈Wj , M〉 = γrel − m∗W0m −

n∑j =1

x j m∗Wj m

and hencec∗x ≥ γrel. Sincev was an abitrary feasible point of Problem 6.6.1 we inferγ∗ ≥ γrel andthusγ∗ = γrel.

It is interesting to note that the proof reveals how one can construct a worst-case parameter uncer-tainty from a rank one dual multiplier! In a similar fashion it is possible to apply the novel principleto a whole variety of other problems which is reveals its charming character.

In one specific case with one-dimensional uncertainty one can in factalwaysconstruct a dual multi-plier of rank one.

Theorem 6.17 Suppose thatδ = conv{δ1, δ2}. Thenγrel = γopt.

Proof. Let us first investigate the constraint (6.6.3) in the specific caseN = 2. Left-multiplicationwith

(I −1(δ1)

)and right-multiplication with

(I −1(δ2)

)∗leads with

U :=(

I −1(δ1)) ( I 0 0

A B 0

)and V :=

(I −1(δ2)

) ( I 0 0A B 0

)to the relationU MV∗

+M = 0. This impliesU MV∗+V∗MU = −M ≤ 0 andU MV∗

−V∗MU =

0 which can be expressed as(U∗

V∗

)∗ ( 0 −M−M 0

)(U∗

V∗

)≥ 0,

(U∗

V∗

)∗ ( 0 M−M 0

)(U∗

V∗

)= 0. (6.6.4)



Now choose an arbitraryx ∈ Rn. SinceM 6= 0 (???), we can apply Lemma?? to infer the existenceof a vectorm 6= 0 (possibly depending onx) with

m∗

W0 +

n∑j =1

x j Wj

m ≥ 〈W0 +

n∑j =1

x j Wj , M〉. (6.6.5)

and(m∗U∗

m∗V∗

)∗ ( 0 −1−1 0

)(m∗U∗

m∗V∗

)≥ 0,

(m∗U∗

m∗V∗

)∗ ( 0 1−1 0

)(m∗U∗

m∗V∗

)= 0.

If we partitionm again as col(w, ξ, ξe) and if we definez = Aw + Bξ we inferUm = w − 1(δ1)zandV m = w − 1(δ2)z. If V m = 0 we concludew = 1(δ2)z. If V m 6= 0 there exists, by Lemma??, someα ≤ 0 with w − 1(δ1)z = αw − α1(δ2)z or

w =

(1

1 − α1(δ1) −

α

1 − α1(δ2)

)z = 1

(1

1 − αδ1

−α

1 − αδ2)

z.

Therefore in both casesw = 1(δ0)z for someδ0∈ conv{δ1, δ2

}.

If γrel = ∞, we can infer〈W0 +∑n

j =1 x j Wj , M〉 ≥ 0 and hence with (6.6.5) we obtainm∗W0m +∑nj =1 x j m∗Wj m ≥ 0. This allows to finish the proof as above.

If γrel < ∞, and ifx is chosen feasible for Problem??, we infer again 0≥ m∗W0m+∑n

j =1 x j m∗Wj m.Now we exploit (6.6.5) to conclude 0≥ 〈W0 +

∑nj =1 x j Wj , M〉 or c∗x ≥ γrel. We conclude again

thatγopt ≥ γrel, and thus equality.

Remarks. This is a novel result on the absence of a relaxation gap for robust semi-definite programswith implicitly described full-block multipliers. Only slight modifications of our arguments lead tothe same insights for block-diagonal multipliers, which reveals that we have provided generalizationsof the result from [?] and [?]. It is important to stress that full-block multipliers are in generalexpected to lead to less conservative results. We can conclude, however, that the one parameter caseallows the restriction to the diagonal multiplier class without introducing conservatism. In case ofcomplex uncertainties as appearing in SSV-theory we will be able to easily derive similar insights ina considerable more general setting in the sections to follow.

Let us summarize our findings in the following general principle.

(Absence of relaxation gap).If the Lagrange multiplier corresponding the concrete S-proceduremultiplier inequality has rank one then there is not relaxation gap.

It is the main purpose of this section to illustrate this principle by means of obtaining a novel result onthe absence of a relaxation gap for (implicitly described) full-block multipliers. A similar derivationin a somewhat more general setting including full complex uncertainty blocks as they appear inrobust control will follow in Section


Chapter 7

Analysis of input-output behavior

The main concern in control theory is to study how signals are processed by dynamical systems andhow this processing can be influenced to achieve a certain desired behavior. For that purpose onehas to specify the signals (time series, trajectories) of interest. This is done by deciding on the setof values which the signals can take (such asRn) and on the time set on which they are defined(such as the full time axisR, the half axis[0, ∞) or the corresponding discrete time versionsZandN). A dynamical system is then nothing but a mapping that assigns to a certain input signalsome output signal. Very often, such a mapping is defined by a differential equation with a fixedinitial condition or by an integral equation, which amounts to considering systems or mappings witha specific explicit description or representation.

The first purpose of this chapter is to discuss stability properties of systems in a rather abstract andgeneral setting. In a second step we turn to specialized system representations and investigate inhow far one can arrive at refined results which are amenable to computational techniques.

We stress that it is somewhat restrictive to consider a system as a mapping of signals, thus fixingwhat is considered as an input signal or output signal. It is not too difficult to extend our results tothe more general and elegant setting of viewing a system as a subset of signals or, in the modernlanguage, as a behavior [?]. In particular concerning robust stability issues, this point of view hasbeen adopted as well in the older literature in which systems have been defined as relations ratherthan mappings [?,?,?,?].

7.1 Basic Notions

Let us now start the concrete ingredients that are required to develop the general theory. The basisfor defining signal spaces is formed byLn, the set of all time-functions or signalsx : [0, ∞) →

Rn which are Lebesgue-measurable. Without bothering too much about the exact definition one

181

182 7.1. BASIC NOTIONS

should recall that all continuous or piece-wise continuous signals are contained inLn. The bookby Rudin [?] is an excellent source for a precise exposition of all the mathematical concepts in thischapter. From an engineering perspective it is helpful to consult [?] in which some of the advancedconstructions are based on an elementary yet insightful footing.

For anyx ∈ Ln one can calculate the energy integral

‖x‖2 :=

√∫∞

0‖x(t)‖2 dt

which is either finite or infinite. If we collect all signals with a finite value we arrive at

Ln2 := {x ∈ Ln

: ‖x‖2 < ∞}.

It can be shown thatLn2 is a linear space, that‖.‖2 is a norm onLn

2, and thatLn2 is complete or a

Banach space. The quantity‖x‖2 is often calledL2-norm orenergyof the signalx. In the sequelif the number of componentsn of the underlying signal is understood from the context or irrelevant,we simply writeL2 instead ofLn

2.

Actually Ln2 admits an additional structure. Indeed, let us define the bilinear form

〈x, y〉 =

∫∞

0x(t)T y(t) dt

on Ln2 × Ln

2. Bilinearity just means that〈., y〉 is linear for eachy ∈ Ln2 and〈x, .〉 is linear for each

x ∈ Ln2. It is not difficult to see that〈., .〉 defines an inner product onLn

2. It is obvious that theabove defined norm and the inner product are related in a standard fashion by‖x‖

22 = 〈x, x〉. As a

complete inner product space,Ln2 is in fact a Hilbert space.

For anyx ∈ Ln2 one can calculate the Fourier transformx which is a function mapping the imaginary

axisC0 into Cn such that ∫∞

−∞

x(i ω)∗ x(i ω) dω is finite.

Indeed, a fundamental results in the theory of the Fourier transformation onL2-spaces, the so-calledParseval theorem, states that∫

∞

0x(t)T y(t) dt =

1

2π

∫∞

−∞

x(i ω)∗ y(i ω) dω.

Recall that the Fourier transformx has in fact a unique continuation intoC0∪ C+ that is analytic

in C+. Hence,x is not just an element ofL2(C0, Cn) but even of the subspaceH2(C+, Cn), one ofthe Hardy spaces. Moreover, one hasL2(C0, Cn) = H2(C−, Cn) + H2(C+, Cn), the sum on theright is direct, and the two spaces are orthogonal to each other. This corresponds via the Payley-Wiener theorem to the orthogonal direct sum decompositionL2(R, Cn) = L2((−∞, 0], Cn) +

L2([0, ∞), Cn). These facts which can be found with more details e.g. in [?, ?] are mentioned onlyfor clarification and they are not alluded to in the sequel.


7.1. BASIC NOTIONS 183

Roughly speaking, stability of systems is related to the property that it maps any input signal ofbounded energy into an output signal which also has bounded energy. Since it is desired to dealwith unstable systems, we cannot confine ourselves to signals with finite energy. This motivates tointroduce a larger class of signals that have finite energy over finite intervals only.

For that purpose it is convenient to introduce for eachT ≥ 0 the so-calledtruncation operatorPT : It assigns to any signalx ∈ Ln the signalPT x that is identical tox on [0, T] and that vanishesidentically on(T, ∞):

PT : Ln→ Ln, (PT x)(t) :=

{x(t) for t ∈ [0, T],

0 for t ∈ (T, ∞).

Note thatLn is a linear space and thatPT is a linear operator on that space with the propertyPT PT =

PT . HencePT is a projection.

Now it is straightforward to define the so-called extended spaceLn2e. It just consists of all signals

x ∈ Ln such thatPT x has finite energy for allT ≥ 0:

Ln2e := {x ∈ Ln

: PT x ∈ Ln2 for all T ≥ 0}.

Hence anyx ∈ Ln2e has the property that

‖PT x‖2 =

∫ T

0‖x(t)‖2 dt

is finite for everyT . We observe that‖PT x‖2 does not decrease ifT increases. Therefore,‖PT x‖2viewed as a function ofT either stays bounded forT → ∞ and then converges, or it is unboundedand then it diverges to∞. For anyx ∈ Ln

2e we can hence conclude that‖PT x‖2 is bounded forT → ∞ iff x is contained inLn

2. Moreover,

x ∈ Ln2 implies ‖x‖2 = lim

T→∞‖PT x‖2.

Example 7.1 The signal defined byx(t) = et is contained inL2e but not inL2. The signal definedby x(0) = 0 andx(t) = 1/t for t > 0 is not contained inL2e. In general, since continuous orpiece-wise continuous signals are bounded on[0, T] for everyT ≥ 0, they are all contained inLn

2e.

A dynamical systemS is a mapping

S : Ue → L l2e with Ue ⊂ Lk

2e.

Note that the domain of definitionUe is not necessarily a subspace and that, in practical applications,it is typically smaller thanLk

2e since it imposes some additional regularity or boundedness propertieson its signals. The systemS is causalif its domain of definition satisfies

PTUe ⊂ Ue for all T ≥ 0



and ifPT S(u) = PT S(PT u) for all T ≥ 0, u ∈ Ue.

It is easily seen thatPT S = PT SPT is equivalent to the following more intuitive fact: Ifu1 andu2are two input signals that are identical on[0, T], PT u1 = PT u2, thenSu1 andSu2 are also identicalon the same time-interval,PT S(u1) = PT S(u2). In other words, the future values of the inputs donot have any effect on the past output values. This mathematical definition matches the intuitivenotion of causality.

Our main interest in this abstract setting is to characterize invertibility and stability of a system.Among the many possibilities to define system stability, those based on gain or incremental gainhave turned out to be of prominent importance.

Definition 7.2 TheL2-gain of the systemSdefined onUe is given as

‖S‖ := sup

{‖PT S(u)‖2

‖PT u‖2| u ∈ Ue, T ≥ 0, ‖PT u‖2 6= 0

}=

= inf{γ ∈ R | ∀ u ∈ Ue, T ≥ 0 : ‖PT S(u)‖2 ≤ γ ‖PT u‖2}.

If ‖S‖ < ∞, S is said to have finiteL2-gain.

The reader is invited to prove that the two different ways of expressing‖S‖ do indeed coincide. IfShas finiteL2-gain we infer forγ := ‖S‖ that

‖PT S(u)‖2 ≤ γ ‖PT u‖2 for all T ≥ 0, u ∈ Ue. (7.1.1)

In particular, this implies‖S(u)‖ ≤ γ ‖u‖2 for all u ∈ Ue ∩ Lk

2. (7.1.2)

Hence, if the input has finite energy, then the output has finite energy and the output energy isbounded by a constant times the input energy. IfS is actuallycausal, the converse is true as well andproperty (7.1.2) implies (7.1.1). Indeed, foru ∈ Ue we infer PT u ∈ Ue ∩ Lk

2 such that we can apply(7.1.2) to conclude‖S(PT u)‖2 ≤ γ ‖PT u‖2. Using causality, it remains to observe‖PT S(u)‖2 =

‖PT S(PT u)‖2 ≤ ‖S(PT u)‖2 to end up with (7.1.1).

We conclude for causalS that

‖S‖ = supu∈Ue∩Lk

2, ‖u‖2>0

‖S(u)‖2

‖u‖2= inf{γ ∈ R | ∀ u ∈ Ue ∩ Lk

2 : ‖S(u)‖2 ≤ γ ‖u‖2}.

Hence theL2-gain of S is the worst amplification of the system if measuring the size of the input-and output-signals by their energy.

Example 7.3 Consider the linear time-invariant system

x = Ax + Bu, y = Cx + Du, x(0) = 0 (7.1.3)


7.1. BASIC NOTIONS 185

with A ∈ Rn×n, B ∈ Rn×k, C ∈ Rl×n. A standard fact in the theory of differential equations revealsthat anyu ∈ Lk

2e leads to a unique responsey ∈ L l2e. It is not difficult to prove that the output has

finite energy for allu ∈ Lk2 if and only if the corresponding transfer matrixC(s I − A)−1B has all its

poles in the open left-half plane. If(A, B) is stabilizable and(A, C) is detectable, this is equivalentto A begin Hurwitz. If (7.1.3) mapLk

2 into L l2, it has finiteL2-gain which is know to equal the

H∞-norm of the transfer matrixC(s I − A)−1B + D.

With the continuous functionφ : R → R let us define the static nonlinear system

S(u)(t) := φ(u(t)) for t ≥ 0. (7.1.4)

For SmappingL2e into L2e, one either requires additional properties onφ or one has to restrict thedomain of definition. IfUe denotes all continuous signals inL2e, we conclude thatS mapsUe intoUe ⊂ L2e, simply due to the fact thatt → φ(u(t)) is continuous foru ∈ Ue. However,S is notguaranteed to have finiteL2-gain. Just think ofφ(x) =

√x and the signalu(t) = 1/(t + 1), t ≥ 0.

In general, ifφ satisfies

|φ(x)| ≤ α|x| for all x ∈ R,

the reader is invited to show thatS : L2e → L2e is causal and has finiteL2-gain which is boundedby α.

For nonlinear systems it is often more relevant to compare the distance of two different input signalsu1, u2 with the distance of the corresponding output signals. In engineering terms this amounts toinvestigating the effect of input variations onto the system’s output, whereas in mathematically termsit is related to continuity properties. From an abundance of possible definitions, the most often usedversion compares the energy of the input incrementu1 − u2 with the energy of the output incrementS(u1) − S(u2) which leads to the notion of incrementalL2-gain.

Definition 7.4 The incremental L2-gain of the systemS : Ue → L l2e is defined as

‖S‖i := sup

{‖PT S(u1) − PT S(u2)‖2

‖PT u1 − PT u2‖2| u1, u2 ∈ Ue, T ≥ 0, ‖PT u1 − PT u2‖2 6= 0

}=

= inf{γ ∈ R | ∀ u1, u2 ∈ Ue, T ≥ 0 : ‖PT S(u1) − PT S(u2)‖2 ≤ γ ‖PT u1 − PT u2‖2}.

If ‖S‖i < ∞, S is said to have finite incrementalL2-gain.

Similarly as for theL2-gain, the systemShas finite incrementalL2-gain if there exists a realγ > 0such that

‖PT S(u1) − PT S(u2)‖2 ≤ γ ‖PT u1 − PT u2‖2 for all T ≥ 0, u1, u2 ∈ Ue. (7.1.5)

This reveals

‖S(u1) − S(u2)‖2 ≤ γ ‖u1 − u2‖2 for all u1, u2 ∈ Ue ∩ Lk2. (7.1.6)



If S is causal, it can again be easily seen that (7.1.6) implies (7.1.5), and that

‖S‖i = supu1,u2∈Ue∩Lk

2, ‖u1−u2‖2>0

‖S(u1) − S(u2)‖2

‖u1 − u2‖2=

= inf{γ ∈ R | ∀ u1, u2 ∈ Ue ∩ Lk2 : ‖S(u1) − S(u2)‖2 ≤ γ ‖u1 − u2‖2}.

Example 7.5 If the functionφ in Example 7.3 satisfies the Lipschitz condition

|φ(x1) − φ(x2)| ≤ α|x1 − x2| for all x1, x2 ∈ R,

the mapping defined by (7.1.4) has finite incrementalL2-gain bounded byα. Note that one cannotguarantee that it has finiteL2-gain, as seen withφ(x) = 1+ |x| and the signalu(t) = 0 which mapsinto φ(u(t)) = 1 not contained inL2.

It just follows from the definitions that

‖S‖ ≤ ‖S‖i if S(0) = 0.

Hence systems mapping zero into zero do have finiteL2-gain if they have finite incrementalL2-gain.The converse is, in general, not true and the reader is asked to construct a suitable counterexample.It is obvious that

‖S‖ = ‖S‖i if S is linear.

Hence linear systems have finiteL2-gain iff they have finite incrementalL2-gain.

For reasons of clarity we have opted to confine ourselves to a few concepts in the pretty specificL2-setting which sets the stage for various modifications or extensions that have been suggested in theliterature. Let us hint at a few of these generalizations in various respects. All notions are as easilyintroduced forL p- andL pe-spaces with 1≤ p ≤ ∞. The time-axis can be chosen as all nonnegativeintegers to investigate discrete-time systems. A mixture of continuous- and discrete-time allows toconsider hybrid systems or systems with jumps. Moreover, the value set of the signals can be takento be an arbitrary normed space, thus including infinite dimensional systems. Finally, the presentedstability concepts are important samples out of a multitude of other possibilities. For example,stability of S if often defined to just requireS to mapUe ∩ L2 into L2 without necessarily havingfinite L2-gain. On the other hand in various applications it is important to qualify in a more refinedfashion how‖PT S(u)‖2 is related to‖PT u‖2 for varying T . For example, one could work with‖PT S(u)‖2 ≤ γ (‖PT u‖2) to hold forT ≥ 0, u ∈ Ue, and for some functionγ : [0, ∞) → [0, ∞)

in a specific class. We have confined ourselves tolinear functionsγ , but one could as well take theclass ofaffineor monotonicfunctions, or classes of even more specific type with desired tangencyand growth properties forT → 0 andT → ∞ respectively, some of which can be found e.g.in [?, ?]. Many of the abstract results to be presented could be formulated with general designerchosen stability properties that only need to obey certain (technical) axiomatic hypotheses.


7.2. ROBUST INPUT-OUTPUT STABILITY 187

7.2 Robust Input-Output Stability

In general, robustness analysis is nothing but an investigation of the sensitivity of a interesting systemproperty against (possibly large) perturbations that are known to belong to an a priori specifiedclass which determines their structure and size. In general one can argue that robustness questionsform center stage in most natural sciences, in engineering and in mathematics. In control one istypically interested in how a desired property of a system interconnection, such as e.g. stabilityor performance, is affected by perturbations in the system’s components that are caused e.g. bymodeling inaccuracies or by failures. The purpose of this section is to develop tools for guaranteeingthe stability of an interconnection of linear time-invariant systems against rather general classes ofsystem perturbations which are also called uncertainties.

7.2.1 A Specific Feedback Interconnection

As we have clarified in Section?? robust stability and performance questions can be handled on thebasis of the setup in Figure 7.1 in quite some generality. Here,M is a fixed nominal model and1describes an uncertainty which is allowed to vary in a certain class1. Both the nominal systemand the uncertainty are interconnected via feedback as suggested in the diagram, and as analyticallyexpressed by the relations (

w

M(w)

)−

(1(z)

z

)=

(winzin

). (7.2.1)

The signalswin andzin are external inputs or disturbances andw, z constitute the correspondingresponse of the interconnection.

The two most basic properties of such feedback interconnections are well-posedness and stability.Well-posedness means that for all external signalswin and zin there exists a unique responsew

andz which satisfies (7.2.1). We can conclude that the relations (7.2.1) actually define a mapping(win, zin) → (w, z). Stability is then just related to whether this mapping has finite gain or not.

After these intuitive explanations we intend to introduce the precise mathematical framework chosenfor our exposition. We start by fixing the hypothesis on the component systems.

Assumption 7.6 For each1 ∈ 1, the mappings

M : We → L l2e and 1 : Ze → Lk

2e

with We ⊂ Lk2e andZe ⊂ L l

2e are causal and of finiteL2-gain. Moreover,1 is star-shaped with starcenter 0:

1 ∈ 1 implies τ1 ∈ 1 for all τ ∈ [0, 1].


188 7.2. ROBUST INPUT-OUTPUT STABILITY

1+

+�

-

?

6

-

�

z

w w0

z0

M

−

Figure 7.1: Uncertainty Feedback Configuration

For each1 ∈ 1 we note{τ1 | τ ∈ [0, 1]} just defines a line segment in the set of all causal andstable systems connecting 0 with1. This justifies the terminology that1 is star-shaped with center0 since any1 ∈ 1 can be connected to 0 with such a line-segment that is fully contained in1. Weinfer that 0∈ 1 which is consistent with viewing1 as an uncertainty where1 = 0 is related to theunperturbed or nominal interconnection described by(

w

z

)=

(win

M(win) − zin

). (7.2.2)

Just for convenience of notations we define the interconnection mapping

IM (1)

(w

z

)=

(I −1

M −I

)(w

z

):=

(w

M(w)

)−

(1(z)

z

)(7.2.3)

Note thatIM (1) depends both on the component systemsM , 1 as well as on the specific inter-connection under investigation. The asymmetric notation should stress thatM is fixed whereas1 isallowed to vary in the class1.

Now we are read to define well-posedness.

Definition 7.7 The mappingIM (1) is well-posedwith respect to the external signal setWin,e ×

Zin,e ⊂ Lk2e × L l

2e if for each(win, zin) ∈ Win,e × Zin,e there exists a unique(w, z) ∈ We × Ze

satisfying

IM (1)

(w

z

)=

(winzin

)such that the correspondingly defined map

Win,e × Zin,e 3

(winzin

)→

(w

z

)∈ We × Ze (7.2.4)

is causal.



Mathematically, well-posedness hence just means that

IM (1) : Wres,e × Zres,e → Win,e × Zin,e has a causal inverseIM (1)−1

with some setWres,e × Zres,e ⊂ We × Ze. Typically, We × Ze andWin,e × Zin,e are explicitly givenuser-specified subsets ofL2e defined through additional regularity or boundedness properties on thecorresponding signals. It is then often pretty simple to rely on standard existence and uniquenessresults (for differential equations or integral equations) to guarantee well-posedness as illustrated inexamples below. In contrast it is not required to explicitly describe the (generally1-dependent) setWres,e × Zres,e which collects all responses of the interconnection to disturbances inWin,e × Zin,e.

For well-posedness it is clearly required to choose an input signal set withPT (Win,e × Zin,e) ⊂

Win,e × Zin,e for all T ≥ 0. In addition the unperturbed interconnection mappingIM (0) is requiredto be well-posed. In view (7.2.2) this is guaranteed if the input signal sets satisfyWin,e ⊂ We andM(Win,e) − Zin,e ⊂ Ze. This brings us to the following standing hypothesis on the interconnection,including some abbreviations in order to simplify the notation.

Assumption 7.8 Well-posedness ofIM (1) is always understood with respect to the disturbanceinput setWin,e × Zin,e satisfyingPT (Win,e × Zin,e) ⊂ Win,e × Zin,e for all T ≥ 0 and

Win,e ⊂ We and M(Win,e) − Zin,e ⊂ Ze.

In addition the following standard abbreviations are employed:

W × Z := (We × Ze) ∩ Lk+l2 and Win × Zin = (Win,e × Zin,e) ∩ Lk+l

2 .

Example 7.9 ConsiderM : Lk2e → L l

2e defined by

x = Ax + Bw, z = Cx

with zero initial condition andA Hurwitz. Let1 denote the class of all mappings

1(z)(t) := φ(z(t)) for t ≥ 0

whereφ : R → R is continuously differentiable and satisfies|φ(x)| ≤ |x| for all x ∈ R. It isobvious that|φ(x)| ≤ |x| for all x ∈ R implies |τφ(x)| ≤ |x| for all x ∈ R andτ ∈ [0, 1]. Thisshows that1 is star-shaped with star-center zero, and the remaining facts in Assumption 7.6 followfrom Example 7.3.

The interconnection is described by

x = Ax + Bw, win = w − φ(z), zin = Cx − z (7.2.5)

and defines the mappingIM (1). To investigate well-posedness it is convenient to rewrite theserelations into the non-linear initial value problem

x(t) = Ax(t) + Bφ(Cx(t) − zin(t)) + Bwin(t), x(0) = 0 (7.2.6)



with the interconnection response as output:

w(t) = φ(Cx(t) − zin(t)) + win(t), z(t) = Cx(t) − zin(t).

Let us restrict the external disturbance input signals(win, zin) to those which are piece-wise right-continuous thus defining the subspaceWin,e × Zin,e of Lk+l

2e . Then standard elementary results indifferential equation theory guarantee the existence of a unique continuously differentiable solutionof the initial value problem (7.2.6) for each(win, zin) ∈ Win,e× Zin,e, and it is clear that(win, zin) →

(w, z) is causal. Through introducing convenient regularity properties on the input signals it turnsout to be simple to establish well-posedness. Of course one can allow for more irregular disturbanceinputs by relying on the more advanced theory of Carathéodory for results on differential equations.Our example illustrates that this more technical framework can be avoided without loosing practicalrelevance.

In addition to well-posedness, it is of interest to investigate stability of the feedback interconnection.Among the various possible choices, we confine ourselves toIM (1)−1 having finite L2-gain orfinite incrementalL2-gain. If these properties hold forall 1 ∈ 1, the interconnection will be calledrobustly stable. Uniform robust stability then just means that the (incremental)L2-gain ofIM (1)−1

is uniformly bounded in1 ∈ 1, i.e., the bound does not depend on1 ∈ 1.

In this book we provide sufficient conditions for the following precise stability properties:

• Under the hypothesis thatIM (1) is well-posed, guarantee that there exists a constantc with‖IM (1)−1

‖ ≤ c for all 1 ∈ 1. In particular,IM (1)−1 then has finiteL2-gain for all1 ∈ 1.

• If M and all systems in1 have finite incrementalL2-gain, characterize thatIM (1) is well-posed and that there exists ac with ‖IM (1)−1

‖i ≤ c for all 1 ∈ 1. In particular, anyIM (1)−1 has a finite incrementalL2-gain.

It is important to observe the fundamental difference in these two characterizations: In the firstscenario one has toassumewell-posedness and oneconcludesstability, whereas in the secondbothwell-posedness and stability areconcluded.

For systems described by differential stability is often related to the state converging to zero fort → ∞. In continuing Example 7.9 let us now illustrate how to guarantee this attractiveness of thesolutions of (nonlinear) differential by input-output stability.

Example 7.10 Let us consider the uncertain non-linear system

x = Ax + Bφ(Cx), x(t0) = x0 (7.2.7)

with t0 > 0, with A being Hurwitz and with the nonlinearity being continuously differentiableand satisfying|φ(x)| ≤ |x|. We are interested in the question of whether all trajectory of (7.2.7)converge to zero fort → ∞. This holds true ifIM (1)−1 as defined in Example 7.9 has a finiteL2-gain. Why? SinceA is stable, it suffices to prove the desired properties forx0 in the controllable



subspace of(A, B). Given any suchx0, we can findwin(.) on [0, t0) which steers the state-trajectoryof x = Ax + Bwin from x(0) = 0 to x(t0) = x0. Definezin(t) := Cx(t) on [0, t0), and concatenatethese external inputs withwin(t) = 0, zin(t) = 0 on [t0, ∞), implying (win, zin) ∈ Lk+l

2 . Atthis point we exploit stability to infer that(w, z) = IM (1)−1(win, zin) is also contained inLk+l

2 .Therefore (7.2.5) reveal that bothx and x are inL2 which implies limt→∞ x(t) = 0. It is obviousby the concatenation construction that the responses of (7.2.6) and (7.2.7) coincide on[t0, ∞) whichproves the desired stability property.

In this construction it is clear that the energy ofwin and hence also ofzin can be bounded in termsof ‖x0‖. If IM (1)−1(win, zin) has finiteL2-gain, we infer that the energy of the responses ofw andz are also bounded in terms of‖x0‖. Due tox = Ax + Bw, the same is true for‖x‖2 and even for‖x‖∞. These arguments reveal how to deduce Lyapunov stability. Uninformity with respect to theuncertainty is guaranteed by a uniform bound on theL2-gain ofIM (1)−1(win, zin).

Before we embark on a careful exposition of the corresponding two fundamental theorems whichare meant to summarize a variety of diverse results in the literature, we include a section with apreparatory discussion meant to clarify the main ideas and to technically simplify the proofs.

7.2.2 An Elementary Auxiliary Result

The well-posedness and stability results to be derived are based on separation. The main ideas areeasily demonstrated in terms of mappingsM and1 defined through matrix multiplication. Well-posedness is then related to whether(

I −1

M −I

)is non-singular (7.2.8)

and stability translates into the existence of a boundc for the inverse’s spectral norm∥∥∥∥∥(

I −1

M −I

)−1∥∥∥∥∥ ≤ c. (7.2.9)

Let us define the graph subspaces

U = im

(IM

)and V = im

(1

I

)of the mappingsM and1. It is obvious that (7.2.8) equivalently translates into the property thatUandV only intersect in zero:

U ∩ V = {0}. (7.2.10)

Let us now prove that (7.2.9) can be similarly characterized byU andV actually keeping nonzerodistance. Indeed∥∥∥∥∥

(I −1

M −I

)−1(winwin

)∥∥∥∥∥ ≤ c

∥∥∥∥( winzin

)∥∥∥∥ for all (win, zin)



is equivalent to ∥∥∥∥( w

z

)∥∥∥∥2

≤ c2∥∥∥∥( I −1

M −I

)(w

z

)∥∥∥∥2

= for all (w, z). (7.2.11)

If we exploit the evident inequality∥∥∥∥( w

z

)∥∥∥∥2

≤

∥∥∥∥( IM

)w

∥∥∥∥2

+

∥∥∥∥( 1

I

)z

∥∥∥∥2

≤ max{1 + ‖M‖2, 1 + ‖1‖

2}︸︷︷︸

γ 2

∥∥∥∥( w

z

)∥∥∥∥2

we observe that (7.2.11)implies∥∥∥∥( IM

)w

∥∥∥∥2

+

∥∥∥∥( 1

I

)z

∥∥∥∥2

≤ (cγ )2∥∥∥∥( I

M

)w −

(1

I

)z

∥∥∥∥2

for all (w, z)

and that (7.2.11)is implied by∥∥∥∥( IM

)w

∥∥∥∥2

+

∥∥∥∥( 1

I

)z

∥∥∥∥2

≤ c2∥∥∥∥( I

M

)w −

(1

I

)z

∥∥∥∥2

for all (w, z).

In terms of the subspacesU andV we conclude that (7.2.9) is guaranteed if

‖u‖2+ ‖v‖

2≤ c2

‖u − v‖2 for all u ∈ U, v ∈ V. (7.2.12)

(Conversely, (7.2.9) implies (7.2.12) forc replaced bycγ but this will be not exploited in the sequel.)Clearly (7.2.12) strengthens (7.2.10) in that it requires all vectorsu ∈ U andv ∈ V of length 1 tokeep at least the distance

√2/c. Readers familiar with the standard gap-metric for subspaces will

recognize that this just comes down to requiringU andV having positive distance.

It is not exaggerated to state that the verifiable characterization of whether two sets are separatedis at the heart of huge variety of mathematical problems. For example in convex analysis sets areseparated by trying to locate them into opposite half-spaces. This can be alternatively expressed astrying to located either set into the zero sub-level and sup-level set of an affine function. As the readercan immediately verify, it is impossible to separate subspaces in this fashion. This motivates to tryseparation in exactly the same vein byquadraticfunctions instead. For the Euclidean subspacesUandV quadratic separation then means to find a symmetricP such thatuT Pu ≤ 0 andvT Pv ≥ 0for all u ∈ U , v ∈ V . Strict quadratic separation is defined as the existence ofε > 0 for which

uT Pu ≤ −ε‖u‖2 and vT Pv ≥ 0 for all u ∈ U, v ∈ V. (7.2.13)

It is the main purpose of this section to prove for a considerably more general situation that (7.2.13)does allow to compute a positivec for which (7.2.12) is satisfied. In the exercises the reader is askedto prove the converse for the specific matrix problem discussed so far.

Let us turn to an abstraction. We assumeU andV to be subsets (not necessarily subspaces) of anormed linear spaceX. Let us first fix what we mean by referring to a quadratic mapping.



Definition 7.11 6 : X → R is said to bequadratic if

6(x) = 〈x, x〉

holds for some mapping〈., .〉 : X × X → R that is additive in both arguments and bounded.〈., .〉 isadditive if

〈x + y, z〉 = 〈x, z〉 + 〈y, z〉, 〈x, y + z〉 = 〈x, y〉 + 〈x, z〉

and it is bounded if there exists aσ > 0 with

|〈x, y〉| ≤ σ‖x‖‖y‖

for all x, y, z ∈ X.

After all these preparations we now turn to the very simple quadratic separation result which willform the basis for proof of our main results in the next two sections.

Lemma 7.12 Suppose that6 : X → R is quadratic, and that there exists anε > 0 for which

6(u) ≤ −ε‖u‖2 and 6(v) ≥ 0 for all u ∈ U, v ∈ V. (7.2.14)

Then there exists a positive real number c (which only depends onε and6) such that

‖u‖2+ ‖v‖

2≤ c‖u − v‖

2 for all u ∈ U, v ∈ V. (7.2.15)

Proof. Let us chooseα with 0 < α < ε. We have

6(x + u) − 6(u) − α‖u‖2

= 〈x, x〉 + 〈x, u〉 + 〈u, x〉 − α‖u‖2

≤

≤ σ‖x‖2+ 2σ‖x‖‖u‖ − α‖x‖

2≤ δ‖x‖

2

whereδ > 0 is chosen as max{σ + 2σ t − αt2: t ∈ R} such that the last inequality follows from

σ + 2σ‖u‖

‖x‖− α

‖u‖2

‖x‖2 ≤ δ for ‖x‖ 6= 0. We can conclude

6(v) − 6(u) − α‖u‖2

≤ δ‖u − v‖2.

We now exploit (7.2.14) to infer

(ε − α)‖u‖2

≤ δ‖u − v‖2.

Recall thatε − α > 0 which motivates the choice ofα. If we observe that‖v‖2

= ‖v − u + u‖2

≤

2‖u − v‖2+ 2‖u‖

2 we end up with

‖u‖2+ ‖v‖

2≤ 3‖u‖

2+ 2‖u − v‖

2≤

(3δ

ε − α+ 2

)︸︷︷︸

c

‖u − v‖2.



Remark. Note that it is not really essential that6 is quadratic. In fact, we only exploited that

6(v) − 6(u) − α‖u‖2

‖v − u‖2is bounded on{(u, v) : u ∈ U, v ∈ V, u 6= v}

for some 0< α < ε. The concept of quadratic continuity [?] (meaning that boundedness is guaran-teed forall α > 0) has been introduced just for this purpose. Our preference for the more specificversion is motivated by the robust stability tests that are amenable to computational techniques asdiscussed in later sections. Far more general results about topological separation to guarantee robuststability are discussed in the literature, and the interested reader is referred to [?,?,?]

7.2.3 An Abstract Stability Characterization

Let us now continue the discussion in Section 7.2.1 with our first fundamental result on guaranteeingstability under the assumption of well-posedness. More concretely we assume thatIM (1)−1 definedonWin,e×Zin,e exists and is causal for all1 ∈ 1. The main point is to formulate sufficient conditionsfor this inverse to admit a uniform bound on itsL2-gain. In view of our preparations it is not difficultto provide an abstract criterion on the basis of Lemma 7.12.

Theorem 7.13 Suppose that for all1 ∈ 1 the mapIM (1) is well-posed, and(w, z) = IM (1)−1(win, zin)

implieswin + δ1(z) ∈ Win,e for all δ ∈ [0, 1]. Let6 : Lk+l2 → R be quadratic, and suppose

6

(1(z)

z

)≥ 0 for all z ∈ Z, 1 ∈ 1. (7.2.16)

If there exists anε > 0 with

6

(w

M(w)

)≤ −ε‖w‖

22 for all w ∈ W, (7.2.17)

there exists a constant c such that

‖IM (1)−1‖ ≤ c for all 1 ∈ 1.

Proof. Fix an arbitrary1 ∈ 1 and recall thatτ1 ∈ 1 for all τ ∈ [0, 1]. Our first goal is to showthat there existsc > 0 such that

‖x‖2 ≤ c‖IM (τ1)(x)‖2 for all x ∈ W × Z, τ ∈ [0, 1]. (7.2.18)

The construction will reveal thatc does neither depend onτ nor on1. Property (7.2.18) is provedby applying Lemma 7.12 for the sets

U :=

{(w

M(w)

): w ∈ W

}and V :=

{(τ1(z)

z

): z ∈ Z

}.



Due to (7.2.16) andτ1 ∈ 1 we certainly have6(v) ≥ 0 for all v ∈ V . Moreover, for allu ∈ U weinfer with u = (w, M(w)) that

6(u) = 6

(w

M(w)

)≤ −ε‖w‖

2≤ −

ε

1 + ‖M‖22

∥∥∥∥( w

M(w)

)∥∥∥∥2

= −ε

1 + ‖M‖22

‖u‖2

due to‖(w, M(w))‖22 ≤ (1 + ‖M‖

22)‖w‖

22. By Lemma 7.12 there existsc > 0 with∥∥∥∥( w

M(w)

)∥∥∥∥2

2+

∥∥∥∥( τ1(z)z

)∥∥∥∥2

2≤ c2

∥∥∥∥( w

M(w)

)−

(τ1(z)

z

)∥∥∥∥2

2

for all w ∈ W andz ∈ Z. Since the left-hand side bounds‖(w, z)‖22 from above, we infer‖x‖2 ≤

c‖IM (τ1)(x)‖2 for all x ∈ W × Z. By construction the constantc only depends onε > 0, ‖M‖,6 which guarantees (7.2.18) uniformly in1 ∈ 1.

If we knew thatIM (τ1)−1 mappedWin×Zin into W×Z (and not just intoWe×Ze as guaranteed byhypothesis), the proof would be finished since then (7.2.18) just implies that the gain ofIM (τ1)−1

is bounded byc for all τ ∈ [0, 1], and hence also forτ = 1. To finish the proof it thus suffices toshow that

IM (τ1)−1(Win × Zin) ⊂ W × Z (7.2.19)

for all τ ∈ [0, 1]. Let us assume that this is true for anyτ0 ∈ [0, 1]. Forτ > τ0 takey = (win, zin) ∈

Win × Zin and setxτ = IM (τ1)−1(y) ∈ We × Ze. With xτ = (wτ , zτ ) we have

IM (τ01)(xτ ) = [IM (τ01)(xτ )−IM (τ1)(xτ )]+IM (τ1)(xτ ) =

((τ − τ0)1(zτ )

0

)+

(winzin

).

We now exploit the extra hypothesis (applied forτ1(zτ ) andδ = (τ − τ0)/τ ∈ [0, 1]) to infer that(τ − τ0)1(zτ ) + win ∈ Win,e. Hence the right-hand side is contained in the domain of definition ofIM (τ01)−1 such that

xτ = IM (τ01)−1[(

(τ − τ0)1(zτ )

0

)+ y

].

Since theL2-gain ofIM (τ01)−1 is bounded byc we obtain

‖PT xτ‖2 ≤ c‖PT (τ − τ0)1(zτ )‖ + c‖PT y‖2 ≤ c|τ − τ0| ‖1‖‖PT xτ‖2 + c‖PT y‖2.

Together with‖PT y‖2 ≤ ‖y‖2 we arrive at

(1 − c|τ − τ0| ‖1‖) ‖PT xτ‖2 ≤ c‖y‖2.

For all τ ∈ [τ0, 1] with 1 − c|τ − τ0| ‖1‖ < 1, we can hence infer that‖PT xτ‖2 is bounded for allT ≥ 0 which actually impliesxτ ∈ Lk+l

2 and hencexτ ∈ W × Z.

We have thus proved: If (7.2.19) holds forτ0 ∈ [0, 1], the it also holds automatically for allτ ∈

[0, 1] ∩ [τ0, τ0 + δ0] if 0 < δ0 < 1/(c‖1‖). Starting withτ0 = 0, we conclude by induction that(7.2.19) holds for allτ ∈ [0, 1] ∩ [0, kδ0] for k = 0, 1, 2, ... and thus, with sufficiently largek > 0,indeed for allτ ∈ [0, 1].



Example 7.14 Let us continue Example 7.9. We have seen in Example 7.10 that1 is star-shapedwith center zero. Well-posedness has been verified, where we observe that the interconnection re-sponse(w, z) is piece-wise right-continuous for(win, zin) ∈ Win,e × Zin,e. We infer1(z) ∈ Win,eand thuswin + δ1(z) ∈ Win,e for all δ ∈ [0, 1] just becauseWin,e is a linear space. Moreover, theLipschitz condition|φ(z)| ≤ |z| is equivalent to the sector condition

0 ≤ zφ(z) ≤ z2 for all z ∈ R

which can be written as(φ(z)

z

)T ( 0 11 2

)︸︷︷︸

P

(φ(z)

z

)≥ 0 for all z ∈ R.

This motivates to define the quadratic function

6(w, z) :=

∫∞

0

(w(t)z(t)

)T

P

(w(t)z(t)

)dt

on L22 to conclude that (7.2.16) is indeed satisfied. Now (7.2.17) simply amounts to requiring thatM

is strictly dissipative with respect to the quadratic supply rate defined withP, and this dissipativityguarantees thatIM (1)−1 has uniformly boundedL2-gain. If M is described by a linear time-invariant system its dissipativity with respect toP can be numerically verified by translation into afeasibility test for a suitably defined linear matrix inequality.

The example demonstrates how to actually apply Theorem 7.13 in concrete situations, and howdissipativity will result in computational LMI-tests for robust stability. It also illustrates how toverify the rather abstract hypotheses and that they often considerably simplify in concrete situations.We will explicitly formulate such specializations in later sections.

Remarks.

• Since we allow for general domains of definitions that are just subsets ofL2 or L2e, we requirethe extra condition thatwin + δ1(z) = (1 − δ)win + δw ∈ Win,e for all δ ∈ [0, 1] and allinterconnection responses(w, z) = IM (1)−1(win, zin) with (win, zin) ∈ Win × Zin. Thistechnical property is often simple to verify. For example it holds ifWe equalsWin,e and bothare convex, or if1(Z) ⊂ Win,e andWin,e is a subspace ofLk

2e. In subsequent discussionsM will be taken linear such that it is natural to assume that its domain of definitionWe is asubspace ofLk

2e and to takeWin,e = We.

• A careful analysis of the proof reveals that (7.2.16)-(7.2.17) has to be only guaranteed for allinterconnection responses(w, z) = IM (1)−1(win, zin) with (win, zin) ∈ Win × Zin. Thisinsight might simplify the actual verification of the two quadratic inequalities.

• The proof of the theorem proceeds via a homotopy argument: TheL2-gain ofIM (0)−1 is fi-nite. Then one proves thatIM (1)−1 has finiteL2-gain by showing that the gain‖IM (τ1)−1

‖2



is bounded byc which avoids that it blows up forτ moving from zero to one. It is hence notdifficult to obtain similar stability results if replacing the line segment[0, 1] 3 τ → τ1 byany continuous curveγ : [0, 1] → 1 connecting 0 and1 asγ (0) = 0, γ (1) = 1. Insteadof being star-shaped, it suffices to require that1 contains 0 and is path-wise connected. Inview of our detailed proofs the advanced reader should be in the position to generalize ourarguments to more advanced stability question that are not covered directly by Theorem 7.13.

• One can easily avoid homotopy arguments and directly guarantee thatIM (1)−1 has finiteL2-gain for afixed1 ∈ 1 with the quadratic inequalities

6

(PT1(z)

PT z

)≥ 0 for all z ∈ Ze, T ≥ 0

and

6

(PTw

PT M(w)

)≤ −ε‖PTw‖

2 for all w ∈ We, T ≥ 0

for someε > 0. It even suffices to ask these properties only for interconnection responses(w, z) = IM (1)−1(win, zin) with (win, zin) ∈ Win × Zin. Due to Lemma 7.12 the correspond-ing proof is so simple that it can be left as an exercise.

7.2.4 A Characterization of Well-Posedness and Stability

The goal of this section is to get rid of the hypothesis thatIM (1) has a causal inverse. The price to bepaid is the requirement that all mappings have finite incrementalL2-gain, and to replace the crucialpositivity and negativity conditions in (7.2.16) and (7.2.17) by their incremental versions. The mainresult will guarantee a uninform bound on the incrementalL2-gain ofIM (1)−1. Technically, wewill apply Banach’s fixed point theorem for which we require completeness ofW × Z as a subset ofLk+l

2 .

Theorem 7.15 Suppose there exists aδ0 ∈ (0, 1] such that Win + δ1(Z) ⊂ Win,e for all δ ∈ [0, δ0]

and all1 ∈ 1. Let all 1 ∈ 1 have finite incremental L2-gain, and let6 : Lk+l2 → R be quadratic

such that that

6

(1(z1) − 1(z2)

z1 − z2

)≥ 0 for all z1, z2 ∈ Z. (7.2.20)

Moreover, let M have finite incremental L2-gain and let there exist anε > 0 such that

6

(w1 − w2

M(w1) − M(w2)

)≤ −ε‖w1 − w2‖

22 for all w1, w2 ∈ W. (7.2.21)

If W × Z be a closed subset of Lk+l2 thenIM (1) has a causal inverse and there exists a constant

c > 0 with

‖IM (1)−1‖2i ≤ c for all 1 ∈ 1.



Proof. Let us fix1 ∈ 1.

First step. Similarly as in the proof of Theorem 7.15, the first step is to show the existence of somec > 0 with

‖x1 − x2‖22 ≤ c2

‖IM (τ1)(x1) − IM (τ1)(x2)‖22 for all x1, x2 ∈ W × Z, τ ∈ [0, 1] (7.2.22)

wherec is independent from1 ∈ 1. For that purpose we now choose

U :=

{(w1 − w2

M(w1) − M(w2)

): w1, w2 ∈ W

}, V :=

{(τ1(z1) − τ1(z2)

z1 − z2

): z1, z2 ∈ Z

}.

For v ∈ V we have6(v) ≥ 0 by (7.2.20). Foru ∈ U we conclude withu = (w1 − w2, M(w1) −

M(w2)) from (7.2.21) that

6(u) ≤ −ε‖w1 − w2‖2

≤ −ε

1 + ‖M‖22i

∥∥∥∥( w1 − w2M(w1) − M(w2)

)∥∥∥∥2

= −ε

1 + ‖M‖22i

‖u‖2

due to‖(w1 − w2, M(w1) − M(w2))‖22 ≤ (1+ ‖M‖

2i )‖w1 − w2‖

22. Hence Lemma 7.12 implies the

existence ofc > 0 (only depending onε, ‖M‖i , 6) such that∥∥∥∥( w1 − w2M(w1) − M(w2)

)∥∥∥∥2

2+

∥∥∥∥( τ1(z1) − τ1(z2)

z1 − z2

)∥∥∥∥2

2≤

≤ c2∥∥∥∥( w1 − w2

M(w1) − M(w2)

)−

(τ1(z1) − τ1(z2)

z1 − z2

)∥∥∥∥2

2

for all (w j , z j ) ∈ W × Z. Since the left-hand side bounds‖w1 − w2‖22 + ‖z1 − z2‖

22 from above,

we arrive at (7.2.22).

Second step.Suppose that, forτ0 ∈ [0, 1], IM (τ01) has an inverse defined onWin × Zin. Let usthen prove that

IM (τ1) has an inverse defined onWin × Zin for all τ ∈ [0, 1] ∩ [τ0, τ0 + δ0] (7.2.23)

with 0 < δ0 < min{δ0, (c‖1‖i )−1

}. Let us first observe that (7.2.22) together with the existence ofthe inverse imply

‖IM (τ01)−1(y1) − IM (τ01)−1(y2)‖22 ≤ c2

‖y1 − y2‖22 for all y1, y2 ∈ Win × Zin. (7.2.24)

To verify that IM (τ1) has an inverse amounts to checking that for ally ∈ Win × Zin there isa uniquex ∈ W × Z with IM (τ1)(x) = y. This relation can be rewritten toIM (τ01)(x) =

y − IM (τ1)(x) + IM (τ01)(x) or to the fixed-point equation

x = IM (τ01)−1( y − IM (τ1)(x) + IM (τ01)(x) ) =: F(x). (7.2.25)

Hence we need to guarantee the existence of a uniquex ∈ W × Z with F(x) = x, and we simplyapply Banach’s fixed-point theorem. Recall thatW× Z is complete. Withy = (win, zin), x = (w, z)we have

y − IM (τ1)(x) + IM (τ01)(x) =

(win + (τ − τ0)1(z)

zin

). (7.2.26)



For τ ∈ [τ0, τ0 + δ0], we can exploit the hypothesis to infer thatWin + (τ − τ0)1(Z) ⊂ Win,e andthus (??) is contained inWin,e× Zin; since1(Z) ⊂ Lk

2 it is even contained inWin × Zin. This revealsthat F mapsW × Z into W × Z. Finally, using (7.2.24) and withx j = (w j , z j ) we conclude

‖F(x1) − F(x2)‖22 ≤

≤ c2‖ [y − IM (τ1)(x1) + IM (τ01)(x1)] − [y − IM (τ1)(x2) + IM (τ01)(x2)] ‖

22 ≤

≤ c2‖ [IM (τ01)(x1) − IM (τ1)(x1)] − [IM (τ01)(x2) − IM (τ1)(x2)] ‖

22 ≤

≤ c2∥∥∥∥( (τ − τ0)1(z1)

0

)−

((τ − τ0)1(z2)

0

)∥∥∥∥2

2≤

≤ c2|τ − τ0|

2‖1(z1) − 1(z2)‖

22 ≤ c2

|τ − τ0|2‖1‖

2i ‖z1 − z2‖

22 ≤ c2

|τ − τ0|2‖1‖

2i ‖x1 − x2‖

22.

For τ ∈ [τ0, τ0 + δ0] we havec2|τ − τ0|

2‖1‖

2i < 1 such thatF is a strict contraction. All this

guarantees thatF has a unique fixed-point inW × Z.

Third step. Obviously,IM (τ1) does have an inverse forτ = 0. Hence this mapping has an inversefor τ ∈ [0, 1]∩[0, δ0] and, by induction, forτ ∈ [0, 1]∩[0, kδ0] with k = 0, 1, 2, .... For sufficientlylargek we conclude thatIM (1) has an inverse. By (7.2.23) it is clear that the incremental gain ofthis inverse is bounded byc. Then it is a simple exercise to prove that the causal mappingIM (1)

has a causal inverseIM (1)−1 defined onWin,e × Zin,e with the same boundc on its incrementalL2-gain.

Remarks.

• The technical propertyWin + δ1(Z) ⊂ Win,e for δ ∈ [0, δ0] is certainly true if1 mapsZ intoWin,e andWin,e is a linear space. For linearM defined on a linear spaceWe and the choiceWin,e = We, the uncertainties have to mapZe into We. We stress that (7.2.21) and (7.2.17) arethen obviously equivalent.

• Even if dealing with nonlinear uncertainties1, they often are guaranteed to have the property1(0) = 0. Then we infer thatIM (1)(0) = 0 such that the same must hold for its inverse.Therefore

‖IM (1)−1‖2 ≤ ‖IM (1)−1

‖2i

and Theorem 7.15 guarantees a bound on theL2-gain of the inverse as well.

• If the uncertainties1 ∈ 1 are linear and defined on the linear spaceZe, (7.2.20) is equivalent(7.2.16), andIM (1) as well as its inverse (if existing) are linear. Since uncertainties could bedefined by infinite-dimensional systems (partial differential equations) even in this simplifiedsituation Theorem 7.15 might be extremely helpful in guaranteeing well-posedness.

• We have made use of a rather straightforward application of the global version of Banach’sfixed-point theorem. As variants on could work with well-established local versions thereof.We finally also stress for Theorem 7.15 that it might not directly be applicable to specificpractical problems, but the techniques of proof might be successfully modified along the linesof the rather the detailed exposition of the basic ingredients.



However, one can also change the viewpoint: Given a quadratic mapping6, definethe class ofuncertainties1 as those that satisfy (7.2.16). Then all systems that have the property (7.2.17) cannotbe destabilized by this class of uncertainties. Classical small-gain and passivity theorems fall in thisclass as will be discussed Section??.


Chapter 8

Robust controller synthesis

8.1 Robust controller design

So far we have presented techniques to design controllers for nominal stability and nominal per-formance. Previous chapters have been devoted to a thorough discussion of how to analyze, for afixed stabilizing controller, robust stability or robust performance. For time-invariant or time-varyingparametric uncertainties, we have seen direct tests formulated as searching for constant or parameter-dependent quadratic Lyapunov functions. For much larger classes of uncertainties, we have derivedtests in terms of integral quadratic constraints (IQC’s) that involve additional variables which havebeen called scalings or multipliers.

Typically, only those IQC tests with a class of multipliers that admit a state-space description asdiscussed in Sections??-?? of Chapter 4 are amenable to a systematic output-feedback controllerdesign procedure which is a reminiscent of theD/K -iteration inµ-theory. This will be the firstsubject of this chapter.

In a second section we consider as a particular information structure the robust state-feedback de-sign problem. We will reveal that the search for static state-feedback gains which achieve robustperformance can be transformed into a convex optimization problem.

The discussion is confined to the quadratic performance problem since most results can be extendedin a pretty straightforward fashion to the other specifications considered in these notes.

201

202 8.1. ROBUST CONTROLLER DESIGN

8.1.1 Robust output-feedback controller design

If characterizing robust performance by an IQC, the goal in robust design is to find a controlleranda multiplier such that, for the closed-loop system, the corresponding IQC test is satisfied. Hence, themultiplier appears as an extra unknown what makes the problem hard if not impossible to solve.

However, if the multiplier is held fixed, searching for a controller amounts to a nominal designproblem that can be approached with the techniques described earlier. If the controller is held fixed,the analysis techniques presented in Chapter 7 can be used to find a suitable multiplier. Hence,instead of trying to search for a controller and a multiplier commonly, one iterates between thesearch for a controller with fixed multiplier and the search for a multiplier with fixed controller. Thisprocedure is known fromµ-theory as scalings/controller iteration orD/K iteration.

To be more concrete, we consider the specificexampleof achieving robust quadratic performanceagainst time-varying parametric uncertainties as discussed in Section??.

The uncontrolled unperturbed system is described by (4.1.1). We assume thatw1 → z1 is theuncertainty channel and the uncontrolled uncertain system is described by including

w1(t) = 1(t)z1(t)

where1(.) varies in the set of continuous curves satisfying

1(t) ∈ 1c := conv{11, ...,1N} for all t ≥ 0.

We assume (w.l.o.g.) that

0 ∈ conv{11, ...,1N}.

The performance channel is assumed to be given byw2 → z2, and the performance index

Pp =

(Qp Sp

STp Rp

), Rp ≥ 0 with the inverse P−1

p =

(Qp Sp

STp Rp

), Qp ≤ 0

is used to define the quadratic performance specification

∫∞

0

(w2(t)z2(t)

)T

Pp

(w2(t)z2(t)

)dt ≤ −ε‖w2‖

22.

The goal is to design a controller that achieves robust stability and robust quadratic performance.We can guarantee both properties by finding a controller, a Lyapunov matrixX, and a multiplier

P =

(Q SST R

), Q < 0,

(1 j

I

)T ( Q SST R

)(1 j

I

)> 0 for all j = 1, . . . , N

(8.1.1)


8.1. ROBUST CONTROLLER DESIGN 203

that satisfy the inequalities

X > 0,

I 0 0

XA XB1 XB2

0 I 0C1 D1 D12

0 0 IC2 D21 D2

T

0 I 0 0 0 0I 0 0 0 0 00 0 Q S 0 00 0 ST R 0 00 0 0 0 Qp Sp

0 0 0 0 STp Rp

I 0 0

XA XB1 XB2

0 I 0C1 D1 D12

0 0 IC2 D21 D2

< 0.

(Recall that the condition on the left-upper block ofP can be relaxed in particular cases what couldreduce the conservatism of the test.)

If we apply the controller parameter transformation of Chapter??, we arrive at the synthesis matrixinequalities

X(v) > 0,

∗

∗

∗

∗

∗

∗

T

0 I 0 0 0 0I 0 0 0 0 00 0 Q S 0 00 0 ST R 0 00 0 0 0 Qp Sp

0 0 0 0 STp Rp

I 0 0

A(v) B1(v) B2(v)

0 I 0C1(v) D1(v) D12(v)

0 0 IC2(v) D21(v) D2(v)

< 0.

Unfortunately, there is no obvious way how to render these synthesis inequalities convex inallvariablesv, Q, S, R.

This is the reason why we consider, instead, the problem with a scaled uncertainty

w1(t) = [r 1(t)]z1(t), 1(t) ∈ 1c (8.1.2)

where the scaling factor is contained in the interval[0, 1]. Due to

(r 1

I

)T ( Q r Sr ST r 2R

)(r 1

I

)= r 2

(1

I

)T ( Q SST R

)(1

I

),

we conclude that the corresponding analysis or synthesis are given by (8.1.1) and

X > 0,

I 0 0

XA XB1 XB2

0 I 0C1 D1 D12

0 0 IC2 D21 D2

T

0 I 0 0 0 0I 0 0 0 0 00 0 Q r S 0 00 0 r ST r 2R 0 00 0 0 0 Qp Sp

0 0 0 0 STp Rp

I 0 0

XA XB1 XB2

0 I 0C1 D1 D12

0 0 IC2 D21 D2

< 0

(8.1.3)



or

X(v) > 0,

∗

∗

∗

∗

∗

∗

T

0 I 0 0 0 0I 0 0 0 0 00 0 Q r S 0 00 0 r ST r 2R 0 00 0 0 0 Qp Sp

0 0 0 0 STp Rp

I 0 0

A(v) B1(v) B2(v)

0 I 0C1(v) D1(v) D12(v)

0 0 IC2(v) D21(v) D2(v)

< 0.

(8.1.4)

For r = 0, we hence have to solve the nominal quadratic performance synthesis inequalities. Ifthey are not solvable, the robust quadratic performance synthesis problem is not solvable either andwe can stop. If they are solvable, the idea is to try to increase, keeping the synthesis inequalitiesfeasible, the parameterr from zero to one. Increasingr is achieved by alternatingly maximizingroverv satisfying (8.1.4) (for fixedP) and by varyingX andP in (8.1.3) (for a fixed controller).

The maximization ofr proceeds along the following steps:

Initialization. Perform a nominal quadratic performance design by solving (8.1.4) forr = 0. Pro-ceed if these inequalities are feasible and compute a corresponding controller.

After this initial phase, the iteration is started. Thej − 1-st step of the iteration leads to a controller,a Lyapunov matrixX, and a multiplierP that satisfy the inequalities (8.1.1) and (8.1.3) for theparameterr = r j −1. Then it proceeds as follows:

First step: Fix the controller and maximizer by varying the Lyapunov matrixX and the scalingsuch that such that (8.1.1) and (8.1.3) hold. The maximal radius is denoted asr j and it satisfiesr j −1 ≤ r j .

Second step:Fix the resulting scalingP and find the largestr by varying the variablesv in (8.1.4).The obtained maximumr j clearly satisfiesr j ≤ r j .

The iteration defines a sequence of radii

r1 ≤ r2 ≤ r3 ≤ · · ·

and a corresponding controller that guarantee robust stability and robust quadratic performance forall uncertainties (8.1.2) with radiusr = r j .

If we are in the lucky situation that there is an index for whichr j ≥ 1, the corresponding controlleris robustly performing for all uncertainties with values in1c as desired, and we are done. However,if r j < 1 for all indices, we cannot guarantee robust performance forr = 1, but we still have aguarantee of robust performance forr = r j !

Before entering a brief discussion of this procedure, let us include the following remarks on thestart-up and on the computations. If the nominal performance synthesis problem has a solution, theLMI’s (8.1.1)-(8.1.3) do have a solutionX andP for the resulting controller and for some - possibly


8.1. ROBUST CONTROLLER DESIGN 205

small - r > 0; this just follows by continuity. Hence the iteration does not get stuck after the firststep. Secondly, for a fixedr , the first step of the iteration amounts to solving an analysis problem,and finding a solutionv of (8.1.4) can be converted to an LMI problem. Therefore, the maximizationof r can be performed by bisection.

Even if the inequalities (8.1.1)-(8.1.4) are solvable forr = 1, it can happen the the limit ofr j issmaller than one. As a remedy, one could consider another parameter to maximize, or one couldmodify the iteration scheme that has been sketched above. For example, it is possible to take thefine structure of the involved functions into account and to suggest other variable combinations thatrender the resulting iteration steps convex. Unfortunately, one cannot give general recommendationsfor modifications which guarantee success.

Remark. It should be noted that the controller/multiplier iteration can be extended to all robustperformance tests that are based on families of dynamic IQC’s which are described by real rationalmultipliers. Technically, one just requires a parametrization of the multipliers such that the corre-sponding analysis test (for a fixed controller) and the controller synthesis (for a fixed multiplier) bothreduce to solving standard LMI problems.

8.1.2 Robust state-feedback controller design

For the same set-up as in the previous section we consider the corresponding synthesis problem ifthe state of the underlying system is measurable. According to our discussion in Section 4.6, theresulting synthesis inequalities read as

Q < 0,

(1 j

I

)T ( Q SST R

)(1 j

I

)> 0 for all j = 1, . . . , N

and

Y > 0,

∗

∗

∗

∗

∗

∗

T

0 I 0 0 0 0I 0 0 0 0 00 0 Q S 0 00 0 ST R 0 00 0 0 0 Qp Sp

0 0 0 0 STp Rp

I 0 0

AY + BM B1 B2

0 I 0C1Y + E1M D1 D12

0 0 IC2Y + E1M D21 D2

< 0

in the variablesY, M , Q, S, R.

In this form these inequalities arenot convex. However, we can apply the Dualization Lemma(Section 4.5.1) to arrive at the equivalent inequalities

R > 0,

(I

−1Tj

)T (Q SST R

)(I

−1Tj

)< 0 for all j = 1, . . . , N



andY > 0,

∗

0 I 0 0 0 0I 0 0 0 0 00 0 Q S 0 00 0 ST R 0 00 0 0 0 Qp Sp

0 0 0 0 STp Rp

−(AY + BM)T−(C1Y + E1M)T

−(C2Y + E2M)T

I 0 0−BT

1 −DT1 −DT

210 I 0

−BT2 −DT

12 −DT2

0 0 I

> 0

in the variablesY, M , Q, S, R. It turns out that these dual inequalities are allaffinein the unknowns.Testing feasibility hence amounts to solving a standard LMI problem. If the LMI’s are feasible, arobust static state-feedback gain is given byD = MY−1. This is one of the very few lucky instancesin the world of designing robust controllers!

8.1.3 Affine parameter dependence

Let us finally consider the system xzy

=

A(1(t)) B1(1(t)) B(1(t))C1(1(t)) D(1(t)) E(1(t))C(1(t)) F(1(t)) 0

xw

u

, 1(t) ∈ conv{11, ...,1N}

where the describing matrices dependaffinelyon the time-varying parameters. If designing output-feedback controllers, there is no systematic alternative to pulling out the uncertainties and applyingthe scalings techniques as in Section 8.1.1.

For robust state-feedback design there is an alternative without scalings. One just needs to directlysolve the system of LMI’s

Y > 0,

∗

∗

∗

∗

T

0 I 0 0I 0 0 00 0 Qp Sp

0 0 STp Rp

I 0A(1 j )Y + B(1 j )M B1(1 j )

0 IC1(1 j )Y + E(1 j )M D(1 j )

< 0, j = 1, . . . , N

(8.1.5)in the variablesY andM .

For the controller gainDc = MY−1 we obtain

Y > 0,

∗

∗

∗

∗

T

0 I 0 0I 0 0 00 0 Qp Sp

0 0 STp Rp

I 0(A(1 j ) + B(1 j )Dc)Y B1(1 j )

0 I(C1(1 j ) + E(1 j )Dc)Y D(1 j )

< 0, j = 1, . . . , N


8.2. EXERCISES 207

A convexity argument leads to

Y > 0,

∗

∗

∗

∗

T

0 I 0 0I 0 0 00 0 Qp Sp

0 0 STp Rp

I 0(A(1(t)) + B(1(t))Dc)Y B1(1(t))

0 I(C1(1(t)) + E(1(t))Dc)Y D(1(t))

< 0

for all parameter curves1(t) ∈ conv{11, ...,1N}, and we can perform a congruence transformationas in Section 4.6 to get

X > 0,

∗

∗

∗

∗

T

0 I 0 0I 0 0 00 0 Qp Sp

0 0 STp Rp

I 0X(A(1(t)) + B(1(t))Dc) XB1(1(t))

0 I(C1(1(t)) + E(1(t))Dc) D(1(t))

< 0.

These two inequalities imply, in turn, robust exponential stability and robust quadratic performancefor the controlled system as seen in Section??.

We have proved that it suffices to directly solve the LMI’s (8.1.5) to compute a robust static state-feedback controller. Hence, if the system’s parameter dependence is affine, we have found twoequivalent sets of synthesis inequalities that differ in the number of the involved variables and in thesizes of the LMI’s that are involved. In practice, the correct choice is dictated by whatever systemcan be solved faster, more efficiently, or numerically more reliably.

Remark. Here is the reason why it is possible to directly solve the robust performance problem bystate-feedback without scalings, and why this technique does, unfortunately, not extend to output-feedback control: The linearizing controller parameter transformation for state-feedback problemsdoes not involve the matrices that describe the open-loop system, whereas that for that for ouptut-feedback problems indeed depends on the matricesA, B, C of the open-loop system description.

Let us conclude this chapter by stressing, again, that these techniques find straightforward extensionsto other performance specifications. As an exercise, the reader is asked to work out the details of thecorresponding results for the robustH2-synthesis problem by state- or output-feedback.

8.2 Exercises

Exercise 1

This is an exercise on robust control. To reduce the complexity of programming, we consider anon-dynamic system only.


208 8.2. EXERCISES

Suppose you have given the algebraic uncertain system

z1z2z3z4

zy1y2

=

0 1 0 1 1 1 00.5 0 0.5 0 1 0 12a 0 a 0 1 0 00 −2a 0 −a 1 0 01 1 1 1 0 0 01 0 0 0 0 0 00 1 0 0 0 0 0

w1w2w3w4

w

u1u2

,

with a time-varying uncertaintyw1w2w3w4

=

δ1(t) 0

δ1(t)δ2(t)

0 δ2(t)

z1z2z3z4

, |δ1(t)| ≤ 0.7, |δ2(t)| ≤ 0.7.

As the performance measure we choose theL2-gain of the channelw → z.

(a) For the uncontrolled system and for eacha ∈ [0, 1], find the minimal robustL2-gain level ofthe channelw → z by applying the robust performance analysis test in Chapter 3 with the

following class of scalingsP =

(Q SST R

):

• P is as inµ-theory: Q, S, R are block-diagonal,Q < 0, R is related toQ (how?), andS is skew-symmetric.

• P is general withQ < 0.

• P is general withQ1 < 0, Q2 < 0, whereQ j denote the blocksQ(1 : 2, 1 : 2) andQ(3 : 4, 3 : 4) in Matlab notation.

Draw plots of the corresponding optimal values versus the parametera and comment!

(b) Fora = 0.9, apply the controller(u1u2

)=

(0 00 k

)(y1y2

)and perform the analysis test with the largest class of scalings fork ∈ [−1, 1]. Plot theresulting optimal value overk and comment.

(c) Perform a controller/scaling iteration to minimize the optimal values for the controller struc-tures (

u1u2

)=

(0 00 k2

)(y1y2

)and

(u1u2

)=

(k1 k12k21 k2

)(y1y2

).

Start from gain zero and plot the optimal values that can are reached in each step of theiteration to reveal how they decrease. Comment on the convergence.

(d) With the last full controller from the previous exercise for a performance level that is close tothe limit, redo the analysis of the first part. Plot the curves and comment.


Chapter 9

Linear parameterically varyingsystems

Linear parameterically varying (LPV) systems are linear systems whose describing matrices dependon a time-varying parameter such that both the parameter itself and its rate of variation are known tobe contained in pre-specified sets.

In robust control, the goal is to findone fixedcontroller that achieves robust stability and robustperformance for all possible parameter variations, irrespective of which specific parameter curvedoes indeed perturb the system.

Instead, in LPV control, it is assumed that the parameter (and, possibly, its rate of variation), althoughnot known a priori, is (are) on-line measurable. Hence the actual parameter value (and its derivative)can be used as extra information to control the system - the controller will turn out to depend on theparameter as well. We will actually choose also an LPV structure for the controller to be designed.

We would like to stress the decisive distinction to the control of time-varying systems: In the standardtechniques to controlling time-varying systems, the model description is assumed to be known apriori over the whole time interval[0, ∞). In LPV control, the model is assumed to be known, attime instantt , only over the interval[0, t].

The techniques we would like to develop closely resemble those for robust control we have investi-gated earlier. It is possible to apply them

• to control certain classes of nonlinear systems

• to provide a systematic procedure for gain-scheduling

with guarantees for stability and performance.

209

210 9.1. GENERAL PARAMETER DEPENDENCE

Before we explore these applications in more detail we would like to start presenting the availableproblem setups and solution techniques to LPV control.

9.1 General Parameter Dependence

Suppose thatδc, δc ⊂ Rm are two parameter sets such that

δc × δc is compact,

and that the matrix valued function A(p) Bp(p) B(p)

Cp(p) Dp(p) E(p)

C(p) F(p) 0

is continuous inp ∈ δc. (9.1.1)

Consider theLinear Parameterically Varying (LPV) systemthat is described as xzp

y

=

A(δ(t)) Bp(δ(t)) B(δ(t))Cp(δ(t)) Dp(δ(t)) E(δ(t))C(δ(t)) F(δ(t)) 0

xwp

u

, δ(t) ∈ δc, δ(t) ∈ δc. (9.1.2)

We actually mean the family of systems that is obtained if lettingδ(.) vary in the set of continuouslydifferentiable parameter curves

δ : [0, ∞) → Rm with δ(t) ∈ δc, δ(t) ∈ δc for all t ≥ 0.

The signals admit the same interpretations as in Chapter 4:u is the control input,y is the measuredoutput available for control, andwp → zp denotes the performance channel.

In LPV control, it is assumed that the parameterδ(t) is on-line measurable. Hence the actual valueof δ(t) can be taken as extra information for the controller to achieve the desired design goal.

In view of the specific structure of the system description, we assume that the controller admits asimilar structure. In fact, anLPV controlleris defined by functions(

Ac(p) Bc(p)

Cc(p) Dc(p)

)that are continuous inp ∈ δc (9.1.3)

as (xc

u

)=

(Ac(δ(t)) Bc(δ(t))Cc(δ(t)) Dc(δ(t))

)(xc

y

)with the following interpretation: It evolves according to linear dynamics that are defined at time-instantt via the actually measured value ofδ(t).

Note that a robust controller would be simply defined with a constant matrix(Ac Bc

Cc Dc

)210 Compilation: February 8, 2005

9.1. GENERAL PARAMETER DEPENDENCE 211

that does not depend onδ what clarifies the difference between robust controllers and LPV con-trollers.

The controlled system admits the description(ξ

zp

)=

(A(δ(t)) B(δ(t))C(δ(t)) D(δ(t))

)(ξ

wp

), δ(t) ∈ δc, δ(t) ∈ δc (9.1.4)

where the function (A(p) B(p)

C(p) D(p)

)is continuous inp ∈ δc

and given as A(p) + B(p)Dc(p)C(p) B(p)Cc(p) Bp(p) + B(p)Dc(p)F(p)

Bc(p)C(p) Ac(p) Bc(p)F(p)

Cp(p) + E(p)Dc(p)C(p) E(p)Cc(p) Dp(p) + E(p)Dc(p)F(p)

or A(p) 0 Bp(p)

0 0 0Cp(p) 0 Dp(p)

+

0 B(p)

I 00 E(p)

( Ac(p) Bc(p)

Cc(p) Dc(p)

)(0 I 0

C(p) 0 F(p)

).

To evaluate performance, we concentrate again on the quadratic specification∫∞

0

(w(t)z(t)

)T

Pp

(w(t)z(t)

)dt ≤ −ε‖w‖

22 (9.1.5)

with an index

Pp =

(Qp Sp

STp Rp

), Rp ≥ 0 that has the inverseP−1

p =

(Qp Sp

STp Rp

), Qp ≤ 0.

In order to abbreviate the formulation of the analysis result we introduce the following differentialoperator.

Definition 9.1 If X : δc 3 p → X(p) ∈ Rn×n is continuously differentiable, the continuousmapping

∂ X : δc × δc → Rn×n is defined as∂ X(p, q) :=

m∑j =1

∂ X

∂p j(p)q j .

Note that this definition is simply motivated by the fact that, along any continuously differentiableparameter curveδ(.), we have

d

dtX(δ(t)) =

m∑j =1

∂ X

∂p j(δ(t))δ j (t) = ∂ X(δ(t), δ(t)). (9.1.6)



(We carefully wrote down the definitions and relations, and one should read all this correctly.Xand∂ X are functions of the parametersp ∈ δc andq ∈ δc respectively. In the definition of∂ X, notime-trajectories are involved. The definition of∂ X is just tailored to obtain the property (9.1.6) ifplugging in a function of time.)

In view of the former discussion, the following analysis result comes as no surprise.

Theorem 9.2 Suppose there exists a continuously differentiableX(p) defined for p∈ δc such thatfor all p ∈ δc and q∈ δc one has

X(p) > 0,

(∂X(p, q) + A(p)TX(p) + X(p)A(p) X(p)B(p)

B(p)TX(p) 0

)+

+

(0 I

C(p) D(p)

)T

Pp

(0 I

C(p) D(p)

)< 0. (9.1.7)

Then there exists anε > 0 such that, for each parameter curve withδ(t) ∈ δc and δ(t) ∈ δc,the system(9.1.4) is exponentially stable and satisfies(9.1.5) if the initial condition is zero and ifwp ∈ L2.

In view of our preparations the proof is a simple exercise that is left to the reader.

We can now use the same procedure as for LTI systems to arrive at the corresponding synthesis result.It is just required to obey that all the matrices are actually functions ofp ∈ δc or of (p, q) ∈ δc × δc.If partitioning

X =

(X U

U T∗

), X−1

=

(Y V

VT∗

),

we can again assume w.l.o.g. thatU , V have full row rank. (Note that this requires the compactnesshypothesis onδc andδc. Why?) With

Y =

(Y I

VT 0

)and Z =

(I 0X U

)we obtain the identities

YTX = Z and I − XY = U VT .

If we apply the differential operator∂ to the first functional identity, we arrive at(∂Y)TX +

YT (∂X) = ∂Z. (Do the simple calculations. Note that∂ is not the usual differentiation such thatyou cannot apply the standard product rule.) If we right-multiplyY, this leads to

YT (∂X)Y = (∂Z)Y − (∂Y)TZT=

(0 0

∂ X ∂U

)(Y I

VT 0

)−

(∂Y ∂V0 0

)(I X0 U T

)and hence to

YT (∂X)Y =

(−∂Y −(∂Y)X − (∂V)U T

(∂ X)Y + (∂U )VT ∂ X

).



If we introduce the transformed controller parameters(K LM N

)=

(U X B0 I

)(Ac Bc

Cc Dc

)(VT 0CY I

)+

(X AY 0

0 0

)+

+

((∂ X)Y + (∂U )VT 0

0 0

),

a brief calculation reveals that

YT (∂X + ATX + XA)Y =

(−∂Y + sym(AY + BM) (A + BNC) + K T

(A + BNC)T+ K ∂ X + sym(AX + LC)

)YTXB =

(Bp + BN FX Bp + L F

), CY =

(CpY + E M Cp + E NC

), D = Dp + E N F

where we used again the abbreviation sym(M) = M +MT . If compared to a parameter independentLyapunov function, we have modified the transformation toK by (∂ X)Y + (∂U )VT in order toeliminate this extra term that appears from the congruence transformation of∂X. If X is does notdepend onp, ∂X vanishes identically and the original transformation is recovered.

We observe thatL, M , N are functions ofp ∈ δc only, whereasK also depends onq ∈ δc. In fact,this function has the structure

K (p, q) = K0(p) +

m∑i =1

K i (p)qi (9.1.8)

(why?) and, hence, it is fully described by specifying

K i (p), i = 0, 1, . . . , m

that depend, as well, onp ∈ δc only.

Literally as in Theorem 4.3 one can now prove the following synthesis result for LPV systems.

Theorem 9.3 If there exists an LPV controller defined by(9.1.3)and a continuously differentiableX(.) defined for p∈ δc that satisfy(9.1.7), then there exist continuously differentiable functionsX, Y and continuous functions Ki , L, M, N defined onδc such that, with K given by(9.1.8), theinequalities (

Y II X

)> 0 (9.1.9)

and −∂Y + sym(AY + BM) (A + BNC) + K T Bp + BN F(A + BNC)T

+ K ∂ X + sym(AX + LC) X Bp + L F(Bp + BN F)T (X Bp + L F)T 0

+

+

(∗

∗

)T

Pp

(0 0 I

CpY + E M Cp + E NC Dp + E N F

)< 0 (9.1.10)



hold on δc × δc. Conversely, suppose the continuously differentiable X, Y and the continuousK i , defining K as in(9.1.8), L, M, N satisfy these synthesis inequalities. Then one can factorizeI − XY = U VT with continuously differentiable square and nonsingular U, V , and

X =

(Y VI 0

)−1( I 0X U

)(9.1.11)(

Ac Bc

Cc Dc

)=

(U X B0 I

)−1(K − X AY− [(∂ X)Y + (∂U )VT

] LM N

)(VT 0CY I

)−1

(9.1.12)

render the analysis inequalities(9.1.7)satisfied.

Remark. Note that the formula (9.1.12) just emerges from the modified controller parameter trans-formation. We observe that the matricesBc, Cc, Dc are functions ofp ∈ δc only. Due to thedependence ofK onq and due to the extra termU−1

[(∂ X)Y + (∂U )VT]V−T in the formula forAc,

this latter matrix is a function that depends both onp ∈ δc andq ∈ δc. It has the same structure asK and can be written as

Ac(p, q) = A0(p) +

m∑i =1

Ai (p)qi .

A straightforward calculation reveals that

Ai = U−1[K i V

−T−

∂ X

∂piY V−T

−∂U

∂pi], i = 1, . . . , m.

Hence, to implement this controller, one indeed requires not only to measureδ(t) but also its rateof variationδ(t). However, one could possibly exploit the freedom in choosingU andV to renderAi = 0 such thatAc does not depend onq any more. Recall thatU andV need to be related byI − XY = U VT ; hence let us choose

VT:= U−1(I − XY).

This leads to

Ai = U−1[(K i −

∂ X

∂piY)(I − XY)−1U −

∂U

∂pi], i = 1, . . . , m.

Therefore,U should be chosen as a nonsingular solution of the system of first order partial differen-tial equations

∂U

∂pi(p) = [K i (p) −

∂ X

∂pi(p)Y(p)](I − X(p)Y(p))−1U (p), j = 1, . . . , m.

This leads toAi = 0 such that the implementation of the LPV controller does not require any on-linemeasurements of the rate of the parameter variations. First order partial differential equations can besolved by the method of characteristics [10]. We cannot go into further details at this point.

In order to construct a controller that solves the LPV problem, one has to verify the solvability of thesynthesis inequalities in the unknownfunctions X, Y, K i , L, M , N, and for designing a controller,one has to find functions that solve them.



However, standard algorithms do not allow to solve functional inequalities directly. Hence we needto include a discussion of how to reduce these functional inequalities to finitely many LMI’s in realvariables.

First step. Sinceq ∈ δc enters the inequality (9.1.10) affinely, we can replace the setδc, if convex,by its extreme points. Let us make the, in practice non-restrictive, assumption that this set has finitelymany generators:

δc = conv{δ1, . . . , δk}.

Solving (9.1.9)-(9.1.10) over(p, q) ∈ δc × δc is equivalent to solving (9.1.9)-(9.1.10) for

p ∈ δc, q ∈ {δ1, . . . , δk}. (9.1.13)

Second step. Instead of searching over the set of all continuous functions, we restrict the search to afinite dimensional subspace thereof, as is standard in Ritz-Galerkin techniques. Let us hence choosebasis functions

f1(.), . . . , fl (.) that are continuously differentiable onδc.

Then all the functions to be found are assumed to belong to the subspace spanned by the functionsf j . This leads to the Ansatz

X(p) =

l∑j =1

X j f j (p), Y(p) =

l∑j =1

Yj f j (p)

K i (p) =

l∑j =1

K ij f j (p), i = 0, 1, . . . , m,

L(p) =

l∑j =1

L j f j (p), M(p) =

l∑j =1

M j f j (p), N(p) =

l∑j =1

N j f j (p).

We observe

∂ X(p, q) =

l∑j =1

X j ∂ f j (p, q), ∂Y(p, q) =

l∑j =1

Yj ∂ f j (p, q).

If we plug these formulas into the inequalities (9.1.9)-(9.1.10), we observe that all the coefficientmatrices enter affinely. After this substitution, (9.1.9)-(9.1.10) turns out to be a family of linearmatrix inequalities in the

matrix variablesX j , Yj , K ij , L j , M j , N j

that is parameterized by (9.1.13). The variables of this system of LMI’s are now real numbers;however, since the parameterp still varies in the infinite setδc, we have to solve infinitely manyLMI’s. This is, in fact, a so-called semi-infinite (not infinite dimensional as often claimed) convexoptimization problem.



Third step. To reduce the semi-infinite system of LMI’s to finitely many LMI’s, the presently chosenroute is to just fix afinite subset

δfinite ⊂ δc

and solve the LMI system in those points only. Hence the resulting family of LMI’s is parameterizedby

p ∈ δfinite and q ∈ {δ1, . . . , δk}.

We end up with a finite family of linear matrix inequalities in real valued unknowns that can besolved by standard algorithms. Since a systematic choice of pointsδfinite is obtained by gridding theparameter set, this last step is often called the gridding phase, and the whole procedure is said to bea gridding technique.

Remark on the second step.Due to Weierstraß’ approximation theorem, one can choose a sequenceof functions f1, f2, . . . on δc such that the union of the subspaces

Sν = span{ f1, . . . , fν}

is densein the set of all continuously differentiable mappings onδc with respect to the norm

‖ f ‖ = max{| f (p)| | p ∈ δc} +

m∑j =1

max{|∂ f

∂p j(p)| | p ∈ δc}.

This implies that, given any continuously differentiableg on δc and any accuracy levelε > 0, onecan find an indexν0 such that there exists anf ∈ Sν0 for which

∀ p ∈ δc, q ∈ δc : |g(p) − f (p)| ≤ ε, |∂g(p, q) − ∂ f (p, q)| ≤ ε.

(Provide the details.) Functions in the subspaceSν hence approximate any functiong and its image∂g under the differential operator∂ up to arbitrary accuracy, if the indexν is chosen sufficientlylarge.

Therefore, if (9.1.9)-(9.1.10) viewed as functional inequalities do have a solution, then they havea solution if restricting the search over the finite dimensional subspaceSν for sufficiently largeν,i.e., if incorporating sufficiently many basis functions. However, the number of basis functionsdetermines the number of variables in the resulting LMI problem. To keep the number of unknownssmall requires an efficient choice of the basis functions what is, in theory and practice, a difficultproblem for which one can hardly give any general recipes.

Remark on the third step. By compactness ofδc and continuity of all functions, solving the LMI’sfor p ∈ δc or for p ∈ δfinite is equivalent if only the points are chosen sufficiently dense. A measureof density is the infimalε such that the balls of radiusε around each of the finitely many points inδfinite already coverδc:

δc ⊂

⋃p0 ∈ δfinite

{u | ‖p − p0‖ ≤ ε}.

If the data functions describing the system are also differentiable inδ, one can apply the mean valuetheorem to provide explicit estimates of the accuracy of the required approximation. Again, however,


9.2. AFFINE PARAMETER DEPENDENCE 217

it is important to observe that the number of LMI’s to solve depends on the number of grid-points;hence one has to keep this number small in order to avoid large LMI’s.

Remark on extensions.Only slight adaptations are required to treat all the other performance speci-fications (such as bounds on theL2-gain and on the analogue of theH2-norm or generalizedH2-normfor time-varying systems) as well as the corresponding mixed problems as discussed in Chapter??in full generality. Note also that, for single-objective problems, the techniques to eliminate param-eters literally apply; there is no need go into the details. In particular for solving gain-schedulingproblems, it is important to observe that one can as well let the performance index depend on themeasured parameter without any additional difficulty. As a designer, one can hence askdifferentperformance propertiesin different parameter rangeswhat has considerable relevance in practicalcontroller design.

Remark on robust LPV control. As another important extension we mentionrobust LPV design.It might happen that some parameters are indeed on-line measurable, whereas others have to beconsidered as unknown perturbations with which the controller cannot be scheduled. Again, it isstraightforward to extend the robustness design techniques that have been presented in Chapter??from LTI systems and controllers to LPV systems and controllers. This even allows to includedynamic uncertainties if using IQC’s to capture their properties. Note that the scalings that appearin such techniques constitute extra problem variables. In many circumstances it causes no extratechnical difficulties to let these scalings also depend on the scheduling parameter what reduces theconservatism.

9.2 Affine Parameter Dependence

Suppose that the matrices (9.1.1) describing the system areaffinefunctions on the set

δc = conv{δ1, . . . , δk}.

In that case we intend to search, as well, for an LPV controller that is defined withaffinefunctions(9.1.3). Note that the describing matrices for the cosed-loop system are alsoaffinein the parameterif (

BE

)and

(C F

)are parameter independent

what is assumed from now on. Finally, we letX in Theorem 9.2 beconstant.

SinceRp ≥ 0, we infer that (9.1.7) is satisfied if and only if it holds for the generatorsp = δ j of thesetδc. Therefore, the analysis inequalities reduce to the finite set of LMI’s

X > 0,

(A(δ j )TX + XA(δ j ) XB(δ j )

B(δ j )TX 0

)+

+

(0 I

C(δ j ) D(δ j )

)T

Pp

(0 I

C(δ j ) D(δ j )

)< 0 for all j = 1, . . . , k.


218 9.3. LFT SYSTEM DESCRIPTION

Under the present structural assumptions, the affine functions

(Ac Bc

Cc Dc

)are transformed into

affine functions

(K LM N

)under the controller parameter transformation as considered in the

previous section.

Then the synthesis inequalities (9.1.9)-(9.1.10) whose variables are the constantX andY and the

affine functions

(K LM N

)turn out to beaffinein the parameterp. This implies for the synthesis

inequalities that we can replace the search overδc without loss of generality by the search overthe generatorsδ j of this set. Therefore, solving the design problem amounts to testing whether theLMI’s (

Y II X

)> 0

and sym(A(δ j )Y + BM(δ j )

)∗ ∗

(A(δ j ) + BN(δ j )C)T+ K (δ j ) sym

(A(δ j )X + L(δ j )C

)∗

(Bp(δj ) + BN(δ j )F)T (X Bp(δ

j ) + L(δ j )F)T 0

+

+

(∗

∗

)T

Pp

(0 0 I

Cp(δj )Y + E M(δ j ) Cp(δ

j ) + E N(δ j )C Dp(δj ) + E N(δ j )F

)< 0

for j = 1, . . . , k admit a solution.

Since affine, the functionsK , L, M , N are parameterized as(K (p) L(p)

M(p) N(p)

)=

(K0 L0M0 N0

)+

m∑i =1

(K i L i

Mi Ni

)pi

with real matricesK i , L i , Mi , Ni . Hence, the synthesis inequalities form genuine linear matrixinequalities that can be solved by standard algorithms.

9.3 LFT System Description

Similarly as for our discussion of robust controller design, let us assume in this section that the LPVsystem is described as and LTI system

xzu

zp

y

=

A Bu Bp B

Cu Duu Dup Eu

Cp Dpu Dpp Ep

C Fu Fp 0

xwu

wp

u

(9.3.1)


9.3. LFT SYSTEM DESCRIPTION 219

in wich the parameter enters via the uncertainty channelwu → zu as

wu(t) = 1(t)zu(t), 1(t) ∈ 1c. (9.3.2)

The size and the structure of the possible parameter values1(t) is captured by the convex set

1c := conv{11, ...,1N}

whose generators1 j are given explicitly. We assume w.l.o.g. that 0∈ 1c. As before, we concen-trate on the quadratic performance specification with indexPp imposed on the performance channelwp → zp.

Adjusted to the structure of (9.3.1)-(9.3.2), we assume that the measured parameter curve enters thecontroller also in a linear fractional fashion. Therefore, we assume that the to-be-designed LPVcontroller is defined by scheduling the LTI system

xc = Acxc + Bc

(y

wc

),

(uzc

)= Ccxc + Dc

(y

wc

)(9.3.3)

with the actual parameter curve entering as

wc(t) = 1c(1(t))zc(t). (9.3.4)

The LPV controller is hence parameterized through the matricesAc, Bc, Cc, Dc, and through apossibly non-linear matrix-valued scheduling function

1c(1) ∈ Rnr ×nc defined on1c.

Figure 9.1 illustrates this configuration.

The goal is to construct an LPV controller such that, for all admissible parameter curves, the con-trolled system is exponentially stable and, the quadratic performance specification with indexPp forthe channelwp → zp is satisfied.

The solution of this problem is approached with a simple trick. In fact, the controlled system can,alternatively, be obtained by scheduling the LTI system

xzu

zc

zp

ywc

=

A Bu 0 Bp B 0Cu Duu 0 Dup Eu 00 0 0 0 0 Inc

Cp Dpu 0 Duu Ep 0C Fu 0 Fp 0 00 0 Inr 0 0 0

xwu

wc

wp

uzc

(9.3.5)

with the parameter as (w1wc

)=

(1(t) 0

0 1c(1(t))

)(z1zc

), (9.3.6)



Nqy_jst�

�F[� j[�

j[�

Nqy]Bt��Bjj��

r `F[�

�[�w�D j[�F[�

Figure 9.1: LPV system and LPV controller with LFT description

and then controlling this parameter dependent system with the LTI controller (9.3.3). Alternatively,we can interconnect the LTI system (9.3.5) with the LTI controller (9.3.3) to arrive at the LTI system

xzu

zc

zp

=

A Bu Bc Bp

Cu Duu Duc Dup

Cc Dcu Dcc Dcp

Cp Dpu Dpc Dpp

xwu

wc

wp

, (9.3.7)

and then re-connect the parameter as (9.3.6). This latter interconnection order is illustrated in Figure9.2.

Note that (9.3.5) is an extension of the original system (9.3.1) with an additional uncertainty channelwc → zc and with an additional control channelzc → wc; the numbernr andnc of the componentsof wc andzc dictate the size of the identity matricesInr and Inc that are indicated by their respectiveindices.

Once the scheduling function1c(1) has been fixed, it turns out that (9.3.3) is arobust controllerfor the system (9.3.5) with uncertainty (9.3.6). The genuine robust control problem in which theparameter is not measured on-line would relate to the situation thatnr = 0 andnc = 0 such that(9.3.5) and (9.3.1) are identical. In LPV control we have the extra freedom of being able to firstextend the system as in (9.3.5) and design for this extended system a robust controller. It will turnout that this extra freedom will render the corresponding synthesis inequalities convex.



Nqy_jst�

�F[� j[�

j[�

Nqy]Bt��Bjj��

r `F[�

�[�w�D j[�F[�

Figure 9.2: LPV system and LPV controller: alternative interpretation

Before we embark on a solution of the LPV problem, let us include some further comments on thecorresponding genuine robust control problem. We have seen in section 8.1.1 that the search for arobust controller leads to the problem of having to solve the matrix inequalities

X(v) > 0,

∗

∗

∗

∗

∗

∗

T

0 I 0 0 0 0I 0 0 0 0 00 0 Q S 0 00 0 ST R 0 00 0 0 0 Qp Sp

0 0 0 0 STp Rp

I 0 0

A(v) Bu(v) B p(v)

0 I 0Cu(v) Duu(v) Dup(v)

0 0 ICp(v) Dpu(v) Dpp(v)

< 0

(1

I

)T ( Q SST R

)(1

I

)> 0 for all 1 ∈ 1c

in the parameterv and in the multiplierP =

(Q SST R

).

Recall from our earlier discussion that one of the difficulties is a numerical tractable parameterizationof the set of multipliers. This was the reason to introduce, at the expense of conservatism, thefollowing subset of multipliers that admits a description in terms of finitely many LMI’s:

P :=

{P =

(Q SST R

)| Q < 0,

(1 j

I

)T

P

(1 j

I

)> 0 for j = 1, . . . , N

}. (9.3.8)



Even after confining the search tov and P ∈ P, no technique is known how to solve the resultingstill non-convex synthesis inequalities by standard algorithms.

In contrast to what we have seen for state-feedback design, the same is true of the dual inequalitiesthat read as

X(v) > 0,

∗

∗

∗

∗

∗

∗

T

0 X 0 0 0 0X 0 0 0 0 00 0 Q S 0 0X 0 ST R 0 00 0 0 0 Qp Sp

0 0 0 0 STp Rp

−A(v)T−Cu(v)T

−Cp(v)T

I 0 0−Bu(v)T

−Duu(v)T−Dpu(v)T

0 I 0−B p(v)T

−Dup(v)T−Dpp(v)T

0 0 I

> 0

(I

−1T

)T (Q SST R

)(I

−1T

)< 0 for all 1 ∈ 1c.

Again, even confining the search to the set of multipliers

P :=

{P =

(Q SST R

)| R > 0,

(I

−1Tj

)T

P

(I

−1Tj

)< 0 for j = 1, . . . , N

}(9.3.9)

does not lead to a convex feasibility problem.

Since non-convexity is caused by the multiplication of functions that depend onv with the multi-pliers, one could be lead to the idea that it might help to eliminate as many of the variables that areinvolved inv as possible. We can indeed apply the technique exposed in Section 4.5.3 and eliminateK , L, M , N.

For that purpose one needs to compute basis matrices

8 =

81

82

83

of ker(

BT ETu ET

p

)and 9 =

91

92

93

of ker(

C Fu Fp)

respectively. After elimination, the synthesis inequalities read as(Y II X

)> 0, (9.3.10)

9T

I 0 0A Bu Bp

0 I 0Cu Duu Dup

0 0 ICp Dpu Dpp

T

0 X 0 0 0 0X 0 0 0 0 00 0 Q S 0 00 0 ST R 0 00 0 0 0 Qp Sp

0 0 0 0 STp Rp

I 0 0A Bu Bp

0 I 0Cu Duu Dup

0 0 ICp Dpu Dpp

9 < 0,

(9.3.11)



8T

−AT−CT

u −CTp

I 0 0−BT

u −DTuu −DT

pu0 I 0

−BTp −DT

pu −DTpp

0 0 I

T

0 X 0 0 0 0X 0 0 0 0 00 0 Q S 0 00 0 ST R 0 00 0 0 0 Qp Sp

0 0 0 0 STp Rp

−AT−CT

u −CTp

I 0 0−BT

u −DTuu −DT

pu0 I 0

−BTp −DT

pu −DTpp

0 0 I

8 > 0

(9.3.12)in the variablesX, Y, and in the multiplierP and P that are coupled as

P =

(Q SST R

)=

(Q SST R

)−1

= P−1. (9.3.13)

Hence, after elimination, it turns out that the inequalities (9.3.10)-(9.3.12) are indeed affine in theunknownsX, Y, P and P. Unfortunately, non-convexity re-appears through the coupling (9.3.13)of the multipliersP and P.

Let us now turn back to the LPV problem where we allow, via the scheduling function1c(1) in thecontroller, extra freedom in the design process.

For guaranteeing stability and performance of the controlled system, we employextendedmultipliersadjusted to the extended uncertainty structure (9.3.6) that are given as

Pe =

(Qe Se

STe Re

)=

Q Q12 S S12

Q21 Q22 S21 S22

∗ ∗ R R12∗ ∗ R21 R22

with Qe < 0, Re > 0 (9.3.14)

and that satisfy 1 00 1c(1)

I 00 I

Pe

1 00 1c(1)

I 00 I

> 0 for all 1 ∈ 1. (9.3.15)

The corresponding dual multipliersPe = P−1e are partitioned similarly as

Pe =

(Qe Se

STe Re

)=

Q Q12 S S12

Q21 Q22 S21 S22

∗ ∗ R R12

∗ ∗ R21 R12

with Qe < 0, Re > 0 (9.3.16)

and they satisfyI 00 I

−1T 00 −1c(1)T

T

Pe

I 00 I

−1T 00 −1c(1)T

> 0 for all 1 ∈ 1.



As indicated by our notation, we observe that(Q SST R

)∈ P and

(Q SST R

)∈ P

for the corresponding sub-matrices ofPe and Pe respectively.

If we recall the description (9.3.6)-(9.3.7) of the controlled LPV system, the desired exponentialstability and quadratic performance property is satisfied if we can find a Lyapunov matrixX and anextended scalingPe with (9.3.14)-(9.3.15) such that

X > 0,

∗

∗

∗

∗

∗

∗

∗

∗

T

0 X 0 0 0 0 0 0X 0 0 0 0 0 0 00 0 Q Q12 S S12 0 00 0 Q21 Q22 S21 S22 0 00 0 ∗ ∗ R R12 0 00 0 ∗ ∗ R21 R22 0 00 0 0 0 0 0 Qp Sp

0 0 0 0 0 0 STp Rp

I 0 0 0A Bu Bc Bp

0 I 0 00 0 I 0Cu Duu Duc Dup

Cc Dcu Dcc Dcp

0 0 0 ICp Dpu Dpc Dpp

< 0.

(9.3.17)

We are now ready to formulate an LMI test for the existence of an LPV controller such that thecontrolled LPV system fulfills this latter analysis test.

Theorem 9.4 The following statements are equivalent:

(a) There exists a controller(9.3.3)and a scheduling function1c(1) such that the controlledsystem as described by(9.3.4)-(9.3.7)admits a Lyapunov matrixX and a multiplier(9.3.14)-(9.3.15)that satisfy(9.3.17).

(b) There exist X, Y and multipliers P∈ P, P ∈ P that satisfy the linear matrix inequalities(9.3.10)-(9.3.12).

Proof. Let us first prove 1⇒ 2. We can apply the technique as described in Section 4.5.3 toeliminate the controller parameters in the inequality (9.3.17). According to Corollary 4.15, thisleads to the coupling condition (4.5.24) and to the two synthesis inequalities (4.5.25)-(4.5.26). Thewhole point is to show that the latter two inequalities can indeed be simplified to (9.3.11)-(9.3.12).Let us illustrate this simplification for the first inequality only since a duality argument leads to thesame conclusions for the second one.

With

9e =

91

92

093

as a basis matrix of ker

(C Fu 0 Fp

0 0 Inr 0

),



the inequality that corresponds to (4.5.24) reads as

9Te

∗

∗

∗

∗

∗

∗

∗

∗

T


0 0 0 0 0 0 STp Rp

I 0 0 0A Bu 0 Bp

0 I 0 00 0 I 0

Cu Duu 0 Dup

0 0 0 00 0 0 I

Cp Dpu Dpc Dpp

9e < 0.

Due to the zero block in9e, it is obvious that this is the same as

9T

∗

∗

∗

∗

∗

∗

∗

∗

T


0 0 0 0 0 0 STp Rp

I 0 0A Bu Bp

0 I 00 0 0

Cu Duu Dup

0 0 00 0 I

Cp Dpu Dpp

9 < 0.

The two zero block rows in the outer factors allow to simplify this latter inequality to (9.3.11), whatfinishes the proof of 1⇒ 2.

The constructive proof of 2⇒ 1 is more involved and proceeds in three steps. Let us assume thatwe have computed solutionsX, Y andP ∈ P, P ∈ P with (9.3.10)-(9.3.12).

First step: Extension of Scalings.SinceP ∈ P and P ∈ P, let us recall that we have(1

I

)T

P

(1

I

)> 0 and

(I

−1T

)T

P

(I

−1T

)< 0 for all 1 ∈ 1. (9.3.18)

Due to 0∈ 1c, we getR > 0 andQ < 0. Hence we conclude for the diagonal blocks ofP thatQ < 0 andR > 0, and for the diagonal blocks ofP that Q > 0 andR < 0. If we introduce

Z =

(I0

)and Z =

(0I

)with the same row partition asP, these properties can be expressed as

ZT P Z < 0, ZT PZ > 0 and ZT P Z < 0, ZT P Z > 0. (9.3.19)

If we observe that im(Z) is the orthogonal complement of im(Z), we can apply the DualizationLemma to infer

ZT P−1Z > 0, ZT P−1Z < 0 and ZT P−1Z > 0, ZT P−1Z < 0. (9.3.20)



For the givenP and P, we try to find an extensionPe with (9.3.14) such that the dual multiplierPe = P−1

e is related to the givenP as in (9.3.16). After a suitable permutation, this amounts tofinding an extension(

P TTT TT N T

)with

(P ∗

∗ ∗

)=

(P T

TT TT N T

)−1

, (9.3.21)

where the specific parameterization of the new blocks in terms of a non-singular matrixT and somesymmetricN will turn out convenient. Such an extension is very simple to obtain. However, we alsoneed to obey the positivity/negativity constraints in (9.3.14) that amount to(

Z 00 Z

)T ( P TTT TT N T

)(Z 00 Z

)< 0 (9.3.22)

and (Z 00 Z

)T (P T

TT TT N T

)(Z 00 Z

)> 0. (9.3.23)

We can assume w.l.o.g. (perturb, if necessary) thatP − P−1 is non-singular. Then we set

N = (P − P−1)−1

and observe that (9.3.21) holds for any non-singularT .

The main goal is to adjustT to render (9.3.22)-(9.3.23) satisfied. We will in fact construct the sub-blocksT1 = T Z andT2 = T Z of T = (T1 T2). Due to (9.3.19), the conditions (9.3.22)-(9.3.23)read in terms of these blocks as (Schur)

TT1

[N − Z(ZT P Z)−1ZT

]T1 < 0 and TT

2

[N − Z(ZT PZ)−1ZT

]T2 > 0. (9.3.24)

If we denote byn+(S), n−(S) the number of positive, negative eigenvalues of the symmetric matrixS, we hence have to calculaten−(N − Z(ZT P Z)−1ZT ) andn+(N − Z(ZT PZ)−1ZT ). SimpleSchur complement arguments reveal that

n−

(ZT P Z ZT

Z N

)= n−(ZT P Z) + n−(N − Z(ZT P Z)−1ZT ) =

= n−(N) + n−(ZT P Z − ZT N−1Z) = n−(N) + n−(ZT P−1Z).

Since ZT P Z and ZT P−1Z have the same size and are both negative definite by (9.3.19) and(9.3.20), we concluden−(ZT P Z) = n−(ZT P−1Z). This leads to

n−(N − Z(ZT P Z)−1ZT ) = n−(N).

Literally the same arguments will reveal

n+(N − Z(ZT PZ)−1ZT ) = n+(N).



These two relations imply that there existT1, T2 with n−(N), n+(N) columns that satisfy (9.3.24).Hence the matrixT = (T1 T2) hasn+(N) + n−(N) columns. Since the number of rows ofT1, T2,Z, Z, N are all identical,T is actually asquarematrix. We can assume w.l.o.g. - by perturbingT1or T2 if necessary - that the square matrixT is non-singular.

This finishes the construction of the extended multiplier (9.3.14). Let us observe that the dimen-sions ofQ22/R22 equal the number of columns ofT1/T2 which are, in turn, given by the integersn−(N)/n+(N).

Second Step: Construction of the scheduling function.Let us fix1 and let us apply the Elim-ination Lemma to (9.3.15) with1c(1) viewed as the unknown. We observe that the solvabilityconditions of the Elimination Lemma just amount to the two inequalities (9.3.18). We conclude thatfor any1 ∈ 1 one can indeed compute a1c(1) which satisfies (9.3.15).

Due to the structural simplicity, we can even provide an explicit formula which shows that1c(1)

can be selected to depend smoothly on1. Indeed, by a straightforward Schur-complement argument,(9.3.15) is equivalent to

U11 U12 (W11 + 1)T WT21

U21 U22 WT12 (W22 + 1c(1))T

W11 + 1 W12 V11 V12W21 W22 + 1c(1) V21 V22

> 0

for U = Re − STe Q−1

e Se > 0, V = −Q−1e > 0, W = Q−1

e Se. Obviously this can be rewritten to(U22 ∗

W22 + 1c(1) V22

)−

(U21 WT

12W21 V21

)(U11 (W11 + 1)T

W11 + 1 V11

)−1(U12 WT

21W12 V12

)> 0

in which 1c(1) only appears in the off-diagonal position. Since we are sure that there does indeedexist a1c(1) that renders the inequality satisfied, the diagonal blocks must be positive definite. Ifwe then choose1c(1) such that the off-diagonal block vanishes, we have found a solution of theinequality; this leads to the following explicit formula

1c(1) = −W22 +(

W21 V21) ( U11 ∗

W11 + 1 V11

)−1( U12W12

)for the scheduling function. We note that1c(1) has the dimensionn−(N) × n+(N).

Third Step: LTI controller construction. After having constructed the scalings, the last step is toconstruct an LTI controller and Lyapunov matrix that render the inequality (9.3.17) satisfied. We areconfronted with a standard nominal quadratic design problem of which we are sure that it admits asolution, and for which the controller construction proceed along the steps that have been intensivelydiscussed in Chapter??.

We have shown that the LMI’s that needed to be solved for designing an LPV controller areiden-tical to those for designing a robust controller, with the only exception that the coupling condition


228 9.4. A SKETCH OF POSSIBLE APPLICATIONS

(9.3.13) drops out. Therefore, the search forX andY and for the multipliersP ∈ P and P ∈ Pto satisfy (9.3.10)-(9.3.12) amounts to testing the feasibility of standard LMI’s. Moreover, the con-troller construction in the proof of Theorem 9.4 is constructive. Hence we conclude that we havefound a full solution to the quadratic performance LPV control problem (includingL2-gain and dis-sipativity specifications) for full block scalingsPe that satisfyQe < 0. The more interesting generalcase without this still restrictive negativity hypotheses is dealt with in future work.

Remarks.

• The proof reveals that the scheduling function1c(1) has a many rows/colums as there arenegative/positive eigenvalues ofP − P−1 (if assuming w.l.o.g. that the latter is non-singular.)If it happens thatP − P−1 is positive or negative definite, there is no need to schedule thecontroller at all; we obtain a controller that solves the robust quadratic performance problem.

• Previous approaches to the LPV problem [2,6,21,37] were based on1c(1) = 1 such that thecontroller is scheduled with an identical copy of the parameters. These results were based onblock-diagonal parameter matrices and multipliers that were as well assumed block-diagonal.The use of full block scalings [32] require the extension to a more general scheduling functionthat is - as seen a posteriori - a quadratic function of the parameter1.

• It is possible to extend the procedure toH2-control and to the other performance specificationsin these notes. However, this requires restrictive hypotheses on the system description. Theextension to general mixed problems seems nontrivial and is open in its full generality.

9.4 A Sketch of Possible Applications

It is obvious how to apply robust or LPV control techniques in linear design: If the underlyingsystem is affected, possibly in a nonlinear fashion, by some possibly time-varying parameter (suchas varying resonance poles and alike), one could strive

• either for designing a robust controller if the actual parameter changes are not available ason-line information

• or for constructing an LPV controller if the parameter (and its rate of variation) can be mea-sured on-line.

As such the presented techniques can be a useful extension to the nominal design specifications thathave been considered previously.

In a brief final and informal discussion we would like to point out possible applications of robustand LPV control techniques to the control of nonlinear systems:


9.4. A SKETCH OF POSSIBLE APPLICATIONS 229

• They clearly apply if one can systematically embed a nonlinear system in a class of linearsystems that admit an LPV parameterization.

• Even if it is required to perform a heuristic linearization step, they can improve classical gain-scheduling design schemes for nonlinear systems since they lead to a one-shot construction ofa family of linear controllers.

9.4.1 From Nonlinear Systems to LPV Systems

In order to apply the techniques discussed in these notes to nonlinear systems, one uses variations ofwhat is often calledglobal linearization.

Consider a nonlinear system described by

x = f (x) (9.4.1)

where we assume thatf : Rn→ Rn is a smooth vector field.

If f (0) = 0, it is often possible to rewritef (x) = A(x)x with a smooth matrix valued mappingA(.). If one can guarantee that the LPV system

x = A(δ(t))x

is exponentially stable, we can conclude that the nonlinear system

x = A(x)x

has 0 as a globally exponentially stable equilibrium. Note that one can and should impose a prioribounds on the state-trajectories such asx(t) ∈ M for some setM such that the stability of the LPVsystem only has to be assured forδ(t) ∈ M ; of course, one can then only conclude stability fortrajectories of the nonlinear system that remain inM .

A slightly more general procedure allows to consider arbitrary system trajectories instead of equi-librium points (or constant trajectories) only. In fact, supposex1(.) andx2(.) are two trajectories of(9.4.1). By the mean-value theorem, there exist

η j (t) ∈ conv{x1(t), x2(t)}

such that

x1(t) − x2(t) = f (x1(t)) − f (x2(t)) =

∂ f1∂x (η1(t))

...∂ fn∂x (ηn(t))

(x1(t) − x2(t)).

Therefore, the incrementξ(t) = x1(t) − x2(t) satisfies the LPV system

ξ (t) = A(η1(t), . . . , ηn(t))ξ(t)



with parametersη1, . . . , ηn. Once this LPV system is shown to be exponentially stable, one canconclude thatξ(t) = x1(t) − x2(t) converges exponentially to zero fort → ∞. If x2(.) is a nominalsystem trajectory (such as an equilibrium point or a given trajectory to be investigated), we canconclude thatx1(t) approaches this nominal trajectory exponentially.

Finally, the following procedure is often referred to asglobal linearization. Let

F be the closure of conv{ fx(x) | x ∈ Rn}.

Clearly,F is a closed and convex subset ofRn×n. It is not difficult to see that any pair of trajectoriesx1(.), x2(.) of (9.4.1) satisfies the linear differential inclusion

x1(t) − x2(t) ∈ F (x1(t) − x2(t)). (9.4.2)

Proof. Fix anyt and consider the closed convex set

F [x1(t) − x2(t)] ⊂ Rn.

Suppose this set is contained in the negative half-space defined by the vectory ∈ Rn:

yTF [x1(t) − x2(t)] ≤ 0.

Due to the mean-value theorem, there exists aξ ∈ conv{x1(t), x2(t)} with

yT[x1(t) − x2(t)] = yT

[ f (x1(t)) − f (x2(t))] = yT fx(ξ)[x1(t) − x2(t)].

Since fx(ξ) ∈ F , we inferyT

[x1(t) − x2(t)] ≤ 0.

Hencex1(t) − x2(t) is contained, as well, in the negative half-space defined byy. SinceF is closedand convex, we can indeed infer (9.4.2) as desired.

To analyze the stability of the differential inclusion, one can cover the setF by the convex hull offinitely many matricesA j and apply the techniques that have been presented in these notes.

Remarks. Of course, there are many other possibilities to embed nonlinear systems in a family oflinear systems that depend on a time-varying parameter. Since there is no general recipe to transforma given problem to the LPV scenario, we have only sketched a few ideas. Although we concentratedon stability analysis, these ideas straightforwardly extend to various nominal or robust performancedesign problems what is a considerable advantage over other techniques for nonlinear systems. Thisis particularly important since, in practical problems, non-linearities are often highly structured andnot all states enter non-linearly. For example, in a stabilization problem, one might arrive at a system

x = A(y)x + B(y)u, y = Cx

whereu is the control input andy is the measured output that captures, as well, those states thatenter the system non-linearly. We can use the LPV techniques to design a stabilizing LPV controllerfor this system. Sincey is the scheduling variable, this controller will depend, in general, non-linearly ony; hence LPV control amounts to a systematic technique to design nonlinear controllersfor nonlinear systems ‘whose non-linearities can be measured’.


9.4. A SKETCH OF POSSIBLE APPLICATIONS 231

9.4.2 Gain-Scheduling

A typical engineering technique to attack design problems for nonlinear systems proceeds as follows:Linearize the system around a couple of operating points, design good linear controllers for each ofthese points, and then glue these linear controllers together to control the nonlinear system.

Although this scheme seems to work reasonably well in many practical circumstances, there areconsiderable drawbacks:

• There is no general recipe how to glue controllers together. It is hard to discriminate betweenseveral conceivable controller interpolation techniques.

• It is not clear how to design the linear controllers such that, after interpolation, the overallcontrolled system shows the desired performance.

• There are no guarantees whatsoever that the overall system is even stabilized, not to speak ofguarantees for performance. Only through nonlinear simulations one can roughly assess thatthe chosen design scenario has been successful.

Based on LPV techniques, one can provide a recipe to systematically design a family of linear con-trollers that is scheduled on the operating point without the need for ad-hoc interpolation strategies.Moreover, one can provide, at least for the linearized family of systems, guarantees for stability andperformance, even if the system undergoes rapid changes of the operating condition.

Again, we just look at the stabilization problem and observe that the extensions to include as wellperformance specifications are straightforward.

Suppose a nonlinear systemx = a(x, u), y = c(x, u) − r (9.4.3)

hasx as its state,u as its control,r as a reference input, andy as a tracking error output that isalso the measured output. We assume that, for each reference inputr , the system admits a uniqueequilibrium (operating condition)

0 = a(x0(r ), u0(r )), 0 = c(x0(r ), u0(r )) − r

such thatx0(.), u0(.) are smooth inr . (In general, one applies the implicit function theorem toguarantee the existence of such a parameterized family of equilibria under certain conditions. Inpractice, the calculation of these operating points is the first step to be done.)

The next step is to linearize the the system around each operating point to obtain

x = fx(x0(r ), u0(r ))x + fu(x0(r ), u0(r ))u, y = cx(x0(r ), u0(r ))x + cu(x0(r ), u0(r ))u − r.

This is indeed a family of linear systems that is parameterized byr .

In standard gain-scheduling, linear techniques are used to find, for eachr , a good tracking controllerfor each of these systems, and the resulting controllers are then somehow interpolated.



At this point we can exploit the LPV techniques to systematically design an LPV controller thatachieves good tracking for all reference trajectories in a certain class, even if these references varyquickly with time. This systematic approach directly leads to a family of linear systems, where theinterpolation step is taken care of by the algorithm. Still, however, one has to confirm by nonlinearsimulations that the resulting LPV controller works well for the original nonlinear system. Notethat the Taylor linearization can sometimes be replaced by global linearization (as discussed in theprevious section) what leads to a priori guarantees for the controlled nonlinear system.

Again, this was only a very brief sketch of ideas to apply LPV control in gain-scheduling, and werefer to [13] for a broader exposition of gain-scheduling in nonlinear control.


Linear Matrix Inequalities in Control · 2006-06-06 · is to reformulate a given control problem...

Documents

Transcript of Linear Matrix Inequalities in Control · 2006-06-06 · is to reformulate a given control problem...