Coherent Space Time Block Codes from Sets of Subspaces · 2009. 7. 10. · Multiple input multiple...

Christian Pietsch

Coherent Space Time Block Codes

from Sets of Subspaces

Multiple input multiple output (MIMO) systems are an attractive option forwireless communications due to their capability of providing extra capacityand/or diversity in comparison with single antenna schemes. This book con-tributes to the analysis and the construction of space time constellations thatallow for a reliable transmission in fading environments where the transmitterdoes not have any knowledge about the channel state information. Initiatedby new findings on the structure of orthogonal space time block codes (OST-BCs), we establish two mappings that link these constellations to unique setsof subspaces, which are termed Grassmannian packings in the mathematicalliterature. We derive the packing properties that result from OSTBCs. In thefirst place, this gives us new insight into the structure of OSTBCs. Moreover,since Grassmannian packings have been previously applied for the construc-tion of non-coherent constellations, we identify similarities and differencesbetween a variety of coherent and non-coherent constellations. The pack-ings that are related to OSTBCs are severely constrained. Allowing for moregeneral packings, this lets us construct space time constellations that supporthigher data rates. It provides a new powerful framework that links the de-sign of general coherent space time constellations with the search for goodGrassmannian packings. We derive packing properties that yield space timeconstellations with excellent performance in terms of mutual information anddiversity. We propose two methods that enable the design of these packings.Constructing space time constellations from the resulting packings, we obtainfull rate coherent space time block codes (STBCs) that turn out to be superiorto the best known coherent STBCs that we are aware of. While the emphasisof this work is on the analysis and the construction of space time constella-tions, a well defined transmission model eases many derivations. In particular,the relationship between models that include spreading matrices, dispersionmatrices, real-valued and complex-valued notation turns out to be important.

Coherent Space Time Block Codes

from Sets of Subspaces

Cover design

The figure on the front cover symbolizes the mapping of two points from the Grass-

mann manifold GR(2, 1) to the two dimensional Euclidean space R2. A detailed expla-

nation is given in Appendix 3.A1.3, see also Figure 3.7.

Acknowledgments

Having finalized my Ph. D. with this thesis, I may now look back at some wonderful

years that I was fortunate enough to spend at the Institute of Information Technology

at the University of Ulm. Surely, this time would not have been so rewarding if it had

not been for the great people I had the chance to work with. At this point, I would like

to express my gratitude to everybody who contributed to my work and thesis in one

way or another.

In particular, I owe many thanks to Prof. Jürgen Lindner for providing me with the

great opportunity to do challenging research at his institute and the trust he put into

my work by giving me a lot of freedom to choose an interesting topic and by letting

me present my results at various national and international conferences. Additionally,

I am very much indebted to Prof. Tobias Weber who kindly agreed on acting as a co-

supervisor.

I would like to say thank you to all my former colleagues, none of whom I would

want to have missed. It is always difficult to pick out individuals since others may

unintentionally appear less important. Nevertheless, I would like to mention a few

people explicitly. There is my long term friend Ivan Perǐsa who I have been lucky

to spend my years throughout high school and university with. There are my office

mates Siegfried Grob and Alexander Linduska who always contributed to an agreeable

working atmosphere. Surely, I will always remember the regular tea kitchen discus-

sions with Markus Dangl, Alexander Linduska, Christian Sgraja, Zoran Utkovski, and

Matthias Wetz on various scientific and non-scientific topics. Moreover, I really enjoyed

the trips to various COST meetings with Werner Teich who was also always patient to

listen to virtually anything that one wanted to discuss. Additionally, there are Werner

Birkle and Werner Hack who were invaluable for my work with the MIMO demonstra-

tor. Finally, many thanks to all those colleagues who were involved in proof-reading

my manuscript!

Last but not least, I am deeply grateful to my parents for their continuous support

throughout the past years.

Christian Pietsch

Neu-Ulm, October 2008

v

Contents

Acknowledgments v

1 Introduction 1

2 Models and Design Criteria for Space Time Block Coding Techniques 5

2.1 The Multiple Input Multiple Output Channel . . . . . . . . . . . . . . . 5

2.2 Basic System Models and Their Relationship . . . . . . . . . . . . . . . 7

2.2.1 Linear Dispersion Model . . . . . . . . . . . . . . . . . . . . . . 9

2.2.2 Spreading Model . . . . . . . . . . . . . . . . . . . . . . . . . . 10

2.2.3 Matched Filtering and Transmission Matrices . . . . . . . . . . . 11

2.2.4 Relationship Between the Two Models . . . . . . . . . . . . . . 12

2.3 Real-Valued Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14

2.3.1 Channel Matrices and Signal Vectors . . . . . . . . . . . . . . . 15

2.3.2 Real-Valued Spreading Matrices . . . . . . . . . . . . . . . . . . 16

2.3.3 Real-Valued Dispersion Matrices . . . . . . . . . . . . . . . . . . 17

2.4 Summary of General System Model Equations and Discussion . . . . . . 18

2.5 Maximum Likelihood Detection . . . . . . . . . . . . . . . . . . . . . . 20

2.5.1 ML Metric – Decision Criterion . . . . . . . . . . . . . . . . . . . 21

2.5.2 Structural Properties . . . . . . . . . . . . . . . . . . . . . . . . 22

2.6 Capacity and Rate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

2.6.1 Spreading Matrices and Capacity . . . . . . . . . . . . . . . . . 25

2.6.2 Dispersion Matrices and Capacity . . . . . . . . . . . . . . . . . 27

2.6.3 Outage Capacity Versus Ergodic Capacity . . . . . . . . . . . . . 28

2.7 Diversity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

2.7.1 Diversity Considerations for a Single Symbol Transmission . . . 30

2.7.2 Diversity Considerations for a Multiple Symbol Transmission . . 34

2.8 Joint Consideration of Capacity, Rate, and Diversity . . . . . . . . . . . 37

Appendix 39

2.A1 Another Example of Notation . . . . . . . . . . . . . . . . . . . . . . . 39

3 Orthogonal Space Time Block Codes – Two Subspace Representations 41

3.1 Dispersion Matrices and Basic Properties of OSTBCs . . . . . . . . . . . 42

3.1.1 Rate Related Issues . . . . . . . . . . . . . . . . . . . . . . . . . 43

3.1.2 Rate 1/2 OSTBCs with Minimal Delay . . . . . . . . . . . . . . . 443.1.3 Notational Aspects . . . . . . . . . . . . . . . . . . . . . . . . . 45

vii

Contents

3.2 Sets of Subspaces from Dispersion Matrices . . . . . . . . . . . . . . . . 46

3.2.1 Square Dispersion Matrices . . . . . . . . . . . . . . . . . . . . 47

3.2.2 Non-Square Dispersion Matrices . . . . . . . . . . . . . . . . . . 52

3.3 Subspaces from Square Transmit Matrices and Implications . . . . . . . 54

3.3.1 Identification of Transmit Matrices with Subspaces . . . . . . . . 56

3.3.2 Subspace Based ML Metric . . . . . . . . . . . . . . . . . . . . . 57

3.3.3 Constraints on Packings Related to OSTBCs . . . . . . . . . . . . 58

3.3.4 Geometrical Interpretation . . . . . . . . . . . . . . . . . . . . . 60

3.3.5 Link Between Coherent and Non-Coherent Schemes . . . . . . . 64

Appendix 71

3.A1 A Short Introduction to Grassmann Manifolds . . . . . . . . . . . . . . 71

3.A1.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71

3.A1.2 Distance Measures . . . . . . . . . . . . . . . . . . . . . . . . . 72

3.A1.3 Example: Points in GR(2, 1) and the Embedding of GR(2, 1) in R2 73

3.A2 Complex Orthogonal Designs from Real Orthogonal Designs . . . . . . 74

3.A3 STBCs from Packings with Subspace Dimensionality ≤ ℓ/2 . . . . . . . 753.A4 Derivation of (3.77) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

4 Design of High Rate Coherent STBCs from Grassmannian Packings 77

4.1 Mutual Information Preserving Sets of Subspaces . . . . . . . . . . . . 78

4.1.1 Packing Properties and Geometrical Interpretation . . . . . . . . 78

4.1.2 A Packing from an Orthogonal Spreading Matrix . . . . . . . . . 81

4.2 Diversity Optimization . . . . . . . . . . . . . . . . . . . . . . . . . . . 90

4.2.1 Pairwise Diversity . . . . . . . . . . . . . . . . . . . . . . . . . . 90

4.2.2 Pairwise Diversity and Real-Valued Notation . . . . . . . . . . . 93

4.2.3 Enhanced Packings by Applying Rotations . . . . . . . . . . . . 99

4.2.4 Random Search for Good Packings . . . . . . . . . . . . . . . . 102

4.3 Comparison with Other STBCs . . . . . . . . . . . . . . . . . . . . . . . 110

4.3.1 An Upper Bound on the Performance – OSTBCs . . . . . . . . . 110

4.3.2 The Linear Dispersion Codes Proposed in [51] . . . . . . . . . . 111

4.3.3 Multi Layer Space Time Block Codes . . . . . . . . . . . . . . . 112

4.3.4 Golden Code . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113

4.3.5 Comparison in Terms of BER Performance . . . . . . . . . . . . 114

Appendix 117

4.A1 Some Principal Angles Between the Subspaces in (4.26) . . . . . . . . . 117

4.A2 Diversity Considerations with Complex-Valued Data Symbols . . . . . . 118

4.A3 Dispersion Matrices from Random Grassmannian Packings . . . . . . . 120

4.A3.1 A Set of Real-Valued 4 × 4 Dispersion Matrices . . . . . . . . . . 1204.A3.2 A Set of Complex-Valued 4 × 4 Dispersion Matrices . . . . . . . 121

5 Summary and Future Work 123

viii

Contents

List of Frequently Used Accents, Acronyms, Operators, and Symbols 127

Bibliography 131

ix

1Introduction

IN the mid 1990s, Foschini [34, 35] and Telatar [106, 107] triggered a new field of

interest by pointing out that there exists an enormous capacity advantage in wire-

less communications when using multiple antennas, both at the transmitter and

at the receiver. Ever since, these systems, which have come to be known as multi-

ple input multiple output (MIMO) systems, have been treated extensively by many

research teams, world wide, resulting in an abundance of publications on virtually

any related issue. Most significantly, besides capacity, MIMO schemes are nowadays

also widely appreciated for their ability to enable reliable transmissions. Meanwhile,

several testbeds have successfully verified that MIMO transmissions are possible under

realistic conditions, see [8,10,53,87,92] for some examples. Furthermore, companies

are starting to implement first products utilizing some of the predicted gains. Despite

all these efforts, there is still a big gap between what theory predicts and reality. Hence,

more sophisticated techniques are yet to be developed to make use of the full potential

that MIMO systems bear. The following paragraphs briefly summarize the most impor-

tant existing results and outline the contributions of this thesis to the enhancement of

MIMO techniques.

Concerning capacity, MIMO systems were found to be attractive primarily because

many scenarios allow for a linear increase of mutual information with respect to the

number of transmit antennas or the number of receive antennas1 while maintaining

bandwidth and fixing transmit power [107]. Indeed, this is a significant capacity gain,

especially at high signal to noise ratios (SNRs), since additional bandwidth is usually

not available (or very expensive) and simply increasing transmit power only leads to a

logarithmic capacity gain [19]. The linear increase is due to the fact that the channel

impulse response usually spans a multi-dimensional space when using several anten-

nas at the transmitter and at the receiver [107]. Each of these dimensions may be used

for an independent data transmission, thus, providing a number of parallel transmis-

sion links. In practice, however, accessing these links individually would require per-

fect channel state information at the transmitter [81, 95, 110], which is unlikely to be

1depending on which number is smaller

1

1 Introduction

obtained. Outdated channel state information or simply estimation errors may cause

significant losses in data rate or performance [72,84]. In this context, a lot of theoret-

ical work has been carried out to compute the maximum achievable data rates under

certain constraints like, for example, the availability of perfect or imperfect channel

state information at the transmitter or the receiver, see for instance [45, 77] and the

references therein. Usually, the lack of channel knowledge – no matter whether this

is at the transmitter or at the receiver – leads to a small degradation in terms of mu-

tual information [45], but, most importantly, it does not affect the linear increase of

capacity. As a matter of fact, Foschini already proposed a scheme in his initial pub-

lication [34], which is nowadays commonly referred to as diagonal Bell Labs layered

space time architecture (D-BLAST), that makes use of the additional data rate while

only demanding channel knowledge at the receiver. Most existing schemes have in

common that they transmit data in parallel in order to exploit the additional data rate.

Due to this fact, the capacity gain of MIMO systems is often also termed multiplexing

gain in literature [65].

With respect to reliability, the gain results from the fact that the individual channel

impulse responses from each transmit antenna to each receive antenna often exhibit in-

dependent fading characteristics [18, Chapter 6]. Therefore, letting the same data ex-

perience multiple components of the MIMO channel enhances the likeliness of the data

being received correctly at the receiver. Such gains where replicas of the same data are

transmitted across several independent transmission links are commonly named diver-

sity gains [88]. In the case of MIMO systems, we usually speak of spatial diversity as

it primarily results from the spatial domain. Its magnitude is specified according to

the number of independent transmission links. For a time invariant flat fading MIMO

scheme, the maximum value is easily seen to be given by the product of the number

of transmit antennas and the number of receive antennas [103]. Space time codes

that achieve the maximum level of diversity are said to be fully diverse. Interestingly,

a moderate number of antennas at the transmitter and at the receiver is already suf-

ficient to provide a bit error probability that gives close to additive white Gaussian

noise (AWGN) performance in an independent identically distributed (i. i. d.) Rayleigh

fading environment [91]. Concerning MIMO transmissions, Alamouti [3] proposed

a simple encoding scheme that efficiently exploits the full potential of spatial diver-

sity without requiring channel state information at the transmitter. While Alamouti’s

scheme was designed for systems with two transmit antennas, only, Tarokh et al. [102]

came up with a generalization for an arbitrary number of transmit antennas, named

orthogonal space time block codes (OSTBCs). Besides diversity, these space time codes

are also very attractive because they reduce the detection complexity of the maximum

likelihood (ML) receiver significantly [3, 65], but their drawback is that they are not

capable of providing a multiplexing gain [66, 93, 102]. Even worse, OSTBCs actu-

ally have a rate loss compared with single antenna systems if more than two transmit

antennas are applied. This led to new designs like quasi OSTBC [61] or diagonal alge-

braic space time (DAST) block codes [20] in order to cope with this additional loss.

Initially, capacity and diversity related issues were mainly considered as two sep-

2

arate lines of research. Consequently, the majority of the early MIMO transmission

techniques put emphasis on maximizing either data rate or diversity – D-BLAST and

OSTBCs being just two examples. Nowadays, it has been widely accepted that one

needs to consider both aspects jointly for the construction of well performing space

time constellations. Bounds have been established on how much mutual information

and diversity can be achieved simultaneously [21, 120]. It becomes clear that large

data rates and a high level of diversity are not necessarily excluding properties asymp-

totically [21]. In literature, the term space time coding has been adopted, generally,

for the application of MIMO signaling techniques at the transmitter. It is also common

practice to distinguish between two branches, namely, the design of space time trellis

codes (STTC) and the design of space time block codes (STBCs) [40, 81, 110]. While

STTCs, see e. g. [103, 114] for some early work, often justify their name since they

allow for similar gains like the ones ordinary channel codes provide, it is important to

note that many (but not all) STBCs do not have this ability. Nevertheless, the design

of STBCs has proven to be a very powerful approach to make MIMO transmissions fea-

sible in today’s applications, partly, also due to the fact that they require less complex

receivers. This thesis contributes to the search for superior STBCs with new findings

on the structure of OSTBCs and new strategies for the construction of high rate space

time constellations that evolve thereof.

STBCs itself may be grouped into various categories. In the first place, they may be

distinguished by the amount of channel state information they need. Some schemes

include partial knowledge about the channel impulse response (like, e. g., its statistical

properties) for optimization at the transmitter, see e. g. [27,64,95,121], whereas oth-

ers do not even require channel state information at the receiver explicitly or implicitly.

The latter ones are usually referred to as differential, see e. g. [50,58,60,67], or non-

coherent space time codes, see e. g. [49,56,57,118,119], respectively. We will mostly

deal with coherent space time coding schemes that do require channel knowledge only

at the receiver, but an interesting link to non-coherent schemes will be discussed [85],

too. There exists an abundance of different proposals on how to design such STBCs.

Most strategies make use of very general concepts like, e. g., the linear dispersion codes

(LDCs) [51] in order to structure the design problem. Having a structured model, nu-

merical techniques are often applied to find good solutions, see e. g. [51, 52]. Unfor-

tunately, the corresponding functions are usually highly non-convex, which makes it

difficult to find an optimal solution. On the other hand, a purely analytical optimiza-

tion is also hard to achieve. Very powerful STBCs have been proposed by applying

division algebras and number theory, see [7,22,25,37,97]. However, these techniques

are difficult to handle, in particular, for higher dimensional STBCs where they may

also require some type of numerical optimization [80].

Now, in detail, the contributions of this work are the following. We show that there

exist two possibilities to identify OSTBCs with sets of subspaces which are named

Grassmannian packings in the mathematical literature [28]. The properties of these

packings give us new insight into the geometrical structure of OSTBCs. In particular,

we obtain a new link between coherent and non-coherent space time constellations

3

1 Introduction

which provides detailed information on their relationship. Motivated by these results,

we generalize this concept to construct high rate STBCs. This gives us a new power-

ful framework that links the design of general coherent space time constellations with

the search for good Grassmannian packings. We derive packing properties that yield

space time constellations with optimized performance in terms of mutual information

and diversity. We also suggest two methods that enable us to construct excellent pack-

ings. Thereof, we obtain full rate coherent STBCs that turn out to be superior to the

best known coherent STBCs [7, 24, 116, 117] that we are aware of. The emphasis of

this work is clearly on the relationship between coherent STBCs and Grassmannian

packings, but a well defined transmission model turns out to be helpful to identify

properties and design criteria. Therefore, we work out a unified model that includes

spreading matrices and dispersion matrices, both applying real-valued or complex-

valued notation. While most parts of the model have been described in literature

before, it is particularly their connection that gives us new insights. Besides, much of

the work relies on real-valued notation whose potential is often neglected. The main

results and some related work were also published in [82–86].

This thesis is structured into three major parts. In Chapter 2, we discuss various as-

pects of the transmission model. We also state some important design criteria that are

commonly used for the construction of space time constellations. Chapter 3 addresses

the connection between OSTBCs and Grassmannian packings. Further, we show what

the implications are on the relationship between coherent and non-coherent packing

based constellations. Finally, we generalize the concept for the design of high rate

coherent space time constellations in Chapter 4. Here, the emphasis is on how to con-

struct good sets of dispersion matrices from Grassmannian packings. A summary of

the main results is given in Chapter 5. Additionally, we point out how our findings

may affect future research in this field.

A list of symbols, operators, and accents is given at the end of this thesis. We

usually introduce all entities when they appear for the first time, but the reader is

recommended to consult this list if being in doubt about a particular definition. What

concerns general conventions, we denote vectors and matrices by bold face lower and

upper case letters, respectively. (¤)H is the hermitian, (¤)T is the transpose, and (¤)∗

the complex conjugate of a vector or a matrix. The trace of a matrix A is denoted by

tr(A). Its squared Frobenius norm is ‖A‖2. If not introduced otherwise, we refer toentries of A by [A]ij where i indicates the row and j the column. Similarly, [A]:j and[A]i: select entire columns and rows and [a]i addresses the ith element of a vector a.Real-valued notation is indicated by a long bar above the letter (not to be confused

with the short bar, which is the statistical average, e. g., the average energy per bit

Ēb). If the results apply for real-valued and complex-valued notation, we usually omitthe bar. Having a complex-valued symbol, ℜ(¤) and ℑ(¤) denote its real part andimaginary part. Finally, we frequently use diagonal matrices Φ with (principal) angles

on the diagonal. Then, cos(Φ) and sin(Φ) must be interpreted as the cosine and thesine of the diagonal elements, respectively, without affecting the off-diagonal elements.

4

2Models and Design Criteria for

Space Time Block Coding Techniques

THROUGHOUT literature, plenty of models have been deployed to describe space

time coded transmissions. Most models are related, but each bears its own ad-

vantages and disadvantages. Many space time codes are closely connected with

the model that was used to describe them. Some of these models are similar concern-

ing their concept and merely differ in notation. Some other models, however, apply dif-

ferent concepts and, therefore, provide easier access to other code properties. For this

reason, the description of an existing space time code using such models often reveals

properties that are hidden otherwise. Also, understanding the relationship between

the models is essential for the comparison of space time codes that were introduced

using different frameworks. In this chapter, we discuss two important concepts that

are used to describe space time transmissions, namely the dispersion model [51] and

the spreading model [52, 83]. As most models have in common that using complex-

valued data symbols imposes some restrictions on the code design, we explain how

real-valued data symbols and a purely real-valued notation [79] allow for a general-

ization which has slightly different implications for the two discussed models. We also

address basic aspects about ML detection and design criteria like transmission rate and

diversity. To begin with, we introduce the MIMO channel and its specific assumptions

that we apply in the course of this thesis.

2.1 The Multiple Input Multiple Output Channel

In a transmission system with multiple antennas at the transmitter and the receiver,

each transmit and receive antenna pair defines a wireless channel. The set of all these

wireless channels is commonly referred to as the MIMO channel, see Figure 2.1. The

number of transmit antennas, which we denote by nt, gives the number of channel

5

2 Models and Design Criteria for Space Time Block Coding Techniques

···

···

1

nt

1

nr···

···

···

···

hnrnt(τ, t)

h11(τ, t)

hnr1(τ, t)

h1nt(τ, t)

Figure 2.1: MIMO channel with nt transmit antennas and nr receive antennas.

inputs and the number of receive antennas, say nr, defines the number of channeloutputs. In total, we thus have ntnr different pairs of transmit and receive antennas.Each of the wireless channels is described by its impulse response hij(τ, t) where i andj identify the receive antenna and the transmit antenna, respectively. In general, theseimpulse responses may be a function of the absolute time t and disperse the transmitsignals in time, which is indicated by the delay time τ .

We restrict our analysis to slowly time variant frequency flat channels, only. This

eases the analysis because there exists a simple baseband channel model where each

channel impulse response may be described by a single complex-valued coefficient hij ,see [88]. Besides, the computation of the channel output then reduces to a multipli-

cation instead of a convolution, which would be much more difficult to handle in the

theoretical analysis. Moreover, we assume that the time variance of the channel is

sufficiently slow so that hij may be considered constant for the duration of at least onespace time codeword. It may change from codeword to codeword, though. All in all,

our assumptions comply with the well known quasi static [45] channel model where

the channel is assumed constant during one block of symbols, which we assume to

extend across ℓ time slots, and varies independently from block to block. Note that wedo not indicate the blockwise time dependence of hij because we only consider singleblocks in our analysis where hij is always fixed.

At first glance, these restrictions appear to be quite severe. However, frequency

selective channels may be easily split up into several parallel frequency flat channels

using orthogonal frequency division multiplexing (OFDM) [32]. Then, space time

coding is applied to each of the frequency slots individually. Doing so, one surely

looses frequency diversity, but spatial diversity often provides a level of diversity that

is sufficiently high already. Besides, additional spreading across the frequency domain

may be always applied, independently from the actual space time code. Also, the

assumption that the channel impulse response does not change for a small number of

time slots is quite moderate.

To allow for a discrete matrix transmission model, all individual channel impulse re-

sponses are gathered in the channel matrix (which we also refer to as (MIMO) channel

6

2.2 Basic System Models and Their Relationship

1 · · · ℓ = 1 · · · ℓ + 1 · · · ℓ

Y H X N

Figure 2.2: Elementary model for MIMO transmissions; blocks 1, . . . , ℓ mark time slots.

impulse response in the following)

H =

0

B

B

@

h11 · · · h1nt...

. . ....

hnr1 · · · hnrnt

1

C

C

A

(2.1)

where the individual channel impulse response hij is the entry of row i and column j.


An elementary model for data transmissions across a MIMO channel where the channel

matrix H stays constant for a duration of ℓ time slots is given by

Y = HX + N. (2.2)

Here, X is named transmit matrix. It is a matrix with nt rows and ℓ columns whoseentries are typically complex values, which we term transmit symbols. The transmit

symbols in row i are assigned to transmit antenna i and the ℓ consecutive time slots.Further, Y is the receive matrix whose entries are called receive symbols. Like for X,

each column is associated with a distinct time slot whereas the rows correspond to

different antennas (here, the receive antennas), see also Figure 2.2 for an illustration

of how the model incorporates time slots. Y results from a linear combination of the

desired signal components, namely HX, and the noise matrix N. We model the re-

ceiver noise components that N gathers by independent identically distributed (i. i. d.)

Gaussian random variables with variance σ2n per real dimension.A space time constellation or, equivalently, a STBC is well defined by a set of trans-

mit matrices X whose cardinality we denote by |X |. We convey data across the MIMOchannel by selecting and transmitting a matrix from X . Hereby, a single transmit ma-trix carries log2(|X |) bit of information. We often refer to each element of X as a pointof the constellation. This is motivated by the observation that each transmit matrix

denotes a point in the signal space of the transmission [70]. Clearly, the set of trans-

mit matrices should be designed such that the receiver has the chance to distinguish

between its elements. The constraints on the elements of X very much depend on theassumptions that we impose on the receiver and the characteristics of the MIMO chan-

nel. Details on this are not to be discussed in this section, but we should mention that

7


the subsequent considerations apply for constellations that are meant to be used for

coherent transmissions, i. e., transmissions where the receiver has knowledge about

H. We do refer to non-coherent space time constellations again towards the end of

Chapter 3, though, to point out some similarities between the coherent ones that we

discuss and general non-coherent constellations.

Now, an unstructured set X is impractical for various reasons. Most importantly,efficient equalization techniques are heavily based on how the information is embed-

ded in the transmit matrices, see e. g. [2, 29, 31, 44], but also the construction of well

performing sets of transmit matrices requires some sort of structure. Almost all coher-

ent STBCs1 that have been proposed in literature are constructed from multiple data

symbols xi which are linearly combined to form the entries of the transmit matrices.Much of our analysis is also based on such linear STBCs where we always assume that

n data symbols are transmitted jointly. The data symbols themselves are chosen in-dependently from discrete sets of scalars that are termed alphabets Ai, i. e. xi ∈ Ai.Typically, all data symbols are chosen from the same alphabet, i. e. A ≡ Ai. Theessence of the design of linear STBCs is to intelligently distribute the symbols xi acrossall transmit antennas within a block of ℓ consecutive time slots. We should alwayskeep in mind, however, that the linear structure imposes constraints that prohibit cer-

tain gains, which other schemes are capable of achieving. Some more details on this

will be discussed in Section 3.3 and in the first two paragraphs in Chapter 4. For the

time being, though, we restrict our considerations to these linear schemes.

The dispersion model [51] and the spreading model [52, 83] are two schemes that

provide a reasonable mathematical framework for the description of linear STBCs. The

dispersion model is closely linked with the matrix model that we introduced in (2.2)

while the spreading model deploys signal vectors. Nevertheless, having already agreed

on the specifications of the MIMO channel in Section 2.1, both models are equivalent

with respect to the type of STBCs which they describe. Each model has its own advan-

tages and disadvantages when it comes to the identification of certain properties of the

STBCs, though. In the remainder of this section, we will discuss the conceptual aspects

and the structural properties of both models and their relationship. To do so, we do

not specify the elements of A at the moment. Hence, the reader may assume complex-valued or real-valued elements in A. However, we soon discover that some existingSTBCs require real-valued alphabets in order to be covered properly. So, we stress that

we usually assume real-valued alphabets in all subsequent sections and chapters if not

mentioned otherwise. Note that this is also indicated by the bar above all data symbols

in upcoming sections and chapters (e. g., xi is a real-valued data symbol). Real-valuedalphabets also raise the desire for a general real-valued notation, but this discussion

is postponed to Section 2.3 because it is less common in literature and needs more

attention. Some additional remarks are finally given in Section 2.4, including a brief

summary of the actual model equations that we apply throughout this work.

1An example of an exception is the STBC described in [94].

8


2.2.1 Linear Dispersion Model

The linear dispersion model [51] defines the transmit matrices as a weighted sum of

dispersion matrices Ci where the weighting coefficients are the information carrying

data symbols xi. Mathematically, this may be expressed by

X =nX

i=1

xiCi. (2.3)

Hence, the STBC is fully defined by a set of dispersion matrices C with |C| = n and thesymbol alphabet A.

Example 2.1 shows how a well known STBC, namely Alamouti’s OSTBC [3], fits

into the linear dispersion model. We note that a formal introduction to OSTBCs is

not to come before Chapter 3. Nevertheless, Alamouti’s OSTBC [3] serves well as an

illustrative example because of its simple structure. We thus make use of it in several

examples to point out specific aspects about space time coding along the way.

Example 2.1 (Linear dispersion matrices). Most commonly Alamouti’s code is repre-

sented by the orthogonal code matrix [102]

X =

x1 −x∗2x2 x

∗1

!

(2.4)

which we simply assume given at the moment. For implementational purposes, (2.4)

actually gives the necessary detail in the sense that we know when and where to trans-

mit which symbol. Nevertheless, we require the dispersion matrices because they give

us more insight into the structure of the scheme later on. Obviously, two complex-

valued symbols are transmitted jointly. More specifically, at time slot 1, x1 and x2 aretransmitted from antenna 1 and 2, respectively. At time slot 2, −x∗2 and x∗1 are trans-mitted from antenna 1 and 2. This means, we would need two dispersion matrices,one for x1 and one for x2. However, it immediately becomes clear that a descriptionwith dispersion matrices according to (2.3) using the complex symbols x1 and x2 isnot possible because of the complex conjugate operation. Still, a description with dis-

persion matrices exists. By splitting up the symbols into their real and imaginary parts,

we come up with a slightly modified version of (2.4), namely

X =

ℜ(x1) + jℑ(x1) −ℜ(x2) + jℑ(x2)ℜ(x2) + jℑ(x2) ℜ(x1) − jℑ(x1)

!

. (2.5)

From this, we can easily derive a set of four dispersion matrices

C1 =

1 0

0 1

!

, C2 = j

1 0

0 −1

!

, (2.6a)

C3 =

0 −11 0

!

, and C4 = j

0 1

1 0

!

(2.6b)

9


that are weighted with the symbols ℜ(x1), ℑ(x1), ℜ(x2), and ℑ(x2), respectively. Lateron, we simply refer to these real-valued data symbols as x1, x2, x3, and x4.

Remark. In the previous example, it might have been more straightforward to use a

STBC that does not require the complex conjugate operation, but, by using Alamouti’s

OSTBC, we already make the reader aware of the difficulties that occur if the data

symbols are chosen from a complex-valued alphabet.

2.2.2 Spreading Model

Clearly, linear space time block coding may be interpreted as spreading data symbols

across the dimensions space and time. It may appear that this is merely a matter

of wording. However, spreading schemes are usually linked to a certain model de-

scription, which is rather uncommon for MIMO systems, but otherwise a well known

concept in communications [69]. Some authors actually do apply the model implic-

itly to simplify certain types of analysis, see e. g. [22, 52, 101], but we often find it

beneficial to define a spreading model explicitly as such.

The key idea of the linear MIMO spreading scheme is to define a spreading matrix

S which maps the data symbol vector x to its transmit symbol vector s2

s = Sx. (2.7)

Here, x is composed of the data symbols xi from before, i. e. [x]i = xi. Further, S isconstructed in such a way that s stacks the vectors that are transmitted consecutively.

In other words, the symbols [s]1, ..., [s]nt are transmitted from antenna 1 to nt duringthe first time slot, the symbols [s]1+nt , ..., [s]2nt are transmitted from antenna 1 to ntduring the second time slot, and so on. Using the effective channel impulse response

Hℓ that results from the concatenation of ℓ time slots, i. e.

Hℓ =

0

B

B

@

H . . . 0

.... . .

...

0 . . . H

1

C

C

A

, (2.8)

the overall transmission is then described by

y = HℓSx + n (2.9)

where n is now termed noise vector, see also Figure 2.3 for a visualization of the

structure of this mathematical model.

Example 2.2 (A spreading matrix). Let us again consider Alamouti’s scheme. Using

the details from Example 2.1, we may easily construct the transmit vector

s = ( x1 x2 −x∗2 x∗1 )T . (2.10)2Note that we may refer to x and s simply as the symbol vector whenever there is no confusion possible.

10


1

···

ℓ

=

H

· · ·

H

x1

·····xn

+

1

···

ℓ

y Hℓ S x n

Figure 2.3: Spreading model; blocks 1, . . . , ℓ mark time slots; white spaces denote zero elements.

A straightforward definition of the symbol vector would be x = ( x1 x2 )T . How-

ever, we have similar difficulty as in the previous example, because there does not exist

a linear mapping from, for example, x1 to x∗1. For this reason, a spreading matrix S

does not exist that maps x to s. If we define a modified transmit vector

x = ( ℜ(x1) ℑ(x1) ℜ(x2) ℑ(x2) )T , (2.11)though, a spreading matrix exists, namely

S =

0

B

B

B

@

1 j 0 0

0 0 1 j

0 0 −1 j1 −j 0 0

1

C

C

C

A

. (2.12)

That means, s = Sx.

2.2.3 Matched Filtering and Transmission Matrices

The definition of the matched filter output [70] is useful because many receive al-

gorithms rely on it. Additionally, matched filtering often gives more insight into the

structure of the transmission scheme, in particular, when despreading is included in

the matched filtering process. Most importantly, at the moment, we can use matched

filtering to motivate the definition of transmission matrices [69]. To do so, we first

consider the matched filter output for the dispersion model, which is

Ym = HHHX + HHN (2.13)

where HH is the impulse response of the matched filter. Note that the matched filter

covers the spatial dimensions such that it implicitly includes maximum ratio combining

(MRC) with respect to the signal contributions from different receive antennas. It is

self-suggestive to combine the impulse response of the channel and of the matched

filter to form a single matrix

R = HHH, (2.14)

11


which is the type of matrices that we refer to as transmission matrix from now on (not

to be confused with the transmit matrix X). It provides a direct link between the input

and output symbol matrices in (2.13)

Ym = RX + HHN. (2.15)

A similar description exists for the spreading model as well. Here, the matched filter

output is

ym = Rℓs + HHℓ n. (2.16)

This particular transmission matrix Rℓ evolves from R in the same manner as Hℓ does

from H. Moreover, additional despreading with SH motivates yet another definition

of a transmission matrix

x̃ = SHRℓSx + SHH

Hℓ n (2.17)

= Rsx + SHH

Hℓ n. (2.18)

Here, we have Rs = SHRℓS where the subscript s indicates that spreading is included.

The latter one of these three transmission matrices turns out to be the most useful one

for our considerations since it directly maps the data symbols.

All transmission matrices have in common that they are hermitian. The main di-

agonal values denote the effective transmission coefficients for each of the symbols

whereas the off-diagonal elements determine the amount of crosstalk between differ-

ent pairs of symbols. Therefore, an informal design goal is to make the off-diagonal

elements as small as possible to avoid crosstalk while the main diagonal elements

should be as large as possible. Clearly, this points out an advantage of Rs concerning

the design of transmission strategies, i. e. STBCs. Namely, R and Rℓ are fixed by the

channel impulse response and may not be altered by any type of signal processing at

the transmitter or the receiver, but the structure of Rs may be optimized by designing

the spreading matrix appropriately. Hence, Rs captures the properties of the chan-

nel as well as those of spatial and temporal spreading. It is also important to realize

that models which include matched filtering or matched filtering with despreading

still provide sets of sufficient statistic. This ensures their optimality concerning the

detection process. Another useful property of the transmission matrices is the fact that

they also denote the noise correlation matrices after matched filtering/despreading.

Note that the above considerations hold for uncorrelated noise at the receiver, but a

generalization for colored noise is straightforward, see [86].

2.2.4 Relationship Between the Two Models

To begin with, we again stress that the linear dispersion model and the spreading

model are mathematically equivalent. In other words, both models carry out the same

mapping between the data symbols and the receive symbols, at least, when consider-

ing the type of channel that we introduced in Section 2.1 (see the remark at the end of

12


S C4 C3 C2 C1

1

2

1

2

1

2

1

2 1 1 1 12 2 2 2

Figure 2.4: Mapping between spreading matrices and dispersion matrices.

this section for some more details). We have also mentioned before that, despite this

equivalence, each model provides its own advantages for the code design and analysis,

which we work out in the course of the thesis. Just to give a couple of examples, we

will observe that the spreading model proves useful for any type of capacity analysis

whereas the dispersion model is superior for certain types of code construction. Fur-

thermore, another advantage of the spreading model is that it provides a vector matrix

representation that many detection techniques require for straightforward implemen-

tation [29,31].

To make use of the properties of both models, it is important to understand their

relationship. The link between the two models is actually rather simple. Namely,

y evolves from Y by stacking its columns. The same applies for the relationship be-

tween s and X. In order to compare different space time constellations, it is also

inevitable to grasp the connection between the spreading matrix S and the dispersion

matrices Ci. We illustrate this mapping in Figure 2.4. In words, the dispersion matrix

Ci defines the ith column of S by stacking its columns. Further, there exists a simplelink between the entries of the transmission matrix Rs and the dispersion matrices.

Specifically,

[Rs]ij = [S]H:i Rℓ [S]:j (2.19)

= tr“

CHi RCj

”

. (2.20)

Applying some equivalence properties of traces, (2.20) becomes

[Rs]ij = tr“

CHi RCj

”

= tr“

CTj R

∗C

∗i

”

(2.21)

=1

2tr“

CHi RCj + C

Tj R

∗C

∗i

”

(2.22)

=1

2tr“

RCjCHi + R

∗C

∗i C

Tj

”

(2.23)

=1

2tr“

ℜ(R)“

CjCHi + C

∗i C

Tj

”

+ jℑ(R)“

CjCHi − C∗i CTj

””

. (2.24)

Later on, some special cases occur, namely, whenever the dispersion matrices are either

13


real-valued or purely imaginary. Then, we have

[Rs]ij =1

2tr“

ℜ(R)“

CjCHi + CiC

Hj

”

+ jℑ(R)“

CjCHi − CiCHj

””

(2.25)

if Ci and Cj are both purely real or both purely imaginary and

[Rs]ij =1

2tr“

jℑ(R)“

CjCHi + CiC

Hj

”

+ ℜ(R)“

CjCHi − CiCHj

””

(2.26)

if either one of them is real and the other one is imaginary. For a purely real-valued

transmission matrix R, the second addend vanishes in (2.25) whereas the first one

vanishes in (2.26). Thus, [Rs]ij is determined only by the hermitian or skew hermitiancomponents of the matrix product of the dispersion matrices Ci and Cj . Further, note

that the first addends are real-valued in, both, (2.25) and (2.26) whereas the last

ones are purely imaginary. Since the imaginary part does not affect the decision if

the dispersion matrices are weighted with real symbols, only the first addends are of

interest in those cases.

Remarks. a) Although we will not make use of this property throughout this work, we

point out that the spreading model is more general than the dispersion model in terms

of the type of the channel impulse responses it allows for. The dispersion model, by

virtue of its structure, inherently assumes a constant channel impulse response during

the transmission of one block. For the spreading model, on the other hand, a straight-

forward extension for more general channel conditions exists, simply by letting Hℓattain a structure that has different blocks on the diagonal or that is not block diagonal.

This way, the spreading model is easily adapted for time variant and/or frequency

selective channels. However, this structure should be taken into account if one wants

to design STBCs for such scenarios, which we do not. b) Sometimes when we switch

between the description of a constellation in terms of dispersion matrices and the same

constellation in terms of spreading matrices, we omit a scaling factor for convenience.

For example, a unitary dispersion matrix would cause a column with norm ℓ in thespreading matrix, which we usually set to 1 without mentioning this explicitely.

2.3 Real-Valued Notation

In communications, it is common practice to represent baseband signals with com-

plex numbers. Complex numbers, however, often prohibit a straightforward way that

certain aspects are dealt with, see e. g. [79, 82]. We already had to cope with some

difficulty when we represented Alamouti’s OSTBC with dispersion matrices in Exam-

ple 2.1 and with spreading matrices in Example 2.2. This was because the model does

not allow for the complex conjugate operation. Generalized, this means that the use of

complex-valued data symbols hides some degrees of freedom that are thus not acces-

sible for any type of optimization. To overcome this problem, we split up the symbols

into their real and imaginary parts and use each of the parts as separate real-valued

14


symbols. Still, the evolving transmit signals are complex-valued and the modeling of

the transmission itself is also complex-valued. Considering Example 2.2, in particular,

this solution is rather unsatisfactory because the resulting spreading matrix carries out

a mapping from real-valued entities to complex-valued entities. With such a mapping,

an analytical analysis takes more effort or even gets infeasible at times. Therefore, we

are interested in a model that maps real-valued data symbols to real-valued transmit

symbols. So, an entirely real-valued transmission model also requires a real-valued

representation of the spreading matrices, the dispersion matrices, and the channel

impulse responses.

Remark. For the sake of completeness, we mention another important advantage of

real-valued notation, which is less significant for our considerations in the remainder

of this thesis, though. That is, complex-valued notation lacks the ability to capture

certain types of correlation in a single correlation matrix correctly, which are, however,

easily incorporated with real-valued notation. As a matter of fact, this is the reason

why Neeser and Massey initially suggested the use of real-valued notation in [79].

2.3.1 Channel Matrices and Signal Vectors

The connection between complex-valued notation and real-valued notation is purely

mathematical. A complex number a may be interpreted as a two-dimensional entitywhose components are orthogonal. A 2×2 matrix that is constructed from the complexnumber in the following manner

A =

ℜ(a) −ℑ(a)ℑ(a) ℜ(a)

!

(2.27)

bears the same property. It is an equivalent description of the complex number in the

sense that it yields the same result with respect to addition and multiplication of two

arbitrary complex numbers [79]. The representation of the complex conjugate of a issimply given by the transpose of the matrix in (2.27).

An extension to matrices is straightforward. Each of the elements of the complex-

valued matrix A is transformed according to (2.27), i. e.,

A = ℜ (A) ⊗

1 0

0 1

!

+ ℑ (A) ⊗

0 −11 0

!

. (2.28)

For the impulse response of the MIMO channel, this is

H =

0

B

B

B

B

B

B

B

@

ℜ (h11) −ℑ (h11) · · · ℜ (h1nt) −ℑ (h1nt)ℑ (h11) ℜ (h11) · · · ℑ (h1nt) ℜ (h1nt)

......

. . ....

...

ℜ (hnr1) −ℑ (hnr1) · · · ℜ (hnrnt) −ℑ (hnrnt)ℑ (hnr1) ℜ (hnr1) · · · ℑ (hnrnt) ℜ (hnrnt)

1

C

C

C

C

C

C

C

A

. (2.29)

15


Again, the product of several matrices gives the same result using either one of the

notations – the complex-valued one or the real-valued one. That means, the con-

catenation of several system matrices (like channel matrices or (complex) spreading

matrices) is the same in both cases, which, of course, is an essential requirement.

For the signal vectors, preserving orthogonality between the real part and the imag-

inary part is not essential. As a matter of fact, the orthogonality between the real

and imaginary part of the signal components is one of the limitations of complex-

valued notation. That is why we simply stack the real and the imaginary part sym-

bolwise to obtain a real-valued description of the signal vector. Mathematically, we

may express the relationship between the complex-valued transmit vector s and its

real-valued counterpart s in the following way:

s = ℜ(s) ⊗

1

0

!

+ ℑ(s) ⊗

0

1

!

. (2.30)

At this point, we want to emphasize again that y = Hs + n and y = Hs + n aretwo equivalent equations if (2.30) and (2.28) apply for the relationship between all

corresponding vectors and matrices, respectively. So, both equations model the same

transmission over the same channel.

2.3.2 Real-Valued Spreading Matrices

The impulse responses of all physically existing MIMO channels are described properly

with complex-valued notation as well as with real-valued notation where the mapping

between the two is according to (2.28). Therefore, its real-valued representation al-

ways has a structure that allows for this transformation. The real-valued spreading

matrices that we introduce now do not require such a structure. They actually do have

this particular structure if and only if it is possible to express the corresponding STBC

with complex-valued data symbol vectors. From the examples before, we know that

this is not possible for all existing STBCs, including Alamouti’s OSTBC. Therefore, the

real-valued spreading matrices of these STBCs must have a more general structure to

allow for arbitrary linear mappings between real-valued data symbols and real-valued

transmit symbols. A simple example is given by the complex conjugate operation that

is detailed now.

Example 2.3 (Complex conjugate operation). Let x be an arbitrary complex-valueddata symbol. No linear transformation with only x as input is able to provide x∗ atits output. That means, x′ where x′ = ax will not be the complex conjugate of x foran arbitrary but fixed complex constant a and an arbitrary complex variable x. Now,real-valued notation easily incorporates the complex conjugate operation as a linear

transform like the following expression shows:

ℜ(x′)ℑ(x′)

!

=

ℜ(x)−ℑ(x)

!

=

1 0

0 −1

!

ℜ(x)ℑ(x)

!

= Sx. (2.31)

16


Again, we want to stress that this simple 2 × 2 spreading matrix does not have astructure according to (2.27), which is the reason why the complex-valued notation

does not exist in the first place.

The complex conjugate operation is common to various space time coding schemes

and Alamouti’s OSTBC is just one out of many STBCs that applies it. To demonstrate

how real-valued notation actually affects the spreading matrices, we give another ex-

ample.

Example 2.4 (Real-valued spreading matrix). We again consider Alamouti’s OSTBC.

The mapping between data vector and transmit vector is clearly defined by (2.4). From

this, the construction of the real-valued spreading matrix is straightforward, i. e.

S =

0

B

B

B

B

@

1 0 0 0 0 0 1 0

0 1 0 0 0 0 0 −10 0 1 0 −1 0 0 00 0 0 1 0 1 0 0

1

C

C

C

C

A

T

. (2.32)

Note that the relationship between S and the complex-valued spreading matrix from

Example 2.2 is

S = ℜ(S) ⊗

1

0

!

+ ℑ(S) ⊗

0

1

!

. (2.33)

This is the mapping that applies for the relationship between any real-valued spreading

matrices and their complex-valued counterparts, those which take real-valued symbols

at their input. A mapping according to (2.33) exists for any real-valued spreading ma-

trix, but, as expected, this spreading matrix does not have a complex-valued equiva-

lent according to (2.28). It is also important to note that S is non-square whereas the

spreading matrix in Example 2.2 is square. Some consequences will become clear in

Section 2.6 where we consider the capacity of a MIMO channel.

2.3.3 Real-Valued Dispersion Matrices

In Example 2.1, we applied real-valued data symbols to come up with the (complex-

valued) dispersion matrices that correctly describe Alamouti’s OSTBC. In terms of

spreading matrices, the counterpart of these particular dispersion matrices is a

complex-valued spreading matrix that takes real-valued data symbols and returns

complex-valued transmit symbols, which we described in Example 2.2. Naturally, real-

valued spreading matrices also have their corresponding dispersion matrices. Once

more, let us consider Alamouti’s scheme as an example.

17


Example 2.5 (Real-valued dispersion matrices). The real-valued dispersion matrices

are

C1 =

1 0 0 0

0 0 1 0

!T

, C2 =

0 1 0 0

0 0 0 −1

!T

, (2.34a)

C3 =

0 0 1 0

−1 0 0 0

!T

, and C4 =

0 0 0 1

0 1 0 0

!T

. (2.34b)

It is easily verified that the matrices Ci evolve from S in the same manner as Ci from

S, see also Figure 2.4. Further, note that Ci is constructed from the corresponding Ciin Example 2.1 in the same way as S is from S, i. e.,

Ci = ℜ(Ci) ⊗

1

0

!

+ ℑ(Ci) ⊗

0

1

!

. (2.35)

Just like the spreading matrices, the complex-valued dispersion matrices in Exam-

ple 2.1 differ from the real-valued dispersion matrices in (2.34) with respect to their

dimensionality. The interpretation is different, though. Contrary to the spreading ma-

trices, it is now the number of time slots that is linked to the number of columns. Since

we usually construct sets of square dispersion matrices, this means that the number

of time slots in the real-valued case is twice the number of time slots compared with

the complex-valued case. This means that the rate (in terms of number of symbols per

time slot) of the STBC constructed from the real-valued set is only half of the rate of

the complex-valued one if the cardinality of both sets is the same and the dispersion

matrices are weighted with real-valued data symbols chosen from the same alphabet

in both cases. In other words, having set these constraints concerning the dispersion

matrices and the channel, we need twice as many dispersion matrices if we apply real-

valued notation in order to obtain the same rate. Therefore, detection usually gets

more involved since double the number of symbols have to be decided on jointly, but,

on the positive side, we have more degrees of freedom for optimization due to a larger

signal space, see Chapter 4.

2.4 Summary of General System Model Equations and Discussion

From Examples 2.1 and 2.2 and the considerations in the previous section, we have

learned that the general case is covered only if the real part and the imaginary part

of complex-valued data symbols are dealt with independently. Eventually, this means

that all data symbols should be real-valued. Provided that this is the case, we have the

choice to use complex-valued or real-valued entities to model the remaining mappings

and transmission. From now on, we mostly3 restrict our considerations to one of these

3In Section 4.2.4, we propose a STBC that is optimal in terms of mutual information only if complex-valueddata symbols are applied. Further, in Section 4.3, some of the constellations from literature require complex-valued data symbols as well.

18

2.4 Summary of General System Model Equations and Discussion

general models where it should become clear from the context which one is actually

used. It is important to keep in mind that any upcoming transmission equation could

be rewritten by using any one of these general models. At times, we switch between

the models without explicitly mentioning, simply because one or the other model is

more suited for certain observations and derivations. In other words, we always use

the model that is best suited for the statements that we intend to make. In summary,

the four general model equations are

y = HSx + n and y = HSx + n (2.36)

and

Y = HnX

i=1

xiCi + N and Y = HnX

i=1

xiCi + N. (2.37)

As we usually apply real-valued data symbols there should not be any confusion con-

cerning the use of the matrices S and Ci from now on. If not stated otherwise, they

always map real-valued data symbols to complex-valued transmit signals and their

interrelationship with S and Ci is always according to (2.33) and (2.35), respectively.

Typically, the general case is covered by using complex-valued dispersion matrices

weighted with real-valued data symbols, i. e. it is modeled similarly according to the

equation on the left hand side in (2.37). As a matter of fact, this was already suggested

by Hassibi et al. in their initial publication on linear dispersion codes [51]. Contrary

to our use, it is, however, common practice to define two sets of dispersion matrices,4

one set for the real parts of the data symbols and another one for the imaginary parts

of the data symbols. Since the mapping between the real-valued data symbols we use

and the real and imaginary parts of the corresponding complex-valued data symbols

is arbitrary, we prefer the use of a single set of dispersion matrices, which is basically

the union of the two sets that are usually defined. We discuss this issue in a little more

detail when we introduce OSTBCs formally in Section 3.1. Certain alphabets like, e. g.,

8 PSK (phase shift keying) cause correlation among pairs of the real-valued symbols.Although it is not impossible to handle correlation among data symbols theoretically,

the assumption of uncorrelated data symbols eases the overall analysis. For this reason,

we continue to consider alphabets with uncorrelated real and imaginary part only. The

restriction is only minor and does not affect our overall conclusions. Hence, we suggest

to choose amplitude shift keying (ASK) alphabets for the real-valued data symbols.

Then, if we had the desire to establish a relationship to equivalent complex-valued

data symbols, we would get symbols from a regular quadrature amplitude modulation

(QAM) scheme, but this is not really necessary.

Despite the fact that the data symbols are usually real-valued, we refer to those

model equations that use complex-valued channel matrices, spreading matrices, or

dispersion matrices as complex-valued notation. Real-valued notation requires the

4The authors in [51] do briefly mention in a footnote that this is not necessary, though.

19


entire model to be real-valued. In Appendix 2.A1, we give another simple example to

point out some more differences between the two notations. It may help to understand

further why we favor real-valued notation at times.

For the sake of completeness, we want to mention that other concepts exist in liter-

ature to cover those mappings that require real-valued data symbols in our considera-

tions. For example, some authors define different effective channel impulse responses

for each time slot in order to cover the complex conjugate operation, see e. g. [81].

By doing so, a spreading model with independent complex-valued data symbols may

be applied for the representation of, e. g., Alamouti’s OSTBC. The resulting channel

matrix Hℓ is still block diagonal, but with different blocks on the diagonal. Unfortu-

nately, this representation lacks intuition because the channel seems to be time variant

although it is actually not. Also, certain theoretical analysis is not as straightforward

as it is with our model. Furthermore, yet another strategy that is often found in litera-

ture is to define two sets of dispersion matrices – one set for the complex-valued data

symbols and another one for the complex conjugate of the data symbols. In terms of

spreading matrices, it would mean that the data vector contains the symbols as well

as their complex conjugate counterparts, which causes correlation among the data

symbols – something we do not desire because it also complicates the analysis. This

concept is often referred to as being widely linear [39, 115] because it applies linear

filtering for the signal and its complex conjugate counterpart [12], but appears to be

non-linear with respect to the overall signal at first glance.

2.5 Maximum Likelihood Detection

At the receiver, a maximum likelihood (ML) detector returns the constellation point

which was most likely transmitted. It performs best or, at least, equally well in terms

of symbol error rate (SER) compared with all other potential receiving methods, as-

suming that all constellation points are equally likely to be transmitted. With the

right mapping between symbols and bits, the same statement applies for the bit error

rate (BER) as well. Unfortunately, its large number of operations often renders such

a detector infeasible for implementation in real applications for complexity reasons.

Nevertheless, progress in the development of the sphere detector [2, 33, 109] and

higher computational power of today’s personal computers enable us to achieve ML

simulation results for many STBCs with moderate constellation size. Moreover, there

exist space time codes that allow for ML detection with reduced complexity like, e. g.,

OSTBCs, see Chapter 3. Besides, other detection techniques exist that often closely

approach ML performance at much lower complexity. Therefore, ML BER results de-

note reasonable performance bounds. Furthermore, the ML decision metric gives a lot

of insight into the structural properties of STBCs, in general. It provides us with ideas

for the construction of new space time constellations and helps us to analyze existing

ones later on in this work.

20

2.5 Maximum Likelihood Detection

2.5.1 ML Metric – Decision Criterion

To identify the most likely space time codeword at the receiver, an obvious way – and

sometimes the only way – is to compare the received signal with all potential constella-

tion points. Among these, the constellation point that is most similar with the received

signal point gives the decision result. What we need is an appropriate measure for sim-

ilarity. As we assume independent Gaussian distributed noise components with equal

variance in all dimensions, this measure is the (squared) Euclidean distance between

the received signal and the potential noise free received signal candidates, which we

can easily compute because we assume the channel impulse response to be perfectly

known at the receiver. Mathematically, the decision rule may be expressed as

X̂ = arg minX̆∈X

w

w

wY − HX̆w

w

w

2

. (2.38)

Here, X̆ is the test candidate chosen from the space time constellation X . The samecriterion, simply in the context of the spreading model, is

ŝ = arg mins̆∈S

‖y − Hℓs̆‖2 (2.39)

where S and s̆ are defined according to X and X̆, respectively. Note that the hatalways indicates that the decision has been taken on the corresponding variable and

the breve identifies test candidates.

Rather than the transmit vector s and the transmit matrix X themselves, we are

usually interested in the decisions on the information carrying data symbols x1, . . . , xnthat s or X are constructed from. The definition of the corresponding decision rules

as a function of x1, . . . , xn is straightforward and becomes

x̂ = arg minx̆∈An

w

w

wy − HℓSx̆w

w

w

2

(2.40)

for the spreading model and

`

x̂1, . . . , x̂n´

= arg minx̆1,...,x̆n∈A

w

w

w

w

w

Y − HnX

i=1

Cix̆i

w

w

w

w

w

2

(2.41)

for the dispersion model.5

Altogether, we have |A|n different data symbol vectors where |A| denotes the cardi-nality of the alphabet A. This number increases exponentially with the total numberof data symbols. So, the number of tests that have to be carried out gets infeasible

for n larger than a certain limit, unless the structural properties of the space timeconstellations allow us to get around with less than |A|n comparisons.

5In (2.40), according to the definition of x, An is a set of all n-dimensional vectors with elements from A.

21


2.5.2 Structural Properties

The squared Euclidean norm in (2.39) may be expressed as a scalar product. Expand-

ing the resulting expression, we come up with


„

“

y − HℓSx̆”H “

y − HℓSx̆”

«

(2.42)

= arg minx̆∈An

“

yHy −

“

x̃Hx̆ + x̆

Tx̃”

+ x̆TRsx̆

”

(2.43)

= arg minx̆∈An

yHy −

nX

i=1

ˆ

x̆˜

i

`

[x̃]i + [x̃]∗i

´

+nX

i=1

nX

j=1

[Rs]ijˆ

x̆˜

i

ˆ

x̆˜

j

!

. (2.44)

In (2.44), the first term is independent of x̆ and thus common to all distances. Hence,

it is irrelevant for the minimization. Also, the matched filter output x̃ does not depend

on the test candidate. That means, the second term is linear with respect to x̆1, . . . , x̆n.However, the third one causes addends that are quadratic with respect to x̆1, . . . , x̆n.Particularly, the addends with i 6= j introduce dependencies between different datasymbols, which is the reason why the decisions have to be taken jointly for all symbols

x1, . . . , xn. This is also why we usually have to carry out |A|n comparisons for thedecision on a single data vector, which increases exponentially with the number of

symbols that are transmitted jointly. In other words, the complexity is significantly

reduced to only n|A| comparisons if Rs has purely imaginary off-diagonal elementssince, in this case, we may minimize the metric individually for each x̆i. Note that thequadratic terms containing two different data symbols cancel if [Rs]ij = [Rs]

∗ji. With

real-valued notation, it is clear that Rs has to be diagonal to allow for symbolwise

detection.6

For academic reasons, especially for the design of new space time constellations, it

is of interest to split up y into its components and insert these into the metric


w

w

wHℓS“

x − x̆”

+ nw

w

w

2

(2.45)

= arg minx̆∈An

“

∆xT Rs∆x + ∆xTS

HH

Hℓ n + n

HHℓS∆x + n

Hn”

(2.46)

where we substituted x − x̆ = ∆x to simplify our notation. Without noise at thereceiver, what remains in the parenthesis in (2.46) is only the first term. This term

denotes the squared Euclidian distance between data vector x, transmitted, and data

vector x̆ chosen at the receiver as a test candidate. It is obvious that we are interested

in making this distance term as large as possible whenever ∆x is not the all-zerovector, i. e. x and x̆ are not identical. Doing so, the decision becomes less vulnerable

with respect to the noise term. Therefore, the ultimate goal is to construct space

time constellations whose distance profile is optimized for the receiver. As we do not

6If complex symbols were allowed, Rs would have to be diagonal for decoupled ML decisions, too.

22

2.6 Capacity and Rate

assume knowledge about the actual channel impulse response at the transmitter, the

optimization has to be carried out for certain statistical assumptions or a deterministic

but unknown realization. Some hints on how the structure of the problem looks like

are already given by simply rewriting the distance measure as

∆xT Rs∆x =nX

i=1

[Rs]ii ∆x2i +

nX

i=1

nX

j=1,j 6=i[Rs]ij ∆xi∆xj . (2.47)

It is a strictly non-negative quadratic form as Rs is a hermitian positive semidefinite

matrix. Clearly, this is a necessity also because the quadratic form is a distance mea-

sure, which has to be larger than or equal to zero, by definition. In particular, the

left hand sum consists of non-negative elements whose individual contributions are

essential for reliability reasons. A formal analysis is to follow in Section 2.7.


Since capacity considerations have been one of the major driving forces for the devel-

opment of MIMO systems, it is quite natural that plenty of related questions have been

posed, which again have attracted a lot of attention in literature ever since, see [45]

and the references therein. Depending on certain constraints like, e. g., channel knowl-

edge and time variance, different expressions have been derived for characterizing the

capacity of a MIMO channel. In many introductory papers on MIMO systems like,

e. g., [40], the use of MIMO techniques is motivated with the capacity expression for

scenarios with fixed flat fading channel impulse responses where the transmitter does

not have channel state information. This capacity,7 which applies for the type of sce-

narios that we consider, is known to be

C = log2

„

det

„

Inr +σ2sσ2n

HHH

««

. (2.48)

Here, σ2s is the average power of the transmit symbols per real dimension. Applyingsome well known properties of matrices and determinants, we may easily verify that

the following equalities hold:

det

„

Inr +σ2sσ2n

HHH

«

=

nλY

i=1

„

1 +σ2sσ2n

λi

«

= det

„

Int +σ2sσ2n

HHH

«

(2.49)

where λi with i ∈ {1, ..., nλ} are the non-zero eigenvalues of HHH or, equivalently,of HHH. It immediately follows that the capacity expression is directly linked with

7In the strict information theoretic sense [19], this is not a capacity because a maximization over the inputconstellations has not been carried out. However, unavailable knowledge about the channel impulse re-sponse at the transmitter prevents such a maximization. For this reason, it is common practice to refer tothese expressions as a capacity, since it is the best we can do considering the constraints of the model.

23


the transmission matrix R and its eigenvalues:

C = log2

„

det

„

Int +σ2sσ2n

R

««

(2.50)

=

nλX

i=1

log2

„

1 +σ2sσ2n

λi

«

. (2.51)

Equation (2.51) scales linearly with the number of non-zero eigenvalues. In other

words, each additional non-zero eigenvalue, which we refer to as an eigenmode of the

MIMO channel, provides a new degree of freedom that may be used for the transmis-

sion of an additional data stream, superimposed with the already existing ones. Totally,

the MIMO channel provides nλ degrees of freedom where, expressed in mathematicalterms, nλ is the rank of R. Therefore, the MIMO channel supports a maximum of nλparallel data streams that can be transmitted independently. A STBC that makes use

of the additional degrees of freedom by transmitting multiple data symbols per time

slot is said to exploit the multiplexing gain. To formalize the term multiplexing gain,

we define the transmission rate of a STBC as

r =n

2ℓ. (2.52)

This is the average number of complex data symbols transmitted per time slot. Com-

monly, the order of the multiplexing gain is defined by the rate of the scheme. In

the following, we refer to a STBC having full rate or maximum multiplexing gain if

r = nλ.8 In most cases, this means that we choose r = nt if we want to achieve full

rate. Note that the factor 1/2 in the rate definition is due to the fact that we refer to nas being the number of real-valued data symbols.

Remarks. a) As we do not have channel knowledge at the transmitter, it is not pos-

sible to address each of the eigenmodes separately. The transmission of the parallel

data streams is done jointly across all eigenmodes. That means, it is the task of the

receiver to isolate the data streams from each other. With perfect channel knowledge

at the transmitter, this separation could have been done at the transmitter already. Be-

sides, perfect channel knowledge would allow us to apply water filling techniques [19,

page 349] where the power of each data stream is adjusted according to the state of the

eigenmode. This results in a slightly higher capacity, but does not have any influence

on the actual achievable multiplexing gain. In our case, we transmit all data streams

with equal power – a strategy that is shown to be optimal if the transmitter does not

have channel knowledge in a Rayleigh fading environment [107]. b) Furthermore, we

want to point out that we find (2.50) to be more convenient compared to (2.48) be-

cause it contains R explicitly and allows us to draw some important conclusions on the

structure of the spreading matrices and dispersion matrices. It is also noteworthy that

the close tie between the transmission matrix R and the capacity expression is one of

the advantages of the spreading model that is to be discussed in the next subsection.

8Contrary to our definition, r = 1 is sometimes referred to as being full rate in literature. It matches with ourdefinition only if either the transmitter or the receiver just have a single antenna.

24


2.6.1 Spreading Matrices and Capacity

Since we usually use real-valued data symbols in our considerations, it is more natural

to give the capacity as a function of real-valued matrices, i. e., we apply real-valued

notation:

C =1

2log2

„

det

„

I2nt +σ2sσ2n

R

««

. (2.53)

Both expressions, (2.50) and (2.53), return the same value. The factor 1/2 is dueto the fact that R simply has double dimensionality with each eigenvalue occurring

twice compared to R,9 i. e., det(A) = |det(A)|2. Using the same ideas, we can easilyincorporate several time slots:

C =1

2ℓlog2

„

det

„

I2ntℓ +σ2sσ2n

Rℓ

««

. (2.54)

Here, the factor 1/ℓ guarantees that the expression stays normalized with respect to asingle time slot.

A simple reasoning gives us some hints on how to choose spreading matrices that

are optimal in terms of capacity. To begin with, let us interpret HℓS as an effective

channel impulse response. It is straightforward to show that the mutual information

(per time slot) is then upper bounded by

C =1

2ℓlog2

„

det

„

I2ntℓ +σ2xσ2n

STRℓS

««

(2.55)

where σ2x is the average power of the real-valued data symbols. Optimum spreadingmatrices have to preserve the value of the original capacity expression in (2.54). In

other words, the mutual information between the transmit signal vector and the signal

vector at the receiver should not be degraded due to spreading. Comparing (2.55)

with (2.54), we have to consider two aspects, namely, the final value of C and thevariance of the signal that is actually transmitted from the antennas. Matrix theory

tells us [11] that orthonormal spreading matrices S do not affect the eigenvalues of

the original transmission matrix Rℓ. As a result, the value of C stays unchanged(assuming σ2s = σ

2x right now). This immediately becomes clear from

C =1

2ℓlog2

„

det“

ST”

det

„

I2ntℓ +σ2xσ2n

Rℓ

«

det“

S”

«

(2.56)

since the determinant of orthonormal matrices is one or minus one and (2.56) is a valid

representation of (2.55) if S is orthonormal. To come up with (2.56), we applied some

basic properties of determinants, including that det(AB) = det(A) det(B). Finally,we have to justify the assumption that σ2s = σ

2x. Having defined an orthonormal

9Information theoretic books often have the factor 1/2 because they consider real-valued quantities, e. g. [19].

25


spreading matrix S and fixed σ2x, it is easily verified that the variance of the transmitsymbol σ2s equals the variance of the data symbols σ

2x, namely

σ2s I = E“

s sT”

= E“

SxxTS

T”

= SE“

xxT”

ST

= σ2xI. (2.57)

Looking at the entire constellation, orthonormal spreading matrices preserve the norm

of the data vectors and their pairwise distances. Geometrically, the application of such

spreading matrices corresponds to a rotation of the data vector in the signal space. We

remark that similar considerations have also been carried out in [52].

Generalizing the results from above, it is easily shown that any real-valued spread-

ing matrix with orthonormal rows fulfills the necessary conditions required to pre-

serve mutual information. This means that spreading matrices may be rectangular,

in general, with more rows than columns. Despite the fact that more data symbols

are transmitted in parallel, the order of the multiplexing gain is not affected by any

spreading matrix that complies with these conditions. Contrary to other spreading

matrices, we want to stress explicitely that spreading matrices with orthonormal rows

do not affect the mutual information, no matter what R is. Hence, we refer to space

time constellations whose spreading matrices have orthonormal rows as mutual infor-

mation preserving constellations in the following. The construction of such spreading

matrices will be an important design criterion in Chapter 4.

Remarks. a) With complex-valued notation, mutual information preserving spreading

matrices require at least twice as many columns as rows. This is because these ma-

trices map real-valued data symbols to complex-valued transmit symbols. Only if we

allowed for complex-valued data symbols, the same reasoning from above would ap-

ply and unitary spreading matrices would be optimal. b) Reconsidering (2.55) with a

fixed realization of Rℓ, we can actually construct non-orthogonal spreading matrices

that do not affect the transmit power, but achieve higher mutual information. An im-

provement is not possible only if the eigenvalues of Rℓ are all the same, i. e., Rℓ is a

scaled identity matrix. The optimized spreading matrix that maximizes the expression

in (2.55) results in being the water filling solution, which is known to be the capacity

when the transmitter has knowledge about the channel impulse response [45,81,110].

Such spreading matrices, however, work well only for a particular channel realizations

and fail for many other ones. For this reason, channel knowledge at the transmitter

is essential for us being able to exploit these additional resources. Without channel

knowledge at the transmitter, we need a fixed spreading matrix that works well for

the majority of channel realizations, because it is impossible to adjust it according

to the channel conditions. The results above show us that spreading matrices with

orthonormal rows fulfill this condition.

Example 2.6 (Mutual information provided by Alamouti’s OSTBC). Alamouti’s OSTBC

is known to preserve mutual information only for certain channel conditions [93].

Based on the observations we made in this sections, we adopt a modified approach

to analyze Alamouti’s OSTBC with respect to mutual information. First of all, let us

consider the corresponding spreading matrix in (2.32). Its dimensionality already tells

26


us that this spreading matrix cannot be optimal in terms of mutual information for

general MIMO channels because the matrix has less columns than rows. Alamouti’s

OSTBC requires two transmit antennas, which means that R2 is an 8 × 8 matrix. Tofully cover the potential of multiplexing, it is necessary to transmit at least eight real-

valued data symbols in parallel, but Alamouti’s scheme only transmits four jointly.

Nevertheless, Alamouti’s OSTBC may still be optimal as long as the number of eigen-

modes of the channel is limited

Coherent Space Time Block Codes from Sets of Subspaces · 2009. 7. 10. · Multiple input multiple...

Documents

Transcript of Coherent Space Time Block Codes from Sets of Subspaces · 2009. 7. 10. · Multiple input multiple...