Aditya K Jagannatham BW Efficient Estimation - UCSD...
Transcript of Aditya K Jagannatham BW Efficient Estimation - UCSD...
UNIVERSITY OF CALIFORNIA SAN DIEGO
Bandwidth Efficient Channel Estimation for Multiple-Input Multiple-Output
(MIMO) Wireless Communication Systems: A Study of Semi-Blind and
Superimposed Schemes.
A dissertation submitted in partial satisfaction of the
requirements for the degree Doctor of Philosophy
in
Electrical Engineering
(Communication Theory and Systems)
by
Aditya K. Jagannatham
Committee in charge:
Professor Bhaskar D. Rao, ChairProfessor Ian AbramsonProfessor Robert BitmeadProfessor Kenneth Kreutz-DelgadoProfessor Laurence Milstein
2007
Copyright
Aditya K. Jagannatham , 2007
All rights reserved.
The dissertation of Aditya K. Jagannatham is approved,
and it is acceptable in quality and form for publication
on microfilm:
Chair
University of California San Diego
2007
iii
To My Father,
Prof. Anantha Swamy Jagannatham
(January 5, 1950 - October 14, 2004)
iv
TABLE OF CONTENTS
Signature Page . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii
Dedication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv
Table of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v
List of Figures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ix
List of Tables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xii
Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xiii
Vita and Publications . . . . . . . . . . . . . . . . . . . . . . . . . . . . xvi
Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . xviii
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.1 MIMO System Modeling and Channel Estimation . . . . . . . . . 41.2 Estimation Philosophies . . . . . . . . . . . . . . . . . . . . . . . 5
1.2.1 Pilot based Estimation . . . . . . . . . . . . . . . . . . . . 51.2.2 Blind Estimation . . . . . . . . . . . . . . . . . . . . . . . 61.2.3 Semi-Blind Philosophy . . . . . . . . . . . . . . . . . . . . 7
1.3 Complex-Constrained Cramer-Rao Bounds . . . . . . . . . . . . . 81.4 Whitening-Rotation Based Semi-Blind MIMO Channel Estimation 91.5 FIM based Regularity Analysis of Semi-Blind MIMO FIR Channels 111.6 Semi-Blind Channel Estimation for MRT Based MIMO Systems . 131.7 Superimposed Pilots for MIMO Channel Estimation . . . . . . . . 141.8 Channel Estimation for Time-Varying Channels . . . . . . . . . . 161.9 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2 Complex Constrained Cramer-Rao Bound (CC-CRB) . . . . . . . . . . 192.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192.2 CRB For Complex Parameters With Constraints . . . . . . . . . . 202.3 A Constrained Matrix Estimation Example . . . . . . . . . . . . . 24
2.3.1 Problem Formulation . . . . . . . . . . . . . . . . . . . . . 242.3.2 Cramer-Rao Bound . . . . . . . . . . . . . . . . . . . . . . 252.3.3 ML Estimate and Simulation Results . . . . . . . . . . . . 28
2.4 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
v
3 Whitening-Rotation Based Semi-Blind MIMO Channel Estimation . . . 313.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 313.2 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . . 323.3 Estimation accuracy for semi-blind approaches . . . . . . . . . . . 35
3.3.1 Estimation Accuracy of the WR scheme . . . . . . . . . . 383.3.2 Constrained CRB of the WR scheme . . . . . . . . . . . . 39
3.4 Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 403.4.1 Orthogonal Pilot ML (OPML) estimator . . . . . . . . . . 403.4.2 Iterative ML procedure for general pilot - IGML . . . . . . 423.4.3 Total Optimization . . . . . . . . . . . . . . . . . . . . . . 46
3.5 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . 493.6 OFDM Channel Estimation . . . . . . . . . . . . . . . . . . . . . 52
3.6.1 Problem Description . . . . . . . . . . . . . . . . . . . . . 533.6.2 Simulation results . . . . . . . . . . . . . . . . . . . . . . . 56
3.7 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 563.8 Appendix for Chapter(3) . . . . . . . . . . . . . . . . . . . . . . . 58
4 Fisher Information Based Regularity and Semi-Blind Estimation of MIMO-FIR Channels . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 644.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 644.2 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . . 674.3 Semi-Blind Fisher Information Matrix (FIM) . . . . . . . . . . . . 69
4.3.1 FIM: A General Result . . . . . . . . . . . . . . . . . . . . 714.3.2 Blind FIM . . . . . . . . . . . . . . . . . . . . . . . . . . . 744.3.3 Pilots and FIM . . . . . . . . . . . . . . . . . . . . . . . . 764.3.4 Pilots and Identifiability . . . . . . . . . . . . . . . . . . . 77
4.4 Semi-Blind Estimation: Performance . . . . . . . . . . . . . . . . 784.4.1 Asymptotic Semi-Blind FIM . . . . . . . . . . . . . . . . . 79
4.5 Semi-blind Estimation: Algorithm . . . . . . . . . . . . . . . . . . 814.5.1 Orthogonal Pilot ML (OPML) for Q Estimation . . . . . . 824.5.2 Orthogonal Pilot Matrix Construction . . . . . . . . . . . 83
4.6 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . 844.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 884.8 Appendix for Chapter(4) . . . . . . . . . . . . . . . . . . . . . . . 90
4.8.1 Proof of Theorem 2 . . . . . . . . . . . . . . . . . . . . . . 904.8.2 Proof of Theorem 3 . . . . . . . . . . . . . . . . . . . . . . 91
5 Semi-Blind Estimation for Maximum Ratio Transmission . . . . . . . . 945.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 945.2 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
5.2.1 System Model and Notation . . . . . . . . . . . . . . . . . 975.2.2 Conventional Least Squares Estimation (CLSE) . . . . . . 985.2.3 Semi-Blind Estimation . . . . . . . . . . . . . . . . . . . . 99
5.3 Conventional Least Squares Estimation (CLSE) . . . . . . . . . . 101
vi
5.3.1 Perturbation of Eigenvectors . . . . . . . . . . . . . . . . . 1015.3.2 MSE in vc . . . . . . . . . . . . . . . . . . . . . . . . . . . 1025.3.3 Received SNR and Symbol Error Rate (SER) . . . . . . . 104
5.4 Closed-Form Semi-Blind estimation (CFSB) . . . . . . . . . . . . 1065.4.1 MSE in vs with Perfect us . . . . . . . . . . . . . . . . . . 1075.4.2 Received SNR with Perfect us . . . . . . . . . . . . . . . . 1085.4.3 MSE in vs with Noise-Free Training . . . . . . . . . . . . . 1085.4.4 Received SNR with Noise-Free Training . . . . . . . . . . . 1095.4.5 Semi-blind Estimation: Summary . . . . . . . . . . . . . . 110
5.5 Comparison of CLSE and Semi-blind Schemes . . . . . . . . . . . 1105.5.1 Performance of a 2 × 2 System with CLSE and CFSB . . . 1115.5.2 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . . . 1125.5.3 Semi-blind Estimation: Limitations and Alternative Solu-
tions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1145.6 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . 1145.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
5.7.1 Proof of Lemma 1: . . . . . . . . . . . . . . . . . . . . . . 1185.7.2 Received SNR with perfect us . . . . . . . . . . . . . . . . 1185.7.3 Proof for equations (5.28) and (5.29) . . . . . . . . . . . . 1195.7.4 Performance of Alamouti Space-Time Coded Data with
Conventional Estimation . . . . . . . . . . . . . . . . . . . 1205.7.5 Other Useful Lemmas: . . . . . . . . . . . . . . . . . . . . 122
6 Superimposed Pilots for MIMO Channel Estimation . . . . . . . . . . . 1306.1 Superimposed Pilots (SP) Based MIMO Estimation . . . . . . . . 1326.2 MSE of Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . 134
6.2.1 Cramer-Rao Bound (CRB) for SP Estimation . . . . . . . 1366.2.2 Semi-Blind SP Estimation . . . . . . . . . . . . . . . . . . 140
6.3 Throughput Performance . . . . . . . . . . . . . . . . . . . . . . . 1416.3.1 A Throughput Lower Bound for Channels with Correlated
Noise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1426.3.2 Throughput Comparison of Superimposed and Conventional
Pilots (CP) . . . . . . . . . . . . . . . . . . . . . . . . . . 1436.3.3 Conventional Pilots (CP) based estimation . . . . . . . . . 145
6.4 Optimal Power Allocation in SP . . . . . . . . . . . . . . . . . . . 1476.4.1 Minimum Variance Distortionless Response (MVDR) Beam-
former . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1476.5 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . 151
6.5.1 MSE of Estimation . . . . . . . . . . . . . . . . . . . . . . 1516.5.2 Throughput Performance . . . . . . . . . . . . . . . . . . . 1526.5.3 Optimal Power Allocation . . . . . . . . . . . . . . . . . . 153
6.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1546.7 Appendix for Chapter(6) . . . . . . . . . . . . . . . . . . . . . . . 156
vii
6.7.1 Proof of Expression for MSEs in section(6.2) . . . . . . . . 1566.7.2 Proof of Theorem 6 . . . . . . . . . . . . . . . . . . . . . . 1576.7.3 Proof of Theorem 7 . . . . . . . . . . . . . . . . . . . . . . 1606.7.4 MVDR - Post-Processing SNR . . . . . . . . . . . . . . . . 161
7 MIMO Time Varying Channel Estimation . . . . . . . . . . . . . . . . . 1637.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1637.2 Problem Setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
7.2.1 SP Estimation Based on the CEBEM MIMO Model . . . . 1657.3 EM Based Algorithm for CEBEM SP Estimation . . . . . . . . . 166
7.3.1 Likelihood computation and Sphere Decoding . . . . . . . 1687.4 Simulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1717.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 173
8 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
Bibliography . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
viii
LIST OF FIGURES
Figure 1.1: Schematic representation of a MIMO System. . . . . . . . . . 2Figure 1.2: Schematic representation of a MIMO frame. . . . . . . . . . . 5Figure 1.3: Pictorial Representation of Pilot vs. Blind Tradeoff. . . . . . 7Figure 1.4: Pictorial Representation of Conventional Vs. Superimposed
Pilots . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
Figure 2.1: Computed MSE Vs SNR,∣∣∣Q(1, 1) − Q(1, 1)
∣∣∣
2
. . . . . . . . . 29
Figure 2.2: Computed MSE Vs SNR,∥∥∥Q − Q
∥∥∥
2
. . . . . . . . . . . . . . 29
Figure 3.1: MSE vs. SNR of OPML semi-blind channel estimation andthe semi-blind CRB with perfect knowledge of W . Also shown forreference is MSE of the exclusively training based channel estimate.H is an 8 × 4 complex flat-fading channel matrix and pilot lengthL = 12. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
Figure 3.2: Computed MSE vs. Pilot length (L) for the OPML, IGML,ROML and exclusive training based channel estimation. H is an8 × 4 complex flat-fading channel matrix and SNR = 8 dB . . . . . 51
Figure 3.3: Comparison of OPML with perfect W , OPML with imperfector estimated W , total optimization and training based estimationof H. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
Figure 3.4: Probability of Bit Error vs. SNR for 8 × 4 MIMO systememploying OPML, Total Optimization (N = 1000, 500). The per-formance of the exclusively training based channel estimate is alsogiven for comparison. . . . . . . . . . . . . . . . . . . . . . . . . . . 53
Figure 3.5: Constrained Vs. unconstrained channel estimation for OFDM. 56
Figure 4.1: Schematic representation of an SB system. . . . . . . . . . . 65Figure 4.2: Schematic representation of input and output symbol blocks. 69Figure 4.3: Paley Hadamard Matrix . . . . . . . . . . . . . . . . . . . . . 83Figure 4.4: Rank deficiency of the complex MIMO FIM Vs. number of
transmitted pilot symbols (Lp)for a 6 × 5 MIMO FIR system oflength Lh = 5. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
Figure 4.5: MSE Vs SNR in a 4× 2 MIMO channel with Lh = 2 channeltaps, Lp = 20 pilot symbols. . . . . . . . . . . . . . . . . . . . . . . 86
Figure 4.6: MSE performance for estimation of a 4×2 MIMO frequency-selective channel. Left- MSE Vs. Lp and Right - MSE Vs. numberof blind symbols. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
Figure 4.7: Symbol error rate (SER) Vs. SNR for QPSK symbol trans-mission of a 4 × 2 MIMO frequency selective channel with Lh = 2channel taps. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
ix
Figure 5.1: MIMO system model, with beamforming at the transmitterand receiver. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 124
Figure 5.2: Comparison of the transmission scheme for conventional leastsquares (CLSE) and closed-form semi-blind (CFSB) estimation. . . 124
Figure 5.3: Average channel gain of a t = r = 2 MIMO channel with L =2, N = 8 and PD = 6dB, for the CLSE and beamforming, CFSB andbeamforming (with and without knowledge of u1), CLSE and whitedata (Alamouti-coded), and perfect beamforming at transmitter andreceiver. Also plotted is the theoretical result for the performanceof Alamouti-coded data with channel estimation error, given by (5.34)125
Figure 5.4: MSE in v1 vs training data length L, for a t = r = 4 MIMOsystem. Curves for CLSE, CFSB and OPML with perfect u1 areplotted. The top five curves correspond to a training symbol SNRof 2dB, and the bottom five curves 10dB. . . . . . . . . . . . . . . . 126
Figure 5.5: SER of beamformed-data vs number of training symbols L,t = r = 4 system, for two different values of white-data lengthN , and data and training symbol SNR fixed at PT = PD = 6dB.The two competing semi-blind techniques, OPML and CFSB, areplotted. CFSB marginally outperforms OPML for N = 50, as itonly requires an accurate estimate of u1 from the blind data. . . . . 127
Figure 5.6: SER vs L, t = r = 4 system, for two different values ofN , and data and training symbol SNR fixed at PT = PD = 6dB.The theoretical and experimental curves are plotted for the CFSBestimation technique. Also, the LCSB technique outperforms boththe conventional (CLSE) and semi-blind (CFSB) techniques. . . . . 128
Figure 5.7: SER versus data SNR for the t = r = 2 system, with L =2, N = 16, γp = 2dB. ‘CLSE-Alamouti’ refers to the performanceof the spatially-white data with conventional estimation, ‘CLSE-bf’is the performance of the beamformed data with vc, ‘CFSB’ and‘LCSB’ refer to the performance of the corresponding techniquesafter accounting for the loss due to the white data. ‘CFSB-u1’is the performance of CFSB with perfect-u1, and ‘Perf-bf’ is theperformance with the perfect u1 and v1 assumption. . . . . . . . . . 129
Figure 6.1: Schematic of a Superimposed Pilot System. . . . . . . . . . . 131Figure 6.2: Schematic diagram of the superimposed pilot(SP) frame struc-
ture. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133Figure 6.3: Schematic of conventional (time-multiplexed) pilots frame (block)
structure. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145Figure 6.4: MSE of Estimation of MIMO wireless channel with r = t = 4,
PNR = 5dB, Nf = 10 and Lp = 8 symbols. . . . . . . . . . . . . . . 148Figure 6.5: MSE of Estimation of SIMO Rayleigh wireless channel with
r = 4 antennas, Nf = 20, Lp = 8, PNR = 5dB. . . . . . . . . . . . 150
x
Figure 6.6: Throughput performance of SP and CP vs. Nf , SNR = PNR= 5dB, Lp = 64. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
Figure 6.7: Throughput performance of SP and CP Vs. SNR for a 4 × 4Rayleigh flat-fading MIMO channel with Nf = 10 sub-frames andLp = 64 pilots. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
Figure 6.8: Detection performance vs. SNR for SP based estimation.SER vs. SNR (Pd/σ
2n) for QPSK signaling, r = 4 SIMO channel
and different [Nf , Lp, αs (dB)]. . . . . . . . . . . . . . . . . . . . . . 156
Figure 6.9: Optimal power allocation ratio 10 log10
(ρ⋆
d
ρ⋆t
)
of a r = 4 an-
tenna SIMO channel Vs. Total Power (αsdB) for various Nf , Lp. . . 158
Figure 7.1: MSE of Kalman based estimation of a time-varying wirelesschannel. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
xi
LIST OF TABLES
Table 6.1: Table showing covariance matrices for SP and CP systems withchannel estimation error. . . . . . . . . . . . . . . . . . . . . . . . . 146
xii
ACKNOWLEDGEMENTS
First and foremost, I would like to thank my advisor Prof. Bhaskar Rao
for his continuous guidance and dedicated personal efforts which led to the fruition
of this research work. His valuable advice and inputs contributed to a great extent
in this work. I am also thankful to him for the uninterrupted financial support
during my several years here at UCSD. It was a privilege to work with him. I
would also like to thank my committee members, Prof. Ian Abramson, Prof.
Robert Bitmead, Prof. Kenneth Kreutz-Delgado and Prof. Laurence Milstein,
for their inputs and critique which have helped address important aspects in this
research. I owe special gratitude to the UCSD CoRe research grant agency1 and
the affiliated companies for supporting me throughout my PhD program.
The text of chapter 2, in part, is a reprint of the material as it appears
in A.K. Jagannatham and B.D. Rao, “Cramer-Rao Lower Bound for Constrained
Complex Parameters”, IEEE Signal Processing Letters, Vol. 11, No. 11, Nov’04,
Pages: 875 - 878 and A. K. Jagannatham and B. D. Rao,“Complex Constrained
CRB and its applications to Semi-Blind MIMO and OFDM Channel Estimation”,
Sensor Array and Multichannel Signal Processing Workshop Proceedings, 2004,
Barcelona, Spain, 18-21 July 2004, Pages: 397 - 401. Chapter 3, in part, is a
reprint of a paper which has been published as A.K. Jagannatham and B. D.
Rao, “Whitening-Rotation Based Semi-Blind MIMO Channel Estimation”, IEEE
Transactions on Signal Processing Vol. 54, No. 3, Mar’06, Pages: 861 - 869, A. K.
Jagannatham and B. D. Rao,“A Semi-Blind Technique For MIMO Channel Matrix
Estimation”, 4th IEEE Workshop on Signal Processing Advances in Wireless Com-
munications, 2003, Rome, Italy , 15-18 June 2003 Pages:304 - 308, Rome, Italy,
A. K. Jagannatham and B. D. Rao,“Constrained ML Algorithms for Semi-Blind
MIMO Channel Estimation”, IEEE Global Telecommunications Conference, 2004
GLOBECOM ’04, Vol. 4, Nov’29 - Dec’3, 2004, Pages: 2475 - 2479. The text of
1This work was supported by CoRe Research Grants Com00-10074, com02-10119, com04-10176 andcom04-10173
xiii
chapter 4, has appeared in A.K. Jagannatham and B. D. Rao, “FIM Regularity
for Gaussian Semi-Blind MIMO FIR Channel Estimation”, Conference Record of
the Thirty-Ninth Asilomar Conference on Signals, Systems and Computers, Oct.
28 - Nov. 1, 2005, Pages: 848 - 852. Chapter 5, in part, is a reprint of the
material which has appeared as A.K. Jagannatham, C.R. Murthy and B.D. Rao,
“A Semi-Blind Channel Estimation Scheme for MRT”, Proceedings of IEEE Inter-
national Conference on Acoustics, Speech, and Signal Processing, 2005, (ICASSP
’05), Mar’05, Vol. 3, Pages: 585 - 588. The final chapter, chapter 6, is adapted
from the content of A.K. Jagannatham and B.D. Rao,“Superimposed Vs. Conven-
tional Pilots for Channel Estimation”, Conference Record of the Fortieth Asilomar
Conference on Signals, Systems and Computers, Nov., 2006. The dissertation au-
thor was the primary researcher and author, and the co-authors listed in these
publications contributed to or supervised the research which forms the basis for
this dissertation.
My labmate Chandra R. Murthy deserves a special mention not only for
his valuable technical feedback but also for directly collaborating with me on part
of this work. I am grateful to the DSP lab members Abhijeet bhorkar, Zhongren
Cao, Ethan Duni, Yogananda Isukapalli, Cecile Levasseur, Joseph Murray, June
Chul Roh, Shankar Shivappa, Anand Subramaniam, Thomas Svantesson, Yeliz
Tokgoz, David Wipf, Chengjin Zhang, Wenyi Zhang, Jun Zheng and UCSD col-
leagues Preeti Nagvanshi, Ramesh Annavajjala for the many hours of discussions,
both technical and non-technical. UCSD’s Mesa graduate housing has provided
me with a very comfortable and affordable abode during the years of my graduate
studies. This stay was made lively by my roommates Ashay Dhamdhere, Sandeep
Kanumuri, Daniel Richter and I thank them for providing great company. I also
wish to thank friends at UCSD, Sumit Bhardwaj, Anuj Grover, Anuj Mishra,
Swamy Muddu, Ali Rangwala, Sourja Ray, Satish Narayanasamy and Sachin Ta-
lati, whose companionship has contributed to enriching the grad life experience at
UCSD. My studies so far have been made a fun filled experience by friends Mo-
xiv
han Dunga, Prashanth Gangu, Phanindra Ganti, Srikanth Geedipalli, Ram Kolli,
Girish Nagavarapu, Sameer Ranjan, Pramod Reddy, Sridhar Reddy, Saurabha
Tavildar, Sampath Vetsa, Satish Vutukuru and many others.
I express a deep sense of gratitude to family friends Dr. Venkat R. Mali
and Harini Mali for their help, both material and emotional, during my stay here.
My uncle, Sreedhara Swamy Jagannatham and aunt Lalitha Jagannatham have
been a constant source of support and motivation for me during this entire period
and I wish to thank them profoundly. Finally, I wish to greatly thank my sister
Anila and brother in law Dr. Nandan R. Thirunahari for their help and encour-
agement over the years. Seeing them happy contributes to the joy of my life and
I wish them all the success in their careers.
In the tradition of India, it is not customary to thank ones parents be-
cause they are in essence present in each of their child’s endeavors. This work
belongs to my mother Smt. Bhagya L. Jagannatham and my father Dr. Anantha
Swamy Jagannatham (Professor of Chemistry and former Vice Chancelor of Osma-
nia University, Hyderabad, India), as much as it is mine. The toughest challenge
during this course was coping with the loss of my father, whose dream it was to see
me earn this PhD. In him I lost a mentor, supporter and a great friend. I dedicate
this thesis to his loving memory. May his soul rest in peace.
xv
VITA
1980 Born - Hyderabad,Andhra Pradesh, INDIA.
1995 Central Board of Secondary Education Certificate,Hyderabad Public School,Hyderabad, AP, INDIA.
1997 AP State Board of Intermediate Education Certificate,Little Flower Junior College,Hyderabad, INDIA.
2001 Bachelor of Technology,Electrical Engineering,Indian Institute of Technology Bombay,Powai, Mumbai, INDIA.
2001-2007 Research Assistant,Department of Electrical and Computer Engineering,University of California San Diego,La Jolla, CA, U.S.A
2003 Graduate Student Intern,Zyray Wireless (Now Broadcom Inc.),San Diego, CA, U.S.A.
2004 Master of Science Degree,Electrical Engineering,University of California San Diego,La Jolla, CA, U.S.A.
2007 Doctor of Philosophy Degree,Electrical Engineering,(Communication Theory and Systems)University of California San Diego,La Jolla, CA, U.S.A.
xvi
PUBLICATIONS
A. K. Jagannatham and B. D. Rao, “Whitening-Rotation Based Semi-Blind MIMOChannel Estimation”, IEEE Transactions on Signal Processing, Vol. 54, No. 3,Mar’06, Pages: 861 - 869.
A. K. Jagannatham and B. D. Rao, “Cramer-Rao Lower Bound for ConstrainedComplex Parameters”, IEEE Signal Processing Letters, Vol. 11, No. 11, Nov’04,Pages: 875 - 878.
A. K. Jagannatham and B. D. Rao,“Superimposed Vs. Conventional Pilots forChannel Estimation”, Conference Record of the Fortieth Asilomar Conference onSignals, Systems and Computers, Nov., 2006.
A. K. Jagannatham and B. D. Rao,“FIM Regularity for Gaussian Semi-BlindMIMO FIR Channel Estimation ”, Conference Record of the Thirty-Ninth Asilo-mar Conference on Signals, Systems and Computers, Oct. 28 - Nov. 1, 2005,Pages: 848 - 852.
A. K. Jagannatham, C. R. Murthy and B. D. Rao,“A Semi-Blind Channel Estima-tion Scheme for MRT”, Proceedings of IEEE International Conference on Acous-tics, Speech, and Signal Processing, 2005, (ICASSP ’05), Mar’05, Vol. 3, Pages:585 - 588.
A. K. Jagannatham and B. D. Rao,“Constrained ML Algorithms for Semi-BlindMIMO Channel Estimation”, IEEE Global Telecommunications Conference, 2004GLOBECOM ’04, Vol. 4, Nov’29 - Dec’3, 2004, Pages: 2475 - 2479.
A. K. Jagannatham and B. D. Rao,“Complex Constrained CRB and its applica-tions to Semi-Blind MIMO and OFDM Channel Estimation”, Sensor Array andMultichannel Signal Processing Workshop Proceedings, 2004, Barcelona, Spain,18-21 July 2004, Pages: 397 - 401.
A. K. Jagannatham and B. D. Rao,“A Semi-Blind Technique For MIMO ChannelMatrix Estimation”, 4th IEEE Workshop on Signal Processing Advances in Wire-less Communications, 2003, Rome, Italy , 15-18 June 2003 Pages:304 - 308, Rome,Italy.
xvii
ABSTRACT OF THE DISSERTATION
Bandwidth Efficient Channel Estimation for Multiple-Input Multiple-Output
(MIMO) Wireless Communication Systems: A Study of Semi-Blind and
Superimposed Schemes
by
Aditya K. Jagannatham
Doctor of Philosophy in Electrical Engineering
(Communication Theory and Systems)
University of California, San Diego, 2007
Professor Bhaskar D. Rao, Chair
This dissertation aims to explore and analyze novel schemes for band-
width efficient channel estimation in multiple-input multiple-output (MIMO) wire-
less systems. As the number of receive/transmit antennas increases in MIMO sys-
tems, the number of channel coefficients to be estimated increases. This, together
with the low SNR of operation in MIMO systems, necessitates an increase in the
pilot symbol overhead which leads to a reduction in the bandwidth efficiency. To
alleviate this problem, we study several procedures such as whitening-rotation and
superimposed pilots for bandwidth efficient MIMO channel estimation. The CRLB
serves as an important tool in the performance evaluation of estimators which arise
frequently in the fields of communications and signal processing. In applications
such as semi-blind channel estimation one is frequently faced with the estimation of
constrained complex parameters. We present a result for the Cramer-Rao bound
(CRB) for complex-constrained parameters and the utility of this framework is
illustrated in the subsequent work on semi-blind channel estimation.
xviii
In addition to using the pilot sequence, the accuracy of the channel esti-
mate at the receiver can be enhanced by employing second-order statistical infor-
mation. For this purpose, we propose a whitening-rotation (WR) based algorithm
for semi-blind estimation of the complex flat-fading MIMO channel matrix H. Uti-
lizing complex constrained CRB, we show that the semi-blind scheme can signifi-
cantly improve estimation accuracy. Next, we consider the problem of semi-blind
(SB) channel estimation for multiple-input multiple-output (MIMO) frequency-
selective (FS) channels. We motivate a Fisher information matrix (FIM) based
analysis of this semi-blind estimation problem and demonstrate that the rank de-
ficiency of the FIM is related to the number of un-identifiable parameters. We
also establish the minimum number of pilot symbols necessary to achieve regu-
larity (full-rank) of the FIM for identifiability. The efficacy and applicability of
the semi-blind philosophy is further exemplified by demonstrating its utility in
the context of channel estimation for MIMO systems employing Maximum Ratio
Transmission (MRT).
Superimposed pilots (SP) are another interesting alternative to reduce
the impact of a pilot overhead without a significant increase in computational
complexity. We present a study of the mean-squared error (MSE) and throughput
performance of superimposed pilots (SP) for the estimation of a MIMO wireless
channel. We illustrate a semi-blind scheme for SP based MIMO channel estima-
tion, which improves performance over the traditional mean-estimator. A new
result is presented for the worst-case capacity of a communication channel with
correlated information symbols and noise. We also address the issue of optimal
source-pilot power allocation for SP. In the end we consider the problem of esti-
mation of a time-selective MIMO wireless channel using superimposed pilot (SP)
symbols. We demonstrate a scheme for channel estimation based on a complex ex-
ponential basis expansion model (CEBEM) approximation of the time-selective
wireless channel. We further reduce the MSE of estimation by employing an
expectation-maximization (EM) based iterative estimation procedure.
xix
1 Introduction
In recent years, the field of wireless communications has experienced a
revolutionary growth which has changed the face of telecommunications. On the
one hand the easy availability of GSM and CDMA based mobile wireless cellular
devices have connected hitherto far flung corners of the world, while the WI-FI
based 802.11b/a devices have enabled ubiquitous data/content access. However,
at present, the bandwidth available on these devices is limited. Currently there
is a flurry of research and development activity to produce 4G wireless devices
that will in the near future support very high data rate applications such as DVB,
which distribute high-definition multimedia content.
One of the most challenging issues faced by designers in such an endeavor
is the harsh nature of the wireless radio channel. Unlike the wireline channel, the
wireless channel is highly variable and subject to fading. This fading nature of the
wireless channel arises from the superposition of multiple signal paths due to scat-
tering from local obstructions such as buildings, trees and other objects. Initially,
wireless communication systems were single-input single-output (SISO) systems,
which means that they employ a single transmit antenna that inputs the transmit
symbol stream into the radio channel, and a single output antenna that receives
a single copy of the signal. However, the capacity of such a SISO wireless link is
adversely affected by fading as the received signal is severely attenuated when the
channel is in a deep fade. This problem can be partially alleviated by introducing
multiple antennas at the receiver. By separating the receiver antennas by spacing
greater than approximately half the wavelength of the narrowband signal, one can
1
2
Figure 1.1: Schematic representation of a MIMO System.
ensure that the radio channel seen by each of these antennas is an independent re-
alization of the fading process. Thus, at any given instant, the probability that all
of these channels are in a deep fade is greatly reduced, thereby ensuring signal reli-
ability. This innovative scheme whereby multiple antennas at the receiver ensures
an improved signal quality by combating fading is also termed as diversity recep-
tion. Alternatively, it can be demonstrated that by introducing multiple transmit
antennas the resulting multiple-input single-output (MISO) system can combat
fading by using transmit diversity[1]. Thus, diversity at the wireless transmitter
or the receiver helps combat fading and such systems that use multiple-antennas
at the transmitter or the receiver are also termed colloquially as smart antenna
systems.
It is now interesting to address the effect of introducing multiple an-
tennas on both the receiver and transmitter and such a system is termed as a
multiple-input multiple-output or MIMO system. In one of the most interesting
results in communication theory, it can be demonstrated [2] that the effect of such
an introduction of multiple antennas is the possibility of a linear increase in the
information-theoretic capacity with the minimum of the number of transmit or
receive antennas. Thus, a MIMO system can effectively result in an multi-fold
increase in capacity over a conventional SISO wireless link. This is achieved by
transmitting different information streams over the transmit antennas and is also
termed as spatial multiplexing. Thus, a MIMO system offers the dual benefits
3
of increased capacity due to spatial multiplexing and fading suppression due to
receive/transmit diversity. These properties have greatly attracted the interest of
researchers in the wireless community and in recent years there has been a flurry
of activity in the MIMO area. This forms the general area of work described in
this thesis.
Channel estimation for a wireless system is a significantly more challeng-
ing task than for a wireline channel. This is attributed to the mobility feature of
wireless devices which causes temporal variations in the wireless channel arising
due to the doppler effect. Thus, the period of time for which the radio channel is
invariant, also known as the coherent time, is of the order of a few milli-seconds,
necessitating frequent re-estimation of the channel coefficients.
The issue of channel estimation assumes a much more critical significance
in the context of a MIMO system. As the diversity of the MIMO system increases,
the number of parameters to be estimated increases as the product rt, where r/t
denote the number of receive/transmit antennas in the MIMO system. Further,
due to the diversity feature of the MIMO system, the SNR at each antenna is even
lower. For instance, employing binary orthogonal FSK modulation at an operation
BER of 2× 10−3, while an SNR of 25 dB is required with a single receive antenna,
an SNR of 12dB suffices with 4 antennas [3]. The SNR at each antenna is even
lower. Hence, in a MIMO system, ironically, one needs to estimate many more
parameters at a much lower SNR compared to a SISO system. Hence, such harsh
conditions motivate the development of more robust channel estimation techniques
which estimate the MIMO channel ”efficiently”. This notion of efficiency will be
clarified in the discussion that follows and the topic of efficient channel estimation
forms the central goal of this thesis.
4
1.1 MIMO System Modeling and Channel Estimation
The MIMO wireless system can be represented as a matrix wireless chan-
nel as will be seen below. Let xd(k) ∈ Ct×1 be the kth transmitted MIMO symbol
vector. This vector xd(k) is given as,
xd(k) =
xd,1(k)
xd,2(k)...
xd,t(k)
,
where xd,j(k), 1 ≤ j ≤ t is the symbol transmitted from the jth transmit antenna.
Similarly, the receive symbol vector yd(k) is given as,
yd(k) =
yd,1(k)
yd,2(k)...
yd,r(k)
,
where yd,i(k), 1 ≤ i ≤ r is the received signal at the ith receive antenna. In a
flat-fading MIMO system (where symbol duration is much greater than the multi-
path delay spread of the channel), the input-output system model at each receive
antenna can be expressed as,
yd,i(k) =t∑
i=1
hi,jxd,j(k) + ηi(k),
where ηi(k) is the noise added at the ith receiver. This can be represented in matrix
form as,
yd(k) = Hxd(k) + η(k),
5
Figure 1.2: Schematic representation of a MIMO frame.
where the matrix H ∈ Cr×t represents the MIMO channel. This matrix H is given
as,
H =
h11 h12 . . . h1t
h21 h22 . . . h2t
......
. . ....
hr1 hr2 . . . hrt
.
The coefficient hij represents the flat-fading channel between the ith receiver and
the jth transmitter. Knowledge of this channel matrix is necessary for detection
of the transmitted symbols xd(k). Hence, it is necessary to estimate the channel
matrix H. This procedure of estimating H is known as channel estimation.
This channel estimate can then either be employed at the receiver for detection[4]
or also fedback to the transmitter[5, 6] for transmit precoding and beamforming.
Finally, it is worth mentioning that the combined impact of channel state feed-
back and estimation error is an interesting problem which has been handled in a
comprehensive fashion in [7].
1.2 Estimation Philosophies
1.2.1 Pilot based Estimation
A typical wireless system involves the transmission of a sequence of sym-
bols also known as a frame. The number of symbols in the frame is also termed
6
as the frame length. Each such frame consists of an initial transmission of pilot
symbols (or simply pilots) and the number of these symbols is the pilot length
of the frame. A schematic diagram of such a system is shown in fig(1.2). These
pilots are a fixed set of symbols which are known at the receiver. Thus, at the
receiver, by observing the outputs to this known sequence of pilots, one can esti-
mate the MIMO channel. This estimate can then be employed for detection of the
information symbols transmitted subsequently. Such a scheme is termed as pilot
based estimation and is the most commonly employed channel estimation proce-
dure. A mathematically rigorous treatment of this procedure is presented in the
subsequent chapters. Overall, the above scheme has the dual benefits of a robust
estimate and low computational complexity. However, a major drawback of such a
scheme is that these predetermined pilots themselves carry no information. Hence,
these pilot symbols are in effect an overhead on the communication system and
result in wastage of bandwidth making such schemes ”bandwidth inefficient”.
1.2.2 Blind Estimation
Blind estimation is a novel strategy to eliminate the pilot overhead in a
communication system. Ideally, a blind scheme does not employ any pilot symbols
and instead relies only on the information symbol outputs to estimate the channel.
One is now curious as to how this can be achieved since the information sym-
bols, by their very definition are unknown at the receiver. It is to be noted, that
though the information symbols are individually unknown, one can have statistical
knowledge about an ensemble of such symbols. For instance, if the transmitter em-
ploys a symmetric transmit constellation with equal priori probabilities (as is the
case frequently), then the received symbol stream has a statistical mean of zero.
Further, with knowledge of the covariance of the input information symbols, the
computed covariance of the output information symbols can be employed to esti-
mate at least part of the channel. Thus, statistical information provides a viable
means to estimate the channel. Theoretically, if such a blind scheme were pos-
7
Figure 1.3: Pictorial Representation of Pilot vs. Blind Tradeoff.
sible, it would eliminate completely the need to transmit pilot symbols and thus
would be totally bandwidth efficient. In principle such schemes exist[8], but are
often computationally complex and are plagued by convergence problems. Further,
most blind schemes can estimate the channel only up to a residual indeterminate
phase factor. Thus, blind schemes, while being extremely bandwidth efficient, are
unattractive to implement in wireless systems where robustness of the estimate
and computational complexity are critical.
1.2.3 Semi-Blind Philosophy
Thus, as seen above, there is an inherent complexity and robustness vs.
bandwidth efficiency tradeoff in pilot and blind estimation schemes as shown in
fig(1.3). One can now ask, is it possible to design schemes which are a hybrid of
pilot and blind schemes. In other words, we desire to construct a channel estimation
scheme with a limited number of pilots to alleviate the potential shortcomings of a
blind scheme, while also employing statistical ”blind” information for bandwidth
efficiency. Such a scheme is semi-blind in nature since it employs both pilot and
8
blind information. The development of such a scheme is motivated for the following
reason. Given a certain amount of pilot information, we wish to enhance the
quality of the channel estimate by employing statistical information to aid the
estimation process, or in other words, we wish to minimize the number of pilot
symbols transmitted by employing statistical information to improve the nature of
the channel estimate, thereby increasing the bandwidth efficiency. The formulation
and analysis of such schemes in the context of a MIMO wireless system forms a
focal point of this thesis.
1.3 Complex-Constrained Cramer-Rao Bounds
At this point, the discussion digresses slightly to discuss another impor-
tant aspect of channel estimation. The formulation of novel channel estimation
schemes as described above is incomplete without an accompanying analysis that
evaluates the merits and demerits of the particular estimator. In particular, in the
context of estimation, we are left with the problem of evaluating the mean-squared
error performance of the designed estimator, which is the frequently employed met-
ric to judge the performance of an estimator. The Cramer-Rao bound theory from
statistical estimation literature presents a classic method to characterize the per-
formance of an unbiased or asymptotically unbiased estimator. However, most
literature deals only with the problem of estimation of unconstrained real param-
eters. By this we mean that the components of the parameter vector that is being
estimated are real and can vary independent of each other in a space of appropriate
dimension.
However, in the context of semi-blind estimation as described above, one
is frequently accosted with the problem of analyzing the performance of an es-
timator with constraints. Further, the components of the parameter vector are
usually drawn from an underlying complex space. Such an estimation problem
with constraints essentially reduces to the estimation of a parameter vector lying
9
on a complex manifold. For instance, consider a unit-norm constrained complex
parameter vector θ ∈ Cm×1, i.e. θ is an m-dimensional complex vector with the
constraint f(θ)
defined as,
f(θ)
, θH θ =∥∥θ
∥∥
2= 1. (1.1)
The above scenario can be considered as a typical example of constrained complex
parameter estimation. As will be seen in the later chapters, the nature of semi-
blind estimation necessitates the development of a framework to analyze the MSE
performance of such estimators. In chapter(2) we address the issue of Cramer-Rao
Bounds for a constrained complex parameter. This framework is then employed
throughout the rest of the thesis to analyze the performance of the proposed esti-
mation schemes.
1.4 Whitening-Rotation Based Semi-Blind MIMO Chan-
nel Estimation
In chapter(3), we introduce the ”whitening-rotation” (WR) scheme for
semi-blind MIMO flat-fading channel estimation. This scheme is described as
follows. Consider a MIMO channel H ∈ Cr×t which has at least as many receive
antennas as transmit antennas i.e. r ≥ t. Then, the channel matrix H can
be decomposed as H = WQH where W ∈ Cr×t and Q ∈ C
t×t is unitary i.e.
QHQ = QQH = I. The matrix W is popularly termed as the whitening matrix1.
Q induces a rotation on the space Ct×1 and is therefore known as the rotation
matrix. For instance, consider the singular value decomposition (SVD) of H given
as H = PΣV H . A possible choice for W,Q, which is employed in subsequent
portions of this work, is given by
W = PΣ , and Q = V. (1.2)
1If a ∈ Ct×1 is a random vector such that E
aa
H
= I and b ∈ Cr×1 is obtained by transforming a
as b = Ha, then W can be employed to decorrelate or whiten b as c = W †b i.e. E
cc
H
= I
10
It then becomes clearly evident that all such matrices W satisfy the property
WWH = HHH and it is well known that W can be determined from the blind
data alone. Q can then be exclusively determined from the transmitted pilot
symbols. Such a technique potentially improves estimation accuracy because the
matrix Q by virtue of its unitary constraint is parameterized by a fewer number of
parameters (t2 parameters to be precise)and hence can be determined with greater
accuracy from the limited pilot data.
The estimation of the unitary matrix Q is significantly more involved
since Q, unlike H, is a constrained matrix, constrained as QQH = QHQ = It.
Thus, the matrix Q lies on a constrained manifold. In this context, the CC-CRB
theory mentioned in the last section is employed to quantify the bounds on the
MSE of estimation of the matrix Q, in essence quantifying the performance of
the WR semi-blind scheme. In chapter(3) we demonstrate through a CC-CRB
analysis that the WR scheme has an MSE that is at least 3dB lower than the MSE
of a pilot based estimator. Thus, it leads to a significant reduction in the MSE of
estimation.
In chapter(3), we also formulate several maximum-likelihood (ML) schemes
for the estimation of the unitary matrix Q. These schemes are then demonstrated
to asymptotically achieve the CC-CRB, thus indeed reducing the MSE over con-
ventional pilot based estimation. The constrained ML schemes and the CC-CRB
analysis form the focus of chapter(3).
The results arising out of the above CC-CRB analysis in the context of
WR based MIMO estimation are very general and can be readily applied in a wide
variety of estimation scenarios. One such application in the context of orthogo-
nal frequency division multiplexing (OFDM) systems is demonstrated towards the
end of chapter(3). We consider the problem of time-domain vs frequency-domain
channel estimation for OFDM systems, similar to the study in [9]. In their study,
they derive the ratio of the MSE of estimation of the above competing schemes
for OFDM channel estimation. However, their analysis involves an elaborate com-
11
putation of the actual MSE covariance matrices. We present an alternative and
simple approach to characterize the above scenario using the framework of con-
strained complex parameters. The analysis presented demonstrates the versatile
nature of the CC-CRB concept and its suitability for diverse estimation scenarios.
1.5 FIM based Regularity Analysis of Semi-Blind MIMO
FIR Channels
The WR scheme described above elaborates one possible estimation al-
gorithm for efficient estimation of the MIMO wireless channel. In the efforts to
minimize the number of pilot symbols transmitted in a frame, it is essential to
address the following question: exactly how many pilots are required to estimate a
wireless channel in the presence of blind information? Or in other words, is blind
information sufficient to estimate the MIMO channel, thus making the transmis-
sion of pilots redundant. We now investigate the theoretical limit on the minimum
number of pilot transmissions necessary for complete estimation of the MIMO
channel2. With this knowledge of the least number of pilots necessary, one can in
principle restrict the number of pilot transmissions to the minimum necessary to
estimate the MIMO channel, enhancing the bandwidth efficiency. Also, to make
our study more general, we now consider a frequency selective MIMO channel
modeled as an Lh tap MIMO FIR filter.
Addressing the above issue forms the central aim of chapter(4). We
demonstrate in there that the Fisher Information Matrix(FIM) based analysis can
be employed as a tool to characterize the number of identifiable parameters in the
MIMO system. In fact, to be more precise, it is demonstrated that the rank of the
FIM is equal to the number of identifiable parameters, or in other words, the rank
of the nullspace of the FIM yields the number of unidentifiable parameters. First,
employing the Gaussianity assumption on the transmitted information symbols,
2By ”complete” we mean the total identifiability of the MIMO channel without any residualindeterminacy
12
we compute the blind information likelihood of a MIMO FIR system. This blind
FIM component is denoted by the matrix J b. It is then demonstrated that the
rank of the nullspace of J b is at least t2, thus implying that blind information alone
is not sufficient to estimate the MIMO channel.
Hence, we now rely on pilot symbols to identify the t2 parameters which
cannot be identified from blind information alone. This can be done by simply
reformulating the FIM for the total information available, i.e. pilot and blind
symbol outputs. It is demonstrated therein that under certain circumstances,
the FIM corresponding to pilot and blind information can be evaluated as a sum
J b +J t, where J b, J t are the FIMs corresponding to the blind and training symbols
respectively. The rank of this matrix can now be evaluated for each additional pilot
symbol. The minimum number of pilot symbols necessary for identifiability can
then be arrived at by simply looking at the number of pilot transmissions for which
the resulting FIM has full rank. The study in the chapter shows that at least t pilot
symbol transmissions are necessary in the MIMO FIR context for identifiability.
Thus we address the question of the fundamental limit on the number of pilot
transmissions.
Further, we demonstrate that the semi-blind CRB converges asymptot-
ically to the complex-constrained CRB demonstrated in chapter(3). Thus, the
semi-blind scheme can indeed achieve better performance by greatly reducing the
MSE of estimation of the MIMO FIR channel. Finally, for optimum MSE perfor-
mance, it is desired to employ an orthogonal pilot sequence. The design of such
a sequence is not straight forward in the context of a MIMO frequency selective
channel, as the resulting Toeplitz structure of the pilot symbol matrix imposes
additional constraints on the nature of the pilot sequence. We demonstrate a
construction scheme based on the Paley-Hadamard matrix structure to construct
an orthogonal pilot symbol matrix in the context of a MIMO frequency selective
channel.
13
1.6 Semi-Blind Channel Estimation for MRT Based MIMO
Systems
Maximum ratio transmission (MRT) is an innovative transmission scheme
for MIMO systems. It relies on beamforming employing the dominant singular
vectors of the MIMO channel. Let the singular value decomposition of the MIMO
channel be given as H = UΣV H , where U ∈ Cr×r, V ∈ C
t×t are the left and right
singular matrices respectively. Let u1 denote the dominant left singular vector
of the MIMO channel and v1 the dominant right singular vector. In MRT, the
MIMO transmitter employs v1 for transmit beamforming. Let xd(k) be the kth
transmitted data symbol. The transmit vector for this symbol is given as v1xd(k).
At the MIMO receiver, the received data vector can be expressed as,
yd(k) = Hv1xd(k) + η(k) = σ21u1xd(k) + η(k),
where σ21 is the dominant singular value of the channel. It can now be seen that
the receiver can employ u1 for receive beamforming. Hence, the final MRT system
can be represented as,
yd(k) = uH1 yd(k) = σ2
1xd(k) + η(k).
It can be seen that the above channel can be modeled as a SISO channel with
gain σ21. The attractive feature of MRT is its low implementation complexity
while retaining the diversity gain of the MIMO system. It can be demonstrated
the MRT achieves the full MIMO diversity. Further, it also achieves the MIMO
capacity at low SNR. Hence, MRT has many advantages for implementation in a
MIMO system.
It can be observed from the above description that an implementation
of MRT requires the estimation of the dominant left and right singular vectors
of the MIMO channel. One scheme to estimate these vectors is to first estimate
the MIMO channel and then perform an SVD on the channel estimate to in turn
estimate the dominant singular vectors. This is termed as the conventional scheme
14
for MRT channel estimation. However, such a two step estimation procedure is
sub-optimal and can result in a poor MSE performance. In chapter(5) we present
a semi-blind scheme for channel estimation in the context of MRT. This scheme
directly estimates the left and right beamforming vectors while employing the
statistical information from the transmitted data symbols.
We employ a framework based on the eigenvector perturbation analysis
from [10] to derive the expressions for MSE and BER performance of both the
schemes. It is also demonstrated that the semi-blind scheme can potentially achieve
a lower MSE than the conventional scheme. Thus, semi-blind estimation provides
a versatile estimation philosophy to address a multitude of estimation problems
arising in the context of MIMO systems.
1.7 Superimposed Pilots for MIMO Channel Estimation
Up to this point, our desire to improve the MIMO bandwidth efficiency
has been focused on a conventional pilot transmission model, where the pilot sym-
bols are time multiplexed with the information symbols. However, superimposed
pilots (SP) present a paradigm shift in the area of pilots based channel estimation.
In SP based systems, the pilot symbols are superimposed over the information
symbols (see fig(1.4)), thus enabling the transmission of information symbols over
the entire frame. Such a scheme would result in a reduction in the signal power
allocated to the data symbols. However, from Shannon’s famous channel capac-
ity result[11], the capacity of a communication system varies logarithmically with
SNR, while it depends linearly on bandwidth. Thus, by avoiding the exclusive
transmission of pilot symbols, one is in fact enhancing the bandwidth available for
information transmission, thus enhancing the overall throughput of the system.
At the receiver it is now necessary to develop novel signal processing
schemes for pilot and data separation followed by channel estimation. In chap-
ter(6) we focus on the MSE and throughput performance of MIMO channel esti-
15
Figure 1.4: Pictorial Representation of Conventional Vs. Superimposed Pilots
mation with superimposed pilots. We derive the Cramer-Rao Bound for SP based
MIMO estimation. Employing an asymptotic analysis, we demonstrate that the
MSE bound of an SP scheme for a SIMO system is 3dB lower than that of the
mean-based SP estimator popularly employed in literature. This is shown to arise
because the mean-based estimator only employs the first-order statistical infor-
mation and ignores the information present in the second-order output statistics
(or the covariance). Based on this, we present an improved semi-blind estimator,
that employs both the mean and statistical information to compute an enhanced
estimate of the MIMO channel. This semi-blind estimate is seen to have a superior
performance compared to the SP mean based estimator. Despite this improvement
in MSE performance of the semi-blind SP estimator, SP based estimation is out-
performed by CP in terms of MSE. The reason for such a performance degradation
of SP is the additional interference from the information symbols. Hence, for a
given constant per frame pilot power, the estimation performance of an SP system
is bounded by the performance of the CP system. Yet, in spite of such a loss in
MSE performance, SP can result in a net gain of throughput owing to the band-
width efficiency arising out of simultaneous transmission of pilot and information
streams as mentioned in the discussion above. Motivated by this observation, we
16
derive a framework to quantify the throughput performance of SP and CP sys-
tems. A precise closed form expression for the capacity of a system with error in
the channel estimate is intractable. Hence, we employ the framework of worst case
capacity analysis with estimation error, first proposed in [12]. We generalize their
result to the scenario with correlated information and noise symbols, where the
correlation itself arises due to the error in the channel estimate. This framework
can then be employed to characterize the throughput performance of SP and CP
systems. It is observed that SP based systems can yield an improved throughput
performance in comaprison to CP. Finally, we also address one other crucial ques-
tion in SP systems, that of power allocation between source and pilot symbols. We
derive a closed form expression for the optimum pilot power allocation, which is
presented towards the end.
1.8 Channel Estimation for Time-Varying Channels
Up until this point in this thesis our study has largely focused on channels
that are block time-invariant, i.e. channels that are invariant over one frame of
symbols. This is true of MIMO channels where there is no relative mobility between
the transmitter and receiver such as an indoor wireless LAN scenario. However,
in the context of mobile cellular communications, the mobile terminal users are
frequently in motion, introducing a Doppler shift in the carrier. Together with the
multipath environment, this results in a temporal variation of the MIMO channel,
making the channel time selective. Such a time-varying channel presents additional
challenges for channel estimation.
Popular approaches to estimating time-selective channels involve devel-
oping a parametric model for this time-varying channel and then following a para-
metric estimation approach for the model coefficients. One such frequently em-
ployed scheme involves modeling the time-selective MIMO channel as a vector
auto-regressive (AR) process. The coefficients of this AR process are then esti-
17
mated using the Yule-Walker MMSE algorithm. The optimal channel estimator,
conditioned on this AR model is a Kalman filter [13].
Recently, there has been an increasing research activity on complex ex-
ponential basis expansion models (CEBEM) for the modeling of time-selective
channels. In our study in chapter(7) we study the performance of CEBEM in the
context of channel estimation using superimposed pilots for time-varying MIMO
channels. The performance of the SP scheme can be further enhanced by employ-
ing an iterative estimation procedure which employs a soft-decision based scheme
to compute the channel estimate. The expectation-maximization algorithm natu-
rally lends itself to such an estimation scenario, since the transmitted information
symbols can be treated as the classical missing information. Thus, by computing
the likelihoods for the different symbols of the source constellation, one can arrive
at soft decisions on the transmitted symbols which can be employed to enhance
the channel estimate. The complexity of EM based iterative estimation increases
exponentially with the number of transmit antennas t and the size of the transmit
constellation. For instance, for a MIMO system with 4 transmit antennas em-
ploying a 16-QAM constellation, it is required to perform 16t = 65, 536 likelihood
computations for each symbol, per EM iteration, which is prohibitively large for
implementation in a wireless device where the computational power available is
limited. Hence we present a novel modification to the EM algorithm where the
number of likelihood computations can be greatly reduced by employing the sphere
decoding(SD) algorithm in conjunction with the EM algorithm. Thus, the SD-EM
algorithm based on a CEBEM model is a viable and effective strategy to tackle
the problem of channel estimation for a time-varying MIMO channel.
1.9 Discussion
Currently, The Global System for Mobile(GSM) telecommunication stan-
dard specification, which is widely employed around the world as a cellular mobile
18
standard, uses 26 symbols in every time slot of 156 symbols for synchronization
and channel acquisition[14]. Thus, pilot overhead represents about 16% of the
data payload, which is a significant overhead. This figure is bound to rise in
MIMO systems in view of the reasons cited earlier in the chapter. Further, as the
communication bandwidth increases due to the changing nature of modern wire-
less devices which are supporting increasingly rich multimedia applications, the
doppler spread increases. For instance, at a speed of about v = 60 miles/hr (≈ 26
m/s) the doppler bandwidth D of a fc = 2 GHz (fc denotes the carrier frequency)
channel is,
D =26
3 × 108× 2 × 109 ≈ 200Hz. (1.3)
Thus the channel coherence time Tc is of the order of 1/(4fd) ≈ 1.25ms. This
in turn implies that the pilot overhead has to be sent over the channel every
1.25 milliseconds, resulting in a substantial bandwidth overhead and poor spectral
efficiency.
On the other hand, emerging wireless standards and strategies are pro-
gressively complex in terms of wireless connectivity and spectrum management.
Wireless adhoc networks are aimed at supporting communication between a large
number of mobile nodes in which each node acts as a data forwarding node, thus
setting up a mobile network route on the fly. Dynamic spectrum strategies such as
cognitive radio are based on the concept of multi-user spectrum utilization, where
when a block of spectrum which is not being currently utilized, also termed as
a ”spectrum-hole”, becomes available, it is rapidly allocated to other users. Such
wireless scenarios of adhoc networks and cognitive radio require fast channel acqui-
sition to support the increasingly dynamic and fluctuating communication links.
This is more complex in the context of MIMO where the nodes are equipped with
multiple antennas, thus giving rise to new challenges in design and implementa-
tion of channel estimation algorithms. This thesis addresses some issues in such
an endeavor.
2 Complex Constrained
Cramer-Rao Bound (CC-CRB)
2.1 Introduction
In this chapter, we present a general theory of the Cramer-Rao Bound
(CRB) for the estimation of complex-constrained parameters which provides a
valuable framework to analyze the MIMO channel estimation problems arising
in the several chapters that follow in this thesis. The CRB serves as an impor-
tant tool in the performance evaluation of estimators which arise frequently in
the fields of communications and signal processing. Most problems involving the
CRB are formulated in terms of unconstrained real parameters [15]. Two use-
ful developments of the CRB theory have been presented in later research. The
first being a CRB formulation for unconstrained complex parameters given in [16].
This treatment has valuable applications in studying the base-band performance
of modern communication systems where the problem of estimating complex pa-
rameters arises frequently. A second result is the development of the CRB theory
for constrained real parameters [17–19]. However, in applications such as semi-
blind channel estimation one is faced with the estimation of constrained complex
parameters. Though one can reduce the problem to that of estimating constrained
real parameters by considering the real and imaginary components of the complex
parameter vector, the complicated resulting expressions result in loss of insight.
Using the calculus of complex derivatives as is often done in signal processing ap-
19
20
plications, considerable insight and simplicity can be achieved by working with
the complex vector parameter as a single entity [15, 20, 21]. We thus present an
extension of the result in [17–19] inspired by the theory in [16] for the case of
constrained complex parameters. To conclude, we illustrate its usefulness by an
example of a semi-blind channel estimation problem.
2.2 CRB For Complex Parameters With Constraints
Consider the complex parameter vector γ ∈ Cn×1. Let γ , α + jβ such
that the real and imaginary parameter vectors α, β ∈ Rn×1 and ξ ,
[αT , βT
]T.
Assume that the likelihood function of the (possibly complex) observation vector
ω ∈ Ω parameterized by ξ is s(ω; ξ). Let ˆξ : Ω → R2n×1 be given as ˆξ ,
[
ˆαT , ˆβT]T
,
where ˆα, ˆβ are unbiased estimators of α, β respectively. In the foregoing analysis,
we define the gradientdr(α)
dα∈ R
1×n of a scalar function r(α) as a row vector:
dr(α)
dα,
[dr(α)
dα1
,dr(α)
dα2
, . . . ,dr(α)
dαn
]
. (2.1)
Let θ ∈ C2n×1 be defined as in [16] by
θ ,
γ
γ∗
. (2.2)
Suppose now that the l complex constraints on θ are given as
h(θ)
= 0, (2.3)
i.e. h(θ)∈ C
l×1. We then construct an extended constraint set (of possibly
redundant constraints) f(θ)∈ C
2l×1 as
f(θ)
,
h
(θ)
h∗ (θ)
= 0. (2.4)
An important observation from (2.4) above is that symmetric complex constraints
on these parameters are treated as disjoint. For instance, given the orthogo-
nality of complex parameter vectors θ1, θ2, i.e. θH1 θ2 = 0, the symmetric con-
straint θH2 θ1 = 0 is to be treated as an additional complex constraint and hence
21
f(θ) =[θH1 θ2, θ
H2 θ1
]T. The extension of the constraints is akin to the extension
of the parameter set from γ to θ = [γ, γ∗] called for when dealing with complex
parameters, and the need will become evident from the proof of lemma(1). Repa-
rameterizing h(θ)
= hR
(θ)
+ jhI
(θ)
in terms of ξ, let the set of 2l parameter
constraints for ξ be given by g(ξ)
=[
hR
(θ)T
,hI
(θ)T
]T∣∣∣∣θ=α+jβ
. Employing nota-
tion defined in [17] and borrowing the notion of a complex derivative from [15,20],
we define F(θ)∈ C
2l×2n as
F(θ)
,∂f(θ)
∂θ=
[∂f
(θ)
∂γ,
∂f(θ)
∂γ∗
]
, (2.5)
It then follows from the properties of the complex derivative [20] that
F(θ)
=1
2T
∂g(ξ)
∂ξS, (2.6)
where T ∈ C2l×2n, S ∈ C
2n×2n are given as
T ,
1 j
1 −j
⊗ Il×l , S ,
1 1
−j j
⊗ In×n. (2.7)
The non-minimality of the set of complex constraint does not affect the CRB
. Alternatively, a minimal set of complex constraints can be obtained by first
formulating g(ξ)
and then reparameterizing in terms of θ. However, such a process
involves a tedious procedure of separating the real and imaginary parts, when it
might be more natural to consider the complex parameters themselves as in the
above example of orthogonality of parameter vectors. Let rank(F
(θ))
= k < 2n.
Hence there exists a U ∈ C2n×2n−k such that U forms an orthonormal basis for the
nullspace of F (θ) i.e. F (θ)U = 0. Let the likelihood of the observed data p(ω; θ)
be reparameterized as s(ω; ξ
)by substituting γ = α + jβ, γ∗ = α− jβ. Define ∆
as
∆ ,∂ ln p(ω; θ)
∂θ=
[(
1
2
∂ ln s(ω; ξ
)
∂α− j
2
∂ ln s(ω; ξ
)
∂β
)
,
(
1
2
∂ ln s(ω; ξ
)
∂α+
j
2
∂ ln s(ω; ξ
)
∂β
)]T
,
(2.8)
22
where the last equation follows from the definition of p(ω; θ). Let J = E∆∗∆T
denote the Fisher information matrix (FIM) for the unconstrained estimation of
θ. Also assume that
A.1: The parameter vector ξ ∈ R2n×1 and the likelihood function s
(ω; ξ
)satisfy
the regularity conditions as in [17, 22]. We present them below for the sake
of completeness.
(i) ξ ∈ Ξ, where Ξ ⊆ R2n.
(ii)∂s(ω;ξ)
∂ξi, i ∈ 1, 2, . . . , 2n exists and is a.s. finite for every ξ ∈ Ξ.
(iii)∫
∣∣∣∣
∂ks(ω;ξ)∂ξk
i
∣∣∣∣< ∞, for every ξ ∈ Ξ, and k = 1, 2.
(iv) E
∣∣∣∣
∂s(ω;ξ)∂ξi
∣∣∣∣
2
< ∞, for every ξ ∈ Ξ.
We now present a result for the constrained complex estimator ˆθ analo-
gous to the real case.
Lemma 1. Under assumption A.1 and constraints given by (2.3), the constrained
estimator ˆθ : Ω → Cn×1 defined as
ˆθ ,
ˆα + j ˆβ
ˆα − j ˆβ
(2.9)
satisfies the property
E(
ˆθ − θ)
∆T
UUH = UUH . (2.10)
Proof. From the results for constrained real parameter vector in [17,19] we have
E(
ˆξ − ξ)
∆T
U UT = U UT , (2.11)
where ∆ =
[∂ ln s
(ω; ξ
)
∂α
∂ ln s(ω; ξ
)
∂β
]
, and U ∈ C2n×2n−k is a basis for the
nullspace of∂g
(ξ)
∂ξ. Let U =
[UT
I , UTR
]T, UI , UR ∈ R
n×2n−k, ˜α , ˆα − α and
23
˜β , ˆβ − β. Then rewriting the above expression in terms of block partitioned
matrices we have,
∫
Ω
˜α
˜β
[∂ ln s
(ω; ξ
)
∂α
∂ ln s(ω; ξ
)
∂β
]
UI
UR
×[
UTI UT
R
]
dω
=
UI
UR
[
UTI UT
R
]
. (2.12)
Let U ∈ C2n×2n−k is defined as
U ,1√2
UI + j UR
UI − j UR
.
With some manipulation, (2.12) can be written in terms of complex matrices as
∫
Ω
˜α + j ˜β
˜α − j ˜β
[
1
2
∂ ln s(ω; ξ
)
∂α− j
2
∂ ln s(ω; ξ
)
∂β,
1
2
∂ ln s(ω; ξ
)
∂α+
j
2
∂ ln s(ω; ξ
)
∂β
]
UUH dω
)
= UUH ,
Using (2.8) and (2.9), the above equation can be expressed in the form given by
(2.10). It remains to show that U forms a basis for the nullspace of F(θ). It
follows from the definition of U that∂g
(ξ)
∂ξU = 0 and this equality is true if and
only if,
1√2
∂g(ξ)
∂ξ
(1
2SSH
)
U = 0 (2.13)
⇔ 1
2√
2T
∂g(ξ)
∂ξSSHU = 0 (2.14)
⇔ F(θ)(
1√2SHU
)
= 0, (2.15)
where the equalities in (2.13), (2.14) follow from the facts 12SSH = I and T is
invertible, respectively. The matrices S, T have been defined in (2.7). It can be seen
that U = 1√2SHU and therefore U ⊥ F
(θ). Moreover, UHU = 1
2UT SSHU = Ik×k.
Hence U contains orthonormal columns. Showing that it spans the nullspace of
24
F(θ)
completes the proof. Let U not span the nullspace of F(θ). Then there
exists u ,[uT
a ,uTb
]Twhere ua,ub ∈ C
n×1 such that F(θ)u = 0 and UHu = 0.
Hence we have T∂g
(ξ)
∂ξSu = 0 ⇒ ∂g
(ξ)
∂ξSu = 0 as T is an invertible matrix.
Let u , Su = [uTa + uT
b , juTb − juT
a ]T . Since∂g
(ξ)
∂ξis real we have
∂g(ξ)
∂ξuR = 0
where uR is the real part of u. Also, it can be observed that UHu = 0 ⇒ UT u = 0
and since U is a real matrix, UT uR = 0. Thus there exists a real vector viz.
v , uR ∈ R2n×1 such that
∂g(ξ)
∂ξv = UTv = 0 contradicting the assumption that
U is a basis for the nullspace of∂g
(ξ)
∂ξ. This completes the proof.
Theorem 1. Under assumption A.1 and constraints given by (2.3), the CRB for
estimation of the constrained parameter θ ∈ C2n×1 is then given as
E
(ˆθ − θ
) (ˆθ − θ
)H
≥ U(UHJU
)−1UH . (2.16)
Proof. Let PU = UUH be the projection matrix onto the column space of U and
let W ∈ C2n×2n be an arbitrary matrix. Let ˜θ ,
(ˆθ − θ
)
. As in [17] we now
consider E
(˜θ − WPU∆∗
) (˜θ − WPU∆∗
)H
. Following a procedure similar to
that for real vectors provided in [17], the proof of (2.16) then follows by making the
obvious modifications for complex matrices (i.e. replacing the transpose operator
with the hermitian, etc.).
2.3 A Constrained Matrix Estimation Example
2.3.1 Problem Formulation
We consider in this section the problem of pilot assisted semi-blind esti-
mation of a complex MIMO (Multi-Input Multi-Output) channel matrix H ∈ Ct×t
(i.e. # transmit antennas = # receive antennas = t). Let a total of L pilot symbols
be transmitted. The channel input-output relation is represented as
yk = Hxk + vk , k = 1, 2, . . . , L, (2.17)
25
where yk,xk ∈ Ct×1 are the received and transmitted signal vectors at the k-th
time instant. vk ∈ Ct×1 is spatio-temporally uncorrelated Gaussian noise such
that Evkv
Hk
= σ2
nI. H can be factorized using its singular value decomposition
(SVD) as H = PΣQH where P,R ∈ Ct×t are orthogonal matrices such that PHP =
QHQ = I, Σ = diag (σ1, σ2, . . . , σt), σ1 ≥ σ2 ≥ . . . ≥ σt > 0. P, Σ can be estimated
using blind techniques. We then employ the pilot data exclusively to estimate the
constrained orthogonal matrix Q. More about the significance of such a problem
can be found in [23].
2.3.2 Cramer-Rao Bound
Let yk = PHyk,vk = PHvk. Denote by qi the i-th column of the matrix
Q. The unconstrained input-output relation for each qi can be written as
yk,i = σixHk qi + vk,i, (2.18)
where yk,i denotes the i-th element of yk and analogously for vk,i. Define the
desired parameter vector to be estimated θ , [vec (Q), vec (Q∗)]T . It can now be
seen that θ is a constrained parameter vector and the constraints are given as
qHi qi = 1, 1 ≤ i ≤ t (2.19)
qHi qj = 0, 1 ≤ i < j ≤ t. (2.20)
Hence, the set of t +(
t2
)complex constraints h
(θ)
is given as,
h(θ)
=
qH1 q1 − 1
qH1 q2
qH3 q1
...
qHt qt − 1
.
26
The extended constraint set f(θ)
is then given as
f(θ)
=
qH1 q1 − 1
qH1 q2
qH1 q3
...
qHt qt − 1
...
qH1 q1 − 1
qH2 q1
qH3 q1
...
qHt qt − 1
.
The matrix f(θ)
can be employed to compute U . However, it can be noticed
that the repeated constraint qHi qi − 1 for i = 1, 2, . . . , t is trivially redundant.
Eliminating this redundancy, the minimal set of t+2(
t2
)= t2 set of non-redundant
constraints f(θ)
can be obtained as
f(θ)
=
qH1 q1 − 1
qH1 q2
qH2 q1
qH1 q3
qH3 q1
. . .
qHt qt − 1
.
The matrix F(θ)
is constructed as given in (2.5), by differentiating f(θ)
with
respect to the parameter vector θ. For example, the derivative of constraint # 2
i.e. qH1 q2 is given as,
∂qH1 q2
∂θ=
[0,qH
1 , 0, . . . ,qT2 , 0, 0, . . .
],
27
where we have used the fact that∂qH
1
∂q1= ∂q2
∂qH2
= 0. This result follows from the
properties of the complex derivative in [15]. Similarly,
∂qH1 q1
∂θ=
[qH
1 , 0, 0, . . . ,qT1 , 0, 0, . . .
],
and so on. The matrix U is an orthogonal basis for the nullspace of F(θ). Hence,
for this example, the matrices F(θ)∈ C
t2×2t2 , U ∈ C2t2×t2 can be written explic-
itly and are given as
F(θ)
=
qH1 0 0 . . . qT
1 0 0 . . .
0 qH1 0 . . . qT
2 0 0 . . .
qH2 0 0 . . . 0 qT
1 0 . . .
0 qH2 0 . . . 0 qT
2 0 . . .
qH3 0 0 . . . 0 0 qT
1 . . .
0 0 q1 . . . qT3 0 0 . . .
......
.... . .
......
.... . .
,
U =1√2
q1 0 q2 0 q3 . . .
0 q1 0 q2 0 . . .
0 0 0 0 0 . . ....
......
......
. . .
−q∗1 −q∗
2 0 0 0 . . .
0 0 −q∗1 q∗
2 0 . . .
0 0 0 0 −q∗1 . . .
......
......
.... . .
.
The simplistic and insightful nature of the above matrices F(θ), U in terms of
the orthogonal parameter vectors q1,q2, . . . ,qt, is particularly appealing and il-
lustrates the efficacy of using the complex CRB . From Eq(2.18) and using the
results for least-squares estimation [15] the Fisher information matrix J(θ)∈
C2t2×2t2 for the unconstrained case is given by the block diagonal matrix J
(θ)
=
1σ2
n
(I2×2 ⊗ Σ2 ⊗ XpX
Hp
). The complex constrained CRB for the parameter vector
θ is then obtained by substituting these matrices in (2.16).
28
2.3.3 ML Estimate and Simulation Results
We now compute the Maximum-Likelihood (ML) estimate and compare
its performance with that predicted by the CRB. The received symbol vectors can
be stacked as Yp , (y1, y2, . . . , yL). Let Xp be defined analogously by stacking the
transmitted symbol vectors. Then Q the ML estimate of Q is given as a solution
of the cost
Q = arg min∥∥∥Yp
H − XHp QΣ
∥∥∥
2
subject to QQH = I
where the norm ‖·‖ is the matrix Frobenius norm such that ‖A‖2 = tr(AAH
).
From [24] the constrained estimate Q employing an orthonormal pilot sequence
Xp (i.e. XpXHp = I) is given as
Q = PpRHp where PpΣpR
Hp = SV D
(
XpYpH
Σ)
(2.21)
Our simulation set-up consists of a 4 × 4 MIMO channel H (i.e. t = 4). A single
realization of H was generated as a matrix of zero-mean circularly symmetric
complex Gaussian random entries such that the variance of the real and imaginary
parts was unity. The source symbol vectors x ∈ C4×1 are assumed to be drawn from
a BPSK constellation and the orthonormality condition is achieved by using the
Hadamard structure. The transmitted pilot was assumed to be of length L = 12
symbols. The error was then averaged for a fixed H over several instantiations
(Ni = 1000) of the channel noise vk. Figure(2.1) shows the MSE in the 1st element
Q(1, 1)
(
i.e.∣∣∣Q(1, 1) − Q(1, 1)
∣∣∣
2)
vs its CRB. Similar results were obtained for the
CRB of other elements of Q. Figure(2.2) then shows the total MSE in estimation
of Q
(
i.e.∥∥∥Q − Q
∥∥∥
2)
vs the trace of the CRB matrix. The ML estimate Q can be
seen to achieve a performance close to the CRB and its performance progressively
improves with increasing SNR.
29
0 5 10 15 20
10−4
10−3
MSE Vs CRLB For Estimation of Q(1,1)
SNR
MS
E
Computed MSECRLB
Figure 2.1: Computed MSE Vs SNR,∣∣∣Q(1, 1) − Q(1, 1)
∣∣∣
2
0 5 10 15 20
10−2
10−1
MSE Vs CRLB For Estimation of Q
SNR
MS
E
MSECRLB
Figure 2.2: Computed MSE Vs SNR,∥∥∥Q − Q
∥∥∥
2
30
2.4 Conclusion
As illustrated in the example above, the CC-CRB framework provides
an elegant means to characterize the MSE of estimation of constrained matrices,
a problem that frequently arises in the context of semi-blind MIMO estimation.
A complete example of such an estimation procedure is demonstrated in the next
chapter on whitening-rotation (WR) based MIMO channel estimation.
Acknowledgement
The text of this chapter, in part, is a reprint of the material as it appears
in A. K. Jagannatham and B. D. Rao, “Cramer-Rao Lower Bound for Constrained
Complex Parameters”, IEEE Signal Processing Letters, Vol. 11, No. 11, Nov’04,
Pages: 875 - 878 and A. K. Jagannatham and B. D. Rao,“Complex Constrained
CRB and its applications to Semi-Blind MIMO and OFDM Channel Estimation”,
Sensor Array and Multichannel Signal Processing Workshop Proceedings, 2004,
Barcelona, Spain, 18-21 July 2004, Pages: 397 - 401.
3 Whitening-Rotation Based
Semi-Blind MIMO Channel
Estimation
3.1 Introduction
As elaborated in chapter(1) semi-blind schemes provide a bandwidth ef-
ficient means to estimate the MIMO wireless channel. In this chapter, we present
one such semi-blind scheme, termed as the whitening-rotation (WR) scheme, for
MIMO channel estimation. We utilize the fact that the MIMO channel matrix H
can be decomposed as the product H = WQH , where W is a whitening matrix
and Q is a unitary matrix, i.e. QQH = QHQ = I. It is well known that W can be
computed blind from the second order statistical information in received output
data. Training data can then be utilized to estimate only the unitary matrix Q.
Significant estimation gains can then be achieved by estimation of such orthog-
onal matrices which are parameterized a much fewer number of parameters. A
more rigorous justification of this statement is given in subsequent sections. Such
a whitening-rotation factorization based estimation procedure naturally arises in
the independent component analysis (ICA) based framework for source separa-
tion, where it has been noted that when the sources are uncorrelated Gaussian,
the channel matrix can be estimated blind up to a rotation matrix. A more com-
plete discussion of ICA can be found in [25, 26]. A totally blind higher order
31
32
statistics algorithm based on such a decomposition is elaborated in [27], for any
source distribution.
Extensive work has been done by Slock et. al. in [28, 29] where several
semi-blind techniques have been reported. More relevant literature to our semi-
blind estimation scheme can be found in Pal’s work [30, 31]. However, it does not
consider the problem of a constrained estimator for Q. Our research is novel in the
following aspects. First, we use the theory of complex constrained Cramer-Rao
bound (CC-CRB) reported in [32] to quantify exactly how much improvement in
performance can be achieved over a traditional training based technique. Also,
since Q is a unitary constrained matrix, optimal estimation of Q necessitates the
construction of constrained estimators. Such an estimator can be found in [33,34]
for an orthogonal pilot sequence. We refer to this as the OPML estimator and
examine its properties. Another salient feature of this chapter is the development
of a novel IGML algorithm for the constrained estimation of Q employing any (not
necessarily orthogonal) pilot sequence. We then present the ROML algorithm as
a low complexity alternative to the IGML estimator.
The chapter is organized as follows. The next section describes the prob-
lem setup. An analysis of the constrained CRBs is given in section 3.3 and estima-
tion algorithms are presented in section 3.4. Finally, simulation results are given
in section 3.5 and we conclude with section 3.7.
3.2 Problem Formulation
Consider a flat-fading MIMO channel matrix H ∈ Cr×t where t is the
number of transmit antennas and r is the number of receive antennas in the sys-
tem, and each hij represents the flat-fading channel coefficient between the ith
receiver and jth transmitter. Denoting the complex received data by y ∈ Cr×1, the
equivalent base-band system can be modelled as
y(k) = Hx(k) + η(k), (3.1)
33
where k represents the time instant, x ∈ Ct×1 is the complex transmitted sym-
bol vector and η is spatio-temporally white additive Gaussian noise such that
Eη(k)η(l)H
= δ(k, l)σ2
nI where δ(k, l) = 1 if k = l and 0 otherwise. Also,
the sources are assumed to be spatially and temporally independent with iden-
tical source power σ2s i.e. E
x(k)x(l)H
= δ(k, l)σ2
sI. The signal to noise ratio
(SNR) of operation is defined as SNR ,σ2
s
σ2n. Now assume that the channel
has been used for a total of N symbol transmissions. Out of these N transmis-
sions, the initial L symbols are known training symbols and the observed outputs
are thus training outputs. Stacking the training symbols as a matrix we have
Xp = [x(1),x(2), . . . ,x(L)] where Xp ∈ Ct×L. Yp ∈ C
r×L is given by similarly
stacking the received training outputs. The remaining N − L information sym-
bols transmitted are termed as ’blind symbols’ and their corresponding outputs
as ’blind outputs’. Xb ∈ Ct×N−L, Yb ∈ C
r×N−L can be defined analogously for the
blind symbols. [Xp, Yp] , Yb is the complete available data.
Consider two possible estimation strategies. H can be estimated exclu-
sively using the pilot Xp given as
HTS = YpX†p, (3.2)
where X†p denotes the Moore-Penrose pseudo-inverse of Xp. This qualifies as train-
ing based estimation and is simple to implement. However, it results in poor usage
of available bandwidth since the pilot itself conveys no source information. Alter-
natively, H may be estimated from blind data without the aid of any pilot. Thus,
in effect this reduces to the case L = 0 and only blind data Yb is available. This
is very efficient in usage of bandwidth since it totally eliminates the need for a
pilot. However, most second order statistics based blind techniques are limited
to estimating the channel matrix up to a scaling and permutation indeterminacy
as detailed in [26],[8]. Blind methods that employ higher order statistics typi-
cally require a large number of data symbols. Moreover, such techniques are often
computationally complex and result in ill-convergence. Based on the above obser-
vations, one is motivated to find a technique which performs reasonably well in
34
terms of bandwidth efficiency and computational complexity. Moreover, pilot sym-
bols are usually feasible in communication scenarios. Hence, the focus of our work
has been to develop a semi-blind estimation procedure which uses a small number
of pilot symbols along with blind data. Such a procedure serves the dual pur-
pose of reducing the required pilot overhead at the same time achieving a greater
estimation accuracy for a given number of pilot symbols.
Consider a MIMO channel H ∈ Cr×t which has at least as many receive
antennas as transmit antennas i.e. r ≥ t. Then, the channel matrix H can
be decomposed as H = WQH where W ∈ Cr×t and Q ∈ C
t×t is unitary i.e.
QHQ = QQH = I. The matrix W is popularly termed as the whitening matrix
1. Q induces a rotation on the space Ct×1 and is therefore known as the rotation
matrix. For instance, consider the singular value decomposition (SVD) of H given
as H = PΣV H . A possible choice for W,Q, which is employed in subsequent
portions of this chapter, is given by
W = PΣ , and Q = V. (3.3)
It then becomes clearly evident that all such matrices W satisfy the property
WWH = HHH and it is well known that W can be determined from the blind data
Yb. Q can then be exclusively determined from Xb. This semi-blind estimation
procedure is termed as a Whitening-Rotation (WR) scheme. Such a technique
potentially improves estimation accuracy because the matrix Q by virtue of its
unitary constraint is parameterized by a fewer number of parameters and hence
can be determined with greater accuracy from the limited pilot data Xp. The
precise improvement in quantitative terms is presented in the next section.
To avoid repetition, we present here a list of assumptions which may be
potentially employed in our work. The exact subset of assumptions used will be
stated specifically in the result.
A.1 W ∈ Cr×t is perfectly known at the output.
1If a ∈ Ct×1 is a random vector such that E
aa
H
= I and b ∈ Cr×1 is obtained by transforming a
as b = Ha, then W can be employed to decorrelate or whiten b as c = W †b i.e. E
cc
H
= I
35
A.2 Xp ∈ Ct×L is orthogonal i.e. XpX
Hp = σ2
sL It×t.
A.1 is reasonable if we assume the transmission of a long data stream from which
W can be estimated with considerable accuracy and A.2 can be easily achieved for
signal constellations such as the BPSK, QPSK etc. by using an integer orthogonal
structure such as the Hadamard matrix.
3.3 Estimation accuracy for semi-blind approaches
We now present a general result to quantify the improvement in estima-
tion accuracy of semi-blind schemes over training based channel estimators. The
Cramer-Rao bound (CRB) is frequently used as a framework to study the estima-
tion efficiency. However, semi-blind approaches involve estimation of constrained
complex parameter vectors. Therefore in our analysis, we use the CC-CRB frame-
work developed in [32], inspired by the result in [17], which provides an ideal setting
to study the performance of such schemes. However, from the CRB matrices which
describe a lower bound on the estimation covariance it is harder to interpret the
achievable estimation accuracy in quantitative terms. This necessitates the devel-
opment of a postive scalar measure to evaluate and contrast the performance of
different estimators. Frequently, the trace of the covariance or the MSE in estima-
tion is used to quantify the performance of an estimator. We next present a result
which justifies the use of such a positive scalar measure.
Lemma 2. Let A,B ∈ Cn×n be positive definite matrices and let A ≥ B i.e.
uHAu ≥ uHBu,∀u ∈ Cn×1. Then tr(A) = tr(B) ⇔ A = B.
Proof. It is easy to see that A = B ⇒ tr (A) = tr (B). To prove the converse,
observe that A ≥ B ⇒ G = A − B ≥ 0 and hence G is Positive Semi-Definite
(PSD). Further tr (A) = tr (B) ⇒ tr(G) = 0 ⇒ ∑ni=1 λi = 0 where λi are the
eigenvalues of G. However G is PSD and hence λi ≥ 0, ∀i. Therefore λi =
0, ∀i ⇒ G = 0 ⇒ A = B.
36
Setting A,B to be the error covariance and the covariance lower bound
(obtained from the CRB analysis) respectively, it is easy to see that if the trace
of the covariance approaches the trace of the bound, then the covariance itself
approaches the bound. Thus, given the estimation error matrix E , H − H,
it is reasonable to consider the mean of the squared Frobenius norm of E given
by E‖E‖2
F
= E
tr
(EEH
), as a performance measure. We now present a
central result which relates the MSE of estimation to the number of unconstrianed
parameters in H.
Lemma 3. Under A.2, the minimum estimation error in H is directly proportional
to Nθ the number of unconstrained real parameters required to describe H and in
fact,
E
∥∥∥H − H
∥∥∥
2
F
≥ σ2n
2σ2sL
Nθ. (3.4)
Proof. H is an r × t dimensional matrix and therefore has 2rt real parameters.
Let parameter vector γ be defined as γ ,
[
vec(HT
)T, vec
(HH
)T]
where vec (H)
denotes a stacking of of the columns of H as vec (H) =[hT
1 , hT2 , . . . , hT
t
]Tand hi
denotes the ith column of H for 1 ≤ i ≤ t. Since we are concerned with a con-
strained parameter estimation problem, we wish to employ the CC-CRB (Complex
Constrained CRB). For this purpose we will need to redefine the following nota-
tion. Let the extended set of constraints on γ be given as f (γ) = 0 such that
f (γ) ∈ Fk×1, where F is the space of functions f such that f : C2rt → R. Let
F(θ)∈ C
k×2rt be defined as F(θ)
,∂f(γ)∂γ
. Thus, there exists a matrix U such
that the columns of U form an orthonormal basis of the nullspace of F(θ).
Since the number of un-constrained parameters in H is Nθ, the number of
constraints on the system is given as 2rt−Nθ. This can be seen as follows. Let the
elements of H be stacked as δ ,
[
vec (Re(H))T , vec (Im(H))T]T
∈ R2rt×1. Define
ζ , [ζ1, ζ2, . . . , ζNθ]T as the vector of the unconstrained parameters ζi, 1 ≤ i ≤ Nθ.
Let the parametric representation of the elements of δ be given as δj , χj
(ζ), 1 ≤
j ≤ 2rt, and χj : RNθ×1 → R. Let δ , [δ1, δ2, . . . , δNθ
]T . Define the vector function
χ as χ , [χ1, χ2, . . . , χNθ]T . Therefore, χ : R
Nθ×1 → RNθ×1 as χ
(ζ)
= δ. Now,
37
by the inverse function theorem [35], under mild conditions 2 on χ, there exists an
inverse function ¯χ : RNθ×1 → R
Nθ×1 such that ¯χ(
δ)
= ζ. The 2rt−Nθ constraints
on the parameter vector δ and in turn on the elements of H are then obtained by
the constraint equations
χj
(
¯χ(
δ))
− δj = 0, Nθ + 1 ≤ j ≤ 2rt. (3.5)
Therefore, rank(F
(θ))
= 2rt −Nθ, the number of non-redundant constraints. It
follows that U ∈ C2rt×Nθ . From [32], the CC-CRB for the estimation of γ is given
as
E(
ˆγ − γ) (
ˆγ − γ)H
≥ U(UHJU
)−1UH , (3.6)
where J is the unconstrained complex Fisher information matrix (FIM). J for the
above scenario is then given as J = σ2sLσ2
nI2rt×2rt [15]. Substituting this expression for
J in (3.6) and considering the trace of resulting matrices on both sides as justified
by lemma 2, we have
tr(
E(
ˆγ − γ) (
ˆγ − γ)H
)
≥ tr((
UHJU)−1
)
, (3.7)
E
2∥∥∥H − H
∥∥∥
2
F
≥ tr
(σ2
n
σ2sL
INθ×Nθ
)
,
=σ2
n
σ2sL
Nθ,
E
∥∥∥H − H
∥∥∥
2
F
≥ σ2n
2σ2sL
Nθ. (3.8)
Thus, the above result validates the claim that the estimation of a matrix
with fewer un-constrained parameters i.e. a constrained matrix, can result in a
significant improvement in estimation accuracy. We next examine the significance
of the result in lemma 3 as applied to the WR based semi-blind algorithm.
2Existence of inverse function requires the dervative χ be continuous and the linear operator χ′ beinvertible. A rigorous formulation can be found in [35]
38
3.3.1 Estimation Accuracy of the WR scheme
The following result, which compares the lower bounds of estimation er-
rors of the training based and WR schemes, gives critical insight into the estimation
accuracy of the proposed semi-blind scheme.
Lemma 4. Under assumptions A.1 and A.2, the potential gain of the semi-blind
algorithm (in dB) in terms of MSE of estimation is 10 log10
(2rt
).
Proof. Under A.1, since W is perfectly known, it suffices to estimate the unitary
matrix Q to estimate the channel matrix as H = WQ. From [36], the number of
real parameters required to parameterize Q which under A.1 equals the number
of un-constrained parameters in H is given as NQ = t2. However, the general
matrix H has NH = 2rt un-constrained real parameters. Hence, from the result in
lemma 3 the estimation gain in dB of the semi-blind scheme which estimates the
constrained unitary matrix rather than the complex matrix H is given by
G = 10 log10
(NH
NQ
)
dB = 10 log10
(2r
t
)
dB, (3.9)
which completes the proof.
Two advantages of the WR scheme can be seen from the above result.
1. In the case when the number of receivers equals the number of transmitters
i.e. r = t, the algorithm can potentially perform 3dB more efficiently than
estimating H directly.
2. The estimation gain progressively increases as r, the number of receive an-
tennas, increases. This can be expected since as r increases, the complexity
of estimating H (size r × t) increases while that of Q (size t × t) remains
constant.
Thus for a size 8× 4 complex channel matrix H, i.e. H ∈ C8×4 the estimation gain
of the semi-blind technique is 6 dB which represents a significant improvement
over the conventional technique described in (3.2).
39
3.3.2 Constrained CRB of the WR scheme
An exact expression is now derived for the variance bound in each ele-
ment of H. To begin with, we assume that only A.1 holds. Let the channel matrix
be factorized using its singular value decomposition (SVD) as H = PΣQH where
P ∈ Cr×t, Q ∈ C
t×t are orthogonal matrices such that PHP = Ir×r, QHQ = It×t,
Σ = diag (σ1, σ2, . . . , σt), σ1 ≥ σ2 ≥ . . . ≥ σt > 0. As seen earlier in (3.3), W
can be given as W = PΣ. Let qi for 1 ≤ i ≤ t be the columns of Q. De-
fine the desired parameter vector to be estimated ρ ,
[
vec (Q)T , vec (Q∗)T]T
=[qT
1 ,qT2 , . . . ,qT
t ,qH1 ,qH
2 , . . . ,qHt
]T. It can then be seen that ρ is a constrained
parameter vector and the constraints are given as
qHi qi = 1, 1 ≤ i ≤ t (3.10)
qHi qj = 0, 1 ≤ i < j ≤ t. (3.11)
Let Uf ∈ C2t2×t2 be defined as
Uf =
U1
U2
,1√2
q1 0 q2 0 q3 . . .
0 q1 0 q2 0 . . .
0 0 0 0 0 . . ....
......
......
. . .
−q∗1 −q∗
2 0 0 0 . . .
0 0 −q∗1 q∗
2 0 . . .
0 0 0 0 −q∗1 . . .
......
......
.... . .
, (3.12)
From [32], CQ, the CC-CRB for the estimation error of ρ can be obtained as
CQ = Uf
(UH
f JUf
)−1UH
f , (3.13)
and the Fisher information matrix J ∈ C2t2×2t2 for the unconstrained case is given
by the block diagonal matrix J = 1σ2
n
(I2×2 ⊗ Σ2 ⊗ XpX
Hp
). Block partitioning CQ
as
CQ =
CQ11 CQ12
CQ21 CQ22
, (3.14)
40
the CRB for the estimation of ω = vec (Q) is given by CQ11 . Let θ = vec(HT
)
and Γ = W ⊗ It×t. We then have θ = Γω. Hence from the property of the CRB
under transforms [15] the error covariance of estimation of the channel matrix H
is then given as
E
(ˆθ − θ
) (ˆθ − θ
)H
≥ ΓCQ11ΓH . (3.15)
Eq.(3.15) gives the bound for a general pilot Xp. Additionally if A.2 holds, then
from [15] it follows that J = σ2sLσ2
n(I2×2 ⊗ Σ2 ⊗ I) and is therefore diagonal. Further
it can be verified that UHJU is also diagonal and is given as UHJU = σ2sLσ2
n
12Σ
where Σ ∈ Ct2×t2 is given as Σ = diag ([2σ2
1, σ21 + σ2
2, σ22 + σ2
1, 2σ22, σ
21 + σ2
3, . . .]).
Hence, CQ11 = U1σ2
n
σ2sL
(12Σ
)−1
UH1 . Substituting these quantities in Eq.(3.15) the
CRB for the estimation of θ is obtained as
E
(ˆθ − θ
) (ˆθ − θ
)H
≥ (PΣ ⊗ It×t) U1σ2
n
σ2sL
(1
2Σ
)−1
UH1 (PΣ ⊗ It×t)
H
= CH . (3.16)
The variance of the (k, l) element of H is obtained as CH ((k − 1)t + l, (k − 1)t + l)
E
∣∣∣H(k, l) − H(k, l)
∣∣∣
2
≥ CH ((k − 1)t + l, (k − 1)t + l) (3.17)
=σ2
n
σ2sL
t∑
i=1
t∑
j=1
σ2i
σ2j + σ2
i
|Pk,i|2 |Ql,j|2 ,
where pk,i, qj,l represent the (k, i) element of P and (j, l) element of Q respectively.
Thus Eq.(3.18) give the variance for the estimation of each element of H. The
weighing factorσ2
i
σ2j +σ2
i
in each term of the above summation results in the net
reduction of estimation error over the training based scheme as given in lemma 4.
3.4 Algorithms
3.4.1 Orthogonal Pilot ML (OPML) estimator
Under A.1 and A.2, Q the constrained OPML estimator of Q such that
Q : Cr×L → S, where S is the manifold of t× t unitary matrices, is then obtained
41
by minimizing the likelihood
∥∥Yp − WQHXp
∥∥
2such that QQH = I. (3.18)
It is shown in [24,37] that Q under the above conditions is given by
Q = VQUHQ where UQΣQV H
Q = SV D(WHYpX
Hp
). (3.19)
The above equation thus yields a closed form expression for the computation of
Q, the ML estimate of Q. The channel matrix H is then estimated as H = WQH .
We next present properties of the above estimator.
Properties of the OPML Estimator:
In this section we discuss properties of the OPML estimator. We show
that the estimator is biased and hence does not achieve the CRB for finite sample
length. However, from the properties of ML estimators, it achieves the CRB
asymptotically as the sample length increases. Further, it is also shown that the
bound is achieved for all sample lengths at high SNR .
P.1 There does not exist a finite length constrained unbiased estimator of the
rotation matrix Q and hence Q, the OPML estimator of Q is biased.
Proof. Let there exist Q such that Q : Cr×L → S is an constrained unbiased
estimator of Q. Cr×L is the observation space (Yb) and S is the manifold of
orthogonal matrices. Then Q = Q + E where E is such that EE
= 0.
Now since Q is a constrained estimator we have QQH = I and therefore,
(Q + E
)H (Q + E
)= I,
which when simplified using the fact that QQH = I yields
QHE + EHQ + EEH = 0.
42
Rearranging terms in the above expression and taking the expectation of
quantities on both sides (where the expectation is with respect to the distri-
bution of E conditioned on Q) yields
tr(
QH EE
+ E
E
HQ
)
= −tr(
E∥
∥E∥∥
2)
. (3.20)
It can immediately be observed that the right hand side is strictly less than 0
while the left hand side is equal to zero (by virtue of EE
= 0) and hence
the contradiction.
The above result then implies that the CRB cannot be achieved in a general
scenario as there does not exist an unbiased estimator which is necessary
for the achievement of the CRB. However, the properties presented next
guarantee the asymptotic achievability of the CRB both in sample length
and SNR.
P.2 The OPML estimator achieves the CRB given in (3.18) as the pilot sequence
length L → ∞.
Proof. Follows from the asymptotic property of ML estimators, reviewed in
[15].
P.3 The OPML estimator of Q achieves the CRB given in (3.18) at high SNR,
i.e. as σ2s
σ2n→ ∞.
Proof. The above result can be proved using the theory of matrix eigenspace
perturbation analysis detailed in [10]. The detailed proof can be found in
the appendix.
3.4.2 Iterative ML procedure for general pilot - IGML
The ML estimate of Q for an orthogonal pilot Xp is given by (3.19).
In this section we present the IGML algorithm to compute the estimate for any
43
given pilot sequence Xp, i.e. when A.2 does not necessarily hold. As it is shown
later, the proposed IGML scheme reduces to the OPML under A.2. The ML cost-
function to be minimized is given as in (3.18). Let A.1 hold true and Yp , PHYp.
With constraints given by (3.10) and (3.11), the Lagrange cost f(Q, λ, µ
)to be
minimized can then be formulated as
f(Q, λ, µ
)=
t∑
i=1
∥∥∥Yp(i) − σiq
Hi Xp
∥∥∥
2
+t∑
i=1
Reλi
(qH
i qi − 1)
+t∑
i=1
t∑
j=i+1
Reµijq
Hi qj
,
where λi ∈ R, µij ∈ C are the Lagrange multipliers, Yp(i) ∈ C1×L is the i-th row
(output at the i-th receiver) and qi is the i-th column of Q for 1 ≤ i, j ≤ t. Define
the matrix of Lagrange multipliers S ∈ Ct×t as Sii , λi, Sij , µij if i > j and
Sij , µ∗ji if i < j. Observe that S is a hermitian symmetric matrix i.e. S = SH .
The above cost function can now be differentiated with respect to Re qi , Im qifor 1 ≤ i ≤ t. These quantities can then be equated to 0 for extrema and after
some manipulation, the resulting equations can be represented in terms of complex
matrices as
XpYHp Σ − XpX
Hp QΣ2 = QS. (3.21)
where Q is unitary. We avoid repeated mention of this constraint in the foregoing
analysis and it is implicitly assumed to hold. Let A , XpYHp Σ = XpY
Hp W .
QHA− QHXpXHp QΣ2 = S.
As noted, S = SH and therefore the lagrange multiplier matrix S can be eliminated
as
QHA−AHQ = QHXpXHp QΣ2 − Σ2QHXpX
Hp Q. (3.22)
Adding and subtracting Lσ2s Σ2 in (3.22)and rearranging terms yields,
QH(A +
(Lσ2
s It×t − XpXHp
)QΣ2
)=
(AH + Σ2QH
(Lσ2
s It×t − XpXHp
))Q.
Let T , A +(Lσ2
s It×t − XpXHp
)QΣ2. Thus from the above equation, QHT
is hermitian symmetric or in other words QHT = T HQ. Also, if UT ΛT V HT =
44
SV D (T ) then, QHUT ΛT V HT = SV D
(QHT
). We have then from the symmetry
of QHT ,
QHUT = VT ⇒ Q = UT V HT . (3.23)
Expression (3.23) gives the critical step in the IGML algorithm which is succinctly
presented below. Some of the definitions above are repeated for the sake of com-
pleteness.
IGML Algorithm: Let A.1 hold, i.e. W = W = PΣ. Xp is the transmitted pilot
symbol sequence and not necessarily orthogonal. We then compute the constrained
ML estimate of Q as follows.
S.1 Compute A = XpYHp W , where Yp is the received output data.
S.2 Let Q0 denote the initial estimate of the unitary matrix Q. Compute Q0 by
employing Xp , W and Yp in (3.19).
S.3 Repeat for N iterations. At the kth iteration i.e.1 ≤ k ≤ N ,
S.3.1 Let Tk = A +(Lσ2
s It×t − XpXHp
)Qk−1Σ
2.
S.3.2 Compute refined estimate of Qk from Tk by employing (3.23).
S.4 Finally estimate H as H = WQHN .
N , the number of iterations is small and typically N ≤ 5 as found in
our simulations. It can now also be noticed that if A.2 holds, XpXHp = Lσ2
s I.
Therefore, T = A = XpYHp W . The SVD of T is then given by UT ΛT V H
T =
VQΣQUHQ . It follows that the IGML solution given as
Q = UT V HT = VQUH
Q , (3.24)
is similar to the solution given in (3.19). Thus, when Xp is orthogonal, the IGML
algorithm converges in a single iteration to the OPML solution.
45
Finally, we wish to compare the CRB of estimation of H for the IGML and
OPML schemes. Let Xp, the pilot for the IGML scheme be a random sequence such
that EXpX
Hp
= Lσ2
s I or in other words it is statistically white. Denoting by J
the unconstrained FIM for IGML, we have from section(3.3), J = I2×2⊗ 1σ2
nXpX
Hp .
Therefore, EJ
= J . The average CRB of IGML, where the averaging is over
the distribution of Xp is then given as CRBIGML = EJ−1
. Employing Jensen’s
inequality for matrices from [38] we have,
CRBIGML = EJ−1
≥
(E
J)−1
= J−1 = CRBOPML. (3.25)
Thus, the error in the estimation of H is minimum for an orthogonal pilot Xp.
Similar optimality properties of orthogonal pilots have been previously reported in
[39] and [40].
’Rotation-Optimization’ ML (ROML)
The above suggested IGML scheme to compute Q for a general pilot
sequence Xp might be computationally complex owing to the multiple SVD com-
putations involved. Thus, to avoid the complexity involved in the full computation
of the optimal ML solution, we propose a simplistic ROML procedure for the sub-
optimal estimation of Q, thus trading complexity for optimality. The first step of
ROML involves construction of a modified cost function as
minQ
∥∥∥WYp − QHXp
∥∥∥
2
where QQH = I. (3.26)
Yp = WYp is the whitening pre-equalized data. The closed form solution Q for the
modified cost in (3.26) is given as
Q = VhUHh where UhShV
Hh = SV D
(
WYpXHp
)
, (3.27)
which can be implemented with low complexity. This result for problem (3.26)
follows by noting its similarity to problem (3.18). Several choices can then be con-
sidered for the pre-equalization filter W . The standard Zero-Forcing (ZF) equalizer
46
is given by WZF = W † (where † denotes the Moore-Penrose pseudo-inverse) and is
usually referred to as ’data whitening’ in literature. However, ZF is susceptible to
noise enhancement as frequently cited in literature. Alternatively, a robust MMSE
pre-filter is given as WMMSE = σ2sW
H(σ2
sWWH + σ2nI
)−1.
Q given by (3.27) is a reasonably accurate closed form estimate of Q.
However, the resulting estimate does not have any statistical optimality properties
as it does not compute the solution to the true cost function given in (3.18).
This estimate of Q can now be employed to initialize the IGML procedure to
minimize the true cost. However, to avoid the complexity associated with an SVD
computation, a constrained minimization procedure (ex: ’fmincon’ in MATLAB)
can now employed to converge to the solution with the t2 non-linear constraints
given by the unit norm and mutual orthogonality of the rows of Q. This procedure
then yields Q which is close to the optimal ML estimate and the low computational
cost of the proposed solution makes it attractive to implement in practical systems.
3.4.3 Total Optimization
This procedure builds on the above described schemes. The ML schemes
(OPML and IGML) for estimating the unitary matrix are optimal given perfect
knowledge of W . However, in finite total symbol run situations where this as-
sumption is not valid (for example in fast fading mobile environments where the
data symbols available in the channel coherence time are limited and hence the
estimated whitening matrix may not be exact as assumed earlier), the disjoint
estimation of the whitening matrix from blind symbols and rotation matrix from
pilot symbols is not optimal. We present a scheme for such a system to iteratively
compute the joint solution for W and Q based on minimizing a Gaussian likelihood
cost function.
47
Initialization of W and Q
W can be estimated from the output correlation matrix Ry which is given
as
Ry = σ2sWWH + σ2
nI. (3.28)
The ML estimate of Ry can be computed blindly from the entire received data
y(1),y(2), . . . ,y(N) as Ry = 1N
∑Ni=1 y(i)y(i)H . Using relation (3.28) and assum-
ing that σ2s and σ2
n are known at the receiver, WWH may be estimated as
HHH =1
σ2s
(
Ry − σ2nI
)
= WMLWHML. (3.29)
WML can then be computed from a Cholesky factorization of Ry. Q, the initial
estimate of Q is then computed by employing W in the OPML or IGML algorithms
outlined in sections (3.4.1) and (3.4.2) respectively.
Likelihood for Total Optimization
In order to arrive at a reasonably tractable likelihood function, we now
assume that the transmitted data x(k), k = L + 1, . . . , N is Gaussian, i.e. x ∼N (0, σ2
sI). The likelihood of the complete received data, conditioned on the pilot
symbols Xp is given as
L (W,Q) =1
2(N − L) ln |R(W )| −
N∑
i=L+1
y(i)HR(W )−1y(i)
︸ ︷︷ ︸
L1(W )
− 1
σ2n
L∑
j=1
‖y(j) − WQx(j)‖2
︸ ︷︷ ︸
L2(W,Q)
,
(3.30)
where R(W ) , σ2sWWH + σ2
nI. L1 is a function entirely of blind data and L2
depends only on training data. This cost function can be minimized for W to
compute W as given below.
Total Optimization: Let Xp be the transmitted pilot symbol sequence, not neces-
sarily orthogonal. We then compute estimates of W and Q matrices as follows.
48
T.1 Compute W0, the initial estimate of W from (3.29).
T.2 Compute Q0 by employing Xp , W0 in the IGML algorithm in section (3.4.2).
T.3 Repeat for NT iterations. At the kth iteration i.e.1 ≤ k ≤ NT ,
T.3.1 Using Wk−1 as an initial estimate, compute Wk by minimizing
L(
W, Qk
)
(’fminunc’ in MATLAB).
T.3.2 Compute the IGML estimate of Qk from Wk.
T.4 Finally estimate H as H = WNTQH
NT.
It is seen from the simulation results that minimization of the above like-
lihood yields an improved estimate of the channel matrix H even when elements
of the transmitted symbol vectors x(k) are drawn from a discrete signal constel-
lation. This solution however involves a computational overhead. Nevertheless it
provides a useful benchmark for the estimation of the flat-fading channel matrix
H. Practical implementation of this algorithm would require a recipe for efficient
numerical computation.
As the data length N increases with pilot length L kept constant, the
effect of L2 on the above expression diminishes for the estimation of W . Hence
for large blind data lengths N , maximizing the likelihood expression L with re-
spect to W , reduces to maximization of L1. The solution W is then given by the
ML estimate in (3.29). The second step maximizes L2, which is the cost function
optimized by the OPML and IGML algorithms. Thus, as N → ∞, the total opti-
mization scheme reduces to a one iteration algorithm involving the ML estimation
of W followed by the constrained ML estimation of Q.
49
3.5 Simulation Results
Our simulation set-up consists of a 8×4 MIMO channel H (i.e. r = 8, t =
4). H was generated as a matrix of zero-mean circularly symmetric complex Gaus-
sian random entries such that the sum variance of the real and imaginary parts was
unity. For an orthogonal pilot, the source symbol vectors x ∈ C4×1 are assumed to
be drawn from a BPSK constellation and the orthonormality condition is achieved
by using the Hadamard structure. But otherwise, for general pilot sequences and
data vectors, symbols were drawn from a 16-QAM signal constellation. Further,
the transmitted for the transmitted training symbol vectors XpXHp = Lσ2
sI and
for the data vectors Ex(k)x(l)H
= δ(k, l)σ2
sI thus maintaining the source power
σ2s constant. Noise vectors η(k) were generated as spatio-temporally uncorrelated
complex Gaussian random vectors and with variance of each element equal to σ2n.
The SNR of operation was measured as SNR = 10 log10
(σ2
s
σ2n
)
. Simulations de-
scribed below investigate the performance of the proposed semi-blind algorithm
under different conditions.
Experiment 1: In this experiment, we demonstrate the enhancement in estima-
tion accuracy that can be achieved by the use of statistical side information (white
data) as qunatified by lemma(4). For this purpose we evaluate the MSE perfor-
mance of the different constrained ML estimators of Q under A.1 and compare it
to the training based estimate given by (3.2) which neglects white data. The MSE
of estimation of the channel matrix H has been averaged over 1000 instantiations
of the channel noise η. In Fig.3.1., this MSE has been plotted vs. SNR in the
range 4dB ≤ SNR ≤ 11dB for the OPML semi-blind scheme. As noted in section
3.3.1, the MSE of semi-blind scheme is 6 dB lower than that of exclusively training
based channel estimation. The CRB of the semi-blind scheme is also plotted for
reference.
Next we compute the MSE for different pilot lengths L in the range
50
4 5 6 7 8 9 10 11
10−1
100
SNR
MS
E
ML Error vs. SNR. H is 8 X 4 L = 12
TrainingOPMLCR Bound
Figure 3.1: MSE vs. SNR of OPML semi-blind channel estimation and the semi-
blind CRB with perfect knowledge of W . Also shown for reference is MSE of the
exclusively training based channel estimate. H is an 8 × 4 complex flat-fading
channel matrix and pilot length L = 12.
20 ≤ L ≤ 100. A statistically white pilot(E
XpX
Hp
= Lσ2
sI)
was employed for
the IGML, ROML and training based schemes while an orthogonal pilot Xp was
used for the OPML scheme with XpXHp = Lσ2
sI, thus maintaining constant source
power. Fig.3.2.-left shows the error for these different schemes and also that for
the exclusive training based scheme. It can be seen that the semi-blind schemes
are 6dB more efficient than the training scheme as suggested by lemma 4. OPML
performs very close to the CRB while the IGML progressively improves towards
the CRB as the pilot length increases. In Fig.3.2.-right, which is a blown up ver-
sion of the same plot, it is seen that the ROML because of its sub-optimality loses
slightly (0.5 dB) in terms of estimation gain when compared to the other con-
strained estimators.
Experiment 2: We now consider the effect of estimation inaccuracies in W arising
from the availability of finite blind data. We demonstrate the performance of the
51
20 30 40 50 60 70 80 90 10010
−2
10−1
Pilot Length
MS
E
ML Error H is 8 X 4 SNR = 7.96 dB
TrainingOPMLIGMLROMLCR Bound
50 55 60 65 70
10−1.8
10−1.7
10−1.6
10−1.5
Pilot Length
MS
E
ML Error H is 8 X 4 SNR = 7.96 dB
TrainingOPMLIGMLROMLCR Bound
Figure 3.2: Computed MSE vs. Pilot length (L) for the OPML, IGML, ROML
and exclusive training based channel estimation. H is an 8×4 complex flat-fading
channel matrix and SNR = 8 dB
Total Optimization procedure for the joint optimization of W and Q and contrast
it with the MSE of the IGML estimate with imperfect W . We consider estimation
of W from N − L = 300, 500, 1000 blind data symbols with the source symbols
drawn from a 16-QAM constellation and employing (3.29). The pilot sequence Xp
was orthogonal. As in the previous experiment, we consider the MSE in estimation
for different pilot lengths 20 ≤ L ≤ 100. It can then be seen from Fig.3.3. that
while the OPML with imperfect W for N − L = 500 performs marginally better
than the training sequence based technique (L ≤ 60 training symbols), the To-
tOpt scheme which optimizes the likelihood in (3.30) performs consistently better
than the training sequence based scheme in all the cases. Their performance is
also compared to the situation of availability of perfect knowledge of W (perf W)
which can be seen to achieve the best performance. As noted in sec 3.4.3, the
performance of TotOpt approaches that of the OPML with perfect W as N → ∞.
Experiment 3: Finally, we consider Pe of detection of the transmitted symbol vec-
52
10 20 30 40 50 60 70 80 90
10−2
10−1
PILOT Length (L)
MS
E
Total Optimization. H is 8 X 4 SNR = 10dB
Perf WTrainingImperf W, N−L = 500Tot Opt, N−L = 300Tot Opt, N−L = 500Tot Opt, N−L = 1000
Figure 3.3: Comparison of OPML with perfect W , OPML with imperfect or esti-
mated W , total optimization and training based estimation of H.
tors employing H estimated from different schemes. We illustrate the performance
of OPML with perfect knowledge of W at the receiver and Total Optimization
with N − L = 1000, 500 blind symbols. The performance of the exclusively train-
ing based estimate of H is also plotted for L = 12. Fig.3.4. shows the probability
of error detection vs SNR for a linear MMSE receiver at the output for an 8 × 4
system H. It can be seen that at an SNR of 6 dB the semi-blind scheme achieves
about a 1 dB improvement in probability of bit error detection performance and
thus improves over the exclusively training based estimate.
3.6 OFDM Channel Estimation
The concept of constrained parameter estimation and CC-CRB provides
a general and powerful framework to characterize problems of a wide variety arising
in the context wireless channel estimation. To illustrate this point we demonstrate
another application of constrained estimation framework by considering a problem
that arises in the context of channel estimation in Orthogonal frequency division
multiplexing (OFDM) communication systems. The problem of time vs frequency
53
0 1 2 3 4 5 6
10−3
10−2
Eb/N
o (dB)
Pe
PROB. SYMBOL ERROR VS Es/N
o L = 12
OPML, Perfect WTot Opt, N−L = 1000Tot Opt, N−L = 500Training
Figure 3.4: Probability of Bit Error vs. SNR for 8 × 4 MIMO system employing
OPML, Total Optimization (N = 1000, 500). The performance of the exclusively
training based channel estimate is also given for comparison.
domain channel estimation for an OFDM based communication system has been
detailed in [9,41]. It has been shown there in that when the number of subcarriers
K exceeds the numbers of taps L in the channel impulse response (CIR), the
time domain least squares channel estimate (TLSE) is more accurate than the
frequency domain least squares estimate (FLSE). Indeed, we demonstrate below
that this result follows as an immediate consequence of the complex constrained
CRB theory developed above.
3.6.1 Problem Description
Employing notation in [9], let the complex baseband channel from the
transmitter to the receiver be modelled by a tapped delay line as
h (τ, t) =L−1∑
l=0
hl (t) δ (τ − lTs) , (3.31)
where L is the number of taps in the channel and is known. Let K denote the
number of subcarriers and p , [a0, a1, . . . , aK−1]T be the pilot signal known at
54
the receiver. Denoting the discrete time CIR as h = [h0, h1, . . . , hL−1]T , the cyclic-
prefix extended OFDM communication system can be modeled as
r = ah + n (3.32)
where r,n ∈ CK×1 are the received symbol vector and additive white Gaussian
noise respectively. The matrix a ∈ CK×L is constructed from the pilot symbols as
a ,
a0 aK−1 aK−2 . . . aK−L+1
a1 a0 aK−1 . . . aK−L+2
......
.... . .
...
aL−1 aL−2 aL−3 . . . a0
......
.... . .
...
aK−1 aK−2 aK−3 . . . aK−L
. (3.33)
h, the LS estimate of h is given as
h =(aHa
)−1aHr. (3.34)
The frequency domain equivalent of the system in (3.32) can be obtained by com-
puting the DFT of both sides as
R = Fr = Fah + Fn, (3.35)
where F ∈ CK×K is given as
F =
W 00 . . . W 0(K−1)
.... . .
...
W (K−1)0 . . . W (K−1)(K−1)
(3.36)
and W il , e−j 2πilK . The system in (3.35) is then given as
R = AH + N, (3.37)
where A , diag (Fp) ∈ CK×K , N , Fn and H , Fh where F is the left K × L
sub-matrix of F. The unconstrained least squares estimate H, which is also the
FLSE, is therefore given as
Hf =(AHA
)−1AHR. (3.38)
55
However, the parameter vector H is a constrained parameter vector and in fact,
the constraints on H are given as
f (H) ,¯FHH = 0 (3.39)
where ¯F is the right K × (K − L) sub-matrix of F . Therefore, from (3.39), it can
be seen that the number of constraints on H is K − L. Hence, even though H
contains K complex parameters (2K real parameters), it only contains L (< K)
un-constrained complex parameters (2L real parameters) and these un-constrained
parameters are in fact the elements of the parameter vector h. H is given as
a function of its un-constrained parameters as H = Fh. Thus, an alternative
constrained technique to estimate H, based on the estimate in (3.34) is given as,
Ht = F h, (3.40)
which is also the TLSE of H. Assuming a constant power spectrum as in [9],
AHA = I. Hence, the pilot orthogonality requirement of theorem 3 is satisfied
and in fact, the CRB is exactly achievable since the estimation problem in this
case involves a linear least squares cost function and the noise is Gaussian [15].
Therefore, from theorem 3, the ratio of the estimation error of the FLSE in (3.38)
to the estimation error in the TLSE in (3.40) is precisely given by the ratio of the
number of parameters to the number of un-constrained parameters as
E
∥∥∥Hf − H
∥∥∥
2
F
E
∥∥∥Ht − H
∥∥∥
2
F
=K
L. (3.41)
as reported in [9], where the above conclusion was reached after an explicit com-
putation of the covariance matrices of the time domain and frequency domain
estimation schemes. Thus, the constrained parameter framework and particularly
theorem 3 provides a powerful framework, where results such as the one in (3.41)
can be deduced by just reckoning the number of un-constrained parameters, thus
avoiding explicit computation of the error covariance matrices.
56
0 2 4 6 8 10 12 1410
−1
100
101
SNR
MS
E
OFDM − Channel Estimation. K = 40, L = 5
Unconstrained − FLSEConstrained − TLSE
Figure 3.5: Constrained Vs. unconstrained channel estimation for OFDM.
3.6.2 Simulation results
Our simulation setup consisted of an OFDM system with K = 40 subcar-
riers and L = 5 taps. The channel h was generated as a complex Gaussian vector
of zero mean independent entries and with the variance of real and imaginary parts
equal to 0.5. The time domain and frequency domain channel estimates were found
as given in (3.40) and (3.38) respectively. The experiment was repeated for 1000
iterations at different SNRs in the range 2 − 14 dB. The mean estimation error
vs SNR is given in Fig.3.5. It can be seen the the time domain estimate is more
accurate than the frequency domain estimate. Also, the ratio of the estimation
error of the FLSE to TLSE is precisely 10 log10
(KL
)= 6dB.
3.7 Conclusions
A semi-blind scheme based on a whitening-rotation decomposition of the
channel matrix H has been proposed for MIMO flat-fading channel estimation.
The algorithm computes the whitening matrix W blind from received data and
the unitary matrix Q exclusively from the pilot data. Closed form expressions
57
for the CRB of the proposed scheme have been derived employing the CC-CRB
framework. Using the bounds, it is shown that the lower bound for the MSE in
channel matrix estimation is directly proportional to the number of un-constrained
parameters leading to the conclusion that the semi-blind scheme can be very effi-
cient when the number of receive antennas is greater than or equal to the number
of transmit antennas. We also develop and analyze algorithms for channel estima-
tion based on the decomposition. Properties of the constrained ML estimator of
Q have been studied and an iterative constrained Q-estimator has been detailed
for non-orthogonal pilot sequences. In the absence of perfect knowledge of W , a
Gaussian likelihood function has been presented for the joint estimation of W and
Q. Simulation results have been presented to support the algorithms and analy-
sis and they demonstrate improved performance compared to exclusively training
based estimation. The applicability of the above framework is also shown in the
context of time versus frequency domain channel estimation in OFDM systems.
Acknowledgement
The text of this chapter, in part, is a reprint of the material as it appears
in A. K. Jagannatham and B. D. Rao, “Whitening-Rotation Based Semi-Blind
MIMO Channel Estimation”, IEEE Transactions on Signal Processing, Vol. 54,
No. 3, Mar’06, Pages: 861 - 869, A. K. Jagannatham and B. D. Rao,“A Semi-Blind
Technique For MIMO Channel Matrix Estimation”, 4th IEEE Workshop on Signal
Processing Advances in Wireless Communications, 2003, Rome, Italy , 15-18 June
2003 Pages:304 - 308, Rome, Italy, A. K. Jagannatham and B. D. Rao,“Constrained
ML Algorithms for Semi-Blind MIMO Channel Estimation”, IEEE Global Telecom-
munications Conference, 2004 GLOBECOM ’04, Vol. 4, Nov’29 - Dec’3, 2004,
Pages: 2475 - 2479, A. K. Jagannatham and B. D. Rao,“Complex Constrained
CRB and its applications to Semi-Blind MIMO and OFDM Channel Estimation”,
Sensor Array and Multichannel Signal Processing Workshop Proceedings, 2004,
Barcelona, Spain, 18-21 July 2004, Pages: 397 - 401.
58
3.8 Appendix for Chapter(3)
In this section we present a more rigorous justification of P.3, i.e. the
OPML estimator of Q achieves the CRB given in (3.18) at high SNR, i.e. as
σ2s
σ2n→ ∞. We start by recapitulating a result from matrix perturbation theory
[10].
Perturbation of eigenvectors : Consider a first order perturbation of a symmetric
matrix R ∈ Cn×n by an error matrix ∆R to yield R ∈ C
n×n i.e. R = R + ∆R. Let
sk represent the eigenvectors and λk the eigenvalues of R for k = 1, 2, . . . , r. Also
let the eigenvalues be distinct, i.e. λi 6= λj, ∀i 6= j. Then for small perturbations
∆R, the perturbed eigenvectors sk can be approximated as
sk = sk +n∑
r=1r 6=k
ωrksr , ωrk ,sHr ∆R sk
λk − λr
, (3.42)
and the perturbed eigenvalues λk are given as
λk = λk + ∆λk , ∆λk , sHk ∆R sk. (3.43)
Perturbation analysis of the SVD : The above results then provide a framework
for the analysis of an SVD of a perturbed matrix as the SVD involves operations
similar to the eigen decomposition. Let Φ , 1σ2
sLWHYpX
Hp . ηp ∈ C
t×L is obtained
by stacking the noise vectors η(k) k = 1, 2, . . . , L as ηp , [η(1), η(2), . . . , η(L)].
Thus, Yp = HXp + ηp and substituting this in the expression for Φ yields
Φ = ΣPH(PΣQHXp + ηp
)XH
p = Σ2QH + E,
where E = ΣUHηpXHp can be regarded as a perturbation matrix. From (3.19)
it is clear that computing the high SNR estimate of Q involves a computation
of the SVD of Φ. Define Ω ∈ Ct×t as Ω , Σ2. The SVD of the unperturbed
matrix ΩQH is given as It×tΩQH . We now wish to use the result in (3.42) to
compute the SVD of the perturbed matrix Φ = ΩQH + E given as IΩQH where
I, Q ∈ Ct×t and Ω = diag
(
Ω1, Ω2, . . . , Ωt
)
. The perturbed left and right singular
59
vectors Ii, qi ∈ Ct×1for i = 1, . . . , t are given in terms of the basis vectors of I,Q
as
Ii = Ii + KiiIi +t∑
j=1
j 6=i
αjiIj , qi = qi + Liiqi +t∑
j=1
j 6=i
βjiqj, (3.44)
where Kii, αji, Lii, βji are the perturbation coefficients. Recasting the above expres-
sions in terms of matrices, the perturbed matrices I, Q are given as I = IC, Q =
QD where C, D ∈ Ct×t are defined as
C ,
1 + K11 α12 · · · α1t
α21 1 + K22 · · · α2t
......
. . ....
αt1 αt2 . . . 1 + Ktt
, D ,
1 + L11 β12 · · · β1t
β21 1 + L22 · · · β2t
......
. . ....
βt1 βt2 . . . 1 + Ltt
.
(3.45)
In the absence of noise, Kii = αji = Lii = βji = 0, ∀ i, j and thus C, D = It×t.
Hence these coefficients are essentially small in magnitude and therefore higher
order terms involving them are neglected.
Perturbation coefficients : We wish to now find expressions for these perturba-
tion coefficients in terms of the perturbation matrix E. By definition Ii are the
eigenvectors of ΦΦH which is given as,
ΦΦH =(ΩQH + E
) (ΩQH + E
)H= Ω2 + E,
where the perturbation matrix E = EQΩ + ΩQHEH and the higher order term
EEH has been neglected. Thus, Ii is the eigenvector of Ω2 while Ii is the perturbed
eigenvector of Ω2 + E. Hence from Eq(3.42) the coefficients αji are given as
αji =IHj EIi
Ω2i − Ω2
j
=ΩiI
Hj Eqi + Ωjq
Hj EHIi
Ω2i − Ω2
j
. (3.46)
Similarly qi is the perturbed eigenvector of ΦHΦ. Hence considering ΦHΦ and
repeating the above procedure it can be shown that the complex coefficients βji
60
are given as
βij =ΩjI
Hj Eqi + Ωiq
Hj EHIi
Ω2i − Ω2
j
. (3.47)
It can also be observed that
αij = −α∗ji , βij = −β∗
ji. (3.48)
Thus C, D are skew-Hermitian matrices. Let Ωi = Ωi + ∆Ωi. Ω21, Ω
22, . . . , Ω
2t are
eigenvalues of the perturbed matrix ΦΦH . We have (Ωi + ∆Ωi)2 − Ω2
i ≈ 2Ωi∆Ωi.
Hence using the result for perturbed eigenvalues from Eq(3.42) we have 2Ωi∆Ωi =
IHi EIi and hence
∆Ωi =1
2Ωi
IHi
(ΩQHEH + EQΩ
)Ii. (3.49)
Finally, we derive a constraint for the coefficients Kii, Lii. By definition, the sin-
gular vectors satisfy∥∥∥Ii
∥∥∥
2
= ‖qi‖2 = 1. Using this in (3.44) we have
Kii + K∗ii = −
t∑
j=1
j 6=i
|αji|2 , Lii + L∗ii = −
t∑
j=1
j 6=i
|βji|2 . (3.50)
Also, from the properties of the singular value decomposition it can be seen that
ΩiqHi = Ii
(
IΩQH)
= Ii
(ΩQH + E
). This yields,
Ωi
qi + Liiqi +
t∑
j=1
j 6=i
βjiqj
H
=
IH
i + KiiIi +t∑
j=1
j 6=i
αjiIHj
(IΣQH + E
). (3.51)
Right multiplying both sides by qi in (3.51) and simplifying by ignoring second
and higher order terms (such as ∆ΩiLii , KiiE etc.) yields,
(1 + K∗ii) Ωi + IH
i Eqi = (Ωi + ∆Ωi) (1 + L∗ii)
⇒ K∗ii − L∗
ii =1
Ωi
(
∆Ωi −1
Ωi
IHi EQΩIi
)
. (3.52)
Finally, substituting the expression for ∆Ωi from (3.49) yields
K∗ii − L∗
ii =1
2Ω2i
IHi
(ΩQHEH − EQΩ
)Ii. (3.53)
61
(3.46),(3.47),(3.53) thus give expressions for the perturbation coefficients depend
on the current realization of the observation noise ηp through E. Next, we express
the estimation error in H in terms of the perturbation coefficients. From (3.19), Q
is given as Q = IQH . We now employ the above expressions for the perturbation
coefficients to compute the MSE in the estimation of H.
Estimation error : The estimate of H is given as H = WQ = P√
ΩIQH . Simplify-
ing the estimation error H − H yields,
H − H = P√
ΩQH − P√
ΩCDHQH
=t∑
i=1
pi
√Ωiq
Hi −
t∑
i=1
√Ωipi (1 + Kii + L∗
ii)qHi
−t∑
j=1
t∑
i=j+1
√Ωjpi
(β∗
ij − α∗ij
)qH
j
−t∑
i=1
i−1∑
j=1
√Ωipi (αij − βij)q
Hj ,
where we have have employed (3.48) and neglected second order terms of the type
αijβkl in the above expansion. The estimation error of the (k, l)-th term may then
be obtained as,
Hkl − Hkl = −t∑
i=1
√ΩiPki (Kii + L∗
ii) Q∗li −
t∑
j=1
t∑
i=j+1
√ΩjPki
(β∗
ij − α∗ij
)Q∗
lj
−t∑
i=1
i−1∑
j=1
√ΩiPki (αij − βij) Q∗
lj.
From (3.53) we have Kii + L∗ii = Kii − Lii −
∑tj=1
j 6=i|βji|2 and therefore neglecting
higher than second order terms |Kii + L∗ii|2 ≈ |Kii − Lii|2. Utilizing this fact, the
62
variance of the estimation error in the (k, l)-th term can be written as
E
∣∣∣Hkl − Hkl
∣∣∣
2
=t∑
i=1
Ωi |Pki|2 E|Kii − Lii|2
|Qli|2
+t∑
j=1
t∑
i=j+1
Ωj |Pki|2 E|βij − αij|2
|Qlj|2
+t∑
i=1
i−1∑
j=1
Ωi |Pki|2 E|αij − βij|2
|Qlj|2 . (3.54)
Perturbation coefficient statistics : From the expressions for the perturbation coef-
ficients αij , βij given in (3.46) and (3.47) respectively we have,
|αij − βij| =
∣∣IH
j Eqi − qHj EHIi
∣∣
Ωi + Ωj
. (3.55)
Let Aii , E|Kii − Lii|2
and Bij , E
|βij − αij|2
. Also, let E = ΣG where
G , 1σ2
sLUHηpX
Hp . Since the noise η is spatio-temporally white, the elements of
the matrix ηp are uncorrelated. Further the variance of each element of ηp is σ2n.
U,Xp are orthogonal matrices. Hence G has uncorrelated entires. However, since
every column of Xp has a variance of σ2sL, the variance of each element of G is
given as E|Gij|2
, σ2
G = σ2n
σ2sL
. Also,
IHj Eqi − qH
j EHIi = σjIHj Gqi − σiq
Hj GHIi. (3.56)
Since |Ii|2 = |qi|2 = 1 we have E∣
∣IHj Gqi
∣∣2
= E∣
∣qHj GHIi
∣∣2
= σ2G. Also,
from properties of zero-mean circularly symmetric random variables IHj Gqi and
qHj GHIi are uncorrelated. Therefore, E
∣∣IH
j Eqi − qHj EHIi
∣∣2
=(σ2
j + σ2i
)σ2
G =
(Ωi + Ωj) σ2G. Substituting this result in (3.55) yields
Bij =σ2
G
Ωi + Ωj
, (3.57)
where the expectation is over the distribution of ηp conditioned on the channel
matrix H. Similarly repeating the above analysis for Kii −Lii from the expression
63
given in (3.53) yields, Aii =σ2
G
2ΩiSubstituting these results in (3.54) we have
E
∣∣∣Hkl − Hkl
∣∣∣
2
=σ2
n
σ2sL
t∑
i=1
1
2|Pki|2 |Qli|2 +
σ2n
σ2sL
t∑
i=1
t∑
j=1
j 6=i
Ωi
Ωi + Ωj
|Pki|2 |Qlj|2
=σ2
n
σ2sL
t∑
i=1
t∑
j=1
σ2i
σ2j + σ2
i
|Pki|2 |Qlj|2 . (3.58)
which is similar to the CRB of the variance of each element given in (3.18). Thus
the constrained ML estimator achieves the CRB asymptotically in SNR.
4 Fisher Information Based
Regularity and Semi-Blind
Estimation of MIMO-FIR
Channels
4.1 Introduction
Identifiability is another key consideration in the study of channel estima-
tion, especially in the context of semi-blind channel estimation where it is unclear
as to how much of the wireless channel can be estimated from the statistical infor-
mation available. In [42] the authors make an intuitive observation that the nullity
of the Fisher information matrix equals the number of unidentifiable parameters.
It is then demonstrated that at least one parameter is unidentifiable from the sta-
tistical information in the context of a SIMO channel. This is also reported widely
in research on subspace based channel estimation such as [43]. In [44,45], a variety
of channel estimation scenarios such as pilot, blind and semi-blind are considered
for a SIMO channel where criterion have been derived for identifiability. In the
context of MIMO channels, criterion for identifiability have been demonstrated in
[46]. Treating the transmitted symbols as deterministic unknown quantities, the
conditions for the regularity of the MIMO FIM and strict identifiability have been
demonstrated in [47]. Yet another interesting result on identifiability is reported in
64
65
Figure 4.1: Schematic representation of an SB system.
[48] where it has been demonstrated that in the context of orthogonal space-time
block codes the channel matrix can be estimated up to a complex scalar ambiguity
from blind symbols. Identifiability results for a general MIMO space-time coding
framework (not necessarily orthogonal) are presented in [49].
Parallel to these developments, work has been simultaneously reported
on blind and semi-blind algorithms for the estimation of MIMO frequency selective
channels. In [50] Tugnait and Huang have elaborated a blind algorithm based on
linear prediction for the estimation of MIMO frequency selective channels from
higher order statistics, and also demonstrate that it is not necessary to assume
that the channel matrix is column reduced. Another interesting scheme is demon-
strated in [51] for equalizable matrices that are not necessarily irreducible but can
be decomposed into the product of an irreducible and para-unitary component. A
closed form blind channel estimator has been demonstrated for MIMO channels
employing orthogonal space-time block codes in [52]. An algorithm for the simulta-
neous semi-blind estimation of channel state information of multiple transmitters
employing orthogonal space-time block codes has been demonstrated in [53].
In this work we address issues regarding the semi-blind estimation of
MIMO-FIR channels through an FIM approach. The FIM has also been employed
as a tool in [42] and [47] to study SIMO and MIMO channel identifiability issues
respectively. However, a key difference in our work is that we treat the unknown
data symbols as stochastic quantities bearing valuable statistical information. In
66
fact, a comparison of the stochastic (random) and deterministic (nonrandom) sig-
nal scenarios has been presented in [54] in the context of direction of arrival es-
timation where it has been demonstrated that if Cs, Cd denote the Cramer-Rao
Bounds (CRB) on the covariance matrices of the stochastic and deterministic es-
timation schemes, then Cd ≥ Cs. In other words, the stochastic model is statisti-
cally more efficient than its deterministic counterpart. The stochastic signal model
has the advantage that the number of unknowns in the system no longer grows
with the transmitted data symbols and more over, the statistical information can
be employed to enhance the accuracy of the computed channel estimate as we
demonstrate. It can be observed that an r × t MIMO system with Lh channel
taps has 2rtLh real parameters. Given the additional statistical information in a
semi-blind system, a key concern is how many pilot symbols are necessary for iden-
tifiability. Using the FIM framework, we demonstrate that at least t pilot symbols
are necessary for regularity (zero nullity of the FIM) which implies identifiability
[47] of the MIMO-FIR channel. This framework can also demonstrate the well
known result namely that the MIMO channel can be estimated at most up to t2
indeterminate parameters from second order statistical information1. Thus, in a
MIMO-FIR system, under certain mild conditions, pilot symbols are needed only
to identify t2 blindly unidentifiable parameters. Further, we quantify the perfor-
mance enhancement of a semi-blind scheme relative to an exclusive training based
scheme which does not leverage on the statistical information available. Employing
an asymptotic analysis we show that the semi-blind CRB (SB-CRB) indeed ap-
proaches the complex-constrained CRB (CC-CRB) [55] for the estimation of these
t2 parameters. Therefore, the semi-blind MSE of estimation is potentially very
small compared to an exclusive pilot based scheme.
It is known from earlier work that the t2 blindly unidentifiable param-
eters correspond to a t × t unitary matrix indeterminacy [46]. Employing this
1This does not imply that 2rtLh − t2 of the original 2rtLh parameters can be estimated but ratherthat each of the 2rtLh parameters can be identified only up to t2 degrees of freedom. This will be moreclear from the discussion in the following sections.
67
observation, we describe a scheme for the SB estimation of a MIMO-FIR chan-
nel based on an irreducible-unitary decomposition. The irreducible part can be
estimated from blind symbols by employing second order statistics (SOS) based
techniques. One such scheme which utilizes linear prediction is elaborated in [50].
Finally, we demonstrate an orthogonal pilot sequence based maximum-likelihood
(ML) scheme for the estimation of the unitary matrix indeterminacy and also de-
scribe a technique for the construction of an orthogonal pilot symbol matrix for
a MIMO FIR system from Paley-Hadamard matrices. The rest of the chapter is
organized as follows. The MIMO-FIR estimation problem is formulated in section
4.2. In section 4.3 we present results on the SB-FIM followed by section 4.4 which
describes the irreducible-unitary SB scheme for channel estimation. Simulation
results are presented in section 4.6 followed by conclusions in section 4.7. In what
follows, i ∈ m,n represents m ≤ i ≤ n; i,m, n ∈ N where N denotes the set of
natural numbers, rank (·) the Rank of a matrix and N (·) represents the null space
of a matrix.
4.2 Problem Formulation
Consider an Lh tap frequency-selective MIMO Channel. The system
input-output relation can be expressed as,
y(k) =
Lh−1∑
i=0
H(i)x(k − i) + n(k), (4.1)
where y(k),x(k) are the kth received and transmitted symbol vectors respec-
tively. η(k) is spatio-temporally white additive Gaussian noise of variance σ2n,
i.e. Eη(k)η(l)H
= σ2
nδ (k − l) Ir where δ (k) = 1 if k = 0 and 0 otherwise.
Let t, r be the number of transmitters, receivers and therefore, y(k) ∈ Cr×1 and
x(k) ∈ Ct×1. Each H(i) ∈ C
r×t, i ∈ 0, Lh − 1 is the MIMO channel matrix cor-
responding to the i-th lag. Also, we assume r > t, i.e. the number of receivers is
greater than the number of transmitters. Let xp(1),xp(2), . . . ,xp(Lp) be a burst
68
of Lp transmitted pilot symbols. The subscript p in the above notation represents
pilots. We also assume that the leading channel matrix H(0) is full rank. Let
H ∈ Cr×Lht be defined as,
H , [H(0), H(1), . . . , H(Lh − 1)] .
The input output relation can then be represented as Yp = HXp + Np, where the
block Toeplitz pilot matrix Xp ∈ CLp×Lht is constructed from the transmitted pilot
symbols as
Xp ,
xp(1) xp(2) . . . xp(Lp)
0 xp(1) . . . xp(Lp − 1)...
.... . .
...
0 0 . . . xp(Lp − Lh + 1)
. (4.2)
For the data symbols (which are blind information at the receiver), let us stack
N > Lh transmitted symbol vectors in X (k) described by the system model given
below as,
y (kN)
y (kN − 1)...
y ((k − 1)N + Lh)
︸ ︷︷ ︸
Y(k)
= H
x (kN)
x ((k − 1)N)...
x ((k − 1)N + 1)
︸ ︷︷ ︸
X (k)
+
η (kN)
η (kN − 1)...
η ((k − 1)N + Lh)
,
(4.3)
where the matrix H ∈ C(N−Lh+1)r×Nt is the standard block Sylvester channel
matrix often employed for the analysis of MIMO-FIR channels [42] and is given
as,
H ,
H(0) H(1) H(2) . . . H(Lh − 1) 0 . . .
0 H(0) H(1) . . . H(Lh − 2) H(Lh − 1) . . .
0 0 H(0) . . . H(Lh − 3) H(Lh − 2) . . ....
......
. . ....
.... . .
.
The input vector X (k) ∈ CNt×1 is the data symbol block. This model is represented
pictorially in fig.4.2., where the length of each block of data is N symbols long. Such
69
Figure 4.2: Schematic representation of input and output symbol blocks.
a stacking of the input/output symbols into blocks results in loss of a small number
of information symbols (Lh − 1 symbols per block) due to interblock interference
(IBI) as depicted in fig.4.2. This model is similar to the one considered in [56] and
is adapted because eliminating the IBI makes the analysis tractable by yielding
simplistic likelihood expressions.
Let the transmitted data symbols x(k) be spatio-temporally white, i.e.
Ex(k)x(l)H
= σ2
sδ(k − l)It and the normalized source power σ2s , 1. Hence the
covariance of the block input vector X (k) is given as RX , EX (i)X (i)H
= IN .
Next, we present insights into the nature of the above estimation problem.
4.3 Semi-Blind Fisher Information Matrix (FIM)
In this section we formally setup the complex FIM for the estimation of
the channel matrix H and provide insights into the nature of semi-blind estimation.
The parameter vector to be estimated θH ∈ C2Lhrt×1 is defined by stacking the
70
complex parameter vector and its conjugate as suggested in [16,55] as,
θH ,
θH(0)
θH(1)
...
θH(Lh−1)
, where θH(i) ,
vec (H(i))
vec (H(i))∗
∈ C2rt×1,
and vec (·) denotes the standard matrix vector operator which represents a column
wise stacking of the entries of a matrix into a single column vector. In what follows
k ∈ 0, Lh − 1, i ∈ 1, rt. Observe also that θ∗H(2krt + i) = θH ((2k + 1)rt + i). Let
Lb blocks of data symbols X (p), p ∈ 1, Lb be transmitted. In addition, let the data
symbol vectors x(l), l ∈ Lp + 1, NLb + Lp be Gaussian. Then, RY , the correlation
matrix of the output Y defined in section.4.2 is given as,
RY = EY(l)Y(l)H
= HHH + σ2
nI,
where Ry ∈ C(N−Lh+1)r×(N−Lh+1)r. To make the analysis tractable, the ISI between
the pilot and first block of blind output symbols is ignored as shown in fig.4.2. The
log-likelihood expression for the semi-blind scenario described above is then given
by a simple sum of the blind and pilot log-likelihoods as,
L (Y ; θH) = Lb + Lp,
where Lb, the Gaussian log-likelihood of the blind symbols is given as,
Lb = −Lb∑
k=1
tr(Y(k)HR−1
Y Y(k))− Lb ln detRY ,
and Lp, the least-squares log-likelihood of the pilot part is given as,
Lp =1
σ2n
Lp∑
i=1
∥∥∥∥∥yp(i) −
Lh−1∑
j=0
H(j)xp(i − j)
∥∥∥∥∥
2
. (4.4)
Hence, the FIM for the sum likelihood is given as,
JθH = J b + Jp, (4.5)
71
where J b, Jp ∈ C2rtLh×2rtLh are the FIMs for the blind and pilot symbols bursts
respectively, which are defined by the likelihoods Lb,Lp [15, 55]. This splitting of
the FIM into pilot and blind components is similar to the approaches employed in
[57, 58]. Next we present a general result on the properties of the FIM before we
apply it to the problem at hand in the succeeding sections.
4.3.1 FIM: A General Result
In this section, we present an interesting property of an FIM based anal-
ysis by demonstrating a relation between the rank of the FIM and the number
of unidentifiable parameters. Let α ∈ Ck×1 be the complex parameter vector of
interest. As described in [16,55] for estimation of complex parameters, we employ
a stacking of α as θ =[αT , αH
]T ∈ Cn×1 where n , 2k. Let p
(ω; g
(θ))
, be the
pdf of the observation vector ω parameterized by g(θ), where g
(θ)
: Cn×1 → C
l×1
is a function of the parameter vector θ. Similar to stacking α, α∗, let the function
f(θ) : Cn×1 → C
m×1,m , 2l be defined as,
f(θ) =
g(θ)
g∗(θ)
. (4.6)
Given the log-likelihood L(ω, θ
), ln p
(ω, f
(θ))
, the FIM Jθ ∈ Cn×n is given [15]
as,
Jθ , −E
∂2L(ω; θ
)
∂θ∂θH
.
Let f(θ)
be an identifiable function of the parameter θ, i.e. the FIM with respect
to f(θ)
has full rank. We then have the following result.
Lemma 5. Let p(ω; f
(θ))
, be the pdf of the observation vector ω and f(θ)
: Cn×1 →
Cm×1 be a function of the parameter vector θ satisfying the following conditions.
C.1 Let f(θ)
itself be an identifiable function of the parameter θ, i.e. the
FIM with respect to f(θ)
has full rank.
72
C.2 Let rank
(
N(
∂f(θ)∂θ
))
≥ d, or in other words the dimension of the null
space of∂f(θ)
∂θis at least d.
Under the above conditions, the FIM J(θ)∈ C
n×n is rank deficient and moreover,
rank(J
(θ))
≤ n − d.
Proof. Let p(ω, f
(θ))
be the pdf of the observations ω. The derivative of the
log-likelihood with respect to the parameter vector θ is given as,
∂
∂θln p
(ω, f
(θ))
=∂
∂f(θ) ln p
(ω, f
(θ)) ∂f
(θ)
∂θ.
The un-constrained FIM for the estimation of the parameter vector θ is given as,
J(θ)
= E
−(
∂
∂θln p
(ω, f
(θ))
)T∂
∂θln p
(ω, f
(θ))
=
(
∂f(θ)
∂θ
)T
E
−
(
∂
∂f(θ) ln p
(ω, f
(θ))
)T∂
∂f(θ) ln p
(ω, f
(θ))
︸ ︷︷ ︸
J(f(θ))
∂f(θ)
∂θ
Hence, from the condition C.2 above it follows that rank(J
(θ))
≤ n − d.
The value d has a critical significance in the above lemma and is related
to the number of un-identifiable parameters in the system. We now provide a
deeper insight into the the above result that connects the nature of the FIM to
the number of unidentifiable parameters. Explicitly, let θ be reparameterized by
the real parameter vector ξ ,[ξT1 , ξT
2
]T, ξ1 ∈ R
d′×1, ξ2 ∈ Rd×1 as θ
(ξ). Let
f(θ(ξ)
)∈ C
m×1 satisfy the property,
∂f(θ(ξ)
)
∂ξ2
=∂f
(θ)
∂θ
∂θ
∂ξ2
= 0m×d, (4.7)
or in other words, the function f(θ)
remains unchanged as the parameter vector
θ varies over the d dimensional constrained manifold θ(ξ2
)and thus
∂f(θ)∂θ
has at
least a d dimensional null space. The parameter vector ξ2 is the un-constrained
73
parameterization of the constraint manifold and represents the unidentifiable pa-
rameters. We assume that ∂θ∂ξ2
is full rank, i.e. θ is non-trivially dependant on
ξ2.
The above scenario can be better illustrated with the aid of the following
example. Consider the estimation of a frequency flat channel matrix H, i.e. Lh = 1.
Let the system be parameterized as α = vec (H) and hence from the discussion
above
θ ,
vec (H)
vec (H)∗
. (4.8)
From the system model described in (4.1), the pdf of the observation vector
y ∈ Cr×1 is given by y ∼ N
(0, g
(θ))
, where g(θ)∈ C
r2×1 is the output cor-
relation defined as g(θ)
, vec(HHH + σ2
nI). Consider a re-parameterization of
the channel matrix H by the real parameter vector ξ as H(ξ)
= W (ξ1)Q(ξ2),
where W(ξ1
)∈ C
r×t is also known as a whitening matrix and Q(ξ2
)∈ C
t×t
is a unitary matrix. g(θ)
is now a many to one mapping since g(θ(ξ))
=
vec(
W(ξ1
)W
(ξ1
)H+ σ2
nI)
and
f(θ(ξ))
=
vec
(
W(ξ1
)W
(ξ1
)H+ σ2
nI)
(
vec(
W(ξ1
)W
(ξ1
)H+ σ2
nI))∗
, (4.9)
which is independent of the parameter vector ξ2. Hence, it is a many to one
mapping since for all unitary matrices Q(ξ2
)we have,
∂f(θ(ξ)
)
∂ξ2
= 0r2×t2 , (4.10)
since ξ2 ∈ Rt2×1 (t2 is the number of real parameters to characterize a t× t unitary
matrix).
Thus, the rank of the FIM is deficient by at least d, which is the number of
un-identifiable parameters. This implies that each parameter θi in θ is identifiable
only up to d degrees of freedom owing to the un-identifiability of the ξ2 component
which is of dimension d. For instance, in the above example of the flat-fading
74
channel, each parameter can be identified only up to the corresponding parameter
of the whitening matrix. The above result has interesting applications, especially
in the investigation of identifiability issues in the context of blind and semi-blind
wireless channel estimation. In the next section we apply this analysis to the
problem of semi-blind MIMO-FIR channel estimation where we examine the rank
of the semi-blind FIM for different cases and derive further insights into the nature
of the estimation problem.
4.3.2 Blind FIM
We now apply the above result to our problem of MIMO-FIR channel
estimation. We start by investigating the properties of the blind FIM J b. Let
the block Toeplitz parameter derivative matrix E(k) ∈ C(N−Lh+1)r×Nt be defined
employing complex derivatives as
E(krt + i) ,∂H
∂θ2krt+iH
.
From the results for the Fisher information matrix of a complex Gaussian stochastic
process [59], J b defined in (4.5) above is given as,
J b2krt+i,2lrt+j = J b
(2l+1)rt+j,(2k+1)rt+i
= Lbtr(
E(krt + i)HHR−1Y HE(lrt + j)HR−1
Y
)
J b2krt+i,(2l+1)rt+j =
(J b
(2l+1)rt+j,2krt+i
)∗
= Lbtr(E(krt + i)HHR−1
Y E(lrt + j)HHR−1Y
), (4.11)
where J bk,l denotes its (k, l)th element. We can now apply the result in lemma 5
above to this FIM matrix J b and we have the following result on the rank of the
blind FIM for the MIMO FIR channel.
Theorem 2. Let the channel matrix H(0) have full column rank. Then, a rank
upper-bound on the blind FIM J b ∈ C2rtLh×2rtLh defined in (4.11) above is given
as,
rank(J b
)≤ 2rtLh − t2.
75
In fact, a basis for the t2×1 subspace of the null space N(J b
)is given by U (H) ∈
C2rtLh×t2 as,
U (H) ,
U(H(0))
U(H(1))...
U(H(Lh))
,
where the matrix function U (H) : Cr×t → C
2rt×t2 for the matrix H = [h1,h2, . . . ,ht]
is defined as,
U (H) =
−h∗1 −h∗
2 −h∗3 . . . 0 0 . . .
0 0 0 . . . h∗1 −h∗
2 . . .
0 0 0 . . . 0 0 . . ....
......
. . ....
.... . .
h1 0 0 . . . h2 0 . . .
0 h1 0 . . . 0 h2 . . .
0 0 h1 . . . 0 0 . . ....
......
. . ....
.... . .
. (4.12)
Proof. See Appendix 4.8.1.
Thus from the above result and lemma 5, it is clear that MIMO-FIR
impulse response of the channel can be estimated at most up to an indeterminacy of
t2 real parameters from the statistical information. Thus, using blind information
alone is not sufficient for the estimation of MIMO channel. Alternatively, the above
result has significant implications for estimation accuracy. As r, Lh increase, the
number of real parameters(2rtLh) in the system that need to be identified increases
many fold but the number of parameters that cannot be identified from blind
symbols may be as small as t2 implying that a wealth of data can be identified from
the blind symbols without any need for pilots. Continuing, we derive properties of
the semi-blind (pilot+blind) FIM JθH defined in (4.5).
76
4.3.3 Pilots and FIM
Recall that xp(1),xp(2), . . . ,xp(Lp) are the Lp transmitted pilot sym-
bols. Then, the FIM of the pilot symbols Jp is given as, Jp =∑Lp
i=1 Jp (i), where,
Jp (i) is the FIM contribution from the ith pilot symbol transmission. Given com-
plex vectors in Ct×1, let the matrix function V (i, j) : (Ct×1, Ct×1) → C
2rt×2rt be
defined as,
iVj ,
xp(i)xp(j)
H ⊗ Ir 0
0 xp(i)∗xp(j)
T ⊗ Ir
, if i, j > 0, (4.13)
and iVj = 02rt×2rt if i ≤ 0 or j ≤ 0. After some manipulations, it can be shown
that the FIM contribution Jp(i) ∈ C2rtLh×2rtLh is given as,
Jp(i) =1
σ2n
iViiVi−1 . . . iVi−Lh+1
i−1Vii−1Vi−1 . . . i−1Vi−Lh+1
......
. . ....
i−Lh+1Vii−Lh+1Vi−1 . . . i−Lh+1Vi−Lh+1
. (4.14)
The following result bounds the rank of the semi-blind (pilot + blind) FIM JθH .
Theorem 3. Let Lp ≤ t pilot symbols xp(1),xp(2), . . . ,xp(Lp) be transmitted and
the matrix H(0) be full column rank as assumed above. A rank upper bound of the
sum (pilot + blind) FIM JθH defined in (4.5) above is given as,
rank (JθH) ≤ 2rtLh − t2 +(2tLp − Lp
2), 0 ≤ Lp ≤ t (4.15)
or in other words, a lower bound on the rank deficiency is given as t2−(2tLp − Lp
2)
=
(t − Lp)2.
Proof. See Appendix 4.8.2.
The above result gives an expression for the rank upper bound of the
MIMO-FIR Fisher information matrix for each transmitted pilot symbol. Since
identifiability requires a full rank FIM, it thus presents a key insight into the
number of pilot symbols needed for identifiability of the MIMO FIR system as
shown next.
77
4.3.4 Pilots and Identifiability
From the above result, one can obtain a lower bound for the minimum
number of pilot symbols necessary to achieve regularity or a full rank FIM JθH .
This result is stated below.
Lemma 6. The number of pilot symbol transmissions Lp should at least equal the
number of transmit antennas t for the the FIM JθH to be full rank and hence the
MIMO-FIR system in (4.1) to be identifiable.
Proof. It is easy to see from (4.15), that for Lp < t,
rank (JθH) = 2rtLh − (t − Lp)2 < 2rtLh,
i.e. strictly less than full rank. As the number of pilot symbols increases, for Lp = t,
rank (JθH) ≤ 2rtLp, where 2rtLp is the dimension of JθH and therefore represents
full rank. Hence, at least t pilot symbols are necessary for the identifiability of the
MIMO-FIR wireless channel.
Thus at least t pilot symbols are necessary for the system to become iden-
tifiable. This seemingly counter intuitive result implies that the use of statistical
information does not necessarily mean the reduction in the minimum number of
pilot symbols for identifiability, which is equal to t symbols even for pilot based
estimation. However, one has to observe that in the case of semi-blind estima-
tion potentially fewer number(t2) of parameters need to be estimated. Hence,
even though semi-blind schemes necessitate the transmission of a similar mini-
mum number of pilot symbols, the accuracy of estimation of such a scheme can
be higher owing to the fact that they estimate fewer parameters. Exactly how
much improvement can one expect by using such a scheme is quantified in the
next section where we present results about the asymptotic performance of the SB
estimator using the above FIM framework.
78
4.4 Semi-Blind Estimation: Performance
In this section we demonstrate and quantify the advantage of employ-
ing semi-blind estimation as compared to pilot based estimation. For this pur-
pose, let the MIMO transfer function of the FIR channel be defined as H(z) =∑Lh−1
i=0 H(i)z−i. Let H(z) satisfy :
A.1 H(z) is irreducible i.e. H(z) has full column rank for all z 6= 0 (including
z = ∞). It follows that if H(z) is irreducible, the leading coefficient matrix
[h1(0),h2(0), . . . ,ht(0)] has full column rank (substitute z = ∞ in H(z)).
A.2 H(z) is column reduced, i.e. the trailing coefficient matrix
H(Lh − 1) = [h1(Lh − 1),h2(Lh − 1), . . . ,ht(Lh − 1)]
has full column rank.
The above assumptions are mild in nature and usually satisfied with very high
probability by wireless channel matrices arising from the random fading coeffi-
cients. For a discussion about special scenarios where the above conditions are
not satisfied the reader is referred to works [50,51]. Under the assumptions above,
it is known [46] that H(z) can be identified up to a constant t × t unitary ma-
trix from second-order statistical information. Based on the above observations we
conjecture that if the MIMO transfer function H(z) satisfies A.1 and A.2 described
above, the rank of the blind MIMO Fisher information matrix J b is given as
rank(J b
)= 2rtLh − t2, (4.16)
i.e. the upper bound on the rank of the blind FIM J b in theorem 2 holds with
equality. A proof of the above statement has not been obtained, however this
has been extensively observed in our simulations. It is further conjectured that
in the case of fading wireless channels where the channel coefficients are random
quantities, the above result holds with probability 1 (see section 4.6). Therefore,
for the purposes of this section we assume that the result holds.
79
The above result implies that at most t2 of the 2rtLh parameters of the
MIMO-FIR system are unidentifiable from statistical information. This has signif-
icant implications for semi-blind schemes which leverage on statistical information
to potentially achieve estimation gains. In the next section, through an analysis of
the asymptotic FIM, we quantify this improvement in performance that can result
from a semi-blind scheme.
4.4.1 Asymptotic Semi-Blind FIM
Consider now the asymptotic performance of the semi-blind scheme from
an FIM perspective. As the amount of blind information increases (Lb → ∞),
the variance of estimation of the blind information (for instance the covariance
matrices) progressively decreases to zero, implying that the blindly identifiable
parameters (such as the whitening matrix) can be estimated accurately. Thus,
the SB estimation problem reduces to the constrained estimation problem of the
t2 blindly un-identifiable parameters. The CRB for such an estimation scheme is
given by the Complex-Constrained CRB as illustrated in [55]. The next result
demonstrates that the SB-CRB indeed converges to the complex-constrained CRB
(CC-CRB) as the amount of blind information increases. Hence, the limiting MSE
is equal to the MSE for the complex constrained estimation of the t2 blindly un-
identifiable parameters. SB techniques can therefore yield a far lesser MSE of
estimation than an exclusively pilot based scheme as illustrated by the following
result.
Theorem 4. Let Jp = Lp
σ2nI2rtLh
, which is achieved by an orthogonal pilot matrix2.
Then, as the number of blind symbol transmissions increases (Lb → ∞), the semi-
blind CRB JθH−1 approaches the CRB for the exclusive estimation of the t2 blindly
unidentifiable parameters. Further, the trace of the CRB matrix converges to,
limLb→∞
E
∥∥∥H − H
∥∥∥
2
F
≥ 1
2tr
(JθH
−1)
=
(σ2
n
2Lp
)
t2, (4.17)
2the construction of such an orthogonal pilot matrix Xop is shown later in section 4.5.2
80
which depends only on t2, the number of blindly unidentifiable parameters.
Proof. Given the fact that Jp = Lp
σ2nI2rtLh
, the semi-blind FIM can be expressed as
JθH =Lp
σ2n
I2rtLh×2rtLh+ LbJ
b,
where J b is the blind FIM corresponding to a single observed blind symbol block
Y and is given as J b ,(J b/Lb
), where the blind FIM J b is defined in (4.11).
From (4.16), it can be seen that J b is rank deficient and rank(
J b)
= rank(J b
)=
2rt − t2. Let the eigen-decomposition of J b be given as J b = EbΛbEbH , where
Λb ∈ C(2rt−t2)×(2rt−t2) is a diagonal matrix. Then,
J (θH) =Lp
σ2n
[E⊥
b , Eb
] [E⊥
b , Eb
]H+ LbEbΛbE
Hb
=[Eb, E
⊥b
]
Lp
σ2nI + LbΛb 0
0 Lp
σ2nI
[Eb, E
⊥b
]H.
Hence the CRB J−1 (θH) is given as,
J−1 (θH) =[Eb, E
⊥b
]
(Lp
σ2nI + LbΛb
)−1
0
0 σ2n
LpI
[Eb, E
⊥b
]H.
As the number of blind symbols Lb → ∞, the diagonal matrix(
Lp
σ2nI + LbΛb
)−1
→0(2rt−t2)×(2rt−t2) in the above expression. Thus the semi-blind bound approaches
the complex constrained Cramer-Rao bound (CC-CRB)[55] given as,
limLb→∞
J−1 (θH) =σ2
n
Lp
E⊥b E⊥
b
H.
In fact, the bound on the MSE is clearly seen to be given as,
E
∥∥∥θH − θH
∥∥∥
2
F
≥ σ2n
Lp
tr(
E⊥b E⊥
b
H)
⇒ 2
(
E
∥∥∥H − H
∥∥∥
2
F
)
≥ σ2n
Lp
tr(
E⊥b E⊥
b
H)
E
∥∥∥H − H
∥∥∥
2
F
≥ 1
2
σ2n
Lp
(2rt −
(2rt − t2
))
=
(σ2
n
2Lp
)
t2,
81
which is the constrained bound for the estimation of the MIMO channel matrix
H.
Thus the bound for the MSE of estimation and hence the asymptotic
MSE of the maximum-likelihood estimate of the channel matrix H with the aid
of blind information, is directly proportional to t2. The MSE of least-squares
estimation exclusively using pilot symbols is given as 12tr
(Jp−1) =
(σ2
n
2Lp
)
2rtLh
and is proportional to 2rtLh, the total number of real parameters. Hence, the SB
estimate which has an asymptotic MSE lower by a factor of 2(
rt
)Lh can potentially
be very efficient compared to exclusive pilot only channel estimation schemes. For
instance, in a MIMO system with r = 4, t = 2 and Lh = 2 channel taps, the
potential reduction in MSE by empoying a semi-blind estimation procedure is
2(
rt
)Lh = 9dB as demonstrated in section 4.6. Thus the SB estimation scheme
can result in significantly lower MSE.
4.5 Semi-blind Estimation: Algorithm
As shown above, the SB problem involves identifying t2 parameters from
the pilot data. These t2 parameters correspond to a unitary matrix. More precisely,
Let H(z) ∈ Cr×t(z) be the r× t irreducible channel transfer matrix. Let the input
output system model be as shown in section.4.2 Then, H(z) can be identified up
to a unitary matrix from the output second order statistics of data. The matrix
H ∈ Cr×Lp can be expressed as
H = W(ILh
⊗ QH), where W , [W (1),W (2), . . . ,W (Lh − 1)] .
From the above result, the matrices W (i), i ∈ 0, Lh − 1 can be estimated blind
from the correlation lags Ry(j), j ∈ 0, Lh − 1. In the flat fading channel case
(Lh = 1), this can be done by a simple Cholesky decomposition of the instantaneous
output correlation matrix Ry(0). However, for the case of frequency selective
channels, estimating the matrices W (i) is more involved and a scheme based on
82
designing multiple delay linear predictors is given in [50] (Set na = 0, d = nb =
Lh − 1 and it follows that Fi = W (i)). It thus remains to compute the unitary
matrix Q ∈ Ct×t, i.e. Q is such that QQH = QHQ = I and Q has very few
parameters (t2 real parameters, [55]). In the next section, we present algorithms
for the estimation of this unitary Q indeterminacy from the transmitted pilot
symbols.
4.5.1 Orthogonal Pilot ML (OPML) for Q Estimation
We now describe a procedure to estimate the unitary matrix Q from
an orthogonal pilot symbol sequence Xp. Let Xp(i), i ∈ 0, Lh − 1 be defined as
Xp(i) , [xp(Lh − i),xp(Lh − i + 1), . . . ,xp(Lp − i)]. Let Xop the pilot matrix be
defined as,
Xop ,
Xp(0)
Xp(1)...
Xp(Lh − 1)
.
We choose a different structure for this pilot matrix as compared to Xp in (4.2)
to enable the construction of an orthogonal pilot matrix as shown later. The least
squares cost function for the constrained estimation of the unitary matrix Q can
then be written as,∥∥∥∥∥Yp −
Lh−1∑
i=0
W (i)QHXp(i)
∥∥∥∥∥
2
, subject to QQH = It.
Let the pilot matrix Xop be orthogonal, i.e. Xo
p
(Xo
p
)H= LpILht. The cost mini-
mizing Q is then given as,
Q = UV H , where UΣV H = SV D
(Lh−1∑
i=0
X(i)Y HW (i)
)
.
Proof. Follows from an extension of the result in [55].
Finally, H is given as H , WQH . It now remains to demonstrate a
scheme to construct such an orthogonal pilot matrix Xop which is treated next.
83
Figure 4.3: Paley Hadamard Matrix
4.5.2 Orthogonal Pilot Matrix Construction
An orthogonal pilot in the context of MIMO FIR channels can be con-
structed by employing the Paley Hadamard (PH) orthogonal matrix structure
shown in fig.4.3 and such a scheme has been described in [60]. The PH ma-
trix has blocks of shifted orthogonal rows (illustrated with the aid of rectangular
boundaries), thus giving it the block sylvester structure. Each transmit stream
of orthogonal pilots for the FIR system can be constructed by considering the ’L’
shaped block shown in the figure and removing the prefix of Lh (channel length)
symbols at the receiver. Thus, the modified pilot matrix (obtained by removing the
initial Lh − 1 columns corresponding to zero transmissions in (4.2)) for orthogonal
pilots Xop is given as,
Xop ,
xp(Lh) xp(Lh + 1) . . . xp(Lp)
xp(Lh − 1) xp(Lh) . . . xp(Lp − 1)...
.... . .
...
xp(1) xp(2) . . . xp(Lp − Lh + 1)
. (4.18)
Orthogonal pilots have shown to be optimally suited for MIMO channel estimation
in studies such as [39, 40]. However, the Sylvester structure of FIR pilot matrices
further constrains the set of orthogonal pilot symbol streams compared to flat-
fading channels. As the number of channel taps increases, employing a PH matrix
84
0 1 2 3 4 50
5
10
15
20
25
Pilot Length (Lp)
FIM
Ran
k D
efic
ienc
y
FIM Rank Deficiency Vs. Pilot Length
t = 5
Figure 4.4: Rank deficiency of the complex MIMO FIM Vs. number of transmitted
pilot symbols (Lp)for a 6 × 5 MIMO FIR system of length Lh = 5.
to construct an orthogonal pilot symbol stream implies choosing a PH matrix
with a large dimension. This in turn implies an increase in the length of the pilot
symbol sequence and hence a larger overhead in communication. This problem can
be alleviated by employing a non-orthogonal pilot symbol sequence which results
in slightly suboptimal estimation performance but enables the designer to choose
any pilot length desired. The IGML scheme for channel estimation using non-
orthogonal pilots is described in [55] for flat-fading channels and can be extended
to FIR channels in a straight forward manner. Experimental results have shown
its performance to be comparable to the scheme that employs orthogonal pilots.
4.6 Simulation Results
In this section we present results of computer simulation experiments to
illustrate the salient aspects of the work described above. In a majority of our
simulations below we consider a 4× 2 MIMO FIR channel with 2 taps i.e. Lh = 2,
r = 4 and t = 2. Each of the elements of H is generated as a zero-mean circularly
85
symmetric complex Gaussian random variables of unit variance, i.e a Rayleigh
fading wireless channel. The orthogonal pilot sequence is constructed from Paley
Hadamard matrices by employing the scheme in section 4.5.1. The transmitted
symbols, both pilot and blind (data) are assume to be drawn from a quadrature
phase shift keying (QPSK) symbol constellation[3].
Experiment 1: In fig.4.4. we plot the rank deficiency of the FIM of a 6×5 MIMO
FIR system (r = 6, t = 5) with Lh = 5 channel taps as a function of the num-
ber of transmitted pilot symbols Lp. The rank was computed for 100 realizations
of randomly generated Rayleigh fading MIMO channels and the rank deficiency
observed was precisely [25, 16, 9, 4, 1, 0] for Lp = [0, 1, 2, 3, 4, 5] transmitted pilot
symbols respectively. Hence, rank deficiency 25 for Lp = 0 verifies that the result
in (4.16) holds with overwhelming probability in the case of randomly generated
MIMO channels. Further, for 1 ≤ Lp ≤ 5, the rank deficiency is given as (5 − Lp)2
which additionally verifies that the bound in (4.15) for FIM rank deficiency as a
function of number of pilot symbols holds with equality with high probability.
Experiment 2: In fig.6.4. we plot the MSE vs SNR when the whitening matrix
W (z) is estimated from NLb = 1000, 5000 blind received symbols employing the
linear prediction based scheme from [50]. The Q matrix is estimated from Lp = 20
orthogonal pilot symbols employing the semi-blind scheme in section 4.5.1. For
comparison we also plot the MSE when H is estimated exclusively from training
using least-squares [15, 55], the asymptotic complex constrained CRB (CC-CRB)
given by (4.17) and the MSE of estimation with the genie assisted case of perfect
knowledge of W (z). It can be observed that the MSE progressively decreases
towards the complex constrained CRB as the number of blind symbols increases.
Also observe as illustrated in theorem(4), the asymptotic SB estimation error is
10 log(
324
)= 9dB lower than the pilot based scheme as illustrated in section 4.4.1.
In fig.4.6.(left) we plot the MSE performance of the competing estima-
86
2 3 4 5 6 7 8 9 1010
−2
10−1
SNR
MS
E
MSE Vs. SNR(dB), Lh = 2, r X t = 4X2, L
p = 20
Semi−BlindNL
b = 1000
NLb = 5000
TrainingAsymp CRB
Figure 4.5: MSE Vs SNR in a 4 × 2 MIMO channel with Lh = 2 channel taps, Lp
= 20 pilot symbols.
tion schemes above for different transmitted pilot symbol lengths Lp and 5000
transmitted QPSK data symbols (blind received symbols). As illustrated in sec-
tion 4.5.1, we employ Paley-Hadamard matrices to construct the orthogonal pilot
sequences. Since such matrices exist only for certain lengths Lp, we plot the per-
formance for Lp = 12, 20, 48, 68, 140 pilot symbols. The asymptotic semi-blind
performance is 9dB lower in MSE as seen above. Also, for a given number of blind
symbols, the performance gap in MSE of performance of the semi-blind scheme
with W (z) estimation and that of the training scheme progressively decreases.
This is due to the fact that more blind symbols are required to accurately estimate
the whitening-matrix W (z) for the MSE performance of the semi-blind scheme to
be commensurate with the performance improvement of the pilot based scheme.
Finally, in fig.4.6.(right) we plot the performance of the competing schemes for
different number of blind symbols in the range 1000 − 5000 QPSK symbols and
Lp = 12 pilot symbols. The performance of the SB scheme with W (z) estimated
can be seen to progressively improve as the number of received blind symbols in-
creases.
87
20 40 60 80 100 120 140
10−2
10−1
Pilot Length (Lp)
MS
E
MSE Vs. Lp, L
h = 2, r X t = 4X2, SNR = 5dB
Semi−BlindNL
b = 5000
TrainingAsymp CRB
1000 1500 2000 2500 3000 3500 4000 4500 500010
−2
10−1
100
NLb (#Blind Symbols)
MS
E
MSE Vs. NLb (# Blind Symbols), SNR = 5dB, L
h = 2, r X t = 4X2, L
p = 20
Semi−BlindImperfect W(z)TrainingAsymp CRB
Figure 4.6: MSE performance for estimation of a 4× 2 MIMO frequency-selective
channel. Left- MSE Vs. Lp and Right - MSE Vs. number of blind symbols.
Experiment 3: We compare the symbol error rate (SER) performance of the
training and semi-blind channel estimation schemes. At the receiver, we employ a
stacking as in (4.3) of 7 received symbol vectors y(k) followed by a MIMO mini-
mum mean-square error (MMSE) detector[4] constructed from the MIMO channel
matrix H. In fig.4.7. we plot the SER of detection of the transmitted QPSK
symbols Vs. SNR in the range 2 − 16dB. It can be seen that the asymptotic
semi-blind estimator has a 1 − 2dB improvement in detection performance over
the exclusive training based scheme. The SER drops from around 1 × 10−1 at
2dB to 1 × 10−8 at 16dB. Thus, an SB based estimation scheme can potentially
yield significant throughput gains when employed for the estimation of the wireless
MIMO frequency-selective channel.
88
2 4 6 8 10 12 14 16
10−8
10−7
10−6
10−5
10−4
10−3
SNR
SE
R
SER Vs SNR, Lh = 2, r X t = 4X2
TrainingAsymp SB
Figure 4.7: Symbol error rate (SER) Vs. SNR for QPSK symbol transmission of
a 4 × 2 MIMO frequency selective channel with Lh = 2 channel taps.
4.7 Conclusion
In this work we have investigated the rank properties of the semi-blind
FIM of a MIMO FIR channel and demonstrated that at least t pilot symbol trans-
missions are necessary to achieve a full rank FIM (and hence identifiability) for
an Lh tap r × t (r > t) channel. Under certain mild conditions, the MIMO
channel transfer function H(z) can be decomposed as H(z) = W (z)QH , where
the whitening transfer function W (z) can be estimated from the blind symbols
alone. A constrained semi-blind estimation scheme has been presented to estimate
the unitary matrix Q from pilot symbols along with an algorithm to achieve an
orthogonal pilot matrix structure for MIMO frequency selective channels using
Paley-Hadamard matrices. Simulation results demonstrate the performance of the
proposed scheme.
89
Acknowledgement
The text of this chapter, in part, is a reprint of the material as it appears
in A. K. Jagannatham and B. D. Rao,“FIM Regularity for Gaussian Semi-Blind
MIMO FIR Channel Estimation ”, Conference Record of the Thirty-Ninth Asilo-
mar Conference on Signals, Systems and Computers, Oct. 28 - Nov. 1, 2005,
Pages: 848 - 852.
90
4.8 Appendix for Chapter(4)
4.8.1 Proof of Theorem 2
Proof. We illustrate the result for the simpler case of the flat fading channel i.e.
Lh = 1. Then, H = H = H(0) = H ∈ Cr×t. Let Φ ,
(HHH + σ2
nI)−1
. Let the
blind FIM J b be block partitioned as,
J b ,
J b
11 J b12
J b21 J b
22
. (4.19)
It can be verified from (4.11) that J b21 = J b
12H
and J b22 = J b
11T. The block compo-
nents of the FIM are given as
J b11 =
hH1 Φh1Φ11 hH
1 Φh1Φ21 . . . hH1 ΦhtΦr1
hH1 Φh1Φ12 hH
1 Φh1Φ22 . . . hH1 ΦhtΦr2
......
. . ....
hH1 Φh1Φ1r hH
1 Φh1Φ2r . . . hH1 ΦhtΦrr
hH2 Φh1Φ11 hH
2 Φh1Φ21 . . . hH2 ΦhtΦr1
hH2 Φh1Φ12 hH
2 Φh1Φ22 . . . hH2 ΦhtΦr2
......
. . ....
hHt Φh1Φ1r hH
t Φh1Φ2r . . . hHt ΦhtΦrr
,
91
which can be written succinctly as(HHΦH
)⊗ ΦT . Similarly, J b
12 is given as,
J b12 =
χ11χ11 χ12χ11 . . . χ11χ21 . . . χ1rχt1
χ11χ12 χ12χ12 . . . χ11χ22 . . . χ1rχt2
......
. . . . . .. . .
...
χ11χ1r χ12χ1r . . . χ11χ2r . . . χ1rχtr
χ21χ11 χ22χ11 . . . χ21χ21 . . . χ2rχt1
χ21χ12 χ22χ12 . . . χ21χ22 . . . χ2rχt2
......
. . ....
. . ....
χ21χ1r χ22χ1r . . . χ21χ2r . . . χ2rχtr
χt1χ11 χt2χ11 . . . χt1χ21 . . . χtrχt1
......
. . ....
. . ....
χt1χ1r χt2χ1r . . . χt1χ2r . . . χtrχtr
.
where χ , HHΦ. It can now be seen that JU = 0, where U is as defined
in (4.12). For instance the top t elements of JU(:, 1) (where we employ MATLAB
notation and U(:, 1) denotes the first column of U) are given as [J b11, J
b12]U(:, 1) =
−(hH
1 Φh1
)ΦTh∗
2 +(hT
2 Φ)T (
hH1 Φh1
)= 0 and so on. The structure of the FIM
for the most general case of arbitrary Lh is complex, but the result can be seen to
hold by employing a symbolic manipulation software tool such as the MATLAB
symbolic toolbox package.
It can be seen that the top part of the null space basis matrix U (H) is
U (H(0)). As assumed earlier, rank (H(0)) = t. Now it is easy to see that if U (H)
is rank deficient, U (H(0)) is rank deficient and from its structure, H(0) is rank
deficient violating the assumption. Hence rank (U (H)) = t2.
4.8.2 Proof of Theorem 3
Proof. J b and Jp(i), i ∈ 1, k are positive semi-definite (PSD) matrices. We use the
following property: If A,B are PSD matrices, (A + B)v = 0 ⇔ Av = Bv = 0.
92
Therefore, JθHv = 0 ⇔ J bv = Jp(i)v = 02rtLh×1, ∀i ∈ 1, k. In other words,
N (J) = N(J b
)Lp⋂
i=1
N (Jp (i)) .
Let v ∈ N (J). Then, from the null space structure of J b in (4.12), it follows that
v = U (H) s, where s ∈ Ct2×1. Also,
Jp(i)v = Jp(i)U (H) s = 0, ∀i ∈ 1, Lp.
From lemma 7, this implies that iVi U(H(0))s = 0, ∀i ∈ 1, Lp. where iVi is as
defined in (4.13). Let the matrix T (u) : Ct×1 → C
2t×t2 be defined as
T (u) ,
0 −u∗1 −u∗
2 . . .
−u∗1 0 0 . . .
......
.... . .
u2 u1 0 . . .
0 0 u1 . . ....
......
. . .
. (4.20)
Recall that H(0) is assumed to have full column rank. Then, from the structure
of U (H(0)) given in (4.12) it can be shown that the relation above holds if and
only if, Ps = 0, where the matrix P ∈ C2Lpt×t2 is given as
P ,
T (xp(1))
T (xp(2))...
T (xp(Lp))
. (4.21)
It can then be seen that matrix G ∈ C2tLp×Lp
2forms a basis for the leftullspace of
P, i.e. GTP = 0, where
G ,
xp(1) xp(2) 0 . . .
xp(1)∗ 0 xp(2)∗ . . .
0 0 xp(1) . . .
0 xp(1)∗ 0 . . ....
......
. . .
(4.22)
93
Thus, rank (P) ≤ 2tLp − Lp2 and therefore, right nullity (or nullity) of P is
dim (N (P)) ≥ t2−(2tLp − Lp
2). And therefore, rank (JθH) ≤ 2rtLh−dim (N (P)) =
2rtLh − t2 +(2tLp − Lp
2).
Lemma 7. Let Jp(i)v = 0, ∀i ∈ 1, Lp where v = U (H) s. Then, iVi U (H(0)) s =
0, ∀i ∈ 1, Lp, where iVi is as defined in (4.13).
Proof. Consider Jp(1), the FIM contribution of the first transmitted pilot symbol.
It can be seen clearly that Jp(1) is given as,
Jp(1) =
1V1 02rt×(2Lh−2)rt
0(2Lh−2)rt×2rt 0(2Lh−2)rt×(2Lh−2)rt
. (4.23)
Hence, Jp(1)U (H) s = 0 implies that 1V1U(H(0))s = 0. Further, from the prop-
erties of the matrix Kronecker product we have AB ⊗ CD = (A ⊗ C) (C ⊗ D).
Substituting A = xp(i), B = xp(i)H , C = D = Ir, one can then obtain that
K (xp(1)) U (H(0)) s = 0
K (xp(i)) ,
xp(i)
H ⊗ Ir 0
0 xp(i)T ⊗ Ir
(4.24)
Since H(0) (and hence H(0)∗) is full rank, after some manipulation it can be shown
that the above condition implies T (xp(1)) s = 0. Now consider the contribution
of the second pilot transmission Jp(2). Jp(2)U (H) s = 0 implies that
K (xp(2)) U (H(0)) s + K (xp(1)) U (H(1)) s = 0. (4.25)
Since T (xp(1)) s = 0 and U (H(0)) , U (H(1)) have the same structure, it can be
shown that K (xp(1)) U (H(1)) s = 0 and hence it follows from the above equation
that K (xp(2)) U (H(0)) s = 0 which in turn implies that T (xp(2)) s = 0 and hence
2V2 U(H(0)) = 0 and so on. This proves the lemma.
5 Semi-Blind Estimation for
Maximum Ratio Transmission
5.1 Introduction
To recapitulate, the standard technique to estimate the channel is to
transmit a sequence of training symbols (also called pilot symbols) at the begin-
ning of each frame. This training symbol sequence is known at the receiver and
thus the channel is estimated from the measured outputs to training symbols.
Training based schemes usually have very low complexity making them ideally
suited for implementation in systems (e.g., mobile stations) where the available
computational capacity is limited.
However, the above training-based technique for channel estimation in
MRT based MIMO systems is transmission scheme agnostic. For example, channel
estimation algorithms when MRT is employed at the transmitter only need to
estimate v1 and u1, where v1 and u1 are the dominant eigenvectors of HHH
and HHH respectively, H is the r × t channel transfer matrix, and r / t are the
number of receive / transmit antennas. Hence, techniques that estimate the entire
H matrix from a set of training symbols and use the estimated H to compute v1
and u1 may be inefficient, compared to techniques designed to use the training
data specifically for estimating the beamforming vectors. Moreover, as r increases,
the mean squared error (MSE) in estimation of v1 ∈ Ct remains constant since
the number of unknown parameters in v1 does not change with r, while that of H
94
95
increases since the number of elements, rt, grows linearly with r. Added to this, the
complexity of reliably estimating the channel increases with its dimensionality. The
channel estimation problem is further complicated in MIMO systems because the
SNR per bit required to achieve a given system throughput performance decreases
as the number of antennas is increased. Such low SNR environments call for more
training symbols, lowering the effective data rate.
For the above reasons, semi-blind techniques can enhance the accuracy
of channel estimation by efficiently utilizing not only the known training sym-
bols but also the unknown data symbols. Hence, they can be used to reduce the
amount of training data required to achieve the desired system performance, or
equivalently, achieve better accuracy of estimation for a given number of training
symbols, thereby improving the spectral efficiency and channel throughput. Work
on semi-blind techniques for the design of fractional semi-blind equalizers in multi-
path channels has been reported earlier by Pal in [30,31]. In [28,58] error bounds
and asymptotic properties of blind and semi-blind techniques are analyzed. In
[24, 55], an orthogonal pilot based maximum likelihood (OPML) semi-blind esti-
mation scheme is proposed, where the channel matrix H is factored into the prod-
uct of a whitening matrix W and a unitary rotation matrix Q. W is estimated
from the data using a blind algorithm, while Q is estimated exclusively from the
training data using the OPML algorithm. However, feedback-based transmission
schemes such as MRT pose new challenges for semi-blind estimation, because em-
ployment of the precoder (beamforming vector) corresponding to an erroneous
channel estimate precludes the use of the received data symbols to improve the
channel estimate. This necessitates the development of new transmission schemes
to enable implementation of semi-blind estimation, as shown in Section 5.2-5.2.3.
Furthermore, the proposed techniques specifically estimate the MRT beamforming
vector and hence can potentially achieve better estimation accuracy compared to
techniques that are independent of the transmission scheme.
The contributions of this chapter are as follows. We describe the training-
96
only based conventional least squares estimation (CLSE) algorithm, and derive an-
alytical expressions for the MSE in the beamforming vector, the mean received SNR
and the symbol error rate (SER) performance. For improved spectral efficiency (re-
duced training overhead), we propose a closed-form semi-blind (CFSB) algorithm
that estimates u1 from the data using a blind algorithm, and estimates v1 exclu-
sively from the training. This necessitates the introduction of a new signal trans-
mission scheme that involves transmission of information-bearing spectrally white
data symbols to enable semi-blind estimation of the beamforming vectors. Expres-
sions are derived for the performance of the proposed CFSB scheme. We show that
given perfect knowledge of u1 (which can be achieved when there are a large num-
ber of white data symbols), the error in estimating v1 using the semi-blind scheme
asymptotically achieves the theoretical Cramer-Rao lower bound (CRB), and thus
the CFSB scheme outperforms the CLSE scheme. However, there is a trade-off in
transmission of white data symbols in semi-blind estimation, since the SER for the
white data is frequently greater than that for the beamformed data. Thus, we show
that there exist scenarios where for a reasonable number of white data symbols,
the gains from beamformed data for this improved estimate in CFSB outweigh
the loss in performance due to transmission of white data. As a more general
estimation method when a given number of blind data symbols are available, we
propose a new scheme that judiciously combines the above described CFSB and
CLSE estimates based on a heuristic criterion. Through Monte-Carlo simulations,
we demonstrate that this proposed linearly-combined semi-blind (LCSB) scheme
outperforms the CLSE and CFSB scheme in terms of both estimation accuracy as
well as SER and thus achieves good performance.
The rest of this chapter is organized as follows. In Section 5.2, we present
the problem setup and notation. We also present both the CFSB and CLSE
schemes in detail. The MSE and the received SNR performance of the CLSE
scheme are derived using a first order perturbation analysis in Section 5.3 and the
performance of the CFSB scheme is analyzed in Section 5.4. In Section 5.5, to
97
conduct an end-to-end system comparison, we derive the performance of Alamouti
space-time coded data with training-based channel estimation, and present the
proposed LCSB algorithm. We compare the different schemes through Monte-
Carlo simulations in Section 5.6 and present our conclusions in Section 5.7.
5.2 Preliminaries
5.2.1 System Model and Notation
Fig. 5.1 shows the MIMO system model with beamforming at the trans-
mitter and the receiver. We model a flat-fading channel by a complex-valued
channel matrix H ∈ Cr×t. We assume that H is quasi-static and constant over
the period of one transmission block. We denote the singular value decompo-
sition (SVD) of H by H = UΣV H , and Σ ∈ Rr×t contains singular values
σ1 ≥ σ2 ≥ . . . ≥ σm > 0, along the diagonal, where m = rank(H). Let v1
and u1 denote the first columns of V and U , respectively.
The channel input-output relation at time instant k is
yk = Hxk + ηk, (5.1)
where xk ∈ Ct is the channel input, yk ∈ C
r is the channel output, and ηk ∈ Cr
is the spatially and temporally white noise vector with i.i.d. zero mean circularly
symmetric complex Gaussian (ZMCSCG) entries. The input xk could denote either
be data or training symbols. Also, we let the noise power in each receive antenna
be unity, that is, Eηkη
Hk
= Ir, where E · denotes the expectation operation,
and Ir is the r × r identity matrix.
Let L training symbol vectors be transmitted at an average power PT per
vector (T stands for ‘training’). The training symbols are stacked together to form
a training symbol matrix Xp ∈ Ct×L as Xp = [x1,x2, . . . ,xL] (p stands for ‘pilot’).
We employ orthogonal training sequences because of their optimality properties in
channel estimation [40]. That is, XpXHp = γpIt, where γp , LPT /t, thus maintain-
ing the training power of PT . The data symbols xk could either be spatially-white
98
(i.e., Exkx
Hk
= (PD/t) It), or it could be the result of using beamforming at the
transmitter with unit-norm weight vector w ∈ Ct×1
(i.e., Exkx
Hk = PDwwH
),
where the data transmit power is ExH
k xk
= PD (D stands for ‘data’). We let
N denote the number of spatially-white data symbols transmitted, that is, a total
of N + L symbols are transmitted prior to transmitting beamformed-data. Note
that the N white data symbols carry (unknown) information bits, and hence are
not a waste of available bandwidth.
In this chapter, we restrict our attention to the case where the transmitter
employs MRT to send data, that is, a single data stream is transmitted over t
transmit antennas after passing through a beamformer w. Given the channel
matrix H, the optimum choice of w is v1 [61]. Thus, MRT only needs an accurate
estimate of v1 to be fed-back to the transmitter. We assume that t ≥ 2, since
when t = 1, estimation of the beamforming vector has no relevance. Finally,
we will compare the performance of different estimation techniques using several
different measures, namely, the MSE in the estimate of v1, the gain (rather, the
power amplification/attenuation), and the symbol error rate (SER) of the one-
dimensional channel resulting from beamforming with the estimated vector v1
assuming uncoded M -ary QAM transmission. The performance of a practical
communication would also be affected by factors such as quantization error in v1,
errors in the feedback channel, feedback delay in time-varying environments, etc.,
and a detailed study of these factors warrant separate treatment.
5.2.2 Conventional Least Squares Estimation (CLSE)
Conventionally, an ML estimate of the channel matrix, Hc, is first ob-
tained from the training data as the solution to the following least squares prob-
lem:
Hc = arg minG∈ Cr×t
‖Yp − GXp‖2F , (5.2)
where ‖·‖F represents the Frobenius norm, Yp is the r×L matrix of received symbols
given by Yp = HXp + ηp, where ηp ∈ Cr×L is the set of AWGN (spatially and
99
temporally white) vectors. From [15], the solution to this least squares estimation
problem can be shown to be Hc = YpX†p, where X†
p is the Moore-Penrose generalized
inverse of Xp. Since orthogonal training sequences are employed, we have X†p =
1γp
XHp , and consequently
Hc =1
γp
YpXHp . (5.3)
The ML estimate of v1 and u1, denoted vc and uc respectively, is now obtained
via an SVD of the estimated channel matrix Hc. Since Hc is the ML estimate of
H, from properties of ML estimation of principal components [62], the vc obtained
by this technique is also the ML estimate of v1 given only the training data.
5.2.3 Semi-Blind Estimation
In the scenario that the transmitted data symbols are spatially-white, the
ML estimate of u1 is the dominant eigenvector of the output correlation matrix
Ry, which is estimated as Ry =∑N
i=1 yiyHi . Now, the estimate of u1 is obtained
by computing the following SVD
UΣ2UH = Ry. (5.4)
Note that it is possible to use the entire received data to compute Ry in (5.4)
rather than just the data symbols, in this case, N should be changed to N + L.
The estimate of u1, denoted us (the subscript ‘s’ stands for semi-blind), is thus
computed blind from the received data as the first column of U . As N grows, a
near perfect estimate of u1 can be obtained.
In order to estimate u1 as described above, it is necessary that the trans-
mitted symbols be spatially-white. If the transmitter uses any (single) beamform-
ing vector w, the expected value of the correlation at the receiver is Hw(Hw)H =
HwwHHH 6= HHH , and hence, the estimated eigenvector will be a vector pro-
portional to Hw instead of u1. Fig. 5.2 shows a schematic representation of the
CLSE and the CFSB schemes. Thus, the CFSB scheme involves a two-phase
data transmission: spatially-white data followed by beamformed data. White data
100
transmission could lead to a loss of performance relative to beamformed data, but
this performance loss can be compensated for by the gain obtained from the im-
proved estimate of the MRT beamforming vector. Thus, the semi-blind scheme can
have an overall better performance than the CLSE scheme. Section 5.5 presents
an overall SER comparison in a practical scenario, after accounting for the perfor-
mance of the white data as well as for the beamformed data.
Having obtained the estimate of u1 from the white data, the training
symbols are now exclusively used to estimate v1. Since the vector v1 has fewer
real parameters (2t−1) than the channel matrix H (2rt), it is expected to achieve a
greater accuracy of estimation for the same number of training symbols, compared
to the CLSE technique which requires an accurate estimate of the full H matrix in
order to estimate v1 accurately. If u1 is estimated perfectly from the blind data,
the received training symbols can be filtered by uH1 to obtain
uH1 Yp = σ1v
H1 Xp + uH
1 ηp. (5.5)
Since ‖u1‖ = 1, (here ‖·‖ represents the 2-norm) the statistics of the Gaussian
noise ηp are unchanged by the above operation. We seek the estimate of v1 as the
solution to the following least squares problem
vs = arg minv∈ Ct, ‖v‖=1
∥∥uH
1 Yp − vHXpσ1
∥∥
2, (5.6)
where vs denotes the semi-blind estimate of v1. The following lemma establishes
the solution.
Lemma 1. If Xp satisfies XpXHp = γpIt, the least squares estimate of v1 (under
‖v1‖ = 1) given perfect knowledge of u1 is
vs =XpY
Hp u1
∥∥XpY H
p u1
∥∥. (5.7)
Proof. See Appendix 5.7.1.
101
Closed-Form Semi-Blind Estimation Algorithm (CFSB)
Based on the above observations, the proposed CFSB algorithm is as
follows. First, we obtain us, the estimate of u1, from (5.4). Then, we estimate
v1 from the L training symbols by substituting us for u1 in (5.7). This requires
L + N symbols to actually estimate v1, however, N of these symbols are data
symbols (which carry information bits). Hence, we can potentially achieve the
desired accuracy of estimation of v1 using fewer training symbols compared to the
CLSE technique.
An alternative to employing u1 at the receiver is employ maximum ratio
combining (MRC), i.e., to use an estimate of Hv1/ ‖Hv1‖ (which can be accurately
estimated as the dominant eigenvector of the sample covariance matrix of the
beamformed data). The performance of such a scheme is summarized in [63], and
the analysis can be carried out along the lines presented in this chapter.
5.3 Conventional Least Squares Estimation (CLSE)
5.3.1 Perturbation of Eigenvectors
We recapitulate a result from matrix perturbation theory [10] that we
will use frequently in the sequel. Consider a first order perturbation of a hermitian
symmetric matrix R by an error matrix ∆R to get R, that is, R = R+∆R. Then,
if the eigenvalues of R are distinct, for small perturbations, the eigenvectors sk of
R can be approximately expressed in terms of the eigenvectors sk of R as
sk = sk +n∑
r=1r 6=k
sHr ∆Rsk
λk − λr
sr, (5.8)
where n is the rank of R, λk is the k-th eigenvalue of R, and λk 6= λj, k 6= j.
When k = 1, we have s1 = Sd, where S = [s1, s2, . . . , sn] is the matrix
of eigenvectors and d = [1,sH2 ∆Rs1
λ1−λ2, . . . , sH
n ∆Rs1λ1−λn
]T . One could scale the vector s1
to construct a unit-norm vector as s1 = s1/ ‖s1‖. Then, s1 = Sd, where d =
102
d/∥∥d
∥∥ = [1 + ∆d1, ∆d2, . . . , ∆dn]T . Following an approach similar to [64], if ∆di
are small, since ‖d‖ = 1, the components ∆di are approximately given by
∆di ≃ sHi ∆Rs1
λ1 − λi
, i = 2, . . . , n
∆d1 ≃ −1
2
n∑
i=1
|∆di|2 . (5.9)
Note that ∆d1 is real, and is a higher-order term compared to ∆di, i ≥ 2.
We will use this fact in our first-order approximations to ignore terms such as
|∆d1|2 , |∆d1|3 , . . . and |∆di|3 , |∆di|4 , . . . , i ≥ 2. In the sequel, we assume that
the dominant singular value of H is distinct, so the conditions required for the
above result are valid.
5.3.2 MSE in vc
To compute the MSE in vc, we use (5.3) to write the matrix HHc Hc as a
perturbation of HHH and use the above matrix perturbation result to derive the
desired expressions.
HHc Hc = V Σ2V H + Et, (5.10)
where Et ≈[V ΣUHEp + Ep
HUΣV H]
with Ep = 1γp
ηpXHp . Here, we have ignored
the EpHEp term in writing the expression for Et, since it is a second order term due
to the 1γp
factor in Ep. Now, we can regard Et as a perturbation of the matrix HHH.
As seen in Section 5.2(5.2.2), vc is estimated from the SVD of Hc. Since the basis
vectors V span Ct, we can let vc = V d, and write d = [1 + ∆d1, ∆d2, . . . , ∆dt]
T
as a perturbation of [1, 0, . . . , 0]T .
For i ≥ 2, ∆di is obtained from (5.9) as
∆di =vi
HEtv1
σ21 − σ2
i
=σiu
Hi Epv1 + σ1v
Hi Ep
Hu1
σ21 − σ2
i
. (5.11)
Note that, if r < t, we have σi = 0, i > r, hence, ∆di = vHi Ep
Hu1/σ1, for i > r.
Therefore, to simplify notation, we can define ui , 0r×1 and vj , 0t×1, for i > r
and j > t respectively. The following result is used to find E|∆di|2
.
103
Lemma 2. Let µ1, µ2 ∈ C be fixed complex numbers. Let σ2p = 1
γpdenote the
variance of one of the elements of Ep. Then,
E∣
∣µ1uiHEpvj + µ2vi
HEpHuj
∣∣2
= σ2p
(|µ1|2 + |µ2|2
), (5.12)
for any 1 ≤ i ≤ r, 1 ≤ j ≤ t.
Proof. Let a , uiHEpvj and b , vi
HEpHuj. Then, from lemma 6 in Section
5.7.5 of the Appendix, a and b are circularly symmetric random variables. Since
Ep is circularly symmetric (E Ep (i, j) Ep (k, l) = 0,∀ i, j, k, l) and a and b∗ are
both linear combinations of elements of Ep, we have E ab∗ = 0. Finally, since
‖ui‖ = ‖vj‖ = 1, the variance of a and b are equal, and σ2a = σ2
b = σ2p. Substituting,
we have
E∣
∣µ1uiHEpvj + µ2vi
HEpHuj
∣∣2
= |µ1|2 σ2a + |µ2|2 σ2
b = σ2p(|µ1|2 + |µ2|2).
Using the above lemma with µ1 = σi, µ2 = σ1 and j = 1, we get, for
i ≥ 2,
E|∆di|2
= σ2
p
σ21 + σ2
i
(σ21 − σ2
i )2 , (5.13)
where the expectation is taken with respect to the AWGN term ηp. The following
lemma helps simplify the expression further. We omit the proof, as it is straight-
forward.
Lemma 3. If vc = V d, then
‖vc − v1‖2 = 2 (1 − Re(d1)) = − (∆d1 + ∆d∗1) , (5.14)
where d1 = 1 + ∆d1 is the first element of d.
Using (5.13) in (5.9) and substituting into in (5.14), the final estimation
error is
E‖vc − v1‖2 =
1
γp
t∑
i=2
σ21 + σ2
i
(σ21 − σ2
i )2 . (5.15)
104
5.3.3 Received SNR and Symbol Error Rate (SER)
In this section, we derive the expression for the received SNR when beam-
forming using vc at the transmitter and filtering using uc at the receiver. Since
the unitary matrices V and U span Ct and C
r, vc and uc can be expressed as
uc = Uc and vc = V d respectively. Borrowing notation from Section 5.3-5.3.2, let
c = [1 + ∆c1, ∆c2, . . . , ∆cr]T ∈ C
r and d = [1 + ∆d1, ∆d2, . . . , ∆dt]T ∈ C
t respec-
tively. Then, c can be derived by a perturbation analysis on HcHHc analogous to
that in (5.10) in Section 5.3-5.3.2. We obtain
∆ci =σiv
Hi Ep
Hu1 + σ1uHi Epv1
σ21 − σ2
i
,
where, as before, we define σi = 0, and ui , 0r×1, vj , 0t×1, for i > r and j > t
respectively, so that ∆ci = 0; i > r, as expected. The channel gain is given by
uHc Hvc = cHΣd = σ1(1 + ∆d1)(1 + ∆c∗1) +
t∑
i=2
σi∆c∗i ∆di.
Ignoring higher order terms (cf. Section 5.2(5.3.1)), the power amplification ρc ,
E∣
∣uHc Hvc
∣∣2
is
ρc ≈ σ21E
1 + (∆d1 + ∆d∗1) + (∆c1 + ∆c∗1) +
t∑
i=2
σi
σ1
(∆ci∆d∗i + ∆c∗i ∆di)
.
(5.16)
From (5.9) and (5.13), we have
E ∆d1 + ∆d∗1 = − 1
γp
t∑
i=2
σ21 + σ2
i
(σ21 − σ2
i )2 ; E ∆c1 + ∆c∗1 = − 1
γp
r∑
i=2
σ21 + σ2
i
(σ21 − σ2
i )2 .
Now, ∆ci∆d∗i can be written as
E ∆ci∆d∗i = E
(σiv
Hi Ep
Hu1 + σ1uHi Epv1
σ21 − σ2
i
) (σiv
H1 Ep
Hui + σ1uH1 Epvi
σ21 − σ2
i
)
,
= E
σ1σi
(∣∣uH
i Epv1
∣∣2+
∣∣vH
i EpHu1
∣∣2)
(σ21 − σ2
i )2
,
=2σ2
pσ1σi
(σ21 − σ2
i )2 =
2σ1σi
γp (σ21 − σ2
i )2 .
105
And likewise for ∆c∗i ∆di. Denoting m , rank(H), the power amplification is
ρc = σ21
(
1 − 1
γp
t∑
i=2
σ21 + σ2
i
(σ21 − σ2
i )2 − 1
γp
r∑
i=2
σ21 + σ2
i
(σ21 − σ2
i )2 +
4
γp
m∑
i=2
σ2i
(σ21 − σ2
i )2
)
,
= σ21 −
2
γp
m∑
i=2
σ21
σ21 − σ2
i
− 1
γp
(r + t − 2m) . (5.17)
In obtaining (5.17), we have used the fact that σi = 0 for i > m, where m =
rank(H). Finally, the received SNR is
SNR = ρcPD, (5.18)
where PD is the power per data symbol. The power amplification with perfect
knowledge of H at the transmitter and the receiver is ρp , σ21. As γp = LPT /t
increases, ρc approaches ρp. Note that, when r = 1, the above expression simplifies
to ρc = ρp− 1γp
(t−1). Also,∑m
i=2σ21
σ21−σ2
i
≥ (m−1) since σ1 ≥ σi. Hence, if r = t, the
CLSE performs best when the channel is spatially single dimensional (for example,
in keyhole channels or highly correlated channels), that is, σi = 0, i ≥ 2. In this
case, we have ρc = ρp − 2γp
(t − 1). At the other extreme, if the dominant singular
values are very close to each other such that (σ21 − σ2
2) < 2/γp, the analysis is
incorrect because it requires that the dominant singular values of H be sufficiently
separated. For Rayleigh fading channels, i.e., H has i.i.d. ZMCSCG entries of unit
variance, we can numerically evaluate the probability Prσ21 − σ2
2 < 2/γp to be
approximately 1.7 × 10−4, with r = t = 4 and a typical value of γp = 10dB. Thus,
the above analysis is valid for most channel instantiations.
Having determined the expected received SNR for a given channel instan-
tiation, assuming uncoded M-ary QAM transmission, the corresponding SER PM
is given as [3]
P√M (ρc) = 2
[
1 − 1√M
]
Q
(√
3ρcPT
M − 1
)
(5.19)
PM (ρc) = 1 −(1 − P√
M (ρc))2
, (5.20)
where Q(·) is the Gaussian Q-function, and ρc is given by (5.17). The above
expression can now be averaged over the probability density function of σ2i through
106
numerical integration.
5.4 Closed-Form Semi-Blind estimation (CFSB)
First, recall that the first order Taylor expansion of a function of two
variables g(x, y) is given by
g(x + ∆x, y + ∆y) − g(x, y) =∂g(x, y)
∂x∆x +
∂g(x, y)
∂y∆y + O
(∆x2
)+ O
(∆y2
)
≈ [g (x + ∆x, y) − g (x, y)] + [g (x, y + ∆y) − g (x, y)] .
Now, in CFSB, the error in v1 (or loss in SNR) occurs due to two reasons: first,
the noise in the received training symbols, and second, the use of an imperfect
estimate of u1 (from the noise in the data symbols and availability of only a finite
number N of unknown white data). More precisely, let the estimator of v1 be
expressed as a function vs = f(Yp, us) of the two variables Yp and us. Using the
above expansion, we have
f (Yp, us) − f (HXp,u1) ≈ [f (Yp,u1) − f (HXp,u1)] + [f (HXp, us) − f (HXp,u1)]
(5.21)
where vs = f (Yp, us) and from (5.7), v1 = f (HXp,u1). Since the training noise
ηp and the error in the estimate us are mutually independent, we get
E‖vs − v1‖2 ≈ E
‖f(Yp,u1) − f(HXp,u1)‖2
︸ ︷︷ ︸
T1
+ E‖f(HXp, us) − f(HXp,u1)‖2
︸ ︷︷ ︸
T2
. (5.22)
Note that the term T1 represents the MSE in vs as if the receiver had perfect
knowledge of us (i.e., us = u1), and the term T2 represents the MSE in vs when
the training symbols are noise-free (i.e., Yp = HXp). Hence, the error in vs can
be thought of as the sum of two terms: the first one being the error due to the
107
noise in the white (unknown) data, and the second being the error due to the noise
in the training data. A similar decomposition can be used to express the loss in
channel gain (relative to σ1).
5.4.1 MSE in vs with Perfect us
In this section we consider the error arising exclusively from the training
noise, by setting us = u1. Let vs be defined as vs ,XpY H
p u1
σ1γp. Then, from (5.5)
vs = v1 +Ep
Hu1
σ1
,
where, Ep , ηpXHp /γp, as before. Recall from (5.7) that vs = vs
‖vs‖ . Now, ‖vs‖ can
be simplified as ‖vs‖2 ≃ 1 +(uH
1 Epv1 + vH1 Ep
Hu1
)/σ1, whence we get
vs ≃(
v1 +Ep
Hu1
σ1
)[
1 − 1
2σ1
(uH
1 Epv1 + vH1 Ep
Hu1
)]
.
Ignoring terms of order EpHEp and simplifying, the MSE in vs is
vs − v1 ≃ EpHu1
σ1
− 1
2σ1
(uH
1 Epv1 + vH1 Ep
Hu1
)v1
‖vs − v1‖2 =
∥∥Ep
Hu1
∥∥
2
σ21
− 1
4σ21
∣∣uH
1 Epv1 + vH1 Ep
Hu1
∣∣2. (5.23)
Taking expectation and simplifying the above expression using lemma 2, we get
E‖vs − v1‖2 =
1
2γpσ21
(2t − 1) . (5.24)
Interestingly, the above expression is the Cramer-Rao lower bound (CRB) for the
estimation of v1 assuming perfect knowledge of u1, which we prove in the following
theorem.
Theorem 5. The error given in (5.24) is the CRB for the estimation of v1 under
perfect knowledge of u1.
Proof. From (5.36), the effective SNR for estimation of v1 is γs = γpσ21. From the
results derived for the CRB with constrained parameters [23, 65], since XpXpH
=
108
It/γp, the estimation error in v1 is proportional to the number of parameters,
which equals 2t − 1 as v1 is a t-dimensional complex vector with one constraint
(‖v1‖ = 1). The estimation error is given by
E‖vs − v1‖2 =
1
2γs
Num. Parameters
=1
2γpσ21
(2t − 1) , (5.25)
which agrees with the ML error derived in (5.24).
5.4.2 Received SNR with Perfect us
We start with the expression for the channel gain when using us and vs as
the transmit and receive beamforming vectors. When we have perfect knowledge
of u1 at the receiver, us = u1 and vs = vs/ ‖vs‖, where vs = v1 + Euu1 and
Eu , EpH/σ1. The power amplification with perfect knowledge of u1, denoted by
ρu , E∣
∣uH1 Hvs
∣∣2
= E
|uH1 Hvs|2‖vs‖2
. As shown in the Appendix 5.7.2, this can
be simplified to
ρu = σ21 −
t − 1
γp
. (5.26)
Finally, the received SNR is given by PDρu, as before. Comparing the above
expression with the power amplification with CLSE (5.17), we see that when r = t,
even in the best case of a spatially single-dimensional channel ρc = ρp− 2γp
(t−1) <
ρu. Next, when r = 1, CLSE and CFSB techniques perform exactly the same:
ρc = ρu = σ21 − t−1
γpsince u1 = 1 (that is, no receive beamforming is needed).
Thus, if perfect knowledge of u1 is available at the receiver, CFSB is guaranteed
to perform as well as CLSE, regardless of the training symbol SNR.
5.4.3 MSE in vs with Noise-Free Training
We now present analysis to compute the second term in (5.22), the MSE
in vs solely due to the use of the erroneous vector us in (5.7), and hence let ηp = 0,
or Yp = HXp. As in Section 5.3-5.3.3, we can express us as a linear combination c
109
of the columns of U as us = Uc. We slightly abuse notation from Section 5.4-5.4.1
and redefine vs as vs , XpYHp us/γp = V Σc. Hence,
‖vs‖2 = cHΣ2c.
Thus, from (5.7), we have, vs = V c, where c = Σc√cHΣ2c
. From lemma (3),
‖vs − v1‖2 = 2 (1 − Re(c1)) . (5.27)
Let c = [1 + ∆c1, ∆c2, . . . , ∆cr]T . Then, as shown in the Appendix 5.7.3, c1, the
first element of c, is given by
c1 ≃ 1 − 1
2
r∑
i=2
σ2i
σ21
|∆ci|2 , (5.28)
and hence ‖vs − v1‖2 =∑r
i=2σ2
i
σ21|∆ci|2. Let γd be defined as γd , NPD/t. Then,
from Appendix 5.7.3, E|∆ci|2
is given by
E|∆ci|2
=
1
(σ21 − σ2
i )2
(σ2
1σ2i
N+
σ2i + σ2
1
γd
+N
γ2d
)
. (5.29)
Substituting, we get the final expression for the MSE as
E‖vs − v1‖2 =
r∑
i=2
σ2i
σ21 (σ2
1 − σ2i )
2
(σ2
1σ2i
N+
σ2i + σ2
1
γd
+N
γ2d
)
. (5.30)
Note that the above expression decreases as O(1/N) (since γd depends linearly on
N), and therefore the MSE asymptotically (as N → ∞) approaches the bound in
(5.24).
5.4.4 Received SNR with Noise-Free Training
The power amplification with noise-free training, denoted ρw, is given by
ρw =∣∣uH
s Hvs
∣∣2. We also have us = Uc and vs = V c, where c = Σc√
cHΣ2c. Then,
uHs Hvs = cHΣc =
√cHΣ2c, and thus
ρw = cHΣ2c = σ21 (1 + ∆c∗1) (1 + ∆c1) +
r∑
i=2
σ2i ∆c∗i ∆ci
≃ σ21 (1 + ∆c1 + ∆c∗1) +
r∑
i=2
σ2i |∆ci|2 .
110
Substituting for ∆c1 from (5.9) and ∆ci from (5.29), we obtain the power ampli-
fication with noise-free training as
ρw = σ21 −
r∑
i=2
1
(σ21 − σ2
i )
(σ2
1σ2i
N+
σ21 + σ2
i
γd
+N
γ2d
)
. (5.31)
As before, the received SNR is given by PDρw. Note that ρw approaches ρp = σ21
for large values of length N and SNR γd.
5.4.5 Semi-blind Estimation: Summary
Recall that γp = LPT /t and γd = NPD/t. The final expressions for the
MSE in vs and the power amplification, from (5.22), are:
E‖v1 − vs‖2 =
(2t − 1)
2γpσ21
+r∑
i=2
σ2i
σ21 (σ2
1 − σ2i )
2
(σ2
1σ2i
N+
σ2i + σ2
1
γd
+N
γ2d
)
, (5.32)
ρs = σ21 −
t − 1
γp
−r∑
i=2
1
(σ21 − σ2
i )
(σ2
1σ2i
N+
σ21 + σ2
i
γd
+N
γ2d
)
. (5.33)
The SER with semi-blind estimation is given by PM (ρs), with PM (·) defined as in
(5.19).
5.5 Comparison of CLSE and Semi-blind Schemes
In order to compare the CFSB and CLSE techniques, one needs to ac-
count for the performance of the white data versus beamformed data, an issue
we address now. Generic comparison of the semi-blind and conventional schemes
for any arbitrary system configuration is difficult, so we consider an example to
illustrate the trade-offs involved. We consider the 2× 2 system with the Alamouti
scheme [1] employed for white data transmission, and with uncoded 4-QAM sym-
bol transmission. The choice of the Alamouti scheme enables us to present a fair
comparison of the two estimation algorithms since it has an effective data rate of 1
bit per channel use, the same as that of MRT. Additionally, it is possible to employ
a simple receiver structure, which makes the performance analysis tractable.
111
Let the beamformed data and the white data be statistically independent,
and a zero-forcing receiver based on the conventional estimate of the channel (5.3)
be used to detect the white data symbols. In Appendix 5.7.4, we derive the average
SNR of this system as
ρw =
(
‖H‖4F +
‖H‖2F
γp
)
Px
‖H‖2F
γpPx + ‖H‖2
F + 2rγp
, (5.34)
where ‖·‖F is the Frobenius norm, Px is the per-symbol transmit power and γp =
LPT /t as defined before. From (5.34), we can also obtain the symbol error rate
performance of the Alamouti coded white data by using (5.19) with ρcPD replaced
by ρw. The resulting expression can be numerically averaged over the pdf of
‖H‖2F , which is Gamma distributed with 2rt degrees of freedom, to obtain the
SER. The analysis of the beamformed data with the CFSB estimation when the
Alamouti scheme is employed to transmit spatially white data remains largely the
same as that presented in the previous section, where we had assumed that Xd
satisfies EXdX
Hd
= γdIt. With Alamouti white-data transmission, we have that
XdXHd = γdIt, which causes the Eχ term to drop out in (5.40) of Appendix 5.7.3.
5.5.1 Performance of a 2 × 2 System with CLSE and CFSB
In order to get a more concrete feel for the expressions obtained in the
preceding, let us consider a 2 × 2 system with L = 2, N = 8, PD = 6dB and
110 total symbols per frame, i.e., 2 training symbols, 8 white data symbols and
100 beamformed data symbols in the semi-blind case, and 2 training symbols and
108 beamformed data symbols in the conventional case. The average channel
power gain ρ versus training symbol SNR (PT ), obtained under different CSI and
signal transmission conditions are shown in Fig. 5.3. When the receiver has
perfect channel knowledge (labelled perfect u1, v1), the average power gain ρ is
E σ21 = 5.5dB, independent of the training symbol SNR. The ρ with CLSE as
well as the semi-blind techniques asymptotically tend to this gain of 5.5dB as the
SNR becomes large, since the loss due to estimation error becomes negligible. The
112
channel power gain with only white (Alamouti) data transmission asymptotically
approaches 3dB (the gain per symbol of the 2×2 system with Alamouti encoding).
The channel power gain at any PT is given by (5.34), which is validated
in Fig. 5.3 through simulation. Observe that at a given training SNR, there is
a loss of approximately Pa = −3dB in terms of the channel gain performance for
the Alamouti scheme compared to the beamforming with conventional estimation.
The results of the channel power gain obtained by employing the CFSB technique
with N = 8 Alamouti-coded data symbols are shown in Fig. 5.3, and show the
improved performance of CFSB. By transmitting a few (N = 8) Alamouti-coded
symbols, the CFSB scheme obtains a better estimate of v1, thereby gaining about
Psb = 0.8dB per symbol over the CLSE scheme, at a training SNR of 2dB.
If the frame length is 110 symbols, we have Ld =100 beamformed data
symbols in the semi-blind case and Ld + N = 108 beamformed data symbols in
the conventional case. Using the beamforming vectors estimated by the CFSB
algorithm, we then have a net power gain ρg given by ρg = Ld+NLd/Psb+N/Pa
, or about
0.4dB per frame. Thus, this simple example shows that CFSB estimation can
potentially offer an overall better performance compared to the CLSE. Although
we have considered uncoded modulation here, in more practical situations a chan-
nel code will be used with interleaving both between the white and beamformed
symbols as well as across multiple frames. In this case, burst errors can be avoided
and the errors in the white data symbols corrected. Furthermore, the performance
of the white data symbols can also be improved by employing an MMSE receiver
or other more advanced multi-user detectors rather than the zero-forcing receiver,
leading to additional improvements in the CFSB technique.
5.5.2 Discussion
We are now in a position to discuss the merits of the conventional es-
timation and the semi-blind estimation. Clearly, the CLSE enjoys the advan-
tages of being simple and easy to implement. As with any semi-blind technique,
113
CFSB being a second-order method requires the channel to be relatively slowly
time-varying. If not, the CLSE can still estimate the channel quickly from a few
training symbols, whereas the CFSB may not be able to converge to an accurate
estimate of u1 from the second order statistics computed using just a few received
vectors. Another disadvantage of the CFSB is that it requires the implementation
of two separate receivers, one for detecting the white data and the other for the
beamformed data. However, the CFSB estimation could outperform the CLSE in
channels where the loss due to the transmission of spatially-white data is not too
great, i.e., in full column-rank channels. Given the parameters N , L, PT and PD,
the theory developed in this chapter can be used to decide if the CFSB technique
would offer any performance benefits versus the CLSE technique. If the CFSB
technique is to perform comparably or better than the CLSE, two things need to
be satisfied:
1. The estimation performance of CLSE and CFSB should be comparable, i.e.,
the number of white data symbols N and the data power PD should be large
enough to ensure that the estimate us is accurate, so that the resulting vs
can perform comparably to the conventional estimate. For example, since
the channel gain with semi-blind estimation is given by (5.32), N should be
chosen to be of the same order as γd; and both N and γd should be of the
order higher than γp. With such a choice, the (t − 1)/γp term will dominate
the SNR loss in the CFSB, thus enabling the beamformed data with CFSB
estimation to outperform the beamformed data with CLSE.
2. The block length should be sufficiently long to ensure that after sending L+N
symbols, there is sufficient room to send as many beamformed symbols as is
necessary for the CFSB technique to be able to make up for the performance
lost during the white data transmission. In the above example, after having
obtained the appropriate value of N , one can use (5.34) to determine the
loss due to the white data symbols (for the t = 2 case), and then finally
determine whether the block length is long enough for the CFSB to be able
114
to outperform the CLSE method.
In Section 5.6, we demonstrate through additional simulations that the CFSB tech-
nique does offer performance benefits relative to the CLSE, for an appropriately
designed system.
5.5.3 Semi-blind Estimation: Limitations and Alternative Solutions
The CFSB algorithm requires a sufficiently large number of spatially-
white data (N) to guarantee a near perfect estimate of u1 and this error cannot
be overcome by increasing the white-data SNR. It is therefore desirable to find
an estimation scheme that performs at least as well as the CLSE algorithm, re-
gardless of the value of N and L. Formal fusion of the estimates obtained from
the CLSE and CFSB techniques is difficult, hence we adopt an intuitive approach
and consider a simple weighted linear combination of the estimated beamforming
vectors as follows:
u1 =βuγpuc + γdus
‖βuγpuc + γdus‖2
, v1 =βvγpvc + γdvs
‖βvγpvc + γdvs‖2
. (5.35)
The above estimates will be referred to as the linear combination semi-blind (LCSB)
estimates. The weights γp = LPT /t and γd = NPD/t are a measure of the accu-
racy of the vectors estimated from the CLSE and CFSB schemes respectively. The
scaling factor of βu and βv is introduced because uc obtained from known training
symbols is more reliable than the blind estimate us when L = N and γp = γd. In
our simulations, for t = r = 4, the choice βu = βv = 4 was found to perform well.
Analysis of the impact of βu and βv is a topic for future research.
5.6 Simulation Results
In this section, we present simulation results to illustrate the performance
of the different estimation schemes. The simulation setup consists of a Rayleigh
flat fading channel with 4 transmit antennas and 4 receive antennas (t = r = 4).
115
The data (and training) are drawn from a 16-QAM constellation. 10,000 random
instantiations of the channel were used in the averaging.
Measuring the error between singular vectors
In the simulations, v1 and v1 are obtained by computing the SVD of
two different matrices H and H respectively. However, the SVD involves an
unknown phase factor, that is, if v1 is a singular vector, so is v1ejφ for any
φ ∈ (−π, π] . Hence, for computational consistency in measuring the MSE in v1, we
use the following dephased norm in our simulations, similar to [66]: ‖v1 − v1‖2
DN,
2(1 −
∣∣vH
1 v1
∣∣); which satisfies ‖v1 − v1‖2
DN= minφ∈(−π,π]
∥∥v1 − v1e
jφ∥∥
2. The
norm considered in our analysis is implicitly consistent with the above dephased
norm. For example, the norm in (5.14) is the same as the dephased norm, since
the perturbation term ∆d1 is real (as noted in Section 5.3-5.3.1). Also, for small
additive perturbations, it can easily be shown that (for example) in (5.23), the
dephased norm reduces to the Euclidean norm.
Experiment 1
In this experiment, we compute the MSE of conventional estimation and
the MSE of the semi-blind estimation with perfect u1, which serves as a benchmark
for the performance of the proposed semi-blind scheme. Fig.5.4 shows the MSE
in v1 versus L, for two different values of pilot SNR (or γp), with perfect u1.
CFSB performs better than the CLSE technique by about 6dB, in terms of the
training symbol SNR for achieving the same MSE in v1. The experimental curves
agree well with the theoretical curves from (5.15), (5.24). Also, the results for
the performance of the semi-blind OPML technique proposed in [55] are plotted
in Fig. 5.4. In the OPML technique, the channel matrix H is factored into the
product of a whitening matrix W (= UΣ) and a unitary rotation matrix Q. A
blind algorithm is used to estimate W , while the training data is used exclusively
to estimate Q. Thus, the OPML technique outperforms the CFSB because it
116
assumes perfect knowledge of the entire U and Σ matrices (and is computationally
more expensive). The CFSB technique, on the other hand, only needs an accurate
estimate of u1 from the spatially-white data.
Experiment 2
Next, we relax the perfect u1 assumption. Fig.5.5 shows the SER per-
formance of the CLSE, OPML and the CFSB schemes at two different values of
N , as well as the N = ∞ (perfect knowledge of U) case. At N = 50 white data
symbols, the CLSE technique outperforms the CFSB for L ≥ 24, as the error in u1
dominates the error in the semi-blind technique. As white data length increases,
the CFSB performs progressively better than the CLSE. Also, in the presence of
a finite number (N) of white data, the CFSB marginally outperforms the OPML
scheme as CFSB only requires an accurate estimate of the dominant eigenvector
u1 from the white data. In Fig. 5.6, we plot both the theoretical and experimental
curves for the CFSB scheme when N = 100, as well as the simulation result for
the LCSB scheme defined in Section 5.5-5.5.3. The LCSB outperforms the CLSE
and the CFSB technique at both N = 50 and N = 100. Thus, the theory devel-
oped in this chapter can be used to compare the performance of CFSB and CLSE
techniques for any choice of N and L.
Experiment 3
Finally, as an example of overall performance comparison, Fig. 5.7 shows
the SER performance versus the data SNR of the different estimation schemes for a
2× 2 system, with uncoded 4-QAM transmission, L = 2 training symbols, N = 16
white data symbols (for the semi-blind technique) and a frame size Ld = 500
symbols. The parameter values are chosen for illustrative purposes, and as L and
PT increase, the gap between the CLSE and CFSB reduces. From the graph, it is
clear that the LCSB scheme outperforms the CLSE scheme in terms of its SER
performance, including the effect of white data transmission.
117
5.7 Conclusion
In this chapter, we have investigated training-only and semi-blind channel
estimation for MIMO flat-fading channels with MRT, in terms of the MSE in the
beamforming vector v1, received SNR and the SER with uncoded M-ary QAM
modulation. The CFSB scheme is proposed as a closed-form semi-blind solution
for estimating the optimum transmit beamforming vector v1, and is shown to
achieve the CRB with the perfect u1 assumption. Analytical expressions for the
MSE, the channel power gain and the SER performance of both the CLSE and
the CFSB estimation schemes are developed, which can be used to compare their
performance. A novel LCSB algorithm is proposed, which is shown to outperform
both the CFSB and the CLSE schemes over a wide range of training lengths and
SNRs. We have also presented Monte-Carlo simulation results to illustrate the
relative performance of the different techniques.
Acknowledgement
The text of this chapter, in part, is a reprint of the material as it ap-
pears in A. K. Jagannatham, C.R. Murthy and B.D. Rao,“A Semi-Blind Channel
Estimation Scheme for MRT”, Proceedings of IEEE International Conference on
Acoustics, Speech, and Signal Processing, 2005, (ICASSP ’05), Mar’05, Vol. 3,
Pages: 585 - 588.
118
Appendix for Chapter(5)
5.7.1 Proof of Lemma 1:
Let Yp ,uH
1 Yp
σ1γp, Xp ,
Xp
γp, and n ,
uH1 ηp
σ1γp. Then, since the training
sequence is orthogonal, XpXpH
= It holds. Substituting into (5.5), we have
Yp = vH1 Xp + n. (5.36)
Thus, we seek the estimate of v1 as the solution to the following least squares
problem
vs = arg minv∈ Ct, ‖v‖=1
∥∥∥Yp − vHXp
∥∥∥
2
. (5.37)
Note that
arg minv1: ‖v1‖=1
∥∥∥Yp − vH
1 Xp
∥∥∥
2
= arg minv1: ‖v1‖=1
(
YpYpH
+‖v1‖2
γp
− YpXpHv1 − vH
1 XpYpH
)
= arg maxv1: ‖v1‖=1
(
YpXpHv1 + vH
1 XpYpH
)
.
The v1 that maximizes the above expression is readily found to be v1 = XpYpH
/∥∥∥XpYp
H∥∥∥.
Substituting for Xp and Yp, the desired result is obtained.
5.7.2 Received SNR with perfect us
Here, we derive the expression in (5.26). For notational simplicity, define
x , vH1 Euu1 and y , uH
1 EHu Euu1. Then, we have
ρu = E
σ2
1 (1 + x) (1 + x∗)
1 + x + x∗ + y
≃ σ21E
(1 + x) (1 + x∗)
(1 − (x + x∗ + y) + (x + x∗ + y)2)
≃ σ21 (1 + E xx∗ − y) , (5.38)
where x∗ is the complex conjugate of x. Also, E xx∗ =σ2
p
σ21
= 1γpσ2
1, and E y =
EuH
1 EHu Euu1
= t
γpσ21. Thus, the power amplification for perfect u1 is given by
ρu = σ21 − t−1
γp.
119
5.7.3 Proof for equations (5.28) and (5.29)
In order to derive an expression for c1, we write c = [1 + ∆c1, ∆c2, . . . , ∆ct]T
as a perturbation of [1, 0, . . . , 0]T . Since c = Σc√cHΣ2c
, equating components, we have
c1 =σ1 (1 + ∆c1)
√
σ21 |1 + ∆c1|2 +
∑ri=2 σ2
i |∆ci|2
≃ (1 + ∆c1)
[
1 − 1
2
(
2∆c1 +r∑
i=2
σ2i
σ21
|∆ci|2)]
≃ 1 − 1
2
r∑
i=2
σ2i
σ21
|∆ci|2 .
Substituting in (5.27), we get
‖v1 − vs‖2 =r∑
i=2
σ2i
σ21
|∆ci|2 . (5.39)
It now remains to compute ∆ci. Recall that us is computed from the SVD in (5.4).
Stacking the transmitted and received data vectors into matrices Xd ∈ Ct×N and
Yd ∈ Cr×N and the noise vectors into ηd ∈ C
r×N , with appropriate scaling we can
rewrite (5.4) as
UΣ2UH = HHH + Es,
where,
Es , HEχHH + HEχη + EHχηH
H + Eη,
and Eχ , 1γd
(XdX
Hd − γdIt
), Eχη ,
XdηHd
γd, Eη , 1
γd
(ηdη
Hd − NI
), and finally
γd = NPD
t, as before.
Observe that, since the white data Xd and AWGN are mutually in-
dependent, the elements of Eχ, Eχη and Eη are pairwise uncorrelated. Also,
E|Eχ(i, j)|2
=
(PD
t
)2/(
N(
PD
t
)2)
= 1/N , E|Eχη(i, j)|2
=
(PD
t
)/(
N(
PD
t
)2)
=
1/γd, and E|Eη(i, j)|2
= 1/
(
N(
PD
t
)2)
= N/γ2d . Thus, from the first order per-
turbation analysis (5.8), ∆ci =uH
i Esu1
σ21−σ2
i
, and therefore
E|∆ci|2
=
1
(σ21 − σ2
i )2
(
E∣
∣uHi HEχHHu1
∣∣2
+ E∣
∣uHi HEχηu1
∣∣2
+ E∣
∣uHi EH
χηHHu1
∣∣2
+ E∣
∣uHi Eηu1
∣∣2)
. (5.40)
120
Simplifying the different components in the above expression, we have,
E∣
∣uHi HEχHHu1
∣∣2
= σ21σ
2i /N, E
∣∣uH
i Eηu1
∣∣2
= N/γ2d ,
and,
E∣
∣uHi HEχηu1
∣∣2
= σ2i /γd.
Substituting into (5.40), we get (5.29).
5.7.4 Performance of Alamouti Space-Time Coded Data with Con-
ventional Estimation
In this section, we determine the performance of Alamouti space-time
coded data for a general r × 2 matrix channel with estimation error and a zero-
forcing receiver. Similar results for other specific cases can be found in [67], [68].
Denote the r× 2 channel matrix H in terms of its columns as H = [H1, h2]. Also,
let the 2×L orthogonal training symbol matrix Xp be defined in terms of its rows
as XTp = [XT
p1, XTp2]
T . Thus, from (5.3), the channel is estimated conventionally as
Hc =1
γp
[YpX
Hp1, YpX
Hp2
]
[
H1, H2
]
=
[
h1 +ηpX
Hp1
γp
,h2 +ηpX
Hp2
γp
]
(5.41)
The effective channel with Alamouti-coded data transmission can be represented
by stacking two consecutively received r× 1 vectors y1 and y∗2 vertically as follows
y1
y∗2
=
h1 h2
−h∗2 h∗
1
x1
x∗2
+
ηw1
η∗w2
, (5.42)
where ηwi, i = 1, 2 is the AWGN affecting the white data symbols. When a zero-
forcing receiver based on the estimated channel is employed, the received vectors
are decoded using[
H1, H2
]
as
x1
x∗2
=
HH
1 −HT2
HH2 HT
1
y1
y∗2
. (5.43)
121
It is clear from symmetry that the performance of x1 and x2 will be the same; hence,
we can focus on determining the SER performance of x1. Now, x1 contains three
components, the signal component coming from x1, and a leakage term coming
from the symbol x2 and the noise term coming from the white noise term ηw as
follows
x1 =(
HH1 h1 + hH
2 H2
)
︸ ︷︷ ︸
ξx1
x1 +(
HH1 h2 − hH
1 H2
)
︸ ︷︷ ︸
ξx2
x∗2 + HH
1 ηw1 − ηHw2H2
︸ ︷︷ ︸
ξn
(5.44)
The coefficient of the x1 term, denoted ξx1 is
ξx1 =
(
h1 +ηpX
Hp1
γp
)H
h1 + hH2
(
h2 +ηpX
Hp2
γp
)
, (5.45)
= ‖H‖2F +
Xp1ηHp h1 + hH
2 ηpXHp2
γp
. (5.46)
From the above equation, it is clear that the performance of the x1 symbol is
dependent on the training noise instantiation ηp. However, we can consider the
average power gain, averaged over the training noise, as follows
E|ξx1|2
= ‖H‖4
F +1
γ2p
EXp1η
Hp h1h
H1 ηpX
Hp1 + hH
2 ηpXHp2Xp2η
Hp h2
,(5.47)
= ‖H‖4F +
1
γ2p
(γp ‖h1‖2 + γp ‖h2‖2) , (5.48)
= ‖H‖4F +
‖H‖2F
γp
, (5.49)
where, in (5.47), the cross terms disappear since the noise ηp is zero-mean and due
to the orthogonality of the training Xp. Similarly, the coefficient of the x∗2 term,
denoted ξx2, can be simplified as
ξx2 =Xp1η
Hp h2 − hH
1 ηpXHp2
γp
. (5.50)
We will assume for simplicity that the x2 term is an additive white Gaussian noise
impairing the estimation of x1, i.e., we do not perform joint detection. This noise
term is independent of the AWGN component ηw. Similar to the coefficient of x1,
122
we can consider the average power gain of the x2 term, which can be obtained after
a little manipulation as
E|ξx2|2
=
‖H‖2F
γp
. (5.51)
Finally, the noise term, denoted ξn, is
ξn = hH1 ηw1 − ηH
w2h2 +Xp1η
Hp ηw1 − ηH
w2ηpXHp2
γp
, (5.52)
from which we can obtain the noise power as
E|ξn|2
= ‖H‖2
F +2r
γp
. (5.53)
Thus, the SNR for detection of a white data symbol is given by
ρw =
(
‖H‖4F +
‖H‖2F
γp
)
Px
‖H‖2F
γpPx + ‖H‖2
F + 2rγp
. (5.54)
5.7.5 Other Useful Lemmas:
In this section, we present three useful lemmas without proof for the sake
of brevity.
Lemma 4. Let Xp ∈ Ct×L be an orthogonal set of vectors (i.e., XpX
Hp = γpIt),
and let ηp ∈ Cr×L contain i.i.d. ZMCSCG entries with mean µ = 0 and variance
σ2n = 1. Then, the elements of Ep = Xpη
Hp are uncorrelated, and the variance of
each element of Ep is σ2p = γp.
Lemma 5. A transformation of Ep (defined in lemma 4) by any orthogonal matrix
V ∈ Ct×t (i.e., V V H = V HV = It) to get E = V Ep, leaves the second order
statistics of Ep unaltered, that is,
E E(i, j) = E
E(i, j)
= 0
E E(i, j)E∗(k, l) = E
E(i, j)E∗(k, l)
= σ2pδ (i − k, j − l) , ∀ i, j, k, l,
where δ (p, q) = 1 when p = q = 0, and 0 otherwise.
123
Lemma 6. If the random vector Xp ∈ Ct×L has zero-mean circularly symmetric
i.i.d. entries, then so does vHXp, where v ∈ Ct×1. Further, if v satisfies ‖v‖ = 1,
then the variance of an element of Xp is the same as that of vHXp.
124
H r x t
ss Decoding /
Decision
Beamforming
||z|| = 1
Transmit
Beamforming
Receive
Czw
||w|| = 1
Ct
r
Figure 5.1: MIMO system model, with beamforming at the transmitter and re-
ceiver.
Training
Training White Data
Conventional
Semi−blind
Beamformed Data
Beamformed DataEst v
1
1Est. u , v
1
1Est. u
Figure 5.2: Comparison of the transmission scheme for conventional least squares
(CLSE) and closed-form semi-blind (CFSB) estimation.
125
−2 0 2 4 6 8 10 12−2
−1
0
1
2
3
4
5
6
Po
we
r A
mp
lific
atio
n (
dB
)
Pilot SNR (dB)
Perfect+u1,v
1CFSB+perfect u
1CFSB+estimated u
1
CLSE+beamformingCLSE+Alamouti, expCLSE+Alamouti, theory
Figure 5.3: Average channel gain of a t = r = 2 MIMO channel with L = 2,
N = 8 and PD = 6dB, for the CLSE and beamforming, CFSB and beamforming
(with and without knowledge of u1), CLSE and white data (Alamouti-coded), and
perfect beamforming at transmitter and receiver. Also plotted is the theoretical
result for the performance of Alamouti-coded data with channel estimation error,
given by (5.34)
.
126
10 20 30 40 50 60
10−2
10−1
Pilot Length (L)
MS
E in
v1
CLSE−TheoryCLSECFSB−TheoryCFSB, perfect u
1
OPML − perfect Upilot SNR = 2dB
pilot SNR = 10dB
Figure 5.4: MSE in v1 vs training data length L, for a t = r = 4 MIMO system.
Curves for CLSE, CFSB and OPML with perfect u1 are plotted. The top five
curves correspond to a training symbol SNR of 2dB, and the bottom five curves
10dB.
127
10 20 30 40 50 60
10−2
SE
R
num pilot
CLSE−expOPML, N=50CFSB, N=50CFSB−u1OPML−UPerf−bf
10 20 30 40 50 60
10−2
SE
R
num pilot
CLSE−expOPML, N=100CFSB, N=100CFSB−u1,theoryCFSB−u1Perf−bf
Figure 5.5: SER of beamformed-data vs number of training symbols L, t = r = 4
system, for two different values of white-data length N , and data and training
symbol SNR fixed at PT = PD = 6dB. The two competing semi-blind techniques,
OPML and CFSB, are plotted. CFSB marginally outperforms OPML for N = 50,
as it only requires an accurate estimate of u1 from the blind data.
128
10 20 30 40 50 60
10−2
SE
R
num pilot
CLSE−expCLSE−theoryCFSB−exp, N=50LCSB, N=50Perf−bf
10 20 30 40 50 60
10−2
SE
R
num pilot
CLSE,expCFSB−theory, N=100CFSB−exp, N=100Perf−bf
Figure 5.6: SER vs L, t = r = 4 system, for two different values of N , and data and
training symbol SNR fixed at PT = PD = 6dB. The theoretical and experimental
curves are plotted for the CFSB estimation technique. Also, the LCSB technique
outperforms both the conventional (CLSE) and semi-blind (CFSB) techniques.
129
−2 0 2 4 6 8 10 12 1410
−4
10−3
10−2
10−1
100
SE
R
data SNR (dB)
CLSE−AlamoutiCLSE−bfCFSBCFSB−u1LCSBPerf−bf
Figure 5.7: SER versus data SNR for the t = r = 2 system, with L = 2, N =
16, γp = 2dB. ‘CLSE-Alamouti’ refers to the performance of the spatially-white
data with conventional estimation, ‘CLSE-bf’ is the performance of the beam-
formed data with vc, ‘CFSB’ and ‘LCSB’ refer to the performance of the corre-
sponding techniques after accounting for the loss due to the white data. ‘CFSB-u1’
is the performance of CFSB with perfect-u1, and ‘Perf-bf’ is the performance with
the perfect u1 and v1 assumption.
6 Superimposed Pilots for
MIMO Channel Estimation
Until the recent past, blind schemes [8, 42] have been the only alterna-
tive to estimate a channel without wasting bandwidth. Frequently, such schemes
cannot estimate the channel completely and leave a residual indeterminate ’phase
factor’, such as a complex phase for SIMO [44] and a unitary matrix [46] for MIMO
systems. Further, the optimization algorithms are often complex, involving second
and higher order statistics, and frequently result in sub-optimal performance from
convergence to local minima. Recent advances in signal processing have suggested
an innovative scheme for channel estimation wtih superimposed pilot(SP) symbols,
also termed as hidden or embedded pilots. SP based schemes employ additional
power to transmit a repetitive sequence of pilot symbols superimposed over the
information bearing data symbols and hence do not sacrifice bandwidth by exclu-
sively transmitting pilots. Further, since they employ first order (mean) statistics,
compared to blind algorithms which traditionally employ second and higher order
statistics, they result in simplistic algorithms which obviate convergence problems.
Thus, they offer the attractive benefit of bandwidth efficiency with moderate com-
putational complexity. Early research on such channel estimation schemes has
been reported in [69, 70]. Alternative schemes for SP based estimation have been
explored in [71–73] and [74]. SP based channel estimation for OFDM systems is
discussed in [75].
In this chapter, we derive the true Cramer-Rao Bounds (CRB) for SP
130
131
Figure 6.1: Schematic of a Superimposed Pilot System.
based estimation where only approximate bounds have been derived previously in
the literature[71]. We demonstrate that the simplistic first-order statistic (mean)
based estimation scheme proposed in works such as [71, 72] has a sub-optimal es-
timation performance compared to the CRB. Hence, we propose a semi-blind SP
estimation scheme which asymptotically achieves the CRB[30,55], thus improving
the estimation performance over existing schemes. In the SIMO context, this es-
timate can be seen to have an asymptotic MSE that is 3dB lower than the mean
based estimate. Another aspect of our work is the development of a framework for
the throughput performance analysis of SP. A similar study has been presented in
[76, 77] for frequency selective single-input multiple-output (SIMO) channels. A
novel contribution of our work is to derive an expression for the capacity lower
bound of channels with source-noise correlation to analyze the throughput per-
formance of SP. This framework is more general and can be used to characterize
the throughput performance of any estimator, and is not limited to the MMSE
estimate as in [39, 76]. Specifically, we focus on the maximum-likelihood (ML)
estimate, which is commonly employed in practice. This expression for worst case
capacity is also utilized to demonstrate the throughput gains of SP over a sys-
tem employing conventional pilots (CP). From simulation studies employing this
132
framework, SP can be seen to potentially outperform CP in terms of overall system
throughput, especially in scenarios where the block length is small so that repeated
transmission of pilot symbols results in a significant bandwidth overhead. This typ-
ically arises in adhoc and sensor networks, where the information is communicated
in short bursts over a large number of channels. It also arises in mobile wireless
scenarios where a short coherence time renders repeated training inefficient.
Further, unlike in CP based estimation where the channel estimate is
independent of the data SNR and depends only on pilot to noise ratio (PNR),
in SP estimation the transmitted data has a corrupting influence on the channel
estimate. This aspect has been considered in the study in [72, 78]. Our work
further addresses this issue of optimal pilot-source power allocation for a fixed
total transmit power based on maximizing the post-processing SNR (PSNR) for a
Capon like receive beamformer.
The rest of the chapter is organized as follows. In the next section we
formulate the problem. We derive the expressions for SP estimation in section
(6.2) and present the CRB analysis for SP in section (6.2.1). The expression
for worst case capacity with correlation is presented in section (6.3) followed by
performance comparison with CP in section (6.3.2). The expressions for optimum
power allocation are derived in section (6.4). We present results from simulation
studies in section (6.5) and conclusions in the end.
6.1 Superimposed Pilots (SP) Based MIMO Estimation
Consider a multiple-input multiple-output (MIMO) wireless system with
r receive antennas, t transmit antennas and r ≥ t, i.e. at least as many receive
antennas as transmit antennas. Let H = [h1,h2, . . . ,hr] ∈ Cr×t, denote the flat-
fading MIMO channel, where hj , [h1j, h2j, . . . , hrj]T represents the vector of
complex fading coefficients between the jth transmit antenna and the receiver. The
equivalent discrete-time baseband MIMO system model after matched filtering and
133
Figure 6.2: Schematic diagram of the superimposed pilot(SP) frame structure.
sampling is given as,
y(k) = Hx(k) + η(k), 1 ≤ k ≤ Nb (6.1)
where the index k denotes the time instant and y(k) ∈ Cr×1, x(k) ∈ C
t×1 denote the
kth received and transmitted symbol vectors respectively. The vector η(k) ∈ Cr×1 is
complex circularly-symmetric spatio-temporally uncorrelated additive white Gaus-
sian noise (AWGN) of power σ2n, i.e. E
η(k)η(l)H
= σ2
n δ(k− l)Ir, where δ(k) = 1
if k = 0 and 0 otherwise. The SP estimation scheme can be described as follows.
Let each frame of contiguous transmitted symbols contain Nf sub-frames of length
Lp symbols where Nb , NfLp denotes the block length. The transmitted data sym-
bols xsd(k) are assumed to be stochastic in nature with E xs
d(k) = 0 and power P sd
i.e. Exs
d(k)xsd(l)
H
= P sd δ(k − l)It. Let Xd , [xs
d(1),xsd(2), . . . ,xs
d (Nb)] ∈ C1×Nb
be the transmitted information symbol sequence. Each such sub-frame consists
of independent data symbols with the pilot sequence Xp ∈ Ct×Lp defined as,
Xp , [xp(1),xp(2), . . . ,xp(Lp)], of length Lp symbols and pilot power P st superim-
posed over the data symbols i.e. tr(XpX
Hp
)= tP s
t Lp. Also, let ρsd , (P s
d /σ2n) and
ρst , (P s
t /σ2n) be the signal-to-noise power ratio (SNR) and pilot-to-noise power
ratio (PNR) respectively. A schematic diagram of this SP frame structure is given
in fig.(6.2). The actual transmitted symbol at the kth instant, xs(k), is therefore
134
given as,
xs(k) , xsd(k) + xs
p(k) = xsd(k) + xp (mod(k − 1, Lp) + 1) . (6.2)
The SP system model has the form,
ys(k) = H (xsd(k) + xp (mod(k − 1, Lp) + 1))
︸ ︷︷ ︸
xs(k)
+η(k), 1 ≤ k ≤ Nb, (6.3)
where ys(k),xs(k) are the kth received symbol vector and transmitted symbol
respectively. We employ a scheme similar to the ones suggested in [69, 71, 73] to
estimate the channel H, which is described below. Let ys(k) ∈ Cr×1, k ∈ 1, Lp be
defined as,
ys(k) ,1
Nf
Nf−1∑
j=0
ys (k + jLp) , 1 ≤ k ≤ Lp. (6.4)
Let Ys ∈ Cr×Lp , [ys(1), ys(2), . . . , ys(Lp)], be a stacking of the processed received
symbol vectors. Statistically, EYs
= HXp. The channel estimate Hs is now
computed by the standard least squares procedure [15] as,
Hs = YsX†p = YsXH
p
(XpX
Hp
)−1= Ys
(Xs
p
)†, (6.5)
where Xsp , [Xp,Xp, . . . ,Xp] ∈ C
t×Nf Lp is the superimposed pilot signal. We refer
to the above estimate as the mean-estimate for superimposed pilots as it employs
the mean of the received signal Ys. Since it is based only on the first order statistics
of Ys, it converges faster (compared to second and higher order statistics) while
having a low complexity of implementation. The estimate Hs is then used for
detection of the transmitted data xsd(k) after removing the superimposed pilot
symbol xp (mod(k − 1, Lp) + 1).
6.2 MSE of Estimation
In this section, we first compute the MSE of estimation for the SP es-
timator given in (6.5). Following this, we present the Cramer-Rao bound (CRB)
135
for SP based estimation, which yields a lower asymptotic MSE than that obtained
for the estimator in (6.5), since the mean-estimate ignores the channel informa-
tion available in the second-order statistics(source covariance). We demonstrate
for a SIMO channel that this asymptotic MSE bound for SP is 3dB lower than
that achieved by the simplistic mean estimator and develop a semi-blind MIMO
estimation scheme that achieves this bound at high SNR (ρsd).
From equation (6.4) above, the quantity Y s is given as, Ys = HXp +
HXsd + N, where Xs
d and N are defined analogously for xsd(k), η(k), 1 ≤ k ≤ Nb as
in (6.4). Simplifying the expression for the SP estimate given in (6.5), the quantity
Hs can be seen to be given as,
Hs = H + HXsdX
Hp
(XpX
Hp
)−1+ NXH
p
(XpX
Hp
)−1. (6.6)
The MSE of the mean-estimate for SP, defined as MSEs , E
∥∥∥Hs − H
∥∥∥
2
F
, can
be simplified as demonstrated in appendix section (6.7.1) to yield,
MSEs = E
tr
((
Hs − H) (
Hs − H)H
)
=1
Nf
(tr
(HHH
)P s
d + rσ2n
)tr
(XpX
Hp
)−1.
(6.7)
The optimal pilot symbol matrix for SP estimation that minimizes the MSE of
estimation can be obtained as X⋆p = arg min MSEs = arg min tr
(XpX
Hp
)−1. The
following result gives the structure of the optimal pilot matrix Xp.
Lemma 8. For a fixed total pilot power tr(XpX
Hp
)= tLpP
st , the optimal pilot
symbol matrix X⋆p ∈ C
t×Lp that minimizes the quantity MSEs, the MSE of estima-
tion of the MIMO channel H using superimposed pilots, is given by X⋆p such that
X⋆p
(X⋆
p
)H= P s
t LpIt, i.e. the pilot symbol matrix Xp is orthogonal.
Proof. Similar to [79,80].
In the remainder of the chapter we assume that Xp = X⋆p, the optimal orthogonal
pilot sequence. Thus, the MSE for SP based estimation is given as,
MSEs =tP s
d
NbP st
tr(HHH
)+
rtσ2n
NbP st
(6.8)
136
We wish to look at the dominant MSE term, which is achieved by defining MSE∞s ∈
O (P sd ), the asymptotic MSE of the mean-estimate at high SNR as,
MSE∞s , P s
d
(
limP s
d→∞
MSEs
P sd
)
=tP s
d
NbPt
tr(HHH
). (6.9)
Hence, MSEs can be expressed as MSEs = MSE∞s + o (Pd
s). Ideally, it is desired
MSE∞s = 0, to ensure that the MSE does not progressively increase without bound
as the source power P sd increases. However, as is seen from above, this is not true of
the SP mean-estimate, which is adversely affected as P sd increases. Thus, increasing
P sd might potentially result in worsening the estimate Hs and in turn results in
poor detection performance. We explore the Cramer-Rao bound estimate in the
next section to address the issue of optimal MSE performance.
6.2.1 Cramer-Rao Bound (CRB) for SP Estimation
In this section, we compute the complex CRB for the SP based esti-
mation of H. To make the analysis tractable and demonstrate insights into SP
estimation, we assume a Gaussian symbol source i.e. xd(k) ∼ N (0, P sd It). It is
worth mentioning that the results derived employing this simplification are in close
agreement with the performance of a system employing a discrete signal constel-
lation such as quadrature phase-shift keying (QPSK). As suggested in [32] for the
construction of CRBs of complex parameters, let the complex parameter vector
θ ∈ C2rt×1 be constructed by stacking the parameter vector H and its conjugate as
θ , [vec (H), vec (H∗)]T . From the SP system model for pilot symbol outputs given
in (6.3), the parameter dependent log-likelihood (log-likelihood ignoring additive
constants) for the estimation of the parameter vector θ is given as,
L(Ys|Xs
p; θ)
= −Nb ln |Re|−Nb∑
i=1
(ys(i) − Hxs
p(i))H
R−1e
(ys(i) − Hxs
p(i))
(6.10)
where Ys , [ys(1),ys(2), . . . ,ys (Nb)], xsp(i) , xp (mod (i − 1, Lp) + 1) and Re, the
covariance of this effective noise is given as Re , PdsHHH+σ2
nIr. The Cramer-Rao
Bound (CRB) for the estimation of θ is given by the matrix J−1θ
, where Jθ ∈ C2r×2r
137
is the complex Fisher information matrix (FIM) for the parameter vector θ ∈ C2r×1
and is given as,
Jθ = −E
∂2L(Ys |Xp ; θ
)
∂θ ∂θH
Therefore, the MSE lower bound for SP based estimation denoted by MSEb is given
as MSEb = tr(J−1
θ
), which is also the asymptotic MSE of an maximum-likelihood
(ML) estimator which maximizes the likelihood in (6.10). The SP mean-estimate
suggested in [71] et al. and described in equation (6.5) above is the ML estimate
ignoring the dependance of the covariance Re on H and employs a straight forward
LS estimator i.e.,
Hs = arg min L(Ys|Xs
p,Re;H)
= arg min
Nb∑
i=1
(ys(i) − Hxs
p(i))H
R−1e
(ys(i) − Hxs
p(i))
,
where Re is assumed known. This procedure although suboptimal, results in a
simple estimation algorithm when compared to minimizing the true cost function
involving Re (H). The FIM corresponding to such an estimation procedure, which
exclusively employs the information in the pilots while ignoring the information in
the covariance Re, is given by the Pilot FIM (PFIM) component Jp
θof the total
FIM Jθ as,
Jp
θ= −E
∂2L(Ys|Xs
p,Re;H)
∂θ ∂θH
=
(
Xsp
(Xs
p
)H)
⊗(R−1
e )T
0
0(
Xsp
(Xs
p
)H)T ⊗
R−1e
. (6.11)
where⊗
denotes the matrix Kronecker product. Hence the MSE for the exclusive
pilot based SP estimation of H is given as,
MSEpb =
1
2tr
(
Jp−1
θ
)
= tr
((
Xsp
(Xs
p
)H)−1
)
tr (Re) = tr(HHH
) P sd
NbP st
t+σ2
n
NbP st
rt,
which is equal to the MSE for the SP estimate given in section (6.2). The factor
12
in the above expression is to account for the fact that the parameter vector θ
represents the MSE of H and H∗.
138
Thus, the mean based estimate is suboptimal in the sense that it ignores
the information in the second order statistics (covariance Re) while minimizing the
likelihood in (6.10). The true FIM corresponding to information in both Xp and
Re can be obtained as Jθ = Jp
θ+ Jr
θ, where the FIM component Jr
θcorresponds
to the information in the covariance matrix Re. Let the block Toeplitz parameter
derivative matrix E(k) ∈ Cr×t be defined employing complex derivatives as, E(i) ,
∂H
∂θi. The component Jr
θcan be seen to be given as[59,81],
Jri,j = Jr
rt+j,rt+i = Nb (P sd )2 tr
(
E(i)HHR−1e HE(j)HR−1
e
)
Jri,rt+j =
(Jr
rt+j,i
)∗= Nb (P s
d )2 tr(E(i)HHR−1
e E(j)HHR−1e
).
The covariance FIM Jrθ
can be block partitioned as,
Jr , Nb (P sd )2
Jr
11 Jr12
Jr21 Jr
22
.
It can be verified that Jr21 = Jr
12H and Jr
22 = Jr11
T . The block components of the
FIM are given as, Jr11 =
(HHR−1
e H) ⊗
(R−1e )
Tand,
Jr12 =
(
eH⊗
HHR−1e
⊗
e)
⊙(
eH⊗
HHR−1e
⊗
e)T
,
where e = [1, 1, . . . , 1]T ∈ Cr×1 and ⊙ denotes the matrix Hadamard product. The
expressions for the FIM components Jp
θ, Jr
θfrom can be employed to obtain the
true FIM Jθ. Thus, the CRB for SP based estimation of H is obtained as,
E
(
θ − ˆθ) (
θ − ˆθ)H
≥ J−1θ
. (6.12)
Also, Jθ > Jp
θin the positive semi-definite matrix sense [82] and hence, MSEb <
MSEs. A more insightful result can be obtained in the contex of a SIMO channel
h ∈ Cr×1, i.e. t = 1. The high SNR approximation to the CRB matrix given by
the result below yields a critical insight into the relation between this MSE bound
MSEb and the quantity MSEs.
139
Theorem 6. In the context of a SIMO wireless channel h ∈ Cr×1, the MSE bound
for SP based estimation is given as MSEb = MSE∞b + o (P s
d ), where MSE∞b the
high SNR asymptote is,
MSE∞b , P s
d
(
limP s
d→∞
MSEb
P sd
)
=1
2
(P s
d
NbP st
)
‖h‖2 . (6.13)
Proof. Given in appendix 6.2.1.
Interesting observations can be made from the above result. The quantity
MSE∞b , or the dominant term in the SP MSE bound, increases linearly with P s
d ,
similar to the MSE of the simplistic mean-estimator in section (6.1). This means
that even if one were to use the available statistical information for estimation,
the least achievable MSE of estimation still increases with SNR, similar to the
mean-estimate in section (6.2) above. Hence, the problem of source-pilot power
allocation assumes a critical significance in the context of SP and is addressed in
section (6.4). We also have the following result.
Lemma 9. The asymptotic MSE measures MSE∞b and MSE∞
s , the asymptotic
MSE bound and the asymptotic MSE of the mean-estimate respectively for SP
based estimation are related as,
MSE∞b
MSE∞m
=1
2. (6.14)
Proof. Follows from (6.9) and (6.13).
The above result implies that at reasonably high SNRs, the MSE of esti-
mating the channel by employing the complete information in the likelihood (6.10)
is 3dB lower than that of the mean-estimate. Neglecting the covariance informa-
tion in Re results in a 3 dB loss of estimation performance in the SIMO context.
We now describe a semi-blind estimation algorithm below, which asymptotically
achieves the above MSE bound for SP based MIMO estimation and thus has a
lower MSE of estimation compared to the mean-estimator of section (6.1).
140
6.2.2 Semi-Blind SP Estimation
In this section, we derive a semi-blind SP estimator, that asymptotically
achieves the CRB for SP estimation by employing the information in the output
covariance Re. Observe that the output covariance Ry is given as,
Ry = Eys(i)ys(i)H
= (P s
d + P st )HHH + σ2
nIr = P st HHH + Re.
Hence, let the output covariance Ry be estimated from the received data symbols
as Ry , 1Nb
(∑Nb
i=1 ys(i)ys(i)H)
. Employing a cholesky matrix factorization, one
can compute the matrix estimate W such that,
WWH =1
(P sd + P s
t )
(
Ry − σ2nIr
)
(6.15)
The matrix W is also known as the whitening matrix [55] and differs from the
estimate of the channel Hb by a unitary matrix Qb i.e. Hb = WQHb . The unitary
matrix Qb can be estimated from Xsp by minimizing the the likelihood,
Qb = arg min tr
((
Ys − WQHXsp
)H
R−1e
(
Ys − WQHXsp
))
, (6.16)
subject to the constraint QbQHb = QH
b Qb = It. The quantity Re is given as
Re , P sdWWH + σ2
nIr. It can then be demonstrated that the optimal unitary
matrix Qb that minimizes the above likelihood is given as,
Qb = UbVHb , where, UbΣbV
Hb = SV D(Xs
p (Ys)H R−1e W), (6.17)
under the condition that the pilot symbol matrix Xp is orthogonal, as has been as-
sumed for optimal MSE performance. The semi-blind channel estimate is obtained
as Hb = WQHb . This is akin to the whitening-rotation SB procedure elaborated
in [55]. The above SB estimator yields a biased estimate at low SNR, owing to the
constrained ML estimator in (6.16). However, the bias progressively decreases as
the SNR increases. Hence, theoretically, the SB estimator asymptotically achieves
the MSE lower bound in (6.13) at high SNR [55]. Simulation studies demonstrated
in section (6.5) suggest that the performance of the proposed SB scheme is close
to the bound even for moderately high SNR.
141
6.3 Throughput Performance
One of the promising aspects of SP estimation compared to CP, is the
potential savings in bandwidth due to the transmission of superimposed data and
pilot signal, thereby eliminating an exclusive slot for the transmission of pilot sym-
bols. In this section we wish to quantify the throughput performance of SP and con-
trast it with that of CP to demonstrate the achievable bandwidth gains. The result
in [39] provides an expression to characterize the worst case capacity of a communi-
cation channel in the presence of channel estimation errors. The framework therein
relies on the central assumption that the channel estimate H and the estimation
error H − H satisfy the decorrelation property, i.e. E
H(
H − H)H
= 0r×r,
which is satisfied by the minimum mean-squared error (MMSE) estimate. How-
ever, this result cannot be used in the context of SP based estimation for the
following reasons.
A.1 The SP channel channel estimate Hs is correlated with the transmitted data
symbols Xsd as, E
Hs [xsd(k)]i
=(
P sd
NbPst
)
hj
(xs
p(k))H
. This can be seen
from (6.6) and is unlike the scenario in [39] where the channel estimate is
uncorrelated with the data, i.e. E
H [xsd(k)]i
= 0r×r.
A.2 Further, it can also be observed that the decorrelation property mentioned
above is not satisfied by many estimators including the least-squares (LS)
estimator. For instance, it can be observed from section(6.2) that,
tr
(
E
Hc
(
H − Hc
)H)
= −(
tσ2n
NbP st
)
tr (Ir) 6= 0. (6.18)
This is a disadvantage since the LS estimate is robust and has a low com-
putational complexity which makes it especially suited for implementation
in wireless systems. Therefore it is of interest to develop a framework that
takes into account such estimators.
The following discussion presents a result for the worst-case capacity Cw of a
channel with non-zero signal-noise correlation. This frame-work can be employed
142
to quantify the throughput performance of the SP sytem. Further, we also utilize
this framework to demonstrate the bandwidth gains of SP when compared with a
system employing conventional pilots (CP) for channel estimation.
6.3.1 A Throughput Lower Bound for Channels with Correlated Noise
In this section, similar to the result in [39], we derive an expression for
the throughput lower bound of a communication system with correlated noise.
Consider the communication channel,
y(k) = Hx(k) + v(k) = s(k) + v(k), s(k),y(k) ∈ Cr×1 (6.19)
where v(k) ∈ Cr×1 is additive noise and s(k) , Hx(k). The worst case capacity
for the above channel is,
Cw = minpv(·), tr(Rv)=rσ2
n
maxpx(·), tr(Rx)=tPd
I (y;x) (6.20)
The important difference between the above model and the one in [39] is that
the signal and noise components s(k),v(k) are not necessarily uncorrelated, i.e.
Ev(k)s(l)H
= δ(k − l)Rvs 6= 0. The result below gives the expression for the
worst case capacity of the above channel.
Theorem 7. Worst Case Correlated Capacity: Let the system input-output
model of a matrix-valued noisy communication channel be given as,
y(k) = Hx(k) + v(k) = s(k) + v(k), (6.21)
where x(k) ∈ Ct×1,v(k) ∈ C
r×1 represent the signal and the unknown noise com-
ponents respectively. Let v(k),x(k) satisfy the covariance constraints,
Ex(k)Hx(l)
= δ(k − l)tr (Rx) = rPt, E
v(k)Hv(l)
= δ(k − l)tr (Rv) = rσ2
n,
and δ(k) = 1 if and only if k = 0 and δ(k) = 0 otherwise. Further, let the
correlation between the signal and noise components be given as,
Ev(k)s(l)H
= δ(k − l)Rvs = δ(k − l)RH
sv,
143
where Rvs is not necessarily 0r×r. For the above communication system, the worst
case capacity Cw as defined in (6.20) is given by the expression,
Cw (Rs,Rv,Rvs) = mintr(Rv)=rσ2
n
maxtr(Rx)=tPd
log∣∣∣I + R−1
v|s (Rs + Rvs)R−1s (Rs + Rvs)
H∣∣∣ ,
(6.22)
where the conditional covariance Rv|s ∈ Cr×r is given as Rv|s , Rv −RvsR
−1s Rsv.
Proof. See appendix 6.7.3.
We employ this result next to derive expressions for the worst case ca-
pacity of the SP and CP estimation schemes. Also, it can be seen from (6.22)
that that for the case of uncorrelated noise, i.e. Rsv = Rvs = 0r×r, the expression
above reduces to the result in [39] for the uncorrelated signal-noise case,
Cw = mintr(Rv)=rσ2
n
maxtr(Rx)=tPd
log∣∣I + R−1
v Rs
∣∣ = min
tr(Rv)=rσ2n
maxtr(Rx)=tPd
log∣∣∣I + R−1
v HRxHH
∣∣∣ .
(6.23)
6.3.2 Throughput Comparison of Superimposed and Conventional Pi-
lots (CP)
We now apply the result for the worst case capacity derived above to the
scenarios of SP and CP based channel estimation. Let ys(k) denote the output of
the SP system after removal of the pilot symbol xp (mod(k − 1, Lp) + 1) employing
the estimate Hs described in (6.6). From (6.3), the input-output relation for the
SP channel after pilot removal is given as,
ys(k) = Hsxsd(k) +
(
H − Hs
)
(xp (mod(k − 1, Lp) + 1) + xsd(k)) + η(k)
︸ ︷︷ ︸
vs(k)
, (6.24)
where ss(k) , Hxsd(k) and ss(k),vs(k) ∈ C
r×1 denote the effective SP channel
output (after pilot removal), noise respectively at the kth time instant. The opti-
mal Rx which maximizes the worst case capacity in (6.22) depends on the channel
matrix H and can be fairly challenging to compute. However, in simplistic commu-
nication scenarios where the channel information is not fedback to the transmitter,
144
a reasonable choice for the transmit covariance matrix is Rx = PtIt, where power
is loaded uniformly on all the transmit antennas. Further, since in our study we
are only interested in a comparison between the SP and CP scenarios, the above
choice of Rx can be used as a benchmark. Hence, the throughput lower bound
for the SP system and the throughput bound for SP semi-blind system in bits per
channel use as is given as,
Csw = Cw (Rs
s,Rsv,R
svs) , Cb
w = Cw
(Rb
s,Rbv,R
bvs
)(6.25)
where, the expressions for the covariance matrices Rs,Rv,Rvs for different the es-
timation schemes are listed in table (6.1). The covariance matrices associated with
the SP mean-estimator can be derived from the expression in (6.6) and employing
the expression for E
(
Hs − H) (
Hs − H)H
, which can be simplified as,
E
(
Hs − H) (
Hs − H)H
=
(1
NbP st
)2 (
E
HXsdX
Hp Xp
(Xs
d
)HHH
)
+
(1
NbP st
)2(E
NXH
p XpNH
)
=
(1
NbP st
)2 (
E
HXsdG
(Xs
d
)HHH
+ ENsGNH
s
)
=
(tP s
d
NbP st
)
HHH +
(tσ2
n
NbP st
)
Ir, G , XHp Xp
where the last equality follows from the simplification EXdGXH
d
= tr (G) P s
d It =
tNbPst P s
d It, as demonstrated in appendix section (6.7.1). For the SP semi-blind
estimate, we derive the approximate error covariance E
(
H − Hb
)(
H − Hb
)H
from the expression for the semi-blind CRB in (6.12). The quantity J b, the error
matrix for SP semi-blind estimation is given as,
E
(
Hb − H) (
Hb − H)H
≈ J b, J bij ,
t−1∑
k=0
[J−1
θ
]
i+kr,j+kr. (6.26)
The above expression provides a good lower-bound on the error covariance, and
is tight even at moderate SNR. It can hence be employed to derive the error
covariance matrices associated with SP semi-blind estimation.
145
Figure 6.3: Schematic of conventional (time-multiplexed) pilots frame (block)
structure.
6.3.3 Conventional Pilots (CP) based estimation
In contrast to SP, CP based channel estimation involves the exclusive
transmission of pilot symbols, which results in a bandwidth overhead. The CP
system frame can be modeled as a transmission of Lp pilot symbols followed by
(Nf − 1) Lp information bearing data symbols. Since the total pilot power in SP
is NbPt, we scale each CP pilot symbol by a factor of√
Nf to transmit equal
pilot power i.e. P ct = P s
t
√Nf . Similarly the CP source power is scaled as P c
d =
P sd /
√
1 − 1Nf
to ensure equal source power. A schematic diagram for the frame
structure of a CP based system is given in fig.(6.3). The input-output model for
the CP system is given as,
yc(k) = Hxc(k)+η(k), where xc(k) =
xcp(k) =
√Nfx
sp(k), 1 ≤ k ≤ Lp
xcd(k) = 1
√
1− 1Nf
xsd(k), Lp + 1 ≤ k ≤ Nb
(6.27)
Defining a stacking of the received pilot symbol outputs as
Yc , [yc(1),yc(2), . . . ,yc(Lp)] , (6.28)
the conventional estimate Hc is then given by the well known LS estimate as,
Hc = Yc(Xc
p
)†= Yc
(Xc
p
)H(
Xcp
(Xc
p
)H)−1
, (6.29)
146
Table 6.1: Table showing covariance matrices for SP and CP systems with channel
estimation error.
SP CP
Rvs Rsvs = − t(P s
d)2
NbPst
(
1 + tNb
)
HHH − P sdtσ2
n
NbPstIr Rc
vs = − tσ2nP c
d
Nf P ctIr
Rs Rss = P s
d
(
1 +tP s
d
NbPst
)
HHH +tσ2
nP sd
NbPstIr Rc
s = P cdHHH +
tP cdσ2
n
Nf P ctIr
Rv Rsv = σ2
nIr + (P sd + P s
t )(
tP sd
NbPstHHH + tσ2
n
NbPstIr
)
Rcv = σ2
nIr +tσ2
nP cd
Nf P ctIr
SP-SB CSIR
Rvs Rbvs = −P s
dJ b Rpvs = 0r
Rs Rbs = P s
dHHH + P sdJ b Rp
s = PdHHH
Rv Rbv = (P s
d + P st )J b + σ2
nIr Rpv = σ2
nIr
where Xcp =
[xc
p(1), . . . ,xcp (Lp)
]. The worst case throughput performance of CP
is given as,
Ccw =
(
1 − 1
Nf
)
Cw (Rcs,R
cv,R
cvs) , (6.30)
where the factor(
Nb−Lp
Nb
)
=(
1 − 1Nf
)
arises due to a loss of one sub-frame per
frame owing to exclusive transmission of the pilot symbols. This results in a loss
in throughput in CP systems, especially for a low number of sub-frames Nf . As
illustrated by the simulation results, for reasonable values of SNR (= Pd/σ2n),
PNR(= Pt/σ2n) and number of sub-frames(= Nf ), an SP scheme has a throughput
of approximately 0.5 bits per channel use greater than that of CP. This is predomi-
nantly because the CP is disadvantaged by the loss of one sub-frame of bandwidth
due to the transmission of pilot symbols exclusively, while the estimation errors
are comparable at low SNRs. Hence, for reasonable SNRs and short data frame
sizes SP has a higher throughput than CP. This makes SP especially suitable for
employment in scenarios such as adhoc and sensor networks, where the informa-
tion transmitted is typically bursty and of short duration and the pilot overhead
in CP would be comparable to the total transmitted data.
147
6.4 Optimal Power Allocation in SP
It can be seen from (6.6) that Hs, the estimate of the channel is corrupted
by the data symbols xsd(k) which enhance the noise during the estimation of the
channel. This scenario presents an interesting tradeoff in SP systems. While on
one hand, higher data power improves the detection performance, it also results
in a poor channel estimate and loss in detection performance. In fact, for a given
number of frames Nf , if the source power Pd is too high, the detection performance
tends to be very poor. Motivated by this observation, we derive expressions for
the optimal data SNR ρsd
(
, P sd /σ2
n
)
in the SIMO context to maximize the post-
processing SNR(PSNR) for Capon beamforming. Consider the analogous SIMO
SP system model, obtained by setting the number of receive antennas r = 1 in
(6.3), with channel vector denoted as h. After estimation of hs and subtracting
the pilot symbol xp (mod(k − 1, Lp) + 1), the model for the detection of the symbol
xsd(k) employing the computed estimate hs is given as demonstrated in (6.24) as,
ys(k) = hsxsd(k) + ∆hs (xp (mod(k − 1, Lp) + 1) + xs
d(k)) + η(k)︸ ︷︷ ︸
vs(k)
, (6.31)
where v(k) represents the effective detection noise and ∆hs, the error in the esti-
mate of h is defined as ∆hs , h − hs. The expression for the covariance of the
effective noise Rsv is given in table 6.1. In the discussion below, we present the
expression for the optimum SNR-PNR allocation for PSNR maximization at the
receiver .
6.4.1 Minimum Variance Distortionless Response (MVDR) Beamformer
The MVDR beamformer [83] wm is given as a solution to the detection
SNR maximization criterion described as,
wm = arg minwHRsvw subject to, wHhs = 1.
From the result in [83], wm is given as wHm =
(
hHs (Rs
v)−1 hs
)−1
hH (Rsv)
−1. Sub-
stituting this in (6.31) above, the expression for the estimation of xd(k) can be
148
2 4 6 8 10 12 14 16 18 20
10−1
100
101
SNR
MS
E
MIMO MSE Vs. SNR for SP Estimation, PNR = 5 dB, Nf = 10, L
p = 8
SP Mean EstimateSP Mean Estimate TheorySP CRBSP SemiBlindSP Semi Blind AsympCP
Figure 6.4: MSE of Estimation of MIMO wireless channel with r = t = 4, PNR =
5dB, Nf = 10 and Lp = 8 symbols.
149
obtained as,
wHmys(k) = wH
mhsxsd(k) + wH
mvs(k) = xsd(k) + wH
mvs(k).
Hence, the post-processing SNR for the MVDR beamformer can be seen to be
given as,
κm =P s
d
E|wH
mvs(k)|2 = Pdh
Hs (Rs
v)−1 hs. (6.32)
As demonstrated in appendix 6.7.4, the above expression can be simplified by
substituting the expression for Rsv in table 6.1 to yield,
κm ≈ ρstρ
sdNb ‖h‖2
(ρsd + ρs
t) ρsd ‖h‖2 + ρs
t (Nb + 1) + ρsd
, (6.33)
where ρsd, ρ
st are the data and pilot SNR respectively as defined previously. Let the
total symbol transmit power be constrained as,
ρst + ρs
d = αs. (6.34)
The optimum power allocation ρ⋆t , ρ
⋆d that maximizes the above expression for the
post-processing SNR κm is given by the following result.
Lemma 10. The optimum PNR ρ⋆t that maximizes the post-processing SNR κm
for the MVDR beamformer with the transmit power constraint in (6.34) above, is
given as,
ρ⋆t =
1
γ
(√
δ2 + δαsγ − δ)
, ρ⋆d = αs − ρ⋆
t (6.35)
where δ , (αs)2 ‖h‖2 + αs and γ , Nb − αs ‖h‖2.
Proof. The above result can be readily obtained by differentiating the expression
in (6.33) and noting that ρ⋆t > 0.
The expression in (6.35) gives the optimum pilot and data power alloca-
tion that maximizes the post-processing SNR for MVDR reception.
150
0 5 10 15 20
10−2
10−1
SNR
Est
imat
ion
MS
E
SIMO Estimation MSE Vs SNR, PNR = 8 dB, Nf = 20, L
p = 12
SP Mean EstimateSP Mean Estimate TheorySP CRBSP CRB AsympSP Semi BlindMATLABSP Semi Blind AsympCP
Figure 6.5: MSE of Estimation of SIMO Rayleigh wireless channel with r = 4
antennas, Nf = 20, Lp = 8, PNR = 5dB.
151
6.5 Simulation Results
6.5.1 MSE of Estimation
In our simulations we employ a MIMO/SIMO wireless channel with r = 4
receive antennas, t ∈ 4, 1 transmit antennas and QPSK symbol modulation. We
consider a Rayleigh fading channel with coefficients Hij, 1 ≤ i, j ≤ 4 generated as
independent zero-mean circularly symmetric complex Gaussian random variables of
unit variance, i.e. E‖Hij‖2 = 1. In the first example we consider the estimation
performance of the semi-blind SP scheme with an orthogonal pilot sequence Xp of
length Lp = 12 symbols and Nf = 10 sub-frames per frame, or a total of Nb = 120
symbols per frame. It is assumed that the channel does not vary significantly
over the block, or in other words, the channel coherence time is larger than the
block duration. Fig.(6.4) shows computed MSE averaged over 2000 independent
realizations of the wireless channel H. It is seen that the MSE of the SP estimate
Hs given by (6.5) is in close agreement with theory from section(6.2). The semi-
blind estimate in (6.17) has a lower MSE than the mean-estimate and achieves the
CRB in (6.12). The asymptotic semi-blind estimator, which has the least MSE, is
the semi-blind estimate as Nb → ∞, implying that the estimate of the whitening
matrix W = W. It can also be seen that even though the CRB results are derived
assuming Gaussian signaling, they are in close agreement with the performance of
a system employing a discrete constellation, QPSK in the above case. Fig.(6.5)
shows the MSE of estimation of a SIMO channel h with r = 4 receive antennas. In
this scenario, the semi-blind estimate in (6.17) involves the constrained estimation
of a scalar phase. As illustrated in lemma 9, it is seen that at high SNR the
SP MSE bound is 3dB lower than the MSE of the SP mean-estimate. We also
plot the MSE of the estimate hf , which is obtained by employing a numerical
optimization routine fminunc(·) in MATLAB to optimize the likelihood in (6.10).
The mean-estimate hs is employed to initialize the procedure. This estimate can
also be seen to achieve the asymptotic MSE bound for SP estimation. Thus, both
152
6 8 10 12 14 16
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Throughput (bits/channel use)
Pou
tage
Worstcase MIMO Throughput Outage with SP
CPSP−MeanSP−SemiBlindPerfect CSIR
Figure 6.6: Throughput performance of SP and CP vs. Nf , SNR = PNR = 5dB,
Lp = 64.
these estimators asymptotically outperform the SP mean-estimator. Finally, the
MSE of the CP estimate from (6.29) is plotted for comparison and can be seen to
outperform SP based estimation. This is expected as the performance of CP is not
limited by the data SNR. However, CP has a net throughput loss compared to SP
as is seen next.
6.5.2 Throughput Performance
Utilizing the framework of worst case correlated capacity developed in
section(6.3.2), we compute the net throughput performance of a superimposed
pilot system and contrast it with the performance of a system employing conven-
tional or time-multiplexed pilots. The results shown in fig. (6.6) consider a system
with Lp = 64 pilot symbols, Nf = 4 sub-frames and PNR, SNR fixed at 5dB.
Employing the expressions in (6.25) and (6.30), we plot the probability of outage
153
of the throughput lower bounds for SP and CP. In Fig. (6.6) it can be seen that
the throughput of SP semi-blind estimation and SP mean-estimation is approxi-
mately 1.0 to 0.5 bits per channel use, respectively, higher than that of CP based
estimation. This throughput margin progressively decreases as the CP bandwidth
loss relative to the block length, i.e. Lp symbols per block of Nb = NfLp symbols,
decreases. Thus, SP estimation can potentially yield significant bandwidth gains,
especially in scenarios which warrant communication in bursts of shorter block
lengths, where employing CP results in significant pilot overheads.
Fig. (6.7) shows the bit error rate (SER) of detection for SP and CP
based detection employing a QPSK symbol constellation, for SNR in the range
2 − 35dB, Nf = 10 sub-frames and pilot sequence length Lp = 64. We employ
a minimum mean-squared error (MMSE) receiver for symbol detection. The net
throughput expressions for SP and CP, denoted by µs, µc respectively, are given
as,
µs = tq (1 − pse) , µc = tq
(
1 − 1
Nf
)
(1 − pce) , (6.36)
where q, is the number of bits per complex symbol (q = 2 for QPSK) and pse, p
ce
are the bit error rates of the SP and CP systems respectively. The SP throughput
with perfect channel state information at receiver (Per CSIR) is also given for
comparison. It can be observed that SP based detection schemes, both mean and
semi-blind, outperform CP by about 0.7 bits per channel use in the mid-SNR range
of 5 − 15dB. This experiment demonstrates a practical MIMO scenario where SP
yields throughput gains over CP by avoiding the exclusive transmission of pilots.
As the SNR increases, the throughput of the SP scheme progressively worsens due
to increasing signal interference.
6.5.3 Optimal Power Allocation
Next we consider the problem of optimal data/pilot transmit power al-
location for receive post-procesing SNR (PSNR) maximization, as examined in
section(6.4). Fig.(6.8) demonstrates the symbol error rate (SER) performance as
154
a function of SNR Vs. transmitted SNR for a QPSK transmit constellation, when
MVDR beamforming is employed at the receiver. We consider the receiver SER
corresponding to several choices of sub-frame number, pilot sequence length and
total transmit power, given in the figure by the legend entry [Nf , Lp, αs (dB)]. For
instance, the legend entry [16, 8, 12.5] denotes the SIMO SER performance curve
for Nf = 16, Lp = 8, αs = 12.5dB. The SER performance reaches a minimum
for a unique SNR (in this scenario for ρd ≈ 11dB) and increases for higher data
power allocation. The corresponding vertical line represents the analytically com-
puted optimal power allocation from the expression in (6.35) and is seen to be
approximately 0.5dB away from optimal performance. Thus, it yields a reliable
benchmark for optimal power allocation. Fig. (6.9) illustrates the optimal power
allocation ratio 10 log10
(ρ⋆
d
ρ⋆t
)
vs. total transmit power αs dB for different numbers
of pilot length Lp and sub-frames Nf . It can be seen that as the block length
(NfLp) increases, the fraction of pilot power decreases from −8dB to −13dB. Fur-
ther, increasing total transmit power results in increasing pilot power allocation to
offset the increase in estimation error from data.
6.6 Conclusion
In the above work we have derived the CRB for SP estimation and demon-
strated a semi-blind scheme that achieves this CRB. We have analyzed the effective
throughput of SP and CP systems employing a novel result for the worst-case ca-
pacity with correlated symbols and noise. It has been observed that SP has a
higher effective throughput than CP based systems. Thus, SP based estimation
can lead to a significant conservation of bandwidth in communication systems.
Acknowledgement
The text of this chapter, in part, is a reprint of the material as it appears
in A. K. Jagannatham and B. D. Rao,“Superimposed Vs. Conventional Pilots for
155
5 10 15 20 25 30
5
5.5
6
6.5
7
7.5
8
SNR (dB)
Thr
ough
put
Throughput Vs SNR, Nf = 10, L
p = 64, PNR = 5 dB
SPCPSP−SBSP−SB AsympSP−Per CSIR
Figure 6.7: Throughput performance of SP and CP Vs. SNR for a 4× 4 Rayleigh
flat-fading MIMO channel with Nf = 10 sub-frames and Lp = 64 pilots.
Channel Estimation”, Conference Record of the Fortieth Asilomar Conference on
Signals, Systems and Computers, Nov., 2006.
156
5 6 7 8 9 10 11 12 13 14
10−3
10−2
SNR (dB)
SE
R
SER Vs. SNR for Total Power Constraint, [Nf,L
p,α (dB)]
[8,4,15]ρ
opt
[8,8,12.5]ρ
opt
[16,8,12.5]ρ
opt
Figure 6.8: Detection performance vs. SNR for SP based estimation. SER vs. SNR
(Pd/σ2n) for QPSK signaling, r = 4 SIMO channel and different [Nf , Lp, α
s (dB)].
6.7 Appendix for Chapter(6)
6.7.1 Proof of Expression for MSEs in section(6.2)
The expression for MSEs can be simplified as,
MSEs = tr(
E(
HXsd
)Fp
(HXs
d
)H)
+ tr(E
NFpN
H)
,
where Fp ,
(
XHp
(XpX
Hp
)−1) (
XHp
(XpX
Hp
)−1)H
. Let U = [u(1),u(2), . . . ,u(L)] ∈C
m×L be any matrix such that Eu(k)u(l)H
= σ2
uδ(k − l)I. Then, EUFpU
H
157
is given as,
EUFpU
H
=L∑
j=1
L∑
i=1
E
u(i) [Fp]ij u(j)H
=L∑
j=1
L∑
i=1
σ2uδ(i − j) [Fp]ij Im
=L∑
j=1
σ2u [Fp]jj Im = σ2
utr (Fp) Im.
Hence, it follows that E
XsdFp
(Xs
d
)H
=P s
d
Nftr (Fp) It and E
NFpN
H
= σ2n
Nftr (Fp) Ir.
Further,
tr (Fp) = tr
((
XHp
(XpX
Hp
)−1) (
XHp
(XpX
Hp
)−1)H
)
= tr((
XpXHp
)−1 (XpX
Hp
)−1XpX
Hp
)
= tr((
XpXHp
)−1)
Hence the expression in (6.7) follows.
6.7.2 Proof of Theorem 6
It can be observed that,
E
tr
(
∂(ys(i) − hxs
p(i))H
∂θi
∂R−1e
∂θ∗j
(ys(i) − hxs
p(i))
)
= tr
(
xsp(i)
∗∂hH
∂θi
∂R−1e
∂θ∗jE
ys(i) − hxs
p(i)
)
= 0
since Eys(i) − hxs
p(i)
= E hxsd(i) + η(i) = 0. Hence, the total FIM corre-
sponding to information in both Xp and Re can be obtained as Jθ = Jp
θ+Jr
θwhere
the FIM component Jrθ
corresponds to the information in the covariance matrix
Re. From the results for the FIM of a complex Gaussian stochastic process [59,81],
158
0 2.5 5 7.5 10 12.5 15
−12
−10
−8
−6
−4
−2
Total SNR (ρs+ρ
t) dB
ρ topt /ρ
sopt d
B
Optimal Power Allocation for Superimposed Pilots
Nf = 64, L
p = 32
Nf = 32, L
p = 16
Nf = 16, L
p = 16
Figure 6.9: Optimal power allocation ratio 10 log10
(ρ⋆
d
ρ⋆t
)
of a r = 4 antenna SIMO
channel Vs. Total Power (αsdB) for various Nf , Lp.
the covariance FIM component Jrθ∈ C
2r×2r is given as,
Jrθ (i, j) = Jr
θ (r + j , r + i) = Nb (P sd )2 tr
(∂h
∂hi
hHR−1e h
∂h
∂hj
H
R−1e
)
,
1 ≤ i, j ≤ r
Jrθ (i, r + j) =
(Jr
θ (r + j, i))∗
= Nb (P sd )2 tr
(∂h
∂hi
hHR−1e
∂h
∂hj
hHR−1e
)
,
1 ≤ i, j ≤ r.
It can then be shown after simplification that the matrix Jrθ
is given as,
Jrθ = Nb (P s
d )2
(hHR−1
e h)(R−1
e )T (
hHR−1e
)ThHR−1
e(hHR−1
e
)H (hHR−1
e
)∗ (hHR−1
e h)R−1
e
.
Using results on matrix inversion [82], the quantities hHR−1e and hHR−1
e h can be
further simplified as hHR−1e = hH
σ2n+P s
d‖h‖2 and hHR−1
e h = ‖h‖2
σ2n+P s
d‖h‖2 . Substituting
these expressions in the FIM expression above we obtain the final expression for
159
Jrθ. The SP FIM is given as,
Jθ = NbPst
(R−1
e )T
0r×r
0r×r R−1e
+ Nb (P sd )2
‖h‖2(R−1e )
T
σ2n+P s
d‖h‖2
h∗hH
(σ2n+P s
d‖h‖2)
2
hhT
(σ2n+P s
d‖h‖2)
2‖h‖2R−1
e
σ2n+P s
d‖h‖2
.
Let the constants α, β, γ, θ be defined as α , NbPst , γ , σ2
n +P sd ‖h‖2, β ,
Nb(P sd)
2
γ
and θ ,βγ. Substituting these in the above expression for the FIM, Jθ can be
written as,
Jθ =
(α + β ‖h‖2) (R−1
e )T
0
0(α + β ‖h‖2)R−1
e
︸ ︷︷ ︸
K−1θ
+θ
0 h∗hH
hhT 0
.
where Kθ is defined as,
Kθ ,1
α + β ‖h‖2
RT
e 0
0 Re
.
Employing the matrix inversion lemma [82], the CRB for the parameter vector θ
given by J−1θ
can be expressed as,
J−1θ
= Kθ − Kθ
h∗ 0
0 h
R−1θ
0 hH
hT 0
Kθ,
where the matrix Rθ is defined as
Rθ ,1
θI2r +
0 hH
hT 0
Kθ
h∗ 0
0 h
=1
θI2r +
1
α + β ‖h‖2
0 hHReh
hTRTe h∗ 0
.
The MSE bound for the estimation of the parameter vector θ is given as
MSEb =1
2tr
(J−1
θ
)=
1
2tr (Kθ) −
1
2tr
R−1θ
0 hH
hT 0
KθKθ
h∗ 0
0 h
.
Simplifying the above expression, it can be demonstrated that the MSE lower
bound for the estimation of h is given as
E
∥∥∥h − h
∥∥∥
2
≥ tr (Re)
α + β ‖h‖2 +
(hHReh
)
(α + β ‖h‖2)
(hHReReh
)
(α + β ‖h‖2)2 |Rθ|
, (6.37)
160
where |Rθ| is the determinant of the matrix Rθ and is given as,
|Rθ| =1
θ2−
(hHReh
) (hTReh
∗)
(α + β ‖h‖2)2 =
(σ2
n + P sd ‖h‖2)4
(Nb (P s
d )2)2 − ‖h‖4 (σ2
n + P sd ‖h‖2)2
(α + β ‖h‖2)2
= γ2
(
1
β2− ‖h‖4
(α + β ‖h‖2)2
)
=αγ2
(α + 2β ‖h‖2)
β2(α + β ‖h‖2)2
At high SNR i.e. as P sd → ∞, it can be seen that limP s
d→∞ |Rθ| = 2αγ2
β2‖h‖2 . It can
be observed that hHReReh → (P sd )2 ‖h‖6 and
(hHReh)(α+β‖h‖2)
→ P sd‖h‖2
βas P s
d → ∞.
Substituting these and the expression for |Rθ| from above in the MSE expression
in (6.37), the high SNR CRB asymptote is obtained as,
MSE∞b = P s
d
(
limP s
d→∞
MSEb
P sd
)
= P sd
(
limP s
d→∞
1
P sd
‖h‖2
Nb
+‖h‖2 P s
d
2NbPt
)
=‖h‖2 P s
d
2NbPt
.
6.7.3 Proof of Theorem 7
The capacity of the communication channel of (6.19) with uncorrelated
Gaussian noise is given by the well known maximization of mutual information
[2, 11]. When the nature of the noise process v(k) is unknown, the worst case
capacity [39] can be expressed as,
Cw = minpv(·), tr(Rv)=rσ2
n
maxps(·), tr(Rx)=tPd
I (y; s)
The system in (6.21) can be equivalently written as,
y(k) ≡(I + RvsR
−1s
)s(k) + v(k),
where v , v+RvsR−1ss s, and the innovations noise v is uncorrelated with the source
s i.e. EvsH
= 0. The covariance Rv is given as Rv = Rv|s = Rv−RvsR
−1s Rsv.It
can be seen that the transformation,
v
s
=
Ir RvsR
−1ss
0r×r Ir
v
s
(6.38)
is invertible (since the transform matrix is upper triangular). Therefore, given
a distribution function pv, s(·), there exists a distribution pv, s(·) and vice versa.
161
Hence it can be seen that,
minpv(·), EvvH=Rv
maxps(·), EssH=Rs
I (y; s) = minpv(·), EvvH=Rv
maxps(·), EssH=Rs
I (y; s) ,
Now, employing the result for worst case capacity with uncorrelated noise from
[39], the worst case capacity for the above system can be seen to be given as,
Cw = mintr(Rv)=rσ2
n
maxtr(Rs)=tPd
log∣∣∣I + R−1
v
(I + RvsR
−1s
)Rs
(I + RvsR
−1s
)H∣∣∣ ,
= mintr(Rv)=rσ2
n
maxtr(Rs)=tPd
log∣∣∣I + R−1
v|s (Rs + Rvs)R−1s (Rs + Rvs)
H∣∣∣ ,
which is the expression for the worst case capacity in (6.22).
6.7.4 MVDR - Post-Processing SNR
Below, we derive the expression in equation in (6.33) for the MVDR
post-processing SNR κm. From table 6.1, the covariance of the effective noise
Rsv = βhhhH + βnI, where the constants βh, βn are defined as βh ,
P sd
Nb
(
1 +P s
d
P st
)
and βn ,σ2
n
Nb
(
1 +P s
d
P st
)
+ σ2n. Using results on matrix inversion [82], the matrix
(Rsv)
−1 can be expressed as,
(Rsv)
−1 =1
βh
(
βh
βn
I − βh
βn
Ih
(
1 + hH βh
βn
Ih
)−1βh
βn
I
)
=1
βn
I−(
βh
βn
)hhH
βn + βh ‖h‖2 .
(6.39)
Substituting this expression for (Rsv)
−1 in (6.32), the expression for κm can be
simplified as,
κm = P sd h
Hs (Rs
v)−1 hs,
≈ P sdh
H (Rsv)
−1 h,
=P s
d
βn
‖h‖2 −(
βh
βn
)P s
d ‖h‖4
βn + βh ‖h‖2 ,
=P s
d ‖h‖2
βn + βh ‖h‖2 .
Substituting the expressions for βh, βn defined above, the final expression for κm
in terms of the quantities P sd , P s
t is obtained as,
κm =Nb ‖h‖2 P s
d P st
‖h‖2 P sd (P s
d + P st ) + σ2
n (P st (1 + Nb) + P s
d ),
162
which reduces to the expression in (6.33).
7 MIMO Time Varying Channel
Estimation
7.1 Introduction
The previous chapter has studied a novel scheme for channel estimation
based on superimposed pilots(SP). In such systems, pilot symbols are not trans-
mitted exclusively but superimposed over the information symbols and hence do
not result in a bandwidth overhead. In [84] it has been demonstrated that the
transmission of such a sequence of superimposed pilot symbols can in fact result
in increased throughput performance.
The problem of MIMO channel estimation is further complicated by the
relative motion between the base station and mobile terminals. This results in
a time varying channel arising due to the doppler shift in the carrier signal. The
velocity of the mobile terminal dictates the doppler bandwidth of the channel which
in turn determines the coherence time of the channel, or the time duration for
which the channel can be assumed to be static. Thus, as the velocity of the mobile
node increases, the coherence time decreases. A popular scheme to estimate such
a time varying channel is the auto-regressive (AR) modeling based Kalman filter
estimation [13]. More recently, in works such as [85,86], complex exponential basis
expansion modeling (CEBEM) based channel estimation with superimposed pilots
has been shown to yield promising results. The CEBEM models for time-varying
channels were first presented in [87] and have recently gained much attention. In
163
164
this work, we study the performance of SP for CEBEM based MIMO time-selective
channel estimation.
Further, the performance of the SP based channel estimation scheme
can be significantly improved by employing a soft-decision based iterative algo-
rithm. The expectation-maximization algorithm[82,88] provides a framework that
is suited for such a procedure since the unknown information symbols can be
treated as missing data. However, one of the major shortcomings of such a scheme
is the associated high computational cost. For instance, employing a 16-QAM
constellation in a 4 transmit antenna MIMO system, it is necessary to perform
164 = 65, 536 likelihood computations, which is prohibitively high. In this context,
we suggest a novel modification to the EM algorithm based on the sphere-decoding
algorithm[89]. This scheme can reduce the computational complexity order by re-
ducing the number of likelihood computations to the number of sphere vectors.
This scheme trades off computational complexity for MSE performance and has
a slightly higher MSE owing to the sub-optimality resulting from the selection of
a fewer source vectors. In the end we present simulation results comparing the
performance of different time-varying MIMO channel estimation schemes. In the
discussion that follows the notation k ∈ m,n represents m ≤ k ≤ n, where k,m, n
are integers. The vector em is defined as em , [1, 1, . . . , 1]T ∈ Cm×1.
7.2 Problem Setup
Consider an r×t MIMO system, i.e. a MIMO system with r receive and t
transmit antennas. Let the multiple-input multiple-output (MIMO) system model
be described as,
y(l) =
∫ ∞
−∞H (l, τ)x (l − τ) dτ + η(l), (7.1)
where l denotes continuous time and H (l, τ) ∈ Cr×t is the time-varying MIMO
channel impulse response. For simplicity, we assume that the coherent bandwidth
Bc of the channel is such that Bc >> Rs, where Rs is the symbol baud. Hence,
165
from [3], the MIMO channel response H (l, τ) can be approximated by the time-
selective but frequency-flat response H (l, τ) = H(l)δ (τ). The discrete time MIMO
system model at the sampling instants can be represented as,
y(k) = H(k)x(k) + η(k), (7.2)
where the index k denotes the sampling index. We now consider the problem of
estimation of the channel H (k) using superimposed pilot symbols. Let a frame
of Nb symbols be transmitted by repeated superposition of a pilot sequence Xp =
[xp(1),xp(2), ...,xp (Lp)] of length Lp symbols i.e. xp(k) = xp (mod (k − 1, Lp) + 1).
Hence, the SP system model can be derived as,
y(k) = H(k) (xd(k) + xp (mod (k − 1, Lp) + 1)) + η(k), (7.3)
where xd(k) are the stochastic zero-mean (E xd(k) = 0) transmitted data sym-
bols of power Pd, i.e. Exd(k)xd(k)H
= PdIr.
7.2.1 SP Estimation Based on the CEBEM MIMO Model
Let fd be the maximum frequency of the doppler spread. Then, fd ,
fd/Rs is the normalized doppler component. From [85], the complex exponential
basis expansion of the time varying MIMO channel H (k) can be expressed as,
H(k) =V∑
v=−V
H(v)ej2πv(k−1)/Nb , V , ⌈fdNb⌉,
where the matrices H(v) ∈ Cr×t, −V ≤ v ≤ V are the coefficient matrices of the
CEBEM model and H ,
[
H (−V ) , H (−V + 1) , . . . , H (V )]
. Let the exponential
basis matrix dv(k) ∈ C(2V +1)×1 be defined as,
dv(k) ,
e−j2πV (k−1)
e−j2π(V −1)(k−1)
...
ej2πV (k−1)
.
166
Hence, the equivalent CEBEM based SP system model can be described as,
y(k) = HDv(k) (xd(k) + xp (mod (k − 1, Lp) + 1)) + η(k),
where Dv(k) , (dv(k)⊗
It) and⊗
denotes the matrix Kronecker product. Let
the matrices Xp,Xs be given as,
Xp ,
xp(1) 0 0 . . . 0
0 xp(2) 0 . . . 0...
......
. . ....
0 0 0 . . . xp (Nb)
, Xs ,
xs(1) 0 0 . . . 0
0 xs(2) 0 . . . 0...
......
. . ....
0 0 0 . . . xs (Nb)
,
where, xs(k) , xd(k) + xp (mod (k − 1, Lp) + 1) is symbol transmitted at the kth
symbol instant. Let Dv , [dv(1),dv(2), . . . ,dv (Nb)]. Let the received symbol vec-
tor matrix Ys be defined as, Ys , [ys(1),ys(2), . . . ,ys (Nb)]. The above CEBEM
estimation scenario can be recast as,
Ys =(
Dv
⊗
It
)
Xp +(
Dv
⊗
It
)
Xd + Ns,
where the matrix Ns represents a stacking of the noise vectors η(k) similar to Ys
defined above. The estimate ˆH is given as,
ˆHp = Ys
[(
Dv
⊗
It
)
Xp
]†, (7.4)
where † denotes the Moore-Penrose pseudo-inverse [82]. The time-selective channel
estimate is now given as,
H(k) ,ˆHp
[
dv(k)⊗
It
]
.
Thus, one can arrive at an estimate of the time-varying MIMO channel H(k). In
the next section we present an EM based iterative algorithm to enahance the above
channel estimate.
7.3 EM Based Algorithm for CEBEM SP Estimation
Let the symbols at each transmit antenna be drawn from a set Ω of size
|Ω|. Let Ωt denote the set of all possible such symbol vectors from which each data
167
symbol xd(k) is drawn, i.e. xd(k) ∈ Γ and Γ = Ω×Ω×. . .×Ω. The size of this set is
given as |Γ| = |Ω|t. Let this set Γ be indexed by j ∈ 1, |Ω|t where xj denotes the j-
th symbol in the set. Also, let xj ∈ Γ have an apriori probability denoted by γj i.e.
p (xd(i) = xj) = γj , ∀i. Define the indicator parameter χi(j), i ∈ 1, Nb, j ∈ 1, |Ω|t
as χi(j) = 1 if xd(i) = xj and 0 otherwise. It can be seen that xd(i) are given as
a function of χi(j)s as xd(i) =∑|Ω|t
j=1 χi(j)xj. The ML estimate of H is given by
optimization of the cost function,
ˆH = arg min∥∥∥Ys − H
[(
Dv
⊗
It
)
Xs
]∥∥∥
2
.
The estimate H is then given by the least squares solution,
ˆH = Ys
[(
Dv
⊗
It
)
Xs
]†.
The closed form solution in (7.5) above can be implemented with very low compu-
tational complexity, if χi(j) were known. It can thus be seen that this formulation
naturally leads the way for application of the EM algorithm by defining the com-
plete data Π = (Ys, χ), where χ ∈ 0, 1|Ω|t×Nb is defined as χ ,[χ1, χ2, . . . , χNb
]
and χi , [χi(1), χi(2), . . . , χi(|Ω|t
)]T ∈ 0, 1|Ω|t×1 , i ∈ 1, Nb. The matrix χ is
popularly known as the ’missing’ data. In Gaussian noise, the log-likelihood of the
complete data Π is given by the sum of the log-likelihoods of the individual ys(i)
as,
Lg (Π;H) =
Nb∑
k=1
|Ω|t∑
j=1
χk(j)∥∥∥ys(k) − HDv(k)
(xp(k) + xj
)∥∥∥
2
. (7.5)
The EM algorithm can now be employed to compute the ML estimate of the matrix
H as follows. Let H(k) denote the estimate of H at the kth iteration. The algorithm
is initialized by the pilot estimate H(0) = ˆHp. Then, H(k) is given as,
H(k) = arg minU (k−1)(
Π; H)
(7.6)
168
where the quantity U (k−1)(
Π; H)
is defined as,
U (k−1)(
Π; H)
,
∫ ∞
−∞p(
χ|Ys; H(k−1)
)
Lg
(
Π; H)
dχ
=
Nb∑
k=1
|Ω|t∑
j=1
Eχk(j)
∣∣H(k−1)
∥∥∥ys(k) − HDv(k)
(xp(k) + xj
)∥∥∥
2
.
By definition of χi(j) it follows that E χi(j)|H(k) = p(
xj|ys(i); H(k)
)
and is
given as,
Eχi(j)
∣∣H(k) = p
(xj|ys(i);H
(k))
=p(ys(i),x
j;H(k))
p (ys(i);H(k))(7.7)
=p(ys(i)|xj;H(k)
)p (xj)
∑|Ω|tl=1 p (ys(i)|xl;H(k)) p (xl)
, (7.8)
where the quantity p(
xj|ys(i); H(k)
)
is given by the exponential function,
p(
xj|ys(i); H(k)
)
= e−‖ys(i)−H(k)Dv(i)(xp(i)+xj)‖2
.
The closed form expression for H(k) which maximizes the likelihood in (7.6) is
given as,
H(k) = R(k)yx
(R(k)
xx
)−1,
where the matrices R(k)yx and R
(k)xx are defined as,
R(k)yx ,
1
Nb
Nb∑
i=1
|Ω|t∑
j=1
p(
xj|ys(i); H(k)
)
ys(i)(xp(i) + xj
)HDv(j)
H ,
R(k)xx ,
1
Nb
Nb∑
i=1
|Ω|t∑
j=1
p(
xj|ys(i); H(k)
)
Dv(j)(xp(i) + xj
) (xp(i) + xj
)HDv(j)
H .
7.3.1 Likelihood computation and Sphere Decoding
The complexity of the posterior probability computation in (7.8) is of
the order of O(|Ω|t
)likelihood computations, such as p
(ys(i)|xj;H(k)
), j ∈ 1, |Ω|t
for each i ∈ 1, Nb. As seen above, for a 4 × 4 MIMO system employing a 16
169
QAM signal constellation (|Ω|t ≈ 105), making the computational complexity pro-
hibitively high for real-time implementation. It can also be observed that the
quantity p(ys(i)|xj;H(k)
), j ∈ 1, |Ω|t is significantly different from zero only for
symbol vectors xj in the neighborhood of xd(i) =∑|Ω|t
j=1 χi(j)xj. The sphere de-
coding algorithm described in [89] for maximum likelihood detection in MIMO
systems can be employed to find the vectors xj such that
∥∥ys(i) − H(k)Dv(i)
(xp(i) + xj
)∥∥ ≤ re (7.9)
where re is the sphere radius. We then choose re such that only a few Nsp >
Nc << |Ω|t signal vectors lie in the sphere. Nc is a certain ’critical’ number of
constellation vectors which contribute significantly to the likelihood expression for
each received symbol ys(i). These Nsp symbols can then be used to compute the
probabilities p(ys(i)|xj;H(k)
)of the vectors xj. Let F(k)(i) ∈ C
2rt×2rt be defined
as,
F(k)(i) ,
Re
(
H(k)Dv(i))
−Im(
H(k)Dv(i))
Im(
H(k)Dv(i))
Re(
H(k)Dv(i))
, (7.10)
and ys(i) ∈ C2r×1 be defined as ys(i) =
[Re
(ys(i)
T), Im
(ys(i)
T)]T
. Then, the
cost function in (7.9) reduces to building the set of sphere vectors Ωj such that,
Ωi ,xj :
∥∥ys(i) − F
(xp(i) + xj
)∥∥ ≤ re
, (7.11)
where xj, xp(i) are obtained by a stacking of xj,xp(i) respectively, similar to ys(i).
The construction of the above set can be further simplified as follows. Let the
source vectors xj be drawn from a 16QAM symbol constellation and this scheme
can be readily extended to square constellations of other sizes. Then, xj can
be represented as xj =√
Pd
10
(2sj − 5
(et +
√−1 et
)), where sj = sj
p +√−1 sj
q.
Each element of the vectors sjp, s
jq is drawn from the set S , Smin, Smax, where
Smin = 1, Smax = 4 for the 16 QAM constellation. The above set Ωi can be recast
as,
Ωi ,sj :
∥∥ys(i) − Gsj
∥∥ ≤ re
, (7.12)
170
where ys(i) , ys(i) − Fxp(i) +√
5Pd
2Fe2t and G ,
√2Pd
5F. The vector sj is
obtained from sj similar to ys(i) defined above. Let VR = G, be the QR factor-
ization of G (i.e. VHV = I and R is upper triangular). Let V be block partitioned
as V =[Vr, Vn
], Vr ∈ C
r×t, Vn ∈ Cr×r−t. Let u = [u1, u2, . . . , u2t]
T, VH
r ys(i).
Below, we adapt the sphere decoding algorithm in [89] to find the set of sphere
vectors Ωi. Let rm,n denote the (m,n)th entry of the matrix R.
Algorithm
S.1 Set k = 2t, r2t = r2
e −∥∥VH
n ys(i)∥∥
2, u2t|2t+1 = u2t. Ωi = .
S.2 Computation of Bounds for sk:
Uk , min
⌊rk + uk|k+1
rk−1,k−1
⌋
, Smax
,
sk = max
⌈−rk + uk|k+1
rk−1,k−1
⌉
− 1, Smin − 1
.
S.3 Set sk = sk + 1. If sk ≤ Uk go to S.5. Else to S.4.
S.4 Set k = k + 1. If k = 2t + 1 terminate algorithm. Else go to S.3.
S.5 If k = 1 go to S.6. Else, set k = k − 1.
uk|k+1 = uk −2t∑
j=k+1
rk,jsj ,
r2k = r2
k+1 −(
uk+1 −2t∑
j=k+1
rk+1,jsj
)2
.
Go to S.2
S.6 Solution found. Let s , [s1, s2, . . . , s2t]T be partitioned as s =
[sTp , sT
q
]T
where, sp, sq ∈ Rt×1. Ωi = Ωi
⋃√
Pd
10
(2 (sp + jsq) − 5
(et +
√−1 et
))
and go to S.3.
171
5 5.5 6 6.5 7 7.5 8 8.5 9
10−1
SNR (dB)
MS
E
MSE Vs. SNR for CEBEM−EM Based Time Selective Channel Estimation
MSE−CEBEM
MSE Static
MSE−CEBEM−EM
MSE−CEBEM−EM−SD
Figure 7.1: MSE of Kalman based estimation of a time-varying wireless channel.
We then sort the computed probabilities p(ys(i)|xj;H(k)
), for xj ∈ Ωi
and choose the largest Nc of them and the corresponding symbol vectors. The
rest of the probabilities for each ys(i) are set to 0. These probabilities are used
to compute the posterior probabilities in (7.8). Thus every iteration of the EM
algorithm uses only Nc << |Ω|t symbol vectors, significantly speeding up the
likelihood computation.
7.4 Simulations
For our simulations, we consider a 4×4 MIMO system with frame length
Nb = 240 symbols. The information, pilot symbols are drawn from a QPSK symbol
constellation. The wireless channel between the receive antenna i and transmit
antenna j is given by the modified Jake’s process outlined in [90] as,
[H(t)]ij =
√
2
Np
Np∑
n=1
ej[Ψn]ij cos(
2πfdt cos [Υn]ij + [Φ]ij
)
,
172
and the matrix Υn is given as,
Υn =1
4Np
((2πn − π) ere
Tt + Θ
),
where the entries of Ψn, Θ, Φ ∈ Rr×t are IID and uniformly distributed as U [−π, π )
and fd , fc (vm/c) is the doppler frequency shift for a node in motion with velocity
vm. The coefficients of the channel matrix H(k) at the sampling instants k are
given as,
[H (k)]ij =
√
2
Np
Np∑
n=1
ej[Ψn]ij cos(
2πfdk cos[Υn]ij + [Φ]ij
)
, (7.13)
and fd , fd/Rs is the normalized doppler frequency. The normalized doppler is a
convenient handle on the nature of variation of a fast-fading process and for most
wireless applications fd ≤ 0.01. We employ average mean-squared error as the
metric to evaluate the performance of the above estimators which is given as,
¯MSE =1
rtNb
Nb∑
l=1
∥∥∥H(k) − H(k)
∥∥∥
2
F(7.14)
In fig.(7.1) we plot the MSE of CEBEM based SP estimation of the time-varying
4 × 4 MIMO channel employing the standard mean-estimator described in (7.4).
We also plot the MSE of EM based iterative estimation with sphere-decoding
described in section(7.3) for re =√
30 and compare this performance with the
standard mean based SP estimator. It can be seen that the EM based iterative
scheme has a significantly lower MSE compared to the mean-estimator. It can
also be seen this low complexity iterative procedure has a slight performance loss
compared to the full complexity EM, which results from the sub-optimality of
the sphere decoding based EM procedure. This MSE difference worsens as the
SNR increases. The MSE of the the static SP mean estimator, which assumes an
invariant MIMO channel matrix H(k) = H, is plotted for comparison. It can be
seen that the static estimator results in poor MSE performance.
173
7.5 Conclusion
In this work we successfully demonstrated the applicability of CEBEM
based modeling in the context of time-selective MIMO channel estimation using
superimposed pilots (SP). An expectation-maximization(EM) based soft decoding
procedure has been suggested for iterative refining of the MIMO channel estimate.
The computational complexity of the EM implementation has been substantially
reduced by adapting the sphere decoding algorithm for likelihood computation.
8 Conclusions
In this thesis we investigated several different schemes for bandwidth ef-
ficient channel estimation in the context of MIMO systems. As the number of
receive/transmit antennas grows in a MIMO system, the number of parameters
to be estimated increases significantly. This, coupled with the low SNR regime
operation of MIMO systems poses great challenges for channel estimation. Trans-
mission of pilot symbols to estimate all the channel coefficients constitutes a sig-
nificant bandwidth overhead in MIMO systems. The alternative to pilot based
estimation is blind estimation, in which the channel is estimated exclusively from
the statistical information present in the transmitted information symbols. Such
a scheme is bandwidth optimal as it avoids the transmission of pilots. However,
it can result in a significant computational complexity overhead and also such al-
gorithms frequently result in convergence to local minima due to non-convexity of
the cost functions.
We have demonstrated that semi-blind estimation which employs both
pilots and blind information, significantly reduces the mean-squared error of es-
timation of a MIMO system. Further, by employing pilots, one can resolve the
indeterminacy that often arises in blind estimation. By achieving an enhanced
MSE performance through employment of blind information, the number of pilots
in the symbol frame can be reduced which leads to greater bandwidth efficiency.
From a constrained CRB analysis, it has been shown that asymptotically such
a semi-blind scheme leads to at least a 3dB decrease in the MSE of estimation.
Several constrained maximum-likelihood schemes have been suggested that asymp-
174
175
totically achieve this constrained Cramer-Rao bound for semi-blind estimation.
We have also addressed the issue of the minimum number of pilot symbols
required for the identifiability of a MIMO frequency selective (FS) channel through
a Fisher information matrix based regularity analysis. It is demonstrated that the
rank deficiency of the Gaussian FIM of a FS channel is at least t2 and further, that
at least t pilot symbols are necessary for the complete estimation of a FIR MIMO
channel. It is also shown that the semi-blind estimation bound asymptotically
converges to the complex constrained CRB thus resulting in a significantly reduced
MSE of estimation of the semi-blind scheme.
The semi-blind estimation philosophy has also been demonstrated to yield
performance improvements in the context of maximum-ratio transmission (MRT)
based MIMO systems. MRT based MIMO relies on beamforming at the transmitter
and receiver to utilize the dominant eigenmode of transmission of a MIMO system
and hence has a low implementation complexity. A semi-blind scheme for MRT
based estimation which directly estimates the dominant left and right singular
vectors of the MIMO channel matrix has been demonstrated. The expressions for
MSE performance and the effective channel gain for the semi-blind scheme and
conventional pilot scheme have been derived using a matrix perturbation analysis.
It has been demonstrated that semi-blind estimation yields MSE and throughput
gains compared to the conventional pilot based scheme.
In a paradigm shift in channel estimation, pilots superimposed over data
symbols or superimposed pilots offer another alternative for bandwidth efficient
channel estimation. Such a scheme avoids the exclusive transmission of pilots. A
semi-blind scheme has been demonstrated for SP based estimation, which improves
performance over the traditional mean-estimator. The throughput performance of
the SP system has been characterized through the development of a worst case
capacity analysis for systems with information noise correlation. Through this
analysis it has been demonstrated that the SP scheme can yield throughput gains
compared to a conventional system employing time-multiplexed pilots. Finally,
176
an application of SP based estimation has been demonstrated in the context of a
time-varying MIMO channel which is modeled using a complex exponential basis
expansion model.
In summary, the schemes proposed in this thesis represent a progress to-
wards the design of bandwidth efficient algorithms for channel estimation in MIMO
systems. Future work can include designing novel implementation strategies to de-
ploy these schemes in real time wireless devices. Such efficient algorithms can
result in significantly enhancing the bandwidth efficiency of MIMO systems by
reducing the pilot overhead.
Bibliography
[1] S. Alamouti, “A simple transmit diversity technique for wireless communica-tions,” Selected Areas in Communications, IEEE Journal on, vol. 16, pp. 1451– 1458, Oct. 1998.
[2] E. Telatar, “Capacity of multi-antenna Gaussian channels,” European Trans-actions on Telecommunications, vol. 10, pp. 585–596, Nov. 1999.
[3] J. Proakis, Digital Communications. NewYork,NY-10020: McGraw-HillHigher Education, international ed., 2001.
[4] A. Paulraj, R. Nabar, and D. Gore, Introduction to Space-Time WirelessCommunications. Cambridge University Press, first ed., 2003.
[5] J. Zheng, E. Duni, and B. D. Rao, “Analysis of multiple antenna systemswith finite-rate feedback using high resolution quantization theory,” IEEETransactions on Signal Processing, To appear.
[6] J. Zheng and B. D. Rao, “Capacity analysis of correlated multiple antennasystems with finite rate feedback,” Proceedings of the IEEE International Con-ference on Communications, Jun. 2006.
[7] Y. Isukapalli, R. Annavajjala, and B. D. Rao, “Performance analysis of trans-mit beamforming for MISO systems with imperfect feedback,” IEEE Trans-actions Communications, In Review.
[8] L. Tong and S. Perreau, “Multichannel blind identification: From subspace tomaximum likelihood methods,” Proceedings of the IEEE, october 1998.
[9] Z. Cheng and D. Dahlhaus, “Time versus frequency domain channel estima-tion for ofdm systems with antenna arrays,” Proc. of 6th International Con-ference on Signal Processing (ICSP’02), Beijing, China, vol. 2, pp. 1340–1343,Aug 2002.
[10] J. H. Wilkinson, The Algebraic Eigenvalue Problem. Walton St., Oxford:Oxford University Press, first ed., 1965.
[11] T. M. Cover and J. A. Thomas, Elements of Information Theory. N.Y.: JohnWiley & Sons, Inc., 1991.
177
178
[12] B. Hassibi and B. Hochwald, “How much training is needed in multiple-antenna wireless links,” IEEE Trans. on Info. Theory, Apr 2003.
[13] M. Yan and B. D. Rao, “Performance of an array receiver with a kalmanchannel predictor for fast rayleigh flat fading environments,” IEEE Journalon Selected Areas in Communications-Wireless Series, vol. 19, pp. 1164–1172,Jun 2001.
[14] T. S. Rappaport, Wireless communications: Principles and Practice. UpperSadle River, NJ 07458: Prentice Hall, second ed., 2002.
[15] S. M. Kay, Fundamentals of Statistical Signal Processing,Vol I: EstimationTheory. Prentice Hall PTR, first ed., 1993.
[16] A. van den Bos, “A Cramer-Rao lower bound for complex parameters,” IEEETransactions on Signal Processing, vol. 42, p. 2859, october 1994.
[17] P. Stoica and B. C. Ng, “On the cramer-rao bound under parametric con-straints,” IEEE Signal Processing Letters, vol. 5, pp. 177–179, Jul 1998.
[18] J. Gorman and A. Hero, “Lower bounds on parametric estimators with con-straints,” IEEE Transactions on Information Theory, vol. 36, pp. 1285 – 1301,Nov. 1990.
[19] T. Marzetta, “A simple derivation of the constrained multiple parame-ter cramer-rao bound,” IEEE Transactions on Signal Processing, vol. 41,pp. 2247–2249, June 1993.
[20] D. H. Brandwood, “A complex gradient operator and its application in adap-tive array theory,” IEE Proc., vol. 130, pp. 11–16, Feb. 1983.
[21] R. Fischer, Precoding and Signal Shaping for Digital Transmission (Appendix-A). Wiley InterSciences, 2002.
[22] S. Zacks, The theory of statistical inference. John Wiley and Sons, first ed.,1971.
[23] A. K. Jagannatham and B. D. Rao, “A semi-blind technique for MIMO chan-nel matrix estimation,” in Proc. of IEEE Workshop on Signal Processing Ad-vances in Wireless Communications (SPAWC 2003) , # 582, (Rome, Italy),2003.
[24] A. Medles, D. T. M. Slock, and E. D. Carvalho, “Linear prediction based semi-blind estimation of MIMO FIR channels,” Third IEEE SPAWC, Taoyuan,Taiwan, pp. 58–61.
[25] P. Comon, “Independent component analysis, a new concept?,” Signal Pro-cessing, vol. 36, no. 3, pp. 287–314, 1994.
179
[26] J. Cardoso, “Blind signal separation : Statistical principles,” Proceedings ofthe IEEE, vol. 86, pp. 2009–25, Oct 1998.
[27] V. Zarzoso and A. Nandi, “Adaptive blind source separation for virtually anysource probability density function,” IEEE Transactions on signal processing,vol. 48, Feb. 2000.
[28] E. Carvalho and D. Slock, “Asymptotic performance of ML methods for semi-blind channel estimation,” Thirty-First Asilomar Conference, vol. 2, pp. 1624–8, 1998.
[29] A. Medles and D. Slock, “Semiblind channel estimation for mimo spatial mul-tiplexing systems,” Vehicular Technology Conference, Fall 2001, 2001.
[30] D. Pal, “Fractionally spaced semi-blind equalization of wireless channels,” TheTwenty-Sixth Asilomar Conference, vol. 2, pp. 642–645, 1992.
[31] D. Pal, “Fractionally spaced equalization of multipath channels: a semi-blindapproach,” 1993 International Conference on Acoustics, Speech and SignalProcessing, vol. 3, pp. 9–12, 1993.
[32] A. K. Jagannatham and B. D. Rao, “Cramer-Rao lower bound for constrainedcomplex parameters,” IEEE Signal Processing Letters, vol. 11, pp. 875–878,Nov. 2004.
[33] A.Medles, D. Slock, and E.D.Carvalho, “Linear prediction based semi-blindestimation of MIMO FIR channels,” Third IEEE SPAWC, Taiwan, 2001.
[34] Y. Sung and L. T. et. al., “Semiblind channel estimation for space-time codedWCDMA,” 36th Asilomar Conference on Sig., Syst., pp. 1637–1641, 2002.
[35] A. Taylor and W. Mann, Advanced Calculus. Wiley Text Books, 3rd ed.
[36] P. Vaidyanathan, Multirate systems and filter banks. Englewood Cliffs, NJ,USA: Prentice Hall, 1993.
[37] G. H. Golub and C. F. V. Loan, Matrix Computations. Johns Hopkins UnivPr, second ed., 1984.
[38] T. S. Ferguson, Mathematical Statistics: A Decision Theoretic Approach.Boston: Academic Press, 1967.
[39] B. Hassibi and B. Hochwald, “How much training is needed in multiple-antenna wireless links,” IEEE Transactions on Information Theory, vol. 49,pp. 951–964, Apr 2003.
[40] T. Marzetta, “BLAST training: Estimating channel characteristics for high-capacity space-time wireless,” Proc. 37th Annual Allerton Conference onCommunications, Control, and Computing, Monticello, IL, pp. 22–24, Sept.1999.
180
[41] J. Heiskala and J. Terry, OFDM Wireless LANs: A Theoretical and PracticalGuide. SAMS Publishing, 2002.
[42] Y. Hua, “Fast maximum-likelihood for blind identification of multiple FIRchannels,” IEEE transactions on Signal Processing, vol. 44, pp. 661–672,March 1996.
[43] P. Loubaton, E. Moulines, and P. Regalia, Signal Processing Advances inWireless and Mobile Communications, Chapter 3: Subspace Method for BlindIdentification and Deconvolution. Upper Saddle River, NJ 07458: PrenticeHall, first ed., 2001.
[44] E. Carvalho and D. Slock, “Blind and semi-blind FIR multichannel estimation:Global identifiability conditions,” IEEE Transactions on Signal Processing,vol. 50, pp. 1053–1064, April 2004.
[45] E. deCarvalho and D. Slock, Signal Processing Advances in Wireless and Mo-bile Communications, Chapter 7: Semi-Blind methods for FIR multi-channelestimation. Upper Saddle River, NJ 07458: Prentice Hall, first ed., 2001.
[46] D. Slock, “Blind joint equalization of multiple synchronous mobile users usingoversampling and/or multiple antennas,” Conference Record of the Twenty-Eighth Asilomar Conference on Signals, Systems and Computers, 1994, vol. 2,pp. 1154–1158.
[47] T. Moore, B. Sadler, and R. Kozick, “Regularity and strict identifiability inMIMO systems,” IEEE Transactions on Signal Processing, vol. 50, pp. 1831–1842, August 2002.
[48] N. Ammar and Z. Ding, “On blind channel identifiability under space-timecoded transmission,” Asilomar conference on signals, systems and computers,vol. 1, pp. 664–668, Nov. 2002.
[49] A. Swindlehurst and G. Leus, “Blind and semi-blind equalization for gen-eralized space-time block codes,” IEEE Transactions on Signal Processing,vol. 50, pp. 2489–2498, Oct. 2002.
[50] J. Tugnait and B. Huang, “Multistep linear predictors-based blind identifi-cation and equalization of multiple-input multiple-output channels,” IEEETransactions on Signal Processing, vol. 48, pp. 26–38, January 2000.
[51] Y. Inuye and R.-W. Liu, “A system-theoretic foundation for blind equalizationof an FIR MIMO channel system,” IEEE transactions on circuits and systems-I: Fundamental theory and applications, vol. 49, pp. 425–435, april 2002.
[52] S. Shahbazpanahi, A. Gershman, and J.H.Manton, “Closed form blind mimochannel estimation for orthogonal space-time block codes,” IEEE Transac-tions on Signal Processing, vol. 53, pp. 4506–4517, Dec. 2005.
181
[53] S. Shahbazpanahi, A. Gershman, and G. Giannakis, “Semi-blind multi-userMIMO channel estimation based on Capon and MUSIC techniques,” Proceed-ings of the International Conference on Acoustics, Speech, and Signal Pro-cessing, vol. 4, pp. 773–776, 2005.
[54] P. Stoica and A. Nehorai, “Performance study of conditional and uncon-ditional direction of arrival estimation,” IEEE Transactions on Acoustics,Speech and Signal Processing, vol. 38, pp. 1783–1795, Oct.
[55] A. K. Jagannatham and B. D. Rao, “Whitening rotation based semi-blindMIMO channel estimation,” IEEE Transactions on Signal Processing, vol. 54,pp. 861–869, Mar. 2006.
[56] N. Dhahir and A. Sayed, “The finite-length multi-input multi-output MMSE-DFE,” IEEE Transactions on Signal Processing, vol. 48, pp. 2921 – 2936, Oct.2000.
[57] A. Medles and D. T. Slock, “Augmenting the training sequence part in semi-blind estimation for MIMO channels,” Proc. of the 37th Asilomar conferenceon signals, systems and computers, 2003.
[58] E. Carvalho and D. Slock, “Cramer-Rao bounds for semi-blind and trainingsequence based channel estimation,” First IEEE Workshop on Signal Process-ing Advances in Wireless Communications, pp. 129–32, 1997.
[59] K. Miller, Complex Stochastic Processes: An Introduction to Theory and Ap-plication. Addison-Wesley, first ed., 1974.
[60] M. Siyau, P. Nobles, and R. Ormondroyd, “Channel estimation for layeredspace-time systems,” Proc. Signal Processing Advances in Wireless Commu-nications, pp. 482–486, Jun. 2003.
[61] T. K. Y. Lo, “Maximum ratio transmission,” IEEE Trans. Commun., vol. 47,pp. 1458–1461, Oct. 1999.
[62] T. W. Anderson, An Introduction to Multivariate Statistical Analysis, ch. 11.John Wiley & Sons, 1971.
[63] A. K. Jagannatham, C. R. Murthy, and B. D. Rao, “A semi-blind MIMOchannel estimation scheme for MRT,” in Proc. ICASSP, vol. 3, (Philadelphia,PA, USA), pp. 585–588, Mar. 2005.
[64] M. Kaveh and A. J. Barabell, “The statistical performance of the MUSICand the minimum-norm algorithms in resolving plane waves in noise,” IEEETransactions on Acoustics, Speech, and Signal Processing, vol. 34, no. 2,pp. 331–341, 1986.
182
[65] A. K. Jagannatham and B. D. Rao, “Complex constrained CRB and its ap-plication to semi-blind MIMO and OFDM channel estimation,” in Proc. ofthe IEEE SAM Workshop, 2004, (Sitges, Barcelona).
[66] D. J. Love and R. W. Heath, Jr., “Equal gain transmission in multiple-inputmultiple-output wireless systems,” IEEE Trans. Commn., vol. 51, pp. 1102–1110, July 2003.
[67] D. Gu and C. Leung, “Performance analysis of transmit diversity scheme withimperfect channel estimation,” IEE Electronics Letters, vol. 39, pp. 402–403,Feb. 2003.
[68] T. Baykas and A. Yongacoglu, “Robustness of transmit diversity schemes withmultiple receive antennas at imperfect channel state information,” in IEEECCECE 2003, vol. 1, pp. 191–194, may 2003.
[69] F. Mazzenga, “Channel estimation and equalization for M-QAM transmissionwith a hidden pilot sequence,” IEEE Transactions on Broadcasting, vol. 46,pp. 170–176, Jun 2000.
[70] B. Farhang-Boroujeny, “Pilot-based channel identification: Proposal for semi-blind identification of communication channels,” Electronics Letters, vol. 31,pp. 1044–1046, June 1995.
[71] G. Zhou, M. Viberg, and T. McKelvey, “Superimposed periodic pilots for blindchannel estimation,” Proceedings of the 35th Annual Asilomar Conference onSignals, Systems and Computers, pp. 653–657, Nov 2001.
[72] J. Tugnait and W. Luo, “On channel estimation using superimposed trainingand first-order statistics,” IEEE Communications Letters, vol. 7, pp. 413–415,Sep. 2003.
[73] A. Orozco-Lugo, M. Lara, and D. McLernon, “Channel estimation using im-plicit training,” IEEE Transactions on Signal Processing, vol. 52, pp. 240–254,Jan 2004.
[74] M. Ghogho and A. Swami, “Estimation of doubly-selective channels in blocktransmissions using data-dependent training,” Proceedings of the EuropeanSignal Processing Conference (EUSIPCO), 2006.
[75] N. Chen and G. Zhou, “A superimposed periodic pilot scheme for semi-blindchannel estimation of OFDM systems,” Proceedings of 2002 IEEE 10th DigitalSignal Processing Workshop and the 2nd Signal Processing Education Work-shop, pp. 362–365, Oct 2002.
[76] P. Bohlin and M. Coldrey, “Performance evaluation of MIMO communicationsystems based on superimposed pilots,” IEEE International Conference onAcoustics, Speech and Signal Processing, pp. 425–428, 2004.
183
[77] M. Coldrey, Ph.D. Thesis: Estimation and performance analysis for wire-less multiple antenna communication channels. Goteborg, Sweden: ChalmersUniversity of Technology, 2006.
[78] J. K. Tugnait, H. Shuangchi, and X. Meng, “On superimposed-training powerallocation for time-varying channel estimation,” IEEE/SP 13th Workshop onStatistical Signal Processing, pp. 1330 – 1335, Jul. 2005.
[79] M. Biguesh and A. Gershman, “Training-based mimo channel estimation: astudy of estimator tradeoffs and optimal training signals,” IEEE Transactionson Signal Processing, vol. 54, Mar. 2006.
[80] M. Dong and L. Tong, “Optimal design and placement of pilot symbols forchannel estimation,” IEEE Transactions on Signal Processing, vol. 50, De-cember 2002.
[81] A. K. Jagannatham and B. D. Rao, “Semi-blind MIMO FIR channel esti-mation: Regularity and algorithms,” Submitted to the IEEE Transactions onSignal Processing.
[82] T. K. Moon and W. C. Stirling, Mathematical Methods and Algorithms forSignal Processing. Prentice Hall, first ed., 2000.
[83] H. L. V. Trees, Optimum Array Processing: Part IV of Detection, Estimationand Modulation Theory. New York: Wiley Interscience, 2002.
[84] A. K. Jagannatham and B. D. Rao, “Superimposed pilots (SP) vs. conven-tional pilots (CP) based MIMO wireless channel estimation,” In perparation.
[85] S. He and J. K. Tugnait, “Direct equalization of multiuser doubly-selectivechannels based on superimposed training,” Proc. of European Signal Process-ing Conf. (EUSIPCO), Florence, Italy, Sep 2006.
[86] M. Ghogho and A. Swami, “Estimation of doubly-selective channels in blocktransmissions using data-dependent superimposed training,” Proc. of Euro-pean Signal Processing Conf. (EUSIPCO), Florence, Italy, Sep 2006.
[87] G. Giannakis and C. Tepedelenlioglu, “Basis expansion models and diversitytechniques for blind identification and equalization of time-varying channels.”
[88] A. P. Dempster, N. M. Laird, and D. B. Rubin, “Maximum likelihood fromincomplete data via the EM algorithm,” J. Royal Stats. Soc., vol. 39, pp. 1–38,1977.
[89] B. Hassibi and H. Vikalo, “On the sphere decoding algorithm: Part i, theexpected complexity,” submitted to IEEE Transactions on Signal Processing.
[90] “Simulation models with correct statistical properties for Rayleigh fadingchannels,” IEEE Transactions on Communications, vol. 51, pp. 920–928, Jun.2003.