3D Massive MIMO and Arti cial Intelligence for Next Generation … · for 3D massive MIMO...
Transcript of 3D Massive MIMO and Arti cial Intelligence for Next Generation … · for 3D massive MIMO...
3D Massive MIMO and Artificial Intelligence for Next Generation
Wireless Networks
Rubayet Shafin
Dissertation submitted to the Faculty of the
Virginia Polytechnic Institute and State University
in partial fulfillment of the requirements for the degree of
Doctor of Philosophy
in
Electrical Engineering
Lingjia Liu, Chair
Jeffrey H. Reed
Harpreet S. Dhillon
Yang (Cindy) Yi
Zhenyu ”James” Kong
February 25, 2020
Blacksburg, Virginia
Keywords: 3D Massive MIMO, Channel Estimaiton, Machine Learning for Wireless
Copyright 2020, Rubayet Shafin
3D Massive MIMO and Artificial Intelligence for Next Generation
Wireless Networks
Rubayet Shafin
(ABSTRACT)
3-dimensional (3D) massive multiple-input-multiple-output (MIMO)/full dimensional (FD)
MIMO and application of artificial intelligence are two main driving forces for next generation
wireless systems. This dissertation focuses on aspects of channel estimation and precoding
for 3D massive MIMO systems and application of deep reinforcement learning (DRL) for
MIMO broadcast beam synthesis. To be specific, downlink (DL) precoding and power allo-
cation strategies are identified for a time-division-duplex (TDD) multi-cell multi-user massive
FD-MIMO network. Utilizing channel reciprocity, DL channel state information (CSI) feed-
back is eliminated and the DL multi-user MIMO precoding is linked to the uplink (UL)
direction of arrival (DoA) estimation through estimation of signal parameters via rotational
invariance technique (ESPRIT). Assuming non-orthogonal/non-ideal spreading sequences of
the UL pilots, the performance of the UL DoA estimation is analytically characterized and
the characterized DoA estimation error is incorporated into the corresponding DL precoding
and power allocation strategy. Simulation results verify the accuracy of our analytical char-
acterization of the DoA estimation and demonstrate that the introduced multi-user MIMO
precoding and power allocation strategy outperforms existing zero-forcing based massive
MIMO strategies.
In 3D massive MIMO systems, especially in TDD mode, a base station (BS) relies on the
uplink sounding signals from mobile stations to obtain the spatial information for downlink
MIMO processing. Accordingly, multi-dimensional parameter estimation of MIMO channel
becomes crucial for such systems to realize the predicted capacity gains. In this work, we also
study the joint estimation of elevation and azimuth angles as well as the delay parameters
for 3D massive MIMO orthogonal frequency division multiplexing (OFDM) systems under
a parametric channel modeling. We introduce a matrix-based joint parameter estimation
method, and analytically characterize its performance for massive MIMO OFDM systems.
Results show that antenna array configuration at the BS plays a critical role in determining
the underlying channel estimation performance, and the characterized MSEs match well with
the simulated ones. Also, the joint parametric channel estimation outperforms the MMSE-
based channel estimation in terms of the correlation between the estimated channel and the
real channel.
Beamforming in MIMO systems is one of the key technologies for modern wireless commu-
nication. Creating wide common beams are essential for enhancing the coverage of cellular
network and for improving the broadcast operation for control signals. However, in order to
maximize the coverage, patterns for broadcast beams need to be adapted based on the users’
movement over time. In this dissertation, we present a MIMO broadcast beam optimization
framework using deep reinforcement learning. Our proposed solution can autonomously and
dynamically adapt the MIMO broadcast beam parameters based on user’ distribution in the
network. Extensive simulation results show that the introduced algorithm can achieve the
optimal coverage, and converge to the oracle solution for both single cell and multiple cell
environment and for both periodic and Markov mobility patterns.
3D Massive MIMO and Artificial Intelligence for Next Generation
Wireless Networks
Rubayet Shafin
(GENERAL AUDIENCE ABSTRACT)
Multiple-input-multiple-output (MIMO) is a technology where a transmitter with multi-
ple antennas communicates with one or multipe receivers having multiple antennas. 3-
dimensional (3D) massive MIMO is a recently developed technology where a base station
(BS) or cell tower with a large number of antennas placed in a two dimensional array com-
municates with hundreds of user terminals simultaneously. 3D massive MIMO/full dimen-
sional (FD) MIMO and application of artificial intelligence are two main driving forces for
next generation wireless systems. This dissertation focuses on aspects of channel estimation
and precoding for 3D massive MIMO systems and application of deep reinforcement learn-
ing (DRL) for MIMO broadcast beam synthesis. To be specific, downlink (DL) precoding
and power allocation strategies are identified for a time-division-duplex (TDD) multi-cell
multi-user massive FD-MIMO network. Utilizing channel reciprocity, DL channel state in-
formation (CSI) feedback is eliminated and the DL multi-user MIMO precoding is linked
to the uplink (UL) direction of arrival (DoA) estimation through estimation of signal pa-
rameters via rotational invariance technique (ESPRIT). Assuming non-orthogonal/non-ideal
spreading sequences of the UL pilots, the performance of the UL DoA estimation is analyt-
ically characterized and the characterized DoA estimation error is incorporated into the
corresponding DL precoding and power allocation strategy. Simulation results verify the
accuracy of our analytical characterization of the DoA estimation and demonstrate that the
introduced multi-user MIMO precoding and power allocation strategy outperforms existing
zero-forcing based massive MIMO strategies.
In 3D massive MIMO systems, especially in TDD mode, a BS relies on the uplink sounding
signals from mobile stations to obtain the spatial information for downlink MIMO process-
ing. Accordingly, multi-dimensional parameter estimation of MIMO channel becomes crucial
for such systems to realize the predicted capacity gains. In this work, we also study the joint
estimation of elevation and azimuth angles as well as the delay parameters for 3D massive
MIMO orthogonal frequency division multiplexing (OFDM) systems under a parametric
channel modeling. We introduce a matrix-based joint parameter estimation method, and
analytically characterize its performance for massive MIMO OFDM systems. Results show
that antenna array configuration at the BS plays a critical role in determining the underlying
channel estimation performance, and the characterized MSEs match well with the simulated
ones. Also, the joint parametric channel estimation outperforms the MMSE-based channel
estimation in terms of the correlation between the estimated channel and the real channel.
Beamforming in MIMO systems is one of the key technologies for modern wireless commu-
nication. Creating wide common beams are essential for enhancing the coverage of cellular
network and for improving the broadcast operation for control signals. However, in order to
maximize the coverage, patterns for broadcast beams need to be adapted based on the users’
movement over time. In this dissertation, we present a MIMO broadcast beam optimization
framework using deep reinforcement learning. Our proposed solution can autonomously and
dynamically adapt the MIMO broadcast beam parameters based on user’ distribution in the
network. Extensive simulation results show that the introduced algorithm can achieve the
optimal coverage, and converge to the oracle solution for both single cell and multiple cell
environment and for both periodic and Markov mobility patterns.
To my parents, my sister, and my brother
vi
Acknowledgments
First and foremost, I owe my deepest gratitude to my advisor, Dr. Lingjia Liu for his
unwavering support during my PhD studies. I would like to sincerely thank him for always
believing in me. His tremendous patience and the time and efforts he dedicated for me made
my PhD journey extremely rewarding and a joyful experience. His supervision and priceless
advices helped me grow not only as a researcher but also as a human being. I am also
thankful to my PhD dissertation committee members—Dr. Jeff Reed, Dr. Harpreet Dhillon,
Dr. Yang Yi, and Dr. James Kong for their valuable comments and suggestion that help me
improve the quality of this dissertation. I am grateful to all my lab members for their help
and support. Special thanks to Hao Chen, who helped me a lot in settling down during my
early years of PhD studies. I have always tried to mimic his great work ethics and dedication
for research. I am also thankful to all my friends in Wireless@VT for their support during
my PhD. Last, but not the least, I am grateful to my parents, my brother, and my sister for
their encouragement and unconditional love. Without their constant support, I could have
never been able to finish this dissertation.
vii
Contents
List of Figures xii
List of Tables xvi
1 Introduction 1
1.1 Massive MIMO as an Enabling Technology . . . . . . . . . . . . . . . . . . . 1
1.2 AI – The New Wireless Frontier . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3 Contribution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.4 Organization of the Dissertation . . . . . . . . . . . . . . . . . . . . . . . . . 10
2 Multi-Cell Multi-User Massive FD-MIMO 12
2.1 System and Channel Model . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.2 Uplink Channel Characterization Through DoA Estimation . . . . . . . . . . 15
2.2.1 UL DoA Estimation through Unitary ESPRIT . . . . . . . . . . . . . 15
2.2.2 RMSE Characterization . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.2.3 Complexity Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
viii
2.3 Downlink Precoding and Achievable Rate Analysis . . . . . . . . . . . . . . 29
2.3.1 Optimum Precoding for Sum-rate Maximization . . . . . . . . . . . . 29
2.3.2 Large-Antenna System Analysis . . . . . . . . . . . . . . . . . . . . . 32
2.3.3 Precoding Complexity Analysis . . . . . . . . . . . . . . . . . . . . . 37
2.4 Performance Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
3 Joint Parameter Estimation for 3D Massive MIMO 47
3.1 System Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
3.2 Parameter Estimation Framework . . . . . . . . . . . . . . . . . . . . . . . . 50
3.2.1 Joint Angle and Delay Estimation Using Standard ESPRIT . . . . . 50
3.2.2 Parameter Pairing and Channel Gains Estimation . . . . . . . . . . . 52
3.3 RMSE Characterization of the Joint Angle-Delay Estimation . . . . . . . . . 54
3.4 Simulation Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
4 Superimposed Pilot for Massive FD-MIMO Systems 62
4.1 Motivation and Literature Review . . . . . . . . . . . . . . . . . . . . . . . . 62
4.2 System and Channel Model . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
4.3 Uplink Channel Estimation and Performance Characterization . . . . . . . . 68
4.3.1 Uplink DoA Estimation using Unitary ESPRIT . . . . . . . . . . . . 68
4.3.2 RMSE Characterization . . . . . . . . . . . . . . . . . . . . . . . . . 71
4.4 Achievable Rate Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
ix
4.4.1 Uplink Rate Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
4.4.2 Optimum Downlink Precoding . . . . . . . . . . . . . . . . . . . . . . 81
4.5 Performance Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
4.6 Summary of Chapter 4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
5 MIMO Broadcast-Beam Optimization Through DRL 106
5.1 Network Model and Problem Statement . . . . . . . . . . . . . . . . . . . . . 106
5.2 Learning Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110
5.2.1 Beam Learning Framework . . . . . . . . . . . . . . . . . . . . . . . . 110
5.2.2 Offline Training . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
5.3 DRL for Broadcast Beam Optimization . . . . . . . . . . . . . . . . . . . . . 114
5.3.1 Background of DRL . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
5.3.2 Broadcast beam optimization for dynamic environment . . . . . . . . 119
5.4 Simulation Results and Performance Analysis . . . . . . . . . . . . . . . . . 123
5.4.1 Results for single sector dynamic environment: . . . . . . . . . . . . 123
5.4.2 Results for multiple sector dynamic environment: . . . . . . . . . . . 130
5.4.3 Multi-sectors environment with Markovian mobility pattern . . . . . 135
5.5 Chapter Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
6 Conclusion 139
Appendices 141
x
Appendix A Proofs for Chapter 2, Chapter 3 142
A.1 Proof of Theorem 2.4 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
A.2 Proof of Theorem 2.7 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
A.3 Proof of Theorem 2.12 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 149
A.4 Proof of Theorem 3.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
A.5 Proof of Theorem 4.2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 153
A.6 Proof of Theorem 4.3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
A.7 Proof of Theorem 4.8 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 157
A.8 Proof of Theorem 4.10 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 158
A.9 Proof of Theorem 4.12 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 160
A.10 Proof of Theorem 4.13 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
Bibliography 162
xi
List of Figures
2.1 Network Model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
2.2 Elevation Angle Estimation for 64 Antennas. . . . . . . . . . . . . . . . . . . 39
2.3 Azimuth Angle Estimation for 64 Antennas. . . . . . . . . . . . . . . . . . . 40
2.4 Elevation Angle Estimation for 256 Antennas. . . . . . . . . . . . . . . . . . 42
2.5 Azimuth Angle Estimation for 256 Antennas. . . . . . . . . . . . . . . . . . 43
2.6 Average Achievable Sum-Rate Comparison. . . . . . . . . . . . . . . . . . . 44
2.7 Computational Complexity Comparison for DoA Estimation Algorithms. . . 45
2.8 Computational Complexity Comparison for Precoding Methods. . . . . . . . 46
3.1 Performance of Delay Estimation. . . . . . . . . . . . . . . . . . . . . . . . . 58
3.2 Elevation Angle Estimation Performance. . . . . . . . . . . . . . . . . . . . 59
3.3 Azimuth Angle Estimation Performance. . . . . . . . . . . . . . . . . . . . . 60
3.4 Correlation Between Underlying True Channel and Estimated Channel. . . . 61
4.1 Uplink Transmission Phases in Superimposed Pilot System. . . . . . . . . . . 75
xii
4.2 Elevation Angle Estimation for 64 Antennas. . . . . . . . . . . . . . . . . . . 83
4.3 Azimuth Angle Estimation for 64 Antennas. . . . . . . . . . . . . . . . . . . 84
4.4 Angle Estimation for 16× 4 Antenna Array. . . . . . . . . . . . . . . . . . . 86
4.5 Elevation Angle Estimation for 256 Antenna Elements. . . . . . . . . . . . . 87
4.6 Azimuth Angle Estimation for 256 Antenna Elements. . . . . . . . . . . . . 88
4.7 Uplink Rate CDF when δs = 0.1 and δd = 0.9, and SNR= -5 dB. . . . . . . . 89
4.8 Uplink Rate CDF when δs = 0.9 and δd = 0.1, and SNR= -5 dB. . . . . . . . 90
4.9 Uplink Rate CDF when δs = 0.1 and δd = 0.9, and SNR= 20 dB. . . . . . . 92
4.10 Uplink Rate CDF when δs = 0.9 and δd = 0.1, and SNR= 20 dB. . . . . . . 93
4.11 Uplink Rate vs SNR when δs = 0.1 and δd = 0.9 . . . . . . . . . . . . . . . . 95
4.12 Uplink Rate vs SNR when δs = 0.9 and δd = 0.1 . . . . . . . . . . . . . . . . 96
4.13 Downlink Rate when the uplink SNR=20 dB . . . . . . . . . . . . . . . . . . 97
4.14 Downlink Rate when the uplink SNR=5 dB . . . . . . . . . . . . . . . . . . 98
4.15 Total Rate when δs = 0.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
4.16 Total Rate when δs = 0.9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
4.17 Total Rate when δs = 0.1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
4.18 Total Rate when δs = 0.9 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
4.19 Uplink Transmission Phases in Superimposed Pilot System. . . . . . . . . . . 105
5.1 Offline training . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
xiii
5.2 Reinforcement Learning Framework for Beam Optimization . . . . . . . . . . 117
5.3 DRL State Representation for Beam Optimization Problem . . . . . . . . . . 117
5.4 Replay Buffer architecture for multiple sector case . . . . . . . . . . . . . . . 121
5.5 Neural Network architecture for multiple sector case . . . . . . . . . . . . . . 122
5.6 Periodic Change in Scenarios . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
5.7 Beam pattern corresponding to a typical RL action. . . . . . . . . . . . . . . 128
5.8 Results for periodic mobility pattern in a single sector dynamic environment:
(a) average squared difference (ASD) between reward achieved by DRL agent
and the reward obtained by Oracle; (b) average mismatch (AM) between
actions taken by the DRL agents and the Oracle. . . . . . . . . . . . . . . . 129
5.9 Users’ Distribution Patterns for 2 Scenarios. . . . . . . . . . . . . . . . . . . 130
5.10 Results for periodic mobility pattern in a multiple sector dynamic environ-
ment: (a) average squared difference (ASD) between reward achieved by DRL
agent and the reward obtained by Oracle; (b) average mismatch (AM) between
actions taken by DRL agents for each sector and the corresponding Oracles. 131
5.11 Instantaneous rewards (a) and instantaneous actions (b) at convergence for
multiple sectors environment and periodic user-mobility pattern. . . . . . . . 132
5.12 Results for Global solution for periodic mobility pattern in a multiple sector
dynamic environment: (a) average squared difference (ASD) between reward
achieved by DRL agent and the reward obtained by Oracle; (b) average mis-
match (AM) between actions taken by DRL agents for each sector and the
corresponding Oracles. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
xiv
5.13 Results for average squared difference (ASD) in reward between the DRL
agent and the Oracle for periodic mobility pattern in a single sector dynamic
environment. ASDs for different size of action space have been plotted in
figures (a) - (e). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
5.14 Results for ASD in reward between DRL agent and the Oracle for periodic
mobility pattern in a multiple sector dynamic environment. ASDs for different
size of action space have been plotted in figures (a) and (b). . . . . . . . . . 135
5.15 State Transition Diagram for Markov Mobility. . . . . . . . . . . . . . . . . 136
5.16 Results for Markov mobility pattern in a multiple sector dynamic environment:
(a) average squared difference (ASD) between reward achieved by DRL agent
and the reward obtained by Oracle; (b) average mismatch (AM) between
actions taken by DRL agent for each sector and the corresponding Oracles. . 137
5.17 Instantaneous reward (a) and instanteneous actions (b) at Convergence for
multiple sectors environment and Markov user mobility pattern. . . . . . . . 137
xv
List of Tables
5.1 Notation for System Variables . . . . . . . . . . . . . . . . . . . . . . . . . . 107
5.2 Simulation Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
xvi
Chapter 1
Introduction
Massive-MIMO or large-scale MIMO, has generated significant interest both in academia [1]
and industry [2]. Because of the promise of fulfilling future throughput demand via aggressive
spatial multiplexing, massive MIMO is considered as one of the key enabling technologies for
next generation wireless networks. On the other hand, artificial intelligence (AI) or machine
learning/deep learning techniques are envisioned as the game changer for future wireless
communication. Hence, both massive MIMO and AI will play critical role in the design of
Beyond 5G and 6G cellular networks.
1.1 Massive MIMO as an Enabling Technology
Due to form factor limitation at the base station (BS), three dimensional (3D) massive-
MIMO/Full Dimension MIMO (FD-MIMO) systems have been introduced in 3GPP to deploy
active antenna elements in a two dimensional (2D) antenna array enabling the exploitation
of the degrees of freedom in both azimuth and elevation domains. Due to the availability
of the huge spectrum in the millimeter wave (mmWave) band, mmWave communication
1
2 Chapter 1. Introduction
is considered as another enabling technology for future cellular networks: 5G and beyond.
However, due to its significantly higher path loss compared to the microwave channel, it
is extremely challenging to establish an effective communication for outdoor channels using
mmWave bands. This challenge can be tackled using beamforming techniques where the base
station serves multiple users with narrower beams. This can be possible if a large number of
antennas are deployed at the base station in order to realize the narrow beams. As a result,
massive MIMO is a natural counterpart for the mmWave cellular network.
Since the benefits of massive MIMO or massive FD-MIMO are limited by the accuracy of
the downlink (DL) channel state information (CSI) available at the base station, it is critical
for the BS to obtain corresponding DL CSI information. In general, the BS can obtain the
CSI knowledge through the following: 1) DL CSI feedback where the the CSI information is
fed back from mobile stations (MSs), and 2) DL/UL channel reciprocity where BS estimates
the uplink (UL) CSI and infers DL CSI information through channel reciprocity. Note that
DL CSI feedback is heavily used in frequency-division-duplex (FDD) systems where only a
few bits of the corresponding DL CSI information are fed back to the BS [3] to achieve a
good tradeoff between DL MIMO performance and UL feedback overhead/reliability. To
utilize the DL/UL channel reciprocity, the critical point becomes estimating the UL channel
at the BS. Based on UL pilots/reference signals sent from MSs, there are generally two
methods to estimate the UL channel. First is to estimate the corresponding channel transfer
function (e.g., UL channel matrix). Alternatively, UL direction of arrival (DoA) can be
estimated at the BS using ESPRIT algorithm [4, 5]. Even though the DoA only provides
partial information on the UL channel, it is shown in [6, 7, 8, 9] that it can be directly
linked to DL MIMO precoding in TDD systems. It is important to note that the DoA
based MIMO precoding strategy has also been introduced to FDD systems demonstrating
significant performance benefits in reality [10].
1.1. Massive MIMO as an Enabling Technology 3
Despite promising better performance, non-linear precoding schemes, such as dirty paper
coding (DPC) or vector perturbation, are not practical for MIMO systems due to its high
implementation complexity. In recent years, simple linear processing techniques have been
shown to offer significant performance gains for multi-user massive MIMO scenarios where
the base stations employ a large number of antennas[1]. Hence, most of the prior works
in massive MIMO literature have focused on maximum ratio transmission (MRT) and zero
forcing (ZF)-based methods for DL MIMO precoding [11, 12]. However, for mmWave mssive
FD-MIMO systems, it is possible to design low-complexity precoders with better performance
than the conventional ZF/MRT-based precoders.
Massive MIMO, with a large number of antennas deployed at the BS, promises a dramatic
increase in spectral efficiency compared to the traditional small-scale MIMO systems, and
is considered as a candidate technology for the next generation cellular networks (5G and
beyond). However, realizing the throughput-gains promised by massive MIMO is contingent
upon the availability of accurate channel state information (CSI) at the BS for downlink
precoding. CSI can be obtained by estimating the transfer function of the channel between
transmitter and receiver. However, because of the large dimensionality, this traditional
transfer function based estimation may not yield expected performance for massive MIMO
systems [13, 14]. Alternatively, channel estimation can be performed by estimating channel
parameters such as Direction of arrivals, delay, and channel gains [15]. When the system is
well-calibrated, the parametric approach of channel estimation can offer great performance
gains over the simple unstructured interpolation approaches [16].
An efficient method for estimating the angles and delays of multiple paths of a known signal
is presented in [17], however, the algorithm-complexity is prohibitively high due to its
iterative nature. [18] derived the analytical results on the performance of standard ESPRIT,
however, the results are associated with distribution of eigenvectors of the sample covariance
4 Chapter 1. Introduction
matrix. With the assumption of small additive perturbation, [19] provides an explicit first
order expression of the signal subspace. Nevertheless, the authors in [19] only consider
the 1D parameter estimation problem. A rather general framework for MSE analysis was
presented in [20], however, these results are very complicated, and can only be simplified
in the single path case. [6] simplifies the analytical results for 3D massive MIMO systems,
but only considers the angle estimation. [7] shows the simplified results for both angle and
delay estimation, but assumes a single carrier system.
1.2 AI – The New Wireless Frontier
Cellular data traffic has witnessed an exponential growth over the last few years primarily
due to the widespread use of mobile devices and novel application services. Cisco Visual
Networking Index (VNI) forecast predicts a threefold increase of global IP traffic from 122
exabyte (EB) in 2017 to 296 EB in 2022. In order to handle this massive data-flow and ensure
superior quality of experience (QoE) to the end users, wireless cellular networks are also be-
coming extremely complicated. With the coexistence of different types of networks, managing
networks efficiently has become a critical issue for 5G [21, 22, 23] and beyond systems. AI is
regarded as one of the frontiers of beyond 5G and 6G wireless systems [24, 25, 26, 27, 28, 29].
In order to reduce the network management complexity and operational cost, self organizing
network (SON) has been introduced in Third Generation Partnership Project (3GPP) as
one of the enabling technologies for advanced mobile networks [30, 31]. SON aims to achieve
autonomous functionalities within Radio Access Network (RAN). These self-X functionali-
ties include self-configuration, self-optimization, and self-healing [32, 33]. Self-optimization
within SON refers to the process of self-tuning of network parameters for achieving optimum
performance in terms of any predefined metric of interest. The idea is to dynamically update
1.2. AI – The New Wireless Frontier 5
the cellular radio resource parameters based on the changes in propagation characteristics,
traffic pattern or network deployment scenarios. User distribution in wireless cellular net-
work changes dynamically over time. This changes are the result of users’ mobility behavior.
For instance, in the day time, users are more densely populated in the commercial area
whereas at night, users are primarily clustered in the residential area. Users’ large time-scale
movement also depends on specific time within the week (workdays and weekends) or year
(holidays). Accordingly, to maximize the overall throughput and coverage of the wireless
networks, cell-specific cellular radio parameters should also be updated taking into account
the changes in users’ distribution.
Multiple Input Multiple Output (MIMO) system [3] is one of the back-bones for current and
next generation cellular network. Massive MIMO [1], where a large number of antennas are
deployed at the base stations (BS), is envisioned as a key enabler for 5G systems. Beamform-
ing refers to a MIMO technique for coherently combining the signals generated by multiple
antennas in the MIMO arrays. 3-dimensional (3D) massive MIMO/full-dimension (FD)
MIMO [2, 6, 34] promises tremendous throughput gain by enabling simultaneous beamform-
ing in both elevation and azimuth domain. With large antenna array, it is possible to create
sharp beams towards desired users, and hence reduce the interference significantly [35]; this
beamforming is used to improve user’s throughput and is therefore user-specific. Cellular
networks, on the other hand, also require to create cell-specific wide beams. In fact, sector-
ization can be viewed as a special case of a wide beam where a separate wide beam is used
to cover a separate sector belonging to the same cell-site. In reality, wide beams are essential
for connecting as many users as possible. This essentially provides the coverage for cellular
networks. Another important application for widebeam is the broadcast technologies for
sending out the wireless control and access signals as prescribed by LTE and LTE-Advanced
systems. As a result, generating the accurate wide beam patterns that cover the maximum
6 Chapter 1. Introduction
number of users in the network is critical.
Unfortunately, most of the work in the MIMO literature focuses on maximizing MIMO
throughput or increasing the reliability of the data plane. Meanwhile, widebeam parameters
are set manually in modern cellular networks. A group of network engineers do the drive test
and physically visit each base station site to fix the parameters controlling the shape, tilt
and beam-widths of widebeam. Once fixed these widebeam parameters are not changed until
some major fault/complain show up. In other words, these parameters remain unchanged for
a long period of time– often years. As a result, currently, these parameter cannot be updated
based on users’ movement or change in distribution. Accordingly, this fixed parameter setup
results in strictly suboptimal solution in terms of overall network coverage.
In cellular networks, users movement changes in a dynamic fashion. To maximize the cov-
erage area, the wide-beam parameters need to be dynamically updated based on the user
movement. Reinforcement learning (RL) is shown to be a useful tool for dynamic spectrum
access (DSA) as well as small cell networks. A Q-learning based framework has been in-
troduced in [36] for managing cumulative interference, originated from multiple cognitive
radios, at the primary users’ receivers in wireless regional area networks (WRANs). The
introduced RL system is shown to autonomously learn policy that handles the cumulative
interference at the primary users and keeps interference level at the primary protection con-
tour below a predefined threshold. A RL-based power control strategy has been developed
in [37] for cognitive femtocell networks, and it has been shown that RL can enhance the
capacity of femtocells while ensuring a minimum quality of service (QoS) to macrocells. In
a similar setup, [38] proposes an RL framework for interference management in small-cell
networks. The problem of dynamic channel assignment (DCA) has been addressed in [39] by
utilizing a real-time RL-based approach. A Multirate transmission control (MTC) strategy
has been proposed in [40] using Q-learning algorithm for wideband code division multiple
1.3. Contribution 7
access (CDMA) systems.
Recently deep reinforcement learning (DRL) [41, 42] has been proved to be capable of learning
human-level control policies on a varieties of different Atari games [43]. DRL agents learn to
estimate the Q-values of selecting the best possible actions from current state of the video
games. However, compared to traditional Q-learning, in deep learning based Q-network, the
Q-values are approximated using deep neural network instead of storing the Q-values for all
state-action pairs in a tabular form. As a result, DRL has the ability to predict the correct
Q-values even for very large state and action space. Our recent work [44] shows that DRL
based resource allocation can help improve the network performance of a DSA network.
1.3 Contribution
In this dissertation, we first characterize the optimal/near-optimal DL MIMO precoding and
power allocation strategies for a TDD multi-cell multi-user mmWave massive FD-MIMO net-
work. ESPRIT-based UL DoA estimation scheme will be introduced and performance of
the DoA estimation will be analytically characterized assuming non-orthogonal spreading
sequences used for UL pilots/reference signals. DL multi-user MIMO precoding and power
allocation strategies will be identified based on the UL DoA estimation and their correspond-
ing error performance. Performance evaluation will be conducted to illustrate benefits of the
introduced MIMO precoding and power allocation strategy over our previous scheme [6] as
well as popular zero forcing (ZF)-based precoding [11, 12].
Next, we propose a parametric channel estimation framework for 3D millimeter wave massive
MIMO OFDM systems. To be specific, we jointly estimate the angle and delay parameters,
and based on the estimated angles and delays, we formulate a maximum likelihood based
estimator for estimating the complex path gains. Moreover, we analytically characterize the
8 Chapter 1. Introduction
root mean squared error (RMSE) of the estimation of delays, and elevation and azimuth
angles, and simplify the results for massive MIMO system. Finally, using simulation work,
we study the performance of the proposed joint estimation algorithm through the correlation
between the estimated channel and the real channel.
Finally, we present a DRL-based framework for MIMO broadcast beam optimization to
maximize the coverage instead of the throughput. This will be an important step towards
realizing the potential of SON.
The detailed contribution of this dissertation can be summarized as the following:
• First, we present a unitary ESPRIT-based uplink DoA estimation method for multi-cell
multi-user mmWave massive FD-MIMO OFDM network. Unlike majority of existing
work, our scheme considers a more realistic scenario where non-orthogonal spreading
sequences are used as UL pilots for both intra-cell and inter-cell users. As a result,
due to the non-zero correlation coefficients among users’ spreading sequences, the UL
DoA/channel estimation is subject to intra-cell interference, inter-cell interference, and
the so called pilot contamination.
• Second, we analytically characterize the mean square error (MSE) of unitary ESPRIT-
based UL DoA estimation for the corresponding multi-cell multi-user FD-MIMO net-
work. Our analytical results show how different perturbation components, namely
noise elements, and intra-cell and inter-cell interferences, affect the UL DoA/chan-
nel estimation performance. The MSE has been related to key physical parameters
such as number of antennas, BS array geometry, complex path gains, and correlation
coefficients between users’ spreading sequences.
• Third, we derive the sum-rate maximizing DL precoding and power allocation strategy
for our FD-MIMO system. Furthermore, we perform a large antenna array regime anal-
1.3. Contribution 9
ysis for DL precoding and identify the optimum power allocation under both perfect
and imperfect DoA estimation scenarios.
• Fourth, regarding MU-MIMO precoding, we validate our algorithms and analytical
results through extensive simulation. The evaluation results demonstrate that our
simulated MSE for different antenna numbers and antenna array geometries match
well with those of analytical expressions for both elevation and azimuth estimation in
large SNR regimes. Moreover, we also show that the introduced sum-rate maximization
precoding strategy outperforms both eigenbeamforming and ZF-based precoding over
all SNR regimes.
• Fifth, we propose a novel framework for superimposed pilot based channel estimation
and downlink processing for 3D massive MIMO systems. We demonstrate that DoA can
be used for uplink channel estimation and downlink precoder design for superimposed
pilot systems and can offer significant performance gain over traditional orthogonal
pilot strategies under certain conditions.
• Sixth, we propose a novel parametric channel estimation framework for jointly estimat-
ing direction of arrivals and delays for 3D massive FD-MIMO systems, and analytically
characterize the estimation performance.
• Seventh, we have proposed a double DQN-based framework [45] for dynamically op-
timizing MIMO broadcast beams for cellular network. The proposed learning-based
algorithm can autonomously update the beam patterns based on changes in user dis-
tribution [46].
• Finally, we propose Beam optimization algorithms for both single cell and multiple cell
environments. For multiple cell environment, we have proposed a novel neural network
architecture for computing the Q-values while keeping the computational complexity
10 Chapter 1. Introduction
only linearly increasing with number of BSs in the network. We have presented exten-
sive simulation work for validating our proposed solution. We have considered both
periodic and Markov mobility patterns, and show that the proposed DRL-based algo-
rithm can achieve perfect convergence with Oracle for both single cell and multiple cell
environment and for any user distribution.
1.4 Organization of the Dissertation
The rest of the dissertation is organized as follows. Chapter 2 presents the channel estimation
and precoding framework for multi-cell massive MU-MIMO network. Section 2.1 describes
the system model and the channel model for the underlying multi-cell multi-user massive FD-
MIMO network. Section 2.2 presents the ESPRIT-based UL DoA estimation method and
the performance characterization for DoA estimation multi-cell massive MU-MIMO network.
The achievable sum-rate analysis under both perfect and imperfect DoA estimation as well
as the optimal MIMO precoding and power allocation strategies are contained in Section
2.3. Simulation results for massive MU-MIMO network are presented in Section 2.4.
Chapter 3 presents our work on joint parameter estimation for massive MIMO systems. Sec-
tion 3.1 describes the system model, section 3.2 presents the framework for joint parameter
estimation, section 3.3 studies the RMSE characterization for the joint parameter estimation,
while simulation results for joint parameter estimation are presented in section 3.4.
Chapter 4 presents our work on superimposed pilot based framework for 3D massive MIMO
systems. Section 4.1 provides the background and motivation of the superimposed pilot sys-
tem for FD-MIMO, and highlights our contribution in this dissertation. Section 4.2 presents
the system and channel model for superimposed pilot framework. Section 4.3 presents the
DoA estimation strategy for superimposed pilot and characterizes the uplink DoA estimation
1.4. Organization of the Dissertation 11
performance. Section 4.4 characterizes both uplink and downlink achievable rates for super-
imposed pilot system. Section 4.5 presents the simulation results and Section 4.6 summarizes
the chapter.
Our work on MIMO broadcast beam optimization using deep reinforcement learning (DRL)
is presented in chapter 5. Section 5.1 presents the network model and problem statement;
Section 5.2 presents the beam learning framework; Section 5.3 introduces the DRL-based op-
timization strategies for both single cell and multiple cell environments; Section 5.4 presents
the simulation work.
Finally, we conclude the dissertation in section 6
Chapter 2
Multi-Cell Multi-User Massive
FD-MIMO
2.1 System and Channel Model
!-th cell"-th cell
#-th MS
#-th MS
$-th MS
$-th MS
BS
Figure 2.1: Network Model.
We consider a multi-cell multi-user MIMO-OFDM system consisting of G BSs as depicted
in figure 2.1. Each BS with Nr number of antennas supports J number of mobile stations
(MSs)–each having Nt number of transmit antennas. After appending cyclic prefix (CP), the
12
2.1. System and Channel Model 13
resulting time domain transmit signal at each MS is first passed through a parallel-to-serial
converter followed by a digital-to-analog (DAC) converter, resulting in the baseband OFDM
signal. The baseband signal is then up-converted and sent through a frequency selective
fading channel, which is assumed to remain time-invariant during an OFDM symbol duration.
It is to be noted here that we assumed same number of antennas at all UEs because of better
clarity of exposition. However, we want to emphasize here that the algorithm and analysis
presented in this work are not restricted by this assumption. All the results presented in this
work can be straightforwardly extended to the scenario where users in the cell have different
number of antennas.
In the UL, each MS sends Nt spreading sequences of length Q as plots/reference signals: one
on each transmit antenna. Accordingly, the Nr×Q frequency-domain received signal for the
k-th subcarrier at the i-th BS can be written as
Zi(k) =G−1∑g=0
J−1∑j=0
√Λjg,iHjg,i(k)Xjg(k) + Wi(k), (2.1)
where Hjg,i(k) is the Nr×Nt channel matrix for the channel between the i-th BS and the j-th
MS in the g-th cell at the k-th subcarrier, and Λjg,i is the corresponding large scale fading
coefficient which is independent of subcarrier frequency; Xjg(k) is the Nt × Q frequency
domain transmit signal from the j-th MS in the g-th cell for the k-th subcarrier, and Wi(k)
is the corresponding Nr×Q noise matrix. Note that each row vector of Xjg(k) is a length-Q
spreading sequence. The channel transfer function, Hjg,i(k), can be written as
Hjg,i(k) =
Ljg,i−1∑`=0
Cjg,i(`)e−j2πk`Nc , (2.2)
where Cjg,i(`) is the Nr×Nt channel impulse response (CIR) for the `-th tap of the channel
between i-th BS and the j-th MS in the g-th cell. Nc denotes total number of subcarriers.
14 Chapter 2. Multi-Cell Multi-User Massive FD-MIMO
Here, we assume that the channel, which can be represented by an equivalent discrete-time
linear channel impulse response, has a finite number (Ljg,i) of non-zero taps.
Using the geometric channel model for mmWave frequencies, the impulse response for the
`-th tap of the channel between i-th BS and the j-th MS in the g-th cell can be represented
by[6, 7, 47]
Cjg,i(`) =
Pjg,i,`−1∑p=0
αjg,i(`, p)er,jg,i(`, p)eHt,jg,i(`, p), (2.3)
where αjg,i(`, p), er,jg,i(`, p), and et,jg,i(`, p) are, respectively, the channel gain, Nr×1 receive
antenna array response, and Nt × 1 transmit antenna array response for the p-th sub-path
within the `-th tap of the channel between the i-th BS and the j-th MS in g-th cell; Pjg,i,` is
the total number of sub-paths within the `-th tap of the channel; and (·)H denotes Hermitian
transpose operation. In the FD-MIMO network of interests, a 1D uniform linear array (ULA)
is assumed at each MS. The corresponding transmit antenna array response can be described
using the Vandermonde structure: et,jg,i(`, p) =
[1 ejωjg,i,`,p . . . ej(Nt−1)ωjg,i,`,p
]T, where
ωjg,i,`,p = (2π∆t/λ) cos Ωjg,i,`,p, ∆t is the spacing between the adjacent transmit antenna
elements, Ωjg,i,`,p is the transmit angle (DoD) for p-th sub-path within `-th tap of the channel
between i-th base station and the j-th user in g-th cell, and λ is the carrier wavelength.
On the other hand, for FD-MIMO networks the antenna array at the BS is a 2D planar
array placed in the X-Z plane, with M1 and M2 antenna elements in vertical and horizontal
directions, respectively. Accordingly, the number of total receive antenna elements at the
base station is Nr = M1 ×M2. Therefore, the receive antenna array response for the p-th
sub-path within `-th tap can be expressed as er,jg,i(`, p) = a(vjg,i,`,p) ⊗ a(ujg,i,`,p), where ⊗
represents the Kronecker product, and a(ujg,i,`,p) =
[1 ejujg,i,`,p . . . ej(M1−1)ujg,i,`,p
]Tand
a(vjg,i,`,p) =
[1 ejvjg,i,`,p . . . ej(M2−1)vjg,i,`,p
]Tcan be treated as the receive steering vectors
2.2. Uplink Channel Characterization Through DoA Estimation 15
in the elevation and azimuth domains, respectively. Here, ujg,i,`,p = 2π∆r
λcos θjg,i,`,p and
vjg,i,`,p = 2π∆r
λsin θjg,i,`,p cosφjg,i,`,p are the two receive spatial frequencies at the BS, ∆r is
the spacing between adjacent antenna elements in the receive antenna array, and θjg,i,`,p and
φjg,i,`,p are the elevation and azimuth DoAs for the p-th sub-path within `-th tap for the
channel between the i-th BS and the j-th MS in g-th cell, respectively. In this paper, we are
not considering user mobility or scheduler impact on the system performance, and we will
address these important issues in our future work.
2.2 Uplink Channel Characterization Through DoA Es-
timation
In this section, we will present the UL DoA estimation procedure for the multi-cell multi-user
massive FD-MIMO network and characterize assuming non-orthogonal/non-ideal spreading
sequences and characterize the corresponding estimation performance. We choose ESPRIT-
based method for DoA estimation over other high resolution DoA estimation methods, such
as MUSIC, since ESPRIT offers better resolvability and unbiased estimates with lower vari-
ance. Most importantly, ESPRIT provides significant computational advantages in terms of
faster processing speed, lower storage requirement and indifference to knowledge of precise
array geometry.
2.2.1 UL DoA Estimation through Unitary ESPRIT
Let the n-th MS at the i-th cell be the target user which tries to communicate to the i-th
BS. In massive FD-MIMO networks, the number of scheduled users may be quite large, and
hence due to limited availability of orthogonal spreading codes, it may not be possible to
16 Chapter 2. Multi-Cell Multi-User Massive FD-MIMO
assign orthogonal sequences to all scheduled users. With this in mind, in this work, we
assume a more realistic scenario that only the spreading sequences used by the same MS
are orthogonal while spreading sequences for different MSs within a cell are non-orthogonal.
Furthermore, we assume that the same pool of spreading sequences are reused across all cells
as UL pilots complying with the 3GPP LTE/LTE-Advanced standards [48].
Let the correlation among the spreading sequences from different MSs be denoted as ρ1.
Now, for estimating the UL channel of the n-th MS in i-th cell, at the i-th BS, the Nr ×Q
received signal at the k-th subcarrier, Zi(k), is first correlated with the spreading sequences
of n-th MS. Hence, after correlating the received signal with the target user’s sequence, from
(2.1), we have
Zi(k)XHni(k) =
G−1∑g=0
J−1∑j=0
√Λjg,iHjg,i(k)Xjg(k)XH
ni(k) + W′
i(k), (2.4)
where W′i(k) = Wi(k)XH
ni(k) is the equivalent noise element. Now, we can re-write (2.4) as
Zi(k)XHni(k) =
√Λni,iHni,i(k) +
J−1∑j=0j 6=n
√Λji,iHji,i(k)ρ11Nt
+G−1∑g=0g 6=i
J−1∑j=0j 6=n
√Λjg,iHjg,i(k)ρ11Nt
+G−1∑g=0g 6=i
√Λng,iHng,i(k) + W
′
i(k), (2.5)
where 1Nt denotes an Nt×Nt matrix with each element being unity. In (2.5), the first term,√Λni,iHni,i(k), represents the target user’s UL channel; first summation term represents the
inter-cell interference caused by users in other cells whose spreading sequences are exactly
the same as that of target user (pilot contamination); second summation term represents
2.2. Uplink Channel Characterization Through DoA Estimation 17
the intra-cell interference; and third summation term represents the inter-cell interference
caused by users in other cells whose spreading sequences are different (non-orthogonal) than
that of target user. In realistic wireless networks such as LTE/LTE-Advanced networks,
there exists a nonzero correlation between different pilot sequences. For example, Zadoff-Chu
sequence is used to make sure different prime length Zadoff-Chu sequence has constant cross-
correlation [49]. To reflect this practical constraint as well as provide system design insights,
a correlation coefficient, ρ1, in (2.5) is introduced as a system design parameter to consider
the tradeoff between the training sequence length and corresponding system performance. It
is to be noted that ρ1 = 0 results in the special case where all the users in a cell are assigned
orthogonal codes. Also important to note that the value of ρ1 depends on the length of
scrambling sequences the system designer chooses, which again depends on the coherence
time of the channel. This provides us a way to investigate the impact of channel coherence on
the network performance: Smaller coherence time will lead to a shorter scrambling sequence
resulting in higher values for ρ1.
Because of the large path loss, which is manifested by the large-scale fading coefficients,
overall gains of the inter-cell interference channels are relatively small compared to that of
the target user’s channel. Furthermore, presence of the small correlation coefficients, ρ1,
intra-cell interference terms can also be considered relatively smaller compared to the term
for target user’s channel. Hence, during the ESPRIT-based parameter estimation phase, we
can treat the interference and noise elements together, and (2.5) can be written as
Hni,i(k) = Zi(k)XHni(k) =
√Λni,iHni,i(k) + W
′′
i (k) (2.6)
where W′′i (k) is the equivalent noise-plus-interference matrix. Now, using (2.2) and (2.3),
18 Chapter 2. Multi-Cell Multi-User Massive FD-MIMO
we can write Hni,i(k) as
Hni,i(k) =
Lni,i−1∑`=0
Pni,i,`−1∑p=0
αni,i(`, p)er,ni,i(`, p)eHt,ni,i,k(`, p) (2.7)
where et,ni,i,k(`, p) = et,ni,i(`, p)e−j2πk`Nc . In order to jointly estimate the elevation and azimuth
angles of the uplink channel between the i-th base station and the n-th user in i-th cell, we
can now apply a low-complexity DoA estimation algorithm based on unitary ESPRIT.
High frequency channels, especially millimeter-wave channels, usually have fewer number of
scattering clusters [50]. In this work, we focus on the simple case where each scattering cluster
contributes a single propagation path. This is a reasonable assumption for the analysis of FD-
MIMO systems [6, 51, 52]. Hence for the clarity of exposition, and notational convenience,
we can drop the subpath index, p, from αjg,i(`, p), er,jg,i,k(`, p), and eHt,jg,i(`, p). However,
our results also hold for multiple subpaths scenario due to the fact that ESPRIT can be
used to distinguish subpaths as long as the spatial resolvability of the array is higher than
the angular spread between two subpaths [53]. It is to be noted here that in this work,
instead of Standard ESPRIT, we utilize Unitary ESPRIT [54] for DoA estimation, which
provides superior estimation performance for the case where the sub-paths within the same
clusters are highly correlated. Moreover, because of Forward-Backward Averaging (FBA),
Unitary ESPRIT can still estimate the corresponding DoAs of two sub-paths which are
completely correlated or coherent. It is also noteworthy here that it is unlikely to have
more than two completely coherent sub-paths in the mmWave propagation channel based on
3GPP mmWave channel model [55] and the seminal work in [56]. Therefore, our introduced
algorithm is applicable for most general mmWave channels. However, for the very special
case where more than two sub-paths are completely coherent, and all such sub-path DoAs
are required to be estimated, the spatial smoothing technique can be applied in conjunction
2.2. Uplink Channel Characterization Through DoA Estimation 19
with FBA to de-correlate the corresponding signals [57]. However, this is out of the scope
of current manuscript and we will consider this special case in our future work. Now, (2.7)
can be written as
Hni,i(k) = Ani,iDni,iBHni,i(k), (2.8)
where Ani,i =
[er,ni,i(0) . . . er,ni,i(Lni,i − 1)
], Dni,i = diag
[αni,i(0) . . . αni,i(Lni,i − 1)
],
and Bni,i(k) =
[et,ni,i,k(0) . . . et,ni,i,k(Lni,i − 1)
]. Hence, from (2.6), the channel matrix,
Hni,i(k), can be written as
Hni,i(k) =√
Λni,iAni,iDni,iBHni,i(k) + W
′′
i (k). (2.9)
By converting all the complex matrices to the real matrices, Unitary ESPRIT performs
the computations in real, instead of complex, numbers from beginning to the end of the
algorithm, and hence, reduces the computational complexity significantly. Since we are only
interested in estimating UL DoAs, the noisy channel from (2.9) can be expressed as
Hni,i(k) = Ani,iSni,i(k) + W′′
i (k), (2.10)
where Sni,i(k) =√
Λni,iDni,iBHni,i(k). In order to perform unitary ESPRIT, we need to use
forward-backward averaging on the received signal:
Hfbani,i(k) =
[Hni,i(k) ΠNrH
∗ni,i(k)ΠNt
]=
[Ani,iSni,i(k) ΠNrA
∗ni,iS
∗ni,i(k)ΠNt
]+
[W′′i (k) ΠNrW
′′i
∗(k)ΠNt
], (2.11)
20 Chapter 2. Multi-Cell Multi-User Massive FD-MIMO
where A∗ represents complex conjugate of A, and Πp denotes the p × p exchange matrix
with ones on its antidiagonal and zeros elsewhere. The subspace decomposition of the signal
space of the received signal through singular value decomposition then can be written as:
[Ani,iSni,i(k) ΠNrA
∗ni,iS
∗ni,i(k)ΠV
]
=
[Usigni,i Unoise
ni,i
]Σsigni,i 0
0 0
Vsig
ni,i
H
Vnoiseni,i
H
. (2.12)
From this step onward, we can now follow our line of work [6] in order to apply ESPRIT-based
techniques on (2.12). Hence, the details are not repeated here due to page limitation.
2.2.2 RMSE Characterization
The theoretical performance of the unitary ESPRIT-based UL DoA estimation can be char-
acterized where the root mean squared error (RMSE) of the estimation is served as the
performance metric. Let vni,i,` denote the estimated spatial frequency for `-th tap of the tar-
get user’s channel, i.e, the channel between the i-th BS and the n-th MS in the i-th cell; the
estimation error is then given by ∆vni,i,` = vni,i,` − vni,i,`. Similarly, ∆uni,i,` = uni,i,` − uni,i,`.
It has been shown in [20] that the unitary transformation does not affect the MSE of the
ESPRIT methods; however, the statistics of the noise and the signal subspace are changed
due to the forward and backward averaging performed in (2.11). To be specific, the covariance
and complementary covariance matrices for the equivalent noise-plus-interference matrix
2.2. Uplink Channel Characterization Through DoA Estimation 21
W′′i (k) in (2.5) become, respectively [20]:
R(fba)i (k) =
Ri(k) 0
0 ΠNrNtR∗i (k)ΠNrNt
;
C(fba)i (k) =
0 Ri(k)ΠNrNt
ΠNrNtR∗i (k) 0
,(2.13)
where Ri(k) = Eα,θ,φ,ψ
vecW′′i (k)
vecW′′i (k)
H, and the expectation, Eα,θ,φ,ψ, is
taken with respect to different channel realizations (i.e., w.r.t. channel gains, DoA’s– both
azimuth and elevation– and DoD’s of the interference channels). Now the expression of the
covariance matrix, Ri(k), can be simplified using the following lemma:
Lemma 2.1. The covariance matrix, Ri(k), of the equivalent noise-plus-interference matrix,
W′′i , is given by:
Ri(k) = Ri,1(k) + Ri,2(k) + Ri,3(k) + Ri,4(k), (2.14)
where Ri,4(k) = σ2INrNt, where σ2 is the noise variance, and
Ri,1(k) = Eα,θ,φ,ψ
G−1∑g=0g 6=i
(√Λng,i
)2
Rng,i(k)
, (2.15)
Ri,2(k) = Eα,θ,φ,ψ
ρ21
J−1∑j=0j 6=n
(√Λji,i
)2
X t,rRji,i(k)X t,r
, (2.16)
22 Chapter 2. Multi-Cell Multi-User Massive FD-MIMO
Ri,3(k) = Eα,θ,φ,ψ
ρ21
G−1∑g=0g 6=i
J−1∑j=0j 6=n
(√Λjg,i
)2
X t,rRjg,i(k)X t,r
, (2.17)
whereX t,r = (1Nt ⊗ INr), and Rpq,r(k) = Ppq,r(k)PHpq,r(k), where Ppq,r(k) =
(B∗pq,r(k)⊗Apq,r
)vec Dpq,r.
Proof Sketch. Lemma 2.1 can be proved using the properties of matrix vectorization, and
with the assumption of the independence of channel gains.
It is noteworthy here that in Lemma 2.1, Ri,1(k), Ri,2(k), Ri,3(k), and Ri,4(k) correspond, re-
spectively, to the effects of pilot contamination, intra-cell interference, inter-cell interference,
and noise element of the noise-plus-interference signal. Now, the first order approximation
of the mean square estimation error of vni,i,` for the Unitary ESPRIT is given by [20]:
E
(4vni,i,`)2=
1
2
(r
(v)H
ni,i,` ·W∗ni,i,mat ·R
(fba)T
i ·WTni,i,matr
(v)ni,i,`
−Re
r(v)T
ni,i,` ·Wni,i,mat ·C(fba)i ·WT
ni,i,mat · r(v)ni,i,`
), (2.18)
where
r(v)ni,i,` = q` ⊗
([(J
(v)1 Usig
ni,i)+(J
(v)2 /ejvni,i,` − J
(v)1 )]T
p`
), (2.19)
Wni,i,mat = (Σsig−1
ni,i VsigT
ni,i )⊗ (Unoiseni,i UnoiseH
ni,i ). (2.20)
Here, Jv,1 = [IM2−1 0] ⊗ IM1 and Jv,2 = [0 IM2−1] ⊗ IM1 are the selection matrices for
the first and second subarrays, respectively, for the spatial frequency vni,i,`; T is the trans-
formation matrix, q` is the `-th column of matrix Tni,i, pT` is the `-th row of matrix T−1ni,i;
Rfbai and Cfba
i are the covariance and complementary covariance matrices of the noise-plus-
interference, respectively. Now, let us consider the following lemma:
2.2. Uplink Channel Characterization Through DoA Estimation 23
Lemma 2.2. Covariance and complementary covariance matrices of the forward-backward
averaged signal can be decomposed as
R(fba)i (k) = R
(fba)i,1 (k) + R
(fba)i,2 (k) + R
(fba)i,3 (k) + R
(fba)i,4 (k), (2.21)
C(fba)i (k) = C
(fba)i,1 (k) + C
(fba)i,2 (k) + C
(fba)i,3 (k) + C
(fba)i,4 (k), (2.22)
where
R(fba)i,m (k) =
Ri,m(k) 0
0 ΠNrNtR∗i,m(k)ΠNrNt
;
C(fba)i,m (k) =
0 Ri,m(k)ΠNrNt
ΠNrNtR∗i,m(k) 0
,(2.23)
for m = 1, . . . , 4, where Ri,m(k)’s are given by Lemma 2.1.
Proof Sketch. This Lemma can be proved by substituting (2.21) into (2.13), and by utilizing
the definitions of Ri,m(k)s from Lemma 2.1.
Using Lemma 2.2, we can separately investigate the effects of different elements of noise-
plus-interference signal on the DoA estimation performance, and hence, can write (2.18) as
E
(4vni,i,`)2 =4∑
m=1
E
(4vni,i,`)2m
where
E
(4vni,i,`)2m
=1
2
(r
(v)H
ni,i,` ·W∗ni,i,mat ·R
(fba)T
i,m ·WTni,i,matr
(v)ni,i,`
−Re
r(v)T
ni,i,` ·Wni,i,mat ·C(fba)i,m ·WT
ni,i,mat · r(v)ni,i,`
). (2.24)
Now, (2.24) depends on the singular value decomposition (SVD) of the noiseless received
24 Chapter 2. Multi-Cell Multi-User Massive FD-MIMO
signal, which can be difficult to obtain at the BS. However, for massive MIMO systems, this
becomes possible due to the fact that the steering vectors are orthogonal. We consider the
following Lemma [6] to facilitate the derivation of the MSE expression for massive MIMO
systems:
Lemma 2.3. If the elevation and azimuth angles are both drawn independently from a con-
tinuous distribution, the normalized array response vectors become orthogonal asymptotically,
that is, er,jg,i(m) ⊥ spaner,j′g′ ,i′ (n) | ∀(j, g, i,m) 6= (j
′, g′, i′, n)
when the number of an-
tennas at the base station goes large, where er,jg,i(m) = 1√Nr
er,jg,i(m).
Using this property, we can analytically characterize the effect of each individual perturbation
element on the DoA estimation performance. To be specific, the MSE of UL DoA estimation
due to pilot contamination is given by the following Theorem:
Theorem 2.4. For the massive MIMO network, the MSE of the unitary ESPRIT-based UL
DoA estimation due to pilot contamination is given by. . .
Eθ,φ,φ(∆vni,i,`)21 =1
8|αni,i(`)|2N2t Λni,i(M2 − 1)2M2
1
×G−1∑g=0g 6=i
Λng,iXng,i
Lng,i−1∑m=0
|αng,i(m)|2
×(Yng,i + Y
′
ng,i − 2<ejΦYng,i
), (2.25)
where Φ = ((M1 − 1)uni,i,` + (M2 − 1)vni,i,`), and Xng,i and Yng,i are given by
Xng,i = Eψ∣∣(1 + e−j(ωni,i,`−ωng,i,m) + . . .
. . .+ e−j(Nt−1)(ωni,i,`−ωng,i,m))∣∣2 , (2.26)
2.2. Uplink Channel Characterization Through DoA Estimation 25
Yng,i = Eθ,φ∣∣(1 + ej(uni,i,`−ung,i,m) + . . .
+ej(M1−1)(uni,i,`−ung,i,m)) (ej(M2−1)(vni,i,`−vng,i,m) − 1
)∣∣2 , (2.27)
and Y′ng,i and Yng,i are given by
Y′
ng,i = Eθ,φ∣∣(ej(M1−1)ung,i,m + ejuni,i,`ej(M1−2)ung,i,m+
. . .+ ej(M1−1)uni,i,`) (ej(M2−1)vni,i,` − ej(M2−1)vng,i,m
)∣∣2 , (2.28)
Yng,i = Eθ,φ
[ (e−j(M1−1)ung,i,m + e−juni,i,`e−j(M1−2)ung,i,m
+ . . .+ e−j(M1−1)uni,i,`) (ej(M2−1)(vng,i,m−vni,i,`) − 1
)×(1 + . . .+ e(M1−1)(ung,i,m−uni,i,`)
)×(e−j(M2−1)vni,i,` − e−j(M2−1)vng,i,m
) ](2.29)
for m = 0, . . . Lng,i − 1, and Eψ and Eθ,φ denote, respectively, expectations with respect to
DoD and DoAs of the interference channel.
Proof. See Appendix A.1.
Remark 2.5. Based on Jacobian, MSEs of the elevation and azimuth angles can be obtained
from the MSEs of the spatial frequencies as follows [58]
Eθ,φ,φ(∆θ`)2 = Eθ,φ,φ(∆u`)2 1
π2 sin2(θ`), (2.30)
Eθ,φ,φ(∆θ`)2 =Eθ,φ,φ(∆u`)2 cot2(θ`) cot2(φ`)
π2 sin2(θ`)
+Eθ,φ,φ(∆v`)2
π2 sin2(θ`) sin2(φ`)(2.31)
26 Chapter 2. Multi-Cell Multi-User Massive FD-MIMO
Remark 2.6. The expectation expressed in (2.25) of Theorem 2.4 is NOT taken with respect
to time, rather, it is taken with respect to the DoAs or DoDs of interference users due to
random locations of those interference users.
Similarly, the effect of intra-cell interference, inter-cell interference, and noise elements on the
MSE performance are characterized in Theorem 2, Theorem 3, and Theorem 4, respectively:
Theorem 2.7. For the massive MIMO network, the MSE of the unitary ESPRIT-based UL
DoA estimation due to intra-cell interference is given by
Eθ,φ,φ
(4vni,i,`)22
=ρ2
1|X′′
ni,i,`|2
8|αni,i(`)|2N2t Λni,i(M2 − 1)2M2
1
×J−1∑j=0j 6=n
Λji,iXji,i
Lji,i−1∑m=0
|αji,i(m)|2
×(Yji,i + Y
′
ji,i − 2<ejΦYji,i
)], (2.32)
where X′′
ni,i,` =Nt−1∑m=0
ejmωni,i,`.
Proof. See Appendix A.2.
Theorem 2.8. For the massive MIMO network, the MSE of the unitary ESPRIT-based UL
DoA estimation due to inter-cell interference is given by
Eθ,φ,φ
(4vni,i,`)23
=ρ2
1|X′′
ni,i,`|2
8|αni,i(`)|2N2t Λni,i(M2 − 1)2M2
1
×
G−1∑g=0g 6=i
J−1∑j=0j 6=n
Λjg,iXjg,i
Ljg,i−1∑m=0
|αjg,i(m)|2
×(Yjg,i + Y
′
jg,i − 2<ejΦYjg,i
), (2.33)
2.2. Uplink Channel Characterization Through DoA Estimation 27
Proof. The proof is similar to the proof of Theorem 2.
Theorem 2.9. For the massive MIMO network, the MSE of the unitary ESPRIT-based UL
DoA estimation due to noise element is given by
E
(4vni,i,`)24
=σ2
2|αni,i(`)|2NtΛni,i(M2 − 1)2M1
,
where σ2 is the noise variance.
Proof. This theorem can be proved following the line of proof for Theorem-1 in [6].
Similarly, we can also obtain the MSE expressions for elevation spatial frequency, E
(4uni,i,`)2.
Accordingly, based on Jacobian matrices, we can characterize MSE expressions for UL ele-
vation and azimuth DoAs from the MSEs of the spatial frequencies.
Remark 2.10. From Theorem 2 and Theorem 3 we can observe that non-zero correlation
among the spreading sequences of different MSs does cause intra- and inter-cell interference
for UL DoA estimation, and the corresponding MSEs of the estimation are directly affected
by the correlation coefficient, ρ1. On the other hand, as we can see from Theorem 1, the
MSE due to pilot contamination is not dependent on the correlation coefficient.
Furthermore, these four theorems suggest that our original work in [6] may yield strictly
suboptimal solutions since in that work we only consider DoA estimation error due to noise
elements. This observation will be verified through performance evaluation in Section V.
2.2.3 Complexity Analysis
In this subsection, we discuss the computational coomplexity of unitary ESPRIT-based DoA
estimation procedure as presented in Section 2.2.1. We first summarize the computational
28 Chapter 2. Multi-Cell Multi-User Massive FD-MIMO
complexity of some basic operations in terms of floating point operations (FLOPS). It re-
quires 2(n − 1)mp FLOPS for computing product of two matrices of sizes (m × n) and
n× p. For taking inverse of a positive definite matrix of size (n× n) requires (n3 + n2 + n)
floating point operations; number of FLOPS required for taking SVD of an m × n matrix
is (4m2n + 22n3), and complexity for finding eigenvalues of an n × n matrix is n3. Now,
we can describe the computational complexity of each step of our algorithm presented. For
complexity analysis, we assume all the channels have L resolvable paths. Now, correlating
the received signal with training symbol matrix in (2.4) requires Ca = 2(Q−1)NrNt number
of FLOPS. Taking forward-backward averaging in (2.11) requires Cb = 2NrNt(Nr +Nt − 2)
FLOPS. Number of FLOPS required for taking SVD of the forward-backward averaged
received signal in (2.12) is Cc = (8N2rNt + 176N3
t ). Now, for solving shift-invariance equa-
tions for elevation and azimuth spatial frequencies, the number of FLOPS required are,
respectively, Cd = 2[M2(M1 − 1) − 1]L2 + [L3 + L2 + L] + [2(L − 1)M2L(M1 − 1)], and
Ce = 2[M1(M2−1)−1]L2 +[L3 +L2 +L]+[2(L−1)M1L(M2−1)]. Finally, for calculating the
eigenvalues of two shift-invariance operator matrices requires Cf = 2L3 number of FLOPS.
Hence, total computational complexity of our ESPRIT-based DoA estimation method can
be written as CESPRIT = Ca + Cb + Cc + Cd + Ce + Cf . Next, for comparison. we compute
the computational complexity of MUSIC algorithm. For computing the covariance matrix
of the received signal, the number of FLOPS required is Da = (Q+ 1)N2r . Next, computing
the SVD of the covariance matrix requires Db = 26N3r FLOPS. Let Ng denote the number of
grids for candidate DoA search. Hence, total number of FLOPS required for extracting the
eigenvectors corresponding to noise subspace is Dc = Ng ([2Nr(Nr − L)] + [2Nr − 3]). Hence
computational cost for MUSIC algorithm is CMUSIC = Da +Db +Dc.
2.3. Downlink Precoding and Achievable Rate Analysis 29
2.3 Downlink Precoding and Achievable Rate Analysis
2.3.1 Optimum Precoding for Sum-rate Maximization
In the DL, at the i-th BS, the Ns × 1 information symbol vector intended for the n-th MS
in the i-th cell on the k-th subcarrier can be expressed as sdlni[k] =
[sdlni,0[k], . . . , sdlni,Ns−1[k]
],
where sdlni,p[k] is the p-th information symbol intended for the n-th MS. Accordingly, the
Nr × 1 downlink frequency domain transmit signal from the i-th BS can be written as
xdli [k] =J−1∑j=0
xdlji[k] =J−1∑j=0
Vji[k]sdlji[k], (2.34)
where xdlji[k] = Vji[k]sdlji[k], and Vji[k] is the Nr × Ns precoding matrix for the j-th MS in
the i-th cell on the k-th subcarrier. Now, the Nt × 1 received signal at the n-th MS in the
i-th cell on the k-th subcarrier, ydlni[k], can be written as
ydlni[k] =G−1∑g=0
√Λni,gH
dlni,g[k]xdlg [k] + ndlni[k] =
G−1∑g=0
J−1∑j=0
√Λni,gH
dlni,g[k]Vjg[k]sdljg[k] + ndlni[k]
=√
Λni,iHdlni,i[k]Vni[k]sdlni[k] +
J−1∑j=0j 6=n
√Λni,iH
dlni,i[k]Vji[k]sdlji[k]
+G−1∑g=0g 6=i
J−1∑j=0j 6=n
√Λni,gH
dlni,g[k]Vjg[k]sdljg[k] + ndlni[k], (2.35)
where Hdlni,g[k] is the Nt × Nr downlink channel between the g-th BS and the n-th MS in
i-th cell on the k-th subcarrier, and ndlni[k] is the corresponding Nt × 1 noise vector at the
receiver with Endlni[m]ndlni[n] = σ2INtδ(m−n). In (2.35), the first term is the desired signal,
while the second and third terms represent the intra- and inter-cell interferences, respectively.
30 Chapter 2. Multi-Cell Multi-User Massive FD-MIMO
Now, the rate for n-th MS in i-th cell is given by
Ini[k] = log2 det(I + Λni,iH
dlni,i[k]Vni[k]VH
ni[k]HdlH
ni,i [k]× ∑(n,i)6=(j,g)
Λni,gHdlni,g[k]Vjg[k]VH
jg[k]HdlH
ni,g[k] + σ2I
−1 . (2.36)
Accordingly, the sum-rate maximization (SRM) problem can be expressed as
maxVji[k]
J−1∑j=0
Iji[k]
s.t.J−1∑j=0
Tr(Vji[k]Vji[k]H
)≤ Pt, (2.37)
where Pt is the total power available at the BS for each sucarrier. In general, it is challenging
to solve the problem in (2.37) since it is highly non-convex. Alternatively, sum-MSE (mean
square error) minimization is another popular utility maximization problem for DL multi-
user MIMO systems. Let Tni be the DL receive processing matrix for the n-th MS. The
estimated received symbol vector can then be written as sdlni[k] = THniy
dlni[k]. Now, n-th MS’s
MSE matrix can be defined as
Eni[k] = E[(
sdlni[k]− sdlni[k]) (
sdlni[k]− sdlni[k])H]
=(I−
√Λni,iT
HniH
dlni,i[k]Vni[k]
)(I−
√Λni,iT
HniH
dlni,i[k]Vni[k]
)H+
∑(n,i)6=(j,g)
Λni,gTHniH
dlni,g[k]Vjg[k]VH
jg[k]HdlH
ni,g[k]Tni + σ2THniRni (2.38)
2.3. Downlink Precoding and Achievable Rate Analysis 31
Accordingly, the sum-MSE minimization problem can be defined as
minVji[k]
J−1∑j=0
εji[k]
s.t.J−1∑j=0
Tr(Vji[k]Vji[k]H
)≤ Pt. (2.39)
where εni[k] = TrEni[k]. The relationship between the problems in (2.37) and (2.39) can
be established by the following lemma [59]:
Lemma 2.11. The sum-rate maximization problem in (2.37) and the sum-MSE minimiza-
tion problem in (2.39) are equivalent in the sense that the optimal solutions, Vji[k]J−1j=0 ,
for both problems are identical.
In this work, we assume that no coordination is available among BSs, which is a typical
scenario in TDD-based FD-MIMO networks. Hence, problem in (2.39) can be written as
minVji[k]
∣∣∣∣THi Hdl
i,i[k]Vi[k]− I∣∣∣∣2F
s.t.J−1∑j=0
Tr(Vji[k]Vji[k]H
)≤ Pt, (2.40)
where THi = blkdiagTH
0i,TH1i, . . . ,T
H(J−1)i, , Vi[k] =
[V0i[k],V1i[k], . . . ,V(J−1)i[k]
]. and
Hdli,i[k] =
[√Λ0i,iH
dlT
0i,i[k],√
Λ1i,iHdlT
1i,i[k], . . . ,√
Λ(J−1)i,iHdlT
(J−1)i,i[k]]T
. Now, using channel
reciprocity property, the downlink channel can be written in terms of the uplink channel:
Hdlni,i[k] = HT
ni,i[k] = B∗ni,i[k]Dni,iATni,i = B∗ni,iDni,iA
Tni,i[k], (2.41)
where Ani,i[k] =
[er,ni,i,k(0) . . . er,ni,i,k(Lni,i − 1)
], where er,ni,i,k(`) = er,ni,i(`)e
−j2πk`Nc , and
Bni,i =
[et,ni,i(0) . . . et,ni,i(Lni,i − 1)
]. Assuming each MS will only use its own DL CSI
32 Chapter 2. Multi-Cell Multi-User Massive FD-MIMO
for receive processing, we have THn,iH
dlni,i[k] = Dni,iA
Tni,i. Accordingly, the problem in (2.40)
can be expressed as
minVji[k]
∣∣∣∣Di,iATi,i[k]Vi[k]− I
∣∣∣∣2F
s.t.
J−1∑j=0
Tr(Vji[k]Vji[k]H
)≤ Pt, (2.42)
where Di,i = blkdiagD0i,i, D1i,i, . . . , D(J−1)i,i, , Ai,i[k] =[A0i,i[k], A1i,i[k], . . . , A(J−1)i,i[k]
],
and Dni,i =√
Λni,iDni,i accounts for both the large and small scale fading effect. The
solution to this problem is given by the following theorem:
Theorem 2.12. Let Di,iATi,i[k] = Ui,i[Λi,i,0]WH
i,i be the SVD of the effective channel,
Di,iATi,i[k], where Λi,i = diagλ0i,i, λ1i,i, . . . , λ(JNs−1)i,i. Then the optimal precoding matrix
problem in (2.42) is given by Vi[k] = Wi,i[Ξi,i,0]T UHi,i, where Ξi,i = diagξ0i,i, ξ1i,i, . . . , ξ(JNs−1)i,i,
and ξmi,i = λmi,i/(λ2mi,i + η), with the smallest η ≥ 0 satisfying
∑JNs−1m=0 |ξmi,i|2 ≤ Pt.
Proof. See Appendix A.3.
From theorem 2.12, it can be seen that the optimal precoder that minimizes the sum-MSE,
and hence maximizes the sum-rate can be constructed from the estimated UL DoAs as well
as the path gains. In this paper, we assume that the BS has perfect knowledge of the path
gains. However, path gains can also be estimated using maximum likelihood (ML) method
once the DoAs have been estimated. Our work on this aspect can be found in [60].
2.3.2 Large-Antenna System Analysis
In this section, we present the achievable rate analysis and simplified precoding strategy for
massive FD-MIMO systems. Our discussions in this sub-section are based on asymptotic
2.3. Downlink Precoding and Achievable Rate Analysis 33
analysis. This can be viewed as the special case of Section 2.3.1 where the number of antennas
at the base station goes large asymptotically.
Achievable Rate under Perfect Channel Estimation
In this case, (2.35) can be written as
ydlni[k] = B∗ni,iDni,iATni,i[k]Vni[k]sdlni[k] +
J−1∑j=0j 6=n
B∗ni,iDni,iATni,i[k]Vji[k]sdlji[k] + n
′
ni[k], (2.43)
where
n′
ni[k] =G−1∑g=0g 6=i
J−1∑j=0
√Λni,gH
dlni,g[k]Vjg[k]sdljg[k] + ndlni[k] (2.44)
is the equivalent noise-plus-inter-cell-interference vector. As the number of antennas grows
large, the right singular matrix, Wi,i, in Theorem 2.12 can be approximated as the DoA
matrix, A∗i,i. In other words, for massive FD-MIMO systems, eigen directions align with
the directions of arrivals, which is also validated in [6] and [7]. From Lemma 2.3, the array
steering vectors for different MSs become orthogonal as the number of antennas grows large,
i.e., we have (1/Nr)AHji,i[k]Aj′ i,i[k] → 0 as Nr → ∞ for all j 6= j
′. Hence for the massive
MIMO systems, beamforming in the DoA directions nullifies the intra-cell interferences.
Therefore, the optimum eigen-beamformer under perfect DoA estimation:
Veigni [k] =
1
Nr
A∗ni,i[k], (2.45)
34 Chapter 2. Multi-Cell Multi-User Massive FD-MIMO
and accordingly, the received signal in (2.43) can be written as
ydlni[k] = B∗ni,iDni,isdlni[k] + n
′
ni[k]. (2.46)
Now, the signal in (2.46), under the optimal receive processing, results in
ydlni[k] = Dni,isdlni[k] + n
′
ni[k], (2.47)
where ydlni[k] =(BTni,iB
∗ni,i
)−1BTni,iy
dlni[k], and n
′ni[k] =
(BTni,iB
∗ni,i
)−1BTni,in
′ni[k]. Accordingly,
the achievable rate for the n-th user in i-th cell, Ini[k], can be expressed as
Ini[k] = log2 det
ILni,i +Dni,iQ
dlni[k]DH
ni,i
G−1∑g=0g 6=i
J−1∑j=0
Λni,gBHni,iH
dlni,g[k]Vjg[k]Qdl
jg[k]VHjg[k]HdlH
ni,g[k]Bni,i + σ2I
.
(2.48)
where, BHni,i =
(BTni,iB
∗ni,i
)−1BTni,i, and Qdl
ni[k] = Esdlni[k]sdlH
ni [k] is the covariance matrix of
the transmit symbol vector from the i-th BS intended for the n-th MS on the k-th subcarrier.
Now, (2.48) can succinctly be written as
Ini[k] = log2 det(ILni,i + Dni,iQ
dlni[k]DH
ni,iR′−1
ni [k]), (2.49)
where inter-cell interference-plus-noise covariance matrix, R′ni[k], is defined as
R′
ni[k] =G−1∑g=0g 6=i
J−1∑j=0
Λni,gBHni,iH
dlni,g[k]Vjg[k]Qdl
jg[k]VHjg[k]HdlH
ni,g[k]Bni,i + σ2I. (2.50)
Let us now consider the following lemma:
2.3. Downlink Precoding and Achievable Rate Analysis 35
Lemma 2.13. Assuming all BSs apply the same precoding strategy, equivalent inter-cell
interference-plus-noise covariance matrix, R′ni[k], is approximated by
R′
ni[k] ≈ (ζni + σ2)ILni,i , (2.51)
where ζni = J(G − 1)EΛni,gpjg,`[k]|αni,g(`)|2, where pjg,`[k] is the power allocated on the
`-th symbol for j-th user in g-th cell on the k-th subcarrier.
Proof. This lemma can be proved by substituting (2.41) in (2.50), and by utilizing the
orthogonality property from Lemma 2.3. Details are omitted due to page limitation.
Accordingly, (2.49) results in
Ini[k] = log2 det
(ILni,i +
1
ζni + σ2Dni,iQ
dlni[k]DH
ni,i
). (2.52)
Assuming Gaussian input signal, Qdlni[k] = Esdlni[k]sdl
H
ni [k] = diagpni,0[k], . . . , pni,L−1[k],
where pni,`[k] is the power to be allocated on the `-th information symbol on the k-th sub-
carrier for the target user. Now, using Hadamard inequality, (A.58) can be rewritten as
Ini[k] = log2 Π`
(1 +
Λni,i|αni,i(`)|2pni,`[k]
ζni + σ2
)=
Lni,i−1∑`=0
log2 (1 + γni,`pni,`[k]) , (2.53)
where γni,` = Λni,i|αni,i(`)|2/(ζni + σ2). Accordingly, the optimal power allocation under
perfect DoA estimation is the well-known water-filling solution which can be expressed as
pni,`[k] = [µni,`[k]− 1/γni,`]♦, (2.54)
where [x]♦ denotes a function with [x]♦ = 0 when x < 0, and [x]♦ = x when x > 0, and
µni,`[k] is the corresponding Lagrange multiplier.
36 Chapter 2. Multi-Cell Multi-User Massive FD-MIMO
System Achievable Rate under DoA Estimation Errors
In this case, since the BS does not have perfect DoA estimation, the array steering matrix
for the n-th MS in i-th cell, in the presence of DoA estimation error, can be expressed in the
form of
ˆAni,i[k] =
[er,ni,i,k(0) er,ni,i,k(1) . . . er,ni,i,k(Lni,i − 1)
],
where er,ni,i,k(`) = e−j2πk`Nc a(vni,i,` + ∆vni,i,`) ⊗ a(uni,i,` + ∆uni,i,`), and ∆uni,i,` and ∆vni,i,`
represent the DoA estimation errors in the azimuth and elevation spatial frequencies for the
`-th path of the channel between the i-th BS and the n-th MS in i-th cell. Now, let us
consider the following lemma:
Lemma 2.14. For the massive FD-MIMO OFDM system, the normalized steering vectors
er,jg,i,k(`) = 1/√Nrer,jg,i,k(`) and ˆer,j′g′,i,k(`
′) = 1/√Nrer,j′g′,i′,k(`
′), ∀j, g, i, ` 6= j′, g′, i′, `′,
becomes orthonormal asymptotically as the number of antenna, Nr →∞ .
Proof. A similar lemma is proved in [6], and hence omitted here.
Using Lemma 2.14 we have 1Nr
ATni,i[k] ˆA∗jg,i[k] = 0, ∀n, i 6= j, g, and 1
NrATni,i[k] ˆA∗ni,i[k] =
1Nr
diager,ni,i,k(0)er,ni,i,k(0), . . . , er,ni,i,k(Lni,i−1)er,ni,i,k(Lni,i−1) for n, i = j, g. Hence,
for massive MIMO systems, we can express the optimum eigen-beamformer under imperfect
DoA estimation as
Veigni [k] =
1
Nr
ˆA∗ni,i[k]. (2.55)
Accordingly, achievable rate, in the presence of UL DoA estimation error, can be written as
Ini[k] = E
Lni,i−1∑`=0
log(1 + γni,` |er,ni,i,k(`)er,ni,i,k(`)|2 pni,`[k]
) , (2.56)
2.3. Downlink Precoding and Achievable Rate Analysis 37
where pni,`[k] denotes the power to be allocated in the presence of DoA estimation error,
γni,` = Λni,i|αni,i(`)|2/(N2r (σ2 +ζni)), and the expectation is taken with respect to estimation
error. Using method of Lagrangian multiplier, the optimal expected power allocation on `-th
information symbol for the n-th MS in the i-th cell on the k-th subcarrier is given by
Epni,`[k] =
[µni,`[k]− 1
γni,`E|eTr,ni,i,k(`)e∗r,ni,i,k(`)|2
]♦, (2.57)
where µni,`(k) is the corresponding Lagrange multiplier. Finally, (2.57) can be simplified as
[6]:
Epni,`[k] =
[µni,`[k]− 1
γni,`M21M
22
(1 +
M21E [(∆vni,i,`)
2]
12
)(1 +
M22E [(∆uni,i,`)
2]
12
)]♦.
(2.58)
It can be observed from that, in the absence of DoA estimation error, the optimal power
allocation algorithm in (2.58) converges to water filling solution in (2.54). It is also to be
noted here that both power allocations in (2.54) and (2.58) take into account the effects of
inter-cell interference, unlike the single-user eigen-beamforming presented in [6].
2.3.3 Precoding Complexity Analysis
In this subsection, we briefly discuss the computational complexity of the proposed DoA-
based precoding strategy as presented in Theorem 2.12. Similarly to the Section 2.2.3,
for computational complexity analysis, here we again assume that all the channels have
L resolvable paths. Now, for forming the effective channels, Di,iATi,i[k], the number of
FLOPS required is Ea = 2(JL − 1)JLNr. Taking SVD of the effective channel requires
Eb = 4J2L2Nr + 22N3r FLOPS. Now, for constructing the final precoder, number of FLOPS
38 Chapter 2. Multi-Cell Multi-User Massive FD-MIMO
required is Ec = [2(Nr−1)NrJL]+2(JL−1)NrJL. Hence, total number of FLOPS required
for our DoA-based precoder can be written as CDoA = Ea + Eb + Ec. Next, for comparison,
we calculate the computational cost for conventional Block Diagonalization-based precod-
ing. Let Lj denote the rank of the matrix [HdlT
0i,i, . . . ,HdlT
(j−1)i,i,HdlT
(j+1)i,i, . . . ,HdlT
(J−1)i,i]T . For
complexity analysis, we assume that Lj = L;∀j. Hence, following the steps of conven-
tional block diagonalization precoding, total number of FLOPS required for BD is CBD =
J([4(J − 1)2N2t Nr + 22N3
r ] + [2(Nr − 1)Nt(Nr − L)] + [4N2t (Nr − L) + 22(Nr − L)3])
2.4 Performance Evaluation
In this section, we evaluate the ESPRIT-based UL DoA estimation for multi-cell multi-
user massive FD-MIMO OFDM networks through simulation. For simulation evaluation, we
consider seven hexagonal cells with MSs uniformly distributed in each cell. Without loss
of generality, we assume that number of co-scheduled MSs in each cell is 10. An M1 ×M2
(antenna elements in elevation direction, and antenna elements in the azimuth direction)
rectangular antenna array is assumed at the BS, whereas the mobile device has a uniform
linear array Different MSs are using non-orthogonal spreading sequences as UL pilots, and
the same pool of sequences is reused in all seven cells. Therefore, in the UL, the target BS
is subject to intra-cell interference as well as interference from MSs in six other neighboring
cells for the purpose of DoA estimation. Cell radius is set to be 1000 meters. The system
is assumed to operate at the mmWave band with 28 GHz carrier frequency. 4 dominant
clusters are assumed for each UL channel from the MS to the BS, and each cluster contributes
one resolvable path. The antenna spacing for both the received and transmit antenna arrays
is assumed to be 0.5λ. The number of transmit antennas at each MS is set to be 8. In this
paper, we invoke the far field assumption, and the wavefront impinging on the antenna array
2.4. Performance Evaluation 39
is assumed to be planer. The transmission medium is assumed to be isotropic and linear.
-4 0 4 8 12 16 20 24
SNR (dB)
10-2
10-1
100
101
RM
SE
fo
r A
ng
le E
stim
atio
n (
in D
eg
)
Simulation Result, Elevation: 16 4 Array
Analytical Result, Elevation: 16 4 Array
Simulation Result, Elevation: 4 16 Array
Analytical Result, Elevation: 4 16 Array
Simulation Result, Elevation: 8 8 Array
Analytical Result, Elevation: 8 8 Array
Figure 2.2: Elevation Angle Estimation for64 Antennas.
The estimation performance of elevation and azimuth angles for 8 × 8, 4 × 16, and 16 × 4
antenna arrays are shown in Fig. 2.2 and Fig. 2.3, respectively, where the RMSE of the
DoA estimation has been used as the performance metric, and the correlation coefficient
of spreading sequences, ρ1, is chosen to be 0.1. As the figure suggests, the analytical re-
sults of DoA estimation match well with that of empirical results asymptotically with SNR.
Furthermore, antenna array geometry has a significant impact on estimation performance.
40 Chapter 2. Multi-Cell Multi-User Massive FD-MIMO
-4 0 4 8 12 16 20 24
SNR (dB)
10-2
10-1
100
101
RM
SE
fo
r A
ng
le E
stim
atio
n (
in D
eg
)
Simulation Result, Azimuth: 16 4 Array
Analytical Result, Azimuth: 16 4 Array
Simulation Result, Azimuth: 4 16 Array
Analytical Result, Azimuth: 4 16 Array
Simulation Result, Azimuth: 8 8 Array
Analytical Result, Azimuth: 8 8 Array
Figure 2.3: Azimuth Angle Estimation for 64 Antennas.
Fig. 2.2 clearly suggests that the 16× 4 antenna array performs better than 8× 8 and 4× 16
arrays in elevation angle estimation. However, 8×8 array configuration may outperform the
4× 16 configuration in azimuth angle estimation as shown in Fig. 2.3. This is quite counter-
intuitive since the 4×16 array has more elements in the azimuth domain. The reason mainly
comes from the fact that the azimuth DoA estimation is actually coupled with elevation DoA
estimation. For the 4×16 array, the performance of the elevation DoA estimation may be so
bad that it affects the azimuth DoA estimation performance. This dependence is manifested
through Jacobians (see Remark 2.5), which, in fact, results from the underlying physics/
2.4. Performance Evaluation 41
coordinate system of the 3D MIMO model. On the other hand, elevation estimation is
not dependent on azimuth estimation, and hence, 16 × 4 array geometry still outperforms
8× 8 array in elevation angle estimation. These observations can provide important design
intuitions for FD-MIMO networks that adopt subspace-based channel estimation methods.
The elevation and azimuth angle estimation results for 16× 16, 8× 32, and 32× 8 antenna
arrays are shown in Fig. 2.4 and Fig. 2.5, respectively. Comparing the results with those
presented in Fig. 2.2 and Fig. 2.3, we can observe that as the total number of antennas
increases, the DoA estimation accuracy accordingly increases, which is also evident from our
analytical results.
In Fig. 2.6, the average achievable sum-rates for different precoding strategies are compared
for multi-cell multi-user massive FD-MIMO networks. Five schemes are compared: the
introduced scheme presented in Theorem 5; the block-diagonalization based zero forcing (BD-
ZF) precoding method [11, 12] assuming full CSI at the BS; and three eigen-beamforming
schemes based on the large antenna system analysis. To be specific, Scheme A is the single-
user eigen-beamforming introduced in [6]. This scheme uses eigen-beamformer in (2.55),
and applies the modified water-filling power allocation presented in [6] taking into account
the DoA estimation error due to the noise. However, Scheme A doesn’t consider the effects
of intra/inter-cell interference into power allocation. In Scheme B, (2.55) is used as the
beamformer and the traditional water-filling in (2.54) is used as power allocation assuming
ideal DoA estimation. Scheme C uses the same beamformer as Scheme A and B, however, it
utilizes the power allocation in (2.58) considering the DoA estimation error due to intra/inter-
cell interference of the network. Fig. 2.6 clearly suggests that the scheme introduced in
Theorem 5 achieves best performance among all precoding strategies over the entire SNR
regime of interests. Even assuming full CSI at the BS, the BD-ZF scheme performs worst in
the medium to high SNR regime. This suggests that BD-ZF based precoding strategy may
42 Chapter 2. Multi-Cell Multi-User Massive FD-MIMO
-4 0 4 8 12 16 20 24
SNR (dB)
10-3
10-2
10-1
100
RM
SE
fo
r A
ng
le E
stim
atio
n (
in D
eg
)
Simulation Result, Elevation: 32 8 Array
Analytical Result, Elevation: 32 8 Array
Simulation Result, Elevation: 8 32 Array
Analytical Result, Elevation: 8 32 Array
Simulation Result, Elevation: 16 16 Array
Analytical Result, Elevation: 16 16 Array
Figure 2.4: Elevation Angle Estimation for256 Antennas.
yield strictly suboptimal performance for massive FD-MIMO networks. It is to be noted
here that even though BD is using full channel state information, the performance gain of
DoA-based method over BD method is coming from the fact that DoA-based method utilizes
the structure of the underlying channel, whereas BD method does not take into account the
underlying structure of the MIMO channel.
For the three eigen-beamforming schemes based on the large antenna system analysis, we
have the following observation: Scheme C outperforms both Schemes A and B over the
2.4. Performance Evaluation 43
-4 0 4 8 12 16 20 24
SNR (dB)
10-3
10-2
10-1
100
RM
SE
fo
r A
ng
le E
stim
atio
n (
in D
eg
)
Simulation Result, Azimuth: 32 8 Array
Analytical Result, Azimuth: 32 8 Array
Simulation Result, Azimuth: 8 32 Array
Analytical Result, Azimuth: 8 32 Array
Simulation Result, Azimuth: 16 16 Array
Analytical Result, Azimuth: 16 16 Array
Figure 2.5: Azimuth Angle Estimation for256 Antennas.
entire SNR regime since Scheme C considers the comprehensive characterization of the DoA
estimation for power allocation as discussed in Remark 1. Scheme A performs better than
Scheme B at low SNRs indicating the importance of incorporating the DoA estimation error.
Since DoA estimation error decreases as SNR increases, both Scheme A and B approach
Scheme C asymptotically. This is because the power allocation in (2.58) converges to water-
filling solution in (2.54) with increasing SNR.
In Fig. 2.7 we compare computational complexity of our ESPRIT-based DoA estimation
44 Chapter 2. Multi-Cell Multi-User Massive FD-MIMO
0 2 4 6 8 10 12 14 16 18 20
SNR (dB)
100
101
102
Su
m-R
ate
(b
/s/H
z)
Sum-rate-maximizing Precoding (Theorem 5)
Eigenbeamforming: Scheme-B
Eigenbeamforming: Scheme C
BD-ZF with Full CSI
Eigenbeamforming: Scheme A [5]
Figure 2.6: Average Achievable Sum-Rate Comparison.
2.4. Performance Evaluation 45
2 3 4 5 6 7 8
M1 = M
2
0
1
2
3
4
5
6
7
8
9
10
Co
mp
uta
tio
na
l C
om
ple
xity (
in F
LO
PS
)
106
ESPRIT
MUSIC
Figure 2.7: Computational ComplexityComparison for DoA Estimation Algorithms.
method with the widely used MUSIC algorithm, for a square BS antenna array (i.e., M1 =
M2). Number of transmit antennas is 8, and number of paths in the channel is 4. For MUSIC
algorithm, number of grids for candidate DoA search is 360 which is a typical number. We can
observe that as the number of antennas increases, the complexity of the MUSIC algorithm
increases much faster compared to the complexity of ESPRIT algorithm. Hence, for massive
MIMO system, MUSIC-based DoA estimation will incur significantly more computational
burden than the ESPRIT method. In Fig. 2.8, we compare the computational complexity
of the our proposed DoA-based precoding scheme (Theorem 2.12) is compared with the
46 Chapter 2. Multi-Cell Multi-User Massive FD-MIMO
4 6 8 10 12 14 16
M1 = M
2
0
1
2
3
4
5
6
7
Co
mp
uta
tio
na
l C
om
ple
xity (
in F
LO
PS
)
108
DoA-based Precoding
BD Precoding
Figure 2.8: Computational ComplexityComparison for Precoding Methods.
complexity of traditional Block Diagonalization (BD) precoding. We can observe that as
for low and mid-size arrays, the complexity of both algorithms are similar. However, as the
number of antennas increases the DoA-based precoder outperforms the BD method in term
of the computational complexity.
Chapter 3
Joint Parameter Estimation for 3D
Massive MIMO
3.1 System Model
Consider a MIMO-OFDM system in uplink with Nr receive antennas at the base station
(BS), and a single transmit antennas at the user equipment (UE). The BS antenna array is
in the form of uniform rectangular array (URA) so that the estimation degree of freedom from
both vertical and horizontal dimensions can be achieved. At UE, the high-rate information
symbols to be transmitted are grouped into blocks of length Nc. The i-th such block at the
transmitter can be represented as xi = [xi(0), xi(1), . . . , xi(Nc − 1)]T , where xi(k) denotes
the information symbol on the k-th subcarrier within i-th OFDM block at the transmitter.
The continuous-time received signal at time instant, t, for the i-th OFDM symbol can then
47
48 Chapter 3. Joint Parameter Estimation for 3D Massive MIMO
be represented as
yi(t) =L−1∑`=0
αi(`)er(`)si(t− τ`) + wi(t), (3.1)
where si(t) =∑Nc−1
k=0 xi(k)ej2πktT , and αi(`) is the complex channel gain for the `-th resolvable
path during the i-th OFDM symbol, and er(`) and τ` are, respectively, the Nr × 1 receive
antenna array response, and the path delay for the `-th tap; T is the OFDM symbol duration,
and wi(t) is the corresponding noise element. It is noteworthy here that DoAs and delays
are relatively long term statistics of the channel, and changes much slower than the channel
gains.
BS antenna array is placed in the X-Z plane, with M1 and M2 antenna elements in ver-
tical and horizontal directions, respectively. Accordingly, total number of receive antenna
elements at the base station is Nr = M1M2. Since the antenna elements are placed in a
2D plane, for each resolvable path, there will be an azimuth DoA and an elevation DoA.
Therefore, the receive antenna array response can be expressed as er(`) = a(v`) ⊗ a(u`),
where ⊗ represents Kronecker product. a(u`) =
[1 eju` . . . ej(M1−1)u`
]Tand a(v`) =[
1 ejv` . . . ej(M2−1)v`
]Tcan be viewed as the receive steering vectors for the elevation
and azimuth angles, respectively. Here, u` = 2πdλ
cos θ` and v` = 2πdλ
sin θ` cosφ` are the two
receive spatial frequencies at the base station, d is the spacing between adjacent antenna
elements, λ is the carrier wavelength, and θ` and φ` are the elevation and azimuth DoAs of
the `-th path, respectively.
Now, after sampling, the discrete time received signal at n-th time sample can be written as
yi[n] =L−1∑`=0
Nc−1∑k=0
xi(k)ej2πknNc e−j2πk∆fτ`αi(`)er(`) + wi[n], (3.2)
3.1. System Model 49
where (∆f) denotes OFDM subcarrier spacing. After taking FFT, the frequency domain
received signal at the k-th subcarrier therefore can be written as
yi[k] = xi(k)L−1∑`=0
αi(`)er(`)e−j2πk∆fτ` + wi[k]. (3.3)
Accordingly, after correlating with the transmit symbol xi(k), the received signal at the k-th
subcarrier can be denoted as yi[k] =x∗i (k)
|xi(k)|2 yi[k]. Now, stacking the correlated received signal
for all subcarriers into columns, from (3.3), we have
Yi = ADiB + Wi, (3.4)
where Yi = [yi[0], yi[1], . . . , yi[Nc − 1]] is theNr×Nc received signal matrix, A = [er(0), er(1), . . . , er(L−
1)] is the Nr × L array steering matrix, Di = diagαi(0), αi(1), . . . , αi(L− 1) is the L× L
diagonal matrix containing the complex channel gains for all the paths, Wi is the corre-
sponding Nr×Nc noise matrix, and B is the L×Nc delay-manifold matrix containing paths
delays, and given by
B =
1 ejω0 . . . ej(Nc−1)ω0
1 ejω1 . . . ej(Nc−1)ω1
......
. . ....
1 ejωL−1 . . . ej(Nc−1)ωL−1
, (3.5)
where ω` = 2π(∆f)τ` is the temporal frequency corresponding to delay, τ`.
50 Chapter 3. Joint Parameter Estimation for 3D Massive MIMO
3.2 Parameter Estimation Framework
In this section, we will construct a space-time manifold through vectorization and jointly
estimate the delay and DoAs using ESPRIT-type algorithm.
3.2.1 Joint Angle and Delay Estimation Using Standard ESPRIT
For joint angle and delay estimation (JADE) algorithm, the first step is to construct the
manifold matrix which involves all three parameters–delay, elevation angle, and azimuth
angle. Now, taking vectorization of the received signal in (3.4):
y(i)v = A(τ, θ, φ)d(i) + vec
Wi
, (3.6)
where d(i) = diagDi = [αi(0), . . . , αi(L− 1)]T , and A(τ, θ, φ) = BT A is the space-time
array manifold matrix, where B and A are the time delay matrix and array manifold matrix
respectively, with Vandermonde structure, and denotes the Khatri-Rao product, i.e., a
column-wise Kronecker product. Now, collect y(i)v for K OFDM symbols, we have
Yv = A(τ, θ, φ)S + Wv, (3.7)
where S =
[d(0) d(1) . . . d(K − 1)
]can be regarded as the equivalent transmit sig-
nal, and Wv =
[vec
W0
vec
W1
. . . vec
WK−1
]is the corresponding noise
matrix. From (3.7), we can observe that if we can utilize the shift-invariance property of
the highly structured manifold matrix, we can apply ESPRIT-type algorithms in order to
jointly estimate the unknown parameters. Next, we briefly describe the steps for 3D joint
parameter estimation through standard ESPRIT method.
3.2. Parameter Estimation Framework 51
In order to estimate w`, we should take the first and respectively the last M1M2(Nc−1) rows
of Yv as two sub-matrices, while for θ` estimation, we may take its first and respectively
last M1 − 1 rows for all NcM2 blocks. Similarly, for φ` estimation, we need to select its first
and respectively last M2 − 1 rows for all NcM1 blocks. Hence, we may define the selection
matrices as follows:
J(1)1 = [INc−1 0]⊗ IM1M2 J
(1)2 = [0 INc−1]⊗ IM1M2
J(2)1 = IM2Nc ⊗ [IM1−1 0] J
(2)2 = IM2Nc ⊗ [0 IM1−1]
J(3)1 = INc ⊗ [IM2−1 0]⊗ IM1 J
(3)2 = INc ⊗ [0 IM2−1]⊗ IM1
where J(r)1 and J
(r)2 are the two selection matrices for the r-th parameter mode, where r =
1, 2, and 3 for path-delay, elevation, and azimuth angle, respectively. Now through shift-
invariance property of the space-time manifold matrix, we can have the following shift-
relations:
J(1)1 A(τ, θ, φ)Ω = J
(1)2 A(τ, θ, φ)
J(2)1 A(τ, θ, φ)Θ = J
(2)2 A(τ, θ, φ) (3.8)
J(3)1 A(τ, θ, φ)Φ = J
(3)2 A(τ, θ, φ)
where Ω = diagejω0 , . . . , ejωL−1, Θ = diageju0 , . . . , ejuL−1, and Φ = diagejv0 , . . . , ejvL−1
are the corresponding diagonal matrices, containing, respectively, the delay, elevation, and
azimuth parameters for each path. Now, we need to perform subspace decomposition of the
received signal in (3.7) through singular value decomposition (SVD). Let the signal space
of the received signal, Yv, be denoted as Us. It can be observed that the columns of the
space-time manifold matrix, A(τ, θ, φ), also span the same L- dimensional signal subspace,
i.e., RangeA(τ, θ, φ) = RangeUs. Therefore there exist a non-singular transformation
52 Chapter 3. Joint Parameter Estimation for 3D Massive MIMO
matrix, T, such that Us = A(τ, θ, φ)T. Hence, we can write the shift-invariance equations
in (3.8) in terms of signal subspace:
J(1)1 UsΨτ = J
(1)2 Us
J(2)1 UsΨθ = J
(2)2 Us (3.9)
J(3)1 UsΨφ = J
(3)2 Us,
where Ψτ = T−1ΩT, Ψθ = T−1ΘT, and Ψφ = T−1ΦT are the three shift-invariance
operators for the path-delay, elevation and azimuth angles, respectively. Hence, from (3.9),
we can solve for the shift-invariance operators using least square (LS) or total least square
(TLS) method. Let the eigenvalues of the matrices, Ψτ , Ψθ, and Ψφ be denoted as λ`τ , λ`θ,
and λ`φ for ` = 0, 1, . . . , L−1. Hence, the temporal frequency, and the elevation and azimuth
spatial frequencies can be computed as ω` = angle(λ`τ ), u` = angle(λ`θ), and v` = angle(λ`φ).
Accordingly, the path-delays, and elevation and azimuth angles can be found by simple
parameter transformation.
It is to be noted here that the advantage of jointly estimating the angle- and delay- param-
eters is that it can work even when the number of paths exceeds the number of antennas
(P > M1M2). In order to be able to estimate all the paths in the underlying channel, in
our proposed formulation, we only need the space-time manifold matrix to be a tall one
(M1M2Nc > P ), which can easily be satisfied even for a relatively large number of paths.
3.2.2 Parameter Pairing and Channel Gains Estimation
After estimating the path-delays and the elevation and azimuth angles, we need to pair
up the respective path parameters. We can apply simultaneous schur decomposition (SSD)
in order to couple the parameters. However, the computational complexity of the SSD for
3.2. Parameter Estimation Framework 53
large antenna systems and for 3D parameter pairing is very high, and may not be feasible for
practical cellular systems. Alternatively, we can obtain the correct pairing by correlating the
eigen-vectors of the shift-invariance operators. Let the matrix containing the eigen-vectors
corresponding to the shift invariance matrices Ψτ , Ψθ, and Ψφ be denoted as Qτ , Qθ, and
Qφ, respectively. Since the delay and angle parameters stem from the same signal subspace,
the products Q−1θ Qτ and QφQτ should be close to permutation matrices. These permutation
matrices, in essence, indicate how the order of the eigenvalues of the matrices Ψθ and Ψφ
are changed with respect to the order of the eigenvalues of Ψτ . Hence, after reordering the
eigenvalues, we can obtain the correct pairing of the estimated parameters.
After pairing the delay and angles, we estimate the complex channel gains using maximum
likelihood (ML) estimator for each OFDM symbol. Let A and B denote the estimated
array steering matrix and delay manifold matrix, respectively. Now, the ML estimate of the
diagonal channel gain matrix, Di, for the i-th OFDM symbol can be written as
Di = argmaxDi
p(Yi|Di) = argminDi
∣∣∣∣∣∣Yi − ADiB∣∣∣∣∣∣2F, (3.10)
where ||X||F denotes the Frobenius norm of the matrix, X. Now,
∣∣∣∣∣∣Yi − ADiB∣∣∣∣∣∣2F
= Tr
(Yi − ADiB
)(Yi − ADiB
)H. (3.11)
Taking derivative of (3.11) with respective to Di, and setting the derivative equal to zero,
we obtain
AT Yi∗BT = AT A∗DiB
∗BT . (3.12)
54 Chapter 3. Joint Parameter Estimation for 3D Massive MIMO
Accordingly, the ML estimate of the complex channel gain matrix can be calculated as
Di =(AHA
)−1
AT Y∗i BT(BBH
)−1
(3.13)
The uplink channel then can be reconstructed by utilizing all the estimated parameters.
3.3 RMSE Characterization of the Joint Angle-Delay
Estimation
In this section, we present the theoretical analysis of the root mean square error (RMSE) of
joint angle-delay estimation massive MIMO OFDM systems. For notational simplicity, we
denote µ(1)` = ω`, µ
(2)` = u` and µ
(3)` = v`. Let us also denote the estimated temporal and
spatial frequencies as µ(1)` = ω`, µ
(2)` = u`, and µ
(3)` = v`, respectively. Define the estimation
error as 4µ(r)` = µ
(r)` − µ
(r)` , for r = 1, 2, and 3. Now subspace decomposition of the received
signal in (3.7) can be performed through SVD. Let Us and Vs denote, respectively, the left
and right singular matrices corresponding to signal subspace, and Σs denote the diagonal
matrix containing the corresponding singular values. The first order approximation of the
mean squared error of the `-th path in the r-th mode is given by [20]:
E(4µ(r)
`
)2
=1
2
(r
(r)H` ·W∗
mat ·RTnn ·WT
mat · r(r)`
−Re
r(r)T` ·Wmat ·Cnn ·WT
mat · r(r)`
), r ∈ 1, 2, 3 .
(3.14)
The vector r(r)` and the matrix Wmat are given by
r(r)` = q` ⊗
([(J
(r)1 Us
)+ (J
(r)2 /ej·µ
(r)` − J
(r)1
)]Tp`
), (3.15)
3.3. RMSE Characterization of the Joint Angle-Delay Estimation 55
Wmat =(Σ−1s VT
s
)⊗(UnU
Hn
), (3.16)
where q` is the `-th column of the transformation matrix T, p` is the `-th row of matrix T−1;
Rnn and Cnn are the noise covariance and complementary covariance matrices, respectively.
Now, let aτθφ(`) denote the normalized space-time steering vector corresponding to the `-th
path of the channel, i.e., aτθφ(`) = 1/√M1M2Ncaτθφ(`), where aτθφ(`) is the `-th column of
the space-time manifold matrix, A(τ, θ, φ). Accordingly, for the massive MIMO systems, we
can have the following Lemma [61]:
Lemma 3.1. If the elevation and azimuth angles are both drawn independently from any
continuous distribution, the normalized space-time steering vectors are orthogonal, that is,
aτθφ(i) ⊥ span aτθφ(j) | ∀i 6= j when M1M2 is large and the number of paths L = o(M1M2).
It is apparent that (3.14) relies on the singular value decomposition of the noiseless received
signal, which is difficult to obtain at the base station. In fact, it is very challenging to sim-
plify such complicated result in the multiple path scenario. Fortunately, for massive MIMO
systems, the result can be significantly simplified due to the orthogonality of the steering
vectors. Specifically, using standard ESPRIT, for the massive MIMO OFDM systems, we
have the simplified RMSE of the 3D parameter estimation as follows:
Theorem 3.2. For parameter estimation based on a uniform planar array of M1 × M2
elements, the root mean square errors of estimation of the delay, and elevation and azimuth
56 Chapter 3. Joint Parameter Estimation for 3D Massive MIMO
angles for 3D massive MIMO OFDM system are given by:
RMSEτ` =σ
2π(∆f)(Nc − 1)
√R−1ss (`, `)
KM1M2
, (3.17)
RMSEθ` =σ
π sin(θ`)(M1 − 1)
√R−1ss (`, `)
KM2Nc
, (3.18)
RMSEφ` =σ
π sin(θ`)
√R−1ss (`, `)
KNc
×
√(cot2(θ`) cot2(φ`)
(M1 − 1)2M2
+1
sin2(φ`)(M2 − 1)2M1
), (3.19)
where Rss is the covariance matrix of the equivalent transmit signal, S, and Rss(`, `) denotes
its `-th diagonal element, K is the number of OFDM symbols, ∆f is subcarrier spacing, and
σ2 is the noise variance.
Proof. See Appendix A.4.
3.4 Simulation Results
In this section, we evaluate the RMSE of delay and angle estimation for the 3D massive
MIMO OFDM system, and verify the accuracy of our analytical results through extensive
simulation works. To evaluate the performance of the DoA estimation, we assume there are 4
resolvable paths, which is a typical number for the outdoor millimeter-wave communication
systems at both 28GHz and 73GHz [62]. Number of subcarriers of the OFDM system is
64, and the antenna spacing for both the received and transmit antennas is assumed to be
0.5λ. The elevation and azimuth DoAs are chosen randomly from the uniform distribution:
U [65, 90] and U [0, 180], respectively. In our work, we invoke the far field assumption,
and the wavefront impinging on the antenna array was assumed to be planer. The number
3.4. Simulation Results 57
of OFDM symbols is taken to be 50. All the path gains are normalized, i.e.,∑L−1
`=0 |αi(`)|2 =
1,∀i = 0, . . . , K − 1, where K is the number of OFDM symbols, and L is the number
resolvable paths in the channel. Finally, the total available transmit power is assumed to be
unity, and the SNR is defined as the ratio of the received signal power to the noise power, i.e.
SNR = 10 log10 (1/σ2). Performances of delay, and elevation and azimuth angle estimation
for different antenna arrays are shown in figures 3.1, 3.2, and 3.3, respectively, where the
analytical RMSE results are compared with the simulation results. We can observe that as
the number of antenna increases the estimation performance also improves. Moreover, in all
cases, our analytical results match with the simulation results asymptotically. At low SNR,
however, the gap between the analytical and simulation results is higher. This is because
our analytical results are based on first order perturbation expansion [63], which, mainly
at high SNR regime (low perturbation), can be used to obtain a linear approximation of
the perturbed subspace. Therefore, if the SNR is chosen moderate to high, first order
perturbation expansion becomes accurate, and the empirical and analytical results converge
as well.
In figure 3.4, we compare performance of the minimum mean squared error (MMSE)-based
channel estimation with parametric channel estimation. The channel estimation quality is
measured by the correlation between the underlying and the estimated channel. Higher
correlation is expected to result in better system level performance. We can observe that
at low and medium SNR regime, parametric channel estimation yields significantly better
performance than the MMSE-based channel estimation, and as SNR increases both results
in correlation value of 1.
58 Chapter 3. Joint Parameter Estimation for 3D Massive MIMO
-4 0 4 8 12 16 20
SNR (dB)
10-2
10-1
100
101
102
RM
SE
for
Estim
ation (
in S
ym
bol D
ura
tion)
Simulation Result: 8 8 Array
Analytical Result: 8 8 Array
Simulation Result: 4 4 Array
Analytical Result: 4 4 Array
Figure 3.1: Performance of Delay Estimation.
3.4. Simulation Results 59
-4 0 4 8 12 16 20
SNR (dB)
10-3
10-2
10-1
100
RM
SE
for
Estim
ation (
in D
eg)
Simulation Result: 8 8 Array
Analytical Result: 8 8 Array
Simulation Result: 4 4 Array
Analytical Result: 4 4 Array
Figure 3.2: Elevation Angle Estimation Performance.
60 Chapter 3. Joint Parameter Estimation for 3D Massive MIMO
-4 0 4 8 12 16 20
SNR (dB)
10-3
10-2
10-1
100
RM
SE
for
Estim
ation (
in D
eg)
Simulation Result: 8 8 Array
Analytical Result: 8 8 Array
Simulation Result: 4 4 Array
Analytical Result: 4 4 Array
Figure 3.3: Azimuth Angle Estimation Performance.
3.4. Simulation Results 61
-20 -15 -10 -5 0 5 10 15 20
SNR(dB)
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Co
rre
latio
n V
alu
e
Figure 3.4: Correlation Between Underlying True Channel and Estimated Channel.
Chapter 4
Superimposed Pilot for Massive
FD-MIMO Systems
4.1 Motivation and Literature Review
Massive-MIMO, also known as large-scale MIMO, is regarded as one of the key enabling tech-
nologies for 5G cellular networks. First introduced in [1], massive MIMO has created new
research trends not only in academia but also in industry [2]. Placing a large antenna array
at the base station (BS), 3 dimensional (3D) massive MIMO/FD-MIMO systems promise
significant gain in spectral efficiency by coherently communicating with tens of mobile sta-
tions (MSs) through aggressive spatial multiplexing. Constrained by the BS form-factor
limitation, FD-MIMO systems employ active antenna elements placed in a 2D antenna ar-
ray, and hence exploits the degrees of freedom in both elevation and azimuth domains [2].
On the other hand, due to large available spectrum, communications at millimeter wave
(mmWave) frequencies is considered as another key enabler for 5G and Beyond 5G systems.
To overcome the high path loss of mmWave channels, it is extremely important to deploy
62
4.1. Motivation and Literature Review 63
appropriate beamforming strategies. Equipped with a large antenna array at the BS, mas-
sive FD-MIMO can form very narrow 3D beams and hence can compensate for the mmWave
path loss by focusing more energy in the desired direction. In this way, massive FD-MIMO
comes as a natural partner for mmWave systems.
To harness the benefits of massive FD-MIMO, it is critical for the BSs to have accurate
downlink (DL) channel state information (CSI). For time division duplex (TDD) systems,
it is possible to obtain the DL CSI directly from the estimated uplink (UL) channel using
channel reciprocity. For mmWave channels, it has been shown in our previous work [6, 7]
that the direction of arrivals (DoAs) estimated in the UL can be directly linked to the MIMO
precoding in DL to completely avoid DL CSI feedback. Our recent work in [64, 65] extended
the analysis to the case where MSs are assigned non-orthogonal pilot sequences due to the
large number of co-scheduled MSs in massive FD-MIMO networks. Note that the strategy
of conducting DL MIMO precoding based on UL DoA has also been introduced to frequency
division duplex systems showing tremendous performance gains for practical networks [66].
In modern cellular networks, dedicated time/frequency resources are assigned to UL pilot
sequences. This approach simplifies channel estimation procedures while introducing pilot
overhead and corresponding rate-loss for the UL. Meanwhile, this UL rate-loss can be poten-
tially alleviated by superimposing pilot symbols with data symbols in the UL [67, 68, 69, 70].
Note that there is a clear trade-off between UL pilot overhead and DL throughput. A higher
UL pilot overhead will reduce UL throughput while increasing DL throughput by improving
channel estimation. Therefore, to provide a systematic view of the network throughput, in
this paper, we use the overall network achievable rate, Roverall = κULRUL + κDLR
DL, as the
performance metric. Here, RUL and RDL are the UL and DL achievable rates, respectively,
κUL and κDL are the respective weights for UL and DL. Note that the UL achievable rate
has two components: 1) the UL achievable rate during the channel estimation phase due to
64 Chapter 4. Superimposed Pilot for Massive FD-MIMO Systems
superimposed pilot and data, and 2) UL achievable rate during the data-only transmission
phase.
The detailed contributions of this paper can be summarized as the following:
• In this work, we present a novel superimposed pilot framework for mmWave 3D massive
MIMO/FD-MIMO systems [71]. Most of the works in the superimposed pilot litera-
ture, including [72, 73? ] consider Rayleigh fading channels for system performance
analysis. While this assumption often makes the analysis simpler, it is only applica-
ble for rich scattering channels. On the other hand, higher frequency channels such
as millimeter wave (mmWave) channels only have a few resolvable paths, and hence,
Rayleigh fading models do not accurately portray the characteristics of mmWave chan-
nel. In the analysis of our superposition pilot system, we adopt a parametric channel
modeling approach where the channel is represented in terms of some parameters (such
as DoA, DoD, and channel gains) corresponding to each resolvable path. This para-
metric channel modeling approach is more appropriate for mmWave channels [6, 7, 47].
All derivations and analysis carried out in this paper are based on parametric channel
modeling, and the results provide new perspectives on the design of superimposed pilot
systems for the mmWave channels.
• We present an UL DoA estimation method for mmWave FD-MIMO OFDM systems un-
der superimposed pilots. The majority of the work in literature on superimposed pilot
system adopt least square (LS) or linear minimum mean squared error (LMMSE)-based
channel estimation methods [68, 72, 74]. In this work, based on parametric channel
modeling, we estimate channel parameters namely directions of arrival (DoA) of the
resolvable multipath components instead of estimating the channel transfer function,
and demonstrate how the estimated DoA in the uplink can be utilized in both uplink
4.1. Motivation and Literature Review 65
and downlink processing. The introduced DoA based strategy is especially suitable
for superimposed pilot to reduce UL overhead. To reflect the reality, non-orthogonal
scrambling/spreading sequences are assumed in the UL.
• We analytically characterize the root mean squared error (RMSE) of uplink angle es-
timation for superimposed pilot-based 3D massive MIMO systems. The performance
of angle estimation is also connected to important physical parameters namely num-
ber of antennas at the BS, array geometry at the receiver, channel gains, correlation
coefficients among MSs’ scrambling/spreading sequences, and the power split between
the superimposed pilot and data in the UL. It is also to be noted here that compared
to our prior works [6, 7], [64], the performance of DoA estimation characterization is
much more involved and non-trivial for superimposed pilot systems.
• We characterize the overall network throughput for the superimposed pilot-based mas-
sive FD-MIMO system. To be specific, we consider jointly the DL achievable rate,
where DL precoders are designed based on UL estimated DoAs, and the UL achievable
rates in both channel estimation phase and data-only transmission phase. Impacts
of imperfect DoA estimation on uplink rate and overall rate are also investigated. In
contrast to majority of the works in literature, where uplink transmission is assumed
to be based only on superimposed pilot and data transmission phase, in this work,
we consider a practical and rather general framework for superimposed pilot systems,
where uplink transmission occurs in two phases. First, the phase where both pilots and
data symbols are transmitted. Second, the phase where only data-symbols are trans-
mitted. This general framework provides the system designer with additional control
parameters to tune for performance optimization.
• Most of the works in superimposed literature use matched filtering (MF) or maximum
ration combining (MRC) for uplink processing, which are based on complete knowledge
66 Chapter 4. Superimposed Pilot for Massive FD-MIMO Systems
of channel transfer functions. On the other hand, our proposed approach in this work
uses only partial channel information for uplink symbol detection for superimposed
pilot systems. To be specifc, we utilize DoAs for uplink processing, and demonstrate
how DoA estimation error affects the uplink rate as well as overall system rates. This
uplink processing strategy with only partial channel information incurs significantly
less computational complexity compared to traditional MF or MRC-based symbol de-
tection.
• Finally, we validate the analytical results through comprehensive simulation works, and
identify important system-level design intuitions. We provide design insights which
are novel compared to results in existing literature for superimposed pilot systems. In
fact, because of using a generalized framework, we are able to identify under which
conditions what the optimal strategies are.
4.2 System and Channel Model
We consider a 3D massive MIMO OFDM network consisted of G BS. Each BS with Nr anten-
nas serves J users, where each user has Nt antennas. In the uplink, time-domain transmit
signal from each mobile station, after up-conversion, is transmitted through a frequency-
selective fading channel, which remains time-invariant for one OFDM symbol duration. In
our feedback-free system, each MS superimposes a scrambling/spreading sequence on top of
its UL data. Assume that we collect the received signal for Q symbol period. Accordingly,
in the UL at i-th BS, Nr ×Q received signal at the k-th subcarrier can be expressed as
Zi(k) =G−1∑g=0
J−1∑j=0
√Λjg,iHjg,i(k) (Xjg(k) + Sjg(k)) + Wi(k), (4.1)
4.2. System and Channel Model 67
where Hjg,i(k) denotes the Nr ×Nt channel transfer function at the k-th subcarrier for the
channel between the i-th BS and the j-th user in g-th cell ; Λjg,i represents the corresponding
large scale fading parameter, and is independent of subcarrier index; Xjg(k) denotes the
Nt × Q frequency-domain UL data matrix from the j-th MS in the g-th cell at the k-
th subcarrier, Sjg(k) is the corresponding scrambling/spreading matrix; and Wi(k) is the
Nr × Q noise element. It is to be noted here that each scrambling/spreading sequence is
of length Q and each MS will utilize Nt orthogonal sequences for uplink data transmission.
Now, the channel, Hjg,i(k), can be expressed as
Hjg,i(k) =
Ljg,i−1∑`=0
Cjg,i(`)e−j2πk`Nc , (4.2)
where Cjg,i(`) denotes the Nr × Nt channel impulse response (CIR) corresponding to the
`-th tap for the channel between j-th user in g-th cell and i-th BS. The CIR is assumed to
have a finite number (Ljg,i) of non-zero taps; Nc is the number of subcarriers.
Using parametric channel modeling for mmWave frequencies, the CIR, Cjg,i(`), can be ex-
pressed as [6, 7, 47]
Cjg,i(`) =
Pjg,i,`−1∑p=0
αjg,i(`, p)er,jg,i(`, p)eHt,jg,i(`, p), (4.3)
here αjg,i(`, p), er,jg,i(`, p), and et,jg,i(`, p) denote, respectively, the complex path gain, Nr×1
receive antenna array response, and Nt × 1 transmit antenna array response corresponding
to the p-th sub-path in `-th cluster; Hermitian operation is denoted as (·)H .
We assume the antennas at each MS is arranged in a uniform linear array (ULA). Accord-
ingly, we can express the uplink array response in a Vandermonde structure: et,jg,i(`, p) =[1 ejωjg,i,`,p . . . ej(Nt−1)ωjg,i,`,p
]T, where ωjg,i,`,p = (2π∆t/λ) cos Ωjg,i,`,p, denotes transmit
68 Chapter 4. Superimposed Pilot for Massive FD-MIMO Systems
spatial frequency, λ denotes the carrier wavelength, Ωjg,i,`,p represents the direction of depar-
ture (DoD), and ∆t denotes the distance between adjacent transmit antenna elements. Each
BS is equipped with 2-dimensional (2D) antenna array placed in X-Z plane, and the number
of antennas in the vertical and horizontal directions are represented by M1 an M2, respec-
tively. Hence, total number of receive antennas at each BS is, Nr = M1M2. The receive-side
array response vector, er,jg,i(`, p), can accordingly be written as er,jg,i(`, p) = a(vjg,i,`,p) ⊗
a(ujg,i,`,p), where a(ujg,i,`,p) =
[1 ejujg,i,`,p . . . ej(M1−1)ujg,i,`,p
]Tand a(vjg,i,`,p) =[
1 ejvjg,i,`,p . . . ej(M2−1)vjg,i,`,p
]Trepresent receive array responses in elevation and azimuth
domains, and⊗ denotes the Kronecker product; elevation and azimuth spatial frequencies are
represented by ujg,i,`,p = 2π∆r
λcos θjg,i,`,p and vjg,i,`,p = 2π∆r
λsin θjg,i,`,p cosφjg,i,`,p, respectively,
where θjg,i,`,p and φjg,i,`,p denote the corresponding elevation and azimuth angles, respectively.
Finally, ∆r denotes the spacing between two adjacent receive antenna elements.
4.3 Uplink Channel Estimation and Performance Char-
acterization
4.3.1 Uplink DoA Estimation using Unitary ESPRIT
In this work, we adopt a parametric-based approach for channel estimation. In tradi-
tional channel estimation methods for example, least squared (LS) or linear minimum mean
squared error (LMMSE)-based techniques, the channel transfer function is estimated explic-
itly. Hence, for this approach, as the dimension of the channel matrix increases, estimation
overhead also increases accordingly. For massive MIMO system, because of the large num-
ber of antennas the BS, number of channel coefficient that need to be estimated using
traditional method increases with geometric progression. On the other hand, in paramet-
4.3. Uplink Channel Estimation and Performance Characterization 69
ric channel estimation approach, number of parameters that need to be estimated doesn’t
grow with number of antennas in either transmitter or receiver, and therefore, estimation
overhead is independent of the channel dimension. Moreover, it has been shown that para-
metric channel estimation approach outperforms channel transfer function estimation based
approaches (LS/LMMSE) for large antenna systems [6, 7, 60]. This makes the parametric
channel estimation an attractive solution massive MIMO or full-dimensional MIMO (FD-
MIMO) systems, especially for mmWave channels, where number of paths in the channels
are quite limited compared to sub-6 GHz channels.
Let the n-th user at i-th cell be the target user, and communicates with i-th BS.
Denote the correlation between the scrambling sequences from different users as ρ1. At i-th
BS, after correlating with the scrambling sequence for the target user, we have
Zi(k)SHni(k) =G−1∑g=0
J−1∑j=0
√Λjg,iHjg,i(k) (Xjg(k) + Sjg(k)) SHni(k) + W
′
i(k), (4.4)
where W′i(k) = Wi(k)SHni(k) denotes equivalent noise element.
Now, (4.4) can be re-written as
Zi(k)SHni(k) =√
Λni,iHni,i(k)(Xni(k)SHni(k) + γI
)+
G−1∑g=0g 6=i
√Λng,iHng,i(k)
(Xng(k)SHni(k) + γI
)
+J−1∑j=0j 6=n
√Λji,iHji,i(k)
(Xji(k)SHni(k) + ρ1γ1Nt
)
+G−1∑g=0g 6=i
J−1∑j=0j 6=n
√Λjg,iHjg,i(k)
(Xjg(k)SHni(k) + ρ1γ1Nt
)+ W
′
i(k), (4.5)
70 Chapter 4. Superimposed Pilot for Massive FD-MIMO Systems
where 1Nt denotes an Nt×Nt matrix with each element being unity, and γ is the portion of
power allocated to pilot symbol; hence, power allocated on the data symbol=(1− γ).
Now, (4.5) can be expressed as
Hni,i(k) = Zi(k)SHni(k) =√
Λni,iHni,i(k)(Xni(k)SHni(k) + γI
)+ W
′′
i (k) (4.6)
where W′′i (k) denotes the equivalent noise-plus-interference matrix. Using (4.2) and (4.3),
Hni,i(k) can now be written as
Hni,i(k) =
Lni,i−1∑`=0
Pni,i,`−1∑p=0
αni,i(`, p)er,ni,i(`, p)eHt,ni,i,k(`, p) (4.7)
where et,ni,i,k(`, p) = et,ni,i(`, p)e−j2πk`Nc .
In this work, we consider millimeter wave channel, and assume that each cluster contribute
one propagation path [6, 50, 51, 52]. Hence, for notational convenience and clarity of expo-
sition, we drop the subpath index. Accordingly, (4.7) becomes
Hni,i(k) = Ani,iDni,iBHni,i(k) (4.8)
where Ani,i =
[er,ni,i(0) . . . er,ni,i(Lni,i − 1)
]is the receiver-side array steering matrix,
Dni,i = diag
[αni,i(0) . . . αni,i(Lni,i − 1)
]is the diagonal matrix with the diagonal elements
being the complex path gains, and Bni,i(k) =
[et,ni,i,k(0) . . . et,ni,i,k(Lni,i − 1)
]denotes the
transmitter-side array steering matrix. Accordingly, the channel, Hni,i(k), from (4.6), can
be expressed as
Hni,i(k) =√
Λni,iAni,iDni,iBHni,i(k)
(Xni(k)SHni(k) + γI
)+ W
′′
i (k). (4.9)
4.3. Uplink Channel Estimation and Performance Characterization 71
Now, the noisy channel matrix in (4.9) can concisely be expressed as
Hni,i(k) = Ani,iSni,i(k) + W′′
i (k), (4.10)
where Sni,i(k) =√
Λni,iDni,iBHni,i(k)
(Xni(k)SHni(k) + γI
). The forward-backward averaged
received signal can be expressed as:
Hfbani,i(k) =
[Hni,i(k) ΠNrH
∗ni,i(k)ΠNt
]=
[Ani,iSni,i(k) ΠNrA
∗ni,iS
∗ni,i(k)ΠNt
]+
[W′′i (k) ΠNrW
′′i
∗(k)ΠNt
], (4.11)
where A∗ denotes complex conjugate of A, and Πp represents the p× p exchange matrix.
We can therefore apply Unitary ESPRIT on (4.11) for DoA estimation [6].
4.3.2 RMSE Characterization
Let vni,i,` and uni,i,` denote the estimated elevation and azimuth spatial frequencies for `-
th tap, respectively. The corresponding estimation errors are given by ∆vni,i,` = vni,i,` −
vni,i,` and ∆uni,i,` = uni,i,` − uni,i,`. Using Lemma 2 of [64], total MSE can be written as
E
(4vni,i,`)2 =4∑
m=1
E
(4vni,i,`)2m
, where m = 1, 2, 3, 4 correspond to the MSE due to
pilot contamination, intra-cell interference, inter-cell interference, and noise element, respec-
tively. Now, for facilitating the derivation of MSE expression for superimposed pilot-based
massive FD-MIMO system, we consider the following Lemma [6]:
Lemma 4.1. Normalized array steering vectors are asymptotically orthogonal as the number
of antennas at the BS goes large, i.e., er,jg,i(m) ⊥ spaner,j′g′ ,i′ (n) | ∀(j, g, i,m) 6= (j
′, g′, i′, n)
,
where er,jg,i(m) = 1√Nr
er,jg,i(m).
72 Chapter 4. Superimposed Pilot for Massive FD-MIMO Systems
Utilizing Lemma 4.1, it is possible to characterize the performance of DoA estimation for
superimposed pilot-based 3D massive MIMO systems. Specifically, following theorem ana-
lytically characterizes the MSE of the uplink angle estimation caused by pilot contamination:
Theorem 4.2. For superimposed pilot-based 3D massive MIMO system, the MSE, E(∆vni,i,`)21,
due to pilot contamination is given by
E(∆vni,i,`)21 =1
Wni,i,`
G−1∑g=0g 6=i
Λng,iαng,i
(Xng,i + ρ2
2γ(1− γ)X′
ng,i
)(Yng,i + Y
′
ng,i
)
− 4ρ22γ(1− γ)
Wni,i,`
G−1∑g=0g 6=i
Λng,iαng,iX′
ng,i<ejΦYng,i
(4.12)
where Φ = ((M1 − 1)uni,i,` + (M2 − 1)vni,i,`),Wni,i,` = 8|αni,i(`)|2N2t Λni,i(M2−1)2M2
1 , αng,i =Lng,i−1∑m=0
|αng,i(m)|2 and Xng,i and Yng,i are given by
Xng,i = Eψ
∣∣∣ρ2
√γ(1− γ)X`,1X`,2 + γX`,3
∣∣∣2 X′
ng,i = Eψ
∣∣∣(ρ2
√γ(1− γ)Nt + 1)X`,1X`,2
∣∣∣2(4.13)
where X`,1 =(1 + e−jωni,i,` + . . .+ e−j(Nt−1)ωni,i,`
), X`,2 =
(1 + ejωni,i,` + . . .+ ej(Nt−1)ωni,i,`
),
and X`,3 =(1 + e−j(ωni,i,`−ωng,i,m) + . . .+ e−j(Nt−1)(ωni,i,`−ωng,i,m)
), ρ2 is the expected correla-
tion between the data signal and spreading sequences, and
Yng,i = Eθ,φ∣∣(1 + ej(uni,i,`−ung,i,m) + . . .+ ej(M1−1)(uni,i,`−ung,i,m)
) (ejvni,i,`e−jvng,i,m − 1
)∣∣2 ,(4.14)
4.3. Uplink Channel Estimation and Performance Characterization 73
Y′
ng,i = Eθ,φ∣∣(ej(M1−1)ung,i,m + ejuni,i,`ej(M1−2)ung,i,m + . . .+ ej(M1−1)uni,i,`
)×(
ej(M2−1)vni,i,` − ej(M2−1)vng,i,m)∣∣2 , (4.15)
Yng,i = Eθ,φ
[ (e−j(M1−1)ung,i,m + e−juni,i,`e−j(M1−2)ung,i,m + . . .+ e−j(M1−1)uni,i,`
)×
(e−j(M2−1)vni,i,` − e−j(M2−1)vng,i,m
) (1 + . . .+ e(M1−1)(ung,i,m−uni,i,`)
)×(
ej(M2−1)(vng,i,m−vni,i,`) − 1) ]
(4.16)
for m = 0, . . . Lng,i − 1, and Eψ
Proof. See Appendix A.5.
Similarly, MSEs due to intra-cell interference, inter-cell interference, and noise element are
characterized in following three theorems, respectively.
Theorem 4.3. For superimposed pilot-based 3D massive MIMO system, the MSE, E(∆vni,i,`)22,
due to intra-cell interference is given by
E
(4vni,i,`)22
=ρ2
1γ2 + ρ2
2γ(1− γ)
Wni,i,`
J−1∑j=0j 6=n
Λji,iX′
ji,iαji,i
(Yji,i + Y
′
ji,i − 2<ejΦYji,i
).
Proof. See Appendix A.6.
Theorem 4.4. For superimposed pilot-based 3D massive MIMO system, the MSE, E(∆vni,i,`)23,
due to inter-cell interference is given by
E
(4vni,i,`)23
=ρ2
1γ2 + ρ2
2γ(1− γ)
Wni,i,`
G−1∑g=0g 6=i
J−1∑j=0j 6=n
Λjg,iX′
jg,iαjg,i
(Yjg,i + Y
′
jg,i − 2<ejΦYjg,i
).
74 Chapter 4. Superimposed Pilot for Massive FD-MIMO Systems
Proof. This theorem can be proved similarly to Theorem 4.3.
Theorem 4.5. For superimposed pilot-based 3D massive MIMO system, the MSE, E(∆vni,i,`)24,
due to noise element is given by
E
(4vni,i,`)24
=σ2(ξ |X`,1|2 + γ2Nt
)2|αni,i(`)|2N2
t Λni,i(M2 − 1)2M1
,
where σ2 is the noise variance, and ξ = Ntρ22γ(1− γ) + 2ρ2γ
√γ(1− γ).
Proof. This proof follows the line of proof for Theorem 1 in [6].
Remark 4.6. Theorems 1-4 explicitly shows how MSE depends on a few physical parameters
namely number of antennas, channel gains, correlation between pilot and data sequences,
and ratio of power allocated on pilot and data symbols. Moreover, these results clearly depict
how antenna configurations affect the estimation perfomance– placing the same total number
of antennas in different orientation will result in different DoA estimation performance for
elevation and azimuth angles.
Remark 4.7. It can be observed from Theorem 1 that MSE due to pilot contamination
does not depend on ρ1, correlation between scrambling sequences for different users. This
is due to the fact that pilot contamination results from users in other cells that use exactly
same spreading sequences as that of target user. On the other hand, from Theorems 2 and
3, it is obvious that MSEs due to intra-cell and inter-cell interference are affected by, ρ1.
4.4. Achievable Rate Analysis 75
Pilot + Data Data-Only
𝑄" Symbols 𝑄# Symbols
Figure 4.1: Uplink Transmission Phases in Superimposed Pilot System.
4.4 Achievable Rate Analysis
4.4.1 Uplink Rate Analysis
We consider a practical scenario, where uplink transmission occurs in two phases (see
Fig. 4.1). First, uplink channel estimation phase with superimposed pilots and data, and
second, uplink data-only transmission phase. Unlike the conventional channel estimation
approaches, where only pilots are transmitted during channel estimation phase, the super-
imposed pilot-based approaches can have non-zero data rate during uplink channel estimation
phase. We assume the number of symbols used for superimposed pilot and data transmission
is Qs, and the number of symbols used for data-only transmission is Qd. Let δs and δd denote
the ratio of symbols used during the channel estimation and data-only transmission phases,
respectively, i.e., δs = Qs/(Qs +Qd) and δd = Qd/(Qs +Qd) = 1− δs. Accordingly, average
uplink spectral efficiency for the i-th cell can be written as
IULi = δsIul,sd
i + (1− δs)Iul,ddi , (4.17)
where Iul,sdi and Iul,dd
i are the uplink rates corresponding to superimposed pilot-based channel
estimation phase and dedicated data-only transmission phase, respectively.
In this work, we assume that the path gains are known apriori to the BS. In the uplink, DoA
is the only channel parameters that BS needs to estimate. BS utilizes the uplink estimated
76 Chapter 4. Superimposed Pilot for Massive FD-MIMO Systems
DoA for uplink data detection. On the other hand, at the UE side, each UE knows its own
DoD and precodes the uplink data with the DoD steering vectors. In the downlink, with
the TDD operation, BS utilizes the uplink estimated DoA for downlink precoding. On the
other hand, UEs use their own DoD steering matrices for downlink receive processing. Note
that in order to detect symbols in the uplink or do precoding in the downlink, the BSs do
not need the uplink DoD information.
It is to be noted that even though in this work we assume the path gains are known apriori,
in practice, based on estimated angle information, path gains can also be estimated using
maximum likelihood (ML) method [60].
Uplink Rate for Channel Estimation Phase
From (4.1), the Nr × 1 uplink received signal at i-th base station for k-th subcarrier at the
q-th symbol, zqi (k), can be written as
zqi (k) =G−1∑g=0
J−1∑j=0
√Λjg,iHjg,i(k)
(xqjg(k) + sqjg(k)
)+ wq
i (k), (4.18)
where xqjg(k) and sqjg(k) are the Nt×1 transmitted data and pilot signal vectors, respectively,
from the j-th user in the g-th cell for k-th subcarrier at the q-th symbol. Let us assume
that n-th user in the i-th cell is the target user whose signal the i-th base station wants to
detect, and the mobile devices in the uplink use their own DoDs to precode the information
4.4. Achievable Rate Analysis 77
symbols. Now, from (4.18), we have
zqi (k) =√
Λni,iAni,i(k)Dni,i (xqni(k) + sqni(k)) +
J−1∑j=0j 6=n
√Λji,iAji,i(k)Dji,i
(xqji(k) + sqji(k)
)
+G−1∑g=0g 6=i
J−1∑j=0
√Λjg,iAjg,i(k)Djg,iB
Hjg,i
(BHjg,g
)+ (xqjg(k) + sqjg(k)
)+ wq
i (k), (4.19)
where xqni(k) and sqni(k) are the Ns×1 unprecoded data and pilot information symbol vector
in the superimposed channel estimation phase, respectively, and ()+ denotes matrix pseudo-
inverse operation. For detecting the data from the n-th user in i-th cell, the i-th base station
first subtracts the corresponding pilot signal from the received signal by utilizing uplink
estimated DoAs:
zqi (k)−√
Λni,iAni,i(k)Dni,isqni(k)
=√
Λni,iAni,i(k)Dni,ixqni(k) +
√Λni,iAni,i(k)Dni,is
qni(k)−
√Λni,iAni,i(k)Dni,is
qni(k)
+J−1∑j=0j 6=n
√Λji,iAji,i(k)Dji,i
(xqji(k) + sqji(k)
)
+G−1∑g=0g 6=i
J−1∑j=0
√Λjg,iAjg,i(k)Djg,iB
Hjg,i
(BHjg,g
)+ (xqjg(k) + sqjg(k)
)+ wq
i (k). (4.20)
Now, after correlating with target user’s estimated array steering matrix, we have
zqni(k) =1
Nr
AHni,i(k)
(zqi (k)−
√Λni,iAni,i(k)Dni,is
qni(k)
)=
√Λni,i
Nr
AHni,i(k)Ani,i(k)Dni,ix
qni(k) + pself
ni,q(k) + pintrani,q (k) + pinter
ni,q (k) + wqi (k),
(4.21)
78 Chapter 4. Superimposed Pilot for Massive FD-MIMO Systems
where wqi (k) = 1
NrAHni,i(k)wq
i (k) is the noise term, and pselfni,q(k), pintra
ni,q (k), and pinterni,q (k) are
the self interference, intra-cell interference, and inter-cell interference, respectively, and given
by
pselfni,q(k) =
√Λni,i
Nr
(AHni,i(k)
(Ani,i(k)− AH
ni,i(k))
Dni,isqni(k)
)(4.22)
pintrani,q (k) =
J−1∑j=0j 6=n
√Λji,i
Nr
AHni,i(k)Aji,i(k)Dji,i
(xqji(k) + sqji(k)
)(4.23)
pinterni,q (k) =
G−1∑g=0g 6=i
J−1∑j=0
√Λjg,i
Nr
AHni,i(k)Ajg,i(k)Djg,iB
Hjg,i
(BHjg,g
)+ (xqjg(k) + sqjg(k)
). (4.24)
It is to be noted here that the uplink self interference, pselfni,q(k), during the uplink data
detection phase is caused due to the mismatch between uplink estimated DoAs and true
uplink DoAs. Following theorem characterizes the uplink rate during channel estimation
phase assuming perfect channel estimation at the BS.
Theorem 4.8. For superimposed pilot-based 3D massive MIMO systems, uplink achievable
rate corresponding to n-th user at k-th subcarrier during superimposed pilot+data transmis-
sion phase and with perfect CSI acquisition is given by
Iul,sdni [k] =
Lni,i−1∑`=0
log2
(1 + γni,`p
ul,sdni,` [k]
), (4.25)
where γni,` = Λni,i|αni,i(`)|2/(σ2), Lni,i denotes the number of symbols, and pul,sdni,` [k] is the
uplink power allocated on the `-th superimposed data symbol during the channel estimation
phase.
Proof. See Appendix A.7.
Remark 4.9. Note that the rate expressions here are based on large antenna approximation.
4.4. Achievable Rate Analysis 79
Hence, in the strict sense, the rate in (4.25) is not the exact achievable rate. However,
following the tradition in massive MIMO literature, where the large antenna assumption is
very common in rate analysis, we stick to the convention and call it ’achievable rate’.
In the presence of DoA estimation error, uplink rate performance can be affected. Next
theorem relates uplink rate with estimated DoAs at the BS.
Theorem 4.10. For superimposed pilot system, uplink achievable rate corresponding to n-
th user at k-th subcarrier during superimposed pilot+data transmission phase and in the
presence of DoA estimation error is given by
Iul,sdni [k] = E
Lni,i−1∑`=0
log
1 +
1N2rγni,`
∣∣eHr,ni,i,k(`)er,ni,i,k(`)∣∣2 pul,sdni,` [k]
(σ2 + γni,`
∣∣∣ 1Nr
eHr,ni,i,k(`)er,ni,i,k(`)− 1∣∣∣2 pul,spni,` [k])
, (4.26)
where the expectation is taken with respect to DoA estimation error, γni,` = Λni,i|αni,i(`)|2,
and pul,sdni,` [k] and pul,spni,` [k] are the transmit powers during the uplink channel estimation phase
allocated on the data and pilot symbols, respectively.
Proof. See Appendix A.8.
Remark 4.11. Theorem 4.10 shows that unlike the perfect channel estimation case, under
DoA estimation error, the uplink rate is affected by pilot symbols superimposed on the data.
Employing more power on the superimposed pilot symbols, although improves the channel
estimation quality, negatively affects the uplink rate during the channel estimation phase.
Uplink Rate for Data-only Transmission Phase
In this subsection, we discuss on the uplink achievable rate for the data-only transmission
phase. The Nr × 1 received signal at k-th subcarrier on the q′-th symbol during the uplink
80 Chapter 4. Superimposed Pilot for Massive FD-MIMO Systems
data-only transmission phase can be written as
zq′
i (k) =√
Λni,iAni,i(k)Dni,ixq′
ni(k) +J−1∑j=0j 6=n
√Λji,iAji,i(k)Dji,ix
q′
ji(k)
+G−1∑g=0g 6=i
J−1∑j=0
√Λjg,iAjg,i(k)Djg,iB
Hjg,i
(BHjg,g
)+xq′
jg(k) + wq′
i (k), (4.27)
where the first and second summation terms in (4.27) represent intra- and inter-cell inter-
ference during uplink data transmission phase, respectively, and wq′
i (k) denotes the corre-
sponding noise element. Uplink rate corresponding to the data-only transmission phase is
characterized in the following theorem assuming perfect CSI availability at the BS.
Theorem 4.12. For superimposed pilot system, uplink achievable rate corresponding to n-th
user at k-th subcarrier during data-only transmission phase and with perfect CSI is given by
Iul,ddni [k] =
Lni,i−1∑`=0
log2
(1 + γni,`p
ul,ddni,` [k]
), (4.28)
where pul,ddni,` [k] is the power allocated on the `-th symbol during the data-only tranmission
phase.
Proof. See Appendix A.9.
It can be observed from Theorem 4.8 and Theorem 4.12 that assuming perfect DoA es-
timation, uplink rates for channel estimation phase and data-only transmission phase are
similar except for the corresponding data power allocation during the two phases. Following
theorem now characterizes the uplink rate during data-only transmission phase taking DoA
estimation error into account.
4.4. Achievable Rate Analysis 81
Theorem 4.13. For superimposed pilot system, uplink achievable rate corresponding to n-
th user at k-th subcarrier during data-only transmission phase and in the presence of DoA
estimation error is given by
Iul,ddni [k] = E
Lni,i−1∑`=0
log
(1 +
1N2rγni,` |er,ni,i,k(`)er,ni,i,k(`)|2 pddni,`[k]
σ2
) , (4.29)
pul,ddni,` is the corresponding power allocated at `-th symbol during the data-only transmission,
and the expectation is taken with respect to uplink DoA estimation error.
Proof. See Appendix A.10.
4.4.2 Optimum Downlink Precoding
At the i-th BS, for downlink transmission, the Ns × 1 frequency domain information vector
intended for the n-th MS can be denoted as sdlni[k] =
[sdlni,0[k], . . . , sdlni,Ns−1[k]
]T, where sdlni,p[k]
is the p-th downlink information at the k-th subcarrier for the n-th MS in i-th cell. Hence, we
can express the transmitted Nr×1 downlink signal from the i-th BS as xdli [k] =∑J−1
j=0 xdlji[k],
where xdlji[k] = Vdlji[k]sdlji[k], and Vdl
ji[k] is the Nr×Ns downlink precoder corresponding to the
j-th MS in i-th cell at the k-th subcarrier. At the n-th MS in i-th cell, the Nt × 1 downlink
received signal at the k-th subcarrier, ydlni[k], can be expressed as
ydlni[k] =G−1∑g=0
√Λni,gH
dlni,g[k]xdlg [k] + ndlni[k] =
G−1∑g=0
J−1∑j=0
√Λni,gH
dlni,g[k]Vdl
jg[k]sdljg[k] + ndlni[k]
=√
Λni,iHdlni,i[k]Vdl
ni[k]sdlni[k] + I intra,dlni + I inter,dl
ni + ndlni[k], (4.30)
where Hdlni,g[k] is the Nt × Nr channel transfer function at the k-th subcarrier correspond-
82 Chapter 4. Superimposed Pilot for Massive FD-MIMO Systems
ing to the downlink channel between the n-th MS in i-th cell and the g-th BS; I intra,dlni =
J−1∑j=0j 6=n
√Λni,iH
dlni,i[k]Vdl
ji[k]sdlji[k] and I inter,dlni =
G−1∑g=0g 6=i
J−1∑j=0
√Λni,gH
dlni,g[k]Vdl
jg[k]sdljg[k] repreesnt, re-
spectively, the intra- and inter-cell interferences; ndlni[k] is the Nt × 1 noise vector with
Endlni[m]ndlni[n] = σ2INtδ(m − n). First term in (4.30) is the desired signal for the n-th
user in the i-th cell.
The n-th user’s downlink rate can accordingly be written as [64]:
Idlni = log2 det
(ILni,i + Dni,iQ
dlni[k]DH
ni,iR′−1
ni [k]), (4.31)
where Dni,i =√
Λni,iDni,i, Qdlni[k] = Esdlni[k]sdl
H
ni [k] is the covariance matrix of the transmit
symbol vector from the i-th BS intended for the n-th MS on the k-th subcarrier, and R′ni[k]
denotes the inter-cell interference-plus-noise covariance matrix.
Hence, under the total power constraint, Pt, for each subcarrier, the downlink sum-rate for
the i-th BS at the k-th subcarrier can be expressed as
Idli [k] =
J−1∑j=0
Idlji [k]. (4.32)
It is to be noted here that precoders, Vdljg,where j = 1, . . . , J, and g = 1, . . . , G, in (4.32)
are constructed based on the uplink DoAs in the BS. The precoder and power allocation that
maximize the downlink rate in (4.32) is presented in [64] taking the uplink DoA estimation
error into account. For the completeness of the exposition, in Section V, we utilize the
theoretical results from [64] for investigating downlink and overall rate performance for
superimposed pilot-based massive MIMO systems. It is to be noted that for investigating
downlink rate performance, the BS is assumed to have the perfect knowledge of the complex
channel gains. However, once the uplink DoAs are estimated, we can estimate the channel
4.5. Performance Evaluation 83
-5 0 5 10 15 20 25
SNR (dB)
10-2
10-1
100
101
RM
SE
fo
r A
ng
le E
stim
atio
n (
in D
eg
)
Simulation Result, Elevation: 8 8 Array with Data Power=0.3
Analytical Result, Elevation: 8 8 Array with Data Power=0.3
Simulation Result, Elevation: 8 8 Array with Data Power=0.1
Analytical Result, Elevation: 8 8 Array with Data Power=0.1
Figure 4.2: Elevation Angle Estimation for 64 Antennas.
gains based on estimated DoAs. Our work on this aspect is presented in [60].
4.5 Performance Evaluation
In this section, we present the performance evaluation for superimposed pilot based 3D
massive MIMO systems. We consider seven hexagonal cells with 10 users in each cell. We
assume each channel has four dominant paths, and antennas at both BS and user sides are
0.5 wavelengths apart.
84 Chapter 4. Superimposed Pilot for Massive FD-MIMO Systems
-5 0 5 10 15 20 25
SNR (dB)
10-2
10-1
100
101
RM
SE
fo
r A
ng
le E
stim
atio
n (
in D
eg
)
Simulation Result, Azimuth: 8 8 Array with Data Power=0.3
Analytical Result, Azimuth: 8 8 Array with Data Power=0.3
Simulation Result, Azimuth: 8 8 Array with Data Power=0.1
Analytical Result, Azimuth: 8 8 Array with Data Power=0.1
Figure 4.3: Azimuth Angle Estimation for 64 Antennas.
4.5. Performance Evaluation 85
The RMSE of DoA estimation for 8×8 antenna arrays are presented in Fig. 4.2 and Fig. 4.3
for elevation and azimuth angles, respectively. Number of antennas at each user is 8. The
correlation coefficient of spreading sequences, ρ1, is takes as 0.3, and correlation between data
and pilot vectors is 0.1. We can observe from the figures that the theoretical and analytical
results match very closely. Results for 16 × 4 antenna array are shown in Fig. 4.4, which,
after comparing with 8× 8 case, indicates that antenna array geometry plays a critical role
in DoA estimation performance. Results corresponding to 16 × 16, and 32 × 8 arrays are
presented in Fig. 4.5 and Fig. 4.6, respectively. We can clearly see that DoA estimation
performance significantly improves with the number of antennas at the BS. In our work,
the array response vectors become orthogonal asymptotically as the number of antennas at
the BS goes large. In practice, this asymptotic assumption holds even for relatively small
antenna arrays, for example, with a total number of 64 antennas (8× 8 or 4× 16 or 16× 4)
[6, 7].
We next investigate the effects of superimposed pilots on uplink achievable rate. From (4.17),
uplink rate can be written in terms of rate achieved during superimposed pilot-based channel
estimation phase and uplink data-only transmission phase:
IUL = δsIul,sd + δdIul,dd, (4.33)
where, again, δs and δd are the ratio of symbols used for channel estimation phase and data
transmission phase, respectively.
Figure 4.7 presents cumulative distribution function (CDF) for uplink rate where SNR level
is fixed at -5 dB, and the values for δs and δd are chosen to be 0.1 and 0.9, respectively.
For the uplink channel estimation phase, sum of powers allocated on the pilots and data
symbols is assumed to be unity. The figure shows the curves as we vary pilot power during
86 Chapter 4. Superimposed Pilot for Massive FD-MIMO Systems
-5 0 5 10 15 20 25
SNR (dB)
10-2
10-1
100
101
RM
SE
fo
r A
ng
le E
stim
atio
n (
in D
eg
)
Simulation Result, Azimuth: 16 4 Array with Data Power=0.3
Analytical Result, Azimuth: 16 4 Array with Data Power=0.3
Simulation Result, Elevation: 16 4 Array with Data Power=0.3
Analytical Result, Elevation: 16 4 Array with Data Power=0.3
Figure 4.4: Angle Estimation for 16× 4 Antenna Array.
4.5. Performance Evaluation 87
-5 0 5 10 15 20 25
SNR (dB)
10-3
10-2
10-1
100
RM
SE
fo
r A
ng
le E
stim
atio
n (
in D
eg
)
Simulation Result, Elevation: 32 8 Array with Data Power=0.3
Analytical Result, Elevation: 32 8 Array with Data Power=0.3
Simulation Result, Elevation: 16 16 Array with Data Power=0.3
Analytical Result, Elevation: 16 16 Array with Data Power=0.3
Figure 4.5: Elevation Angle Estimation for 256 Antenna Elements.
88 Chapter 4. Superimposed Pilot for Massive FD-MIMO Systems
-5 0 5 10 15 20 25
SNR (dB)
10-3
10-2
10-1
100
RM
SE
fo
r A
ng
le E
stim
atio
n (
in D
eg
)
Simulation Result, Azimuth: 32 8 Array with Data Power=0.3
Analytical Result, Azimuth: 32 8 Array with Data Power=0.3
Simulation Result, Azimuth: 16 16 Array with Data Power=0.3
Analytical Result, Azimuth: 16 16 Array with Data Power=0.3
Figure 4.6: Azimuth Angle Estimation for 256 Antenna Elements.
4.5. Performance Evaluation 89
0 0.5 1 1.5 2 2.5
Data Rate (bits/s/Hz)
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
CD
F
Uplink CDF for s= 0.1,
d= 0.9, SNR= -5 dB
Orthogonal Pilot Transmission
Data power = : 0.1, Pilot power = : 0.9
Data power = : 0.3, Pilot power = : 0.7
Data power = : 0.5, Pilot power = : 0.5
Data power = : 0.7, Pilot power = : 0.3
Data power = : 0.9, Pilot power = : 0.1
Figure 4.7: Uplink Rate CDF when δs = 0.1 and δd = 0.9, and SNR= -5 dB.
90 Chapter 4. Superimposed Pilot for Massive FD-MIMO Systems
0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6
Data Rate (bits/s/Hz)
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
CD
F
Uplink CDF for s= 0.9,
d= 0.1, SNR= -5 dB
Orthogonal Pilot Transmission
Data power = : 0.1, Pilot power = : 0.9
Data power = : 0.3, Pilot power = : 0.7
Data power = : 0.5, Pilot power = : 0.5
Data power = : 0.7, Pilot power = : 0.3
Data power = : 0.9, Pilot power = : 0.1
Figure 4.8: Uplink Rate CDF when δs = 0.9 and δd = 0.1, and SNR= -5 dB.
4.5. Performance Evaluation 91
the channel estimation phase from 0.1 to 0.9 ( i.e., vary the data power from 0.9 to 0.1).
We can observe that as the data power during the channel estimation phase is increased,
the rate performance actually worsens. This is quite counter-intuitive. The reason behind
this phenomenon is as following: figure 4.7 depicts the scenario where most of the data
symbols are reserved for data-only transmission, while only 10% of the symbols are used
for channel estimation phase. In order to successfully decode the data symbols, accurate
DoA estimation is required since array steering matrices are utilized as the uplink receive
processing matrices. On the other hand, at -5 dB SNR level, the uplink DoA estimation
performance is not very good as we can observe from figure 4.2 to figure 4.6. Hence most
of the power during the superimposed channel estimation phase should be deployed on pilot
symbols in order to improve the channel estimation quality. Hence one can see a trade-off:
even though we are sacrificing the 10% data symbols that are superimposed with the pilots,
it is still better to employ most of the power on pilot symbols for better channel estimation
in order to secure the other 90% of the data symbols.
Figure 4.8 shows the uplink rate CDF where δs = 0.9 and δd = 0.1, and the SNR level is fixed
at same -5 dB. It can be observed that as the pilot power is increased from 0.1 to 0.5, the rate
increases. However, if we further increase the pilot power (i.e., decrease the superimposed
data power), the rate starts decreasing. This is quite different than what we observed in
figure 7. The reason is that unlike figure 7, in figure 4.8, most of the data symbols are
used during the superimposed channel estimation phase, and only 10% of the data symbols
are used during the data-only transmission phase. Hence, once a certain level of channel
estimation quality is ensured, we should employ rest of the power on the superimposed data
during channel estimation.
In Fig. 4.7 and Fig. 4.8, we also plot the traditional orthogonal pilot transmission scheme,
which corresponds to the case where data power = 0 and pilot power = 1. We can see from
92 Chapter 4. Superimposed Pilot for Massive FD-MIMO Systems
0 20 40 60 80 100 120 140 160 180
Data Rate (bits/s/Hz)
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
CD
F
Uplink CDF for s= 0.1,
d= 0.9, SNR= 20 dB
Orthogonal Pilot Transmission
Data power = : 0.1, Pilot power = : 0.9
Data power = : 0.3, Pilot power = : 0.7
Data power = : 0.5, Pilot power = : 0.5
Data power = : 0.7, Pilot power = : 0.3
Data power = : 0.9, Pilot power = : 0.1
Figure 4.9: Uplink Rate CDF when δs = 0.1 and δd = 0.9, and SNR= 20 dB.
Fig. 4.7 that superimposed pilot strategies (non-zero data power during phase 1 transmission)
do not provide too much advantage over the traditional orthogonal pilot transmission scheme.
This observation is consistent with that in [74]. On the other hand, Fig. 4.8 depicts the
scenario for δs = 0.9, and δd = 0.1. In contrast to Fig. 4.7, it can be seen from Fig. 4.8 that
having a balanced power allocation between data and pilot symbols is important for obtaining
higher uplink rate. In this scenario, superimposed pilot strategy clearly outperforms the
traditional orthogonal pilot scheme in all SNR regime.
The uplink rate CDFs for SNR = 20 dB are shown in figures 4.9 and figure 4.10. Figure 4.9
depicts the scenario where δs = 0.1 and δd = 0.9, i.e., most of the data symbols are reserved
4.5. Performance Evaluation 93
0 20 40 60 80 100 120
Data Rate (bits/s/Hz)
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
CD
F
Uplink CDF for s= 0.9,
d= 0.1, SNR= 20 dB
Orthogonal Pilot Transmission
Data power = : 0.1, Pilot power = : 0.9
Data power = : 0.3, Pilot power = : 0.7
Data power = : 0.5, Pilot power = : 0.5
Data power = : 0.7, Pilot power = : 0.3
Data power = : 0.9, Pilot power = : 0.1
Figure 4.10: Uplink Rate CDF when δs = 0.9 and δd = 0.1, and SNR= 20 dB.
94 Chapter 4. Superimposed Pilot for Massive FD-MIMO Systems
for data-only transmission. Comparing it with Figure 4.7, we can observe that performance
is worst if the power allocation on the pilots is too low such as pilot power = 0.1 in this
figure. Hence, certain level of power needs to be employed on superimposed pilot symbols for
having better channel estimation. However, as we keep increasing the pilot power allocation,
in contrast to Figure 4.7, the achievable rate performance becomes very similar. The reason
behind this is that at high SNR, the DoA estimation performance is quite good, as can also
be observed in figure 4.2 to figure 4.6, and it doesn’t require too much power on the pilot to
attain a good channel estimation. Figure 4.10, on the other hand, presents the uplink rate
CDFs for the case where δs = 0.9 and δd = 0.1. This depicts the scenario where majority of
the data symbols are used during the channel estimation phase superimposed with the pilot
symbols. In contrast to low-SNR case of figure 4.8, figure 4.10 shows that in the high-SNR
regime, the rate performance increases as we keep increasing power on the superimposed data
symbols during channel estimation phase. Figure 4.11 shows the uplink rate vs SNR plots
for different data-pilot power allocation ratios and for the case where δs = 0.1, and δd = 0.9.
We can observe that the performance improves as we put more power on the superimposed
pilots. However, as the SNR increases the performance-gap among different power splitting
ratios gradually decreases. Figure 4.12 presents the results for uplink rate vs SNR plots for
δs = 0.9 and δd = 0.1. We can observe that as SNR increases the performance also gets
better. In this scenario, it can be clearly seen that at low SNR regime, while it is detrimental
to invest too little power on the superimposed pilot symbols, it also hurts the data-rate if too
much power is invested on the pilot. This observation provides important design intuitions
for the system architects. At high SNR regime, as is obvious from Figure 4.12, it is prudent
to employ more power on the superimposed data symbols.
The downlink rate CDFs are shown in figure 4.13 and figure 4.14. Figure 4.13 depicts the
case where equal number of symbols are used for the pilot and data symbols, i.e., δs = 0.5
4.5. Performance Evaluation 95
-5 0 5 10 15 20
SNR (dB)
10-1
100
101
102
Ra
te (
bits/s
/Hz)
Uplink Rate when s= 0.1,
d= 0.9
Orthogonal Pilot Transmission
Data power = : 0.1, Pilot power = : 0.9
Data power = : 0.3, Pilot power = : 0.7
Data power = : 0.5, Pilot power = : 0.5
Data power = : 0.7, Pilot power = : 0.3
Data power = : 0.9, Pilot power = : 0.1
Figure 4.11: Uplink Rate vs SNR when δs = 0.1 and δd = 0.9
96 Chapter 4. Superimposed Pilot for Massive FD-MIMO Systems
-5 0 5 10 15 20
SNR (dB)
10-1
100
101
102
Rate
(bits/s
/Hz)
Uplink Rate when s= 0.9,
d= 0.1
Orthogonal Pilot Transmission
Data power = : 0.1, Pilot power = : 0.9
Data power = : 0.3, Pilot power = : 0.7
Data power = : 0.5, Pilot power = : 0.5
Data power = : 0.7, Pilot power = : 0.3
Data power = : 0.9, Pilot power = : 0.1
Figure 4.12: Uplink Rate vs SNR when δs = 0.9 and δd = 0.1
4.5. Performance Evaluation 97
0 10 20 30 40 50 60 70 80
Data Rate (bits/s/Hz)
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
CD
F
Downlink CDF for s= 0.5,
d= 0.5, SNR
UL= 20dB, SNR
DL= 10dB
Orthogonal Pilot Transmission
Data power = : 0.1, Pilot power = : 0.9
Data power = : 0.3, Pilot power = : 0.7
Data power = : 0.5, Pilot power = : 0.5
Data power = : 0.7, Pilot power = : 0.3
Data power = : 0.9, Pilot power = : 0.1
Figure 4.13: Downlink Rate when the uplink SNR=20 dB
98 Chapter 4. Superimposed Pilot for Massive FD-MIMO Systems
0 50 100 150 200
Data Rate (bits/s/Hz)
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
CD
F
Downlink CDF for s= 0.5,
d= 0.5, SNR
UL= 5, SNR
DL= 20
Orthogonal Pilot Transmission
Data power = : 0.1, Pilot power = : 0.9
Data power = : 0.3, Pilot power = : 0.7
Data power = : 0.5, Pilot power = : 0.5
Data power = : 0.7, Pilot power = : 0.3
Data power = : 0.9, Pilot power = : 0.1
Figure 4.14: Downlink Rate when the uplink SNR=5 dB
4.5. Performance Evaluation 99
and δd = 0.5. Downlink SNR level is fixed at 10 dB, while uplink SNR during channel
estimation phase is kept at 20 dB. We can observe that if the uplink pilot power during
channel estimation is too low, the downlink data rate suffers. The reason is that downlink
precoder is based on uplink estimated DoAs. With too little power on the uplink superim-
posed pilot symbols, uplink channel estimation quality degrades, and accordingly, downlink
rate is also affected. On the other hand, at 20 dB uplink SNR, after ensuring a certain
power level, further increasing the allocation for uplink superimosed pilot symbols doesn’t
cause too much variation in the downlink rate performance. Figure 4.14 shows the case for
lower uplink SNR level (5 dB). Superimposed pilot and data power splitting ratios are kept
same as in figure 4.13, i.e., δs = 0.5 and δd = 0.5, where as downlink SNR level is fixed at
20 dB. In this low uplink SNR case, we can observe that putting more power on the pilot
symbols increases the downlink rate performance since at low uplink SNR scenario, it takes
more power on the pilot symbols in order to achieve a good DoA estimation, which, again,
directly affects downlink rates through the downlink precoder design. Finally, comparing
figure 4.13 and figure 4.14, one can see that as the downlink SNR is increased, as expected,
downlink rate also increases.
The overall achievable rate, Ioverall, can be expressed in terms of uplink achievable rate, IUL,
and downlink achievable rate, IDL, as follows:
Ioverall = κULIUL + κDLIDL = κUL
(δsIul,sd + δdIul,dd
)+ κDLIDL, (4.34)
where κUL and κDL represent the weights/priorities for the uplink and downlink rates, re-
spectively, and κUL + κDL = 1. In figure 4.15, we present the overall rate vs power allocated
on the superimposed data during channel estimation. Plots for different uplink-downlink
priorities are shown for the scenario where δs = 0.1 and δd = 0.9. We can observe that
as we increase the downlink priority, overall rate increases. Moreover, as we increase the
100 Chapter 4. Superimposed Pilot for Massive FD-MIMO Systems
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
Data Power, (1- )
140
145
150
155
160
165
170
175
180
185
Ra
te (
bits/s
/Hz)
Total Rate when s= 0.1, and
d= 0.9
ul
= 0.9, dl
= 0.1
ul
= 0.5, dl
= 0.5
ul
= 0.1, dl
= 0.9
Figure 4.15: Total Rate when δs = 0.1
4.5. Performance Evaluation 101
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
Data Power, (1- )
20
40
60
80
100
120
140
160
180
Ra
te (
bits/s
/Hz)
Total Rate when s= 0.9, and
d= 0.1
ul
= 0.9, dl
= 0.1
ul
= 0.5, dl
= 0.5
ul
= 0.1, dl
= 0.9
Figure 4.16: Total Rate when δs = 0.9
102 Chapter 4. Superimposed Pilot for Massive FD-MIMO Systems
power allocated on the superimposed data, the rate initially increases slowly. However, after
a certain point, the overall rate keeps dropping. This is the point where uplink channel
estimation quality becomes quite bad since too little power is left for the superimposed pilot
symbols. Overall rate vs superimposed data power for the case where δs = 0.9 and δd = 0.1
is shown in figure 4.16. Comparing with figure 4.15, we can see that overall rate for this case
is worse than that of figure 4.15 for all priority levels. Also, similar to figure 4.15, the rate
drops after the superimposed data power goes above a threshold (0.8 in this case). Finally,
overall rate vs uplink priority level is shown in figure 4.17 and figure 4.18 for the cases where
δs = 0.1 and δs = 0.9, respectively. We can observe that as the uplink priority increases (i.e.,
downlink priority decreases), overall rate also decreases. Moreover, increasing power on the
superimposed data symbols from 0.1 to 0.7 increases the overall data rate.
Finally, we provide comparison between DoA-based and tradi2tional strategies for superim-
posed pilot system. For superimposed pilot systems, a popular strategy in the literature is
to estimate the channel using least square (LS) and utilize matched filtering (MF) method
for uplink receive processing. In Fig. 4.19, we compare the the proposed DoA based strategy
with the conventional LS-MF based method for uplink rate performance, where data power
and pilot power are fixed at 0.7 and 0.3, respectively. We can clearly observe that for both
δs = 0.5 and δs = 0.8, the DoA based strategy outperforms the conventional LS-MF based
method for superimposed pilot-based 3D massive MIMO systems.
4.6 Summary of Chapter 4
In this work, a superimposed pilot based massive FD-MIMO network is introduced and
the corresponding network performance is investigated. Both UL and DL achievable rates
are considered in the analysis to reflect the impact of UL pilot overhead on the network
4.6. Summary of Chapter 4 103
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
Uplink Priority Factor, UL
166
168
170
172
174
176
178
180
182
184
Ra
te (
bits/s
/Hz)
Total Rate when s= 0.1 and
s= 0.9
Orthogonal Pilot Transmission
Data Power= 0.1, Pilot power = : 0.9
Data Power= 0.3, Pilot power = : 0.7
Data Power= 0.5, Pilot power = : 0.5
Data Power= 0.7, Pilot power = : 0.3
Figure 4.17: Total Rate when δs = 0.1
104 Chapter 4. Superimposed Pilot for Massive FD-MIMO Systems
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
Uplink Priority Factor, UL
20
40
60
80
100
120
140
160
180
Rate
(bits/s
/Hz)
Total Rate when s= 0.9 and
s= 0.1
Orthogonal Pilot Transmission
Data Power= 0.1, Pilot power = : 0.9
Data Power= 0.3, Pilot power = : 0.7
Data Power= 0.5, Pilot power = : 0.5
Data Power= 0.7, Pilot power = : 0.3
Figure 4.18: Total Rate when δs = 0.9
4.6. Summary of Chapter 4 105
-5 0 5 10 15 20
SNR (dB)
10-1
100
101
102
Rate
(bits/s
/Hz)
Uplink Rate when Data power = 0.7, Pilot power = : 0.3
DoA-based: s= 0.8,
d= 0.2
LS-based: s= 0.8,
d= 0.2
DoA-based: s= 0.5,
d= 0.5
LS-based: s= 0.5,
d= 0.5
Figure 4.19: Uplink Transmission Phases in Superimposed Pilot System.
performance. The performance of UL DoA estimation is analytically characterized for su-
perimposed pilot based massive FD-MIMO networks and is linked to both UL and DL
achievable rate analysis. Both analytical and numerical evaluation suggest that the intro-
duced superimposed pilot based massive FD-MIMO network can significantly reduce the UL
pilot overhead to achieve a good trade-off between UL and DL achievable rates.
Chapter 5
MIMO Broadcast-Beam Optimization
Through DRL
5.1 Network Model and Problem Statement
We consider a cellular network consisting of G BSs and K UEs. We assume the BSs can
have one or multiple sectors, and there are total M sectors in the network, where M ≥ G.
Each sector is equipped with a two dimensional (2D) antenna array whose phases can be
configured so that different array-beam widths (in both elevation and azimuth domain) and
elevation tilt (e-tilt) angle can be updated. Placing 2D antenna array enables the BSs to
beamform in both elevation and azimuth directions, and this is essentially the setup for full
dimension (FD) MIMO systems [2]. The elevation beam-width, φ, azimuth beam-width, ψ,
and e-tilt angle, ζ, constitute the parameter set in constructing the broadcast beams for
each sector. In this work, we focus on optimizing the broadcast beams/sector-wide beams
for cellular network. Let us denote the number of antenna elements in elevation and azimuth
directions by N1 and N2, respectively. Hence, total N = N1N2 number of antenna weights
106
5.1. Network Model and Problem Statement 107
need to be tuned for generating the FD-MIMO broadcast beams. We can represent the
N1 × N2 antenna weight matrix into a N × 1 weight vector, w, following a vectorization
operation. Each choice of weight vector, w, in fact, consists of a specific choice of φ, ψ, and
ζ. A collection of notations used in this paper is summarized in Table 5.1.
Table 5.1: Notation for System Variables
Variable Notation
No. of BSs GNo. of Sectors M
No. of UEs KElevation beam-width φAzimuth beam-width ψ
E-tilt angle ζNo. of antennas at the BSs in elevation direction N1
No. of antennas at the BSs in azimuth direction N2
Total no. of antennas NBroadcast signal from m-th BS xm
Broadcast beamforming vector for m-th BS fmReceived signal at k-th UE yk
Channel between m-th BS and k-th UE hm,kBeam-pool W
No. of possible beams in beam-pool Jj-th beam-weight vector in beam-pool wj
n-th antenna weight in j-th beam wjnUEs’ SINR threshold for connectivity T
Assuming each UE has a single antenna, the downlink broadcast received signal at k-th UE
under m-th cell-sector can be written as
yk = hTm,kfmxm +M∑
m′=1m′ 6=m
hTm′,kfm′xm′ + nk, (5.1)
where hm,k is the N×1 channel vector for the channel between m-th sector and the k-th UE,
xm is the broadcast signal from m-th sector, and fm is the corresponding N × 1 broadcast
precoding vector for m-th sector. It can be clearly observed from (5.1) that broadcast beams
108 Chapter 5. MIMO Broadcast-Beam Optimization Through DRL
from one sector interfere with the beams from other sectors. Hence, in order to maximize the
network coverage, selecting the appropriate broadcast beams for all the sectors is critical.
In this work, we adopt a DRL-based approach where an agent is responsible for selecting
the proper antenna configurations for all sectors [75]. Each BS, for its sectors, has the same
pool of possible antenna weight vectors available,W : w1,w2, . . . ,wJ, where J is the total
number of beam-weight vectors in the pool; wj = [wj1, wj2, . . . , w
jN ] is the j-th vector in the
beam pool, and wqn is the antenna weight for the n-th antenna element corresponding to q-th
weight vector. Accordingly, each sector chooses its precoder, f , from the beam pool, i.e.,
fm ∈ W . It is to be noted here again that each of the weight vector in the pool corresponds to
a particular choice of elevation and azimuth beam-widths and e-tilt angle. The agent selects
one out of J beam patterns for each sector based on users’ distribution/mobility patterns.
This selection behavior is referred to as actions in reinforcement learning.
All BS in the network transmit sector-specific signals using the wide broadcast beams selected
by the agent. UEs collect measurement results such as Reference Signal Received Power
(RSRP) or Reference Signal Received Quality (RSRQ), and report them to the agent as
observation of the mobile environment. Assuming k-th UE in the network is associated with
m-th sector, from (5.1), the received signal-to-interference-plus-noise ratio (SINR) for k-th
user can be expressed as:
SINRk =
∣∣hTm,kfm∣∣2∑Mm′=1m′ 6=m
∣∣hTm′,kfm′∣∣2 + σ2, (5.2)
where σ2 is the noise variance. In this work, we use the number of connected UEs as a metric
to measure the cell coverage. Number of connected UEs in the network can be defined as
the number of UEs whose received signal-to-interference-plus-noise ratio (SINR) are above
a predefined threshold, T . For any user distribution, the objective, hence, is to select the
5.1. Network Model and Problem Statement 109
optimal beam pattern indices for all the sectors under all BSs that maximize the coverage
or total number of connected UEs in the network. The problem can formally be written as:
maxf1,f2,...,fM
K∑k=1
1SINRk>T (5.3)
s.t. fm ∈ W , 1 ≤ m ≤M, (5.4)
where the indicator function, 1x>T , is defined as
1x>T :=
1, if x > T
0, if x ≤ T.
(5.5)
The user distribution changes over time, and hence optimal beam patterns that maximize
the number of connected UEs at time t1 may not be the same as that at time t2, where
t1 6= t2. The agent, therefore, has to be able to identify users’ mobility pattern, and then
dynamically and autonomously select optimal beams for all the sectors in order to maximize
network coverage. It is to be noted here that we are not using users’ location information to
optimize the beam patterns. In order to minimize the feedback from the network, the agent
will be merely using users’ RSRP values to for the optimization.
In this work, we consider both single cell and multiple cells network scenarios. In the single
cell case, the agent optimizes the broadcast beam for one cell–this represents a noise-limited
environment. In this case, the DRL only needs to learn the optimal beam according to the
cell environment including UE mobility pattern. On the other hand, in the multiple-cell case,
the broadcast beams for all the cells need to be updated simultaneously– this represents an
interference-limited environment. We are addressing the challenges of these two scenarios
110 Chapter 5. MIMO Broadcast-Beam Optimization Through DRL
where UEs are assumed to be moving according to some mobility pattern; first, the periodic
case, where users’ movement change in a periodic fashion, and second, the Markov case,
where users’ mobility is determined by following a transition probability matrix.
5.2 Learning Framework
In this section, we present learning framework for MIMO broadcast beam optimization using
DRL as a self-tuning sectorization mechanism. We first briefly describe the background of
DRL which will set up the foundation for broadcast beam-learning strategy developed in
subsequent sections.
5.2.1 Beam Learning Framework
Appropriate MIMO broadcast beam selection for cell-sectorization is critical for wireless
network performance optimization. Our objective here is to build a mechanism that auto-
matically facilitates the selection of best beams for all the sectors. Moreover, we would need
the sectors autonomously update their beam parameters based on different scenarios or user
distributions, and realize self-tuning sectorization. Towards this, our learning framework is
described as follows:
Specification of design parameters: First of all, network designer needs to decide on
the objective function that needs to be optimized [76]. For broadcast beam optimization,
an important objective function is the network coverage or total number of connected UEs
in the network. The optimization parameters in this problem are the beam weights for each
sector antenna element. It is necessary to select the optimal beam for each BS from a set of
possible beams. Next, the system designer needs to decide on what input, such as RSRP or
5.2. Learning Framework 111
RSRQ, are required from the UEs in order to learn their mobility behavior and optimize the
beams. Finally, in order to avoid random broadcast beams during the deployment stage, a
simulation platform based on ray-tracing data is built to train the DRL agent offline.
Learning Engine: An agent or learning engine has the task of learning the UE mobility
pattern and selecting the best beam parameters for each scenario. It takes feedbacks from
UEs as inputs, and suggests the optimal beam vectors for all sectors. Updating the beams
based on user distribution by autonomously identifying underlying mobility pattern requires
training. However, online training is often not desirable because of stringent network man-
agement requirements from the operators. Hence, the training needs to be done offline, and
the training environment has to be close to the real cellular environment as much as possible
so that the optimal beams in the training stage will be identical to the optimal beams in
deployment stage– the procedure is presented in details in the next subsection.
Online Deployment and Occasional Re-training Once the learning engine is trained
offline, the learned agent is deployed for real-time operation. It will enable the BSs to choose
the optimal beams and update the selections based on users mobility pattern. Since users’
mobility pattern in the network don’t change too frequently, the beam parameters learned
offline can remain unchanged for a long period of time– on the order of weeks or months.
Whenever, there is a need to support new scenarios or any change in mobility patterns
is identified, the learning engine would need to be re-trained offline based on recent data.
The newly learned beam parameters will then be pushed to the respective BSs for updated
operation.
112 Chapter 5. MIMO Broadcast-Beam Optimization Through DRL
5.2.2 Offline Training
Dynamically updating the broadcast beam patterns according to the cellular environment
and user distribution for all cells in real time is intrinsically a difficult problem. Directly
deploying a DRL agent and training it online is not only slow but also costly. During the
online training stage of the DRL, the agent may output some random beams according to
the greedy exploration algorithm. Some of these random beams may not be acceptable
to operators because of degraded network performance. In order to address this issue, we
develop an offline training mechanism using ray-tracing data to train the DRL network before
real deployment. By providing azimuth angle of arrival, elevation angle of arrival, azimuth
angle of departure, elevation angle of departure, and path loss value of each path for each
location in a cell, ray-tracing can well-capture the cellular environment so that the learned
beam in the offline training platform could be the same as the online deployment case.
The offline training is focused on learning the UE distribution pattern from users’ location
history data. The location data includes UEs’ location and the corresponding time stamp.
The location history data contains the UEs’ mobility pattern information. Together with ray-
tracing data, which contains the information about signal propagation environment , UEs’
location history data are used to train the DRL network so that the DRL agent could learn
the best broadcast beam according to both the cellular environment and UE distribution
pattern. It is to be noted that for each training time step, the BSs select one set of actions
and it throws the agent to a new state, upon which the new reward is computed. Hence, for
each training step, the agent needs to access one ray-tracing data. After offline training, it
will be deployed to provide real-time broadcast beam selection results for all the BSs in the
cellular network. In the following, we describe the detailed steps of offline training.
According to 3GPP standard on minimization of drive test (MDT), a BS could configure its
UEs to report measurement results, time stamp, and location information [77]. Therefore,
5.2. Learning Framework 113
we assume that UE location history information is available for a cellular network. During
one training step, a batch of time stamps are selected from the location history data, and
the corresponding UEs’ location information is incorporated to ray-tracing data for every
time stamp. Therefore, the UE distribution at the selected timestamp is combined with
ray-tracing data. We call the ray-tracing data with UE distribution information as scenario-
specific ray-tracing data and the UEs who report their measurement information during the
timestamp as selected UEs. Based on the current BSs’ broadcast beam and scenario-specific
ray-tracing data, the receive power for the selected UEs could be calculated and accordingly
the network coverage. A reward could be provided to the DRL agent based on the coverage
and the DRL agent could accordingly update its selection of broadcast beams based on
selected optimizer. These offline training steps could be repeated many times until the DRL
agent converges. After the DRL agent converges, it could be deployed in the cellular network
for real-time broadcast beam selection. Details on the DRL agent design is discussed in next
section. The entire offline training process is pictorially depicted in Fig. 5.1 and Algorithm 1.
Figure 5.1: Offline training
114 Chapter 5. MIMO Broadcast-Beam Optimization Through DRL
Algorithm 1 Offline Training
Input:1: UE location history data, ray-tracing data of a cellular networkOutput:2: trained DRL agent for broadcast beam selectionSTEP 1: Initialization3: Define a pool of candidate antenna patterns;STEP 2: Learning Best Beams4: while algorithm doesn’t achieve convergence do5: Select a batch of UE location at different timestamps;6: incorporate UE location distribution to ray-tracing data to create scenario-specific ray-
tracing data;7: calculate the received power for each UE in the scenario-specific ray-tracing data based on
the current BSs’ broadcast beam;8: calculate the network coverage, and calculate a total reward as a function of network cov-
erage;9: DRL updates its neural weights based on the learning algorithm and reward
10: end while
5.3 DRL for Broadcast Beam Optimization
In this section, details on the design of DRL framework for self-tuning sectorization are
presented. The DRL network is utilized in order to track optimal beams during both the
offline training and online deployment. To be specific, a deep Q-network (DQN)-based
architecture has been introduced to select MIMO broadcast beams for all sectors in a dynamic
environment. For better stability of the results, we use DQN with experience replay [42, 78].
The agent (decision maker) interacts with the environment by selecting the best broadcast
beam parameters. The DRL has three main components: state, action and reward. The
dynamics between state, action and reward are shown in Fig. 5.2. Agent interacts with
environment by observing the state of the network, and taking action that maximizes the
reward or network performance metric.
5.3. DRL for Broadcast Beam Optimization 115
5.3.1 Background of DRL
We consider a reinforcement learning framework where an agent or controller dynamically
interacts with an unknown environment, E , by taking sequential decisions or actions in
discrete time steps. At each time step, t, the agent interacting with the environment observes
a state, st ∈ S, selects an action, at, from a set of allowable actions, A, and receives an
immediate scalar reward, rt ∈ R(st, at). Based on agent’s current action, agent enters into
new state, st+1. The cumulative discounted reward, Rt, at time step, t, is defined as
Rt =∞∑k=0
γkrt+k, (5.6)
where γ ∈ (0, 1] is the reward discount factor, which balances between the impact of recent
rewards and earlier rewards. The learning objective is to maximize the expected cumulative
reward at each state, st. The Q-value, Qπ(s, a), for state-action pair, (s, a), is defined as
the expected cumulative discounted reward for taking action, a, in state, s, and following a
policy, π, onward, i.e.,
Qπ(s, a) = E[Rt|s, a], (5.7)
where E[·] denotes expectation. Q-learning adopts a value iteration approach to find the
Q-values for each state-action pair, and optimal value function Q∗(s, a) is the one which
provides maximum action value for state, s, and action, a, achievable by following any
policy:
Q∗(s, a) = maxπ
Qπ(s, a). (5.8)
116 Chapter 5. MIMO Broadcast-Beam Optimization Through DRL
Using Bellman equation [41], the optimal value function in (5.7) can be expressed as
Q∗(s, a) = Es′
[rt + γmax
a′Q∗(s′, a′)|s, a
]. (5.9)
The value iteration algorithm can solve the Bellman equation, and the update rule is given
by
Qi+1(s, a)← Es′
[rt + γmax
a′Qi(s
′, a′)|s, a.]. (5.10)
In deep Q-learning, the value functions are approximated by deep neural network parame-
terized by the weights, ζ:
Q(s, a, ζ) ≈ Qπ(s, a). (5.11)
This helps to estimate the Q-values even for very large state-action space, and reduces
the computational complexity. Next, we describe each of these components in details, and
explain how we model the state, action, and reward in DRL-based MIMO broadcast beam
optimization problem.
State: State in the introduced RL framework is designed as to reflect the network coverage
situation which can be obtained from UE measurements. To be specific, we can design
the state as the connection indicators of UEs in the network (a vector of 1/0s). Each UE
reports its status to its attached BS. If a UE’s SINR falls below a predefined threshold, T ,
a zero is placed at the element of the vector corresponding to that UE. Otherwise, a one is
placed. Accordingly, a ‘0’ in the state vector will represent that the corresponding UE has
poor connection, and a ‘1’ will indicate that the UE has good connection. The DRL state
representation adopted in this work is pictorially depicted in Fig. 5.3.
5.3. DRL for Broadcast Beam Optimization 117
AGENT
Action(BeamParameters)
Observation/StateReward(No.ofConnectedUEs)Environment
UEs
Figure 5.2: Reinforcement Learning Framework for Beam Optimization
1
11
00
BaseStation/CellTower
User1’sStatus
User2’sStatus
User3’sStatus
User4’sStatus
User5’sStatus
User1
User2
User3
User4
User5
Connected
Connected
Connected
NotConnected
NotConnected
GoodConnectionSignal-strengthabovethreshold
PoorConnectionSignal-strengthbelowthreshold
Figure 5.3: DRL State Representation for Beam Optimization Problem
Action: An action of the agent is defined as the selection of beam index from a pool of
candidate beam patterns. Agent observes the states and the corresponding reward, and takes
the best possible action that maximizes the cumulative discounted future reward for the next
118 Chapter 5. MIMO Broadcast-Beam Optimization Through DRL
time step. At the beginning of training, agent explores different actions in an attempt to
learn the best beams for different user distribution. However, once the training phase is
complete, agent exploits the learned information and only selects the best known actions
that maximize the cumulative reward for each user distribution. We would like to highlight
the fact that continuous beam space would not be feasible for cellular network coverage
optimization since it can produce many beams which may not be practically realizable at BS
arrays. Hence, beam pool should consist of discrete set of beams and need to be judiciously
selected based on the particular network under consideration.
Reward: A reward in this work refers to any network performance metric. One way to
design the reward can be the total number of connected UEs in the network based on the
state and action taken in the previous state. Another approach to design the reward can
be the function of the measurement results, for example, a function of the SINR or RSRP
vector. In this work, we adopt the first approach for designing the reward. It is to be
noted here that maximizing total number of connected UEs in the network is equivalent to
maximizing the coverage of the cellular network.
The agent’s goal is to maximize the cumulative discounted future reward. The agent gathers
its experiences as tuples, (st, at, rt, st+1), where st denotes current UE connection state, at
denotes the action taken at state, st; rt is the instantaneous reward obtained from state,
st and by taking action, at; and st+1 is the next state. The agent stores history of its
experiences in a memory called experience replay memory[78], and replay memory stores the
tuples, (st, at, rt, st+1), for all time steps. The DRL agent randomly samples mini-batches
of experience from the replay memory for training, and selects an action based on ε-greedy
policy, i.e., with probability ε, it tries a random action, and with probability (1−ε), the agent
selects the best known action so far. The optimum action in a particular state is selected
based on maximum Q-values [41] corresponding to that state. In DQN-based reinforcement
5.3. DRL for Broadcast Beam Optimization 119
learning, the Q-values are predicted using deep neural network. Input to the neural network
is the UEs’ connection vector representing the state of the RL environment, and output is
the Q-values corresponding to all the possible actions, i.e., beam indices from the beam-pool.
In the following subsections, we detail the broadcast beam optimization strategies for both
single cell and multiple cell scenarios.
5.3.2 Broadcast beam optimization for dynamic environment
In this subsection, we present the framework for dynamically optimizing MIMO broadcast
beams, where the RL agent needs to simultaneously control the beam parameters for all
the sectors based on different user distributions. For the single cell case, beam parameters
corresponding to only one sector need to be optimized. This could serve as an example where
a legacy LTE sector is replaced with one massive MIMO unit. The goal is to maximize the
number of connected UEs for different dynamic user distributions. The agent keeps a single
replay memory containing the agent’s experience tuples, and randomly samples from it–this
random sampling from experience replay memory helps to decorrelate the data[43]. However,
for multiple sectors case, there needs to be some significant changes on the RL framework
compared to that for single sector beam optimization. In the multiple sector environment,
each sector has its own pool of beams or action sets. Each sector can hence independently
select its own beam parameters. The setup is similar to that of multi agent system [79, 80].
The goal remains the same–to maximize the overall network coverage. This is a challenging
problem in terms of computational tractability. For an illustration, let us consider that
there are m sectors in the network, and each sector has j possible beam patterns (actions)
to select from. Hence, total number of actions, i.e., all possible combinations of sectors’
beam patterns, becomes jm, which increases exponentially with total number of sectors.
If there are 40 base stations, and each has 5 possible actions to choose from, total possible
120 Chapter 5. MIMO Broadcast-Beam Optimization Through DRL
combination of beam patterns becomes 540, which is an extraordinarily large number, making
it difficult to achieve optimal solution within reasonable time.
One way to find the appropriate broadcast beams for multiple sectors simultaneously is to
use a single large neural network with large number of output nodes that can be used to
predict the Q-values for all possible jm actions. However, total number of training samples
needed to train such neural network would be extremely large, which may not be feasible
at all for any practical purposes. In other words, the learning algorithm can almost never
achieve convergence with this architecture for even moderate size cellular network.
To address this issue, we introduce a novel low-complexity algorithm for optimizing the
broadcast beams for multiple sectors where the action space grows only linearly, instead of
exponentially, with total number of sectors in the network. Let us again assume that there
are m sectors, and each sector has j possible actions (beam-weight set) to choose from.
Unlike the single cell case, for multiple cell environment, we assume the agent preserves
different replay memories for different sectors. Moreover we use m different neural networks
for independently computing the Q-values for j sectors. Each neural network is responsible
to predict the optimum action for the corresponding sector only. With this architecture,
number of actions increases only linearly, but we can still achieve perfect convergence with
reasonably short computation time, which demonstrated through extensive simulation in
Section 5.4. It is to be noted that deep Q learning algorithm proposed in [42] is designed to
create a single NN-based agent that can learn to play Atari games where the number of valid
actions for the player is quite limited. On the other hand, in terms of training methodology,
the architecture presented in this section for multiple sector scenario is scalable with growing
action space, and in this sense, it provides a architectural generalization of the work in [42].
The details of the architectures for replay memory and neural networks for multiple sector
broadcast beam optimization are briefly described next.
5.3. DRL for Broadcast Beam Optimization 121
Figure 5.4: Replay Buffer architecture for multiple sector case
Replay memory architecture: The replay memory architecture for multiple sectors
broadcast beam optimization is shown in figure 5.4. There are separate buffers for each
sector. The same current state, reward, and the next state are stored in all the replay mem-
ories/buffers for the sectors. However, the replay memories differ in the actions taken (beam
indices chosen) by the each sector. While all the sectors observe the same current state, st
, reward, rt , and next state, st+1 , the action stored are different–BS 1’s action is stored
in buffer 1, BS 2’s action is stored in buffer 2, and so on. The rationale behind this buffer
architecture is that states and rewards are network specific, and same states and rewards
are observed by all sectors. On the other hand, each sector takes its own action, and their
joint actions regulate the overall network state and the corresponding reward.
Neural Network architecture: For Q-value prediction, a deep convolutional neural net-
work is used in this work. For the suitability of computing the Q-values using convolutional
neural network, we transform the (K × 1) UE connection vector into an ( K100× 100) frame.
Four such frames are stacked together, and fed as the input to the neural network for com-
puting the Q-values. We used three convolutional layers–all with rectified linear unit (ReLU)
activation function. First convolution layer has 32 (8x8) filters. Second and third convolu-
122 Chapter 5. MIMO Broadcast-Beam Optimization Through DRL
tion layers have 64 (4x4) filters and 64 (3x3) filters, respectively. Finally, a dense layer with
linear activation function is used as the output layer.
Two such identical neural networks are used in predicting the Q-values. One is used for
computing the running Q-values–this neural network is called the evaluation network. The
other neural network, called the target network, is held fixed for some training duration,
say for P episodes, and every P episodes, the weights of the evaluation neural network is
transferred to the target neural network. It has been shown that this two neural network-
based approach for Q-value prediction provides better stability of results at convergence
[43].
The neural network architecture for predicting the Q-values for multiple sectors are shown in
Fig. 5.5. The depiction is presented for M sectors case, where M separate neural networks
Q-valuesfor
Act1ofSector1
ActJ ofSector1
Act1ofSectorM
ActJ ofSectorM
(𝑲×𝟏)UEConnection
Vector
NNforSector1’sQ-values
NNforSectorM’sQ-values
Figure 5.5: Neural Network architecture for multiple sector case
are used for predicting the Q-values for M sectors. Input to all neural networks are the
5.4. Simulation Results and Performance Analysis 123
same state vectors. Neural networks are identical, and the number of output for each neural
network is J . Hence, size of action space is JM , instead of JM , i.e., total number of actions
grows only linearly with number of sectors. The optimal action predicted by the Q-values of
neural network 1 is stored in Buffer 1, which corresponds to sector 1. Similarly, the action
predicted by the Q-values of neural network 2 is stored in Buffer 2, which corresponds to
sector 2, and so on. It is to be noted that neural networks do not share any weight information
during training, and each neural network independently predicts the optimal actions for the
corresponding sectors. Hence, there is no additional signaling overhead among the neural
networks. The beam learning procedure for multiple BS environment is presented in
Algorithm 2.
5.4 Simulation Results and Performance Analysis
In this section, we present the simulation results and performance evaluation for self-tuning
sectorization mechanism through DRL-based MIMO broadcast beam optimization. We first
present the results for single sector environment followed by multiple sectors case. Both
periodic and Markov mobility patterns have been considered for the evaluation.
5.4.1 Results for single sector dynamic environment:
In this sub-section, we present the performance evaluation for our algorithm for single sector
dynamic environment. The sector is equipped with a two dimensional (2D) antenna array
with 4 antenna elements in both elevation and azimuth directions. The horizontal distance
between BS antenna elements is 0.5 wave-length and the vertical distance between antenna
elements is 1.48 wave-length. We first consider two scenarios or user distributions, and
124 Chapter 5. MIMO Broadcast-Beam Optimization Through DRL
Algorithm 2 Broadcast Beam Optimization for Multiple Sectors
Input:1: RSRP measurements from the UEs in the networkOutput:2: Optimum broadcast beam patterns for all sectors that maximizes the number of connected UEsSTEP 1: Initialization3: Define a pool of candidate antenna pattern;4: Define the maximum exploration rate, εmax, minimum exploration rate, εmin, exploration decay
rate, optimizer’s learning rate, α, and reward discount factor, γ;5: Initialize the replay memory, D.STEP 2: Optimization of Beam Weights6: for episode = 1, 2, . . . , Z, do7: Initialize the state vector at time step 1 as s1;8: for t = 1, 2, . . . , T ′, do9: Sample c from Uniform (0, 1)
10: if c ≤ ε then11: Select an action (choose a beam index) for each sector randomly from the beam pool12: else13: for m = 1, 2, . . . ,M do14: Select the action for m-th BS, amt = argmaxamQ
∗m(st, a
m; θm)15: end for16: end if17: Apply the selected beam patterns on the antenna arrays of the corresponding BSs18: Observe the resulting RL state, st+1, the UE connection vector.19: Pre-process the state vector into a frame before feeding to Neural Network20: Compute the reward, rt, which is the number of connected UEs.21: for m = 1, 2, . . . ,M do22: Store the experience tuple for m-th sector, emt = (st, a
mt , rt, st+1), in m-th replay
memory, Dm.23: Sample random mini-batches of experience (sj , a
mj , rj , sj+1), from Dm
24: if sj+1 is a terminal state then25: Set ymj = rj26: else27: Set ymj = rj + γmaxam Qm(st, am; θ)28: end if
29: Perform a gradient descend on(ymj −Q(sj , a
mj ; θ)
)2
30: end for31: end for32: end for
5.4. Simulation Results and Performance Analysis 125
Table 5.2: Simulation Parameters
RL Parameter Specification
Reward Discount Factor, γ 0.0001Learning Rate, α 0.001
Initial exploration probability, εmax 1.0Final exploration probability, εmin 0.000001
Training batch size 32Optimizer Adam
Network Parameter Specification
Antenna array at BSs 4× 4Antenna separation in azimuth domain 1.48λAntenna separation in elevation domain 0.5λUEs’ SINR threshold for connectivit, T -6 dB
BSs height from the ground 35m
assume that users switch between Scenario-1 and Scenario-2 periodically every 8 time steps
(see Fig. 5.6). The BS is located at a height of 35 m from ground, and users are distributed
randomly in the cell. Based on users’ X-, Y-, and Z-coordinates, two scenarios are defined
as follows: Scenario-1: X ≥ 2600 m, Z ≥ 10 m; Scenario-2: X ≤ 2700 m, Z ≤ 12 m. For
simulation, this partition is used as users’ mobility pattern. The received power of each UE
is calculated based on ray-tracing data. Noise level is set as −95 dBm, and SINR threshold
level is kept at 0.1 dB. For a particular user, if the received SINR is above this threshold, we
consider the user to be connected; otherwise, we consider it to be not-connected. A set of
simulation parameters used in this work is summarized in Table 5.2. The general rationale
for selecting the hyper-parameters are described below:
Initial Exploration Rate: At the beginning of training, agent needs to gather experiences,
and explore as much as possible. Accordingly, the initial exploration rate, εmax, is set to 1,
which corresponds to complete exploration and no exploitation.
Final Exploration Rate: Towards the end of training, agent should have acquired enough
knowledge about the environment and the underlying user distributions. In this phase, rather
126 Chapter 5. MIMO Broadcast-Beam Optimization Through DRL
than exploration, the agent should focus on exploitation by taking the already known best
actions for different sectors. Hence, final exploration rate should be close to zero. However,
in order to avoid the situation where two rewards are very close to each other and the agent
is stuck with the slightly lesser reward, the final exploration, εmin, in practice, is not set at
exactly 0. In this work, we set εmin at a very small number, 0.000001, which correspond to
very high exploitation phase.
Exploration Decay Rate Exploration decay rate is set based on the total number of
training samples available and the number of training samples used for initial exploration
phase. Usually, the exploration rate is decreased in regular interval. Denoting the number
of training samples dedicated for initial observation as Tobs, in this work, we followed the
algorithm below for decaying the exploration rate at time step, t. Here, Texpl denotes
if t ≤ Tobs thenεt = εmax
else if t > Tobs and εt > εmin thenεt = εt−1 − (εmax−εmin)
Texpl
elseεt = εmin
end if
a parameter > 1 controlling the speed of decay. In our work, we set Texpl = 5000, and
Tobs = 1000.
Learning Rate: Learning rate, α, determines how fast information acquired from recent
experiences overrides the information from prior experiences. In practice, α is set between 0
and 1. A learning rate of 0 implies the Q-values are never updated, and hence no learning
takes place. On the other extreme, a learning rate of 1 means the agent only considers
the information from the most recent experience, and ignores any information previously
acquired. In this work, we start training with initial learnig rate, α = 0.001, and every
20000 training steps, we reduce our learning rate by a factor of 10.
5.4. Simulation Results and Performance Analysis 127
Reward Discount Factor: Reward discount factor, γ, indicates how the agent values the
future reward. In practice, the value of γ is set between 0 and 1. γ close to zero indicates
that immediate rewards are more valued than the distant future rewards. On the other hand,
γ close to 1 implies that long term cumulative future rewards are more important than the
current reward. Based on our DRL environment, we set the reward discount factor at 0.0001.
Figure 5.6: Periodic Change in Scenarios
At each time step, the RL agent has 10 actions to choose from, i.e., there are 10 different
beam weight vectors available for the agent. Each of the actions corresponds to a unique
beam pattern. As an illustration, one such beam pattern and the associated elevation and
azimuth cuts are shown in Fig. 5.7. Based on the change in user distribution, the agent
adaptively selects the beam that maximizes the total number of connected UEs. Figure 5.8a
shows the average squared difference (ASD) between the reward (total number of connected
UEs) obtained by the DRL agent and the reward predicted by Oracle:
ASD =1
N ′
N ′∑n=1
(RAgentn −ROracle
n
)2, (5.12)
where RAgentn and ROracle
n denote instantaneous reward at n-th time step obtained by DRL
agent and Oracle, respectively; N ′ represent the number of time steps used for averaging.
Oracle is defined as an entity which has the complete and perfect knowledge of the environ-
ment and user distribution; it is essentially an exhaustive search method in order to compute
the maximum attainable reward at any given scenario. Each point in Fig. 5.8a represent
ASD over N ′ = 200 time steps. In Fig. 5.8a, we have also shown the shaded error bar, which
128 Chapter 5. MIMO Broadcast-Beam Optimization Through DRL
(a) Beam Pattern
-200 -100 0 100 200
Azimuth Angle (degrees)
-50
-40
-30
-20
-10
0
10
20
Pow
er
(dB
)
Azimuth Cut (elevation angle = 0.0°)
-100 -50 0 50 100
Elevation Angle (degrees)
-50
-40
-30
-20
-10
0
10
20
Pow
er
(dB
)
Elevation Cut (azimuth angle = 0.0°)
(b) Elevation and Azimuth Cuts
Figure 5.7: Beam pattern corresponding to a typical RL action.
represent the maximum difference from the mean value within every N ′ time steps. It can
be observed that at the beginning of training, ASD between rewards obtained by the RL
agent and the Oracle is quite high. However, as time goes by, ASD gradually decreases, and
finally, at the completion of training, rewards from RL agent converges completely with that
from Oracle. This is due to the fact that at the beginning of training, the agent explores
different actions and collects the memory. During the exploration phase, the agent tries out
all available actions, and attempts to learn the optimal beam weights for different user dis-
tributions. Over time, this exploration rate decreases, and exploitation increases, i.e., agent
tends to choose more frequently the best known actions so far that maximize the reward.
Fig. 5.8b shows the results for average mismatch (AM) in actions (selected beam pattern)
taken by the DRL agent and the Oracle, respectively, where AM is defined as
AM =1
N ′
N ′∑n=1
1(AAgentn 6=AOracle
n ), (5.13)
where AAgentn and AOracle
n denote the actions selected for n-th time step by the DRL agent
5.4. Simulation Results and Performance Analysis 129
(a) ASD in reward from DRL agent and Oracle. (b) Average action mismatch with Oracle.
Figure 5.8: Results for periodic mobility pattern in a single sector dynamic environment: (a)average squared difference (ASD) between reward achieved by DRL agent and the rewardobtained by Oracle; (b) average mismatch (AM) between actions taken by the DRL agentsand the Oracle.
and the Oracle, respectively, and the indicator function, 1(AAgentn 6=AOracle
n ), is defined as
1(AAgentn 6=AOracle
n ) =
1, if AAgent
n 6= AOraclen
0, if AAgentn = AOracle
n .
(5.14)
It can be observed that action mismatch is quite large at the start of the training because
of high exploration rate. However, at the end of training phase, actions taken by the DRL
agent and the Oracle converge completely, and average mismatch reduces to zero. It is to be
noted that the introduced DRL-based self-sectorization method is applicable for any discrete
number of actions. However, as the number actions grows large, the difference between opti-
mal number of connected users corresponding to different best beam combinations becomes
smaller. Further increasing the number of action would not provide much gain, however,
may potentially cause longer training time, especially for the multiple sector scenarios.
130 Chapter 5. MIMO Broadcast-Beam Optimization Through DRL
(a) Scenario 1 (b) Scenario 2
Figure 5.9: Users’ Distribution Patterns for 2 Scenarios.
5.4.2 Results for multiple sector dynamic environment:
In this sub-section, we present the simulation results for multiple sector dynamic environ-
ment. We consider two sectors, each at a height of 35 m from ground. Each sector has two
possible beam patterns to choose from. Two scenarios are considered similar to single sector
case in the previous sub-section. The scenarios with line of sight (LoS) and non-line of sight
(NLoS) UEs are shown in Fig. 5.9. We assume the scenarios periodically change every 8
time steps. The agent is responsible for simultaneously selecting the optimal beam patterns
for both sectors for maximizing the number of connected UEs in the network. The aver-
age squared difference in rewards achieved by the agent and the oracle for multiple sectors
scenario is shown in Fig. 5.10a. Similarly to single cell case, as training increases, overall
rewards attained by the agent and the oracle converge completely. In other words, the agent
is able to dynamically optimize the beam patterns for both sectors simultaneously in the
interference environment, and maximize the overall rewards from the network in all scenarios
or user distributions. In Fig. 5.10b, we show the average action mismatch for both sectors.
It can be observed that towards the end of exploration phase, average action mismatches be-
5.4. Simulation Results and Performance Analysis 131
(a) ASD
0 5 10 15 20 25
Simulation Steps ( 200)
0
0.2
0.4
0.6
0.8
1
AM
fro
m O
racle
Sector-2 Actions
Sector-1 Actions
(b) AM
Figure 5.10: Results for periodic mobility pattern in a multiple sector dynamic environment:(a) average squared difference (ASD) between reward achieved by DRL agent and the rewardobtained by Oracle; (b) average mismatch (AM) between actions taken by DRL agents foreach sector and the corresponding Oracles.
tween the sectors and the corresponding Oracles reduce to zero. The instantaneous rewards
and actions at convergence of the algorithm are shown in Fig. 5.11, where, for clarity, we
zoom in for time steps between 4000 and 4030. We can observe that scenarios change every
8 time steps and maximum number of connected UEs are different for the two scenarios.
Optimal strategy for sector-1 is to select action 1 while in scenario 1, and select action 2
while in scenario 2. On the other hand, optimal strategy for sector-2 is to select action 2 for
both scenarios. In reinforcement learning, it is, in general, difficult to obtain convergence
if the reward values are too close. However, we can observe from Fig. 5.11 that the DRL
agent can completely converge with the oracle and take the corresponding best actions even
when the reward values for scenario 1 and scenario 2 differ. This indicates the accuracy of
self-tuning sectorization strategy developed in this work.
Fig. 5.10 and Fig. 5.11 are based on our introduced neural network architecture, where Q-
values corresponding to each sector is predicted by a separate neural network. For compari-
132 Chapter 5. MIMO Broadcast-Beam Optimization Through DRL
4000 4005 4010 4015 4020 4025 4030
Simulation Steps
290
300
310
320
330
340
350
360
No
. o
f C
on
ne
cte
d U
Es
Rewards from DRL Agent
Rewards from Oracle
(a) Rewards
4000 4005 4010 4015 4020 4025 4030Simulation Steps
1
2
3
4
Actio
n I
nd
ex
Sector-2 Actions from DRL Agent
Sector-2 Actions from Oracle
Sector-1 Actions from DRL Agent
Sector-1 Actions from Oracle
(b) Actions
Figure 5.11: Instantaneous rewards (a) and instantaneous actions (b) at convergence formultiple sectors environment and periodic user-mobility pattern.
son, in Fig. 5.12, we present the global solution, where a single neural network is responsible
for predicting the Q-values for all sectors. Hence, if there are 4 actions available for each
sector, for a 2-sectors environment, the neural network needs to predict Q-values for 42 = 16
actions. We can observe from Fig. 5.12 that for the single NN-based architecture, it requires
more than 7500 time steps for the DRL to converge with Oracle. In comparison, for the in-
troduced NN-architecture in Fig. 5.10, the DRL agent can converge with Oracle within only
about 4000 time steps. These results demonstrate the training advantage of the introduced
neural network architecture, especially for large action space, over the traditional state of
the art DRL training method [42].
In general, if the action space grows large, more training time is required for the algorithm to
converge. However, exact training time required can be determined through experimentation.
For example, Fig. 5.13 shows the results on average squared difference (ASD) in reward
(number of connected UEs) obtained by the DRL agent and the Oracle for a single sector.
In Fig. 5.13, for a fixed set of hyper-parameters, we provide a comparison on how size of
action space affects the convergence time, where we vary the number of available actions
5.4. Simulation Results and Performance Analysis 133
(a) Rewards
0 10 20 30 40 50
Simulation Steps ( 200)
0
0.2
0.4
0.6
0.8
AM
fro
m O
racle
Sector-2 Actions
Sector-1 Actions
(b) Actions
Figure 5.12: Results for Global solution for periodic mobility pattern in a multiple sectordynamic environment: (a) average squared difference (ASD) between reward achieved byDRL agent and the reward obtained by Oracle; (b) average mismatch (AM) between actionstaken by DRL agents for each sector and the corresponding Oracles.
(possible beam patterns) from 2 to 10. We can observe that for the single sector case,
training for the action sizes from 2 to 10 all can converge within about 3000 time steps.
On the other hand, Fig. 5.14 presents ASD between the DRL agent and the Oracle for
two-cells dynamic environment where the comparison on convergence time is shown for the
numbers of actions 2 and 4. Unlike the performance on single cell cases in Fig. 5.13, for
the multiple sectors case, we can observe that as the number of actions doubles, from 2 to
4, it requires more time for the DRL agent to converge with the Oracle. However, from
these experiments, it is notable that even though the action space increases linearly, the
required convergence time doesn’t increase proportionally, i.e. training time doesn’t need to
be doubled with doubling the action space.
134 Chapter 5. MIMO Broadcast-Beam Optimization Through DRL
(a) ASD: 2 Actions (b) ASD: 4 Actions
(c) ASD: 6 Actions (d) ASD: 8 Actions
(e) ASD: 10 Actions
Figure 5.13: Results for average squared difference (ASD) in reward between the DRL agentand the Oracle for periodic mobility pattern in a single sector dynamic environment. ASDsfor different size of action space have been plotted in figures (a) - (e).
5.4. Simulation Results and Performance Analysis 135
(a) ASD: 2 Actions (b) ASD: 4 Actions
Figure 5.14: Results for ASD in reward between DRL agent and the Oracle for periodicmobility pattern in a multiple sector dynamic environment. ASDs for different size of actionspace have been plotted in figures (a) and (b).
5.4.3 Multi-sectors environment with Markovian mobility pattern
In this sub-section, we present the performance analysis for DRL-based self-tuning beam-
forming in multiple sector environment and for the case where user distributions alternate
between two scenarios following a Markovian mobility pattern. It is to be noted here that,
in general, users’ mobility pattern has some intrinsic regularity. For example, users can be
clustered more in the commercial area during day time while they move to residential are
in the evening. Hence, periodic mobility patterns considered in previous two sub-sections
rather closely depict the actual mobility pattern in cellular network. Nevertheless, in this
sub-section, we consider the Markovian mobility in order to verify the robustness of the de-
veloped self-tuning sectorization algorithm for the extreme case when users’ mobility pattern
doesn’t have any regularity and users move between different scenarios in random fashion.
We consider two scenarios defined similarly to the ones in Section 5.4.1, and assume the
users’ locations switch between these two scenarios with transition probability governed by
136 Chapter 5. MIMO Broadcast-Beam Optimization Through DRL
the state transition diagram shown in Fig. 5.15. Moreover, we consider two sectors each
Scenario2
0.9
0.1
0.6
0.4
Scenario1
Figure 5.15: State Transition Diagram for Markov Mobility.
having two possible beam patterns to choose from for each scenario. Fig. 5.16a shows the
average squared difference for rewards attained by the RL agent and the oracle for Markov
mobility pattern. We can observe that similarly to the periodic cases presented in previous
two subsections, RL agent does converge with the oracle even for probabilistic mobility, and
ASD goes to zero after the training phase. Average mismatch in actions between the sectors
and the corresponding oracles are shown in Fig. 5.16b. It can be seen that average mismatch
in actions for both sectors reduce to zero at the end of the training phase. Finally, the
instantaneous rewards achieved and the actions taken by the sectors at convergence of the
algorithm are shown in Fig. 5.17, which, again, indicates perfect convergence for Markov
mobility pattern in multiple cell environment.
5.4. Simulation Results and Performance Analysis 137
(a)
0 10 20 30 40
Simulation Steps ( 200)
0
0.2
0.4
0.6
0.8
1
AM
fro
m O
racle
Sector-2 Actions
Sector-1 Actions
(b)
Figure 5.16: Results for Markov mobility pattern in a multiple sector dynamic environment:(a) average squared difference (ASD) between reward achieved by DRL agent and the rewardobtained by Oracle; (b) average mismatch (AM) between actions taken by DRL agent foreach sector and the corresponding Oracles.
6000 6005 6010 6015 6020 6025 6030290
300
310
320
330
340
350
360
Reward from DRL Agent
Reward from Oracle
(a) Rewards
6000 6005 6010 6015 6020 6025 6030
Simulation Steps
1
2
3
4
Actio
n I
nd
ex
Sector-2 Actions from DRL Agent
Sector-2 Actions from Oracle
Sector-1 Action from DRL Agent
Sector-1 Action from Oracle
(b) Actions
Figure 5.17: Instantaneous reward (a) and instanteneous actions (b) at Convergence formultiple sectors environment and Markov user mobility pattern.
138 Chapter 5. MIMO Broadcast-Beam Optimization Through DRL
5.5 Chapter Summary
In this work, we have developed a framework for self-tuning cell sectorization through MIMO
broadcast beam optimization using deep reinforcement learning. To be specific, we have in-
troduced learning strategies for both single sector and multiple sectors environment with
dynamic user distribution. The introduced solutions can autonomously and adaptively up-
date the RF parameters based on the changes in user distributions. Simulation results show
that the DRL-based method completely converges with the Oracle-suggested optimal solu-
tions for both periodic and Markovian user mobility patterns.
Chapter 6
Conclusion
Accurate DL CSI is critical for massive MIMO to realize the promised throughput gain. In
this dissertation, we introduced optimal DL MIMO precoding and power allocation strategies
for multi-cell multi-user massive FD-MIMO networks based on UL DoA estimation at the
BS. The UL DoA estimation error for such a network has been analytically characterized and
has been incorporated into the proposed MIMO precoding and power allocation strategy.
Simulation results suggested that the proposed strategy outperforms existing BD-ZF based
MIMO precoding strategies which requires full CSI at the BS. This work shed a light on
system design for massive FD-MIMO communications which is critical for 5G and Beyond
5G cellular networks. Moreover, based on parametric channel modeling, we have proposed
a framework for estimating parameters for 3D massive MIMO OFDM system and analyti-
cally characterized the estimation performance. Results show that the empirical results on
parameter estimation match those with analytical ones asymptotically. Moreover, we have
shown that parametric channel estimation outperforms MMSE-based channel estimation in
terms of correlation between the estimated channel and the underlying channel.
AI is the next frontier for future wireless cellular network. In this dissertation, we have
139
140 Chapter 6. Conclusion
developed a framework for MIMO broadcast beam optimization using deep reinforcement
learning. To be specific, we have proposed learning strategies for both single cell and multiple
cell environment with dynamic user distribution. The proposed solutions can autonomously
and adaptively update the RF parameters based on the changes in user distributions. Sim-
ulation results show that the proposed DRL based method completely converges with the
Oracle-suggested optimal solutions for both periodic and Markovian user mobility patterns.
Appendices
141
Appendix A
Proofs for Chapter 2, Chapter 3
A.1 Proof of Theorem 2.4
Effect of pilot contamination on the MSE of DoA estimation is given by
E
(4vni,i,`)21
=1
2
(r
(v)H
ni,i,` ·W∗ni,i,mat ·R
(fba)T
i,1 ·WTni,i,mat · r
(v)ni,i,`
−Re
r(v)T
ni,i,` ·Wni,i,mat ·C(fba)i,1 ·WT
ni,i,mat · r(v)ni,i,`
), (A.1)
Let us now denote
βni,i,` = Vsigni,iΣ
sig−1
ni,i q`, (A.2)
αv,ni,i,` =
(pT`
(J
(v)1 Usig
ni,i
)+ (J
(v)2 /ejvni,i,` − J
(v)1
)(Unoiseni,i UnoiseH
ni,i
))T, (A.3)
142
A.1. Proof of Theorem 2.4 143
Using (2.19) and (2.20), we have WTni,i,matr
(v)ni,i,` = βni,i,`⊗αv,ni,i,`. The MSE in (A.1) becomes
E
(4vni,i,`)21
=1
2
((βni,i,` ⊗αv,ni,i,`)
H ·R(fba)Ti,1 · (βni,i,` ⊗αv,ni,i,`)
−Re
(βni,i,` ⊗αv,ni,i,`)T ·C(fba)
i,1 (βni,i,` ⊗αv,ni,i,`))
. (A.4)
It can be easily verified that αv,ni,i,` can be written as
αTv,ni,i,` = cT`
((Jv,2Ani,i
)+
Jv,2 −(Jv,1Ani,i
)+
Jv,1
),
=1
(M2 − 1)M1
[−1,−e−juni,i,` , . . . ,−e−j(M1−1)uni,i,` , 0, . . . , 0,
e−j(M2−1)vni,i,` , e−j((M2−1)vni,i,`+uni,i,`), . . . , e−j((M2−1)vni,i,`+(M1−1)uni,i,`)]. (A.5)
Next, in order to obtain the expression for βni,i,`, we need to perform the SVD of the
perturbation-free signal in (2.11):
√Λni,i
[Ani,iDni,iB
Hni,i(k) ΠNrA
∗ni,iD
∗ni,iB
Tni,i(k)ΠNt
]= Ani,idiag
bni,i
[BHni,i(k) Γni,iB
Tni,i(k)ΠNt
],
where
Γni,i = diag[e−j((M1−1)uni,i,0+(M2−1)vni,i,0), . . . , e−j((M1−1)uni,i,Lni,i−1+(M2−1)vni,i,Lni,i−1)
],
BHni,i(k) = diag
ejφ′ni,i,0 , . . . , e
jφ′ni,i,Lni,i−1
BHni,i(k), and
bni,i =[bni,i,0, . . . , bni,i,Lni,i−1
],
144 Appendix A. Proofs for Chapter 2, Chapter 3
where bni,i,` and φ′
ni,i,` are the amplitude and the phase of the channel gain αni,i(`), respec-
tively. Accordingly, based on Lemma 2.3, we can obtain
Usigni,i = 1/
√NrAni,i,
Σsigni,i =
√2NrNt
√Λni,idiag
bni,i
, and
VsigH
ni,i = 1/√
2Nt
[BHni,i(k) Γni,iB
Tni,i(k)ΠNt
].
In [19], the vector βni,i,` is given as βni,i,` = Vsigni,iΣ
sig−1
ni,i UsigH
ni,i Ani,ic`. Now substituting here
the expressions of Usigni,i, Σsig
ni,i, and Vsigni,i, we obtain:
βni,i,` =1
(bni,i,`)√
Λni,i
√2Nt
Vsigni,ic`. (A.6)
Hence, the expression (βni,i,` ⊗αv,ni,i,`) in (A.42) can be written as
β` ⊗αv,` =1
(bni,i,`)2Nt
√Λni,i
e−jφ′ni,i,`et,ni,i,k(`)
ejφ′ni,i,`ej((M1−1)uni,i,`+(M2−1)vni,i,`)ΠNte
∗t,ni,i,k(`)
⊗αv,`
=1
(bni,i,`)2Nt
√Λni,i
e−jφ′ni,i,`et,ni,i,k(`)⊗αv,ni,i,`
ejφ′ni,i,`ej((M1−1)uni,i,`+(M2−1)vni,i,`)ΠNte
∗t,ni,i,k(`)⊗αv,ni,i,`
,(A.7)
Now, using equation (2.23) and (A.43), the first term in (A.42) can be written as
(βni,i,` ⊗αv,ni,i,`)H ·R(fba)T
i,1 · (βni,i,` ⊗αv,ni,i,`)
=1
b2ni,i,`4N
2t Λni,i
((eHt,ni,i,k(`)⊗αH
v,ni,i,`
)RTi,1 (et,ni,i,k(`)⊗αv,ni,i,`)
+(eTt,ni,i,k(`)ΠNt ⊗αH
v,ni,i,`
)ΠNrNtR
Hi,1ΠNrNt
(ΠNte
∗t,ni,i,k(`)⊗αv,ni,i,`
))(A.8)
A.1. Proof of Theorem 2.4 145
Now, using (2.15), and after some simplification, we have
(eHt,ni,i,k(`)⊗αH
v,ni,i,`
)RTi,1 (et,ni,i,k(`)⊗αv,ni,i,`)
=1
(M2 − 1)2M21
G−1∑g=0g 6=i
(√Λng,i
)2
Xng,iYng,i
L∑m=1
|αng,i(m)|2 (A.9)
where Xng,i and Yng,i are given by
Xng,i = Eψ∣∣(1 + e−j(ωni,i,`−ωng,i,m) + . . .+ e−j(Nt−1)(ωni,i,`−ωng,i,m)
)∣∣2 , (A.10)
Yng,i = Eθ,φ∣∣(1 + ej(uni,i,`−ung,i,m) + . . .+ ej(M1−1)(uni,i,`−ung,i,m)
) (ejvni,i,`e−jvng,i,m − 1
)∣∣2 ,(A.11)
for m = 0, . . . Lng,i − 1, and Eψ and Eθ,φ denote, respectively, expectations with respect to
DoD and DoAs. Now, similarly to (A.45), we also have
(eTt,ni,i,k(`)ΠNt ⊗αH
v,ni,i,`
)ΠNrNtR
Hi,1ΠNrNt
(ΠNte
∗t,ni,i,k(`)⊗αv,ni,i,`
)=
1
(M2 − 1)2M21
G−1∑g=0g 6=i
(√Λng,i
)2
Xng,iY′
ng,i
L∑m=1
|αng,i(m)|2, (A.12)
where
Y′
ng,i =Eθ,φ∣∣(ej(M1−1)ung,i,m + ejuni,i,`ej(M1−2)ung,i,m + . . .+ ej(M1−1)uni,i,`
)×(
ej(M2−1)vni,i,` − ej(M2−1)vng,i,m)∣∣2 , (A.13)
146 Appendix A. Proofs for Chapter 2, Chapter 3
for m = 0, . . . Lng,i − 1. Now, using (A.45) and (A.46), we can write (A.44) as
(βni,i,` ⊗αv,ni,i,`)H ·R(fba)T
i,1 · (βni,i,` ⊗αv,ni,i,`)
=1
b2ni,i,`4N
2t Λni,i
1
(M2 − 1)2M21
G−1∑g=0g 6=i
(√Λng,i
)2
Xng,i
Lng,i−1∑m=0
|αng,i(m)|2(Yng,i + Y
′
ng,i
)(A.14)
Similarly, we can also have
(βni,i,` ⊗αv,ni,i,`)T ·C(fba)
i,1 (βni,i,` ⊗αv,ni,i,`)
=1
b2ni,i,`2N
2t Λni,i
1
(M2 − 1)2M21
ejΦG−1∑g=0g 6=i
(√Λng,i
)2
Xng,iYng,i
Lng,i−1∑m=0
|αng,i(m)|2, (A.15)
where Φ = ((M1 − 1)uni,i,` + (M2 − 1)vni,i,`), and Yng,i is given in (2.29). Finally, plug the
expressions from (A.47) and (A.48) into (A.42), and the proof is finished.
A.2 Proof of Theorem 2.7
Similarly to (A.42), MSE due to intra-cell interference can be written as
E
(4vni,i,`)22
=1
2
((βni,i,` ⊗αv,ni,i,`)
H ·R(fba)Ti,2 · (βni,i,` ⊗αv,ni,i,`)
−Re
(βni,i,` ⊗αv,ni,i,`)T ·C(fba)
i,2 (βni,i,` ⊗αv,ni,i,`))
. (A.16)
A.2. Proof of Theorem 2.7 147
Using (2.23) for m = 2, the first term in (A.49) can be expressed as
(βni,i,` ⊗αv,ni,i,`)T ·C(fba)
i,2 (βni,i,` ⊗αv,ni,i,`)
=1
b2ni,i,`4N
2t Λni,i
ejΦ[(
eHt,ni,i,k(`)ΠNt ⊗αTv,ni,i,`
)ΠNrNtR
∗i,2 (et,ni,i,k(`)⊗αv,ni,i,`)
+(eTt,ni,i,k(`)⊗αT
v,ni,i,`
)Ri,2ΠNrNt
(ΠNte
∗t,ni,i,k(`)⊗αv,ni,i,`
)](A.17)
Now, using (2.16), and after some simplifications, we can write the first term in (A.50) as
(eHt,ni,i,k(`)ΠNt ⊗αT
v,ni,i,`
)ΠNrNtR
∗i,2 (et,ni,i,k(`)⊗αv,ni,i,`)
=ρ21Eα,θ,φ,ψ
J−1∑j=0j 6=n
(√Λji,i
)2 (eHt,ni,i,k(`)1NtBji,i(k)⊗αT
v,ni,i,`ΠNrA∗ji,i
)vec Dji,i∗
×vec Dji,iT(BHji,i(k)1Ntet,ni,i,k(`)⊗AT
ji,iαv,ni,i,`
) (A.18)
It can be shown that
Eα
[(eHt,ni,i,k(`)1NtBji,i(k)⊗αT
v,ni,i,`ΠNrA∗ji,i
)vec Dji,i∗ vec Dji,iT ×(
BHji,i(k)1Ntet,ni,i,k(`)⊗AT
ji,iαv,ni,i,`
)]=∣∣∣X ′′ni,i,`∣∣∣2 Lji,i−1∑
m=0
|αji,i(m)|2 |Xji,i(m)|2 αTv,ni,i,`ΠNre
∗ji,i(m)eTji,i(m)αv,ni,i,`, (A.19)
where X′′
ni,i,` =Nt−1∑r=0
ejrωni,i,` , and Xji,i(m) =Nt−1∑r=0
ejr(ωni,i,`−ωji,i,m). After some tedious but
straight forward calculations, we have
Eθ,φ[αTv,ni,i,`ΠNre
∗ji,i(m)eTji,i(m)αv,ni,i,`
]=
1
(M2 − 1)2M21
Yji,i. (A.20)
148 Appendix A. Proofs for Chapter 2, Chapter 3
Now, using (A.19) and (A.20), we can simplify (A.18) as follows:
(eHt,ni,i,k(`)ΠNt ⊗αT
v,ni,i,`
)ΠNrNtR
∗i,2 (et,ni,i,k(`)⊗αv,ni,i,`)
= ρ21
1
(M2 − 1)2M21
|X ′′ni,i,`|2
J−1∑j=0j 6=n
(√Λji,i
)2
Xji,iYji,i
Lji,i−1∑m=0
|αji,i(m)|2 . (A.21)
Similarly, we can simplify the second term in (A.50) as follows:
(eHt,ni,i,k(`)ΠNt ⊗αT
v,ni,i,`
)ΠNrNtR
∗i,2 (et,ni,i,k(`)⊗αv,ni,i,`)
= ρ21
1
(M2 − 1)2M21
|X ′′ni,i,`|2
J−1∑j=0j 6=n
(√Λji,i
)2
Xji,iYji,i
Lji,i−1∑m=0
|αji,i(m)|2 . (A.22)
Now, plugging the expressions from (A.51) and (A.52) into (A.50), we obtain
(βni,i,` ⊗αv,ni,i,`)T ·C(fba)
i,2 (βni,i,` ⊗αv,ni,i,`)
=1
b2ni,i,`4N
2t Λni,i
ρ21
1
(M2 − 1)2M21
|X ′′ni,i,`|2ejΦ×J−1∑
j=0j 6=n
(√Λji,i
)2
Lji,i−1∑m=0
|αji,i(m)|2Xji,i
(2Yji,i
) (A.23)
A.3. Proof of Theorem 2.12 149
Following similar procedure, we can also obtain
(βni,i,` ⊗αv,ni,i,`)H ·R(fba)T
i,2 · (βni,i,` ⊗αv,ni,i,`)
=1
b2ni,i,`4N
2t Λni,i
ρ21
1
(M2 − 1)2M21
|X ′′ni,i,`|2×J−1∑
j=0j 6=n
(√Λji,i
)2
Lji,i−1∑m=0
|αji,i(m)|2Xji,i
(Yji,i + Y
′
ji,i
) , (A.24)
Now, plugging the expressions from (A.53) and (A.54) into (A.49), we obtain the desired
result.
A.3 Proof of Theorem 2.12
The problem in (2.42) is a convex quadratic optimization problem, and can be solved using
Lagrangian method. The Lagrange function for (2.42) can be written as
L (Vi[k], µik) = TrRHi Hi,i[k]Vi[k]VH
i [k]Ri −RHi Hi,i[k]Vi[k]−VH
i [k]Ri + I
+ µik(TrVi[k]VH
i [k] − Pt), (A.25)
where µik is the corresponding Lagrange multiplier. Now, taking the derivative of the La-
grange function w.r.t. Vi[k] and setting the derivative equal to zero, we can obtain the
desired result.
150 Appendix A. Proofs for Chapter 2, Chapter 3
A.4 Proof of Theorem 3.2
Proof. In the case of standard ESPRIT and for the circularly symmetric white noise, the
complementary covariance matrix, Cnn = 0 [? ]. Hence, we can write (3.14) as
E(4µ(r)
`
)2
=1
2
(r
(r)H` ·W∗
mat ·RTnn ·WT
mat · r(r)`
). (A.26)
Let us now denote β` = VsΣ−1s q`, and
α(r)` =
(pT`
(J
(r)1 Us
)† (J
(r)2 /ejµ
(r)` − J
(r)1
)(UnUn)
)T. (A.27)
Hence, we have:
WTmatr
(r)` = β` ⊗α
(r)` . (A.28)
Substituting (A.28) into (A.26) we obtain
E(4µ(r)
`
)2
=1
2
(β` ⊗α
(r)`
)HRTnn
(β` ⊗α
(r)`
). (A.29)
Now, the noise covariance matrix can be written as: Rnn = E
vecWvvecWvH
. If the
noise is assumed to be circularly symmetric and white Gaussian, the covariance matrix can
then be written as Rnn = σ2INrNcK . Hence, we can succinctly write (A.29) as
E(4µ(r)
`
)2
=σ2
2
(β` ⊗α
(r)`
)H (β` ⊗α
(r)`
)=σ2
2
(βH` β`
)⊗(α
(r)H` α
(r)`
)=σ2
2||β`||2 ||α(r)
` ||2 (A.30)
A.4. Proof of Theorem 3.2 151
The vector α(r)` can be written as [63]:
α(r)T` = pT` (J1Us)
+ (J2/ejw` − J1
) (UnU
Hn
)= eT`
((J
(r)2 A(τ, θ, φ)
)†J
(r)2 −
(J
(r)1 A(τ, θ, φ)
)+
J(r)1
), (A.31)
where e` =
[0 . . . 1 . . . 0
]Tis the column selection vector with all zero elements
except the `-th one.
Computing the pseudo-inverses in (A.31) can be very cumbersome. However, for the massive
MIMO systems, pseudo inverse of the selected signal can be significantly simplified. For
mode, r = 1, we have:
(J
(1)1 A(τ, θ, φ)
)†=
((J
(1)1 A(τ, θ, φ)
)H (J
(1)1 A(τ, θ, φ)
))−1
(J
(1)1 A(τ, θ, φ)
)H=
1
M1M2(Nc − 1)
(J
(1)1 A(τ, θ, φ)
)H (J
(1)1 A(τ, θ, φ)
)M1M2(Nc − 1)
−1
(J
(1)1 A(τ, θ, φ)
)H (a)=
1
M1M2(Nc − 1)
(J
(1)1 A(τ, θ, φ)
)H,
(A.32)
where (a) holds due to Lemma 2.3. Similarly, we have
(J
(1)2 A(τ, θ, φ)
)+
=1
M1M2(Nc − 1)
(J
(1)2 A(τ, θ, φ)
)H. (A.33)
Using (A.32) and (A.33), and noting the definition of J(1)1 and J
(1)2 from (3.8), and after some
152 Appendix A. Proofs for Chapter 2, Chapter 3
simplifications, we can write (A.31) for r = 1 as:
α(1)T` = eT`
((J
(1)2 A (τ, θ, φ))
)†J
(1)2
−(J
(1)1 A (τ, θ, φ))
)†J
(1)1
)=
1
M1M2(Nc − 1)
[−1,−e−ju` , . . . ,−e−j(M1−1)u` ,−e−jv` ,
. . . ,−e−j(M2−1)v`e−j(M1−1)u` , 0, . . . , 0, e−j(Nc−1)ω` ,
e−j(Nc−1)ω`e−ju` , . . . , e−j(Nc−1)ω`e−j(M2−1)v`e−j(M1−1)u`]. (A.34)
Accordingly, we have
∣∣∣∣∣∣α(1)`
∣∣∣∣∣∣2 = α(1)H` α
(1)` =
2
M1M2(Nc − 1)2. (A.35)
Similarly, for the parameter mode, r = 2, 3, we can obtain the following relations:
∣∣∣∣∣∣α(2)`
∣∣∣∣∣∣2 =2
M2Nc(M1 − 1)2,∣∣∣∣∣∣α(3)
`
∣∣∣∣∣∣2 =2
M1Nc(M2 − 1)2.
Now, extending the results in [63] to 3D parameter estimation problem, the vector, β`, can
be expressed as β` = VsΣ−1s UH
s A(τ, θ, φ)e`. Since β` is the `-th column of the matrix, ||β`||2
becomes the `-th diagonal element of the matrix, AH(τ, θ, φ)UsΣ−2s UH
s A(τ, θ, φ), and can
be succinctly expressed as ||β`||2 = R−1
SS(`, `)/K [63], where RSS(`, `) is the `-th diagonal
element of the equivalent transmit signal covariance matrix. Now, plugging the values of
||β`||2 and ||α(r)` ||2 into (A.29), for r = 1, 2, 3, we can, respectively, obtain the MSEs of the
A.5. Proof of Theorem 4.2 153
temporal frequency, ω`, and spatial frequencies, u` and v` as follows:
E
(4ω`)2
=R−1ss (`, `)
K
σ2
M1M2(Nc − 1)2, (A.36)
E
(4u`)2
=R−1ss (`, `)
K
σ2
M2Nc(M1 − 1)2, (A.37)
E
(4v`)2
=R−1ss (`, `)
K
σ2
M1Nc(M2 − 1)2(A.38)
Now, based on Jacobian matrix, we have
E
(4τ`)2
= E
(4ω`)2 1
4π2(∆f)2, (A.39)
E
(4θ`)2
= E
(4u`)2 1
π2 sin2(θ`), (A.40)
E
(4φ`)2
=E
(4u`)2
cot2(θ`) cot2(φ`)
π2 sin2(θ`)
+E
(4v`)2
π2 sin2(θ`) sin2(φ`). (A.41)
Recognizing that RMSEτ` =√
E4τ 2` , RMSEθ` =
√E4θ2
`, and RMSEφ` =√E4φ2
`, and after substituting the (A.36), (A.37), and (A.38) into (A.39), (A.40), (A.41),
respectively, we obtain the desired result.
A.5 Proof of Theorem 4.2
From [64] and [6], MSE E
(4vni,i,`)21
can be expressed as
E
(4vni,i,`)21
=1
2
((βni,i,` ⊗αv,ni,i,`)
H ·R(fba)Ti,1 · (βni,i,` ⊗αv,ni,i,`)
−Re
(βni,i,` ⊗αv,ni,i,`)T ·C(fba)
i,1 (βni,i,` ⊗αv,ni,i,`))
. (A.42)
154 Appendix A. Proofs for Chapter 2, Chapter 3
where expression (βni,i,` ⊗αv,ni,i,`) in (A.42) for superimposed pilot system can be expressed
as
β` ⊗αv,` =1
(bni,i,`)2Nt
√Λni,i
e−jφ′ni,i,`et,ni,i,k(`)
ejφ′ni,i,`ej((M1−1)uni,i,`+(M2−1)vni,i,`)ΠNte
∗t,ni,i,k(`)
⊗αv,`
=1
(bni,i,`)2Nt
√Λni,i
e−jφ′ni,i,`YH
ni(k)et,ni,i,k(`)⊗αv,ni,i,`
ejφ′ni,i,`ej((M1−1)uni,i,`+(M2−1)vni,i,`)ΠNtY
Tni(k)e∗t,ni,i,k(`)⊗αv,ni,i,`
,(A.43)
Utilizing (A.43), the first term in (A.42) can thus be expressed as
(βni,i,` ⊗αv,ni,i,`)H ·R(fba)T
i,1 · (βni,i,` ⊗αv,ni,i,`)
=1
b2ni,i,`4N
2t Λni,i
((eHt,ni,i,k(`)Yni(k)⊗αH
v,ni,i,`
)RTi,1
(YHni(k)et,ni,i,k(`)⊗αv,ni,i,`
)+(eTt,ni,i,k(`)Y
∗ni(k)ΠNt ⊗αH
v,ni,i,`
)ΠNrNtR
Hi,1ΠNrNt
(ΠNtY
Tni(k)e∗t,ni,i,k(`)⊗αv,ni,i,`
)).
(A.44)
Now, we can write the first part in (A.44) as
(eHt,ni,i,k(`)Yni(k)⊗αH
v,ni,i,`
)RTi,1
(YHni(k)et,ni,i,k(`)⊗αv,ni,i,`
)=
1
(M2 − 1)2M21
G−1∑g=0g 6=i
(√Λng,i
)2
(Xng,i + ρ22γ(1− γ)X
′
ng,i)Yng,i
L∑m=1
|αng,i(m)|2, (A.45)
A.5. Proof of Theorem 4.2 155
Now, similarly to the derivation of (A.45), we also have
(eTt,ni,i,k(`)Y
∗ni(k)ΠNt ⊗αH
v,ni,i,`
)ΠNrNtR
Hi,1ΠNrNt
(ΠNtY
Tni(k)e∗t,ni,i,k(`)⊗αv,ni,i,`
)=
1
(M2 − 1)2M21
G−1∑g=0g 6=i
(√Λng,i
)2
(Xng,i + ρ22γ(1− γ)X
′
ng,i)Y′
ng,i
L∑m=1
|αng,i(m)|2, (A.46)
Accordingly, utilizing (A.45) and (A.46), (A.44) can be simplified as
(βni,i,` ⊗αv,ni,i,`)H ·R(fba)T
i,1 · (βni,i,` ⊗αv,ni,i,`)
=1
b2ni,i,`4N
2t Λni,i
1
(M2 − 1)2M21
G−1∑g=0g 6=i
(√Λng,i
)2
(Xng,i + ρ22γ(1− γ)X
′
ng,i)
×Lng,i−1∑m=0
|αng,i(m)|2(Yng,i + Y
′
ng,i
)(A.47)
We can similarly obtain the expression for the second term in (A.42) as
(βni,i,` ⊗αv,ni,i,`)T ·C(fba)
i,1 (βni,i,` ⊗αv,ni,i,`)
=1
b2ni,i,`N
2t Λni,i
ρ22γ(1− γ)
(M2 − 1)2M21
ejΦG−1∑g=0g 6=i
(√Λng,i
)2
X′
ng,iYng,i
Lng,i−1∑m=0
|αng,i(m)|2, (A.48)
Finally, after inserting the expressions from (A.47) and (A.48) into (A.42), we achieve the
desired result.
156 Appendix A. Proofs for Chapter 2, Chapter 3
A.6 Proof of Theorem 4.3
MSE E
(4vni,i,`)22
can be written as
E
(4vni,i,`)22
=1
2
((βni,i,` ⊗αv,ni,i,`)
H ·R(fba)Ti,2 · (βni,i,` ⊗αv,ni,i,`)
−Re
(βni,i,` ⊗αv,ni,i,`)T ·C(fba)
i,2 (βni,i,` ⊗αv,ni,i,`))
. (A.49)
The second term in (A.49) can be written as
(βni,i,` ⊗αv,ni,i,`)T ·C(fba)
i,2 (βni,i,` ⊗αv,ni,i,`)
=1
b2ni,i,`4N
2t Λni,i
ejΦ[(
eHt,ni,i,k(`)Yni(k)ΠNt ⊗αTv,ni,i,`
)ΠNrNtR
∗i,2
(YHni(k)et,ni,i,k(`)⊗αv,ni,i,`
)+(eTt,ni,i,k(`)Y
∗ni(k)⊗αT
v,ni,i,`
)Ri,2ΠNrNt
(ΠNtY
Tni(k)e∗t,ni,i,k(`)⊗αv,ni,i,`
)](A.50)
Now, after some simplifications, we can write the first term in (A.50) as
(eHt,ni,i,k(`)Yni(k)ΠNt ⊗αT
v,ni,i,`
)ΠNrNtR
∗i,2
(YHni(k)et,ni,i,k(`)⊗αv,ni,i,`
)=ρ2
1γ2 + ρ2
2γ(1− γ)
(M2 − 1)2M21
J−1∑j=0j 6=n
(√Λji,i
)2
X′
ji,iYji,i
Lji,i−1∑m=0
|αji,i(m)|2 . (A.51)
In a similar manner, second term in (A.50) can be written as:
(eHt,ni,i,k(`)ΠNt ⊗αT
v,ni,i,`
)ΠNrNtR
∗i,2 (et,ni,i,k(`)⊗αv,ni,i,`)
=ρ2
1γ2 + ρ2
2γ(1− γ)
(M2 − 1)2M21
J−1∑j=0j 6=n
(√Λji,i
)2
X′
ji,iYji,i
Lji,i−1∑m=0
|αji,i(m)|2 . (A.52)
A.7. Proof of Theorem 4.8 157
Now, after inserting the expressions from (A.51) and (A.52) into (A.50), we obtain
(βni,i,` ⊗αv,ni,i,`)T ·C(fba)
i,2 (βni,i,` ⊗αv,ni,i,`)
=1
b2ni,i,`4N
2t Λni,i
ρ21γ
2 + ρ22γ(1− γ)
(M2 − 1)2M21
ejΦ
J−1∑j=0j 6=n
(√Λji,i
)2
Lji,i−1∑m=0
|αji,i(m)|2X
′
ji,i
(2Yji,i
)(A.53)
Similarly, we can also obtain
(βni,i,` ⊗αv,ni,i,`)H ·R(fba)T
i,2 · (βni,i,` ⊗αv,ni,i,`)
=1
b2ni,i,`4N
2t Λni,i
ρ21γ
2 + ρ22γ(1− γ)
(M2 − 1)2M21
J−1∑j=0j 6=n
(√Λji,i
)2
Lji,i−1∑m=0
|αji,i(m)|2X
′
ji,i
(Yji,i + Y
′
ji,i
) ,(A.54)
Now, insert the expressions from (A.53) and (A.54) into (A.49), and the proof is complete.
A.7 Proof of Theorem 4.8
For perfect uplink DoA estimation scenario, we have AHni,i(k) = Ani,i(k). Accordingly, from
(4.22), we have pselfni,q(k) = 0, and pintra
ni,q (k) and pinterni,q (k) can respectively be written as
pintrani,q (k) =
J−1∑j=0j 6=n
√Λji,i
Nr
AHni,i(k)Aji,i(k)Dji,i
(xqji(k) + sqji(k)
); (A.55)
pinterni,q (k) =
G−1∑g=0g 6=i
J−1∑j=0
√Λjg,i
Nr
AHni,i(k)Ajg,i(k)Djg,iB
Hjg,i
(BHjg,g
)+ (xqjg(k) + sqjg(k)
). (A.56)
158 Appendix A. Proofs for Chapter 2, Chapter 3
Now, using Lemma 2.3, (1/Nr)Ani,i(k)Aji,i(k) → 0,∀(j 6= n) and (1/Nr)Ani,i(k)Ajg,i(k) →
0,∀(j 6= n, and g 6= i). Hence, for 3D massive MIMO systems, pintrani,q (k)→ 0 and pinter
ni,q (k)→
0. Accordingly, (4.21) simplifies as
zqni(k) =
√Λni,i
Nr
AHni,i(k)Ani,i(k)Dni,ix
qni(k) + wq
i (k) a=
√Λni,iDni,ix
qni(k) + wq
i (k), (A.57)
where a results since (1/√Nr)Ani,i(k) is unitary. Hence, from (A.57), mutual information
for n-th user in i-th cell at the k-th subcarrier, can be written as
Iul,sdni [k] = log2 det
(ILni,i +
1
σ2Dni,iQ
ul,sdni [k]DH
ni,i
). (A.58)
Finally, using Hadamard’s Inequality, (A.58) results in
Iul,sdni [k] = log2 Π
`
(1 +
Λni,i|αni,i(`)|2pul,sdni,` [k]
σ2
)=
Lni,i−1∑`=0
log2
(1 + γni,`p
ul,sdni,` [k]
). (A.59)
A.8 Proof of Theorem 4.10
Using Lemma 3 in [6], the product, AHni,i(k)Ani,i(k), can be written as
1
Nr
AHni,i(k)Ani,i(k) =
1
Nr
diageHr,ni,i,k(0)er,ni,i,k(0), . . . , eHr,ni,i,k(Lni,i − 1)er,ni,i,k(Lni,i − 1).
Similarly, using Lemma 2 and Lemma 3 in [6], it can be shown that 1Nr
AHni,i(k)Ani,i(k)→ 1
NrI,
1Nr
AHni,i(k)Aji,i(k) → 0, and 1
NrAHni,i(k)Ajg,i(k) → 0. Accordingly, from (4.23) and (4.24),
for large antenna system, pintrani,q′ (k)→ 0 and pinter
ni,q′(k)→ 0, and from (4.22), pselfni,q(k) simplifies
A.8. Proof of Theorem 4.10 159
as
pselfni,q(k) =
√Λni,i
Nr
(AHni,i(k)Ani,i(k)− I
)Dni,is
qni(k); (A.60)
Now, from (4.21), we have
zqi (k) =
√Λni,i
Nr
AHni,i(k)Ani,i(k)Dni,ix
qni(k)
+√
Λni,i
(1
Nr
AHni,i(k)Ani,i(k)− I
)Dni,is
qni(k) + wq
i (k). (A.61)
Accordingly, achievable rate for superimposed pilot+data transmission phase under DoA
estimation error can be written as
Iul,sdni [k] = E
log2 det
(ILni,i +
Λni,i
N2r
AHni,i(k)Ani,i(k)Dni,iQ
ul,sdni [k]DH
ni,iAHni,i(k)Ani,i(k)Rul,s−1
ni [k]
),
(A.62)
where Rul,sni = Λni,i
(1Nr
AHni,i(k)Ani,i(k)− I
)Dni,iQ
ul,spni [k]DH
ni,i
(1Nr
AHni,i(k)Ani,i(k)− I
)H+
σ2I, where Qul,spni [k] is the covariance matrix of superimposed pilot symbols, and the expec-
tation is taken with respect to DoA estimation error.
The termΛni,iN2r
AHni,i(k)Ani,i(k)Dni,iQ
ul,sdni [k]DH
ni,iAHni,i(k)Ani,i(k) in (A.62) is a diagonal matrix
with (`, `)-th diagonal element beingΛni,iN2r|αni,i(`)|2
∣∣eHr,ni,i,k(`)er,ni,i,k(`)∣∣2 pul,sdni,` [k]. Similarly,
Rul,sni in (A.62) also results in a diagonal matrix, where the (`, `)-th diagonal element is given
by (σ2 + Λni,i|αni,i(`)|2∣∣∣ 1Nr
eHr,ni,i,k(`)er,ni,i,k(`)− 1∣∣∣2 pul,sp
ni,` [k]). Hence, using Hadamard’s in-
equality, from (A.62), the expected achievable rate during superimposed (pilot+data) trans-
160 Appendix A. Proofs for Chapter 2, Chapter 3
mission phase under DoA estimation error becomes
Iul,sdni [k] = E
Lni,i−1∑`=0
log
1 +
1N2rγni,`
∣∣eHr,ni,i,k(`)er,ni,i,k(`)∣∣2 pul,sdni,` [k]
(σ2 + γni,`
∣∣∣ 1Nr
eHr,ni,i,k(`)er,ni,i,k(`)− 1∣∣∣2 pul,sp
ni,` [k])
, (A.63)
where the expectation is taken with respect to DoA estimation error, γni,` = Λni,i|αni,i(`)|2,
and pul,sdni,` [k] and pul,sp
ni,` [k] are the transmit powers during the uplink channel estimation phase
allocated on the data and pilot symbols, respectively.
A.9 Proof of Theorem 4.12
Following the line of proof of Theorem 4.8 in Appendix A.3, (4.27) can be written as
zq′
i (k) =√
Λni,i1
Nr
AHni,i(k)Ani,i(k)Dni,ix
q′
ni(k) + wq′
i (k) =√
Λni,iDni,ixq′
ni(k) + wq′
i (k).
(A.64)
Hence, mutual information for n-th user in i-th cell at the k-th subcarrier, can be written as
Iul,ddni [k] = log2 det
(ILni,i +
1
σ2Dni,iQ
ul,ddni [k]DH
ni,i
)= log2 det
(ILni,i +
1
σ2diagΛni,iαni,i(0)pul,dd
ni,0 [k], . . . ,Λni,iαni,i(Lni,i − 1)pul,ddni,Lni,i−1[k]
).
Finally using Hadamard’s Inequality, we obtain the desired results:
Iul,ddni [k] = log2 Π
`
(1 +
Λni,i|αni,i(`)|2pul,ddni,` [k]
σ2
)=
Lni,i−1∑`=0
log2
(1 + γni,`p
ul,ddni,` [k]
). (A.65)
A.10. Proof of Theorem 4.13 161
A.10 Proof of Theorem 4.13
Following proof of Theorem 4.10, in presence of DoA estimation error, (4.27) can be written
as
zq′
i (k) =
√Λni,i
Nr
AHni,i(k)Ani,i(k)Dni,ix
q′
ni(k) + wq′
i (k).
Accordingly, achievable rate for data-only transmission phase under DoA estimation error:
Iul,ddni [k] = E
log2 det
(ILni,i +
Λni,i
N2r σ
2AHni,i(k)Ani,i(k)Dni,iQ
ul,ddni [k]DH
ni,iAHni,i(k)Ani,i(k)
).
(A.66)
Now, the termΛni,iN2r σ
2 AHni,i(k)Ani,i(k)Dni,iQ
ul,ddni [k]DH
ni,iAHni,i(k)Ani,i(k) in (A.66) is a diago-
nal matrix with (`, `)-th diagonal element beingΛni,iN2r σ
2 |αni,i(`)|2∣∣eHr,ni,i,k(`)er,ni,i,k(`)∣∣2 pul,dd
ni,` [k].
Finally, using Hadamard’s inequality, the expected achievable rate during data-only trans-
mission phase under DoA estimation error becomes
Iul,ddni [k] = E
Lni,i−1∑`=0
log
(1 +
1N2rγni,` |er,ni,i,k(`)er,ni,i,k(`)|2 pddni,`[k]
σ2
) , (A.67)
where the expectation is taken with respect to DoA estimation error, γni,` = Λni,i|αni,i(`)|2,
and pul,ddni,` [k] are the transmit powers during the uplink data-only transmission phase.
Bibliography
[1] T. L. Marzetta, “Noncooperative cellular wireless with unlimited numbers of base sta-
tion antennas,” IEEE Trans. Wireless Commun., vol. 9, no. 11, pp. 3590 – 3600, Novem-
ber 2010.
[2] Y. Kim, H. Ji, J. Lee, Y.-H. Nam, B. L. Ng, I. Tzanidis, Y. Li, and J. Zhang, “Full
dimension MIMO (FD-MIMO): The next evolution of MIMO in LTE systems,” IEEE
Wireless Commun., vol. 21, no. 3, pp. 92–100, June 2014.
[3] L. Liu, R. Chen, S. Geirhofer, K. Sayana, Z. Shi, and Y. Zhou, “Downlink MIMO in
LTE-Advanced: SU-MIMO vs. MU-MIMO,” IEEE Commun. Mag., vol. 50, no. 2, pp.
140–147, February 2012.
[4] A. Paulraj, R. Roy, and T. Kailath, “A subspace rotation approach to signal parameter
estimation,” Proc. IEEE, vol. 74, no. 7, pp. 1044–1046, July 1986.
[5] R. Shafin and L. Liu, “Doa estimation and performance analysis for multi-cell multi-
user 3d mmwave massive-mimo ofdm system,” in 2017 IEEE Wireless Communications
and Networking Conference (WCNC). IEEE, 2017, pp. 1–6.
[6] R. Shafin, L. Liu, J. Zhang, and Y. C. Wu, “DoA Estimation and Capacity Analysis
for 3-D Millimeter Wave Massive-MIMO/FD-MIMO OFDM Systems,” IEEE Trans.
Wireless Commun., vol. 15, no. 10, pp. 6963–6978, Oct 2016.
162
BIBLIOGRAPHY 163
[7] R. Shafin, L. Liu, Y. Li, A. Wang, and J. Zhang, “Angle and Delay Estimation for 3-D
Massive MIMO/FD-MIMO Systems Based on Parametric Channel Modeling,” IEEE
Trans. Wireless Commun., vol. 16, no. 8, pp. 5370–5383, Aug 2017.
[8] H. Almosa, R. Shafin, S. Mosleh, Z. Zhou, Y. Li, J. Zhang, and L. Liu, “Downlink
channel estimation and precoding for fdd 3d massive mimo/fd-mimo systems,” in 2017
26th Wireless and Optical Communication Conference (WOCC). IEEE, 2017, pp. 1–4.
[9] R. Shafin, L. Liu, and J. C. Zhang, “Doa estimation and capacity analysis for 3d massive-
mimo/fd-mimo ofdm system,” in 2015 IEEE Global Conference on Signal and Informa-
tion Processing (GlobalSIP). IEEE, 2015, pp. 181–184.
[10] D. Vasisht, S. Kumar, H. Rahul, and D. Katabi, “Eliminating channel feedback in next-
generation cellular networks,” in 2016 ACM SIGCOMM Conference, 2016, pp. 398–411.
[11] E. Bjornson, E. G. Larsson, and M. Debbah, “Massive MIMO for Maximal Spectral
Efficiency: How Many Users and Pilots Should Be Allocated?” IEEE Trans. Wireless
Commun., vol. 15, no. 2, pp. 1293–1308, Feb 2016.
[12] T. V. Chien, E. Bjornson, and E. G. Larsson, “Joint Power Allocation and User As-
sociation Optimization for Massive MIMO Systems,” IEEE Trans. Wireless Commun.,
vol. 15, no. 9, pp. 6384–6399, Sept 2016.
[13] R. Shafin, L. Liu, and J. C. Zhang, “On the channel estimation for 3d massive mimo
systems,” E-LETTER, 2014.
[14] R. Shafin, “Performance analysis of parametric channel estimation for 3d massive
mimo/fd-mimo ofdm systems.” Master’s thesis, University of Kansas, 2017.
[15] B. Yang, K. Letaief, R. Cheng, and Z. Cao, “Channel Estimation for OFDM Trans-
164 BIBLIOGRAPHY
mission in Multipath Fading Channels Based on Parametric Channel Modeling,” IEEE
Trans. Commun., vol. 49, no. 3, pp. 467–479, Mar 2001.
[16] M. Larsen, A. Swindlehurst, and T. Svantesson, “Performance bounds for MIMO-
OFDM channel estimation,” IEEE Trans. Signal Process., vol. 57, no. 5, pp. 1901–1916,
May 2009.
[17] M. Wax and A. Leshem, “Joint Estimation of Time Delays and Directions of Arrival of
Multiple Reflections of a Known Signal,” IEEE Trans. Signal Process., vol. 45, no. 10,
pp. 2477–2484, October 1997.
[18] R. Roy and T. Kailath, “ESPRIT-Estimation of Signal Parameters via Rotational In-
variance Techniques,” IEEE Trans. Acoust., Speech, Signal Process., vol. 37, no. 7, pp.
984–995, 1989.
[19] F. Li, H. Liu, and R. Vaccaro, “Performance analysis for DoA estimation algorithms:
unification, simplification, and observations,” IEEE Trans. Aerosp. Electron. Syst.,
vol. 29, no. 4, pp. 1170–1184, Oct 1993.
[20] F. Roemer, M. Haardt, and G. Del Galdo, “Analytical performance assessment of multi-
dimensional matrix- and tensor-based ESPRIT-type algorithms,” IEEE Trans. Signal
Process., vol. 62, no. 10, pp. 2611–2625, May 2014.
[21] J. G. Andrews, S. Buzzi, W. Choi, S. V. Hanly, A. Lozano, A. C. Soong, and J. C. Zhang,
“What will 5G be?” IEEE J. Sel. Areas Commun., vol. 32, no. 6, pp. 1065–1082, 2014.
[22] F. Boccardi, R. W. Heath, A. Lozano, T. L. Marzetta, and P. Popovski, “Five disruptive
technology directions for 5g,” IEEE Communications Magazine, vol. 52, no. 2, pp. 74–
80, 2014.
BIBLIOGRAPHY 165
[23] C.-X. Wang, F. Haider, X. Gao, X.-H. You, Y. Yang, D. Yuan, H. Aggoune, H. Haas,
S. Fletcher, and E. Hepsaydir, “Cellular architecture and key technologies for 5g wireless
communication networks,” IEEE Communications Magazine, vol. 52, no. 2, pp. 122–
130, 2014.
[24] R. Shafin, L. Liu, V. Chandrasekhar, H. Chen, J. Reed et al., “Artificial intelligence-
enabled cellular networks: A critical path to beyond-5g and 6g,” to appear on IEEE
Wireless Communications, arXiv preprint arXiv:1907.07862, 2019.
[25] T. Cousik, R. Shafin, Z. Zhou, K. Kleine, J. Reed, and L. Liu, “Cogrf: A new fron-
tier for machine learning and artificial intelligence for 6g rf systems,” arXiv preprint
arXiv:1909.06862, 2019.
[26] A. Akhtar, J. Ma, R. Shafin, J. Bai, L. Li, Z. Li, and L. Liu, “Low latency scalable
point cloud communication in vanets using v2i communication,” in ICC 2019-2019
IEEE International Conference on Communications (ICC). IEEE, 2019, pp. 1–7.
[27] R. Shafin, L. Liu, J. Ashdown, J. Matyjas, M. Medley, B. Wysocki, and Y. Yi, “Realizing
green symbol detection via reservoir computing: An energy-efficiency perspective,” in
2018 IEEE International Conference on Communications (ICC). IEEE, 2018, pp. 1–6.
[28] S. Mosleh, L. Liu, C. Sahin, Y. R. Zheng, and Y. Yi, “Brain-inspired wireless commu-
nications: Where reservoir computing meets mimo-ofdm,” IEEE Trans. Neural Netw.
Learn. Syst., 2017.
[29] Z. Zhou, L. Liu, and H.-H. Chang, “Learn to demodulate: Mimo-ofdm symbol detection
through downlink pilots,” arXiv preprint arXiv:1907.01516, 2019.
[30] S. Hamalainen, H. Sanneck, and C. Sartori, LTE self-organising networks (SON): net-
work management automation for operational efficiency. John Wiley & Sons, 2012.
166 BIBLIOGRAPHY
[31] M. Peng, D. Liang, Y. Wei, J. Li, and H.-H. Chen, “Self-configuration and self-
optimization in lte-advanced heterogeneous networks,” IEEE Communications Mag-
azine, vol. 51, no. 5, pp. 36–45, 2013.
[32] O. Sallent, J. Perez-Romero, J. Sanchez-Gonzalez, R. Agustı, M. A. Dıaz-Guerra,
D. Henche, and D. Paul, “A roadmap from umts optimization to lte self-optimization,”
IEEE Communications Magazine, vol. 49, no. 6, pp. 172–182, 2011.
[33] H. Hu, J. Zhang, X. Zheng, Y. Yang, and P. Wu, “Self-configuration and self-
optimization for LTE networks,” IEEE Commun. Mag., vol. 48, no. 2, 2010.
[34] R. Shafin and L. Liu, “Multi-Cell Multi-User Massive FD-MIMO: Downlink Precoding
and Throughput Analysis,” IEEE Trans. Wireless Commun., vol. 18, no. 1, pp. 487–502,
Jan 2019.
[35] R. Shafin, L. Liu, Y. Li, A. Wang, and J. Zhang, “Angle and delay estimation for 3-d
massive mimo/fd-mimo systems based on parametric channel modeling,” IEEE Trans.
Wireless Commun., vol. 16, no. 8, pp. 5370–5383, Aug 2017.
[36] A. Galindo-Serrano and L. Giupponi, “Distributed Q-learning for aggregated interfer-
ence control in cognitive radio networks,” IEEE Trans. Veh. Technol., vol. 59, no. 4,
pp. 1823–1834, 2010.
[37] H. Saad, A. Mohamed, and T. ElBatt, “Distributed cooperative Q-learning for power
allocation in cognitive femtocell networks,” in IEEE Veh. Technol. Conf. (VTC Fall),
2012, 2012, pp. 1–5.
[38] M. Bennis, S. M. Perlaza, P. Blasco, Z. Han, and H. V. Poor, “Self-organization in
small cell networks: A reinforcement learning approach,” IEEE transactions on wireless
communications, vol. 12, no. 7, pp. 3202–3212, 2013.
BIBLIOGRAPHY 167
[39] J. Nie and S. Haykin, “A q-learning-based dynamic channel assignment technique for
mobile communication systems,” IEEE Transactions on Vehicular Technology, vol. 48,
no. 5, pp. 1676–1687, 1999.
[40] Y.-S. Chen, C.-J. Chang, and F.-C. Ren, “Q-learning-based multirate transmission con-
trol scheme for RRM in multimedia WCDMA systems,” IEEE Trans. Veh. Technol.,
vol. 53, no. 1, pp. 38–48, 2004.
[41] R. S. Sutton and A. G. Barto, Reinforcement learning: An introduction. MIT press,
2018.
[42] V. Mnih, K. Kavukcuoglu, D. Silver, A. Graves, I. Antonoglou, D. Wierstra, and
M. Riedmiller, “Playing atari with deep reinforcement learning,” arXiv preprint
arXiv:1312.5602, 2013.
[43] V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, M. G. Bellemare, A. Graves,
M. Riedmiller, A. K. Fidjeland, G. Ostrovski et al., “Human-level control through deep
reinforcement learning,” Nature, vol. 518, no. 7540, p. 529, 2015.
[44] H. Chang, H. Song, Y. Yi, J. Zhang, H. He, and L. Liu, “Distributive dynamic spectrum
access through deep reinforcement learning: A reservoir computing based approach,”
IEEE Internet Things J., pp. 1–1, 2019.
[45] H. Van Hasselt, A. Guez, and D. Silver, “Deep reinforcement learning with double
q-learning.” in AAAI, vol. 2. Phoenix, AZ, 2016, p. 5.
[46] R. Shafin, H. Chen, Y. H. Nam, S. Hur, J. Park, J. Reed, L. Liu et al., “Self-tuning sec-
torization: Deep reinforcement learning meets broadcast beam optimization,” to appear
on IEEE Transactions on Wireless Communications, arXiv preprint arXiv:1906.06021,
2020.
168 BIBLIOGRAPHY
[47] M. R. Akdeniz, Y. Liu, M. K. Samimi, S. Sun, S. Rangan, T. S. Rappaport, and
E. Erkip, “Millimeter wave channel modeling and cellular capacity evaluation,” IEEE
J. Sel. Areas Commun., vol. 32, no. 6, pp. 1164–1179, 2014.
[48] 3GPP TR 36.814 V.9.2.0 Evolved Universal Terrestrial Radio Access (E-UTRA); Fur-
ther advancements for E-UTRA physical layer aspects, March 2017.
[49] B. M. Popovic, “Generalized chirp-like polyphase sequences with optimum correlation
properties,” IEEE Trans. Inf. Theory, vol. 38, no. 4, pp. 1406–1409, 1992.
[50] T. S. Rappaport, F. Gutierrez, E. Ben-Dor, J. N. Murdock, Y. Qiao, and J. I. Tamir,
“Broadband millimeter-wave propagation measurements and models using adaptive-
beam antennas for outdoor urban cellular communications,” IEEE Trans. Antennas
Propag., vol. 61, no. 4, pp. 1850–1859, April 2013.
[51] O. E. Ayach, S. Rajagopal, S. Abu-Surra, Z. Pi, and R. W. Heath, “Spatially Sparse Pre-
coding in Millimeter Wave MIMO Systems,” IEEE Trans. Wireless Commun., vol. 13,
no. 3, pp. 1499–1513, March 2014.
[52] A. Alkhateeb, O. E. Ayach, G. Leus, and R. W. Heath, “Channel estimation and hybrid
precoding for millimeter wave cellular systems,” IEEE J. Sel. Topics Signal Process.,
vol. 8, no. 5, pp. 831–846, Oct 2014.
[53] A. Wang, L. Liu, and J. Zhang, “Low complexity direction of arrival (DoA) estimation
for 2D massive MIMO systems,” in IEEE Global Commun. Conf., 2012, pp. 703–707.
[54] M. Haardt and J. A. Nossek, “Unitary ESPRIT: How to obtain increased estimation
accuracy with a reduced computational burden,” IEEE Trans. Signal Process., vol. 43,
no. 5, pp. 1232–1242, 1995.
BIBLIOGRAPHY 169
[55] 3GPP, “Study on channel model for frequency spectrum above 6 GHz,” 3rd Generation
Partnership Project (3GPP), TR 38.900 V14.2.0, Dec 2016.
[56] M. K. Samimi and T. S. Rappaport, “3-D millimeter-wave statistical channel model for
5G wireless system design,” IEEE Transactions on Microwave Theory and Techniques,
vol. 64, no. 7, pp. 2207–2225, 2016.
[57] S. U. Pillai and B. H. Kwon, “Forward/backward spatial smoothing techniques for co-
herent signal identification,” IEEE Trans. Acoust., Speech, and Signal Process., vol. 37,
no. 1, pp. 8–15, 1989.
[58] Y. Zhu, L. Liu, and J. Zhang, “Joint angle and delay estimation for 2D active broadband
MIMO-OFDM systems,” in IEEE Global Commun. Conf., 2013, pp. 3300–3305.
[59] Q. Shi, M. Razaviyayn, Z. Q. Luo, and C. He, “An Iteratively Weighted MMSE Ap-
proach to Distributed Sum-Utility Maximization for a MIMO Interfering Broadcast
Channel,” IEEE Trans. Signal Process., vol. 59, no. 9, pp. 4331–4340, Sept 2011.
[60] R. Shafin, M. Jiang, S. Ma, L. Piazzi, and L. Liu, “Joint Parametric Channel Estimation
and Performance Characterization for 3D Massive MIMO OFDM Systems,” in IEEE
Intl. Conf. on Commun., 2018, pp. 1–6.
[61] R. Shafin, L. Liu, and J. Zhang, “Doa estimation and rmse characterization for 3d
massive-mimo/fd-mimo ofdm system,” in 2015 IEEE Global Communications Confer-
ence (GLOBECOM), Dec 2015, pp. 1–6.
[62] S. Rangan, T. Rappaport, and E. Erkip, “Millimeter Wave Cellular Wireless Networks:
Potentials and Challenges,” Proc. IEEE, vol. 102, no. 3, pp. 366–385, Nov 2014.
[63] F. Li, H. Liu, and R. J. Vaccaro, “Performance Analysis for DOA Estimation Algo-
170 BIBLIOGRAPHY
rithms: Unification, Simplification, and Observations,” vol. 29, no. 4, pp. 1170–1184,
October 1993.
[64] R. Shafin and L. Liu, “Multi-cell multi-user massive FD-MIMO: downlink precoding
and throughput analysis,” IEEE Trans. Wireless Commun., vol. 18, no. 1, pp. 487–502,
Jan. 2019.
[65] R. Shafin, L. Liu, J. Ashdown, J. Matyjas, and J. Zhang, “On the Channel Estimation
of Multi-Cell Massive FD-MIMO Systems,” in 2018 IEEE Intl. Conf. Commun. (ICC),
pp. 1–6.
[66] D. Vasisht, S. Kumar, H. Rahul, and D. Katabi, “Eliminating channel feedback
in next-generation cellular networks,” in Proceedings of the 2016 ACM SIGCOMM
Conference, ser. SIGCOMM ’16. New York, NY, USA: ACM, 2016, pp. 398–411.
[Online]. Available: http://doi.acm.org/10.1145/2934872.2934895
[67] H. Zhang, S. Gao, D. Li, H. Chen, and L. Yang, “On superimposed pilot for channel
estimation in multicell multiuser MIMO uplink: Large system analysis,” IEEE Trans.
on Vehi. Tech., vol. 65, no. 3, pp. 1492–1505, March 2016.
[68] K. Upadhya, S. A. Vorobyov, and M. Vehkapera, “Superimposed pilots are superior
for mitigating pilot contamination in massive MIMO,” IEEE Trans. Signal Process.,
vol. 65, no. 11, pp. 2917–2932, June 2017.
[69] J. Ma, C. Liang, C. Xu, and L. Ping, “On orthogonal and superimposed pilot schemes
in massive MIMO NOMA systems,” IEEE J. on Sel. Areas Commun., vol. 35, no. 12,
pp. 2696–2707, Dec. 2017.
[70] Z. Zhou, L. Liu, and J. Zhang, “FD-MIMO via Pilot-Data Superposition: Tensor-Based
BIBLIOGRAPHY 171
DOA Estimation and System Performance,” IEEE J. Sel. Topics Signal Process., vol. 13,
no. 5, pp. 931–946, Sep. 2019.
[71] R. Shafin and L. Liu, “Superimposed pilot for multi-cell multi-usermassive fd-mimo
systems,” to appear on IEEE Transactions on Wireless Communications, 2020.
[72] K. Upadhya, S. A. Vorobyov, and M. Vehkapera, “Downlink performance of super-
imposed pilots in massive MIMO systems,” IEEE Trans. Wireless Commun., vol. 17,
no. 10, pp. 6630–6644, Oct. 2018.
[73] X. Jing, M. Li, H. Liu, S. Li, and G. Pan, “Superimposed Pilot Optimization Design
and Channel Estimation for Multiuser Massive MIMO Systems,” IEEE Trans. Veh.
Technol., vol. 67, no. 12, pp. 11 818–11 832, Dec. 2018.
[74] D. Verenzuela, E. Bjornson, and L. Sanguinetti, “Spectral and energy efficiency of
superimposed pilots in uplink massive MIMO,” IEEE Trans. Wireless Commun., vol. 17,
no. 11, pp. 7099–7115, Nov. 2018.
[75] H. Chen, Y. H. Nam, R. Shafin, and J. Zhang, “Method and apparatus for machine
learning based wide beam optimization in cellular network,” Dec. 10 2019, uS Patent
10,505,616.
[76] S. Joseph, R. Misra, and S. Katti, “Towards self-driving radios: Physical-layer control
using deep reinforcement learning,” in Proceedings of the 20th International Workshop
on Mobile Computing Systems and Applications, ser. HotMobile ’19. New York, NY,
USA: ACM, 2019, pp. 69–74.
[77] 3GPP, “Radio measurement collection for Minimization of Drive Tests (MDT),” 3rd
Generation Partnership Project (3GPP), TS 37.320 V14.0.0, Mar. 2017.
172 BIBLIOGRAPHY
[78] L.-J. Lin, “Reinforcement learning for robots using neural networks,” Carnegie-Mellon
Univ Pittsburgh PA School of Computer Science, Tech. Rep., 1993.
[79] K. P. Sycara, “Multiagent systems,” AI Mag., vol. 19, no. 2, p. 79, 1998.
[80] M. Wooldridge, An introduction to multiagent systems. John Wiley & Sons, 2009.