APPLICATION OF SPC IN PERCEPTUAL SPEECH QUALITY CONTROL
IN
MODERN MOBILE RADIO NETWORKS
This thesis is presented for the degree of Doctor of Philosophy
The University of Western Australia
School of Electrical, Electronic and Computer Engineering
August 2012
Ahmad Zamani Jusoh
B. S. Electronic Engineering
(Hanyang University, Republic of Korea)
M. Sc. Digital Communication Systems
(Loughborough University, UK)
THE UNIVERSITY OFWESTERN AUSTRALIA
DECLARATION FOR THESES CONTAINING PUBLISHED WORK AND/OR WORK PREPARED FORPUBLICATION
The exominotion of ihe thesis is on exominolion of the work of ihe student. The workmust hove been substonliolly conducted by the student during enrolmeni in lhedegree.
Where lhe thesis includes work lo which others hove contribuled, the thesis mustinclude o stotement lhot mokes lhe studenl's contribution cleor to the exominers. Thismoy be in the form of o descriplion of lhe precise contribution of lhe student io thework presenled for exominoiion ond/or o slotement of the percenloge of the work thotwos done by the student.
ln oddition, in ihe cose of co-outhored publicolions included in the thesis, eoch outhormust give their signed permission for lhe work to be included. lf signolures from oll lheouthors connol be obloined. the stotement detoiling the sludeni's conlribution to thework musl be signed by lhe coordinoling supervisor.
Pleose siqn one of lhe stotements below.
1. This ihesis does nol conloin work thot I hove published, nor work under review for publicotion
Student Signoture
2. This thesis contoins only sole-oulhored work, some of which hos been published ond/orprepored for publicotion under sole outhorship. The bibliogrophicol detoils of the work ond whereil oppeors in the lhesis ore outlined below.
Studenl Signoture
3. This lhesis contoins published work ond/or work prepored for publicoiion, some of which hosbeen co-qulhored. The bibliogrophicoldeioils of ihe work ond where il oppeors in the lhesis oreoutlined below.The student musi ottoch to this declorotion o stotement for eoch publicotion lhot clorifies thecontribulion of the student lo the work. This moy be in lhe form of o description of the preciseconlributions of lhe sludenl to the published work ond/or o stotement of percenl contribution bythe sludent. This siotemeni must be signed by ollouthors. lf signolures from otlthe oulhors connolbe obloined, the slotemenl detoiling lhe studenl's contribution lo the published work must besigned by the coordinoling supervisor.
A.Z. Jusoh, R. Togneri, B. Rohani, S. Nordholm, "CL\SUM Aoolication in Perceptual Sneech Oualitl,, Contol",Proceedines of APCC2009, October 2009, Shanghai, China, pp. 694-698.
Student Signoture
i
Abstract
One of the important aspects of a mobile communications industry is satisfying
customers’ needs most economically. Indeed, customers expect good and consistent
quality of service from the provider. As such, in mobile telephony, this amounts to
controlling speech quality as “perceived” by customers. Controlling perceptual speech
quality necessitates a reliable measurement of the quality first, followed by exercising
direct control over it.
The ultimate measure of perceived speech quality is realized through subjective
listening tests, but this is not practical for real-time day to day applications. In recent
years, objective quality measurement algorithms have been developed to predict the
subjective quality with considerable accuracy. And the ITU-T P.862 Perceptual
Evaluation of Speech Quality (PESQ) model is state of the art in the International
Telecommunication Union’s Telecommunication Standardization Sector (ITU-T)
recommendation for reference objective quality measurement method. However, these
algorithms have yet to be applied for end user quality control in cellular networks.
Hence, in this thesis, the research framework for application of the PESQ algorithm for
perceptual speech quality control is presented.
The PESQ algorithm has been extensively used in measurement tools for accurate
assessment of perceptual speech quality in modern telecommunication networks.
However, the smallest period that PESQ can evaluate speech quality is 320 ms. Even
though this or longer periods may be suitable for monitoring the speech quality, it may
be too long for effective control of quality in the network. PESQ is calculated based on
the so-called “Frame Disturbance” (FD) which is effectively the perceptual distance
between a reference and a distorted speech signal. The FD is calculated every 16 ms.
Even though 16 ms is too short for assessing speech quality but it is suitable for control
purposes. FD is investigated as a perceptual metric for control of speech quality in
modern networks replacing conventional metrics.
Since perceptual quality is a relatively long term aggregate of FD values, the
relationship between the statistics of FD values and the resulting speech quality needs to
be investigated. It is the outcome of this investigation which will help determine the
scheme most suitable for control of speech quality based on FD statistics. It is
ii
envisaged that the Statistical Process Control (SPC) which is a popular method in
manufacturing and industrial process control, will be a promising method.
The control of perceptual speech quality using mechanisms such as power control and a
“hybrid” control mechanism has been studied and applied before. However, a direct
control approach using controlling tools such as SPC will be the first to be attempted.
Statistical process control has been widely used in manufacturing and industrial quality
control. A statistical process control mechanism that has received much attention in the
statistical literature and usage in industry is the Cumulative Sum (CUSUM) method.
CUSUM detects process shifts faster than any other method. In this thesis, the
application of CUSUM in perceptual quality control based on FD in a Universal Mobile
Telecommunication Systems (UMTS) environment will be presented. Furthermore, the
performance of CUSUM will be compared with its counterpart in SPC: Exponential
Weighted Moving Average (EWMA). From these results, the CUSUM and EWMA
applications show better control in speech quality compared with the conventional
method used in UMTS.
iii
Table of Contents
Abstract ....................................................................................................................................... i
Table of Contents ...................................................................................................................... iii
Dedication ................................................................................................................................. vi
Acknowledgements .................................................................................................................. vii
List of Abbreviations................................................................................................................. ix
List of Common Symbols ......................................................................................................... xi
List of Figures ......................................................................................................................... xiii
List of Tables............................................................................................................................ xv
CHAPTER 1 ............................................................................................................................... 1
INTRODUCTION ....................................................................................................................... 1
1.1 Thesis structure ................................................................................................. 2
1.2 Summary of major Contributions...................................................................... 4
1.2.1 log (FDn) as the perceptual speech quality metric ........................................... 4
1.2.2 The CUSUM application in Speech Codec Rate and Power Control for
UMTS ........................................................................................................................ 4
1.2.3 The EWMA application in Power Control for UMTS ..................................... 5
1.3 Publications ....................................................................................................... 5
CHAPTER 2 ............................................................................................................................... 6
LITERATURE REVIEW ............................................................................................................. 6
2.0 Introduction ............................................................................................................. 6
2.1 Speech Quality Metrics and Measurement Method ................................................ 6
2.1.2 Perceptual Speech Quality Metric .................................................................... 8
2.2 Power Control Scheme.......................................................................................... 17
2.2.1 Centralized Power Control ............................................................................. 18
2.2.2 Distributed Power Control ............................................................................. 19
2.3 UMTS Power Control ........................................................................................... 20
2.3.1. Open Loop Power Control ............................................................................ 21
2.3.2. Closed Loop Power Control .......................................................................... 22
2.4 Statistical Process Control (SPC) .......................................................................... 27
2.5 Summary ............................................................................................................... 34
CHAPTER 3 ............................................................................................................................. 35
METHODOLOGY .................................................................................................................... 35
iv
3.0 Introduction ........................................................................................................... 35
3.1 Proposed Perceptual Speech Quality Control Model ............................................ 37
3.1.1 Motivation ...................................................................................................... 37
3.1.2 Proposed Model ............................................................................................. 37
3.1.3 Original Input Speech File and Speech Codec ................................................... 39
3.2 PESQ ..................................................................................................................... 40
3.2.1 Level Alignment ............................................................................................. 41
3.2.2 Input Filtering ................................................................................................. 41
3.2.3 Time Alignment and Equalization ................................................................. 41
3.2.4 Auditory Transform ....................................................................................... 41
3.2.5 Disturbance Processing and Cognitive Modelling ......................................... 42
3.2.6 Disturbance Aggregation and MOS Prediction .............................................. 42
3.2.7 Realignment of Bad Intervals ......................................................................... 43
3.3 Frame Disturbance ................................................................................................ 43
3.4 FQI Feedback Method .......................................................................................... 45
3.5 CUSUM ................................................................................................................ 47
3.5.1 Tabular CUSUM ............................................................................................ 48
3.5.2 The V-mask ........................................................................................................ 51
3.6 EWMA .................................................................................................................. 53
3.7 Closed Loop Power Control in FDD Mode .......................................................... 57
3.7.1 Conventional UMTS Outer Loop Power Control Algorithm ............................ 58
3.8 SPC Based UMTS Power Control ........................................................................ 60
3.9 Summary ............................................................................................................... 62
CHAPTER 4 ............................................................................................................................. 63
THE CUSUM TECHNIQUE APPLICATION IN PERCEPTUAL SPEECH QUALTY
CONTROL ................................................................................................................................ 63
4.0 Introduction ........................................................................................................... 63
4.1 Frame Disturbance Analysis ................................................................................. 64
4.1.2 Input speech file and speech codec ................................................................ 64
4.1.3 Methodology .................................................................................................. 64
4.1.4 Simulation result and discussion .................................................................... 65
4.2 Speech Codec Rate Control Simulation model ..................................................... 69
4.2.1 Introduction .................................................................................................... 69
4.2.2 Methodology .................................................................................................. 69
4.2.3 Simulation results and discussion .................................................................. 71
v
4.3 Power Control Simulation Model ......................................................................... 73
4.3.1 Input speech file ............................................................................................. 74
4.3.2 Speech codec .................................................................................................. 74
4.3.3 Multiplexing and channel coding ................................................................... 75
4.3.4 Power Control ................................................................................................ 76
4.3.5 Channel .......................................................................................................... 79
4.3.6 Summary of simulation parameters ............................................................... 80
4.3.7 Methodology .................................................................................................. 80
4.3.8. Simulation results and discussion ................................................................. 82
4.4 Summary ............................................................................................................... 94
CHAPTER 5 ............................................................................................................................. 96
THE EWMA TECNIQUE APPLICATION IN PERCEPTUAL SPEECH QUALITY CONTROL
.................................................................................................................................................. 96
5.0 Introduction ........................................................................................................... 96
5.1 Data Distributions Responses with the Application of EWMA and CUSUM ..... 97
5.1.1 Data Sample ................................................................................................... 97
5.1.2 Methodology .................................................................................................. 98
5.1.3 Simulation result and discussion .................................................................... 99
5.2 Power Control Simulation Model ....................................................................... 102
5.2.1 EWMA based UMTS Power Control .......................................................... 103
5.2.2 Summary of simulation parameters ............................................................. 105
5.2.3 Methodology ................................................................................................ 105
5.2.4. Simulation results and discussion ............................................................... 107
5.3 Summary ............................................................................................................. 125
CHAPTER 6 ........................................................................................................................... 126
CONCLUSIONS ..................................................................................................................... 126
6.1 Summary of Major Findings and Contributions ................................................. 127
6.2 Suggestions for Future Work .............................................................................. 129
APPENDIX ............................................................................................................................. 131
ITU Speech Files .................................................................................................................... 131
vii
Acknowledgements
Thanks be to God!
Many people have helped me in the success of this project by supporting me with
different levels of assistance. I want to express my deepest gratitude, and would like to
thank them for their contributions
To my supervisory committee,
I would like to sincerely thank my wonderful supervisors, Associate Professor Roberto
Togneri and Professor Sven Nordholm, for sharing their knowledge and giving me such
helpful guidance along the way to completing this thesis. Their advice and comments
have been insightful to me. Not forgetting a special thanks to my former supervisor, Dr
Bijan Rohani for initiating this research and giving me ideas to enhance it.
To the UWA staff members,
I wish to express appreciation to all of those people who assisted me in performing my
experimental work, also my PhD documentation. My thanks to my lab mates, Sarajul,
Daniel, and Ingrid who assisted and shared the experience in doing the research.
To my colleagues and friends,
A special thanks to my friend, Daniel for willingly being my proof reader. I really
appreciated it. To all my friends who made me enjoy living in Perth and giving me
support during my hard work doing PhD, Nurazura Mohd Diah, Nor Azlin Tajuddin,
Ibrahim Abdul Rahman, Nadzril Sulaiman, Ahmad Fareed Ismail, Hamdan Daniyal,
Nor Fadhilah Mohd Azmin, Abdul Malek Abdul Hamid and etc.
To my family,
Thanks to my mother, Fatimah Che Long, my late father, Jusoh Latiff, my lovely wife,
Nor Haslinda Abdul Hameed, my sisters: Zamilah, Zaiton, Zarini and Zahariayana, my
viii
brothers: Mohd Zainuddin, Mohd Zakuan and Mohd Zaim Rasyidi, my family in law,
Bungsu Ismail and Mohd Yusof Mohd Nor & family for always being with me in bad
and good moments. Thanks for your prayers, patience, love and moral guidance
throughout my critical time. Their tremendous support has succeeded in assisting me to
complete this thesis.
Last but not least, a special thanks to my employer, International Islamic University
Malaysia and Ministry of Higher Education for giving me the opportunity and
scholarship in making my PhD journey a reality
ix
List of Abbreviations
3G Third Generation
3SQM Single Sided Speech Quality Measure
ACR Absolute Category Rating
ACELP Algebraic Code Excited Linear Prediction
AMR Adaptive Multi-Rate
ANIQUE Audio Non-intrusive Quality Estimation
ARL Average Run Length
ASD Auditory Spectrum Distance
BER Bit Error Rate
BS Base Station
BSD Bark Spectral Distance
BTS Base transceiver Station
CC Convolutional Coding
CD Cepstral Distance
CDMA Code Division Multiple Access
CePC Centralized Power Control Scheme
CLPC Closed Loop Power Control
CRC Cyclic Redundancy Check
CUSUM Cumulative Sum
DCR Degradation Category Rating
DMOS Degradation Mean Opinion Score
DPC Distributed Power Control
EWMA Exponentially Weighted Moving Average
ETSI European Telecommunications Standards Institute
FDD Frequency Division Duplex
FDMA Frequency Division Multiple Access
FER Frame Error Rate
FEP Frame Erasure Pattern
FD Frame Disturbance
FQI Frame Quality Indicator
FTT Fast Fourier Transform
GMA Geometric Moving Average
IRS Intermediate Reference System
x
ITU International Telecommunication Union
ITU-T International Telecommunication Union Telecommunication
Standardization Sector
MNB Measuring Normalizing Blocks
MS Mobile Station
MOS Mean Opinion Score
OLPC Open Loop Power Control
OPCS Optimum Power Control Scheme
PAMS Perceptual Analysis Measurement System
PCM Pulse Code Modulation
PAQM Perceptual Audio Quality Measure
PESQ Perceptual Evaluation Speech Quality
PSD Power Spectral Density
PSQM Perceptual Speech Quality Measure
QoS Quality of Service
RNC Radio Network Controller
SIR Signal Interference Ratio
SPC Statistical Process Control
TDD Time Division Duplex
TDMA Time Division Multiple Access
TPC Transmit Power Control
TSPC Target-SIR-tracking Power Control
UE User Equipment
UMTS Universal Mobile Telecommunication System
TQM Total Quality Management
VoIP Voice over Internet Protocol
xi
List of Common Symbols
R Transmission rating of E-model
TPCcm TPC command
δ Step size in inner loop power control
rxTPCcmd Received TPC command
txTPCcmd Transmitted TPC command
X - chart Shewhart Sample Mean
R-chart Shewhart Sample Range
p-chart Sample Proportion Defective
np-chart Sample Number of Defectives
c-chart Sample Number of Defects
u-chart or c -chart Sample Number of defects per unit
( )nD f Disturbance density
( )nDA f Asymmetrical disturbance density
N Frame number
nM Multiplication factor
Nb Number of bark band
fW Series of constants
0µ Target value of CUSUM
C+ Upper limit of CUSUM
C− Lower limit of CUSUM
K Reference value of CUSUM
H Tabular CUSUM limit
σ Standard deviation
0C + Initial CUSUM
0C − Initial CUSUM
α Probability of a false alarm in CUSUM
β Probability of not detecting a shift of the size δ
L Width of the EWMA control limits factor
1z First value of EWMA
UCL Upper control limit of EWMA
xii
LCL Lower control limit of EWMA
sµ Estimated mean log(FD)
∆ Step size in outer loop power control
P Statistical significant value
l Slot index for inner loop power control algorithm
xiii
List of Figures
2.1 Speech quality metric categorization……………………...………….... 7
2.2 Speech quality metrics and location measured………….……………... 7
2.3 Anatomy of the human ear………...…………………………………… 9
2.4 Basic operations performed by a perceptual speech quality metric…… 13
2.5 UMTS power control basic block diagram……………………………. 21
2.6 Open Loop Power Control operation………………………………..… 22
2.7 Closed Loop Power Control operation………………………………… 23
2.8 Block diagram of UMTS CLPC…….…………………………………. 23
2.9 General outer loop power control algorithm…………………………... 24
2.10 General inner loop power control algorithm…………………………... 25
2.11 Production process inputs and outputs………………………………… 28
2.12 Sample of Histogram………………………………………….……….. 29
2.13 Sample of Pareto Chart………………………………………………… 30
2.14 Sample of Cause and Effects Diagram………………………………… 30
2.15 Sample of Scatter Diagram……………………………………………. 31
2.16 Process improvement using the Control Chart………………………… 32
2.17 Sample of basic Shewhart Control Chart………………………………. 33
3.1 Example of perceptual speech quality experienced by more than 30
end users in a simulated 3G UMTS network…………………………...
36
3.2 Proposed model for speech codec rate control………………………… 38
3.3 Application of CUSUM/EWMA based on nFD ..………………………... 38
3.4 PESQ block diagram…………………………………………...……… 40
3.5 Structure of PESQ model…………………………………...…………. 40
3.6 nFD concept in controlling perceptual quality………………………… 44
3.7 The classification of the encoded speech bits and their unequal error
protection scheme for UMTS…………………………………….…….
45
3.8 Block diagram of FQI feedback method………...…………………….. 46
3.9 Sample of Tabular CUSUM………………………………...…………. 50
3.10 A typical V-Mask…………………………….………………………... 51
3.11 The physical distance between subgroup samples is equivalent to a
unit on the vertical axis………………………………………………....
52
3.12 Sample of EWMA chart…...……………………………….………….. 55
xiv
3.13 Conventional UMTS outer-loop power control flow chart……………. 59
3.14 SPC based UMTS outer-loop power control………………..…………. 61
4.1 Simulation model for frame disturbance analysis………........................ 65
4.2 log( )nFD distribution for PESQ MOS 3.0-3.5: (a) 3.0, (b) 3.1, (c) 3.2,
(d) 3.3, (e) 3.4 and (f) 3.5…………………………………….................
68 4.3 The simulation model for speech codec rate control…………………... 69
4.4 A CUSUM control chart without the controlling speech codec rate…... 71
4.5 Apply CUSUM with controlling speech codec rate…………………... 71
4.6 A CUSUM control chart without the controlling speech codec rate…... 72
4.7 Apply CUSUM with controlling speech codec rate…………………… 73
4.8 Block diagram of the simulation model of UMTS physical layer (FDD
mode)…………………………………………………………………...
74
4.9 Application of CUSUM in UMTS outer-loop power control…...…….. 77
4.10 CUSUM based UMTS outer-loop power control……………….…….. 78
4.11 Performance comparison of CUSUM based and conventional power
control (shadowing profile 5 and ∆ = 0.005 dB): (a) 3 km h-1 ,
(b) 50 km h-1 and (c) 120 km h-1………………………………………..
91
4.12 Performance comparison of CUSUM based and conventional power
control (shadowing profile 1 and ∆ = 0.02 dB): (a) 3 km h-1 ,
(b) 50 km h-1 and (c) 120 km h-1………………………………………..
93
5.1 Data sample which has a normal distribution…………………...……... 97
5.2 Data sample which not has a normal distribution…………………...…. 98
5.3 Result of the application of (a) EWMA technique and (b) CUSUM
technique to the normal distribution data………………........................
100
5.4 Result of the application of (a) EWMA technique and (b) CUSUM
technique to the non-normal distribution data………………………….
101
5.5 Application of EWMA in UMTS outer-loop power control…………... 103
5.6 EWMA based UMTS outer-loop power control………………………. 104
5.7 Performance comparison of conventional, CUSUM based and EWMA
based power control (shadowing profile 5 and ∆ = 0.005 dB):
(a) 3 km h-1 , (b) 50 km h-1 and (c) 120 km h-1………………………...
121
5.8 Performance comparison of conventional, CUSUM based and EWMA
based power control (shadowing profile 1 and ∆ = 0.01 dB):
(a) 3 km h-1 , (b) 50 km h-1 and (c) 120 km h-1………………………...
124
xv
List of Tables
3.1 Number of bits in Classes A, B, and C for each AMR codec mode……… 39
3.2 The parameters of for the sample of Tabular CUSUM chart……………... 49
3.3 EWMA parameters for the sample of EWMA chart……………………... 55
4.1 The estimated mean, 0
µ and the standard deviation of log( )nFD
distribution…………………………………………………………………
68
4.2 Parameters chosen for CUSUM chart ……….…………………………… 70
4.3 Summary of AMR codec mode 7 frame structure………………………... 75
4.4 Conventional UMTS power control
parameters…………………………...
77
4.5 Tapped-delay-line parameters for Vehicular A environment…………..…. 79
4.6 Main simulation parameters………………………………………...…….. 81
4.7 Results for conventional and CUSUM based power control algorithms
with outer-loop step down, ∆down = 0.005 dB and vehicular speed of
(a) 3 km h-1, (b) 50 km h-1 and (c) 120 km h-1……………………………..
84
4.8 Results for conventional and CUSUM based power control algorithms
with outer-loop step down, ∆down = 0.01 dB and vehicular speed of
(a) 3 km h-1, (b) 50 km h-1 and (c) 120 km h-1…………………..................
85
4.9 Results for conventional and CUSUM based power control algorithms
with outer-loop step down, ∆down = 0.015 dB and vehicular speed of
(a) 3 km h-1, (b) 50 km h-1 and (c) 120 km h-1…………………..................
86
4.10 Results for conventional and CUSUM based power control algorithms
with outer-loop step down, ∆down = 0.02 dB and vehicular speed of
(a) 3 km h-1, (b) 50 km h-1 and (c) 120 km h-1…………………………… .
87
4.11
Results for conventional and CUSUM based power control algorithms for
all outer-loop step sizes and vehicular speed of (a) 3 km h-1, (b) 50 km h-1
and (c) 120 km h-1…………………………………………………………
88
5.1 Chosen parameters for normal distribution data: (a) EWMA and
(b) CUSUM………………………………………………………………..
99
5.2 Chosen parameters for the non-normal distribution data: (a) EWMA and
(b) CUSUM………………………………………………………………..
99
5.3 Main Simulation Parameters………………………………………………
106
xvi
5.4 Results for Conventional and EWMA based power control algorithms
with outer-loop step down, ∆down = 0.005 dB and vehicular speed of
(a) 3 km h-1, (b) 50 km h-1 and (c) 120 km h-1……………………………
108
5.5 Results for Conventional and EWMA based power control algorithms
with outer-loop step down, ∆down = 0.01 dB and vehicular speed of
(a) 3 km h-1, (b) 50 km h-1 and (c) 120 km h-1……………………………..
109
5.6 Results for Conventional and EWMA based power control algorithms
with outer-loop step down, ∆down = 0.015 dB and vehicular speed of
(a) 3 km h-1, (b) 50 km h-1 and (c) 120 km h-1……………………………..
110
5.7 Results for Conventional and EWMA based power control algorithms
with outer-loop step down, ∆down = 0.02 dB and vehicular speed of
(a) 3 km h-1, (b) 50 km h-1 and (c) 120 km h-1……………………………..
111
5.8 Results for EWMA and CUSUM based power control algorithms with
outer-loop step down, ∆down = 0.005 dB and vehicular speed of
(a) 3 km h-1, (b) 50 km h-1 and (c) 120 km h-1……………………………..
112
5.9 Results for EWMA and CUSUM based power control algorithms with
outer-loop step down,, ∆down = 0.01 dB and vehicular speed of
(a) 3 km h-1, (b) 50 km h-1 and (c) 120 km h-1…………………………….
113
5.10 Results for EWMA and CUSUM based power control algorithms with
outer-loop step down, ∆down = 0.015 dB and vehicular speed of
(a) 3 km h-1, (b) 50 km h-1 and (c) 120 km h-1…………………………..…
114
5.11 Results for EWMA and CUSUM based power control algorithms with
outer-loop step down, ∆down = 0.02 dB and vehicular speed of
(a) 3 km h-1, (b) 50 km h-1 and (c) 120 km h-1……………………………..
115
5.12
Result for Conventional and CUSUM based power control algorithms for
all simulated outer loop step sizes and vehicular speed of
(a) 3 km h-1, (b) 50 km h-1 and (c) 120 km h-1……………………………..
116
5.13
Result for EWMA and CUSUM based power control algorithms for all
simulated outer loop step sizes and vehicular speed of
(a) 3 km h-1, (b) 50 km h-1 and (c) 120 km h-1……………………………..
117
xvii
A ITU Speech files used for FD analysis for PESQ MOS 3.0………………. 131
B ITU Speech files used for FD analysis for PESQ MOS 3.1………………. 131
C ITU Speech files used for FD analysis for PESQ MOS 3.2………………. 131
D ITU Speech files used for FD analysis for PESQ MOS 3.3………………. 132
E ITU Speech files used for FD analysis for PESQ MOS 3.4………………. 132
F ITU Speech files used for FD analysis for PESQ MOS 3.5………………. 132
1
CHAPTER 1
INTRODUCTION
As a result of increasing competition, measurement and control of the end-user
perception of service quality are becoming increasingly important to cellular network
operators. To measure and control perceptual speech quality efficiently, an accurate
speech quality measurement is required. In many modern cellular networks, accurate
speech quality measurements are required for a variety of reasons. These range from
daily network maintenance to radio resource management through power control and
link adaptation.
To date, speech quality has been monitored and controlled based on
conventional measurements such as Signal Interference Ratio (SIR), Bit Error Rate
(BER), and Frame Error Rate (FER). FER measure is widely used in systems such as
3G UMTS (Universal Mobile Telecommunication System) because it is recognised as a
good measure of speech quality. However, FER is not a perceptual measure of speech
quality. Furthermore, none of these non-perceptual measurements have been shown to
estimate speech quality with sufficient accuracy or reliability [1].
However, these parametric methods with their inferior performance are still
commonly used. Since these methods lack accuracy in their prediction of perceived
speech quality, the service provider needs to cater for the worst case scenario in order to
ensure that the quality expectations of almost all customers are met; that is, the provider
will have to unnecessarily expend more resources, such as transmission power and
speech codec rate to prevent speech quality from dropping below a certain acceptable
limit. There are no constraints on the upper quality value. Therefore, often more than
adequate quality is provided at the expense of valuable resources. That is, the available
methods do not control the perceptual quality directly, but they do so indirectly through
some relevant channel measures. Furthermore, applying any control on the signal will
only result in corresponding changes in the variables measured by the parametric
method, i.e. FER, SIR or BER.
A truly perceptual quality measure is obtained when we analyse the received
speech signal with a perceptual algorithm Perceptual Evaluation of Speech Quality
(PESQ) model which is a state of the art International Telecommunication Union’s
Telecommunication Standardization sector (ITU-T) recommendation for referenced
perceptual model measurement methods. PESQ has been designed to improve on the
2
previous objective methods. It is implemented commercially in testing devices and
monitoring systems [2]. As such, the application of PESQ as a monitoring and
controlling method will be beneficial to the telecommunications industry. In this thesis,
the speech quality monitor and control based on Frame Disturbance (FD) which is
subtracted from a PESQ algorithm will be investigated. The Frame Quality Indicator
(FQI) method used for estimation of the perceptual speech quality is applied in this
research application to ensure the employment of FD statistical data as speech quality
metric is possible.
The main aim of this research is to first incorporate perceptual based quality
measurement schemes to replace their traditional counterparts in mobile networks.
Subsequently, methods for direct control of the perceptual speech quality such as
Statistical Process Control (SPC) are applied in mobile communication systems. The
control of perceptual speech quality using mechanisms such as power control and
“hybrid” control mechanism has been studied and applied before [3-5], however, direct
control approach using controlling tools such as SPC is the first attempted.
Statistical process control has been widely used in manufacturing and industrial
quality control [6]. A statistical process control mechanism that has received much
attention in the statistic literature and usage in industry in controlling the process mean
is the Cumulative Sum (CUSUM) method. Furthermore, the CUSUM scheme detects
process shifts faster than any other method [7]. Hence, in this thesis a direct control
approach using the CUSUM scheme to control the perceptual speech quality will be
analysed. The performance of the scheme is compared to its counterpart tool in SPC,
Exponentially Weighted Moving Average (EWMA) scheme.
The outcome of this study is that it potentially benefits both the network
provider and users. The provider can optimize the network resources by providing just
enough resources to meet required levels of service as well as providing consistent
perceived quality to the customers. This is achieved while maintaining a satisfactory
service level for all customers.
1.1 Thesis structure
Chapter 2 covered the literature review for the thesis. The chapter starts with the survey
of speech quality metrics. The power control schemes used in mobile communication
systems is then described and discussed. The SPC and its tool are reviewed at end of the
chapter.
3
In Chapter 3, a novel method of controlling power as well as the speech codec
rate is proposed, both of which are associated with the usage of power in mobile
communication systems. PESQ, state of the art for referenced perceptual speech quality
measure is explored in the proposed system. FD which is subtracted from PESQ and is
proposed to replace the non-perceptual speech quality metric such as FER is described
in detail in the chapter. The FQI method used for estimation of perceptual speech
quality in this research application is also discussed. The direct control approach using
SPC tools - CUSUM and EWMA, which has not yet been explored in mobile
communications systems, is described in detail at end of the chapter.
In Chapter 4, the speech codec rate and power control using a CUSUM based
technique is applied in UMTS to improve the performance of UMTS. The chapter
begins with the analysis of the FD which is subtracted from PESQ. As a result, it shows
that log( )nFD possesses a normal distribution. Since CUSUM is naturally applied to the
normal distribution data, the employment of CUSUM in this research is justified. Then,
the application and analysis of the CUSUM based technique for controlling the speech
codec rate and power control in UMTS are discussed. It shows how fast the new
proposed perceptual speech quality metric, log( )nFD incorporates with CUSUM to
control the speech quality. Here, the CUSUM based power control algorithm
performance is compared with that of the UMTS conventional power control algorithm.
It is demonstrated that CUSUM based power control achieves adequate speech quality
while using less system resources. The application of a CUSUM based technique in
power control for UMTS, shows that the technique has up to a 13% reduction in the
average SIR target compared to the conventional counterpart.
In Chapter 5, power control using the EWMA based technique is applied in
UMTS to compare with the CUSUM based technique. The chapter begins with the
response of data distributions (normal and non-normal distribution) to the application of
both techniques. It shows that, in a particular study, EWMA is more superior in
detecting the shift for the non-normal data distribution than the CUSUM technique. The
application and analysis of the EWMA based technique for controlling the power
control for UMTS performance is compared to the conventional CUSUM based
technique. The result shows, CUSUM based technique achieves up to 5% reduction in
the SIR target compared to the EWMA based technique. However, the EWMA based
technique achieves up to 9% reduction in the SIR target compared to the conventional
based technique.
4
1.2 Summary of major Contributions
In this section, the major contributions of the thesis are summarized. These
contributions, to the best of the author’s knowledge, are innovative and have not been
published previously by other authors.
1.2.1 log (FDn) as the perceptual speech quality metric
The PESQ algorithm has been extensively used in measurement tools for accurate
assessment of perceptual speech quality in modern telecommunication networks.
However, the smallest period that PESQ can evaluate speech quality is 320 ms [8].
Even though this or longer periods may be suitable for monitoring speech quality, it
may be too long for effective control of quality in the network. As such, it will be
necessary to investigate metrics which can be calculated faster than 320 ms for
application in controlling the quality.
The PESQ is calculated based on the so-called “Frame Disturbance”, (FD) which is
effectively the perceptual distance between the reference and the distorted speech
signals [9]. The FD is calculated every 16 ms. Even though 16 ms is too short for
assessing speech quality it is suitable for control purposes . It is proposed that the FD is
investigated as a perceptual metric for control of speech quality in modern networks.
Since the perceptual quality is a relatively long term aggregate of the FD values
[2], the relationship between the statistics of FD values and the resulting speech quality
is investigated and the numerical analysis of the FD shows that log( )nFD has a normal
distribution for a given perceptual quality, the Mean Opinion Score (MOS). It also
demonstrated that the mean of distributions of log( )nFD is increased with the
degradation of the perceptual quality and vice versa. The result of the FD statistics data
shows it is applicable and appropriate for the application of SPC schemes in directly
controlling the perceptual speech quality.
1.2.2 The CUSUM application in Speech Codec Rate and Power Control for UMTS
The application of CUSUM based speech codec rate control for UMTS. CUSUM based
technique allows faster action at the transmitter to control the quality of the speech
signals as required by end users. The performance is compared between CUSUM based
5
and FER based outer loop power control algorithms through simulations. It is revealed
that the CUSUM based power control achieves adequate speech quality while reducing
the average SIR target by up to 13% relative to the conventional algorithm.
1.2.3 The EWMA application in Power Control for UMTS
The analysis between EWMA and CUSUM techniques control with normal distribution
data and non-normal distribution data is compared. It shows that in our case, a EWMA
technique has a better response with the data which does not have a normal distribution
compared to the CUSUM technique. A EWMA based technique is also superior in
detecting the larger shift than a CUSUM based technique. However, on the other hand,
a CUSUM technique has a better response with the normal distribution data compared
to the EWMA technique.
A EWMA based power control technique is applied for UMTS to compare with
the performance of conventional and CUSUM based power control techniques. It is
shown that both EWMA and CUSUM algorithms reduce the average SIR target
compared to a conventional algorithm. However, the CUSUM based power control
achieves adequate speech quality while reducing the average SIR target slightly by up to
5% relative to the EWMA based algorithm.
1.3 Publications
The following publication corroborates the material presented in this thesis:
1. Jusoh A.Z., Togneri R., Rohani B., and Nordholm S, " CUSUM application in
perceptual speech quality control, in 15th Asia-Pasific Conference on
Communication (APCC 2009), Shanghai, China, October 2009, pp694-698.
6
CHAPTER 2
LITERATURE REVIEW
2.0 Introduction Nowadays, the demand for mobile communications is increasing rapidly as well as its
featured technology. However, speech communication is still the main requirement by
end users. The service provider’s profit is dependent on the service provided.
Nevertheless, the demand for the service is dependent on the end user’s satisfaction
from the Quality of Service (QoS) they receive. Therefore, in mobile telephony, this
amounts to controlling speech quality as “perceived” by customers. Controlling
perceptual speech quality necessitates a reliable measurement of the quality first,
followed by exercising direct control of it. Power control schemes have been proposed
to increase the optimization of the resources as well as providing a better quality of
service to mobile network customers [10]. SPC has been widely used in manufacturing
and industrial quality control [6]. However, a direct control approach using SPC has not
yet been explored in mobile communications systems. In this chapter, the survey of
speech quality metrics, followed by the review of power control schemes of mobile
radio systems and SPC, is presented.
2.1 Speech Quality Metrics and Measurement Method
There is a wide range of metrics used, and could be used, to assess speech quality in
mobile radio networks. Figure 2.1 shows the range of the metric used from the
conventional metric to the perceptual metric. There are two types of perceptual metric:
objective and subjective. Furthermore, under objective perceptual metrics, there are the
referenced and non-referenced metrics.
7
Figure 2.1: Speech quality metrics categorization.
Traditionally, the speech quality in wireless systems is estimated based on parametric
methods that rely on channel quality measurements at the receiver [4, 11, 12] SIR, BER,
and FER are the more widely used metrics in mobile radio systems.
Figure 2.2: Speech quality metrics and location measured.
Figure 2.2 shows the speech quality metrics and where they are measured in
simplified communication systems. The simplest quality measure at the receiver side is
SIR. It is the quotient between the average received modulated carrier power S or C and
the average received co-channel interference power I. SIR is directly related to the
Speech Encoder
Channel Decoder Demodulator
Modulator Channel Encoder
Speech Decoder
Speech In
Reconstructed Speech Out
Channel
Perceptual Speech Quality Metric
FER BER SIR
Transmitter
Receiver
Speech Quality Metric
Conventional Perceptual
Subjective
Referenced
Objective
Non-referenced
8
carrier hence easy to be controlled. The BER can be measured after demodulation. BER
is defined as the average number of bits that are in error as compared with the total bits
entering the modulator over a given period of time. And, FER can be measured after the
speech undergoes the channel decoding process. FER can be defined as the ratio of
frames which are in error, to the total number of frames over the given period of time.
All of these measurements are related to speech quality. Among them, FER is
considered to be the most reliable and commonly used in modern mobile radio
networks, since frame errors are the major cause for quality degradation in speech signal
quality. However none of these measurements have been shown to accurately and
reliably estimate speech quality [1].
However, these parametric methods, with their poor performance in measuring
the true speech quality, are still commonly used. Since these methods lack accuracy in
their prediction of perceived speech quality, the service provider needs to cater for the
worst case scenario in order to ensure that the quality expectations of almost all the
customers are met; that is, the provider will have to unnecessarily expend more
resources, such as transmission power and speech codec rate to prevent speech quality
from dropping below a certain acceptable limit. There are no constraints on the upper
quality value. Therefore, often more than adequate quality is provided at the expense of
valuable resources. That is, the available methods do not control perceptual quality
directly but they do so indirectly through some relevant channel measures.
2.1.2 Perceptual Speech Quality Metric A truly perceptual quality measure is obtained when we analyse the received speech
signal with a perceptual algorithm based on the human hearing system. Perceptual
speech quality measurement is relatively new to mobile radio networks. Perceptual
speech quality measure is based on a psychoacoustics sound representation which will
be elaborated on in detail in the following sections. There are two types of perceptual
measurement methods: subjective and objective [13]. The subjective method uses a
human as a test subject, while the contrary objective method uses a model instead of a
human. Due to drawbacks of the subjective method such as being expensive, time
consuming and not suitable for day-to-day application [13-15], the objective method is
more appropriately applied in this research.
9
Psychoacoustics Psychoacoustics is the study of human perception of sound [16]. Sound is an alternating
air pressure which emanates from the source through a medium to the receptor. The
human perception of sound depends on the auditory behavioural responses of human
listeners, the abilities and limitations of the human ear, and the auditory complex
process which occurs inside the brain.
Figure 2.3: Anatomy of the human ear [17].
The human ear is divided into three parts: the outer ear, middle ear and the inner
ear as illustrated in Figure 2.3. The outer ear amplifies the incoming air vibrations. The
middle ear transduces these into mechanical vibrations and the inner ear filters and
converts them into hydrodynamic and electro-mechanical vibrations, after which, those
electromechanical signals are transmitted through nerves to the brain.
The human auditory system is remarkable in terms of absolute sensitivity and
the range of intensities to which it can respond [16]. Intensity means the acoustic power
of a sound per unit of area. The audible frequency of the human ear is roughly between
20 and 20,000 Hz and its intensity can be up to 120 dB. Human hearing has a binaural
hearing characteristic that allows humans to localize the sound by registering slight
differences in time, phase, and intensity of sound striking each ear. Human hearing also
can detect time differences as slight as 30 ms, which automatically compares the left
and right ear receptions and evaluates the sound’s intensity so that it allows humans to
identify the approximate location of the original sound.
Sound may be generally characterized by pitch, loudness, and timbre. In
psychoacoustics, pitch is the psychological perception of frequency. From the research
undertaken such as in [18] , pitch is a response pattern to the frequency of a sound. In
10
music, it is defined as the position of a single sound in the complete range of sound
from lowest to highest. The rise and fall in pitch is dependent on the strength of
vibration of the sound waves that produce that particular sound.
Loudness is a subjective perception of the intensity of sound. The ear is less
sensitive to low frequencies. The maximum sensitivity of human hearing is between
1,000 and 5,000 Hz and the standard threshold of hearing at 1,000 Hz is nominally
taken to be 0 dB.
Timbre is the ability of the ear to distinguish two similar sounds that have the
same pitch and loudness. It is mainly determined by harmonic content and dynamic
characteristics that allow us to discriminate sounds produced by the different sources we
hear at the same time.
The concept of critical bands, masking phenomena and the minimum threshold
of hearing of the human auditory system are important in psychoacoustic modelling
[19]. A critical band is the smallest band of frequencies that activate the same part of the
basilar membrane at the cochlea at the inner ear. The concept of critical bands was
introduced by Fletcher in 1940 [20] and has been widely tested. J.V. Tobias revealed the
critical band scale in 1970 [21]. From that scale, it is clear that the critical bands are
much narrower at low frequencies than at high frequencies where ¾ of the critical bands
are below 5,000 Hz. At low frequencies the ear can distinguish tones of a few hertz
difference but at high frequencies tones must differ by hundreds of Hertz to be
distinguished. When two sounds with equal loudness when sounded separately are close
together in pitch, their combined loudness when sounded together will be only slightly
louder than one of them alone. They might be in the same critical band where they are
competing against each other for the same nerve endings on the basilar membrane of the
inner ear. If the two sounds have a wide difference of pitch, the perceived loudness of
the combined tones will be greater because they do not overlap on the basilar membrane
and compete for the same hair cells. And, if the tones are not in the critical bands, the
combination of both can be perceived twice as loudly as one alone. The theory of
critical band shows that the human auditory system can discriminate energy use
between inside and outside bands.
Simultaneous masking is a characteristic of the human auditory systems where
some sounds fade away in the presence of louder sounds [22]. The louder sound is
called masker and the softer sound is called maskee. Instantaneous masking was
essentially defined through experimentation with pure tones and narrow-band noises
[20, 23]. Masking is the most powerful characteristic of modern lossy coders. The
11
sound signals which are going to be coded are compared to the minimum threshold and
masking curve. If the sound signals fall below the threshold, they will be discarded
since the ear cannot hear them.
A coder for a communication system based on critical band and masking in an
auditory system has been explored. For example, in 1980, Michael A. Krasner [24]
developed a multiband speech encoding system which uses the results of
psychoacoustic experiments to specify the system structure and parameters.
Subjective Speech Quality Measure The International Telecommunication Union (ITU) P.800 Recommendation [25]
describes several methods and procedures for subjective evaluations of transmission
quality. The most commonly used method is the Absolute Category Rating (ACR) and
Degradation Category Rating (DCR) tests. Subjective tests are normally carried out
under controlled conditions in the laboratory. The subjective perceptual measurement
method involves a group of participants rating the quality of some speech samples in a
strictly controlled environment. Careful test design can control some undesirable factors
that influence the voting process.
For an ACR listening test, subjects, (untrained listeners), have to rate the overall
quality of a speech clip which may have distortion without comparing it with the
original speech clip. This means the subjects do not have to refer to the original speech
clip in rating the given speech clip. The listeners have to give each sentence a rating
from 1 to 5 as follows: (1) bad; (2) poor; (3) fair; (4) good; (5) excellent. The
arithmetical mean of all the individual scores is the MOS and represents the overall
subjective rating of the speech sample [7].
For a DCR test, the listeners have to rate the degradation level of the speech by
comparing the speech clip under test to the original clip. The listeners have to give each
sentence a rating from 1 to 5 as follows: (1) very annoying; (2) annoying; (3) slightly
annoying; (4) audible but not annoying; (5) inaudible. The average of the opinion scores
of subjects in DCR is called the Degradation Mean Opinion Score (DMOS). The DCR
test provides more sensitivity in speech quality evaluation than the ACR method since
the reference speech is provided particularly when evaluating the good quality speech.
The ACR test tends to be insensitive to the extent that small differences in quality are
not detected.
Subjective testing methods have been developed to provide an overall score of
the quality of a system or service from the customer’s viewpoint, independent of the
12
underlying technology used in the network. This method is widely used in
communication systems even though it has limitations. For example, in subjective
perceptual measure, the definition of the test condition and the interpretation of results
are crucial. Hence, this method is tedious, error-prone, expensive, time consuming and
not suitable for real-time and day-to-day application [13-15]. Furthermore the tests and
the results of this method are not always reproducible. However, the subjective
perceptual measures are important because they are the ultimate measure of quality and
provide a benchmark for evaluation and comparison among other speech quality
measures.
Objective Speech Quality Measure In order to avoid the undesirable features of subjective tests, objective perceptual
methods have been invented. By contrast with the subjective perceptual measure,
objective perceptual measure replaces the human subject with a computer model.
Objective perceptual methods use models based on the human auditory system
properties in an attempt to derive quality estimations which are close to the subjective
perceptual method’s MOS values. Some objective perceptual quality measurement
methods have high correlations (as high as 97%) with the subjective MOS [10].
Furthermore, some objective methods can provide an accurate and reliable measurement
of speech quality in real life situations where the subjective perceptual methods can't be
used. Objective methods can be categorized as either referenced (Input/output based or
double-ended) or non-referenced (Output based or single-ended) measurements [11,
12].
Referenced Objective Speech Quality Metric:
In referenced schemes, the received speech signal is compared with the original
undistorted signal. Also called intrusive schemes, such schemes can be very
accurate [15] but they need the availability of the original signal in addition to
the distorted signal at the point of measurement. Thus, they are not applicable to
a measurement of the speech quality at the customer end.
13
Figure 2.4: Basic operations performed by a perceptual speech quality metric.
The basic operations performed by referenced perceptual speech quality
measurement methods are shown in Figure 2.4. The operation of the model
consists of two modules: perceptual transformation and cognition. The
perceptual transformation module transforms the signal into psychoacoustic
representation which approximates the human perception. Then, the cognition
module maps the difference between psychoacoustic representations of the
original and degraded signals into estimated perceptual distortion and rated to
the MOS scale.
Several researchers have attempted to adopt reference methods in analysing
perceptual quality. Karjalainen [26] introduced the method of measuring the
distortion of the speech signals in 1985. This method is based on the use of
speech signal as test signals and Auditory Spectrum Distance (ASD) as a
measure of speech quality degradation. This measure relies on comparison of
audible time-frequency-loudness representations of the signals. However,
Karjalainen’s work was almost unnoticed in later research studies.
In 1998, Quackenbush also described various models which use the distortion
parameters extracted from the signal to estimate the subjective quality measure
[27]. The models used objective measures such as the Cepstral Distance (CD).
However, they did not strictly follow the perceptual approach. Similarly, other
researchers introduced models that used objective measures. For example, in
1988, Voran introduced the Measuring Normalizing Blocks (MNB) model
which was based on a multi-scale method to compute a quality score from the
difference between logarithmic spectrograms of the signals [28, 29]. In the early
Perceptual Transformation
Perceptual Transformation
Cognition
Original speech
Degraded speech
Estimated distortion
14
1990s, various new perceptual quality measurement models for speech and
audio codec were introduced. In 1992, Wang et al [29] computed loudness on a
Sone scale [30] in Bark bands [31], and evaluated the mean squared Bark
Spectral Distance (BSD). Then, Hollier [32] generalized the Wang et al
approach to model both the amount and the distribution of errors.
The exploration of this niche area in the 1990s [33-35] also introduced some
new concepts which were later used in the speech quality models. For example
in 1992, an asymmetry factor was introduced by Beerends and Stemerdink’s
Perceptual Audio Quality Measure (PAQM) [33]. It should be noted that when
audio is mentioned, it indicates a wideband 20 kHz signal, whereas speech
implies a 3 kHz narrowband signal. This asymmetry factor from PAQM was
adapted into a method for speech codec evaluation, Perceptual Speech Quality
Measure, PSQM [36]. In PSQM, the asymmetry factor involved the different
weighting between degraded and reference signals in each time-frequency cell
by the power ratio of the two signals. PSQM was adopted as the objective
quality measurement method for speech codec by ITU in 1998.
Even though most of the methods described above are good in measuring the
speech and audio signal, they were not suitable for measuring speech quality
delivered by communication networks. Communication networks have issues
including filtering, level changes and unknown delays which could vary
dynamically. If these issues are not considered, the reference schemes will be
considered as very inaccurate and useless for such networks. Therefore, the
researchers in the mid 1990s began to focus on solving those issues.
Rix [37] introduced a new model called a Perceptual Analysis Measurement
System (PAMS) to address the problem of linear filtering which can occur in
several places in a communication system. This model is based on one
developed by Hollier [32]. Later, to overcome a problem in the system, Beerend
and Hekstra improved PSQM to PSQM99 [14] using the PAMS method
proposed by [37].
For proper operation, perceptual models require the reference and degraded
signals to be aligned in time. However none of the early models had the ability
15
to do that. Rix and Reynolds addressed this problem by adding a set of methods
to PAMS that allowed identification and adjustment for delay changes in speech
signals [14]. Subsequently, in 2001, PSQM was replaced by PESQ which was
based on PSQM99 and PAMS. PESQ model is the state of the art ITU-T
recommendation for a referenced perceptual model measurement method. PESQ
has been designed to improve on the previous objective methods and was
implemented commercially in testing devices and monitoring systems [2]. This
method will be elaborated in detail in Chapter 3.
For network measurements, a referenced method can be employed in
conjunction with test calls. In this case, a test call is made and the corresponding
signal at the receiving end is recorded for assessment with the referenced
method. This however is wasteful in terms of utilising network resources. In
addition, it only provides a snapshot of the network quality at the time of
measurement and the location of measurement. Sometimes, measurements are
carried out during live traffic. In this case, shot segments of a test signal are
interleaved with a user signal. A referenced model is used to assess the quality
from the receiving side based on the received test signal and a pre-stored copy.
The situation is however different from the non-referenced schemes. These can
be adopted for measuring a speech signals. Alternatively, a non-referenced
speech quality model may be used, in which case the need for test calls or test
signals is alleviated. Such a scenario is referred to as non-intrusive or passive
network monitoring.
Non-Referenced Objective Speech Quality Methods:
Also referred to as the non-intrusive speech quality measurement method; this
does not need an injection of a reference signal and is appropriate for monitoring
live traffic. Non-referenced objective perceptual models include the E-model
[38, 39] ITU-T.P.563 Audio Non-intrusive Quality Estimation (ANIQUE) [40,
41] and Single Sided Speech Quality Measure (3SQM) [42].
E-model is the abbreviation for the European Telecommunications Standards
Institute (ETSI) Computation Model. It was developed in 1996 initially as a
computational tool for network planning. However, this is now being used to
predict speech quality for VoIP non-intrusive applications [43]. The E-model
16
assumes an additional relationship between a numbers of transmission
parameters which affect the speech quality. The E-model produces a
transmission rating R which can be used to estimate speech quality. The value
of R lies between 0 and 100. The R value below 50 indicates very poor quality
while a value between 90 and 100 indicates excellent quality. The average
correlation between estimated quality of the E-model and subjective MOS has
been reported to be 0.74 [44]. Although the E-model has been a useful tool for
non-intrusive voice quality measurement in Voice over Internet Protocol (VoIP)
networks, it also has limitations which means it would not apply widely in the
communication systems. It is expensive, time consuming and only applicable for
a limited numbers of codec and network conditions. Also, it assumes that the
individual transmission parameters are independent of each other and are
additions which do not always prove to be true [45].
The ANIQUE model which was developed in 2004 by Kim [40] is based on the
functional roles of human auditory systems and the characteristics of human
articulation systems. It was reported to have an average correlation of 0.8546
with the subjective MOS.
The 3SQM was released in May 2004 after being selected and standardized by
the ITU-T as per Recommendation P.563. It was developed in 2003 by the
combination of three companies named PSYTECHNICS, OPTICOM and
SWISSQUAL. The average Pearson correlation coefficient between 3SQM
MOS and subjective MOS has been reported to be 0.89 [42].
However, 3SQM also has drawbacks when applied to a communications system
such as in link adaptation. Link adaptation is the process of changing codec rate,
modulation, and other parameters on a packet-to-packet basis or even during the
transmission of a single packet, in response to channel conditions. The quality
score of the 3SQM is based on the ACR. It cannot differentiate whether the
degradation of the speech is because of the channel errors or the bad quality of
original source itself. Therefore, a link adaptation technique based on 3SQM
may assume that the degradation of the speech derives from deterioration in the
channel and it will unnecessarily try to compensate for it.
17
That is different from the referenced speech quality method like PESQ where the
referenced speech quality metric would give a quality score relative to the
quality of the original signal.
Furthermore, the intrusive metrics are generally more accurate than their non-
intrusive counterparts and give a higher correlation with the subjective MOS.
The correlation coefficient of 3SQM and ANIQUE scores with subjective MOS
values are on average 0.89 and 0.85 respectively as compared to that of PESQ
which is 0.935. Also, the 3SQM update rate is unacceptably slow for link
adaptation in a radio system such as power control in UMTS. This is due to the
3s minimum length required for 3QSM to assess the speech quality.
Despite their better reliability, perceptual objective quality measurement
methods have not been adopted in conjunction with speech quality control in mobile
telephony applications. Instead, parametric methods with their inferior performance are
still commonly used. However, because these methods lack accuracy in their prediction
of the perceived speech quality, the service provider has to cater for the worst case
quality scenario. That is, the provider will have to unnecessarily expend more resources,
such as transmission power, to prevent the speech quality from dropping below a certain
acceptable limit. Furthermore, applying any control on the signal will only result in
controlling a variable measured by the parametric method, i.e. FER, SIR or BER.
Therefore, the available methods do not control the perceptual quality directly but they
do so indirectly through some relevant channel measures.
2.2 Power Control Scheme Power control is acknowledged as the crucial aspect in mobile communication systems
[10]. Right up to the present, power control has been comprehensively studied for
Frequency Division Multiple Access (FDMA), Time Division Multiple Access
(TDMA) [46] and Code Division Multiple Access (CDMA) [10, 47-53]. In early days,
radio telephone systems used high antennae and high power to serve an entire area from
a single base station and each channel could only be used once in each particular area.
Current cellular systems use lower antennae and lower transmission power to allow
each channel to be reused many times within the same area. The frequency reuses
increases the number of calls which can be accommodated in the same area. Although
there are some variations due to terrain, user density and available cell sites, cellular
18
systems tend to use simple, geometric patterns to establish frequency reuse. FDMA and
TDMA based mobile radio systems are employed on this frequency reused to overcome
the limited availability of frequency spectrum. This employment increased the system
capacity where the more radio frequencies are reused, the higher the system capacity
will be. However, co-channel interference limits the number of frequencies reused in a
given area, in which case, power control is applied to reduce the effects of co-channel
and subsequently allows higher reuse of frequencies.
In controlling power, both the base and mobile transmitter powers can be
adjusted dynamically over a wide range. Typical cellular systems adjust their transmitter
power based on received signal strength. This method adjusts for differences in path
loss as users move closer or further away from their base stations. There is no attempt to
simultaneously optimize transmitter power for all users. The CDMA-based mobile
system is the one which has implemented this kind of power control. It ensures that the
resources are equally distributed among users. Without power control, the capacity of
CDMA-based systems is even worse than FDMA-based systems.
In cellular systems, the quality of a call is usually determined by the SIR.
Traditional reuse distances are selected to maintain an acceptable SIR under worst-case
scenario situations with simple power control. Hence, there are optimum power control
schemes proposed by the researcher to adjust transmitted power dynamically so as to
meet SIR requirements. This results in reduced power consumption and reduced intra-
system interference to improve call quality, prolonged battery life of the mobile, and
also reduces out-of-system interference to help meet regulatory requirements.
The power control schemes can be distributed or centralized as are briefly
reviewed in the sequel.
2.2.1 Centralized Power Control
A Centralized Power Control Scheme (CePC) uses information for all links and the
central station controls the whole system. The motivation of the CePC is to maximize
the minimum SIR in each of the channels in the system. CePC is not usually
implemented in the mobile communication system due to its complexity but it helps in
the design of various power schemes such as distributed power control schemes that are
easy to implement.
Wu published two papers on centralized power control [48, 49]. Wu analysed
the Optimum Power Control Scheme (OPCS) for CDMA systems in [48] and the upper
limit for all transmitter power controls were presented. Using OPCS was shown to
19
increase the system capacity by 55% over an Interim Standard 95 (IS-95) system with
perfect power control. Wu had expanded his work on OPCS, and in [49], presented an
optimum power control algorithm for mobile radio systems based on heterogeneous
SIR. Heterogeneous SIR means that different SIR values are used for different links.
Subsequently, this employment will minimize the average SIR value required for each
link without compromising the QoS.
2.2.2 Distributed Power Control The distributed Power Control (DPC) algorithm uses only local SIR information and
utilizes an iterative scheme to control the transmission power. This means each base
station takes charge of controlling the transmission power of the mobile stations in its
own cell. Therefore, a centralized controller is no longer required. DPC schemes are
more appropriate for practical implementation in mobile communication systems due to
their less computationally complex and require much less signalling compared to CePC
schemes.
The fundamental work on DPC was studied by Axen [54, 55]. Axen implied a
simple proportional control algorithm in implementing DPC. The algorithm will
decrease the transmitter power in a link if the SIR moves above a target threshold value
and will increase the transmitter power value when SIR is too low. However, Axen’s
algorithm would become unstable if the target threshold value was set too high. In that
case, the transmitters would increase continuously the output powers to achieve the
given target. This, however, increases the interference on all other transmitters which
would result in transmitters continually increasing their power until they reach their
peak output power. Then, the transmitters are going to be in saturation state. Zander in
[56] addressed this problem by presenting a DPC algorithm which incorporated
distributed SIR balancing. This Zander’s distributed discrete-time power control
algorithm is also called Distributed Balancing (DB) and is based on the model and
assumptions in [57].
In 1993, Foschini and Milijanic [58] proposed the Target-SIR-tracking Power
Control (TSPC) Algorithm and it was further studied in [59-62].Under the TSPC, the
information that each user needs to know, either from local or corresponding base
station, is minimal. In [63] Zander, Rasti and Sharafat improved the TSPC by
introducing a new Distributed Constrained Power Control (DCPC) algorithm to deal
with the problem of inefficient energy consumption and unnecessary interference for
the communication networks users.
20
2.3 UMTS Power Control UMTS is a third generation mobile system which will integrate most mobile services
into a single system so that all kinds of terminals may be used in all environments.
UMTS separates the roles of service provider, network operators, subscriber and user.
This enables innovative new services without requiring additional network investment
from a service provider.
In UMTS, the relative movement of the User Equipment (UE) and the Base
Station (BS) contribute to channel variations and subsequently affect the received signal
level of the communication system. Therefore, the transmit power must be changed in
response to channel variations to ensure reliable signal modulations. Otherwise, if the
received signal level is too weak, the QoS is degraded. On the other hand, if the signal
level is too high, it creates too much interference which would increase system capacity.
Furthermore, excessive transmission power in the uplink will shorten the battery life of
UE. Therefore, UMTS employs power control in an attempt to regulate the received
signal level such that it is within a desired range [47].
UMTS contains both the Frequency Division Duplex (FDD) and Time Division
Duplex (TDD) modes of operation. Generally FDD mode employs faster uplink and
downlink power control rates than the TDD mode. The detailed discussion on UMTS
uplink power control in FDD can be referred to in Chapter 3. The power control in
UMTS can be implemented in two ways: open loop power control and closed loop
power control. Figure 2.5 below shows the block diagrams of the power control in
UMTS.
21
Figure 2.5: UMTS power control basic block diagram.
2.3.1. Open Loop Power Control
The initial power control is Open Loop. In Open Loop Power Control (OLPC), the
transmitter does not depend on feedback information from the receiver end of the
communication link. Instead, the transmitter estimates the path loss from the signal
received in downlink. As such, this method would be far too inaccurate. In OLPC, the
MS of UE transmitter has the ability to set its output power based on the received level
of the pilot from the Base Transceiver Station (BTS) or node B. It estimates the path
Transmit Measuring received power
Estimating path loss
Calculating transmission
power
Transmit Receive
Decide transmission
power
Transmit
Power control command
Measuring received power
Open Loop power control
Closed Loop power control
BTS
MS (UE) MS (UE)
22
loss from the signal in downlink. The UE will continue estimating the output power
until it receives the response from the BTS as illustrated in Figure 2.6.
Figure 2.6: Open Loop Power Control operation.
The OLPC is most effective if both uplink and downlink channels are symmetrical.
Since path loss and shadowing are frequency dependent, the uplink and downlink
channels are symmetrical when they operate on the same frequency. In utilising these
symmetrical circumstances, open loop power control systems continuously adjust the
transmit power by an amount inversely proportional to changes in the received signal
power [64, 65]. However, the assumption of symmetry of the uplink and downlink
channels is invalid in the case where the transmitter and receiver operate on different
frequency bands [64]. Therefore, in UMTS, open loop power control is primarily used
in TDD mode. However, FDD is used for initial power setting.
2.3.2. Closed Loop Power Control
Once communication is established, power is controlled by the Closed Loop Power
Control (CLPC). In CLPC, BTS performs frequent estimations of the received SIR and
compares to a target SIR. It commands the MS to increase or decrease its power.
In CLPC, FDD mode is used for both uplink and downlink but TDD mode only
used in downlink [47, 53]. Unlike OLPC, CLPC depends on feedback information from
the receiver end of the communication systems. CLPC for FDD mode is the more
widely used mode in communication systems [66]. The CLPC procedure in UMTS is
MS Access 1 with estimated power
MS Access 2 with estimated power
MS Access n with estimated power
Response with power control MS (UE)
BTS (Node B)
23
divided into two processes: outer loop and inner loop as illustrated in Figure 2.7 and
Figure 2.8 and are briefly elaborated in the sequel.
Figure 2.7: Closed Loop Power Control operation.
Figure 2.8: Block diagram of UMTS CLPC.
Set SIR Target Compare Downlink
Step Selection
Adjust power Uplink
SIR Estimation
Calculate CRC
FER target
Inner loop Outer loop
Function performed in BS
Function performed in UE
Channel
TPC = Transmit Power Control command
SIR = Signal to Interference Ratio
CRC = Cyclic Redundancy Check
FER = Frame Error Rate
TPC
BTS RNC sends new SIR target
ME (UE)
RNC calculates FER for Tx
Radio Network Controller (RNC) sets target for service
Continues Power Control
BTS sends power control bits to MS (UE) (1500 times/sec)
MS transmits (Tx)
Outer Loop
Inner Loop
RNC
24
Outer Loop Power Control Outer loop power control is used to maintain the QoS requirement and at the same time
minimize the power. Outer loop power control in UMTS is responsible for adjusting the
SIR target values for the inner loop in an effort to maintain MS’s measured FER at the
BTS close to a specific value. The uplink outer loop power control, Radio Network
Controller (RNC) is responsible for setting a SIR target in the BTS for each individual
uplink inner loop power control. This SIR target is updated for each UE according to
the estimated uplink quality based on FER target (usually 1% for speech services) to
achieve the satisfactory QoS. In downlink outer loop power control, UE converge the
required link quality set by a RNC in downlink. Figure 2.9 shows the general algorithm
of outer loop power control.
Figure 2.9: General outer loop power control algorithm.
Inner Loop Power Control
Inner loop power control is also called a fast closed loop power control. As opposed to
outer loop power control, the inner loop power control adjusts the transmitted power of
the MS in an attempt to compensate for signal amplitude fading of the uplink radio
channel and consequently meet the SIR target set by outer loop. It means, in the uplink,
UE transmitter adjusts its output power accordingly with one or more Transmit Power
Control (TPC) commands received in the downlink in order to maintain the received
uplink SIR at a given SIR target. If the measured SIR at BTS is higher than the target
SIR, the base station will command the MS to decrease the power. If it is too low, the
Decrease SIR target
Received quality better than required
quality Yes No
Increase SIR target
25
BTS will command the MS to increase its power. The general algorithm inner loop
power control is illustrated in Figure 2.10. The command-react cycle is 1500s-1.
Figure 2.10: General inner loop power control algorithm.
The inner loop updates SIR target every 10-100 ms which is a much higher rate
than the outer loop. During this time, the CRC for each frame is calculated and used to
adjust the SIR target. In UMTS inner loop power control, there are two alternative
algorithms used by BS in instructing UE for interpreting the TPC command (TPCcmd)
[52] which are referred as algorithm 1 and algorithm 2 in this thesis. In the inner loop,
the BS estimates the received SIR and compares it against the SIR target once every
0.666 ms time-slot. If the estimated received SIR is less than the SIR target, the BS
sends a TPCcmd “1” to the UE; otherwise a “0” is transmitted.
The single power control step changed in the UE transmitter output power in
response to a single TPCcmd. However, multiple TPC bits are received by the UE in soft
handover. In such cases, the behaviour of the algorithms varies slightly. In this study,
only the case when the UE is not in soft handover is considered. On receiving the TPC
bit rxTPCcmd, the UE derives a single transmit TPC command txTPCcmd for each time
slot based on one of the two algorithms. These algorithms are performed in the block
labelled “step selection” in Figure 2.8. Note that the step size in the inner loop power
Decrease power
Measure SIR <
SIR target
Yes
Measured SIR At BTS
Measure SIR >
SIR target
Yes
Increase power
Yes
No No
26
control is different from the step size in the outer loop power control. Therefore, in that
case the δ symbol is used for the step size in the inner loop power control. The
algorithms used in inner loop power control are described as below:
Algorithm 1
A single TPCcmd received by UE transmitter in each time slot and change the power
control step size in the same slot. The steps of the algorithm are as follow:
Step 1: Initialise time slot index l to1.
Step 2: Wait for the arrival of rxTPCcmd for time slot l.
Step 3: Decide on the value of txTPCcmd for time slot l.
If rxTPCcmd = 0, then
txTPCcmd = -1.
else if the rxTPCcmd = 1, then
txTPCcmd = + 1.
Step 4: Calculating the step size, δ for adjusting the transmitter power:
δ = TPC cmdtxTPCδ × (2.1)
where TPCδ can take on values of either 1 dB or 2 dB [67].
Step 5: Adjust transmitter power by step size δ
Step 6: Set l = l + 1 and go to step 2.
Algorithm 2
A single TPCcmd received by UE transmitter in each time slot and change the power
control step size based on a 5-slot cycle. The steps of the algorithm are as follow:
Step 1: Initialise time slot index l to1.
Step 2: Wait for the arrival of rxTPCcmd for time slot l.
Step 3: Decide on the value of txTPCcmd for time slot l.
If the l is not divisible by 5 (i.e, this is not the fifth time slot in a 5-slot cycle)
txTPCcmd = 0.
else
If all last five rxTPCcmd = 0, then
txTPCcmd = -1.
else if all last five rxTPCcmd = 1, then
txTPCcmd = +1.
else
txTPCcmd = 0.
27
Step 4: Calculating the step size, δ for adjusting the transmitter power:
δ = TPC cmdtxTPCδ × (2.2)
where TPCδ can take on values of either 1 dB [47, 67].
Step 5: Adjust transmitter power by step size δ
Step 6 : Set l = l + 1 and go to step 2.
2.4 Statistical Process Control (SPC)
Statistics is a collection of techniques useful for making decisions about a process or
population based on an analysis of the information contained in a sample from the
particular population [68]. Statistical techniques play a very significant role in quality
improvement. SPC is the method for quality improvement that relies on statistical and
engineering technology. It derives from the concept of Total Quality Management
(TQM) which was a useful management structure in which to implement statistical
methods. SPC involves using statistical techniques to measure and analyse variations;
hence the procedure can help us monitor the process behaviour.
Statistical process control has been widely used in manufacturing and industrial quality
control [6, 69-71]. However, it has not yet been used in mobile communication systems.
The key element in maintaining and improving quality and productivity is having
efficient process control. Therefore, SPC is most often used for manufacturing
processes to monitor product quality and maintains a process to a fixed target and also
may be used for analysing process capability and for continuous process improvement
precautions. In manufacturing, SPC is used to monitor the consistency of processes used
to manufacture a product as designed. It aims to start and keep the process under
control. Regardless of the quality of the design, SPC can ensure that the product is being
manufactured as designed and intended. There are various statistical tools which are
useful in analysing quality problems and improving the performance of a production
process. The basic role of these tools is illustrated in Figure 2.11.
28
Figure 2.11: Production process inputs and outputs.
SPC provides surveillance and feedback for keeping processes in control. It
monitors process quality by signalling and detects the problem and assignable causes of
variation with the process which is about to affect the quality adversely. Hence, it
accomplishes process characterization, trends and patterns. Therefore, due to the
predictability, the application of SPC reduces the need for inspection. It also provides a
mechanism to make process changes and track effects of those changes. Once a process
is stable and the assignable causes of variation have been eliminated, SPC provides an
ongoing process capability analysis with comparisons to the desired outcome.
The commonly used tools in SPC include [68, 69, 72]:
1. Histogram or Stem-and-Leaf Display,
2. Check Sheet,
3. Pareto Chart and Analysis,
4. Cause and Effect Diagram,
5. Defect Concentration Diagram,
y = Quality characteristic
Input raw materials, components,
and subassemblies
Process
Measurement Evaluation Monitoring
and Control
Output product
Controllable inputs
Uncontrollable inputs
29
6. Scatter Diagram, and
7. Control charts
These seven tools are often called “the magnificent seven” by the statisticians due to
their important role in SPC. Each tool is simple to implement and usually used to
complement each other.
The Histogram is a fundamental statistical tool of SPC. The shape of the
histogram shows the nature of the distribution of the data. It identifies the average and
variation of the data. It also shows the pattern of variation. Specification limit can be
used to display the capability of the process by detecting whether the process is within
specifications or not. This tool is a very effective graphical and easily interpreted as
illustrated in Figure 2.12.
02468
1012141618
Freq
uenc
y
Figure 2.12: Sample of Histogram.
A Check Sheet is the one which was used in the early stages of SPC
implementation. It is a relatively simple form used to collect the data. Hence, the Check
Sheet can be very useful in data collection activity. It was designed to facilitate
summarizing the entire historical defect data available concerning the particular product
in a particular process. Even though not mandatory, Check Sheets are beneficial in
constructing the Pareto Charts.
The Pareto Chart was invented by Italian economist Vilfredo Pareto (1848-
1923). Vilfredo Pareto discovered that: 80% of the wealth in Italy was held by 20% of
population, 20% of customers accounted for 80% sales, 20% of parts accounted for 80%
of cost etc. Juran (1960) then confirmed these observations and named this discovery
Pareto Principle or 80-20% principle. The Pareto Principle states that: Not all of the
30
causes of a particular phenomenon occur with the same frequency or with the same
impact. This principle can be transformed to the chart named the Pareto Chart. A Pareto
Chart identifies the most frequently occurring factors. From the chart, analysis can be
made to tackle the most effective problem in the process as in the Figure 2.13.
Sample Pareto ChartC
ause
# 1
Cau
se #
2
Cau
se #
3
Cau
se #
4
Cau
se #
5
Cau
se #
6
Cau
se #
7
Cau
se #
8
Cau
se #
90
10
20
30
40
50
60
Causes
Defe
cts
0%
20%
40%
60%
80%
100%
Cum
ulat
ive
%
Vital Few Useful Many Cumulative% Cut Off % [42]
Figure 2.13: Sample of Pareto Chart.
Another useful SPC tool is the Cause and Effect Diagram or Fishbone Diagram.
It does not have a statistical basis yet is an excellent aid for problem solving and trouble
shooting. Since it was introduced by Dr. Kaoru Ishikawa in 1943, it is also known as an
Ishikawa Diagram. The tool can reveal the possible contributing factors of the out of
control processes and the important relationship among the various variables. It also can
provide additional insight into the process behaviour as shown in Figure 2.14.
Figure 2.14: Sample of Cause and Effects Diagram
Machines
EFFECT
Measures
Materials Environment
Methods Men
31
A Defect Concentration Diagram is a picture of the product which shows all
relevant angles. The various types of defects are drawn on the picture. Based on the
picture, the defects location can be determined and it proffers information to start
investigating the causes of the defects.
A Scatter diagram can be used to identify the potential relationship between two
variables as shown in Figure 2.15. The shape of the scatter diagram often determines the
type of relationship between the two variables.
0
10
20
30
40
50
60
0 5 10 15 20 25 30
Variable 1
Varia
ble
2
Figure 2.15: Sample of Scatter Diagram.
Arguably, the Control Chart is one of the primary and most successful SPC
techniques. It was originally developed by Walter Shewhart in the early 1920s [73]. It is
the graphical representation of certain descriptive statistics for specific quantitative
measurements of the process over the period of time. These descriptive statistics are
displayed in the control chart and are compared to their ‘in-control’ sampling
distributions. The comparison detects any unusual variation in the process, which could
indicate a problem in such a process thus helping to reduce variability. The Control
Chart monitors performance of the process over time and allows process corrections to
prevent rejection where the out of control conditions are immediately detected. A
Control Chart differentiates between variations whether due to common or special
causes.
32
Variations due to common causes have a small effect on the process and it
occurs due to the process management and operation. This can only be removed either
by changing the process, making modifications to the process, or both.
The variations due to the special causes are considered abnormal to the process.
It’s often specific to a certain manpower or operator, machine, material, etc. It is
important to investigate and rectify the variations due to the special cause to improve
the process quality. It is the key of the process improvement.
The most important use of a control chart is to improve the process. This process
improvement activity using the control chart is illustrated in Figure 2.16.
Figure 2.16: Process improvement using the Control Chart.
The control chart will only detect assignable causes. It is the responsibility of
management, human and engineering action to rectify the problem and eliminate the
assignable causes.
Several different descriptive statistics may be used in control charts and there are
several different types of control chart which can test for different causes, such as how
quickly a shift in process means are detected. For continuous (variables) data, the
commonly used Control Charts are: Shewhart Sample Mean ( X - chart), Shewhart
Sample Range (R-chart), Shewhart Sample (X-chart), CUSUM chart, EWMA chart, and
Moving-average and Range Chart. And for discrete (attributes and countable) data, the
commonly used Control Charts are: Sample Proportion Defective (p-chart), Sample
Input
Implement Corrective action
Identify root cause of problem
Detect assignable cause
Verify and follow up
Output Process
Measurement System
33
Number of Defectives (np-chart), Sample Number of Defects (c-chart) and Sample
Number of defects per unit (u-chart or c -chart).
In plotting the Control Chart, the value is assumed to be independent and
normally distributed. These assumptions enable predictions to be made about the data.
For a basic Shewhart Control Chart as in Figure 2.17, the process is considered to be in
control if the plotted statistic is within the control limits. Otherwise, the process is
considered to be out of control.
Average Daily Imperfections with Control Limits
0.0
0.5
1.0
1.5
2.0
2.5
3.0
3.5
4.0
4.5
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 18 20 21 22 23
Average Daily Imperfections
Sample Mean
Lower Control Limit
Upper Control Limit
Figure 2.17: Sample of basic Shewhart Control Chart.
Control Charts have been used widely in manufacturing industries due to their
capability in improving the quality in a production sector. Control charts are also a
popular technique for improving productivity. They are useful in defect prevention
including preventing unnecessary process adjustment. Control Charts also provide
diagnostic information and information about the process capability. These
characteristics make the use of Control Charts widespread across most industries.
In this research, it is crucial to opt for a tool which can control the process mean
effectively. Among all the tools, the SPC mechanism that has received the most
attention in the statistical literature and usage in industry in controlling the process
mean is the CUSUM method. Furthermore, the CUSUM scheme detects process shifts
faster than any method [7]. Due to the advantages of the CUSUM and appropriate with
the application, CUSUM will be highlighted and applied in the research. In the next
34
chapter, CUSUM and it’s counterpart EWMA [74] which is also excellent in detecting
process shift for comparison are discussed in detail.
2.5 Summary In this chapter, the necessary background and information required for the study of the
rest of the thesis was presented. As this thesis is concerned with speech quality, speech
quality metric and measurement methods were presented. These metrics ranged from
conventional speech quality metric (i.e. SIR, BER and FER) to perceptual speech
quality metric.
Perceptual speech quality metrics are divided into two classes: subjective
methods and objective methods. Objective methods comprise two approaches: reference
objective methods (i.e. PAQM, PAMS, PSQM and PESQ) and non-reference subjective
methods (i.e. ANIQUE, 3QSM and E-model) where the PESQ which is applied in this
research is the state of art ITU-T Recommendation for reference objective speech
quality measure.
Since control of perceptual speech quality is highlighted in the thesis, the power
control schemes were described in general, followed by the power control schemes
applied in UMTS. There are two power control schemes discussed: centralized power
control and distributed power control. The power control in UMTS can be implemented
in two ways: open loop power control and closed loop power control. The closed loop
power control is also divided into two processes: outer loop power control and inner
loop power control. The closed loop power control is the main concern of the thesis and
the application of this is discussed in Chapter 3.
Statistical Process Control (SPC) was introduced as an important tool for closed
loop power control and its seven tools including CUSUM and EWMA which form the
main element of the thesis were briefly reviewed. The other tools are Histogram or
Stem-and-Leaf Display, Check Sheet, Pareto Chart and Analysis, Cause and Effect
Diagram, Defect Concentration Diagram, Scatter Diagram and Control charts. All of
these were briefly described in this chapter.
35
CHAPTER 3
METHODOLOGY
3.0 Introduction
UMTS is based on CDMA technology which relies greatly on accurate power control
techniques. In this chapter, a method of controlling power and the speech codec rate is
proposed, both of which are associated with the usage of power in mobile
communication systems. PESQ as the state of the art for reference perceptual speech
quality measure is explored in the proposed system. Furthermore, a direct control
approach using SPC which has not yet been explored in mobile communications
systems is also used in the system. An FQI method used for estimation of the perceptual
speech quality in this research application is also discussed.
The proposed technique aims to maximize the battery life of the mobile stations
and provides the adequate QoS as required by the customers. This technique also aims
to avoid the waste of power consumption and increase network capacity. Since the
transmission channel is varied due to UE movement, the BS transmit power should be
adjusted accordingly to ensure that the received signal is within the customer’s
requirements, in which case, the BS transmit power does not change based on those
channel variations; the received signal will drop below the requirement and thus
degrade QoS or exceed the customer demands and eventually waste the utilized power
and network capacity.
Due to a necessity to fulfil customer requirements, monitoring and control of the
QoS by the service provider is a must in mobile communication systems. To date the
speech quality has ben measured and control based on radio link measurements such as
SIR, BER or FER, depending on where they are measured at the receiver. However,
these parameters actually measure the quality of the received radio signal, or integrity of
the detected bits, or frames but not the speech quality [75].
FER is widely used in communication systems such as in the 3G UMTS because
it is considered a good measure of speech quality. However, FER is not a truly
perceptual measure of speech quality. It only measures the number of frames of data
that contain errors and does not process the human information content or speech.
Speech quality is accurately measured using a perceptual measure, e.g., the ITU-T
recommendation P.862 for reference quality method, PESQ. In fact, for a given FER,
36
the perceptual speech quality, expressed by a subjective MOS, is a random variable
whose statistical expectation is predicated by the FER.
Figure 3.1: Example of perceptual speech quality experienced by more than 30
end users in a simulated 3G UMTS network.
Figure 3.1 shows an example of perceptual speech quality experienced by more
than 30 end users in simulated 3G UMTS networks. Although the FER is kept at 1% for
all users, the quality experienced by them is mixed. While some may be satisfied with
the speech quality they are experiencing others will be dissatisfied. Network operators
may provide better quality for these unsatisfied customers by providing a lower FER,
such as 0.5% for everyone, in which case, power and hence network capacity is wasted
for those customers who were already experiencing satisfactory quality. Therefore, we
propose a model which employs PESQ together with link adaptation techniques such as
speech codec rate and power control for providing the speech quality which is adequate
for all users.
Furthermore, the proposed model also includes the application of SPC which is
novel in communication systems. Control of perceptual speech quality using
mechanisms such as power control and a “hybrid” control mechanism has been studied
and applied before [4, 5, 76]. However; a direct control approach using controlling tools
such as Statistical Process Control (SPC) is the first attempt.
Bad
Poor
Fair
Good
Excellent MOS values for FER=1%
1 1.5
2 2.5
3 3.5
4 4.5
5
0 5 10 15 20 25 30 35 Sample number
Mean
Op
inio
n S
core
(MO
S)
37
3.1 Proposed Perceptual Speech Quality Control Model
3.1.1 Motivation PESQ is designed for a wide range of network conditions and error types [2]. However,
the smallest period that PESQ can evaluate speech quality is 320 ms [2, 8]. This is too
long for effective control of quality in the network. However, FD, which is subtracted
from PESQ is calculated every 16 ms. Even though 16 ms is too short for assessing the
speech quality it is suitable for control purposes. As such, the parameter which based on
FD is proposed for use as a perceptual metric to replace non-perceptual measures such
as FER. The details of PESQ and FD are discussed in Section 3.2 and 3.3 respectively.
The parameter based on calculated FD is proposed to be used for controlling some
functions of the transmitter such as in the transmission power, channel coding, or
speech codec rate to maintain a required quality level. In this particular work, the
controlling speech codec rate and transmission power will be adopted.
3.1.2 Proposed Model
Figure 3.2 and 3.3 shows the proposed perceptual speech quality control model where
the SPC tools, CUSUM/EWMA are applied to control the perceptual speech quality
based on FD. A log( )nFD is proposed as a new parameter to control the perceptual
speech quality in the model. The log here is the natural algorithm which is commonly
used in Matlab programming. Figure 3.2 is the model to control the speech codec rate,
whereas Figure 3.3 is the model to control the transmission power at the transmitter in
the UMTS. As PESQ requires both reference and degraded signals to evaluate the
perceptual speech quality and eventually log( )nFD , a synthesized version of the
degraded speech signal is required. Therefore, FQI which is associated with Frame
Erasure Pattern (FEP) is applied for the model.
The details of CUSUM, EWMA and FQI are discussed in Section 3.4, 3.5 and
3.6 respectively. The application of the proposed technique in speech codec rate and
power control in UMTS is investigated in Chapter 4 and 5.
38
Figure 3.2: Proposed model for speech codec rate control.
Since the proposed model for power control requires feedback from the receiver
end of the communication link, the Closed Loop Power Control (CLPC) in FDD mode
is applied in this model. For SPC based power control purposes, the inner loop power
control was not included in the model because its update rate of 1500 updates per
second or 1500/s is too fast for the PESQ algorithm to provide a good estimation of
speech quality. On the contrary, an outer-loop update rate can be as low as 50/s and
suitable for this application. Details of the CLPC in FDD mode process for the proposed
model and the SPC based power control are discussed in Section 3.7 and 3.8
respectively.
Figure 3.3: Application of CUSUM/EWMA based on nFD .
Speech Signal
nx
AMR Encoder
Channel Model
AMR Decoder
PESQ
Receiver, ny log( )nFD
Approximation required
SPC
nFQI
AMR Encoder Outer-loop Power Conrol
Channel
CUSUM/EWMA ny Synthesized
Signal
log( )nFD PESQ
AMR Decoder
CRC Check FEP
( nFQI )
AMR Decoder
Delay
+ Delay
Reference Signal
Transmitter Receiver
Speech Signal Received
Signal
39
3.1.3 Original Input Speech File and Speech Codec
Original speech signals have been obtained from the ITU database for voice quality
measurement tests [77]. The signals were stored in files and pre-recorded in 16-bit
linear Pulse Code Modulation (PCM) which is in binary format. Each of these
constituent speech files contained pre-recorded sentences of 8 seconds duration with
approximately 50% speech and 50% silence intervals.
The AMR speech codec is the standard codec for UMTS. It was used in the
analysis at the transmitter and receiver part. The AMR Codec is based on Algebraic
Code Excited Linear Prediction (ACELP) technique [78, 79] . It encodes speech into
frames of 20 ms duration and is rearranged into classes A, B, and C in decreasing order
of their perceptual importance. There are eight codec modes and the number of bits in
each frame varies depend on them. It is summarized in Table 3.1.
The usage of AMR requires optimized link adaptation that selects the best codec
mode to meet the local radio channel and capacity requirements. The codec mode is
proportional to the quality of the speech, where a higher codec mode will result in better
speech quality and vice versa [80].
Table 3.1: Number of bits in Classes A,B, and C for each AMR codec mode [78].
Codec
Mode
Code
d Rate
(kb/s)
No.
of bits
per
frame
No.
of
Class
A bits
No.
of
Class
B bits
No.
of
Class
C bits
0 4.75 95 42 53 0
1 5.15 103 49 54 0
2 5.90 118 55 63 0
3 6.70 134 58 76 0
4 7.40 148 61 87 0
5 7.95 159 75 84 0
6 10.2 204 65 99 40
7 12.2 244 81 103 60
40
3.2 PESQ
Figure 3.4: PESQ block diagram.
Figure 3.4 shows the basic block diagram of PESQ [2, 8]. The algorithm requires both
the original and the degraded speech signals to make a comparison. Signals are
transformed frame-by-frame according to a perceptual model, which represents the
human auditory system. The transformed signals are subtracted to calculate the FD for
each degraded frame. The FD represents the perceptual difference of the two signals,
and is aggregated over all frames and a mapping function is used to give a MOS value
for the degraded signal.
The PESQ algorithm has been extensively used in measurement tools for
accurate assessment of perceptual speech quality in modern telecommunication
network. Figure 3.5 shows the structure of the PESQ model, and elaboration of each
blocks are briefly described below. The details can be found in [2, 8].
Figure 3.5: Structure of PESQ model [2].
Re-align bad interval
Degraded signal
Reference signal
System under test
Level align
Input filter
Level align
Input filter
Time align and equalise
Auditory Transform
Disturbance processing
Auditory Transform
Identify bad interval
Cognitive Modelling
Prediction of perceived
speech quality
Perceptual Model
Perceptual Model
Time Alignment
delays
Internal representation of nx
Original speech
frames nx
Degraded speech
frames, ny Internal representation of ny
MOS +-
Averaging & Mapping
Disturbance ,n nD DA
41
3.2.1 Level Alignment Both reference and degraded signals go through level alignment to ensure both signals
have the same constant power level and hence have the same standard of listening level.
The process undergoes the filtering process for signals, computing their power and
finally applying gains to align them.
3.2.2 Input Filtering The level align signal is filtered using Fast Fourier Transform (FFT) with an input filter
to model a standard telephone handset which has Intermediate Reference System (IRS)
or modified IRS receive characteristics.
3.2.3 Time Alignment and Equalization Time delay is needed to align both degraded and original signals in order to allow both
corresponding signals to be compared. Both silence and speech periods are accounted
for by the time alignment and equalization process. The process is performed part by
part of the speech signal. The part by part of the speech signal is called utterance.
3.2.4 Auditory Transform In the auditory transformation, both degraded and reference signals are mapped into a
representation of perceived loudness in time and frequency, based on the human
hearing. This transformation process involves the following stages:
Time-frequency mapping: A short term FFT with a Hann Window over 32 ms frame
is used to transform reference and degraded signals into an individual time-frequency
cell and the instantaneous power spectrum in each frame is calculated.
Bark spectrum: The instantaneous power spectrum in each frame is summed into 42
bins on the modified Bark Scale [31].
Frequency equalization: The average of bark spectrum for the non-silent speech
frames is calculated. As the frequency response of system under test is assumed to be
constant, the ratio between spectra of the reference and degraded signals gives a transfer
function estimate. This estimate is used to equalise the reference and degraded signals
for frequency equalisation.
42
Equalization of gain variation: In each frame, the ratio between the audible power of
the reference and degraded signals is calculated to identify the gain variations. Hence,
this ratio is used to equalise the gain of the reference and degraded signals in that frame
after being filtered with a first-order low pass filter and bounded.
Loudness mapping: The bark spectrum is mapped to a psychoacoustic scale (Sone) of
loudness to give the indication of perceived loudness in each time-frequency cell.
3.2.5 Disturbance Processing and Cognitive Modelling An absolute difference of loudness density between the reference and the degraded
signals is calculated to identify the audible error measure. In PESQ, there are several
steps involved before the calculation of a non-linear average over time and frequency.
Deletion: If the difference of loudness density between the degraded and reference
signals is negative, the components have been deleted from the original signal. This
deletion leaves a part which overlaps in the degraded signals.
Masking of small disturbance: Masking is modelled using a simple threshold below
which distortions for the degraded signal are inaudible. This threshold is subtracted
from the absolute loudness difference between the degraded and the reference signals
for each frame. The masked value of the absolute loudness of each frame is called the
frame disturbance.
Asymmetry: This asymmetry factor is computed from the ratio of the Bark spectral
density of the degraded and the referenced signals in each time-frequency cell.
Multiplication of this factor with each frame disturbance will identify the asymmetric
weighted disturbance which consequently only measures the additive distortions.
3.2.6 Disturbance Aggregation and MOS Prediction PESQ calculates two different average disturbance values, one with the asymmetry
factor and one without it. The linear combination of both average disturbances gives a
final score of PESQ MOS. The range of the PESQ score is from -0.5 to 4.5.
43
3.2.7 Realignment of Bad Intervals If the consecutive frame disturbance values are above a given threshold, the frame or
section is identified as a bad interval i.e. the frame disturbance value is more than 45
and the bad frame is separated by less than 5 good frames. Each of the identified bad
frames and sections is realigned and the disturbance is recalculated. New delay
estimation is calculated using cross-correlation. Auditory transformation of the
degraded signal is also recalculated to give a new disturbance value. For each frame, a
new value is used if the realignment gives the lower disturbance value.
3.3 Frame Disturbance
The sign difference between the distorted and original loudness density in PESQ is
called the raw disturbance density. The minimal of the original and degraded loudness
density is computed for each time frequency cell. This results in a disturbance density as
a function of time (window number n) and frequency, ( )nD f . As PESQ involves the
asymmetry effect processing, the asymmetrical disturbance density, ( )nDA f is also
aggregated [2, 8].
Disturbance, nD , and the asymmetrical disturbance, nDA are calculated by a
non-linear average as below:
33
1,...(| ( ) | )n n n f
f NbD M D f W
=
= ∑ (3.1)
3
1,...(| ( ) | )n n n f
f NbDA M DA f W
=
= ∑ (3.2)
nM is a multiplication factor which is equal to 1/(power of original frame + 105/107)-
0.04 and Nb is the number of bark bands. fW is a series of constants which are
proportional to the width of the modified Barks bins.
This results in disturbance and asymmetrical disturbance signals that represent
how distorted the speech is during a very short period of time (16 ms).Details can be
referred to [8]. The linear combination of disturbance and asymmetrical disturbance
values will result in the final disturbances which are referred to as nFD throughout this
thesis as
44
n n nFD D DA= + (3.3)
Figure 3.6 illustrates the concept of applying log( )nFD for controlling the
perceptual quality of the degraded signal or received signal, ny used in this research.
The original signal and the degraded signal are required for the PESQ at the transmitter
to calculate the frame disturbances for each frame.
These calculated frame disturbances can then be used for controlling functions
of the transmitter such as the transmission power, channel coding, or speech codec rate
to maintain a required quality level. In the absence of ny at the transmitting side the
PESQ must use an approximation of ny .
One possibility for calculation of log( )nFD at the transmitting side, where the
control is applied, is to use the FEP information through FQI which has been
successfully applied before [2, 4, 81, 82].
As PESQ required both degraded and reference signals at the transmitter to
evaluate the perceptual speech quality and hence calculate the nFD and subsequently,
log( )nFD , a synthesized degraded speech signal has been used. Approximation of the
degraded signal is obtained by applying an FQI feedback method which is associated
with the FEP. The MOS correlation between the usages of synthesized signal replacing
the actual degraded signal is impressively high i.e. between 0.82 and 0.91.
Figure 3.6: log( )nFD concept in controlling perceptual quality.
Transmitter Channel Receiver
PESQ
Degraded frames,
ny
nx ny log( )nFD
Approximation required
CUSUM/EWMA
nFQI
Original frames,
ny
45
3.4 FQI Feedback Method Output at the speech encoder gives various bits of data which have unequal perceptual
importance. Hence, the bits are often rearranged according to their perceptual
importance before applying error protection against the transmission errors. Then,
consequently, those bits which are more important for the reconstruction of speech will
be protected more effectively compared with those which are less important.
The Third Generation (3G) UMTS Adaptive Multi-Rate (AMR) is adopted in
this research. In a 3G AMR the encoded speech bits within a 20ms speech frame are
rearranged based on their perceptual importance. They are classified into the most
important bits; class A, class B and the least important, class C bits [83]. The errors in
class A can cause severe damage to speech reproduction, whereas class B and class C
bit errors can be tolerated. Therefore for a typical implementation, class A bits are
protected by rate-1/3 Convolutional Coding (CC), class B bits with rate-1/2 CC and
class C bits may be not protected at all. As an extra in error protection, an error-
concealment mechanism is also provided to mask the effects of the class A bit errors at
the receiver [84].
Figure 3.7 shows the classification of the encoded speech bits and their unequal
error protection scheme for UMTS.
Figure 3.7: The classification of the encoded speech bits and their unequal error
protection scheme for UMTS.
Error-concealment is based on a Cyclic Redundancy Check (CRC) which is
correspondent to the FQI. In a cellular system, every transmitted frame is sent with a
CRC word. The CRC is used for checking integrity of the received frame before going
through speech decoding. If there are bit errors in class A which caused the frame
Digitized Speech
AMR speech Codec
Class A bits
Class A bits
Class A bits
CRC
Rate 1/3
Rate 1/2
Rate 1/2
Convolutional Coding
46
erasure, it is indicated by failure of CRC. That frame is called a ‘bad’ frame. That
particular class A bit is replaced with the corresponding bits from the last frame which
has free error class A bits.
For estimation of perceptual speech quality, the FEP is sent back to the transmitter
through a FQI feedback method in which the receiver sends information back to the
transmitter to indicate whether the received frame was “good” or “bad”.
The FQI is binary flagged with “good=0” and “bad=1” indicating that the received
frame should be erased or not. Figure 3.8 shows the block diagram of FQI feedback
method with the conjunction of PESQ which is used in this research.
Figure 3.8: Block diagram of FQI feedback method. In this diagram, a corresponding binary signal, denoted by nFQI , is sent back to
the transmitter where n is the frame number. At the transmitter, the copies of the
transmitted frames are tagged with the corresponding nFQI for the received frames.
These frames are then sent to the speech decoder. The output of the decoder is a
synthesised version of the degraded signal, ny . Subsequently, the frame disturbance of
frame of the synthesized signal is calculated by the PESQ algorithm.
Differing from its commonly used counterparts; the FQI feedback method
utilizes the acoustic information present in the received signal at the receiver. So, its
estimation of the perceived quality is a true measure and not just based on its statistical
expectations.
Speech Encoder
PESQ Speech Decoder
Physical Layer
Speech Decoder CRC Check
Delay
Original
signal, nx
Degraded signal, y(n)
Frame disturbance,
nFD Synthesized degraded signal, ny
Rx Frame + CRC
Rx Frame + FQI
nFQI
+ FQI=0 if CRC is “good” FQI=1 if CRC is “bad”
Transmitter Receiver
Tx Frame
Tx Frame + FQI
Delay
47
3.5 CUSUM A cumulative sum control chart was initiated by Page in 1954 [7] and has been studied
by many researchers such as Ewan (1963) [85], Lucas (1976) [86], Gan (1991) [87] ,
Hawkins (1981,1993) [88], and Woodall and Adams (1993) [89]. Among many
schemes of control charts, it was argued that CUSUM charts are the most appropriate
and very relevant to quality control [85, 90, 91]. The CUSUM technique is also among
the most powerful tools for detecting a shift from a wide range of distribution. It is
naturally applied to the normal distributed data [92].The extension of this technique has
been explored by many researchers. It started with the application of the CUSUM chart
in economic projects to control the process average with a normally distributed quality
characteristic [93]. This technique is employed in determining the optimum values of
the sample size, the sampling interval and the decision limit.
In current practice, formulae and equations presented by Montgomery were used
[68]. The application of CUSUM charts for monitoring process average and variability
has been introduced. A CUSUM chart is directly incorporated with the whole
information in the sequence of sample values by plotting the cumulative sums of the
sample values deviations from the target value. Let the sample size1 be s > 1, jx is the
average of the jth sample and 0µ is the target value for the process average, the
CUSUM control chart up to the frame N is formulated as below
01( ),
N
n jj
C x µ=
= −∑ (3.4)
where nC is the cumulative sum including the Nth frame, since the CUSUM control chart
combines information from several samples. Due to the combination, CUSUM charts
are more effective than the common control chart, (Shewhart chart) for detecting a small
process shift. Furthermore, CUSUM is particularly effective with sample size, s = 1
[68]. If the process maintains control at the target value 0µ , the CUSUM defined in (3.4)
describes a random method with a zero average. However, if the average changes
upwards to some value 1 0µ µ> , then an ascendant tendency will develop at the CUSUM
nC . Reciprocally, if the average changes downward to some value 1 0µ µ< , the CUSUM
1 Sample size in this case is the sample size per calculation which is the number of
measurements used to calculate each value in the CUSUM.
48
will have a negative direction. If there is a tendency up and down at the limit lines, it
must be considered as evidence that the process average was changed, and the cause of
that change must be investigated and rectified.
There are two ways of representing the CUSUM charts: the algorithmic or
tabular CUSUM [68, 94] and the V-mask form of the CUSUM [86, 95, 96].
3.5.1 Tabular CUSUM
Tabular CUSUM will be used in this research. Tabular CUSUM works by accumulating
derivations from 0µ those are above target and below target with statistics C+ and
C− accordingly. Statistics C+ and C− are called one-sided upper and lower CUSUM,
respectively. With the log of frame disturbance variable nFD which has a normal
distribution (refer to Section 4.1) with mean, µ , and standard deviation,σ ,
log( ) ~ ( , )nFD N µ σ , the cumulative sums for detecting upward and downward shifts in
the mean are calculated as below
1 0max[0, log( ) ( )]n n nC C FD Kµ+ +−= + − + for an upward shift (3.5)
1 0min[0, log( ) ( )]n n nC C FD Kµ− −−= + − − for a downward shift (3.6)
where 0µ is the target mean and K is usually called the reference value or the
allowance, or the slack value. K is often chosen about halfway between the target mean
and the out of control value of the mean 1µ that we are interested in detecting quickly.
The starting value for C+ = C− = 0.
The tabular CUSUM is designed by choosing values for the reference value K
and the decision interval or threshold H . There are two thresholds H in tabular CUSUM
called upper CUSUM and lower CUSUM limits. Usually; these parameters are selected
to provide good Average Run Length (ARL) performance. ARL is the average number
of points which must be plotted before there is a point which indicates out of control
limits. Define, 0H hσ= and 0K kσ= , where 0σ is the standard deviation of the sample
variable used in performing CUSUM whereas in this research, the case is the log( )nFD .
Using 4h = or 5h = with 1/ 2k = will generally provide a CUSUM that has good
ARL properties for the shift of 1σ .
49
Note that C+ and C− accumulative deviations from the target value 0µ that are
greater than K , with both quantities reset to zero upon becoming negative. If either C+
or C− exceeds the decision interval/threshold, H , the process is considered to be out of
control.
The reasonable value for H is five times the process standard deviationσ .
Upper CUSUM is the threshold at the positive side (+H), whereas lower CUSUM is the
threshold for the negative side (-H).
Figure 3.9 shows the sample of tabular CUSUM based on Table 3.2 which shows the
sample values of log( )nFD and the value of upward and downward CUSUM. Process
standard deviation, 0σ = 0.120, k =1/2 and h = 4.. Therefore, K = 0.060 and H = 0.478.
Subsequently, the upper and lower CUSUM are 0.478 and -0.478 respectively. The
target value 0µ is set to be 0.781. The value of initial CUSUM, 0C + = 0C − = 0.
Table 3.2: The parameters for the sample of Tabular CUSUM chart.
To illustrate the calculation, consider period 1. The equation for C+ and C− are
as follows:
1 0 0max[0, log( ) ( )]nC C FD Kµ+ += + − +
Period n log( )nFD C+ C- 1 0.696 0.000 -0.025 2 0.847 0.006 0.000 3 0.785 0.000 0.000 4 0.841 0.000 0.000 5 0.765 0.000 0.000 6 0.840 0.000 0.000 7 0.941 0.100 0.000 8 0.550 0.000 -0.172 9 0.639 0.000 -0.254 10 0.882 0.041 -0.093 11 0.768 0.000 -0.047 12 0.791 0.000 0.000 13 0.918 0.077 0.000 14 0.892 0.128 0.000 15 0.847 0.134 0.000 16 0.868 0.161 0.000 17 0.898 0.217 0.000 18 0.569 0.000 -0.153 19 0.687 0.000 -0.187 20 0.600 0.00 -0.308
50
1 max[0,0 0.696 (0.781 0.060)]C + = + − +
max[0, 0.145]0
= −=
1 0 0min[0, log( ) ( )]nC C FD Kµ− −= + − −
min[0,0.696 (0.781 0.060)]min[0, 0.025]
0.025
= − −= −= −
Sample of Tabular CUSUM
-0.6000
-0.4000
-0.2000
0.0000
0.2000
0.4000
0.6000
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
Sample
Cus
um C
hart Upper Cusum
C+
C-
Lower Cusum
Figure 3.9: Sample of Tabular CUSUM.
The CUSUM variables C+ and C− are compared against appropriate thresholds
for detection of upward or downward shifts. The thresholds are chosen based on the
trade off between the responsiveness of the algorithm and the probability of a false
detection. Generally, thresholds that lead to faster detection of a shift in the mean will
also result in higher probability of false alarms or false detection [97].
Whenever a process shift is detected as out of control, the search for the
assignable cause will be carried out, and corrective action is required then, the CUSUM
chart will be reinitialized to zero. The CUSUM is particularly helpful in determining
when the assignable cause has occurred.
+H
-H
51
3.5.2 The V-mask
The V-mask control scheme is the alternative to Tabular CUSUM and was proposed by
Barnard in 1959 [95]. It is applied to the successive values of the CUSUM statistic as
below:
11
n
n j n nj
C y y C −=
= = +∑ (3.7)
where 0( ) /n ny x µ σ= − . Figure 3.10 shows the typical V-mask.
Figure 3.10: A sample of out of control V-Mask [98].
In this control scheme, the V-mask is formed by plotting V-shaped limits. The
V-mask is placed in the CUSUM control chart with an Origin point on the last value of
nC and the Origin-Vertex line is parallel to the horizontal axis. The process is in control
if all the previous CUSUM values ( 1 2, ,... nC C C ) lie between the upper arm and lower
arm. On the other hand, the process is considered out of control if any of the previous
CUSUM values lie outside both arms. However, in actual use, the V-mask would be
applied to each new point of the CUSUM chart as soon as it was plotted.
nC
n
52
The performance of the V-Mask is determined by the d distance and θ value as
shown in Figure 3.10.
Johnson [99] recommended the optimum values of d and θ using the following
equations:
1tan ( )2Aδθ −= (3.8)
and
2
2 1( ) ln( )d βδ α
−= (3.9)
where α , β and δ are the variables that have to be chosen appropriately. α is the
probability of a false alarm where 2α is the highest allowable probability of a signal
when the process mean is under control and β is the probability of not detecting a shift
of size δ . The A value is a scale factor chosen to make the resulting graph easily
readable as shown in Figure 3.11. Many computer programs used Johnson’s method, i.e.
Statgraphics which used the default value for each parameter as follows:
1, 0.05,δ α= = and 0.05β = [68].
Figure 3.11: The physical distance between subgroup samples is equivalent to a
unit on the vertical axis.
The application of V-mask was established and modified by Lucas in [86, 96].
Due to its complexity, the V-mask method is not always practical when applied such as
in this research application. There is a difficulty in determining how far backwards the
arms of the V-mask should extend in the case of applying V-mask for each new point of
A
1
a
2 3 a
53
CUSUM value. Furthermore, an ambiguous association with α and β may cause a
severe problem of V-mask application. Hence, the tabular CUSUM control scheme is
adopted in this research.
3.6 EWMA
An alternative technique to detect small shifts is to apply the EWMA which was
developed by S.W. Roberts in 1959 [100]. EWMA was found to be more efficient for
monitoring stationary auto correlated processes [101] and the mean of skewed
populations [102].
The EWMA control chart is also good in detecting a small shift like CUSUM.
It’s approximately equivalent to CUSUM and is easy to set up and operate. Like
CUSUM, it’s typically used with individual observations. The EWMA is often superior
to the CUSUM charting technique for detecting “larger” shifts and, unlike CUSUM, is
not sensitive to normality assumption [103]. It is sometimes called a Geometric Moving
Average (GMA) and is used extensively in time series modelling and in forecasting
[104, 105]. EWMA is defined as [106]
1log( ) (1 )n n nz FD zλ λ −= + − (3.10)
where n is the number of frame to be monitored and 0 1λ< ≤ is a constant and the
starting value is the process target, so that
0 0z µ=
Or the starting value of preliminary data i.e. the mean of several log( )nFD data, can be
used as the starting value of EWMA, so that
0 log( )nz FD=
As log( )nFD values are independent random variables with the variance 2σ ,
then the variance of nz is
2 2 2( [1 (1 ) ])2n
nz
λσ σ λλ
= − −−
(3.11)
54
Therefore, similar with CUSUM, a EWMA control chart would be constructed by
plotting EWMA values, nz versus the frame number n or time. This is different from
CUSUM which have a constant control limits. The centre line, upper and lower control
limits for EMWA control charts are dependent on the target value 0µ and the number of
frame n as follows:
20 [1 (1 ) ]
(2 )nUCL L λµ σ λ
λ= + − −
− (3.12)
0CL µ= (3.13)
20 [1 (1 ) ]
(2 )nLCL L λµ σ λ
λ= − − −
− (3.14)
where σ and n is the standard deviation and frame number of the data. The factor L is
the width of the control limits.
In general, values of λ in their interval 0.05 0.25λ≤ ≤ work well in practice,
with 0.05λ = , 0.10λ = and 0.20λ = being popular choices. A good rule of thumb is to
use the smaller values of λ to detect smaller shift. By research (Hunter, 1989) [107],
the value of λ = 0.4 and 3.054L = is recommended.
As n gets larger, the control limits will approach steady-state values given by the
following equations:
0 (2 )UCL L λµ σ
λ= +
− (3.15)
0 (2 )LCL L λµ σ
λ= −
− (3.16)
Figure 3.12 shows a sample of a EWMA chart based on sample values shown in
Table 3.3. Similar to CUSUM, the EMWA values, nz is under control if it is not beyond
the upper and lower limit of the EWMA chart. Table 3.3 shows the sample values of
55
log( )nFD , nz and the value of upper and lower control limit of EWMA. The target
value, 0µ = 0.356, L = 3, σ = 0.079 and 0.2λ = were selected appropriately.
Table 3.3: EWMA parameters for the sample of EWMA chart.
log( )nFD nz UCL LCL 0.268 0.338 0.404 0.308 0.368 0.344 0.417 0.295 0.353 0.346 0.424 0.288 0.321 0.341 0.428 0.283 0.429 0.359 0.431 0.281 0.356 0.358 0.433 0.279 0.401 0.367 0.434 0.278 0.428 0.379 0.434 0.278 0.311 0.365 0.435 0.277 0.201 0.332 0.435 0.277 0.323 0.331 0.435 0.277 0.383 0.341 0.435 0.277 0.368 0.346 0.435 0.277 0.293 0.336 0.435 0.277 0.211 0.311 0.435 0.277 0.460 0.341 0.435 0.277 0.292 0.331 0.435 0.277 0.412 0.347 0.435 0.276 0.377 0.353 0.435 0.276 0.528 0.388 0.435 0.276
0.2445
0.2945
0.3445
0.3945
0.4445
0.4945
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
EWM
A
Sample
EWMA Chart
Average
UCL
LC
Figure 3.12: Sample of EWMA chart.
56
To illustrate the calculations, consider the first observation 1log( ) 0.268FD = .
The first value of EWMA 1z is
1 1 0(log ) (1 )z FD zλ λ= + −
(0.2)(0.268) (1 0.2)(0.356)0.338
= + −=
Therefore, 1z = 0.338 is the first value plotted in Figure 3.12. The second value of
EWMA 2z is
2 2 1log( ) (1 )z FD zλ λ= + −
(0.2)(0.368) (1 0.2)(0.338)0.344
= + −=
The other values of the EWMA are computed similarly.
The control limits in figure 3.12 are found using equation (3.12) and (3.14). For
n = 1,
20 [1 (1 ) ]
(2 )nUCL L λµ σ λ
λ= + − −
−
2(1)0.20.356 (3)(0.079) [1 (1 0.2) ]
(2 0.2)0.404
= + − −−
=
and
20 [1 (1 ) ]
(2 )nLCL L λµ σ λ
λ= − − −
−
2(1)0.20.356 (3)(0.079) [1 (1 0.2) ]
(2 0.2)0.308
= − − −−
=
For n =2, the limits are
20 [1 (1 ) ]
(2 )nUCL L λµ σ λ
λ= + − −
−
2(2)0.20.356 (3)(0.079) [1 (1 0.2) ]
(2 0.2)0.417
= + − −−
=
and
20 [1 (1 ) ]
(2 )nLCL L λµ σ λ
λ= − − −
−
57
2(2)0.20.356 (3)(0.079) [1 (1 0.2) ]
(2 0.2)0.295
= − − −−
=
As n increase the control limits increase in width until they stabilize at the steady state
values given by equations (3.15) and (3.16).
0 (2 )UCL L λµ σ
λ= +
−
0.20.356 (3)(0.079)
(2 0.2)0.435
= +−
=
0 (2 )LCL L λµ σ
λ= −
−
0.20.356 (3)(0.079)
(2 0.2)0.276
= −−
=
Similar to CUSUM applications, whenever process shift is detected to be out of
control, the search for the assignable cause will be investigated, and corrective action is
required then, the EWMA chart limit will be reinitialized to zero and the limits will be
refreshed.
Like CUSUM, the EWMA performs well against small shifts but does not react
well to large shifts as quickly as the Shewhart chart. However, EWMA is superior to the
CUSUM only for larger shifts particularly if λ > 0.1. Differing from CUSUM which is
sensitive to normality assumption, EWMA is insensitive to statistical distribution
properties and was properly designed to be less sensitive to the normality assumption
[103]. It is due to its scheme which can be viewed as a weighted average of all past and
current observations. Even, if the log( )nFD has a normal distribution, the application of
EWMA as the control tool for this research is significant in making the comparison with
CUSUM.
3.7 Closed Loop Power Control in FDD Mode
In CLPC, FDD mode is used for both uplink and downlink but TDD mode is
only used in downlink [47, 53]. Furthermore, CLPC in FDD mode is more widely used
in UMTS [66]. The CLPS procedure in UMTS is divided into two processes which are
58
outer loop and inner loop [52]. The inner loop process was discussed earlier in Chapter
2 (Section 2.3.2). The block diagram of UMTS CLPC is shown in Figure 2.10.
Outer loop operates within the BS The outer loop dynamically sets the SIR
target for the inner loop based on the FER target which is usually 1% for speech
services for achieving a satisfactory speech quality. The outer loop sets the target SIR at
the BS according to the needs of the end users and aims at a constant quality. On the
other hand, the inner loop regulates transmit power of the UE such as a hand phone in
an attempt to compensate signal amplitude fading and meet the SIR target. When the
inner loop is unable to combat channel fading, the FER will increase. Consequently the
outer loop increases the SIR target to maintain the FER target.
3.7.1 Conventional UMTS Outer Loop Power Control Algorithm
Figure 3.13 shows a commonly accepted flow chart of a conventional UMTS outer loop
power control [51]. For each received speech frame at the BS, CRC is used to check the
integrity of the frame whether it contains errors or not. If CRC detects an error in the
frame, then the SIR target is increased by K multiplies by a given step size ∆ in dB,
where K is a positive integer. The value of K is related to the desired FER as
1 1KFER
= −
(3.17)
The algorithm aims to keep the real FER less than or equal to the equation 3.18.
In its steady state, the SIR target is not far from the minimum value of SIR required to
maintain the FER target. Therefore, setting the small ∆ to decrease the excess SIR may
result in inaccurate monitoring of the channel variation and longer convergence time.
On the contrary, setting the large ∆ to increase the SIR target may result in fast channel
changes. Consequently it will lead to a larger interference and a decrease in the system
capacity [51]. Hence, the outer loop step up size up∆ and step down size down∆ can be
formulated as
up K∆ = × ∆ and down∆ = ∆ (3.18)
Note that the dynamic range of SIR target is limited. Hence, the new SIR target
is compared to the allowed minimum and maximum limits of the SIR target. If the SIR
59
target exceeds these two limits, it is clamped to those limits. The flow is repeated for
subsequent frames.
Start
Check CRC of current frame
SIR target = maximum SIR_target
Process next frame
SIR_target<minimum SIR target
No
No No
Yes
Yes Yes
Figure 3.13: Conventional UMTS outer-loop power control flow chart.
CRC in error?
SIR target = SIR target + ∆up
SIR target = SIR target - ∆down
SIR_target > maximum SIR target
SIR target = minimum SIR_target
60
3.8 SPC Based UMTS Power Control
In the SPC based UMTS power control model shown in Figure 3.3, perceptual speech
quality is measured by PESQ. The FD which is subtracted from PESQ will be utilized
by the SPC and then the statistical value of FD is applied in the outer-loop power
control. This SPC based technique employs the PESQ reference implementation
software supplied by ITU [108]. A delay was introduced to account for the round-trip
delay between the AMR encoder and the decoder at the transmitter.
Minor modifications were made to the reference implementation to integrate
PESQ in the simulation model. In the simulations, level alignment in the reference
implementation of PESQ was disabled in order to speed up simulations. Furthermore,
the performance of PESQ with and without level alignment were confirmed to be
identical [4]. Some additions of interfacing code (wrappers) to the original C code of the
reference implementation of PESQ also have also been made in order to integrate PESQ
into Matlab Simulink.
Figure 3.14 shows a flow chart for the SPC based outer loop power control. The
difference between the conventional and the SPC based outer loop power control flow
chart is highlighted in the chart. Compared with the conventional power control, there is
an additional process after CRC indicates an error in Class A bits of the received frame.
The SIR target is not automatically increased but is only increased if the FD statistical
data is higher than the SPC upper limits. Otherwise, the SIR target will be decreased. In
a case where the FD statistical data is less than the SPC lower limit, the SIR target will
also be decreased. Hence, the scheme will ensure that the perceptual speech quality
received by the end user meets the customer requirement and at the same time optimizes
the power usage in the system.
61
Start
Check CRC of current frame
log ( nFD ) < SPC thresholds?
SIR target = maximum SIR_target
Process next frame
SIR_target < minimum SIR target
No
Yes
No No
Yes
Yes
No
Yes
Figure 3.14: SPC based UMTS outer-loop power control
CRC in error?
SIR target = SIR target + ∆up
SIR target = SIR target - ∆down
SIR_target > maximum SIR
target
SIR target = minimum SIR_target
62
3.9 Summary The proposed perceptual speech quality control technique was described thoroughly in
this chapter. In this technique, the log( )nFD parameter which is subtracted from PESQ
is replaced by a non-perceptual metric such as FER in mobile communication systems.
The details of PESQ components are briefly described as well as the FD. The statistical
analysis of FD is discussed in Chapter 4.
As PESQ required both degraded and reference signals at the transmitter to
evaluate the perceptual speech quality and hence calculate the FD, the synthesized
degraded speech signal has been used. Approximation of the degraded signal is obtained
by applying the FQI feedback method which is associated with the FEP. This method
was deliberated thoroughly in this chapter.
The novelty of this thesis which is a direct control approach using SPC was
discussed. This approach is the first attempt in a mobile communication system. Two
prominent tools from SPC are considered for the purposes of this perceptual speech
quality control: CUSUM and EWMA, which were discussed in Sections 3.5 and 3.6.
CUSUM has two ways of representing the CUSUM operation chart: tabular CUSUM
and V-mask form. Due to the complexity of the V-mask form application, tabular
CUSUM was adopted to apply in the proposed application. Both, CUSUM and EWMA
are well-known as the tools which have the greatest ability in monitoring the small
changes in the process mean. As applied to the outer loop power control in a UMTS, the
performance comparison between both tools is represented in Chapter 5.
The UMTS closed loop power control in FDD mode which is applied in this
research is discussed including details of the conventional UMTS outer loop and inner
loop power control. Subsequently, the proposed SPC based Power Control which is new
in a communication system is described.
63
CHAPTER 4
THE CUSUM TECHNIQUE APPLICATION IN PERCEPTUAL SPEECH QUALTY CONTROL
4.0 Introduction
Power control is an important means of providing a fair operating environment among
all users. An adequate QoS for a speech user can be measured in terms of PESQ, and as
such accurate power control operates to minimize interference among all users and at
the same time providing an adequate QoS as required in UMTS.
Therefore, in this chapter, the speech codec rate and power control using a
CUSUM based technique is applied in UMTS to improve the performance of UMTS.
This chapter begins with an analysis of the FD which is subtracted from PESQ. Then,
the application and analysis of the CUSUM based technique for controlling the speech
codec rate and power control in Universal Mobile Telecommunication Systems
(UMTS) is discussed. This is followed by the experimental results and ends with a
summary of the chapter.
The distributions of this chapter are as follows:
• Presentation of the FD analysis. It is observed that the log( )nFD have a log-
normal distribution for a given perceptual quality MOS.
• Application of a CUSUM based speech codec rate control for UMTS is
presented. CUSUM based technique allows faster action at the transmitter to
control the quality of the speech signals as required by the end users. Hence,
the conventional parameter such as FER can be replaced with the log( )nFD .
• Comparison of the performance of CUSUM based and FER based outer loop
power control algorithm through simulations. It is shown that the CUSUM
based power control achieves adequate speech quality while reducing the
average SIR target by up to 13% relative to the conventional algorithm.
64
4.1 Frame Disturbance Analysis
Details of the FD are described in Section 3.3. In this section, the simulation and the
analysis of FD distribution is presented.
4.1.2 Input speech file and speech codec Input speech samples used in the analysis are from the ITU database for speech quality
measurement tests. Each speech file contained pre-recorded sentences of 8s duration
with approximately 50% speech and 50% silence intervals. However, FD is calculated
with the silence periods removed to ensure only the active speech is considered in this
application. The AMR speech codec is a standard codec for UMTS and was used in the
analysis at the transmitter and receiver part
4.1.3 Methodology In the absence of a degraded speech signal at the transmitter site, an approximation must
be used. In this case, the FQI method is applied to synthesize the speech signal output as
shown in Figure 4.1. A degraded speech signal with PESQ MOS ranging 3.0 to 3.5 is
collected and saved. In attaining more reliable FD distribution, for each PESQ MOS, 10
sets are collected where each set contains FD which is presented by 10 speech files.
Each speech file on average contains 243 samples of FD calculations. The silence parts
of the speech signal output were removed for this analysis. The simulation model is as
shown in Figure 4.1. The estimated mean and standard deviation of the distribution of
each PESQ MOS from 3.0 to 3.5 are observed and recorded.
By applying the sample mean estimation theorem [68], the estimated mean
( )log FDn , sµ of one set of 10 speech files is given by
1[ ( )] ( )1
NE log FD log FDn n sN n
µ= =∑=
, (4.1)
where N = 2430 for 10 speech files. Consequently, the estimated mean for all 10 sets of
speech files is given by
1[ ] 01
ME s snN m
µ µ µ= =∑=
, (4.2)
65
where M = 10, and0
µ is the target mean of the ( )log FD ,n which will be used for
CUSUM based speech codec control at the next section.
Figure 4.1: Simulation model for frame disturbance analysis.
4.1.4 Simulation result and discussion The result of FD analysis is shown in Figure 4.2 over a range of PESQ MOS values.
-3 -2 -1 0 1 2 3 40
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
log(FDn)
Relat
ive F
requ
ency
(a)
AMR Encoder
Channel Model
AMR Decoder
PESQ ny
nFD
Synthesis Signal
nFQI
nx
Reference Signal
66
-3 -2 -1 0 1 2 3 40
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
log(FDn)
Relat
ive F
requ
ency
(b)
-3 -2 -1 0 1 2 3 40
0.1
0.2
0.3
0.4
0.5
0.6
0.7
Log(FDn)
Relat
ive F
requ
ency
(c)
67
-3 -2 -1 0 1 2 3 40
0.1
0.2
0.3
0.4
0.5
0.6
0.7
log(FDn)
Relat
ive F
requ
ency
(d)
-3 -2 -1 0 1 2 3 40
0.1
0.2
0.3
0.4
0.5
0.6
0.7
log(FDn)
Relat
ive F
requ
ency
(e)
68
-3 -2 -1 0 1 2 3 40
0.1
0.2
0.3
0.4
0.5
0.6
Log(FDn)
Relat
ive F
reque
ncy
(f)
Figure 4.2: ( )log FDn distribution for PESQ MOS 3.0-3.5: (a) 3.0, (b) 3.1,
(c) 3.2, (d) 3.3, (e) 3.4 and (f) 3.5.
Analysis of the FD shows that ( )FDn have a log-normal distribution for a given
perceptual quality MOS as shown in Figure. 4.2. Table 4.1 shows that the mean of
distribution of ( )log FDn is increasing with the degradation of the perceptual quality and
vice versa. The distribution suggests that for a given perceptual quality the FD can have
a wide range of values. Some large values can be tolerated while the overall quality
remains the same. Note that ( )log FDn parameter is used for all simulations in this
thesis (Section 4.2, 4.3 and 5.2).
Table 4.1: The estimated mean, 0
µ and the standard deviation of
( )log FDn distribution. PESQ
MOS
Target Mean,
0µ
Standard
Deviation
3.0 0.5507 0.0476 3.1 0.5017 0.0482 3.2 0.4692 0.0385 3.3 0.3559 0.0343 3.4 0.2506 0.0337 3.5 0.1443 0.0384
69
4.2 Speech Codec Rate Control Simulation model
4.2.1 Introduction
A simulation model for speech codec rate control is shown in Figure 4.3 below. The
same input speech and speech codec employed for frame disturbance analysis (Section
4.1) has been used for the simulations.
Figure 4.3: The simulation model for speech codec rate control.
4.2.2 Methodology
By applying the sample mean estimation theorem [68], the estimated mean
( )log FDn of one set of 10 speech file is given by
1[ ( )] ( )1
NE log FD log FDn nN n
µ= =∑=
, (4.3)
Where N = 2430, and µ is the mean of ( )log FDn that will be used for CUSUM
application.
The speech quality perceived differs among the end users as it depends on their
judgment of perception. This CUSUM application is applied on an end user to end user
basis. Two cases will be applied using the CUSUM control chart in this analysis. The
quality of the speech was controlled to a PESQ MOS of 3.3 which is considered a good
speech quality MOS score. Hence, based on Table 4.1, the CUSUM target mean, 0
µ , is
set to be 0.3559.
AMR Encoder
Channel Model
AMR Decoder
PESQ ny log( )nFD
Approximation required
CUSUM
nFQI
nx
Reference Signal
70
A total of 50 sequenced speech files are simulated with the AMR initial speech
codec rate being set to 2. The mean, µ, of ( )log FDn for each speech file, was applied to
the CUSUM control chart.
Table 4.2: Parameters chosen for CUSUM chart.
K ½ σ
Upper limit 0.3512
Lower limit -0.3512
Target mean 0.3559
Initial AMR speech codec rate 2
Case 1
The first 40 speech files are 3.3 PESQ MOS while the other 10 speech files are
degraded speech files.
Case 2
The first 40 speech files are 3.3 PESQ MOS while the other 10 are the better grade of
speech files.
Process standard deviation, σ for the in controlled first 40 simulated speech files is
0.0878. K was set to be 1
2σ and H was set to be 5σ. Therefore CUSUM upper and lower
limit, H was set to be 0.3512 and -0.3512, respectively. The process mean for the first
40 simulated speech files is 0.3575. The chosen CUSUM parameters are shown in Table
4.2
71
4.2.3 Simulation results and discussion Case 1
Uncontrolled CUSUM
-0.6
-0.4
-0.2
0.0
0.2
0.4
0.6
0.8
1.0
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49Speech Sample
Cus
um C
hart
Upper CusumC+C-Lower Cusum
Figure 4.4: A CUSUM control chart without controlling speech codec rate.
Figure 4.4 shows a CUSUM control chart without controlling speech quality.
The mean of process for the last 10 simulated speech files was increased to 0.4674. The
out of control CUSUM was detected at the 43rd speech signal sample at the CUSUM
upper limit which indicated there was a degradation of the speech signals.
Controlled CUSUM
-0.6
-0.4
-0.2
0.0
0.2
0.4
0.6
0.8
1.0
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49Speech Sample
Cus
um C
hart Upper Cusum
C+C-Lower Cusum
Figure 4.5: Apply CUSUM with controlling speech codec rate.
72
Figure 4.5 shows that the degradation of the speech signals was rectified by
increasing the speech codec rate from mode 2 to mode 3 starting at the 44th speech
signal sample.
Case 2
Uncontrolled CUSUM
-1.0
-0.8
-0.6
-0.4
-0.2
0.0
0.2
0.4
0.6
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49
Speech Sample
Cus
um C
hart Upper Cusum
C+C-Lower Cusum
Figure 4.6: A CUSUM control chart without controlling speech codec rate.
Figure 4.6 shows the CUSUM control chart without controlling speech quality.
The mean of process for the last 10 simulated speech files was decreased to 0.2446. The
out of control CUSUM was detected at 44th speech signal sample but this time it
occurred at the CUSUM lower limit. This indicated that the signal was beyond the
quality which is needed by the end user or customer.
73
Controlled CUSUM
-1.0
-0.8
-0.6
-0.4
-0.2
0.0
0.2
0.4
0.6
1 3 5 7 9 11 13 15 17 19 21 23 25 27 29 31 33 35 37 39 41 43 45 47 49
Speech Sample
Cu
sum
Ch
art
Upper CusumC+C-Lower Cusum
Figure 4.7: Apply CUSUM with controlling speech codec rate.
Figure 4.7 shows the excess quality of the speech signals was rectified by
decreasing the speech codec rate from mode 2 to mode 1 starting at 45th speech signal
sample.
4.3 Power Control Simulation Model
In this section, a CUSUM based technique as described in Chapter 3 is incorporated in
the outer-loop of the UMTS power control. The performance of this CUSUM based
technique is compared against a conventional UMTS counterpart using computer
simulations.
A Matlab Simulink implementation of the UMTS physical layer was used for
simulations. A physical layer was implemented at chip level according to the 3GPP
technical specifications2. Figure 4.8 shows a block diagram of the simulation model
showing the relevant functional blocks. The building blocks and some of the important
simulation parameters are described briefly in the following sub-sections.
2We would like to thank and acknowledge PHYBIT Inc. Singapore for permitting us to use their
UMTS physical layer simulation software.
74
Figure 4.8: Block diagram of the simulation model of UMTS physical
layer (FDD mode).
4.3.1 Input speech file For consistency, only one input speech file was used for all simulations in this chapter.
This speech file was constructed by the combination of five speech files (O_M01L1A,
O_M01L1C, O_F01L6A, O_F01L6B, and O_F01L6C) from the ITU database for voice
quality measurement tests [77]. Each of the constituent speech files consisted of 8
seconds of pre-recorded sentences with approximately 50% speech and 50% silence
intervals. Constituent speech files were recorded in 16-bit, 8 kHz linear PCM format.
4.3.2 Speech codec
The AMR speech codec has been employed for all the simulations in this chapter. This
codec which is mandatory for UMTS [78] was described in Section 3.1.3.
75
Generally, the AMR codec mode 7 produced the superior speech quality.
Therefore, this mode has been used for the simulation of this chapter. The frame
structure of AMR codec mode 7 is summarized in Table 4.3.
Table 4.3: Summary of AMR codec mode 7 frame structure.
Codec Rate
(kbps)
Number of bits
per frame
Number of
Class A bits
Number of
Class B bits
Number of
Class C bits
12.2 244 81 103 60
4.3.3 Multiplexing and channel coding In the simulation model as shown in Figure 4.8, the 20 ms encoded speech frames are
processed by “Multiplexing and Channel Encoding” which are referred to hereafter as
MC blocks. In UMTS, data arriving from Layer 2 is processed in Transmission Time
Interval (TTI). In this case, each AMR speech frame corresponds to 20 ms TTI.
Operations of the MC block in each TTI are summarized below.
Transport Channel (TrCH) allocation
In UMTS, the transmitted data can be divided into distinct logical channels which are
referred to as Transport Channels (TrCH). This separate TrCH then assigned to the
AMR output bit Class A, B, and C denoted as TrCH1, TrCH2, and TrCH3 respectively.
CRC attachment
In every TTI, a 12 bit CRC is attached to the TrCH1. The receiver uses the CRC bits to
detect any potential Class A errors. There are no CRC bits for TrCH2 and TrCh3.
Channel Coding
Convolutional coding (CH) has been recommended for speech [109]. A rate 1/3 code is
used for TrCH1 while a rate ½ code is used for both TrCH1 and TrCH2 [110].
First interleaving
Two stages of interleaving were required in UMTS to achieve the best performance.
Two stages are needed to spread the channel errors as widely as possible. The first
interleaver is an inter-frame interleaver operated individually on each TrCH for every
76
TTI. In the case of speech, TrCH bits are entered into the interleaver row-by-row, with
two columns for each row. Subsequently, the bits are set out in columns [53].
Radio frame segmentation
Data are transmitted in 10 ms radio frames in UMTS, equivalent to two radio frames per
TTI per speech. This radio frame segmentation process involves dividing data from
each TrCH into two consecutive radio frames.
Multiplexing
Transport channels are multiplexed into a simple serial multiplexing on a frame by
frame basis named Single Coded Composite Transport Channel (CCTrCH), and for this
multiplexing, each transport channel provides data in 10 ms frames.
Second interleaving
The second interleaver is an intra-frame interleaver and it operates on 10 ms radio
frames. For that scenario, the bits from a radio frame are read into the interleaver row by
row where each row contains 30 columns. Subsequently, the bits are set out in columns
after inter-column modification has been applied.
4.3.4 Power Control
In this section, simulation details of the conventional UMTS and CUSUM based power
control algorithms are given. Since the natural application of CUSUM technique
corresponds to the normal distribution [92], the application of the CUSUM in
controlling power is justified.
Conventional UMTS power control
Closed loop power control as described in Section 3.7 was simulated for the
conventional UMTS power control which incorporated both inner and outer loops. The
TPC commands for the inner loop were applied based on Algorithm 1 given in Section
2.3.2. Transmission power was updated using a step size δ of 1 dB, which is the
mandatory step size specified in [67]. An updated rate of 1500 1s− corresponds to once
every time slot. The outer loop was based on the algorithm proposed by Sampath et al.
[51]. A flow chart of the algorithm is shown in Figure 3.14. The FER target for the
algorithm was set to 1%., and the SIR target was updated using a step size ∆ of 0.005
dB [47] at a rate of 50 1s− which corresponds to once every speech frame. A summary
77
of the simulation parameters for the conventional UMTS power control is given in
Table 4.4. Table 4.4: Conventional UMTS power control parameters.
Type Algorithm Update
rate( 1s− )
Step up
size(dB)
Step down
size(dB)
FER
Target(1%)
Outer Loop Sampath et al 50 ∆up = 0.495 ∆down =
0.005
1
Inner Loop Algorithm 1 1500 δ up = 1 δ down = 1 -
CUSUM based UMTS Power Control
Simulation model for UMTS power control based on CUSUM is as described and
illustrated in Chapter 3. Figure 4.9 shows the application of CUSUM based in UMTS
outer-loop power control.
Figure 4.9: Application of CUSUM in UMTS outer-loop power control.
A flow chart for the CUSUM based outer loop power control is depicted in
Figure 4.10 based on Figure 3.15 (Chapter 3, section 3.8).
AMR Encoder Outer-loop Power Conrol
Channel
CUSUM ny
Synthesized Signal
log( )nFD PESQ
AMR Decoder
CRC Check FEP
( nFQI )
AMR Decoder
Delay
+
Delay
Reference Signal
Speech Signal
Received Signal
Transmitter Receiver
78
Start
Check CRC of current frame
log ( nFD ) < CUSUM
thresholds?
SIR target = maximum SIR_target
Process next frame
SIR_target < minimum SIR target
No
Yes
No No
Yes
Yes
No
Yes
Figure 4.10: CUSUM based UMTS outer-loop power control
CRC in error?
SIR target = SIR target + ∆up
SIR target = SIR target - ∆down
SIR_target > maximum SIR
target
SIR target = minimum SIR_target
79
4.3.5 Channel
In the mobile radio channel, noise sources can be subdivided into multiplicative and
additive effects. The simplest practical case of a mobile radio channel is an additive
white Gaussian Noise (AWGN) channel [111].
For the purpose of simulations, a 6-ray Vehicular A channel model specified by
3GP [112, 113] has been considered for modeling the fast multipath channel. Relative
time delays and average powers for each path of the channel models are summarized in
Table 4.5. Power Spectral Densities (PSD) for each path follows the classical PSD
[114]. The logarithm scale shadowing was modeled according to the correlated normal
distribution. Normally, the mean value of the distribution is practically equal to the path
loss. However, in this research, the path loss was assumed to be compensated for by the
power control subsystem which implies that the mean of the distribution is equal to 0
dB. The standard deviation of the distribution is the function of the propagation
environment. For urban environments, an 8 dB standard deviation has been used [115]
Furthermore, a de-correlation distance of 20 m has been used in this model [116]. De-
correlation distance is the signal shadowing which it de-correlates with travel distance
and it is dependent on the propagation environment.
Table 4.5: Tapped-delay-line parameters for Vehicular A environment [113].
Tap Number Relative Delays
(ns)
Relative Avg
Power (dB)
Doppler Spectrum
1 0.0 0.0 Classical
2 310 -1.0 Classical
3 710 -9.0 Classical
4 1090 -10.0 Classical
5 1730 -15.0 Classical
6 2510 -20.0 Classical
80
4.3.6 Summary of simulation parameters
A summary of main simulation is given in Table 4.6. From this table it is noted that the
K and CUSUM target are set to be 12
σ and 0.02 respectively, where σ is the process
standard deviation. Based on FD analysis at section 4.1, the CUSUM target of 0.02 was
found to be equivalent to an MOS score of 4.0 for PESQ. The PESQ MOS starts
deteriorating rapidly once it reaches the value 3.6 for the AMR codec mode 7 (12.2 kbs)
[80]. Therefore, the chosen CUSUM target was appropriate to ensure that the CUSUM
based power control activates SIR target reduction only when the quality is good.
4.3.7 Methodology
Based on 3GPP recommendations [112], three representative vehicular speeds f 3, 50,
and 120 km/h were employed for performance comparison between conventional and
CUSUM based UMTS power control algorithm. To ensure the channel error patterns
were independent for the simulations, 5 different channels shadowing profiles were
simulated for each vehicular speed. Each power control algorithm was simulated for
outer loop step sizes of ∆ of 0.005, 0.01, 0.015 and 0.02 dB. For each simulation, a 40 s
speech file was transmitted on the UMTS physical layer shown in Figure 4.8 enabling
only one power control algorithm at a time. In each case, the variations of the SIR target
and the channel shadowing profile were recorded.
For each simulation, the PESQ algorithm was applied to the received speech file
together with the original transmitted file and the corresponding actual PESQ MOS was
calculated.
81
Table 4.6: Main simulation parameters.
_____________________________________________________________________
Chip rate 3.84 Mc/s
Spreading factor 128
Channel bit rate 60 kb/s
Speech coding AMR (rate 12.2 kb/s)
Channel Coding
Class A Rate 1/3 CC + 12 bit CRC
Class B Rate 1/2 CC
Class C Rate 1/2 CC
Interleaving both inter and intra-frame
Modulation QSPK
Power Control
Inner Loop
Update rate 1500s-1
Up/down step size (δup or δdown) 1 dB
Outer-Loop (Conventional)
FER target 1%
Control variable CRC flags
Step down ∆down 0.005, 0.01, 0.015 and 0.02 dB
Step down ∆up 0.495, 0.99, 1.485 and 1.98 dB
Update rate 50 s-1
Outer-Loop (CUSUM based)
FER target 1%
Control variable CUSUM threshold and CRC flags
CUSUM target 0.02
Step up/down as conventional above
Update rate 50 s-1
Channel type
AWGN ON
Log-normal Fading ON
(Std, decorrelation distance) (8 dB, 20 m)
Fast Fading 6-tap Vehicular A
Vehicular speed 3, 50 and 120 km/h
Receiver Rake (6 fingers)
Initial SIR 4 dB
82
4.3.8. Simulation results and discussion
Simulation results for each outer loop step sized and vehicular speed of 3, 50, and 120
km/h are given in Table 4.7(a)-(c), Table 4.8(a)-(c), Table 4.9(a)-(c) and Table 4.10(a)-
(c) respectively. These results include the average and standard deviation of the SIR
target and the PESQ MOS corresponding to two different power control algorithms
obtained for each shadowing profile. Furthermore, the gain of the CUSUM based power
control with respect to the conventional power control calculated as the difference
between the SIR targets in the two cases is shown. Ensemble averages over all
shadowing profiles are also included.
A statistical significant difference between the two methods is obtained by
applying the T-test statistic. In this statistical significance testing, p-value is the
probability of obtaining a test statistic at least as extreme as the one which was actually
observed. If the p-value is less than the significance level α (Greek alpha), which is
often 0.05 or 0.01 [117], the result is said to be statistically significant.
Table 4.7(a)-(c), Table 4.8(a)-(c), Table 4.9 (a)-(c) and Table 4.10(a)-(c) shows
the p-value is less than the significance level, therefore, the result is consider to be
statistically significant.
From the Table 4.7(a)-(c), Table 4.8(a)-(c), Table 4.9(a)-(c) and Table 4.10(a)-
(c), it is observed that the CUSUM based power control achieved from 3% to 14% gains
in the SIR target. The perceptual quality of CUSUM based technique in term of PESQ
MOS is kept in all cases within a desired range 3 to 3.5 of “fair” to “good” quality. All
the PESQ MOS differences are less than 0.2 MOS and hardly perceptible. Therefore,
we could say that both power control algorithms deliver adequate perceptual qualities,
even though, at different cost in terms of average SIR target levels.
It is also observed that generally, perceptual quality delivered by the
conventional technique power control is slightly higher than a CUSUM based
technique. The reason for this observation was the ability of the perceptual scheme to
trade-off average transmitting power with perceptual quality in a more controlled
manner. On the other hand, inefficiencies in conventional power do not allow for
precise control of the speech quality. Network providers have to balance their desire to
increase capacity by reducing the average of SIR level, and their commitment to provide
adequate quality to customers, which is achieved by power control. However,
inefficiencies in a conventional technique will tip the balance one way or the other. That
is, at times more than necessary quality is provided, at a cost of extra power and hence
83
reduced capacity while at other times quality is degraded and is being sacrificed to gain
higher capacity. On the other hand, a CUSUM based technique always provides
adequate perceptual quality and at the same time, avoids situations where a
conventional technique provides more than necessary quality at the cost of increased
average SIR target. Therefore, we can say that a CUSUM based technique does a better
“balancing” act than its counterpart. It should also be noted that the same gains could
not be achieved by allowing a larger FER target for conventional power control without
affecting the perceptual quality more severely [80]. In this case, the FER is increased
regardless of the effect on the perceptual quality, whereas with CUSUM based power
control, the FER is only increased when the quality is not affected noticeably.
SIR target gain is due to the number of times a CUSUM based algorithm avoids
increasing the SIR target while the conventional algorithm could not manage it. That is,
a lower SIR target corresponds to more efficient power control. In this case, the gain
increased with the power control step size as noted in the Table 4.7(a)-(c), Table 4.8(a)-
(c), Table 4.9(a)-(c) and Table 4.10(a)-(c).
84
Table 4.7: Results for conventional and CUSUM based power control
algorithms with outer-loop step down, ∆down = 0.005 dB and vehicular speed of
(a) 3 km h-1, (b) 50 km h-1 and (c) 120 km h-1.
(a)
Channel Profile Ave SIR target(dB) PESQ MOS Conv Std CUSUM std Gain CUSUM P value Conv CUSUM Difference
profile 1 4.075 0.164 3.932 0.121 0.143 3.87E-35 3.169 3.057 0.112 profile 2 4.112 0.081 3.845 0.119 0.267 0.00E+00 3.205 3.177 0.028 profile 3 4.058 0.068 3.848 0.124 0.210 0.00E+00 3.152 3.034 0.118 profile 4 4.081 0.152 3.887 0.122 0.194 3.24E-19 3.048 3.016 0.032 profile 5 4.022 0.061 3.957 0.119 0.065 4.16E-155 3.153 3.100 0.053 Average 4.070 0.105 3.894 0.121 0.176 6.48E-20 3.145 3.077 0.069
(b)
Channel Profile Ave SIR target(dB) PESQ MOS Conv Std CUSUM std Gain CUSUM P value Conv CUSUM Difference
profile 1 4.260 0.097 3.966 0.229 0.294 4.27E-15 3.016 2.982 0.034 profile 2 4.125 0.062 4.091 0.128 0.034 0.00E+00 3.152 3.006 0.146 profile 3 4.190 0.083 4.022 0.129 0.168 3.83E-15 3.095 2.967 0.128 profile 4 4.134 0.073 3.998 0.125 0.136 1.20E-25 3.201 3.171 0.030 profile 5 4.124 0.095 3.849 0.090 0.275 0.00E+00 3.083 2.997 0.086 Average 4.167 0.082 3.985 0.140 0.181 1.62E-15 3.109 3.024 0.085
(c)
Channel Profile Ave SIR target(dB) PESQ MOS Conv Std CUSUM std Gain CUSUM P value Conv CUSUM Difference
profile 1 3.624 0.204 3.560 0.271 0.064 3.76E-121 3.466 3.395 0.071 profile 2 3.691 0.143 3.578 0.221 0.113 0.00E+00 3.392 3.291 0.101 profile 3 3.658 0.161 3.518 0.246 0.140 0.00E+00 3.317 3.244 0.073 profile 4 3.619 0.212 3.462 0.209 0.157 1.78E-45 3.493 3.315 0.178 profile 5 3.610 0.212 3.499 0.289 0.111 1.39E-42 3.460 3.314 0.146 Average 3.640 0.186 3.523 0.247 0.117 2.78E-43 3.426 3.312 0.114
85
Table 4.8: Results for conventional and CUSUM based power control
algorithms with outer-loop step down, ∆down = 0.01 dB and vehicular speed of
(a) 3 km h-1, (b) 50 km h-1 and (c) 120 km h-1.
(a)
Channel Profile Ave SIR target(dB) PESQ MOS
Conv Std CUSUM std Gain CUSUM P value Conv CUSUM Difference profile 1 4.062 0.096 3.920 0.112 0.142 0.00E+00 3.076 3.065 0.011 profile 2 4.085 0.113 3.936 0.120 0.149 0.00E+00 3.240 3.052 0.188 profile 3 4.081 0.088 3.803 0.122 0.278 1.98E-28 3.204 2.997 0.207 profile 4 4.080 0.111 3.834 0.121 0.246 2.45E-41 3.193 3.178 0.015 profile 5 4.083 0.128 3.847 0.122 0.236 0.00E+00 3.136 3.040 0.096 Average 4.078 0.107 3.868 0.119 0.210 3.960E-29 3.170 3.066 0.103
(b)
Channel Profile Ave SIR target(dB) PESQ MOS
Conv Std CUSUM std Gain CUSUM P value Conv CUSUM Difference profile 1 4.183 0.128 3.652 0.130 0.531 0.00E+00 3.035 3.067 0.032 profile 2 4.184 0.102 3.784 0.129 0.400 4.30E-67 3.095 3.046 0.049 profile 3 4.371 0.150 3.911 0.127 0.460 2.45E-45 3.089 3.026 0.063 profile 4 4.184 0.123 3.806 0.130 0.378 0.00E+00 3.171 3.046 0.125 profile 5 4.332 0.138 4.104 0.129 0.228 4.52E-123 3.157 3.039 0.118 Average 4.251 0.128 3.851 0.129 0.399 4.900E-46 3.109 3.045 0.077
(c)
Channel Profile Ave SIR target(dB) PESQ MOS
Conv Std CUSUM std Gain CUSUM P value Conv CUSUM Difference profile 1 3.412 0.271 3.123 0.463 0.289 0.00E+00 3.356 3.296 0.060 profile 2 3.663 0.177 3.222 0.251 0.441 2.89E-98 3.270 3.266 0.004 profile 3 3.300 0.350 3.009 0.287 0.291 0.00E+00 3.374 3.213 0.161 profile 4 3.351 0.286 3.257 0.287 0.094 5.67E-24 3.426 3.214 0.212 profile 5 3.408 0.291 3.236 0.317 0.172 4.00E-29 3.237 3.119 0.118 Average 3.427 0.275 3.169 0.321 0.257 1.134E-24 3.333 3.222 0.111
86
Table 4.9: Results for conventional and CUSUM based power control
algorithms with outer-loop step down, ∆down = 0.015 dB and vehicular speed of
(a) 3 km h-1, (b) 50 km h-1 and (c) 120 km h-1.
(a)
Channel Profile Ave SIR target(dB) PESQ MOS Conv Std CUSUM std Gain CUSUM P value Conv CUSUM Difference
profile 1 3.988 0.158 3.916 0.124 0.072 0.00E+00 3.027 2.932 0.095 profile 2 4.057 0.141 3.860 0.124 0.197 0.00E+00 3.248 3.070 0.178 profile 3 3.984 0.135 3.955 0.122 0.029 4.70E-55 3.189 3.008 0.181 profile 4 4.049 0.143 3.829 0.125 0.220 1.26E-56 3.201 3.151 0.050 profile 5 4.175 0.170 3.884 0.114 0.291 0.00E+00 3.108 2.995 0.113 Average 4.051 0.149 3.889 0.122 0.162 9.65E-56 3.155 3.031 0.124
(b)
Channel Profile Ave SIR target(dB) PESQ MOS Conv Std CUSUM std Gain CUSUM P value Conv CUSUM Difference
profile 1 4.356 0.173 3.957 0.231 0.399 0.00E+00 3.123 2.992 0.131 profile 2 4.254 0.160 4.105 0.233 0.149 1.89E-98 3.130 3.057 0.073 profile 3 4.386 0.182 3.868 0.138 0.518 0.00E+00 3.123 3.042 0.081 profile 4 4.213 0.189 3.828 0.135 0.385 3,12E-47 3.150 3.037 0.113 profile 5 4.237 0.164 3.797 0.140 0.440 0.00E+00 3.177 3.037 0.140 Average 4.289 0.174 3.911 0.176 0.378 4.72E-99 3.141 3.033 0.108
(c)
Channel Profile Ave SIR target(dB) PESQ MOS Conv Std CUSUM std Gain CUSUM P value Conv CUSUM Difference
profile 1 3.248 0.323 3.171 0.366 0.077 0.00E+00 3.295 3.196 0.099 profile 2 3.420 0.276 3.088 0.266 0.332 1.65E-41 3.315 3.244 0.071 profile 3 3.210 0.342 3.063 0.320 0.147 0.00E+00 3.262 3.183 0.079 profile 4 3.260 0.346 3.118 0.314 0.142 2.39E-56 3.331 3.164 0.167 profile 5 3.330 0.285 2.844 0.542 0.486 1.12E-59 3.178 3.046 0.132 Average 3.294 0.314 3.057 0.362 0.237 3.30E-42 3.276 3.167 0.110
87
Table 4.10: Results for conventional and CUSUM based power control
algorithms with outer-loop step down, ∆down = 0.02 dB and vehicular speed of
(a) 3 km h-1, (b) 50 km h-1 and (c) 120 km h-1.
(a)
Channel Profile Ave SIR target(dB) PESQ MOS Conv Std CUSUM std Gain CUSUM P value Conv CUSUM Difference
profile 1 4.147 0.205 3.944 0.211 0.203 6.90E-189 3.108 3.076 0.032 profile 2 4.102 0.165 3.984 0.238 0.118 0.00E+00 3.263 3.240 0.023 profile 3 4.134 0.194 3.906 0.222 0.228 0.00E+00 3.211 3.100 0.111 profile 4 4.014 0.167 3.944 0.252 0.070 4.32E-48 3.265 3.178 0.087 profile 5 4.130 0.238 3.918 0.238 0.212 1.69E-111 3.191 3.136 0.055 Average 4.105 0.192 3.939 0.232 0.166 3.38E-12 3.208 3.146 0.062
(b)
Channel Profile Ave SIR target(dB) PESQ MOS Conv Std CUSUM std Gain CUSUM P value Conv CUSUM Difference
profile 1 4.176 0.238 3.766 0.284 0.410 0.00E+00 3.108 3.055 0.053 profile 2 4.298 0.222 3.825 0.378 0.473 0.00E+00 3.127 3.055 0.072 profile 3 4.360 0.235 3.933 0.357 0.427 4.12E-26 3.181 3.018 0.163 profile 4 4.236 0.222 3.835 0.304 0.401 1.56E-13 3.208 3.210 0.002 profile 5 4.367 0.184 3.871 0.355 0.410 2.45E-45 3.256 2.932 0.324 Average 4.287 0.220 3.846 0.336 0.441 3.90E-14 3.176 3.054 0.123
(c)
Channel Profile Ave SIR target(dB) PESQ MOS Conv Std CUSUM std Gain CUSUM P value Conv CUSUM Difference
profile 1 3.185 0.370 2.794 0.339 0.391 0.00E+00 3.310 3.151 0.159 profile 2 3.363 0.234 2.939 0.332 0.424 3.56E-17 3.305 3.219 0.086 profile 3 3.122 0.330 2.812 0.435 0.310 3.16E-63 3.138 3.083 0.055 profile 4 3.260 0.286 2.808 0.338 0.452 1.78E-98 3.246 3.135 0.111 profile 5 3.368 0.340 2.749 0.513 0.619 0.00E+00 3.351 3.152 0.199 Average 3.260 0.312 2.820 0.391 0.439 7.12E-18 3.270 3.148 0.122
88
The summary of ensemble averages for the outer-loop step sizes of 0.01, 0.005,
0.015 and 0.02 dB are given in Table 4.11(a)-(d) respectively.
Table 4.11: Results for conventional and CUSUM based power control
algorithms for all outer-loop step sizes and vehicular speed of (a) 3 km h-1, (b)
50 km h-1 and (c) 120 km h-1.
(a)
Step Sizes (dB)
Ave SIR target(dB) PESQ MOS
Conv Std CUSUM std Gain CUSUM P value Conv CUSUM Difference 0.005 4.070 0.105 3.894 0.121 0.176 6.48E-20 3.145 3.077 0.069 0.010 4.078 0.107 3.868 0.119 0.210 3.96E-29 3.170 3.066 0.103
0.015 4.051 0.149 3.889 0.122 0.162 9.65E-56 3.155 3.031 0.124
0.020 4.106 0.192 3.939 0.232 0.167 3.38E-12 3.208 3.146 0.062
(b)
Step Sizes (dB)
Ave SIR target(dB) PESQ MOS
Conv Std CUSUM std Gain CUSUM P value Conv CUSUM Difference 0.005 4.167 0.082 3.985 0.140 0.181 1.62E-15 3.109 3.024 0.085
0.010 4.251 0.128 3.851 0.129 0.399 4.90E-46 3.109 3.045 0.064
0.015 4.289 0.174 3.911 0.176 0.378 4.72E-99 3.141 3.033 0.108
0.020 4.287 0.220 3.846 0.336 0.442 3.90E-14 3.176 3.054 0.123
(c)
Step Sizes (dB)
Ave SIR target(dB) PESQ MOS
Conv Std CUSUM std Gain CUSUM P value Conv CUSUM Difference 0.005 3.640 0.186 3.523 0.247 0.117 2.78E-43 3.426 3.312 0.114 0.010 3.427 0.275 3.169 0.321 0.257 1.13E-24 3.333 3.222 0.111
0.015 3.294 0.314 3.057 0.362 0.237 3.30E-42 3.276 3.167 0.110
0.020 3.260 0.312 2.820 0.391 0.439 7.12E-18 3.270 3.148 0.122
89
A set of representative curves comparing the performance of CUSUM based and
conventional outer-loop power control algorithms for vehicular speeds of 3 km h-1, 50
km h-1, and 120 km h-1, are shown in Figure 4.11(a)-(c) respectively. In each case, the
shadowing profile and the SIR targets for the two algorithms are shown. CRC flags
indicating the frame erasure for both systems are also shown for the comparison. CRC
is flagged as “1” to indicate the frame erasure. Note that, the perceptual speech quality
control in CUSUM based technique not only depends on the CUSUM threshold but also
on the CRC flags (Referred to Figures 3.14 and 3.15). It can be observed from Figure
4.11(a)-(c) that the SIR target for conventional outer loop power control was increased
whenever the corresponding CRC flag indicated the frame erasure. It also applied in
CUSUM based power control. However, there were situations when the frame erasures
occurred but the SIR target for CUSUM based technique was not increased giving rise
to the observed gaps between the SIR targets in the two algorithms in Figure 4.11(a)-
(c).
The average area of the gap corresponds to the gain achieved through a CUSUM
based algorithm over its conventional counterpart. The set of curves corresponding to
the best scenario, which resulted in the highest SIR target gain, are shown in Figure
4.12(a)-(c). In this case, at a given step size of 0.02 dB, SIR target gains 0.203, 0.410
and 0.619 dB for vehicular speeds of 3 km h-1,50 km h-1, and 120 km h-1, respectively.
90
0 5 10 15 20 25 30 35 40-25
-20
-15
-10
-5
0
5
10
15
20
25
Time (sec)
Am
plit
ude (
dB
)
Fla
g
Conventional SIR Targets (dB)CUSUM based SIR Targets (dB)Shadowing Profile (dB)CRC Flag (Conventional System)CRC Flag (CUSUM based System)
Average SIR Targets (dB): Conventional = 4.022, CUSUM = 3.957, gain = 0.065
(a)
0 5 10 15 20 25 30 35 40-25
-20
-15
-10
-5
0
5
10
15
20
25
Time (sec)
Am
plitu
de (
dB)
Fla
g
Conventional SIR Targets (dB)CUSUM based SIR Targets (dB)Shadowing ProfileCRC Flag (Conventional System)CRC Flag (CUSUM based System)
Average SIR Targets (dB): Conventional = 4.124, CUSUM = 3.849, gain = 0.275
(b)
91
0 5 10 15 20 25 30 35 40-25
-20
-15
-10
-5
0
5
10
15
20
25
Time (sec)
Ampl
itude
(dB)
Flag
Conventional SIR Targets (dB)CUSUM based SIR Targets (dB)Shadowing Profile (dB)CRC Flag (Conventional System)CRC Flag (CUSUM based System)
Average SIR Targets (dB): Conventional = 3.610, CUSUM = 3.499, gain = 0.111
(c)
Figure 4.11: Performance comparison of CUSUM based and conventional power control
(shadowing profile 5 and ∆ = 0.005 dB): (a) 3 km h-1 , (b) 50 km h-1 and (c) 120 km h-1.
92
0 5 10 15 20 25 30 35 40-25
-20
-15
-10
-5
0
5
10
15
20
25
Time (sec)
Ampl
itude
(dB)
Flag
Conventional SIR Targets (dB)CUSUM based SIR Targets (dB)Shadowing Profile (dB)CRC Flag (Conventional System)CRC Flag (CUSUM based System
Average SIR Targets (dB): Conventional = 4.147, CUSUM = 3.944, gain = 0.203
(a)
0 5 10 15 20 25 30 35 40-25
-20
-15
-10
-5
0
5
10
15
20
25
Time (sec)
Ampl
itude
(dB)
Flag
Conventional SIR Targets (dB)CUSUM based SIR Targets (dB)Shadowing Profile (dBCRC Flag (Conventional System)CRC Flag (CUSUM based System)
Average SIR Targets (dB): Conventional = 4.176, CUSUM = 3.766, gain = 0.410
(b)
93
0 5 10 15 20 25 30 35 40-25
-20
-15
-10
-5
0
5
10
15
20
25
Time (sec)
Ampl
itude
(dB)
Flag
Conventional SIR Targets (dB)CUSUM based SIR Targets (dB)Shadowing Profile (dB)CRC Flag (Conventional System)CRC Flag (CUSUM based System)
Average SIR Targets (dB): Conventional = 3.368, CUSUM = 2.749, gain = 0.619
(c)
Figure 4.12: Performance comparison of CUSUM based and conventional power control
(shadowing profile 1 and ∆ = 0.02 dB): (a) 3 km h-1 , (b) 50 km h-1 and (c) 120 km h-1.
94
4.4 Summary
Based on the FD analysis, it was shown (in Section 4.1.4) ( )log FDn has the normal
distribution and the mean of ( )log FDn is inversely proportional to the quality of speech.
The result of FD analysis suggests that transmission parameters for such as speech
codec rate should not be adapted on a frame-by-frame basis as is the current practice.
The current practices lead to inefficient utilization of resources and possibly
unsatisfactory perceptual quality. To maintain a certain level of end-user perceptual
quality, what is needed is to detect the shift in the distribution of ( )log FDn and take
steps to rectify that such as controlling the transmission power, channel coding or
speech codec rate which was being applied in this analysis. The result of the analysis
shows that, the conventional parameter such as FER can be replaced with FD of PESQ.
Using FER to control speech quality will result in loss of quality and/or inefficient use
of radio resources. Applying this new parameter to the CUSUM scheme will allow
faster action at the transmitter to control the quality of the speech signals as required by
the end users. Hence, it will help the provider in optimizing network resources.
A CUSUM based power control technique was incorporated in UMTS outer-
loop power control which was designed in such a way as to avoid unnecessary increases
in transmitter power levels when the perceived speech quality was adequate. The
CUSUM based algorithm would enable network operators to have direct control over
the perceptual speech quality by applying and setting the value of the CUSUM
thresholds. However, this could not be achieved with the conventional UMTS power
control, as network operator could only control the delivered service quality by
adjusting the FER target.
The performance of both CUSUM based and conventional UMTS power
controls was compared by computer simulations using a comprehensive set of
parameters. These parameters were the step size of the outer loop power control,
vehicular speed and channel shadowing profile. The simulation results showed that the
CUSUM based power control achieved adequate speech quality while reducing the
average SIR target by up to 13% relative to the conventional algorithm. To justify this,
the CUSUM based algorithm is compared to its counterpart part in SPC, EWMA in
chapter 5.
The outcomes of this research will potentially benefit both network provider and
users. The provider can optimize the network resources by providing resources to meet
95
required levels of service to provide consistent perceived quality to customers. The
employment of FD as a new parameter to control perceptual speech quality in the
CUSUM based algorithm to optimize network resources is achieved while maintaining
a satisfactory service levels for all customers. The CUSUM based algorithm had the
ability to trade-off transmit power (lowered average SIR target) with perceptual quality
in a more controlled manner while still providing adequate quality to the users.
96
CHAPTER 5
THE EWMA TECNIQUE APPLICATION IN PERCEPTUAL SPEECH QUALITY CONTROL
5.0 Introduction
EWMA is acknowledged as a good SPC tool in detecting a small shift like CUSUM.
The two methods are often compared by researchers in performance [118, 119]. EWMA
is often superior to CUSUM for detecting larger shift and is not sensitive to normal
assumption [103].
Therefore, in this chapter, power control using the EWMA based technique is
applied in UMTS to compare with the CUSUM based technique. The chapter covers the
analysis of data distributions (normal and non-normal distribution) with the application
of both techniques. Then the application and analysis of a EWMA based technique for
controlling power control in UMTS is discussed followed with a comparison of
CUSUM based technique and ends with a summary of the chapter.
The conclusions from this chapter are as follows:
• A presentation of the comparison analysis between EWMA and CUSUM
technique control with the normal distribution data and non-normal
distribution data. It is shown that in our case, EWMA technique has a better
response with data which does not have normal distribution compared to a
CUSUM technique. The EWMA based technique is also superior in
detecting the larger shift than a CUSUM based technique. On the other hand,
a CUSUM technique has a better response with the normal distribution data
compared to a EWMA technique.
• Performance comparison between a EWMA based and CUSUM based
power control algorithms is presented through simulations. It is shown that
both EWMA and CUSUM algorithms are reducing the average SIR target
compared to a conventional algorithm. However, the CUSUM based power
control achieves adequate speech quality by reducing the average SIR target
by up to 5% relative to the EWMA based algorithm.
97
5.1 Data Distributions Responses with the Application of EWMA and CUSUM
In this section, EWMA and CUSUM techniques are applied to two samples of data:
One sample has normal distribution and the other one does not. This data is used to
observe the efficiency of EWMA and CUSUM techniques to detect a shift of the data in
the sample.
5.1.1 Data Sample
Data samples used in the analysis are from the FD analysis of Section 4.1. Data for
( )log FDn with PESQ MOS 3.5 is employed for the application of EWMA and CUSUM
techniques. Figure 5.1 shows the distribution of ( )log FDn which has a normal
distribution.
-3 -2 -1 0 1 2 3 40
0.05
0.1
0.15
0.2
0.25
0.3
0.35
0.4
0.45
0.5
Sample Data
Rel
ativ
e Fr
eque
ncy
Figure 5.1: ( )log FDn data sample which has a normal distribution.
98
Figure 5.2 shows the distribution of linear nFD which does not have a normal distribution.
-3 -2 -1 0 1 2 3 40
0.1
0.2
0.3
0.4
0.5
0.6
0.7
Sample Data
Rel
ativ
e Fr
eque
ncy
Figure 5.2: Linear nFD data sample which does not have a normal distribution.
5.1.2 Methodology
EWMA and CUSUM techniques are applied to the first 100 data of both samples
( ( )log FDn and linear nFD . Plot patterns of data from both samples are observed. The
chosen EWMA and CUSUM parameters for normal distribution data are shown in
Table 5.1(a)-(b) respectively. Subsequently, the chosen EWMA and CUSUM
parameters for non-normal data distribution are shown in Table 5.2(a)-(b) respectively.
The mean, µ and the standard deviation, σ of the normal distribution data are 0.14 and
0.81. On the other hand the mean, µ and the standard deviation, σ of the non-normal
data distribution are 1.64 and 1.59 respectively. The chosen parameters of CUSUM are
equivalent to the chosen parameters for EWMA for each case.
99
Table 5.1: Chosen parameters for normal distribution data: (a) EWMA and (b)
CUSUM.
(a)
λ 0.2
L 3
UCL (Steady state) 0.828
CL 0.243
LCL (Steady state) -0.341
(b)
K ½ σ Upper limit 2.978 Lower limit -2.978 Target mean 0.243
Table 5.2: Chosen parameters for the non-normal distribution data: (a) EWMA
and (b) CUSUM.
(a)
λ 0.2
L 3
UCL (Steady state) 2.672
CL 1.694
LCL (Steady state) 0.717
(b)
K ½ σ Upper limit 6.552 Lower limit -6.552 Target mean 1.694
5.1.3 Simulation result and discussion
Results of the analysis is shown as below. Figure 5.3(a)-(b) shows the result of applying
EWMA and CUSUM techniques to the normal distribution data. Subsequently, Figure
100
5.4(a)-(b) shows the result of applying EWMA and CUSUM techniques to the data
which does not have a normal distribution.
(a)
(b)
Figure 5.3: Result of the application of (a) EWMA technique and (b) CUSUM
technique to the normal distribution data.
From Figure 5.3, it is observed that for normal distribution data, we could say
that a CUSUM technique is slightly more sensitive toward a shift of the distribution
than a EWMA technique. The data is considered out of control whenever it goes beyond
the upper and lower limit for both techniques. CUSUM has a slightly higher percentage
101
of data which is out of control, 16% (16 out of 100 data) compared with 15% (15 out of
100 data) for a EWMA technique. A longer period of out of control occurred in the
CUSUM technique from 26th to 33rd point and 69th to 73rd point. Nevertheless, both
CUSUM and EWMA are considered as the best tools to detect a small shift of the
distribution [6, 119] as shown in Figure 5.3(a)-(b). Therefore the application of a
EWMA technique for comparison with a CUSUM technique in controlling perceptual
speech quality in the latter section is appropriate
(a)
(b)
Figure 5.4: Result of the application of (a) EWMA technique and (b) CUSUM
technique to the non-normal distribution data.
102
From Figure 5.4, it is observed that for the non-normal distribution data, the
EWMA technique looks more sensitive towards the shift of data distribution. Even
though EWMA only has a slightly higher percentage of data which are out of control,
16% (16 out of 100 data) compare with 15% (15 out of 100 data) for the CUSUM
technique, the CUSUM plot looks steady from 40th to 69th point and from 80th to 95th
point while the EWMA plot keeps changing during the period. This indicated that
EWMA has higher sensitivity towards non-normal distribution compared with the
normal distribution data.
From both figures, it shows EWMA is more sensitive in detecting shift changes
with non-normal distribution data than normal distribution data. Since the standard
deviation for the sample of the non-normal distribution data is 1.92 compared to 0.35
for the sample of normal distribution data, the figures also imply that EWMA is more
sensitive with a larger shift of data. In Figure 5.4(a), the EWMA plot is going up and
down frequently compared to the CUSUM plot which is steadier. Since log( )nFD
which has normal distribution which reflecting the quality of speech is employed as the
new metric to replace the conventional metric (FER) to control perceptual speech
quality, the superiority of EWMA over CUSUM in detecting a larger shift for non-
normal distribution is not the case.
5.2 Power Control Simulation Model
In this section, the EWMA based technique as described in Chapter 3 is incorporated in
the outer-loop of the UMTS power control. Performance of this EWMA based
technique is compared against a conventional technique and its SPC tool counterpart,
CUSUM which we analysed it in chapter 4.using computer simulations.
A Matlab Simulink implementation of the UMTS physical layer which was used
for simulations in chapter 4 is also used in these simulations. This detail of the physical
layer is described in Section 4.3.3. The same input speech file, speech codec and
channel parameters used in the simulation model in Section 4.3, are used for the
simulation in this chapter to make a valid comparison.
103
5.2.1 EWMA based UMTS Power Control
A simulation model for UMTS power control based on EWMA is as described and
illustrated in Chapter 3. Figure 5.5 shows the application of EWMA based in UMTS
outer-loop power control.
Figure 5.5: Application of EWMA in UMTS outer-loop power control.
The flow chart for the EWMA based outer loop power control is depicted in
Figure 5.6. Note that this flow chart differs from the flow chart for the conventional
UMTS power control shown in Figure 3.14 (Section 3.7.1).
AMR Encoder Outer-loop Power Conrol
Channel
EWMA Yn
Synthesized Signal
Log( FDn)
PESQ
AMR Decoder
CRC Check FEP (FQIn)
AMR Decoder
Delay
+
Delay
Reference Signal
Speech Signal
Received Signal
Transmitter Receiver
104
Start
Check CRC of current frame
log ( nFD ) < EWMA
thresholds?
SIR target = maximum SIR_target
Process next frame
SIR_target < minimum SIR target
No
Yes
No No
Yes
Yes
No
Yes
Figure 5.6: EWMA based UMTS outer-loop power control.
CRC in error?
SIR target = SIR target + ∆up
SIR target = SIR target - ∆down
SIR_target > maximum SIR
target
SIR target = minimum SIR_target
105
5.2.2 Summary of simulation parameters
A summary of the main simulations is given in Table 5.3. From the table it is noted that
, Lλ and EWMA targets are set to be 0.2, 3 and 0.02 respectively, where σ is the
process standard deviation. The EWMA target of 0.02 is equivalent to 4 MOS score for
PESQ. Hence, it is equivalent to the CUSUM parameters set in Chapter 4.3.6.
5.2.3 Methodology
Based on 3GPP recommendations [112], three representative vehicular speeds f 3, 50,
and 120 km/h were employed for performance comparisons between EWMA and
CUSUM based UMTS power control algorithms. To ensure the channel error patterns
were independent for the simulations, 5 different channels shadowing profiles were
simulated for each vehicular speed. Each power control algorithm was simulated for
outer loop step sizes of ∆ of 0.005, 0.01, 0.015 and 0.02 dB. For each simulation, a 40 s
speech file was transmitted on the UMTS physical layer shown in Figure 4.8(Section
4.3) enabling only one power control algorithm at a time. In each case, the variations of
the SIR target and the channel shadowing profile were recorded.
For each simulation, the PESQ algorithm was applied to the received speech file
together with an original transmitted file and the corresponding actual PESQ MOS was
calculated.
106
Table 5.3: Main Simulation Parameters.
_____________________________________________________________________________
Chip rate 3.84 Mc/s
Spreading factor 128
Channel bit rate 60 kb/s
Speech coding AMR (rate 12.2 kb/s)
Channel Coding
Class A Rate 1/3 CC + 12 bit CRC
Class B Rate 1/2 CC
Class C Rate 1/2 CC
Interleaving both inter and intra-frame
Modulation QSPK
Power Control
Inner Loop
Update rate 1500s-1
Up/down step size (δup or δdown) 1 dB
Outer-Loop (Conventional)
FER target 1%
Control variable CRC flags
Step down ∆down 0.005, 0.01, 0.015 and 0.02 dB
Step down ∆up 0.495, 0.99, 1.485 and 1.98 dB
Update rate 50 s-1
Outer-Loop (EWMA based)
FER target 1%
Control variable EWMA threshold and CRC flags
EWMA target 0.02
Step up/down as conventional above
Update rate 50 s-1
Channel type
AWGN ON
Log-normal Fading ON
(Std, decorrelation distance) (8 dB, 20 m)
Fast Fading 6-tap Vehicular A
Vehicular speed 3, 50 and 120 km/h
Receiver Rake (6 fingers)
Initial SIR 4 dB
107
5.2.4. Simulation results and discussion
Simulation results for the each outer loop step sized and vehicular speed of 3, 50, and
120 km/h are given in Table 5.4(a)-(c), Table 5.5(a)-(c), Table 5.6(a)-(c), and Table
5.7(a)-(c) respectively. The results include the average and standard deviation of the
SIR target and the PESQ MOS corresponding to the two different power control
algorithms obtained for each shadowing profile. Furthermore, the gain of the CUSUM
based power control with respect to the EWMA based power control calculated as the
difference between the SIR targets in both cases is shown. The ensemble averages over
all shadowing profiles are also included.
The statistical significant difference between the two methods is obtained by
applying the T-test statistic. From Table 5.4(a)-(c), Table 5.5(a)-(c), Table 5.6(a)-(c),
Table 5.7(a)-(c), Table 5.8(a)-(c), Table 5.9(a)-(c), Table 5.10(a)-(c) and Table 5.11(a)-
(c), it shows that the p-values are less than the significance level. From the tables, it can
be seen that the maximum value of p-value is 1.67E-04 which is below the significance
level, 0.01. Therefore, the results can be considered as statistically significant.
From Table 5.4(a)-(c), Table 5.5(a)-(c), Table 5.6(a)-(c) and Table 5.7(a)-(c), it
is observed that the EWMA based power control achieves an average from 2% to 9%
gains in the SIR target over the conventional power control. On the other hand, from the
Table 5.8(a)-(c), Table 5.9(a)-(c), Table5.10(a)-(c) and Table 5.11(a)-(c), it is observed
that the CUSUM based power control achieved from 1% to5% gains in the SIR target
over the EWMA based power control. It is also observed that the maximum average
difference between the PESQ values of the algorithm is 0.188 (3.270-3.082) and all
these PESQ MOS differences are less than 0.2 MOS and hardly perceptible. Therefore,
we can say that these speech files have similar perceptual qualities.
The SIR target gain is due to the number of times both EWMA and CUSUM
based algorithms avoid increasing the SIR target while a conventional algorithm could
not manage it. However, EWMA is observed to be less sensitive in detecting a shift of
the log( )nFD distribution which has a normal distribution. The gain of CUSUM based
increased over EWMA based with the power control step size as noted in the Table
5.8(a)-(c), Table 5.9(a)-(c), Table 5.10(a)-(c) and Table 5.11(a)-(c).
108
Table 5.4: Results for Conventional and EWMA based power control
algorithms with outer-loop step down, ∆down = 0.005 dB and vehicular speed of
(a) 3 km h-1, (b) 50 km h-1 and (c) 120 km h-1.
(a)
Channel Profile Ave SIR target(dB) PESQ MOS
Conv std EWMA std Gain EWMA P value Conv EWMA Difference profile 1 4.075 0.164 4.002 0.136 0.073 2.34E-19 3.169 3.030 0.139 profile 2 4.112 0.081 3.978 0.147 0.134 2.15E-11 3.205 3.065 0.140 profile 3 4.058 0.068 3.985 0.232 0.073 1.45E-89 3.152 3.089 0.063 profile 4 4.081 0.152 4.025 0.148 0.056 0.00E+00 3.048 3.037 0.011 profile 5 4.022 0.061 3.969 0.059 0.053 3.10E-23 3.153 3.152 0.001 Average 4.070 0.105 3.992 0.144 0.078 4.300E-12 3.145 3.075 0.071
(b)
Channel Profile Ave SIR target(dB) PESQ MOS
Conv std EWMA std Gain EWMA P value Conv EWMA Difference profile 1 4.260 0.097 4.017 0.095 0.243 0.00E+00 3.016 3.097 0.081 profile 2 4.125 0.062 4.012 0.167 0.113 1.28E-23 3.152 3.019 0.133 profile 3 4.190 0.083 4.031 0.158 0.159 4.04E-32 3.095 2.910 0.185 profile 4 4.134 0.073 4.083 0.091 0.051 2.87E-13 3.201 3.138 0.063 profile 5 4.124 0.095 3.987 0.053 0.137 1.98E-23 3.083 3.036 0.047 Average 4.167 0.082 4.026 0.113 0.141 5.740E-14 3.109 3.040 0.102
(c)
Channel Profile Ave SIR target(dB) PESQ MOS
Conv Std EWMA Std Gain EWMA P value Conv EWMA Difference profile 1 3.624 0.204 3.532 0.252 0.092 2,.65E-45 3.466 3.346 0.120 profile 2 3.691 0.143 3.600 0.258 0.091 3.70E-76 3.392 3.353 0.039 profile 3 3.658 0.161 3.617 0.257 0.041 1.69E-37 3.317 3.296 0.021 profile 4 3.619 0.212 3.603 0.253 0.016 0.00E+00 3.493 3.353 0.140 profile 5 3.610 0.212 3.504 0.283 0.106 0.00E+00 3.460 3.287 0.173 Average 3.640 0.186 3.571 0.261 0.069 4.225E-38 3.426 3.327 0.099
109
Table 5.5: Results for Conventional and EWMA based power control
algorithms with outer-loop step down, ∆down = 0.01 dB and vehicular speed of
(a) 3 km h-1, (b) 50 km h-1 and (c) 120 km h-1.
(a)
Channel Profile Ave SIR target(dB) PESQ MOS
Conv Std EWMA std GainEWMA P value Conv EWMA Difference profile 1 4.062 0.096 3.959 0.227 0.103 4.10E-34 3.076 3.012 0.064 profile 2 4.085 0.113 3.965 0.147 0.120 0.00E+00 3.240 3.036 0.204 profile 3 4.081 0.088 3.967 0.232 0.114 0.00E+00 3.204 3.092 0.112 profile 4 4.080 0.111 3.998 0.148 0.082 3.25E-54 3.193 3.052 0.141 profile 5 4.083 0.128 4.001 0.159 0.082 1.56E-23 3.136 3.091 0.045 Average 4.078 0.107 3.978 0.183 0.100 3.120E-24 3.170 3.057 0.113
(b)
Channel Profile Ave SIR target(dB) PESQ MOS
Conv Std CUSUM std Gain CUSUM P value Conv CUSUM Difference profile 1 4.183 0.128 3.652 0.130 0.531 0.00E+00 3.035 3.067 0.032 profile 2 4.184 0.102 3.784 0.129 0.400 4.30E-67 3.095 3.046 0.049 profile 3 4.371 0.150 3.911 0.127 0.460 2.45E-45 3.089 3.026 0.063 profile 4 4.184 0.123 3.806 0.130 0.378 0.00E+00 3.171 3.046 0.125 profile 5 4.332 0.138 4.104 0.129 0.228 4.52E-123 3.157 3.039 0.118 Average 4.251 0.128 3.851 0.129 0.399 4.900E-46 3.109 3.045 0.077
(c)
Channel Profile Ave SIR target(dB) PESQ MOS
Conv Std CUSUM std Gain CUSUM P value Conv CUSUM Difference profile 1 3.412 0.271 3.313 0.350 0.099 2.88E-41 3.356 3.195 0.161 profile 2 3.663 0.177 3.313 0.302 0.350 0.00E+00 3.270 3.258 0.012 profile 3 3.300 0.350 3.230 0.215 0.070 2.34E-61 3.374 3.163 0.211 profile 4 3.351 0.286 3.323 0.207 0.028 0.00E+00 3.426 3.281 0.145 profile 5 3.408 0.291 3.368 0.308 0.040 4.24E-32 3.237 3.187 0.050 Average 3.427 0.275 3.309 0.276 0.117 8.480E-33 3.333 3.217 0.116
110
Table 5.6: Results for Conventional and EWMA based power control
algorithms with outer-loop step down, ∆down = 0.15 dB and vehicular speed of
(a) 3 km h-1, (b) 50 km h-1 and (c) 120 km h-1.
(a)
Channel Profile Ave SIR target(dB) PESQ MOS
Conv Std EWMA std Gain EWMA P value Conv EWMA Difference profile 1 3.988 0.158 4.008 0.127 -0.020 1.25E-23 3.027 3.021 0.006 profile 2 4.057 0.141 3.913 0.115 0.144 0.00E+00 3.248 3.036 0.212 profile 3 3.984 0.135 3.976 0.109 0.008 3.56E-17 3.189 3.062 0.127 profile 4 4.049 0.143 3.982 0.129 0.067 4.14E-35 3.201 2.985 0.216 profile 5 4.175 0.170 4.039 0.129 0.136 2.18E-21 3.108 3.091 0.017 Average 4.051 0.149 3.984 0.122 0.067 7.120E-18 3.155 3.039 0.116
(b)
Channel Profile Ave SIR target(dB) PESQ MOS
Conv Std EWMA Std Gain EWMA P value Conv EWMA Difference profile 1 4.356 0.173 4.057 0.212 0.299 1.45E-29 3.123 3.094 0.029 profile 2 4.254 0.160 4.062 0.167 0.192 0.00E+00 3.130 3.010 0.120 profile 3 4.386 0.182 3.989 0.184 0.397 0.00E+00 3.123 2.896 0.227 profile 4 4.213 0.189 4.008 0.194 0.205 3.65E-18 3.150 3.098 0.052 profile 5 4.237 0.164 3.822 0.127 0.415 2.70E-28 3.177 2.888 0.289 Average 4.289 0.174 3.988 0.177 0.302 7.300E-19 3.141 2.997 0.143
(c)
Channel Profile Ave SIR target(dB) PESQ MOS
Conv Std EWMA Std Gain EWMA P value Conv EWMA Difference profile 1 3.248 0.323 3.176 0.363 0.072 2.76E-98 3.295 3.146 0.149 profile 2 3.420 0.276 3.125 0.353 0.295 1.65E-49 3.315 3.149 0.166 profile 3 3.210 0.342 3.148 0.369 0.062 0.00E+00 3.262 3.063 0.199 profile 4 3.260 0.346 3.144 0.345 0.116 0.00E+00 3.331 3.186 0.145 profile 5 3.330 0.285 3.096 0.413 0.235 2.93E-49 3.178 3.087 0.091 Average 3.294 0.314 3.138 0.369 0.156 9.160E-50 3.276 3.126 0.150
111
Table 5.7: Results for Conventional and EWMA based power control
algorithms with outer-loop step down, ∆down = 0.02 dB and vehicular speed of
(a) 3 km h-1, (b) 50 km h-1 and (c) 120 km h-1.
(a)
Channel Profile Ave SIR target(dB) PESQ MOS
Conv Std EWMA std Gain EWMA P value Conv EWMA Difference profile 1 4.147 0.205 3.965 0.207 0.182 4.54E-12 3.108 3.037 0.071 profile 2 4.102 0.165 3.934 0.248 0.168 0.00E+00 3.263 3.201 0.062 profile 3 4.134 0.194 3.999 0.171 0.135 0.00E+00 3.211 3.141 0.070 profile 4 4.014 0.167 3.964 0.254 0.050 2.87E-32 3.265 3.226 0.039 profile 5 4.130 0.228 4.007 0.157 0.123 1.98E-23 3.191 3.116 0.075 Average 4.105 0.192 3.974 0.207 0.132 9.080E-13 3.208 3.144 0.063
(b)
Channel Profile Ave SIR target(dB) PESQ MOS
Conv Std EWMA Std Gain EWMA P value Conv EWMA Difference profile 1 4.176 0.238 3.852 0.187 0.324 3.76E-76 3.108 3.049 0.059 profile 2 4.298 0.222 3.808 0.227 0.490 0.00E+00 3.127 3.006 0.121 profile 3 4.360 0.235 3.912 0.230 0.448 1.69E-37 3.181 2.944 0.237 profile 4 4.236 0.222 3.982 0.198 0.254 0.00E+00 3.208 3.173 0.035 profile 5 4.367 0.184 3.863 0.252 0.504 2.80E-45 3.256 2.948 0.308 Average 4.287 0.220 3.883 0.219 0.404 3.380E-38 3.176 3.024 0.152
(c)
Channel Profile Ave SIR target(dB) PESQ MOS
Conv Std EWMA Std Gain EWMA P value Conv EWMA Difference profile 1 3.368 0.340 2.988 0.411 0.380 0.00E+00 3.310 3.004 0.306 profile 2 3.363 0.234 2.954 0.309 0.409 0.00E+00 3.305 3.181 0.124 profile 3 3.122 0.330 2.981 0.311 0.141 3.45E-27 3.138 3.096 0.042 profile 4 3.260 0.286 3.001 0.397 0.259 0.00E+00 3.246 3.014 0.232 profile 5 3.185 0.370 2.935 0.310 0.250 2.65E-43 3.351 3.114 0.237 Average 3.260 0.312 2.972 0.348 0.288 6.900E-28 3.270 3.082 0.188
112
Table 5.8: Results for EWMA and CUSUM based power control algorithms
with outer-loop step down, ∆down = 0.005 dB and vehicular speed of (a) 3 km h-1,
(b) 50 km h-1 and (c) 120 km h-1.
(a)
Channel Profile Ave SIR target(dB) PESQ MOS
EWMA std CUSUM std Gain CUSUM P value EWMA CUSUM Difference profile 1 4.002 0.136 3.932 0.121 0.070 1.78E-21 3.030 3.057 0.027 profile 2 3.978 0.147 3.845 0.119 0.133 3.67E-11 3.065 3.177 0.112 profile 3 3.985 0.232 3.848 0.124 0.137 0.00E+00 3.089 3.034 0.055 profile 4 4.025 0.148 3.887 0.122 0.138 4.14E-35 3.037 3.016 0.021 profile 5 3.969 0.059 3.957 0.119 0.012 2.57E-05 3.152 3.100 0.052 Average 3.992 0.144 3.894 0.121 0.098 5.14E-06 3.075 3.077 0.054
(b)
Channel Profile Ave SIR target(dB) PESQ MOS
EWMA Std CUSUM std Gain CUSUM P value EWMA CUSUM Difference profile 1 4.017 0.095 3.966 0.229 0.051 5.70E-67 3.097 2.982 0.115 profile 2 4.012 0.167 4.091 0.128 -0.079 1.78E-43 3.019 3.006 0.013 profile 3 4.031 0.158 4.022 0.129 0.009 0.00E+00 2.910 2.967 0.057 profile 4 4.083 0.091 3.998 0.125 0.085 0.00E+00 3.138 3.171 0.033 profile 5 3.987 0.053 3.849 0.090 0.138 6.07E-19 3.036 2.997 0.039 Average 4.026 0.113 3.985 0.140 0.041 1.21E-19 3.040 3.024 0.051
(c)
Channel Profile Ave SIR target(dB) PESQ MOS
EWMA Std CUSUM std Gain CUSUM P value EWMA CUSUM Difference profile 1 3.532 0.252 3.560 0.271 -0.028 0.00E+00 3.346 3.395 0.049 profile 2 3.600 0.258 3.578 0.221 0.022 1.57E-16 3.353 3.291 0.062 profile 3 3.617 0.257 3.518 0.246 0.099 2.04E-56 3.296 3.244 0.052 profile 4 3.603 0.253 3.462 0.209 0.141 0.00E+00 3.353 3.315 0.039 profile 5 3.504 0.283 3.499 0.289 0.005 1.54E-75 3.287 3.314 0.027 Average 3.571 0.261 3.523 0.247 0.048 3.14E-17 3.327 3.312 0.046
113
Table 5.9: Results for EWMA and CUSUM based power control algorithms
with outer-loop step down, ∆down = 0.01 dB and vehicular speed of (a) 3 km h-1,
(b) 50 km h-1 and (c) 120 km h-1.
(a)
Channel Profile Ave SIR target(dB) PESQ MOS
EWMA std CUSUM std Gain CUSUM P value EWMA CUSUM Difference profile 1 3.959 0.227 3.920 0.112 0.039 3.31E-12 3.012 3.065 0.053 profile 2 3.965 0.147 3.936 0.120 0.029 0.00E+00 3.036 3.052 0.016 profile 3 3.967 0.232 3.803 0.122 0.164 5.76E-24 3.092 2.997 0.095 profile 4 3.998 0.148 3.834 0.121 0.164 0.00E+00 3.052 3.178 0.126 profile 5 4.001 0.159 3.847 0.122 0.154 2.16E-27 3.091 3.040 0.051 Average 3.978 0.183 3.868 0.119 0.110 6.62E-13 3.057 3.066 0.068
(b)
Channel Profile Ave SIR target(dB) PESQ MOS
EWMA Std CUSUM std Gain CUSUM P value EWMA CUSUM Difference profile 1 3.920 0.088 3.652 0.130 0.268 0.00E+00 2.997 3.067 0.070 profile 2 4.033 0.121 3.784 0.129 0.249 0.00E+00 3.029 3.046 0.017 profile 3 4.069 0.134 3.911 0.127 0.158 0.00E+00 2.866 3.026 0.160 profile 4 3.978 0.143 3.806 0.130 0.172 3.56E-137 3.179 3.046 0.133 profile 5 4.041 0.168 3.910 0.129 0.131 3.89E-36 3.104 3.039 0.065 Average 4.008 0.131 3.813 0.129 0.196 7.78E-37 3.035 3.045 0.089
(c)
Channel Profile Ave SIR target(dB) PESQ MOS
EWMA Std CUSUM std Gain CUSUM P value EWMA CUSUM Difference profile 1 3.313 0.350 3.123 0.463 0.190 6.23E-46 3.195 3.296 0.101 profile 2 3.313 0.302 3.222 0.251 0.091 1.54E-04 3.258 3.266 0.008 profile 3 3.230 0.215 3.009 0.287 0.221 0.00E+00 3.163 3.213 0.050 profile 4 3.323 0.207 3.257 0.287 0.066 3.67E-127 3.281 3.214 0.067 profile 5 3.368 0.308 3.236 0.317 0.132 3.89E-36 3.187 3.119 0.068 Average 3.309 0.276 3.169 0.321 0.140 3.08E-05 3.217 3.222 0.059
114
Table 5.10: Results for EWMA and CUSUM based power control algorithms
with outer-loop step down, ∆down = 0.015 dB and vehicular speed of (a) 3 km h-1,
(b) 50 km h-1 and (c) 120 km h-1.
(a)
Channel Profile Ave SIR target(dB) PESQ MOS EWMA std CUSUM std Gain CUSUM P value EWMA CUSUM Difference
profile 1 4.008 0.127 3.916 0.124 0.092 0.00E+00 3.021 2.932 0.089 profile 2 3.913 0.115 3.860 0.124 0.053 0.00E+00 3.036 3.070 0.034 profile 3 3.976 0.109 3.955 0.122 0.021 5.67E-45 3.062 3.008 0.054 profile 4 3.982 0.129 3.829 0.125 0.153 2.34E-87 2.985 3.151 0.166 profile 5 4.039 0.129 3.884 0.114 0.155 5.70E-170 3.091 2.995 0.097 Average 3.984 0.122 3.889 0.122 0.095 1.13E-45 3.039 3.031 0.088
(b)
Channel Profile Ave SIR target(dB) PESQ MOS EWMA Std CUSUM std Gain CUSUM P value EWMA CUSUM Difference
profile 1 4.057 0.212 3.957 0.231 0.100 0.00E+00 3.094 2.992 0.102 profile 2 4.062 0.167 4.105 0.233 -0.043 0.00E+00 3.010 3.057 0.046 profile 3 3.989 0.184 3.868 0.138 0.121 0.00E+00 2.896 3.042 0.146 profile 4 4.008 0.194 3.828 0.135 0.180 1.34E-15 3.098 3.037 0.061 profile 5 3.822 0.127 3.797 0.140 0.026 3.25E-12 2.888 3.037 0.149 Average 3.988 0.177 3.911 0.176 0.077 6.5E-13 2.997 3.033 0.101
(c)
Channel Profile Ave SIR target(dB) PESQ MOS EWMA Std CUSUM std Gain CUSUM P value EWMA CUSUM Difference
profile 1 3.176 0.363 3.171 0.366 0.005 0.00E+00 3.146 3.196 0.050 profile 2 3.125 0.353 3.088 0.266 0.037 0.00E+00 3.149 3.244 0.095 profile 3 3.148 0.369 3.063 0.320 0.085 4.90E-59 3.063 3.183 0.120 profile 4 3.144 0.345 3.118 0.314 0.026 2.56E-10 3.186 3.164 0.023 profile 5 3.096 0.413 2.844 0.542 0.252 1.50E-238 3.087 3.046 0.041 Average 3.138 0.369 3.057 0.362 0.081 5.12E-11 3.126 3.167 0.066
115
Table 5.11: Results for EWMA and CUSUM based power control algorithms
with outer-loop step down, ∆down = 0.02 dB and vehicular speed of (a) 3 km h-1,
(b) 50 km h-1 and (c) 120 km h-1.
(a)
Channel Profile Ave SIR target(dB) PESQ MOS
EWMA std CUSUM std Gain CUSUM P value EWMA CUSUM Difference profile 1 3.965 0.207 3.944 0.211 0.021 8.36E-04 3.037 3.076 0.039 profile 2 3.934 0.248 3.984 0.238 -0.050 0.00E+00 3.201 3.240 0.039 profile 3 3.999 0.171 3.906 0.222 0.093 0.00E+00 3.141 3.100 0.041 profile 4 3.964 0.254 3.944 0.252 0.020 2.76E-78 3.226 3.178 0.049 profile 5 4.007 0.157 3.918 0.238 0.089 0.00E+00 3.116 3.136 0.020 Average 3.974 0.207 3.939 0.232 0.035 1.67E-04 3.144 3.146 0.038
(b)
Channel Profile Ave SIR target(dB) PESQ MOS EWMA Std CUSUM std Gain CUSUM P value EWMA CUSUM Difference
profile 1 3.852 0.187 3.766 0.284 0.086 8.36E-04 3.049 3.055 0.006 profile 2 3.808 0.227 3.825 0.378 -0.017 2.80E-23 3.006 3.055 0.049 profile 3 3.912 0.230 3.933 0.357 -0.021 0.00E+00 2.944 3.018 0.074 profile 4 3.982 0.198 3.835 0.304 0.147 0.00E+00 3.173 3.210 0.036 profile 5 3.863 0.252 3.871 0.355 -0.008 2.56E-43 2.948 2.932 0.016 Average 3.883 0.219 3.846 0.336 0.037 1.67E-04 3.024 3.054 0.036
(c)
Channel Profile
Ave SIR target(dB) PESQ MOS EWMA Std CUSUM std Gain CUSUM P value EWMA CUSUM Difference
profile 1 2.988 0.411 2.749 0.513 0.239 1.01E-29 3.004 3.151 0.147 profile 2 2.954 0.309 2.939 0.332 0.015 0.00E+00 3.181 3.219 0.037 profile 3 2.981 0.311 2.812 0.435 0.169 3.20E-77 3.096 3.083 0.013 profile 4 3.001 0.397 2.808 0.338 0.193 2.56E-56 3.014 3.135 0.121 profile 5 2.935 0.310 2.794 0.339 0.141 5.70E-124 3.114 3.152 0.038 Average 2.972 0.348 2.820 0.391 0.151 2.02E-30 3.082 3.148 0.071
116
A summary of ensemble averages for the outer-loop step sizes of 0.01, 0.005,
0.015 and 0.02 dB of EWMA based compared to conventional power control algorithm
are given in Table 5.12(a)-(c) respectively.
Table 5.12: Result for Conventional and EWMA based power control
algorithms for all simulated outer loop step sizes and vehicular speed of (a) (a) 3
km h-1, (b) 50 km h-1 and (c) 120 km h-1.
(a)
Step Sizes (dB)
Ave SIR target(dB) PESQ MOS
Conv Std EWMA std Gain EWMA P value Conv EWMA Difference 0.005 4.070 0.105 3.000 0.121 1.070 4.30E-12 3.145 3.145 0.069 0.010 4.078 0.107 3.868 0.119 0.210 3.12E-24 3.170 3.170 0.103
0.015 4.051 0.149 3.889 0.122 0.162 7.12E-18 3.155 3.155 0.124
0.020 4.106 0.192 3.939 0.232 0.167 9.08E-13 3.208 3.208 0.062
(b)
Step Sizes (dB)
Ave SIR target(dB) PESQ MOS
Conv Std EWMA std Gain EWMA P value Conv EWMA Difference 0.005 4.167 0.082 4.026 0.113 0.141 5.74E-14 3.109 3.109 0.085
0.010 4.251 0.128 4.008 0.131 0.243 1.60E-25 3.109 3.109 0.064
0.015 4.289 0.174 3.988 0.177 0.301 7.30E-19 3.141 3.141 0.108
0.020 4.287 0.220 3.883 0.219 0.404 3.30E-38 3.176 3.176 0.123
(c)
Step Sizes (dB)
Ave SIR target(dB) PESQ MOS
Conv Std EWMA std Gain EWMA P value Conv EWMA Difference 0.005 3.640 0.186 3.571 0.216 0.069 4.23E-38 3.426 3.426 0.114
0.010 3.427 0.275 3.309 0.276 0.118 8.48E-33 3.333 3.333 0.111
0.015 3.294 0.314 3.138 0.369 0.156 9.16E-50 3.276 3.276 0.110
0.020 3.260 0.312 2.972 0.348 0.288 6.90E-28 3.270 3.270 0.122
117
A summary of ensemble averages for the outer-loop step sizes of 0.01, 0.005,
0.015 and 0.02 dB are given in Table 5.13(a)-(c) respectively.
Table 5.13: Result for EWMA and CUSUM based power control algorithms for
all simulated outer loop step sizes and vehicular speed of (a) (a) 3 km h-1, (b) 50
km h-1 and (c) 120 km h-1.
(a)
Step Sizes (dB)
Ave SIR target(dB) PESQ MOS
EWMA Std CUSUM std Gain CUSUM P value EWMA CUSUM Difference 0.005 3.992 0.144 3.894 0.121 0.098 5.14E-06 3.075 3.077 0.054 0.010 3.978 0.183 3.868 0.119 0.110 6.62E-13 3.057 3.066 0.068
0.015 3.984 0.122 3.889 0.122 0.095 1.13E-45 3.039 3.031 0.088
0.020 3.974 0.207 3.939 0.232 0.035 1.67E-04 3.144 3.146 0.038
(b)
Step Sizes (dB)
Ave SIR target(dB) PESQ MOS
EWMA Std CUSUM std Gain CUSUM P value EWMA CUSUM Difference 0.005 4.026 0.113 3.985 0.140 0.041 1.21E-19 3.040 3.024 0.051
0.010 4.008 0.131 3.813 0.129 0.196 7.78E-37 3.035 3.045 0.089
0.015 3.988 0.177 3.911 0.176 0.077 6.50E-13 2.997 3.033 0.101
0.020 3.883 0.219 3.846 0.336 0.037 1.67E-04 3.024 3.054 0.036
(c)
Step Sizes (dB)
Ave SIR target(dB) PESQ MOS
EWMA Std CUSUM Std Gain CUSUM P value EWMA CUSUM Difference 0.005 3.571 0.261 3.523 0.247 0.048 3.14E-17 3.327 3.312 0.046 0.010 3.309 0.276 3.169 0.321 0.140 3.08E-05 3.217 3.222 0.059
0.015 3.138 0.369 3.057 0.362 0.081 5.12E-11 3.126 3.167 0.066
0.020 2.972 0.348 2.820 0.391 0.151 2.02E-30 3.082 3.148 0.071
118
A set of representative curves comparing the performance of conventional,
EWMA based and CUSUM based outer-loop power control algorithms for vehicular
speeds of 3 km h-1, 50 km h-1, and 120 km h-1 respectively, are shown in Figure 5.7(a)-
(c) respectively. Since the same results are applied, the figure is similar to Figure 4.11
except there is an addition of a EWMA curve for the comparison. In each case,
shadowing profile and SIR targets for the three algorithms are shown. The CRC flag
indicated the frame erasure. Note that, like conventional and CUSUM, EWMA based
technique also depends on CRC flags as well as the EWMA threshold in controlling
perceptual speech quality. It can be observed from Figure 5.7(a)-(c) that the SIR target
for all algorithms was increased whenever the corresponding CRC flag indicated the
frame erasure. However, there were situations when the frame erasures occurred but the
SIR target for EWMA and CUSUM based techniques was not increased giving rise to
observed gaps between the SIR targets in the three algorithms in Figure 5.7(a)-(c). The
average area of the gap corresponds to the gain achieved through a CUSUM based
algorithm over its SPC counterpart, EWMA and also the gain achieved through
conventional algorithm over EWMA The set of curves corresponding to the best
scenario, which resulted in the highest SIR target gain of CUSUM over EWMA, are
shown in Figure 5.8(a)-(c). In this case, at the given step size of 0.01 dB, SIR target
gains 0.039, 0.268 and 0.190 dB for vehicular speeds of 3 km h-1,50 km h-1, and 120 km
h-1, respectively.
119
0 5 10 15 20 25 30 35 40-25
-20
-15
-10
-5
0
5
10
15
20
25
Time (sec)
Am
plitu
de (
dB
)
Fla
g
Conventional SIR Targets (dB)CUSUM based SIR Targets (dB)EWMA based SIR Targets (dB)Shadowing Profile (dB)CRC Flag (Conventional System)CRC Flag (CUSUM based System)CRC Flag (EWMA based System)
Average SIR Targets (dB): Conventional =4.022, CUSUM = 3.957, EWMA =3.969
(a)
120
0 5 10 15 20 25 30 35 40-25
-20
-15
-10
-5
0
5
10
15
20
25
Time (sec)
Am
plitu
de (
dB
)
Fla
g
Conventional SIR Targets (dB)CUSUM based SIR Targets (dB)EWMA based SIR Targets (dB)Shadowing Profile (dB)CRC Flag (Conventional System)CRC Flag (CUSUM based System)CRC Flag (EWMA based System)
Average SIR Targets (dB): Conventional =4.124, CUSUM = 3.849, EWMA =3.987
(b)
121
0 5 10 15 20 25 30 35 40-25
-20
-15
-10
-5
0
5
10
15
20
25
Time (sec)
Am
plitu
de (
dB
)
Fla
g
Conventional SIR Targets (dB)CUSUM based SIR Targets (dB)EWMA based SIR Targets (dB)Shadowing Profile (dB)CRC Flag (Conventional System)CRC Flag (CUSUM based System)CRC Flag (EWMA based System)
Average SIR Targets (dB): Conventional =4.022, CUSUM = 3.957, EWMA =3.969
(c)
Figure 5.7: Performance comparison of Conventional, CUSUM based and
EWMA based power control (shadowing profile 5 and ∆ = 0.005 dB): (a) 3 km
h-1 , (b) 50 km h-1 and (c) 120 km h-1.
122
0 5 10 15 20 25 30 35 40-25
-20
-15
-10
-5
0
5
10
15
20
25
Time (sec)
Am
plitu
de (
dB
)
Fla
g
Conventional SIR Targets (dB) CUSUM based SIR Targets (dB)EWMA based SIR Targets (dB)Shadowing Profile (dB)CRC Flag (Conventional System)CRC Flag (CUSUM based System)CRC Flag (EWMA based System)
Average SIR Targets (dB): Conventional =4.062, CUSUM = 3.920, EWMA =3.959
(a)
123
0 5 10 15 20 25 30 35 40-25
-20
-15
-10
-5
0
5
10
15
20
25
Time (sec)
Am
plitu
de (
dB
)
Fla
g
Conventional SIR Targets (dB)CUSUM based SIR Targets (dB)EWMA based SIR Targets (dB)Shadowing Profile (dB)CRC Flag (Conventional System)CRC Flag (CUSUM based System)CRC Flag (EWMA based System)
Average SIR Targets (dB): Conventional =4.183, CUSUM = 3.652, EWMA =3.920
(b)
124
0 5 10 15 20 25 30 35 40-25
-20
-15
-10
-5
0
5
10
15
20
25
Time (sec)
Am
plitu
de (
dB
)
Fla
g
Conventional SIR Targets (dB)CUSUM based SIR Targets (dB)EWMA based SIR Targets (dB)Shadowing Profile (dB)CRC Flag (Conventional System)CRC Flag (CUSUM based System)CRC Flag (EWMA based System)
Average SIR Targets (dB): Conventional =3.412, CUSUM = 3.123, EWMA =3.313
(c)
Figure 5.8: Performance comparison of Conventional, CUSUM based and
EWMA based power control (shadowing profile 1 and ∆ = 0.01 dB): (a) 3 km h-
1 , (b) 50 km h-1 and (c) 120 km h-1.
125
5.3 Summary
Based on data distribution with the application of EWMA and CUSUM analysis, it
shows that CUSUM is slightly more sensitive for the normal distribution data.
However, for non normal data distribution, EWMA shows more sensitivity than
CUSUM. It is also observed that the EWMA has a tendency to be more sensitive with a
larger shift of the distribution. Since ( )log FDn has the normal distribution (FD
Analysis at Section 4.1), applying CUSUM for controlling the transmitter parameter in
the UMTS system is more appropriate than EWMA. Furthermore, based on the
comparison analysis in this chapter it was shown that CUSUM based power control
reduced the average SIR target by up to 5% relative to EWMA based power control
technique. It is noted that applying this new parameter to both EWMA and CUSUM
schemes in a mobile communication system will allow faster action at the transmitter to
control the quality of the speech signals as required by the end users. Hence, it will help
the providers in optimizing the network resources.
The performance of both EWMA based and CUSUM power controls was
compared by computer simulations using a comprehensive set of parameters. These
parameters were the step size of the outer loop power control, vehicular speed and
channel shadowing profile. Both algorithms are better than conventional in term of SIR
target reduction while at the same time providing adequate perceptual quality to the end
users. The simulation results showed that the EWMA based power control reducing the
average SIR target by up to 9 % relative to the conventional based algorithm.
126
CHAPTER 6
CONCLUSIONS Mobile communication system usage has expanded over the years. Mobile phones for
example, have become a necessary item rather than an accessory. Therefore the demand
for good quality systems including speech quality is getting higher. Mobile
communication system providers compete among themselves to offer a better QoS to
the customers and at the same time they can gain financial benefits as well as avoiding a
energy wastage. From an end user’s point of view, they are willing to have a good
quality of QoS while making it cost effective.
Hence, it has inspired researchers to find a method of decreasing the energy
usage while providing an adequate QoS to the customer. In achieving this, the system
which can control the resources at the transmitter such as power control and speech
codec rate control needs to be constructed. Consequently, over the years, power control
has received considerable attention and many good power control algorithms have been
proposed. However most of the proposed algorithms measure speech quality indirectly
based on some channel quality metric such as SIR, BER, and FER. It is agreed by many
researchers that these parameters are actually measures of the quality of received radio
signals, or integrity of the detected bits or frames but not the speech quality as perceived
by the end user. Employment of these inaccurate channel quality metrics will result in
inefficiency in power control algorithms. In this case, at times, more than adequate
quality is provided at the expense of network capacity, while at other times a connection
is considered technically successful but the quality of speech may be poor.
Therefore, to avoid inefficiencies in controlling functions such as power control
and speech codec rate at the transmitter, the more reliable speech quality measures must
be used. Indeed the most reliable speech quality measure should come from the end
user. Hence, the control algorithm based on a human auditory system should be
designed to have efficient control of system resources. As such, among the various
reliable perceptual speech quality metrics, a state of the art method for referenced
objective speech quality measure, PESQ, has an advantage over previously referenced
objective speech quality measures to be employed as the perceptual speech quality
metric in controlling mobile system resources.
A power control algorithm based on PESQ was applied by researchers into the
UMTS system. It was proved that this algorithm is better than the conventional
127
algorithm of UTMS in saving system resources while catering for customers with a
satisfactory QoS. However, the smallest period that PESQ can evaluate speech quality
is 320 ms, which is too long for effective control of quality in networks.
In this thesis, FD which is subtracted from PESQ is proposed to replace a non-
perceptual metric such as FER in mobile radio systems. The FD is calculated every 16
ms and is suitable for control purposes. In order to control functions at the transmitter
such as speech codec rate and power based on FD distribution which is ( )log FDn , an
SPC tool which is novel in mobile communication systems is applied. CUSUM and
EWMA techniques have been applied to UMTS and their effectiveness in better
addressing the aforementioned trade-off between radio resources, and speech quality
has been shown by computer simulation as well as analysis. The major findings and
contributions of this thesis together with possible extensions of it are summarized as
follows.
6.1 Summary of Major Findings and Contributions
In this thesis, ( )log FDn has been proposed for use as a perceptual metric to replace
non-perceptual measures such as SIR, BER and FER. The PESQ is a function of FD.
Specifically FD represents perceptual degradation of each frame of speech. The analysis
of FD in chapter 4 shows that ( )log FDn has a normal distribution where the mean of the
distribution increases with the degradation of perceptual speech quality and vice versa.
The FD analysis suggests that transmission parameters such as power should not be
adapted on a frame by frame basis as is the current practice. Current practices lead to
inefficient utilization of resources and possible unsatisfactory perceptual speech quality.
In maintaining a certain level of end user perceptual quality, what is needed is to detect
a shift in the distribution of ( )log FDn and take steps to rectify that such as controlled
the transmission power, channel coding or speech codec rate.
A CUSUM based technique was proposed as a novel technique for controlling
the speech codec rate and control power in mobile communication systems. In chapter
4, Section 4.2, a CUSUM based technique is applied for controlling the speech codec
rate for UMTS. The CUSUM based technique which employed log( )nFD as the
perceptual speech quality parameter allows faster action at the transmitter to control the
quality of the speech signals as required by end users. Hence, the non-perceptual speech
parameter such as FER can be replaced with log( )nFD .
128
In Chapter 4, Section 4.3, the CUSUM based technique was applied for
controlling the transmission power at the transmitter for UMTS. The UMTS outer loop
power control was modified to employ the CUSUM based technique. Instead of
increasing the SIR target every time a frame error occurred, the perceptual importance
of the erroneous frame was determined by the CUSUM based technique before the
process proceeded. If the erroneous frame was of sufficient perceptual importance, only
then was the SIR target increase allowed, otherwise, the SIR target was decreased. A
comparison of the performance of CUSUM based and FER based outer loop power
control algorithms through simulations using a comprehensive set of parameters was
carried out, and the simulation results show the CUSUM based power control achieves
adequate speech quality while reducing the average SIR target by up to 13% relative to
the conventional algorithm.
The CUSUM based power control algorithm enabled the trade-off of average
perceptual quality with average SIR target in a more controlled manner. This cannot be
achieved with conventional power control of UMTS. This inefficiency of the
conventional power control would not allow accurate control of speech quality. This is
mainly due to inaccuracy of the FER as a non-perceptual quality metric in representing
speech quality. The conventional power control algorithm tried to keep FER within a
specified range that would guarantee good quality in all situations. This, however, at
times meant that more than necessary perceptual quality was provided.
The application of a CUSUM based technique in power control in UMTS
required feedback of FEP, every 20 ms, from the receiver end of the communication
link to the transmitter. It implies a requirement of the feedback channel for this purpose.
Therefore, FEP could be included in the feedback channels already available.
In chapter 5, the comparison analysis between EWMA and CUSUM based
techniques is analysed. Section 5.1, the response of EWMA and CUSUM based
techniques towards data distribution is observed and analysed showing that in our case,
the EWMA technique has a better response with the data which does not have normal
distribution compared to a CUSUM technique. On the other hand, the CUSUM
technique has a better response with normal distribution data compared to a EWMA
technique. The analysis also implies that EWMA is more sensitive with a larger shift of
data where EWMA is more sensitive with data which does not have a normal
distribution.
In Section 5.2, the performance comparison between EWMA based and
CUSUM based power control algorithms is presented through simulations. It is shown
129
that both EWMA and CUSUM algorithms reduce the average SIR target compared to
conventional algorithm where EWMA based power control achieves up to 9% relative
to a conventional algorithm. However, CUSUM based power control achieves adequate
speech quality while reducing the average SIR target slightly by up to 5% relative to the
EWMA based algorithm. It shows that the CUSUM has more sensitivity towards
( )log FDn distributions which have a normal distribution. Generally, the perceptual
quality delivered by conventional power control is slightly higher than
CUSUM/EWMA based algorithms. This is due to the ability of the perceptual
algorithms to trade-off transmit power with perceptual quality in a more controlled
manner while still providing adequate quality to the users. Furthermore, it should be
noted that the MOS differences between perceptual algorithms and conventional
algorithms are hardly perceptible (less than 0.2 MOS). Therefore, we could say, all the
algorithms deliver adequate perceptual qualities but with a different cost in term of
average SIR target levels.
6.2 Suggestions for Future Work
The benefits of applying the SPC in power control of mobile communication systems is
shown in this thesis to be mainly as saving precious system resources such as
transmitter power as well as providing the adequate speech quality to end users.
However, since the thesis is mostly based on numerical simulations, more theoretical
analysis will improve the balance between numerical and theoretical analysis in the
thesis. Furthermore, there are a number of extensions to this work that could be
considered for future research in methodology, which potentially will improve the
performance of SPC based techniques. The suggested extensions are as follows:
Pre-emptive perceptual power control
The proposed perceptual power control algorithms are reactive since they wait for a
frame error to occur and depend on the perceptual importance of the erroneous frame
reaction in an attempt to keep the overall perceptual quality within a prescribed range.
In the situation where the received frames on which quality measure were made, they
were severely corrupted to such an extent as to degrade the overall perceived speech
quality; significantly, the system could not do much to improve the speech quality.
However, if there are pre-emptive measures which predict the perceptual significance of
frames before they are transmitted and protect them accordingly, it can avoid the
130
particular situation. The frame can be protected in many ways such as unequal error
protection, unequal signal power allocation for the frames, etc.
Simplification of PESQ
PESQ is designed for a wide range of network conditions and error types as well as
applications. For a specific application such as in a mobile communication system, the
PESQ algorithm could be simplified for the application without losing much accuracy.
There are functional blocks in PESQ, such as input filtering, which should be studied
and justified. The simpler the PESQ algorithm, the smaller the memory space required
for implementation of SPC based algorithms on mobile and base stations. With less
blocks in PESQ, the execution time of algorithm will be faster.
131
APPENDIX
ITU Speech Files TABLE A: ITU Speech files used for FD analysis for PESQ MOS 3.0
Speaker gender ITU File name Speaker gender ITU File name
Female O_0F01L84 Male O_M02L3C
Female O_0F02LBA Male O_M02L2D
Female O_0F02L8C Male O_M02L3A
Female O_0F02L8E Male O_M02L3E
Female O_0F02L8F Male O_M02L4A TABLE B: ITU Speech files used for FD analysis for PESQ MOS 3.1
Speaker gender ITU File name Speaker gender ITU File name
Female O_0F01L5A Male O_M01L02
Female O_0F02L5B Male O_M01L2B
Female O_0F02L5D Male O_M01L04
Female O_0F02L6C Male O_M01L07
Female O_0F02L7B Male O_M01L08 TABLE C: ITU Speech files used for FD analysis for PESQ MOS 3.2
Speaker gender ITU File name Speaker gender ITU File name
Female O_0F01L5C Male O_M01L06
Female O_0F01L5E Male O_M01L08
Female O_0F01L6B Male O_M01L12
Female O_0F01L6D Male O_M01L16
Female O_0F01L7C Male O_M01L18
132
TABLE D: ITU Speech files used for FD analysis for PESQ MOS 3.3
Speaker gender ITU File name Speaker gender ITU File name
Female O_0F01L6A Male O_M01L0F
Female O_0F01L7A Male O_M01L09
Female O_0F01L7D Male O_M01L10
Female O_0F01L60 Male O_M01L13
Female O_0F01L61 Male O_M01L14 TABLE E: ITU Speech files used for FD analysis for PESQ MOS 3.4
Speaker gender ITU File name Speaker gender ITU File name
Female O_0F01L5F Male O_M01L0A
Female O_0F01L6F Male O_M01L0B
Female O_0F01L7A Male O_M01L0E
Female O_0F01L61 Male O_M01L1F
Female O_0F01L67 Male O_M01L2B TABLE F: ITU Speech files used for FD analysis for PESQ MOS 3.5
Speaker gender ITU File name Speaker gender ITU File name
Female O_0F01L5F Male O_M01L0C
Female O_0F01L6A Male O_M01L0D
Female O_0F01L7A Male O_M01L0E
Female O_0F01L61 Male O_M01L1A
Female O_0F01L68 Male O_M01L1B
133
BIBLIOGRAPHY
[1] Objective Quality Measurement of Telephone Band (300-34000Hz) Speech
Codecs, ITU-T Recommendation P.861, August 1996.
[2] J. G. B. Antony W.Rix1, Michael P. Hollier1 and Andries P. Hekstra2,
"Perceptual Evaluation of Speech Quality (PESQ) - A New Method for Speech
Quality Assessment of Telephone Networks and Codecs," in Proceeding IEEE
International Conference on Acoustics, Speech, and Signal Processing (ICASSP
'01) Salt Lake City, Utah, USA, May 2001, pp. 749-752.
[3] Behrooz Rohani and H. J. Zepernick, "Application of a Perceptual Speech
Quality Metric in Power Control of UTMS," in 2nd ACM International
Workshop on Quality of Service & Security for Wireless and Mobile Networks
(Q2sWinet'06), Torremolinos,(Malaga), Spain, Oct 2006, pp. 87-94.
[4] Behrooz Rohani, et al., "Application of a Perceptual Speech Quality Metric for
Link Adaptation in Wireless Systems," in 1st International Symposium on
Wireless Communication Systems, Mauritius, Sept 2004, pp. 260-264.
[5] S. Mohammed, et al., "Integrating Network Measurements and Speech Quality
Subjective Scores for Control Purposes," in 20th Annual Joint Conference of the
IEEE Computer and Communication Societies INFOCOM 2001, Anchorage,
Alaska, USA, April 2001, pp. 641-649.
[6] W.H. Woodwall and D. C. Montgomery, "Research Issues and Ideas in
Statistical Process Control," Journal of Quality Technology, vol. 31, pp. 376-
386, 1999.
[7] E.S Page, "Continuous Inspection Schemes," Biometrika, vol. 41, pp. 100-115,
1954.
134
[8] Perceptual Evaluation of Speech Quality (PESQ), An Objective Method for End-
to-End Speech Quality Assessment of Narrow Band Telephone Networks and
Speech Codecs, ITU-T Recommendation P.862, Feb 2001.
[9] Evaluation of Speech Quality (PESQ), and Objective Method for End-to end
Speech Quality Assessment of Narrow-band Telephone Networks and Speech
Codecs, ITU-T Recommendation P.862, Feb. 2001.
[10] D. M. Novakovic and M. L. Dukic, "Evolution of the Power Control Techniques
for DS-CDMA Toward 3G Wireless Communication Systems," IEEE
Communications Surveys & Tutorials, vol. 3, pp. 2-15, 2000.
[11] S. Nanda, et al., "Adaptation Techniques in Wireless Packet Data Services,"
IEEE Comm. Magazine, pp. 54-64, Jan 2000.
[12] S. Pennock, "Accuracy of the Perceptual Evaluation of Speech Quality (PESQ)
Algorithm," in Proc. of MESAQIN - The Measurement of Speech and Audion
Quality in Networks, Prague, Czech Republic, Jan 2002.
[13] H. Hosseini, et al., "Objective Characterization of Voice Service Quality in
Wideband CDMA," in IEEE VTC Conference, Rhodes, Greece, May 2001, pp.
2708-2711.
[14] A. W. Rix, "Perceptual Speech Quality Assessment - A Review," in IEEE
Conference on Acoustics, Speech, and Signal Processing, Montreal, Quebec,
Canada, May 2004, pp. 1056-1059.
[15] A. W. Rix, et al., "Objective Assessment of speech and Audio Quality -
Technology and Applications," IEEE Transactions on Audio, Speech, and
Language Processing, vol. 14, pp. 1890-1901, 2006.
[16] E. Zwicker and H. Fastl, Psychoacoustics: Facts and Models, Second Updated
ed. Heidelberg: Springer, 1999.
135
[17] T. Dimauro. (2002, Spring). Physics of Speech, Hearing, and Sound. Available:
http://sdsu-physics.org/physics201/physics201.html
[18] J. G. W. Bernstein and A. J. Oxenham, "The Relationship Between Frequency
Selectivity and Pitch Discrimination: Effects of Stimulus Level," The Journal of
the Acoustical Society of America, vol. 120, pp. 3916-3928, 2006.
[19] K. Suresh, et al., "Direct MDCT Domain Psychoacoustic Modeling," in
Symposium on Signal Processing and Information Technology, 2007 IEEE
International Cairo, Egypt, Dec 2007, pp. 742-747.
[20] H. Fletcher, "Auditory Patterns," Reviews of Modern Physics, vol. 12, p. 47,
1940.
[21] J. V. Tobias, Foundation of Modern Auditory Theory, vol. 1: Academic Press,
1970.
[22] M. Bosi and R. E. Goldberg, Introduction to Digital Audio Coding and
Standards. Boston: Kluwer Academic Publishers, December 2002.
[23] F. Harvey, "The Relation Between Loudness and Masking," The Journal of the
Acoustical Society of America, vol. 7, p. 238, 1936.
[24] M. Krasner, "The Critical Band Coder--Digital Encoding of Speech Signals
Based on the Perceptual Requirements of the Auditory System," in IEEE
International Conference on Acoustics, Speech, and Signal Processing (ICASSP
'80), Denver, Colorado, USA, April 1980, pp. 327-331.
[25] Methods for Subjective Determination of Transmission Quality, ITU-R
Rocemmendation P.800, August 1996.
[26] M. Karjalainen, "A New Auditory Model for the Evaluation of Sound Quality of
Audio Systems," in IEEE International Conference on Acoustics, Speech, and
136
Signal Processing (ICASSP '85), Tampa, Florida, USA, March 1985, pp. 608-
611.
[27] Schuyler R. Quackenbush, et al., Objective Measures of Speech Quality. New
York: Prentice Hall, 1998.
[28] S. Voran, "Objective Estimation of Perceived Speech Quality. I. Development of
the Measuring Normalizing Block Technique," IEEE Transactions on Speech
and Audio Processing, vol. 7, pp. 371-382, 1999.
[29] S. Wang, et al., "An Objective Measure for Predicting Subjective Quality of
Speech Coders," IEEE Journal on Selected Areas in Communications, vol. 10,
pp. 819-829, June 1992.
[30] R. Mannel, "The Perceptual and Auditory Implications of Parametric Scaling in
Synthetic Speech," Ph.D, Dept of Linguistic, Macquarie University, Sydney,
1994.
[31] J. O. Smith, III and J. S. Abel, "Bark and ERB Bilinear Transforms," IEEE
Transactions on Speech and Audio Processing, vol. 7, pp. 697-708, 1999.
[32] M. P. Hollier, et al., "Error Activity and Error Entropy as a Measure of
Psychoacoustic Significance in the Perceptual Domain," IEEE Proceedings on
Vision, Image and Signal Processing, vol. 141, pp. 203-208, 1994.
[33] J.G. Beereds and J. A. Stemerdink, "A Perceptual Audio Quality Measure Based
on a Psychoacoustic Sound Representation," Journal of the Audio Engineering
Society, vol. 40, pp. 963-974, Dec 1992.
[34] B. Pailard, et al., "PERCEVAL: Perceptual Evaluation of the Quality of Audio
Signals," Journal of the Audio Engineering Society, vol. 40, pp. 21-31, Jan 1992.
[35] C. Colomes, et al., "A Perceptual Model Applied to Audio Bit-Rate Reduction,"
Journal of the Audio Engineering Society, vol. 43, pp. 223-240, April 1995.
137
[36] J.G. Beerends and J. A. Stemerdink, "A Perceptual Speech Quality Measure
Based on a Psycho Sound Reperesentation," Journal of the Audio Engineering
Society, vol. 42, pp. 115-123, November 1994.
[37] A. W. Rix and M. P. Hollier, "The Perceptual Analysis Measurement System for
Robust End-to-End Speech Quality Assessment," in Proceedings IEEE
International Conference on Acoustics, Speech, and Signal Processing ( ICASSP
'00), Istanbul, Turkey, June 2000, pp. 1515-1518
[38] The E-model, a Computational Model for Use in Transmission Planning, ITU-T
Recommendation G. 107, July 2002.
[39] L. Carvalho, et al., "An E-model Implementation for Speech Quality Evaluation
in VoIP Systems," in Proceedings 10th IEEE Symposium on Computers and
Communications ( ISCC 2005) Cartegena, Spain, June 2005, pp. 933-938.
[40] D. S. Kim, "ANIQUE: An Auditory Model for Single-Ended Speech Quality
Estimation," IEEE Transactions on Speech and Audio Processing, vol. 13, pp.
821-831, 2005.
[41] K. Doh-Suk and A. Tarraf, "Perceptual Model for Non-intrusive Speech Quality
Assessment," in Proceedings IEEE International Conference on Acoustics,
Speech, and Signal Processing (ICASSP '04), 2004, pp. iii-1060-3.
[42] Single-ended Method for Objective Speech Quality Assessement in Narrow-band
Telephony Applications, ITU-T Recommendation P.563, May 2004.
[43] A. P. Markopoulou, et al., "Assessment of VoIP Quality Over Internet
Backbones," in Proceedings Twenty-First Annual Joint Conference of the IEEE
Computer and Communications Societies (INFOCOM 2002), New York, USA,
June 2002, pp. 150-159.
138
[44] A. Takahashi, et al., "Objective Assessment Methodology for Estimating
Conversational Quality in VoIP," IEEE Transactions onAudio, Speech, and
Language Processing, vol. 14, pp. 1984-1993, 2006.
[45] S. Moller and G. Berger, "Describing Telephone Speech Codec Quality
Degradations by Means of Impairment Factors," Journal of the Audio
Engineering Society, vol. 50, pp. 667-680, September 2002.
[46] S. C. Chen, et al., "On Distributed Power Control for Radio Networks," in IEEE
International Conference on Communications (ICC '94) 'Serving Humanity
Through Communications', New Orleans, Louisiana, USA, May 1994, pp. 1281-
1285
[47] Harri Holma and A. Toskala, WCDMA for UMTS-HSPA Evolution and LTE, 4
ed. Chichester: John Wiley & Sons Ltd, 2007.
[48] W. Qiang, "Performance of Optimum Transmitter Power Control in CDMA
Cellular Mobile Systems," IEEE Transactions on Vehicular Technology, vol. 48,
pp. 571-575, 1999.
[49] W. Qiang, "Optimum Transmitter Power Control in Cellular Systems with
Heterogeneous SIR Thresholds," IEEE Transactions on Vehicular Technology,
vol. 49, pp. 1424-1429, 2000.
[50] R. Prasad and T. Ojanpera, "A Survey on CDMA: Evolution Towards Wideband
CDMA," in Proceedings IEEE 5th International Symposium on Spread
Spectrum Techniques and Applications, Sun City, South Africa, Sept 1998, pp.
323-331.
[51] A. Sampath, et al., "On Setting Reverse Link Target SIR in a CDMA System,"
in IEEE 47th Vehicular Technology Conference, Pheonix, Arizona, USA, May
1997, pp. 929-933.
139
[52] M. P. J. Baker and T. J. Mouslsley, "Power control in UMTS Release '99," in
First International Conference on (Conf. Publ. No. 471) 3G Mobile
Communication Technologies, London, UK, March 2000, pp. 36-40.
[53] R. Tanner and J. Woodards, WCDMA Requirement and Practical Design, 3rd
ed.: Chichester : John Wiley and Sons, 2004.
[54] H. Axen, "Power Control in Cellular Mobile Telephone Systems (in Swedish),"
Erricson Radio Systems, 1990.
[55] H. Axen, "Uplink C/I as Control Parameter for Mobile Station Power Control (in
Swedish)," Erricson Radio Systems, 1990.
[56] J. Zander, "Distributed Cochannel Interference Control in Cellular Radio
Systems," IEEE Transactions on Vehicular Technology, vol. 41, pp. 305-311,
1992.
[57] J. Zander, "Performance of Optimum Transmitter Power Control in Cellular
Radio Systems," IEEE Transactions on Vehicular Technology, vol. 41, pp. 57-
62, 1992.
[58] G. J. Foschini and Z. Miljanic, "A Simple Distributed Autonomous Power
Control Algorithm and Its Convergence," IEEE Transactions on Vehicular
Technology, vol. 42, pp. 641-646, 1993.
[59] S. A. Grandhi, et al., "Distributed Power Control in Cellular Radio Systems,"
IEEE Transactions on Communications, vol. 42, pp. 226-228, 1994.
[60] S. A. Grandhi and J. Zander, "Constrained Power Control in Cellular Radio
Systems," in IEEE 44th Vehicular Technology Conference, Stockholm, Sweden,
June 1994, pp. 824-828.
140
[61] F. Berggren, et al., "A Generalized Algorithm for Constrained Power Control
with Capability of Temporary Removal," IEEE Transactions on Vehicular
Technology, vol. 50, pp. 1604-1612, 2001.
[62] M. Rasti, et al., "Improved Distributed Power Control Algorithms with Gradual
Removal in Wireless Networks," in 14th European Wireless Conference (EW
2008) Prague, Czech Republic, June 2008, pp. 1-5.
[63] M. Rasti, et al., "A Distributed and Efficient Power Control Algorithm for
Wireless Networks," in IEEE 19th International Symposium onPersonal, Indoor
and Mobile Radio Communications (PIMRC 2008), Cannes, France, Sept 2008,
pp. 1-6.
[64] V. Vanghi, et al., The cdma2000 Systems for Mobile Communications. New
Jersey: Prentice Hall, 2004.
[65] J. P. Castro, The UMTS Network and Radio Access Technology: Air Interface
Techniques for Future Mobile Systems. Chichester: John Wiley and Sons, 2001.
[66] A. M. Viterbi and A. J. Viterbi, "Erlang Capacity of a Power Controlled CDMA
System," in Proceedings IEEE International Symposium on Information Theory,
San Antonio, Texas, USA, Jan 1993, pp. 254-254.
[67] Technical Specification Group Access Nertowrk; Physical Layer procedures
(FDD) (Release 6), V 6.2.0, June 2004.
[68] D. C. Montgomery, Introduction to Statistical Quality Control. New York,
1996.
[69] George Box and A. Luceno, Statistical Control by Monitoring and Feedback
Adjustment: John Wiley & Son, 1997.
[70] A. Hossain, et al., "Statistical Process Control of an Industrial Process in Real
Time," IEEE Transactions on Industry Applications, vol. 32, pp. 243-249, 1996.
141
[71] A. Cinar and C. Undey, "Statistical Process and Controller Performance
Monitoring. A Tutorial on Current Methods and Future Directions," in
Proceedings of the American Control Conference San Diego, California, USA,
June 1999, pp. 2625-2639.
[72] Ming T. Tham. (1997, An Introduction to SPC. Available:
http://lorien.ncl.ac.uk/ming/spc/spc0.htm
[73] W. A. Shewhart, "Quality Control Charts," Bell System Technical Journal, vol.
5, pp. 593-602, 1926.
[74] M. E. Camargo, et al., "Statistical Quality Control: A Case Study Research," in
4th IEEE International Conference on Management of Innovation and
Technology (ICMIT 2008), Bangkok, Thailand, Sept 2008, pp. 746-750.
[75] R. E. Mohammad Abaii, "Transmission Power Control Using Perceptual Quality
Metrics," in Preceeding on 14th IEEE Personal Indoor and Mobile Radio
Communications (PIMRC 2003), Beijing, China, Sept 2003, pp. 2317-2321.
[76] A.R Prasad, et al., "Perceptual Quality Measurement and Control: Definition,
Application and Performance," in 4th International Symposium on Wireless
Personal Multimedia Communication (WPMC'01), Aalborg, Denmark, Sept
2001, pp. 553-556.
[77] ITU-T Coded-Speech Database, ITU-T Recommendation, February 1998.
[78] Mandatory Speech Codec Speech Processing Functions; Adaptive Multi-rate
(AMR) Speech Codec Frame Structure (Release 6), 3GPP TS 26.101 V6.0.0,
September 2004.
[79] Mandatory Speech Codec Speech Processing Functions; Interface to lu, Uu and
Nb (Release 6), 3GPP TS 26.102 V6.0.0, September 2004.
142
[80] B. Rohani and H. J. Zepernick, "An Efficient Method for Perceptual Evaluation
of Speech Quality in UMTS," in Proceedings International Conference on
Multimedia Communications System, Montreal, Canada, Aug 2005, pp. 185-190.
[81] B. Rohani and H. J. Zepernick, "Frame Erasure Pattern Feedback for Real-time
Perceptual Quality Estimation," in Proceedings of the Joint Conference of the
Fourth International Conference on Information, Communications and Signal
Processing and Fourth Pacific Rim Conference on Multimedia, Singapore, Dec
2003, pp. 110-113.
[82] B. Rohani and H. J. Zepernick, "Feedback Method for Real-time Perceptual
Quality Estimation," Electronics Letters, vol. 40, pp. 913-915, 2004.
[83] AMR Speech Codec Frame Structure, 3G TS 26.10, March 2002.
[84] AMR Speech Codec; Error Concealment of Lost Frames, 3G TS 26.091, March
2001.
[85] Ewan and W.D., "When and How to Use Cusum Chart," Technometrics, vol. 5,
pp. 1-22, 1963.
[86] J. M. Lucas, "The Design and Use of V-mask Control Scheme," Journal of
Quality Technology, vol. 8, pp. 1-12, 1976.
[87] F. F. Gan, "Joint Monitoring of Process Mean and Variance Using Exponentially
Weighted Moving Average Control Charts," Technometrics, vol. 37, pp. 446-
453, 1995.
[88] D. M. Hawkins, "Self-Starting Cusum Charts for Location and Scale," Journal
of the Royal Statistical Society. Series D (The Statistician), vol. 36, pp. 299-316,
1987.
[89] W. H. Woodall and B. M. Adams, "The Statistical Design of Cusum Charts,"
Quality Engineering, vol. 5, pp. 559 - 570, 1993.
143
[90] Bissel and A.F., "Cusum Techniques for Quality Control," Applied Statistics,
vol. 18, pp. 1-30, 1969.
[91] W. Zhang and Y. Mei, "A CUSUM Chart Using Absolute Sample Values to
Monitor Process Mean and Variance," in IEEE International Conference on
Industrial Engineering and Engineering Management (IEEM 2009), Hong
Kong, Dec 2009, pp. 414-418.
[92] J. D. Healy, "A Note on Multivariate CUSUM Procedure," Technometrics, vol.
29, pp. 409-412, Nov. 1987.
[93] A. L. Goel and S. M. Wu, "Economically Optimum Design of Cusum Charts,"
Management Science, vol. 19, pp. 1271-1282, 1973.
[94] R. Gerlach, et al., "Diagnostics for Time Series Analysis," Journal of Time
Series Analysis, vol. 20, pp. 309-330, 1999.
[95] G. A. Barnard, "Control Charts and Stochastic Processes," Journal of the Royal
Statistical Society. Series B (Methodological), vol. 21, pp. 239-271, 1959.
[96] J. M. Lucas, "A modified V Mask Control Scheme," Technometrics, vol. 15, pp.
833-847, 1973.
[97] L.A Jones, et al., "The Run Length Distribution of the CUSUM with Estimated
Parameters," Journal of Quality Technology, vol. 36, pp. 95-108, Jan 2004.
[98] Michael J. Cybrynski, et al. (2010, 3 July). Defining the V-Mask for a Two-
Sided Cusum Scheme (Second ed.). Available:
http://www.jmu.edu/docs/sasdoc/sashtml/qc/chap12/sect16.htm
[99] N. L. Johnson, "A Simple Theoretical Approach to Cumulative Sum Control
Charts," Journal of the American Statistical Association, vol. 56, pp. 835-840,
1961.
144
[100] S. W. Roberts, "Control Chart Tests Based on Geometric Moving Averages,"
Technometrics, vol. 42, pp. 97-101, 2000.
[101] H.-Y. Wang, "An EWMA for Monitoring Stationary Autocorrelated Process," in
International Conference on Computational Intelligence and Software
Engineering (CiSE 2009), Wuhan, China, Dec 2009, pp. 1-4.
[102] M. Khoo and A. Atta, "An EWMA Control Chart for Monitoring the Mean of
Skewed Populations Using Weighted Variance," in IEEE International
Conference on Industrial Engineering and Engineering Management (IEEM
2008), Singapore, Dec 2008, pp. 218-223.
[103] D. M. Hawkins, Olwell, David H., Cumulative Sum Charts and Charting for
Quality Improvement. New York: Springer-Verlag, 1998.
[104] G.E. Box, et al., Time Series Analysis: Forecasting and Control: Prentice Hall
PTR, 1994.
[105] M. D.C., et al., Forecasting and Time Series Analysis, 2nd ed. New York:
McGraw-Hill, 1990.
[106] J. S. Hunter, "The Exponetially Weighted Moving Average," Journal of Quality
Technology, vol. 18, pp. 203-210, 1986.
[107] J. S. Hunter, "A One-point Plot Equaivalent to the Shewhart Chart with Western
Electric Rules," Quality Engineering, vol. 2, pp. 13 - 19, 1989.
[108] "Perceptual Evaluation of Speech Quality (PESQ), An Objective Method for
End-to-End Speech Quality Assessment of Narrow Band Telephone Networks
and Speech Codecs," I.-T. R. P.862, Ed., ed: , 2001.
[109] 3GPP Technical Report 25.101 V4.1.0, "Channel Coding and Multiplexing
Example (Release 4)," June June 2001.
145
[110] Channel Coding and Multiplexing Example (Release 4), 3GPP Technical Report
25.101 V4.1.0, June 2001.
[111] Simon R. Saunders, Antennas and Propagation for Wireless Communication
Systems: Chichester: John Wiley & Sons Ltd, 1999.
[112] User Equipment (UE) Radio Transmission and Reception (FDD) (Release 6),
3GPP Technical Specification 25.101 V6.10.0, Dec 2005.
[113] Guidelines for Evaluation of Radio Transmission Technologies for IMT-2000,
ITU-R Recommendation M.1225, Feb 1977.
[114] W. Jakes, Microwave Mobile Communication: New York: John Wiley and Sons,
1978.
[115] K.S. Gilhousen, et al., "On the Capacity of a Celullar CDMA system," IEEE
Trans. Veh. Technology, vol. 40, pp. 303-312, May 1991.
[116] M.C. Jeruchim, et al., Simulation of Communication Systems, Modeling,
Methodolgy and Techniques, 2nd ed.: New York: Kluwer Academic, 2000.
[117] W. R.Rice, "Analyzing Tables of Statistical Tests," Evolution, vol. 43, pp. 223-
225, Jan 1989.
[118] L. Van Brackle and M. R. Reynolds, "EWMA and CUSUM Control Charts in
the Presence of Correlation," Communications in Statistic-Simulation and
Computation, vol. 26, pp. 979-1008, 1997.
[119] De Vargas V.D.C.C., et al., "Comparative Study of the Performance of the
CUSUM and EWMA Control Charts," Computers and Industrial Engineering,
vol. 46 pp. 707-724, 2004.
Top Related