CHAPTER 2 VIDEO TRANSMISSION IN WLAN WITH...
Transcript of CHAPTER 2 VIDEO TRANSMISSION IN WLAN WITH...
23
CHAPTER 2
VIDEO TRANSMISSION IN WLAN WITH UNEQUAL
ERROR PROTECTION
2.1 AN OVERVIEW OF DIGITAL VIDEO
The analog video signal consists of a sequence of video frames. The
video frames are generated at a fixed frame rate of 30 frames/s in the National
Television Standards Committee (NTSC) format. To obtain a digital video
signal the analog video signal is passed to a digitizer. The digitizer samples
and quantizes the analog video signal. Each sample corresponds to a pixel.
The most common digital frame formats are Common Intermediate Format
(CIF) with 352 × 288 pixels (i.e., 352 pixels in the horizontal direction and
288 pixels in the vertical direction), Source Intermediate Format (SIF) with
352 × 240 pixels, and Quarter CIF (QCIF) with 176 × 144 pixels. In all three
frame formats, each video frame is divided into three components. These are
the luminance component (Y) and the two chrominance components: hue (U)
and intensity /saturation (V). Since the human eye is less sensitive to the color
information than to the luminance information, the chrominance components
are sampled at a lower resolution. Typically, each chrominance component is
sampled at half the resolution of the luminance component in both the
horizontal and vertical directions. This is referred to as 4:1:1 chroma
subsampling. In the QCIF frame format, for instance, there are 176 × 144
luminance samples, 88 × 72 hue samples and
88 × 72 intensity samples in each video frame, when 4:1:1 chroma
24
subsampling is used. Finally, each sample is quantized; typically, 8 bits are
used per sample.
2.2 VIDEO QUALITY PERFORMANCE METRICS AND
REQUIREMENTS
The important parameter metrics with respect to the evaluation of
video transmission are PSNR, end to end packet delay and delay jitter.
2.2.1 Peak Signal to Noise Ratio
As in (Klaue et al 2003), digital video quality measurements must
be made based on the perceived quality of the actual video being received by
the users of the digital video system, because the impression of the user is
what counts in the end.
There are basically two approaches to measure digital video quality,
namely subjective quality measures and objective quality measures.
Subjective quality metrics always grasp the crucial factor, the impression of
the user watching the video. They are extremely costly to estimate due to the
large time consumption, high manpower requirements and the need for
special equipment. Such methods are described in detail by International
Telecommunication Union (ITU-R, 2000), (ITU-T, 1996), American National
Standards Institute (ANSI ) (ANSI T1.801.01/02-1996) and MPEG (Moving
Picture Experts Group) (ISO-IEC/JTC1/SC29/WG11.1996). The human
quality impression usually is given on a scale from 5 (best) to 1 (worst) as in
Table 2.1. This scale is called Mean Opinion Score (MOS).
25
Table 2.1 Subjective quality and impairment scale
Scale Quality Impairment
5 Excellent Imperceptible
4 Good Perceptible, but not annoying
3 Fair Slightly annoying
2 Poor Annoying
1 Bad Very annoying
The complex and expensive subjective quality measurements may
not be possible always. Therefore, objective metrics have been developed to
emulate the quality impression of the human visual system. In (Wolf and
Pinson 2002), there is an exhaustive discussion of various objective metrics
and their performance compared to subjective tests. However, the most
popular method is the calculation of PSNR frame by frame basis. The PSNR
compares the maximum possible signal energy to the noise energy, and has
been shown to result in a higher correlation with the subjective quality
perception. PSNR is the ratio between the maximum possible power of
a signal and the power of corrupting noise that affects the fidelity of its
representation. It is usually expressed in terms of the decibel (dB) scale.
The following equation is the definition of the PSNR between the
luminance component Y of source image S and destination image D of size
f1 x f2.
MSEMAXlog10PSNR
2S
10 (2.1)
26
where, MSE is the Mean Square Error between the source and the
destination images and is given by,
1f
0i
1f
0j
2
21
1 2
j,iDj,iSff1MSE (2.2)
Here, MAXS is the maximum possible pixel value of the image.
When the pixels are represented using 8 bits per sample, this value is 255.
More generally, when samples are represented using bs bits per
sample, MAXS is 12 sb .
When the focus is to calculate the distortion introduced by the
network alone, that is to compare the received (possibly distorted) video with
the undistorted video sent, it is done by comparing the PSNR of the encoded
video with the received video frame by frame. Another possibility is to
calculate the MOS first and calculate the percentage of frames with a MOS
worse than that of the sent (undistorted) video. This method has the advantage
of showing clearly the distortion caused by the network. The possible
conversion between PSNR and MOS (Klaue et al 2003) is shown in
Table 2.2. .
Table 2.2 Possible PSNR to MOS conversion
PSNR(dB) MOS
>37 5 (Excellent)
31 - 37 4 (Good)
25-31 3 (Fair)
20-25 2 (Poor)
<20 1 (Bad)
27
2.2.2 End to End Delay
In video transmission systems not only the actual loss is important
for the perceived video quality, but also the end to end delay of the system.
The end to end delay is the time elapsed from when the image is captured by
video camera at the sender side until when it is displayed on the monitor at
the receiver. In a typical video conferencing application as shown in (Baldi
and Ofek 2000), end to end delay is modeled with four components whose
value depends on the system configuration.
Processing Delay (PD) is introduced on both sender and
receiver sides and it is the time consumed for grabbing the
video, digitizing and compressing at the sender end and time
consumed at the receiver for decompression and display .
The Network Delay (ND) is the time needed to move data
units from a source to the destination. The network delay
includes the protocol processing delay in sender and receiver
and the propagation delay.
The Processing Resynchronisation delay (PR) cancels the
delay variation in generating the compressed video units.
The Network Resynchronization delay (NR) cancels the
variations of the delay experienced in the network (e.g., the
delay jitter due to the queuing in the network nodes.)
Therefore the total end to end delay given by the sum of all the four
components (PD+ND+PR+NR) should be a constant and be within certain
limit.
28
The typical end to end delay as per the International Telephone and
Telegraph Consultative Committee (CCITT) G.114 delay recommendations
and its impact in quality of service is given in Table 2.3.
Table 2.3 Typical end to end delay requirements for multimedia traffic
CCITT G.114 Delay Recommendations
One-way Delay Characterization of Quality
0-150 ms Acceptable for most user application
150-400 ms May impact some applications
Above 400 ms Unacceptable for general network planning purposes
2.2.3 Delay Jitter
The variation in delay is called delay jitter. Digital video always
consists of frames that are to be displayed at a constant rate. Displaying a
frame before or after a defined time bound results in “jerkiness” (Wolf and
Pinson 2002). This issue is addressed by a technique called play-out buffers.
These buffers have the purpose of absorbing the jitter introduced by network
delivery delays. It is obvious that a big enough play-out buffer can
compensate any amount of jitter. In extreme case, the buffer is as big as the
entire video and displaying do not start until the last frame is received. This
would eliminate any possible jitter at the cost of an additional delay of the
entire transmission time. The other extreme would be a buffer capable of
holding exactly one frame. In this case jitter cannot be eliminated but no
additional delay is introduced.
As per the author in (Horowitz 2009), natural conversation begins
to break down when the end to end delay exceeds 250 ms. Today most video
29
conferencing end points with a frame rate of 30 frame/s point to point
conference call have end to end delay of approximately 200 ms (including
PD, NR and PR) excluding the network delay, which can be 50 ms to meet
the threshold.
2.3 IMPLEMENTATION OF VIDEO TRANSMISSION IN NS-2
2.3.1 Video Trace Generation
YUV information of each video is captured using a personal
computer video capture card and the bttvgrab (v. 0.15.10) software and the
information is stored on disk. The YUV information is grabbed at a frame rate
of 30 frames/s in the QCIF format with 4:1:1 chrominance subsampling and 8
bit quantization is used. The QCIF format is chosen since it can be used to
generate traces for the evaluation of their transmission performance in
wireless networking systems. It is expected that handheld wireless devices of
next-generation wireless systems will typically have a screen size that
corresponds to the QCIF video format. The stored YUV frame sequences are
used as input for both the MPEG-4 encoder and the H.263/H.264AVC
(Advanced Video Coding) encoder as shown in Figure 2.1 and trace file of the
video is generated (Ke et al 2008).
Figure 2.1 Generation of video trace file
30
2.3.2 Video Streaming Application
For optimizing the transmission of video over the network which
may be wireless or wired, there must be an application which sits above the
other layers in order to read the video file and produce the encoded video
packets. To do this, a new application which reads an encoded video file and
delivers it to the underlying layer is introduced by the new tool set
(http://140.116.72.80/~smallko/ns2/MultimediaComm_en.htm) that combine
the EvalVid (Klaue et al 2003) and NS-2 (Network Simulator-2)
(http://www.isi.edu./ns/nam/ns). It reads from the file frame-by-frame and
transfers it to the new agent introduced in the transport layer. The new agent
divides the frames into UDP (User Datagram Packet) packets suitable for
transmitting over the network. The sink agent in the receiver does the reverse
process (Ke et al 2008).
2.3.3 Network Simulation Agents
Three connecting simulation agents, namely MyTrafficTrace,
MyUDP, and MyUDPSink, are implemented between NS-2 and EvalVid.
These interfaces are designed either to read the video trace file or to generate
the data required to evaluate the delivered video quality.
MyTrafficTrace: The MyTrafficTrace agent is employed to
extract the frame type and the frame size of the video trace file
generated from the output of the video sender component of
EvalVid. Furthermore, this agent fragments the video frames
into smaller segments and sends these segments to the lower
UDP layer at the appropriate time according to the user settings
specified in the simulation script file.
31
MyUDP: MyUDP is an extension of the UDP agent. This new
agent allows users to specify the output file name of the sender
trace file and it records the timestamp of each transmitted
packet, the packet identity and the packet payload size. The task
of the MyUDP agent corresponds to the task that tools such as
tcp-dump or win-dump do in a real network environment.
MyUDPSink : MyUDPSink is the receiving agent for the
fragmented video frame packets sent by MyUDP. This agent
also records the timestamp, packet identity and payload size of
each received packet in the user specified file.
The following Figure 2.2. illustrates the QoS assessment framework
for video traffic enabled by the new tool-set that combines EvalVid and
NS-2. Figure 2.3 illustrates the implementation scenario of typical video
streaming in WLAN using NS-2.
Figure 2.2 QoS assessment frame work of EvalVid in NS-2
Network Trace File
Result:The PSNR The packet loss rate The packet delay
Experience network delay and packet loss
ET program Evaluate end to end video quality using original network trace file, sender trace file and receiver trace file.
My Evalvid_Sink
Sink
MyTraffic Trace
MyUDP
Source
Traffic trace file NS 2 Environment
Receiver trace file
Sender trace file
Simulated network
32
Figure 2.3 Implementation of video streaming in WLAN using NS-2
2.4 PROPOSED CROSS LAYER SCHEME
2.4.1 Introduction
The cross layer design is an evolving paradigm in the design of
wireless network architecture that takes into consideration the dependencies
and interaction among layers and supports optimization across layers. The
cross layer design is not a replacement for layered architecture but it enables
information exchange between the layers, thereby making the system more
adaptive. Future wireless networks need to support multimedia application
like media streaming, video conferencing and interactive video. The classical
layered network architecture struggles to support their QoS requirements such
as throughput, delay, PSNR and so on. This issue can be tackled by cross
layer design.
There are numerous research works going on in this area. Mihaela Van
Der Schaar and Sai Shanker (2005), introduce a new fairness concept for
wireless multimedia systems that employs different cross layer strategies that
combine Application, MAC and PHY layers, and show its advantages when
compared to existing resource allocation mechanisms used in wireline
33
communications. In (Ksentini et al 2006), the H.264 wireless video
transmission over IEEE 802.11 WLAN is proposed using a cross layer
architecture that leverages the H.264 error resilience schemes of data partition
and existing QoS based IEEE 802.11e MAC protocol features. In the
literature (Pollin et al 2007), a two phase methodology that resolves the
sleep time trade off across the physical and the link layer and schedules nodes
at run time with near optimal energy efficient configuration in the solution
space has been proposed and applied to MPEG-4 video transmission and
video delivery with guaranteed QoS over slow fading channels.
As the multimedia services require high PSNR and low end to end
delay the literatures that consider low end to end delay, less packet loss and
hence error protection for the video packets were reviewed. Lu et al 2005
presented a work for video streaming over WLAN network, where the retry is
made adaptive with respect to the content of the video packet. The intention
of the authors is to protect the packets with high temporal correlation being
dropped and avoid the error propagated throughout the GOP. So the packets
of the I (Intra coded) frame are independently coded and retransmitted with
infinite retry limit and the algorithm is named as Content Aware Retry limit.
But there is no error protection for the packets considered and also this
infinite retry limit increase the buffering delay and hence end to end delay. In
the literature (Park and Wang 1998), authors have proposed an adaptive FEC
mechanism to facilitate the real time application whose timing constraints rule
out the use of Automatic Repeat reQuest (ARQ) schemes. The adaptive FEC
schemes in which the degree of redundancy is adjusted as a function of
network status is analyzed and proved to be better compared to static FEC or
non adaptive FEC in a dynamic asynchronous transfer mode network. An
unequal error protection wherein the different layers of scalable video coder
are coded using FEC with different degree of redundancy to maintain
acceptable picture quality for internet video streaming is presented by Horn et
34
al (1999). Here the FEC is added at the packet level and it is sender based.
Bajic (2007) propose a sender based efficient cross layer FEC scheme for
wireless video multicasting to maintain the received video quality for all the
users above a predetermined level. Here, based on the feedback information
received from each user about the number of received packets out, the sender
calculates the number of packets which each user has lost and adjusts the FEC
redundancy to reduce the packet loss rate. The main draw back in the sender
based FEC schemes is the need for feedback information and hence increased
delay. Access point based hybrid error recovery scheme which exploits the
advantages of ARQ and FEC mechanisms is proposed by Qiao and Shin
(2000) for the wireless video streaming with guaranteed QoS. In Lin et al
(2006), an AP based Enhanced Adaptive FEC (EAFEC) mechanism is
proposed based on the information from the interface queue length and MAC
layer retransmissions and improvement shown in the decodable frame rate
compared to the static FEC scheme. Similar work is presented in (Lin et al
2008) for video transmission in WLAN. Though this scheme proposes an
efficient adaptive FEC mechanism for video delivery, it does not consider the
frame priority and the information content of the frames and offers uniform
error protection for all the frames.
From the literature survey, it was observed that none of the
previous works considered the video frame priority for adapting the error
protection of the packets concerned for video transmission in WLAN. Hence
in the present research work, a cross layer optimization frame work is
proposed including adaptation across application, MAC and PHY layers. The
application layer packet prioritization mechanism, AP queue length, MAC
layer retransmission time for estimating PHY layer Channel State information
and scheduling information are jointly used to provide FEC for the video
packets of significant information content than packets of less importance.
This helps in realizing an optimized PSNR and delay performance for video
35
transmission. Hence the proposed cross layer strategy is named Unequal Error
Protection scheme for video transmission in WLAN.
2.4.2 Cross Layer Design Strategy for Video Transmission in WLAN
with Unequal Error Protection
2.4.2.1 Video compression and frame priority
MPEG compression standard is one of the popularly used schemes
for the delivery of video. One of the major features of MPEG is random
access capability. The different frames are organized together in a Group Of
Pictures (GOP). GOP is the smallest random access unit. A GOP consists of I,
P and B frames depending on the perceptual content of the video information.
The frames which are coded without any reference to the past frames are
called I frames and they do not have temporal correlation. So the compression
rate is low compared to the frames that make use of the temporal correlations
for prediction. Thus the number of frames between two I frames is a tradeoff
between compression efficiency and convenience. In order to improve the
compression efficiency, MPEG scheme contains two other frames called inter
frames, namely the Prediction coded frames (P) and Bidirectional predictive
coded (B) frames. The P frames are coded using motion compensated
prediction from I or P frames. B frames achieve a high level of compression
by using motion compensated prediction from the P frame. In the proposed
architecture, the application layer sets the priority levels for the different
frame types depending on the perceptual content of video quality. I frame
always has the highest priority and next two levels of priority are assigned for
P and B frames, respectively.
36
2.4.2.2 Cross layer design: Building blocks
In this work, a set of building blocks for channel and application
adaptive wireless streaming applications is considered. They exploit two well-
known modules in wireless video communication.
The first module deals with video quality based on the fact that different parts of a video bit stream are of different importance. To ensure better quality of video, in general two error correcting methods are used in general. They are ARQ and FEC mechanisms (Elaoud and Ramanathan 1998). In the ARQ mechanism, the receiver requests the retransmission of erroneous packets from the sender after detecting an error. This increases the end to end round trip delay due to the retransmission of lost packets and is viewed as inefficient for delay sensitive video transmission. The FEC mechanism allows sender to transmit the original data with redundant packets and to correct the error at the receiver when errors occur. The FEC mechanism is suitable for the error correction of the video transmission as the roundtrip delay is avoided at the cost of transmission time required for redundant packets. The FEC mechanism can be implemented by the byte level approach or the packet level approach. This FEC mechanism is realized by block erasure codes for adding redundant information. In an (n,k) block erasure codes, the original data is divided into blocks which consist of koriginal packets. The FEC encoder generates (n-k) redundant packets and adds to source packets, and transmits n encoded data packets as shown in Figure 2.4. At the FEC decoder, original packets can be reconstructed if at least, k packets arrive at the receiver.
Figure 2.4 Application layer packet level FEC mechanism
FEC Encoder
Source data packets (k) Sender transmitted packets (n)
1 2 - - k 1 2 - - n
37
The second module is based on the fact that different parts of the
video stream are of different importance. Hence adaptivity to channel
conditions can be realized by dropping less important packets while
providing error protection for the important packets. A packetization scheme
is used so that, FEC codes can be applied at the application packet level
across different application packets, and not at a bit level, and thereby reduce
delay at the receiver. The proposed cross layer based unequal FEC scheme
referred as Unequal Error Protection scheme is expected to outperform the
conventional layered architecture having uniform or equal FEC adaptation to
all frames (Lin et al 2006) referred as Equal Error Protection (EEP) scheme
under same or tighter constraints. The reason is that the adaptation scheme is
done at the application layer, but with the granularity of the MAC and PHY
layers.
2.4.2.3 Problem formulation
Let us consider that there are NP, NM, NN, NT and NA parameters
that can be considered in PHY, MAC, Network, Transport and Application
layers respectively for the cross layer adaptation. Some of the adaptation
parameters in each layer are given below.
Physical (P1P…PNP ): Modulation rate (constellation size),
bandwidth, transmit and receive power, Signal to Noise Ratio
(SNR) and BER.
MAC (M1M…MNM,): Scheduling mechanisms, admission control,
error correcting mechanisms, retry limit.
Network (N1N…NNN,): Routing algorithms based on power
allocation, bandwidth and the distance.
38
Transport (T1T…TNT ): Error protection strategies, scheduling.
Application (A1A…ANA): Compression ratio, forward error
correction mechanism, throughput, delay, application priority.
A cross layer design problem could be formulated with the objective
of selecting an optimum combination that gives the required QoS. The joint
cross layer strategy is defined as in Equation (2.3) (Miheala Van Der Schaar
and Sai Shankar 2005),
S ={P1P…PNP, M1M…MNM, N1N…NNN, T1T…TNT, A1A…ANA}
(2.3)
leading to,
= NP * NM * NN * NT * NA (2.4)
possible combinations.
The joint cross layer strategy defined in this work involves Application, MAC and PHY layers. The channel status estimated from the MAC Retransmission information (MRT) and the AP current queue length (AQl), and the frame priority (AFrP) at the application layer, are used in optimizing the FEC (AFEC) packets added at the application layer. The optimal solution is the one represented by the following equation,
Sopt(x) = Max Q[ S(x)] (2.5)
where, x denotes the combination of MRT, AQl, , AFrP, AFEC
Q[S(x)] represents the quality of video transmission defined by,
Delay[S(x)] Dmax
PSNR[S(x)] PSNR min
39
where, Dmax is the maximum permissible end to end delay , and
PSNRmin is the minimum required PSNR.
The block diagram of the proposed cross layer scheme is given in
Figure 2.5.
Figure 2.5 Block diagram of proposed Cross Layer Scheme
2.4.2.4 Implementation of UEP for video packets
During multimedia packet transmissions, network congestions and
the limited amount of data that can be stored in queues lead to the need for
discarding some of the transmitted packets. Network congestions may happen
whenever the overall traffic load exceeds the available network resources, and
under such conditions, some of the packets to be transmitted are either
buffered or discarded by traffic policer that monitors the network status. On
the other hand, when a node cannot access a shared transmission medium,
Packets with adaptive FEC
Application video
OPTIMIZER
AP queue status (AQl)Video frame priority (AFrP)
Retransmission Information (MRT)
Number of FEC packets to be generated (AFEC)
PHYSICAL
MAC
NETWORK
TRANSPORT
APPLICATION
40
outgoing packets are buffered in a queue that could get soaked with packets
quite easily in case buffer is under dimensioned with respect to the number of
users. Further, an excessive waiting time in the queues could make a video
packet obsolete since it cannot be decoded in time, and therefore, the video
source decoder considers its late arrival like a loss. In order to provide a
control over losses or excessive delays, network elements and edge nodes
need to adopt an appropriate scheduling strategy that shapes the incoming and
out coming traffic according to specific policies like Integrated Services,
Differentiated Services given in NS-2 documents (http://www.isi.
edu./ns/nam/ns). The main purpose of these schemes is to prevent
users/services that violate their traffic limitations from jeopardizing the QoS
of other connections.
At the edge of the network, each packet is classified and a service
class is assigned (two rate Three Colour Marker , Multi Protocol Lable
Switching). This class determines how the network elements handle packets
along the data path according to different scheduling and queue management
schemes such as DropTail, Random Early Detection and Weighted Random
Early Detection. For video applications, these strategies turn out to be crucial
in the characterization of the video quality perceived by end users since
significant amount of information that needs to be transmitted must be
accurately controlled. Generally classifiers label packets according to their
size and the buffer levels, and their video content is not taken into
consideration. Recently, new techniques have been proposed showing that a
classifying strategy aware of the significance of each packet in the decoding
process improves the QoS experienced by the end user. In the proposed
scheme, a joint optimization scheme is evolved that considers both the content
of the video packets and its service class constrained on the maximum end to
end delay. As a consequence, the relevance of each packet in the decoding
process changes according to the contained information and the adoption of
41
an algorithm that adapts the protection level to the packet type can
significantly improve the quality of the reconstructed sequence.
The video transmission in WLAN is considered for analysis in this
work. In the infrastructure mode, when any wired or wireless node wants to
send data packets to other wireless nodes, data must first be sent to the Access
Point. The AP then forwards packets to the corresponding node. Therefore, AP
is a good place for introducing the FEC mechanism for improving quality of
video delivered over the network (Lin et al 2006, Qiao and Shin 2000). For
efficient adaptive FEC, when the block of source packets is received, the AP
dynamically determines how many redundant FEC packets (No_FEC) should
be generated, based on the current network condition and frame type of the
packets. The interface current queue length (Qlen), calculated by the weighted
moving average method with a weighting factor QW, is a good indicator for
estimating network traffic load. If network load is high, queue occupancy is
more; otherwise, queue occupancy is less. When queue occupancy is less and
frame type corresponds to I, more redundant FEC packets can be generated,
but when queue occupancy increases, number of FEC packets can be
appropriately reduced to avoid unnecessary network congestion. If the queue
occupancy is beyond a threshold level and/or frame type is P or B, FEC
packets will not be generated, to avoid congestion as well as to reduce end to
end delay considerably.
The current packet retransmission time is a good indicator of
wireless channel status. This retransmission time (RT) is calculated using
weighted moving average method with weighting factor of RW. When the
wireless channel is good, number of retransmissions is less; otherwise, packet
retransmission is more. When the wireless channel is experiencing more
losses and if frame type is I, more redundant FEC packets will be generated,
but when the channel is good and if frame type is I, fewer FEC packets are
42
generated. Here the channel status from the physical layer and number of
retransmissions from the MAC layer are sent to the application layer as cross
layer information. The flow diagram for implementing UEP that is priority
based FEC in the application layer is depicted in Figure 2.6.
Figure 2.6 Flow Diagram of UEP scheme
Is Qlen <Threshold2
Is RT <Threshold4
Add Frame type into Packet Header
Is Frame Type=I
Qlen=0,RT=0
Is Qlen <Threshold1
Is Qlen <Threshold2
Is RT <Threshold3
Is RT <Threshold4
Qlen=(1-QW)*Curr_Q+QW*Qlen
No_FEC =Max_FEC(Threshold2 Qlen)(Threshold2 Threshold1)
RT=(1-RW)*Curr_RT+RW*RT
No_FEC = No_FEC {1(Threshold4 RT)
(Threshold4 Threshold3)}
FEC not added
No_FEC=Max_FEC
No_FEC=Max_FEC
Yes
Yes
Yes
Yes
FEC not added
Drop P-Frame
Drop P-Frame Transmit P-Frame
Yes
Yes
No
No No
No
No
Yes No
No
43
2.4.3 Simulation Setup and Results
The simulation to validate the proposed cross layer scheme is
carried out using Network Simulator NS-2.31. NS-2 is an event driven
simulator and works at the packet level. In this simulation, a wireless network
with one access point and four nodes is considered. Out of the four nodes, one
is the source node that has video files to be transmitted to the destination node
connected to the same access point. The two other nodes are involved in data
transmission and hence contributing to background traffic. Destination node is
moving randomly inside the region of the AP. An application is invoked to
record the sender tracefile. Each frame will be fragmented into 1000 bytes for
transmission. Maximun packet length will be 1028 bytes, including IP header
(20bytes) and UDP header (8bytes). The video packets will be stored in a new
file. Evalvid (Ke et al 2008) frame work for video transmission in NS-2 is
used.
A Highway video sequence (A Racing car in a highway) and Foreman video sequence consisting of 2000 frames and 400 frames, respectively, in QCIF format with 176 x 144 pixels, are chosen as the applications. A random uniform error model is incorporated to introduce errors at the physical layer to map realistic channel scenarios. In this study, the network end to end delay and PSNR are considered as the evaluation parameters for the video transmission under different channel induced packet error rates at the physical layer. Here error rate is considered as the transition probability of packets from good to bad. In (Li et al 2007), the wireless AP queue capacities are shown to be packet-based and to range from 40 packets to over 338 packets for various wireless AP which are used for commercial and residential purpose. So in this simulation, wireless AP queue capacity has been taken as 50 for 1500 Byte packet. Therefore the lower and higher threshold values of the queue lengths are to be chosen within this limit. InIEEE 802.11 MAC protocol, a frame will get retransmitted up to a certain
44
limit, if it is lost due to random channel errors. It is obvious that allowing a higher retransmission limit increases the chance of successful packet transmissions at the cost of longer packet transmission delays; while a lower limit will result in a larger packet loss probability with smaller delays for delivered packets. Both packet loss and transmission delay have negative impact on the performance of the network (Bai et al 2010). Further when this retransmission time is used as a measure of channel status and is integrated with the adaptive FEC scheme, the selection of the lower and higher threshold values for this parameter becomes very important. Therefore for different combinations of these threshold values the simulation was obtained for the worst error rate of 0.6. The improvement in average PSNR over the No Error Protection scheme and the average end to end delay measurements were done and tabulated in Table 2.4. Here the base scheme of EEP that is, without application frame prioritization, is used with maximum FEC packets of 8. From the tabulated results, case 1 is selected since it provides a better performance in terms of improved PSNR and reduced end to end delay and thresholds are fixed for the performance evaluation of application priority based UEP scheme. The parameters considered in the simulation are given in Table 2.5.
Table 2.4 Performance for different Threshold values
Case Threshold values Improvement
in Average PSNR (dB)
Average end to
end delay(s)
Threshold1 Threshold2 Threshold3 Threshold4
1 10 40 4 15 5.82 0.077432 10 40 5 10 5.64 0.076213 5 40 5 15 5.32 0.071714 10 45 5 15 6.08 0.079415 10 40 5 25 3.89 0.06603
45
Table 2.5 Simulation parameters for the cross layer based UEP
Layer Parameters and Value
Application AP Queue Type Drop tail
Max Length(Capacity)
50
Threshold1 10
Threshold 2 40
FEC packets Minimum 0
Maximum 8
Application video sequence
i. Foreman sequence of 400 frames
ii. Highway sequence of 2000 frames
MAC Protocol 802.11b
Bandwidth 11 Mbps
Short Retry limit 7
Long Retry Limit Threshold3 4
Threshold4 15
With the application video sequence being the highway video the
average PSNR measured for the frames received at the application layer
using EEP and UEP are shown in Figure 2.7. To demonstrate the significance
of error protection and FEC adaptation, the above measures are also compared
with the scheme with No Error Protection. The results are obtained for
various error rates. The results show that PSNR performance is much affected
when the error rate is high and there is no FEC. The PSNR values are
significantly improved when EEP scheme is implemented. The proposed UEP
46
scheme is observed to show reduction in PSNR values compared to EEP but
shows an improvement over NEP by almost 2.5 dB at 0.6 error rate. The
corresponding user perceived video quality is within the range of good (31-37
dB) with reference to Table.2.2 under the worst channel conditions.
Figure 2.7 PSNR performance for highway video sequence under
varying error rates
The end to end delay experienced by the video packets in simulated
network during the simulation is shown in Figure 2.8 for the three schemes
under an error rate of 0.5 for the highway video application. It is observed that
video packets in EEP scheme experience more delay than the other two
schemes. For better understanding, the average PSNR and average end to end
delay for the three schemes under an error rate of 0.5 are compared in Figures
2.9 and 2.10, respectively. From the results it is observed that the average
PSNR is less in the UEP scheme compared with EEP scheme, however the
average end to end delay is also significantly less and close to that of the NEP
47
scheme. The average end to end delay is maximum for the EEP scheme. User
perceived video quality is dependent on both the PSNR and the end to end
delay. A higher PSNR is normally preferred, but not at the cost of increased
delay. Especially real time multimedia applications such as video
conferencing, have very stringent end to end delay requirement as given in
2.2.3. Under bad channel conditions, EEP scheme shows best PSNR
performance but at the cost of increased latency. NEP scheme shows good
delay performance but at the cost of reduced PSNR. The UEP scheme shows
optimum performance, with respect to PSNR and delay compared to the EEP
and the NEP schemes.
Figure 2.8 End to end delay experienced by the video packets over the
simulation time for highway video sequence
48
Figure 2.9 Comparison of average PSNR performance for highway
video sequence under error rate 0.5
Figure 2.10 Comparison of average end to end delay performance for
highway video sequence under error rate 0.5
49
Another application foreman video sequence of 400 frames in QCIF
format is considered and simulation is performed and same set of results were
obtained. The PSNR obtained for different frames is compared for all the
three schemes and shown in Figure 2.11. The PSNR value is obtained with
the introduction of different error probabilities. To clearly show the
improvement in PSNR realized for most of the frames under adaptation, the
average PSNR obtained for the three mechanisms adapted by the application
layer is compared in Figure 2.12 for different error rates. The results show
that when the error rate is high say above 0.3, the average PSNR realized with
EEP which has adaptive FEC for all frames, is significantly higher than the
NEP case, and remains reasonably constant around 38 dB in the simulation.
When UEP, which has adaptive FEC for only I frame is used, the average
PSNR decreases but is definitely better than the NEP case. The results
confirm that for increased error rate the average PSNR decreases but an
improvement is achieved when FEC adaptation mechanism is introduced in
the application layer. The improvement in average PSNR is significant for
higher error rates.
Figure 2.11 PSNR for each frame of foreman video sequence
50
Figure 2.12 PSNR performance for foreman video sequence under
varying error rates
The end to end delay variation experienced by the packets at an
error rate of 0.5 is shown and compared in Figure 2.13 for the three schemes.
The end to end delay is almost same for NEP and UEP.
Figure 2.13 End to end delay experienced by the video packets over
simulation time for foreman video sequence
51
In the EEP scheme, there is a considerable improvement in PSNR
but the end to end delay also considerably increases as seen from Figure 2.14.
But this delay reduces to that of NEP case, when the proposed UEP scheme is
used. The average PSNR in this scheme is slightly reduced but the video
frames are still decodable.
The received frames are decoded at the receiver for both the schemes
and the video stream is viewed using YUV viewer. The snapshots given in
Figure 2.15 show a significant improvement in the video stream for the
proposed scheme over NEP for the same error rate of 0.4.
Figure 2.14 Comparison of average end to end delay for foreman video
sequence under error rate 0.5
NEPUEPEEP0
0.01
0.02
0.03
0.04
0.05
0.06
0.07
0.08
0.09
52
Figure 2.15 Snap Shots of foreman video sequence under three schemes
The improvement QoS measures realized by EEP and UEP schemes
over the NEP is calculated and tabulated in Table 2.6. The results show that
performance improvement shown by the proposed scheme is better for
foreman video compared to highway video.
53
Table 2.6 Comparison of performance of EEP with UEP scheme
Application video
Percentage Improvement in
PSNR
Percentage Increase in End to end delay
EEP UEP EEP UEP
Highway 12 4 168 50
Foreman 23 7 42 2
The reason for the difference in PSNR improvement for the two
applications is analyzed. It is found that the average temporal correlation
between the corresponding macro blocks of the successive frames in highway
video is 0.8 where as that of the foreman is comparatively higher with a
correlation co efficient value of 0.92. The content of the packets belonging to
encoded frames are very much important to get undistorted frames at the
receiver. If the temporal correlation is less, it is difficult to reconstruct the
erasures caused due to the loss of packets and hence may significantly reduce
the PSNR. Hence the proposed scheme is best suited for real time multimedia
applications with high temporal correlation. The correlation co efficient
values obtained for the corresponding macro blocks of the successive frames
of the foreman and highway videos are shown in Figures 2.16 and 2.17
respectively.
2.5 SUMMARY
In this work, an adaptive unequal error protection mechanism with
cross layer communication between the application layer and MAC layer and
the PHY layer has been proposed and implemented for video transmission.
The following are the important observations made from the results obtained.
54
Figure 2.16 Correlation between the corresponding macroblocks of
frame 9 and 10 of the foreman video sequence
Figure 2.17 Correlation between the corresponding macroblocks of
frame 25 and 26 of the highway video sequence
0 10 20 30 40 50 60 70 80 90 1000.4
0.5
0.6
0.7
0.8
0.9
1
Macro block
Foreman video
0 10 20 30 40 50 60 70 80 90 1000.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Macro block
Highway video
55
When the simulation is carried out with Highway video
sequence as application, there is no significant improvement
in average PSNR at low error rate. As the error rate is
increased to 0.5, the percentage improvement in PSNR
achieved in EEP scheme is 12% at the cost of 168% increase
in average end to end delay over the NEP. The percentage
improvement in PSNR achieved in proposed UEP only is 4%,
but with only 50 % increase in average end to end delay as
compared to NEP scheme.
Though percentage improvement in PSNR is low in case of
the proposed scheme compared to EEP, it keeps the PSNR
within the range of good in terms of perceived video quality
with less increase in delay.
The proposed scheme is hence best suited to loss tolerable,
delay sensitive video transmission applications in a WLAN
network.
The application dependent behavior of the proposed scheme is
verified by the temporal correlation among the macro blocks
of the successive frames of the used application video
sequences. It is found that the temporal correlation coefficient
for highway is 0.8 whereas as this value is 0.92 for foreman
video. This slightly higher correlation works well during the
error resilient process thereby improving the performance.