CHAPTER 2 VIDEO TRANSMISSION IN WLAN WITH...

23

CHAPTER 2

VIDEO TRANSMISSION IN WLAN WITH UNEQUAL

ERROR PROTECTION

2.1 AN OVERVIEW OF DIGITAL VIDEO

The analog video signal consists of a sequence of video frames. The

video frames are generated at a fixed frame rate of 30 frames/s in the National

Television Standards Committee (NTSC) format. To obtain a digital video

signal the analog video signal is passed to a digitizer. The digitizer samples

and quantizes the analog video signal. Each sample corresponds to a pixel.

The most common digital frame formats are Common Intermediate Format

(CIF) with 352 × 288 pixels (i.e., 352 pixels in the horizontal direction and

288 pixels in the vertical direction), Source Intermediate Format (SIF) with

352 × 240 pixels, and Quarter CIF (QCIF) with 176 × 144 pixels. In all three

frame formats, each video frame is divided into three components. These are

the luminance component (Y) and the two chrominance components: hue (U)

and intensity /saturation (V). Since the human eye is less sensitive to the color

information than to the luminance information, the chrominance components

are sampled at a lower resolution. Typically, each chrominance component is

sampled at half the resolution of the luminance component in both the

horizontal and vertical directions. This is referred to as 4:1:1 chroma

subsampling. In the QCIF frame format, for instance, there are 176 × 144

luminance samples, 88 × 72 hue samples and

88 × 72 intensity samples in each video frame, when 4:1:1 chroma

24

subsampling is used. Finally, each sample is quantized; typically, 8 bits are

used per sample.

2.2 VIDEO QUALITY PERFORMANCE METRICS AND

REQUIREMENTS

The important parameter metrics with respect to the evaluation of

video transmission are PSNR, end to end packet delay and delay jitter.

2.2.1 Peak Signal to Noise Ratio

As in (Klaue et al 2003), digital video quality measurements must

be made based on the perceived quality of the actual video being received by

the users of the digital video system, because the impression of the user is

what counts in the end.

There are basically two approaches to measure digital video quality,

namely subjective quality measures and objective quality measures.

Subjective quality metrics always grasp the crucial factor, the impression of

the user watching the video. They are extremely costly to estimate due to the

large time consumption, high manpower requirements and the need for

special equipment. Such methods are described in detail by International

Telecommunication Union (ITU-R, 2000), (ITU-T, 1996), American National

Standards Institute (ANSI ) (ANSI T1.801.01/02-1996) and MPEG (Moving

Picture Experts Group) (ISO-IEC/JTC1/SC29/WG11.1996). The human

quality impression usually is given on a scale from 5 (best) to 1 (worst) as in

Table 2.1. This scale is called Mean Opinion Score (MOS).

25

Table 2.1 Subjective quality and impairment scale

Scale Quality Impairment

5 Excellent Imperceptible

4 Good Perceptible, but not annoying

3 Fair Slightly annoying

2 Poor Annoying

1 Bad Very annoying

The complex and expensive subjective quality measurements may

not be possible always. Therefore, objective metrics have been developed to

emulate the quality impression of the human visual system. In (Wolf and

Pinson 2002), there is an exhaustive discussion of various objective metrics

and their performance compared to subjective tests. However, the most

popular method is the calculation of PSNR frame by frame basis. The PSNR

compares the maximum possible signal energy to the noise energy, and has

been shown to result in a higher correlation with the subjective quality

perception. PSNR is the ratio between the maximum possible power of

a signal and the power of corrupting noise that affects the fidelity of its

representation. It is usually expressed in terms of the decibel (dB) scale.

The following equation is the definition of the PSNR between the

luminance component Y of source image S and destination image D of size

f1 x f2.

MSEMAXlog10PSNR

2S

10 (2.1)

26

where, MSE is the Mean Square Error between the source and the

destination images and is given by,

1f

0i

1f

0j

2

21

1 2

j,iDj,iSff1MSE (2.2)

Here, MAXS is the maximum possible pixel value of the image.

When the pixels are represented using 8 bits per sample, this value is 255.

More generally, when samples are represented using bs bits per

sample, MAXS is 12 sb .

When the focus is to calculate the distortion introduced by the

network alone, that is to compare the received (possibly distorted) video with

the undistorted video sent, it is done by comparing the PSNR of the encoded

video with the received video frame by frame. Another possibility is to

calculate the MOS first and calculate the percentage of frames with a MOS

worse than that of the sent (undistorted) video. This method has the advantage

of showing clearly the distortion caused by the network. The possible

conversion between PSNR and MOS (Klaue et al 2003) is shown in

Table 2.2. .

Table 2.2 Possible PSNR to MOS conversion

PSNR(dB) MOS

>37 5 (Excellent)

31 - 37 4 (Good)

25-31 3 (Fair)

20-25 2 (Poor)

<20 1 (Bad)

27

2.2.2 End to End Delay

In video transmission systems not only the actual loss is important

for the perceived video quality, but also the end to end delay of the system.

The end to end delay is the time elapsed from when the image is captured by

video camera at the sender side until when it is displayed on the monitor at

the receiver. In a typical video conferencing application as shown in (Baldi

and Ofek 2000), end to end delay is modeled with four components whose

value depends on the system configuration.

Processing Delay (PD) is introduced on both sender and

receiver sides and it is the time consumed for grabbing the

video, digitizing and compressing at the sender end and time

consumed at the receiver for decompression and display .

The Network Delay (ND) is the time needed to move data

units from a source to the destination. The network delay

includes the protocol processing delay in sender and receiver

and the propagation delay.

The Processing Resynchronisation delay (PR) cancels the

delay variation in generating the compressed video units.

The Network Resynchronization delay (NR) cancels the

variations of the delay experienced in the network (e.g., the

delay jitter due to the queuing in the network nodes.)

Therefore the total end to end delay given by the sum of all the four

components (PD+ND+PR+NR) should be a constant and be within certain

limit.

28

The typical end to end delay as per the International Telephone and

Telegraph Consultative Committee (CCITT) G.114 delay recommendations

and its impact in quality of service is given in Table 2.3.

Table 2.3 Typical end to end delay requirements for multimedia traffic

CCITT G.114 Delay Recommendations

One-way Delay Characterization of Quality

0-150 ms Acceptable for most user application

150-400 ms May impact some applications

Above 400 ms Unacceptable for general network planning purposes

2.2.3 Delay Jitter

The variation in delay is called delay jitter. Digital video always

consists of frames that are to be displayed at a constant rate. Displaying a

frame before or after a defined time bound results in “jerkiness” (Wolf and

Pinson 2002). This issue is addressed by a technique called play-out buffers.

These buffers have the purpose of absorbing the jitter introduced by network

delivery delays. It is obvious that a big enough play-out buffer can

compensate any amount of jitter. In extreme case, the buffer is as big as the

entire video and displaying do not start until the last frame is received. This

would eliminate any possible jitter at the cost of an additional delay of the

entire transmission time. The other extreme would be a buffer capable of

holding exactly one frame. In this case jitter cannot be eliminated but no

additional delay is introduced.

As per the author in (Horowitz 2009), natural conversation begins

to break down when the end to end delay exceeds 250 ms. Today most video

29

conferencing end points with a frame rate of 30 frame/s point to point

conference call have end to end delay of approximately 200 ms (including

PD, NR and PR) excluding the network delay, which can be 50 ms to meet

the threshold.

2.3 IMPLEMENTATION OF VIDEO TRANSMISSION IN NS-2

2.3.1 Video Trace Generation

YUV information of each video is captured using a personal

computer video capture card and the bttvgrab (v. 0.15.10) software and the

information is stored on disk. The YUV information is grabbed at a frame rate

of 30 frames/s in the QCIF format with 4:1:1 chrominance subsampling and 8

bit quantization is used. The QCIF format is chosen since it can be used to

generate traces for the evaluation of their transmission performance in

wireless networking systems. It is expected that handheld wireless devices of

next-generation wireless systems will typically have a screen size that

corresponds to the QCIF video format. The stored YUV frame sequences are

used as input for both the MPEG-4 encoder and the H.263/H.264AVC

(Advanced Video Coding) encoder as shown in Figure 2.1 and trace file of the

video is generated (Ke et al 2008).

Figure 2.1 Generation of video trace file

30

2.3.2 Video Streaming Application

For optimizing the transmission of video over the network which

may be wireless or wired, there must be an application which sits above the

other layers in order to read the video file and produce the encoded video

packets. To do this, a new application which reads an encoded video file and

delivers it to the underlying layer is introduced by the new tool set

(http://140.116.72.80/~smallko/ns2/MultimediaComm_en.htm) that combine

the EvalVid (Klaue et al 2003) and NS-2 (Network Simulator-2)

(http://www.isi.edu./ns/nam/ns). It reads from the file frame-by-frame and

transfers it to the new agent introduced in the transport layer. The new agent

divides the frames into UDP (User Datagram Packet) packets suitable for

transmitting over the network. The sink agent in the receiver does the reverse

process (Ke et al 2008).

2.3.3 Network Simulation Agents

Three connecting simulation agents, namely MyTrafficTrace,

MyUDP, and MyUDPSink, are implemented between NS-2 and EvalVid.

These interfaces are designed either to read the video trace file or to generate

the data required to evaluate the delivered video quality.

MyTrafficTrace: The MyTrafficTrace agent is employed to

extract the frame type and the frame size of the video trace file

generated from the output of the video sender component of

EvalVid. Furthermore, this agent fragments the video frames

into smaller segments and sends these segments to the lower

UDP layer at the appropriate time according to the user settings

specified in the simulation script file.

31

MyUDP: MyUDP is an extension of the UDP agent. This new

agent allows users to specify the output file name of the sender

trace file and it records the timestamp of each transmitted

packet, the packet identity and the packet payload size. The task

of the MyUDP agent corresponds to the task that tools such as

tcp-dump or win-dump do in a real network environment.

MyUDPSink : MyUDPSink is the receiving agent for the

fragmented video frame packets sent by MyUDP. This agent

also records the timestamp, packet identity and payload size of

each received packet in the user specified file.

The following Figure 2.2. illustrates the QoS assessment framework

for video traffic enabled by the new tool-set that combines EvalVid and

NS-2. Figure 2.3 illustrates the implementation scenario of typical video

streaming in WLAN using NS-2.

Figure 2.2 QoS assessment frame work of EvalVid in NS-2

Network Trace File

Result:The PSNR The packet loss rate The packet delay

Experience network delay and packet loss

ET program Evaluate end to end video quality using original network trace file, sender trace file and receiver trace file.

My Evalvid_Sink

Sink

MyTraffic Trace

MyUDP

Source

Traffic trace file NS 2 Environment

Receiver trace file

Sender trace file

Simulated network

32

Figure 2.3 Implementation of video streaming in WLAN using NS-2

2.4 PROPOSED CROSS LAYER SCHEME

2.4.1 Introduction

The cross layer design is an evolving paradigm in the design of

wireless network architecture that takes into consideration the dependencies

and interaction among layers and supports optimization across layers. The

cross layer design is not a replacement for layered architecture but it enables

information exchange between the layers, thereby making the system more

adaptive. Future wireless networks need to support multimedia application

like media streaming, video conferencing and interactive video. The classical

layered network architecture struggles to support their QoS requirements such

as throughput, delay, PSNR and so on. This issue can be tackled by cross

layer design.

There are numerous research works going on in this area. Mihaela Van

Der Schaar and Sai Shanker (2005), introduce a new fairness concept for

wireless multimedia systems that employs different cross layer strategies that

combine Application, MAC and PHY layers, and show its advantages when

compared to existing resource allocation mechanisms used in wireline

33

communications. In (Ksentini et al 2006), the H.264 wireless video

transmission over IEEE 802.11 WLAN is proposed using a cross layer

architecture that leverages the H.264 error resilience schemes of data partition

and existing QoS based IEEE 802.11e MAC protocol features. In the

literature (Pollin et al 2007), a two phase methodology that resolves the

sleep time trade off across the physical and the link layer and schedules nodes

at run time with near optimal energy efficient configuration in the solution

space has been proposed and applied to MPEG-4 video transmission and

video delivery with guaranteed QoS over slow fading channels.

As the multimedia services require high PSNR and low end to end

delay the literatures that consider low end to end delay, less packet loss and

hence error protection for the video packets were reviewed. Lu et al 2005

presented a work for video streaming over WLAN network, where the retry is

made adaptive with respect to the content of the video packet. The intention

of the authors is to protect the packets with high temporal correlation being

dropped and avoid the error propagated throughout the GOP. So the packets

of the I (Intra coded) frame are independently coded and retransmitted with

infinite retry limit and the algorithm is named as Content Aware Retry limit.

But there is no error protection for the packets considered and also this

infinite retry limit increase the buffering delay and hence end to end delay. In

the literature (Park and Wang 1998), authors have proposed an adaptive FEC

mechanism to facilitate the real time application whose timing constraints rule

out the use of Automatic Repeat reQuest (ARQ) schemes. The adaptive FEC

schemes in which the degree of redundancy is adjusted as a function of

network status is analyzed and proved to be better compared to static FEC or

non adaptive FEC in a dynamic asynchronous transfer mode network. An

unequal error protection wherein the different layers of scalable video coder

are coded using FEC with different degree of redundancy to maintain

acceptable picture quality for internet video streaming is presented by Horn et

34

al (1999). Here the FEC is added at the packet level and it is sender based.

Bajic (2007) propose a sender based efficient cross layer FEC scheme for

wireless video multicasting to maintain the received video quality for all the

users above a predetermined level. Here, based on the feedback information

received from each user about the number of received packets out, the sender

calculates the number of packets which each user has lost and adjusts the FEC

redundancy to reduce the packet loss rate. The main draw back in the sender

based FEC schemes is the need for feedback information and hence increased

delay. Access point based hybrid error recovery scheme which exploits the

advantages of ARQ and FEC mechanisms is proposed by Qiao and Shin

(2000) for the wireless video streaming with guaranteed QoS. In Lin et al

(2006), an AP based Enhanced Adaptive FEC (EAFEC) mechanism is

proposed based on the information from the interface queue length and MAC

layer retransmissions and improvement shown in the decodable frame rate

compared to the static FEC scheme. Similar work is presented in (Lin et al

2008) for video transmission in WLAN. Though this scheme proposes an

efficient adaptive FEC mechanism for video delivery, it does not consider the

frame priority and the information content of the frames and offers uniform

error protection for all the frames.

From the literature survey, it was observed that none of the

previous works considered the video frame priority for adapting the error

protection of the packets concerned for video transmission in WLAN. Hence

in the present research work, a cross layer optimization frame work is

proposed including adaptation across application, MAC and PHY layers. The

application layer packet prioritization mechanism, AP queue length, MAC

layer retransmission time for estimating PHY layer Channel State information

and scheduling information are jointly used to provide FEC for the video

packets of significant information content than packets of less importance.

This helps in realizing an optimized PSNR and delay performance for video

35

transmission. Hence the proposed cross layer strategy is named Unequal Error

Protection scheme for video transmission in WLAN.

2.4.2 Cross Layer Design Strategy for Video Transmission in WLAN

with Unequal Error Protection

2.4.2.1 Video compression and frame priority

MPEG compression standard is one of the popularly used schemes

for the delivery of video. One of the major features of MPEG is random

access capability. The different frames are organized together in a Group Of

Pictures (GOP). GOP is the smallest random access unit. A GOP consists of I,

P and B frames depending on the perceptual content of the video information.

The frames which are coded without any reference to the past frames are

called I frames and they do not have temporal correlation. So the compression

rate is low compared to the frames that make use of the temporal correlations

for prediction. Thus the number of frames between two I frames is a tradeoff

between compression efficiency and convenience. In order to improve the

compression efficiency, MPEG scheme contains two other frames called inter

frames, namely the Prediction coded frames (P) and Bidirectional predictive

coded (B) frames. The P frames are coded using motion compensated

prediction from I or P frames. B frames achieve a high level of compression

by using motion compensated prediction from the P frame. In the proposed

architecture, the application layer sets the priority levels for the different

frame types depending on the perceptual content of video quality. I frame

always has the highest priority and next two levels of priority are assigned for

P and B frames, respectively.

36

2.4.2.2 Cross layer design: Building blocks

In this work, a set of building blocks for channel and application

adaptive wireless streaming applications is considered. They exploit two well-

known modules in wireless video communication.

The first module deals with video quality based on the fact that different parts of a video bit stream are of different importance. To ensure better quality of video, in general two error correcting methods are used in general. They are ARQ and FEC mechanisms (Elaoud and Ramanathan 1998). In the ARQ mechanism, the receiver requests the retransmission of erroneous packets from the sender after detecting an error. This increases the end to end round trip delay due to the retransmission of lost packets and is viewed as inefficient for delay sensitive video transmission. The FEC mechanism allows sender to transmit the original data with redundant packets and to correct the error at the receiver when errors occur. The FEC mechanism is suitable for the error correction of the video transmission as the roundtrip delay is avoided at the cost of transmission time required for redundant packets. The FEC mechanism can be implemented by the byte level approach or the packet level approach. This FEC mechanism is realized by block erasure codes for adding redundant information. In an (n,k) block erasure codes, the original data is divided into blocks which consist of koriginal packets. The FEC encoder generates (n-k) redundant packets and adds to source packets, and transmits n encoded data packets as shown in Figure 2.4. At the FEC decoder, original packets can be reconstructed if at least, k packets arrive at the receiver.

Figure 2.4 Application layer packet level FEC mechanism

FEC Encoder

Source data packets (k) Sender transmitted packets (n)

1 2 - - k 1 2 - - n

37

The second module is based on the fact that different parts of the

video stream are of different importance. Hence adaptivity to channel

conditions can be realized by dropping less important packets while

providing error protection for the important packets. A packetization scheme

is used so that, FEC codes can be applied at the application packet level

across different application packets, and not at a bit level, and thereby reduce

delay at the receiver. The proposed cross layer based unequal FEC scheme

referred as Unequal Error Protection scheme is expected to outperform the

conventional layered architecture having uniform or equal FEC adaptation to

all frames (Lin et al 2006) referred as Equal Error Protection (EEP) scheme

under same or tighter constraints. The reason is that the adaptation scheme is

done at the application layer, but with the granularity of the MAC and PHY

layers.

2.4.2.3 Problem formulation

Let us consider that there are NP, NM, NN, NT and NA parameters

that can be considered in PHY, MAC, Network, Transport and Application

layers respectively for the cross layer adaptation. Some of the adaptation

parameters in each layer are given below.

Physical (P1P…PNP ): Modulation rate (constellation size),

bandwidth, transmit and receive power, Signal to Noise Ratio

(SNR) and BER.

MAC (M1M…MNM,): Scheduling mechanisms, admission control,

error correcting mechanisms, retry limit.

Network (N1N…NNN,): Routing algorithms based on power

allocation, bandwidth and the distance.

38

Transport (T1T…TNT ): Error protection strategies, scheduling.

Application (A1A…ANA): Compression ratio, forward error

correction mechanism, throughput, delay, application priority.

A cross layer design problem could be formulated with the objective

of selecting an optimum combination that gives the required QoS. The joint

cross layer strategy is defined as in Equation (2.3) (Miheala Van Der Schaar

and Sai Shankar 2005),

S ={P1P…PNP, M1M…MNM, N1N…NNN, T1T…TNT, A1A…ANA}

(2.3)

leading to,

= NP * NM * NN * NT * NA (2.4)

possible combinations.

The joint cross layer strategy defined in this work involves Application, MAC and PHY layers. The channel status estimated from the MAC Retransmission information (MRT) and the AP current queue length (AQl), and the frame priority (AFrP) at the application layer, are used in optimizing the FEC (AFEC) packets added at the application layer. The optimal solution is the one represented by the following equation,

Sopt(x) = Max Q[ S(x)] (2.5)

where, x denotes the combination of MRT, AQl, , AFrP, AFEC

Q[S(x)] represents the quality of video transmission defined by,

Delay[S(x)] Dmax

PSNR[S(x)] PSNR min

39

where, Dmax is the maximum permissible end to end delay , and

PSNRmin is the minimum required PSNR.

The block diagram of the proposed cross layer scheme is given in

Figure 2.5.

Figure 2.5 Block diagram of proposed Cross Layer Scheme

2.4.2.4 Implementation of UEP for video packets

During multimedia packet transmissions, network congestions and

the limited amount of data that can be stored in queues lead to the need for

discarding some of the transmitted packets. Network congestions may happen

whenever the overall traffic load exceeds the available network resources, and

under such conditions, some of the packets to be transmitted are either

buffered or discarded by traffic policer that monitors the network status. On

the other hand, when a node cannot access a shared transmission medium,

Packets with adaptive FEC

Application video

OPTIMIZER

AP queue status (AQl)Video frame priority (AFrP)

Retransmission Information (MRT)

Number of FEC packets to be generated (AFEC)

PHYSICAL

MAC

NETWORK

TRANSPORT

APPLICATION

40

outgoing packets are buffered in a queue that could get soaked with packets

quite easily in case buffer is under dimensioned with respect to the number of

users. Further, an excessive waiting time in the queues could make a video

packet obsolete since it cannot be decoded in time, and therefore, the video

source decoder considers its late arrival like a loss. In order to provide a

control over losses or excessive delays, network elements and edge nodes

need to adopt an appropriate scheduling strategy that shapes the incoming and

out coming traffic according to specific policies like Integrated Services,

Differentiated Services given in NS-2 documents (http://www.isi.

edu./ns/nam/ns). The main purpose of these schemes is to prevent

users/services that violate their traffic limitations from jeopardizing the QoS

of other connections.

At the edge of the network, each packet is classified and a service

class is assigned (two rate Three Colour Marker , Multi Protocol Lable

Switching). This class determines how the network elements handle packets

along the data path according to different scheduling and queue management

schemes such as DropTail, Random Early Detection and Weighted Random

Early Detection. For video applications, these strategies turn out to be crucial

in the characterization of the video quality perceived by end users since

significant amount of information that needs to be transmitted must be

accurately controlled. Generally classifiers label packets according to their

size and the buffer levels, and their video content is not taken into

consideration. Recently, new techniques have been proposed showing that a

classifying strategy aware of the significance of each packet in the decoding

process improves the QoS experienced by the end user. In the proposed

scheme, a joint optimization scheme is evolved that considers both the content

of the video packets and its service class constrained on the maximum end to

end delay. As a consequence, the relevance of each packet in the decoding

process changes according to the contained information and the adoption of

41

an algorithm that adapts the protection level to the packet type can

significantly improve the quality of the reconstructed sequence.

The video transmission in WLAN is considered for analysis in this

work. In the infrastructure mode, when any wired or wireless node wants to

send data packets to other wireless nodes, data must first be sent to the Access

Point. The AP then forwards packets to the corresponding node. Therefore, AP

is a good place for introducing the FEC mechanism for improving quality of

video delivered over the network (Lin et al 2006, Qiao and Shin 2000). For

efficient adaptive FEC, when the block of source packets is received, the AP

dynamically determines how many redundant FEC packets (No_FEC) should

be generated, based on the current network condition and frame type of the

packets. The interface current queue length (Qlen), calculated by the weighted

moving average method with a weighting factor QW, is a good indicator for

estimating network traffic load. If network load is high, queue occupancy is

more; otherwise, queue occupancy is less. When queue occupancy is less and

frame type corresponds to I, more redundant FEC packets can be generated,

but when queue occupancy increases, number of FEC packets can be

appropriately reduced to avoid unnecessary network congestion. If the queue

occupancy is beyond a threshold level and/or frame type is P or B, FEC

packets will not be generated, to avoid congestion as well as to reduce end to

end delay considerably.

The current packet retransmission time is a good indicator of

wireless channel status. This retransmission time (RT) is calculated using

weighted moving average method with weighting factor of RW. When the

wireless channel is good, number of retransmissions is less; otherwise, packet

retransmission is more. When the wireless channel is experiencing more

losses and if frame type is I, more redundant FEC packets will be generated,

but when the channel is good and if frame type is I, fewer FEC packets are

42

generated. Here the channel status from the physical layer and number of

retransmissions from the MAC layer are sent to the application layer as cross

layer information. The flow diagram for implementing UEP that is priority

based FEC in the application layer is depicted in Figure 2.6.

Figure 2.6 Flow Diagram of UEP scheme

Is Qlen <Threshold2

Is RT <Threshold4

Add Frame type into Packet Header

Is Frame Type=I

Qlen=0,RT=0

Is Qlen <Threshold1

Is Qlen <Threshold2

Is RT <Threshold3

Is RT <Threshold4

Qlen=(1-QW)*Curr_Q+QW*Qlen

No_FEC =Max_FEC(Threshold2 Qlen)(Threshold2 Threshold1)

RT=(1-RW)*Curr_RT+RW*RT

No_FEC = No_FEC {1(Threshold4 RT)

(Threshold4 Threshold3)}

FEC not added

No_FEC=Max_FEC

No_FEC=Max_FEC

Yes

Yes

Yes

Yes

FEC not added

Drop P-Frame

Drop P-Frame Transmit P-Frame

Yes

Yes

No

No No

No

No

Yes No

No

43

2.4.3 Simulation Setup and Results

The simulation to validate the proposed cross layer scheme is

carried out using Network Simulator NS-2.31. NS-2 is an event driven

simulator and works at the packet level. In this simulation, a wireless network

with one access point and four nodes is considered. Out of the four nodes, one

is the source node that has video files to be transmitted to the destination node

connected to the same access point. The two other nodes are involved in data

transmission and hence contributing to background traffic. Destination node is

moving randomly inside the region of the AP. An application is invoked to

record the sender tracefile. Each frame will be fragmented into 1000 bytes for

transmission. Maximun packet length will be 1028 bytes, including IP header

(20bytes) and UDP header (8bytes). The video packets will be stored in a new

file. Evalvid (Ke et al 2008) frame work for video transmission in NS-2 is

used.

A Highway video sequence (A Racing car in a highway) and Foreman video sequence consisting of 2000 frames and 400 frames, respectively, in QCIF format with 176 x 144 pixels, are chosen as the applications. A random uniform error model is incorporated to introduce errors at the physical layer to map realistic channel scenarios. In this study, the network end to end delay and PSNR are considered as the evaluation parameters for the video transmission under different channel induced packet error rates at the physical layer. Here error rate is considered as the transition probability of packets from good to bad. In (Li et al 2007), the wireless AP queue capacities are shown to be packet-based and to range from 40 packets to over 338 packets for various wireless AP which are used for commercial and residential purpose. So in this simulation, wireless AP queue capacity has been taken as 50 for 1500 Byte packet. Therefore the lower and higher threshold values of the queue lengths are to be chosen within this limit. InIEEE 802.11 MAC protocol, a frame will get retransmitted up to a certain

44

limit, if it is lost due to random channel errors. It is obvious that allowing a higher retransmission limit increases the chance of successful packet transmissions at the cost of longer packet transmission delays; while a lower limit will result in a larger packet loss probability with smaller delays for delivered packets. Both packet loss and transmission delay have negative impact on the performance of the network (Bai et al 2010). Further when this retransmission time is used as a measure of channel status and is integrated with the adaptive FEC scheme, the selection of the lower and higher threshold values for this parameter becomes very important. Therefore for different combinations of these threshold values the simulation was obtained for the worst error rate of 0.6. The improvement in average PSNR over the No Error Protection scheme and the average end to end delay measurements were done and tabulated in Table 2.4. Here the base scheme of EEP that is, without application frame prioritization, is used with maximum FEC packets of 8. From the tabulated results, case 1 is selected since it provides a better performance in terms of improved PSNR and reduced end to end delay and thresholds are fixed for the performance evaluation of application priority based UEP scheme. The parameters considered in the simulation are given in Table 2.5.

Table 2.4 Performance for different Threshold values

Case Threshold values Improvement

in Average PSNR (dB)

Average end to

end delay(s)

Threshold1 Threshold2 Threshold3 Threshold4

1 10 40 4 15 5.82 0.077432 10 40 5 10 5.64 0.076213 5 40 5 15 5.32 0.071714 10 45 5 15 6.08 0.079415 10 40 5 25 3.89 0.06603

45

Table 2.5 Simulation parameters for the cross layer based UEP

Layer Parameters and Value

Application AP Queue Type Drop tail

Max Length(Capacity)

50

Threshold1 10

Threshold 2 40

FEC packets Minimum 0

Maximum 8

Application video sequence

i. Foreman sequence of 400 frames

ii. Highway sequence of 2000 frames

MAC Protocol 802.11b

Bandwidth 11 Mbps

Short Retry limit 7

Long Retry Limit Threshold3 4

Threshold4 15

With the application video sequence being the highway video the

average PSNR measured for the frames received at the application layer

using EEP and UEP are shown in Figure 2.7. To demonstrate the significance

of error protection and FEC adaptation, the above measures are also compared

with the scheme with No Error Protection. The results are obtained for

various error rates. The results show that PSNR performance is much affected

when the error rate is high and there is no FEC. The PSNR values are

significantly improved when EEP scheme is implemented. The proposed UEP

46

scheme is observed to show reduction in PSNR values compared to EEP but

shows an improvement over NEP by almost 2.5 dB at 0.6 error rate. The

corresponding user perceived video quality is within the range of good (31-37

dB) with reference to Table.2.2 under the worst channel conditions.

Figure 2.7 PSNR performance for highway video sequence under

varying error rates

The end to end delay experienced by the video packets in simulated

network during the simulation is shown in Figure 2.8 for the three schemes

under an error rate of 0.5 for the highway video application. It is observed that

video packets in EEP scheme experience more delay than the other two

schemes. For better understanding, the average PSNR and average end to end

delay for the three schemes under an error rate of 0.5 are compared in Figures

2.9 and 2.10, respectively. From the results it is observed that the average

PSNR is less in the UEP scheme compared with EEP scheme, however the

average end to end delay is also significantly less and close to that of the NEP

47

scheme. The average end to end delay is maximum for the EEP scheme. User

perceived video quality is dependent on both the PSNR and the end to end

delay. A higher PSNR is normally preferred, but not at the cost of increased

delay. Especially real time multimedia applications such as video

conferencing, have very stringent end to end delay requirement as given in

2.2.3. Under bad channel conditions, EEP scheme shows best PSNR

performance but at the cost of increased latency. NEP scheme shows good

delay performance but at the cost of reduced PSNR. The UEP scheme shows

optimum performance, with respect to PSNR and delay compared to the EEP

and the NEP schemes.

Figure 2.8 End to end delay experienced by the video packets over the

simulation time for highway video sequence

48

Figure 2.9 Comparison of average PSNR performance for highway

video sequence under error rate 0.5

Figure 2.10 Comparison of average end to end delay performance for

highway video sequence under error rate 0.5

49

Another application foreman video sequence of 400 frames in QCIF

format is considered and simulation is performed and same set of results were

obtained. The PSNR obtained for different frames is compared for all the

three schemes and shown in Figure 2.11. The PSNR value is obtained with

the introduction of different error probabilities. To clearly show the

improvement in PSNR realized for most of the frames under adaptation, the

average PSNR obtained for the three mechanisms adapted by the application

layer is compared in Figure 2.12 for different error rates. The results show

that when the error rate is high say above 0.3, the average PSNR realized with

EEP which has adaptive FEC for all frames, is significantly higher than the

NEP case, and remains reasonably constant around 38 dB in the simulation.

When UEP, which has adaptive FEC for only I frame is used, the average

PSNR decreases but is definitely better than the NEP case. The results

confirm that for increased error rate the average PSNR decreases but an

improvement is achieved when FEC adaptation mechanism is introduced in

the application layer. The improvement in average PSNR is significant for

higher error rates.

Figure 2.11 PSNR for each frame of foreman video sequence

50

Figure 2.12 PSNR performance for foreman video sequence under

varying error rates

The end to end delay variation experienced by the packets at an

error rate of 0.5 is shown and compared in Figure 2.13 for the three schemes.

The end to end delay is almost same for NEP and UEP.

Figure 2.13 End to end delay experienced by the video packets over

simulation time for foreman video sequence

51

In the EEP scheme, there is a considerable improvement in PSNR

but the end to end delay also considerably increases as seen from Figure 2.14.

But this delay reduces to that of NEP case, when the proposed UEP scheme is

used. The average PSNR in this scheme is slightly reduced but the video

frames are still decodable.

The received frames are decoded at the receiver for both the schemes

and the video stream is viewed using YUV viewer. The snapshots given in

Figure 2.15 show a significant improvement in the video stream for the

proposed scheme over NEP for the same error rate of 0.4.

Figure 2.14 Comparison of average end to end delay for foreman video

sequence under error rate 0.5

NEPUEPEEP0

0.01

0.02

0.03

0.04

0.05

0.06

0.07

0.08

0.09

52

Figure 2.15 Snap Shots of foreman video sequence under three schemes

The improvement QoS measures realized by EEP and UEP schemes

over the NEP is calculated and tabulated in Table 2.6. The results show that

performance improvement shown by the proposed scheme is better for

foreman video compared to highway video.

53

Table 2.6 Comparison of performance of EEP with UEP scheme

Application video

Percentage Improvement in

PSNR

Percentage Increase in End to end delay

EEP UEP EEP UEP

Highway 12 4 168 50

Foreman 23 7 42 2

The reason for the difference in PSNR improvement for the two

applications is analyzed. It is found that the average temporal correlation

between the corresponding macro blocks of the successive frames in highway

video is 0.8 where as that of the foreman is comparatively higher with a

correlation co efficient value of 0.92. The content of the packets belonging to

encoded frames are very much important to get undistorted frames at the

receiver. If the temporal correlation is less, it is difficult to reconstruct the

erasures caused due to the loss of packets and hence may significantly reduce

the PSNR. Hence the proposed scheme is best suited for real time multimedia

applications with high temporal correlation. The correlation co efficient

values obtained for the corresponding macro blocks of the successive frames

of the foreman and highway videos are shown in Figures 2.16 and 2.17

respectively.

2.5 SUMMARY

In this work, an adaptive unequal error protection mechanism with

cross layer communication between the application layer and MAC layer and

the PHY layer has been proposed and implemented for video transmission.

The following are the important observations made from the results obtained.

54

Figure 2.16 Correlation between the corresponding macroblocks of

frame 9 and 10 of the foreman video sequence

Figure 2.17 Correlation between the corresponding macroblocks of

frame 25 and 26 of the highway video sequence

0 10 20 30 40 50 60 70 80 90 1000.4

0.5

0.6

0.7

0.8

0.9

1

Macro block

Foreman video

0 10 20 30 40 50 60 70 80 90 1000.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

Macro block

Highway video

55

When the simulation is carried out with Highway video

sequence as application, there is no significant improvement

in average PSNR at low error rate. As the error rate is

increased to 0.5, the percentage improvement in PSNR

achieved in EEP scheme is 12% at the cost of 168% increase

in average end to end delay over the NEP. The percentage

improvement in PSNR achieved in proposed UEP only is 4%,

but with only 50 % increase in average end to end delay as

compared to NEP scheme.

Though percentage improvement in PSNR is low in case of

the proposed scheme compared to EEP, it keeps the PSNR

within the range of good in terms of perceived video quality

with less increase in delay.

The proposed scheme is hence best suited to loss tolerable,

delay sensitive video transmission applications in a WLAN

network.

The application dependent behavior of the proposed scheme is

verified by the temporal correlation among the macro blocks

of the successive frames of the used application video

sequences. It is found that the temporal correlation coefficient

for highway is 0.8 whereas as this value is 0.92 for foreman

video. This slightly higher correlation works well during the

error resilient process thereby improving the performance.

CHAPTER 2 VIDEO TRANSMISSION IN WLAN WITH...

Documents

Transcript of CHAPTER 2 VIDEO TRANSMISSION IN WLAN WITH...