QoS Measurement and Management for VoIP

40
QoS Measurement and QoS Measurement and Management for VoIP Management for VoIP Wenyu Jiang IRT Lab March 5, 2003

description

QoS Measurement and Management for VoIP. Wenyu Jiang IRT Lab March 5, 2003. Introduction to VoIP & IP Telephony. Transport of voice packets over IP networks Cost savings Consolidates voice and data networks Avoids leased lines, long-distance toll calls Smart and new services - PowerPoint PPT Presentation

Transcript of QoS Measurement and Management for VoIP

Page 1: QoS Measurement and Management for VoIP

QoS Measurement and QoS Measurement and Management for VoIPManagement for VoIP

Wenyu Jiang

IRT LabMarch 5, 2003

Page 2: QoS Measurement and Management for VoIP

Introduction to VoIP & Introduction to VoIP & IP TelephonyIP Telephony

Transport of voice packets over IP networks Cost savings

– Consolidates voice and data networks– Avoids leased lines, long-distance toll calls

Smart and new services– Call management (filtering, TOD forwarding): CPL– Better than PSTN quality: wide-band codecs

Protocols and Standards– Signaling: SIP (IETF), H.323 (ITU-T)– Transport: RTP/RTCP (IETF)

Page 3: QoS Measurement and Management for VoIP

Practical Issues in VoIPPractical Issues in VoIPQuality of Service (QoS)

– Internet is a best-effort network Loss, delay and jitter Users expect at least PSTN quality for VoIP!

Ease of deployment– Requires seamless integration with legacy

networks (PSTN/PBX)– Security is a must

High yardstick of service availability– Can your network achieve 99.999% up time?

Page 4: QoS Measurement and Management for VoIP

OutlineOutline QoS measurement

– Objective vs. subjective metrics – Automated measurement of subjective quality

QoS management: improving your quality– End-to-End: FEC, LBR, PLC– Network provisioning: voice traffic aggregation

Reality check– Performance of end-points (IP phones, …)– Deployment issues in VoIP– Evaluation of VoIP service availability through

Internet measurement

Page 5: QoS Measurement and Management for VoIP

Workings of a VoIP ClientWorkings of a VoIP ClientAudio is packetized, encoded and transmittedForward error correction (FEC) may be used

to recover lost packetsPlayout control smoothes out jitter to

minimize late losses; coupled with FECPacket loss concealment (PLC)

– Last line of “defense” after FEC and playout

FEC affects playout control

addedloss, jitter

recoveryFEC

unrecoverableplayoutdelaycontrol

losses by FEC

& decoding

lossconcealmentInternet

addedlatelosses

packets with FECmultimedia

Page 6: QoS Measurement and Management for VoIP

LBR: An Alternative to FECLBR: An Alternative to FEC An (n,k) block FEC code can recover n-k losses Low Bit-rate Redundancy (LBR)

– Transmit a lower bit-rate version of original audio– No notion of “blocks”– Not bit-exact recovery

CA B D

A BF

E

C D

transmission time

FEC block 1 FEC block 2

FEC dataFEC data

C

a'A B

transmission time

LBR datab'

E

c'

F

d'

D

Page 7: QoS Measurement and Management for VoIP

Objective QoS Metrics: LossObjective QoS Metrics: Loss Internet packet loss is often bursty

– May worsen voice quality than random (Bernoulli) loss Characterization of packet loss

– 2-state Markov (Gilbert) model: conditional loss prob.

– More detailed models, but more states! Extended Gilbert model, nth order Markov model Hidden Markov model, Gilbert-Elliot model, inter-loss distance

– More states Larger test set, loss of big picture, and Adaptive applications can trade-off model accuracy for fast feedback Gilbert model provides an acceptable compromise

0 11-p p

q

(non-loss) (loss)

1-q = p c

Page 8: QoS Measurement and Management for VoIP

Effect of Gilbert Loss ModelEffect of Gilbert Loss Model Loss burst distribution of a packet trace

– Roughly, though not exactly exponential Loss burstiness on FEC performance

– FEC less efficient under bursty loss

0.1

1

10

100

1000

0 2 4 6 8 10 12

nu

mb

er o

f o

ccu

rren

ces

Loss burst length

Packet traceGilbert model

0

0.5

1

1.5

2

2.5

3

10 20 30 40 50 60

p_f:

fina

l los

s% a

fter

FE

C

conditional loss p_c (%)

GilbertBernoulli

Page 9: QoS Measurement and Management for VoIP

Objective QoS Metrics: DelayObjective QoS Metrics: Delay Complementary Conditional CDF (C3DF)

– More descriptive than auto-correlation function (ACF)– Delay correlation rises rapidly beyond a threshold– Approximates conditional late loss probability

lag=3

lag=5

lag=10lag=20

unconditional

lag=2

lag=1

0

0.2

0.4

0.6

0.8

1

0 0.05 0.1 0.15 0.2 0.25 0.3

y: p

roba

bilit

y

x: delay (sec)

idltdtdPtf ilii packet ofdelay : ,...,3,2,1 lag ],|[)(

Page 10: QoS Measurement and Management for VoIP

Subjective QoS MetricsSubjective QoS MetricsPerceived quality

– Mean Opinion Score (MOS) ITU-T P.800/830 Obtained via listening tests

– MOS variations DMOS (Degradation) CMOS (Comparison) MOSc (Conversational): considers delay A/B preference

Pros: more meaningful to end usersCons: time consuming, labor intensive

MOS Grade Score

Excellent 5

Good 4

Fair 3

Poor 2

Bad 1

Page 11: QoS Measurement and Management for VoIP

Effect of Loss Model on Effect of Loss Model on Perceived QualityPerceived Quality

Codec: G.729 (8kb/s ITU std)Random (Bernoulli) vs. bursty (Gilbert) loss

– Bursty lower MOS– True even when FEC or LBR is used

2

2.5

3

3.5

4

4.5

0.02 0.04 0.06 0.08 0.1 0.12

MO

S

loss probability

Effect of random vs. bursty loss on MOS quality

random (Bernoulli) lossbursty (Gilbert) loss

2

2.5

3

3.5

4

4.5

5

0.02 0.04 0.06 0.08 0.1 0.12

MO

S

loss probability

random vs. bursty loss on FEC (G.723.1) quality

FEC (3,2) (Gilbert)FEC (3,2) (Bernoulli)

Page 12: QoS Measurement and Management for VoIP

Going Further: Bridging Going Further: Bridging Objective and Subjective MetricsObjective and Subjective Metrics The E-model (ITU-T G.107/108)

– Originally for telephone network planning– Considers various impairments– Reduces to delay and loss impairment when adapted for

VoIP

Objective quality estimation algorithms– Suitable when network stats is not available, e.g.,

phone-to-phone service with IP in between.– Speech recognition performance may be used as a

quality predictor, by comparing with original text

Page 13: QoS Measurement and Management for VoIP

The E-modelThe E-model Map from loss and delay to

impairment scores (Ie, Id) Compute a gross score (R

value) and map to MOSc

Limited number of codec loss impairment mappings 10

15

20

25

30

35

40

45

50

0 0.03 0.06 0.09 0.12 0.15 0.18

Ie (l

oss

impa

irmen

t)

average loss probability

G.729 T=20ms random loss

0.5

1

1.5

2

2.5

3

3.5

4

4.5

20 40 60 80 100

MO

S

R value

R to MOS mapping

0

5

10

15

20

25

30

35

0 50 100 150 200 250 300 350 400

Id (d

elay

impa

irmen

t)

delay (ms)

E-model Id

Page 14: QoS Measurement and Management for VoIP

Using Speech Recognition to Using Speech Recognition to Predict MOSPredict MOS

Evaluation of automatic speech recognition (ASR) based MOS prediction– IBM ViaVoice Linux version– Codec used: G.729– Performance metric

absolute word recognition ratio

relative word recognition ratio

dsspoken wor of # total

wordsrecognizedcorrectly of #absR

yprobabilit loss is ,%)0(

)()( p

R

pRpR

abs

absrel

Page 15: QoS Measurement and Management for VoIP

Recognition Ratio vs. MOSRecognition Ratio vs. MOSBoth MOS and Rabs

decrease w.r.t. lossThen, eliminate

middle variable p 2

2.2

2.4

2.6

2.8

3

3.2

3.4

3.6

0 2 4 6 8 10 12 14 16

MO

S

loss rate (%)

Impact of packet loss on audio quality

G.729 codec

28

30

32

34

36

38

40

42

44

0 2 4 6 8 10 12 14 16

wor

d re

cogn

ition

rat

io (%

)

loss rate (%)

Impact of packet loss on automatic speech recognition

G.729 codec

2

2.2

2.4

2.6

2.8

3

3.2

3.4

3.6

3.8

28 30 32 34 36 38 40 42 44

MO

S

word recognition ratio (%)

mapping from speech recognition performance to MOS

speech recognition performance

Page 16: QoS Measurement and Management for VoIP

Speaker DependencySpeaker Dependency Absolute performance

is speaker-dependent But relative word

recognition ratio is not Suitable for MOS

prediction

0.65

0.7

0.75

0.8

0.85

0.9

0.95

1

0 2 4 6 8 10 12 14 16

rela

tive

wor

d re

cogn

ition

rat

io R

_rel

packet loss probability p (%)

Speaker ASpeaker BSpeaker C

2

2.2

2.4

2.6

2.8

3

3.2

3.4

3.6

3.8

0.65 0.7 0.75 0.8 0.85 0.9 0.95 1

MO

S

relative word recognition ratio R_rel

Speaker ASpeaker BSpeaker C

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

0 2 4 6 8 10 12 14 16

wor

d re

cogn

ition

rat

io

packet loss probability p (%)

Speaker ASpeaker BSpeaker C

Page 17: QoS Measurement and Management for VoIP

Summary of QoS Summary of QoS MeasurementMeasurement

Loss burstiness:– Affects (generally worsens) perceived quality as well

as FEC performance– May be described with, e.g., a Gilbert model

Delay correlation:– Increases rapidly beyond a threshold, revealed through

Complementary Conditional CDF (C3DF)– Late losses are also bursty

Perceived quality (MOS) estimation– Analytical: the E-model– If network statistics N/A: relative word recognition

ratio can provide speaker-independent MOS prediction

Page 18: QoS Measurement and Management for VoIP

OutlineOutline QoS measurement

– Objective vs. subjective metrics – Automated measurement of subjective quality

QoS management: improving your quality– End-to-End: FEC, LBR, PLC– Network provisioning: voice traffic aggregation

Reality check– Performance of VoIP end-points (IP phones, …)– Deployment issues in VoIP– Evaluation of VoIP service availability through Internet

measurement

Page 19: QoS Measurement and Management for VoIP

Quality of FEC vs. LBRQuality of FEC vs. LBR FEC is substantially and consistently better

– At comparable bandwidth overhead– Across all codec configurations tested

2

2.5

3

3.5

4

4.5

0.02 0.04 0.06 0.08 0.1 0.12

MO

S

loss probability

FEC vs. LBR based on G.723.1

J: FEC (2,1)I: G.723.1 LBR

2

2.5

3

3.5

4

4.5

0.02 0.04 0.06 0.08 0.1 0.12

MO

S

loss probability

FEC vs. LBR based on AMR

N: AMR12.2+FEC (3,2)M: AMR12.2+6.7 LBR

G.729+G.723.1 LBR AMR LBR

Page 20: QoS Measurement and Management for VoIP

Quality of FEC under Bursty Quality of FEC under Bursty LossLoss

Packet interval T has a stronger effect on MOS with FEC than without FEC

0.5-0.6 MOS

2.5

3

3.5

4

4.5

0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18

MO

S (M

ean

Opi

nion

Sco

re)

p_u (overall loss rate)

conditional loss probability p_c = 30%

T=20ms

2

T=40ms

T=20ms, FEC

T=40ms, FEC

Page 21: QoS Measurement and Management for VoIP

FEC MOS Optimization FEC MOS Optimization Considering Delay EffectConsidering Delay Effect

Larger T FEC efficiency, but delay Optimizing T with the E-model

– Calculate final loss probability after FEC, apply delay impairment of FEC, map to MOSc

Prediction close to FEC MOS test results– Suitable for analytical perceived quality prediction

2

2.5

3

3.5

4

20 40 60 80 100 120 140 160 180

MO

S_c

packet interval T (ms)

FEC MOS optimization, Id != 0, d=3*T

p_u=4%p_u=8%

p_u=12%p_u=16%

2.4

2.6

2.8

3

3.2

3.4

3.6

3.8

4

4.2

0 2 4 6 8 10 12 14 16

MO

S_c

original loss rate (%)

FEC MOS prediction, p_c=30%

E-model prediction T=40msreal MOS test T=40ms

Page 22: QoS Measurement and Management for VoIP

Trade-off Analysis between Trade-off Analysis between Codec Robustness and FECCodec Robustness and FEC

3 loss repair options– FEC, LBR, PLC

Loss-resilient codec– Better PLC

iLBC (IETF)

– But more bit-rates– Better than FEC?

1.5

2

2.5

3

3.5

4

0 0.03 0.06 0.09 0.12 0.15

MO

S

average loss probability

iLBC 14kb/sG.729 8kb/s

G.723.1 6.3kb/s

Page 23: QoS Measurement and Management for VoIP

Observations and ResultsObservations and Results When considering delay:

– iLBC is usually preferred in low loss conditions– G.729 or G.723.1 + FEC better for high loss

Example: max bandwidth 14 kb/s– Consider delay impairment (use MOSc)

2.4

2.6

2.8

3

3.2

3.4

3.6

3.8

4

0 0.03 0.06 0.09 0.12 0.15

MO

S_c

average loss probability

iLBC,no FECG.729+(5,3)

G.723.1+(2,1),T=60ms

G.729+(5,3)

G.723.1+(2,1),T=60ms

iLBC

33.23.43.63.8

4

0 0.03 0.06 0.09 0.12 0.15

MO

S_c

average loss probability

Max BW: 14 kb/s

2.82.62.4

Page 24: QoS Measurement and Management for VoIP

Effect of Max Bandwidth on Effect of Max Bandwidth on Achievable QualityAchievable Quality

14 to 21 kb/s: significant improvement in MOSc

From 21 to 28 kb/s: marginal change due to increasing delay impairment by FEC

2.4

2.6

2.8

3

3.2

3.4

3.6

3.8

4

0 0.03 0.06 0.09 0.12 0.15

MO

S_c

average loss probability

Max BW: 14 kb/sMax BW: 21 kb/sMax BW: 28 kb/s

Page 25: QoS Measurement and Management for VoIP

Provisioning a VoIP NetworkProvisioning a VoIP Network Silence detection/suppression

– Transmit only during On period, saves bandwidth– Allows traffic aggregation through statistical multiplexing

Characteristics of On/Off patterns in VoIP– Traditionally found to be exponentially distributed– Modern silence detectors (G.729B VAD, NeVoT SD) produce

different patterns

1e-05

0.0001

0.001

0.01

0.1

1

0 50 100 150 200 250 300 350 400 450 500

com

plem

enta

ry C

DF

spurt/gap duration (in 10 ms frames)

talk-spurt/gap distribution, G.729B VAD

real spurt CDFexponential spurt CDF

real gap CDFexponential gap CDF

1e-05

0.0001

0.001

0.01

0.1

1

0 200 400 600 800 1000

com

plem

enta

ry C

DF

spurt/gap duration (in 10 ms frames)

talk-spurt/gap distribution, Nevot SD (default setting)

real spurt CDFexponential spurt CDF

real gap CDFexponential gap CDF

Page 26: QoS Measurement and Management for VoIP

Traffic Aggregation SimulationTraffic Aggregation Simulation Token bucket filter with N sources, R: reserved to peak BW ratio CDF model resembles trace model in most cases Exponential (traditional) model

– Under-predicts out-of-profile packet probability;– Under-prediction ratio as token buffer size B

Similar results for NeVoT SD

Page 27: QoS Measurement and Management for VoIP

Summary of QoS Summary of QoS ManagementManagement

End-to-End– FEC is superior in quality to LBR– Codec robustness is better than FEC in low loss

conditions Combining both schemes brings the best of both sides

Network provisioning– Observation: New silence detectors (G.729B, NeVoT

SD) non-exponential voice On/Off patterns– Result: performance of voice traffic aggregation under

new On/Off patterns– Important in traffic engineering and Service Level

Agreement (SLA) validation

Page 28: QoS Measurement and Management for VoIP

OutlineOutline QoS measurement

– Objective vs. subjective metrics – Automated measurement of subjective quality

QoS management: improving your quality– End-to-End: FEC, LBR, PLC– Network provisioning: voice traffic aggregation

Reality check– Performance of end-points (IP phones, …)– Deployment issues in VoIP– Assessment of VoIP service availability through Internet

measurement

Page 29: QoS Measurement and Management for VoIP

Mouth-to-ear Delay of VoIP Mouth-to-ear Delay of VoIP End-pointsEnd-points

All receivers can adjust M2E delay adaptively whenever it is too low or too high

M2E delay depends mainly on receiver (esp. RAT) HW phones have relatively low delay (~45-90ms)

35

40

45

50

55

60

0 50 100 150 200 250 300 350

M2E

del

ay (m

s)

time (sec)

experiment 1-1experiment 1-2

silence gaps

406080

100120140160180200220240

3Com Cisco Mediatrix Pingtel RAT

M2E

del

ay (m

s)

Receiver

Effect of Sender and Receiver

Sender: 3ComSender: Cisco

Sender: MediatrixSender: Pingtel

Sender: RAT

Page 30: QoS Measurement and Management for VoIP

But Adaptiveness But Adaptiveness PerfectionPerfection

Symptom of playout buffer underflow

Waveforms are dropped

Occurred at point of delay adjustment

Bugs in software?

LAN perfect quality?

Page 31: QoS Measurement and Management for VoIP

Major ObservationsMajor Observations Overall: end-points matter a lot! HW IP phones: 45-90ms average M2E delay SW clients:

– Messenger 2000 lowest (68ms), XP (96-120ms) c.f. GSMPSTN: 110ms either direction

– NetMeeting very bad (> 400ms) PLC robustness

– Acceptable in all 3 IP phones tested, Cisco phone more robust Silence detection/suppression

– Works for speech input– Often fails for non-speech (e.g., music) input

Generates many unnatural gaps Not good for customer support center (on-hold music)!

Acoustic echo cancellation (AEC): – Good on most IP phones (Echo Return Loss > 40 dB)– But some do not implement AEC at all

Page 32: QoS Measurement and Management for VoIP

Reality Check #2: IP Reality Check #2: IP Telephony DeploymentTelephony Deployment

Localized deployment at Columbia Univ.

SIP proxy,redirectserver

SQLdatabase

sipd

ConferenceServer

VoicemailServer

T1/E1RTP/SIP

Regular phone

SIP/PSTN Gateway

TelephoneSwitch/PBX

Web based configuration

Web Server

Server status monitoring

Core Server

IP Phones

Page 33: QoS Measurement and Management for VoIP

Issues and Lessons LearnedIssues and Lessons Learned PSTN/PBX integration

– Requires full understanding of legacy networks Lower layer (e.g., T1 line configuration)

– Parameters must match on both PSTN/PBX and gateway! PBX access configurations

– To ensure calls go through in both directions Address translation (dial-plan) in both directions

– Previous lessons/experiences can help greatly E.g., second gateway installed in weeks instead of months

Security– Issue: SIP/PSTN gateway has no authentication feature– Solution:

Use gateway’s access control lists to block direct calls SIP proxy server handles authentication using record-route

Page 34: QoS Measurement and Management for VoIP

Reality Check #3: VoIP Reality Check #3: VoIP Service AvailabilityService Availability

Focus on availability rather than traditional QoS– Delay is a minor issue; FEC recovers most isolated losses– Ability to make a call is vital, especially in emergency

Internet measurement sites:– 14 nodes worldwide, not just Internet2 and alike

Definitions:– Availability = MTBF / (MTBF + MTTR)– Availability = successful calls / first call attempts

Equipment availability: 99.999% (“5 nines”) 5 minutes/year AT&T: 99.98% availability (1997) IP frame relay SLA: 99.9% UK mobile phone survey: 97.1-98.8%

Page 35: QoS Measurement and Management for VoIP

First Look of AvailabilityFirst Look of Availability Call success probability:

– 62,027 calls succeeded, 292 failed 99.53% availability

– Roughly constant across I2, I2+, commercial ISPs: 99.39-99.58%

Overall network loss– PSTN: once connected, call

usually of good quality exception: mobile phones

– Compute % time below loss threshold

5% loss causes degradation for many codecs

others acceptable till 20%

loss 0% 5% 10% 20%

All 82.3 97.48 99.16 99.75

ISP 78.6 96.72 99.04 99.74

I2 97.7 99.67 99.77 99.79

I2+ 86.8 98.41 99.32 99.76

US 83.6 96.95 99.27 99.79

Int. 81.7 97.73 99.11 99.73

US ISP

73.6 95.03 98.92 99.79

Int. ISP

81.2 97.60 99.10 99.71

Page 36: QoS Measurement and Management for VoIP

Network OutagesNetwork Outages Sustained packet losses

– arbitrarily defined at 8 packets– far beyond recoverable (FEC,

interpolation) 23% packet losses are outages Make up significant part of 0.25%

unavailability Symmetric: AB BA Spatially correlated: AB

AX Not correlated across networks

(e.g., I2 and commercial) Mostly short (a few seconds), but

some are very long (100’s of seconds), make up majority of outage time

0.0001

0.001

0.01

0.1

1

0 50 100 150 200 250 300 350 400

Com

plem

enta

ry C

DF

outage duration (sec)

US Domestic pathsInternational paths

1e-05

0.0001

0.001

0.01

0.1

1

0 50 100 150 200 250 300 350 400

Com

plem

enta

ry C

DF

outage duration (sec)

all pathsInternet2

Page 37: QoS Measurement and Management for VoIP

Outage-induced Call Abortion Outage-induced Call Abortion ProbabilityProbability

Long interruption user likely to abandon call

from E.855 survey: P[holding] = e-t/17.26 (t in seconds)

half the users will abandon call after 12s

2,566 have at least one outage 946 of 2,566 expected to be

dropped 1.53% of all calls

all 1.53%

I2 1.16%

I2+ 1.15%

ISP 1.82%

US 0.99%

Int. 1.78%

US ISP 0.86%

Int. ISP 2.30%

Page 38: QoS Measurement and Management for VoIP

Summary of Service Summary of Service AvailabilityAvailability

Through several metrics, one can translate from network loss to VoIP service availability (no Internet dial-tone)

Current results show availability far below five 9’s, but comparable to mobile telephony– Outage statistics are similar in research and ISP

networks Working on identifying fault sources and locations Additional measurement sites are welcome

Page 39: QoS Measurement and Management for VoIP

ConclusionsConclusions Measuring QoS

– Loss burstiness and delay correlation affects (generally worsens) perceived quality

– Bridging objective and subjective metrics: the E-model, or speech recognition based MOS prediction

– Performance of real products: IP phones and soft clients Ensuring/improving QoS

– Network provisioning (voice traffic aggregation) Efficient, but may be expensive to deploy and manage

– End-to-End (FEC > LBR, PLC) Easier to deploy, but must control overhead of FEC

Reality Check– Good implementation at the end-point (e.g., IP phones) is vital– VoIP deployment requires PSTN integration and security– Service availability is crucial for VoIP, but still far from 99.999%

over the Internet

Page 40: QoS Measurement and Management for VoIP

Ongoing and Future WorkOngoing and Future Work

Sampling Internet performance– Where do the problems reside?

Access networks (Cable, DSL), or International paths?

– How can we solve these problems? Can adaptive FEC react fast enough to changes in

network conditions?

Playout delay behaviors of VoIP end-points– How well do they react to jitter, delay spikes?