SpeechTEK 2006 – Voice Over IP TutorialSpeechTEK 2006 – Voice Over IP Tutorial
Andrew Hunt, Ph.D.VP Engineering, Holly Connects
Andrew Hunt, Ph.D.VP Engineering, Holly Connects
Welcome!
Who are you?
Andrew Hunt, Ph.D.VP of Engineering
Level 11, 301 George StreetSydney, NSW 2000, Australia
Tel +61 2 8207 8207Email: [email protected]: http://www.holly-connects.com/
4
DRAFT V03DRAFT V03 TimingTiming
8:30am Start
10:00-10:30am Coffee break
12:00-1:00pm Lunch
2:30-3:00pm Coffee break
4:30pm Close
5
DRAFT V03DRAFT V03 AgendaAgenda
1. Welcome & Introductions
2. Why Voice Over IP?
3. Brief History of Telephony
4. Digital Voice
5. Voice Over IP – Basics
6. VoIP Protocols
7. SIP: Session Initiation Protocol
8. RTP: Real-time Transport Protocol
9. Network Issues and Design
10. VoIP and Speech Recognition
11. VoIP and Mobile Telephony
12. Closing
6
DRAFT V03DRAFT V03 ObjectivesObjectives
Informative
Relevant
Interesting
Interactive
Why Voice Over IP?
8
DRAFT V03DRAFT V03 Module OverviewModule Overview
Why Voice Over IP?
What is it?
Cost
Functionality & flexibility
Mobility
9
DRAFT V03DRAFT V03 VoIP DefinitionVoIP Definition
Definitions
“Voice Over IP” is the use of the internet, intranets and other IP networks for the delivery of voice conversations
“Internet Protocol” (IP) is a protocol used for communicating data across a packet-switched network (specifically IPv4 or IPv6)
Many VoIP protocols exist: focus today on SIP and related protocols for use in speech recognition and IVR contexts
10
DRAFT V03DRAFT V03 Why VoIP?Why VoIP?
Cost
Global data traffic exceeded voice traffic in late 1990’s
Telco charges & revenue largely from voice traffic
Migration to shared networks
Single network for voice and data
Utilize spare capacity in many data networks
Inwards-out approach to migration
Traditional voice carriage costs to business being driven down by VoIP
Residential advantages
Free services: Skype etc.
Lower cost services: Vonage, Skype etc.
Regulatory and service capability issues are evolving
11
DRAFT V03DRAFT V03 Why VoIP?Why VoIP?
Functionality and Flexibility
Virtualization &mobility: move calls and agents locally, nationally, globally
Match traditional telephony functions
Transfers, voice mail, conferencing, redial, speed-dial, forwarding etc.
Integration with software and internet services
Availability notification (IM)
Instant messaging
Multi-media services: video, data files etc.
Options for traditional and VoIP telephony to co-exist and migrate steadily
12
DRAFT V03DRAFT V03 Why VoIP?Why VoIP?
Mobility
Make VoIP calls virtually anywhere in the world
Number portability: landline, mobile, internet
Synchronization of address books and contacts
Brief History of Telephony
14
DRAFT V03DRAFT V03 Module OverviewModule Overview
Brief History of Telephony
Telephony switching
Basic concepts of telephony
Call establishment
Circuits
Switching
Migrating from analogue to digital
15
DRAFT V03DRAFT V03 Telephony: Switching and CircuitsTelephony: Switching and Circuits
Alexander Graham
Bell
Thomas Watson
16
DRAFT V03DRAFT V03 Telephony: Switching and CircuitsTelephony: Switching and Circuits
1. Caller picks up
2. Caller “dials”
3. Callee phone rings
4. Callee answers
5. Circuit established
6. Conversation
7. Hang-up & tear-down
17
DRAFT V03DRAFT V03 Telephony: Switching and CircuitsTelephony: Switching and Circuits
ExchangeExchange
ExchangeExchange
ExchangeExchange
ExchangeExchange
ExchangeExchange
18
DRAFT V03DRAFT V03 Telephony: Switching and CircuitsTelephony: Switching and Circuits
PSTN: Public Switched Telephony Network
1. A-party picks up
2. A-party dials
3. B-party rings
4. B-party answers
5. Circuit established
6. Conversation
7. Hang-up & tear-down
19
DRAFT V03DRAFT V03 Telephony: Switching and CircuitsTelephony: Switching and Circuits
Pre-Digital “Analogue” Era
1878: New Haven, Connecticut
World’s first commercial telephone exchange
Built from “carriage bolts, handles from teapot lids and bustle wire”
Cost: $40 including the furniture
1891: Topeka, Kansas
Almon Strowger, an undertaker, patented the Strowger switch
Automation of the telephone circuit switching by decadic pulses
1950 onwards: Crossbar switches
1964: “Dual Tone Multi-Frequency” (DTMF) introduced
20
DRAFT V03DRAFT V03 Telephony: Switching and CircuitsTelephony: Switching and Circuits
Pre-Digital “Analogue” Era
Infrastructure
Local wiring to each phone
System of local, regional, national and international exchanges
Shared connections between exchanges
Call establishment
Numbering scheme (many iterations) Voice decadic pulsing DTMF
Mapping of numbering scheme to the exchanges (e.g. CAstle=22)
Call communication
Dedicated circuit per call
21
DRAFT V03DRAFT V03 Telephony: Switching and CircuitsTelephony: Switching and Circuits
Digital Era
Digital started with the core telco networks
Efficiency on long-distance carriage
Efficiency of solid state switching technology
Migrated to local exchanges
What is digital voice?
Digital Voice
23
DRAFT V03DRAFT V03 Module OverviewModule Overview
Digital Speech
Sampling: creating digital audio
CODECs: Compression and companding
Sharing channels: Time Division Multiplexing (TDM)
Standard digital data links: E1 & T1
24
DRAFT V03DRAFT V03 Telephony: Packet Switch NetworkTelephony: Packet Switch Network
…0101100…
…0011100…
…1100110…
…1101100…
…0001100…
…0101100…
…0011100…
…1100110…
…1101100…
…0001100…
…0101100…
…0011100…
…1100110…
…1101100…
…0001100…
…0101100…
…0011100…
…1100110…
…1101100…
…0001100…
PSTN: Public Switched Telephony Network
25
DRAFT V03DRAFT V03 Sampling TheorySampling Theory
26
DRAFT V03DRAFT V03 Sampling TheorySampling Theory
27
DRAFT V03DRAFT V03
Sample frequency bandwidth
Sampling TheorySampling Theory
28
DRAFT V03DRAFT V03
Sam
ple
reso
lutio
n =
bits
n
oise
/err
orSampling TheorySampling Theory
29
DRAFT V03DRAFT V03 Sampling TheorySampling Theory
Time (msec)0
Value0
1 292 573 824 1025 1176 1257 1268 1209 108
10 8911 6612 3913 914 -2015 -4916 -7517 -9718 -11419 -12420 -127… …
30
DRAFT V03DRAFT V03
Non
-line
ar s
ampl
ing
b
ette
r no
ise/
erro
rSampling Theory: CompandingSampling Theory: Companding
G.711 μ-law for North America and JapanG.711 A-law for Europe and rest of the world
31
DRAFT V03DRAFT V03 Sampling TheorySampling Theory
Measure G.711 Telephony Compact Disc
Sampling rate 8,000 Hz 44,100 Hz
Frequency range Low – 3.5kHz (4kHz max) Lower – 22 kHz
Sample type8 bit A-law / μ-lawMono
16-bit linear PCMStereo
Signal-to-noise ratio < 70dB 96 dB
Data bandwidth 8 kByte/sec 176 kByte/sec
Perceived quality Telephony Great
32
DRAFT V03DRAFT V03 Digital TransportDigital Transport
• 10
• 52
• 87
• 92
• 10
2
• 85
• 49
• 10
• -1
8
• -4
8
• -6
0
• -1
2
• 10
2
• 85
• 49
• 10
• -1
8
• -4
8
• -6
0
• -1
2
• 10
• 52
• 87
• 92
• 1
0•
52
• 8
7•
92
• 1
02
• 8
5•
10
2•
85
• 4
9•
10
• -1
8•
-48
• -4
9•
10
• -1
8•
-48
• -6
0•
-12
• -6
0•
-12
• 1
0•
52
• 8
7•
92
TDM = TDM = Time Division Multiplex Time Division Multiplex
LatencyLatency PacketPacket
33
DRAFT V03DRAFT V03 Digital TransportDigital Transport
E1
World (ex. NA and Japan)
2.048 Mbit/s full duplex
32 time slots = 32 channels of 8-bit x 8 kHz
1 time slot reserved for framing
1 time slot is typically reserved for signalling
30 time slots for voice communications
E3 = 16 x E1 = 480 channels
T1
North America and Japan
1.536 MBit/s full duplex
24 time slots = 24 channels of 8-bit x 8kHz
34
DRAFT V03DRAFT V03 CompressionCompression
Compression of voice and audio
CODEC = COmpression - DECompression
Reduce bandwidth for the audio signal = more channels on the same transport
Lossless vs. lossy algorithms
Latency impact
CPU impact
Quality impact
Speech recognition impact
35
DRAFT V03DRAFT V03 Sampling TheorySampling Theory
CODEC Characteristics Description
ITU-T G.711 Sample rate: 8kHzSample size: 8-bit A-law/μ-law Bandwidth: 64kbit/s
Standard telephony quality with “companding”
ITU-T G.726 Sample rate: 8kHzSample size: 2, 3, 4, 5-bitBandwidth: 16, 24, 32, 40 kbit/s
Adaptive Delta PCM (ADPCM). Supercedes G.721 & G.723
ITU-T G.728 Bandwidth: 16 kbit/sDelay: 5 samples, 0.625 ms
LDCELP = Low Delay Code Excited Linear Prediction
ITU-T G.729 Bandwidth: 8 or 6.4, 11.8 kbit/sDelay: 10 ms chunks
CS-ACELP = Conjugate-Structure Algebraic-Code-Excited Linear Prediction
ITU-T G.722 Sample rate: 16kHzSample size: 14-bit Bandwidth: 32-64kbit/s
Standard wideband speech ADPCM codec.Non-telephony CODEC
36
DRAFT V03DRAFT V03 Digital TransportDigital Transport
CODECs can increase channel capacity
1 E1/T1 channel = 64 kbit/s
= 1 channel of G.711
= 2 channels of G.726 @ 32 kbit/s
= 8 channels of G.729 @ 8 kbit/s
1 T1 = 24 channels
= 24 channels of G.711
= 48 channels of G.726 @ 32 kbit/s
= 196 channels of G.729 @ 8 kbit/s
BUT, CODECs can reduce voice quality
37
DRAFT V03DRAFT V03 Digital TransportDigital Transport
Advanced topics
Unreliable tone transmission on some CODECs
DTMF, Fax, Modem etc.
Use “out-of-band” communication (more later)
Silence suppression
Voice Activity Detection
Replace by simulated background noise
e.g. G.729 Annex B
Comfort noise generator (CNG)
Played when a communication channel fails temporarily
Usability: Reduces hang-up on temporary outages
38
DRAFT V03DRAFT V03
Voice Over IP – Basics
40
DRAFT V03DRAFT V03 Module OverviewModule Overview
Voice Over IP Basics
Internet and intranet for voice communication
Challenges of the internet protocols
Quick guide to the internet protocols: TCP/IP, UDP
Applying internet protocols to session management & voice carriage
41
DRAFT V03DRAFT V03 Telephony: Packet Switch NetworkTelephony: Packet Switch Network
…0101100…
…0011100…
…1100110…
…1101100…
…0001100…
…0101100…
…0011100…
…1100110…
…1101100…
…0001100…
…0101100…
…0011100…
…1100110…
…1101100…
…0001100…
…0101100…
…0011100…
…1100110…
…1101100…
…0001100…
PSTN: Public Switched Telephony Network
42
DRAFT V03DRAFT V03 Voice Over Internet ProtocolVoice Over Internet Protocol
…0101100…
…0011100…
…1100110…
…1101100…
…0001100…
…0101100…
…0011100…
…1100110…
…1101100…
…0001100…
…0101100…
…0011100…
…1100110…
…1101100…
…0001100…
…0101100…
…0011100…
…1100110…
…1101100…
…0001100…
Intranet
OR
Internet
43
DRAFT V03DRAFT V03 ChallengesChallenges
Call establishment protocols
Connect A-party and B-party
Perform advanced telephony functions
Audio transport protocols
Get the packets from A-to-B / B-to-A on time
Latency
Packet loss
Jitter
Quality of service
44
DRAFT V03DRAFT V03 The ProblemThe Problem
The internet was not designed to carry real-time voice traffic!!!
(well almost)
45
DRAFT V03DRAFT V03 Internet Protocol Suite StackInternet Protocol Suite Stack
LinkLinkLayer 1
NetworkNetworkLayer 2
TransportTransportLayer 3
ApplicationApplicationLayer 4
Ethernet, Wi-Fi, ATM, Frame Relay…
IP (IPv4, IPv6)
TCP, UDP, SCTP, DCCP, IL, RUDP, …
DNS, FTP, HTTP, SMTP, SNMP, TELNET, SIP, RTP, H.323…
46
DRAFT V03DRAFT V03 Voice Over Internet ProtocolVoice Over Internet Protocol
Internet
ApplicationApplication
ApplicationApplication
47
DRAFT V03DRAFT V03 Internet Protocol Suite StackInternet Protocol Suite Stack
LinkLink
NetworkNetwork
TransportTransport
ApplicationApplication
LinkLink
NetworkNetwork
TransportTransport
ApplicationApplicationPeer-to-peer connection
LinkLink
NetworkNetwork
LinkLink
NetworkNetwork
48
DRAFT V03DRAFT V03 Internet Protocol Suite StackInternet Protocol Suite Stack
LinkLink
NetworkNetwork
TransportTransport
ApplicationApplication
LinkLink
NetworkNetwork
TransportTransport
ApplicationApplicationPeer-to-peer connection
LinkLink
NetworkNetwork
LinkLink
NetworkNetwork
0101100
0101100 0101100
49
DRAFT V03DRAFT V03 Internet Protocol Suite StackInternet Protocol Suite Stack
LinkLink
NetworkNetwork
TransportTransport
ApplicationApplication
LinkLink
NetworkNetwork
TransportTransport
ApplicationApplicationPeer-to-peer connection
LinkLink
NetworkNetwork
LinkLink
NetworkNetwork
0101100
0101100 0101100
Packet Loss
50
DRAFT V03DRAFT V03 Internet Protocol Suite StackInternet Protocol Suite Stack
TCP/IP = Transmission Control Protocol
Transport protocol
One of the core protocols of the Internet protocol suite: 75% of all traffic
Applications on networked hosts can create connections to one another using TCP for exchange of data or packets.
Guarantees reliable and in-order delivery of data from sender to receiver
Distinguishes data for multiple, concurrent applications on the same host
TCP supports many of the most popular application protocols including HTTP (Web), Email and SIP
Send a stream of bytes through a virtual “pipe”
Utilizes sequence numbers, acknowledgement, timeout, retransmission…
51
DRAFT V03DRAFT V03 Internet Protocol Suite StackInternet Protocol Suite Stack
TCP/IP for Voice Over IP
Good for session management
H.323 / ASN.1 built on TCP/IP
Sub-optimal for near-realtime audio transport
Latency
Jitter
Aside: Utilized for Skype
52
DRAFT V03DRAFT V03 Internet Protocol Suite StackInternet Protocol Suite Stack
UDP = User Datagram Protocol
Transport protocol:
One of the core protocols of the Internet protocol suite: 20% of all traffic
Does not provide the reliability and ordering guarantees of TCP
Datagrams may arrive out of order or be dropped by the network
Datagram transmission is stateless in the network
Lower overhead = faster, more efficient, suited to time-sensitive comms
UDP supports many application protocols including DNS and RTP
53
DRAFT V03DRAFT V03 Internet Protocol Suite StackInternet Protocol Suite Stack
UDP for Voice Over IP
OK for session management
End-parties need reliable communication about session status
SIP utilises UDP with retry behaviour to withstand packet loss
UDP offers faster setup time than TCP/IP
Suited to near-realtime audio transport
Utilized by RTP
Better latency than TCP/IP (though not ideal)
Better jitter than TCP/IP (though not ideal)
Packet loss causes poorer audio transmission than TCP/IP
54
DRAFT V03DRAFT V03 Internet Protocols for Voice Over IPInternet Protocols for Voice Over IP
Internet vs. PSTN
Internet has smart terminals and “dumb” network
PSTN has “dumb” terminals and smart network
PSTN dedicates virtual connections for audio and session
Internet normally creates connections on an as-needed basis
Internet protocols emerging for traffic shaping suited to telephony
Network tools also emerging for banishing and punishing telephony
VoIP Protocols
56
DRAFT V03DRAFT V03 Module OverviewModule Overview
VoIP Protocols
Overview of the landscape of VoIP protocols
Major IETF standards: SIP, RTP, RTCP
Major ITU-T standards: H.323 family
57
DRAFT V03DRAFT V03 VoIP Protocols: IETFVoIP Protocols: IETF
Protocol Description
SIP Session Initiation Protocol Session management on UDP
RTPSRTP
Real-time Transport ProtocolSecure RTP
Audio/video media delivery on UDP
RTCPSRTCP
Real-time Transport Control ProtocolSecure RTCP
Out-of-band control protocol for RTP
58
DRAFT V03DRAFT V03 VoIP ProtocolsVoIP Protocols
About SIP
IETF Protocol
Standard for initiating, modifying, and terminating a user session that may involves media elements such as voice, video, instant messaging etc.
Used widely in telephony environments
Supported by numerous IVR platforms
Accepted in 2000 as the signalling protocol of the IMS architecture
Other uses
MRCP v2: Media Resource Control Protocol for Speech Recognition and Text-to-Speech
Microsoft Messenger
59
DRAFT V03DRAFT V03 VoIP Protocols: ITU-TVoIP Protocols: ITU-T
Protocol Description
H.323Umbrella recommendation for audio-visual comms on any packet networkReferences the following specifications
H.225.0Protocol to describe call signaling, the media (audio and video), the stream packetization, media stream synchronization and control message formats
H.245
Control protocol for multimedia communication with messages and procedures used for opening and closing logical channels for audio, video and data, capability exchange, control and indications
H.235 Describes security in H.323
H.329Describes dual stream use in videoconferencing, usually one for live video, the other for presentation
60
DRAFT V03DRAFT V03 VoIP ProtocolsVoIP Protocols
About H.323
Based on ISDN Q.931
Suited to internetworking between IP and ISDN / QSIG
Similar call model to ISDN
Used widely in telephony environments
Telecommunications backbones
Other uses
Microsoft NetMeeting
61
DRAFT V03DRAFT V03 Today’s FocusToday’s Focus
We’ll focus on SIP today
SIP: Session Initiation Protocol
63
DRAFT V03DRAFT V03 Module OverviewModule Overview
Session Initiation Protocol
Components of the SIP architecture
SIP Messaging
Standard SIP exchanges
Ring, hold, answer, transfer, consultative transfer, conferencing
SIP addresses
SIP working with RTP for Voice
64
DRAFT V03DRAFT V03 SIP OverviewSIP Overview
Session Initiation Protocol (SIP)
There are many applications of the Internet that require the creation and management of a session, where a session is considered an exchange of data between an association of participants. The implementation of these applications is complicated by the practices of participants: users may move between endpoints, they may be addressable by multiple names, and they may communicate in several different media - sometimes simultaneously. Numerous protocols have been authored that carry various forms of real-time multimedia session data such as voice, video, or text messages. The Session Initiation Protocol (SIP) works in concert with these protocols by enabling Internet endpoints (called user agents) to discover one another and to agree on a characterization of a session they would like to share. For locating prospective session participants, and for other functions, SIP enables the creation of an infrastructure of network hosts (called proxy servers) to which user agents can send registrations, invitations to sessions, and other requests. SIP is an agile, general-purpose tool for creating, modifying, and terminating sessions that works independently of underlying transport protocols and without dependency on the type of session that is being established.
65
DRAFT V03DRAFT V03 SIP OverviewSIP Overview
Session Initiation Protocol (SIP)
http://www.ietf.org/rfc/rfc3261.txt
Protocol developed by the IETF MMUSIC Working Group (now SIP)
Scope: initiate, modify and terminate an interactive user session that involves voice and multimedia elements such as video, instant messaging and games.
SIP 2.0 published as RFC 3261 in 2002 Initial release of SIP 1.0 as RFC 2543 in 1996 (now obsolete)
SIP enables device-to-device communication with media communication via other protocols SDP: Session Description Protocol - RFC 2327 (describe media capabilities)
RTP: Real-time Transport Protocol - RFC 3550 (transport audio, video, media)
RTCP: Real-time Transport Control Protocol - RFC 3550 (control transport of media)
Standard protocols with high level of product interoperability
66
DRAFT V03DRAFT V03 SIP OverviewSIP Overview
Session Initiation Protocol (SIP)
SIP is an “application-layer” control protocol in Internet stack
SIP is an device-to-device, client-server session signalling protocol
SIP establishes sessions for voice and other media
Allows integration with others services: web, email, IM…
Allows presence and mobility services
67
DRAFT V03DRAFT V03 SIP OverviewSIP Overview
Applications of SIP
SIP can convey arbitrary payload
Session description
Instant messages
Pictures (e.g. picture of the caller)
Speech recognition control
Web pages
68
DRAFT V03DRAFT V03 SIP Overview: DevicesSIP Overview: Devices
CiscoCisco
xTenxTenSiemensSiemens
AvayaAvaya
BlackBerryBlackBerry
Express TalkExpress Talk
69
DRAFT V03DRAFT V03 SIP OverviewSIP Overview
Network Devices
SIP Proxy Server
Intermediary to relay call signalling
SIP Redirect Server
Redirects callers to other servers
SIP Registrar
Accept registration requests from users
Maintains user’s whereabouts
SIP IVR
SIP PBX
70
DRAFT V03DRAFT V03 SIP CommunicationsSIP Communications
SIP Jargon
User Agent Client = Initiates a communication
User Agent Server = Respondent to a communication
Note: device can be both client and server in a single session
Examples:
Desktop phone is both a client (makes calls) and server (receives calls)
71
DRAFT V03DRAFT V03 SIP CommunicationsSIP Communications
SIP Addresses
SIP address can make you globally reachable Callees bind to this address using SIP REGISTER method
Callers use this address to establish real-time communication with callees
SIP address is a URI address format: sip:[email protected]
sip:[email protected]?subject=wassup
Can embed in web pages or place on your business card
Highlighted text is the public identifier
SIP URI address contents Must include host
May include user name
May include the port number
May include others parameters (e.g., transport)
72
DRAFT V03DRAFT V03 SIP CommunicationsSIP Communications
Protocol design
Similar protocol to HTTP and SMTP
Transmission via UDP messages
Human-readable messages
Simple interaction mechanism
User Agent “A” sends a Control Message to User Agent “B”
User Agent “B” sends a Response Code to User Agent “A”
Retry in the event of communication failure
73
DRAFT V03DRAFT V03 SIP CommunicationsSIP Communications
SIP Methods (Control Messages)
INVITE = invite a user agent to a session
ACK = acknowledge a communication
OPTIONS = query servers about their capabilities
REGISTER = register with a SIP Registrar
BYE = terminate a session
CANCEL = cancel a session
74
DRAFT V03DRAFT V03 SIP CommunicationsSIP Communications
SIP Response Codes
1xx: Provisional -- request received, continuing to process the request
2xx: Success -- the action was successfully received, understood, and accepted
200 OK
3xx: Redirection -- further action required to complete the request
4xx: Client Error -- the request contains bad syntax or cannot be fulfilled
5xx: Server Error -- the server failed to fulfil an apparently valid request
6xx: Global Failure -- the request cannot be fulfilled at any server
75
DRAFT V03DRAFT V03 SIP CommunicationsSIP Communications
SIP Extensions (selection amongst many)
INFO = carry session-related control information
RFC 2976
e.g. ISUP and ISDN signalling messages
REFER = refer the recipient to a new resource
RFC 3515
e.g. Call transfer
76
DRAFT V03DRAFT V03 SIP: Simple Peer-to-Peer SessionSIP: Simple Peer-to-Peer Session
“A” “B”
Simple peer-to-peer SIP session
Assumptions
“A” knows address of “B”
“A” and “B” can see each other on the network
Audio communication
Humans are using “A” and “B”
77
DRAFT V03DRAFT V03 SIP: Simple Peer-to-Peer SessionSIP: Simple Peer-to-Peer Session
“A” “B”
INVITE From:A To:B A-SDP
100 TRYING
180 RINGING Play ring-tone to user BPlay ring-tone to user A
User makes a call to “B”
B User answers call200 OK B-SDP
RTP Audio Stream
RTP Audio Stream
ACK
A hears B B hears A
B hangs upBYE
200 OK B terminates RTP audio
- - - Call Established - - -
A terminates audio
- - - Session Over - - -
- - - Waiting for answer - - -
78
DRAFT V03DRAFT V03 SIP: Simple Peer-to-Peer SessionSIP: Simple Peer-to-Peer Session
“A” “B”
INVITE From:A To:B A-SDP
100 TRYING
180 RINGING Play ring-tone to user BPlay ring-tone to user A
User makes a call to “B”
B User answers call200 OK B-SDP
RTP Audio Stream
RTP Audio Stream
ACK
A hears B B hears A
B hangs upBYE
200 OK B terminates RTP audio
- - - Call Established - - -
A terminates audio
- - - Session Over - - -
- - - Waiting for answer - - -
SIP MessagesStatus codes
RTP
79
DRAFT V03DRAFT V03 SIP CommunicationsSIP Communications
SIP Message: Example
INVITE sip:[email protected]:5077 SIP/2.0Via: SIP/2.0/UDP 10.0.0.3:5060From: <sip:[email protected]>;tag=40A0C340-2BCTo: <sip:[email protected]>Date: Fri, 07 Jul 2006 01:58:55 GMTCall-ID: [email protected]: timer,100relMin-SE: 1800Cisco-Guid: 4217155242-210899419-2988540394-3730825786User-Agent: Cisco-SIPGateway/IOS-12.xAllow: INVITE, OPTIONS, BYE, CANCEL, ACK, PRACK, COMET, REFER, SUBSCRIBE, NOTIFY, INFOCSeq: 101 INVITEMax-Forwards: 6Remote-Party-ID: <sip:[email protected]>;party=calling;screen=yes;privacy=offTimestamp: 1152237535Contact: <sip:[email protected]:5060>Expires: 180Allow-Events: telephone-eventContent-Type: application/sdpContent-Length: 235
<<SDP header goes here>>
Header
To & From
Unique call ID
User agent details
Transmission info
Multi-part media content
SDP removed
80
DRAFT V03DRAFT V03 SIP CommunicationsSIP Communications
SDP – Session Description Protocol
RFC 2327
Describe media capability of a SIP user agent
v=0o=Holly-HVG-4-2 2890844526 2890842809 IN IP4 10.0.0.113s=sip call from the hvgc=IN IP4 10.0.0.113t=0 0m=audio 11946 RTP/AVP 8 101c=IN IP4 10.0.0.113a=rtpmap:0 PCMU/8000a=rtpmap:8 PCMA/8000a=rtpmap:101 telephone-event/8000a=fmtp:101 0-16
v=0o=Holly-HVG-4-2 2890844526 2890842809 IN IP4 10.0.0.113s=sip call from the hvgc=IN IP4 10.0.0.113t=0 0m=audio 11946 RTP/AVP 8 101c=IN IP4 10.0.0.113a=rtpmap:0 PCMU/8000a=rtpmap:8 PCMA/8000a=rtpmap:101 telephone-event/8000a=fmtp:101 0-16
Agent Description
Audio Capability #1RTP Protocol
Agent Address
G.711 u-law & A-law8000Hz
DTMF events 0123456789ABCD*#
81
DRAFT V03DRAFT V03 SIP CommunicationsSIP Communications
SDP Protocol Structure
v= (protocol version)o= (owner/creator and session identifier).s= (session name)i=* (session information)u=* (URI of description)e=* (email address)p=* (phone number)c=* (connection information)b=* (bandwidth information)
One or more time descriptionsz=* (time zone adjustments)k=* (encryption key)a=* (zero or more session attribute lines)
Zero or more media descriptionsm= (media name and transport address)i=* (media title)c=* (connection information - optional if
included at session-level)b=* (bandwidth information)k=* (encryption key)a=* (zero or more media attribute lines)
Time descriptiont= (time the session is active)r=* (zero or more repeat times)
* = optional
82
DRAFT V03DRAFT V03 SIP RegistrationSIP Registration
“A”“A”
Registration Process
SIPRegistrar
SIPRegistrar
REGISTER
200 OK
- - - Repeat regularly (5, 60+ min) - - -
REGISTERVia: IP:HostFrom: “Andrew Hunt” <sip:[email protected]>To: <sip:asterisk.holly-connects.com:5060>CallID: <<something unique>>Expires: 3600
REGISTERVia: IP:HostFrom: “Andrew Hunt” <sip:[email protected]>To: <sip:asterisk.holly-connects.com:5060>CallID: <<something unique>>Expires: 3600
83
DRAFT V03DRAFT V03 SIP: SIP Session Via a ProxySIP: SIP Session Via a Proxy
INVITE From:A To:B A-SDP
100 TRYING
180 RINGING Play ring-tone to user B
Play ring-tone to user A
User makes a call to “B”
B User answers call200 OK B-SDP
RTP Audio Stream
RTP Audio Stream
ACK
A hears B
B hears A
BYE
200 OK B terminates RTP audio
- - - Call Established - - -
A hangs up
- - - Session Over - - -
- - - Waiting for answer - - -
“A”“A” SIP ProxySIP Proxy “B”“B”
INVITE From:A To:B A-SDP
100 TRYING
180 RINGING
200 OK B-SDP
ACK
BYE
200 OK
A terminates RTP audio
84
DRAFT V03DRAFT V03 SIP: SIP Proxy ChainingSIP: SIP Proxy Chaining
“A”“A”
“B”“B”
SIP ProxySIP Proxy
SIP ProxySIP Proxy
SIP ProxySIP Proxy
85
DRAFT V03DRAFT V03 SIP ProxySIP Proxy
SIP Proxy Function
Serve as rendezvous point at which callees are reachable
Perform routing function
Select the next hop or hops when chaining
Forking: try multiple destinations in parallel or sequence
Avoid loops when chaining
Available capabilities
Programmable routing decisions & tables
Least-cost routing
Firewall traversal
Direct certain calls to PSTN via gateway (e.g. 911, local calls)
86
DRAFT V03DRAFT V03 SIP: Session Via a PBXSIP: Session Via a PBX
INVITE From:A To:B A-SDP
100 TRYING
180 RINGING Play ring-tone to user B
Play ring-tone to user A
User makes a call to “B”
B User answers call200 OK B-SDP
RTP Audio Stream
RTP Audio StreamACK
A hears B
BYE
200 OK
B terminates RTP audio
- - - Call Established - - -
- - - Session Over - - -
- - - Waiting for answer - - -
“A”“A” SIP PBX(e.g. Asterisk)
SIP PBX(e.g. Asterisk) “B”“B”
INVITE From:P To:B P-SDP
100 TRYING
180 RINGING
200 OK B-SDP
ACK
BYE
200 OKA terminates RTP audio
REGISTER
OK REGISTER
OK
RTP Audio Stream
RTP Audio Stream
B hangs up
87
DRAFT V03DRAFT V03 SIP: Session Via a PBX with Redirect (Direct Audio Link)SIP: Session Via a PBX with Redirect (Direct Audio Link)
BYE
200 OK
B terminates RTP audio
- - - Call Established - - -
- - - Session Over - - -
“A”“A” SIP PBX(e.g. Asterisk)
SIP PBX(e.g. Asterisk) “B”“B”
BYE
200 OKA terminates RTP audio
B hangs up
(Re)INVITE From:P To:B A-SDP
200 OK
RTP Audio Stream
(Re)INVITE From:P To:A B-SDP
RTP Audio Stream
- - - Call Continues - - -
88
DRAFT V03DRAFT V03 SIP-TDM GatewaySIP-TDM Gateway
Translate signalling messages
To/From traditional telephony and VoIP
e.g. ISUP SIP/RTP
Support heterogeneous environments
Staged migration to VoIP
Numerous gateway products available
PSTN IP
TDM
/ISDN
SIP/R
TP
VoIPGateway
89
DRAFT V03DRAFT V03 SIP-TDM GatewaySIP-TDM Gateway
Signalling gatewaySignalling gateway
Media Gateway Controller
Media Gateway Controller
Media GatewayMedia Gateway
PSTN IP
RTP
SIP
TDM
ISDN
90
DRAFT V03DRAFT V03 SIP-TDM GatewaySIP-TDM Gateway
Terminates RTP audio
- - - Session Over - - -
TDM-VoIPBridge
TDM-VoIPBridgeSIP UASIP UA
Q.931 RELEASE COMPLETE
Terminates RTP audio
Pick-up
INVITE 555 1234
100 TRYING
RTP Audio Stream
RTP Audio Stream
180 RINGING
Q.931 SETUP
Q.931 CALL PROCEEDING
Play ring-tone
User makes a call
Q.931 PROGRESS
Q.931 CONNECT
Phone rings
200 OK
ACK
BYEQ.931 DISCONNECT
200 OK
Hang up
Q.931 RELEASE
- - - Call in Progress - - -
91
DRAFT V03DRAFT V03 SIP: Sending Auxiliary InformationSIP: Sending Auxiliary Information
User makes a call
“A”“A”
INVITE sip:[email protected]:5077 SIP/2.0<<SIP header goes here>>Content-type: multipart/mixed; boundary="gc0p4Jq0M2Yt08jU534c0p"MIME-version: 1.0
This is a multi-part message in MIME format.--gc0p4Jq0M2Yt08jU534c0pContent-Type: application/sdpContent-Length: 235
<<SDP header goes here>>--gc0p4Jq0M2Yt08jU534c0pContent-type: image/jpeg PGh0bWw+CiAgPGhlYWQ+CiAgPC9oZWFkPgogIDxib2R5PgogICAgPHA+VGhpcyBpcyB0aGUgYm9keSBvZiB0aGUgbWVzc2FnZS48L3A+CiAgPC9ib2R5Pgo8L2h0bWw+Cg==--gc0p4Jq0M2Yt08jU534c0p
“Brad calling”
92
DRAFT V03DRAFT V03 SIP: Failed EstablishmentSIP: Failed Establishment
“A” “B”
INVITE From:A To:B A-SDPUser makes a call to “B”
INVITE From:A To:B A-SDPTimeout
- - - Retries - - -
“B” gives up
INVITE From:A To:B A-SDPTimeout
Timeout
RTP: Real-time Transport Protocol
94
DRAFT V03DRAFT V03 Module OverviewModule Overview
RTP: Real-time Transport Protocol
Really real-time?
RTP session overview
RTP packets
RTP and network transmission
95
DRAFT V03DRAFT V03 Real-time Transport ProtocolReal-time Transport Protocol
Standard for real-time transport over IP networks
Streaming audio and video
Utilised in SIP/RTP and H.323
Adopted by 3GPP for next generation cellular telephony
Widespread use in streaming: QuickTime, Real, Microsoft
RTP assumes
Network is dumb and imperfect, end-points are smart
Network may exhibit delays, jitter, packet loss etc.
Real-time Transport Protocol is NOT REAL-TIME
No end-to-end protocol, including RTP, can ensure in-time delivery. This would require the support of lower layers (switches, routers etc.)
RTP provides functionality suited for carrying real-time content, e.g., a timestamp and control mechanisms for synchronizing different streams with timing properties
96
DRAFT V03DRAFT V03 Real-time Transport ProtocolReal-time Transport Protocol
One RTP session transmits one media type
Audio / voice
Video
Multi-media requires multiple RTP sessions
RTP session:
Implements a particular RTP profile
Includes an RTP data flow Transports a single media type according to one or more payload formats e.g. audio in G.711 format
Includes an RTP control protocol flow
Providing reception quality feedback, user information, etc.
Associates: Source and destination IP addresses A pair of UDP ports: one for RTP, one for RTCP
97
DRAFT V03DRAFT V03 Real-time Transport ProtocolReal-time Transport Protocol
Media Content
Supports mixinge.g. for audio conferencing
Detect packet lossTime of last payload sample
98
DRAFT V03DRAFT V03 Real-time Transport ProtocolReal-time Transport Protocol
Source
Destination
Playback
Buffer
Network
Packet loss
Out-of-order recovery
Late packet loss
Network Issues and Design
100
DRAFT V03DRAFT V03 Module OverviewModule Overview
Network Issues and Design
Quality of Service
Packet loss, Latency, Jitter
Managing Quality of Service
Network quality
MPLS
Silence suppression
Local vs. long distance
Network design
Firewalls: NAT & STUN
101
DRAFT V03DRAFT V03 Quality of ServiceQuality of Service
QoS Definition
Probability of the network meeting a given traffic contract
Informally refers to the probability of a packet succeeding in passing between two points in the network within its desired latency period
102
DRAFT V03DRAFT V03 Quality of ServiceQuality of Service
What can go wrong as packets go from “A” to “B”?
Dropped packets: routers might fail to deliver (drop) some packets if they arrive when their buffers are full. Some, none, or all of the packets might be dropped, depending on the state of the network, and it is impossible to determine what happened in advance.
Delay: it may take a long time for a packet to get from “A” to “B” because it gets held up in long queues or takes a less direct route to avoid congestion
Jitter: packets from source will reach the destination with different delays, sometimes by taking different routes
Out-of-order delivery: different packets may take different routes with enough difference in delay to change the order of arrival
Error: sometimes packets are misdirected, or combined together, or corrupted, while en route
103
DRAFT V03DRAFT V03 Quality of ServiceQuality of Service
104
DRAFT V03DRAFT V03 Quality of ServiceQuality of Service
QoS issues can have a major impact on real-time media streaming
Packet loss
Missing packet replaced by silence
Jitter & out-of-order
Abrupt and unwanted variation in packet arrival timing
Arrival of packets out-of-order
Original
Packet loss
Jitter
105
DRAFT V03DRAFT V03 Quality of ServiceQuality of Service
Measure QoS on routers and end-points
QoS tends to degrade with network size and congestion
Managing Quality of Service
Generously over-provision a network – expensive and does not scale
Reserve network resources: e.g. RSVP = Resource Reservation Protocol
DiffServ: Differentiated services for bulk flows (e.g. packets from a university)
Multi-Protocol Label Switching (MPLS): emulates some properties of a circuit-switched network over a packet-switched network.
Traffic shaping: control computer network traffic to optimize or guarantee performance, low latency, and/or bandwidth. Traffic shaping deals with concepts of classification, queue disciplines, enforcing policies, congestion management, quality of service (QoS), and fairness.
Silence suppression
106
DRAFT V03DRAFT V03 Quality of ServiceQuality of Service
QoS on Local Network
QoS generally not a major issue in single-site deployments
Dedicate separate switches for voice and data traffic
Provide redundant networks for voice and data traffic
Gigabit Ethernet
Monitor QoS
Alarm network outages
107
DRAFT V03DRAFT V03 FirewallsFirewalls
Network Address Translation (NAT): re-writing the source and/or destination addresses of IP packets as they pass through a router or firewall
Aka network masquerading or IP-masquerading
Firewalls use NAT to enable multiple hosts on a private network to access the Internet using a single public IP address
Common in home and SOHO routers
SDP addresses are not translated by NAT
STUN (Simple Traversal of UDP over NATs, RFC 3489):
Network protocol allowing clients behind NAT to find its (a) public address, (b) the type of NAT and (c) the internet side port associated by the NAT with a particular local port.
Info is used to set up UDP communication between two machines both behind NAT routers
VoIP and Speech Recognition
109
DRAFT V03DRAFT V03 Module OverviewModule Overview
VoIP and Speech Recognition
MRCP v1 & v2
Impact of CODECs
Network issues: packet loss, latency
110
DRAFT V03DRAFT V03 MRCP: Media Resource Control ProtocolMRCP: Media Resource Control Protocol
MRCP v1
IETF protocol
Client control of speech resources
Speech recognition
Text-to-speech
MRCP structure is similar to HTTP and SIP
Request by client in header+body format
Response by server
Media delivery typically via RTP
Widely supported by VoiceXML Platforms
Leverages existing W3C standards for speech recognition and TTS markeup
111
DRAFT V03DRAFT V03 MRCP: Media Resource Control ProtocolMRCP: Media Resource Control Protocol
MRCP v2
IETF protocol to supersede MRCP v1
Broader client control of speech resources
Adds speaker verification and speaker identification
Adds recording
Utilizes SIP+SDP to establish the media pipe
112
DRAFT V03DRAFT V03 VoIP and Speech RecognitionVoIP and Speech Recognition
Speech Recognition and CODECs
Lossless CODECs do not affect speech recognition accuracy
No loss of information
Lossy CODECs can affect speech recognition accuracy
Greater compression tends to cause great degration
Speech recognizers are generally very reliable with widely deployed CODECs
Mobile telephony has extensive compression
Speech recognition trained explicitly for mobile performance
DSR Aurora: Distributed Speech Recognition
CODEC specialized for the requirements of speech recognition
Promoted for mobile carrier usage
113
DRAFT V03DRAFT V03
Speech Recognizer
VoIP and Speech RecognitionVoIP and Speech Recognition
Speech Recognition and Latency
Speech recognition not as sensitive to latency as humans
Late packet is better than no packet
Speech recognizers have extensive buffers for non-real-time processing
Note: excessive latency (>1sec) can cause caller perceived service issues
BufferRTP Result
114
DRAFT V03DRAFT V03
Speech Recognizer
VoIP and Speech RecognitionVoIP and Speech Recognition
Speech Recognition and Packet Loss
Speech recognition are sensitive to packet loss
ASR can use packet loss information to minimize error reduction
BufferRTP Result
VoIP and Mobile Telephony
116
DRAFT V03DRAFT V03 Module OverviewModule Overview
VoIP and Mobile Telephony
Analog mobile
Digital mobile
3G and SIP
IMS – IP Multi-Media Subsystem Architecture
117
DRAFT V03DRAFT V03 Mobile TelephonyMobile Telephony
Analog Mobile
Experimental systems from 1920s
“1G” – 1st Generation
AMPS: Advanced Mobile Phone Services
Analog transmission
1978: Trial in Chicago
1979: Commercial launch in Japan
1981: Commercial launch in Sweden, Norway, Denmark, Finland
1983: Commercial launch in Chicago
Issues: limited capacity, fraud, subscriber volume…
118
DRAFT V03DRAFT V03 Mobile TelephonyMobile Telephony
2G – 2nd Generation Mobile
Objectives achieved
Digital technology
Increased capacity
Greater security against fraud
Global roaming
Advanced services
Lower power = smaller handsets = longer battery life
Many standards evolved (examples)
GSM: Pan-European standard that spread globally
CDMA: Americas and parts of Asia (aka PCS)
PDC: Japan
Limitations:
Optimized for voice – not suited to data
119
DRAFT V03DRAFT V03 Mobile TelephonyMobile Telephony
2.5G – Stepping Stone from 2G to 3G
2G system with both packet switching (for data) and circuit switching (for voice)
2.5G is a marketing term – not a standard
Objectives achieved
Re-use of much 2G infrastructure (GSM & CDMA)
Data rate of 144 kbit/sec or better
Used for sending photos and much more
120
DRAFT V03DRAFT V03 Mobile TelephonyMobile Telephony
3G – 3rd Generation Mobile
Combines high-speed mobile access with Internet Protocol (IP) based services
Covers range of network technologies
WCDMA, CDMA2000, UMTS, EDGE
Data rate: 384kbps for mobile systems and 2Mbps for stationary systems
Enables video, TV, images, music, games, location services…
121
DRAFT V03DRAFT V03 Mobile TelephonyMobile Telephony
3GPP – 3rd Generation Mobile
Collaboration agreement (Dec-98) between ETSI (Europe), ARIB/TTC (Japan), CCSA (China), ATIS (North America) and TTA (South Korea).
Goal: global 3G specification within the scope of the ITU's IMT-2000 project
3GPP specifications are based on evolved GSM specifications
Now generally known as the UMTS system
Introduced IMS…
122
DRAFT V03DRAFT V03 Mobile TelephonyMobile Telephony
IMS – IP Multi-Media Sub-System
Emerged in 3GPP Release 5 with following enhancements
Principles
Access independence: work with fixed, mobile or wireless networks
Different network architectures: implement on operator-selected architectures
Terminal and user mobility: provides terminal mobility (roaming)
Extensive IP-based services: offer just about any IP-based service. VoIP, push-to-talk over cellular (POC), multiparty gaming, video conferencing, messaging, community services, presence information, content sharing…
123
DRAFT V03DRAFT V03 Mobile TelephonyMobile Telephony
IMS is Built on SIP
3GPP Variant of SIP
Application servers for SIP session management
Caller ID, call waiting, call forwarding, transfer, call blocking, interception, announcements, conferencing, voice-mail, SMS…
CSCF: Call Session Control Function and other functions by SIP
Media Resource Function (MRF): SIP end-point with IVR-like functionality
TDM-VoIP gateways to bridge to fixed and mobile telephony
Who’s in control?
TDM = dumb terminals, smart network
Internet VoIP = smart terminals, dumb network
IMS = dumb/smart terminals, smart network
Closing
125
DRAFT V03DRAFT V03 Module ObjectivesModule Objectives
Go home!!
126
DRAFT V03DRAFT V03 Further InformationFurther Information
Web sites
IETF: http://www.ietf.org/html.charters/sip-charter.html
SIP Tutorial: http://www.iptel.org/sip/siptutorial.pdf
SIP Home Page: http://www.cs.columbia.edu/sip/
SIP Forum: http://www.sipforum.org/
Asterisk PBX: http://www.asterisk.org/
VoIP Wiki Reference: http://www.voip-info.org/wiki/
SIP Knowledge: http://www.sipknowledge.com/SIP_RFC.htm
SIP FAQ: http://www.sipknowledge.com/faq_main.htm
SIP Tech Portal: http://www.tech-invite.com/
Thank you!!!Thank you!!!
Top Related