All Rights Reserved © Alcatel-Lucent 2006, ##### Volker Hilt [email protected]...

14
All Rights Reserved © Alcatel-Lucent 2006, ##### Volker Hilt [email protected] Bell-Labs/Alcatel-Lucent SIP Overload Control Report from the IETF Design Team

Transcript of All Rights Reserved © Alcatel-Lucent 2006, ##### Volker Hilt [email protected]...

Page 1: All Rights Reserved © Alcatel-Lucent 2006, ##### Volker Hilt volkerh@bell-labs.com Bell-Labs/Alcatel-Lucent SIP Overload Control Report from the IETF Design.

All Rights Reserved © Alcatel-Lucent 2006, #####

Volker Hilt

[email protected]

Bell-Labs/Alcatel-Lucent

SIP Overload ControlReport from the IETF Design Team

Page 2: All Rights Reserved © Alcatel-Lucent 2006, ##### Volker Hilt volkerh@bell-labs.com Bell-Labs/Alcatel-Lucent SIP Overload Control Report from the IETF Design.

Slide 2 | Volker Hilt | March 2008

Overview

IETF SIP Overload Control Design Team

Simulation Results for SIP

SIP Overload Control

Conclusion

Page 3: All Rights Reserved © Alcatel-Lucent 2006, ##### Volker Hilt volkerh@bell-labs.com Bell-Labs/Alcatel-Lucent SIP Overload Control Report from the IETF Design.

Slide 3 | Volker Hilt | March 2008

IETF SIP Overload Control Design Team

Team was founded beginning of 2007 by SIPPING WG

Members Eric Noel, Carolyn Johnson (AT&T Labs) Volker Hilt, Indra Widjaja (Bell-Labs/Alcatel Lucent) Charles Shen, Henning Schulzrinne (Columbia University) Ping Wu*, Tadeusz Drwiega*, Mary Barnes (Nortel) Jonathan Rosenberg (Cisco) Nick Stewart (British Telecom)

Developed four independent simulation tools AT&T, Bell-Labs, Columbia, Nortel* Simulation results for SIP Proposals and initial results for overload controlled SIP.

* Left design team in fall 2007

Page 4: All Rights Reserved © Alcatel-Lucent 2006, ##### Volker Hilt volkerh@bell-labs.com Bell-Labs/Alcatel-Lucent SIP Overload Control Report from the IETF Design.

Slide 4 | Volker Hilt | March 2008

Simulation ResultsSetup and Assumptions

SIP server topology consisting of UAs, edge proxies and core proxies.

UAs are connected to edge proxies. Each UA creates a single call (INVITE

followed by BYE transaction). Poisson arrival rate. Load is equally distributed across edge

proxies.

Edge proxies forward requests to a core proxies.

Edge proxies reject a request if one/both core proxies are overloaded.

To break up the problem domain we assume edge proxies have infinite capacity.

Core proxies forward call to the edge proxy of the destination.

Capacity: 500 messages per second at a constant rate.

All proxies are modeled as queuing system. Queue size: 500 messages

Media path congestion is not considered.Proxy Model

Server Topology

Page 5: All Rights Reserved © Alcatel-Lucent 2006, ##### Volker Hilt volkerh@bell-labs.com Bell-Labs/Alcatel-Lucent SIP Overload Control Report from the IETF Design.

Slide 5 | Volker Hilt | March 2008

Simulation Results Client-to-Server vs. Server-to-Server Communication

Server-to-ServerCommunication

D

UA-to-Server Communication

a

b

c

z

Server-to-Server Communication

A server sends a stream of SIP requests to other servers.

SIP request streams between servers are dynamic. Load between servers can be reduced gradually

by rejecting/retrying some of the requests.

Overload control can use feedback to request that an upstream server reduces traffic to a desired amount.

Client-to-Server Communication

UAs typically only initiate a single request at a time. A UA can be told to wait a certain time before

re-sending the request.

Problem: a large number of UAs can cause overload even if all UAs are told to back-up.

Feedback-based overload control does not prevent overload in the server.

DB

A

C

Page 6: All Rights Reserved © Alcatel-Lucent 2006, ##### Volker Hilt volkerh@bell-labs.com Bell-Labs/Alcatel-Lucent SIP Overload Control Report from the IETF Design.

Slide 6 | Volker Hilt | March 2008

Simulation ResultsScenario 1: No Overload Control

Assumptions

Proxies do not use any overload control.

When the input buffer of a proxy is filled up, messages are dropped.

Requests that failed at one of the core proxies are not retried.

Results

The number of INVITEtransactions completed (i.e., calls set up) drops to zero.

Congestion collapse!

Sim1

0

20

40

60

80

100

120

140

0 100 200 300 400 500 600 700 800Offered load (cps)

Carried load (cps)

AT&T Sim1 gputCU Sim1 gputBL Sim1 gput

Page 7: All Rights Reserved © Alcatel-Lucent 2006, ##### Volker Hilt volkerh@bell-labs.com Bell-Labs/Alcatel-Lucent SIP Overload Control Report from the IETF Design.

Slide 7 | Volker Hilt | March 2008

Simulation ResultsScenario 2: 503 (Service Unavailable) – Reject Requests

Assumptions

Proxies use 503 (Service Unavailable) responses to reject requests during overload.

Watermark-based (Bang-Bang) overload control algorithm: Enter overload state when queue length reaches high watermark (400

messages) Enter normal state when queue length drops below low watermark (300

messages) When in overload state a proxy rejects all incoming requests with 503

responses.

Edge proxies do not retry requests that are rejected.

Results

Provides little or no improvement compared to no overload control.

Note: Performance can be improved by using other control algorithms.

Congestion collapse!

Statefull 503

0

20

40

60

80

100

120

140

0 200 400 600 800Offered load (cps)

Carried load (cps)

AT&T Sim2 gput

CU Sim2 gput

BL Sim2 gput

Page 8: All Rights Reserved © Alcatel-Lucent 2006, ##### Volker Hilt volkerh@bell-labs.com Bell-Labs/Alcatel-Lucent SIP Overload Control Report from the IETF Design.

Slide 8 | Volker Hilt | March 2008

Simulation ResultsScenario 3: 503 (Service Unavailable) - Retry Requests

Assumptions

Same assumptions as in Scenario 2.

But: edge proxies retry all requests that have been rejected by one core proxy at the second core proxy.

Requests are rejected only if they fail at both core proxies.

Results

Retrying requests decreases goodput.

Increased load caused by retries.

Congestion collapse!

Statefull 503

0

20

40

60

80

100

120

140

0 200 400 600 800Offered load (cps)

Carried load (cps)

AT&T Sim3 gput

CU Sim3 gput

BL Sim3 gput

Page 9: All Rights Reserved © Alcatel-Lucent 2006, ##### Volker Hilt volkerh@bell-labs.com Bell-Labs/Alcatel-Lucent SIP Overload Control Report from the IETF Design.

Slide 9 | Volker Hilt | March 2008

SIP Overload ControlMechanisms and Algorithms

The IETF SIP overload design team has started to investigate solutions for more effective SIP overload control mechanisms.

Current simulations of SIP overload control mechanisms are focusing on:

Server-to-server communication. Feedback channel between core and edge proxy (hop-by-hop). Four different overload control mechanisms:

on/off control, rate-based control, loss-based control, window-based control.

Simulation results available for these four mechanisms and different control algorithm proposals.

Contributed by AT&T Labs, Bell-Labs/Alcatel-Lucent, Columbia University

Initial simulation results.

Page 10: All Rights Reserved © Alcatel-Lucent 2006, ##### Volker Hilt volkerh@bell-labs.com Bell-Labs/Alcatel-Lucent SIP Overload Control Report from the IETF Design.

Slide 10 | Volker Hilt | March 2008

A B

C

D

A B

C

D

Hop-by-hop

End-to-end

SIP Overload Control Hop-by-hop vs. end-to-end

SIP requests for the same source/destination pair can travel along different paths, depending on

provider policies, services invoked, forwarding rules, request forking, load balancing, etc.

A SIP proxy cannot make assumptions about which downstream proxies will be on the path of a SIP request.

Hop-by-hop overload control Server provides overload control feedback to its

direct upstream neighbor. No knowledge about routing policies of neighbors

needed. Neighbor processes feedback and rejects/retries

excess requests if needed.

End-to-end overload control Feedback from all servers on all paths between a

source and a destination needs to be considered. A server needs to track the load of all servers a request

may traverse on its way to the target. Complex and challenging since requests for the

same destination may travel along very different paths. May be applicable in limited, tightly controlled

environments.

x

x

Page 11: All Rights Reserved © Alcatel-Lucent 2006, ##### Volker Hilt volkerh@bell-labs.com Bell-Labs/Alcatel-Lucent SIP Overload Control Report from the IETF Design.

Slide 11 | Volker Hilt | March 2008

SIP Overload Control AT&T Labs Simulation Results

Every sampling interval, Core Proxies estimate optimal control parameters such that queueing delay is within a pre-defined target delay. Core Proxies solely rely on measured offered load and measured internal queueing delay (no Edge Proxies to Core Proxies signaling).

On/off control builds upon existing SIP 503 Retry-After capability. Each control interval, Core Proxies estimate optimal retry after timer value and share with Edge Proxies within the 503 messages.

In rate-based control, each control interval, Core Proxies estimate optimal controlled load and active sources, then share with Edge Proxies either with dedicated signaling or as overhead in response messages. Edge Proxies execute percent blocking throttling algorithm.

Stateless 503

0

20

40

60

80

100

120

140

160

0 200 400 600 800 1000Offered load (cps)

Carried load (cps)

TheoreticalAT&T Labs Sim3 no control gputAT&T Labs Sim3 RetryAfter algo1 gputAT&T Labs Sim3 Rate algo1 gput

Page 12: All Rights Reserved © Alcatel-Lucent 2006, ##### Volker Hilt volkerh@bell-labs.com Bell-Labs/Alcatel-Lucent SIP Overload Control Report from the IETF Design.

Slide 12 | Volker Hilt | March 2008

SIP Overload Control Bell-Labs/Alcatel-Lucent Simulation Results

Loss-based overload control.

Feedback-loop between receiver (core proxy) and sender (edge proxy).

Feedback in SIP responses.

Receiver driven control algorithm. Estimates current processor

utilization. Compares to target processor

utilization. Multiplicative increase and decrease

of loss-rate to reach target utilization.

Sender adjusts the load it sends to receiver based on the feedback received using percent-blocking.

Overload control algorithms: Occupancy algorithm (OCC), Acceptance Rate/Occupancy

algorithm (ARO)

Page 13: All Rights Reserved © Alcatel-Lucent 2006, ##### Volker Hilt volkerh@bell-labs.com Bell-Labs/Alcatel-Lucent SIP Overload Control Report from the IETF Design.

Slide 13 | Volker Hilt | March 2008

SIP Overload ControlColumbia University Simulation Results

Window-based overload control

SIP session as control unit, dynamically estimated from processed SIP messages

Receiver (Core Proxy in the scenario) decreases window on session arrival dynamically computes available window splits and feedbacks to active Senders Feedback piggybacked in

responses/requests

Sender (Edge Proxy in the scenario) forwards a session only if window slot

available

CU Overload Control Results

0

20

40

60

80

100

120

140

160

0 200 400 600 800 1000 1200

Load (cps)

Goodput (cps)Sim1 Sim2

Sim3 CU-WIN-I

CU-WIN-II CU-WIN-III

Theoretical

Three different window adaptation algorithms work equally well in steady state CU-WIN-I: keep current estimated sessions below total allowed sessions given

target delay CU-WIN-II: open up the window after a new session is processed CU-WIN-III: discrete version of CU-WIN-I, divided into control intervals

Page 14: All Rights Reserved © Alcatel-Lucent 2006, ##### Volker Hilt volkerh@bell-labs.com Bell-Labs/Alcatel-Lucent SIP Overload Control Report from the IETF Design.

Slide 14 | Volker Hilt | March 2008

Conclusion

Current Status

The SIP protocol is vulnerable to congestion collapse under overload. The 503 response mechanism is insufficient to avoid congestion

collapse. Simulation results confirm problems.

The IETF design team is investigating different mechanisms and algorithms for SIP overload control.

Initial simulation results are available. Results show stable server-to-server goodput under overload.

Open Issues Overload control with uneven distribution of load and fluctuating

load conditions. Dynamic arrival and departure of servers. Fairness between upstream neighbors.

Comments and suggestions are very welcome!