The Congestion Manager Hari BalakrishnanSrinivasan Seshan MIT LCS CMU draft-ietf-ecm-cm-01.txt.

28
The Congestion Manager Hari Balakrishnan Srinivasan Seshan MIT LCS CMU http://nms.lcs.mit.edu/ draft-ietf-ecm-cm-01.txt

Transcript of The Congestion Manager Hari BalakrishnanSrinivasan Seshan MIT LCS CMU draft-ietf-ecm-cm-01.txt.

Page 1: The Congestion Manager Hari BalakrishnanSrinivasan Seshan MIT LCS CMU  draft-ietf-ecm-cm-01.txt.

The Congestion Manager

Hari Balakrishnan Srinivasan Seshan

MIT LCS CMU

http://nms.lcs.mit.edu/

draft-ietf-ecm-cm-01.txt

Page 2: The Congestion Manager Hari BalakrishnanSrinivasan Seshan MIT LCS CMU  draft-ietf-ecm-cm-01.txt.

July 31, 2000 48th IETF (Pittsburgh) ECM WG 2

CM architecture

• Integrates congestion management across all applications (transport protocols & user-level apps)

• Exposes API for application adaptation, accommodating ALF applications• This draft: sender-only module

TCP1

IP

UDPTCP2

HTTP RTP/RTCP

SCTP

NNTP . . .

Congestion

Manager

API

Page 3: The Congestion Manager Hari BalakrishnanSrinivasan Seshan MIT LCS CMU  draft-ietf-ecm-cm-01.txt.

July 31, 2000 48th IETF (Pittsburgh) ECM WG 3

Outline

• Draft overview (“tutorial” for slackers!)– Terminology– System components– Abstract CM API– Applications

• Issues for discussion

Page 4: The Congestion Manager Hari BalakrishnanSrinivasan Seshan MIT LCS CMU  draft-ietf-ecm-cm-01.txt.

July 31, 2000 48th IETF (Pittsburgh) ECM WG 4

Assumptions & terminology

• Application: Any protocol that uses CM• Well-behaved application: Incorporates

application-level receiver feedback, e.g., TCP (ACKs), RTP (RTCP RRs), …

• Stream– Group of packets with five things in common

[src_addr, src_port, dst_addr, dst_port, ip_proto]

• Macroflow– Group of streams sharing same congestion control and

scheduling algorithms (a “congestion group”)

Page 5: The Congestion Manager Hari BalakrishnanSrinivasan Seshan MIT LCS CMU  draft-ietf-ecm-cm-01.txt.

July 31, 2000 48th IETF (Pittsburgh) ECM WG 5

Architectural components

• CM scope is per-macroflow; not on data path• Congestion controller algorithm MUST be TCP-friendly (see

Floyd document)• Scheduler apportions bandwidth to streams

Congestioncontroller

Scheduler

CM

API to streams on macroflow

Page 6: The Congestion Manager Hari BalakrishnanSrinivasan Seshan MIT LCS CMU  draft-ietf-ecm-cm-01.txt.

July 31, 2000 48th IETF (Pittsburgh) ECM WG 6

Congestion Controller• One per macroflow

• Addresses two issues:– WHEN can macroflow transmit?– HOW MUCH data can be transmitted?

• Uses app notifications to manage state – cm_update() from streams– cm_notify() from IP output whenever packet sent

• Standard API for scheduler interoperability– query(), notify(), update()

• A large number of controllers are possible

Page 7: The Congestion Manager Hari BalakrishnanSrinivasan Seshan MIT LCS CMU  draft-ietf-ecm-cm-01.txt.

July 31, 2000 48th IETF (Pittsburgh) ECM WG 7

Scheduler

• One per macroflow• Addresses one issue:

– WHICH stream on macroflow gets to transmit

• Standard API for congestion controller interoperability– schedule(), query_share(), notify()– This does not presume any scheduler

sophistication

• A large number of schedulers are possible

Page 8: The Congestion Manager Hari BalakrishnanSrinivasan Seshan MIT LCS CMU  draft-ietf-ecm-cm-01.txt.

July 31, 2000 48th IETF (Pittsburgh) ECM WG 8

Sharing

• All streams on macroflow share congestion state

• What should granularity of macroflow be?– [Discussed in November ‘99 IETF]– Default is all streams to given destination address– Grouping & ungrouping API allows this to be

changed by an application program

Page 9: The Congestion Manager Hari BalakrishnanSrinivasan Seshan MIT LCS CMU  draft-ietf-ecm-cm-01.txt.

July 31, 2000 48th IETF (Pittsburgh) ECM WG 9

Abstract CM API

• State maintenance• Data transmission• Application notification• Querying• Sharing granularity

Page 10: The Congestion Manager Hari BalakrishnanSrinivasan Seshan MIT LCS CMU  draft-ietf-ecm-cm-01.txt.

July 31, 2000 48th IETF (Pittsburgh) ECM WG 10

State maintenance

• stream_info is platform-dependent data structure, containing:[src_addr, src_port, dst_addr, dst_port, ip_proto]

• cm_open(stream_info) returns stream ID, sid• cm_close(sid) SHOULD be called at the end• cm_mtu(sid) gives path MTU for stream• Add call for sid--->stream_info (so non apps

can query too)

Page 11: The Congestion Manager Hari BalakrishnanSrinivasan Seshan MIT LCS CMU  draft-ietf-ecm-cm-01.txt.

July 31, 2000 48th IETF (Pittsburgh) ECM WG 11

Data transmission

• Two API modes, neither of which buffers data• Accommodates ALF-oriented applications• Callback-based• Application controls WHAT to send at any

point in time

Page 12: The Congestion Manager Hari BalakrishnanSrinivasan Seshan MIT LCS CMU  draft-ietf-ecm-cm-01.txt.

July 31, 2000 48th IETF (Pittsburgh) ECM WG 12

Callback-based transmission

CM

Application

1. cm_request() 2. cmapp_send() /* callback */

• Useful for ALF applications• TCP too

– On a callback, decide what to send (e.g., retransmission), independent of previous requests

Page 13: The Congestion Manager Hari BalakrishnanSrinivasan Seshan MIT LCS CMU  draft-ietf-ecm-cm-01.txt.

July 31, 2000 48th IETF (Pittsburgh) ECM WG 13

Synchronous transmission

• Applications that transmit off a (periodic) timer loop– Send callbacks wreck timing structure

• Use a different callback• First, register rate and RTT thresholds

– cm_setthresh() per stream

• cmapp_update(newrate, newrtt, newrttdev) when values change

• Application adjusts period, packet size, etc.

Page 14: The Congestion Manager Hari BalakrishnanSrinivasan Seshan MIT LCS CMU  draft-ietf-ecm-cm-01.txt.

July 31, 2000 48th IETF (Pittsburgh) ECM WG 14

Application notification

• Tell CM of successful transmissions and congestion– cm_update(sid, nrecd, nlost, lossmode, rtt)– nrecd, nsent since last cm_update call– lossmode specifies type of congestion as bit-

vector: CM_PERSISTENT, CM_TRANSIENT, CM_ECN

• Should we define more specifics?

Page 15: The Congestion Manager Hari BalakrishnanSrinivasan Seshan MIT LCS CMU  draft-ietf-ecm-cm-01.txt.

July 31, 2000 48th IETF (Pittsburgh) ECM WG 15

Notification of transmission

• cm_notify(stream_info, nsent) from IP output routine– Allows CM to estimate outstanding bytes

• Each cmapp_send() grant has an expiration– max(RTT, CM_GRANT_TIME)

• If app decides NOT to send on a grant, SHOULD call cm_notify(stream_info, 0)

• CM congestion controller MUST be robust to broken or crashed apps that forget to do this

Page 16: The Congestion Manager Hari BalakrishnanSrinivasan Seshan MIT LCS CMU  draft-ietf-ecm-cm-01.txt.

July 31, 2000 48th IETF (Pittsburgh) ECM WG 16

Querying

• cm_query(sid, rate, srtt, rttdev) fills values– Note: CM may not maintain rttdev, so consider

removing this?

• Invalid or non-existent estimate signaled by negative value

Page 17: The Congestion Manager Hari BalakrishnanSrinivasan Seshan MIT LCS CMU  draft-ietf-ecm-cm-01.txt.

July 31, 2000 48th IETF (Pittsburgh) ECM WG 17

Sharing granularity

• cm_getmacroflow(sid) returns mflow identifier• cm_setmacroflow(mflow_id, sid) sets macroflow

for a stream– If macroflowid is -1, new macroflow created

• Iteration over flows allows grouping– Each call overrides previous mflow association

• This API sets grouping, not sharing policy– Such policy is scheduler-dependent– Examples include proxy destinations,client

prioritization, etc.

Page 18: The Congestion Manager Hari BalakrishnanSrinivasan Seshan MIT LCS CMU  draft-ietf-ecm-cm-01.txt.

July 31, 2000 48th IETF (Pittsburgh) ECM WG 18

Example applications

• TCP/CM– Like RFC 2140, TCP-INT, TCP sessions

• Congestion-controlled UDP• Real-time streaming applications

– Synchronous API, esp. for audio

• HTTP server– Uses TCP/CM for concurrent connections– cm_query() to pick content formats

Page 19: The Congestion Manager Hari BalakrishnanSrinivasan Seshan MIT LCS CMU  draft-ietf-ecm-cm-01.txt.

July 31, 2000 48th IETF (Pittsburgh) ECM WG 19

Linux implementation

Congestioncontroller

Scheduler

CM macroflows, kernel APITCP UDP-CC

libcm.a

IP

cm_notify()ip_output() ip_output()

User-level library;implements API

Control socket for callbacksSystem calls (e.g., ioctl)

App stream

cmapp_*()Stream requests, updates

Page 20: The Congestion Manager Hari BalakrishnanSrinivasan Seshan MIT LCS CMU  draft-ietf-ecm-cm-01.txt.

July 31, 2000 48th IETF (Pittsburgh) ECM WG 20

Server performance

0

5

10

15

20

25

30

35

40

45

0 200 400 600 800 1000 1200 1400 1600

cmapp_send()

Buffered UDP-CC

TCP/CM, no delack

TCP, w/ delack

TCP/CM, w/ delack

TCP, no delack

CPU secondsfor 200K pkts

Packet size (bytes)

Page 21: The Congestion Manager Hari BalakrishnanSrinivasan Seshan MIT LCS CMU  draft-ietf-ecm-cm-01.txt.

July 31, 2000 48th IETF (Pittsburgh) ECM WG 21

Security issues

• Incorrect reports of losses or congestion; absence of reports when there’s congestion

• Malicious application can wreck other flows in macroflow

• These are all examples of “NOT-well-behaved applications”

• RFC 2140 has a list– Will be incorporated in next revision– Also, draft-ietf-ipsec-ecn-02.txt has relevant stuff

Page 22: The Congestion Manager Hari BalakrishnanSrinivasan Seshan MIT LCS CMU  draft-ietf-ecm-cm-01.txt.

July 31, 2000 48th IETF (Pittsburgh) ECM WG 22

Issues for discussion

• Prioritization to override cwnd limitation• cm_request(num_packets)

– Request multiple transmissions in a single call

• Reporting variances– Should all CM-to-app reports include a variance

• Reporting congestion state– Should we try and define “persistent” congestion?

• Sharing policy interface– Scheduler-dependent (many possibilities)

Page 23: The Congestion Manager Hari BalakrishnanSrinivasan Seshan MIT LCS CMU  draft-ietf-ecm-cm-01.txt.

July 31, 2000 48th IETF (Pittsburgh) ECM WG 23

Overriding cwnd limitations

• Prioritization– Suppose a TCP loses a packet due to congestion– Sender calls cm_update()– This causes CM to cut window– Now, outstanding exceeds cwnd– What happens to the retransmission?

• Solution(?)– Add a priority parameter to cm_request()– At most one high-priority packet per RTT?

Page 24: The Congestion Manager Hari BalakrishnanSrinivasan Seshan MIT LCS CMU  draft-ietf-ecm-cm-01.txt.

July 31, 2000 48th IETF (Pittsburgh) ECM WG 24

A more complex cm_request()?

• Issue raised by Joe Touch– cm_request(num_packets)

• Potential advantage: higher performance due to fewer protection-boundary crossings

• Disadvantage: makes internals complicated• Observe that:

– Particular implementations MAY batch together libcm-to-kernel calls, preserving simple app API

– Benefits may be small (see graph)

Page 25: The Congestion Manager Hari BalakrishnanSrinivasan Seshan MIT LCS CMU  draft-ietf-ecm-cm-01.txt.

July 31, 2000 48th IETF (Pittsburgh) ECM WG 25

Reporting variances

• Some CM calls do not include variances, e.g., no rate-variance reported

• There are many ways to calculate variances– These are perhaps better done by each

application (e.g., by a TCP)

• The CM does not need to maintain variances to do congestion control

• In fact, our implementation of CM doesn’t even maintain rttdev...

Page 26: The Congestion Manager Hari BalakrishnanSrinivasan Seshan MIT LCS CMU  draft-ietf-ecm-cm-01.txt.

July 31, 2000 48th IETF (Pittsburgh) ECM WG 26

Semantics of congestion reports

• CM_PERSISTENT– Persistent congestion (e.g., TCP timeouts)– Causes CM to go back into slow start

• CM_TRANSIENT: Transient congestion, e.g., three duplicate ACKs

• CM_ECN: ECN echoed from receiver• Should we more precisely define when

CM_PERSISTENT should be reported?– E.g., no feedback for an entire RTT (“window”)

Page 27: The Congestion Manager Hari BalakrishnanSrinivasan Seshan MIT LCS CMU  draft-ietf-ecm-cm-01.txt.

July 31, 2000 48th IETF (Pittsburgh) ECM WG 27

Sharing policy

• Sender talking to a proxy receiver– See, e.g., MUL-TCP

• Client prioritization & differentiation• These are scheduler issues

– Particular schedulers may provide interfaces for these and more

– The scheduler interface specified here is intentionally simple and minimalist

• Vern will talk more about the scheduler

Page 28: The Congestion Manager Hari BalakrishnanSrinivasan Seshan MIT LCS CMU  draft-ietf-ecm-cm-01.txt.

July 31, 2000 48th IETF (Pittsburgh) ECM WG 28

Future Evolution

• Support for non-well behaved applications– Likely use of separate headers

• Policy interfaces for sharing• Handling QoS-enabled paths

– E.g., delay- and loss-based divisions

• Aging of congestion information for idle periods• Expanded sharing of congestion information

– Within cluster and across macroflows