9.CongestionC
-
Upload
insha-rafiq -
Category
Documents
-
view
212 -
download
0
description
Transcript of 9.CongestionC
-
Flow Control
-
TCP Flow Controlreceiver: explicitly informs sender of (dynamically changing) amount of free buffer space RcvWindow field in TCP segmentsender: keeps the amount of transmitted, unACKed data less than most recently received RcvWindowsender wont overrunreceivers buffers bytransmitting too much, too fastreceiver bufferingRcvBuffer = size or TCP Receive Buffer
RcvWindow = amount of spare room in Buffer
-
*TCP Flow Control: How it Worksspare room in buffer= RcvWindowsource port #dest port #applicationdata (variable length)sequence numberacknowledgement numberrcvr window sizeptr urgent datachecksumFSRPAUheadlennotusedOptions (variable length)
-
TCP: setting timeouts
-
TCP Round Trip Time and TimeoutQ: how to set TCP timeout value?longer than RTTnote: RTT will varytoo short: premature timeoutunnecessary retransmissionstoo long: slow reaction to segment lossQ: how to estimate RTT?SampleRTT: measured time from segment transmission until ACK receiptignore retransmissions, cumulatively ACKed segmentsSampleRTT will vary, want estimated RTT smootheruse several recent measurements, not just current SampleRTT
-
High-level IdeaSet timeout = average + safe margin
-
Estimating Round Trip TimeEstimatedRTT = (1- )*EstimatedRTT + *SampleRTTExponential weighted moving averageinfluence of past sample decreases exponentially fasttypical value: = 0.125 SampleRTT: measured time from segment transmission until ACK receipt SampleRTT will vary, want a smoother estimated RTTuse several recent measurements, not just current SampleRTT
-
Setting TimeoutProblem:using the average of SampleRTT will generate many timeouts due to network variations
Solution:EstimtedRTT plus safety marginlarge variation in EstimatedRTT -> larger safety marginTimeoutInterval = EstimatedRTT + 4*DevRTTDevRTT = (1-)*DevRTT + *|SampleRTT-EstimatedRTT|
(typically, = 0.25) Then set timeout interval:
-
An Example TCP Session
-
TCP Round Trip Time and TimeoutSetting the timeoutEstimtedRTT plus safety marginlarge variation in EstimatedRTT -> larger safety margin
EstimatedRTT = (1-x)*EstimatedRTT + x*SampleRTTExponential weighted moving averageinfluence of given sample decreases exponentially fasttypical value of x: 0.1Timeout = EstimatedRTT + 4*DeviationDeviation = (1-x)*Deviation + x*|SampleRTT-EstimatedRTT|
-
Fast retransmit
-
*Fast RetransmitTimeout period often relatively long:long delay before resending lost packet
Detect lost segments via duplicate ACKssender often sends many segments back-to-backif segment is lost, there will likely be many duplicate ACKsIf sender receives 3 ACKs for the same data, it supposes that segment after ACKed data was lost:resend segment before timer expires
-
*Triple Duplicate Ack12345623PacketsAcknowledgements (waiting seq#)47
-
* event: ACK received, with ACK field value of y if (y > SendBase) { SendBase = y if (there are currently not-yet-acknowledged segments) start timer } else { increment count of dup ACKs received for y if (count of dup ACKs received for y = 3) { resend segment with sequence number y Fast Retransmit:a duplicate ACK for already ACKed segmentfast retransmit
-
Congestion Control
-
Principles of Congestion ControlCongestion:informally: too many sources sending too much data too fast for network to handlemanifestations:lost packets (buffer overflow at routers)long delays (queuing in router buffers)a highly important problem!
-
Causes/costs of congestion: scenario 1 two senders, two receiversone router, infinite buffers no retransmission
-
Causes/costs of congestion: scenario 1 Throughput increases with loadMaximum total load C (Each session C/2)Large delays when congestedThe load is stochastic
-
Causes/costs of congestion: scenario 2 one router, finite buffers sender retransmission of lost packet
-
Causes/costs of congestion: scenario 2 always: (goodput)Like to maximize goodput!perfect retransmission:retransmit only when loss: Actual retransmission of delayed (not lost) packet makes larger (than perfect case) for same
-
Causes/costs of congestion: scenario 2 costs of congestion: more work (retrans) for given goodputunneeded retransmissions: link carries (and delivers) multiple copies of pkt
outoutinoutin
-
Causes/costs of congestion: scenario 3 four sendersmultihop pathstimeout/retransmit
Q: what happens as and increase ?
-
Causes/costs of congestion: scenario 3 Another cost of congestion: when packet dropped, any upstream transmission capacity used for that packet was wasted!
-
Approaches towards congestion controlEnd-end congestion control:no explicit feedback from networkcongestion inferred from end-system observed loss, delayapproach taken by TCPNetwork-assisted congestion control:routers provide feedback to end systemssingle bit indicating congestion (SNA, DECbit, TCP/IP ECN, ATM)explicit rate sender should send atTwo broad approaches towards congestion control:
-
Goals of congestion controlThroughput:Maximize goodputthe total number of bits end-endFairness:Give different sessions equal share.Max-min fairnessMaximize the minimum rate session.Single link:Capacity Rsessions mEach sessions: R/m
-
Max-min fairnessModel: Graph G(V,e) and sessions s1 smFor each session si a rate ri is selected.The rates are a Max-Min fair allocation:The allocation is maximalNo ri can be simply increasedIncreasing allocation ri requires reducingSome session jrj riMaximize minimum rate session.
-
Max-min fairness: AlgorithmModel: Graph G(V,e) and sessions s1 smAlgorithmic view:For each link compute its fair share f(e).Capacity / # sessionselect minimal fair share link.Each session passing on it, allocate f(e).Subtract the capacities and delete sessionscontinue recessively.Fluid view.
-
Max-min fairnessExample
Throughput versus fairness.
-
Case study: ATM ABR congestion controlABR: available bit rate:elastic service if senders path underloaded: sender can use available bandwidthif senders path congested: sender lowers ratea minimum guaranteed rateAim: coordinate increase/decrease rateavoid loss!
-
Case study: ATM ABR congestion controlRM (resource management) cells:sent by sender, in between data cellsone out of every 32 cells.RM cells returned to sender by receiverEach router modifies the RM cellInfo in RM cell set by switchesnetwork-assisted 2 bit info.NI bit: no increase in rate (mild congestion)CI bit: congestion indication (lower rate)
-
Case study: ATM ABR congestion controltwo-byte ER (explicit rate) field in RM cellcongested switch may lower ER value in cellsender send rate thus minimum supportable rate on pathEFCI bit in data cells: set to 1 in congested switchif data cell preceding RM cell has EFCI set, sender sets CI bit in returned RM cell
-
Case study: ATM ABR congestion controlHow does the router selects its action: selects a rateSet congestion bitsVendor dependent functionalityAdvantages:fast responseaccurate responseDisadvantages:network level designIncrease router tasks (load).Interoperability issues.
-
End to end control
-
End to end feedbackAbstraction:Alarm flag. observable at the end stations
-
Simple Abstraction
-
Simple Abstraction
-
Simple feedback modelEvery RTT receive feedbackHigh CongestionDecrease rate
Low congestionIncrease rate
Variable rate controls the sending rate.
-
Multiplicative UpdateCongestion:Rate = Rate/2No Congestion:Rate= Rate *2Performance Fast responseUn-fair:Ratios unchanged
-
Additive UpdateCongestion:Rate = Rate -1No Congestion:Rate= Rate +1Performance Slow responseFairness: Divides spare BW equallyDifference remains unchanged
-
AIMD SchemeAdditive IncreaseFairness: ratios improvesMultiplicative DecreaseFairness: ratio unchangedFast responsePerformance:Congestion -Fast responseFairness
-
AIMD: Two users, One linkBW limitFairnessRate of User 1Rate of User 2
-
*TCP Congestion ControlClosed-loop, end-to-end, window-based congestion controlDesigned by Van Jacobson in late 1980s, based on the AIMD alg. of Dah-Ming Chu and Raj JainWorks well so far: the bandwidth of the Internet has increased by more than 200,000 timesMany versionsTCP/Tahoe: this is a less optimized versionTCP/Reno: many OSs today implement Reno type congestion controlTCP/Vegas: not currently usedFor more details: see TCP/IP illustrated; or read http://lxr.linux.no/source/net/ipv4/tcp_input.c for linux implementation
-
TCP/Reno Congestion Detection
3 dup ACKs indicates network capable of delivering some segments
timeout is more alarming
Philosophy:Detect congestion in two cases and react differently:
3 dup ACKstimeout event
-
Basic StructureTwo phasesslow-start: MIcongestion avoidance: AIMD
Important variables:cwnd: congestion window sizessthresh: threshold between the slow-start phase and the congestion avoidance phase
-
Visualization of the Two Phases
-
Slow Start: MIWhat is the goal? getting to equilibrium gradually but quickly
Implements the MI algorithmdouble cwnd every RTT until network congested get a rough estimate of the optimal of cwnd
-
Slow-startcwnd = 2cwnd = 4cwnd = 6Initially:cwnd = 1;ssthresh = infinite (e.g., 64K);
For each newly ACKed segment:if (cwnd < ssthresh) /* slow start*/ cwnd = cwnd + 1;cwnd = 8
-
Startup Behavior with Slow-startSee [Jac89]
-
*TCP/Reno Congestion AvoidanceMaintains equilibrium and reacts around equilibrium
Implements the AIMD algorithmincreases window by 1 per round-trip time (how?)cuts window size to half when detecting congestion by 3DUPto 1 if timeoutif already timeout, doubles timeout
-
*TCP/Reno Congestion AvoidanceInitially:cwnd = 1;ssthresh = infinite (e.g., 64K);For each newly ACKed segment:if (cwnd < ssthresh) /* slow start*/ cwnd = cwnd + 1;else /* congestion avoidance; cwnd increases (approx.) by 1 per RTT */ cwnd += 1/cwnd;Triple-duplicate ACKs: /* multiplicative decrease */cwnd = ssthresh = cwnd/2;Timeout:ssthresh = cwnd/2;cwnd = 1;(if already timed out, double timeout value; this is called exponential backoff)
-
*TCP/Reno: Big Picture TimecwndslowstartcongestionavoidanceTDTD: Triple duplicate acknowledgementsTO: TimeoutTOssthreshssthreshssthreshssthreshcongestionavoidanceTDcongestionavoidanceslow startcongestionavoidanceTD
-
*A SessionQuestion: when cwnd is cut to half, why sending rate is not?
-
*TCP/Reno Queueing DynamicsConsider congestion avoidance onlyTimecwndcongestionavoidanceTDssthreshbottleneckbandwidthThere is a filling and draining of buffer process for each TCP flow.
-
TCP & AIMD: congestionDynamic window size [Van Jacobson]Initialization: Slow startSteady state: AIMDCongestion = timeoutTCP TahoeCongestion = timeout || 3 duplicate ACKTCP Reno & TCP new RenoCongestion = higher latencyTCP Vegas
-
TCP Congestion Controlend-end control (no network assistance)transmission rate limited by congestion window size, Congwin, over segments:w segments, each with MSS bytes sent in one RTT:Congwin
-
TCP congestion control:probing for usable bandwidth: ideally: transmit as fast as possible (Congwin as large as possible) without lossincrease Congwin until loss (congestion)loss: decrease Congwin, then begin probing (increasing) again
two phasesslow startcongestion avoidance important variables:Congwinthreshold: defines threshold between two slow start phase, congestion control phase
-
TCP Slowstartexponential increase (per RTT) in window size (not so slow!)In case of timeout:Threshold=CongWin/2initialize: Congwin = 1for (each segment ACKed) Congwin++until (congestion event OR CongWin > threshold)Host Aone segmentRTTHost Btwo segmentsfour segments
-
TCP Tahoe Congestion Avoidance/* slowstart is over */ /* Congwin > threshold */Until (timeout) { /* loss event */ every ACK: Congwin += 1/Congwin }threshold = Congwin/2Congwin = 1perform slowstartCongestion avoidanceTCP Taheo
-
TCP RenoFast retransmit:After receiving 3 duplicate ACKResend first packet in window.Try to avoid waiting for timeoutFast recovery:After retransmission do not enter slowstart.Threshold = Congwin/2Congwin = 3 + Congwin/2Each duplicate ACK received Congwin++After new ACK Congwin = Threshold return to congestion avoidanceSingle packet drop: great!
-
TCP Congestion Window Trace
-
TCP Vegas:Idea: track the RTTTry to avoid packet losslatency increases: lower ratelatency very low: increase rateImplementation:sample_RTT: current RTTBase_RTT: min. over sample_RTTExpected = Congwin / Base_RTTActual = number of packets sent / sample_RTT =Expected - Actual
-
TCP Vegas = Expected - ActualCongestion Avoidance:two parameters: and , ) Congwin = Congwin -1Otherwise no changeNote: Once per RTTSlowstartparameter If ( > ) then move to congestion avoidanceTimeout: same as TCP Taheo
-
TCP Dynamics: RateTCP Reno with NO Fast Retransmit or RecoverySending rate: Congwin*MSS / RTTAssume fixed RTT WW/2Actual Sending rate: between W*MSS / RTT and (1/2) W*MSS / RTT Average (3/4) W*MSS / RTT
-
TCP Dynamics: LossLoss rate (TCP Reno)No Fast Retransmit or RecoveryConsider a cycleTotal packet sent: about (3/8) W2 MSS/RTT = O(W2)One packet lossLoss Probability: p=O(1/W2) or W=O(1/p)
-
TCP latency modelingQ: How long does it take to receive an object from a Web server after sending a request? TCP connection establishmentdata transfer delay
Notation, assumptions:Assume one link between client and server of rate RAssume: fixed congestion window, W segmentsS: MSS (bits)O: object size (bits)no retransmissionsno loss, no corruption
-
TCP latency modelingOptimal Setting: Time = O/R
Two cases to consider:WS/R > RTT + S/R: ACK for first segment in window returns before windows worth of data sentWS/R < RTT + S/R: wait for ACK after sending windows worth of data sent
-
TCP latency ModelingCase 1: latency = 2RTT + O/RCase 2: latency = 2RTT + O/R+ (K-1)[S/R + RTT - WS/R]K:= O/WS
-
TCP Latency Modeling: Slow StartNow suppose window grows according to slow start. Will show that the latency of one object of size O is: where P is the number of times TCP stalls at server:- where Q is the number of times the server would stall if the object were of infinite size.
- and K is the number of windows that cover the object.
-
TCP Latency Modeling: Slow Start (cont.)Example:
O/S = 15 segments
K = 4 windows
Q = 2
P = min{K-1,Q} = 2
Server stalls P=2 times.
-
TCP Latency Modeling: Slow Start (cont.)
*******************