Congestion Control In The Internet Part 2: How it is ......1. Congestion Control in the Internet is...
Transcript of Congestion Control In The Internet Part 2: How it is ......1. Congestion Control in the Internet is...
Congestion Control In The Internet
Part 2: How it is implementedin TCP
JYLeBoudec2014
1
Contents
1. CongestioncontrolinTCP2. ThefairnessofTCP3. Theloss‐throughputformula4. ExplicitCongestionNotification(ECN)andRED5. OtherBellsandWhistles
2
1. Congestion Control in the Internet is in TCP
TCPisusedtoavoidcongestionintheInternetinadditiontowhatwasshown:aTCPsourceadjustsitswindowtothecongestionstatusoftheInternet(slowstart,congestionavoidance)thisavoidscongestioncollapseandensuressomefairness
TCPsourcesinterpretslossesasanegativefeedbackusetoreducethesendingrate
UDPsourcesareaproblemfortheInternetuseforlonglivedsessions(ex:RealAudio)isathreat:congestioncollapseUDPsourcesshouldimitateTCP:“TCPfriendly”
3
TCP Congestion Control is based on AIMD
TCPadjuststhewindowsize(inadditiontoofferedwindowiecreditmechanism)
basedonthebeliefthatrate
W=min(cwnd, OfferedWindow)OfferedWindow =windowobtainedbyTCP’swindowfieldcwnd =controlledbyTCPcongestioncontrol
PrinciplesofTCPCongestionControlnegativefeedback=loss,positivefeedback=ACKreceivedAdditiveIncrease(+1MaximumSegmentSize‐‐ MSS),MultiplicativeDecrease( 0.5)Slowstartwithincreasefactor 2
InadditiontoAIMD:whenlossisdetectedbytimeout‐>slowstart
Lossdetectedbyfastretransmit=>fastrecovery(noslowstart) 4
A Trace
5
created from data from: IEEE Transactions on Networking, Oct. 95, “TCP Vegas”, L. Brakmo and L. Petersen
twnd (= ssthresh)
cwnd
A B C
0 1 2 3 4 5 6 7 8 9
60
30
0
Bytes
seconds
A Trace
6
created from data from: IEEE Transactions on Networking, Oct. 95, “TCP Vegas”, L. Brakmo and L. Petersen
twnd
cwnd
congestion avoidanceslowstart congestion avoidance
A B C
B C
0 1 2 3 4 5 6 7 8 9
60
30
0
Bytes
seconds
How TCP approximates……multiplicativedecrease: (onlossdetectionbytimeout)
cwnd = 0.5 current_windowcwnd = max (cwnd, 2 MSS)
…additiveincrease:foreachACKreceived
cwnd = cwnd + MSS MSS / cwndThisisequivalentinpacketsto
cwnd ← cwnd1
cwndcwnd = min (cwnd, max‐size) (64KB)
7
cwnd=
1 MSS
32 4
How TCP approximates…
…multiplicativeincrease:(SlowStart)non dupl. ack received during slow start ‐>
cwnd = cwnd + MSS (in bytes) (1)if cwnd = twnd then go to congestion avoidance
(1)isequivalentinpacketstocwnd = cwnd + 1 (in packets)
8
32 5cwnd = 1 seg 4 6 7 8
AIMD and Slow Start
9
loss
loss
window size
slow start
A finite state machine description (incomplete)
10
exponential increase for cwnd until cwnd = twnd
Slow Start
additiveincrease for twnd, cwnd = twnd
CongestionAvoidance
cwnd = twnd
retransmissiontimeout:
- multiplicative decrease for twnd- cwnd = 1 seg
retransmissiontimeout:
- multiplicative decrease for twnd- cwnd = 1 seg
connection opening: twnd = 65535 Bcwnd = 1 seg
notesthis shows only 2 states out of 3twnd = target window
Assume a TCP flow uses WiFi with high loss ratio. Assume some packets are lost in spite of WiFi retransmissions.
When a packet is lost on the WiFi link…
1. TheTCPsourceknowsitisalossduetochannelerrorsandnotcongestion,thereforedoesnotreducethewindow
2. TheTCPsourcethinksitisacongestionlossandreducesitswindow
3. ItdependsiftheMAClayerusesretransmissions
4. Idon’tknow11
1. 2. 3. 4.
16%
2%
19%
63%
Fast Recovery
Slowstartusedwhenweassumethatthenetworkconditionisneworabruptlychangingi.e.atbeginningandafterlossdetectedbytimeoutInallotherpacketlossdetectionevents,slowstartisnotused,but“fastrecovery”isusedinsteadWithFastRecovery
targetwindowishalvedButcongestionwindowisallowedtoincreasebeyondthetargetwindowuntilthelossisrepaired
12
Fast Recovery Details
Whenlossisdetectedby3duplicateackstwnd = 0.5 current-sizetwnd = max (twnd, 2 MSS)cwnd = twnd + 3 MSS (exp. increase)cwnd = min (cwnd, 64K)
ForeachduplicatedACKreceivedcwnd = cwnd + MSS (exp. increase)cwnd = min (cwnd, 64K)
Whenlossisrepairedcwnd = twndGoto congestion avoidance
13
Fast Recovery Example
14
twnd=cwnd = 800seq=201:301seq=301:351seq=351:401seq=401:501
twnd=cwnd=813seq=501:601
seq=601:701
seq=701:751twnd=407, cwnd=707
seq=201:301seq=751:801
twnd=407, cwnd=807twnd=407, cwnd=907
seq=801:901twnd=407, cwnd=1007twnd=407, cwnd=407
Ack = 101,win=1’000
Ack = 101,win=1’000
Ack = 101,win=1’000Ack = 101,win=1’000
Ack = 101,win=1’000
Ack = 801,win=1’000
1234567891011121314151617181920
Ack = 101,win=1’000
Ack = 101,win=1’000
TcpMaxDupACKs=3
15
Attime1,thesenderisin“congestionavoidance”mode.Thecongestionwindowincreaseswitheveryreceivednon‐duplicateack (asattime6).Thetargetwindow(alsocalledslow‐startthreshold)isequaltothecongestionwindow.
Thesecondpacketislost.
Attime12,itslossisdetectedbyfastretransmit,i.e.receptionof3duplicateacks.Thesendergoesinto“fastrecovery”mode.Thetargetwindowissettohalfthevalueofthecongestionwindow;thecongestionwindowissettothetargetwindowplus3packets(oneforeachduplicateack received).
Attime13thesourceretransmitsthelostpacket.Attime14ittransmitsafreshpacket.Thisispossiblebecausethewindowislargeenough.Thewindowsize,whichistheminimumofthecongestionwindowandtheadvertisedwindow,isequalto707.Sincethelastacked byteis101,itispossibletosendupto807.
Attimes15,16and18,thecongestionwindowisincreasedby1MMS,i.e.100bytes,byapplicationofthecongestionavoidancealgorithm.Thisallowstosendonefreshpacketattime17.
Attime18thelostpacketisacked,thesourceexitsthefastrecoverymodeandenterscongestionavoidance.Thecongestionwindowissettothetargetwindow.
How many new segments of size 100 bytes can the source send at time 20 ?
A. 1B. 2C. 3D. 4
F. 0G. Idon’t know
16
1 2 3 4 0 I don’t know
0% 0% 0%0%0%0%0%
Fast Recovery Example
18
twnd
cwnd
0 1 2 3 4 5 6
60
30
0
Bytes
seconds
E F
A B C D E F
A-B, E-F: fast recoveryC-D: slow start
A finite state machine description
19
Slow Start
- exponentialincrease
CongestionAvoidance
- additiveincrease
FastRecovery
- exponential increase
beyond twnd
new connection:
retr. timeout:
cwnd = twnd:
retr. timeout:
retr. timeout: expected ack received:
fast retransmit:
fastretransmit:
2. Fairness of TCP
TCPimplementssomeapproximationofAIMDAIMDisdesignedtobeapproximatelyfairinasinglelinkscenarioQ:whathappensinanetwork?DoesTCPdistributeratesaccordingtomax‐minfairnessorproportionalfairness?
20
Does TCP distribute rates according to max‐min fairness or proportional fairness ?
1. Max‐min2. Proportional3. Noneoftheabove4. Idon’tknow
21
1. 2. 3. 4.
23%
0%
32%
45%
TCP and RTT
TCPtendstodistributeratesoastomaximizeutilityofsourcegivenby
√
Theutility dependsontheroundtriptime ;
23
TheutilityU isadecreasingfunction
of
Whatdoesthisimply?
and send to destination using one TCP connection each, RTTs are 60ms and 140ms. Bottleneck is link « router‐destination ». Who
gets more ?
getsahigherthroughputgetsahigherthroughput
3. Bothgetthesame4. Idon’tknow
24
router destination10 Mb/s, 20 ms 1 Mb/s 10 ms
10 Mb/s, 60 ms
S1
S2
1. 2. 3. 4.
81%
2%7%9%
The Bias of TCP Against Large RTTs
WithTCP,twocompetingsourceswithdifferentRTTsarenottreatedequally
sourcewithlargeRTTobtainsless
Asourcethatusesmanyhopsobtainslessratebecauseoftwocombinedfactors,oneisgood,theotherisbad:1. thissourceusesmoreresources.Themechanicsof proportional
fairnessleadstothissourcehavinglessrate– thisisdesirableinviewofthetheoryoffairness.
2. thissourcehasalargerRTT.Themechanicsof additiveincreaseleadstothissourcehavinglessrate– thisisabuginthedesignofTCP
Causeis:additiveincreaseisonepacketperRTT(insteadofonepacketperconstanttimeinterval)
26
3. TCP Loss ‐ Throughput Formula
27
Consideralarge TCPconnection(manybytestotransmit)Assumeweobservethat,inaverage,afractionq ofpacketsislost(ormarkedwithECN)Canwesaysomethingabouttheexpectedthroughputforthisconnection?
TCP Loss ‐ Throughput Formula
28
TCPconnectionthroughput (bytespersecond)RTTT (seconds)segmentsizeL(bytes)averagepacketlossratio ∈ 0,1constantC =1.22
Assumes:transmissiontimenegligiblecomparedtoRTT,lossesarerare,timespentinSlowStartandFastRecoverynegligible
Guess the ratio between the throughputs 1and 2 of S1 and S2
5. Noneoftheabove6. Idon’tknow
291. 2. 3. 4. 5. 6.
8%
2%6%6%
17%
62%
router destination10 Mb/s, 20 ms 1 Mb/s 10 ms
10 Mb/s, 60 ms
S1
S2
4. Explicit Congestion Notification (ECN)
Whyinvented:signalcongestionwithoutdroppingpacketsdroppingpacketsisnotnice(delay,retransmissions)(=DECbit)Howdoesitwork:routermarkspacketinsteadofdroppingTCPdestinationechoesthemarkbacktosource
Atthesource,TCPinterpretsamarkedpacketinthesamewayasiftherewouldbealossdetectedbyfastretransmit
31
Explicit Congestion Notification (ECN)
32
payloadTCP
headerIPheader
SR
S reduces window by 1/2
ECN Flags2bitsinIPheader(directpath)codefor4possiblecodepointsnonECNCapable(nonECT)ECNcapableandnocongestionECT(0)andECT(1)ECNcapableandcongestionexperienced(CE)
3bitsinTCPheader(returnpath)ECE(ECNecho)bit= 0or1CWR=ack ofECE=1received(windowreduced)ECE=1issetbyRuntilRreceivesaTCPsegmentwithCWR=1NS=nonce,usedforStocheckthatECNistakenseriously.
33
ECT(0)CE
CE
ECT(0) ECE=1
ECT(0) ECE=1
Put correct labels
1. 1=CA,2=SS2. 1=SS,2=MD3. 1=CA,2=MD4. Idon’tknow
34
assume TCP with ECN is used and there is no packet loss CA :congestion avoidance=additive increase
SS : slow start = multiplicative increase
MD : multiplicative decrease
1. 2. 3. 4.
15%
0%
69%
15%
Nonce Sum(RFC 3540)
Whyinvented?SusesECN routersdonotdroppackets,useECNinsteadbutECNcouldbepreventedby:RnotimplementingECN,NATsthatdropECNinfoinheader
Insuchcases,theflowofSisnot congestioncontrolled;thisshouldbedetected‐>reverttononECNWhatdoesitdo?NonceSumbitinTCPheader(NS)isusedbyStoverifythatECNworksforthisTCP
36
ECT(0)CE
CE
ECT(0) ECE=1
ECT(0) ECE=1
Nonce Sum(RFC 3540)
Howdoesitwork?SourceSrandomlychoosesECT(0)orECT(1)inIPheaderandverifiesthatthereceivedNSbitiscompatiblewiththeECT(0)/ECT(1)chosenbyS
Noncongestedrouterdoesnothing;congestedrouterseesECT(0)orECT(1)andmarkpacketasCEinsteadofdroppingitRechoesbacktoSthexor ofallECTbitsinNSfieldofTCPheader;IfRdoesnottakeECNseriously,NSdoesnotcorrespondandSdetectsit;SdetectsthatECNdoesnotwork;MaliciousRcannotcomputeNScorrectlybecauseCEpacketsdonotcarryECTbitIfrouterorNATdropsECNbitsthenRcannotcomputethecorrectNSbitandSdetectsthatECNdoesnotwork
37
ECT(0)CE
CE
ECT(0) ECE=1
ECT(0) ECE=1
From
RFC 35
40
38
RED (Random Early Detection)
ProblemsolvedbyRED=whentomarkapacketwithECNPrinciple:queueestimatesitsaveragequeuelength
avg a measured + (1 - a) avg
incomingpacketismarkedwithprobabilitygivenbyREDcurve
auniformization procedureisalsoappliedtopreventburstsofmarking
39
avg (queue size)th‐min th‐max
max‐p
1
q (marking probability)
Active Queue Management
REDcanalsobeappliedevenifECNisnotsupportedInsuchacase,apacketmaybedroppedevenifthereissomespaceavailableExpectedbenefit
avoidconstantlyfullbuffersavoidirregulardroppatterns
ThisiscalledActiveQueueManagementasopposedtopassivequeuemanagement=dropapacketwhenqueueisfull=“TailDrop”
40
In a network where all flows use TCP with ECN and all routers support ECN, we expect that …
1. thereisnopacketloss2. thereisnopacketloss
duetocongestion3. thereisnopacketloss
duetocongestioninrouters
4. noneoftheabove5. Idon’tknow
411. 2. 3. 4. 5.
0% 0%0%0%0%
5. Other Bells and WhistlesTCP for Very High Speed Networks
(slidefromPresentation:"CongestionControlonHigh‐SpeedNetworks”,Injong Rhee,Lisong Xu,Slide7)thefigureassumeslossesaredetectedbyfastretransmitandignorestheshort“fastrecovery”phase
Packet loss
Time (RTT)Congestion avoidance
Packet loss Packet losscwnd
Slow start
Packet loss
100,000 10Gbps
50,000 5Gbps
1.4 hours 1.4 hours 1.4 hours
TCP
1
SlowIncreasecwnd =cwnd +
1 0.5
FastDecreasecwnd =cwnd *
0.5
CUBIC, FastTCP etc.
Whyinvented?Onlongfatnetworks(verylargebandwidthdelayproduct),AIMDistooslow
SolutionistoincreasemorequicklythanAIMDButonlyinconditionswhereoneTCPconnectioncanobtainaverylargethroughput
CubicisonesuchmodificationofTCPBehaveslikeTCPwhenRTTislessthan100msec,otherwiseismoreaggressiveImplementedinLinuxservers
43
TCP Friendly Applications
UDPapplicationscanbeathreattonetworkperformanceiftheygeneratealotoftraffic;SomeUDPapplicationssendatconstantrate(e.g.ahighfrequencysensor);careshouldbedoneinsuchnetworkstoavoidcongestioncollapse(controltherateandthecapacityofthenetwork)SomeotherUDPapplicationscanadapttheirrate(e.g.VBRvideo);theyimitateTCPandarecalledTCP‐friendly:
applicationdeterminesthesendingratehencecodingratefeedbackir receivedinformofcountoflostpacketssendingrateissetto:rateofaTCPflowexperiencingthesamelossratio,usingthelossthroughputformulaSeeforexampletheTFRC(TCP‐FriendlyRateControl)protocol
44
Facts to remember
45
TCPperformscongestioncontrolinend‐systemssenderincreasesitssendingwindowuntillossoccurs,thendecreases
additiveincrease(noloss)multiplicativedecrease(loss)
NegativefeedbackiseitherpacketdroporECNNegativebiastowardslongroundtriptimesispartlyundueUDPapplicationsshouldbehavelikeTCPwiththesamelossrateorshouldbeisolatede.g.byclassbasedqueuing.