Fast Convergence techniques-APNIC-42wiki.bdnog.org/lib/exe/fetch.php/bdnog7/fast... · 50-ms...
Transcript of Fast Convergence techniques-APNIC-42wiki.bdnog.org/lib/exe/fetch.php/bdnog7/fast... · 50-ms...
NeedforFastConvergence
Itsnotonlybrowsing,mailandwatchingvideosanymore.InternetandNetworkscarryingVoice/Video calls.Carryingbusinessandmissioncriticaldata.
Nooptionforoutageorinterruption.
NeedforFastConvergence
FewyearsbeforeinEthernetnetworkConvergencetimewasabout2minutes.
AtpresentittakesfewsecondswithoutanyfastconvergencetechniquesappliedinInterfaceandprotocolconfiguration.
Butmanycriticalservicesdemand<50msconvergencetimeinacarriergradenetwork.
DesignConsideration
• NetworkTopology• IPPlanning• IGPFineTuning• ScalingBGP• TypeofServiceDelivery
NetworkTopology:BadExample
NetworkTopology:BetterExample
BetterIPPlanBetterConvergence
• Domain/AreaBasedIPPlanmustbetakingplacetominimizetheprefixes
• PrefixSummeryorAreasummeryisveryeffectivetoaggregateindividualsmallprefixeswithintheArea
IGPFastConvergence
• FailureDetection• EventPropagation• SPFRun• RIBFIBUpdate
• Timetodetectthenetworkfailure,e.g.interfacedowncondition.
• Timetopropagatetheevent,i.e.floodtheLSAacrossthetopology.
• TimetoperformSPFcalculationsonallroutersuponreceptionofthenewinformation.
• Timetoupdatetheforwardingtablesforallroutersinthearea.
PurgingtheRIBonlinkfailure
• RoutingprotocolsaremoreefficientthanRIBprocessindetectinglinkfailuretodeletetheassociatenext-hoproutesofthefailedinterface.Enablingthisfeaturereducesconvergencetimesignificantlyspeciallyincaseofalargeroutingtable.
ip routing protocol purge interface
LinkFailureDetectionProcess
Hereisfewmethodstodetectthelinkfailure1. IGPkeepalive times/fasthelloswiththedead/holdinterval
ofonesecondandsub-secondhellointervals.ItisCPUhungry
2. carrier-delaymsec 0,PhysicalLayer3. BFD,OpenStandardmorereliableratherthanIGP
Keepalive fasthello
LinkFailureDetection
• SetCarrier-delay to0mstochangethelink stateinstantly.Ifyou areusing any other transportserviceslike SDHorDWDMsetthevalueaccording toyourtransportnetwork
int gi0/0/1carrier-delay msec0
LinkFailureDetection
• Enable BFDtonotify routing protocols aboutthelinkfailure insub secondinterval.Without BFDit will takeat least1second
int gi0/0/1ip ospf bfdbfd interval 50 min_rx 50 multiplier 3
LinkFailureDetection
• InEthernetinterface,ISIS/OSPFwillattempttoelectaDIS/DRwhenitformsanadjacency– Asitisrunningasapoint-to-pointlink,configuringISIS/OSPFtooperatein"point-to-pointmode”reduceslinkfailuredetectiontime
int gi0/0/1isis network point-to-point
int gi0/0/1ip ospf network point-to-point
SPFCalculation
• TheuseofIncrementalSPF(iSPF)allowstofurtherminimizetheamountofcalculationsneededwhenpartialchangesoccurinthenetwork
• Needtoenableispf underospf/isis process
router ospf 10ispf
SetOverloadbit
• WaituntiliBGP isrunningbeforeprovidingtransitpathrouter isis isp
set-overload-bit on-startup wait-for-bgp
router ospf 10max-metric router-lsa on-startup wait-
for-bgp• Avoidsblackholing trafficonrouterrestart• CausesOSPF/ISIStoannounceitsprefixeswithhighestpossiblemetricuntiliBGP isupandrunning
NonStopForwarding
• CiscoNSFwithSSOorJuniperNonStopActiveRoutingforsystemswithdualrouteprocessorallowsarouterthathasexperiencedahardwareofsoftwarefailureofanactiverouteprocessortomaintaindatalinklayerconnectionsandtocontinueforwardingpacketsduringtheswitchovertothestandbyrouteprocessor
EventPropagation
After LinkDownEvent Remarks Command
LSAgenerationdelay timersthrottle lsa initialholdmax_wait
timersthrottle lsa 0201000
LSAreceptiondelay ThisdelayisasumoftheingressqueuingdelayandLSAarrivaldelay
timerspacingretransmission100
ProcessingDelay timerspacingflood(ms)withthedefaultvalueof55ms
timerspacingflood15
PacketPropagationDelay 12usecfor1500bytespacketovera1Gbpslink
N/A
RIB/FIBUpdate
Link/NodeDown
SPFCalculatio
n
RIBUpdate
FIBUpdate
Communication
LesserNumberofPrefixeslessertimetoconvergetheRIBandFIB
RIB/FIBUpdate
• AftercompletingSPFcomputation,OSPF/ISISperformssequentialRIBupdatetoreflectthechangedtopology.TheRIBupdatesarefurtherpropagatedtotheFIBtable
• TheRIB/FIBupdateprocessmaycontributethemosttotheconvergencetimeinthetopologieswithlargeamountofprefixes,e.g.thousandsortensofthousands
• Platformwhatyouareusing,highercapacityCPUandRAMwillcaterbetterperformance.
ConfigurationTemplate
router ospf 10max-metric router-lsa on-startup wait-for-bgptimers lsa arrival 50 timers throttle lsa all 10 100 1000 timers throttle spf 10 100 1000 timers pacing flood 5 timers pacing retransmission 60 ispfbfd all interfaces
ConfigurationTemplate
router isis ISPset-overload-bit on-startup wait-for-bgpspf-interval 5 1 20lsp-gen-interval 5 1 20prc-interval 5 1 20fast-flood 10bfd all-interfacesispf level-1-2 60
FinalCalculationEvent Time(ms) Remarks
FailureDetectionDelay:Carrier-delaymsec 0 0 about5-10msworstcasetodetect
In BFDCase 150 Multiplayer3 islastcount:50msinterval
MaximumSPFruntime 64 doubling forsafetymakesit64ms
MaximumRIBupdate 20 doubling forsafetymakesit20ms
OSPFinterfacefloodpacing timer 5 doesnotapply totheinitial LSAflooded
LSAGenerationInitialDelay 10 enough todetectmultiple linkfailuresresultingfromSRLGfailure
SPFInitialDelay 10 enough toholdSPFtoallowtwoconsecutiveLSAstobeflooded
Networkgeographicalsize/PhysicalMedia(Fiber) 0 signalpropagation isnegligible
FinalFIBUPDATETime:Maximum500ms.Itissub-secondconvergence
BeyondSubsecondConvergence
Butifyouneed<50ms Convergencetime,Needtodomore…….
i. RSVPBasedlink/nodeprotectionrouteii. LDPBasedLFA-FRR
50-msConvergence:Dowereallyneedthis?
• Mostoftheapplicationsandservicesweareusingtodayarefinewithsubsecond(500ms)convergence.
• Fewapplicationslikestocktrading,mobilephonerecharge,fewotherpoorlywrittenappspeopleusingasksfor50msconvergence.
• L2CircuitemulationoverIPsometimesbreaksover100ms
• http://www.ethernetacademy.net/Ethernet-Academy-Articles/putting-50-milliseconds-in-perspective
LFA-FRR
• Providelocalsub-100msconvergencetimesandcomplementanyotherfastconvergencetuningtechniquesthathavebeenemployed
• LFA-FRRiseasilyconfiguredonarouterbyasinglecommand,calculateseverythingautomatically
• EasyandlessercomplexthanRSVPBasedTrafficEngineering.
Prerequisite
• NeedMPLSLDPConfiguration
• NeedBFDConfigurationtotriggerFastReroute
• NeedsomeFastRerouteconfigurationunderOSPFProcess
• Needsomespecialconfigurationbasedonplatform
mpls ldp discovery targeted-hello accept
router ospf Yrouter-id xxxxxispfprefix-priority high route-map
TE_PREFIXfast-reroute per-prefix enable area y prefix-priority high
fast-reroute per-prefix remote-lfa tunnel mpls-ldp
ip prefix-list TE_PREFIX seq 5 permit a.b.c.d/32
!route-map TE_PREFIX permit 10match ip address prefix-list TE_PREFIX
Howitworks1. Initiallybestpathfortheprefix172.16.1.0/24isB-A-B1-B32. OncethelinkfailsbetweenB-AthenpriorcomputedLFATunnelTriggeredbyBFD3. ImmediateTargetPrefix(es)arepassedthroughB-DLFATunnel4. PackdropdoesnotobservebecauseBrouterdoesnotwaitforIGPconvergence
LFA-FRRDesignConsideration
• InaRingTopology• LesserPrefixmakequickerconvergence• SpecificPrefixwithhigherprioritywillshowbestperformancewithoutanyserviceinterruptionandpacketdrop.
ROBI39-DHKTL25#sh ip int briefLoopback1 10.253.51.91 YES NVRAM up upMPLS-Remote-Lfa124 10.10.202.69 YES unset up up
show ip cef 10.255.255.2910.255.255.29/32nexthop 10.10.202.65 Vlan10 label [166|1209]
repair: attached-nexthop 10.253.51.94 MPLS-Remote-Lfa124
Before/AfterLFAFRRXshell:\> ping 10.252.51.111 –tReply from 10.252.51.111: bytes=32 time=2ms TTL=253Reply from 10.252.51.111: bytes=32 time=4ms TTL=253Reply from 10.252.51.111: bytes=32 time=2ms TTL=253Reply from 10.252.51.111: bytes=32 time=2ms TTL=253Request timed out.Reply from 10.252.51.111: bytes=32 time=61ms TTL=253Reply from 10.252.51.111: bytes=32 time=86ms TTL=253Reply from 10.252.51.111: bytes=32 time=70ms TTL=253Reply from 10.252.51.111: bytes=32 time=147ms TTL=253
Reply from 10.252.51.111: bytes=32 time=2ms TTL=253Reply from 10.252.51.111: bytes=32 time=2ms TTL=253Reply from 10.252.51.111: bytes=32 time=1ms TTL=253Reply from 10.252.51.111: bytes=32 time=1ms TTL=253Reply from 10.252.51.111: bytes=32 time=27ms TTL=253Reply from 10.252.51.111: bytes=32 time=32ms TTL=253Reply from 10.252.51.111: bytes=32 time=1ms TTL=253Reply from 10.252.51.111: bytes=32 time=2ms TTL=253Reply from 10.252.51.111: bytes=32 time=2ms TTL=253Reply from 10.252.51.111: bytes=32 time=1ms TTL=253
BGPFastConvergence
LFA-FRRorRSVPcanimproveL2-VPNandIntra-ASConvergencebutcan’tdomuchforExternalprefixeslearnviaEBGP
BGPFastConvergence
TheBGPPICEdgeforIPandMPLS-VPNfeatureimprovesBGPconvergenceonceanetworkfailure.
Prerequisites
• BGPandtheIPorMultiprotocolLabelSwitching(MPLS)networkisupandrunningwiththecustomersiteconnectedtotheprovidersitebymorethanonepath(multihomed).
• Ensurethatthebackup/alternatepathhasauniquenexthopthatisnotthesameasthenexthopofthebestpath.
• EnabletheBidirectionalForwardingDetection(BFD)protocoltoquicklydetectlinkfailuresofdirectlyconnectedneighbors.
HowToWork:PE-CELink/PEFailure
• eBGP sessionsexistbetweenthePEandCErouters.• TrafficfromCE1usesPE1toreachnetworkx.x.x.x/24towardstherouterCE2.CE1has
twopaths:• PE1astheprimarypathandPE2asthebackup/alternatepath.• CE1isconfiguredwiththeBGPPICfeature.BGPcomputesPE1asthebestpathandPE2
asthebackup/alternatepathandinstallsbothroutesintotheRIBandCEFplane.WhentheCE1-PE1link/PEgoesdown,CEFdetectsthelinkfailureandpointstheforwardingobjecttothebackup/alternatepath.TrafficisquicklyreroutedduetolocalfastconvergenceinCEF.
HowtoWork:DualCE-PELine/NodeFailure
• eBGP sessionsexistbetweenthePEandCErouters.TrafficfromCE1usesPE1toreachnetworkx.x.x.x/24throughrouterCE3.
• CE1hastwopaths:PE1astheprimarypathandPE2asthebackup/alternatepath.• AniBGP sessionexistsbetweentheCE1andCE2routers.• IftheCE1-PE1linkorPE1goesdownandBGPPICisenabledonCE1,BGPrecomputes thebestpath,
removingthenexthopPE1fromRIBandreinstalling CE2asthenexthopintotheRIBandCiscoExpressForwarding.CE1automaticallygetsabackup/alternaterepairpathintoCiscoExpressForwardingandthetrafficlossduringforwardingisnowinsubseconds,therebyachieving fastconvergence.
HowtoWork:IPMPLSPEDown
• ThePEroutersareVPNv4iBGP peerswithreflect routersintheMPLSnetwork.• TrafficfromCE1usesPE1toreachnetworkx.x.x.x/24towardsrouterCE3.CE3isdual-homedwith
PE3andPE4.PE1hastwopathstoreachCE3fromthereflect routers:PE4istheprimarypathwiththenexthopasaPE4address.
• PE3isthebackup/alternatepathwiththenexthopasaPE3address.• WhenPE4goesdown,PE1knowsabouttheremovalofthehostprefixbyIGPsinsubseconds,
recomputes thebestpath,selectsPE3asthebestpath,andinstallstheroutesintotheRIBandCiscoExpressForwardingplane.NormalBGPconvergencewillhappenwhileBGPPICisredirecting thetraffictowardsPE3,andpacketsarenotlost.
ConfigurationTemplaterouter bgp 65000no synchronization
neighbor 10.0.0.10 remote-as 65000neighbor 10.0.0.10 update-source Loopback0
no auto-summary!address-family vpnv4bgp additional-paths installneighbor 10.0.0.10 activateneighbor 10.0.0.10 send-community both
exit-address-family!address-family ipv4 vrf abcimport path selection allneighbor 10.10.10.20 remote-as 65534neighbor 10.10.10.20 activateexit-address-family
ConclusionIGPFinetuning
100%DynamicandsimplifiedcanreachsubsecondconvergencetimeLFA-FRR
LFATunnelPre-computed,pre-installedPrefix-independentSimple,deploymentfriendly,goodscalingCanreach<50ms convergencetimesuitableforIntra-ASandL2-VPNtrafficBut
TopologydependantIPFRRIGPcomputation isveryCPU-intensivetask
BGPPICCanachieve<50ms convergencetimeforInter-ASandL3-VPNtraffic
ThankYou