67886 -Switch and Router Design - TNG Presentation | … · · 2017-08-2867886 -Switch and Router...
Transcript of 67886 -Switch and Router Design - TNG Presentation | … · · 2017-08-2867886 -Switch and Router...
1 2
BottlenecksMemory,memory,…
3
3
PacketProcessingExamples
• AddressLookup(IP/Ethernet)– Wheretosendanincomingpacket?“Useoutput-port3,tosendpacketstoMACaddress01:23:45:67:89:ab” – ExactMatch”Useoutput-port4,tosendpacketstodestinationnetwork111.15/16” - (LongestPrefixMatch)
• Firewall,ACL– Whichpackettoacceptordeny?”Dropallpacketsfromevilsourcenetwork66.66/16onports6-666”
– Usuallyneeds5fields:source-address,dest-address,source-port,dest-port,protocol
4
PacketProcessingExamples
• IntrusionDetectionSchemes– DeepPacketinspection(DPI)”DropallpacketsthatcontainsthestringEvilWormanywhere withinthepacket”
– SNORTruleset
PacketProcessingRate
12540Gb/s200331.2510Gb/s20017.812.5Gb/s19991.94622Mb/s1997
40B packets (Mpkt/s)
LineYear
1. Lookup mechanism must be simple and easy to implement2. (Surprise?) Memory access time is the long-term bottleneck
MemoryTechnology(2003-04)
Technology Single chip density
$/chip ($/MByte)
Access speed
Watts/chip
Networking DRAM
64 MB $30-$50($0.50-$0.75)
40-80ns 0.5-2W
SRAM 4 MB $20-$30($5-$8)
4-8ns 1-3W
TCAM 1 MB $200-$250($200-$250)
4-8ns 15-30W
Note: Price, speed and power are manufacturer and market dependent.
Numbers are a bit outdated but give the general idea
SimplestTask:ExactMatching
• Mostlyinbridges– Bridgesworksinlayer2(Ethernet)– BridgesconnectstwoEthernetnetworks– Wire-speedforwarding:
• Eachtimeapacketarrivesatabridge,forwarditaccordingtothedestinationMACaddress
• Store/updatealsothesourceMACaddress(learning)
• Shouldbedoneatwirespeed
Bridge
a b
c d
Solution1:BinarySearch
• MACaddresseshavevalueswhichcanbesorted• Thus,whenkeepingthemsorted,onecanperformabinarysearchonthearrayandfindtherightMACaddress
• However,eachiterationisamemoryaccessà logNmemoryaccessesà worksfine(evenusingDRAM)forsmallspeed,N(around10Mb/s,8Kvalues)butdoesn’tscaleforlargeN/higherspeeds(notevenfor100Mb/s,64Kvalues)
• Usingfasterhardware(SRAM)won’treallysolvetheproblem(anditismoreexpensive…)
ScalingusingHashing
• Hashingismuchfasterthanbinarysearchonaverage,howevermuchslowerontheworstcase(uptolineartime…)
• However,onecanchoose(pre-compute)goodhashfunctions,sothenumberofcollisioncanbesmallandbounded– Precomputation takesalotoftime,butaddressesarenotaddedinrapid
rate– Applyingthehashfunctionsisdoneonwire-speed
• Moresophisticateddatastructure/hashingtechniquescanalsobeapplied(e.g.toreducememory)– BloomFilters,fingerprinting,etc.
Example(Gigaswitch,1994)
• N=64K;binarysearchtakes16memoryaccesses• Foreach48-bitaddressaddr,wefirstapplyh(addr),toget48-bitvalue:– 16LSBarethehash-tableentryindex(64Kentries)– Eachentryisabalancedbinarytreeofheightatmost3,sortedbytheremaining32MSB
– Thehashfunctionshouldguaranteethatnomorethan8addressesareinthesametree,andthatwecandisambiguatebetweenaddressesusingthe32MSB
• Solvecorner-casesseparately(CAM);rehashing
– 4memoryaccesses
IPlongestprefixmatching
Destination =12.5.9.16-------------------------------
payload
Prefix Interface Next Hop
12.0.0.0/8 10.14.22.19 Output-port 2
12.4.0.0/15
12.5.8.0/23 10.1.3.23
Output-port 3
Output-port 4
10.1.3.77
IP Forwarding Table
0.0.0.0/0 10.14.11.33 Output-port 1
even better
OK
better
best!
LongestPrefixMatchisHarderthanExactMatch• Thedestinationaddressofanarrivingpacket
doesnotcarrywithittheinformationtodeterminethelengthofthelongestmatchingprefix
• Hence,oneneedstosearchamongthespaceofallprefixlengths;aswellasthespaceofallprefixesofagivenlength
CurrentPracticalData
• Cachingworkspoorlyinbackbonerouters– 250,000concurrentflows
• Wirespeedlookupneededfor40-bytepackets– 50%areTCPacks– 32nsec/packetin10Gbs and8nsec/packetin40Gbs
• Lookupdominatedbymemoryaccessesà speedismeasuredbymemoryaccesses
• Prefixlength8-32• Today150,000prefixesà withgrowth– 1millionprefixes
• HigherspeedsneedSRAMàWorthminimizingmemory
ProblemDefinition
192.2.0/22, R2192.2.2/24, R3 192.2.0/22 200.11.0/22
192.2.2/24
200.11.0/22, R4
200.11.0.33192.2.0.1 192.2.2.100
LPM: Find the most specific route, or the longest matching prefix among all the prefixes matching the destination address of an incoming packet
LPMinIPv4Use32exactmatchalgorithmsforLPM!
Exact matchagainst prefixes
of length 1
Exact matchagainst prefixes
of length 2
Exact matchagainst prefixes
of length 32
Network Address PortPriorityEncodeand pick
Wecanstartwithprefixlength8
MetricsforLookupAlgorithms
• Speed(=numberofmemoryaccesses)• Storagerequirements(=amountofmemory)• Lowupdatetime• Scalability
– Withlengthofprefix:IPv4unicast (32b),Ethernet(48b),IPv4multicast(64b),IPv6unicast (128b)
– Withsizeofroutingtable:(sweetspot fortoday’sdesigns=1million)
• Flexibilityinimplementation• Lowpreprocessingtime
OurToyExample
P1 = 101*P2 = 111*P3 = 11001*P4 = 1*P5 = 0*P6 = 1000*P7 = 100000*P8 = 100*P9 = 110*
Packet: 128.0.0.1 à 100..001à P4, P6, P7, P8à Forward to P7
Unibit (=Radix)Tries
P1 = 101*P2 = 111*P3 = 11001*P4 = 1*P5 = 0*P6 = 1000*P7 = 100000*P8 = 100*P9 = 110*
0pointer
1pointer
prefix
Unibit Tries
P1 = 101*P2 = 111*P3 = 11001*P4 = 1*P5 = 0*P6 = 1000*P7 = 100000*P8 = 100*P9 = 110*
0 1P5
0 1
P4
0 1
P1
0
P8
0 1
P2
0
1 P3
P9
0
P6
0P7
CompactingOne-WayBranches(variantofPARTICIAtree)P1 = 101*P2 = 111*P3 = 11001*P4 = 1*P5 = 0*P6 = 1000*P7 = 100000*P8 = 100*P9 = 110*
0 1P5
0 1
P4
0 1
P10
P8
0 1
P2
01
P3
P9
00
P6
P7
Unibit Tries– RunningExample
P1 = 101*P2 = 111*P3 = 11001*P4 = 1*P5 = 0*P6 = 1000*P7 = 100000*P8 = 100*P9 = 110*
0 1P5
0 1
P4
0 1
P10
P8
0 1
P2
01
P3
P9
00
P6
P7
Input: 1001 Memory: null
Unibit Tries– RunningExample
P1 = 101*P2 = 111*P3 = 11001*P4 = 1*P5 = 0*P6 = 1000*P7 = 100000*P8 = 100*P9 = 110*
0 1P5
0 1
P4
0 1
P10
P8
0 1
P2
01
P3
P9
00
P6
P7
Input: 1001 Memory: P4
Unibit Tries– RunningExample
P1 = 101*P2 = 111*P3 = 11001*P4 = 1*P5 = 0*P6 = 1000*P7 = 100000*P8 = 100*P9 = 110*
0 1P5
0 1
P4
0 1
P10
P8
0 1
P2
01
P3
P9
00
P6
P7
Input: 1001 Memory: P4
Unibit Tries– RunningExample
P1 = 101*P2 = 111*P3 = 11001*P4 = 1*P5 = 0*P6 = 1000*P7 = 100000*P8 = 100*P9 = 110*
0 1P5
0 1
P4
0 1
P10
P8
0 1
P2
01
P3
P9
00
P6
P7
Input: 1001 Memory: P4 P8
Unibit Tries- Analysis
• W-bitprefixes,N- prefixes:O(W)lookup,O(NW)storageandO(W)updatecomplexity
• Patricia:O(N)storage(why?)• Stillslow,highmemory,but:
– Simple– Extensibletowiderfields
Multi-bitTries
Depth = WDegree = 2Stride = 1 bit
Binary trieW
Depth = W/kDegree = 2k
Stride = k bits
Multi-ary trie
W/k
Principle: Trade Memory for Speed
PrefixExpansionwithMulti-bitTries
If stride = k bits, prefix lengths that are not a multiple of k need to be expanded
Prefix Expanded prefixes0* 00*, 01*11* 11*
E.g., k = 2:
Maximum number of expanded prefixes corresponding to one non-expanded prefix = 2k-1
1000 01 11
Quadrary-Trie (k=2)
P1 = 101*P2 = 111*P3 = 11001*P4 = 1*P5 = 0*P6 = 1000*P7 = 100000*P8 = 100*P9 = 110*
P5a
P1bP8
P2a
10
P3a
P9a
00
P6
P7
1000 01 11
P5b P4b
1000 01 11
P4a
P2b
P3b
P9b
11P1a
PrefixExpansionIncreasesStorageConsumption
• Replicationofnext-hopptr• Greaternumberofunused(null)pointersinanode
• Improvement:FromFixed-StrideTriestoVariable–StrideTries
Time ~ W/kStorage ~ NW/k * 2k-1
30
TernaryContent-AddressableMemory(TCAM)
Enco
der
Match lines
Search Key
01234
65
789
2
01234
65
789
17
0
32
7701
3TCAM Array
Each entry is a word in
{0,1,*}W and represents a
rule
31
Example
Enco
der
Match lines
01234
65
789
35
1
12
127214321
2
0011101101010**00*01001111****11*00*00001110*0*101000110****10**010100*0**0100011010*01000001110*************************1110**010*01*0010101010*0*****11**10010*01*0010****10*01*****************************001110****10101010***********************111111111111111111111111*
0011101010101001110001110001110
00010
10
101
3
*******************************
32
TCAMBenefitsandDisadvantages
• DeterministicSearchThroughput—O(1)search• Veryflexibletootherproblemsaswell
– Nextweek:multi-fieldpacketclassifications• However,relativelycostlyandenergy-consuming– 150$forsmall(4Mbit)TCAM– Energydependsonthenumberofentries
• ~10millionTCAMdevicesalreadydeployed
TypicalDimensionsandSpeed
• 100K-200Krules• 100-150symbolsperrule• 133millionsearchespersecondfor144-bitkeys
– Suitableevenfor40Gb/s traffic
• IPv4andIPv6lookupsaretrivialwithTCAM
• Extrasymbols areleftineachentry,thatcanbeusedtooptimizeTCAMperformance
33