Post on 17-Jan-2016
description
Dynamic Algorithms with Worst-case
Performance for Packet Classification
Pankaj Gupta and Nick McKeownStanford University
{pankaj, nickm}@stanford.edu
May 17, 2000
Outline
Motivation for packet classification Problem definition Previous work Goals of this work Algorithm 1: Heap-on-trie (HoT) Algorithm 2: Binarysearchtree-on-
trie (BoT) Conclusions and open problems
Need for differentiated services
ISP1
NAP
E1E2
ISP2
ISP3Z
X
Y
Service ExampleTraffic Shaping Ensure that ISP3 does not inject more than
50Mbps of total traffic on interface X, of which no more than 10Mbps is email traffic
Packet Filtering Deny all traffic from ISP2 (on interface X) destined to E2
Policy Routing Send all voice-over-IP traffic arriving from E1 (on interface Y) and destined to E2 via a separate ATM network
Multi-field Packet Classification
Given a classifier with N rules, find the action associated with the highest priority rule matching an incoming packet.
Example: A packet (152.168.3.32, 152.163.171.71, …, TCP) would have action AN-1 applied to it.
Field 1 Field 2 … Field k Action
Rule N 152.163.190.69/21
152.163.80.11/32
… UDP AN
Rule N-1
152.168.3.0/24
152.163.0.0/16
… TCP AN-1
… … … … … …
Rule 1 152.168.0.0/16
152.0.0.0/8
… ANY A1
Performance Metrics of a Classification Algorithm
Data structure storage requirements
Packet classification time (query time)
Incremental update time
Undesirable values of Performance Metrics
Data structure storage requirements: O(Nd)
Packet classification time (query time): O(N)
Incremental update time: O(N)
62,6416 dKN
Previous WorkScheme Pros ConsLinear Search Good storage
and update time characteristics
High classification time
Grid-of-tries (Srinivasan et al[Sigcomm98])
Fast 2-D classification
No incremental updates, does not extend to multiple dimensions
Bitmap intersection (Stiliadis et al[Sigcomm98])
Fast classification for multiple dimensions
No incremental updates, large amount of memory storage and bandwidth
Tuple Space Search (Suri et al [Sigcomm99])
Good storage characteristics
Non-deterministic search and update time
Previous Work (contd.)
Crossproducting (Srinivasan et al[Sigcomm98])
Recursive Flow Classification (Gupta et al[Sigcomm99])
FIS tree (Muthukrishnan et al [Infocom00]) Please see paper for more references
All of this work “sacrifices” one metric (among update time, query time or storage) for another, OR achieves good performance on the basis of observed structure of classifiers
Goal of this work
Achieve good update time, storage and query time characteristics
Do not rely on the structure of classifiers for good performance, i.e., work in worst case
Extend to generic multi-dimensional classifiers
Algorithms that simultaneously:
Bounds: known
))16((log))32(( NOWO
Algorithm
Query Storage Update
Linear searchCrossproducting/recursive flow classificationGrid-of- tries
)(log NO )(d
NO
)log( NdNO)1
(log Nd
O
)(NO )(log NO)(NO
Bounds: obtained
Algorithm
Query Storage Update
Heap-on-Trie (HoT)
Binarysearchtree-on-trie (BoT)
)(log NdO )log( NdNO
)1(log NdO )log( NdNO
)1(log NdO
)(log NdO
NW log
)log( Nd
NO
Heap-on-trie (HoT)
One-dimensional generic 4-bit classifier
Rule
Range Maximal Prefixes
R5 [3,11] 0011, 01**, 10**R4 [2,7] 001*, 01**R3 [4,11] 01**, 10**R2 [4,7] 01**R1 [1,15] 0001, 001*, 01**, 10**,
110*, 1110
Rule
Range
R5 [3,11]
R4 [2,7]
R3 [4,11]
R2 [4,7]
R1 [1,14]
0 1
{R5}
{R5,R4,R3,R2,R1} {R5,R3,R1}
{R1}
{R1}
1 1
1
1
1
1
0
0
0
0
Trie
Heap-on-
Search 0011
{R4, R1}
{R5,R4,R3,R2,R1}
{R4, R1}
Update R4
Search: O(W)
Update: O(WlogN)
Storage: O(NW)
{R1}
0
0 1
{R5}
{R5,R4,R3,R2,R1} {R5,R3,R1}
{R4, R1}
{R1}
{R1}
R5 [3,11] 0011, 01**, 10**R4 [2,7] 001*, 01**R3 [4,11] 01**, 10**R2 [4,7] 01**R1 [1,14] 0001, 001*, 01**, 10**, 110*, 1110
1 1
1
1
1
1
0
0
0
0
Static case: pre-compute highest priority rule
Dynamic case: need to keep the prefixes around
One range is allocated to at most 2W nodes
{R1}
0
Store rules at each trie node in a separate heap
5
1
34
2
{R5,R4,R3,R2,R1}
0 1
{R5}
{R5,R4,R3,R2,R1} {R5,R3,R1}
{R1}
{R1}
1 1
1
1
1
1
0
0
0
0
Trie
Heap-on-
Search 0011
{R4, R1}
{R5,R4,R3,R2,R1}
{R4, R1}
Update R4
Search: O(W)
Update: O(WlogN)
Storage: O(NW)
{R1}
0
HoT: multiple dimensions
Dimension-1
Dimension-2
Heaps
01*
All rules that have 01* as first dimension
HoT: Bounds achieved
Algorithm
Query Storage Update
One dimension’d’ dimensions
)(log NO )log( NNO
)(log Nd
O )log( NdNO
)2
(log NO
)1
(log Nd
O
NW log
Binarysearchtree-on-trie (BoT)
Basic Idea:
Assign each range to only one trie node instead of 2W trie nodes
Improves update time at the cost of search time
Assigning ranges to trie nodes: Definitions
St(v) Pt(v) En(v)
Range represented by a prefix * at trie node v
00..0 10..0 11..1
4-bit prefix = 01**St(v) = 0100 = 4En(v) = 0111 = 7Pt(v) = 0110 = 6
(D-point)
8
6
12
10 14
0 1
4
2
1 1
1
0 0
31
10
75
10
119
10
1513
10
0*: d-point = 0100 = 4
*: d-point = 1000 = 84-bit trie
Assigning ranges to nodes
Assigning ranges to nodes
St(v) Pt(v) En(v)
Range represented by a prefix * at trie node v
00..0 10..0 11..1
(D-point)
Assign range H to node v, if v satisfies both of the following:
1. H contains Pt(v)2. H does not contain Pt(parent(v))
8
6
12
10 14
0 1
4
2
1 1
1
0 0
Assigning ranges to nodes
R5 [3,11] 0011, 01**, 10**R4 [2,7] 001*, 01**R3 [4,11] 01**, 10**R2 [4,7] 01**R1 [1,14] 0001, 001*, 01**, 10**, 110*, 1110
{R5, R3, R1}
{R4, R2}
Pt(v)
Q
b
a
Store left endpoint coordinatesin BBSTleft(v)
Store left endpoint coordinatesin BBSTright(v)
c
d
Node v(*)
Rules R1
R4R3
R2
If Q < Pt(v), find the highest priority of all ranges whose left endpoints are less than or equal to Q.
St(v) En(v)
00..0 10..0
BoT: data structure and algorithms
Binarysearchtree-on-trie
Search: O(WlogN)
Update: O(logN)
Storage: O(NW)
Multi-dimension: similar extension as HoT
8
0
4
{R5, R3, R1}
{R4, R2}
BBSTleft BBSTright
Search 5
{R4, R2}
Update
Conclusions and open problems
Presented two algorithms with reasonable worst-case bounds in all performance metrics
The algorithms seem to be simple to implement though no implementation results were shown
What are non-trivial lower bounds to the query time in d-dimensional classification, both static and dynamic, in a given amount of space?
Can we improve upon both the search and update times in the paper in the same or better storage complexity?