Servers in Action: Towards Distributed Traffic Measurement ...2 Network Measurement in Data Centers...
Transcript of Servers in Action: Towards Distributed Traffic Measurement ...2 Network Measurement in Data Centers...
1
Servers in Action: Towards Distributed Traffic Measurement in Data Centers
Praveen Tammana & Myungjin Lee School of Informatics
University of Edinburgh
MSN’14 - Cosener's House, Abingdon
2
Network Measurement in Data Centers
● Data center applications– High Bandwidth– Latency sensitive
● Network Management– Control
● Routing, Access control
– Measurement● Traffic engineering, Security applications● Basic requirements
– Accuracy, Scalability● Data center requirements
– Programmable, Responsive, Evolvable
3
Data Center Measurement Requirements
● Responsiveness : Quick control loop decisions (Heavy flow scheduling)
● Programmability: Adaptable to dynamic workloads
● Evolvability: Software based measurement module (bloom filter, trie, hashtable)
Core
Edge
Aggregate
4
Network Measurement Framework
AccountingTraffic Engineering
Fault diagnosis
SLA monitoring Anomaly detection
worms, portscans, botnets
Forensic analysis Network ManagementTasks
Flow Collector
Flow Monitoring● Software ● Hardware
5
Software Based - Flow Monitoring
● Sampling
– NetFlow, sFlow
– High traffic rates compliance with limited switch resources (SRAM, CPU)
● Problem
– Not Accurate (Basic Requirement)
➔ Flow coverage and accuracy are compromised.
➔ Not suitable for management tasks that requires fine grained flow details.
AccountingTraffic Engineering
Fault diagnosis
SLA monitoring Anomaly detection
worms, portscans, botnets
Forensic analysis
sampled
Packet stream
Counters
Task 1
Task 2
Task N
# Bytes/Pkts
Management Tasks
6
Hardware Based - Flow Monitoring
● Task Oriented
– Task 1 : Anomaly Detection
– Task 2 : Traffic Engineering
● Problem
– Not Evolvable (DC Requirement)
● Higher speed links (40/100 Gbps)
● SLA monitoring in data centers
Packet stream
Counters(SRAM) Counters Counters
# Bytes/Pkts
Task 1 Task 2 Task N
7
Network Is Evolving
(Net Optics 2013)
8
Distributed Traffic Measurement
● Our approach :
– Distribute flow monitoring overhead between switches and servers
Sta
tistic
pkt
s
Statistic pkts
f1f2f3f4
f1
f5
Flows
Monitor Flows1
2
3Aggregate pkts Report results
Collector
9
Distributed Traffic Measurement
Core
Edge
Administrators have complete control of switches and servers
- High computational resources (multiple cores, large memory)
- Hosts observe relevant traffic of running services
- Monitors less traffic than switch
Aggregate
10
Proposed Framework
Measurement module
● Producers
● Monitor traffic and generates statistic packets (s-pkts) (e.g., per flow record)
● Feeds s-pkts to ToR switch
Aggregation module
● Consumers
● Aggregates statistic packets (s-pkts)
● Counters can be stored in high density DRAM
11
Proposed Framework – Packet Processing
Measurement module
NIC
1. Copy of regular packets
Reglular packets
12
Measurement module
NIC
1. Copy of regular packets
Regular packets
2. statistic packets (s-pkts)
Proposed Framework – Packet Processing
13
Aggregation Module
3. Copy of statistic packets (s-pkts)
Reglular packets
Measurement module
NIC
1. Copy of regular packets
Regular packets
2. statistic packets (s-pkts)
Ing
r es
s p
ort
Eg
r es
s p
ort
Proposed Framework – Packet Processing
14
On going work : s-pkt forwarding
● How to forward s-pkt ?– Packet path encoding and IP source route option
– Use switch forwarding table
15
Usecase – Hierarchical Heavy Hitter (HHH)
40
400
19
12 7
1 5 2
21
12 9
9 3 5 4
Threshold : T=10
11
****
0*** 1***
00** 01**
000* 001* 010* 011*
0000 0001 0010 0011 0100 0101 0110 0111
HHH: Longest IP prefix occupies more than fraction T of link bandwidthafter excluding any descendant HHH
Traffic volume for each IP Prefix
16
HHH Detection
f1f2f3f4
f1
f5
HHH Measurement Module
HHH Aggregation Module
Collector
Report HHH
Pre-filtering
IP Prefix Trie (Source IP)
Statis
tic p
kts
f1f2f3
f1
HHH Measurement Module
Pre-filtering
Statistic pkts
17
Evaluation
● Simulation setup
– Measurement module : Customized YAF
– Aggregation module : IP Prefix Trie
– Packet trace – T. Benson : University data center
● Aggregation module performance
– HHH Accuracy
– Computation overhead on Servers and switches
– Compared with NetFlow
18
Preliminary Results
Aggregation module overhead (AMO) over NetFlow overhead (NFO)
HHH Accuracy Varying Sampling Rates
FPR : False Positive RateFNR : False Negative Rate
19
Preliminary Results
Aggregation module overhead (AMO) over NetFlow overhead (NFO)
HHH Accuracy Varying Sampling Rates
FPR : False Positive RateFNR : False Negative Rate
Correctness : 100% Sampling rate – 100% Accuracy Overhead: AMO is just < 2% of NFO
20
Conclusions and Future work
● Conclusions
– Our framework offloads overhead on switch
– Evolves along with data center traffic volume
– Provides more flexibility to data centre operators● Future Work
– Prototyping proposed framework
– Exploring performance across different measurement tasks
– Endhost based network trouble shooting (e.g., packet loss, delay)
– Impact of packet loss on accuracy
– Distributing measurement task overhead across network
22
Challenges
– Handling multiple paths between End Hosts
– Consistency with forwarding rules update
23
Measurement Tasks
● Hierarchical Heavy Hitter (HHH)● Heavy Hitter● Superspreader● Flow Size Distribution● DDoS
24
Proposed Framework : s-pkt forwarding
S R
T1
SS
T2
A1
Flow Path : S → T1 → A1 → T2 → R
Encodes path information into packet
25
Proposed Framework : s-pkt Forwarding
S R
T1
SS
T2
A1
Flow Path : S → T1 → A1 → T2 → R
s-pkt : R → T2 → A1 → T1
s-pkt
s-pk
t s-
pkt
1. Generate s-pkt
2. Enables IP source routing option