Fine-Grained Latency and Loss Measurements in the
Presence of Reordering
Myungjin Lee, Sharon Goldberg,Ramana Rao Kompella, George Varghese
Trend toward low-latency networks Low latency: one of important metrics in
designing a network Switch vendors introduce switches that provide
low latency Financial data center begins to demand more
stringent latency
Low latency
Benefits of low-latency networks
An automated trading program can buy shares cheaply
A cluster application can run 1000’s more instructions
Financial ServiceProviderNetwork
ContentProvider Brokerage
Our network provides E-to-E latency SLA of a few μseconds
But… Guaranteeing low latency in data centers is
hard Congestion needs to be less than a certain level
Reason 1: No traffic models for different applications Hinders managers from predicting offending
applications
Reason 2: New application’s behavior is often unforeseen until it is actually deployed E.g., TCP incast problem [SIGCOMM ’09]
Latency & loss measurements are crucial Need latency & loss measurements on a
continuous basis Detect problems Fix: re-routing offending application, upgrading
links, etc.
Goal: Providing fine-grained end-to-end aggregate latency and loss measurements in data center environmentsA B
ContentProvider Brokerage
E-to-E latency and loss
measurements
ContentProvider Brokerage
Measurement model
Out-of-order packet delivery due to multiple paths Packet filtering associates packet stream between A and B Time synchronization: IEEE 1588, GPS clock, etc. No header changes: Regular packets carry no timestamp
Financial ServiceProviderNetwork
A B…
Multiple paths
Out-of-order delivery
Brokerage Filter
Filter
Measurement model
Interval message: A special ‘sync’ control packet to mark off a measurement interval Injected by measurement modules at an edge (e.g., Router
A) Measurement interval: A set of packets ‘bookended’
by a pair of interval messages
A BFinancial ServiceProviderNetwork
ContentProvider Brokerage
Filter
FilterRouter A Router B
Interval
Message
Interval
Message
Measurement Interval
Existing solutions Active probes
Problem: Not effective due to huge probe rate requirement
Storing timestamps and packet digests locally Problem: Significant overhead for communication Packet sampling: Trade-off between accuracy and
overhead
Lossy Difference Aggregator (LDA) [Kompella, SIGCOMM ’09] State-of-the-art solution with FIFO packet delivery
assumption Problem: Not suitable in case where packets can
be reordered
LDA in packet loss case
Key point: Only useful buckets must be used for estimation A useful bucket: a bucket updated by the same set of packets at
A and B Bad packets: lost packets to corrupt buckets
Router A Router BHash
Hash
11
12 3Bucke
t
13
X
12
5 9
26
212
7 11
29
111
True delay =
Corrupted bucket
(3 – 1) 3= 3.3
+ (11 – 7)
+ (9 – 5) Estimated
delay =12 –
62= 3
Interval
Message
Estimation error = 9%
Packet countTimestamp sum
LDA in packet loss + reordering case
Problem: LDA confounds loss and reordering Packet count match in buckets between A and B is
insufficient Reordered packets are also bad packets Significant error in loss and aggregate latency
estimation
Router A Router BHash
Hash
11
12 3
13
X
12
5 9
26
212
7 11
29
111
13
224
No reordering
Reordering
True delay = 3.3 Estimated delay =
12 + 24 – 6 – 94
= 5.25
Estimation error = 59%
Freeze buckets
after update
True delay = 3.3
Packet countTimestamp sum
Freeze buckets
Quick fix of LDA: per-path LDA Let LDA operate on a per-path basis
Exploit the fact that packets in a flow are not reordered by ECMP
Issues (1) Associating a flow with a path is difficult (2) Not scalable: potentially need to handle
millions of separate TCP flows
Packet reordering in IP networks Today’s trend
No reordering among packets in a flow No reordering across flows between two interfaces
New trend: Data centers exploit the path diversity ECMP splits flows across multiple equal-cost paths Reordering can occur across flows
Future direction: Switches may allow reordering within switches for improved load balancing and utilization Reordering-tolerant TCP for use in data centers
Proposed approach: FineComb Objective
Detect and correct unusable buckets Control the number of unusable buckets
Key ideas 1) Incremental stream digests: Detect unusable
buckets 2) Stash recovery: Make corrupted buckets useful by
correction 3) Packet sampling: Control the number of bad
packets included
Incremental stream digests (ISDs) An ISD = H(pkt1) H(pkt2) … H(pktk)
is an invertible commutative operator (e.g., XOR)
Property 1: Low collision probability Two different packet streams hash to different
value Allows to detect corrupted buckets
Property 2: Invertibility Easy addition/subtraction of a packet digest from
an ISD The basis of stash recovery
ISDs handles loss and reordering
ISDs detects corrupted buckets by loss and reordering Buckets are usable only if both packet counts and
ISDs match each other between A and B
Router A Router BHash
Hash
11
0304 03
13
X
12
06 06
26
212
2A 2A
29
111
10
224
True delay = 3.3
03 0409 2E 0309 2A3AISDs don’t match
Hashvalue
Packet countTimestamp sumISD
Latency and loss estimation
Average latency estimation
2 26 9
2E09ISDTimestamp
sum
Packet count
Router A Router B
A193 2 2
12 243A09 9C
191
Delay sum =
(12 – 6)2 Average latency
= 3.0Count =
+ (0 – 0)
+ (0 – 0)+
0+ 0
= 6= 2
Loss count sum = (2 – 2)
2 Loss rate = 0.43
Total packets =
+ (2 – 2)
+ (3 – 1)+
2+ 3
= 3= 7
Loss estimation
Stash recovery Stash: A set of (timestamp, bucket index, hash value)
tuple of packets which are potentially reordered
(-) stash Contains packets potentially added to a receiver (Router B) In recovery, packet digests are subtracted from bad buckets
at a receiver
(+) stash Contains packets potentially missing at a receiver (Router B) In recovery, packet digests are added to bad buckets at a
receiver
Stash recovery A bad bucket can be recovered iff reordered
packets corrupted it Reordered packets are not counted as lost packets
Increase loss estimation accuracy
A bucket in A
(–) Stash in B
2 12 2E 3 3
4 3EA bucket
in B– 1 2 04
2 32 3A2 29 2E
ISDs don’t matchISDs
match
1 5 10
1 5 101 2 04
1 5 10
1 2 04{ }{ }
1 5 101 2 04{ }
All subsets
Sizing buckets and stashes Known loss and reordering rates
Given a fixed storage size, we obtain the optimal packet sampling rate (p*)
We provision stash and buckets based on the the p*
Unknown loss and reordering rates Use multiple banks optimized for different set of
loss and reordering rateDetails can be found in our
paper
Accuracy of latency estimationAv
erag
e re
lativ
e er
ror
Reordering rate
1000x difference
FineComb: ISD+stash, FineComb-: ISD only
Packet loss rate = 0.01%, #packets = 5M, true mean delay = 10μs
Accuracy of loss estimationAv
erag
e re
lativ
e er
ror
Reordering rate
Packet loss rate = 0.01%, #packets = 5M
Stash helps to obtain accurate loss estimation
Summary Data centers require end-to-end fine-grain latency
and loss measurements
We proposed a data structure called FineComb Resilient to packet loss and reordering Incremental stream digest detects corrupted buckets Stash recovers buckets only corrupted by reordered
packets
Evaluation shows FineComb achieves higher accuracy in latency and loss estimation than LDA
Thank you! Questions?
Backup
Microscopic loss estimationAv
erag
e re
lativ
e er
ror
Reordering rate
Handling unknown loss & reordering rates
Aver
age
rela
tive
erro
r
Reordering rateLDA: 2-banks, FineComb: 4-banks with same memory size
Top Related