Download - Fine-Grained Latency and Loss Measurements in the Presence of Reordering

Fine-Grained Latency and Loss Measurements in the

Presence of Reordering

Myungjin Lee, Sharon Goldberg,Ramana Rao Kompella, George Varghese

Trend toward low-latency networks Low latency: one of important metrics in

designing a network Switch vendors introduce switches that provide

low latency Financial data center begins to demand more

stringent latency

Low latency

Benefits of low-latency networks

An automated trading program can buy shares cheaply

A cluster application can run 1000’s more instructions

Financial ServiceProviderNetwork

ContentProvider Brokerage

Our network provides E-to-E latency SLA of a few μseconds

But… Guaranteeing low latency in data centers is

hard Congestion needs to be less than a certain level

Reason 1: No traffic models for different applications Hinders managers from predicting offending

applications

Reason 2: New application’s behavior is often unforeseen until it is actually deployed E.g., TCP incast problem [SIGCOMM ’09]

Latency & loss measurements are crucial Need latency & loss measurements on a

continuous basis Detect problems Fix: re-routing offending application, upgrading

links, etc.

Goal: Providing fine-grained end-to-end aggregate latency and loss measurements in data center environmentsA B


E-to-E latency and loss

measurements


Measurement model

Out-of-order packet delivery due to multiple paths Packet filtering associates packet stream between A and B Time synchronization: IEEE 1588, GPS clock, etc. No header changes: Regular packets carry no timestamp

Financial ServiceProviderNetwork

A B…

Multiple paths

Out-of-order delivery

Brokerage Filter

Filter

Measurement model

Interval message: A special ‘sync’ control packet to mark off a measurement interval Injected by measurement modules at an edge (e.g., Router

A) Measurement interval: A set of packets ‘bookended’

by a pair of interval messages

A BFinancial ServiceProviderNetwork


Filter

FilterRouter A Router B

Interval

Message

Interval

Message

Measurement Interval

Existing solutions Active probes

Problem: Not effective due to huge probe rate requirement

Storing timestamps and packet digests locally Problem: Significant overhead for communication Packet sampling: Trade-off between accuracy and

overhead

Lossy Difference Aggregator (LDA) [Kompella, SIGCOMM ’09] State-of-the-art solution with FIFO packet delivery

assumption Problem: Not suitable in case where packets can

be reordered

LDA in packet loss case

Key point: Only useful buckets must be used for estimation A useful bucket: a bucket updated by the same set of packets at

A and B Bad packets: lost packets to corrupt buckets

Router A Router BHash

Hash

11

12 3Bucke

t

13

X

12

5 9

26

212

7 11

29

111

True delay =

Corrupted bucket

(3 – 1) 3= 3.3

+ (11 – 7)

+ (9 – 5) Estimated

delay =12 –

62= 3

Interval

Message

Estimation error = 9%

Packet countTimestamp sum

LDA in packet loss + reordering case

Problem: LDA confounds loss and reordering Packet count match in buckets between A and B is

insufficient Reordered packets are also bad packets Significant error in loss and aggregate latency

estimation


Hash

11

12 3

13

X

12

5 9

26

212

7 11

29

111

13

224

No reordering

Reordering

True delay = 3.3 Estimated delay =

12 + 24 – 6 – 94

= 5.25

Estimation error = 59%

Freeze buckets

after update

True delay = 3.3

Packet countTimestamp sum

Freeze buckets

Quick fix of LDA: per-path LDA Let LDA operate on a per-path basis

Exploit the fact that packets in a flow are not reordered by ECMP

Issues (1) Associating a flow with a path is difficult (2) Not scalable: potentially need to handle

millions of separate TCP flows

Packet reordering in IP networks Today’s trend

No reordering among packets in a flow No reordering across flows between two interfaces

New trend: Data centers exploit the path diversity ECMP splits flows across multiple equal-cost paths Reordering can occur across flows

Future direction: Switches may allow reordering within switches for improved load balancing and utilization Reordering-tolerant TCP for use in data centers

Proposed approach: FineComb Objective

Detect and correct unusable buckets Control the number of unusable buckets

Key ideas 1) Incremental stream digests: Detect unusable

buckets 2) Stash recovery: Make corrupted buckets useful by

correction 3) Packet sampling: Control the number of bad

packets included

Incremental stream digests (ISDs) An ISD = H(pkt1) H(pkt2) … H(pktk)

is an invertible commutative operator (e.g., XOR)

Property 1: Low collision probability Two different packet streams hash to different

value Allows to detect corrupted buckets

Property 2: Invertibility Easy addition/subtraction of a packet digest from

an ISD The basis of stash recovery

ISDs handles loss and reordering

ISDs detects corrupted buckets by loss and reordering Buckets are usable only if both packet counts and

ISDs match each other between A and B


Hash

11

0304 03

13

X

12

06 06

26

212

2A 2A

29

111

10

224

True delay = 3.3

03 0409 2E 0309 2A3AISDs don’t match

Hashvalue

Packet countTimestamp sumISD

Latency and loss estimation

Average latency estimation

2 26 9

2E09ISDTimestamp

sum

Packet count

Router A Router B

A193 2 2

12 243A09 9C

191

Delay sum =

(12 – 6)2 Average latency

= 3.0Count =

+ (0 – 0)

+ (0 – 0)+

0+ 0

= 6= 2

Loss count sum = (2 – 2)

2 Loss rate = 0.43

Total packets =

+ (2 – 2)

+ (3 – 1)+

2+ 3

= 3= 7

Loss estimation

Stash recovery Stash: A set of (timestamp, bucket index, hash value)

tuple of packets which are potentially reordered

(-) stash Contains packets potentially added to a receiver (Router B) In recovery, packet digests are subtracted from bad buckets

at a receiver

(+) stash Contains packets potentially missing at a receiver (Router B) In recovery, packet digests are added to bad buckets at a

receiver

Stash recovery A bad bucket can be recovered iff reordered

packets corrupted it Reordered packets are not counted as lost packets

Increase loss estimation accuracy

A bucket in A

(–) Stash in B

2 12 2E 3 3

4 3EA bucket

in B– 1 2 04

2 32 3A2 29 2E

ISDs don’t matchISDs

match

1 5 10

1 5 101 2 04

1 5 10

1 2 04{ }{ }

1 5 101 2 04{ }

All subsets

Sizing buckets and stashes Known loss and reordering rates

Given a fixed storage size, we obtain the optimal packet sampling rate (p*)

We provision stash and buckets based on the the p*

Unknown loss and reordering rates Use multiple banks optimized for different set of

loss and reordering rateDetails can be found in our

paper

Accuracy of latency estimationAv

erag

e re

lativ

e er

ror

Reordering rate

1000x difference

FineComb: ISD+stash, FineComb-: ISD only

Packet loss rate = 0.01%, #packets = 5M, true mean delay = 10μs

Accuracy of loss estimationAv

erag

e re

lativ

e er

ror

Reordering rate

Packet loss rate = 0.01%, #packets = 5M

Stash helps to obtain accurate loss estimation

Summary Data centers require end-to-end fine-grain latency

and loss measurements

We proposed a data structure called FineComb Resilient to packet loss and reordering Incremental stream digest detects corrupted buckets Stash recovers buckets only corrupted by reordered

packets

Evaluation shows FineComb achieves higher accuracy in latency and loss estimation than LDA

Thank you! Questions?

Backup

Microscopic loss estimationAv

erag

e re

lativ

e er

ror

Reordering rate

Handling unknown loss & reordering rates

Aver

age

rela

tive

erro

r

Reordering rateLDA: 2-banks, FineComb: 4-banks with same memory size