Download - DiffProbe Detecting ISP Service Discrimination Partha Kanuparthy, Constantine Dovrolis.

DiffProbeDetecting ISP Service Discrimination

Partha Kanuparthy, Constantine Dovrolis

Net Neutrality

Recent FCC-ISP debates Comcast throttling dispute, etc.

FCC broadband mapping framework Tools to estimate performance $350m stimulus funds

What is Service Discrimination?

ISPs can classify certain apps as low-priority: and service them accordingly

Discrimination can manifest as (relatively): high delays high loss rates

ISP can also do shaping: leads to low throughput (=> both delay and loss) ShaperProbe: first step

Goals

Problem: Is an application's traffic being

classified low-priority by an ISP? Is the ISP doing loss or delay

discrimination or both? Can we identify scheduler type?

Solution: Compare performance of normal

and application traffic sent simultaneously

Identifying discrimination is not easy:

1. Congestion events can be short-lived (us-ms scales)

– Bad idea: compare delays/loss rates from different times

2. Customer may see same performance if there is no cross-traffic

– Bad idea: call this as no-discrimination

serverDiffProbe

WRR

Delay Discrimination: Practice

Non-discriminatory schedulers (single queue): First-Come-First-Serve (FCFS)

Discriminatory schedulers (multiple classes): Strict Priority (SP) Weighted Fair Queuing (WFQ)

Delay discrimination creates difference in delay distributions

Loss Discrimination: Practice

Non-discriminatory buffer managers: DropTail (DT) Random Early Detect (RED)

Discriminatory buffer managers: Weighted RED (WRED) Drop-from-Longest-Queue

Loss discrimination creates difference in loss rates

Drop-from-Longest

Rest of the talk…

High level design Detecting delay discrimination Detecting loss discrimination The DiffProbe tool

ShaperProbe

High-level design

Send normal (P) and application (A) traffic simultaneously

Measure one-way delays (OWDs) and lost packets for each flow

serverDiffProbe

Application traffic (A)

Normal traffic (P)

Avoiding Classification

A flow: , ... P flow has to be:

sufficiently different from A to avoid classification Ex: alter payload, ports, gaps

sufficient similar to A to observe same network performance as P when there is no discrimination same packet size distribution between A and P send a P packet at about same time as A

Probing Patterns

Create two probing structures using A and P: Balanced Load Period (BLP): send both flows at

their normal rates

Load Increase Period (LIP): scale up P flow's rate

Why create LIP? To maximize chances of queuing in ISP network

AP

AP

Discrimination Identifiability

The user does not always “see” discrimination no high-priority backlog “=>” Low-priority gets link

capacity We use BLP to detect unidentifiable conditions

for delay discrimination P delays created during LIP are larger than BLP

90th percentile of P's delays during LIP

median of P's delays during BLP>

Overview

High level designHigh level design Detecting delay discrimination Detecting loss discrimination The DiffProbe tool

Detecting Delay Discrimination

We observe empirical delay distributions of A and P flows during LIP: and

No delay discrimination:

Delay discrimination:WRR 1:3

(emulated)

FCFS(Comcast)

Detecting Delay Discrimination (2)

Pre-processing: Pairing: Consider only those (A,P) sample pairs

which were sent within an MTU-transmission time, τ

Discard delay values in τ-neighborhood of estimated propagation delay such samples don't see queuing

Subtract propagation delay estimate from samples

AP


Hypothesis test for :1. Null hypothesis: equal distributions

2. Compute Kullbeck-Leibler (KL) Divergence of pre-processed samples

3. Compute KL Divergences of uniform random partitions of

4. Is (2) > (3)?

• Test for Compare all higher percentiles (50th - 90th) of A and P delay

distributions Redo the test, swapping A and P as inputs If this test fails, we state that delay discrimination is unknown

Delay Discrimination: Accuracy

Evaluate using simulations: Discrimination using SP and WFQ Skype iSAC packet trace as A flow Cross-traffic: interactive TCP sessions

(200 users) Half of user traffic classified low-priority BLP, LIP durations: 30s

90+% accuracy among detectable trials

95% confidence, 2% error margin

WFQ weights

FCFS, SP, WFQ

1:1.5 is similar to FCFS

SP or WFQ? SP-like or WFQ-like scheduling create diff. delays

Idea: some P packets serviced just after A would: see only A's non-preemption delay (if any) in SP but, see A's queuing delays in WFQ

Low-prioritySP WFQ 1:2

non-preemption queuing

WFQSP

Distribution of P subset

Method: choose a subset of P samples:

received very close but after an A packet

Overview

High level designHigh level design Detecting delay discriminationDetecting delay discrimination Detecting loss discrimination The DiffProbe tool

Detecting Loss Discrimination

Estimate loss rates of A and P flows during LIP as fraction of packets lost: and

No loss discrimination:

Loss discrimination:

WRR 1:3 Drop-Longest-Queue(emulated)

Detecting Loss Discrimination (2)

Pre-processing: to estimate and Pairing: same as that for delay discrimination

ensure the A and P flows sample the same congestion events if DropTail/RED

Use the Two-Proportion Test on and Unidentifiability: less than 10 dropped packets

in each flow

Loss Discrimination: Accuracy

Buffer sizes according to BW-Delay product

90+% accuracy for discriminating configurations

WRED accuracy

f: Min queue thresholdof normal flows:

Drop-Longest-Queue (WFQ) vs. DT

WFQ 1:1.5 is similar to DT

similar loss rates

Overview

High level designHigh level design Detecting delay discriminationDetecting delay discrimination Detecting loss discriminationDetecting loss discrimination The DiffProbe tool

Implementing DiffProbe

DiffProbe runs as client-server (~7500 LoCs) Classifier types: port, payload A flow: Skype and Vonage voice traces P flow: randomize payload, port of A flow LIP, BLP durations: 30s each

Pre-probing: estimate path capacity using packet trains

Experiments

Emulations: discriminating link configured using tc Pareto cross-traffic SP, WRR, and Drop-Longest-Queue discriminators No FPs, FNs

Real-world experiments (Skype and Vonage):

KL-test p-values: Access ISP runs

We do not have ground truth A high p-value of KL-test is a good “indicator” of no-discrimination One ISP showed multi-path routing, which created different delays

Validation

ISPs have so far not disclosed details of application discrimination practices (if any) No ground truth!

Discrimination: significant difference in delays and/or losses of A and P Why? : controlled environment trials!

Validation ideas?

Overview

High level designHigh level design Detecting delay discriminationDetecting delay discrimination Detecting loss discriminationDetecting loss discrimination The DiffProbe toolThe DiffProbe tool

ShaperProbe

A pre-probing module of DiffProbe to answer:

Can we detect traffic shaping by ISPs?

What is the shaping configuration?

Key idea: probe and detect level shifts in rate

the token bucket signature Upload: 7Mbps -> 2Mbpsin 8s

ShaperProbe (contd.)

Deployed at Google M-Lab 60,000+ runs so far

Who shapes traffic?

...among 700+ other ASes.

Thank You!

partha @ cc.gatech.edu


Hypothesis test for : Null hypothesis: equal distributions Compute Kullbeck-Leibler (KL) Divergence of pre-

processed samples call it

Bootstrap: compute KL Divergences of uniform random partitions of this gives us a KL distribution

Reject null hypothesis if p-value is < 0.05:


Test for (if KL-test rejects null hypothesis) Compare higher percentiles of A and P delay

distributions

Redo the test, swapping A and P as inputs If this test fails, we state that delay

discrimination is unknown

SP or WFQ? (2)

For the distribution of this subset of P samples: SP if: 95th percentile P delay ≈ 5th percentile WFQ-like, otherwise

WFQ

SP

WFQ-SP accuracyDistribution of P subset