Ph.D. Thesis Presentation Aleksandar Kuzmanovic Edge-based Inference, Control, and DoS Resilience...

44
Ph.D. Thesis Presentation Aleksandar Kuzmanovic Edge-based Inference, Control, and DoS Resilience for the Internet

Transcript of Ph.D. Thesis Presentation Aleksandar Kuzmanovic Edge-based Inference, Control, and DoS Resilience...

Ph.D. Thesis Presentation

Aleksandar Kuzmanovic

Edge-based Inference, Control, and DoS Resilience

for the Internet

Aleksandar Kuzmanovic

The Internet

1969

The system of astonishing scale and complexity

2004

UTAH

UCLAUCS B

S R

Aleksandar Kuzmanovic

Internet Design Principles

Network as a black-box

End-to-end argument [Clark84]– The core is simple

– Intelligence at the

endpoints

Implications– Easy to upgrade the

network– Easy to incrementally

deploy new services

Aleksandar Kuzmanovic

Why End-Point Approach Today?

Scalability

e2e scalability

Deployability– IP and network core are not extensible and

are slowly evolving: IPv6 (10 years) IP Multicast (domain dependent)

Goal: Improve network performance right here – right now!

Aleksandar Kuzmanovic

Network Performance

Internet traffic– HTTP (web browsing)– FTP (file transfer)

Fact: 95% of the traffic today is TCP-based

Performance– QoS differentiation

Net win for both HTTP and FTP flows End-point-based two-level differentiation scheme

– Denial of Service DoS attacks can demolish network performance Prevent DoS attacks via a robust end-point protocol

design

Aleksandar Kuzmanovic

End-Point Service Differentiation

TCP-Low Priority– Utilizes only the excess network bandwidth

Key mechanism– Early congestion indications: one-way packet delay

Performance– Can improve the HTTP file transfers for more than 90%

when FTP flows use TCP-LP Deployability

– no changes in the network core– sender side modification of TCP

High-speed version developed in cooperation with SLAC– tested over Gb/s networks in US

http://www.ece.rice.edu/networks/TCP-LP

Aleksandar Kuzmanovic

Denial of Service

A malicious way to consume resources in a network, a server cluster or in an end host, thereby denying service to other legitimate users

Example– Well-known TCP’s

vulnerability to

high-rate

non-responsive flows

Victim

Attacker

Aleksandar Kuzmanovic

Design Principles - Revisited Design Principles

– Intelligence at the endpoints

– The core is simple– Trust and cooperation

among the endpoints

Implications– Easy to incrementally

implement new services.– Easy to upgrade the

network.– Large-scale system

Implement more intelligence at routers?– Scalability issue– Detect misbehaving flows

in routers is a hard problem

Needle in a haystack

Core Routers

Aleksandar Kuzmanovic

Design Principles - Revisited Design Principles

– Intelligence at the endpoints

– The core is simple– Trust and cooperation

among the endpoints

Implications– Malicious clients may

misuse the intelligence.– Easy to upgrade the

network.– Large-scale system

Core Routers Implement more intelligence at routers?– Scalability issue– Detect misbehaving flows

in routers is a hard problem

Needle in a haystack

Aleksandar Kuzmanovic

Design Principles - Revisited Design Principles

– Intelligence at the endpoints

– The core is simple– Trust and cooperation

among the endpoints

.– Hard to detect endpoint

misbehavior.– Large-scale system

– Malicious clients may misuse the intelligence

Implications

Core Routers Implement more intelligence at routers?– Scalability issue– Detect misbehaving flows

in routers is a hard problem

Needle in a haystack

Aleksandar Kuzmanovic

Design Principles - Revisited Design Principles

– Intelligence at the endpoints

– The core is simple– Trust and cooperation

among the endpoints

.– Hard to detect endpoint

misbehavior.– Large-scale system

– Malicious clients may misuse the intelligence

Implications

Core Routers Implement more intelligence at routers?– Scalability issue– Detect misbehaving flows

in routers is a hard problem

Needle in a haystack

Aleksandar Kuzmanovic

End-Point Protocol Design

Performance vs. Security– End-point protocols are designed to maximize

performance, but ignore security– 95% of the Internet traffic is TCP traffic

Can have catastrophic consequences

DoS-resilient protocol design– Jointly optimize

performance

and security– Outperforms the

core-based solutions

Endpoints

Aleksandar Kuzmanovic

Remaining Outline

End-point protocol vulnerabilities– Low-rate TCP-targeted DoS attacks– Receiver-based TCP stacks with a misbehaving

receiver

Limitations of network-based solutions

DoS-resilient end-point protocol design

Aleksandar Kuzmanovic

Low-Rate Attacks

TCP is vulnerable to low-rate DoS attacks

DoSRate

DoS I nter- burst Period

TC P

DoS

Aleksandar Kuzmanovic

TCP: a Dual Time-Scale Perspective

Two time-scales fundamentally required– RTT time-scales (~10-100 ms)

AIMD control

– RTO time-scales (RTO=SRTT+4*RTTVAR) Avoid congestion collapse

Lower-bounding the RTO parameter:– [AllPax99]: minRTO = 1 sec

to avoid spurious retransmissions

– RFC2988 recommends minRTO = 1 sec

Discrepancy between RTO and RTT tim e- scales isa key source of vulnerability to low rate attacks

Aleksandar Kuzmanovic

The Low-Rate Attack

Victim

Attacker

TC

P S

en

din

g R

ate

Time

Do

S R

ate

Tim e

Aleksandar Kuzmanovic

The Low-Rate Attack

At a random initial time A short burst (~RTT)

sufficient to create outage– Outage – event of

correlated packet losses that forces TCP to enter RTO mechanism

The impact of outage is distributed to all TCP flows

Victim

Attacker

Do

S R

ate

Tim e

short burst (~RTT)

random initial phase

TC

P S

en

din

g R

ate

Tim e

outage

Aleksandar Kuzmanovic

The Low-Rate Attack

The outage synchronizes all TCP flows– All flows react

simultaneously and identically

backoff for minRTO The attacker stops

transmitting to elude detection

Victim

Attacker

TC

P S

en

din

g R

ate

Tim e

minRTO

Do

S R

ate

Tim erandom initial phase

Aleksandar Kuzmanovic

The Low-Rate Attack

Once the TCP flows try to recover – hit them again

Exploit protocol determinism

Victim

AttackerTC

P S

en

din

g R

ate

Time

minRTO

Do

S R

ate

Tim erandom initial phase

Aleksandar Kuzmanovic

The Low-Rate Attack

And keep repeating…

RTT-time-scale outages inter-spaced on minRTO periods can deny service to TCP traffic

Victim

Attacker

TC

P S

en

din

g R

ate

Tim e

minRTO minRTO

Do

S R

ate

Tim erandom initial phase

Aleksandar Kuzmanovic

Low-Rate Attacks

TCP is vulnerable to low-rate DoS attacks

DoSRate

DoS I nter- burst Period

TC P

DoS

Aleksandar Kuzmanovic

Vulnerability of Receiver-Based TCP to Misbehaviors

Sender-based TCP– Control functions given to the sender

RWND

CWND

SND.NXTSND.UNA

FlowControl

Re liability

Conge stion Contro l

SendMuchNextSend

Loss/Progress

se nd buffe r

SEG.ACK

SEG.WND

SEG.SEQ

SEG.ACKSEQ.WND

SEG.SEQ

RCV.NXTRCV.WNDResequencing

re cv buffe r

TC P SENDER

TC P REC EI VER

Aleksandar Kuzmanovic

Receiver-Based TCP

Receiver decides how much data can be sent, and which data should be sent by the sender

DATA – ACK communication becomes REQ - DATA

Example protocols– TFRC [RFC3448], WebTP, and RCP

RCV.NXTSEG.WNDREQ.NXT

FlowControl

Reliability

Congestion Control

ReqMuchNextReq

Loss/Progress

rec v /req buf f er

SEG.REQ

SEG.DEQ

SEG.SEQ

SND.NXT Send

s end buf f er

RC P REC EI VER

RC P SENDERCWND

RWND

SEG.SEQ

SEG.REQSEG.DEQ

ReqMuch

Aleksandar Kuzmanovic

Why Receiver-Based TCP?

Example: Busy web server– Receiver-based TCP distributes the state management

across a large number of clients Generally

– Whenever a feedback is needed from the receiver, receiver-based TCP has advantage over sender-based schemes due to the locality of information

Benefits [RCP03]

Performance Functionality

- Loss recovery - Seamless handoffs

- Congestion control - Server migration

- Power management for - Bandwidth aggregation

mobile devices - Web response times

- Network-specific congestion control

Aleksandar Kuzmanovic

Vulnerability

Receivers decide which packets and when to be sent– Receivers remotely control servers

Receivers have both means and incentive to manipulate the congestion control algorithm – Means: open source OS– Incentive: faster web browsing & file download

Server(Sender)

Client(Receiver)

request

data?

Aleksandar Kuzmanovic

Receiver-Induced DoS Attacks

Request flood attack– A misbehaving receiver floods the server with requests, which replies and congests the network

Goals– Evaluate network-based schemes

– Develop end-point solutions

Server

Requests

Malicious Client

Aleksandar Kuzmanovic

Remaining Outline

End-Point protocol vulnerabilities

Limitations of network-based solutions– Low rate attacks– Misbehaving receivers

DoS-resilient end-point protocol design

Core Routers

Aleksandar Kuzmanovic

Random Early Detection with Preferential Dropping

RED-PD [MFW01] designed to detect and thwart non-responsive flows– Monitors only a subset of flows at the router and

compares their rates to the targeted bandwidth (TB)

TB is computed as a TCP-fair throughput for » Observed Ploss & RTT=40ms

If Ti > TB => flow i malicious

Key questions– Can algorithms intended to find high-rate attacks

detect low-rate attacks?– Could we tune the algorithms to detect low-rate

attacks without having too many false alarms?

Aleksandar Kuzmanovic

The Time-Scale Issue

Scenario: 9 TCP Sack flows with RED and RED-PD

– RED-PD detects high bandwidth flows

DoS inter-burst period < 500 ms

Aleksandar Kuzmanovic

The Time-Scale Issue

Scenario: 9 TCP Sack flows with RED and RED-PD

– RED-PD detects high but fails to detect low-rate attacks bandwidth flows DoS inter-burst period > 500 ms

DoS inter-burst period < 500 ms

Aleksandar Kuzmanovic

CHOKe

CHOKe [PPP00] controls misbehaving flows by preventing a flow to monopolize buffer resources

=

= Question:

– Why don’t we use CHOKe against low-rate attacks?

Aleksandar Kuzmanovic

Flow Filtering Scenario

Heterogeneous RTT environment:– Short-RTT flows are the most vulnerable to low-

rate attacks

RTT

flow passno passcut- off tim e scale

outage length

Implications:– Long-RTT flows

‘collaborate’ in the attack

– Less-than bottleneck rates needed to attack short-RTT flows

Aleksandar Kuzmanovic

CHOKe and Flow Filtering

C

TC P ( s ho r t-R TT)

TC P ( lo ng-R TT)

D o S DoS flow utilizes only

3.3% of the bottleneck capacity

CHOKe fails to throttle the low-rate attack against short-RTT flows

Aleksandar Kuzmanovic

Request Flooding DoS Attack

Pushback [RFC3168]– Network nodes coordinate efforts to detect a

malicious (flooding) node

But in the request flooding scenario, the flooding machine is not malicious – moreover, it is a victim…

S erverMisbehaving Client

Aleksandar Kuzmanovic

Bandwidth Stealing

Fact– Network-based schemes lack

the exact knowledge of end-point parameters

Example– RED-PD doesn’t know about

RTT: TB=f(Ploss, RTT=40ms)

Implication– Clients with RTT > 40 ms can

exploit this vulnerability

Algorithmic misbehavior– We generalized the TCP

formula T=f(Ploss, RTT, a, b)

– Our algorithm tells how to re-tune AIMD parameters to steal bandwidth, yet elude detection

S erverMisbehaving Client

Aleksandar Kuzmanovic

Summary of Limitations

Low rate attacks– RED-PD: issue of time-scales– CHOKe: flow filtering

Misbehaving receivers– Pushback: No distinction of causes and effects– RED-PD: No knowledge of endpoint parameters

Can we do better from the endpoints?– End-point parameter randomization– End-point TCP-fairness verification

Endpoints

Aleksandar Kuzmanovic

End-point minRTO Randomization

Observe: – Low-rate attacks exploit protocol determinism

minRTO=1sec Question:

– Can minRTO randomization alleviate the problem?

Approach:– Randomize the minRTO parameter –

Insight:– The most vulnerable time-scale is T=b

Wait for flows to recover and then hit them again

),(min bauniformRTO

Aleksandar Kuzmanovic

End-point minRTO Randomization

TCP throughput formula on T=b time-scale of the low-rate attack

n - num ber of TCP flowsa,b - param . of unif. dist.b

ab

n

nbT

1

)(

lowaggregation

highaggregation

Spuriousre- transm issions [ AllPax99]

)1;( bbT

a

1/ 2

1

1

lowaggregation

highaggregation

Bad for short- lived ( HTTP) traffi c

)1;( abT

b

1/ 2

1

1 2

Aleksandar Kuzmanovic

End-point minRTO Randomization

TCP throughput formula on T=b time-scale of the Shrew attack

Randomizing the minRTO parameter shifts and smoothes TCP’s null time-scales

Fundamental tradeoff between TCP performance and vulnerability to low-rate DoS attacks remains

n - num ber of TCP flowsa,b - param . of unif. dist.b

ab

n

nbT

1

)(

Aleksandar Kuzmanovic

An End-Point Solution

Sender-side verification:– Ping Agent:

Measures RTT without a cooperation from the receiver

– TFRC Agent: Computes “TCP-

fair” rate

– Control Agent: Enforces the

sending rate

SND.NXT Send

s end buf f er

SEG.SEQ

SEG.REQSEG.DEQ

PingAgent

PNG.SND

PNG.RCV

TFRCAgent

Control Agent

RTT

Ploss

Measured Throughput

ComputedThroughput

Aleksandar Kuzmanovic

Evaluation

Scenarios:– with behaving receiver (to study false positives)– with misbehaving receivers (to study detection)

End-point scheme is able to detect even very moderate misbehaviors

Slight inaccuracy for higher packet loss ratios (due to TFRC conservatism)

Aleksandar Kuzmanovic

Summary

Denial of Service attacks represent a fundamental threat to today’s Internet

Network-based solutions are necessary, yet are quite often very limited

End-point protocols optimized for performance, not security

DoS-resilient protocol design Parameter randomization Ability to control the other end-point

Aleksandar Kuzmanovic

Conclusions

Improve network performance via – End-point QoS differentiation– DoS-resilient protocol design

QoS differentiation– Developed, implemented, and tested TCP-LP – Can significantly improve the network performance

Denial of Service – Pro-active approach – Jointly consider both performance and security

aspects

Aleksandar Kuzmanovic

Publications

[1] Measuring Service in Multi-Class Networks, In IEEE INFOCOM 2001.

[2] Measurement Based Characterization and Classification of QoS-Enhanced Systems, In IEEE TPDS, 14(7): 671-685, 2003.

[3] TCP-LP: A Distributed Algorithm for Low Priority Data Transfer, In IEEE INFOCOM 2003.

[4] TCP-LP: Low-Priority Service via End-Point Congestion Control, To appear in IEEE/ACM ToN.

[5]* HSTCP-LP: A Protocol for Low-Priority Bulk Data Transfer in High-Speed High-RTT Networks, In PFLDnet 2004.

[6] Low-Rate TCP-Targeted Denial of Service Attacks (The Shrew vs. the Mice and Elephants), In ACM SIGCOMM 2003.

[7] Low-Rate TCP-Targeted Denial of Service Attacks and Counter Strategies, Submitted to IEEE/ACM ToN.

[8] A Performance vs. Trust Perspective in the Design of End-Point Congestion Control Protocols, In IEEE ICNP 2004.

[9] Receiver-based Congestion Control with a Misbehaving Receiver: Vulnerabilities and End-Point Solutions, Submitted to IEEE/ACM ToN.

* With R. Les Cottrell, SLAC.