FAST TCP in Linux Cheng Jin David Wei Steven Low California Institute of Technology.

24
FAST TCP in Linux Cheng Jin David Wei Steven Low California Institute of Technology

Transcript of FAST TCP in Linux Cheng Jin David Wei Steven Low California Institute of Technology.

Page 1: FAST TCP in Linux Cheng Jin David Wei Steven Low California Institute of Technology.

FAST TCP in Linux

Cheng Jin David Wei Steven Low California Institute of Technology

Page 2: FAST TCP in Linux Cheng Jin David Wei Steven Low California Institute of Technology.

netlab.caltech.edu

Outline

Overview of FAST TCP. Design details. SC2002 experiments. On-going experiments.

Page 3: FAST TCP in Linux Cheng Jin David Wei Steven Low California Institute of Technology.

netlab.caltech.edu

Motivation

Clear and present need in the HENP community.

High capacity network infrastructure available today.

Existing TCP protocol does not perform well.

Page 4: FAST TCP in Linux Cheng Jin David Wei Steven Low California Institute of Technology.

netlab.caltech.edu

FAST vs Linux TCP

Distance = 10,037 km; Delay = 180 ms; MTU = 1500 B; Duration: 3600 sLinux TCP Experiments: Jan 28-29, 2003

1333173.182Linux TCPtxqlen=100

3909319.352Linux TCPtxqlen=10000

78 1851.86 1Linux TCPtxqlen=100

753

387

111

Transfer(GB)

1,7972FAST

19.11.2002

9259.281FAST

19.11.2002

2662.671Linux TCP

txqlen=10000

Throughput(Mbps)

BmpsPeta

Flows

18.03

Page 5: FAST TCP in Linux Cheng Jin David Wei Steven Low California Institute of Technology.

netlab.caltech.edu

Aggregate Throughput

Linux TCP Linux TCP FAST

Average utilization

19%

27%

92%FAST Standard MTU of 1500 bytes Utilization averaged over one hour

txq=100 txq=10000

95%

16%

48%

Linux TCP Linux TCP FAST

2G

1G

Page 6: FAST TCP in Linux Cheng Jin David Wei Steven Low California Institute of Technology.

netlab.caltech.edu

Summary of Changes

RTT estimation: high-resolution timer. Fast convergence to equilibrium. Delay monitoring in equilibrium. Pacing cwnd to smooth out traffic.

Page 7: FAST TCP in Linux Cheng Jin David Wei Steven Low California Institute of Technology.

netlab.caltech.edu

FAST TCP Flow Chart

Slow Start

Fast Convergence

Equilibrium

Loss Recovery

NormalRecovery

Time-out

Page 8: FAST TCP in Linux Cheng Jin David Wei Steven Low California Institute of Technology.

netlab.caltech.edu

RTT Estimation

Measure queueing delay. Kernel timestamp with s resolution. Exponential averaging of RTT samples.

Page 9: FAST TCP in Linux Cheng Jin David Wei Steven Low California Institute of Technology.

netlab.caltech.edu

Fast Convergence

Monitor the per-ack queueing delay to avoid overshoot.

Window increment scales with the “distance” from equilibrium.

Page 10: FAST TCP in Linux Cheng Jin David Wei Steven Low California Institute of Technology.

netlab.caltech.edu

Equilibrium

Define an equilibrium neighborhood around the target queueing delay.

Small cwnd adjustment in large time-scale -- one per RTT.

Per-ack delay monitoring to enable timely detection of changes in equilibrium.

Page 11: FAST TCP in Linux Cheng Jin David Wei Steven Low California Institute of Technology.

netlab.caltech.edu

Pacing

What do we pace? Increment to window.

Time-Driven vs. event-driven. Trade-off between complexity and

performance. Timer resolution is important.

Page 12: FAST TCP in Linux Cheng Jin David Wei Steven Low California Institute of Technology.

netlab.caltech.edu

Time-Based Pacing

cwnd increments are scheduled at fixed intervals.

data ack data

Page 13: FAST TCP in Linux Cheng Jin David Wei Steven Low California Institute of Technology.

netlab.caltech.edu

Event-Based Pacing

Detect sufficiently large gap between consecutive bursts and delay cwnd increment until the end of each such burst.

Page 14: FAST TCP in Linux Cheng Jin David Wei Steven Low California Institute of Technology.

SCinet Caltech-SLAC experiments

http://netlab.caltech.edu/FAST

SC2002 Baltimore, Nov 2002

Experiment

Sunnyvale Baltimore

Chicago

Geneva

3000km 1000km

70

00

km

C. Jin, D. Wei, S. LowFAST Team and Partners

Internet: distributed feedbacksystem Rf (s)

Rb’(s)

x

p

TCP AQM

Theory

FAST TCP Standard MTU Peak window = 14,255 pkts Throughput averaged over > 1hr 925 Mbps single flow/GE card

9.28 petabit-meter/sec 1.89 times LSR

8.6 Gbps with 10 flows 34.0 petabit-meter/sec 6.32 times LSR

21TB in 6 hours with 10 flows

Implementation Sender-side modification Delay based

Highlights

1 2

1

2

7

9

10G

enev

a-Sunnyv

ale

Baltim

ore-

Sunn

yval

eFA

ST

I2 L

SR

#flows

Page 15: FAST TCP in Linux Cheng Jin David Wei Steven Low California Institute of Technology.

netlab.caltech.edu

Network

(Sylvain Ravot, Caltech/CERN)

Page 16: FAST TCP in Linux Cheng Jin David Wei Steven Low California Institute of Technology.

netlab.caltech.edu

FAST BMPS

Internet2Land Speed

Record

FAST

1 2

1

2

7

9

10

Geneva-S

unnyvale

Baltim

ore-S

unnyvale

#flows

FAST Standard MTU Throughput averaged over > 1hr

Page 17: FAST TCP in Linux Cheng Jin David Wei Steven Low California Institute of Technology.

netlab.caltech.edu

Aggregate Throughput

1 flow 2 flows 7 flows 9 flows 10 flows

Average utilization

95%

92%

90%

90%

88%FAST Standard MTU of 1500 bytes Utilization averaged over one hour

1hr 1hr 6hr 1.1hr 6hr

Page 18: FAST TCP in Linux Cheng Jin David Wei Steven Low California Institute of Technology.

netlab.caltech.edu

Caltech-SLAC Entry

Rapid recoveryafter possiblehardware glitch

Power glitch -- Reboot

100-200Mbps ACK traffic

Page 19: FAST TCP in Linux Cheng Jin David Wei Steven Low California Institute of Technology.

SCinet Caltech-SLAC experiments

http://netlab.caltech.edu/FAST

SC2002 Baltimore, Nov 2002

PrototypeC. Jin, D. Wei

TheoryD. Choe (Postech/Caltech), J. Doyle, S. Low, F. Paganini (UCLA), J. Wang, Z. Wang (UCLA)

Experiment/facilities Caltech: J. Bunn, C. Chapman, C. Hu (Williams/Caltech), H. Newman, J. Pool, S.

Ravot (Caltech/CERN), S. Singh CERN: O. Martin, P. Moroni Cisco: B. Aiken, V. Doraiswami, R. Sepulveda, M. Turzanski, D. Walsten, S. Yip DataTAG: E. Martelli, J. P. Martin-Flatin Internet2: G. Almes, S. Corbato Level(3): P. Fernes, R. Struble SCinet: G. Goddard, J. Patton SLAC: G. Buhrmaster, R. Les Cottrell, C. Logg, I. Mei, W. Matthews, R. Mount, J.

Navratil, J. Williams StarLight: T. deFanti, L. Winkler TeraGrid: L. Winkler

Major sponsorsARO, CACR, Cisco, DataTAG, DoE, Lee Center, NSF

Acknowledgments

Page 20: FAST TCP in Linux Cheng Jin David Wei Steven Low California Institute of Technology.

netlab.caltech.edu

Fairness Experiment I

Page 21: FAST TCP in Linux Cheng Jin David Wei Steven Low California Institute of Technology.

netlab.caltech.edu

Fairness Experiment II

Page 22: FAST TCP in Linux Cheng Jin David Wei Steven Low California Institute of Technology.

netlab.caltech.edu

10GbE Mystery

Frequent packet loss with 1500-byte MTU. None with larger MTUs.

Packet loss even when cwnd is capped at 300 - 500 packets.

Routers have large queue size of 4000 packets.

Packets captured at both sender and receiver using tcpdump.

Page 23: FAST TCP in Linux Cheng Jin David Wei Steven Low California Institute of Technology.

netlab.caltech.edu

How Did the Loss Happen?

loss detected

Page 24: FAST TCP in Linux Cheng Jin David Wei Steven Low California Institute of Technology.

netlab.caltech.edu

Future Work

Redesigning a beta version. Stability and fairness experiments. More 10 GbE WAN tests. Extensive testing on shared links.