FAST Protocols for High Speed Network

28
FAST Protocols for High Speed Network David Wei @ netlab, Caltech For HENP WG, Feb 1st 2003

description

FAST Protocols for High Speed Network. David Wei @ netlab, Caltech For HENP WG, Feb 1st 2003. WAN in Lab Caltech. research & production networks. Internet : distributed feedback control system TCP : adapts sending rate to congestion AQM : feeds back congestion information. R f (s). x. - PowerPoint PPT Presentation

Transcript of FAST Protocols for High Speed Network

Page 1: FAST  Protocols for High Speed Network

FAST Protocolsfor High Speed Network

David Wei @ netlab, Caltech

For HENP WG, Feb 1st 2003

Page 2: FAST  Protocols for High Speed Network

FAST Protocols for Ultrascale Networks

netlab.caltech.edu/FAST

Internet: distributed feedback control system• TCP: adapts sending rate to congestion• AQM: feeds back congestion information

Rf (s)

Rb’(s)

x

))((1

lll

l ctyc

p

)()(1)( tan)(

)()(1-2

tqtttT

wx iid

tqtxi

ii ii

ii

y

pq

TCP AQM

Theory

Calren2/Abilene

Chicago

Amsterdam

CERN

Geneva

SURFNet

StarLight

WAN in LabCaltech

research & production networks

Multi-Gbps50-200ms delay

Experiment

155Mb/s

slowstart

equilibrium

FASTrecovery

FASTretransmit

timeout

10Gb/s

Implementation

Students Choe (Postech/CI

T) Hu (Williams)

J. Wang (CDS) Z.Wang (UCLA)

Wei (CS)

Industry Doraiswami (Cisc

o) Yip (Cisco)

Faculty Doyle (CDS,EE,B

E) Low (CS,EE)

Newman (Physics) Paganini (UCLA)

Staff/Postdoc Bunn (CACR)

Jin (CS) Ravot (Physics) Singh (CACR)

Partners CERN, Internet2, CENIC, StarLight/UI, SLAC, AMPATH, Cisco

People

Page 3: FAST  Protocols for High Speed Network

FAST project

• Goal: Protocols (TCP/AQM) for ultrascale networks Bandwidth: 10Mbps ~ > 100 Gbps Delay: 50-200ms delay Research: Theory, algorithms, design, implement, demo,

deployment

• Urgent Need:– Large amount of Data to share (500TB in SLAC)

– Typical file in SLAC transfer ~1 TB (15 mins with 10Gbps)

Page 4: FAST  Protocols for High Speed Network

HEP Network (DataTAG)

NLNLSURFnet

GENEVA

UKUKSuperJANET4

ABILENE

ABILENE

ESNETESNET

CALREN

CALREN

ItItGARR-B

GEANT

NewYork

FrFrRenater

STAR-TAP

STARLIGHT

Wave

Triangle

• 2.5 Gbps Wavelength Triangle 2002 • 10 Gbps Triangle in 2003

Newman (Caltech)

Page 5: FAST  Protocols for High Speed Network

Projected performance

Ns-2: capacity = 155Mbps, 622Mbps, 2.5Gbps, 5Gbps, 10Gbps100 sources, 100 ms round trip propagation delay

’01155

’02622

’032.5

’04 5

’05 10

J. Wang (Caltech)

Page 6: FAST  Protocols for High Speed Network

Throughput as function of the time

Chicago -> CERN

Linux kernel 2.4.19

Traffic generated by iperf (I measure the throughput over the last 5 sec)

TCP single stream

RTT = 119ms MTU = 1500

Duration of the test : 2 hours

0

100

200

300

400

500

0 1000 2000 3000 4000 5000 6000 7000Time (s)

Thr

ough

put

(Mb/

s)

By Sylvain Ravot (Caltech)

Current TCP (Linux Reno)

Page 7: FAST  Protocols for High Speed Network

As MTU increase…

1.5K, 4K, 9K …

0

200

400

600

800

1000

0 1000 2000 3000

Time (s)

Thr

ough

put

(Mb/

s)

MTU=1498

MTU=3998

MTU=8988

By Sylvain Ravot (Caltech)

Current TCP (Linux Reno)

Page 8: FAST  Protocols for High Speed Network

Better?????

By Some Dreamers (Somewhere)

Page 9: FAST  Protocols for High Speed Network

FAST

Network• CERN (Geneva) SLAC (Sunnyvale), GE, Standard MTU

Sunnyval -> CERN

Linux kernel 2.4.18-FAST enabled

RTT = 180 ms

MTU = 1500

By C. Jin & D. Wei (Caltech)

Page 10: FAST  Protocols for High Speed Network

Theoretical Background

Page 11: FAST  Protocols for High Speed Network

Congestion control

xi(t)

xi(t)

xi(t)

liRli link uses source if 1

RliRli link usenot does source if 0

Page 12: FAST  Protocols for High Speed Network

Congestion control

xi(t)

Example congestion measure pl(t)

– Loss (Reno)– Queueing delay (Vegas)

pl(t)

xi(t) xi(t)→pl(t)

liRli link uses source if 1

Xi

iill xRy ,

AQM:yl(t)

TCP

Page 13: FAST  Protocols for High Speed Network

TCP/AQM

• Congestion control is a distributed asynchronous algorithm to share bandwidth

• It has two components– TCP: adapts sending rate (window) to congestion

– AQM: adjusts & feeds back congestion information

• They form a distributed feedback control system– Equilibrium & stability depends on both TCP and AQM

– And on delay, capacity, routing, #connections

pl(t)

xi(t)TCP: Reno Vegas

AQM: DropTail RED REM/PI AVQ

Page 14: FAST  Protocols for High Speed Network

MethodologyProtocol (Reno, Vegas, RED,

REM/PI…)

Equilibrium Performance

Throughput, loss, delay

Fairness Utility

Dynamics Local stability Cost of stabilization

))( ),(( )1(

))( ),(( )1(

txtpGtp

txtpFtx

Page 15: FAST  Protocols for High Speed Network

Goal: Fast AQM Scalable TCP

• Equilibrium properties– Uses end-to-end delay (and loss) as congestion measure

– Achieves any desired fairness, expressed by utility function

– Very high bandwidth utilization (99% in theory)

• Stability properties– Stability for arbitrary delay, capacity, routing & load

– Good performance• Negligible queueing delay & loss introduced by the protocol

• Fast response

Page 16: FAST  Protocols for High Speed Network

Implementation and Experiment

Page 17: FAST  Protocols for High Speed Network

Implementation

First Version (demonstrated in SuperComputing Conf, Nov 2002):

• Sender-side kernel modification (Good for File sharing service)

• Challenges:– Effects ignored in theory– Large window size and high speed

Page 18: FAST  Protocols for High Speed Network

SCinet Caltech-SLAC experiments

netlab.caltech.edu/FAST

SC2002 Baltimore, Nov 2002

Network Topology

Sunnyvale Baltimore

Chicago

Geneva

3000km 1000km

70

00

km

C. Jin, D. Wei, S. LowFAST Team and Partners

FAST TCP Standard MTU Peak window = 14,255 pkts Throughput averaged over > 1hr

• 925 Mbps single flow/GE card

9.28 petabit-meter/sec

1.89 times LSR• 8.6 Gbps with 10 flows

34.0 petabit-meter/sec

6.32 times LSR• 21TB in 6 hours with 10 flows

Highlights

1 2

1

2

7

9

10G

enev

a-Sunnyv

ale

Baltim

ore-

Sunn

yval

eFA

ST

I2 L

SR

#flows

Page 19: FAST  Protocols for High Speed Network

FAST BMPS

flows Bmps

Peta

Thruput

Mbps

Distance

km

Delay

ms

MTU

B

Duration

s

Transfer

GB

Path

Alaska-Amsterdam

9.4.2002

1 4.92 401 12,272 - - 13 0.625 Fairbanks, AL – Amsterdam,

NL

MS-ISI

29.3.2000

2 5.38 957 5,626 - 4,470 82 8.4 MS, WA –

ISI, Va

Caltech-SLAC

19.11.2002

1 9.28 925 10,037 180 1,500 3,600 387 CERN -Sunnyvale

Caltech-SLAC

19.11.2002

2 18.03 1,797 10,037 180 1,500 3,600 753 CERN -Sunnyvale

Caltech-SLAC

18.11.2002

7 24.17 6,123 3,948 85 1,500 21,600 15,396 Baltimore -Sunnyvale

Caltech-SLAC

19.11.2002

9 31.35 7,940 3,948 85 1,500 4,030 3,725 Baltimore -Sunnyvale

Caltech-SLAC

20.11.2002

10 33.99 8,609 3,948 85 1,500 21,600 21,647 Baltimore -SunnyvaleMbps = 106 b/s; GB = 230 bytes

• C. Jin, D. Wei, S. Low• FAST Team and Partners

Page 20: FAST  Protocols for High Speed Network

FAST BMPS

flows Bmps

Peta

Thruput

Mbps

Distance

km

Delay

ms

MTU

B

Duration

s

Transfer

GB

Path

Alaska-Amsterdam

9.4.2002

1 4.92 401 12,272 - - 13 0.625 Fairbanks, AL – Amsterdam,

NL

MS-ISI

29.3.2000

2 5.38 957 5,626 - 4,470 82 8.4 MS, WA –

ISI, Va

Caltech-SLAC

19.11.2002

1 9.28 925 10,037 180 1,500 3,600 387 CERN -Sunnyvale

Caltech-SLAC

19.11.2002

2 18.03 1,797 10,037 180 1,500 3,600 753 CERN -Sunnyvale

Mbps = 106 b/s; GB = 230 bytes• C. Jin, D. Wei, S. Low

• FAST Team and Partners

Page 21: FAST  Protocols for High Speed Network

FAST BMPS

flows Bmps

Peta

Thruput

Mbps

Distance

km

Delay

ms

MTU

B

Duration

s

Transfer

GB

Path

Alaska-Amsterdam

9.4.2002

1 4.92 401 12,272 - - 13 0.625 Fairbanks, AL – Amsterdam,

NL

MS-ISI

29.3.2000

2 5.38 957 5,626 - 4,470 82 8.4 MS, WA –

ISI, Va

Caltech-SLAC

19.11.2002

1 9.28 925 10,037 180 1,500 3,600 387 CERN -Sunnyvale

Caltech-SLAC

19.11.2002

2 18.03 1,797 10,037 180 1,500 3,600 753 CERN -Sunnyvale

Mbps = 106 b/s; GB = 230 bytes• C. Jin, D. Wei, S. Low

• FAST Team and Partners

Page 22: FAST  Protocols for High Speed Network

FAST BMPS

flows

Bmps

Peta

Thruput

Mbps

Distance

km

Delay

ms

MTU

B

Duration

s

Transfer

GB

Path

Alaska-Amsterdam

9.4.2002

1 4.92 401 12,272 - - 13 0.625 Fairbanks, AL – Amsterdam,

NL

MS-ISI

29.3.2000

2 5.38 957 5,626 - 4,470 82 8.4 MS, WA –

ISI, Va

Caltech-SLAC

19.11.2002

1 9.28 925 10,037 180 1,500 3,600 387 CERN -Sunnyvale

Caltech-SLAC

19.11.2002

2 18.03 1,797 10,037 180 1,500 3,600 753 CERN -Sunnyvale

Caltech-SLAC

18.11.2002

7 24.17 6,123 3,948 85 1,500 21,600 15,396 Baltimore -Sunnyvale

Caltech-SLAC

19.11.2002

9 31.35 7,940 3,948 85 1,500 4,030 3,725 Baltimore -Sunnyvale

Caltech-SLAC

20.11.2002

10 33.99 8,609 3,948 85 1,500 21,600 21,647 Baltimore -Sunnyvale

Mbps = 106 b/s; GB = 230 bytes• C. Jin, D. Wei, S. Low

• FAST Team and Partners

Page 23: FAST  Protocols for High Speed Network

FAST Aggregate throughput

1 flow 2 flows 7 flows 9 flows 10 flows

Average utilization

95%

92%

90%

90%

88%FAST• Standard MTU• Utilization averaged over > 1hr

1hr 1hr 6hr 1.1hr 6hr

C. Jin, D. Wei, S. Low

Page 24: FAST  Protocols for High Speed Network

FAST vs Linux TCP (2.4.18-3)

Linux TCP Linux TCP FAST

Average utilization

19%

27%

92%FAST• Standard MTU• Utilization averaged over 1hr

txq=100 txq=10000

95%

16%

48%

Linux TCP Linux TCP FAST

2G

1G

C. Jin (Caltech)

Page 25: FAST  Protocols for High Speed Network

Trial Deployment

FAST Kernel Installed:• SLAC: Les Cottrell, etc.

www-iepm.slac.stanford.edu/monitoring/bulk/fast• FermiLab: Michael Ernst, etc.

Coming soon:• 10-Gbps NIC Testing (Sunnyval - CERN)• Internet2• …

Page 26: FAST  Protocols for High Speed Network

Detailed Information:

• Home Page: http://Netlab.caltech.edu/FAST

• Theory: http://netlab.caltech.edu/FAST/overview.html

• Implementation & Testing: http://netlab.caltech.edu/FAST/software.html

• Publications: http://netlab.caltech.edu/FAST/publications.html

Page 27: FAST  Protocols for High Speed Network

FAST

netlab.caltech.edu/FAST

• Theory

D. Choe (Postech/Caltech), J. Doyle, S. Low, F. Paganini (UCLA), J. Wang, Z. Wang (UCLA)

• PrototypeC. Jin, D. Wei

• Experiment/facilities– Caltech: J. Bunn, C. Chapman, C. Hu (Williams/Caltech), H. Newman, J. Pool, S. Ravot (C

altech/CERN), S. Singh– CERN: O. Martin, P. Moroni– Cisco: B. Aiken, V. Doraiswami, R. Sepulveda, M. Turzanski, D. Walsten, S. Yip– DataTAG: E. Martelli, J. P. Martin-Flatin– Internet2: G. Almes, S. Corbato– Level(3): P. Fernes, R. Struble– SCinet: G. Goddard, J. Patton– SLAC: G. Buhrmaster, R. Les Cottrell, C. Logg, I. Mei, W. Matthews, R. Mount, J. Navratil,

J. Williams– StarLight: T. deFanti, L. Winkler

– TeraGrid: L. Winkler

• Major sponsors

ARO, CACR, Cisco, DataTAG, DoE, Lee Center, NSF

Acknowledgments

Page 28: FAST  Protocols for High Speed Network

Thanks

Questions?