Masaki Hirabaru CRL, Japan ITRC MAI BoF Kochi November 6, 2003...
-
Upload
polly-williams -
Category
Documents
-
view
216 -
download
0
Transcript of Masaki Hirabaru CRL, Japan ITRC MAI BoF Kochi November 6, 2003...
Masaki Hirabaru<[email protected]>
CRL, Japan
ITRC MAI BoF
KochiNovember 6, 2003
広帯域・高遅延ネットワークでのTCP性能計測
Acknowledgements• David Lapsley, Haystack Observatory, MIT• Koyama Yasuhiro, Kashima Space Research Center, CRL• Junichi Nakajima, Kashima Space Research Center, CRL• Jouko Ritakari, Metsähovi Radio Observatory, HUT• Katsushi Kobayashi, Information and Network Systems, CRL
• CRL RDNP / Tokyo XP / TransPAC / Abilene / Caltech
Backgrounds
• e-VLBI – geographically distributed observation, interconnecting radio antennas over the world
• Gigabit/real-time VLBI – multi-gigabit rate sampling
VLBI (Very Long Baseline Interferometry)
Motivations• MIT Haystack – CRL Kashima e-VLBI Experiment
on August 27, 2003 to measure UT1-UTC in 24 hours– 41.54 GB CRL => MIT 107 Mbps (~50 mins)
41.54 GB MIT => CRL 44.6 Mbps (~120 mins)
– RTT ~220 ms, UDP throughput 300-400 MbpsHowever TCP ~6-8 Mbps (per session, tuned)
– BBFTP with 5 x 10 TCP sessions to gain performance
• HUT – CRL Kashima Gigabit VLBI Experiment
- RTT ~325 ms, UDP throughput ~70 MbpsHowever TCP ~2 Mbps (as is), ~10 Mbps (tuned)
- Netants (5 TCP sessions with ftp stream restart extension)
They need high-speed / real-time / reliable / long-haul high-performance, huge data transfer.
Purpose
• Ensure >= 1 Gbps end-to-end performance in high bandwidth-delay product networks– to support for networked science applications– to help operations in finding a bottleneck– to evaluate advanced transport protocols (e.g. Tsu
nami, SABUL, HSTCP, FAST, XCP, ikob)
Contents
• Advanced TCP evaluation on TransPAC/Internet2
• How to isolate host and implementation specific constraints
• TCP monitoring with web100
• Advanced TCP evaluation in a laboratory
• Summary and future work
KwangjuBusan
2.5G
Fukuoka
Korea
2.5G SONET
KORENTaegu
Daejon
10G
0.6G1Gx2
1Gx2
QGPOP
Seoul XP
Genkai XP
Kitakyushu
Tokyo XP
Kashima
0.1G
Fukuoka Japan
250km
1,000km2.5G
TransPAC
9,000km
4,000km
Los Angeles
Cicago
New York
MIT Haystack
HUT
10G
1G
APII/JGN
Abilene
0.1GHelsinki
2.4G
Stockholm
0.6G
2.4G
2.4G
GEANT
Nordunet
funetKoganei
1G
7,000km
Indianapolis
I2 Venue1G
10G
100km
server (general)
server (e-VLBI)
Abilene Observatory: servers at each NOC
CMM: common measurement
machines
Network Diagram for TransPAC/I2 Measurement(Oct. 2003)
1G x2
sender
receiver
Mark5Linux 2.4.7 (RH 7.1)P3 1.3GHzMemory 256MBGbE SK-9843
PE1650Linux 2.4.22 (RH 9)Xeon 1.4GHzMemory 1GBGbE Intel Pro/1000 XT
Iperf UDP ~900Mbps (no loss)
TransPAC/I2 #1:Reno (Win 64MB)
TransPAC/I2 #1:Reno (better case)
TransPAC/I2 #1:Reno (worse case)
TransPAC/I2 #1: Reno (10 mins)
TransPAC/I2 #2:High Speed (Win 64MB)
TransPAC/I2 #2:High Speed (better case)
TransPAC/I2 #2:High Speed (worse case)
TransPAC/I2 #2: High Speed (60 mins)
TransPAC/I2 #1: Reno (Win 12MB)
TransPAC/I2 #2: High Speed (Win 12MB)
*Testing FAST TCP over Abilene by Shalunov, Internet2 Meeting, Oct. 2003
Path4: Pittsburgh to Atlanta, RTT=26.9ms
FAST Linux
throughput
loss
queue
STCPHSTCP
30min
Room for mice !
HSTCP
*TCP comparison (dummynet 800M, 120ms) from FAST TCP presented IETF Vienna July 2003
1st Step: Tuning a Host with UDP
• Remove any bottlenecks on a host– CPU, Memory, Bus, OS (driver), …
• Dell PowerEdge 1650 (*not enough power)– Intel Xeon 1.4GHz x1(2), Memory 1GB– Intel Pro/1000 XT onboard PCI-X (133Mhz)
• Dell PowerEdge 2650– Intel Xeon 2.8GHz x1(2), Memory 1GB– Intel Pro/1000 XT PCI-X (133Mhz)
• Iperf UDP throughput 957 Mbps – GbE wire rate: headers: UDP(20B)+IP(20B)+EthernetII(38B)– Linux 2.4.22 (RedHat 9) with web100– PE1650: TxIntDelay=0
Evaluating Advanced TCPs
• Reno (Linux TCP, web100 version)– Ack: w=w+1/w, Loss: w=w-1/2*w
• HighSpeed TCP (included in web100)– Ack: w=w+a(w)/w, Loss: w=w-1/b(w)*w
• FAST TCP (binary, provided from Caltech)– w=1/2*(w_old*baseRTT/avgRTT+α+w_current)
• Limited Slow-Start (included in web 100)
Note: Differences in sender side only
Web100 (http://www.web100.org)
• A kernel patch for monitoring/modifying TCP metrics in Linux kernel
• Alpha 2.3 for Linux 2.4.22, released September 12, 2003
2nd Step: Tuning a Host with TCP• Maximum socket buffer size (TCP window size)
– net.core.wmem_max net.core.rmem_max (64MB)– net.ipv4.tcp_wmem net.tcp4.tcp_rmem (64MB)
• Driver descriptor length– e1000: TxDescriptors=1024 RxDescriptors=256 (default)
• Interface queue length– txqueuelen=100 (default)– net.core.netdev_max_backlog=300 (default)
• Interface queue descriptor– fifo (default)
• MTU– mtu=1500 (IP MTU)
• Iperf TCP throughput 941 Mbps– GbE wire rate: headers: TCP(32B)+IP(20B)+EthernetII(38B)– Linux 2.4.22 (RedHat 9) with web100
• Web100 (incl. High Speed TCP)– net.ipv4.web100_no_metric_save=1 (do not store TCP metrics in the route cache)– net.ipv4.WAD_IFQ=1 (do not send a congestion signal on buffer full)– net.ipv4.web100_rbufmode=0 net.ipv4.web100_sbufmode=0 (disable auto tuning)– Net.ipv4.WAD_FloydAIMD=1 (HighSpeed TCP)– net.ipv4.web100_default_wscale=7 (default)
• FAST TCP from Caltech– Linux 2.4.20– net.ipv4.tcp_vegas_cong_avoid=1– net.ipv4.tcp_vegas_pascing=1 net.ipv4.tcp_vegas_fast_converge=1– net.ipv4.tcp_vegas_alpha=100 net.ipv4.tcp_vegas_beta=120 net.ipv4.tcp_vegas_gamma=50
Test in a laboratory (1) – no bottleneck
PacketSphere
ReceiverSender
L2SW(12GCF)
Bandwidth 1Gbps Buffer 0KBDelay 88 msLoss 0
GbE/SX
GbE/TGbE/T
PE 2650 PE 1650
• #0: Reno => Reno
– net.ipv4.WAD_IFQ=0 (congestion signal on buffer full)
• #1: Reno => Reno
• #2: High Speed TCP => Reno
• #3: FAST TCP => Reno
2*BDP = 11MB
#0, 1, 2: Data obtained on sender
#3: Data obtained on receiver
Laboratory #0: Reno (no bottleneck)
Laboratory #0: Reno (60 mins)
Laboratory #1: Reno (no bottleneck)
Laboratory #1: Reno (60 mins)
Laboratory #2: High Speed (no bottleneck)
Laboratory #2: High Speed (60 mins)
Laboratory #3: FAST (no bottleneck)
Laboratory #3: FAST (60 mins)
Test in a laboratory (2) – with bottleneck
PacketSphere
ReceiverSender
L2SW(12GCF)
Bandwidth 800Mbps Buffer 256KBDelay 88 msLoss 0
GbE/SX
GbE/TGbE/T
PE 2650 PE 1650
• #4: Reno => Reno
• #5: High Speed TCP => Reno
• #6: FAST TCP => Reno
2*BDP = 11MB
#4, 5: Data obtained on sender #6: Data obtained on receiver
Laboratory #4: Reno (800Mbps)
Laboratory #4: Reno (startup)
Laboratory #4: High Speed (800Mbps)
Laboratory #5: High Speed (startup)
Laboratory #5: High Speed (180-360)
Laboratory #5: High Speed (Win 12MB)
Laboratory #5: High Speed (Limited slow-start)
Laboratory #7: FAST (800Mbps)
Laboratory #7: FAST (startup)
Test in a laboratory (3) – with bottleneck
PacketSphere
ReceiverSender
L2SW(12GCF)
Bandwidth 100Mbps Buffer 256KBDelay 88 msLoss 0
GbE/SX
GbE/TGbE/T
PE 2650 PE 1650
• #7: Reno => Reno
• #8: HighSpeed TCP => Reno
• #9: FAST TCP => Reno
• #10: Reno => Reno (100M shaping)
2*BDP = 11MB
#7, 8, 10: Data obtained on sender #9: Data obtained on receiver
Laboratory #7: Reno (100Mbps)
Laboratory #7: Reno (10 mins)
Laboratory #8: High Speed (100Mbps)
Laboratory #8: High Speed (10 mins)
Laboratory #9: FAST (100Mbps)
Laboratory #9: FAST (10 mins)
Laboratory #10: Reno (100Mbps shaping)
Summary and Future work• No significant differences under no bottlenecks
– Packet loss
• With bottleneck, FAST would work betterin case RTT increase can be detected
– RED on the bottleneck– Queue length on the bottleneck– Signal from the bottleneck or from the receiver
• Appropriate window size to avoid loss– Autotuning?
• Mobile
Questions?• See
http://www2.crl.go.jp/ka/radioastro/index.htmlfor VLBI