Khatri Thesis

42
SCTP PERFORMANCE IMPROVEMENT BASED ON ADAPTIVE RETRANSMISSION TIME-OUT ADJUSTMENT THESIS Presented to the Graduate Council of Texas State University-San Marcos in Partial Fulfillment of the Requirements for the Degree Master of SCIENCE by Sagun Khatri, B.A. San Marcos, Texas August 2011

description

Khatri Thesis

Transcript of Khatri Thesis

Page 1: Khatri Thesis

SCTP PERFORMANCE IMPROVEMENT BASED ON ADAPTIVE

RETRANSMISSION TIME-OUT ADJUSTMENT

THESIS

Presented to the Graduate Councilof Texas State University-San Marcos

in Partial Fulfillmentof the Requirements

for the Degree

Master of SCIENCE

by

Sagun Khatri, B.A.

San Marcos, Texas

August 2011

Page 2: Khatri Thesis

SCTP PERFORMANCE IMPROVEMENT BASED ON ADAPTIVE

RETRANSMISSION TIME-OUT ADJUSTMENT

Committee Members Approved:

Wuxu Peng, Chair

Stan McClellan

Hongchi Shi

Approved:

J. Michael Willoughby

Dean of the Graduate College

Page 3: Khatri Thesis

FAIR USE AND AHTHOR’S PERMISSION STATEMENT

Fair Use

This work is protected by the Copyright Laws of the United States (Public Law 94-553, section 107). Consistent with fair use as defined in the Copyright Laws, briefquotations from this material are allowed with proper acknowledgement. Use of thismaterial for financial gain without the author’s express written permission is notallowed.

Duplication Permission

As the copyright holder of this work I, Sagun Khatri, authorize duplication of thiswork, in whole or in part, for educational or scholarly purposes only.

Page 4: Khatri Thesis

TABLE OF CONTENTS

LIST OF TABLES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vLIST OF FIGURES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi

CHAPTER

1. INTRODUCTION 11.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11.2 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.3 Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21.4 Layout of the Thesis . . . . . . . . . . . . . . . . . . . . . . . 5

2. BACKGROUND 62.1 Commonalities between SCTP and TCP . . . . . . . . . . . . 62.2 Differences between SCTP and TCP . . . . . . . . . . . . . . 72.3 Retransmission . . . . . . . . . . . . . . . . . . . . . . . . . . 92.4 Jacobson’s Algorithm . . . . . . . . . . . . . . . . . . . . . . . 102.5 Performance Deterioration of Jacobson’s Algorithm . . . . . . 112.6 Karn’s Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . 132.7 Fast Retransmission Timeout . . . . . . . . . . . . . . . . . . 142.8 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . 152.9 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

3. ADAPTIVE RTO MIN (ARM) ALGORITHM 183.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 183.2 Research Logistics . . . . . . . . . . . . . . . . . . . . . . . . . 193.3 Adaptive RTO MIN (ARM) Algorithm . . . . . . . . . . . . . 193.4 Data Gathering for Multiple Payloads . . . . . . . . . . . . . . 23

3.4.1 Performance Evaluation for 50 bytes Payload . . . . . . 273.4.2 Performance Evaluation for 500 bytes Payload . . . . . 273.4.3 Performance Evaluation for 1000 bytes Payload . . . . 283.4.4 Performance Evaluation for 2000 bytes Payload . . . . 28

3.5 Algorithm Comparison Chart . . . . . . . . . . . . . . . . . . 293.6 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

4. CONCLUSIONS AND FUTURE WORK 31BIBLIOGRAPHY . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

iv

Page 5: Khatri Thesis

LIST OF TABLES

Table 3.1 Data in this table are the outcome of the Adaptive RTO algo-rithm implementation. . . . . . . . . . . . . . . . . . . . . . . . . . 22

Table 3.2 Data Gathered using the static RTO MIN (SRM) and AdaptiveRTO MIN algorithm (ARM) executed on multiple payload, andfile sizes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

Table 3.3 This chart shows us that when a chunk’s payload size is rela-tively small the Adaptive RTO algorithm preforms better thanthe static RTO MIN. . . . . . . . . . . . . . . . . . . . . . . . . . . 29

v

Page 6: Khatri Thesis

LIST OF FIGURES

Figure 1.1 In this diagram the X-axis represents number of RTO updatesand the Y-axis represents time in milli seconds. . . . . . . . . . . . . 3

Figure 1.2 In this figure the X-axis represents number of RTO updates,and the Y-axis represents time in milliseconds. . . . . . . . . . . . . 4

Figure 2.1 The Jacobson’s algorithm . . . . . . . . . . . . . . . . . . . . . . . 10Figure 2.2 In this figure we see how the Jacobson’s algorithm currently

behaves in SCTP. . . . . . . . . . . . . . . . . . . . . . . . . . . . 12Figure 2.3 This is a zoomed-in version of above Figure 2.2. . . . . . . . . . . . . 12Figure 2.4 Karn’s Algorithm - the Retransmission Ambiguity Problem . . . . . . 13Figure 3.1 The SCTP Echo Server running at the Texas State University–

Texas State Computer Science Department. . . . . . . . . . . . . . . 20Figure 3.2 The Adaptive RTO algorithm divides the possible RTT values

into five sectors. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21Figure 3.3 Adaptive RTO algorithm. . . . . . . . . . . . . . . . . . . . . . . . 21Figure 3.4 This graph is generate from the data in the Table 3.1. . . . . . . . . . 23Figure 3.5 Without the Adaptive RTO MIN algorithm, the static RTO

MIN holds the RTO from falling below 1, 000 milliseconds. . . . . . . 24

vi

Page 7: Khatri Thesis

CHAPTER 1

INTRODUCTION

1.1 Background

The Stream Control Transmission Protocol (SCTP) is a new IP transport protocol,

existing at an equivalent level with User Datagram Protocol (UDP) and Transmis-

sion Control Protocol (TCP), which provides transport layer functions to many In-

ternet applications. SCTP has been approved by the IETF as a Proposed Standard

[Ong and Yoakum, 2002].

Like TCP, SCTP provides a reliable transport service, ensuring that data is

transported across the network without error and in sequence. Like TCP, SCTP

is session-oriented mechanism, meaning that a relationship is created between the

endpoints of an SCTP association prior to data being transmitted, and this rela-

tionship is maintained until all data transmission has been successfully completed

[Ong and Yoakum, 2002]. The word “association” is used in SCTP instead of “con-

nection” to avoid the connotation that a connection involves communication be-

tween only two IP addresses. An association refers to a communication between

two systems, which may involve more than two IP addresses due to multihoming

[Stevens et al., 2004].

Unlike TCP, SCTP provides a number of functions that are critical for telephony

signaling transport, and at the same time can potentially benefit other applications

1

Page 8: Khatri Thesis

2

needing transport with additional performance and reliability [Ong and Yoakum, 2002].

1.2 Motivation

In recent years there have been a significant number of changes with regard to network

infrastructure. In the 1980s the Round Trip Time (RTT) for a packet to travel from

one side of the Continental US, for example, from New York City to the other side,

for example, Los Angeles California, used to take around 200 milliseconds. Currently

a packet traveling from New York City to Los Angeles could easily travel in less than

60 milliseconds. The decrease in time taken for a packet to travel a distance shows

us that there is a major improvement in the network infrastructure. With the help

of new technologies, such as fiber cables, and better satellite communication, faster

data transfer between two endpoints is going to be an ongoing trend.

Even though the network infrastructure has improved significantly over the years,

some of the network algorithms have not been fine tuned to take advantage of the

infrastructure improvement. Some of the network algorithms designed in the late

1980s are still being used Jacobson’s algorithm is a good example at this trend.

Jacobson’s algorithm calculates the Retransmission Time-Out (RTO) for each Round

Trip Time (RTT) [Stevens et al., 2004]. The algorithm was designed with the network

infrastructure of the late 1980s, where bandwidth was not as abundant.

1.3 Objectives

This work focuses on improving the file transfer time for the Stream Control Transmis-

sion Protocol (SCTP). SCTP borrows many features from TCP, but there are some

Page 9: Khatri Thesis

3

440 5 10 15 20 25 30 35 40

3000

0

500

1000

1500

2000

2500

Number of RTO and RTT Updates

Tim

e (m

illis

econ

ds) RTO

RTT

RTO MIN

Waste

Waste

Waste

Figure 1.1: In this diagram the X-axis represents number of RTO updates and the Y-axisrepresents time in milli seconds.

areas where the SCTP needs fine tuning in order to take advantage of its unique fea-

tures, such as multi-homing. One particular area where the SCTP needs improvement

is the implementation of the Retransmission Time-Out Minimum (RTO MIN).

The Retransmission Time-Out Minimum (RTO MIN) constant in SCTP is set

very high (i.e., 1000 milliseconds) as seen in Figure 1.1 and Figure 1.2. Jacobson’s

algorithm does not respond to any sporadic changes in the RTT value mainly due to

the RTO MIN, as seen in Figure 1.2. The result is waste of valuable system resources,

that could have been used for transmitting few more packets of data.

The subject matter of this thesis is not only to improve upon the time taken

to transfer a file form one node to the other, but also to make sure that the new

algorithm does not have any side effects, like bandwidth congestion. To accomplish

Page 10: Khatri Thesis

4

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31

1100

0

100

200

300

400

500

600

700

800

900

1000

Number of RTO and RTT Updates

Tim

e (m

illis

econ

ds)

RTO

RTT

RTO MIN

Waste

Figure 1.2: In this figure the X-axis represents number of RTO updates, and the Y-axisrepresents time in milliseconds.

this we proposed to fine tune the RTO MIN to make SCTP aware of a packet loss in

a significantly shorter amount of time. Thus, the retransmission will be expedited.

With static RTO MIN (SRM), Jacobson’s algorithm is idle majority of the time,

as seen in Figure 1.2. In this thesis we propose a new algorithm called Adaptive RTO

MIN (ARM) algorithm, which dynamically lowers the lower bound of the RTO, thus

forcing Jacobson’s algorithm to engage in the RTO’s calculation.

The retransmission timer is a key feature of a reliable link or transport layer

protocol. It can greatly influence peer-to-peer performance. A too optimistic re-

transmission often expires prematurely. Such an event is called spurious timeout

[Ludwig and Sklower, 2000]. Thus, the ARM algorithm should avoid spurious time-

outs. However, fine tuning the RTO MIN should not make the RTO too optimistic.

Page 11: Khatri Thesis

5

Otherwise, there is a risk of high bandwidth consumption because of spurious re-

transmissions. Detecting packet loss faster, should result in a faster file transfer as

compared to the existing algorithm.

1.4 Layout of the Thesis

Chapter 1: Introduction to SCTP. Describes the fundamental differences between

SCTP, TCP, and UDP.

Chapter 2: Provide a detailed background information on SCTP and research that

is related to this thesis.

Chapter 3: Detailed description on the Adaptive RTO MIN algorithm.

Chapter 4: Conclusion and Future Work.

Page 12: Khatri Thesis

CHAPTER 2

BACKGROUND

2.1 Commonalities between SCTP and TCP

SCTP shares many features of TCP. For example:

• Both TCP and SCTP are developed to achieve the highest possible throughput

in various network scenarios. Thus, they will try to make use of all available

bandwidth in the network to transmit data as fast as possible to remote users.

This is known as thick stream. An example of thick stream would be a file trans-

fer from one node to the other. When very few packets are sent without the need

to make use of the available bandwidth, and those packets are small compared

to the available payload. This is known as thin stream [Pedersen, 2006].

• SCTP and TCP adjusts the sending rate to avoid overwhelming both the re-

ceiver and the network. Limiting the sending rate to avoid overwhelming a

receiver is called flow control. Limiting the sending rate to avoid overwhelming

the network is called congestion control [Matthews, 2005].

• SCTP and TCP maintains another limit called the congestion window. The con-

gestion window typically starts at the 1 Maximum Segment Size (MSS) and then

increases with each segment that is successfully acknowledged [Matthews, 2005].

6

Page 13: Khatri Thesis

7

• In SCTP and TCP the congestion window grows multiplicatively with each ac-

knowledgement. This phase of multiplicative growth is called slow start. How-

ever, this multiplicative growth does not continue forever, there is a adaptively

determined threshold [Matthews, 2005].

• In SCTP data can be transmitted in one or more streams within a single associ-

ation and subjected to common congestion and flow control. These mechanisms

are based on TCP. This means that SCTP is using slow start and congestion

avoidance in its procedures. During slow-start the initial congestion window is

set to 2× Maximum Transmission Unit (MTU). During congestion avoidance,

the congestion window is increased by 1× MTU per RTT.

2.2 Differences between SCTP and TCP

SCTP differs from TCP in fundamental ways, which is why there is a need to optimize

algorithms to better suit SCTP, rather than just copy and paste the code from TCP.

Following are some key differences between the two protocols:

• Unlike TCP, SCTP is message-oriented. It provides sequence delivery of individ-

ual records. Like User Datagram Protocol (UDP), the length of a record written

by the sender is passed to the receiving application [Stevens et al., 2004].

• SCTP can provide multiple streams between connection endpoints, each with

its own reliable sequenced delivery of messages. A lost message in one of these

streams does not block delivery of messages in any other streams. This approach

is in contrast to TCP, where a loss at any point in the single stream of bytes

Page 14: Khatri Thesis

8

blocks delivery of all future data on the connection until the loss is repaired

[Stevens et al., 2004].

• SCTP also provides a multihoming feature, which allows a single SCTP end-

point to support multiple IP addresses. This feature can provide increased

robustness against network failure. An endpoint can have multiple redundant

network connections, where each of these networks has a different connection

to the Internet. SCTP can work around a failure of one network or path across

the Internet by switching to another address already associated with the SCTP

association. The word “association” is used in SCTP instead of “connection” to

avoid the connotation that a connection involves communication between only

two IP addresses [Stevens et al., 2004].

• In SCTP a message from the application layer is transmitted in a data chunk

which has its own unique Transmission Sequence Number (TSN). Several chunks

for different types may get bundled into one packet as long as the total size of a

packet does not exceed the MTU of the network path. If a message does not fit

into a single packet according to the MTU, it is fragmented into multiple data

chunks where each fits into a packet [Stewart et al., 2000].

• SCTP uses SACK to acknowledge the receipt of data chunks. In the absence of

loss, a SACK is sent back for every second packet received or within 200 milli

seconds of the arrival of any unacknowledged data chunks.

Page 15: Khatri Thesis

9

2.3 Retransmission

When a SCTP sender transmits a chunk, it also sets a timer called a retransmission

timer. When an acknowledgment arrives, the timer is cancelled. If the timer expires

before an acknowledgment arrives, the chunk will be retransmitted.

SCTP does not always wait for a retransmission timer to expire before retrans-

mitting data. SCTP will also interpret a series of duplicate acknowledgments as an

early sign of packet losses [Matthews, 2005]. Fast retransmission occurs when four

Selective Acknowledgements (SACKs) is received, and this is discussed in details in

later sections.

The RTT between a client and a server can change rapidly with time as net-

work conditions change. For optimal performance a timeout and retransmission

algorithm should be used that takes into account the actual RTT’s characteristics

along with changes in the RTT over time. Much work has been focused on this

area, mostly relating to TCP, but the same ideas apply to any network application

[Allman and Paxson, 1999, Coene and Pastor-Balbas, 2006].

The retransmission time-out (RTO) value is the time that elapses after a packet

was sent until the sender considers it lost and retransmits it, this event is called a

timeout. The RTO is a prediction of the upper limit of the round trip time (RTT), i.e.,

the time that elapses after a packet left the sender until the sender receives a positive

acknowledgment (ACK) for that packet. The time that remains until the timeout for

a packet occurs is maintained by the retransmission timer state (REXMT). Thus, the

RTO is the REXMT’s initial value [Ludwig and Sklower, 2000].

Page 16: Khatri Thesis

10

delta = measuredRTT − srtt

srtt ← srtt + g × delta

rttvar ← rttvar + h(|delta| − rttvar)

RTO = srtt + 4× rttvar

Figure 2.1: The Jacobson’s algorithm

The retransmission timer is the key feature of a reliable link or transport layer

protocol. It can greatly influence peer-to-peer performance. A too optimistic retrans-

mission timer often expires prematurely. Such an event is called spurious timeout. It

causes unnecessary traffic, so called spurious retransmissions, that reduce a connec-

tion’s effective throughput. On the other hand, a retransmission timer that is too

conservative may cause long idle time before the lost packet is retransmitted. This

can also degrade performance [Ludwig and Sklower, 2000].

2.4 Jacobson’s Algorithm

We want to calculate the RTO to use for every packet that we send. To calculate this,

we measure the RTT: the actual round-trip time for a packet. Every time we measure

an RTT, we update two statistical estimators: srtt is the smoothed RTT estimator and

rttvar is the smoothed mean deviation estimator. The latter is a good approximation

of the standard deviation, but easier to compute since it does not involve a square

root. Given these two estimators, the RTO is assigned as srtt plus four times rttvar

[Jacobson and Karels, 1988] provides all the details to these calculations, which we

can summarize in Figure 2.1:

In Figure 2.1 delta is the difference between the measured RTT and the current

Page 17: Khatri Thesis

11

smoothed RTT estimator (srtt), g is the gain applied to the RTT estimator and equals

18, and h is the gain applied to the mean deviation estimator and equals 1

4.

Another point made in [Jacobson and Karels, 1988] is that when the retransmis-

sion timer expires, an exponential backoff must be used for the next RTO. For exam-

ple, if our first RTO is 2 seconds and the reply is not received in this time, then the

next RTO is 4 seconds. If there is still no reply, the next RTO is 8 seconds, and then

16, and so on [Stevens et al., 2004].

2.5 Performance Deterioration of Jacobson’s Algorithm

A retransmission timer that is too conservative may cause long idle time before the lost

packet is retransmitted. This can degrade performance [Ludwig and Sklower, 2000].

In SCTP, RTO MIN is a constant that keeps the RTO from falling below 1, 000

milliseconds. During the research, there was a strong reverberation through the static

RTO MIN causing Jacobson’s algorithm performance deterioration.

RTO MIN’s very liberal value of 1, 000 milliseconds keeps the Jacobson’s algorithm

from playing a proactive role in RTO calculation, as seen in Figure 2.2 and Figure 2.3.

In many circumstances (due to modern broadband networks) the RTT is well below

1, 000 milliseconds. Thus, most of the time Jacobson’s algorithm is never used for

the RTO calculation, which causes long idle time before realizing a packet has been

lost. In other words, SCTP has to wait for almost a whole second, before realizing a

packet has been lost. This approximately 1, 000 millisecond loss in the long run will

turns out to be very costly in terms of transmission time.

Page 18: Khatri Thesis

12

550180 250 300 350 400 450 500

4500

0

500

1000

1500

2000

2500

3000

3500

4000

Number of RTO and RTT Updates

Tim

e (m

illis

econ

ds)

RTO

RTT

Figure 2.2: In this figure we see how the Jacobson’s algorithm currently behaves in SCTP.

14040 50 60 70 80 90 100 110 120 130

4500

0

500

1000

1500

2000

2500

3000

3500

4000

Number of RTO and RTT Updates

Tim

e (m

illis

econ

ds)

RTO

RTT

Figure 2.3: This is a zoomed-in version of above Figure 2.2.

Page 19: Khatri Thesis

13

client server

reply

request

request

reply

client server

request

lost

request

reply

(a) lost request

client server

request

replylost

request

reply

(b) lost reply (c) RTO too short

{RTO

Figure 2.4: Karn’s Algorithm - the Retransmission Ambiguity Problem

2.6 Karn’s Algorithm

Another point made in [Jacobson and Karels, 1988] is that when the retransmission

timer expires, an exponential backoff must be used for the next RTO. For example,

if our first RTO is 2 seconds and the reply is not received in this time, then the next

RTO is 4 seconds. If there is still no reply, the next RTO is 8 seconds, and then 16,

and so on.

Jacobson’s algorithm is used to calculate the RTO each time an RTT is measured,

and how to increase the RTO for retransmission. However, a problem arises for

ACK of retransmitted packets. This is called the retransmission ambiguity problem.

Figure 2.4 shows the following three possible scenarios when the retransmission timer

expires.

(a) The request is lost.

(b) The reply is lost.

Page 20: Khatri Thesis

14

(c) The RTO is too small.

When the client receives a reply to a request that was retransmitted, it cannot

tell to which request the reply corresponds. In the example (refer to Figure 2.4) the

reply corresponds to the original request, while in the two other examples, the reply

corresponds to the retransmitted request.

Karn’s algorithm [Karn and Partridge, 1991] handles this scenario with the follow-

ing rules that apply whenever a reply is received for a request that was retransmitted.

• If an RTT was measured, do not use it to update the estimators since it is not

known to which request the reply corresponds.

• Since this reply arrived before the retransmission time expired, reuse this RTO

for the next packet. Only when a reply is received to a request that was not

retransmitted will RTT estimators be updated and RTO recalculated.

2.7 Fast Retransmission Timeout

In SCTP, fast retransmission is triggered by four SACKs. Whenever the sender

receives a SACK that reports missing data chunks, it will wait for three further SACKs

reporting the same data chunks as missing before doing the fast retransmission. By

waiting for four consecutive SACKs, SCTP tries to eliminate spurious retransmissions

caused by packets that are received out of order [Pedersen, 2006].

Page 21: Khatri Thesis

15

2.8 Related Work

A lot of research has been done to improve late retransmission. This section focuses

on what others have done in order to improve the late retransmission.

Early Fast Retransmit (EFR) is an optional mechanism in FreeBSD, which is

active whenever the congestion window is larger than the number of unacknowledged

packets and there are packets to be sent. It starts a timer that closely follows the

RTT and RTTVAR for every packet sent, and when the timer goes off and the stream

is still not using the entire congestion window, it retransmits all packets that could

have been acknowledged [Pedersen et al., 2006].

Much research has been done in the quest to improve Jacobson’s algorithm, but so

far there has not been a definite answer. Ekstrom and Ludwig [Ekstrom and Ludwig, 2004]

indicate that the RTO algorithm defined in RFC2988 [Paxson and Allman, 2000],

used in both TCP and SCTP, responds inappropriately to certain fluctuation in RTT.

This causes the characteristics spike in RTO when there is a sudden movement in

RTT, as seen in Figure 2.2. The reason behind the spike is that the RTTVAR compu-

tation does not distinguish between positive and negative variations. Their proposed

algorithm alleviates the sudden fluctuation for a wide range of cases, and the findings

in [Pedersen et al., 2006] concurs with their algorithm. However, their solution on

average is higher than that proposed in RFC2988 [Paxson and Allman, 2000], which

is not a optimal solution [Petlund et al., 2009].

The “Early Retransmit (ER)” algorithm [Allman et al., 2010] suggests that a

mechanism should be in place to recover lost segments when there are few unac-

Page 22: Khatri Thesis

16

knowledged packets to trigger Fast Retransmit. The Early Retransmit algorithm

reduces waiting time in four situations [Petlund et al., 2009]:

• The congestion window is still initially small.

• It is small because of heavy loss.

• Flow control limits the send window size.

• The application has no data to send.

The RTO MIN is an important factor in calculating the RTO itself. Some appli-

cations may want to lower the RTO MIN to less than 1, 000 milliseconds, which will

allow the sender to reach the maximum number of retransmission threshold faster in

the case of network failures. However, Allman and Paxson [Allman and Paxson, 1999]

warn that lowering the RTO MIN may have negative impact on network behavior.

Where as, some applications might want to eliminate using the binary exponen-

tial back-off concept in the RTO calculation in order to speed up failure detection.

The RFC4166 suggests not to eliminate the binary exponential back-off altogether,

because when network congestion does occur, not backing off the timer may worsen

the congestion situation [Coene and Pastor-Balbas, 2006].

2.9 Summary

Even though there has been a tremendous amount of research in this area, there is

still a great deal of fine tuning to be done in terms of enhancing the retransmission

timeout algorithms. In the next chapter we are going to give a detailed description

Page 23: Khatri Thesis

17

of the algorithm we call Adaptive RTO MIN (ARM) algorithm, which exploits the

variations in RTT and dynamic range of RTO. This algorithm improves performance

of SCTP by implementing a dynamically variable minimum value for RTO. Previously,

the minimum value of RTO (RTO MIN) has been statically defined.

Page 24: Khatri Thesis

CHAPTER 3

ADAPTIVE RTO MIN (ARM) ALGORITHM

3.1 Introduction

The idea behind the Adaptive RTO algorithm comes from observing the Jacobson’s

algorithm secluded role in calculating the RTO with the lower bound set by the static

RTO MIN (SRM). With the SRM, Jacobson’s algorithm is idle the majority of the

time. The Adaptive RTO MIN (ARM) algorithm dynamically lowers the lower bound

of the RTO, thus forcing Jacobson’s algorithm to engage in the RTO’s calculation.

RFC-2960 defines RTO MIN as a constant, with a value of one second, which

is 1, 000 milliseconds [Stewart et al., 2000]. The primary purpose of the RTO MIN

is to act as a lower bound for the RTO, i.e. RTO’s value can not fall below that

threshold. The RTO MIN constant makes the RTO very unresponsive to RTT’s

sporadic behavior. For example even if the RTT is at 700 milliseconds the RTO will

not make any adjustments accordingly, as seen in Figure 2.2.

The only way Jacobson’s Algorithm can play a proactive role in the RTO calcula-

tion is if the RTT value is in the vicinity of 800 to 900 milliseconds. In our research,

we found that waiting 1000 milliseconds for the retransmission timer to expire is a

waste of time, since as mentioned before the RTT for the Continental US is about 60

milliseconds. If there are some spurious RTT’s once in a while, then why not catch

it early and retransmit it for faster file transfer?

18

Page 25: Khatri Thesis

19

Modern bandwidth is big enough for us to afford to do such retransmissions, and

as the results from the data collection shows that doing so will not be retransmitting

any more than the current algorithm. In some occasions the Adaptive RTO algorithm

can save retransmission from happening and wasting resources.

3.2 Research Logistics

For this research we setup three computers, as seen in Figure 3.1. Two computers

running an SCTP echo client program residing in an external network, and the third

computer running an SCTP echo server program, which resides in the Texas State

University–San Marcos Computer Science Department. All three of them were run-

ning Linux CentOS 5.4, kernel version 2.6.18, with the default SCTP module provided

by the kernel.

The physical distance between the computers was less than 50 miles. The RTT

between the two client computers and the server was typically around 30 milliseconds.

To measure the time taken, the Adaptive RTO MIN (ARM) algorithm and the

static RTO MIN (SRM), we transmitted the same file from two client computers,

where one is running the ARM algorithm and the other SRM. In addition, the client

computers were transmitting the file in 10 streams, with the data being mirrored in

all 10 streams, as diagramed in Figure 3.1.

3.3 Adaptive RTO MIN (ARM) Algorithm

The ARM algorithm calculates RTO MIN based on the current value of RTT. Dy-

namically lowering the RTO MIN forces Jacobson’s algorithm to play a proactive role

Page 26: Khatri Thesis

20

ExternalNetwork

InternetSCTP Echo Server

SCTP Echo Client 1

SCTP Echo Client 2

University Network

Router Router

RTT 28 to 32 ms

10 Streams 10 Streams

Figure 3.1: The SCTP Echo Server running at the Texas State University–Texas StateComputer Science Department.

in calculating RTO.

The RTO Adaptive Algorithm calculates the RTO MIN value based on the current

RTT with the principle of exponential decay. This is diagramed in Figure 3.2 and

Figure 3.3.

There are two reasons for choosing the multiplicative values in the Adaptive RTO

MIN algorithm:

• Multiplication of 2, 1.75, 1.50, 1.25, 1.125, is easier and efficient to calculate

using a right bit shift and an addition operation, rather than a floating point

calculations.

• When the RTT is in the lower range, from 1 to 50 milliseconds, it is safe to

double the RTO MIN value, because the RTO MIN will at most have an upper

bound of 100 milliseconds. Whereas, doubling the RTT MIN value while the

Page 27: Khatri Thesis

21

8000 50 100 150 200 250 300 350 400 450 500 550 600 650 700 750

2.25

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2

RTT Value

RT

O M

IN R

atio

n

Figure 3.2: The Adaptive RTO algorithm divides the possible RTT values into five sectors.

1 i f ( r t t <= 50)2 rto min = r t t ∗ 23 else i f ( r t t <= 100)4 rto min = r t t ∗ 1 .755 else i f ( r t t <= 200)6 rto min = r t t ∗ 1 .57 else i f ( r t t <= 400)8 rto min = r t t ∗ 1 .259 else

10 rto min = r t t ∗ 1 .125

Figure 3.3: Adaptive RTO algorithm.

Page 28: Khatri Thesis

22

Table 3.1: Data in this table are the outcome of the Adaptive RTO algorithm implemen-tation.

RTO MIN and RTO After the Adaptive RTO Algorithm

Round Trip Time Retransmission Time-Out Retransmission Time-Out Minimum

29 85 5833 78 6628 66 5632 64 6462 108 108106 159 159184 286 276215 401 26840 356 8040 320 8036 291 7244 252 8860 211 10560 175 10570 152 12281 150 14185 148 14889 155 15590 157 15793 162 162

RTT is in the range of 201 to 400 milliseconds, is not wise because the RTO

MIN’s upper bound will be almost the same as static RTO MIN. Thus, using

an exponential decay principle in the Adaptive RTO MIN algorithm made the

most sense.

Figure 3.4 shows the resulting performance of the ARM algorithm. Note that

because of the ARM algorithm the RTO, which is calculated by Jacobson’s algorithm,

is not over estimating as was the case with the static RTO MIN of 1, 000 milliseconds.

By lowering the RTO MIN value and allowing it to update dynamically based

Page 29: Khatri Thesis

23

220 2 4 6 8 10 12 14 16 18 20

450

0

50

100

150

200

250

300

350

400

Number of RTO, RTT, and RTO MIN Updates

Tim

e (m

illis

econ

ds)

RTO

RTT

RTO MIN

Figure 3.4: This graph is generate from the data in the Table 3.1.

on current RTT values, Jacobson’s algorithm becomes more proactive in the RTO

calculation, but with enough room for the RTT to fluctuate and not over-congest the

network bandwidth.

As shown in Figure 3.4, the ARM algorithm gives enough room for the RTO to

make correct decision, but at the same time it does not let the RTO MIN fall too far

below. In a sense the RTO MIN is acting like a net for the RTO. Before, the RTO

constantly hovered at 1, 000 milliseconds even if the RTT was around 30 milliseconds.

3.4 Data Gathering for Multiple Payloads

To verify the performance of Adaptive RTO algorithm a wealth of data were gath-

ered, using the network configuration of Figure 3.1, using multiple payload sizes, and

multiple file sizes. The payload size ranged from 50 bytes — 2000 bytes, and the file

Page 30: Khatri Thesis

24

220 2 4 6 8 10 12 14 16 18 20

1100

0

100

200

300

400

500

600

700

800

900

1000

Number of RTO, RTT, and RTO MIN Updates

Tim

e (m

illis

econ

ds)

RTO

RTT

RTO MIN

Figure 3.5: Without the Adaptive RTO MIN algorithm, the static RTO MIN holds theRTO from falling below 1, 000 milliseconds.

sizes ranged from 32 kilobytes — 2048 kilobytes (2 megabytes). The interpretation

of this data shows that the ARM algorithm preforms better with smaller payloads.

As seen in Table 3.2, the data shows that the Adaptive RTO algorithm behaves

approximately the same when the congestion level in the network is the comparatively

same, i.e. there is no excessive misfiring of retransmission. The conventional thought

is that lowering the RTO MIN might have undesirable side effect, like network con-

gestion [Allman and Paxson, 1999, Coene and Pastor-Balbas, 2006]. The Adaptive

RTO algorithm and static RTO MIN behaves “approximately” the same while op-

erating in the non-congested mode. However, the Adaptive RTO algorithm still out

performs the static RTO MIN in terms of Total Transmission time, RTX due to Fast

RTX, albeit in small fractions.

Page 31: Khatri Thesis

25

The advantage comes when using small packets as in “thin stream”, or control

streams. During the data collection phase the Adaptive RTO algorithm transfer rate

improved by 5%, while still retaining the same retransmission rate as the static RTO

MIN (SRM). Table 3.2 shows the findings using multiple payload, and file sizes.

In Table 3.2, each sub-tables represents different payload and file sizes. Further-

more, the sub-tables have been divided into two different columns that represents

data collected with and without congestion. Each rows of the sub-tables are defined

below:

• Number of Fast RTX, represents the total number of chunks retransmitted due

to fast retransmission.

• Number of RTX Time-Out, represents the total number chunks retransmitted

due to time out.

• Number of RTX PMTU, represents the total number of chunks retransmitted

due to the chunk’s size being greater than the maximum transmission unit.

• Transmission Time in seconds, represents the total time taken to transmit a

file.

In order to collect the data a text file was transferred from the two client computers

to the echo server, as seen in Figure 3.1. The file size is represented by each sub-tables

in Table 3.2. The network congestion was emulated by uploading a very large file in

the background from one of the client computer.

Page 32: Khatri Thesis

26

Table 3.2: Data Gathered using the static RTO MIN (SRM) and Adaptive RTO MINalgorithm (ARM) executed on multiple payload, and file sizes.

Transferring 32 Kb file with 50 bytes payloadWithout Congestion With CongestionSRM ARM SRM ARM

Number of Fast RTX 164 177 1862 2005Number of RTX Time-Out 5 0 55 342Number of RTX PMTU 0 0 0 0Transmission Time (seconds) 1673 1667 5716 5498

Transferring 512 Kb File with 500 bytes payloadWithout Congestion With CongestionSRM ARM SRM ARM

Number of Fast RTX 656 159 3009 2350Number of RTX Time-Out 34 7 88 383Number of RTX PMTU 0 0 0 0Transmission Time (seconds) 2756 2724 9013 8581

Transferring 1024 Kb file with 1000 bytes payloadWithout Congestion With CongestionSRM ARM SRM ARM

Number of Fast RTX 311 181 2223 2118Number of RTX Time-Out 6 9 97 343Number of RTX PMTU 0 0 0 0Transmission Time (seconds) 2728 2721 9542 9334

Transferring 2048 Kb file with 2000 bytes payloadWithout Congestion With CongestionSRM ARM SRM ARM

Number of Fast RTX 362 290 2735 2572Number of RTX Time-Out 14 9 64 393Number of RTX PMTU 0 0 0 0Transmission Time (seconds) 2742 2723 9224 9237

Page 33: Khatri Thesis

27

The data in the Table 3.2 show the performance improvement achieved by im-

plementing the ARM algorithm over the standard SRM, which will be discussed in

detail in the following sections.

3.4.1 Performance Evaluation for 50 bytes Payload

The data from the Table 3.2 suggest that the ARM algorithm preforms better with

small the payload size.

When the payload size is 50 bytes, and there is no network congestion the ARM

algorithm transfers a file in approximately the same time the SRM does. The re-

transmission rate due to both fast retransmission and time-out is approximately the

same.

Whereas, the ARM algorithm out performs the SRM when there is a network

congestion, by transferring a file 4% faster. As expected, the number of chunks re-

transmitted due to time-out increases for the ARM algorithm. But, the number of

chunks retransmitted due to time-out is very minuscule in comparison to the total

number of chunks actually transmitted. In comparison to the SRM, the ARM algo-

rithm retransmitted 0.3% more chunks in total, which as stated earlier is well within

network bandwidth’s capacity.

3.4.2 Performance Evaluation for 500 bytes Payload

When the payload size is 500 bytes, and there is no network congestion the ARM

algorithm transfers a file in approximately the same time the SRM does. The re-

transmission rate due to both fast retransmission and time-out is approximately the

Page 34: Khatri Thesis

28

same as well.

The ARM algorithm preform better than the SRM when there is network con-

gestion. The ARM algorithm transferred a file 5% faster as compared to SRM. The

total number of retransmission caused by the ARM algorithm was in fact 0.1% less

than the SRM.

With the 500 bytes payload, the ARM algorithm seems to be more efficient than

with 50 bytes payload, because it transferred a file in 5% less time, with 0.1% less

retransmission.

3.4.3 Performance Evaluation for 1000 bytes Payload

With the payload size of 1, 000 bytes, and with no network congestion the ARM

algorithm transmits a file in approximately the same time as the SRM, and the

retransmission rate is the same for both.

The ARM algorithm preforms better than the SRM when there is network con-

gestion. The ARM algorithm transferred a file 2.23% faster as compared to the SRM.

The total number retransmission caused by the ARM algorithm was approximately

the same as the SRM.

3.4.4 Performance Evaluation for 2000 bytes Payload

As the payload size increases the ARM algorithm and SRM preforms the same regard-

less of with or without network congestion. The total number of chunks retransmitted

is approximately the same, and the time taken to transmit a file is same as well.

Page 35: Khatri Thesis

29

Table 3.3: This chart shows us that when a chunk’s payload size is relatively small theAdaptive RTO algorithm preforms better than the static RTO MIN.

Comparison Chart

Payload Size (bytes) Without Congestion With Congestion

2, 000 ∼ ∼1, 000 ∼ +2.2%500 ∼ +5.03%50 ∼ +4%

3.5 Algorithm Comparison Chart

Table 3.2 presents us with the data for different file and payload sizes, with ARM al-

gorithm and without. Detailed interpretation of the data was provided in the previous

sections.

Table 3.3 is a comparison chart which neatly conveys the messages, provided in

the previous sections. The chart represents how the ARM algorithm compares to the

SRM, in respect to file transfer time. The convention used in the chart is as follows:

‘+’ represents a time gain caused by the ARM algorithm, and the ‘∼’ represents the

ARM algorithm’s performance is approximately the same as the SRM.

With the help of Table 3.3 it is easy to visualize the conditions that favors ARM

algorithm. The data suggest that the ARM algorithm is always better than the SRM

when the payload is relatively small, “thin stream”, and the ARM algorithm performs

approximately the same as SRM, when the payload size increases in the vicinity of

2, 000 bytes.

Page 36: Khatri Thesis

30

3.6 Summary

The data gathering and analysis in this chapter have proven the Adaptive RTO MIN

(ARM) algorithm as a viable replacement to the current static RTO MIN implemen-

tation. The ARM algorithm can transmit a file from one endpoint to the other in

5% less time than the static RTO MIN, without unnecessary retransmission, and in

some case the ARM algorithm has even mitigated the amount of retransmission as

compared to the static RTO MIN.

Page 37: Khatri Thesis

CHAPTER 4

CONCLUSIONS AND FUTURE WORK

In this thesis we have shown what others have done in regard to enhancing SCTP’s

retransmission time-out performance, and how their approaches are different from

ours. We have also shown how the Adaptive RTO algorithm helps improve the SCTP’s

performance in contrast to the existing implementation.

Significant amount of data were gathered in order to show that we have improved

the file transfer rate in SCTP by about 5%, in the scenario where the Adaptive RTO

algorithm is implemented. We have been vigilant in making sure the Adaptive RTO

algorithm does not unnecessarily clog network’s bandwidth, and that the current im-

plementation of RTO MIN can be safely replaced with the Adaptive RTO algorithm.

Furthermore, we have done extensive testing of the Adaptive RTO algorithm in re-

spect to multiple scenarios. For example, the data were gathered when the network

was operating on a non-congested environment, as well as when the network was op-

erating under heavy congestion environment. We have also shown that the Adaptive

RTO algorithm does not clog network’s bandwidth under any conditions.

This is an evolutionary research. This thesis does not solve all the problems but

sets up additional studies and hypothesis. If the time had permitted I would have

done the following, in my opinion, the next cycle in this evolution:

• Come up with a design and implement a congestion detection via Adaptive

31

Page 38: Khatri Thesis

32

RTO algorithm, and make use of the SCTP’s unique feature, multihoming, to

switch between IP address. Currently this feature is not fully implemented in

the SCTP, the application layer needs to handle the logic to select IP addresses.

The Adaptive RTO algorithm will help detect the network’s congestion and help

make better decision to switch between multiple IP addresses.

• Come up with a design and implement better path selection algorithm via Adap-

tive RTO algorithm. The Adaptive RTO algorithm would detect a better path

for transferring data from one node to the other.

Page 39: Khatri Thesis

BIBLIOGRAPHY

[Allman et al., 2010] Allman, M., Avrachenkov, K., Ayesta, U., Blanton, J., and

Hurtig, P. (2010). Early retransmit for tcp and stream control transmission protocol

(sctp).

[Allman and Paxson, 1999] Allman, M. and Paxson, V. (1999). On estimating end-

to-end network path properties. SIGCOMM.

[Coene and Pastor-Balbas, 2006] Coene, L. and Pastor-Balbas, J. (2006). Telephony

signaling transport over stream control transmission protocol (sctp) applicability

statement.

[Ekstrom and Ludwig, 2004] Ekstrom, H. and Ludwig, R. (2004). The peak-hopper:

A new end-to-end retransmission timer for reliable unicast transport. INFOCOM.

[Jacobson and Karels, 1988] Jacobson, V. and Karels, M. J. (1988). Congestion

avoidance and control. ACM Computer Communications Review.

[Karn and Partridge, 1991] Karn, P. and Partridge, C. (1991). Improving round-trip

time estimates in reliable transport protocols. ACM Transactions on Computer

Systems, 9(4):364–373.

[Ludwig and Sklower, 2000] Ludwig, R. and Sklower, K. (2000). The eifel retrans-

mission timer. ACM Computer Communications Review.

33

Page 40: Khatri Thesis

34

[Matthews, 2005] Matthews, J. (2005). Computer Networking Internet Protocols In

Action. Wiley, Hoboken, NJ, 1st edition.

[Ong and Yoakum, 2002] Ong, L. and Yoakum, J. (2002). An introduction to the

stream control transmission protocol (sctp).

[Paxson and Allman, 2000] Paxson, V. and Allman, M. (2000). Computer tcp’s re-

transmission timer, rfc 2988 (proposed standard).

[Pedersen, 2006] Pedersen, J. (2006). Evaluation of sctp retransmission delays. Mas-

ter’s thesis, University of Oslo Department of Informatics.

[Pedersen et al., 2006] Pedersen, J., Griwodz, C., and Halvorsen, P. (2006). Consid-

erations of sctp retransmission delays for thin streams. LCN.

[Petlund et al., 2009] Petlund, A., Beskow, P., Pedersen, J., Paaby, E. S., Griwodz,

C., and Halvorsen, P. (2009). Improving sctp retransmission delays for time-

dependent thin streams. Multimedia Tools and Applications, 45:33–60.

[Stevens et al., 2004] Stevens, W. R., Fenner, B., and Rudoff, A. M. (2004). UNIX

Network Programming The Socket Networking API Volume 1. Addison-Wesley,

Boston, MA, 3rd edition.

[Stewart et al., 2000] Stewart, R., Xie, Q., Morneault, K., Sharp, C., Schwarzbauer,

H., Taylor, T., Rytina, I., Kalla, M., Zhang, L., and Paxson, V. (2000). Stream

control transmission protocol.

Page 41: Khatri Thesis

VITA

Sagun Khatri was born in Kathmandu, Nepal, on December 27, 1981, the son of

Sridhar and Sarita Khatri. After completing his work at Galaxy Public High School

in Kathmandu, Nepal, he entered Luther College–Decorah, Iowa. In the Fall of 2006,

he received the degree of Bachelor of Arts from Luther College–Decorah, Iowa. In

Fall 2008, he entered the Graduate College of Texas State University-San Marcos.

Permanent Address: 12800 Harrisglenn Drive Apt# 1534

Austin, Texas 78753

This thesis was typed by Sagun Khatri.

Page 42: Khatri Thesis