The Hong Kong Polytechnic University · 2020. 6. 29. · Abstract Abstract of the thesis entitled...

The Hong Kong Polytechnic University

Department of Electronic and Information Engineering

Stability Analysis of The Internet Congestion Control

Xi Chen

A thesis submitted in partial fulfillment of the requirements for

the degree of Doctor of Philosophy

February 2009

Abstract

Abstract of the thesis entitled “Stability analysis of the Internet congestion control”

submitted by Xi CHEN for the degree of Doctor of Philosophy at The Hong Kong Poly-

technic University in February 2009

The Internet has become an important medium of information transfer nowadays. The

TCP/IP protocol suite and the interconnected gateways provide reliable channels for the

flow of information which are evenly shared among different connections. It has been

known that a bottleneck RED (Random Early Detection) gateway can become oscillatory

when regulating multiple identical TCP (Transmission Control Protocol) flows.

In this thesis, we will study the stability issue in the TCP-RED system. The stabil-

ity boundary of the TCP-RED system depends on many network parameters, making the

adjustment of the RED gateway a difficult task. Based on a fluid-flow model (FFM), we

formulate analytical conditions that describe the stability boundary of the RED gateway

which depends on the number of TCP Reno connections. The proposed model accurately

generates a stability boundary surface in a four dimensional space, which facilitates the

adjustment of parameters for stable operation of the RED gateway. The accuracy of the

analytical results has been verified using the ns-2 network simulator.

We will use the fluid-flow model to derive the system characteristic frequency, and then

compare it with the frequencies of the RED queue length waveforms observed from ns-2

simulations. The ns-2 simulator is the only viable simulation tool accepted by industry

iii

for verification purposes. Analysis of the TCP source frequency distribution reveals the

occurrence of period doubling when the system enters the instability region as the filter

resolution varies. Since random events and a large number of TCP flows are involved in

the process of generating the average system dynamics, a statistical viewpoint is taken in

the analysis. Our results reflect the true system behavior as they are based on data from ns-2

simulations rather than numerical simulations of analytical models. The physical mecha-

nism of oscillation is explained in terms of the difference in the TCP source frequency and

the TCP-RED system characteristic frequency.

The detrended fluctuation analysis (DFA) method is used to analyze the stability of

the Internet RED gateway. In DFA, time-series data are analyzed to generate a key pa-

rameter called power-law scaling exponent, which provides indication as to the long-range

correlations of the time series. By examining the variation of the DFA scaling exponent

when varying system parameters, we quantify the stability of the RED system in terms of

system’s characteristics.

Finally, the random explicit congestion notification (ECN) marking distribution mech-

anism in RED gateways has been studied. The randomness of the RED ECN marking

algorithm is implemented into the FFM. The new model is shown to have better dynamic

performance, as verified by the waveforms provided by ns-2 simulations.

iv

Acknowledgements

My sincere gratitude goes to my supervisors Dr. Siu-Chung Wong and Prof. Michael

Tse, for their valuable advice, patient guidance, and generous support throughout the study.

Without their supports, this research projects would not have been completed.

I thank my former advisor Dr. Wing-Kuen Ling who introduced me to nonlinear science

for his constant teaching and encouragement.

I would also like to thank my collaborator, Prof. Ljiljana Trajkoviá, for her valuable

ideas and suggestions.

At the same time, I would like to acknowledge the members in our research group

Yuehui Huang, Xiaohui Qu, Xiaoke Xu, Jie Zhang, Junfeng Sun, Guang Feng, Sufen Chen,

Zhen Li, Takayuki Kimura, Xiaofan Liu, Xiaodong Luo, Qingfeng Zhou, Rongtao Xu, Xia

Zheng, Yang Liu, and Xiumin Li for their support and valuable discussions on my research.

I wish to thank Dr. Jianbo Gao, Dr. Wen-wen Tung and Rongsheng Huang for their

hospitality during my visit at the University of Florida.

I gratefully acknowledge the Research Committee of The Hong Kong Polytechnic Uni-

versity for the financial support during the entire period of my candidature.

Last, but far from the least, I would like to thank my parents and my elder sister for

their love and care over the years, for their persistent support, encouragement, and under-

standing.

v

Abbreviations

Abbreviations Whole phrases

ACK Acknowledgement

AI Additive Increase

AIMD Additive Increase Multiplicative Decrease

AQM Active Queue Management

ARED Adaptive RED

AVQ Adaptive Virtual Queue

BRED Balanced RED

CBR Constant Bit Rate

CBT RED Class-Based Threshold RED

CE Congestion Experience

CHOKe CHOose and Keep for responsive flows

cwnd Congestion Window Size

DDE Delayed Differential Equation

DFA Detrended Fluctuation Analysis

DoD Department of Defense

DSRED Double Slope RED

ECN Explicit Congestion Notification

e-mail Electronic Mail

vi

FFT Fast Fourier Transform

FFM Fluid-Flow Model

FRED Flow RED

FIN Finish

FTP File Transport Protocol

HS TCP High Speed TCP

IAB the Internet Architecture Board

IETF the Internet Engineering Task Force

IP Internet Protocol

LBL Lawrence Berkeley Laboratory

LRD Long-Range Dependence

MD Multiplicative Decrease

MILNET MILitary NETwork

OSI Open Systems Interconnection

OTcl Orient object Tool Command Language

P Proportional

PD Proportional-Differential

PI Proportional-Integral

PI-PD Proportional-Integral-Proportional-Derivative

PARC Palo Alto Research Center

QoS Quality of Service

RED Random Early Detection

REM Random Exponential Marking

RFC Request For Comments

RFFM Randomized Fluid-Flow Model

rms Root Mean Square

vii

RST Reset

RTT Round Trip Time

rwnd Receiver Advertised Window Size

SACK Selective ACKnowledgment

SRED Stabilized RED

SRTT Sample Round Trip Time

SSH Secure Shell

ssthresh Slow Start Threshold

SYN Synchronization

TCP Transmission Control Protocol

TCPW TCP Westwood

Telnet Network Terminal Protocol

TFTP Trivial File Transport Protocol

TO Timeout

UC Berkeley University of California, Berkeley

UDP User Datagram Protocol

USC University of South California

VINT Virtual InterNetwork Testbed

WWW World-Wide-Web

viii

Nomenclature List

Symbol Description

α the exponential moving average weight parameter at the RED gateway

β the DFA scaling exponent of the queue length

βT j the DFA scaling exponent of the series of the jth TCP window period

βT the DFA scaling exponent of the TCP window period

ϕ the constant for controlling target queue length

Φ the target oscillation range, (Xmax − Xmin)

κ the proportionality constant used in fluid flow model

ρi, j the degree of similarity between flow i and flow j

B the buffer size the RED gateway

C the bottleneck bandwidth

cm ECN unmark counter

Dev the estimated mean deviation

fδ TCP sources frequency, 1/T

H the Hurst parameter

K the packet-in-flight

N the number of connections

pb the marking/droping probability assigned by RED algorithm

pmax the marking probability of when the average queue length being at Xmax

ix

q the instantaneous queue length

q0 the target queue length

Ro propagation delay

ro round trip time

RTT estimated average round trip time

S Matrix constructed by ρi, j

T the TCP sources sending rate period

T j the TCP window period of the jth TCP flow

w the TCP window size

Wsum the total window size

x the average queue length

Xmax the maximum threshold at the RED gateway

Xmin the minimum threshold at the RED gateway

wR the weight factor for computing sample round trip time

x

Table of Contents

Abstract iii

Acknowledgements v

Abbreviations vi

Nomenclature List ix

1 Introduction 1

1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

1.2 Contribution of the Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . 4

1.3 Publications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

1.4 Organization of the Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . 8

2 Background 10

2.1 Internetworking: Concepts, Architectures and Protocols . . . . . . . . . . . 10

2.2 TCP and Congestion Control . . . . . . . . . . . . . . . . . . . . . . . . . 15

2.2.1 TCP Prime . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

2.2.2 Congestion Control Algorithm . . . . . . . . . . . . . . . . . . . . 20

2.2.3 TCP Flavors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

2.3 Active Queue Management (AQM) . . . . . . . . . . . . . . . . . . . . . 27

i

2.3.1 The Need of AQM . . . . . . . . . . . . . . . . . . . . . . . . . . 28

2.3.2 Explicit Congestion Notification (ECN) . . . . . . . . . . . . . . . 29

2.3.3 Random Early Detection (RED) . . . . . . . . . . . . . . . . . . . 30

2.3.4 RED and Its Variants . . . . . . . . . . . . . . . . . . . . . . . . . 34

2.4 Network Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

2.4.1 Discrete Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

2.4.2 Fluid Flow Model . . . . . . . . . . . . . . . . . . . . . . . . . . 39

2.5 Network Simulator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

2.5.1 Data Collection and Analysis . . . . . . . . . . . . . . . . . . . . . 42

3 Stability Analysis 45

3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

3.2 TCP-RED Fluid-Flow Model . . . . . . . . . . . . . . . . . . . . . . . . . 48

3.3 Stability Boundary of TCP-RED System . . . . . . . . . . . . . . . . . . . 51

3.3.1 Steady-State Solution and Target Queue Length . . . . . . . . . . . 51

3.3.2 Linearization and Perturbation . . . . . . . . . . . . . . . . . . . . 52

3.3.3 Closed-Form Stability Condition . . . . . . . . . . . . . . . . . . . 53

3.4 Verification of Stability Boundaries . . . . . . . . . . . . . . . . . . . . . . 55

3.4.1 Definition of Stability . . . . . . . . . . . . . . . . . . . . . . . . 55

3.4.2 Comparison of Stability Boundaries Using Different Simulation

Methods and Approximations . . . . . . . . . . . . . . . . . . . . 59

3.4.3 Cross-Sectional Views . . . . . . . . . . . . . . . . . . . . . . . . 60

3.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

4 Nonlinear Analysis 67

4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

4.2 Characteristic Frequency . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

4.2.1 Steady-State Solution . . . . . . . . . . . . . . . . . . . . . . . . . 71

ii

4.2.2 Finding the Characteristic Frequency . . . . . . . . . . . . . . . . 72

4.3 Actual Steady-State Waveforms of TCP Sources . . . . . . . . . . . . . . . 74

4.4 Comparison of Results from Fluid-Flow Model Calculations and Ns-2 Sim-

ulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

4.5 Characteristic Frequency and Period Doubling From A Statistical Perspective 80

4.6 Mechanism of Oscillation . . . . . . . . . . . . . . . . . . . . . . . . . . . 88

4.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89

5 Detrended Fluctuation Analysis 90

5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90

5.2 Property of Time Series and DFA method . . . . . . . . . . . . . . . . . . 94

5.2.1 Self-Similarity and Long-Range Dependence . . . . . . . . . . . . 94

5.2.2 DFA Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95

5.3 Long-Range Power-Law Correlations in Queue Length . . . . . . . . . . . 96

5.3.1 Scaling Exponent of Queue length and System Stability . . . . . . 98

5.3.2 Interpretation from a Waveform Viewpoint . . . . . . . . . . . . . 99

5.4 Long-Range Power-Law Correlations in Series of TCP Window Periods . . 100

5.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107

6 Randomized Fluid Flow Model 108

6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108

6.2 RED ECN Marking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110

6.2.1 Distribution of Consecutive ECN Generation . . . . . . . . . . . . 110

6.2.2 Randomized Fluid-Flow Model . . . . . . . . . . . . . . . . . . . 112

6.3 Results from RFFM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115

6.3.1 Comparison RFFM and ns-2 Simulations . . . . . . . . . . . . . . 115

6.3.2 Phase Portrait and Randomness . . . . . . . . . . . . . . . . . . . 117

6.4 Modeling Interactive Bottleneck Gateways . . . . . . . . . . . . . . . . . . 117

iii

6.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122

7 Conclusion and Future Work 123

7.1 Main Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123

7.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125

Bibliography 126

iv

List of Tables

2.1 RED configuration parameters in ns-2 simulator . . . . . . . . . . . . . . 43

3.1 Abbreviations used in graphical presentation . . . . . . . . . . . . . . . . . 55

4.1 Parameters for ns-2 simulations . . . . . . . . . . . . . . . . . . . . . . . . 77

6.1 Simulation parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112

v

List of Figures

1.1 Flow charts of the main contributions. . . . . . . . . . . . . . . . . . . . . 5

2.1 Illustration of links and a network cloud. (a) Point-to-point link; (b) multiple-

access link; (c) switched network cloud. . . . . . . . . . . . . . . . . . . . 11

2.2 The illustration of interconnection of networks. An internet is formed by

a gateway interconnecting two physical networks. The networks can be of

different types. End systems can be attached to either of the networks. . . . 12

2.3 Illustration of the internetworking concept. An internet is formed with six

gateways interconnecting five physical networks. A host in the underly-

ing physical structure can be attached to any one of the physical networks

which are interconnected by gateways. A host in the internet can commu-

nicate with any other host in the internet, even though the two hosts may

be attached to different types of networks in the internet. . . . . . . . . . . 14

2.4 Illustration of the Internet layer operations. . . . . . . . . . . . . . . . . . 15

2.5 (a)Three-way handshaking in TCP connection establishing, and (b) four-

way handshaking in TCP connection termination. . . . . . . . . . . . . . . 18

2.6 Illustration of congestion collapse. . . . . . . . . . . . . . . . . . . . . . . 20

vi

2.7 Illustration of TCP congestion window. “SS” is the slow start phase, dur-

ing which the window size exponentially increases until it reaches the slow

start threshold, ssthresh. ssthresh is set to half of the current window size

whenever a congestion is experienced. “CA” represents the congestion

avoidance phase during which the window size is increasing linearly and

decreasing multiplicatively. Multiplicative decrease of the TCP window

size occurs either by receiving an ECN bit header or three duplicate ACKs.

When “TO” timer expires, the window size is reduced to 1 and the system

re-enters the slow start phase. . . . . . . . . . . . . . . . . . . . . . . . . . 24

2.8 Marking probability function in RED. Dash line is the marking or dropping

probability for the original RED, and solid line is the probability for the

gentle RED. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

2.9 The control feedback loop formed by the TCP sources and the RED gateway. 40

2.10 Fields appearing in the trace file. . . . . . . . . . . . . . . . . . . . . . . . 43

3.1 A system of N TCP flows, from S i to Di, where i = 1, 2, · · · ,N, passing

through a common bottleneck link between G1 and G2. . . . . . . . . . . . 48

3.2 Stability boundary surface for (a) Φ = 128 packets, (b) Φ = 256 packets,

and (c) Φ = 384 packets. Region below each surface is “stable” and above

is “unstable”. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

3.3 ns-2 simulation for Pmax = 1/11.4: N = 256, C = 51.2 Mbps, α = 0.01,

qo = 384 packets, and ro = 64 ms. . . . . . . . . . . . . . . . . . . . . . . 56

3.4 ns-2 simulation for Pmax = 1/13.8 (adjusted to maintain the original target

queue length): N = 256, C = 51.2 Mbps, α = 0.01, qo = 384 packets, and

ro = 73 ms (Ro = 20 ms). . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

vii

3.5 ns-2 simulation for Pmax = 1/17 (adjusted to make the system stable): N =

256, C = 51.2 Mbps, α = 0.01, qo = 384 packets, and ro = 75 ms (Ro = 20

ms). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

3.6 ns-2 simulation for Pmax = 1/44: N = 150, C = 30 Mbps, α = 0.001, qo =

384 packets, ro = 155 ms and K = 1078.8. . . . . . . . . . . . . . . . . . . 60

3.7 FFT of the average queue length for the first 100 seconds of the ns-2 simu-

lation shown in Fig. 3.6. . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

3.8 ns-2 simulation for Pmax = 1/41: for N = 150, C = 30 Mbps, α = 0.001,

qo = 384 packets, ro = 145 ms, and K = 1006.9. . . . . . . . . . . . . . . . 62

3.9 FFT of average queue length for the first 100 seconds ns-2 simulation

shown in Fig. 3.8. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

3.10 Comparison of various methods and approximations. Region below a curve

is “stable” and above is “unstable”. The deficiency of the Padé(0,1) lin-

earization (i.e., “A1” and “SL1”) is clearly evident. . . . . . . . . . . . . . 63

3.11 Comparison of stability boundaries from closed-form solution based on

Padé(1,1) linearization (solid and dashed curves) for various α, Φ = 256

packets, and qo = 384 packets, corresponding to Fig. 3.2(b). Region below

a curve is “stable” and above is “unstable”. . . . . . . . . . . . . . . . . . 64

3.12 Comparison of stability boundaries from closed-form solution based on

Padé(1,1) linearization (solid curves) for various N and Φ = 256 packets

corresponding to Fig. 3.2(b). Region below a curve is “stable” and above

is “unstable”. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

viii

3.13 Comparison of stability boundaries between the closed-form solution based

on Padé(1,1) linearization (dashed curves) for various N andΦ = 256 pack-

ets corresponding to Fig. 3.2(b). The values of N for the full simulations

based on the fluid-flow model (labelled as “S”) is intentionally adjusted to

fit the ns-2 simulations of Fig. 3.12 and to show a constant offset of N = 30

of the fluid-flow model from the actual values given by ns-2 simulations.

Region below a curve is “stable” and above is “unstable”. . . . . . . . . . . 66

3.14 Comparison of stability boundaries for various Φ for N = 256, α = 0.001

and qo = 384 packets. Region below a curve is “stable” and above is “un-

stable”. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

4.1 Simulated queue length waveforms of TCP-RED using the ns-2 simulator

for different values of filter resolution α. (a) α = 0.1, (b) α = 0.0005, and

(c) α = 0.0008 for 170 TCP connections. Each connection shares a fixed

bandwidth of 1.5 Mb/s in the bottleneck link. . . . . . . . . . . . . . . . . 70

4.2 Ideal steady-state waveform of TCP sender’s window size. . . . . . . . . . 75

4.3 Waveform of TCP source window size of a connection at filter resolution

α = 0.1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

4.4 Waveform of TCP source window size of a connection at filter resolution

α = 0.001. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

4.5 Comparison of characteristic frequency fc from linearized fluid-flow model

and peak oscillation frequency from ns-2 simulations. Bandwidths are in-

dicated as vertical bars for the ns-2 data. System parameters are as listed in

Table 4.1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

4.6 Waveforms of queue length found from linearized fluid-flow model for sys-

tem parameters shown in Table 4.1. . . . . . . . . . . . . . . . . . . . . . . 78

ix

4.7 Frequency distribution from FFT of the ns-2 simulated queue length wave-

form of Fig. 4.1 (a). The distance between the two arrows is the bandwidth

at this peak frequency. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

4.8 Frequency distribution from FFT of the ns-2 simulated queue length wave-

form for α = 0.001. The distance between the two arrows is the bandwidth

at this peak frequency. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80

4.9 Distribution of TCP source window frequency for the 52nd flow at α = 0.1. 81

4.10 Distribution of TCP source window frequency for the 153rd flow at α =

0.001. Period doubling in the statistical sense is clearly evident from the

emergence of a small “bump” at half of the characteristic frequency. . . . . 82

4.11 Distribution of TCP source window frequency for α = 0.1. . . . . . . . . . 83

4.12 Distribution of TCP source window frequency for α = 0.002. Period dou-

bling in the statistical sense is clearly evident from the emergence of a

small “bump” at half of the characteristic frequency. Impulse at 1 Hz re-

flects time-out saturation. . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

4.13 Distribution of TCP source window frequency for α = 0.0015. Period

doubling becomes more evident as the “bump” at half of the characteristic

frequency grows. Impulse at 1 Hz reflects time-out saturation. . . . . . . . 84

4.14 Distribution of TCP source window frequency for α = 0.001. Period dou-

bling persists as the “bump” at half of the characteristic frequency stays in

the distribution. Impulse at 1 Hz reflects time-out saturation. . . . . . . . . 84


doubling becomes less persistent as the “bump” at half of the characteristic

frequency begins to shrink. Impulse at 1 Hz reflects time-out saturation. . . 85


doubling begins to subside. Impulse at 1 Hz reflects time-out saturation. . . 85

x

4.17 Distribution of TCP source window frequency for α = 0.0005. Stability is

about to resume as period doubling subsides. . . . . . . . . . . . . . . . . 86

4.18 Distribution of TCP source window frequency for α = 0.0004. Stability is

resumed. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86

4.19 Distribution of TCP source window frequency for α = 0.0001. . . . . . . . 87

4.20 Distribution of TCP source window frequency for α = 0.00001. . . . . . . 87

5.1 Simulated RED queue length waveforms using ns-2 simulator for filter res-

olutions of α = 0.0001 and α = 0.0008 indicating stable waveforms (a),

(c) and (e); and unstable waveforms (b), (d), and (f), respectively. Figures

in (c) and (e) are enlarged views of (a), and (d) and (f) are enlarged views

of (b). There are 170 TCP connections. Each connection shares a fixed

bandwidth of 1.5 Mb/s in the bottleneck link. . . . . . . . . . . . . . . . . 97

5.2 DFA scaling exponent for RED instantaneous queue length with different

choices of α value. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98

5.3 DFA scaling exponents in region 1 of Fig. 5.2 with varying α. . . . . . . . 99

5.4 DFA scaling exponents of stationary signals. . . . . . . . . . . . . . . . . . 100

5.5 Relationship amount system instability, positive feedback system and long-

range correlation in the queue length series of the system. . . . . . . . . . . 101

5.6 DFA scaling exponent of T1(i) series of 170 TCP connections with α =0.1,

0.002, 0.0015, 0.001, 0.0008, 0.0006, 0.0005, 0.0004, 0.0001, and 0.00001 . 102

5.7 DFA scaling exponent of the T j(i) series of 170 TCP connections for j =

1, · · · , 170 with α = 0.001 . . . . . . . . . . . . . . . . . . . . . . . . . . 102

5.8 DFA results and distribution of the DFA results of T j(i) series of the 170

TCP connections, where j is the connection number for α = 0.1 . . . . . . 103

xi

5.9 From top to bottom are the distribution of ρ for α = 0.0001, 0.1, and 0.001,

respectively. The system is stable for α=0.0001 and 0.1, and unstable for

α=0.001. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105

5.10 DFA scaling exponents of T j(i) series for the 170 TCP connections with

varying α. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106

5.11 Relationship among the stable TCP-RED system, long-range anti-correlation

of TCP window period series, and the negative feedback system. . . . . . . 106

5.12 An illustration of the competition of the bandwidth between two TCP flows

in a stable TCP-RED system. . . . . . . . . . . . . . . . . . . . . . . . . . 107

6.1 Phase portraits from the original FFM: current total window size Nw(t)

versus the total window size in last round trip time Nw(t−r(t)). Waveforms

of 30 sec simulation time is shown in blue. Steady state waveforms are

shown in red, after a simulation time of 25 sec. (a) Stable trajectory with α

= 0.0001, showing a fixed point. (b) Unstable trajectory with α = 0.0008

showing limit cycles. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113

6.2 Phase portraits from ns-2 simulations: current total window size versus the

total window size in last round trip time for the simulation time from 25 to

30 sec. (a) Stable trajectory with α = 0.0001. (b) Unstable trajectory with

α = 0.0008. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114

6.3 Instantaneous queue length comparison for stable network in Table 6.1 with

α = 0.0001. Waveforms in the left and right panels are instantaneous queue

length results from RFFM and ns-2 simulations, respectively. Waveforms

in the lower panels are the magnified versions of those in the upper row. . . 115

xii

6.4 Instantaneous queue length comparison for stable network in Table 6.1 with

α = 0.0008. Waveforms in the left and right panels are instantaneous queue

length results from RFFM and ns-2 simulations, respectively. Waveforms

on the lower panels are the magnified versions of those in the upper panels. 116

6.5 Fixed point from RFFM with α = 0.0001: total window size Nw(t) versus

the total window size in last round trip time Nw(t − r(t)). (a) Total simu-

lation time of 30 sec is shown in blue, and the steady state portion in time

interval from 25 to 30 sec is shown in red. (b) Magnified version of the red

portion. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117

6.6 Limit cycle from RFFM with α = 0.0008: total window size Nw(t) versus

the total window size in last round trip time Nw(t−r(t)). (a) Total simulation

time of 30 sec is shown in blue and the steady state portion at time interval

from 25 to 30 sec is shown in red. (b) Magnified version of the red portion. 118

6.7 A network of N TCP flows starts from S i to Di, where i = 1, 2, · · · ,N,

passing through two bottleneck links between G1 to G2 and G2 to G3. . . . . 119

6.8 Ns-2 results of the instantaneous queue length for a network with two in-

teracting bottlenecks, where C2 = 0.995C1. . . . . . . . . . . . . . . . . . 120

6.9 Illustration of queue length interaction between two interacting gateways,

where Ro(1,2) is the one way propagation delay between G1 and G2. . . . . . 120

6.10 Waveforms of the instantaneous queue length of FFM for two gateway net-

work, where C2 = 0.995C1. There are noticeable differences from ns-2

simulations shown in Fig. 6.8. . . . . . . . . . . . . . . . . . . . . . . . . 121

6.11 Waveforms of the instantaneous queue length of RFFM for network with

two interacting gateways, where C2 = 0.995C1. RFFM is able to model the

interaction between gateways. . . . . . . . . . . . . . . . . . . . . . . . . 121

xiii

Chapter 1

Introduction

Nowadays, the Internet has fully integrated with society in much of the developed world.

The Internet and the TCP/IP Internet protocol suite1 have revolutionized the way we in-

teract and communicate. Conventional media of information exchange and distribution is

gradually giving way to the more efficient, rapid, economical and globalized infrastructure

of information exchange network–the Internet. The success of the Internet can be mostly

credited to the capability of its protocols, among which Transmission Control Protocol

(TCP) and Internet Protocol (IP) are the two most important, providing users and develop-

ers with robust and interoperable services based on a set of enduring design principles such

as simplicity, scalability, distributed architectures and the end to end connections.

The worldwide computer network, the Internet, interconnects millions of end systems2,

such as personal computers, workstations, servers and so on, around the world. In 1986,

the Internet suffered a huge problem of congestion collapses, where the throughputs for

networks sharply diminished to zero while the data traffic increased. As the response from

the Internet research community to the problem of Internet congestion collapse, congestion

control mechanism was proposed. The main goal of congestion control is to optimize

1Protocol is the standard to provide syntactic and semantic rules for communication [34].2End systems, end points, end nodes and hosts all refer to either a TCP sender or receiver. Those terms

are used interchangeably throughout this thesis.

1

CHAPTER 1. INTRODUCTION 2

computer network performance by adjusting the sending rates of the end systems according

to the level of congestion on the path of the transmission. Specifically, TCP sources keep

on tracing the packets sent. If packet loss is observed, the TCP sending rate will be reduced,

in order to avoid further loss and congestion. If all data are delivered, the TCP source will

slowly increase its sending rate to maximize the utilization of the network resources. The

TCP congestion control has been extremely successful in minimizing the packet loss and

maximizing network utilization.

However, as the demand for higher quality of service (QoS) in the Internet increases,

there are indications that the TCP congestion control is reaching its limits. In particular, the

packet loss has resulted in lower network efficiency since the end systems and the Internet

gateways3 are constantly operating on the packets which are then to be dropped. With the

purpose of enhancing the network performance, Internet Engineering Task Force4 (IETF)

suggests the deployment of Active Queue Management (AQM) [16] and Explicit Conges-

tion Notification (ECN) [46, 119] to avoid packet loss by allowing the gateways to assist the

network management. Processes in the end systems are logically communicating through

transport-layer protocols, dominantly TCP, whereas in practice, end systems are indirectly

connected via gateways or routers. The queue length in the buffer should be managed in-

telligently, in order to prevent buffer delays from getting too long, and to keep the packet

loss as small as possible. AQM algorithms have been implemented to manage the queue

length in the buffer and to assist the management of network performance with a congestion

control algorithm. While the basic idea of AQM is to detect the congestion in advance and

to signal congestion notification to the end systems before packet loss and queue overflow

occur. The goal of ECN is to provide the network with the ability to explicitly signal the

end systems of congestion control obtained from AQM before packet loss occurs. Hence,

3A Gateway, a network node, connects to two or more networks that forwards packets from one networkto another[118]. The two terms are used interchangeably throughout the thesis.

4The Internet Engineering Task Force is the group responsible for protocol standards and technical aspectsof TCP/IP and the Internet under the Internet Architecture Board (IAB) who sets the technical direction anddecides on the standards of TCP/IP and the global Internet [70].


the end systems can adjust their sending rates upon receiving the ECN signal. ECN can be

effective when it is used with AQM.

This chapter is organized as the follows. First, in Section 1.1, we look at what the

future Internet would be like, and what will be needed to get there. Then, Section 1.2 gives

an overview of the contributions of this thesis. After that, publications arising from this

project are listed in Section 1.3. Finally, Section 1.4 provides the outline of the rest of the

thesis.

1.1 Motivation

The last two decades have seen growing interest on research in the Internet congestion con-

trol. TCP may be the most complex protocol in the suite of Internet protocols. It provides

reliable, flow-controlled, end-to-end, streaming service between two end systems on an un-

reliable network for communication. Implementation of congestion control mechanism on

TCP has further enhanced the network efficiency and performance. Internet traffic is mainly

composed of TCP traffics [38, 81, 143]. However, in the future end-to-end TCP conges-

tion control alone may not be enough to manage the Internet traffic, whose size has been

and probably will continue to increase exponentially. The most widely deployed version of

TCP is Reno. Therefore, Reno is the major type of TCP to be studied in this thesis.

More recently, the idea of allowing AQM gateways to assist the network management

on an end-to-end basis has been generally accepted. Among all the AQM algorithms, Ran-

dom Early Detection (RED) is probably the most famous one. It has been recommended

by IETF for next generation Internet [16], and it has been implemented in some commer-

cial gateways [30]. The RED gateways are said to be able to enhance the throughput and

fairness, and to avoid packet loss and global synchronization. However, the oscillatory

and instability problems of the TCP-RED system have constrained the wide deployment


of RED gateways over a decade. While many AQMs have been proposed to avoid the in-

stability problem, new problems, such as complexity, fairness, and scalability, are induced.

Due to complexity of the implementation and the many parameters involved in the design

of the TCP-RED system the continuous improvement in the TCP-RED system has been a

challenging problem and drawn a spate of interest.

Full understanding and explanation for the oscillation and instability problems are gen-

erally unavailable, as the Internet is probably the most complicated man-made system peo-

ple have ever constructed. Furthermore, the TCP-RED system is a cross-layer, transport

layer to Internet layer (also known as internetwork layer or IP layer) [14], optimization

mechanism. Recent studies have attempted to establish a relationship of the dynamics

between the network models and the real Internet. However, the studies have not been

sufficiently verified by real traffic data.

In the light of these motivations, this thesis has three purposes:

1. to solve the oscillatory and unstable problems in TCP-RED,

2. to understand the mechanism of the oscillations as well as the dynamics of the TCP-

RED system, and

3. to establish a more realistic model for the Internet.

1.2 Contribution of the Thesis

As shown in Fig. 1.1, this thesis contains four fairly independent contributions in order to

solve the instability and oscillation problems of TCP-RED system.

Firstly, based on a fluid-flow model, we have developed an analytical closed-form so-

lution for finding the stability boundary of the TCP-RED system. The solution is very

accurate for multiple TCP connections. The simplicity of the solution allows easy and fast


Chapter 3:closed-form stability conditions using FFM

Chapter 4:mechanism of the oscillation and

instability of TCP-RED

Chapter 5:stability analysis using DFA method

Chapter 6:RFFM: an enhanced

model of FFM for TCP-RED

Instability and oscillatory problems

of TCP-RED

Figure 1.1: Flow charts of the main contributions.

generation of the stability boundaries in the essential parameter space. The solution can be

used to formulate guidelines for setting parameters in RED gateways to avoid instability.

Secondly, the fluid-flow model has been used to calculate the characteristic frequency

of the TCP-RED system with multiple identical Reno TCP connections. Period doubling

has been observed in a statistical sense from ns-2 simulations using statistical frequency

distribution of TCP source windows. The physical mechanism for the onset of period

doubling has been explained in terms of the difference in the TCP sending frequency and

the system’s characteristic frequency. Viable verifications using the industry standard ns-2

simulation tool are provided. The bifurcation and stability results are reflecting the true

behavior of the actual system.

Thirdly, based on the data collected from the ns-2 simulations, long range power-law

correlations of the queue length waveforms in the RED gateway have been studied using the

Detrended Fluctuation Analysis (DFA) method. It can be shown that the scaling exponent

varies with the relative stability of the RED gateway, and that as the scaling exponent is

independent of the stationarity of the queue length, it can be used as an indicator for the


stability of the TCP-RED system.

Fourthly, the random marking mechanism in RED gateways and the distribution of

ECN markings have been evaluated using a pseudo random variable generator. The results

show that randomness is one of the key components to cause the difference between the

TCP-RED model and ns-2 simulation. A randomized fluid flow model have presented by

injecting the same kind of randomness into the fluid flow model. The interaction between

participating gateways can be captured by the model. In terms of the ability to capture the

salient dynamical features of the RED gateway, the randomized fluid flow model shows

significant improvement over the original fluid flow model.


1.3 Publications

Journal Papers

1. X. Chen, S. C. Wong, C. K. Tse and F. C. M. Lau, “Oscillation and Period Doubling

in TCP/RED System: Analysis and Verification,” International Journal of Bifurca-

tion and Chaos, vol. 18, no. 5, pp. 1459–1475, May 2008.

2. X. Chen, S. C. Wong, C. K. Tse and L. Trajković, “Detrended Fluctuation Analysis

of the Stability of Internet Gateway Employing the Random Early Detection Algo-

rithm,” International Journal of Bifurcation and Chaos, to appear.

3. X. Chen, S. C. Wong, C. K. Tse and L. Trajković, “Stability Analysis of Adaptive

TCP-RED Gateway with Multiple Connections,” submitted to IEEE Transactions on

Networking.

4. X. Chen, S. C. Wong and C. K. Tse, “Adding Randomness to Modeling Internet

TCP-RED System,” submitted to IEEE Transactions on Circuits and Systems -part

II: Express Briefs.

International Conference Papers

1. X. Chen, S. C. Wong, C. K. Tse, and L. Trajković, “Stability Analysis of RED Gate-

way with Multiple TCP Reno Connections,” in Proceedings of IEEE International

Symposium on Circuits and Systems 2007, pp. 1429–1432, New Orleans, LA, USA,

May. 2007.

2. X. Chen, S. C. Wong, C. K. Tse, and L. Trajković, “Stability Study of the TCP-RED

System Using Detrended Fluctuation Analysis,” in Proceedings of IEEE Interna-

tional Symposium on Circuits and Systems 2008, pp. 324–327, Seattle, WA, USA,

May. 2008.


1.4 Organization of the Thesis

The remaining part of this thesis is organized in the following way:

Chapter 2 introduces basic knowledge on TCP and active queue management. Two

kinds of modeling techniques for the TCP-RED system, namely, discrete model and fluid

flow model, are discussed. The industrial network simulator, ns-2, is introduced. Methods

in collecting and analyzing network data from ns-2 simulations are explained.

Chapter 3 studies stability issues in the TCP-RED system. Stability conditions for TCP-

RED system are defined. Based on a fluid-flow model (FFM), analytical conditions which

describe the stable boundary of the RED gateway depending on the number of TCP Reno

connections are formulated. The accuracy of the analytical results is verified using the

results from ns-2 network simulations.

Chapter 4 studies the nonlinear dynamics of the TCP-RED system. The chapter de-

scribes the derivation of the characteristic frequency of the TCP-RED system from the

fluid-flow model. The obtained frequencies are compared with the frequencies of the RED

queue length waveforms observed from ns-2 simulations. Analysis of the TCP source fre-

quency distribution reveals the occurrence of period doubling when the system enters the

instability region as the filter resolution varies. Since random events and a large number of

TCP flows are involved in the process of generating the average system dynamics, a statis-

tical viewpoint is taken in the analysis. The results reflect the true system behavior as they

are based on data from ns-2 simulations rather than numerical simulations of analytical

models. The physical mechanism of oscillation is explained in terms of the differences in

the TCP source frequency and the TCP-RED system characteristic frequency.

Chapter 5 proposes a fast algorithm for detecting the congestion. The detrended fluc-

tuation analysis (DFA) method is used to analyze the stability of the TCP-RED system. In

DFA, time-series data are analyzed to generate a key parameter called power-law scaling

exponent, which provides indication to the long-range correlations of the time series. By


examining the variation of the DFA scaling exponent according to different varying system

parameters, we quantify the stability of the RED system in terms of system’s characteris-

tics.

Chapter 6 presents a modified FFM. It has been found that waveforms from determin-

istic FFM can never match with those from ns-2 simulations. It has been found that the

mismatch is mainly due to the randomness of the way in which the RED gateways issue

ECN markings. The distribution of the randomness in the ECN markings in RED gate-

ways is studied, which is further implemented in FFM to enhance the RFFM (Randomized

FFM). Using RFFM, bifurcation study is performed and compared with ns-2 simulations.

Chapter 7 provides the conclusions and discusses some research directions for future

works.

Chapter 2

Background

In this chapter, the concepts, architecture, and protocols of the Internet are introduced,

along with a detailed discussion of the essential features and mechanisms of Transmission

Control Protocol (TCP) and TCP congestion control algorithms. The roles and algorithms

of queue management in a network are discussed. After that, network models for TCP-

AQM are reviewed. Finally, a network simulation tool, ns-2, is introduced.

2.1 Internetworking: Concepts, Architectures and Proto-

cols

Data communication networks have been growing explosively and have become an essen-

tial tool for communications in developed societies. Networks were constructed to provide

users with an ability to share information resources. A network consists of two or more

end systems or hosts [14], which are the ultimate consumers of communication services.

Specifically, an end system, which can be a computer, workstation, mobile device and so

on, employs internet communication and executes applications on behalf of users. The end

systems are directly connected through a physical medium, such as twisted pair cables,

10

CHAPTER 2. BACKGROUND 11

coaxial cables, or optical fibers. The end systems in a network are referred to as nodes, and

the physical medium is called a link. Depending on how a node is attached to a link, the

link can be either limited to a pair of nodes, which is known as point-to-point link, or shared

by more than two nodes, which is known as multiple-access link. The point-to-point link

and multiple-access link are shown in Figs. 2.1 (a) and (b), respectively. Besides the direct

links, end systems can be indirectly connected through one or more medium nodes called

switches, and the resulting network is called a switched network, as illustrated in Fig. 2.1

(c). One of the most common switched networks is the packet-switched network1, in which

data are divided into small pieces called packets and sent individually, instead of being

transferred as strings of continuous bits. A network can be represented by a network cloud

shown in Fig. 2.1 (c). The nodes inside the cloud, the switches, implement the network,

and the outside nodes are the users of the network, called hosts. In general, a network cloud

depicts any size and any type of network, regardless of its link type or switch type.

Host Host

Host HostHost Host

(a)

(b)

Host Host

Swtich

Swtich

(c)

Swtich Swtich

Swtich

Host

Host

Host

Host

Figure 2.1: Illustration of links and a network cloud. (a) Point-to-point link; (b) multiple-access link; (c) switched network cloud.

1The two most common types of switched networks are known as packet-switched network and circuit-switched network. The overwhelming majority of computer networks deploy the former type of switches,while the circuit-switch is most notably employed by the telephone system. In this thesis the packet switchednetwork is considered.


Networks were originally conceived to be small systems consisting of rather few nodes,

and a user attached to a given network could not access to another network, as switches

were limited by their ability to scale and to handle heterogeneity. As the need for data

communication service has grown, a single network may be inadequate for business and

individuals to meet the needs of information flow. Universal service, with which an end

system in any part of an organization would communicate with other end systems, is highly

desirable. In the early 1970s, the term internetworking was coined. The internetworking

or internet2 scheme provides universal service among heterogeneous networks. Additional

hardware systems are desired to interconnect a set of physical networks.

The basic hardware component that carries out relaying service between networks is

a gateway or router3. As illustrated in Fig. 2.2, two physical networks are connected by

a gateway. A gateway connects two or more networks, and it appears in the connected

network as a connected host [13]. An internetworking gateway makes it possible for one to

choose network technologies satisfied by each user and plays more or less the same role as

a switch which stores and forwards packets.

Network 1 Network 2Router

Figure 2.2: The illustration of interconnection of networks. An internet is formed by agateway interconnecting two physical networks. The networks can be of different types.End systems can be attached to either of the networks.

An internet consists of a set of networks interconnected by gateways. The size of an

2When written with an uppercase I, the term Internet refers to the widely used global Internet, while theone with a lowercase i, the internet, refers to an arbitrary collection of networks interconnected to providehost-to-host packet delivery service.

3In the Internet community, a gateway is specifically referred to as an IP-level router, while a router is aswitch that receives data transmission units from input interfaces and, depending on the addresses in thoseunits, routes them to the appropriate output interfaces [13]. But in the thesis, the term gateway and router areused interchangeably.


internet, which depends on the number of connected networks, the number of end systems

and users attached to each network, can vary. An internet can be built from an interconnec-

tion of internets. Thus, arbitrarily large network clouds can be formed by interconnecting

clouds. Fig. 2.3 illustrates the concept of the internetworking. Although the Internet is

much more complicated and includes much more heterogenous nodes than the internet il-

lustrated in Fig. 2.3, the basic idea of the Internet, to which a large percentage of networks

are connected, is the same. That is why the Internet has been known as the network of

networks [14].

In general, both internet software and internet hardware together provide the appear-

ance of a single, seamless communication system. The most important protocols devel-

oped for internetworking are known as the TCP/IP Internet Protocols [124]. The internet

architecture is based on four layers4 5. The host’s layer structure in Fig. 2.4 depicts the

internet layer architecture. The bottom layer of the internet model is the link layer. The

links allow data to be transferred within each network. The same link layer protocols are

required for all the Internet nodes, including hosts and gateways, to communicate in their

directly-connected network. The second layer is Internet layer, also known as the Internet

Protocol (IP) or Internetworking layer. Protocols in this layer specify the format of pack-

ets sent across an internet and provide the function necessary for connecting networks and

gateways into one coherent system. The IP layer is responsible for delivering data from the

source host to the final destination host. IP is a connectionless or datagram internetwork

service, providing no end-to-end delivery guarantees. The IP layer is required by both hosts

and gateways. The third layer is known as transport layer which specifies how to ensure

transfer reliability, and provides end-to-end communication services. The transport layer

contains two primary transport layer protocols at present: Transmission Control Protocol

4Another internetworking layer model is referred to Open Systems Interconnection[127, 154] (OSI) sevenlayers model. In this thesis, TCP/IP four layers reference model is discussed.

5While some divide the internet layering model into four layers [11, 14, 31], others lay the internet as fivelayers, in which the link layer is separated into two layers: the network interface layer and physical layer [33].


Host

Router

RouterRouter

Router

Router

Router

Host

Host

Host

Network 4

Network 3Network 2

Network 1

Network 5Host

Host

HostHost

Host

The Internet

Figure 2.3: Illustration of the internetworking concept. An internet is formed with six gate-ways interconnecting five physical networks. A host in the underlying physical structurecan be attached to any one of the physical networks which are interconnected by gateways.A host in the internet can communicate with any other host in the internet, even though thetwo hosts may be attached to different types of networks in the internet.


and User Datagram Protocol (UDP). Reliable connection-oriented data transport service

is provided by TCP, which is discussed in detail in the following sections. The top layer is

called application layer which supports the direct interface to a user application. The layer

contains many widely used protocols such as Electronic mail (e-mail), World-Wide-Web

(WWW), File Transport Protocol (FTP), Trivial File Transport Protocol (TFTP), Secure

Shell (SSH), and Network Terminal Protocol or remote login (Telnet).

Application layer

Transport layer

Internet layer

Link layer

Application layer

Transport layer

Internet layer

Internet layer

Gateway

Host Host

Link layer Link layer

Figure 2.4: Illustration of the Internet layer operations.

TCP/IP protocol software is required in both hosts and gateways. Nevertheless, gate-

ways do not need the protocols from all layers. More specifically, gateways necessitate the

Internet protocol layer and the link layer to provide the connectivity service. An example

of the Internet layer operations is presented in Fig. 2.4.

2.2 TCP and Congestion Control

2.2.1 TCP Prime

The first TCP reference was a note in 1973 written by Vinton G. Cerf with the title of

“A Partial Specification of an International Transmission Protocol”. The protocol design

choices were then discussed and published. The protocol was split into TCP and IP, in

which TCP aims to handle packetization, error control, retransmission and reassembly,


while IP specifies routing packets [21]. In February 1980, the U.S. Department of Defense

(DoD) adopted TCP/IP as the preferred protocol to build a network of networks which

was later split into military network (MILNET) for military related sites and the Internet

[22, 23]. By the time that Internet started to be popularized by private companies, the

networking revolution had begun. Immense opportunities in research and business have

been provided since then.

TCP protocol defined in RFCs6 793, 1122, 1323, 2018 and 2581 [3, 14, 15, 99, 126] is

the predominant transport protocol of today’s Internet. More than 80% of the total Internet

traffic volume is carried by TCP which provides a reliable data transfer service on unre-

liable networks [143]. Numerous Internet applications, such as, Electronic Mail (e-mail),

World-Wide-Web (WWW), File Transfer (FTP), Secure Shell (SSH), The Network Termi-

nal Protocol (Telnet), and streaming media application, etc., are all built on TCP. Another

type of transport protocol is UDP which provides a much simpler service to the application.

It is connectionless, unreliable and not stream-oriented and it supports neither congestion

control nor flow control.

TCP is a connection-oriented protocol. An end-to-end connection between a TCP

source and a TCP receiver must be established for data transfer. There are two procedures

in connection-oriented service. One is connection establishment and the other is connec-

tion termination. The way of establishing a connection follows a three-way handshake,

in which handshake refers to the exchange of control information as shown in Fig. 2.5(a).

The TCP sender sends an initial request message with SYN7 and a sequence number x to

establish communication. Once the TCP receiver receives the SYN, the receiver will record6RFC (Request For Comments) is a series of chronological documents that contain ideas, techniques,

observations, and proposes and accepts TCP/IP protocol standards. RFCs documents are available atwww.ietf.org

7SYN is the name of the one code bit field in the TCP header. When SYN is set to 1, its correspondingsequence number is the the initial sequence number, and the sequence number of the first data byte is thissequence number plus one. When SYN is set to 0, its corresponding sequence number is the first data byte’ssequence number. OtherTCP header code bits used in opening or closing a TCP connection include ACK,FIN and RST.


the sequence number, and reply a message with SYN whose sequence number is y and ACK

whose acknowledgment number is x + 1. When the TCP sender receives the reply mes-

sage from the TCP receiver, the sender will send a message with ACK received from the

TCP receiver with an acknowledgment number y + 1. When the connection is established

between two end systems, the data can be sent from both directions, known as full-duplex

service, until one of the systems issues a FIN packet, or a RST packet, or the connection

times out. When the transfer completes, the connection is explicitly terminated. The pro-

cess of terminating a TCP connection follows a four-way handshake. The TCP connections

are full-duplex and there are two independent transfer streams, one on each direction. The

TCP sender closes its application, sends a message of FIN and waits for the TCP receiver’s

acknowledgment. The receiver acknowledges the FIN packet and sends an ACK to inform

the sender that no more data is available. After a connection has been closed in a given di-

rection, TCP would refuse to accept more data from that direction. On the other hand, data

can still flow in the other direction until the sender closes it. When both directions have

been closed, the end systems delete the records of the TCP connections. The termination

of a TCP connection procedure, four-way handshake, is shown in Fig. 2.5(b).

Retransmission Mechanism

TCP guarantees an orderly delivery of all bytes data without any duplication, by using an

acknowledgement mechanism to check the accuracy of the data received. Every time TCP

sends a packet, it starts a timer and waits for an acknowledgment. If the timer expires

before the acknowledgement reaches the TCP sender, the packet is then considered lost or

corrupted. Unacknowledged data are retransmitted later. An ACK is generated by the TCP

receiver upon accurate reception of the data correctly receiving. In this way, the reception

of an ACK at the TCP sender guarantees that the data have reached its destination correctly.

The TCP receiver has two choices when receiving a TCP packet. It can either generate an


SYN, (Seq. No. = x)

ACK, (ACK. No. = y+1)

SYN, ACK,

(Seq. No. = y,

ACK =x+1)

Tim

eSender Receiver

FIN

ACK

Tim

e

Sender Receiver

FIN

ACK

(a) (b)

Figure 2.5: (a)Three-way handshaking in TCP connection establishing, and (b) four-wayhandshaking in TCP connection termination.

ACK as soon as a packet is received or delay the ACK generation for a while, known as

delayed ACK. By holding up the ACK, the receiver may be able to acknowledge two packets

at the same time and therefore to reduce ACK traffic. However, if an ACK is delayed for

too long, a timeout and retransmission may be triggered. In practice, the delay of ACK

should not be set longer than 500 ms [60].

Retransmission mechanism is one of the key principles for providing reliable data trans-

fer. TCP employs a retransmission timer for each packet sent. If there is no ACK received

during the time of retransmission timeout (TO) period, which means that the packet is prob-

ably lost, then the packet needs to be sent again. When the ACK is received within the TO

period, then the retransmission timer will be cleared. In many popular TCP implementa-

tions, the minimum TO is set to 1 second [60]. Too long TO may result in longer delay and

subsequently a lost in a busy network environment. However, an inappropriately short TO


would lead to too much unnecessary transmission traffic which is a waste of network re-

sources and increases extra network traffic. Therefore, an appropriate TO is very important

for obtaining optimal performance. TO should be set according to the value of the average

round trip time (RTT), which is the time duration for a packet traveling from one end of a

network to the other end and back again [118]. TCP records the time at which each packet

is sent and the time at which an ACK for that packet arrives. From the difference of the two

times, TCP knows the the sample round trip time (SRTT). TCP estimates the average RTT,

RTT , in the following way:

RTT = (1 − wR) · RTT + wR · S RTT (2.1)

in which wR (0 ≤ wR < 1) is a constant weighting factor for weighting the old average

against the latest sample round trip time. A typical value for wR is 0.125 [34]. The TO is

estimated as:

TO = RTT + ηDev (2.2)

in which Dev is the estimated mean deviation and is used to describe the fluctuations, and η

is a factor used to control how much deviation affects the round trip TO. A value suggested

by researchers for η[34] is 38. The way of calculating Dev is as follows:

Dev = (1 − wD) · Dev + wD · |S RTT − RTT | (2.3)

where wD is a fraction between 0 and 1. This operation works like a low pass filter to

control how fast the new sample affects the mean deviation. A typical value for wD is 0.25

[34]. Equation (2.3) is used to maintain an exponentially weighted moving average of the

deviation.8The original value for η was 2 in 4.3BSD UNIX, and was changed to 4 in 4.4BSD UNIX


Flow Control

To prevent a TCP sender’s sending rate from being too high for the TCP receiver to han-

dle, end-to-end flow control is implemented in TCP [54, 55]. The end-to-end flow control

is necessary especially for a heterogeneous network environment. Flow control adopts a

sliding window mechanism9 to continuously inform the TCP sender how much data the re-

ceiver can accommodate, known as receiver advertised window size (rwnd), through using

ACKs. The corresponding TCP sender controls the sending data by a send window with

a size no greater than the rwnd upon the reception of a previous ACK. Since this work

is focused on the congestion control, and the advised window size is larger than the send

window size the advised window size can be ignored.

2.2.2 Congestion Control Algorithm

Offered load

Thro

ughp

ut

Capacitydesired

reality

Figure 2.6: Illustration of congestion collapse.

While flow control is to prevent buffer overflow at the TCP receiver, it does not pre-

vent the buffer overflow in the intermediate gateways. Congestion control mechanism is

9Sliding window: an algorithm at the heart of TCP that allows the sender to transmit multiple packets upto the size of the window before receiving an ACK.


introduced in the late 1980s by Van Jacobanson [71] to regulate the transmission rate of

each connection and to prevent it from reaching an inappropriately high rate that the gate-

way cannot handle. The uncontrolled high rate may eventually lead to congestion collapse

as illustrated in Fig. 2.6. In Fig. 2.6, the red dashed line represents the network capac-

ity, at which the network performance would be perfect. Ideally, a network throughput is

converging to the perfect case as the traffic load increases,as shown by the green curve.

However, in reality the throughput decreases dramatically after some point before reaching

the perfect case, as shown by the orange curve. This phenomena is called congestion col-

lapse. The purpose of congestion control is to avoid congestion collapse and to maintain

optimal (the highest) throughput of a network. TCP congestion control is a window-based

mechanism. There are two key variables in a TCP congestion control algorithm: conges-

tion window size (cwnd) and slow start threshold (ssthresh). cwnd limits the amount of

data a sender can send to network, while ssthresh is a threshold variable which separates

the different congestion control mechanism phases. As mentioned earlier, the send window

size is assumed to be greater than rwnd. In fact, the maximum send window size is given

by the min(rwnd, cwnd). Since congestion control is the main focus, cwnd is considered

to be larger than rwnd. The assumption holds when the gateway is the bottleneck for a

network. Therefore, the window size will refer to the send window which is controlled by

congestion control mechanism, as well as cwnd.

The principal operation of TCP congestion control in Reno10[3] involves the following

mechanisms: slow start, congestion avoidance and fast retransmit/ fast recovery [135].

Slow Start

At the start-up of a connection, a TCP sender starts cautiously with a small (no more than

2) window size, and then it tries to probe the available bottleneck capacity by exponentially

increasing the window size until the window size reaches ssthresh. During this slow start10TCP Reno is the most widely deployed TCP version, and is considered as the standard TCP.


phase, for each ACK received, cwnd will increase by one, thus cwnd is doubled every RTT.

It takes log2N round trips before the TCP sender can send N packets. The slow start phase

is ended either when cwnd reaches ssthresh, i.e. cwnd ≥ ssthresh, or when congestion

occurs, which means that either TO expires or three duplicate ACKs are received. The

value of ssthresh can be arbitrarily high at the beginning. When the TCP experiences a

congestion, it reduces ssthresh to half of the current window size (the window size is reset

to 1 when timeout occurs), i.e. ssthreshnew = max(2, cwnd/2). When cwnd increases to a

value greater than ssthresh, the congestion avoidance phase starts to take over. The slow

start is used once a timeout occurs. As shown in Fig. 2.7 at the slow start phase, the window

size exponentially increases.

Congestion Avoidance

During the congestion avoidance phase, the window size increases more cautiously with

multiplicative decrease of the window size for each ACK received. Hence, in the conges-

tion avoidance phase the window size increases linearly by one packet for each RTT, which

is often known as additive increase (AI) algorithm. The window size is halved (if the halved

value is smaller than 1 packet, then the window size will be reduced to 1 packet), once there

is congestion detected. This is often referred to as multiplicative decrease (MD). A conges-

tion can be due to a TO expiration which will trigger a retransmission with the window size

reduced to 1 and ssthresh halved. Occurrence of three duplicate ACKs of Fast Retransmit

in TCP is also considered as a signal of packet loss and thus triggers the multiplicative de-

crease in both the window size and ssthesh. Furthermore, in an ECN (Explicit Congestion

Notification)-capable network, receiving an ECN bit header is also considered as a signal

of congestion. The window size and sshresh are reduced to half of the current window

size. As shown in Fig. 2.7, the window size dynamics in the TCP congestion avoidance

phase is known as Additive Increase Multiplicative Decrease (AIMD). The AIMD pattern


of continual increase and decrease of the window size continues throughout the lifetime

of the connection. The important concept for AIMD is that the source reduces its window

size at a much faster rate than it increases. At steady states, a non-congested connection is

maintained at the phase of congestion avoidance and follows the AIMD pattern. Ideally, in

the congestion avoidance phase, the waveform of the window size resembles a periodically

sawtooth waveform. The periodic behavior is the basis of many TCP dynamics models.

Fast Retransmit/Fast Recovery

Fast Retransmit is an enhancement to TCP for reducing the waiting time for a sender before

retransmitting a lost packet. When a TCP sender receives three duplicate ACKs, i.e. an

original plus three absolutely identical copies in a row, congestion is declared. The packet is

considered lost. When a duplicate ACK is received by the TCP sender, it represents that the

receiver has received a packet out of order, suggesting that the earlier packet has probably

been lost. The lost packet is then retransmitted without waiting for the retransmission timer

to expire. The send window size is reduced to half of its current value. Note that this cannot

happen if the congestion window is smaller than four packets.

The Fast Recovery algorithm is another improvement of TCP. While the fast retransmit

algorithm sends the lost packet in the congestion avoidance phase and not in slow start

phase, the fast recovery algorithm allows high throughput under moderate congestion, in-

stead of resuming slow start, especially for large windows. Faster recovery is only executed

if the packet has been detected by fast retransmit.

2.2.3 TCP Flavors

The original TCP [126] version includes mechanisms for providing reliable connection

services, such as timeout based retransmission, full-duplex data service, and flow control.

Since it does not include any congestion control mechanism, it is no longer used in the


1

Congestion window size (packet)

Time (RTT)

SS CA TO CA CASSCA

ssthresh

ssthresh

ECN or 3 duplicate

ACKs received

ACK

time-out

Packet

loss

Window size

CA: Congestion Aviodance

SS: Slow Start

TO: Timeout

Figure 2.7: Illustration of TCP congestion window. “SS” is the slow start phase, duringwhich the window size exponentially increases until it reaches the slow start threshold,ssthresh. ssthresh is set to half of the current window size whenever a congestion is ex-perienced. “CA” represents the congestion avoidance phase during which the window sizeis increasing linearly and decreasing multiplicatively. Multiplicative decrease of the TCPwindow size occurs either by receiving an ECN bit header or three duplicate ACKs. When“TO” timer expires, the window size is reduced to 1 and the system re-enters the slow startphase.


current Internet. However, it is the basis of all other TCP versions.

TCP Tahoe [71] is implemented with congestion control mechanism which includes

slow start, congestion avoidance, and fast retransmit. Fast recovery is not yet included in

Tahoe. Each time when packet lost occurs, TCP Tahoe re-enters slow start phase.

TCP Reno [3] is the most widely used TCP version, which has included slow start,

congestion avoidance, fast retransmit and fast recovery. The main improvement of Reno

over Tahoe is that Reno reduces the window size as well as ssthresh to half of the current

window size when a congestion signal is received, which can be either an ECN bit or three

duplicate ACKs. As a result, in TCP Reno, the window size does not start with slow start

for each packet lost, and therefore, throughputs in Reno are improved. Same as Tahoe,

Reno still enters slow start when a timeout occurs. Reno is the TCP version studied in this

thesis.

TCP NewReno [47] performs at a lower packet error rates than Reno by modifying the

fast recovery mechanism of Reno. In NewReno, when a packet loss is detected during fast

retransmit, the highest sequence number transmitted till then is remembered. Fast recovery

is finished only after receiving the ACK with the highest sequence number sent before

a loss occurs. Additional losses of fast recovery are detected by the reception of partial

acknowledgment. A partial acknowledgment is an ACK for new data with a lower sequence

number than the highest data packet retransmitted before the packet loss. A problem with

NewReno is that when there are no packet losses but only packets reordering with more

than three packet sequence numbers, NewReno may mistakenly enter fast recovery. The

NewReno is still under investigation and being enhanced by incorporating some additional

algorithms [63].

TCP SACK [99] uses selective acknowledgments (SACKs) to enable the TCP sender

to retransmit lost packets faster than one packet per round trip time. Using SACK, the TCP

receiver can explicitly is able to inform the TCP sender about packets that have received

successfully, so the sender need retransmit only the packets that have actually been lost.


The SACK implementation can still use the same congestion control algorithms as Reno.

Moreover, the SACK is useful for the Satellite Internet access [2].

TCP Vegas [17] was presented before the introduction of NewReno and SACK in 1994.

It is fundamentally different from other TCP versions. Vegas does not use packet loss as

a trigger to reduce the window size. In Vegas, additive increase and additive decrease of

the window size is deployed. The idea of Vegas is to use throughputs to detect conges-

tion. Vegas estimates the throughput of a connection by evaluating the packets-in-flight,

which is a delay-bandwidth product. If the estimated throughput is higher than the actual

throughput, congestion is said to exist. Hence, reduction of the window size at the TCP

sender is executed. A variable, Diff, representing the difference between the estimated and

actual throughputs is maintained in Vegas. Vegas also modifies the slow start algorithm

to find the correct window size without incurring a loss by exponentially increasing the

window size every other RTT. RTT is used to calculate the variable, Diff. A different re-

transmission strategy is used in Vegas, whose principle is that when RTT is greater than

the timeout value, instead of waiting for three duplicate ACKs, Vegas starts retransmission

after receiving one duplicate ACK. By doing so, cases in which the senders never receive

three duplicate ACKs can be avoided.

Fast TCP [75] is an alternative congestion control algorithm built on TCP Vegas. FAST

TCP aims at providing flow level properties such as stable equilibrium, well-defined fair-

ness, high throughput and link utilization. The basic idea of FAST TCP is to use queuing

delay to assess and address the congestion. There is an additional modification requirement

at the sender. FAST TCP applies an equation based approach at the source to control the

sending rate. By appropriately selecting the equation and feedback mechanism, FAST TCP

eliminates the packet level oscillations, improves the flow level dynamics and achieves its

objective of high performance, stability and fairness in general networks. But stability eval-

uation is limited to a single link with heterogeneous sources and feedback delay is ignored.

Moreover, many experimental scenarios are designed to identify the properties of FAST


TCP but those scenarios are not very realistic.

TCP Westwood [97] (TCPW) aims to improve the congestion control function by esti-

mating an eligible sending rate, and configuring the congestion control parameters accord-

ingly. TCP Westwood is a sender side modification of TCP Reno. TCP Westwood con-

gestion control is based on bandwidth estimation by monitoring the ACK reception rate.

By employing an adaptive decrease mechanism, TCP Westwood congestion control may

avoid the half reduction of the TCP window size of the standard TCP AIMD and improve

the stability of TCP. Comparing to standard TCP, TCP Westwood provides a congestion

window that is reduced more in the presence of heavy congestion and less in the presence

of light congestion.

HighSpeed TCP [50] (HS TCP) reduces the loss recovery time by modifying standard

TCP’s AIMD algorithm. HS TCP would only be effective for large congestion windows.

When the congestion window size is smaller than a threshold or the loss ratio is too high,

HS TCP will behave the same as the standard TCP algorithm. HS TCP performs well in

high-speed long-distance links. On the one hand, HS TCP improves the link utilization of

bursty traffic networks. On the other hand, it may lose fairness between the connections.

2.3 Active Queue Management (AQM)

The performance of applications that are built on TCP depends not only on TCP conges-

tion control algorithm but also on the strategies of queue management in network routers.

Active queue management mechanisms are implemented in Internet gateways to assist con-

gestion control algorithms to manage the network. The main goal of AQM algorithms is to

allow network operators simultaneously to achieve low packet loss and high throughput by

detecting incipient congestion.


2.3.1 The Need of AQM

TCP congestion control is an end-to-end control scheme. In TCP congestion control algo-

rithms, packet loss is considered as an indicator of congestion, and thus triggers congestion

control in which the sending rate of the TCP sender is reduced. TCP congestion control

has been considered a great success in the Internet so far. However, it has limitations. A

TCP end system heals the network only after congestion has occurred, i.e., one or more

packets have been lost. When congestion occurs at the gateway, it takes about one RTT be-

fore the TCP sender is informed. As the network is already congested and can handle less

injected traffic, it is harmful to the network if the TCP sender keeps increasing sending rate

before it is notified of the congestion. Further increase of the sending rate when congestion

already occurs would lead to more serious congestion, causing even longer transmission

delay, faster degradation of network throughput, and more packet losses. Additionally,

there is increasing demand for QoS in the Internet, as multimedia and peer-to-peer applica-

tions are being more popular. Congestion control, on its own, may not be able to fulfill the

requirement. Moreover, TCP traffic contributes to the bursty nature of the Internet traffic.

Larger buffer size at gateways is required in order to absorb the burstiness and to lower the

number of lost packets. However, larger buffer size may cause longer queuing delay and

even larger burstiness when congestion occurs. The queue length in the buffer should be

managed intelligently in order to optimize the network performance. Hence, an intelligent

queue management scheme is needed.

Traditionally, the DropTail queue management, in which a first-in-first-out policy is

used, is widely deployed in Internet gateways. It is simple and easy to implement and

therefore still dominates today’s Internet. In the DropTail scheme, if the incoming packets

exceed the buffer capacity, the latest arriving packets will be dropped. The DropTail queue

management scheme has the following problems: firstly, it may cause lock-out, a scenario


where the network bandwidth is occupied by few number of traffic flows and other con-

nections access requests to the gateway are all denied. Secondly, the network is inefficient,

because no advance congestion warning is given to the end systems. Thirdly, global syn-

chronization of TCP flows may occur as a result of the high correlation among packet

droppings. The simultaneous drops of the packets result in the synchronous reduction of

the TCP window sizes. Moreover, there might be large queue oscillation and jitter which

causes TCP traffic to be more bursty. Additionally, since DropTail does not drop packets

until the queue is full, the queue length stays full for a long period of time. Thus, long

queuing delay is retained.

To avoid the inherent problems of DropTail queue management, IETF recommends

AQM for the next generation Internet. Unlike DropTail queue management, AQM is a

proactive congestion control scheme and provides preventive measures to manage the gate-

way buffer.

Goals of AQM have been specified in RFC 2309 as follows:

• Reduce the number of packets dropped in routers;

• Provide low-delay interactive services;

• Avoid lock-out behaviors.

2.3.2 Explicit Congestion Notification (ECN)

The original Internet protocol does not have any code field for signaling the congestion, and

the most common way to signal the congestion is by dropping packets. Explicit Congestion

Notification (ECN) [46, 119] is an extension to the Internet protocol to provide the ability

of explicit notification of congestion. It is inherently coupled with AQM. The basic idea

of AQM is to provide TCP senders with information about the imminent congestion, by

sending appropriate indications to the TCP senders before the queue overflows. Instead


of informing TCP senders of congestion by dropping packets, as is the case with DropTail

queue management network, AQM gateways can mark packets during congestion by setting

the ECN bit in the packets’ header. Congestion Experience (CE) code point in the IP packet

header is reserved as the possible congestion indication11. In an ECN-capable network, i.e.,

one that is able to react to a congestion notification, the TCP sources respond to the ECN

bit set in exactly the same way they react to a dropped packet. The choice of marking

packets during congestion depends on the AQM policy. There are two main advantages of

the ECN-enabled network. One is its ability to avoid unnecessary packet drops, and the

other is the capacity in lessening unnecessary retransmission TOs.

ECN is an optional feature, and the effectiveness of ECN requires the deployment of

AQM.

2.3.3 Random Early Detection (RED)

Random Early Detection (RED) [45] is the default AQM scheme recommended by IETF

for the next generation Internet [16]. RED was first proposed by Sally Floyd and Van

Jacobson in 1993 [45] and is probably the most well known AQM scheme.

The basic idea of RED is to detect an imminent congestion at gateways by comparing

the computed average queue length, x, at the gateways with two thresholds, Xmin and Xmax.

In the original RED algorithm, if the average queue length x is smaller than Xmin, no packet

will be marked. That is to say, all TCP senders can further increase their sending rating,

since the network is able to accommodate more packets. If the average queue length x

is greater than Xmax and not greater than the buffer size B, all packets in the buffer will

be marked to prevent the threatening congestion. As the queue length reaches the buffer

size B, congestion occurs that all the packets in the buffer are marked and all the coming

packets are dropped. RED can also drop packets when the ECN bit header is received

11Technically, two bits are required for ECN. One is set by the source to indicate that the packet is ECN-capable. The other is set by gateways along the transmission path when congestion is experienced.


instead of marked packets depending on the configuration of RED and if the network is

ECN capable. When the average queue length is between Xmin and Xmax, congestion is

well controlled and high utilization is maintained. The packets in the buffer are randomly

chosen and marked by a certain marking probability p. The marking probability increases

linearly with increasing x, until x reaches Xmax, at which the probability is given by pmax.

The probability function for gentle RED is given as follows and shown in Fig. 2.8.

pb =

0 0 ≤ x < Xminx − Xmin

Xmax − Xminpmax Xmin ≤ x ≤ Xmax

pmax +1−pmax

Xmax(x − Xmax) Xmax < x ≤ 2Xmax

1 2Xmax ≤ x ≤ B

(2.4)

where x is the average queue length at the RED gateway; α is the queue weight in RED12;

Xmax and Xmin are maximum and minimum thresholds at the RED gateway, respectively; B

is the the buffer size; pb is the marking probability assigned by RED algorithm; pmax is the

marking probability of when the average queue length being at Xmax.

As shown in Fig. 2.8, the marking probability is a function only of the average queue

length, x. However, the real implementation for RED is ac

The Hong Kong Polytechnic University · 2020. 6. 29. · Abstract Abstract of the thesis entitled...

Documents

Transcript of The Hong Kong Polytechnic University · 2020. 6. 29. · Abstract Abstract of the thesis entitled...