The Hong Kong Polytechnic University · 2020. 6. 29. · Abstract Abstract of the thesis entitled...

168

Transcript of The Hong Kong Polytechnic University · 2020. 6. 29. · Abstract Abstract of the thesis entitled...

  • The Hong Kong Polytechnic University

    Department of Electronic and Information Engineering

    Stability Analysis of The Internet Congestion Control

    Xi Chen

    A thesis submitted in partial fulfillment of the requirements for

    the degree of Doctor of Philosophy

    February 2009

  • Abstract

    Abstract of the thesis entitled “Stability analysis of the Internet congestion control”

    submitted by Xi CHEN for the degree of Doctor of Philosophy at The Hong Kong Poly-

    technic University in February 2009

    The Internet has become an important medium of information transfer nowadays. The

    TCP/IP protocol suite and the interconnected gateways provide reliable channels for the

    flow of information which are evenly shared among different connections. It has been

    known that a bottleneck RED (Random Early Detection) gateway can become oscillatory

    when regulating multiple identical TCP (Transmission Control Protocol) flows.

    In this thesis, we will study the stability issue in the TCP-RED system. The stabil-

    ity boundary of the TCP-RED system depends on many network parameters, making the

    adjustment of the RED gateway a difficult task. Based on a fluid-flow model (FFM), we

    formulate analytical conditions that describe the stability boundary of the RED gateway

    which depends on the number of TCP Reno connections. The proposed model accurately

    generates a stability boundary surface in a four dimensional space, which facilitates the

    adjustment of parameters for stable operation of the RED gateway. The accuracy of the

    analytical results has been verified using the ns-2 network simulator.

    We will use the fluid-flow model to derive the system characteristic frequency, and then

    compare it with the frequencies of the RED queue length waveforms observed from ns-2

    simulations. The ns-2 simulator is the only viable simulation tool accepted by industry

    iii

  • for verification purposes. Analysis of the TCP source frequency distribution reveals the

    occurrence of period doubling when the system enters the instability region as the filter

    resolution varies. Since random events and a large number of TCP flows are involved in

    the process of generating the average system dynamics, a statistical viewpoint is taken in

    the analysis. Our results reflect the true system behavior as they are based on data from ns-2

    simulations rather than numerical simulations of analytical models. The physical mecha-

    nism of oscillation is explained in terms of the difference in the TCP source frequency and

    the TCP-RED system characteristic frequency.

    The detrended fluctuation analysis (DFA) method is used to analyze the stability of

    the Internet RED gateway. In DFA, time-series data are analyzed to generate a key pa-

    rameter called power-law scaling exponent, which provides indication as to the long-range

    correlations of the time series. By examining the variation of the DFA scaling exponent

    when varying system parameters, we quantify the stability of the RED system in terms of

    system’s characteristics.

    Finally, the random explicit congestion notification (ECN) marking distribution mech-

    anism in RED gateways has been studied. The randomness of the RED ECN marking

    algorithm is implemented into the FFM. The new model is shown to have better dynamic

    performance, as verified by the waveforms provided by ns-2 simulations.

    iv

  • Acknowledgements

    My sincere gratitude goes to my supervisors Dr. Siu-Chung Wong and Prof. Michael

    Tse, for their valuable advice, patient guidance, and generous support throughout the study.

    Without their supports, this research projects would not have been completed.

    I thank my former advisor Dr. Wing-Kuen Ling who introduced me to nonlinear science

    for his constant teaching and encouragement.

    I would also like to thank my collaborator, Prof. Ljiljana Trajkoviá, for her valuable

    ideas and suggestions.

    At the same time, I would like to acknowledge the members in our research group

    Yuehui Huang, Xiaohui Qu, Xiaoke Xu, Jie Zhang, Junfeng Sun, Guang Feng, Sufen Chen,

    Zhen Li, Takayuki Kimura, Xiaofan Liu, Xiaodong Luo, Qingfeng Zhou, Rongtao Xu, Xia

    Zheng, Yang Liu, and Xiumin Li for their support and valuable discussions on my research.

    I wish to thank Dr. Jianbo Gao, Dr. Wen-wen Tung and Rongsheng Huang for their

    hospitality during my visit at the University of Florida.

    I gratefully acknowledge the Research Committee of The Hong Kong Polytechnic Uni-

    versity for the financial support during the entire period of my candidature.

    Last, but far from the least, I would like to thank my parents and my elder sister for

    their love and care over the years, for their persistent support, encouragement, and under-

    standing.

    v

  • Abbreviations

    Abbreviations Whole phrases

    ACK Acknowledgement

    AI Additive Increase

    AIMD Additive Increase Multiplicative Decrease

    AQM Active Queue Management

    ARED Adaptive RED

    AVQ Adaptive Virtual Queue

    BRED Balanced RED

    CBR Constant Bit Rate

    CBT RED Class-Based Threshold RED

    CE Congestion Experience

    CHOKe CHOose and Keep for responsive flows

    cwnd Congestion Window Size

    DDE Delayed Differential Equation

    DFA Detrended Fluctuation Analysis

    DoD Department of Defense

    DSRED Double Slope RED

    ECN Explicit Congestion Notification

    e-mail Electronic Mail

    vi

  • FFT Fast Fourier Transform

    FFM Fluid-Flow Model

    FRED Flow RED

    FIN Finish

    FTP File Transport Protocol

    HS TCP High Speed TCP

    IAB the Internet Architecture Board

    IETF the Internet Engineering Task Force

    IP Internet Protocol

    LBL Lawrence Berkeley Laboratory

    LRD Long-Range Dependence

    MD Multiplicative Decrease

    MILNET MILitary NETwork

    OSI Open Systems Interconnection

    OTcl Orient object Tool Command Language

    P Proportional

    PD Proportional-Differential

    PI Proportional-Integral

    PI-PD Proportional-Integral-Proportional-Derivative

    PARC Palo Alto Research Center

    QoS Quality of Service

    RED Random Early Detection

    REM Random Exponential Marking

    RFC Request For Comments

    RFFM Randomized Fluid-Flow Model

    rms Root Mean Square

    vii

  • RST Reset

    RTT Round Trip Time

    rwnd Receiver Advertised Window Size

    SACK Selective ACKnowledgment

    SRED Stabilized RED

    SRTT Sample Round Trip Time

    SSH Secure Shell

    ssthresh Slow Start Threshold

    SYN Synchronization

    TCP Transmission Control Protocol

    TCPW TCP Westwood

    Telnet Network Terminal Protocol

    TFTP Trivial File Transport Protocol

    TO Timeout

    UC Berkeley University of California, Berkeley

    UDP User Datagram Protocol

    USC University of South California

    VINT Virtual InterNetwork Testbed

    WWW World-Wide-Web

    viii

  • Nomenclature List

    Symbol Description

    α the exponential moving average weight parameter at the RED gateway

    β the DFA scaling exponent of the queue length

    βT j the DFA scaling exponent of the series of the jth TCP window period

    βT the DFA scaling exponent of the TCP window period

    ϕ the constant for controlling target queue length

    Φ the target oscillation range, (Xmax − Xmin)

    κ the proportionality constant used in fluid flow model

    ρi, j the degree of similarity between flow i and flow j

    B the buffer size the RED gateway

    C the bottleneck bandwidth

    cm ECN unmark counter

    Dev the estimated mean deviation

    fδ TCP sources frequency, 1/T

    H the Hurst parameter

    K the packet-in-flight

    N the number of connections

    pb the marking/droping probability assigned by RED algorithm

    pmax the marking probability of when the average queue length being at Xmax

    ix

  • q the instantaneous queue length

    q0 the target queue length

    Ro propagation delay

    ro round trip time

    RTT estimated average round trip time

    S Matrix constructed by ρi, j

    T the TCP sources sending rate period

    T j the TCP window period of the jth TCP flow

    w the TCP window size

    Wsum the total window size

    x the average queue length

    Xmax the maximum threshold at the RED gateway

    Xmin the minimum threshold at the RED gateway

    wR the weight factor for computing sample round trip time

    x

  • Table of Contents

    Abstract iii

    Acknowledgements v

    Abbreviations vi

    Nomenclature List ix

    1 Introduction 1

    1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

    1.2 Contribution of the Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . 4

    1.3 Publications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

    1.4 Organization of the Thesis . . . . . . . . . . . . . . . . . . . . . . . . . . 8

    2 Background 10

    2.1 Internetworking: Concepts, Architectures and Protocols . . . . . . . . . . . 10

    2.2 TCP and Congestion Control . . . . . . . . . . . . . . . . . . . . . . . . . 15

    2.2.1 TCP Prime . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

    2.2.2 Congestion Control Algorithm . . . . . . . . . . . . . . . . . . . . 20

    2.2.3 TCP Flavors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

    2.3 Active Queue Management (AQM) . . . . . . . . . . . . . . . . . . . . . 27

    i

  • 2.3.1 The Need of AQM . . . . . . . . . . . . . . . . . . . . . . . . . . 28

    2.3.2 Explicit Congestion Notification (ECN) . . . . . . . . . . . . . . . 29

    2.3.3 Random Early Detection (RED) . . . . . . . . . . . . . . . . . . . 30

    2.3.4 RED and Its Variants . . . . . . . . . . . . . . . . . . . . . . . . . 34

    2.4 Network Models . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

    2.4.1 Discrete Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

    2.4.2 Fluid Flow Model . . . . . . . . . . . . . . . . . . . . . . . . . . 39

    2.5 Network Simulator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42

    2.5.1 Data Collection and Analysis . . . . . . . . . . . . . . . . . . . . . 42

    3 Stability Analysis 45

    3.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

    3.2 TCP-RED Fluid-Flow Model . . . . . . . . . . . . . . . . . . . . . . . . . 48

    3.3 Stability Boundary of TCP-RED System . . . . . . . . . . . . . . . . . . . 51

    3.3.1 Steady-State Solution and Target Queue Length . . . . . . . . . . . 51

    3.3.2 Linearization and Perturbation . . . . . . . . . . . . . . . . . . . . 52

    3.3.3 Closed-Form Stability Condition . . . . . . . . . . . . . . . . . . . 53

    3.4 Verification of Stability Boundaries . . . . . . . . . . . . . . . . . . . . . . 55

    3.4.1 Definition of Stability . . . . . . . . . . . . . . . . . . . . . . . . 55

    3.4.2 Comparison of Stability Boundaries Using Different Simulation

    Methods and Approximations . . . . . . . . . . . . . . . . . . . . 59

    3.4.3 Cross-Sectional Views . . . . . . . . . . . . . . . . . . . . . . . . 60

    3.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62

    4 Nonlinear Analysis 67

    4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

    4.2 Characteristic Frequency . . . . . . . . . . . . . . . . . . . . . . . . . . . 69

    4.2.1 Steady-State Solution . . . . . . . . . . . . . . . . . . . . . . . . . 71

    ii

  • 4.2.2 Finding the Characteristic Frequency . . . . . . . . . . . . . . . . 72

    4.3 Actual Steady-State Waveforms of TCP Sources . . . . . . . . . . . . . . . 74

    4.4 Comparison of Results from Fluid-Flow Model Calculations and Ns-2 Sim-

    ulations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

    4.5 Characteristic Frequency and Period Doubling From A Statistical Perspective 80

    4.6 Mechanism of Oscillation . . . . . . . . . . . . . . . . . . . . . . . . . . . 88

    4.7 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89

    5 Detrended Fluctuation Analysis 90

    5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90

    5.2 Property of Time Series and DFA method . . . . . . . . . . . . . . . . . . 94

    5.2.1 Self-Similarity and Long-Range Dependence . . . . . . . . . . . . 94

    5.2.2 DFA Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95

    5.3 Long-Range Power-Law Correlations in Queue Length . . . . . . . . . . . 96

    5.3.1 Scaling Exponent of Queue length and System Stability . . . . . . 98

    5.3.2 Interpretation from a Waveform Viewpoint . . . . . . . . . . . . . 99

    5.4 Long-Range Power-Law Correlations in Series of TCP Window Periods . . 100

    5.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107

    6 Randomized Fluid Flow Model 108

    6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108

    6.2 RED ECN Marking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 110

    6.2.1 Distribution of Consecutive ECN Generation . . . . . . . . . . . . 110

    6.2.2 Randomized Fluid-Flow Model . . . . . . . . . . . . . . . . . . . 112

    6.3 Results from RFFM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115

    6.3.1 Comparison RFFM and ns-2 Simulations . . . . . . . . . . . . . . 115

    6.3.2 Phase Portrait and Randomness . . . . . . . . . . . . . . . . . . . 117

    6.4 Modeling Interactive Bottleneck Gateways . . . . . . . . . . . . . . . . . . 117

    iii

  • 6.5 Summary . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122

    7 Conclusion and Future Work 123

    7.1 Main Contributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123

    7.2 Future Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125

    Bibliography 126

    iv

  • List of Tables

    2.1 RED configuration parameters in ns-2 simulator . . . . . . . . . . . . . . 43

    3.1 Abbreviations used in graphical presentation . . . . . . . . . . . . . . . . . 55

    4.1 Parameters for ns-2 simulations . . . . . . . . . . . . . . . . . . . . . . . . 77

    6.1 Simulation parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112

    v

  • List of Figures

    1.1 Flow charts of the main contributions. . . . . . . . . . . . . . . . . . . . . 5

    2.1 Illustration of links and a network cloud. (a) Point-to-point link; (b) multiple-

    access link; (c) switched network cloud. . . . . . . . . . . . . . . . . . . . 11

    2.2 The illustration of interconnection of networks. An internet is formed by

    a gateway interconnecting two physical networks. The networks can be of

    different types. End systems can be attached to either of the networks. . . . 12

    2.3 Illustration of the internetworking concept. An internet is formed with six

    gateways interconnecting five physical networks. A host in the underly-

    ing physical structure can be attached to any one of the physical networks

    which are interconnected by gateways. A host in the internet can commu-

    nicate with any other host in the internet, even though the two hosts may

    be attached to different types of networks in the internet. . . . . . . . . . . 14

    2.4 Illustration of the Internet layer operations. . . . . . . . . . . . . . . . . . 15

    2.5 (a)Three-way handshaking in TCP connection establishing, and (b) four-

    way handshaking in TCP connection termination. . . . . . . . . . . . . . . 18

    2.6 Illustration of congestion collapse. . . . . . . . . . . . . . . . . . . . . . . 20

    vi

  • 2.7 Illustration of TCP congestion window. “SS” is the slow start phase, dur-

    ing which the window size exponentially increases until it reaches the slow

    start threshold, ssthresh. ssthresh is set to half of the current window size

    whenever a congestion is experienced. “CA” represents the congestion

    avoidance phase during which the window size is increasing linearly and

    decreasing multiplicatively. Multiplicative decrease of the TCP window

    size occurs either by receiving an ECN bit header or three duplicate ACKs.

    When “TO” timer expires, the window size is reduced to 1 and the system

    re-enters the slow start phase. . . . . . . . . . . . . . . . . . . . . . . . . . 24

    2.8 Marking probability function in RED. Dash line is the marking or dropping

    probability for the original RED, and solid line is the probability for the

    gentle RED. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32

    2.9 The control feedback loop formed by the TCP sources and the RED gateway. 40

    2.10 Fields appearing in the trace file. . . . . . . . . . . . . . . . . . . . . . . . 43

    3.1 A system of N TCP flows, from S i to Di, where i = 1, 2, · · · ,N, passing

    through a common bottleneck link between G1 and G2. . . . . . . . . . . . 48

    3.2 Stability boundary surface for (a) Φ = 128 packets, (b) Φ = 256 packets,

    and (c) Φ = 384 packets. Region below each surface is “stable” and above

    is “unstable”. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

    3.3 ns-2 simulation for Pmax = 1/11.4: N = 256, C = 51.2 Mbps, α = 0.01,

    qo = 384 packets, and ro = 64 ms. . . . . . . . . . . . . . . . . . . . . . . 56

    3.4 ns-2 simulation for Pmax = 1/13.8 (adjusted to maintain the original target

    queue length): N = 256, C = 51.2 Mbps, α = 0.01, qo = 384 packets, and

    ro = 73 ms (Ro = 20 ms). . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

    vii

  • 3.5 ns-2 simulation for Pmax = 1/17 (adjusted to make the system stable): N =

    256, C = 51.2 Mbps, α = 0.01, qo = 384 packets, and ro = 75 ms (Ro = 20

    ms). . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

    3.6 ns-2 simulation for Pmax = 1/44: N = 150, C = 30 Mbps, α = 0.001, qo =

    384 packets, ro = 155 ms and K = 1078.8. . . . . . . . . . . . . . . . . . . 60

    3.7 FFT of the average queue length for the first 100 seconds of the ns-2 simu-

    lation shown in Fig. 3.6. . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

    3.8 ns-2 simulation for Pmax = 1/41: for N = 150, C = 30 Mbps, α = 0.001,

    qo = 384 packets, ro = 145 ms, and K = 1006.9. . . . . . . . . . . . . . . . 62

    3.9 FFT of average queue length for the first 100 seconds ns-2 simulation

    shown in Fig. 3.8. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

    3.10 Comparison of various methods and approximations. Region below a curve

    is “stable” and above is “unstable”. The deficiency of the Padé(0,1) lin-

    earization (i.e., “A1” and “SL1”) is clearly evident. . . . . . . . . . . . . . 63

    3.11 Comparison of stability boundaries from closed-form solution based on

    Padé(1,1) linearization (solid and dashed curves) for various α, Φ = 256

    packets, and qo = 384 packets, corresponding to Fig. 3.2(b). Region below

    a curve is “stable” and above is “unstable”. . . . . . . . . . . . . . . . . . 64

    3.12 Comparison of stability boundaries from closed-form solution based on

    Padé(1,1) linearization (solid curves) for various N and Φ = 256 packets

    corresponding to Fig. 3.2(b). Region below a curve is “stable” and above

    is “unstable”. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65

    viii

  • 3.13 Comparison of stability boundaries between the closed-form solution based

    on Padé(1,1) linearization (dashed curves) for various N andΦ = 256 pack-

    ets corresponding to Fig. 3.2(b). The values of N for the full simulations

    based on the fluid-flow model (labelled as “S”) is intentionally adjusted to

    fit the ns-2 simulations of Fig. 3.12 and to show a constant offset of N = 30

    of the fluid-flow model from the actual values given by ns-2 simulations.

    Region below a curve is “stable” and above is “unstable”. . . . . . . . . . . 66

    3.14 Comparison of stability boundaries for various Φ for N = 256, α = 0.001

    and qo = 384 packets. Region below a curve is “stable” and above is “un-

    stable”. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66

    4.1 Simulated queue length waveforms of TCP-RED using the ns-2 simulator

    for different values of filter resolution α. (a) α = 0.1, (b) α = 0.0005, and

    (c) α = 0.0008 for 170 TCP connections. Each connection shares a fixed

    bandwidth of 1.5 Mb/s in the bottleneck link. . . . . . . . . . . . . . . . . 70

    4.2 Ideal steady-state waveform of TCP sender’s window size. . . . . . . . . . 75

    4.3 Waveform of TCP source window size of a connection at filter resolution

    α = 0.1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

    4.4 Waveform of TCP source window size of a connection at filter resolution

    α = 0.001. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

    4.5 Comparison of characteristic frequency fc from linearized fluid-flow model

    and peak oscillation frequency from ns-2 simulations. Bandwidths are in-

    dicated as vertical bars for the ns-2 data. System parameters are as listed in

    Table 4.1. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78

    4.6 Waveforms of queue length found from linearized fluid-flow model for sys-

    tem parameters shown in Table 4.1. . . . . . . . . . . . . . . . . . . . . . . 78

    ix

  • 4.7 Frequency distribution from FFT of the ns-2 simulated queue length wave-

    form of Fig. 4.1 (a). The distance between the two arrows is the bandwidth

    at this peak frequency. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79

    4.8 Frequency distribution from FFT of the ns-2 simulated queue length wave-

    form for α = 0.001. The distance between the two arrows is the bandwidth

    at this peak frequency. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80

    4.9 Distribution of TCP source window frequency for the 52nd flow at α = 0.1. 81

    4.10 Distribution of TCP source window frequency for the 153rd flow at α =

    0.001. Period doubling in the statistical sense is clearly evident from the

    emergence of a small “bump” at half of the characteristic frequency. . . . . 82

    4.11 Distribution of TCP source window frequency for α = 0.1. . . . . . . . . . 83

    4.12 Distribution of TCP source window frequency for α = 0.002. Period dou-

    bling in the statistical sense is clearly evident from the emergence of a

    small “bump” at half of the characteristic frequency. Impulse at 1 Hz re-

    flects time-out saturation. . . . . . . . . . . . . . . . . . . . . . . . . . . . 83

    4.13 Distribution of TCP source window frequency for α = 0.0015. Period

    doubling becomes more evident as the “bump” at half of the characteristic

    frequency grows. Impulse at 1 Hz reflects time-out saturation. . . . . . . . 84

    4.14 Distribution of TCP source window frequency for α = 0.001. Period dou-

    bling persists as the “bump” at half of the characteristic frequency stays in

    the distribution. Impulse at 1 Hz reflects time-out saturation. . . . . . . . . 84

    4.15 Distribution of TCP source window frequency for α = 0.0008. Period

    doubling becomes less persistent as the “bump” at half of the characteristic

    frequency begins to shrink. Impulse at 1 Hz reflects time-out saturation. . . 85

    4.16 Distribution of TCP source window frequency for α = 0.0006. Period

    doubling begins to subside. Impulse at 1 Hz reflects time-out saturation. . . 85

    x

  • 4.17 Distribution of TCP source window frequency for α = 0.0005. Stability is

    about to resume as period doubling subsides. . . . . . . . . . . . . . . . . 86

    4.18 Distribution of TCP source window frequency for α = 0.0004. Stability is

    resumed. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86

    4.19 Distribution of TCP source window frequency for α = 0.0001. . . . . . . . 87

    4.20 Distribution of TCP source window frequency for α = 0.00001. . . . . . . 87

    5.1 Simulated RED queue length waveforms using ns-2 simulator for filter res-

    olutions of α = 0.0001 and α = 0.0008 indicating stable waveforms (a),

    (c) and (e); and unstable waveforms (b), (d), and (f), respectively. Figures

    in (c) and (e) are enlarged views of (a), and (d) and (f) are enlarged views

    of (b). There are 170 TCP connections. Each connection shares a fixed

    bandwidth of 1.5 Mb/s in the bottleneck link. . . . . . . . . . . . . . . . . 97

    5.2 DFA scaling exponent for RED instantaneous queue length with different

    choices of α value. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98

    5.3 DFA scaling exponents in region 1 of Fig. 5.2 with varying α. . . . . . . . 99

    5.4 DFA scaling exponents of stationary signals. . . . . . . . . . . . . . . . . . 100

    5.5 Relationship amount system instability, positive feedback system and long-

    range correlation in the queue length series of the system. . . . . . . . . . . 101

    5.6 DFA scaling exponent of T1(i) series of 170 TCP connections with α =0.1,

    0.002, 0.0015, 0.001, 0.0008, 0.0006, 0.0005, 0.0004, 0.0001, and 0.00001 . 102

    5.7 DFA scaling exponent of the T j(i) series of 170 TCP connections for j =

    1, · · · , 170 with α = 0.001 . . . . . . . . . . . . . . . . . . . . . . . . . . 102

    5.8 DFA results and distribution of the DFA results of T j(i) series of the 170

    TCP connections, where j is the connection number for α = 0.1 . . . . . . 103

    xi

  • 5.9 From top to bottom are the distribution of ρ for α = 0.0001, 0.1, and 0.001,

    respectively. The system is stable for α=0.0001 and 0.1, and unstable for

    α=0.001. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105

    5.10 DFA scaling exponents of T j(i) series for the 170 TCP connections with

    varying α. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106

    5.11 Relationship among the stable TCP-RED system, long-range anti-correlation

    of TCP window period series, and the negative feedback system. . . . . . . 106

    5.12 An illustration of the competition of the bandwidth between two TCP flows

    in a stable TCP-RED system. . . . . . . . . . . . . . . . . . . . . . . . . . 107

    6.1 Phase portraits from the original FFM: current total window size Nw(t)

    versus the total window size in last round trip time Nw(t−r(t)). Waveforms

    of 30 sec simulation time is shown in blue. Steady state waveforms are

    shown in red, after a simulation time of 25 sec. (a) Stable trajectory with α

    = 0.0001, showing a fixed point. (b) Unstable trajectory with α = 0.0008

    showing limit cycles. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113

    6.2 Phase portraits from ns-2 simulations: current total window size versus the

    total window size in last round trip time for the simulation time from 25 to

    30 sec. (a) Stable trajectory with α = 0.0001. (b) Unstable trajectory with

    α = 0.0008. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114

    6.3 Instantaneous queue length comparison for stable network in Table 6.1 with

    α = 0.0001. Waveforms in the left and right panels are instantaneous queue

    length results from RFFM and ns-2 simulations, respectively. Waveforms

    in the lower panels are the magnified versions of those in the upper row. . . 115

    xii

  • 6.4 Instantaneous queue length comparison for stable network in Table 6.1 with

    α = 0.0008. Waveforms in the left and right panels are instantaneous queue

    length results from RFFM and ns-2 simulations, respectively. Waveforms

    on the lower panels are the magnified versions of those in the upper panels. 116

    6.5 Fixed point from RFFM with α = 0.0001: total window size Nw(t) versus

    the total window size in last round trip time Nw(t − r(t)). (a) Total simu-

    lation time of 30 sec is shown in blue, and the steady state portion in time

    interval from 25 to 30 sec is shown in red. (b) Magnified version of the red

    portion. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117

    6.6 Limit cycle from RFFM with α = 0.0008: total window size Nw(t) versus

    the total window size in last round trip time Nw(t−r(t)). (a) Total simulation

    time of 30 sec is shown in blue and the steady state portion at time interval

    from 25 to 30 sec is shown in red. (b) Magnified version of the red portion. 118

    6.7 A network of N TCP flows starts from S i to Di, where i = 1, 2, · · · ,N,

    passing through two bottleneck links between G1 to G2 and G2 to G3. . . . . 119

    6.8 Ns-2 results of the instantaneous queue length for a network with two in-

    teracting bottlenecks, where C2 = 0.995C1. . . . . . . . . . . . . . . . . . 120

    6.9 Illustration of queue length interaction between two interacting gateways,

    where Ro(1,2) is the one way propagation delay between G1 and G2. . . . . . 120

    6.10 Waveforms of the instantaneous queue length of FFM for two gateway net-

    work, where C2 = 0.995C1. There are noticeable differences from ns-2

    simulations shown in Fig. 6.8. . . . . . . . . . . . . . . . . . . . . . . . . 121

    6.11 Waveforms of the instantaneous queue length of RFFM for network with

    two interacting gateways, where C2 = 0.995C1. RFFM is able to model the

    interaction between gateways. . . . . . . . . . . . . . . . . . . . . . . . . 121

    xiii

  • Chapter 1

    Introduction

    Nowadays, the Internet has fully integrated with society in much of the developed world.

    The Internet and the TCP/IP Internet protocol suite1 have revolutionized the way we in-

    teract and communicate. Conventional media of information exchange and distribution is

    gradually giving way to the more efficient, rapid, economical and globalized infrastructure

    of information exchange network–the Internet. The success of the Internet can be mostly

    credited to the capability of its protocols, among which Transmission Control Protocol

    (TCP) and Internet Protocol (IP) are the two most important, providing users and develop-

    ers with robust and interoperable services based on a set of enduring design principles such

    as simplicity, scalability, distributed architectures and the end to end connections.

    The worldwide computer network, the Internet, interconnects millions of end systems2,

    such as personal computers, workstations, servers and so on, around the world. In 1986,

    the Internet suffered a huge problem of congestion collapses, where the throughputs for

    networks sharply diminished to zero while the data traffic increased. As the response from

    the Internet research community to the problem of Internet congestion collapse, congestion

    control mechanism was proposed. The main goal of congestion control is to optimize

    1Protocol is the standard to provide syntactic and semantic rules for communication [34].2End systems, end points, end nodes and hosts all refer to either a TCP sender or receiver. Those terms

    are used interchangeably throughout this thesis.

    1

  • CHAPTER 1. INTRODUCTION 2

    computer network performance by adjusting the sending rates of the end systems according

    to the level of congestion on the path of the transmission. Specifically, TCP sources keep

    on tracing the packets sent. If packet loss is observed, the TCP sending rate will be reduced,

    in order to avoid further loss and congestion. If all data are delivered, the TCP source will

    slowly increase its sending rate to maximize the utilization of the network resources. The

    TCP congestion control has been extremely successful in minimizing the packet loss and

    maximizing network utilization.

    However, as the demand for higher quality of service (QoS) in the Internet increases,

    there are indications that the TCP congestion control is reaching its limits. In particular, the

    packet loss has resulted in lower network efficiency since the end systems and the Internet

    gateways3 are constantly operating on the packets which are then to be dropped. With the

    purpose of enhancing the network performance, Internet Engineering Task Force4 (IETF)

    suggests the deployment of Active Queue Management (AQM) [16] and Explicit Conges-

    tion Notification (ECN) [46, 119] to avoid packet loss by allowing the gateways to assist the

    network management. Processes in the end systems are logically communicating through

    transport-layer protocols, dominantly TCP, whereas in practice, end systems are indirectly

    connected via gateways or routers. The queue length in the buffer should be managed in-

    telligently, in order to prevent buffer delays from getting too long, and to keep the packet

    loss as small as possible. AQM algorithms have been implemented to manage the queue

    length in the buffer and to assist the management of network performance with a congestion

    control algorithm. While the basic idea of AQM is to detect the congestion in advance and

    to signal congestion notification to the end systems before packet loss and queue overflow

    occur. The goal of ECN is to provide the network with the ability to explicitly signal the

    end systems of congestion control obtained from AQM before packet loss occurs. Hence,

    3A Gateway, a network node, connects to two or more networks that forwards packets from one networkto another[118]. The two terms are used interchangeably throughout the thesis.

    4The Internet Engineering Task Force is the group responsible for protocol standards and technical aspectsof TCP/IP and the Internet under the Internet Architecture Board (IAB) who sets the technical direction anddecides on the standards of TCP/IP and the global Internet [70].

  • CHAPTER 1. INTRODUCTION 3

    the end systems can adjust their sending rates upon receiving the ECN signal. ECN can be

    effective when it is used with AQM.

    This chapter is organized as the follows. First, in Section 1.1, we look at what the

    future Internet would be like, and what will be needed to get there. Then, Section 1.2 gives

    an overview of the contributions of this thesis. After that, publications arising from this

    project are listed in Section 1.3. Finally, Section 1.4 provides the outline of the rest of the

    thesis.

    1.1 Motivation

    The last two decades have seen growing interest on research in the Internet congestion con-

    trol. TCP may be the most complex protocol in the suite of Internet protocols. It provides

    reliable, flow-controlled, end-to-end, streaming service between two end systems on an un-

    reliable network for communication. Implementation of congestion control mechanism on

    TCP has further enhanced the network efficiency and performance. Internet traffic is mainly

    composed of TCP traffics [38, 81, 143]. However, in the future end-to-end TCP conges-

    tion control alone may not be enough to manage the Internet traffic, whose size has been

    and probably will continue to increase exponentially. The most widely deployed version of

    TCP is Reno. Therefore, Reno is the major type of TCP to be studied in this thesis.

    More recently, the idea of allowing AQM gateways to assist the network management

    on an end-to-end basis has been generally accepted. Among all the AQM algorithms, Ran-

    dom Early Detection (RED) is probably the most famous one. It has been recommended

    by IETF for next generation Internet [16], and it has been implemented in some commer-

    cial gateways [30]. The RED gateways are said to be able to enhance the throughput and

    fairness, and to avoid packet loss and global synchronization. However, the oscillatory

    and instability problems of the TCP-RED system have constrained the wide deployment

  • CHAPTER 1. INTRODUCTION 4

    of RED gateways over a decade. While many AQMs have been proposed to avoid the in-

    stability problem, new problems, such as complexity, fairness, and scalability, are induced.

    Due to complexity of the implementation and the many parameters involved in the design

    of the TCP-RED system the continuous improvement in the TCP-RED system has been a

    challenging problem and drawn a spate of interest.

    Full understanding and explanation for the oscillation and instability problems are gen-

    erally unavailable, as the Internet is probably the most complicated man-made system peo-

    ple have ever constructed. Furthermore, the TCP-RED system is a cross-layer, transport

    layer to Internet layer (also known as internetwork layer or IP layer) [14], optimization

    mechanism. Recent studies have attempted to establish a relationship of the dynamics

    between the network models and the real Internet. However, the studies have not been

    sufficiently verified by real traffic data.

    In the light of these motivations, this thesis has three purposes:

    1. to solve the oscillatory and unstable problems in TCP-RED,

    2. to understand the mechanism of the oscillations as well as the dynamics of the TCP-

    RED system, and

    3. to establish a more realistic model for the Internet.

    1.2 Contribution of the Thesis

    As shown in Fig. 1.1, this thesis contains four fairly independent contributions in order to

    solve the instability and oscillation problems of TCP-RED system.

    Firstly, based on a fluid-flow model, we have developed an analytical closed-form so-

    lution for finding the stability boundary of the TCP-RED system. The solution is very

    accurate for multiple TCP connections. The simplicity of the solution allows easy and fast

  • CHAPTER 1. INTRODUCTION 5

    Chapter 3:closed-form stability conditions using FFM

    Chapter 4:mechanism of the oscillation and

    instability of TCP-RED

    Chapter 5:stability analysis using DFA method

    Chapter 6:RFFM: an enhanced

    model of FFM for TCP-RED

    Instability and oscillatory problems

    of TCP-RED

    Figure 1.1: Flow charts of the main contributions.

    generation of the stability boundaries in the essential parameter space. The solution can be

    used to formulate guidelines for setting parameters in RED gateways to avoid instability.

    Secondly, the fluid-flow model has been used to calculate the characteristic frequency

    of the TCP-RED system with multiple identical Reno TCP connections. Period doubling

    has been observed in a statistical sense from ns-2 simulations using statistical frequency

    distribution of TCP source windows. The physical mechanism for the onset of period

    doubling has been explained in terms of the difference in the TCP sending frequency and

    the system’s characteristic frequency. Viable verifications using the industry standard ns-2

    simulation tool are provided. The bifurcation and stability results are reflecting the true

    behavior of the actual system.

    Thirdly, based on the data collected from the ns-2 simulations, long range power-law

    correlations of the queue length waveforms in the RED gateway have been studied using the

    Detrended Fluctuation Analysis (DFA) method. It can be shown that the scaling exponent

    varies with the relative stability of the RED gateway, and that as the scaling exponent is

    independent of the stationarity of the queue length, it can be used as an indicator for the

  • CHAPTER 1. INTRODUCTION 6

    stability of the TCP-RED system.

    Fourthly, the random marking mechanism in RED gateways and the distribution of

    ECN markings have been evaluated using a pseudo random variable generator. The results

    show that randomness is one of the key components to cause the difference between the

    TCP-RED model and ns-2 simulation. A randomized fluid flow model have presented by

    injecting the same kind of randomness into the fluid flow model. The interaction between

    participating gateways can be captured by the model. In terms of the ability to capture the

    salient dynamical features of the RED gateway, the randomized fluid flow model shows

    significant improvement over the original fluid flow model.

  • CHAPTER 1. INTRODUCTION 7

    1.3 Publications

    Journal Papers

    1. X. Chen, S. C. Wong, C. K. Tse and F. C. M. Lau, “Oscillation and Period Doubling

    in TCP/RED System: Analysis and Verification,” International Journal of Bifurca-

    tion and Chaos, vol. 18, no. 5, pp. 1459–1475, May 2008.

    2. X. Chen, S. C. Wong, C. K. Tse and L. Trajković, “Detrended Fluctuation Analysis

    of the Stability of Internet Gateway Employing the Random Early Detection Algo-

    rithm,” International Journal of Bifurcation and Chaos, to appear.

    3. X. Chen, S. C. Wong, C. K. Tse and L. Trajković, “Stability Analysis of Adaptive

    TCP-RED Gateway with Multiple Connections,” submitted to IEEE Transactions on

    Networking.

    4. X. Chen, S. C. Wong and C. K. Tse, “Adding Randomness to Modeling Internet

    TCP-RED System,” submitted to IEEE Transactions on Circuits and Systems -part

    II: Express Briefs.

    International Conference Papers

    1. X. Chen, S. C. Wong, C. K. Tse, and L. Trajković, “Stability Analysis of RED Gate-

    way with Multiple TCP Reno Connections,” in Proceedings of IEEE International

    Symposium on Circuits and Systems 2007, pp. 1429–1432, New Orleans, LA, USA,

    May. 2007.

    2. X. Chen, S. C. Wong, C. K. Tse, and L. Trajković, “Stability Study of the TCP-RED

    System Using Detrended Fluctuation Analysis,” in Proceedings of IEEE Interna-

    tional Symposium on Circuits and Systems 2008, pp. 324–327, Seattle, WA, USA,

    May. 2008.

  • CHAPTER 1. INTRODUCTION 8

    1.4 Organization of the Thesis

    The remaining part of this thesis is organized in the following way:

    Chapter 2 introduces basic knowledge on TCP and active queue management. Two

    kinds of modeling techniques for the TCP-RED system, namely, discrete model and fluid

    flow model, are discussed. The industrial network simulator, ns-2, is introduced. Methods

    in collecting and analyzing network data from ns-2 simulations are explained.

    Chapter 3 studies stability issues in the TCP-RED system. Stability conditions for TCP-

    RED system are defined. Based on a fluid-flow model (FFM), analytical conditions which

    describe the stable boundary of the RED gateway depending on the number of TCP Reno

    connections are formulated. The accuracy of the analytical results is verified using the

    results from ns-2 network simulations.

    Chapter 4 studies the nonlinear dynamics of the TCP-RED system. The chapter de-

    scribes the derivation of the characteristic frequency of the TCP-RED system from the

    fluid-flow model. The obtained frequencies are compared with the frequencies of the RED

    queue length waveforms observed from ns-2 simulations. Analysis of the TCP source fre-

    quency distribution reveals the occurrence of period doubling when the system enters the

    instability region as the filter resolution varies. Since random events and a large number of

    TCP flows are involved in the process of generating the average system dynamics, a statis-

    tical viewpoint is taken in the analysis. The results reflect the true system behavior as they

    are based on data from ns-2 simulations rather than numerical simulations of analytical

    models. The physical mechanism of oscillation is explained in terms of the differences in

    the TCP source frequency and the TCP-RED system characteristic frequency.

    Chapter 5 proposes a fast algorithm for detecting the congestion. The detrended fluc-

    tuation analysis (DFA) method is used to analyze the stability of the TCP-RED system. In

    DFA, time-series data are analyzed to generate a key parameter called power-law scaling

    exponent, which provides indication to the long-range correlations of the time series. By

  • CHAPTER 1. INTRODUCTION 9

    examining the variation of the DFA scaling exponent according to different varying system

    parameters, we quantify the stability of the RED system in terms of system’s characteris-

    tics.

    Chapter 6 presents a modified FFM. It has been found that waveforms from determin-

    istic FFM can never match with those from ns-2 simulations. It has been found that the

    mismatch is mainly due to the randomness of the way in which the RED gateways issue

    ECN markings. The distribution of the randomness in the ECN markings in RED gate-

    ways is studied, which is further implemented in FFM to enhance the RFFM (Randomized

    FFM). Using RFFM, bifurcation study is performed and compared with ns-2 simulations.

    Chapter 7 provides the conclusions and discusses some research directions for future

    works.

  • Chapter 2

    Background

    In this chapter, the concepts, architecture, and protocols of the Internet are introduced,

    along with a detailed discussion of the essential features and mechanisms of Transmission

    Control Protocol (TCP) and TCP congestion control algorithms. The roles and algorithms

    of queue management in a network are discussed. After that, network models for TCP-

    AQM are reviewed. Finally, a network simulation tool, ns-2, is introduced.

    2.1 Internetworking: Concepts, Architectures and Proto-

    cols

    Data communication networks have been growing explosively and have become an essen-

    tial tool for communications in developed societies. Networks were constructed to provide

    users with an ability to share information resources. A network consists of two or more

    end systems or hosts [14], which are the ultimate consumers of communication services.

    Specifically, an end system, which can be a computer, workstation, mobile device and so

    on, employs internet communication and executes applications on behalf of users. The end

    systems are directly connected through a physical medium, such as twisted pair cables,

    10

  • CHAPTER 2. BACKGROUND 11

    coaxial cables, or optical fibers. The end systems in a network are referred to as nodes, and

    the physical medium is called a link. Depending on how a node is attached to a link, the

    link can be either limited to a pair of nodes, which is known as point-to-point link, or shared

    by more than two nodes, which is known as multiple-access link. The point-to-point link

    and multiple-access link are shown in Figs. 2.1 (a) and (b), respectively. Besides the direct

    links, end systems can be indirectly connected through one or more medium nodes called

    switches, and the resulting network is called a switched network, as illustrated in Fig. 2.1

    (c). One of the most common switched networks is the packet-switched network1, in which

    data are divided into small pieces called packets and sent individually, instead of being

    transferred as strings of continuous bits. A network can be represented by a network cloud

    shown in Fig. 2.1 (c). The nodes inside the cloud, the switches, implement the network,

    and the outside nodes are the users of the network, called hosts. In general, a network cloud

    depicts any size and any type of network, regardless of its link type or switch type.

    Host Host

    Host HostHost Host

    (a)

    (b)

    Host Host

    Swtich

    Swtich

    (c)

    Swtich Swtich

    Swtich

    Host

    Host

    Host

    Host

    Figure 2.1: Illustration of links and a network cloud. (a) Point-to-point link; (b) multiple-access link; (c) switched network cloud.

    1The two most common types of switched networks are known as packet-switched network and circuit-switched network. The overwhelming majority of computer networks deploy the former type of switches,while the circuit-switch is most notably employed by the telephone system. In this thesis the packet switchednetwork is considered.

  • CHAPTER 2. BACKGROUND 12

    Networks were originally conceived to be small systems consisting of rather few nodes,

    and a user attached to a given network could not access to another network, as switches

    were limited by their ability to scale and to handle heterogeneity. As the need for data

    communication service has grown, a single network may be inadequate for business and

    individuals to meet the needs of information flow. Universal service, with which an end

    system in any part of an organization would communicate with other end systems, is highly

    desirable. In the early 1970s, the term internetworking was coined. The internetworking

    or internet2 scheme provides universal service among heterogeneous networks. Additional

    hardware systems are desired to interconnect a set of physical networks.

    The basic hardware component that carries out relaying service between networks is

    a gateway or router3. As illustrated in Fig. 2.2, two physical networks are connected by

    a gateway. A gateway connects two or more networks, and it appears in the connected

    network as a connected host [13]. An internetworking gateway makes it possible for one to

    choose network technologies satisfied by each user and plays more or less the same role as

    a switch which stores and forwards packets.

    Network 1 Network 2Router

    Figure 2.2: The illustration of interconnection of networks. An internet is formed by agateway interconnecting two physical networks. The networks can be of different types.End systems can be attached to either of the networks.

    An internet consists of a set of networks interconnected by gateways. The size of an

    2When written with an uppercase I, the term Internet refers to the widely used global Internet, while theone with a lowercase i, the internet, refers to an arbitrary collection of networks interconnected to providehost-to-host packet delivery service.

    3In the Internet community, a gateway is specifically referred to as an IP-level router, while a router is aswitch that receives data transmission units from input interfaces and, depending on the addresses in thoseunits, routes them to the appropriate output interfaces [13]. But in the thesis, the term gateway and router areused interchangeably.

  • CHAPTER 2. BACKGROUND 13

    internet, which depends on the number of connected networks, the number of end systems

    and users attached to each network, can vary. An internet can be built from an interconnec-

    tion of internets. Thus, arbitrarily large network clouds can be formed by interconnecting

    clouds. Fig. 2.3 illustrates the concept of the internetworking. Although the Internet is

    much more complicated and includes much more heterogenous nodes than the internet il-

    lustrated in Fig. 2.3, the basic idea of the Internet, to which a large percentage of networks

    are connected, is the same. That is why the Internet has been known as the network of

    networks [14].

    In general, both internet software and internet hardware together provide the appear-

    ance of a single, seamless communication system. The most important protocols devel-

    oped for internetworking are known as the TCP/IP Internet Protocols [124]. The internet

    architecture is based on four layers4 5. The host’s layer structure in Fig. 2.4 depicts the

    internet layer architecture. The bottom layer of the internet model is the link layer. The

    links allow data to be transferred within each network. The same link layer protocols are

    required for all the Internet nodes, including hosts and gateways, to communicate in their

    directly-connected network. The second layer is Internet layer, also known as the Internet

    Protocol (IP) or Internetworking layer. Protocols in this layer specify the format of pack-

    ets sent across an internet and provide the function necessary for connecting networks and

    gateways into one coherent system. The IP layer is responsible for delivering data from the

    source host to the final destination host. IP is a connectionless or datagram internetwork

    service, providing no end-to-end delivery guarantees. The IP layer is required by both hosts

    and gateways. The third layer is known as transport layer which specifies how to ensure

    transfer reliability, and provides end-to-end communication services. The transport layer

    contains two primary transport layer protocols at present: Transmission Control Protocol

    4Another internetworking layer model is referred to Open Systems Interconnection[127, 154] (OSI) sevenlayers model. In this thesis, TCP/IP four layers reference model is discussed.

    5While some divide the internet layering model into four layers [11, 14, 31], others lay the internet as fivelayers, in which the link layer is separated into two layers: the network interface layer and physical layer [33].

  • CHAPTER 2. BACKGROUND 14

    Host

    Router

    RouterRouter

    Router

    Router

    Router

    Host

    Host

    Host

    Network 4

    Network 3Network 2

    Network 1

    Network 5Host

    Host

    HostHost

    Host

    The Internet

    Figure 2.3: Illustration of the internetworking concept. An internet is formed with six gate-ways interconnecting five physical networks. A host in the underlying physical structurecan be attached to any one of the physical networks which are interconnected by gateways.A host in the internet can communicate with any other host in the internet, even though thetwo hosts may be attached to different types of networks in the internet.

  • CHAPTER 2. BACKGROUND 15

    and User Datagram Protocol (UDP). Reliable connection-oriented data transport service

    is provided by TCP, which is discussed in detail in the following sections. The top layer is

    called application layer which supports the direct interface to a user application. The layer

    contains many widely used protocols such as Electronic mail (e-mail), World-Wide-Web

    (WWW), File Transport Protocol (FTP), Trivial File Transport Protocol (TFTP), Secure

    Shell (SSH), and Network Terminal Protocol or remote login (Telnet).

    Application layer

    Transport layer

    Internet layer

    Link layer

    Application layer

    Transport layer

    Internet layer

    Internet layer

    Gateway

    Host Host

    Link layer Link layer

    Figure 2.4: Illustration of the Internet layer operations.

    TCP/IP protocol software is required in both hosts and gateways. Nevertheless, gate-

    ways do not need the protocols from all layers. More specifically, gateways necessitate the

    Internet protocol layer and the link layer to provide the connectivity service. An example

    of the Internet layer operations is presented in Fig. 2.4.

    2.2 TCP and Congestion Control

    2.2.1 TCP Prime

    The first TCP reference was a note in 1973 written by Vinton G. Cerf with the title of

    “A Partial Specification of an International Transmission Protocol”. The protocol design

    choices were then discussed and published. The protocol was split into TCP and IP, in

    which TCP aims to handle packetization, error control, retransmission and reassembly,

  • CHAPTER 2. BACKGROUND 16

    while IP specifies routing packets [21]. In February 1980, the U.S. Department of Defense

    (DoD) adopted TCP/IP as the preferred protocol to build a network of networks which

    was later split into military network (MILNET) for military related sites and the Internet

    [22, 23]. By the time that Internet started to be popularized by private companies, the

    networking revolution had begun. Immense opportunities in research and business have

    been provided since then.

    TCP protocol defined in RFCs6 793, 1122, 1323, 2018 and 2581 [3, 14, 15, 99, 126] is

    the predominant transport protocol of today’s Internet. More than 80% of the total Internet

    traffic volume is carried by TCP which provides a reliable data transfer service on unre-

    liable networks [143]. Numerous Internet applications, such as, Electronic Mail (e-mail),

    World-Wide-Web (WWW), File Transfer (FTP), Secure Shell (SSH), The Network Termi-

    nal Protocol (Telnet), and streaming media application, etc., are all built on TCP. Another

    type of transport protocol is UDP which provides a much simpler service to the application.

    It is connectionless, unreliable and not stream-oriented and it supports neither congestion

    control nor flow control.

    TCP is a connection-oriented protocol. An end-to-end connection between a TCP

    source and a TCP receiver must be established for data transfer. There are two procedures

    in connection-oriented service. One is connection establishment and the other is connec-

    tion termination. The way of establishing a connection follows a three-way handshake,

    in which handshake refers to the exchange of control information as shown in Fig. 2.5(a).

    The TCP sender sends an initial request message with SYN7 and a sequence number x to

    establish communication. Once the TCP receiver receives the SYN, the receiver will record6RFC (Request For Comments) is a series of chronological documents that contain ideas, techniques,

    observations, and proposes and accepts TCP/IP protocol standards. RFCs documents are available atwww.ietf.org

    7SYN is the name of the one code bit field in the TCP header. When SYN is set to 1, its correspondingsequence number is the the initial sequence number, and the sequence number of the first data byte is thissequence number plus one. When SYN is set to 0, its corresponding sequence number is the first data byte’ssequence number. OtherTCP header code bits used in opening or closing a TCP connection include ACK,FIN and RST.

  • CHAPTER 2. BACKGROUND 17

    the sequence number, and reply a message with SYN whose sequence number is y and ACK

    whose acknowledgment number is x + 1. When the TCP sender receives the reply mes-

    sage from the TCP receiver, the sender will send a message with ACK received from the

    TCP receiver with an acknowledgment number y + 1. When the connection is established

    between two end systems, the data can be sent from both directions, known as full-duplex

    service, until one of the systems issues a FIN packet, or a RST packet, or the connection

    times out. When the transfer completes, the connection is explicitly terminated. The pro-

    cess of terminating a TCP connection follows a four-way handshake. The TCP connections

    are full-duplex and there are two independent transfer streams, one on each direction. The

    TCP sender closes its application, sends a message of FIN and waits for the TCP receiver’s

    acknowledgment. The receiver acknowledges the FIN packet and sends an ACK to inform

    the sender that no more data is available. After a connection has been closed in a given di-

    rection, TCP would refuse to accept more data from that direction. On the other hand, data

    can still flow in the other direction until the sender closes it. When both directions have

    been closed, the end systems delete the records of the TCP connections. The termination

    of a TCP connection procedure, four-way handshake, is shown in Fig. 2.5(b).

    Retransmission Mechanism

    TCP guarantees an orderly delivery of all bytes data without any duplication, by using an

    acknowledgement mechanism to check the accuracy of the data received. Every time TCP

    sends a packet, it starts a timer and waits for an acknowledgment. If the timer expires

    before the acknowledgement reaches the TCP sender, the packet is then considered lost or

    corrupted. Unacknowledged data are retransmitted later. An ACK is generated by the TCP

    receiver upon accurate reception of the data correctly receiving. In this way, the reception

    of an ACK at the TCP sender guarantees that the data have reached its destination correctly.

    The TCP receiver has two choices when receiving a TCP packet. It can either generate an

  • CHAPTER 2. BACKGROUND 18

    SYN, (Seq. No. = x)

    ACK, (ACK. No. = y+1)

    SYN, ACK,

    (Seq. No. = y,

    ACK =x+1)

    Tim

    eSender Receiver

    FIN

    ACK

    Tim

    e

    Sender Receiver

    FIN

    ACK

    (a) (b)

    Figure 2.5: (a)Three-way handshaking in TCP connection establishing, and (b) four-wayhandshaking in TCP connection termination.

    ACK as soon as a packet is received or delay the ACK generation for a while, known as

    delayed ACK. By holding up the ACK, the receiver may be able to acknowledge two packets

    at the same time and therefore to reduce ACK traffic. However, if an ACK is delayed for

    too long, a timeout and retransmission may be triggered. In practice, the delay of ACK

    should not be set longer than 500 ms [60].

    Retransmission mechanism is one of the key principles for providing reliable data trans-

    fer. TCP employs a retransmission timer for each packet sent. If there is no ACK received

    during the time of retransmission timeout (TO) period, which means that the packet is prob-

    ably lost, then the packet needs to be sent again. When the ACK is received within the TO

    period, then the retransmission timer will be cleared. In many popular TCP implementa-

    tions, the minimum TO is set to 1 second [60]. Too long TO may result in longer delay and

    subsequently a lost in a busy network environment. However, an inappropriately short TO

  • CHAPTER 2. BACKGROUND 19

    would lead to too much unnecessary transmission traffic which is a waste of network re-

    sources and increases extra network traffic. Therefore, an appropriate TO is very important

    for obtaining optimal performance. TO should be set according to the value of the average

    round trip time (RTT), which is the time duration for a packet traveling from one end of a

    network to the other end and back again [118]. TCP records the time at which each packet

    is sent and the time at which an ACK for that packet arrives. From the difference of the two

    times, TCP knows the the sample round trip time (SRTT). TCP estimates the average RTT,

    RTT , in the following way:

    RTT = (1 − wR) · RTT + wR · S RTT (2.1)

    in which wR (0 ≤ wR < 1) is a constant weighting factor for weighting the old average

    against the latest sample round trip time. A typical value for wR is 0.125 [34]. The TO is

    estimated as:

    TO = RTT + ηDev (2.2)

    in which Dev is the estimated mean deviation and is used to describe the fluctuations, and η

    is a factor used to control how much deviation affects the round trip TO. A value suggested

    by researchers for η[34] is 38. The way of calculating Dev is as follows:

    Dev = (1 − wD) · Dev + wD · |S RTT − RTT | (2.3)

    where wD is a fraction between 0 and 1. This operation works like a low pass filter to

    control how fast the new sample affects the mean deviation. A typical value for wD is 0.25

    [34]. Equation (2.3) is used to maintain an exponentially weighted moving average of the

    deviation.8The original value for η was 2 in 4.3BSD UNIX, and was changed to 4 in 4.4BSD UNIX

  • CHAPTER 2. BACKGROUND 20

    Flow Control

    To prevent a TCP sender’s sending rate from being too high for the TCP receiver to han-

    dle, end-to-end flow control is implemented in TCP [54, 55]. The end-to-end flow control

    is necessary especially for a heterogeneous network environment. Flow control adopts a

    sliding window mechanism9 to continuously inform the TCP sender how much data the re-

    ceiver can accommodate, known as receiver advertised window size (rwnd), through using

    ACKs. The corresponding TCP sender controls the sending data by a send window with

    a size no greater than the rwnd upon the reception of a previous ACK. Since this work

    is focused on the congestion control, and the advised window size is larger than the send

    window size the advised window size can be ignored.

    2.2.2 Congestion Control Algorithm

    Offered load

    Thro

    ughp

    ut

    Capacitydesired

    reality

    Figure 2.6: Illustration of congestion collapse.

    While flow control is to prevent buffer overflow at the TCP receiver, it does not pre-

    vent the buffer overflow in the intermediate gateways. Congestion control mechanism is

    9Sliding window: an algorithm at the heart of TCP that allows the sender to transmit multiple packets upto the size of the window before receiving an ACK.

  • CHAPTER 2. BACKGROUND 21

    introduced in the late 1980s by Van Jacobanson [71] to regulate the transmission rate of

    each connection and to prevent it from reaching an inappropriately high rate that the gate-

    way cannot handle. The uncontrolled high rate may eventually lead to congestion collapse

    as illustrated in Fig. 2.6. In Fig. 2.6, the red dashed line represents the network capac-

    ity, at which the network performance would be perfect. Ideally, a network throughput is

    converging to the perfect case as the traffic load increases,as shown by the green curve.

    However, in reality the throughput decreases dramatically after some point before reaching

    the perfect case, as shown by the orange curve. This phenomena is called congestion col-

    lapse. The purpose of congestion control is to avoid congestion collapse and to maintain

    optimal (the highest) throughput of a network. TCP congestion control is a window-based

    mechanism. There are two key variables in a TCP congestion control algorithm: conges-

    tion window size (cwnd) and slow start threshold (ssthresh). cwnd limits the amount of

    data a sender can send to network, while ssthresh is a threshold variable which separates

    the different congestion control mechanism phases. As mentioned earlier, the send window

    size is assumed to be greater than rwnd. In fact, the maximum send window size is given

    by the min(rwnd, cwnd). Since congestion control is the main focus, cwnd is considered

    to be larger than rwnd. The assumption holds when the gateway is the bottleneck for a

    network. Therefore, the window size will refer to the send window which is controlled by

    congestion control mechanism, as well as cwnd.

    The principal operation of TCP congestion control in Reno10[3] involves the following

    mechanisms: slow start, congestion avoidance and fast retransmit/ fast recovery [135].

    Slow Start

    At the start-up of a connection, a TCP sender starts cautiously with a small (no more than

    2) window size, and then it tries to probe the available bottleneck capacity by exponentially

    increasing the window size until the window size reaches ssthresh. During this slow start10TCP Reno is the most widely deployed TCP version, and is considered as the standard TCP.

  • CHAPTER 2. BACKGROUND 22

    phase, for each ACK received, cwnd will increase by one, thus cwnd is doubled every RTT.

    It takes log2N round trips before the TCP sender can send N packets. The slow start phase

    is ended either when cwnd reaches ssthresh, i.e. cwnd ≥ ssthresh, or when congestion

    occurs, which means that either TO expires or three duplicate ACKs are received. The

    value of ssthresh can be arbitrarily high at the beginning. When the TCP experiences a

    congestion, it reduces ssthresh to half of the current window size (the window size is reset

    to 1 when timeout occurs), i.e. ssthreshnew = max(2, cwnd/2). When cwnd increases to a

    value greater than ssthresh, the congestion avoidance phase starts to take over. The slow

    start is used once a timeout occurs. As shown in Fig. 2.7 at the slow start phase, the window

    size exponentially increases.

    Congestion Avoidance

    During the congestion avoidance phase, the window size increases more cautiously with

    multiplicative decrease of the window size for each ACK received. Hence, in the conges-

    tion avoidance phase the window size increases linearly by one packet for each RTT, which

    is often known as additive increase (AI) algorithm. The window size is halved (if the halved

    value is smaller than 1 packet, then the window size will be reduced to 1 packet), once there

    is congestion detected. This is often referred to as multiplicative decrease (MD). A conges-

    tion can be due to a TO expiration which will trigger a retransmission with the window size

    reduced to 1 and ssthresh halved. Occurrence of three duplicate ACKs of Fast Retransmit

    in TCP is also considered as a signal of packet loss and thus triggers the multiplicative de-

    crease in both the window size and ssthesh. Furthermore, in an ECN (Explicit Congestion

    Notification)-capable network, receiving an ECN bit header is also considered as a signal

    of congestion. The window size and sshresh are reduced to half of the current window

    size. As shown in Fig. 2.7, the window size dynamics in the TCP congestion avoidance

    phase is known as Additive Increase Multiplicative Decrease (AIMD). The AIMD pattern

  • CHAPTER 2. BACKGROUND 23

    of continual increase and decrease of the window size continues throughout the lifetime

    of the connection. The important concept for AIMD is that the source reduces its window

    size at a much faster rate than it increases. At steady states, a non-congested connection is

    maintained at the phase of congestion avoidance and follows the AIMD pattern. Ideally, in

    the congestion avoidance phase, the waveform of the window size resembles a periodically

    sawtooth waveform. The periodic behavior is the basis of many TCP dynamics models.

    Fast Retransmit/Fast Recovery

    Fast Retransmit is an enhancement to TCP for reducing the waiting time for a sender before

    retransmitting a lost packet. When a TCP sender receives three duplicate ACKs, i.e. an

    original plus three absolutely identical copies in a row, congestion is declared. The packet is

    considered lost. When a duplicate ACK is received by the TCP sender, it represents that the

    receiver has received a packet out of order, suggesting that the earlier packet has probably

    been lost. The lost packet is then retransmitted without waiting for the retransmission timer

    to expire. The send window size is reduced to half of its current value. Note that this cannot

    happen if the congestion window is smaller than four packets.

    The Fast Recovery algorithm is another improvement of TCP. While the fast retransmit

    algorithm sends the lost packet in the congestion avoidance phase and not in slow start

    phase, the fast recovery algorithm allows high throughput under moderate congestion, in-

    stead of resuming slow start, especially for large windows. Faster recovery is only executed

    if the packet has been detected by fast retransmit.

    2.2.3 TCP Flavors

    The original TCP [126] version includes mechanisms for providing reliable connection

    services, such as timeout based retransmission, full-duplex data service, and flow control.

    Since it does not include any congestion control mechanism, it is no longer used in the

  • CHAPTER 2. BACKGROUND 24

    1

    Congestion window size (packet)

    Time (RTT)

    SS CA TO CA CASSCA

    ssthresh

    ssthresh

    ECN or 3 duplicate

    ACKs received

    ACK

    time-out

    Packet

    loss

    Window size

    CA: Congestion Aviodance

    SS: Slow Start

    TO: Timeout

    Figure 2.7: Illustration of TCP congestion window. “SS” is the slow start phase, duringwhich the window size exponentially increases until it reaches the slow start threshold,ssthresh. ssthresh is set to half of the current window size whenever a congestion is ex-perienced. “CA” represents the congestion avoidance phase during which the window sizeis increasing linearly and decreasing multiplicatively. Multiplicative decrease of the TCPwindow size occurs either by receiving an ECN bit header or three duplicate ACKs. When“TO” timer expires, the window size is reduced to 1 and the system re-enters the slow startphase.

  • CHAPTER 2. BACKGROUND 25

    current Internet. However, it is the basis of all other TCP versions.

    TCP Tahoe [71] is implemented with congestion control mechanism which includes

    slow start, congestion avoidance, and fast retransmit. Fast recovery is not yet included in

    Tahoe. Each time when packet lost occurs, TCP Tahoe re-enters slow start phase.

    TCP Reno [3] is the most widely used TCP version, which has included slow start,

    congestion avoidance, fast retransmit and fast recovery. The main improvement of Reno

    over Tahoe is that Reno reduces the window size as well as ssthresh to half of the current

    window size when a congestion signal is received, which can be either an ECN bit or three

    duplicate ACKs. As a result, in TCP Reno, the window size does not start with slow start

    for each packet lost, and therefore, throughputs in Reno are improved. Same as Tahoe,

    Reno still enters slow start when a timeout occurs. Reno is the TCP version studied in this

    thesis.

    TCP NewReno [47] performs at a lower packet error rates than Reno by modifying the

    fast recovery mechanism of Reno. In NewReno, when a packet loss is detected during fast

    retransmit, the highest sequence number transmitted till then is remembered. Fast recovery

    is finished only after receiving the ACK with the highest sequence number sent before

    a loss occurs. Additional losses of fast recovery are detected by the reception of partial

    acknowledgment. A partial acknowledgment is an ACK for new data with a lower sequence

    number than the highest data packet retransmitted before the packet loss. A problem with

    NewReno is that when there are no packet losses but only packets reordering with more

    than three packet sequence numbers, NewReno may mistakenly enter fast recovery. The

    NewReno is still under investigation and being enhanced by incorporating some additional

    algorithms [63].

    TCP SACK [99] uses selective acknowledgments (SACKs) to enable the TCP sender

    to retransmit lost packets faster than one packet per round trip time. Using SACK, the TCP

    receiver can explicitly is able to inform the TCP sender about packets that have received

    successfully, so the sender need retransmit only the packets that have actually been lost.

  • CHAPTER 2. BACKGROUND 26

    The SACK implementation can still use the same congestion control algorithms as Reno.

    Moreover, the SACK is useful for the Satellite Internet access [2].

    TCP Vegas [17] was presented before the introduction of NewReno and SACK in 1994.

    It is fundamentally different from other TCP versions. Vegas does not use packet loss as

    a trigger to reduce the window size. In Vegas, additive increase and additive decrease of

    the window size is deployed. The idea of Vegas is to use throughputs to detect conges-

    tion. Vegas estimates the throughput of a connection by evaluating the packets-in-flight,

    which is a delay-bandwidth product. If the estimated throughput is higher than the actual

    throughput, congestion is said to exist. Hence, reduction of the window size at the TCP

    sender is executed. A variable, Diff, representing the difference between the estimated and

    actual throughputs is maintained in Vegas. Vegas also modifies the slow start algorithm

    to find the correct window size without incurring a loss by exponentially increasing the

    window size every other RTT. RTT is used to calculate the variable, Diff. A different re-

    transmission strategy is used in Vegas, whose principle is that when RTT is greater than

    the timeout value, instead of waiting for three duplicate ACKs, Vegas starts retransmission

    after receiving one duplicate ACK. By doing so, cases in which the senders never receive

    three duplicate ACKs can be avoided.

    Fast TCP [75] is an alternative congestion control algorithm built on TCP Vegas. FAST

    TCP aims at providing flow level properties such as stable equilibrium, well-defined fair-

    ness, high throughput and link utilization. The basic idea of FAST TCP is to use queuing

    delay to assess and address the congestion. There is an additional modification requirement

    at the sender. FAST TCP applies an equation based approach at the source to control the

    sending rate. By appropriately selecting the equation and feedback mechanism, FAST TCP

    eliminates the packet level oscillations, improves the flow level dynamics and achieves its

    objective of high performance, stability and fairness in general networks. But stability eval-

    uation is limited to a single link with heterogeneous sources and feedback delay is ignored.

    Moreover, many experimental scenarios are designed to identify the properties of FAST

  • CHAPTER 2. BACKGROUND 27

    TCP but those scenarios are not very realistic.

    TCP Westwood [97] (TCPW) aims to improve the congestion control function by esti-

    mating an eligible sending rate, and configuring the congestion control parameters accord-

    ingly. TCP Westwood is a sender side modification of TCP Reno. TCP Westwood con-

    gestion control is based on bandwidth estimation by monitoring the ACK reception rate.

    By employing an adaptive decrease mechanism, TCP Westwood congestion control may

    avoid the half reduction of the TCP window size of the standard TCP AIMD and improve

    the stability of TCP. Comparing to standard TCP, TCP Westwood provides a congestion

    window that is reduced more in the presence of heavy congestion and less in the presence

    of light congestion.

    HighSpeed TCP [50] (HS TCP) reduces the loss recovery time by modifying standard

    TCP’s AIMD algorithm. HS TCP would only be effective for large congestion windows.

    When the congestion window size is smaller than a threshold or the loss ratio is too high,

    HS TCP will behave the same as the standard TCP algorithm. HS TCP performs well in

    high-speed long-distance links. On the one hand, HS TCP improves the link utilization of

    bursty traffic networks. On the other hand, it may lose fairness between the connections.

    2.3 Active Queue Management (AQM)

    The performance of applications that are built on TCP depends not only on TCP conges-

    tion control algorithm but also on the strategies of queue management in network routers.

    Active queue management mechanisms are implemented in Internet gateways to assist con-

    gestion control algorithms to manage the network. The main goal of AQM algorithms is to

    allow network operators simultaneously to achieve low packet loss and high throughput by

    detecting incipient congestion.

  • CHAPTER 2. BACKGROUND 28

    2.3.1 The Need of AQM

    TCP congestion control is an end-to-end control scheme. In TCP congestion control algo-

    rithms, packet loss is considered as an indicator of congestion, and thus triggers congestion

    control in which the sending rate of the TCP sender is reduced. TCP congestion control

    has been considered a great success in the Internet so far. However, it has limitations. A

    TCP end system heals the network only after congestion has occurred, i.e., one or more

    packets have been lost. When congestion occurs at the gateway, it takes about one RTT be-

    fore the TCP sender is informed. As the network is already congested and can handle less

    injected traffic, it is harmful to the network if the TCP sender keeps increasing sending rate

    before it is notified of the congestion. Further increase of the sending rate when congestion

    already occurs would lead to more serious congestion, causing even longer transmission

    delay, faster degradation of network throughput, and more packet losses. Additionally,

    there is increasing demand for QoS in the Internet, as multimedia and peer-to-peer applica-

    tions are being more popular. Congestion control, on its own, may not be able to fulfill the

    requirement. Moreover, TCP traffic contributes to the bursty nature of the Internet traffic.

    Larger buffer size at gateways is required in order to absorb the burstiness and to lower the

    number of lost packets. However, larger buffer size may cause longer queuing delay and

    even larger burstiness when congestion occurs. The queue length in the buffer should be

    managed intelligently in order to optimize the network performance. Hence, an intelligent

    queue management scheme is needed.

    Traditionally, the DropTail queue management, in which a first-in-first-out policy is

    used, is widely deployed in Internet gateways. It is simple and easy to implement and

    therefore still dominates today’s Internet. In the DropTail scheme, if the incoming packets

    exceed the buffer capacity, the latest arriving packets will be dropped. The DropTail queue

    management scheme has the following problems: firstly, it may cause lock-out, a scenario

  • CHAPTER 2. BACKGROUND 29

    where the network bandwidth is occupied by few number of traffic flows and other con-

    nections access requests to the gateway are all denied. Secondly, the network is inefficient,

    because no advance congestion warning is given to the end systems. Thirdly, global syn-

    chronization of TCP flows may occur as a result of the high correlation among packet

    droppings. The simultaneous drops of the packets result in the synchronous reduction of

    the TCP window sizes. Moreover, there might be large queue oscillation and jitter which

    causes TCP traffic to be more bursty. Additionally, since DropTail does not drop packets

    until the queue is full, the queue length stays full for a long period of time. Thus, long

    queuing delay is retained.

    To avoid the inherent problems of DropTail queue management, IETF recommends

    AQM for the next generation Internet. Unlike DropTail queue management, AQM is a

    proactive congestion control scheme and provides preventive measures to manage the gate-

    way buffer.

    Goals of AQM have been specified in RFC 2309 as follows:

    • Reduce the number of packets dropped in routers;

    • Provide low-delay interactive services;

    • Avoid lock-out behaviors.

    2.3.2 Explicit Congestion Notification (ECN)

    The original Internet protocol does not have any code field for signaling the congestion, and

    the most common way to signal the congestion is by dropping packets. Explicit Congestion

    Notification (ECN) [46, 119] is an extension to the Internet protocol to provide the ability

    of explicit notification of congestion. It is inherently coupled with AQM. The basic idea

    of AQM is to provide TCP senders with information about the imminent congestion, by

    sending appropriate indications to the TCP senders before the queue overflows. Instead

  • CHAPTER 2. BACKGROUND 30

    of informing TCP senders of congestion by dropping packets, as is the case with DropTail

    queue management network, AQM gateways can mark packets during congestion by setting

    the ECN bit in the packets’ header. Congestion Experience (CE) code point in the IP packet

    header is reserved as the possible congestion indication11. In an ECN-capable network, i.e.,

    one that is able to react to a congestion notification, the TCP sources respond to the ECN

    bit set in exactly the same way they react to a dropped packet. The choice of marking

    packets during congestion depends on the AQM policy. There are two main advantages of

    the ECN-enabled network. One is its ability to avoid unnecessary packet drops, and the

    other is the capacity in lessening unnecessary retransmission TOs.

    ECN is an optional feature, and the effectiveness of ECN requires the deployment of

    AQM.

    2.3.3 Random Early Detection (RED)

    Random Early Detection (RED) [45] is the default AQM scheme recommended by IETF

    for the next generation Internet [16]. RED was first proposed by Sally Floyd and Van

    Jacobson in 1993 [45] and is probably the most well known AQM scheme.

    The basic idea of RED is to detect an imminent congestion at gateways by comparing

    the computed average queue length, x, at the gateways with two thresholds, Xmin and Xmax.

    In the original RED algorithm, if the average queue length x is smaller than Xmin, no packet

    will be marked. That is to say, all TCP senders can further increase their sending rating,

    since the network is able to accommodate more packets. If the average queue length x

    is greater than Xmax and not greater than the buffer size B, all packets in the buffer will

    be marked to prevent the threatening congestion. As the queue length reaches the buffer

    size B, congestion occurs that all the packets in the buffer are marked and all the coming

    packets are dropped. RED can also drop packets when the ECN bit header is received

    11Technically, two bits are required for ECN. One is set by the source to indicate that the packet is ECN-capable. The other is set by gateways along the transmission path when congestion is experienced.

  • CHAPTER 2. BACKGROUND 31

    instead of marked packets depending on the configuration of RED and if the network is

    ECN capable. When the average queue length is between Xmin and Xmax, congestion is

    well controlled and high utilization is maintained. The packets in the buffer are randomly

    chosen and marked by a certain marking probability p. The marking probability increases

    linearly with increasing x, until x reaches Xmax, at which the probability is given by pmax.

    The probability function for gentle RED is given as follows and shown in Fig. 2.8.

    pb =

    0 0 ≤ x < Xminx − Xmin

    Xmax − Xminpmax Xmin ≤ x ≤ Xmax

    pmax +1−pmax

    Xmax(x − Xmax) Xmax < x ≤ 2Xmax

    1 2Xmax ≤ x ≤ B

    (2.4)

    where x is the average queue length at the RED gateway; α is the queue weight in RED12;

    Xmax and Xmin are maximum and minimum thresholds at the RED gateway, respectively; B

    is the the buffer size; pb is the marking probability assigned by RED algorithm; pmax is the

    marking probability of when the average queue length being at Xmax.

    As shown in Fig. 2.8, the marking probability is a function only of the average queue

    length, x. However, the real implementation for RED is ac