Resource Allocation in OFDMA Wireless Networks With Time Varying Arrivals

34
1 Resource Allocation in OFDMA Wireless Networks with Time-Varying Arrivals Somsak Kittipiyakul and Tara Javidi, Member, IEEE Abstract This paper considers the issue of optimal subcarrier allocation in OFDMA wireless networks when the arrivals and channels are stochastic and time-varying. Our objective is to minimize the long-term average packet delay over multiple time epochs. We show by an example that water-filling-based subcarrier allocation policy which maximizes throughput at each time epoch is not optimal in this case. This is because such policy ignores the time-varying queue state information, while in fact such information is necessary to minimize the long-term average delay. We present an optimal policy for a special case of On-Off channel and two homogeneous users. For general channel, the optimal policy is complicated and unknown. However, based on the insights learned from the On-Off case, we provide heuristic policies that use different degrees of knowledge about the channel and queue state information. Through simulation, we show that the value of queue vs. channel information varies with the traffic load. For instance, at low-to-moderate traffic regime, a policy which ignores much of channel state information outperforms a water-filling policy which ignores much of the queue state information. The opposite is true when considering heavy traffic regime. Index Terms OFDMA, subcarrier allocation, water-filling, load-balancing, queue state information I. I NTRODUCTION O RTHOGONAL frequency-division multiple access (OFDMA) is a promising technique to provide multiple access control (MAC) in high-speed wireless applications (e.g. broadband wireless, 4G systems, LANs) in a hostile multi-path environment with frequency-selective fading. OFDMA achieves high spectral efficiency in multiuser environment by dividing the total available bandwidth to orthogonal narrow sub-bands to be shared by users in an efficient manner [1]. By adaptively assigning subcarriers to the best users, multi-user water-filling achieves the instantaneous maximum throughput by taking advantage S. Kittipiyakul and T. Javidi are with University of California, San Diego. Email: [email protected]

description

Resource Allocation in OFDMA Wireless Networks With Time Varying Arrivals

Transcript of Resource Allocation in OFDMA Wireless Networks With Time Varying Arrivals

  • 1Resource Allocation in OFDMA Wireless Networks

    with Time-Varying ArrivalsSomsak Kittipiyakul and Tara Javidi, Member, IEEE

    Abstract

    This paper considers the issue of optimal subcarrier allocation in OFDMA wireless networks when the arrivals

    and channels are stochastic and time-varying. Our objective is to minimize the long-term average packet delay overmultiple time epochs. We show by an example that water-filling-based subcarrier allocation policy which maximizes

    throughput at each time epoch is not optimal in this case. This is because such policy ignores the time-varying

    queue state information, while in fact such information is necessary to minimize the long-term average delay. We

    present an optimal policy for a special case of On-Off channel and two homogeneous users. For general channel,

    the optimal policy is complicated and unknown. However, based on the insights learned from the On-Off case, we

    provide heuristic policies that use different degrees of knowledge about the channel and queue state information.

    Through simulation, we show that the value of queue vs. channel information varies with the traffic load. For

    instance, at low-to-moderate traffic regime, a policy which ignores much of channel state information outperforms

    a water-filling policy which ignores much of the queue state information. The opposite is true when considering

    heavy traffic regime.

    Index Terms

    OFDMA, subcarrier allocation, water-filling, load-balancing, queue state information

    I. INTRODUCTION

    ORTHOGONAL frequency-division multiple access (OFDMA) is a promising technique to providemultiple access control (MAC) in high-speed wireless applications (e.g. broadband wireless, 4Gsystems, LANs) in a hostile multi-path environment with frequency-selective fading. OFDMA achieveshigh spectral efficiency in multiuser environment by dividing the total available bandwidth to orthogonal

    narrow sub-bands to be shared by users in an efficient manner [1]. By adaptively assigning subcarriers tothe best users, multi-user water-filling achieves the instantaneous maximum throughput by taking advantage

    S. Kittipiyakul and T. Javidi are with University of California, San Diego. Email: [email protected]

  • 2of channel diversity among users in different locations. Since it is unlikely that selective fading affects users

    in different locations on similar frequencies, there is an overall gain in selectively assigning subcarriers to

    users. This is known as multiuser diversity gain. When each user is assumed to have infinite or sufficient

    amount of data in the buffer, water-filling is throughput optimal in both short and long terms. However,

    when considering finite time-varying packet arrivals and time-varying channels, water-filling is not long-

    term throughput optimal [2].The problem of optimal real-time subcarrier allocation has been recently studied ([3]-[18]). The prior

    work can be categorized into two classes based on the optimization objectives. The objective in the firstclass of work ([3]-[9]) is to minimize the total transmit power given constraints on Quality of Service(QoS) requirements of each user. These constraints include minimum data rate and/or acceptable bit errorrate (BER). In [8], although random packet arrivals are considered as inputs to the system, there is apacket scheduler that controls the average transmission rate of data flows into the subcarrier allocation.

    The second class of papers [10]-[13] attempt to maximize the total throughput at each decision epochgiven a constraint on transmit power per data stream. In [14], the authors generalize the objective to be thesum utility function of the throughputs subject to power constraints. Our work is similar to this class ofpapers in that we focus on maximizing the system throughput when considering the subcarrier allocation

    problem in an OFDMA system. We believe that, although there will always be numerous applications for

    which the power minimization is critical, in many commercial applications of future wireless systems,

    achieving high connection speed (data rate) will be of primary concern rather than power.However, the above works [10]-[13] differ from our work in that they all consider throughput maximiza-

    tion at each decision epoch. They provide solutions to a multi-user water-filling problem achieving Shannon

    capacity under power constraint at each decision epoch (in this paper, we refer to this as instantaneousthroughput maximization). The algorithm is then run every decision epoch to allow updates regardingvarying channel states and varying number of data streams. The problem with this technique is that the

    scheduler fails to anticipate the impact of its allocation decision to the future state of the system by

    assuming infinite supply of data per user.

    The present paper is a part of an on-going effort to establish a systematic approach to a series of

    throughput-related problems which have been observed in OFDMA systems when considering realistic

    packet arrivals and queue occupancies ([15]-[18]). In [15], a subcarrier allocation based solely on queuebacklogs for an MPEG-4 video transmission application is studied. In [16], the same authors study

  • 3subcarrier allocation based on realistic packet arrivals for MPEG-4 video streaming with respect to a

    performance metric based on video quality. The authors in [17] propose a subcarrier allocation in an OFDMsystem with finite buffer space. They show through simulations that water-filling solutions perform poorly

    with respect to a long-run throughput criterion due to buffer overflow. The authors in [18] demonstrate thata subcarrier allocation that maximizes a network-wide utility function, determined by mean waiting time,

    significantly extends the stable region for incoming traffic. Furthermore, it provides low delay transmission

    and controls the level of fairness for high-speed, bursty, and delay-sensitive traffic.

    In this paper, we focus on a long-term average performance, i.e. average queue backlog over T epochs,

    rather than an instantaneous optimization. We argue that the key factor in performance improvement

    in real systems with finite and time-varying arrivals is the queue backlog information. We show by an

    example that even without a constraint on buffer sizes, water-filling-based techniques perform poorly when

    considered over a long term. We argue that under realistic arrival patterns and moderate loads, maximizing

    instantaneous throughput (water-filling) is too myopic. Instead, we propose a long-term objective tominimize the average holding cost in the system and attempt to identify the optimal subcarrier allocation

    with respect to this objective. For a special case of In general, an optimal long-term policy must tradeoff between two competing goals: the desire to get maximum throughput now and the desire to get the

    maximum throughput in the future. The second goal requires positioning the system to enjoy the highestmultiuser diversity gain in the future by avoiding the situations where some queues are empty while

    others are heavily backlogged. In other words, the optimal long-term policy must take advantage of the

    knowledge of the current queue backlogs as well as the statistics of the channel and arrival processes to

    guarantee a balanced load in the future.

    The paper is organized as follows. Section II provides problem formulation and assumptions. In Section

    III, we discuss two major classes of policies: the instantaneous throughput maximizing policies (this classincludes water-filling) and load-balancing policies. Via a simple example, we show that in general theintersection of these two classes of policies can be empty, making identification of optimal policy difficult.

    However, in Section IV, we consider a special case of two-state (ON-OFF) channel model. For this case, weidentify a policy, called Maximum Throughput and Load-Balancing (MTLB), that belongs to both classesand show its optimality for the case of two homogeneous users. In Section V, for general connectivity

    model where the optimal policy is unknown, we use the knowledge learned from the MTLB policy in

    the On-Off connectivity model that queue information is vital for the optimal policy. We design heuristic

  • 4algorithms derived from MTLB with various emphasis on channel and queue state information. We, then,

    compare their performance via simulation under different arrival loads and traffic distributions. Section

    VI concludes the paper and discusses direction of future studies. The Appendices contain the construction

    of MTLB allocation and the proofs of the discussed theorems and lemmas.

    Here we would like to emphasize that our study shows that load balancing is essential in minimizing

    average delay. This fact is absolutely independent of fairness considerations even though MTLB happens

    to be a max-min fair policy when the users are assumed to be homogeneous. In other words, our interest

    in MTLB policy is not motivated by concerns regarding fairness. Instead, we demonstrate the following

    fundamental point: under realistic arrival patterns, load balancing is an essential element of any delay-optimal policy.

    II. PROBLEM FORMULATION AND ASSUMPTIONS

    A. Model and Notations

    We consider a downlink single-hop OFDMA system composed of a base station and N users with infinite

    buffers. There are K OFDM subcarriers which are time-slotted. There are N queues at the base station,

    one for each user, to buffer the data. The users are homogeneous, i.e. they see statistically symmetric

    arrival and channel connectivity processes. Furthermore, they have the same priority. Packets of fixed size

    arrive stochastically at each queue j and are transmitted to user j over a set of allocated subcarriers. At

    the beginning of each timeslot, the assignment of subcarriers to users is made by a centralized resource

    manager at the base station. The resource manager has perfect knowledge of the queue backlogs and

    the channel states which are assumed constant during a timeslot but varying over timeslots (block fadingmodel). We do not allow sharing of any subcarriers. The assignment is announced immediately to allusers via a separate control channel. Packets arrive during the current timeslot is not transmitted in that

    timeslot.

    In this paper, we use adaptive QAM as an example of the modulation scheme. We use the fact thatthere is only a very small loss of channel capacity if a white power spectrum is used (i.e. each subcarrierreceives equal power) instead of the optimal power spectrum [19]. By allowing the users to be locatedat different distances from the base, the transmit power per subcarrier of each user is assumed to be

    pre-adjusted to compensate for the different path losses so that the average received power per subcarrier

  • 5SNR Threshold for 2 packettranmission/timeslot

    subcarrier 1

    SNR

    subcarrier 2 subcarrier 3

    frequency

    S2

    S1

    SNR Threshold for 1 packettranmission/timeslot

    Fig. 1. Mapping of a received SNR channel profile of a user to packet capacity for each subcarrier. In this example, the user can potentiallytransmit 2, 1, and 0 packets on subcarrier 1, 2, and 3, respectively.

    assigned to user j is equalized for all users j = 1, . . . , N1. For example, if the distance of user j and

    the base is dj and subcarrier i is assigned to user j, then the required transmit power on subcarrier i

    is PKdj , where dj is path loss, is a fixed path loss exponent and P is a constant number2. If hij

    denotes the channel gain of user j at subcarrier i, then the signal power of this subcarrier received by

    user j is ( PKdj )

    |hij|2

    djwhich is equal to P

    K|hij|

    2, independent of the distance dj . Under such assumption,

    the channel gain hij can be mapped to the number of packets per time slot, cij , that subcarrier i can

    potentially transmit for user j as [19]:

    cij =D

    max

    {0,

    0.31(10 log10(

    P |hij|2

    KNo) 6.7)

    }(1)

    where D the number of QAM symbols per channel in a timeslot, the fixed packet length (in bits), No isthe noise power in the subcarrier and is the flooring operation. In other words, we assume that random

    fading channel conditions can be mapped into a matrix of connectivities {cij} and hence we consider the

    random connectivity matrix instead of the random fading channels of the users. Figure 1 shows such a

    procedure for an example of a user. In this example, we assume that there are two modulation and/or

    coding types: the first type requires a certain SNR (SNR > S1) and can transmit one packet per a timeslot;the other transmission type requires a higher SNR (SNR > S2) and transmits two packets per a timeslot.The figure shows that, the user having this channel profile can receive 2, 1, and 0 packets on subcarrier

    1, 2, and 3, respectively. As a result we map the channel profile for this user given in Figure 1 to a

    connectivity profile of (2, 1, 0).

    The following notations are used throughout the paper. Note that we use the following conventions:1The pre-adjusted transmit power assumption is used to avoid the near-far problem which would cause some issues on fairness as discussed

    in Section VI.2With the symmetric channel and arrival assumption (described later in Assumptions (A1) and (A2)), a subcarrier is equally likely to be

    assigned to any user and hence the average total transmit power consumed at the base is PN

    PN

    j=1 dj .

  • 6lower case letters for scalar, bold face lower case letters for row vectors, upper case letters for matrices

    and scripted upper case letters for space of matrices.

    b(n) = (b1, . . . , bN): Backlogs of each queue at the beginning of timeslot n.

    a(n) = (a1, . . . , aN): Stochastic packet arrivals to each queue during timeslot n.

    C(n) = {cij}: the K-by-N stochastic connectivity matrix at timeslot n where cij denotes the

    maximum number of packets subcarrier i can serve from queue j. For example, if user 1 has channel

    profile at time n given as in Figure 1, the first column of C(n) will be the column vector (2, 1, 0).

    W (n) = {wij}: the K-by-N allocation matrix at the beginning of timeslot n. wij {0, 1} and

    wij = 1 denotes that subcarrier i is assigned to serve queue j.

    Definition 1: For a row vector x = (x1, . . . , xN) and a matrix Y = (y1, . . . ,yN) where yj is a columnvector, a column-by-column matrix permutation pi corresponding to a permutation pi is defined as, for

    any j and k,

    pi(xj) = xk pi(yj) = yk

    Using the above notations and definition, we make the following assumptions on the arrival and

    connectivity processes:

    (A1) The packet arrival processes {a(n)} to users queues during each timeslot are independent acrosstimeslots. The packet arrival processes are symmetric such that the joint probability mass functionis permutation invariant, i.e.

    P [a(n) = pi(x)] = P [a(n) = x]

    for any n, vector x and permutation pi.

    (A2) The connectivity profiles {C(n)} are independent across timeslots and symmetric, i.e. the joint pmffor {C(n)} is column-by-column permutation invariant, i.e.

    P [C(n) = pi(Y )] = P [C(n) = Y ]

    for any n, matrix Y and column-by-column permutation matrix pi .

    Assumption (A2) is valid when the channel and mobility creates a homogeneous environment for allusers. Note that (A1) and (A2) imply independence across time but not across users, i.e. at a given timethe arrivals to various queues need not be independent.

  • 71

    2

    3

    K

    Users/Queues

    Subcarriers/Servers

    1

    2

    N

    c11

    cK1

    cKN

    c1Na 1

    a 2

    a N

    c12

    Fig. 2. Subcarrier Allocation Problem

    B. Problem Formulation

    We now formulate an abstract problem that captures essential features of the described OFDMA

    problem, discussed above.

    Problem (P)Consider a discrete-time model of N queues served by K servers. At each time, each server

    can serve one queue; but a queue can be simultaneously served by multiple servers. A server i

    can serve at most cij packets from queue j (see Figure 2). At each time, the connectivities cij ofall queue/server pairs are known. We allow for arrivals at each queue at each time and the arrivals

    are assumed to occur right before each time. The statistics of arrival and connectivity processes

    are assumed to satisfy (A1) and (A2). We wish to determine a Markov server allocation policy that minimizes the cost function at the finite horizon T :

    JT = E[T |I0] (2)

    where I0 summarizes all information available at time zero. T is the cost under Markov policy

    over horizon T .

    T =Tt=0

    (b(t)) (3)

    where the cost function (b) =N

    j=1 g(bj) where g is a convex and strictly increasing function.

    We note that restriction to Markov policies does not entail any loss of optimality because Problem (P)is a stochastic control problem with perfect observations [20]. Also, note that when g is identity func-tion, Problem (P) reduces to an average total backlog (E[Tt=0Nj=1 bj(t)]) minimization problem over

  • 8horizon T . From Littles theorem, the optimal policy that achieves minimum average backlog achieves

    the minimum average packet delay as well. Thus, we can interchange the notion of minimizing average

    delay and average backlog.

    C. Related Prior Work in Server Allocation

    Our problem formulation is very similar to the problem of transmission scheduling for wireless and

    satellite networks where a limited number of transmitters (servers) or channels have to be allocated tocompeting users with varying connectivity.

    The authors in [21] consider the server allocation problem of a single server to N competing queues.At each time slot each queue may be connected or disconnected (ON-OFF) to the server, dependingon a binary connectivity random variable. They show that the Longest Connected Queue (LCQ) policystabilizes the system if the system is stabilizable and minimizes the delay for the special case of symmetric

    queues.

    The authors in [22] further show that, in the case of K servers = N queues and the constraint that atmost C packets can be served in total in each time slot and fractional packets are allowed to be served to

    each queue, the optimal policy (the Most Balanced policy) is to serve the queues such that the resultingqueue lengths are most balanced. The authors in [22] allow for sharing of the servers (serving a fractionof packet from a set of queues). Furthermore, they do not allow servers to have distinct connectivityprofiles, i.e. in their paper a user is either connected to all servers or none at all. This is a special case

    of our problem where C(n) is reduced to a vector.

    In addition, the model used in [23] is similar to the one used in our paper. The authors in [23] considerthe problem of batch allocation of bandwidth or servers to multiple queues. The study is focused on

    the delay in the observations of channel and queue lengths. Again, the difference with our model is that

    [23] assumes identical connectivity profile while we allow for distinct connectivity profiles across servers.However, we do not consider observation delay with respect to queue length nor do we address imperfect

    channel estimation.

    III. INSTANTANEOUS THROUGHPUT MAXIMIZING VS. LOAD-BALANCING POLICIES

    In this section, we consider two classes of server allocation policies: a class comprising of instantaneous

    throughput maximizing (IMT) policies and another class of load-balancing (LB) policies. As discussedpreviously, each class represents one of the competing goals: an IMT policy maximizes the number of

  • 9packets being served now, while an LB policy maximizes the number of non-empty queues (hence, themultiuser diversity gain and the number of packets served) in the future. To be precise, we first definethe feasible allocation and non-idling feasible allocation. Then, we describe the two classes of policies

    mentioned above.

    A. Feasible Allocation

    Assume that at the beginning of time slot n, the state of the system is (b, C). An allocation W = {wij}

    is a feasible allocation for time slot n if(a) cij = 0 wij = 0; and(b) Nj=1wij 1, i = 1, . . . , K.

    The set of all feasible allocations is denoted by W(n, C). In addition, define W(n,b, C) W(n, C) to

    denote the set of all non-idling feasible allocation W if W also satisfies

    (c) Ki=1wijcij bj , j = 1, . . . , N .

    B. Instantaneous Throughput Maximizing Policies (IMT)

    Instantaneous Throughput Maximizing: An IMT allocation W (n) ={wij} W(n, C) is a feasible

    allocation that achieves the maximum throughput at time n if for all W (n) = {wij} W(n, C),

    Nj=1

    Ki=1

    wijcijI{bj>0} Nj=1

    Ki=1

    wijcijI{bj>0}, (4)

    where the indicator function IE =

    1 if condition E holds,

    0 otherwise.Note that the traditional water-filling is an IMT policy. Since water-filling maximizes instantaneous

    throughput by assigning subcarriers based only on the channel state information (CSI), it potentiallyempties or shortens some queues such that the queue lengths are significantly unbalanced.

    C. Load-Balancing (LB) Policies

    It is reasonable to maximize the expected future multiuser diversity gain under stochastic arrival and

    connectivity processes. For that reason, we consider a load-balancing policy; one that distributes the future

    work load among the queues as evenly as possible so that there are as many users as possible who have

    data waiting in the queues for transmission. Hence, this guarantees as much as possible the multiuser

  • 10

    diversity gain in the future. The future work load here is defined as the length of the queues after the

    assignment. The Longest Connected Queue policy [21] and Most Balanced policy [22] are some examplesof LB policies.

    To introduce the LB policy, we need the following definitions to compare queue vectors in term of

    their load distribution:

    Definition 2: Considering ordering function ord : RN RN to be such that x RN , y = ord(x)has the ordered elements of x in descending order i.e. yi yi+m, m > 0.

    Definition 3: We say x LQO y (x is more balanced than y) iff ord(x) lex ord(y) where the relationlex on R

    N is the lexicographic (i.e. dictionary or alphabetic) ordering (p.12 [26]).Example: (2, 2, 1) LQO (1, 3, 1) because ord(2, 2, 1) = (2, 2, 1) lex (3, 1, 1) = ord(1, 3, 1).

    Load Balancing: An LB allocation W (n) ={wij} W(n, C) is a feasible allocation that produces the

    most balanced queues after the assignment if, for all W (n) = {wij} W(n, C),

    [b 1W C]+ LQO [b 1W C]+ (5)

    where an element-wise product W C is a matrix {wijcij}, and 1 is a row vector of ones; Note that for

    a vector v RN , [v]+ ={v+1 , . . . , v

    +N

    }where v+j = vjI{vj>0}.

    Note that LB policies potentially sacrifice the current throughput (by giving priority to long queues)for the future throughput (by increasing the future multiuser diversity gain).

    D. Example

    This example shows that an average-delay optimal policy (which is also an average-backlog optimalpolicy by Littles theorem, as noted earlier) could be a mixture of IMT and LB policies. Table I gives asimple example demonstrating IMT, LB and mixture policies. For illustration, we assume the arrival and

    connectivity processes to have periodic structure with a period of six timeslots. The initial queue lengths

    b is [2, 1]. The subcarrier allocation matrix for each timeslot is denoted by underlining elements of each

    connectivity matrix. For example, for timeslot 1, the choice of

    1 21 0

    for the IMT policy indicates that

    it allocates W =

    0 11 0

    and achieves 3 packets of throughput while leaving the queue lengths after

  • 11

    Policies IMT LB MixtureTimeslot 1 2 3 4 5 6 1 2 3 4 5 6 1 2 3 4 5 6Arrivals (a) 2 1 0 2 2 0 0 2 1 2 2 0 2 1 0 2 2 0 0 2 1 2 2 0 2 1 0 2 2 0 0 2 1 2 2 0Queues (b+ a) 4 2 3 2 3 1 1 3 1 4 2 4 4 2 2 4 4 2 2 4 3 4 3 4 4 2 3 2 3 1 1 3 2 3 2 3Connectivity (C) 1 2 2 1 1 1 2 1 2 0 0 2 1 2 2 1 1 1 2 1 2 0 0 2 1 2 2 1 1 1 2 1 2 0 0 2

    1 0 0 1 1 1 1 1 0 0 0 0 1 0 0 1 1 1 1 1 0 0 0 0 1 0 0 1 1 1 1 1 0 0 0 0Leftover Queues 3 0 1 1 1 1 0 2 0 4 2 2 2 2 2 2 2 2 2 2 1 4 3 2 3 0 1 1 1 1 1 1 0 3 2 1Total Throughput 13 12 14Backlog 6 5 4 4 5 6 6 6 6 6 7 7 6 5 4 4 5 5Total Backlog 30 38 29

    TABLE IAN EXAMPLE SHOWING IMT, LB, AND MIXTURE POLICIES.

    the allocation at [3, 0], a highly unbalanced state. On the other hand, the choice of

    1 21 0

    indicates that

    the LB policy assigns W =

    1 01 0

    and has 2 packets of throughput but the leftover queue lengths are

    balanced at [2, 2]. Which policy is better cannot be told at timeslot 1 since the overall result depends on

    the future arrivals and connectivities.

    Because of the periodic structure of the arrivals and connectivities in this particular example, the

    allocations repeat every six timeslots. In every 6-timeslot period, there are 14 packet arrivals but only 13

    packet departures under IMT policy and 12 under LB, resulting in instability of queue backlogs (growingto infinity). Now we consider a third and mixture policy whose allocations are given in Table 1. It resultsin a better delay performance. In contrast to the pure IMT and LB policies, the mixture policy sometimes

    maximizes instantaneous throughput (e.g. at timeslot 1), balances the load (e.g. at timeslot 4), or doesIMT and LB simultaneously (e.g. at the other timeslots). We notice that this policy stabilizes the queues.This can be seen directly from the queue backlogs. Both IMT and LB policies add 1 and 2 packets,

    respectively, to the queues every 6 timeslots. Hence, the queues grow unboundedly over time while the

    queues under the mixture policy stay fixed at 2 and 1 packets at the end of every 6 timeslots. Although

    this is only a pathological example, it illustrates the difficulty of devising an optimal policy in the context

    of delay, where the loss of delay optimality can cause instability.

    IV. SPECIAL CASE: ON-OFF CHANNEL

    In this section, we consider a special case of the connectivity process where cij only takes values 0

    (OFF) or 1 (ON). Under this On-Off connectivity, we show that there exists a policy that meets boththe instantaneous throughput maximizing and load balancing objectives for the case of two homogeneous

  • 12

    users. We then show that, in this special On-Off case, the Maximum Throughput and Load-Balancing

    (MTLB) policy is an optimal policy for Problem (P). At each timeslot, the MTLB allocation is computedby using a Maximum Weight Matching (MWM) algorithm (see Appendix I). In this section, we give adefinition of MTLB policy specific for On-Off channel, discuss its existence, and prove its optimality.

    A. MTLB Policy

    Definition 4: Given state (b, C) at the beginning of time slot n, the MTLB policy chooses a non-idlingfeasible packet withdrawal matrix W (n) =

    {wij} W(n,b, C) such that

    (C1) Maximum Throughput: W (n) achieves the maximum throughput, i.e. for all W = {wij} W(n,b, C),

    Nj=1

    Ki=1

    wij Nj=1

    Ki=1

    wij. (6)

    (C2) Load Balancing: W (n) produces the most balanced queue, i.e. for all W = {wij} W(n,b, C),

    b 1W LQO b 1W. (7)

    B. Existence of MTLB Policy

    We show the existence of MTLB policy constructively in Appendix I by constructing an algorithm that

    results in an MTLB allocation at each decision epoch.

    C. Optimality of MTLB Policy

    We have the following main theorem of the paper:

    Theorem 1: Consider Problem (P) with On-Off connectivity, N = 2 users, a finite horizon T givenany initial state I0 = (b, C) and a cost function (b) =

    Nj=1 g(bj) where g is a strictly increasing and

    convex function, then MTLB policy is optimal at all time n = 1, . . . , T .

    The proof is given in Appendix II. Note that we conjecture that Theorem 1 should hold as well forany N 3 users. The proof requires complicated and lengthy arguments and is left for future work.

    Remark: Condition (C1) is a maximum instantaneous throughput or water-filling condition. In otherwords, the optimality of the MTLB policy shown in Theorem 1 implies that water-filling criteria is not

    sufficient to guarantee long-term throughput optimality unless it is complemented by a load-balancing

    criteria (condition (C2)).

  • 13

    Remark: Theorem 1, in addition, can be extended to the optimality of MTLB in an expected average

    cost sense for an infinite horizon problem.

    Corollary 1: Consider an infinite horizon version of the Problem (P), where the cost is modified to bethe average expected cost at each stage. Then MTLB is optimal for any initial state I0 = (b, C).

    Proof: The theorem proves that there exists a stationary MTLB policy which is optimal for Problem(P) for any finite horizon T . Hence, our MTLB policy achieves the minimization of the average expectedcost piT/T for any finite horizon T . Since the policy is independent of the horizon T , it is optimal with

    respect to an average expected cost criterion for the infinite horizon version of the problem.

    V. GENERAL CONNECTIVITY MODEL

    We have seen from the example in Table I that the optimal policy for the general connectivity model, i.e.

    when cij {0, 1, . . . , cmax}, where cmax 2, is rather complicated and unknown. Furthermore, we know

    that MTLB policy does not, in general, exist in this more general case3. Thus, in this section, we return

    to the two classes of policies: IMT and LB defined in Section III. We see each class can outperform

    the other depending on the system load. From the On-Off connectivity model, we have learned that

    queue information is vital for an average-delay optimal policy. We use the obtained insights to design

    several heuristic policies that are expected to be close to the optimal delay performance for a large

    subset of admissible loads. The policies use different degrees of knowledge on connectivity and queue

    information (called Channel and Queue State Information, CSI and QSI, respectively). We then comparethe performance of all these policies in term of average queue backlog (or equivalently, average packetdelay) by simulation under different traffic loads and traffic types.

    A. Heuristic Policies

    1) Algo-I (full QSI, On-Off CSI): The subcarrier assignment uses full information about the queuelengths (full QSI) and minimal information about the channel. The channel quality is thresholded to beeither ON or OFF (On-Off CSI), i.e. a subcarrier is considered ON if hij hthreshold4. Then, MTLBpolicy introduced in Section IV-A is used for subcarrier allocation. Note that the best allocation is found

    by using the maximum weight matching algorithm described in the Appendix I.3Moreover, we do not have any numerical solution to the general connectivity model, since all numerical solutions to the dynamic

    programming equation (shown in Appendix II-B) suffer from dimensionality curse (note that our state space is exponentially in number ofsubcarriers, users, and the maximum queue sizes).

    4The threshold is arbitrarily chosen. It should be adjusted depending on the load of the system and the number of subcarrier and users.We assume here that the threshold is fixed in our simulation.

  • 14

    Algo-I:

    cij =

    1 if hij hthreshold,

    0 otherwise.

    cth =D

    0.31(10 log( P

    K

    h2threshold

    No) 6.7)

    +.

    bpj = bj/(cth) j = 1, . . . , N

    For state (bp, C) compute an MTLB allocation W (see Appendix I).

    Note that the ON subcarriers are treated for a worst-case scenario, so the throughput of user j is equal

    to min{bj , cth

    Kj=1w

    ij

    }.

    2) Algo-II (full CSI, On-Off QSI): The subcarrier allocation uses full information about the channel (fullCSI) and minimal information about the queue lengths (On-Off QSI). The subcarrier allocation considersonly the queues which have some data to transmit. This allocation belongs to the class of IMT policies

    defined in (4).

    Algo-II:

    Assign W ={wij}

    such that wij(i) = 1 where j(i) = argmaxj{1,...,N} cijI{bj>0}.

    3) Algo-III and Algo-IV (full CSI and full QSI): Algo-III is the Maximum-Weight policy proposed in[24] which achieves full throughput for all admissible traffic. Under the Maximum-Weight policy, eachsubcarrier is assigned to the queue with the highest weighted connectivity, where the weight is the queue

    backlog size. Thus, when the connectivity of a subcarrier for multiple queues are equal, the subcarrier is

    given to the longest queue. However, Algo-III may over assign subcarriers to some users (i.e. the usershave not enough data to send over all assigned subcarriers). With an insight into the significance of load-balancing, we modify the known Algo-III to arrive at Algo-IV. Algo-IV is proposed to avoid the problem

    of imbalancing by accounting in the weight the reduced queue backlog resulted from the packets that will

    be served from the already assigned subcarriers.

    Algo-III:

    Assign W ={wij}

    such that wij(i) = 1 where j(i) = argmaxj{1,...,N} bjcij .;

    Algo-IV:

    X = {1, . . . , K};

    Loop (until stop):

  • 15

    If X = , then stop;

    (i, j) = argmaxiX,j{1,...,N} bjcij;

    If bjcij > 0, then wij = 1 else stop;

    bj = bj cij and X = X {i};

    Assign W ={wij}

    .

    For both algorithms, note that the throughput of user j is min{bj ,

    Ki=1w

    ijcij

    }.

    B. Numerical Comparisons and Simulation

    We consider a downlink OFDMA system in a single cell with one base station composed of N = 32

    statistically independent and identical users and K = 128 subcarriers. We generate a frequency-selective

    channel by using 26-tap multipath with exponential intensity profile and use adaptive QAM modulation.We set the parameters P,D, and No in (1) so that the allocation of subcarriers over a block is equivalent tothe server scheduling problem where the connectivity cij {0, 1, 2, 3, . . .}. All simulations are conducted

    over a 6,000 timeslots.

    We consider arrivals of fixed-size packets where the number of arrivals per timeslot for each queue

    is a random variable having one of the three distributions: heavy-tailed, Poisson and deterministic5. The

    heavy-tailed distribution is said to model realistic packet traffic (e.g. in the Internet) which exhibits a long-range dependency [35]). Here, we use the simplest heavy-tailed distribution, the Pareto distribution6. Theheavy-tailed, Poisson and deterministic traffic types represent decreasing degree of burstiness, respectively.

    Figures 3, 4, and 5 provide comparisons of the performance of the proposed algorithms under different

    traffic models in terms of the average total queue backlog (equivalently, in terms of the average delay byLittles theorem). For all traffic types, Algo-III and Algo-IV, as expected, outperform Algo-I and Algo-II because Algo-III and Algo-IV use both CSI and QSI for the assignment decision. This observationis consistent with [17] and [18], as discussed before. Furthermore, Algo-IV outperforms Algo-III via abalancing of queues (at the cost of computational burden).

    However, the more interesting and important observation is the performance of Algo-I in the light-

    to-moderate traffic regime (below 6 packets/user/timeslot) in all traffic types. Although Algo-I ignoresmuch of the CSI, it outperforms Algo-II and Algo-III significantly. This is because Algo-II and Algo-III

    5Since we allow only integer number of packet arrivals in each timeslot, we assume arrivals of and packets to allow for non-integeraverage number of arrivals .

    6We use a slightly modified Pareto, i.e. its cumulative distribution function of P (X < x) = 1 (k/x), where k is the minimum valuethe random variable X can take and we set = 2.

  • 16

    2 3 4 5 6 7 8 90

    20

    40

    60

    80

    100

    120

    140

    160

    180

    200

    Avg. Load (pkts/timeslot/user)

    Avg.

    Que

    ue B

    acklo

    g (pk

    ts/tim

    eslot

    /user) AlgoIAlgoII

    AlgoIIIAlgoIV

    Fig. 3. Average queue backlog for the truncated heavy-tailed distribution.

    2 3 4 5 6 7 8 90

    20

    40

    60

    80

    100

    120

    140

    160

    180

    200

    Avg. Load (pkts/timeslot/user)

    Avg.

    Que

    ue B

    acklo

    g (pk

    ts/tim

    eslot

    /user) AlgoIAlgoII

    AlgoIIIAlgoIV

    Fig. 4. Average queue backlog for the Poisson distribution.

    over-assign the capacity. In addition, it is because in this low-to-moderate traffic regime, the traffic load

    is light enough compared to the system capacity so that the load can be sustained even with the On-Off

    knowledge of the CSI. However, as the traffic intensity increases, the performance of Algo-I sees a sharp

    degradation reflecting the policys maximum stable rate (about 6 to 8 packets/timeslot/user, dependingon traffic burstiness). In other words, Algo-II has a larger stability region than Algo-I since Algo-II usesfull CSI and its over-assignment problem disappears at heavy traffic. This insight sheds light on nature

    of delay performance versus throughput considerations and the benefit of using queue information. When

    considering light-to-moderate traffic intensity (resulting in reasonable delays), the value of QSI outplaysthat of CSI. This means that CSI is critical for high throughput and delay-insensitive applications, while

    QSI is vital for delay sensitive traffic for low-to-moderate throughput.

  • 17

    2 3 4 5 6 7 8 90

    20

    40

    60

    80

    100

    120

    140

    160

    180

    200

    Avg. Load (pkts/timeslot/user)

    Avg.

    Que

    ue B

    acklo

    g (pk

    ts/tim

    eslot

    /user) AlgoIAlgoII

    AlgoIIIAlgoIV

    Fig. 5. Average queue backlog for the constant arrivals.

    VI. CONCLUSION AND FUTURE RESEARCH

    In this paper, we considered the problem of subcarrier allocation in OFDMA system. We argued that

    conventional water-filling policies based on maximizing instantaneous throughput are too myopic and

    ignore important information about state variable (queue length). We identified a policy (MTLB) thatachieves the instantaneous maximum throughput as well as balancing the queue lengths. Such a policy

    always exists when the channel follows a symmetric ON/OFF model. In such case, we conjectured thatMTLB achieves the minimum average delay (mean response time) at any time. We proved this whenN = 2 while the proof for N 3 users remains open.

    that are expected to be close to the optimal delay performance for a large subset of admissible loads.

    The policies use different degrees of knowledge on connectivity and queue information (called Channeland Queue State Information, CSI and QSI, respectively).

    For more realistic channels (general connectivity matrix) but symmetric users, we proposed two heuristicpolicies (Algo-III and Algo-IV) based on the insights obtained from the On-Off connectivity case. Weshowed by simulation that the value of CSI and QSI in optimizing the performance heavily dependson the arrival statistics. We showed that in low-to-moderate traffic regime and from a delay optimality

    perspective, balancing the queues is more critical than opportunistically taking advantage of CSI. The

    opposite becomes true in the heavy traffic regime. Furthermore, the proposed heuristic algorithms (Algo-III and Algo-IV) that use both sets of information and perform well in both regimes. What remains is theextension of the above results to a network of heterogeneous users (in terms of priority, statistics of arrivalsand channel quality). We believe that in such systems the fundamental tradeoff between instantaneous

  • 18

    throughput and setting up the system for highest future multi-user diversity remains. The challenge, though,

    is to understand how heterogeneity of users will impact the notion of balancing.

    It is useful to note that our symmetric assumption is a statistical notion. This is justified if we assumevariation in the channel and mobility creates a homogeneous environment for all users. In case where this

    is not true, as the case with low mobility or users in a large cell, the near-far problem and consequent

    fairness-throughput issues are challenging problems by themselves as evident in [25] and the referencestherein. We note that our work is only complimentary to these papers, in that, we address the significance

    of queue information in allocating subcarriers even in the absence of such near-far problem. Therefore,

    combining the general connectivity matrix and heterogeneous users into our model to reflect the practical

    system is an important problem for future research.

    ACKNOWLEDGMENT

    This research was supported in part by the National Science Foundation ADVANCE Cooperative Agree-

    ment No. SBE-0123552. The authors would like to thank Professor Richard Ladner and Tami Tamir for

    their helpful suggestions.

    REFERENCES

    [1] R.S. Cheng and S. Verdu, Guassian multiaccess channels with ISI: capacity region and multiuser water-filling, IEEE Trans. on Inform.Theory, vol. 39, no. 3. pp. 773-785, May 1993.

    [2] S. Kittipiyakul and T. Javidi,Subcarrier allocation in OFDMA systems: beyond water-filling, 2004 Asilomar Conference on Signals,Systems, and Computers, Nov. 2004.

    [3] D. Kivanc, and H. Liu, Computationally efficient bandwidth allocation and power control for OFDMA, IEEE Trans. on WirelessComm., vol. 2, no. 6, Nov. 2003.

    [4] C. Y. Wong, R. S. Cheng, K. B. Lataief, and R. D. March, Multiuser OFDM with adaptive subcarrier, bit and power allocation, IEEEJSAC., vol. 17, no. 10, pp. 1747-1757, Oct. 1999.

    [5] S. Pietrzyk and G. J. M. Janssen, Multiuser subcarrier allocation for QoS provision in the OFDMA systems, IEEE VTC 2002 Fall,vol. 2, Sept. 2002.

    [6] E. Bakhtiari and B. Khalaj, A new joint power and subcarrier allocation scheme for multiuser OFDM systems, 14th IEEE Proc. onPersonal, Indoor and Mobile Radio Comm., 2003.

    [7] G. Zhang, Subcarrier and bit allocation for real-time services in multiuser OFDM systems, 2004 IEEE International Conf. onCommunications, 2004.

    [8] Y. Zhang and K. Lataief, Adaptive resource allocation and scheduling for multiuser packet-based OFDM networks, IEEE ICC 2004,Paris, June 2004.

    [9] H. Yin, H. Liu, An efficient multiuser loading algorithm for OFDM-based broadband wireless systems, Globecomm 00, San Francisco,USA, 2000.

  • 19

    [10] M. Ergen, S. Coleri and P. Varaiya, QoS aware adaptive resource allocation techniques for fair scheduling in OFDMA based broadbandwireless access systems, IEEE Transactions on Broadcasting, vol. 49, no. 4, Dec. 2003.

    [11] J. Jang, K.B. Lee, and Y.H. Lee, Transmit power and bit allocations for OFDM systems in a fading channel, Proc. IEEE GLOBECOM2003, Dec. 2003.

    [12] G. Munz, S. Pfletschinger, and J. Speidel, An efficient water-filling algorithm for multiple access OFDM, IEEE Globecom 02, Taipei,Taiwan, November 2002.

    [13] W. Rhee and J.M. Cioffi, Increase in capacity of multiuser OFDM system using dynamic subchannel allocation, Proc. of VehicularTech. Conf. (VTC), 2000, vol. 2, pp.1085-1089, May 2000.

    [14] G. Song and Y. Li, Cross-layer optimization for OFDM wireless networks - part I: theoretical framework, IEEE Trans. WirelessComm., v.4, no.2, March 2005.

    [15] J. Gross, J. Klaue, H. Karl and A. Wolisz, Subcarrier allocation for variable bit rate video streams in wireless OFDM systems, IEEEVTC, Florida, USA, 2003.

    [16] J. Gross, J. Klaue, H. Karl, and A. Wolisz, Cross-layer optimization of OFDM transmission systems for MPEG-4 video streaming,Computer Communications, 2004.

    [17] G. Li and H. Liu, Dynamic resource allocation with finite buffer constraint in broadband OFDMA networks, IEEE Wireless Comm.and Networking, v. 2, pp. 1037-1042, March 2003.

    [18] G. Song, Y. Li, L. Cimini, Jr., and H. Zheng, Joint channel-aware and queue-aware data scheduling in multiple shared wirelesschannels, Globecomm 2003, San Francisco, 2003.

    [19] A. Czylwik, Adaptive OFDM for wideband radio channels, Proc. GLOBECOM 96, vol. 1, pp 713-718.[20] R. Kumar and P. Varaiya, Stochastic Control, Prentice-Hall, 1986.[21] L. Tassiulas and A. Ephremides, Dynamic server allocation to parallel queues with randomly varying connectivity, IEEE Trans. on

    Info. Theory, vol. 39, no. 2, pp. 466-478, 1993.[22] A. Ganti, Transmission Scheduling for Multi-Beam Satellite Systems, Doctoral Thesis, Dept. of EECS, MIT, Cambridge, MA, 2003.[23] N. Ehsan and M. Liu, Properties of optimal resource sharing in a delay channel, IEEE CDC 2004, Dec. 2004.[24] T. Javidi, Rate stable resource allocation in OFDM systems: from waterfilling to queue-balancing, Allerton Conference on

    Communication, Control, and Computing, September 2004.

    [25] Z. Shen, J. G. Andrews, and B. L. Evans, Adaptive resource allocation in multiuser OFDM systems with proportional fairness, IEEETrans. on Wireless Communications, Dec. 2005.

    [26] D. M. Topkis, Supermodularity and Complementarity, Princeton University Press, USA, 1998.[27] U. Manber, Introduction to Algorithm: a creative approach, Addison-Wesley Publishing Company, 1989.[28] N. Harvey, R. Ladner, L. Lovasz, and T. Tamir, Semi-matchings for bipartite graphs and load balancing, Proc. of the Workshop on

    Algorithms and Data Structures (WADS 03), Ottawa, Canada, July 2003.[29] K. Mehlhorn and S. Naher, The LEDA Platform of Combinatorial and Geometric Computing, Cambridge University Press, 1999.[30] B. Hajek, Optimal control of two interacting service stations, IEEE Trans. Auto. Control. AC-29, pp. 491-499, 1984.[31] G. Koole, Convexity in tandem queues, Prob. in Engr. and Info. Sciences, 2004.[32] E. Altman, B. Gaujal and A. Hordijk, Discrete-Event Control of Stochastic Networks: Multimodularity and Regularity, Springer-Verlag,

    Germany, 2003.

    [33] A. W. Marshall and I. Olkin, Inequalities: Theory of Majorization and Its Application, Academic Press, USA, 1979.[34] E. M. Yeh and A. S. Cohen, Delay optimal rate allocation in multiaccess fading communications, Proc. Allerton Conf. on

    Communication, Control, and Computing, Monticello, IL, 2004

  • 20

    [35] Wikipedia, Long-range dependency, http://en.wikipedia.org/wiki/Heavytail

    APPENDIX I

    COMPUTATION AND EXISTENCE OF MTLB POLICY

    In this section, we prove the existence of MTLB policy under an On-Off channel model. The proof is

    constructive and uses ideas from graph matching literature. An algorithm to compute MTLB assignment

    is proposed. This algorithm is based on notions of alternating and balancing paths which are later used

    to prove the optimality of MTLB. Note that the discussion in this section (hence the existence proof) isvalid for general N .

    A. Alternating and Balancing Paths

    Let U be the set of all queues and V the set of all servers. We have the following definitions:

    Definition 5: The ordered interleaved sequence of user-servers

    S(W,u1, uk) := (u1, v1, u2, . . . , uk1, vk1, uk)

    is said to be an alternating path from node u1 to node uk corresponding to allocation W if

    a) ul 6= uj and vl 6= vj for all l = 1, . . . , k, j = 1, . . . , k and l 6= j;b) Both queues ul and ul+1 have connectivity to server vl, i.e. cvl,ul = cvl,ul+1 = 1; andc) W assigns server vl1 to queue ul, i.e. wvl1,ul = 1;

    In addition, S(W,u1, uk) is called a balancing path from node u1 to node uk corresponding to allocation

    W if it also meets condition

    d) bu1 wu1 buk wuk + 2 where wul =K

    i=1wi,ul for l = 1, k.

    Definition 6: W a (W b) is the alternating (balancing) allocation of allocation W along an alternating(balancing) path S(W,ul, uk) if queues ul are reassigned to servers vl, l = 1, . . . , k 1.

    An example of alternating path and alternating allocation is shown in Figure 6. Notice that the above

    notions are conceptually similar to the notion of alternating path in the graph matching literature [27]. Inparticular, our definition of balancing path is related to the notion of cost-reducing path in [28] in that abalancing path is used to reduce the cost of unbalancedness of the queues.

  • 21

    u1

    u3

    u1

    u3

    (a) (b)

    v1

    u2v2

    v1

    v2

    u2

    Fig. 6. Example of an alternating path and the alternating allocation from queue u1 to queue u3 (a) Alternating path(u1, v1, u2, v2, u3). The dotted and solid lines show connectivity. The solid lines show the assignment of queues to servers (b)Alternating allocation.

    B. Existence of MTLB: Construction and Computation

    Now we propose a graph algorithm to construct MTLB assignment. We first convert the original graph

    of queues and servers (Figure 2) into the following Equivalent Bipartite Graph with proper weights on theedges. We then run Maximum Weight Matching (MWM) on the equivalent bipartite graph. In Theorem 2,we show that the resulting assignment satisfies conditions (C1) and (C2), hence, it is MTLB.

    Equivalent Bipartite Graph Construction

    1) Associated with each queue j, construct mj = min(bj ,K

    i=1 cij) nodes labeled as aj1, aj2, . . . , ajmj .

    2) Let Ueq = {a11, a12, . . . , a1m1 , a21, . . . , aNmN} be the set of all such nodes.3) Let V eq = {vi}Ki=1 be the set of servers.4) Let Eeq = {(ajm, vi) : cij = 1} be the set of edges representing connectivities.5) Let : Eeq 7 Z++, (ajm, ) = bj m+ 1 be the positive integer weight of each edge in Eeq .

    Definition 7: [29] Consider a bipartite graph (U,E, V ) with weight function : E 7 R. A matchingM is a subset of E such that no two edges in M share an endpoint. The weight of a matching M is

    (M) =

    eM (e). A matching M is a maximum weight matching (MWM) if its weight is at least aslarge as the weight of any other matching .

    Definition 8: A subcarrier allocation W = {wij} and a matching M are said to be equivalent when 1)M is a matching on the equivalent bipartite graph, and 2) wij = 1 if and only if there exists m such that(ajm, vi) is a matching edge, i.e. (ajm, vi) M .

    Theorem 2 uses the notion of MWM to constructively prove the existence of MTLB allocation. For

    that, we need the following lemma:

    Lemma 1: Any allocations W and W in W(n,b, C), for n = 1, . . . , T , that achieve the maximum

    throughput L are derived from each other by a sequence of alternating allocations.

  • 22

    Proof: Let w := 1W . It is sufficient to show that if both w + ei and w + ek are feasible and bothhave the maximum throughput L, then there exists an alternating path S(w+ ek, ui, uk). If this is not the

    case, then w+ ei + ek must also be feasible. But the throughput of w+ ei + ek is equal to L+ 1 which

    is a contradiction.

    Lemma 1 means that any throughput optimal allocations relate to each other by re-assignment of some

    servers while the throughput is kept unchanged. This is true because a subcarrier can serve one packet

    from any of the connected servers by the On-Off channel assumption.

    Corollary 2: An allocation W W(n,b, C), for n = 1, . . . , T , which satisfies Condition (C1) alsosatisfies the Load-Balancing Condition (C2) if and only if it has no balancing path.

    Proof: First observe that any allocation satisfying the Condition (C2) must also satisfy Condition(C1) because if not, there would be an idle server that could have been assigned to serve one more packetand the queues will be more balanced using this idle server. With this observation, the corollary holds

    using Lemma 1.

    Theorem 2: A maximum weight matching on the equivalent bipartite graph is equivalent to a MTLB

    allocation, i.e. the equivalent allocation satisfies both conditions (C1) and (C2).Proof: Since all weights are positive, the MWM matching on the equivalent bipartite graph nec-

    essarily matches all possible servers and hence the equivalent allocation achieves maximum throughput

    (condition (C1)). We prove the load balancing condition (C2) by contradiction. Suppose the maximumweight matching M results in the allocation W that achieves the maximum throughput but does not

    produce the most balanced queues. But from Corollary 2, we know that there must exist a balancing path

    S(w, uj, ui) from some queue uj to ui such that bj wj bi wi + 2. Let us denote the balancing

    allocation of W along S(w, uj, ui) as W b. Let M b be the equivalent matching to W b. According to M ,

    node aiwi is matched and aj(wj+1) is not, while the reverse is true for M b. Hence, (M b) (M) =

    bj wj (biwi+1) 1. But this is a contradiction to the assumption that M is the maximum weight

    matching.

    An example of MTLB assignment based on the proposed algorithm is shown in Figure 7. It is intuitive

    to see that maximum weight matching on the equivalent bipartite graph achieves MTLB assignment. This

    is because the equivalent bipartite graph in effect expands individual packets that can possibly be served

    into nodes and basically labels each packet with the number of packets waiting behind it (see Figure 7(b)).The maximum weight matching selects the matching that serves the packets with the most number of

  • 23

    A

    B

    C

    5 U1

    (a)

    4 U2

    A

    B

    C

    5U1

    (c)

    4U2

    3

    3

    A

    B

    C

    U1a11a12

    U2a21

    a22a23

    (b)

    5

    432

    4

    Fig. 7. Example of MTLB construction (a) queue lengths and connectivities; (b) The equivalent bipartite graph with theweights are shown at each subnode e.g. the weights of the edges (a11, A) and (a11, B) are five. The thick edges indicatethe maximum weight matching; (c) The edges indicate the resulted MTLB assignment. The leftover queue length after theallocation is {3, 3}.

    packets waiting behind them. This achieves maximum throughput and load-balancing at the same time.

    Over-assignments are avoided since, in the equivalent bipartite graph, only min{bj ,K

    i=1 cij

    }packets

    from each node j is expanded.

    APPENDIX II

    OPTIMALITY OF MTLB POLICY FOR ON-OFF CONNECTIVITY

    In this section, we analyze the solution to Problem (P) and show that MTLB policy is optimal forProblem (P) for symmetric On-Off connectivity when N = 2. We conjecture the optimality for N 3.

    A. Basic Definitions and NotationsWe first revisit the notion of instantaneous cost function , introduced in Problem (P). We use a similar

    framework as in [23], [30] and [31]. We first define a class of functions, F , to which any cost function(b) of the form

    Nj=1 g(bj) belongs, where g is strictly increasing and convex. We then show that if the

    cost function belongs to F , then the average optimal cost-to-go function (defined in (10)) also belongsto F . The properties of F are then used to show the optimality of MTLB policy.

    For notation convenience, we first define:

    Definition 9: Define function Rij : ZN ZN to be equivalent to a transfer of a packet from queueui to queue uj i.e. Rij(b) = b ei + ej where em is a row vector of zeros except for the mth element

    which is one.

  • 24

    Next we give the definition of F , the class of the cost functions we consider:

    Definition 10: A function f : ZN+ R belongs to the set F if, for any i, j {1, . . . , N}, f satisfies:(B.1) f(b) f(b+ ei);(B.2) f(b) = f(pi(b)) for any permutation pi;(B.3) f(b+ ei) f(b) f(b+ ei + ej) f(b+ ej);(B.4) 2f(b) f(b+ ei) + f(b ei);(B.5) 2f(b) f(Rij(b)) + f(Rji(b)); and(B.6) f(Rij(b)) f(b) if and only if bi bj + 1.

    (B.1) is a monotonicity condition. (B.2) is permutation invariance. (B.3) is supermodularity [31]. (B.4)is convexity in bi. (B.5) is directional convexity along b1+ b2 = constant line [31]. Conditions (B.3)-(B.5)are the second-order relations related to convexity for discrete functions.7 (B.6) is the balancing advantageand establishes the optimality of the optimal MTLB policy.

    Fact 1: Any strictly increasing and convex function of the formN

    j=1 g(bj), where g is strictly increas-

    ing and convex, belongs to F .

    Proof of this fact is based on simple testing of (B.1) to (B.6) and hence is left to the readers.

    B. Dynamic Programming Formulation

    Next we consider a dynamic programming formulation in which V n (b, C) is the expected cost-to-go

    at horizon n under Markovian policy . Let allocation W (n,b, C) W(n,b, C) denote the allocation

    at state (b, C) prescribed by policy at time n. It is clear that:

    V n (b, C) = (b) + Ea,C[Vn1(b+ a 1W

    (n,b, C), C)]. (8)

    The equivalence of our cost function in (2) and (8) is due to the validity of dynamic programming theoremfor a finite horizon Markov Decision Process (MDP) [20]. Define

    V n (b, C) := minUn

    V n (b, C) (9)

    to be the minimum cost over all Markovian policies at horizon n.7Due to the symmetric assumptions (i.e. condition (B.2)) in our model, conditions (B.3)-(B.5) are special cases of the multimodularity

    condition stated by Hajek [32].

  • 25

    Furthermore, we define the average optimal cost-to-go function as

    vn(b) := Ea,C [Vn (b+ a, C)] . (10)

    In the following Proposition we show the iterative structure of vn.

    Proposition 1: Given a horizon n, the average optimal cost-to-go at time n, vn(b), satisfies the following

    recursions:

    v0(b) = (b) (11)

    vn(b) = (b) + Ea,C

    [min

    wW(n,C)vn1([b+ aw]

    +)

    ](12)

    where Ea,C denotes the expectation with regard to the statistics of arrivals and connectivity and (b) :=

    Ea [(a+ b)].

    Proof: From dynamic programming, we have the following recursion for the optimal cost-to-goV n (b, C):

    V 0 (b, C) = (b)

    V n (b, C) = (b) + infwW(n,b,C)

    Ea,C[V

    n1(b+ aw, C)] (13)

    Since W(n,b, C) is finite, there exists an optimal packet withdrawal w(n,b, C) at time n when

    the state of the queue backlogs is equal to the vector b and the connectivity profile is C. The optimal

    cost-to-go (13) can, then, be rewritten as:

    V n (b, C) = (b) + minwW(n,b,C)

    Ea,C

    [V n1(b+ aw, C)

    ](14)

    = (b) + Ea,C

    [V n1(b+ aw

    (n,b, C), C)]

    = (b) + vn1 (bw(n,b, C)) . (15)

    Now taking the expectation of both sides we have:

    vn(b) = Ea,C [Vn (b+ a, C)]

    = (b) + Ea,C

    [min

    wW(n,b+a,C)vn1(b+ aw)

    ](16)

    = (b) + Ea,C

    [min

    wW(n,C)vn1([b+ aw]

    +)

    ]. (17)

  • 26

    where the last equality is a result of the fact that for any allocation W W(n + 1, C), there exists

    W W(n + 1,b, C) such that vn(b 1W ) = vn([b 1W ]+).

    In addition, v0(b) = Ea,C [V 0 (b+ a, C)] = (b).

    C. Optimality of the MTLB policy for N = 2 Users

    Outline of the ProofConsider Problem (P) with a finite horizon T given any initial state I0 = (b, C) and a strictly increasing

    and convex cost function (b) =N

    j=1 g(bj), where g is strictly increasing convex. We prove Theorem 1

    via the following three statements:

    (ST1) vn is strictly increasing, n = 0, 1, . . . , T ;(ST2) vn F and vn is strictly increasing MTLB is optimal at stage n+ 1; and(ST3) vn F and MTLB is optimal at n+ 1 vn+1 F .

    Notice that (ST2) and (ST3) allow for an inductive proof of the optimality when (b) =Nj=1 g(bj).We prove (ST1) in Lemma 2.Lemma 2: If (b) =

    Nj=1 g(bj) where g is strictly increasing and convex, then vn(b, C) is strictly

    increasing on b for all n = 0, . . . , T .

    By using (ST1), we have that any optimal allocation at any stage n must achieve the maximumthroughput (condition (C1)) (see Lemma 3 below). Hence, Lemmas 3 and 4 below are sufficient toestablish (ST2):

    Lemma 3: If vn is strictly increasing for all n = 0, . . . , T 1, any optimal allocation satisfies the

    maximum throughput condition (C1) in the definition of the MTLB policy.Lemma 4: If vn F and vn is strictly increasing, then MTLB policy is optimal at stage n+ 1.

    The last step is to prove (ST3). However, proving (ST3) requires more complex development. Due toboundary consideration, it is more convenient to work with a function on ZN rather than ZN+ . We define

    the following extension:

    Definition 11: Consider f : ZN+ R. We denote f : ZN R as an extension of f on ZN such thatf(b) = f([b]+).

    Furthermore, we define an extension F of F :

    F :={f : ZN R : f meets (B.1) to (B.6)

    }(18)

  • 27

    The above extensions together with the optimality of MTLB at n + 1 facilitate the proof of (ST3) asfollows:

    vn F vn F (by Fact 2) (19)

    Ea,C

    [min

    wW(n+1,C)vn(b+ aw)

    ]satisfies (B.3) to (B.6) (by Lemmas 6 and 7) (20)

    vn+1 F (by Lemmas 2, 5 and Fact 3) (21)

    vn+1 F (by Fact 4) (22)

    where the facts and lemmas are listed below. All the facts are from [23] and [30] and the lemmas areproved in Appendix III.

    Fact 2: If f F , then the function f : ZN R defined as f(b) = f([b]+) is in F .

    Fact 3: If f1, f2, . . . are a sequence of functions that belong to F , then h(b) =

    l plfl(b) also belongs

    to F , where pl are constants.

    Fact 4: If f F , then the restriction of f to non-negative domain is in F .

    Fact 5: If f1, f2, . . . are a sequence of functions that belong to F , then h(b) =

    l plfl(b) also belongs

    to F , where pl are constants.

    Lemma 5: If (b) =N

    j=1 g(bj) where g is a strictly increasing and convex function, then vn(b, C) is

    permutation invariant on b for all n = 0, . . . , T .

    Lemma 6: If vn F and MTLB is optimal at stage n + 1, then Ea,C[minwW(n+1,C) vn(b+ aw)

    ]satisfies (B.3) to (B.5).

    Lemma 7: If vn F and MTLB is optimal at stage n + 1, then Ea,C[minwW(n+1,C) vn(b+ aw)

    ]satisfies (B.6).

    Proof of Theorem 1Proof: With the above outline, we are ready to show that MTLB is optimal for all stage n + 1,

    n = 0, . . . , T 1. But by (ST2), it suffices to show that vn F and strictly increasing for all n. We showthis by induction. Note that, by Lemma 2, we already have (ST1), i.e. vn is strictly increasing for all n.

    Basis of Induction: By Fact 1, the cost function (b) =N

    j=1 g(bj), where g is strictly increasing and

    convex, belongs to F . From (11), v0(b) = (b) =

    aPa(a)(b+ a). By Fact 5, v0(b) belongs to F .

    Induction Step: Suppose vn F . By (ST2), MTLB is optimal at stage n + 1. Hence, by (ST3), wehave vn+1 F .

    Note that (ST3) is established by showing (19) to (22). Since the proofs of (19), (20) and (22) are

  • 28

    straightforward from the stated lemmas and facts. Here we focus on (21). Notice that

    vn+1(b) = ([b]+) + Ea,C

    [min

    wW(n+1,C)vn(b+ aw)

    ].

    Since F , we have that ([b]+) is in F by Fact 2. Because Ea,C[minwW(n+1,C) vn(b+ aw)

    ]meets (B.3) to (B.6) by (20), we have that vn+1 also has the same properties (by Fact 3). In addition, byLemmas 2 and 5, vn+1 also has the properties (B.1) and (B.2). Therefore, vn+1 F .

    D. Optimality of the MTLB policy for N > 2

    It is more difficult and involved to prove the optimality of MTLB policy for N > 2 users as we

    conjecture as follows:Conjecture 1: If the cost function is strictly increasing Schur convex8 [33] then vn is strictly increasing

    Schur convex function for all n.

    Given this conjecture, the proof of optimality for N > 2 is similar to Lemma 4. In the lemma, wewould have b w1 b w2 and hence vn(b w1) vn(b w2). We believe that the proof of the

    conjecture would need a combination of sample path argument and majorization techniques (see [34])and is a topic of further research.

    APPENDIX III

    SUPPORTING LEMMAS

    In this appendix we establish the proofs for lemmas 2 to 7 stated above. The first lemma establishes

    (ST1) and that vn satisfies condition (B.1) for all n = 0, . . . , T .Lemma 2: If (b) =

    Nj=1 g(bj) where g is a strictly increasing and convex function, then vn(b, C) is

    strictly increasing on b for all n = 0, . . . , T , i.e. b > b vn(b) > vn(b).

    Proof: Since vn is linearly related to V n by (10), it suffices to show the strict monotonicity ofV n (b, C). We show by induction.

    Induction Basis: V 0 (b, C) = (b) =N

    j=1 g(bj) is strictly increasing by the assumption of g.

    8For any x = (x1, . . . , xN ) RN , let x[1] . . . x[N] denote the components of x in decreasing order. For x, y RN , x is said tobe weakly majorized by y, written x w y, if Pki=1 x[i]

    Pk

    i=1 y[i], k = 1, . . . , N . If, in addition, equality obtains for k = N , x is saidto be majorized by y, written x y. A function f : RN R is Schur-convex if x y f(x) f(y).

  • 29

    Induction Step: Assume V n1(b, C) > V n1(b, C) for b > b, then

    V n (b, C) = (b) + min

    wW(n,b,C)Ea,C[V

    n1(b

    + aw, C)]

    (b) + minwW(n,b,C)

    Ea,C[V

    n1(b+ aw(w

    ), C)]

    (b) + minwW(n,b,C)

    Ea,C[V

    n1(b+ aw, C)]

    > (b) + minwW(n,b,C)

    Ea,C[V

    n1(b+ aw, C)]

    = V n (b, C),

    where, for w W(n,b, C), we define w(w) W(n,b, C) as the allocation that assigns to each queue

    j the same number of servers as w unless the queue is empty, in which case it assigns wj (bj bj). In

    light of this, the first inequality holds by the induction hypothesis and the fact that w(w) w(bb).

    The second inequality holds because w(w) W(n,b, C). The third inequality is a result of the strict

    monotonicity of .

    Lemmas 3 and 4 provide (ST2).Lemma 3: If vn is strictly increasing for all n = 0, . . . , T 1, any optimal allocation satisfies condition

    (C1) in the definition of the MTLB policy.Proof: (by contradiction) Assume W W(n + 1,b, C) is optimal but does not satisfy (C1), i.e.

    i,j wij = l < L, where L is the maximum achievable throughput. Now assume, without loss of generality,

    that b1 w1 = 0 and b2 w2 > 0 and there is at least one idle server p such that cp1 > 0. Note that if

    there is an idle server i connected to u2, we are done since 1(W +Ei2) W(n+ 1,b, C), where Ei2 is

    a matrix of all zero except element (i, 2) which is 1. In other words, vn(b 1(W +Ei2)) < vn(b1W )

    by the strict monotonicity of vn (Lemma 2), a contradiction with the optimality of W .Now we argue that there must exist a non-idle server such that it is connected to both u1 and u2 but

    assigned to u1, i.e. S(W,u2, u1) = {u2, , u1} is an alternating path. Note that if such server does not

    exist, u2 has used up all its connected servers and thus l is the maximum throughput, a contradiction.

    Now, let W a be the alternating allocation of W along S(W,u2, u1), i.e. 1W a = 1W e1 + e2. Notice

    that, under W a, one packet remains in u1 and thus the packet can be assigned to server p. Hence,

    1W a + e1 W(n + 1,b, C). But 1W a + e1 is nothing but 1W + e2.

    In other words, we have shown that 1W = 1W+e2 W(n+1,b, C). Hence, vn(b1W ) < vn(b1W )

    by the strict monotonicity of vn, a contradiction with the optimality of W .

  • 30

    Lemma 4: If vn(b) F and vn(b) is strictly increasing on b, then the MTLB policy is optimal at

    stage n+ 1.

    Proof: By Lemma 3, we have that the optimal allocation w(n+1,b, C) W(n+1,b, C) achievesthe maximum throughput (C1). However, among all allocations achieving the maximum throughput, weneed to show that the most balanced allocation wMTLB achieves the minimum cost function V n+1(b, C)

    (Eqn. (15)). In another word, vn(bwMTLB) = minwW(n+1,b,C) vn(bw).Assume there exists two possible allocations we can choose from, w1 and w2 W(n + 1,b, C),

    such that w2 = R21(w1) and b1 w11 b2 w12 + 2. Since vn meets condition (B.6), it gives thatvn(b w

    1) vn(R12(b w1)) = vn(b w

    2). Thus, configuration b w2 which is more balanced

    than configuration bw1 achieves lower (or the same) average cost vn() at stage n+ 1. Repeating thisprocess in a finite number of steps, we arrive at an optimal allocation which is MTLB.

    The rest of the appendix provides Lemmas 5 to 7 necessary to establish (ST3) as discussed in (19) to(22) in the outline of the proof of the main theorem.

    The next lemma shows that vn satisfies (B.2) for all n = 0, . . . , T .Lemma 5: If (b) =

    Nj=1 g(bj) where g is a strictly increasing and convex function, then vn(b, C) is

    permutation invariant on b for all n = 0, . . . , T , i.e. vn(pi(b)) = vn(b).

    Proof: Again, it suffices to show the permutation invariance property of V n (b, C).Induction Basis: V 0 (b, C) = (b) =

    Nj=1 g(bj) is clearly permutation invariant.

    Induction Step: Assume V n1(pi(b),pi(C)) = V n1(b, C), then

    V n (pi(b),pi(C)) = (pi(b)) + minwW(n,pi(b),pi(C))

    Ea,C[V

    n1(pi(b) + aw, C)]

    = (b) + minwW(n,pi(b),pi(C))

    Ea,C[V

    n1(pi(b) + pi(a)w,pi(C))]

    = (b) + minwW(n,b,C)

    Ea,C[V

    n1(pi(b) + pi(a) pi(w),pi(C))]

    = (b) + minwW(n,b,C)

    Ea,C[V

    n1(b+ aw, C)]

    = V n (b, C),

    where the second equality is a direct result of Assumptions (A1) and (A2). The third equality holds since

    W W(n,b, C) pi(W ) W(n, pi(b),pi(C)).

    The fourth equality follows from the induction hypotheses.

  • 31

    Next, we show that Ea,C[minwW(n+1,C) vn(b+ aw)

    ]satisfies (B.3) to (B.6) whenever vn F .

    Without loss of generality, we consider the case i = 1 and j = 2 in (B.3) to (B.6).But before we proceed, for notational simplicity, we write

    T a,Cn (b) := minwW(n+1,C)

    vn(b+ aw). (23)

    We also define the following definition:

    Definition 12: Define W(n+ 1,b, C) to be the set of all optimal allocations at time n+ 1, when thestate of the system at time n + 1 is (b, C). In other words,

    W(n+ 1,b, C) := {W W(n + 1, C) : vn([b 1W]+) = min

    WW(n+1,C)vn([b 1W ]

    +)} (24)We are now ready to show the following lemma:

    Lemma 6: If vn F and MTLB is optimal at stage n+1, then, for any state b, Ea,C[T a,Cn (b)

    ]satisfies

    (B.3), (B.4), and (B.5).Proof: By using Fact 3 in Appendix II-C, it suffices to show that T a,Cn (b) satisfies conditions (B.3)

    to (B.5) for any realization (a, C) of the arrival and connectivity processes.For short notation, let b := a + b. There exists a MTLB allocation w W(n + 1, C) such that

    w W(n+ 1,b, C) W(n+ 1,b + e1, C) W(n+ 1,b + e1 + e2, C). This is because 1) adding

    one packet to each queue does not create any balancing paths, i.e. if w W(n + 1,b, C), then

    w W(n + 1,b + e1 + e2, C); and 2) w W(n + 1,b, C) can be chosen such that it givespriority to serving queue u1, hence, adding one packet to u1 does not create any balancing paths, i.e.

    w W(n+ 1,b + e1, C). For notational simplicity, let d := b w Z2.

    (i) T a,Cn (b) satisfies (B.3):

    T a,Cn (b+ e1 + e2) + Ta,Cn (b) T

    a,Cn (b+ e1) T

    a,Cn (b+ e2)

    = minwW(n+1,C)

    vn(b + e1 + e2 w) + min

    wW(n+1,C)vn(b

    w)

    minwW(n+1,C)

    vn(b + e1 w) min

    wW(n+1,C)vn(b

    + e2 w)

    vn(d+ e1 + e2) + vn(d) vn(d+ e1) vn(d+ e2)

    0,

    where the last inequality holds because vn F and hence satisfying condition (B.3).

  • 32

    (ii) T a,Cn (b) satisfies (B.4):We need to show the non-negativity of

    T a,Cn (b+ e1) 2Ta,Cn (b) + T

    a,Cn (b e1)

    = minwW(n+1,C)

    vn(b + e1 w) 2 min

    wW(n+1,C)vn(b

    w) + minwW(n+1,C)

    vn(b e1 w)

    = vn(d+ e1) 2vn(d) + minwW(n+1,C)

    vn(b e1 w) (25)

    We consider two cases.

    Case 1: If w W(n+ 1,b e1, C), then we are done.

    Case 2: w 6 W(n+ 1,b e1, C). This means that S(w, u2, u1), d2 = d1 + 1, and wa W(n+

    1,b e1, C) where wa := 1W a and W a is the alternating allocation of w along S(w, u2, u1). In

    another word, wa = R12(w) = w e1 + e2. Now, (25) becomes

    vn(d+ e1) 2vn(d) + minwW(n+1,C)

    vn(b e1 w)

    = vn(d+ e1) 2vn(d) + vn(b e1 w

    a)

    = vn(d+ e1) 2vn(d) + vn(d e2)

    = vn(d+ e1) vn(d) vn(pi12(d)) + vn(d e2)

    = vn(d+ e1) vn(d) vn(d+ e1 e2) + vn(d e2)

    0,

    where the second equality holds because be1wa = de2, the forth equality holds because d2 = d1+1,

    and the last inequality holds because vn F hence satisfying condition (B.3).

    (iii) T a,Cn (b) satisfies (B.5):We need to show the non-negativity of

    T a,Cn (R12(b)) 2Ta,Cn (b) + T

    a,Cn (R21(b))

    = minwW(n+1,C)

    vn(R12(b)w) 2 min

    wW(n+1,C)vn(b

    w) + minwW(n+1,C)

    vn(R21(b)w)

    = minwW(n+1,C)

    vn(R12(b)w) 2vn(d) + min

    wW(n+1,C)vn(R21(b

    )w) (26)

    Case 1: If w W(n+ 1, R21(b), C) W(n+ 1, R12(b), C), then we are done.

  • 33

    Case 2: If w 6 W(n + 1, R21(b), C), then S(w, u1, u2). In addition, d1 = d2 because w is

    chosen to give priority to serving u1. After an alternating allocation of w along S(w, u1, u2), we have

    R21(w) W(n+ 1, R21(b

    ), C). Thus, for this case, (26) becomes:

    minwW(n+1,C)

    vn(R12(b)w) 2vn(d) + min

    wW(n+1,C)vn(R21(b

    )w)

    = minwW(n+1,C)

    vn(R12(b)w) 2vn(d) + vn(d)

    = minwW(n+1,C)

    vn(R12(b)w) vn(d) (27)

    Next we consider the two following subcases:

    Case 2.1: If w W(n + 1, R12(b), C), then (27) is equal to vn(R12(d)) vn(d) 0, becaused1 = d2 and vn F hence satisfying condition (B.6).

    Case 2.2: If w 6 W(n + 1, R12(b), C), then S(w, u2, u1). After an alternating allocation of w

    along S(w, u2, u1), we have R12(w) W(n+ 1, R12(b), C). Thus, (27) is equal to zero.Case 3: If w W(n+1, R21(b), C) but w 6 W(n+1, R12(b), C), then, by permutation invariance

    property of vn, this is the same as Case 2.1 where w 6 W(n + 1, R21(b), C) but w W(n +

    1, R12(b), C).

    Lemma 7: Assume vn F and MTLB is optimal at stage n+1. For any state b such that b1 b2+1,

    Ea,C[Ta,Cn (b)] satisfies condition (B.6).

    Proof: For notational simplicity, let us define, for any (a, C),

    Za,C(b) := T a,Cn (b) Ta,Cn (R12(b)). (28)

    It is easy to see that, for some (a, C), Za,C(b) can be negative. However, because of the joint permutationinvariance of the pmfs of the arrival and connectivity processes (Assumptions A1 and A2), to show thatEa,C[T

    a,Cn (b)] meets condition (B.6) is equivalent to show the non-negativity of

    Ea,C[Ta,Cn (b)] Ea,C[T

    a,Cn (R12(b))] = Ea,C

    [T a,Cn (b) T

    a,Cn (R12(b))

    ]

    = Ea,C[Za,C(b)

    ]

    =1

    2Ea,C

    [Za,C(b) + Zpi12(a),pi12 (C)(b)

    ].

    Thus, it suffices to show that, for any (a, C) and b1 b2 + 1,

    Za,C(b) + Zpi12(a),pi12 (C)(b) 0.

  • 34

    We show this by noticing that

    Za,C(b) + Zpi12(a),pi12 (C)(b) = Za,C(b) + Tpi12(a),pi12 (C)n (b) T

    pi12(a),pi12 (C)n (R12(b))

    = Za,C(b) + T a,Cn (pi12(b)) Ta,Cn (pi12(R12(b)))

    = T a,Cn (b0) T a,Cn (b

    1) + T a,Cn (bM) T a,Cn (b

    M1) (29)

    where M := b1 b2 ( 1) and bm := bme1 +me2, for m = 0, . . . ,M . The first and third equalities

    follow from (28) and the second inequality from permutation invariance property. Note that pi12(b) =b2e1 + b1e2 = b (b1 b2)e1 + (b1 b2)e2 = b

    M, pi12(R12(b)) = b

    M1, bm+1 = R12(b

    m), and

    bm1 = R21(bm).

    Now notice that if M = 1, then (29) is zero. If M 2, we have

    T a,Cn (b0) T a,Cn (b

    1) T a,Cn (bM1) + T a,Cn (b

    M )

    =M1m=1

    {T a,Cn (b

    m1) 2T a,Cn (bm) + T a,Cn (b

    m+1)}

    =M1m=1

    {T a,Cn (R21(bm)) 2T a,Cn (b

    m) + T a,Cn (R12(bm))}

    0,

    where the inequality holds because vn F and, from Lemma 6, T a,Cn () satisfies condition (B.5).