Belgrade University Aleksandra Smiljanić: High-Capacity Switching Switches with Input Buffers...

34
Aleksandra Smiljanić: High- Capacity Switching Belgrade University Switches with Input Buffers (Cisco)

Transcript of Belgrade University Aleksandra Smiljanić: High-Capacity Switching Switches with Input Buffers...

Aleksandra Smiljanić: High-Capacity Switching

Belgrade University

Switches with Input Buffers (Cisco)

Aleksandra Smiljanić: High-Capacity Switching

Belgrade University

Packet Switches with Input Buffers

Switching fabric Electronic chips (Mindspeed, AMCC, Vitesse) Space-wavelength selector (NEC, Alcatel) Fast tunable lasers (Lucent) Waveguide arrays (Chiaro)

Scheduler Packets compete not only with the packets destined for

the same output but also with the packets sourced by the same input. Scheduling might become a bottleneck in a switch with hundreds of ports and gigabit line bit-rates.

Aleksandra Smiljanić: High-Capacity Switching

Belgrade University

Optical Packet Cross-bar (NEC,Alcatel)

A 2.56 Tb/s multiwavelength and scalable switch-fabric for fast packet-switching network, PTL 1998,1999, NEC

Aleksandra Smiljanić: High-Capacity Switching

Belgrade University

Optical Packet Cross-bar (Lucent)

A fast 100 channel wavelength tunable transmitter for optical packet switching, PTL 2001, Bell Labs

Aleksandra Smiljanić: High-Capacity Switching

Belgrade University

Scheduling Algorithms for Packet Switches with Input Buffers

Each input sends request for its HOL packet to the corresponding output. Each output grants one input, and this input-output pair will be connected in the next time slot.

Output utilization when inputs are fully loaded is:

U=1-(1-1/N)N-1

Aleksandra Smiljanić: High-Capacity Switching

Belgrade University

Scheduling Algorithms for Packet Switches with Input Buffers

..1

..1

..1

..1

..1

..1

..1

..1

..1

..1

..1

..1

..2

..2

..2

..2

..2

..2

..2

..2

..2

..2

..2

..2

..3

..3

..3

..3

..3

..3

..3

..3

..3

..3

..3

..3

..4

..4

..4

..4

..4

..4

..4

..4

..4

..4

..4

..4

..1

..1

..1

..1

..1

..1

..1

..1

..1

..1

..1

..1

Aleksandra Smiljanić: High-Capacity Switching

Belgrade University

Scheduling Algorithms for Packet Switches with Input Buffers

In parallel iterative matching (PIM), SLIP or dual round-robin (DRR) inputs send requests to outputs, outputs grant inputs, and inputs then grant outputs in one iteration. It was proven that PIM finds a maximal matching after log2N +4/3 steps on average.

Maximum weighted matching and maximum matching algorithm maximize the weight of the connected pairs, and achieve 100% for i.i.d. traffic but have complexities O(N3log2N) and O(N2.5).

Sequential greedy scheduling is a maximal matching algorithm that is simple to implement. Maximal matching algorithm does not leave input-output pair unmatched.

Aleksandra Smiljanić: High-Capacity Switching

Belgrade University

Bandwidth ReservationsPacket Switches with Input Buffers

Anderson et al.: Time is divided into frames of F time slots. Schedule is calculated in each frame; Statistical matching algorithm.

Stiliadis and Varma: Counters are loaded per frame. Queues with positive counters are served with priority according to parallel iterative matching (PIM), their counters are then decremented by 1. DRR proposed by Chao et al. could be used as well.

Kam et al.: Counter is incremented for the negotiated bandwidth and decremented by 1 when the queue is served. Maximal weighted matching algorithm is applied.

Smiljanić: Counters are loaded per frame. Queues with positive counters are served with priority according to the maximal matching algorithm preferrably sequential greedy scheduling algorithm (SGS), where inputs sequentially choose outputs to transmit packets to.

Aleksandra Smiljanić: High-Capacity Switching

Belgrade University

Maximum and Maximal Matching Algorithm

It was shown that when packet arrivals are i.i.d and traffic distribution is admissible then 100% can pass the cross-bar if the maximum or the maximum weighted matching algorithms are applied.

It was shown that when packet arrivals obey a strong law of large numbers and traffic distribution is admissible then 50% can pass the cross-bar if the mximal matching algorithms are applied.

Aleksandra Smiljanić: High-Capacity Switching

Belgrade University

PIM, SLIP and DRR

In PIM and SLIP each input sends requests to all outputs for which it has packets, and in DRR only to one chosen output. SLIP and DRR use round-robin choices.

Theorem: PIM finds a maximal matching after log2N +4/3 steps on average.

Proof: Let n inputs request output Q, and let k of these inputs receive no grants. With probability k/n all requests are resolved, and with probability 1-k/n at most k requests are unresolved. The average number of requests is at most (1-k/n)·k≤n/4. So if there are N2 requests at the beginning, the expected number of unresolved requests after I iterations is N2/4i

Aleksandra Smiljanić: High-Capacity Switching

Belgrade University

PIM, SLIP and DRR

Proof (cont.): Let C be the last step on which the last request

is resolved. Then:

3

4log

4,1min}iterationsafterrequests{

}iterationsafterrequests{}0{][

20

2

0 1

0 10

NN

ijjP

ijPCPCE

ii

i j

i ji

Aleksandra Smiljanić: High-Capacity Switching

Belgrade University

Typical Central Controllers (Cisco)

Aleksandra Smiljanić: High-Capacity Switching

Belgrade University

SGS Implementation

All inputs one after another choose outputs, SGS is a maximal matching algorithm

Aleksandra Smiljanić: High-Capacity Switching

Belgrade University

SGS Uses Pipelining

Ii -> Tk Input i chooses output for time slot k

Aleksandra Smiljanić: High-Capacity Switching

Belgrade University

Weighted Sequential Greedy Scheduling

i=1; Input i chooses output j from Ok for which

it has packet to send; Remove i from Ik and j from Ok;

If i<N choose i=i+1 and go to the previous step;

Aleksandra Smiljanić: High-Capacity Switching

Belgrade University

Weighted Sequential Greedy Scheduling

If k=1 mod F then cij=aij;

Ik={1,...,N}; Ok={1,...,N}; i=1; Input i chooses output j from Ok for which

it has packet to send such that cij>0; Remove i from Ik and j from Ok; cij=cij-1;

If i<N choose i=i+1 and go to the previous step;

Aleksandra Smiljanić: High-Capacity Switching

Belgrade University

Non-blocking Nature of WSGS

Maximal matching algorithm does not leave input or output unmatched if there is a packet to be transmitted from the input to the output in question.

It can be proven that all the traffic passes through the cross-bar with the speedup of two which is run by a maximal matching algorithm, as long as the outputs are not overloaded.

Aleksandra Smiljanić: High-Capacity Switching

Belgrade University

Performance of Maximal Matching Algorithm

Theorem: The maximal matching protocol (and so WSGS) ensures aij time slots per frame to input-output pair (i,j), if

Proof: Note that

FaaaaRT ijm

mjm

imijji

ijim

mjjm

imim

mjjm

im aFaacc

where Ti is the number of slots reserved for input i, and Rj is the number of slots reserved for output j.

Aleksandra Smiljanić: High-Capacity Switching

Belgrade University

Admission Control for Maximal Matching Algorithm

The maximal matching (and so WSGS) protocol ensures aij time slots per frame to input-output pair (i,j) if:

1 FRT jiI:

2/)1(,2/)1( FRFT jiII:

III: 2/1,2/1 ji rt

F frame lengthTi the number of slots reserved for input i, Rj the number of slots reserved for output j. ti, rj are normalized Ti, Rj.

Aleksandra Smiljanić: High-Capacity Switching

Belgrade University

Analogy with Circuit Switches

Inputs ~ Switches in the first stage

Time slots in a frame ~ Switches in the middle stage

Outputs ~ Switches in the last stage

Non-blocking condition: Fn Strictly non-blocking condition: Fn 12

Aleksandra Smiljanić: High-Capacity Switching

Belgrade University

Rate and Delay Guranteed by Maximal Matching Algorithm (and WSGS)

Assume a coarse synchronization on a frame by frame basis, where a frame is the policing interval comprising F cell time slots of duration Tc.

Then, the delay of D=2·F·Tc is provided for the utilization of 50%. Or, this delay and utilization of 100% are provided for the fabric with the speedup of 2.

%50

2

U

TFD c

2

%100

2

S

U

TFD c

Aleksandra Smiljanić: High-Capacity Switching

Belgrade University

Port Congestion Due to Multicasting

1,,

lkjkl

m

lkk

ik

m

ik ppΜ

Μ

bit-rate reserved for multicast session k of input im

ikp

ikΜ multicast group k sourced by input i

Solution: Packets should be forwarded through the switch by multicast destination ports.

Aleksandra Smiljanić: High-Capacity Switching

Belgrade University

Forwarding Multicast Traffic

Aleksandra Smiljanić: High-Capacity Switching

Belgrade University

Forwarding Multicast Traffic

Aleksandra Smiljanić: High-Capacity Switching

Belgrade University

Forwarding Multicast Traffic

Aleksandra Smiljanić: High-Capacity Switching

Belgrade University

Adding the Port to the Multicast Tree

Aleksandra Smiljanić: High-Capacity Switching

Belgrade University

Removing the Port from the Multicast Tree

Aleksandra Smiljanić: High-Capacity Switching

Belgrade University

Admission Control for Modified WSGS

1 FRET jii

1FRRPT jii

where Ei is the number of forwarded packets per frame

1)1(

,

,

FFPF

FR

FT

rt

ri

ti

Aleksandra Smiljanić: High-Capacity Switching

Belgrade University

Admission Control for Modified WSGS

2

1

1

1,minmax

),min(max,max

P

F

P

FFF

FEFC

ttF

rtFF

t

rt

M

2

1

P

FFF rt

for

Aleksandra Smiljanić: High-Capacity Switching

Belgrade University

Admission Control for Modified WSGS

)2/()1(),2/()1( PFRPFT iiI:

II: )2/(1),2/(1 PrPt ii

Modified WSGS protocol ensures negotiated bandwidths to input-output pairs if for :

Ti the number of slots reserved for input i, Ri the number of slots reserved for output i. ti, ri are normalized Ti, Ri.

F frame length, P forwarding fan-out

Ni 1

Aleksandra Smiljanić: High-Capacity Switching

Belgrade University

Rate and Delay Guaranteed by Modified WSGS

Assume again a coarse synchronization on a frame by frame basis.

Then, the delay of D= F·Tc is provided for the utilization of 1/(P+2), where P is the forwarding fan-out. Or, this delay and utilization of 100% are provided for the fabric speedup of P+2.

)2/(1

log

PU

TFND cP

2

%100

log

PS

U

TFND cP

NPlog

Aleksandra Smiljanić: High-Capacity Switching

Belgrade University

Quality of Service, P=2, S=4, B=10Gb/s, Tc=50ns

N 1000 4000

F 104 5·104 104 5·104

C [Tb/s] 2.5 2.5 10 10

G [Mb/s] 1 0.2 1 0.2

D [ms] 5 25 5.5 27.5

Aleksandra Smiljanić: High-Capacity Switching

Belgrade University

References

T. E. Anderson, S. S. Owicki, J. B. Saxe, and C. P. Thacker, “Highspeed switch scheduling for local-area networks,” ACM Transactions on Computer Systems, vol. 11, no. 4, November 1993, pp. 319-352.

N. McKeown et al., “The Tiny Tera: A packet switch core,” IEEE Micro, vol. 17, no. 1, Jan.-Feb. 1997, pp. 26-33.

A. Smiljanić, “Flexible bandwidth allocation in high-capacity packet switches,” IEEE/ACM Transactions on Networking, April 2002, pp. 287-293.

Aleksandra Smiljanić: High-Capacity Switching

Belgrade University

References

A. Smiljanić, “Scheduling of multicast trafc in high-capacity packet switches,” IEEE Communication Magazine, November 2002, pp. 72-77.