FIF (FUTURE INTERNET FORUM)fif.kr/AsiaNetFPGAws/slide/1-4_slide.pdf · 2017. 2. 13. · FIF (FUTURE...

23
Data Center Quantized Congestion Notification (QCN): Implementation and Evaluation on NetFPGA 2010 June 14th Masato Yasuda (NEC Corporation) Abdul Kader Kabanni (Stanford University)

Transcript of FIF (FUTURE INTERNET FORUM)fif.kr/AsiaNetFPGAws/slide/1-4_slide.pdf · 2017. 2. 13. · FIF (FUTURE...

Page 1: FIF (FUTURE INTERNET FORUM)fif.kr/AsiaNetFPGAws/slide/1-4_slide.pdf · 2017. 2. 13. · FIF (FUTURE INTERNET FORUM)

Data Center Quantized Congestion Notification (QCN): Implementation and Evaluation on NetFPGA

2010 June 14thMasato Yasuda (NEC Corporation)

Abdul Kader Kabanni (Stanford University)

Page 2: FIF (FUTURE INTERNET FORUM)fif.kr/AsiaNetFPGAws/slide/1-4_slide.pdf · 2017. 2. 13. · FIF (FUTURE INTERNET FORUM)

© NEC Corporation 2009Page 2

Background

▐ In data centers, there is a movement of Network Convergence ▐ The spec of Converged Enhanced Ethernet (CEE) has been discussed in

IEEE 802.1 standard committees

Converged Enhanced Ethernet

Internet

servers

terminals

Internet

terminals

Converged DC transport

LAN(Fibre Channel: FC)(TCP)

SAN

storage storage

servers

Separate Network Converged Network

Page 3: FIF (FUTURE INTERNET FORUM)fif.kr/AsiaNetFPGAws/slide/1-4_slide.pdf · 2017. 2. 13. · FIF (FUTURE INTERNET FORUM)

© NEC Corporation 2009Page 3

Congestion Control Mechanism in CEE

▐ In CEE, it has Congestion Control MechanismNon-TCP traffic (storage, media) can avoid network

congestion

▐ Congestion Notification (CN) The spec of Congestion Control Mechanism in CEE It is specified in IEEE802.1Qau in Data Center Bridging Task Group We proposed QCN (Quantized Congestion Notification) and finally

accepted as a standard in March 2010.

Converged Enhanced Ethernet (CEE)Converged Enhanced Ethernet (CEE)

FCoEFCoEIPIP IPIP

TCPTCP UDP(for media)

UDP(for media)

FC(for storage)

FC(for storage)

SupportSupportCongestion Congestion

ControlControlat Layer 2at Layer 2

Page 4: FIF (FUTURE INTERNET FORUM)fif.kr/AsiaNetFPGAws/slide/1-4_slide.pdf · 2017. 2. 13. · FIF (FUTURE INTERNET FORUM)

© NEC Corporation 2009Page 4

Significant restrictions/requirements for CN

We must avoid queue oscillation which causes overflow (causes link pause) or underflow (lose utilization)

StableStable

Unlike TCP slow start, sources can come on at the full line rate of 10Gbps

Sources can start Sources can start at line lateat line late

Links can be paused by pause mechanism but CN is needed to avoid congestion spreading (pause spreading to innocent paths).

Links can be Links can be pausedpaused

The algorithm should be simple enough to be implemented in hardware (for 40G or 100G processing)

SimpleSimple

No way to know round trip time. Not automatically self-clocked like TCP.(The algorithm need to have counters for self-clocking)

No perNo per--packet packet ACKsACKs

Page 5: FIF (FUTURE INTERNET FORUM)fif.kr/AsiaNetFPGAws/slide/1-4_slide.pdf · 2017. 2. 13. · FIF (FUTURE INTERNET FORUM)

© NEC Corporation 2009Page 5

QCN Overview

Page 6: FIF (FUTURE INTERNET FORUM)fif.kr/AsiaNetFPGAws/slide/1-4_slide.pdf · 2017. 2. 13. · FIF (FUTURE INTERNET FORUM)

© NEC Corporation 2009Page 6

QCN (Quantized Congestion Notification)

▐ QCN: Congestion Control Mechanism at Layer 2 Proposed by our group and accepted in IEEE 802.1Qau

▐ Terminology: Congestion Point: Where congestion occurs, mainly switches Reaction Point: Source of traffic, mainly rate limiters in Ethernet

NICs

Reaction PointReaction Point Congestion PointCongestion PointFeedbackMessage

FbFbTrafficfrom othersources

SwitchNIC

Page 7: FIF (FUTURE INTERNET FORUM)fif.kr/AsiaNetFPGAws/slide/1-4_slide.pdf · 2017. 2. 13. · FIF (FUTURE INTERNET FORUM)

© NEC Corporation 2009Page 7

▐ Feedback Processing Calculate Feedback value (Fb) from Switch Queue length Send feedback to Reaction Point (RP)

▐ Fb sampling probability No Congestion: Pmin(1%) Under Congestion: Pmax(10%) at maximum

▐ Fb Calculation

QCN Algorithm: Congestion Point (CP)

) Q wQ (F deltaoffb

Qeq : Qlen in equibrium

Qold : Queue length at previous processingQ: Current Queue LengthFb : Feedback (quantized to 6bits) w: fixed parameter (= 2)

QeqQ

QoldQdelta

Qoff

Qoff: Offset

eqoff QQQ Qdelta: Variance

olddelta QQQ

|Fb|

Refle

ction

Prob

ability

Pmin

Pmax

Page 8: FIF (FUTURE INTERNET FORUM)fif.kr/AsiaNetFPGAws/slide/1-4_slide.pdf · 2017. 2. 13. · FIF (FUTURE INTERNET FORUM)

© NEC Corporation 2009Page 8

QCN Algorithm: Reaction Point (RP)

▐ Rate Control Rate Decrease: Feedback from CP Rate Increase: Increase by itself

▐ Rate Increase Algorithm: Averaging Principle Based on BIC-TCP: Works without ACK

Target Rate (TR)

Current Rate (CR)

Fast RecoveryFast Recovery

Active Active IncreaseIncrease

HyperHyper--Active Active IncreaseIncrease

Receive FeedbackReceive Feedback

)FGCR(1CR bdCRTR

1/2)F(G bmaxd

TR)/2(CRCR

TRTR

AIRTRTR

TR)/2(CRCR TR)/2(CRCR HAIiRTRTR

Time

Rate

TriggerTimeout or Byte Counter

Page 9: FIF (FUTURE INTERNET FORUM)fif.kr/AsiaNetFPGAws/slide/1-4_slide.pdf · 2017. 2. 13. · FIF (FUTURE INTERNET FORUM)

© NEC Corporation 2009Page 9

Implementation

Page 10: FIF (FUTURE INTERNET FORUM)fif.kr/AsiaNetFPGAws/slide/1-4_slide.pdf · 2017. 2. 13. · FIF (FUTURE INTERNET FORUM)

© NEC Corporation 2009Page 10

Current QCN Implementation

▐▐ WorldWorld’’s first Hardware QCN Full implementations first Hardware QCN Full implementation It is fully functional We confirmed that the result matches OMNet++ simulation result!

▐ QCN-NIC Multiple Reaction Points Improved Token Bucket Rate Limiter

▐ QCN-Switch Multiple Congestion Points

▐ Compliant with Pseudo Code v2.3 [1] 1Gpbs Platform (NetFPGA)

[1] QCN Pseudo Code v2.3: http://www.ieee802.org/1/files/public/docs2009/au-rong-qcn-serial-hai-v23.pdf

Page 11: FIF (FUTURE INTERNET FORUM)fif.kr/AsiaNetFPGAws/slide/1-4_slide.pdf · 2017. 2. 13. · FIF (FUTURE INTERNET FORUM)

© NEC Corporation 2009Page 11

Packet Format

▐ Data Frame (Normal Ethernet UDP/IP Frame)

▐ QCN Fb Frame(64Bytes)

SRC MAC[48:32]DST MAC[47:0]

63 0

SRC MAC[31:0] Type =16’h0800

IP and UDP Header

Extract as flowid[7:0] at QCN Switch

SRC MAC[48:32]DST MAC[47:0]

63 0

SRC MAC[31:0] Type = 16’hXXXX QCN Field[15:0]

flowid[7:0] qcn_fb[5:0]

Ethernet Padding

reserved(2bits)

QCN Field[15:0]

Page 12: FIF (FUTURE INTERNET FORUM)fif.kr/AsiaNetFPGAws/slide/1-4_slide.pdf · 2017. 2. 13. · FIF (FUTURE INTERNET FORUM)

© NEC Corporation 2009Page 12

QCN NIC Block Diagram

User Data Path

InputArbiter

Demux DemuxMux

Delay Line

SRAM

ReceivePacketparser

QCNTerminator

Fb Demuxqcn_fb, flowid

RateCalculator

Timer

Byte Count

Token BucketRate Limiter

UDP PktGen

Reaction Point 0

Demux Mux

Reaction Point 1Reaction Point 2

Reaction Point 3

Page 13: FIF (FUTURE INTERNET FORUM)fif.kr/AsiaNetFPGAws/slide/1-4_slide.pdf · 2017. 2. 13. · FIF (FUTURE INTERNET FORUM)

© NEC Corporation 2009Page 13

Token Bucket Rate Limiter

▐ Token bucket algorithm avoiding burstiness

Packet

Increase token_fill size tokensIf refresh timer has expired

Decrease packet size tokensin sending packet

Bucket Size B

Wait for being allowed to send

tokens

Send Condition:tokens in the bucket > 0

Send Condition:tokens in the bucket > 0

Send Condition:tokens in the bucket >= token size

Send Condition:tokens in the bucket >= token size

Causing burstiness!

Previous implementation Current implementation

Working Fine

Page 14: FIF (FUTURE INTERNET FORUM)fif.kr/AsiaNetFPGAws/slide/1-4_slide.pdf · 2017. 2. 13. · FIF (FUTURE INTERNET FORUM)

© NEC Corporation 2009Page 14

QCN Switch Module Structure

User Data Path

Fb Calculator

QlenMonitor

Fb Generator

PrioritizedMux

Output Queues

InputArbiter

PacketParser

Port ByteCount

Receive Block

FwdPort

Congestion Point

src/dst macsrc/dst port

Demux4port

PrioritizedMux

PrioritizedMux

PrioritizedMuxRate

Limiter

SRAM

Page 15: FIF (FUTURE INTERNET FORUM)fif.kr/AsiaNetFPGAws/slide/1-4_slide.pdf · 2017. 2. 13. · FIF (FUTURE INTERNET FORUM)

© NEC Corporation 2009Page 15

Evaluation

Page 16: FIF (FUTURE INTERNET FORUM)fif.kr/AsiaNetFPGAws/slide/1-4_slide.pdf · 2017. 2. 13. · FIF (FUTURE INTERNET FORUM)

© NEC Corporation 2009Page 16

QCN Hardware Evaluation System

QCN-NIC0

QCN-Switch

Port1MAC

Port2MAC

QCN-NIC1 CPRate

Limiter

Fb

Mux

Demux

drop

RP0RP1RP2RP3

Receive

Mux

MACDelayLine

RP0RP1RP2RP3

Receive

Mux

MACDelayLine

Port0MAC

Output Queue(Port0)

FwdPort

▐ Number of Reaction Points (RPs): 1-8 Each RP generates 1Gbps UDP Traffic

▐ Limit Rate at Switch: 950Mbps(3.7sec) 200Mbps(3.7sec) 950Gbps(3.7sec)

Limit Rate (950M200M950M)to check flow control

behavior

4RPs

4RPs

Page 17: FIF (FUTURE INTERNET FORUM)fif.kr/AsiaNetFPGAws/slide/1-4_slide.pdf · 2017. 2. 13. · FIF (FUTURE INTERNET FORUM)

© NEC Corporation 2009Page 17

Parameters (QCN)

▐ NIC FAST_RECOVERY_THRESHOLD = 5 AI_INC = 0.5 Mbps HAI_INC = 5 Mbps BC_LIMIT = 150 KB (30% randomness) TIMER_PERIOD = 25 ms (30% randomness) MIN_RATE = 0.5 Mbps GD = 1/128

▐ Switch Quantized_Fb: 6 bits Q_EQ = 33 KB W = 2 Base marking = 150 KB, and varies according to the lookup table in

the pseudo code (30% randomness)

Based on QCN Pseudo Code Parameters

Page 18: FIF (FUTURE INTERNET FORUM)fif.kr/AsiaNetFPGAws/slide/1-4_slide.pdf · 2017. 2. 13. · FIF (FUTURE INTERNET FORUM)

© NEC Corporation 2009Page 18

Evaluation Result

Page 19: FIF (FUTURE INTERNET FORUM)fif.kr/AsiaNetFPGAws/slide/1-4_slide.pdf · 2017. 2. 13. · FIF (FUTURE INTERNET FORUM)

© NEC Corporation 2009Page 19

Hardware Evaluation Result (1RP) , RTT=100usec

Switch: Queue LengthNIC: Rate

020406080

100120140

1 3 5 7 9 11

Evaluation Time [sec]Q

ueue

Siz

e [K

Byt

es]

0

200

400

600

800

1000

1 3 5 7 9 11

Evaluation Time [sec]

Rat

e [M

bps]

Queue length rises temporarilyin limiting Switch rate

but go back to Qeq in shorter time.

Queue length rises temporarilyin limiting Switch rate

but go back to Qeq in shorter time.

The NIC rate behavior well followsthe limited Switch rate

by QCN Algorithm

The NIC rate behavior well followsthe limited Switch rate

by QCN Algorithm

Page 20: FIF (FUTURE INTERNET FORUM)fif.kr/AsiaNetFPGAws/slide/1-4_slide.pdf · 2017. 2. 13. · FIF (FUTURE INTERNET FORUM)

© NEC Corporation 2009Page 20

OMNet++ Simulation Result (1RP) , RTT=100usec

Switch: Queue LengthNIC: Rate

020406080

100120140

1 3 5 7 9 11

Simulation Time [sec]

Que

ue S

ize

[KB

ytes

]

0

200

400

600

800

1000

1 3 5 7 9 11

Simulation Time [sec]

Rat

e [M

bps]

Simulation result well matches the Hardware Evaluation Result !Simulation result well matches the Hardware Evaluation Result !

Page 21: FIF (FUTURE INTERNET FORUM)fif.kr/AsiaNetFPGAws/slide/1-4_slide.pdf · 2017. 2. 13. · FIF (FUTURE INTERNET FORUM)

© NEC Corporation 2009Page 21

Hardware Evaluation Result (8RPs) , RTT=1msec

Switch: Queue LengthNIC: Rate

020406080

100120140

1 3 5 7 9 11

Evaluation Time [sec]

Que

ue S

ize

[KB

ytes

]

0

200

400

600

800

1000

1 3 5 7 9 11

Evaluation Time [sec]

Rat

e [M

bps]

It also matches OMNet++ Simulation Result with the same parameterIt also matches OMNet++ Simulation Result with the same parameter

Wiggling because of longer delaybut still keeps the same queue length.

Wiggling because of longer delaybut still keeps the same queue length.

The aggregate NIC rate behavior still well follows the Switch limited rate

The aggregate NIC rate behavior still well follows the Switch limited rate

Page 22: FIF (FUTURE INTERNET FORUM)fif.kr/AsiaNetFPGAws/slide/1-4_slide.pdf · 2017. 2. 13. · FIF (FUTURE INTERNET FORUM)

© NEC Corporation 2009Page 22

Demonstration

Page 23: FIF (FUTURE INTERNET FORUM)fif.kr/AsiaNetFPGAws/slide/1-4_slide.pdf · 2017. 2. 13. · FIF (FUTURE INTERNET FORUM)

© NEC Corporation 2009Page 23

Conclusion

▐ We presented/demonstrated QCN theory and implementation

▐ QCN Implementation on NetFPGA Board All QCN Functions are implemented (Pseudo Code v2.3) QCN-NIC

• Multiple RPs are ready• Improved Token Bucket Rate Limiter

QCN-Switch• Multiple CPs are ready• Binary Search Logic for Fb Calculation

▐ We have proved that our QCN implementation achieves expected performance. The result matches OMNet++ Simulation