Вопросы балансировки трафика

46
MX TRIO LOAD BALANCING Dmitry Shokarev Product Line Management Routing Business Unit Version 1.4, April 2014 Confidential

description

Презентация для доклада, сделанного в рамках конференции Juniper New Network Day 01.01.2014. Докладчик -- Product Line Manager компании Juniper Networks Дмитрий Шокарев. Видеозапись этого доклада с онлайн-трансляции конференции вы можете увидеть здесь: http://www.youtube.com/watch?v=G96VHB4vfsw

Transcript of Вопросы балансировки трафика

Page 1: Вопросы балансировки трафика

MX TRIO LOAD BALANCING

Dmitry ShokarevProduct Line ManagementRouting Business UnitVersion 1.4, April 2014

Confidential

Page 2: Вопросы балансировки трафика

2 Copyright © 2009 Juniper Networks, Inc. www.juniper.net

AGENDA

1. High level load balancing overview

2. Packet parsing and hash computation

3. Advanced Topics

4. Theoretical load balancing efficiency analysis

5. Adaptive and Stateful load balancing

Page 3: Вопросы балансировки трафика

3 Copyright © 2009 Juniper Networks, Inc. www.juniper.net

HIGH LEVEL OVERVIEW

Page 4: Вопросы балансировки трафика

4 Copyright © 2009 Juniper Networks, Inc. www.juniper.net

Ingress PFE

Parse packet

Compute hash

Lookup Route

Select next-hop

HIGH LEVEL LOAD BALANCING OVERVIEW (SIMPLIFIED)

Parse packet Depending on the interface

encapsulation, select packet fields for route lookup

Compute hash Compute fixed size hash value from

variable set of packet fields

Egress PFE

Encapsulate

Lookup route Find a route based on the packet fields

Select next-hop Select ultimate next-hop from a list of

possible next-hops (multiple levels)

Page 5: Вопросы балансировки трафика

5 Copyright © 2009 Juniper Networks, Inc. www.juniper.net

PACKET PARSING AND HASH COMPUTATION

Page 6: Вопросы балансировки трафика

6 Copyright © 2009 Juniper Networks, Inc. www.juniper.net

Hash is symmetric (swapping the fields does not change the hash result)

Applicable only if there is a field match (TCP or UDP packets in this case). The field is included into the hash

L4

L3

L2

HOW TO READ THE DIAGRAM [1 OF 2]

Source Port ONOFF

Dest. Port

IIF

Protocol

DSCP ONOFF

ONOFF

6 or 17

Source Address

Dest. Address

IIF

Protocol

DSCP ONOFF

IPv4, GRE (PPTP)

ONOFF

47

GRE Key (16 bits)

GRE Protocol 0x880B

Source Address

Dest. Address

Configurable (default on)

Applicable only if there is a field match (PPTP packets in this

case). The field is NOT included into the hash computation

Field is includedby default and can’t

be turned off

IIF stands for Incoming Interface Index (internal

logical interface identifier)

ONOFF

Configurable (default off)

IPv4, UDP or TCP

Page 7: Вопросы балансировки трафика

7 Copyright © 2009 Juniper Networks, Inc. www.juniper.net

HOW TO READ THE DIAGRAM [2 OF 2]

L3/L4

L2

IIF ONOFF

Source MAC

Dest. MACONOFF

Outer 802.1p ONOFF

VLAN Tag 1

VLAN Tag 2..N

Ether type 0x0800

IPv4 payload ONOFF

Ethernet, IPv4

Shaded area refers to the hash field selection procedure defined somewhere elseIn this case IPv4 hash selection procedure will be used

Protocol

DSCPON

OFF

47

GRE Key (32 bits)

Source Address

Dest. Address

Fragment Flag 0

Fragment Offset 0

IPv4, GRE,non fragmented

Protocol

DSCPON

OFF

47

GRE Key (16 LS Bits)

GRE Protocol 0x880B

Source Address

Dest. Address

Fragment Flag 0

Fragment Offset 0

GRE Key (16 MS Bits)

IPv4, PPTP,non-fragmented

Source Port

Dest. Port

Protocol

DSCPON

OFF

17

Source Address

Destination Address

2152

GTP TEIDON

OFF

Fragment Flag 0

Fragment Offset 0

IPv4, GTP, non-fragmented

Protocol

DSCPON

OFF

IPv4

Source Address

Dest. Address

Source Port

Dest. Port

Fragment Flag

DSCPON

OFF

0

Source Address

Dest. Address

ON

OFF

ON

OFF

Fragment Offset 0

Protocol 6 or 17

IPv4, UDP or TCP, non-fragmented

Page 8: Вопросы балансировки трафика

8 Copyright © 2009 Juniper Networks, Inc. www.juniper.net

WHICH FIELDS SELECTED WHEN?

IP IP

Use IP fields

MPLS MPLS

Use MPLS fields

CCC, VPLS,Bridge

Use CCC/Bridge/VPLS fields

Answer depends on the encapsulation on ingress / egress

CCC,VPLS,Bridge

IP MPLS

Use IP fields

CCC

Use CCC/Bridge/VPLS fields

MPLS

Use MPLS fields

VPLS, Bridge

Use CCC/Bridge/VPLS fields

MPLS

IP IP+GRE/IPIP

Use IP fields

Use Inner IP fields

VPLS, Bridge IP (VIA IRB)

Use IP fields

Page 9: Вопросы балансировки трафика

9 Copyright © 2009 Juniper Networks, Inc. www.juniper.net

HASH FIELD SELECTION, IPV4 TRAFFIC [1 OF 2]

IIF

Protocol

DSCP ONOFF

IPv4

Source Port

Dest. Port

ONOFF IIF

Fragment Flag

DSCP ONOFF

ONOFF

0

Source Address

Dest. Address

Source Address

Dest. Address

ONOFF

ONOFFL4

L3

L2

Fragment Offset 0 Include L4 only for non fragments

Protocol 6 or 17

IPv4, UDP or TCP, non-fragmented

Page 10: Вопросы балансировки трафика

10 Copyright © 2009 Juniper Networks, Inc. www.juniper.net

L4

HASH FIELD SELECTION, IPV4 TRAFFIC [2 OF 2]

L3

L2

Source Port

Dest. Port

IIF

Protocol

DSCP ONOFF

ONOFF

17

Source Address

Destination Address

2152

GTP TEID ONOFF

IIF

Protocol

DSCP ONOFF

ONOFF

47

GRE Key (32 bits)

IIF

Protocol

DSCP ONOFF

ONOFF

47

GRE Key (16 LS Bits)

GRE Protocol 0x880B

Source Address

Dest. Address

Source Address

Dest. Address

Fragment Flag 0

Fragment Offset 0

Fragment Flag 0

Fragment Offset 0

Fragment Flag 0

Fragment Offset 0

GRE Key (16 MS Bits)

IPv4, GRE,non fragmented

IPv4, PPTP,non-fragmented

IPv4, GTP, non-fragmented

Page 11: Вопросы балансировки трафика

11 Copyright © 2009 Juniper Networks, Inc. www.juniper.net

HASH FIELD SELECTION, IPV6 TRAFFIC [1 OF 2]

IIF

Next Header

Traffic Class ONOFF

ONOFF

Source Address

Dest. Address

L4

L3

L2

IPv6

Source Port

Dest. Port

IIF

Traffic Class ONOFF

ONOFF

Source Address

Dest. Address

ONOFF

ONOFF

Next Header 6 or 17

IPv6, UDP or TCP

Page 12: Вопросы балансировки трафика

12 Copyright © 2009 Juniper Networks, Inc. www.juniper.net

L4

HASH FIELD SELECTION, IPV6 TRAFFIC [2 OF 2]

L3

L2

Source Port

Dest. Port

IIF

Next Header

Traffic Class ONOFF

ONOFF

17

Source Address

Destination Address

2152

GTP TEID ONOFF

IIF

Next Header

Traffic Class ONOFF

ONOFF

47

GRE Key (32 bits)

IIF

Next Header

Traffic Class ONOFF

ONOFF

47

GRE Key (16 LS Bits)

GRE Protocol 0x880B

Source Address

Dest. Address

Source Address

Dest. Address

GRE Key (16 MS Bits)

IPv6, GRE IPv6, PPTP IPv6, GTP

Page 13: Вопросы балансировки трафика

13 Copyright © 2009 Juniper Networks, Inc. www.juniper.net

HASH FIELD SELECTIONCCC/BRIDGE/VPLS TRAFFIC [1 OF 2]

IIF ONOFF

Source MAC

Dest. MACONOFF

Outer 802.1p ONOFF

VLAN Tag 1

VLAN Tag 2..N

L4

L3

L2

Ethernet, non IP or MPLS

Note, VLANs are note included

Page 14: Вопросы балансировки трафика

14 Copyright © 2009 Juniper Networks, Inc. www.juniper.net

L3/L4

L2

HASH FIELD SELECTIONCCC/BRIDGE/VPLS TRAFFIC [2 OF 2]

IIF ONOFF

Source MAC

Dest. MACONOFF

Outer 802.1p ONOFF

VLAN Tag 1 or none

VLAN Tag 2 or none

Ether type 0x0800

IPv4 payload

IIF ONOFF

Source MAC

Dest. MACONOFF

Outer 802.1p ONOFF

VLAN Tag 1 or none

VLAN Tag 2 or none

Ether type 0x8847

MPLS payloadONOFF

ONOFF

IIF ONOFF

Source MAC

Dest. MACONOFF

Outer 802.1p ONOFF

VLAN Tag 1 or none

VLAN Tag 2 or none

Ether type 0x86DD

IPv6 payload ONOFF

Ethernet, IPv4 Ethernet, IPv6 Ethernet, MPLS

Single knob to control payload analysis for all packet types

Page 15: Вопросы балансировки трафика

15 Copyright © 2009 Juniper Networks, Inc. www.juniper.net

HASH FIELD SELECTION, MPLS TRAFFIC [JUNOS < 14.1]

IIF

Label 2..5 (20 bits each)

Outer Label EXP ONOFF

ONOFF

Label 1 (20 bits)

IPv4, IPv6 payload

IIF

Label 2..5 (20 bits each)

Outer Label EXP ONOFF

ONOFF

Label 1 (20 bits)

IPv4, IPv6 in Ethernet pseudo-wire

L3/L4

L2

ONOFF

ONOFF

Up to 5 top labels

MPLS, Encapsulated IPv4 or IPv6

MPLS, IPv4/IPv6 in Ethernet Pseudo-wire

Page 16: Вопросы балансировки трафика

16 Copyright © 2009 Juniper Networks, Inc. www.juniper.net

MPLS ENCAPSULATED TRAFFIC DETERMINATION [JUNOS < 14.1]

Bottom of the stack reached?

Start

Use up to 5 top labelsin hash computation

Include topmost EXP(if enabled)

End

No

Check first nibble

Compute IPv4 hash

Length matches?

Compute IPv6 hash

Length matches?

Check Ethertype

Yes Yes

YesSkip VLAN

VLANs skipped > 2

0x4 (IPv4) 0x6 (IPv6)

Else0x8100

0x86DD

0x0800

No

Yes

NoNo

Else

Page 17: Вопросы балансировки трафика

17 Copyright © 2009 Juniper Networks, Inc. www.juniper.net

HASH FIELD SELECTION, MPLS TRAFFIC [JUNOS >= 14.1]

IIF

Label 2..8 (20 bits each)

Outer Label EXP ONOFF

ONOFF

Label 1 (20 bits)

IPv4, IPv6 payload

IIF

Label 2..8 (20 bits each)

Outer Label EXP ONOFF

ONOFF

Label 1 (20 bits)

IPv4, IPv6 or MPLS in Ethernet

pseudo-wire

L3/L4

L2

ONOFF

ONOFF

MPLS, Encapsulated IPv4 or IPv6

MPLS, IPv4/IPv6 in Ethernet Pseudo-wire

IIF

Label 2..8 (20 bits each)

Outer Label EXP ONOFF

ONOFF

Label 1 (20 bits)

Entropy Label Indicator detected,

Payload is not processedIndicator is not included into

hash

MPLS, Entropy Label

Page 18: Вопросы балансировки трафика

18 Copyright © 2009 Juniper Networks, Inc. www.juniper.net

MPLS ENCAPSULATED TRAFFIC DETERMINATION [JUNOS >= 14.1]

Bottom of the stack reached ANDno ELI* detected?

Start

Use up to 8 top labelsin hash computation

except ELI*

Include topmost EXP(if enabled)

End

No

Check first nibble

Compute IPv4 hash

Length matches?

Compute IPv6 hash

Length matches?

Check Ethertype

Yes Yes

Yes

Skip VLAN

VLANs skipped > 2

0x4 (IPv4) 0x6 (IPv6)

Else0x8100

0x86DD

0x0800

No

Yes

NoNo

Else

* ELI: Entropy Label Indicator, value of 7

Page 19: Вопросы балансировки трафика

19 Copyright © 2009 Juniper Networks, Inc. www.juniper.net

Byte offset 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 3104812 DEI16 DEI2024 S=0[1]28 S=0[2]32 S=1[3]36404448 DEI52 DEI566064687276808488

Identification Flags Fragment offsetVersion Header Length DSCP ECN Total Length

Checksum

Ethertype (0x0800, IPv4)PCP Encapsulated Inner VLAN

Payload Data

0 Protocol = 17 (UDP) UDP LengthSource Port Destination Port

Length

TTL Protocol Header checksumSource Address

Destination Address

Destination MACDestination MAC

Source MACSource MAC

TPID (0x8100) .1P

EXP[2]Encapsulated Destination MAC

Encapsulated Ethernet

Outer VLANTPID (0x8100) .1P Inner VLAN

Ethertype (0x8847, MPLS)

Encapsulated SRC MACEncapsulated Destination MAC

TPID (0x8100)PCP Encapsulated Outer VLANTPID (0x8100)Encapsulated SRC MAC

Bit position

UDP

IPv4

TTL[2] Label[3]Label[3] EXP[3] TTL[3]

MPLS

Ethernet

Label[1]Label[1] EXP[1] TTL[1] Label[2]Label[2]

NOTES ON MPLS PAYLOAD PROCESSING Algorithm features

Heuristic nature, produces good detection results Certain (fixed) requirements to the traffic

No control word for EoMPLS/VPLS frames

0x8100 Ethertype for VLANs

Sample hash field selection for bridged MPLS traffic with pseudo-wire encapsulated UDP.All optional fields are enabled except IIF, fields included into computation are in black

Page 20: Вопросы балансировки трафика

20 Copyright © 2009 Juniper Networks, Inc. www.juniper.net

HOW HASH IS COMPUTED?

Trio hash computation algorithm Uses a combination of Cyclic Redundancy Check (CRC) 13 and

CRC-31 polynomial functions (similar functions are used to compute ethernet frame checksum)

Implemented in hardware Very efficient

Hash function result One 31 bit number (used to select the next-hop) For hierarchical load balancing sections of that result are used

Page 21: Вопросы балансировки трафика

21 Copyright © 2009 Juniper Networks, Inc. www.juniper.net

BGP

BGP

NEXT-HOP SELECTION EXAMPLE (MULTIPLE LEVELS)

IP RouteNext-hop

list 1.1

Indirect next-hop 1

Indirect next-hop 2

Indirect next-hop 3

List 2.2

List 2.3

LSP 1

LSP 2

LSP 3

AE0-1

AE0-2

AE0-3

LSP1

PE0 LSP3

LSP2

PE1

PE3

PE2

1st level balancing 2nd level balancing 3rd level balancing

Different set of bits from the hash are used to select a next-hop at each level (to prevent

polarization)

List 2.1

AE0 List

BGP

Page 22: Вопросы балансировки трафика

22 Copyright © 2009 Juniper Networks, Inc. www.juniper.net

POLARIZATION PREVENTION (NETWORK WIDE)

Problem statement Hashing at different nodes

may produce same results Will result in traffic

polarization

Solution Include a hash seed into

computation Hash seed is based on the

system MAC Enabled by default, non

configurable

Traffic

~ 50%

~ 50%

Hash computation, 1st

load balancing decision

Hash computation (same result, unless we enable IIF inclusion),

2nd load balancing decision ~ 50%

0%

Different hash seeds fixes that

Page 23: Вопросы балансировки трафика

23 Copyright © 2009 Juniper Networks, Inc. www.juniper.net

MULTICAST TRAFFIC LOAD BALANCING

Notes Only relevant in the context of aggregated ethernet

(ECMP join load balancing is managed by the downstream)

In enhanced-ip mode the algorithm behaves exactly the same as for unicast traffic Same fields selected for hashing Same hash computation procedure Same member links are selected

Page 24: Вопросы балансировки трафика

24 Copyright © 2009 Juniper Networks, Inc. www.juniper.net

HASH CONFIGURATION

forwarding-options { enhanced-hash-key { family inet { incoming-interface-index; gtp-tunnel-endpoint-identifier; no-destination-port; no-source-port; type-of-service; } }}

forwarding-options { enhanced-hash-key { family inet6 { incoming-interface-index; gtp-tunnel-endpoint-identifier; no-destination-port; no-source-port; traffic-class; } }}

IPv6 hash configuration

IPv4 hash configuration

forwarding-options { enhanced-hash-key { family mpls { incoming-interface-index; label-1-exp; no-payload; no-ether-pseudowire;/*13.3R3*/ } }}

forwarding-options { enhanced-hash-key { family multiservice { incoming-interface-index; no-mac-address; no-payload; outer-priority; } }}

CCC/VPLS/Bridge hash configuration

MPLS hash configuration

Page 25: Вопросы балансировки трафика

25 Copyright © 2009 Juniper Networks, Inc. www.juniper.net

SYMMETRIC LOAD BALANCING

Page 26: Вопросы балансировки трафика

26 Copyright © 2009 Juniper Networks, Inc. www.juniper.net

SYMMETRIC LOAD BALANCING

Problem statement Same flow should reach same stateful appliance irrespective of the path (through

MX1 or MX2) Reverse flow should reach same stateful appliance

Solution Disable router hash seed Synchronize link order through link-index configuration Second problem is solved on Trio automatically

MX1 MX2ServiceAppliances

Flow A->B

Flow B->A

Page 27: Вопросы балансировки трафика

27 Copyright © 2009 Juniper Networks, Inc. www.juniper.net

CONSISTENT HASHING

Page 28: Вопросы балансировки трафика

28 Copyright © 2009 Juniper Networks, Inc. www.juniper.net

CONSISTENT HASHING

Problem statement L3-L4 load balancing between servers should remain consistent in

failure scenarios (when server goes down or when it recovers) Need to detect and react to server failures

Solution Use EBGP for server health checks Use modified Equal Cost Multipath to distribute traffic

MX

Server 1

Server 2

Server N [N = 1..64]

Enabling highly efficient L3/L4 Load Balancing

eBGP

eBGP

eBGP

Virtual IP A

Virtual IP A

Virtual IP A

Page 29: Вопросы балансировки трафика

29 Copyright © 2009 Juniper Networks, Inc. www.juniper.net

CONSISTENT HASHINGIMPLEMENTATION DETAILS

All Servers active

Server 1

Server 2

Server 3

Flow 1, Flow 2

Flow 3, Flow 4

Flow 5, Flow 6

Server 2 / Link 2 fails

Server 1

Server 2

Server 3

Flow 1, Flow 2, Flow 3

Flow 5, Flow 6, Flow 4

Server 2 recovers

Server 1

Server 2

Server 3

Flow 1, Flow 2

Flow 3, Flow 4

Flow 5, Flow 6

Flow (hash bucket) to ECMP next-hop mapping table in time

MX

Server 1

Server 2

Server 3

eBGP

eBGP

eBGP

Virtual IP A

Virtual IP A

Virtual IP A

Page 30: Вопросы балансировки трафика

30 Copyright © 2009 Juniper Networks, Inc. www.juniper.net

CONSISTENT HASHINGCONFIGURATION, SOFTWARE SUPPORT

policy-options { policy-statement c-hash { from { route-filter ${virtual_ip}; } then { load-balance consistent-hash; } }}protocols { bgp { group server-group { import c-hash; } }}

Configuration

LINE CARD All Trio

JUNOS 13.3R3

LEVEL ECMP only

OTHER Unicast only

SCALING <1000 ECMP NHs

Software and hardware

Page 31: Вопросы балансировки трафика

31 Copyright © 2009 Juniper Networks, Inc. www.juniper.net

THEORETICAL LOAD BALANCING EFFICIENCY ANALYSIS

Page 32: Вопросы балансировки трафика

32 Copyright © 2009 Juniper Networks, Inc. www.juniper.net

HOW MANY FLOWS DO WE NEED?

More flows will Improve load balancing efficiency (or reduce imbalance) Reduce imbalance probability

Some definitions Positive imbalance: difference between the max link rate and the expected average Tolerance limit: % of capacity that allowed to be wasted

1

2

3

4

5

6

7

8

PositiveImbalance

Expected average

Max link rate

Page 33: Вопросы балансировки трафика

33 Copyright © 2009 Juniper Networks, Inc. www.juniper.net

ESTIMATING THE FLOW COLLISION PROBABILITY Traffic model

N equal traffic flows are sent over M equal paths (or distributed between M member links); Traffic flows are balanced between paths using hash. The hash function produces uniform results, probability of a flow taking

specific path is 1/M; The balancing implemented for each flow independently. I.e. if one flow took path 1 with probability 1/M, another flow will take

this path with the same probability.

Bernoulli’s Trial Scheme applies in this case A given path is selected with probability 1/M; Any of other paths is selected with probability 1-1/M.

KN

K MMK

NKP

11

1)(

0.00%

2.00%

4.00%

6.00%

8.00%

10.00%

12.00%

14.00%

16.00%

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64

KN

K MMK

NKP

11

1)( Probability of the K flows hitting the same link

64 flows distributed over 8 links, probability of K flows hitting the same link N flows

Pro

bab

ility

Page 34: Вопросы балансировки трафика

34 Copyright © 2009 Juniper Networks, Inc. www.juniper.net

TAKING TOLERANCE INTO ACCOUNT How to find the imbalance probability

Define tolerance limit (25% in this case, i.e. 10 flows is ok to map to a single link) Sum up probabilities of undesired outcomes (more than 11 flows mapped to a link)

Some results With 25% imbalance target, probability to stay within this target is 82.96% To reach 99.99% probability, need to increase the number of flows to 1605

0.00%

2.00%

4.00%

6.00%

8.00%

10.00%

12.00%

14.00%

16.00%

0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64

64 flows distributed over 8 links, probability of K flows hitting the same link, outcomes in green are within 25% tolerance

N flows

Pro

bab

ility

Page 35: Вопросы балансировки трафика

35 Copyright © 2009 Juniper Networks, Inc. www.juniper.net

ADAPTIVE LOAD BALANCING

Page 36: Вопросы балансировки трафика

36 Copyright © 2009 Juniper Networks, Inc. www.juniper.net

Ingress PFE

Parse packet

Compute hash

Lookup Route

Select next-hop

ADAPTIVE LOAD BALANCING OVERVIEW

Monitor utilization and adjust mapping

Hash Buckets1

2

N

LAG link [1 .. M]

LAG link [1 .. M]

LAG link [1 .. M]

Rate Table1

2

N

Rate 1

Rate 2

Rate 3

To fabricFrom WAN

Implementation details Track traffic rate per hash bucket Re-map hash buckets periodically if imbalance crosses a threshold

Page 37: Вопросы балансировки трафика

37 Copyright © 2009 Juniper Networks, Inc. www.juniper.net

Link #1Link #2

Link #8

ADAPTIVE AT WORK Example

Balancing towards network core (8 links in a group)

Many small flows Very few high volume flows

Results Without adaptive balancing, flows are

distributed in a uniform way, but link rates differ because of the high volume flows

With adaptive, the imbalance is compensated

1 N

Rate

Rates per hash bucket

Traffic

High volume flows

Page 38: Вопросы балансировки трафика

38 Copyright © 2009 Juniper Networks, Inc. www.juniper.net

ADAPTIVE MAPPING OF HASH BUCKETS

Link rates, sample uniform (default) mapping of hash buckets to links

1

2

3

4

5

6

7

8

1

2

3

4

5

6

7

8

Link rates, sample adaptive mapping of hash buckets to links Savings

Page 39: Вопросы балансировки трафика

39 Copyright © 2009 Juniper Networks, Inc. www.juniper.net

LAB VERIFICATION

1.25

1.3

1.4

1.45

1.5

1.55

1.6

1.35

1.65

1.7

42:00 43:00 44:00 45:00 46:00 47:00 48:00 49:00 50:00 51:00 52:00 53:00

Time, MM:SS

Inte

rfac

e ra

te (G

bps)

Link 1

Link 2

Link 3

Link 4

Link 5

Link 6

Adaptivebalancing enabled Adjustment made

Page 40: Вопросы балансировки трафика

40 Copyright © 2009 Juniper Networks, Inc. www.juniper.net

MX ADAPTIVE LOAD BALANCING SUPPORT

INGRESS LINE CARD Trio

EGRESS LINE CARD Trio, DPC

MX MIXED MODE Yes

JUNOS 12.3R4

Software and hardware

LEVELOnly across LAG members

(no ECMP)

OTHER

Tracks usage and compensates imbalance for

unicast traffic only, multicast is load balanced

in a regular way

Features

OTHER NOTES

Optimization is local to the ingress PFE, in case of multiple ingress PFEs, each ingress PFE compensates imbalance on its own

Hash bucket counters are maintained per egress IFL

Multi-LU line cards are supported (MPC3, MPC4)

Page 41: Вопросы балансировки трафика

41 Copyright © 2009 Juniper Networks, Inc. www.juniper.net

STATEFUL LOAD BALANCING

Page 42: Вопросы балансировки трафика

42 Copyright © 2009 Juniper Networks, Inc. www.juniper.net

Ingress PFE

Parse packet

Compute hash

Lookup Route

Map to hash-bucketSelect a link for a new

bucketSelect next-hop

STATEFUL LOAD BALANCING OVERVIEW

Hash Buckets1

2

N

LAG link [1 .. M]

LAG link [1 .. M]

LAG link [1 .. M]

To fabricFrom WAN

Implementation details Initially all hash buckets point to void Map packet to hash bucket If a hash bucket does not point to a link, incrementally choose a link

Page 43: Вопросы балансировки трафика

43 Copyright © 2009 Juniper Networks, Inc. www.juniper.net

MX STATEFUL LOAD BALANCING SUPPORT

INGRESS LINE CARD Trio

EGRESS LINE CARD Trio, DPC

MX MIXED MODE Yes

JUNOS 12.3R3

Software and hardware

LEVELOnly across LAG members

(no ECMP)

OTHERUnicast traffic only,

multicast follows regular hashing

Features

OTHER NOTES

Mapping is local to the ingress PFE, in case of multiple ingress PFEs, each ingress PFE maintains its own mapping

Multi-LU line cards are not supported (MPC3, MPC4)

Page 44: Вопросы балансировки трафика

44 Copyright © 2009 Juniper Networks, Inc. www.juniper.net

Stateful

• Best for few flows of the same size• Requires more NPU memory

Regular

• Best for multiple flows of the same size• Note, use formulas to estimate

number of flows (threshold)

USAGE GUIDELINES

Adaptive

• Best for multiple flows with few highvolume flows

• Requires more NPU memory

Flow rate

N fl

ows

Threshold

Flow rate

N fl

ows

Threshold

Flow rateN

flow

s

Threshold

Page 45: Вопросы балансировки трафика

45 Copyright © 2009 Juniper Networks, Inc. www.juniper.net

RELATED FEATURE LIST AND SCALING

SW FEATURE

10.2 Baseline Trio implementation

11.4R6 Turn off hash calculation based on Layer-4 information for fragments

12.3R2 GTP TEID hash inclusion

12.3R2 Introduce length checks in the heuristic algorithm

12.3R3 Increase number of links in a LAG to 64

13.3R3 Selectively disable hash computation for psedo-wires only

13.3R3 Consistent Hashing

ECMP PATHS 64

LAG MEMBERS 64

LAG GROUPS 496

Feature list

Current scaling

Page 46: Вопросы балансировки трафика