Architecture for a 10-Gigabit Ethernet Standard -...

26
1 S. Muller - Sun IEEE 802.3 HSSG Architecture for a 10-Gigabit Ethernet Standard Shimon Muller Sun Microsystems Computer Company IEEE 802.3 High Speed Study Group June 1, 1999

Transcript of Architecture for a 10-Gigabit Ethernet Standard -...

1

S. Muller - Sun

IEEE 802.3

HSSG

Architecture for a 10-Gigabit EthernetStandard

Shimon MullerSun Microsystems Computer Company

IEEE 802.3 High Speed Study GroupJune 1, 1999

2

IEEE 802.3

HSSG S. Muller - Sun

Outline

■ Introduction

■ Market Requirements

■ Customer Requirements

■ Standards Requirements

■ Multi-Gigabit Transmission Options

■ 10-Gigabit Ethernet Objectives

■ Architecture

■ Functional Partitioning

■ Principles of Operation

■ Physical Coding and Framing Requirements

■ Media Independent Interfaces

■ Error Handling

■ Summary

3

IEEE 802.3

HSSG S. Muller - Sun

Introduction --- Market Requirements

■ The expectation is that scaling Ethernet to 10Gb/s will necessarilyexpand the scope of its application space

■ “Ethernet everywhere” (or at least “wherever you can get away with it”)

■ Everything from data center and computer room clusters, through traditionalLAN backbone and desktop applications, to MAN/RAN/WAN

4

IEEE 802.3

HSSG S. Muller - Sun

Introduction --- Customer Requirements

■ The motivations and constraints for each one of the envisionedapplications are quite different

■ Long Haul

■ MAN, RAN and WAN

■ Must operate over existing very long fiber links

■ Requires high coding efficiency

■ Not very sensitive to cost

■ Does not address any specific problem for traditional Ethernet users

■ Intermediate Haul

■ Traditional LAN backbones

■ Must operate over the existing cabling infrastructure

■ Coding efficiency is not an issue

■ Cost sensitive

■ Will address future problems of backbone congestion in Ethernet LANs

5

IEEE 802.3

HSSG S. Muller - Sun

Introduction --- Customer Requirements (continued)

■ Short Haul

■ Computer room clusters, server-switch and inter-switch interconnects

■ Cabling infrastructure is not an issue

■ Coding efficiency is not an issue

■ Very cost sensitive

■ Will address existing hot spots in today’s networks

6

IEEE 802.3

HSSG S. Muller - Sun

Introduction --- Standards Requirements

■ “One solution that fits all” for the 10-Gigabit Ethernet Standardwill be sub-optimal for at least one major market segment

■ The standard should be able to accommodate multiple solutionsthat will address the divergent market requirements

■ The various options must be constrained to the Physical Layer

■ Specific Requirements:

■ The MAC should be modified one more time and made “truly” speed-independent

■ Media independent interfaces should be defined below the MAC

■ Several Physical Layers can be defined, that attach to these interfaces

■ This approach is fully compatible with previous 802.3 practice

7

IEEE 802.3

HSSG S. Muller - Sun

Multi-Gigabit Transmission Options

■ “Go Faster”

■ Architecturally straightforward

■ Scale everything by a factor of 10

■ Implementation and cost challenges

■ Requires very high-speed Physical Layer transmission components

■ Significant portions of logic need to be clocked at very high frequencies

■ “Go Wider”

■ Striping of the data stream across multiple transmission channels

■ Can be implemented using proven existing technology

■ Alleviates the very high-speed logic design requirements

■ Will provide a much cheaper alternative in a variety of network environments

■ The transmission channels can be separate physical links (ribbon fiber) or asingle physical link that carries multiple logical channels (WDM)

■ The 10-Gigabit Ethernet standard should accommodate both theserial and the parallel transmission schemes

8

IEEE 802.3

HSSG S. Muller - Sun

Multi-Gigabit Transmission Options (continued)

■ Coarse Granularity Striping

■ The channels’ convergence point is above the MAC Layer --- 802.3adLink Aggregation

■ Distribution/collection typically implemented in s/w or in the switching fabric

■ High-speed operation achieved only when multiple Layer 2/3/4 “flows”can be aggregated

■ For a given “flow”, the throughput and the latency are limited by thespeed of a single channel

■ Fine Granularity Striping

■ The channels’ convergence point is below the MAC Layer

■ Distribution/collection implemented in h/w as part of the Physical Layer

■ Striping of the MAC data stream performed with byte granularity

■ From the MAC Client’s perspective the performance of the aggregate isidentical to that of a single high-speed link

9

IEEE 802.3

HSSG S. Muller - Sun

10-Gigabit Ethernet Objectives

■ Support the speed of 10Gb/s at the MAC/PLS service interface

■ Preserve the 802.3 frame format at the MAC Client service interface

■ Preserve the minimum and maximum frame sizes of current 802.3 standard

■ Support simple forwarding between 10Gb/s, 1Gb/s, 100Mb/s and 10Mb/s

■ Provide support for Full Duplex operation only (no CSMA/CD)

■ Meet all 802 functional requirements, with the possible exception of Ham-ming Distance

■ Support star-wired topologies

■ Support media selected from ISO/IEC 11801

■ Provide a family of Physical Layer specifications which support links of:

■ At least 100m (?) over multi-mode multi-fiber bundles

■ At least 300m (?) over multi-mode single-fiber cable

■ At least 3km (?) over single-mode single-fiber cable

■ At least 50km (?) over single-mode single-fiber cable

■ Provide specifications for optional Media Independent Interfaces

10

IEEE 802.3

HSSG S. Muller - Sun

Architecture --- Functional Partitioning

PCS

PMA

PMD

MEDIUM

PCS

PMA

PMD

MEDIUM

PCS

PMA

PMD

MEDIUM

PCS

PMA

PMD

MEDIUM

PCS

PMA

PMD

MEDIUM

DISTRIBUTOR/COLLECTOR

RECONCILIATIONRECONCILIATION

1Gb/s 1Gb/s

LOWER SPEEDS

MAC --- MEDIA ACCESS CONTROL

LLC --- LOGICAL LINK CONTROL

MAC CONTROL (OPTIONAL)

HIGHER LAYERS

10-GMII

2-GMII

MDI

10-GMII

MDI

ST

RIP

ED

PH

Y

SE

RIA

L P

HY

11

IEEE 802.3

HSSG S. Muller - Sun

Architecture --- Functional Partitioning (continued)

2-TBI156.25MHz/312.50MBd

2-GMII156.25MHz/312.50MBd

MACClientI/F

10-GbEthernet MAC

Distributor

Collector

8B/10B PCS

8B/10B PCS

8B/10B PCS

8B/10B PCS

SER-DES

SER-DES

SER-DES

SER-DES

SER-DES

“Serial” PCS

10-GMII156.25MHz/312.50MBd

MDI3.125GBd

MDI 12.5GBd???

32

32

8

8

8

8

8

8

8

8

10

10

10

10

10

10

10

10

2

2

2

2

2

2

2

2

2

2

?

?

12

IEEE 802.3

HSSG S. Muller - Sun

Architecture --- Principles of Operation

■ Assumptions:

■ A striped bundle contains a fixed number of channels

■ Equal to four

■ All the channels in a bundle have the same nominal speed

■ Equal to 3.125GBd

■ No partial operation is supported

■ For a striped bundle to be considered operational, all the channels in the bun-dle must be operational

■ The end points of a bundle are properly wired

■ In the same order and contiguous

■ The maximum skew between the channels in a bundle is bounded

13

IEEE 802.3

HSSG S. Muller - Sun

Architecture --- Principles of Operation (continued)

■ Distributor:

■ Operates in an open loop

■ Not required to consider the skew at the receiving side or the receiver state

■ Accepts a contiguous byte stream from the MAC (frames and idles) anddivides it into four sub-streams (“mini-frames” and “mini-idles”)

■ Round-robin arbitration

■ Byte granularity

■ Starting point is arbitrary, but the arbitration is contiguous afterwards

■ The first and/or last bytes of a packet may be sent over any channelwith no restrictions

■ The Distributor uniquely “marks” the first and last bytes of a packet

■ The Distributor enforces packet sequencing during the Inter-Packet Gap

14

IEEE 802.3

HSSG S. Muller - Sun

Architecture --- Principles of Operation (continued)

■ Collector:

■ Contains an elasticity buffer per channel

■ Compensate for the worst case skew between the channels

■ Synchronizes the channels using sequence information received duringthe Inter-Packet Gap

■ Reassembles “mini-frames” into packets and sends them to the MAC

■ Reassembly occurs only if all the channels are “in synch” and contain at leasta partial “mini-frame”

■ Uses the uniquely “marked” first and last bytes of a packet to determine packetboundaries across the channels

■ Packet reassembly starting point is determined by the “first” byte of a packet

■ After the first byte, reassembly is round-robin with byte granularity

■ Packet reassembly end point is determined by the “last” byte of a packet

15

IEEE 802.3

HSSG S. Muller - Sun

Physical Coding and Framing Requirements

■ Leverage from 1000BASE-X to the extent possible

■ Enhance the 1000BASE-X PCS to deal with following constraints:

■ 1. Preamble Shrinkage

■ The minimum Preamble of a “mini-frame” may be reduced to one byte (7/4)

■ 1000BASE-X encapsulation can extend the IPG by one symbol at the expenseof the Preamble

■ 1000BASE-X encapsulation requires at least one symbol of Preamble for SPD

■ Solution:

■ Increase the Preamble field of the 10-Gigabit Ethernet frame by one byte (total of eight)

■ 2. Inter-Packet Gap Shrinkage

■ The minimum IPG between “mini-frames” may be reduced to three bytes (12/4)

■ 1000BASE-X encapsulation allows for an EPD of a maximum of three symbols

■ This can potentially eliminate the Idle signalling between frames

■ Solution:

■ Define a new EPD of no more than two symbols

16

IEEE 802.3

HSSG S. Muller - Sun

Physical Coding and Framing Requirements (cont.)

■ 3. Packet Sequencing Signalling

■ Allows for channel synchronization in the Collector

■ Enhances error robustness

■ To avoid additional overhead, packet sequencing is enforced during the IPG

■ Solution:

■ Define multiple flavors of the IDLE ordered set

■ 4. Packet and Frame Delimiters

■ “Mini-frame”-to-Packet reassembly in the Collector requires multiple flavors ofstart and end dilimiters

■ Solution:

■ Define an SPD code-group that indicates the star t of a “mini-frame” on one of the chan-nels AND the star t of a MA C packet for the entire aggregate

■ Define an SFD code-group that indicates the star t of a “mini-frame” ONLY on the remain-ing three channels

■ Define an EPD code-group that indicates the end of a “mini-frame” on one of the chan-nels AND the end of a MA C packet for the entire aggregate

■ Define an EFD code-group that indicates the end of a “mini-frame” ONLY on the remain-ing three channels

17

IEEE 802.3

HSSG S. Muller - Sun

Media Independent Interfaces

■ Standard media independent interfaces are a good thing to have!

■ Provide interoperable points of attachment between components frommultiple vendors

■ Allow for clean architectural partitioning between functional modules

■ Simplify the standard’s specification

■ Multiple MII compliance interfaces may be needed for the 10-Giga-bit Ethernet Standard

■ Allows for various levels of silicon integration in implementations

■ All the interfaces are based on the same concepts that were usedfor Gigabit Ethernet

■ Utilize the full duplex subset of the GMII and the TBI

■ Trade-off between the absolute minimum number of signals used, with-out substantially increasing the baud rate

■ All the interfaces are defined such that they can be overlaid one on topof the other

18

IEEE 802.3

HSSG S. Muller - Sun

Media Independent Interfaces (continued)

■ 10-GMII

■ 156.25MHz clocks

■ 25% increase compared to GMII

■ Both edges of the clocks used for data transfer

■ 32-bit data bus in each direction (TXD/RXD)

■ Increased from 8 on the GMII

■ 2-bit VLD bus in each direction

■ Indicates the number of valid bytes on TXD/RXD

■ Full Duplex operation only

■ GMII COL and CRS signals removed

■ GMII Carrier Extension encodings on TXD/RXD removed

19

IEEE 802.3

HSSG S. Muller - Sun

Media Independent Interfaces (continued)

■ 2-GMII

■ 156.25MHz clocks

■ Both edges of the clocks used for data transfer

■ 8-bit data bus in each direction (TXD/RXD)

■ Same as GMII

■ 1-bit Packet Delimiter (PD) signal in each direction

■ Indicates the first and the last bytes of a MAC packet

■ Full Duplex operation only

■ GMII COL and CRS signals removed

■ GMII Carrier Extension encodings on TXD/RXD removed

■ 2-TBI

■ 156.25MHz clocks

■ Both edges of the clocks used for data transfer

■ 10-bit data bus in each direction (TX/RX)

20

IEEE 802.3

HSSG S. Muller - Sun

Media Independent Interfaces (continued)

PA1PA2PA3PA4

PA5PA6PA7SFD

D1D2D3D4

D57D58D59D60

CRC1CRC2CRC3CRC4

TX_EN/RX_DV

CLK

VLD[1:0]

D[31:0]IPG1IPG2IPG3IPG4

IPG5IPG6IPG7IPG8

IPG9IPG10IPG11IPG12

PA1PA2PA3PA4

PA5PA6PA7SFD

D1D2D3D4

D57D58D59D60

D61CRC1CRC2CRC3

CRC4IPG1IPG2IPG3

0 0 0 1 0 3

IPG4IPG5IPG6IPG7

IPG8IPG9IPG10IPG11

IPG12PA1PA2PA3

PA4PA5PA6PA7

SFDD1D2D3

D56D57D58D59

D60D61CRC1CRC2

CRC4CRC4IPG1IPG2

20

IPG3IPG4IPG5IPG6

IPG7IPG8IPG9IPG10

IPG11IPG12PA1PA2

PA3PA4PA5PA6

PA7SFDD1D2

D3D4D5D6

20 0

10-GMII

Notes: - When TX_EN/RX_DV is asserted, an encoding of VLD=0 indicates four valid bytes on D[31:0].

- A non-zero encoding of VLD is only relevant when TX_EN/RX_DV is asserted.

TX_ER/RX_ER

21

IEEE 802.3

HSSG S. Muller - Sun

Media Independent Interfaces (continued)

2-GMII/2-TBI --- Channel 1

TX_EN/RX_DV

CLK

PD

D[7:0] PA1 PA5 D1 D57 CRC1 PA1 PA5 D1 D57 D61 CRC4 PA4 SFD D56 D60 CRC3 PA3 PA7 D3

SPD D D D D EFD IDLE2 SPD D D D D D EPD IDLE3 SFD D D D D EFD IDLE4 SFD D D8B/10B DCode

D D

TX_ER/RX_ER

IDLE1

Idle1 Idle4Idle3Idle2 Idle2 Idle2 Idle3 Idle3 Idle4 Idle4

CLK

PD

D[7:0] PA2 PA6 D2 D58 CRC2 Idle2 Idle2 Idle2 PA2 PA6 D2 D58 CRC1 Idle3 Idle3 Idle3 PA1 PA5 D1 D57 D61 CRC4 Idle4 Idle4 Idle4 PA4 SFD D4

SFD D D D D EFD IDLE2 SFD D D D D EFD IDLE3 SPD D D D D EPD IDLE4 SFD D DD D DD

Idle1

IDLE1

2-TBI

TX_EN/RX_DV

TX_ER/RX_ER

8B/10BCode

2-TBI

2-GMII/2-TBI --- Channel 2

22

IEEE 802.3

HSSG S. Muller - Sun

Media Independent Interfaces (continued)

CLK

PD

D[7:0] PA3 PA7 D3 D59 CRC3 Idle2 Idle2 Idle2 PA3 PA7 D3 D59 CRC2 Idle3 Idle3 Idle3 PA2 PA6 D2 D58 CRC1 Idle4 Idle4 Idle4 PA1 PA5 D1 D5

SFD D D D D EFD IDLE2 SFD D D D D EFD IDLE3 SFD D D D EFD IDLE4 SPD D DD D DD D

Idle1

IDLE1

PA4 SFD D4 D60 CRC4 Idle2 Idle2 Idle2 PA4 SFD D4 D60 CRC3 Idle3 Idle3 Idle3 PA3 PA7 D3 D59 CRC2 Idle4 Idle4 Idle4 PA2 PA6 D2 D6

SFD D D D D EPD IDLE2 SFD D D D D EFD IDLE3 SFD D D D EFD IDLE4 SFD D DD D DD D

Idle1

IDLE1

TX_EN/RX_DV

TX_ER/RX_ER

8B/10BCode

2-TBI

CLK

TX_EN/RX_DV

TX_ER/RX_ER

PD

D[7:0]

8B/10BCode

2-TBI

2-GMII/2-TBI --- Channel 3

2-GMII/2-TBI --- Channel 4

23

IEEE 802.3

HSSG S. Muller - Sun

Media Independent Interfaces (continued)

PA1PA2PA3PA4

PA5PA6PA7SFD

D1D2D3D4

D57D58D59D60

CRC1CRC2CRC3CRC4

IPG1IPG2IPG3IPG4

IPG5IPG6IPG7IPG8

IPG9IPG10IPG11IPG12

PA1PA2PA3PA4

PA5PA6PA7SFD

D1D2D3D4

D57D58D59D60

D61CRC1CRC2CRC3

CRC4IPG1IPG2IPG3

IPG4IPG5IPG6IPG7

IPG8IPG9IPG10IPG11

IPG12PA1PA2PA3

PA4PA5PA6PA7

SFDD1D2D3

D56D57D58D59

D60D61CRC1CRC2

CRC4CRC4IPG1IPG2

IPG3IPG4IPG5IPG6

IPG7IPG8IPG9IPG10

IPG11IPG12PA1PA2

PA3PA4PA5PA6

PA7SFDD1D2

D3D4D5D6

PA1 PA5 D1 D57 CRC1 PA1 PA5 D1 D57 D61 CRC4 PA4 SFD D56 D60 CRC3 PA3 PA7 D3Idle1 Idle4Idle3Idle2 Idle2 Idle2 Idle3 Idle3 Idle4 Idle4

SPD D D D D EFD IDLE2 SPD D D D D D EPD IDLE3 SFD D D D D EFD IDLE4 SFD D DD D DIDLE1

PA2 PA6 D2 D58 CRC2 Idle2 Idle2 Idle2 PA2 PA6 D2 D58 CRC1 Idle3 Idle3 Idle3 PA1 PA5 D1 D57 D61 CRC4 Idle4 Idle4 Idle4 PA4 SFD D4Idle1

SFD D D D D EFD IDLE2 SFD D D D D EFD IDLE3 SPD D D D D EPD IDLE4 SFD D DD D DDIDLE1

PA3 PA7 D3 D59 CRC3 Idle2 Idle2 Idle2 PA3 PA7 D3 D59 CRC2 Idle3 Idle3 Idle3 PA2 PA6 D2 D58 CRC1 Idle4 Idle4 Idle4 PA1 PA5 D1 D5Idle1

SFD D D D D EFD IDLE2 SFD D D D D EFD IDLE3 SFD D D D EFD IDLE4 SPD D DD D DD DIDLE1

PA4 SFD D4 D60 CRC4 Idle2 Idle2 Idle2 PA4 SFD D4 D60 CRC3 Idle3 Idle3 Idle3 PA3 PA7 D3 D59 CRC2 Idle4 Idle4 Idle4 PA2 PA6 D2 D6Idle1

SFD D D D D EPD IDLE2 SFD D D D D EFD IDLE3 SFD D D D EFD IDLE4 SFD D DD D DD DIDLE1

2-TBI --- Channels 1-4

2-GMII --- Channels 1-4

10-GMII

24

IEEE 802.3

HSSG S. Muller - Sun

Error Handling

■ Data Corruption Errors

■ Data symbol converted to another data symbol

■ Detected and handled by the MAC Layer, analogous to a single-channel trans-mission scheme

■ Per Channel Errors

■ Code violations, Framing errors, Disparity errors, etc.

■ Detected by the PCS in each channel

■ Propagated to the Collector over the 2-GMII using the RX_ER handshake

■ Propagated to the MAC over the 10-GMII using the RX_ER handshake

■ Affect the entire packet in progress

■ Distribution/Collection Errors

■ Length mismatch between “mini-frames” that belong to the same pack-et greater than 1 byte

■ Detected by the Collector

■ Propagated to the MAC over the 10-GMII using the RX_ER handshake

25

IEEE 802.3

HSSG S. Muller - Sun

Error Handling (continued)

■ Channel Skew Errors

■ Overflow of at least one per channel elasticity fifo in the Collector

■ Detected by the Collector

■ The entire packet is dropped in the Collector

■ Channel Synchronization Errors

■ At least one channel is out of synch

■ The channel received a “mini-frame” with a sequence number that does notmatch the expected one

■ The Collector drops packets until synchronization is reestablished

■ Collector cannot reestablish synchronization

■ Timer based

■ Collector reestablished synchronization incorrectly

■ Can happen if multiple “mini-frames” on a single channel “vanished”

■ Results in a very high rate of CRC errors, with no other errors apparent

■ Flush the pipe (link initialization or 802.3x)

26

IEEE 802.3

HSSG S. Muller - Sun

Summary

■ “One solution that fits all” for the 10-Gigabit Ethernet Standardwill not appropriately address the customer needs

■ The standards effort should be prioritized as follows:

■ Phase 1:

■ Overall architecture and structure of the standard

■ Changes to the MAC Layer

■ Media Independent Interfaces

■ Channelized transmission scheme

■ PCS definition based on the 8B/10B coding scheme

■ Physical Layer for ~100m over MMF bundles using SX lasers

■ Physical Layer for ~300m over MMF using LX lasers and WDM

■ Phase 2:

■ Serialized transmission scheme

■ New PCS definition based on a high efficiency encoding method

■ Physical Layer for ~3km over SMF using LX lasers

■ Physical Layer for ~50km over SMF using LX lasers