A Single Chip Terabit Switch
Transcript of A Single Chip Terabit Switch
1
A Single Chip Terabit Switch
Speaker: Fred Heaton
Authors: Bill Dally,Wayne Dettloff, John Eyles,Trey Greer, John Poulton, Teva Stone, Steve Tell
2
Architecture Overview
140 998 Mbps - 3.125Gb/s
Transmitters
140 998 Mb/s - 3.125Gb/s
Receivers
140 x 1403.125Gb/s
Switch
Core
8 bit uP interface
• Re-usable, Low Power Tx/Rx Module
• Rx Clock/Data recovery
• Adaptive termination
• Lane-by-lane selectable data rates
3 selectable Reference clock inputs
5 selectable multiplier settings
•Asynchronous switch operation
•User selectable Transmitter Pre-emphasis
•Scalable architecture
3
Applications3/5 Stage Clos Network configurations
9800 port SNB 3 stage Clos network (30+ Tbps)Protection switchingVideo routingDWDM OEO switching applications
WAN SONET Ring WAN SONET Ring
OEO switch
will be
replaced
by
4
Receiver
140 x 1403.125Gb/s
Switch
Core
RxRX0 ±±±±
Clock Gen
RxDat
Host Interface,
Controller, and
Global logic
8 bit uP Interface
MDIO/MDC
RFCC ±
RFCB ±
RFCA ±
receiver assembly N
receiver assembly 2
receiver assembly 1
receiver assembly 0
RxToggle
8
CDR
To Switch
5
Switch
140 x 1403.125Gb/s
Switch
Core
Crosspoint switch fabric
140 x 140 x 9
•Full reconfiguration and multicast support
•Asynchronous operation
•Differential data path design
•Byte wide datapath + 1 “clock”
From Rx
To Tx
6
Transmitter
140 x 1403.125Gb/s
Switch
Core
Tx
TX0 ±
Clo
ck
Gen
.
TxD
at
TxT
oggle
Switch
control
Host Interface
Controller, and
Global logic
RFCC ±
RFCB ±
RFCA ±
host
control buses
switch
select
tra
nsm
itte
r a
ssem
bly
N
tran
smit
ter
ass
emb
ly 2
tran
smit
ter
ass
emb
ly 1
8
From
Switch
CDR
8 bit uP Interface
MDIO/MDC
• Differential, CML signaling
• Transmitter Pre-emphasis
• On-chip adaptive termination
7
Technology
0.18 micron, 7 Layer MetalFlip-chip die attachCeramic BGA package1mm ball pitch
36 x 36 full array (1296 pins)
168 Vdd pins (1.8 volt nominal)
13 Ivdd pins (2.5/3.3 volt nominal)
473 Ground pins
8
Electrical/Performance Specs
Rx Input Sensitivity - 35 mVJitter Tolerance - 0.65 UIJitter Generation - 0.25 UIPlesiochronous tracking of 200ppm differencebetween reference and embedded data clockSwitch Reconfiguration time ~30uSClock Lock time ~9uSSelectable output levels (675 mV max)Power Dissipation18 Watts @ 2.5 Gbps (1.8 V operation)
22 Watts @ 3.125 Gbps (1.8V operation), 130mW/Tx-Rx pair
9
Receiver
...D
ESD
Programmable
Termination
E
D
E
eclkP
dclkN
eclkN
dclkP
Data/EdgeSamplers
Early/Late
Logic
early
late LPF
720° Phase
Interpolator
dclkP,dclkN
eclkP,eclkN
Clock
Multiplier
RFC
A,B,C
Multiplier
settings
8/10
210-1 PRBS
Checker
to switch
RX+,RX-RXDAT[9:0]
RXTOGGLE
E0
D0
E1
D1
Quadrature clocks
10
Early Late Logic
D0
E0D1
E1 DDLogic threshold
!=
late
= != =
late
dclkP
dclkN
eclkP
eclkN
!=
early
!= != !=
early- Data sample
Logic threshold
1
011 00
000 1
11
Switch Design
Weak
driver
Strong
driver
resistor
row
Diff. ampE
dg
e
Detecto
r
Ed
ge
Detecto
r
Column driver
Colu
mn
Diff. amp
Out+/-
•Asynchronous, low-swing differential design
•Pre-emphasis technique used to drive heavily
loaded row/column lines
•Weak driver and “resistors” used to maintain
differential voltage at DC
• 3 Watts worst case power dissipation at 3.125
Gbps, 2 volt operation
NFET H bridge driver
Edge
Detector
In+
In-
Row driver
Edge
Detector
12
Nominal Spice simulation, 2.5 Gbps operation
Row near end
Row far end
In
Col near end
Out
Xpt amp
3.2ns
260mV
780mV
13
Transmitter
...D
ESD
Programmable
Termination
D
dclkP
dclkP
Tx CDR
720° Phase
Interpolator
dclkP,dclkN
eclkP,eclkN
Clock
Multiplier
RFC
A,B,C Multiplier
settings
8/10
210-1 PRBS
Generator
from switch TX+,TX-
TXDAT
TXTOGGLE
Driver
Quadrature clocks
14
Transmitter Current Steering Element
G
0
G
G
d0
dclk
Vbn
Vbn
dclk[0]
en0N
TX+TX-
s1
s0
• 4 identical sets of current steering elements
•Separate staggered clocks provided to each for slew rate control
•Programmable bias generator (5 settings)
•Equalization tap (5 settings)
Vbn
Vbn
Vbn
Di-bit data
1
0
1
0
1
0
1
d1
d0
d0
d0
d1
d1
d1
dclk
dclk
dclk
GG
GG
dclkdclk
d1
d0
en1N
en2N
en3N
EQ
Bias
Gen.
15
Interconnect Performance @ Gigabit rates
Skin effect attentuation - freq relationshipDielectric loss - to first order, linear attenuation with frequencySerial transmitters use pre-emphasis to compensate for signaldistortion effects of FR4
1.0 2.0 3.0
Frequency (GHz)
Am
pli
tud
e (d
B)
0.0
-9.0
-6.0
-3.0
FR4 dielectric loss
Skin effect loss
(trace geometry)Total trace loss
16
Example Waveform (No Pre-emphasis)
0.20.0
Time (ns)
Am
pli
tud
e (V
)
0.75
0.0
0.25
0.50
0.4 0.6 0.8 1.0 1.2 1.4
Transmitted pulse
Received pulse
17
Example Waveform (Pre-emphasis)
0.20.0
Am
pli
tud
e (V
)
0.75
0.0
0.25
0.50
0.4 0.6 0.8 1.0 1.2 1.4
-0.25
Pre-Emphasis transmitted pulse
Pre-Emphasis pulse response
Response w/ out pre-emphasis
a
b
1.00
� Pre-Emphasis 1-tap equation:
Vpr(n) = a * Vi(n) – b * Vi(n-1)
Narrows pulse which opens the transmit eye at the
receiver
18
At Transmitter: No Pre-Emphasis Signal at Receiver
At Transmitter: Pre-Emphasis Enabled Signal at Receiver
Pre-Emphasis at 2.5Gb/s, PRBS data:
40” FR4RXTX
19
On-chip Processor
8 bit Microprocessor4K x 14 bit instruction ROM
4K x 14 bit instruction RAM
512 x 8 bit data RAM
External EEPROM interface
On reset processor performsOn-chip register initialization
Default switch configuration
Receiver offset trim
Termination resistor calibration
Ongoing polling loopRegister updates, termination resistor updates
20
210-1 PRBS, 3.577 Gbps, 4” FR4
650mV
Duty cycle distortion present due to internal clock load imbalance
Eye reduction in both the time domain and amplitude
21
Terabit Switch Building Blocks
Crosspoint switch demonstratestwo key building blocksHigh-speed I/Os
High-performance switch core
Different switches can berealized by adding application-specific logic between theseGrooming switchFramers and time-slot interchangers
Cell switchScheduler and queueing
140 x 140
3.125Gb/s
Switch
Core
Application-Specific
Logic
140 3.125Gb/s
Transmitters
140 3.125Gb/s
Receivers
22
Other Products
VC2002 SONET/SDH Grooming Switch72 Integrated 2.5Gb/s I/O port pairs
SONET Input and Output Processing
SONET Input and Output Processing
ST-192(c) Support
STS-1 level switching
(3456 x 3456 STS-1 Switch)
multi-stage scalable architecture
VC1001/2/3 Octal SERDESfull-duplex, 1Gbps - 3.125 Gbps, 1.5 - 2.1 WattsQuad version also available