January 2007 RAMP Retreat BEE3 Update Chuck Thacker Technical Fellow Microsoft Research 11 January,...

19
BEE3 Update Chuck Thacker Technical Fellow Microsoft Research 11 January, 2007
  • date post

    21-Dec-2015
  • Category

    Documents

  • view

    218
  • download

    0

Transcript of January 2007 RAMP Retreat BEE3 Update Chuck Thacker Technical Fellow Microsoft Research 11 January,...

Page 1: January 2007 RAMP Retreat BEE3 Update Chuck Thacker Technical Fellow Microsoft Research 11 January, 2007.

BEE3 Update

Chuck ThackerTechnical Fellow

Microsoft Research11 January, 2007

Page 2: January 2007 RAMP Retreat BEE3 Update Chuck Thacker Technical Fellow Microsoft Research 11 January, 2007.

Outline

• What is BEE3?

• BEE2-BEE3 Differences

• Project participants

• Engineering plan, schedule

Page 3: January 2007 RAMP Retreat BEE3 Update Chuck Thacker Technical Fellow Microsoft Research 11 January, 2007.

What is BEE3?

• Follow-on to BEE2 (BWRC, 2004)• Board with several highly-connected FPGAs• Vehicle for computer architecture research

– Microsoft’s primary interest

• Potential platform for high performance DSP applications– Astronomers, and perhaps others.

• Allows large scale architectural experiments– Although perhaps not as large as originally hoped– And certainly not at the speed of a real implementation

• Can scale smoothly from a single board to 64 boards (256 FPGAs)

Page 4: January 2007 RAMP Retreat BEE3 Update Chuck Thacker Technical Fellow Microsoft Research 11 January, 2007.

BEE2

Page 5: January 2007 RAMP Retreat BEE3 Update Chuck Thacker Technical Fellow Microsoft Research 11 January, 2007.

BEE2 – BEE3 Differences

• 4 Xilinx Virtex 5 vs 5 Virtex 2 Pro FPGAs– We use XC5VLX110T-ff1136– V2Pro is now obsolete (130nm)– V5 is a major improvement (65nm)

• 6-input LUT (64 bit DP RAM)• Better Block RAMs• Improved interconnect• Better signal integrity

• 8 Infiniband/CX4 channels vs 18• 4 x8 PCI Express Low Profile slots

Page 6: January 2007 RAMP Retreat BEE3 Update Chuck Thacker Technical Fellow Microsoft Research 11 January, 2007.

BEE3 – BEE2 Differences (2)• 2 Banks DDR2 x 2 vs 4 Banks DDR2 x 1

– Same capacity (64 GB likely)– Lower bandwidth– Mandated by fewer signal pins on V5

• 4 10/100/1000 Ethernet channels• No SATA

– BEE2 SATA didn’t work anyway – iSCSI instead (?)

• No PowerPCs– This version has not yet been released by Xilinx

Page 7: January 2007 RAMP Retreat BEE3 Update Chuck Thacker Technical Fellow Microsoft Research 11 January, 2007.

BEE2 – BEE3 Differences (3)• Divided the system into two boards, Main and Control

– Main board has FPGAs, all high speed logic– Control board handles downloading, monitoring– Simplifies main board engineering – can design control board in parallel

• Smaller main board– 168 vs 374 in2

– Fewer layers for lower cost• Much more “PC-like”• Can use PC power supplies, peripherals• Several layouts are being considered

– All fit in 2U enclosure– Much more attention is being given to thermal design– Must pass UL, FCC

Page 8: January 2007 RAMP Retreat BEE3 Update Chuck Thacker Technical Fellow Microsoft Research 11 January, 2007.

BEE3 Main Board

User15VLXT

User25VLXT

User35VLXT

User45VLXT

DDR2 DIMM0DDR2 DIMM1

DDR2 DIMM0DDR2 DIMM1

108

108

108

108

133 133

DDR2 DIMM2DDR2 DIMM3

133133

DDR2 DIMM2DDR2 DIMM3

40x2

DDR2 DIMM0DDR2 DIMM1

DDR2 DIMM0DDR2 DIMM1

133 133

DDR2 DIMM2DDR2 DIMM3

133133

DDR2 DIMM2DDR2 DIMM3

QSH-DP-040

40x2

40x2QSH-DP-

040QSH-DP-

040

PCI-E8X

CX4

CX4

CX4

CX4

CX4

CX4 PCI-E

8X PCI-E

8X

40x2QSH-DP-

040CX4

CX4

PCI-E8X

Page 9: January 2007 RAMP Retreat BEE3 Update Chuck Thacker Technical Fellow Microsoft Research 11 January, 2007.

Bandwidths (per-FPGA)• Memory

– 400 MT/s * 8B/T * 2 channels: 6.4GB/s

• Ring– 400 MT/s * 12 B/T: 4.8 GB/s

• QSH– 400 MT/s * 10 B/T: 4 GB/s

• Ethernet– 125 MB/s

• CX4– 1.25 GB/s * 2 directions * 2 channels: 5GB/s

• PCI Express– Same as CX4

Page 10: January 2007 RAMP Retreat BEE3 Update Chuck Thacker Technical Fellow Microsoft Research 11 January, 2007.

BEE3 Clocking, JTAG

User15VLXT

User25VLXT

User35VLXT

User45VLXT

JTAG

SMA

200MHz

Clock Buf1:4

333MHz

Clock Buf1:4

PS_ON#

SelectMAP{Data[15:0], Cclk, RDWR_B, Busy, Prog_B, Init_B, Done} + JTAG{TMS, TCK}

TDI

TDO

GT

P 8

x

GT

P 8

x

Gcl

k

DD

R2

GT

P 8

x

GT

P 8

x

Gcl

k

DD

R2

GT

P 8

x

GT

P 8

x

Gcl

k

DD

R2

GT

P 8

x

GT

P 8

x

Gcl

k

DD

R2

156.25MHz

Clock Buf1:8

125MHz

Clock Buf1:8

SMA

Sel0,En0

Sel1,En1

CS_B[3:0]

5Vsb x4

GND x2064p

in 0.1" H

ead

er Co

nne

ctor

100MHz

PC

I-E

xpre

s 8x

Slo

t#1

PC

I-E

xpre

s 8x

Slo

t#2

PC

I-E

xpre

s 8x

Slo

t#3

PC

I-E

xpre

s 8x

Slo

t#4

125MHz

Sel2,En2

Sel3,En3

PWR_OK

Page 11: January 2007 RAMP Retreat BEE3 Update Chuck Thacker Technical Fellow Microsoft Research 11 January, 2007.

BEE3 Control Board

PROM

16

USBCtrl

EthernetPHY

JTAG

DRAM 32

FLASH

USB/H USB/D RJ45

50MHz

Spartan3FT256

LED x4

GPIO x40

PushBtn x4

64pin 0.1" Header C

onnector

16x4Char LCD

5Vsb x4

GND x20

Page 12: January 2007 RAMP Retreat BEE3 Update Chuck Thacker Technical Fellow Microsoft Research 11 January, 2007.

BEE3 System (v1)

Power Supply

AT

X P

WR

12V

AT

X

PW

R

Fujitsu 2x2 CX4

1.0V

1.8V

RJ45

2.5V

SMA

4 G

B D

DR

2-6

67 D

RA

M4 G

B D

DR

2-6

67 D

RA

M

4 G

B D

DR

2-6

67 D

RA

M4 G

B D

DR

2-6

67 D

RA

M

FF1738

5VLXTFF1136

Fujitsu 2x2 CX4

4 G

B D

DR

2-6

67 D

RA

M4 G

B D

DR

2-6

67 D

RA

M

4 G

B D

DR

2-6

67 D

RA

M4 G

B D

DR

2-6

67 D

RA

M

FF1738

5VLXTFF1136

4 G

B D

DR

2-6

67 D

RA

M4 G

B D

DR

2-6

67 D

RA

M

4 G

B D

DR

2-6

67 D

RA

M4 G

B D

DR

2-6

67 D

RA

M

FF1738

5VLXTFF1136

4 G

B D

DR

2-6

67 D

RA

M4 G

B D

DR

2-6

67 D

RA

M

4 G

B D

DR

2-6

67 D

RA

M4 G

B D

DR

2-6

67 D

RA

M

FF1738

5VLXTFF1136

1.0V 1.0V

1.0V

1.8V

SMA

PC

I-Exp

ress 8x

PC

I-Exp

ress 8x

PC

I-Exp

ress 8x

PC

I-Exp

ress 8x

GESwitch

64pin 0.1" H

eader C

onnector

Control Board

Available for Fans

QS

H-D

P-0

40

QS

H-D

P-0

40

QS

H-D

P-04

0

QS

H-D

P-04

0

12V

A

TX

P

WR

64pin 0.1" Header Connector

Main cable harness exits

here

Page 13: January 2007 RAMP Retreat BEE3 Update Chuck Thacker Technical Fellow Microsoft Research 11 January, 2007.

BEE3 System (v2)

Control Board

I/O Panel

4 G

B D

DR

2-6

67

DR

AM

4 G

B D

DR

2-6

67

DR

AM

4 G

B D

DR

2-6

67

DR

AM

4 G

B D

DR

2-6

67

DR

AM

5VLXTFF1136

QS

H-D

P-040

QS

H-D

P-040

QS

H-D

P-040

QS

H-D

P-040

4 G

B D

DR

2-6

67

DR

AM

4 G

B D

DR

2-6

67

DR

AM

4 G

B D

DR

2-6

67

DR

AM

4 G

B D

DR

2-6

67

DR

AM

5VLXTFF1136 2

4 p

in A

TX

PW

R

Fujitsu 2x2 CX4

1.0V

1.8V 2.5V

4 G

B D

DR

2-6

67

DR

AM

4 G

B D

DR

2-6

67

DR

AM

4 G

B D

DR

2-6

67

DR

AM

4 G

B D

DR

2-6

67

DR

AM

5VLXTFF1136

Fujitsu 2x2 CX4

4 G

B D

DR

2-6

67

DR

AM

4 G

B D

DR

2-6

67

DR

AM

4 G

B D

DR

2-6

67

DR

AM

4 G

B D

DR

2-6

67

DR

AM

5VLXTFF1136

1.0V 1.0V

1.0V

1.8V

PC

I-Express 8x

GESwitch

64-pin 0.1" Header C

onnector

12VP

WR

PC

I-Express 8x

PC

I-Express 8x

PC

I-Express 8x

PC

Ie 1x

Page 14: January 2007 RAMP Retreat BEE3 Update Chuck Thacker Technical Fellow Microsoft Research 11 January, 2007.

BEE3 Main Board (v3)

QS

H-D

P-04

0

QS

H-D

P-04

0

QS

H-D

P-04

0

QS

H-D

P-04

0

24 pin A

TX

PW

R

Fujitsu 2x2 CX4

Fujitsu 2x2 CX4

PC

I-Exp

ress 8x

64-pin 0.1" H

eade

r Conn

ector12

VP

WR

PC

I-Exp

ress 8x

PC

I-Exp

ress 8x

PC

I-Exp

ress 8x

4 GB

DD

R2-66

7 DR

AM

4 GB

DD

R2-66

7 DR

AM

4 GB

DD

R2-66

7 DR

AM

4 GB

DD

R2-66

7 DR

AM

5VLXTFF1136

4 GB

DD

R2

-667 D

RA

M4 G

B D

DR

2-66

7 DR

AM

4 GB

DD

R2

-667 D

RA

M4 G

B D

DR

2-66

7 DR

AM

5VLXTFF1136

4 G

B D

DR

2-6

67

DR

AM

4 G

B D

DR

2-6

67

DR

AM

4 G

B D

DR

2-6

67

DR

AM

4 G

B D

DR

2-6

67

DR

AM

5VLXTFF1136

4 G

B D

DR

2-66

7 D

RA

M4

GB

DD

R2-

667

DR

AM

4 G

B D

DR

2-66

7 D

RA

M4

GB

DD

R2-

667

DR

AM

5VLXTFF1136

1.0V

1.0V

1.8V

1.0V

1.0V

1.8V 2.5V

RJ45 RJ45

Page 15: January 2007 RAMP Retreat BEE3 Update Chuck Thacker Technical Fellow Microsoft Research 11 January, 2007.

Remaining Issues• Precise EATX compatibility, or not?

– Affects layout complexity, thermal design

• Power supply sizing– We don’t want to leave the overclockers in the lurch

• Standard power supplies (?)– “2U” supplies aren’t as efficient, have fewer vendors– Prefer Intel/Google “12V only” supplies (minimum loading issue),

if available in time and at reasonable cost

• PCI Express is nonstandard– Xilinx hard macro is “device only”, not host– Need an intrepid graduate student– Can still use it for additional Infiniband/CX4 channels

Page 16: January 2007 RAMP Retreat BEE3 Update Chuck Thacker Technical Fellow Microsoft Research 11 January, 2007.

Project Participants and Roles• Microsoft Research (Silicon Valley)

– Funds and manages system engineering• Celestica (Ottawa and elsewhere)

– Does main board engineering, produces final systems.– Microsoft has a very deep relationship with Celestica

• Function Engineering (Palo Alto)– Does thermal and mechanical engineering

• Xilinx (San Jose)– Provides FPGAs for academic machines– Provides FPGA application expertise

• Ramp Group (BWRC)– Control board, basic software

• Ramp Community– Uses the systems for research

Page 17: January 2007 RAMP Retreat BEE3 Update Chuck Thacker Technical Fellow Microsoft Research 11 January, 2007.

Why is Microsoft interested?• We believe the overall RAMP effort will have significant impact, and

want to support it in the most effective way we can.– Simply paying for grad students seems suboptimal

• We observe that universities aren’t very good at this sort of system engineering and production.– Grad students are great for many things, but doing things like board

layout aren’t among them.– Requires deep understanding of tools and production processes. Pros

have this.– We can open doors that academia can’t.– We have experience in managing this sort of program.

• We want the systems themselves– As infrastructure for our new effort in computer architecture (yes, this is

a recruiting pitch).• We also want systems to be available to other industrial users

– This might be more difficult if the systems came from academia.– But we don’t want to be in the hardware business.

Page 18: January 2007 RAMP Retreat BEE3 Update Chuck Thacker Technical Fellow Microsoft Research 11 January, 2007.

Plan, schedule• Generate design spec: 6 weeks

– Scope layout problems and layer count

• Layout and signal integrity: 12 weeks– Parts procurement proceeds in parallel– Will probably do 4-5 prototypes.

• Board fab, test and assembly: 3 weeks• Design verification testing:5 weeks

– This happens at Microsoft or BWRC

• Production can start in Summer ‘07

Page 19: January 2007 RAMP Retreat BEE3 Update Chuck Thacker Technical Fellow Microsoft Research 11 January, 2007.

Discussion?