HS06 on last generation of HEP worker nodes Berkeley, Hepix Fall ‘09

25
HS06 on last generation of HEP worker nodes Berkeley, Hepix Fall ‘09 INFN - Padova michele.michelotto at pd.infn.it

description

HS06 on last generation of HEP worker nodes Berkeley, Hepix Fall ‘09. INFN - Padova michele.michelotto at pd.infn.it. 471.omnetpp 473.astar 483.xalancbmk 444.amd 447.dealII 450.soplex 453.povray. Geometric average on 7 tests. Sum on all cores Sum on all Worker nodes. HEP-SPEC06. - PowerPoint PPT Presentation

Transcript of HS06 on last generation of HEP worker nodes Berkeley, Hepix Fall ‘09

Page 1: HS06 on last generation of HEP worker nodes Berkeley, Hepix Fall ‘09

HS06 on last generation of HEP worker nodes

Berkeley, Hepix Fall ‘09

INFN - Padova

michele.michelotto at pd.infn.it

Page 2: HS06 on last generation of HEP worker nodes Berkeley, Hepix Fall ‘09

HEP-SPEC06

• 471.omnetpp• 473.astar• 483.xalancbmk• 444.amd• 447.dealII• 450.soplex• 453.povray

• Geometric average on 7 tests. Sum on all cores• Sum on all Worker nodes

Hepix Fall 09 michele michelotto - INFN Padova 2

Integer tests

Floating Point tests

Page 3: HS06 on last generation of HEP worker nodes Berkeley, Hepix Fall ‘09

Hepix Fall 09 michele michelotto - INFN Padova 3

“old cpu”

Page 4: HS06 on last generation of HEP worker nodes Berkeley, Hepix Fall ‘09

Hepix Fall 09 michele michelotto - INFN Padova 4

HS06/clock/core

0.00

1.00

2.00

3.00

4.00

5.00

6.00

7.00

8.00

1800 2000 2200 2400 2600 2800 3000 3200 3400

clock

Modern 4core processor

processor HS06 HS06/Clock HS06/Clock/core

Core/

Logical cpu

Intel Clovertown 53xx 53-60 23-26 2.90-3.20 8

Intel Harpertown 54xx 60-70 25-28 3.20-3.50 8

AMD Shanghai 23xx 60-74 25-27 3.20-3.60 8

AMD Instanbul 24xx 96-99 40-44 3.34 -3.73 12

Intel Gainestown 5520

80-95-120 43-53 3.33-5.39 8-16

Xeon

DC-QC

nehalem

• About 60 Measurement from Brunengo, Macorini, Crescente, Calzolari (all INFN), Srinivasan (LBNL), Iribarren (CERN), Alef (GridKa), PIC, RAL, Nikhef

Page 5: HS06 on last generation of HEP worker nodes Berkeley, Hepix Fall ‘09

Hepix Fall 09 michele michelotto - INFN Padova 5

Measuring Nehalem

• Phase space complicated– 32-64 bit– Hyperthreading ON - OFF– Memory (Size and number of channels)– Turbo Mode ON - OFF

Page 6: HS06 on last generation of HEP worker nodes Berkeley, Hepix Fall ‘09

Hepix Fall 09 michele michelotto - INFN Padova 6

The official HS06: 32bit32 bit - 48GB

0.00

20.00

40.00

60.00

80.00

100.00

120.00

140.00

160.00

180.00

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

core

HE

P-S

PE

C06

32 bit

linear scaling 32 bit

•HS06 HT is on118.30 (16t)81.81(8t)

Page 7: HS06 on last generation of HEP worker nodes Berkeley, Hepix Fall ‘09

Hepix Fall 09 michele michelotto - INFN Padova 7

64 bit64 bit - 48GB

0.00

50.00

100.00

150.00

200.00

250.00

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

core

HE

P-S

PE

C06

64 bit

linear scaling 64 bit

•HS06 HT is on136.74 (16t)95.14(8t)

Page 8: HS06 on last generation of HEP worker nodes Berkeley, Hepix Fall ‘09

Hepix Fall 09 michele michelotto - INFN Padova 8

32 vs 64

48GB 32bit vs 64 bit

0.00

20.00

40.00

60.00

80.00

100.00

120.00

140.00

160.00

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

cores

HE

P-S

PE

C06

64 bit 32bit

Page 9: HS06 on last generation of HEP worker nodes Berkeley, Hepix Fall ‘09

Hepix Fall 09 michele michelotto - INFN Padova 9

Memory48GB (6x8)24GB (3x8)16GB (2x8)

Memory 16 -24 - 48 GB

0.00

20.00

40.00

60.00

80.00

100.00

120.00

140.00

160.00

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

core

HE

P-S

PE

C06

16GB 32bit

48GB 32 bit

48GB 64 bit16GB 64 bit

24GB 32 bit

24GB 64 bit

32 bit

64 bit

Page 10: HS06 on last generation of HEP worker nodes Berkeley, Hepix Fall ‘09

Hepix Fall 09 michele michelotto - INFN Padova 10

HT OFF 32bit• With HT OFF I’d imagine

saturation after 8th thread• Will HT OFF 1-8 = HT ON?

32 bit HT ON

0

20

40

60

80

100

120

140

160

180

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

core

HE

P-S

PE

C06

32 bit - HT ON

32 bit HT ON Linear

32 bit HT OFF

0

20

40

60

80

100

120

140

160

180

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

core

HE

P-S

PE

C06

32 bit - HT OFF

32 bit HT OFF Linear

Page 11: HS06 on last generation of HEP worker nodes Berkeley, Hepix Fall ‘09

Hepix Fall 09 michele michelotto - INFN Padova 11

HT ON vs HT OFF32 bit HT OFF vs ON

0

20

40

60

80

100

120

140

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

core

HE

P-S

PE

C06

32 bit - HT OFF

32 bit HT ON

HT ON:81.81HT OFF: 95.96HT OFF is better up to 11t

Page 12: HS06 on last generation of HEP worker nodes Berkeley, Hepix Fall ‘09

Hepix Fall 09 michele michelotto - INFN Padova 12

HT OFF 64bit

• At 64 bit: same behaviour

64 bit HT OFF vs ON

0

20

40

60

80

100

120

140

160

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

core

HE

P-S

PE

C06

64 bit - HT OFF

64 bit HT ON

32 and 64 bit HT OFF vs ON

0

20

40

60

80

100

120

140

160

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

core

HE

P-S

PE

C06

32 bit - HT OFF

32 bit HT ON

64 bit - HT OFF

64 bit HT ON

Page 13: HS06 on last generation of HEP worker nodes Berkeley, Hepix Fall ‘09

Hepix Fall 09 michele michelotto - INFN Padova 13

Turbo mode

• Turn off the voltage on idle cores and overclock +133 MHz or even +266MHz the actives cores if temperature is ok– 5520 Default clock is 2266 MHz– 5520 3 of 4 core +1bin 2400 MHz– 5520 1 or 2 core +2bin 2533 MHz

• New half-generation e.g. Xeon 3500 up to 4 bin

Page 14: HS06 on last generation of HEP worker nodes Berkeley, Hepix Fall ‘09

Hepix Fall 09 michele michelotto - INFN Padova 14

Turbo mode

• Benefit of Turbo mode decrease when number of active core increase

32 bit TURBO OFF vs ON

0

20

40

60

80

100

120

140

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

core

HE

P-S

PE

C06

32 bit -TURBO OFF

32 bit TURBO ON

TURBO MODE SPEEDUP

-10.00%

-8.00%

-6.00%

-4.00%

-2.00%

0.00%

2.00%

4.00%

6.00%

8.00%

10.00%

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

cores

Sp

eed

up

TURBO OFF/ON 32 bit

TURBO OFF/ON 64 bit

Page 15: HS06 on last generation of HEP worker nodes Berkeley, Hepix Fall ‘09

Hepix Fall 09 michele michelotto - INFN Padova 15

Opteron 2427 32bit

32 bit 2427

0

50

100

150

200

250

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

core

HS

06

32 bit

32 bit linear•HS06: 98.05 (12t)

Page 16: HS06 on last generation of HEP worker nodes Berkeley, Hepix Fall ‘09

Hepix Fall 09 michele michelotto - INFN Padova 16

Opteron 2427 – 64bit

64 bit 2427

020406080

100120140160180200

1 2 3 4 5 6 7 8 9 10111213141516

cores

HS

0664 bit

linear scaling 64 bit

•HS06: 111.46 (12t)

Page 17: HS06 on last generation of HEP worker nodes Berkeley, Hepix Fall ‘09

Hepix Fall 09 michele michelotto - INFN Padova 17

Instanbul 2 x exacore

32 bit 2427

0

50

100

150

200

250

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16

core

HS

06

32 bit

32 bit linear

64 bit 2427

020406080

100120140160180200

1 2 3 4 5 6 7 8 9 10111213141516

cores

HS

06

64 bit

linear scaling 64 bit

• Opteron 2427 – 2200 MHz – 32GB • 2P x 6cores

Page 18: HS06 on last generation of HEP worker nodes Berkeley, Hepix Fall ‘09

Overbooking?

• 5520 with HT OFF increased performance even after 8 cores

• 2247 doesn’t show any drop with 12 cores fully loaded

• What happens if we start overloading with more processes than cores?

Page 19: HS06 on last generation of HEP worker nodes Berkeley, Hepix Fall ‘09

Hepix Fall 09 michele michelotto - INFN Padova 19

5520(2266) vs 2427(2200)

Page 20: HS06 on last generation of HEP worker nodes Berkeley, Hepix Fall ‘09

Hepix Fall 09 michele michelotto - INFN Padova 20

5520(2266) vs 2427(2200)32 bit 2427

0

20

40

60

80

100

120

140

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22

core

HS

06

32 bit 242732 bit 5520 HT ON32 bit 5520 HT OFF

Page 21: HS06 on last generation of HEP worker nodes Berkeley, Hepix Fall ‘09

Hepix Fall 09 michele michelotto - INFN Padova 21

5520(2266) vs 2427(2200)64 bit 2427

0

20

40

60

80

100

120

140

160

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

core

HS

06

64 bit 2427

64 bit 5520 HT ON

64 bit 5520 HT OFF

Page 22: HS06 on last generation of HEP worker nodes Berkeley, Hepix Fall ‘09

Todo

• Compare with Atlas and CMS code– GEN, SIM, DIGI and RECO

• Effect of HT, Turbo Mode and Overbooking on Power Consumption

Page 23: HS06 on last generation of HEP worker nodes Berkeley, Hepix Fall ‘09

Hepix Fall 09 michele michelotto - INFN Padova 23

Questions?

Page 24: HS06 on last generation of HEP worker nodes Berkeley, Hepix Fall ‘09

Hepix Fall 09 michele michelotto - INFN Padova 24

Nehalem “gainestown”

• 45 nm • Cache L1 32+32 KB• Cache L2 256KB/core• Cache L3 8MB shared• 80W: E5502 1.86 GHz E5540 2.53 GHz• 95W: X5550 2.66 GHz X5570 2.93 GHz• Dual Thread (from 5520 upwards)• Turbo Mode• Quad core (excl. 5502 and 5508)

Page 25: HS06 on last generation of HEP worker nodes Berkeley, Hepix Fall ‘09

Hepix Fall 09 michele michelotto - INFN Padova 25

Opteron Instanbul

• 45 nm

• Cache L1 128 KB

• Cache L2 512KB/core

• Cache L3 6MB shared

• Power Consumption:– EE: max 40W (1.8 GHz)– HE: max 55W (2.0-2.1 GHz)– Standard max 75W (2.2-2.6 GHz)– SE max 105W (2.8GHz)