HS06 on last generation of HEP worker nodes Berkeley, Hepix Fall ‘09
-
Upload
curran-simmons -
Category
Documents
-
view
57 -
download
2
description
Transcript of HS06 on last generation of HEP worker nodes Berkeley, Hepix Fall ‘09
HS06 on last generation of HEP worker nodes
Berkeley, Hepix Fall ‘09
INFN - Padova
michele.michelotto at pd.infn.it
HEP-SPEC06
• 471.omnetpp• 473.astar• 483.xalancbmk• 444.amd• 447.dealII• 450.soplex• 453.povray
• Geometric average on 7 tests. Sum on all cores• Sum on all Worker nodes
Hepix Fall 09 michele michelotto - INFN Padova 2
Integer tests
Floating Point tests
Hepix Fall 09 michele michelotto - INFN Padova 3
“old cpu”
Hepix Fall 09 michele michelotto - INFN Padova 4
HS06/clock/core
0.00
1.00
2.00
3.00
4.00
5.00
6.00
7.00
8.00
1800 2000 2200 2400 2600 2800 3000 3200 3400
clock
Modern 4core processor
processor HS06 HS06/Clock HS06/Clock/core
Core/
Logical cpu
Intel Clovertown 53xx 53-60 23-26 2.90-3.20 8
Intel Harpertown 54xx 60-70 25-28 3.20-3.50 8
AMD Shanghai 23xx 60-74 25-27 3.20-3.60 8
AMD Instanbul 24xx 96-99 40-44 3.34 -3.73 12
Intel Gainestown 5520
80-95-120 43-53 3.33-5.39 8-16
Xeon
DC-QC
nehalem
• About 60 Measurement from Brunengo, Macorini, Crescente, Calzolari (all INFN), Srinivasan (LBNL), Iribarren (CERN), Alef (GridKa), PIC, RAL, Nikhef
Hepix Fall 09 michele michelotto - INFN Padova 5
Measuring Nehalem
• Phase space complicated– 32-64 bit– Hyperthreading ON - OFF– Memory (Size and number of channels)– Turbo Mode ON - OFF
Hepix Fall 09 michele michelotto - INFN Padova 6
The official HS06: 32bit32 bit - 48GB
0.00
20.00
40.00
60.00
80.00
100.00
120.00
140.00
160.00
180.00
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
core
HE
P-S
PE
C06
32 bit
linear scaling 32 bit
•HS06 HT is on118.30 (16t)81.81(8t)
Hepix Fall 09 michele michelotto - INFN Padova 7
64 bit64 bit - 48GB
0.00
50.00
100.00
150.00
200.00
250.00
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
core
HE
P-S
PE
C06
64 bit
linear scaling 64 bit
•HS06 HT is on136.74 (16t)95.14(8t)
Hepix Fall 09 michele michelotto - INFN Padova 8
32 vs 64
48GB 32bit vs 64 bit
0.00
20.00
40.00
60.00
80.00
100.00
120.00
140.00
160.00
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
cores
HE
P-S
PE
C06
64 bit 32bit
Hepix Fall 09 michele michelotto - INFN Padova 9
Memory48GB (6x8)24GB (3x8)16GB (2x8)
Memory 16 -24 - 48 GB
0.00
20.00
40.00
60.00
80.00
100.00
120.00
140.00
160.00
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
core
HE
P-S
PE
C06
16GB 32bit
48GB 32 bit
48GB 64 bit16GB 64 bit
24GB 32 bit
24GB 64 bit
32 bit
64 bit
Hepix Fall 09 michele michelotto - INFN Padova 10
HT OFF 32bit• With HT OFF I’d imagine
saturation after 8th thread• Will HT OFF 1-8 = HT ON?
32 bit HT ON
0
20
40
60
80
100
120
140
160
180
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
core
HE
P-S
PE
C06
32 bit - HT ON
32 bit HT ON Linear
32 bit HT OFF
0
20
40
60
80
100
120
140
160
180
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
core
HE
P-S
PE
C06
32 bit - HT OFF
32 bit HT OFF Linear
Hepix Fall 09 michele michelotto - INFN Padova 11
HT ON vs HT OFF32 bit HT OFF vs ON
0
20
40
60
80
100
120
140
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
core
HE
P-S
PE
C06
32 bit - HT OFF
32 bit HT ON
HT ON:81.81HT OFF: 95.96HT OFF is better up to 11t
Hepix Fall 09 michele michelotto - INFN Padova 12
HT OFF 64bit
• At 64 bit: same behaviour
64 bit HT OFF vs ON
0
20
40
60
80
100
120
140
160
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
core
HE
P-S
PE
C06
64 bit - HT OFF
64 bit HT ON
32 and 64 bit HT OFF vs ON
0
20
40
60
80
100
120
140
160
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
core
HE
P-S
PE
C06
32 bit - HT OFF
32 bit HT ON
64 bit - HT OFF
64 bit HT ON
Hepix Fall 09 michele michelotto - INFN Padova 13
Turbo mode
• Turn off the voltage on idle cores and overclock +133 MHz or even +266MHz the actives cores if temperature is ok– 5520 Default clock is 2266 MHz– 5520 3 of 4 core +1bin 2400 MHz– 5520 1 or 2 core +2bin 2533 MHz
• New half-generation e.g. Xeon 3500 up to 4 bin
Hepix Fall 09 michele michelotto - INFN Padova 14
Turbo mode
• Benefit of Turbo mode decrease when number of active core increase
32 bit TURBO OFF vs ON
0
20
40
60
80
100
120
140
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
core
HE
P-S
PE
C06
32 bit -TURBO OFF
32 bit TURBO ON
TURBO MODE SPEEDUP
-10.00%
-8.00%
-6.00%
-4.00%
-2.00%
0.00%
2.00%
4.00%
6.00%
8.00%
10.00%
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
cores
Sp
eed
up
TURBO OFF/ON 32 bit
TURBO OFF/ON 64 bit
Hepix Fall 09 michele michelotto - INFN Padova 15
Opteron 2427 32bit
32 bit 2427
0
50
100
150
200
250
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
core
HS
06
32 bit
32 bit linear•HS06: 98.05 (12t)
Hepix Fall 09 michele michelotto - INFN Padova 16
Opteron 2427 – 64bit
64 bit 2427
020406080
100120140160180200
1 2 3 4 5 6 7 8 9 10111213141516
cores
HS
0664 bit
linear scaling 64 bit
•HS06: 111.46 (12t)
Hepix Fall 09 michele michelotto - INFN Padova 17
Instanbul 2 x exacore
32 bit 2427
0
50
100
150
200
250
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16
core
HS
06
32 bit
32 bit linear
64 bit 2427
020406080
100120140160180200
1 2 3 4 5 6 7 8 9 10111213141516
cores
HS
06
64 bit
linear scaling 64 bit
• Opteron 2427 – 2200 MHz – 32GB • 2P x 6cores
Overbooking?
• 5520 with HT OFF increased performance even after 8 cores
• 2247 doesn’t show any drop with 12 cores fully loaded
• What happens if we start overloading with more processes than cores?
Hepix Fall 09 michele michelotto - INFN Padova 19
5520(2266) vs 2427(2200)
Hepix Fall 09 michele michelotto - INFN Padova 20
5520(2266) vs 2427(2200)32 bit 2427
0
20
40
60
80
100
120
140
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22
core
HS
06
32 bit 242732 bit 5520 HT ON32 bit 5520 HT OFF
Hepix Fall 09 michele michelotto - INFN Padova 21
5520(2266) vs 2427(2200)64 bit 2427
0
20
40
60
80
100
120
140
160
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20
core
HS
06
64 bit 2427
64 bit 5520 HT ON
64 bit 5520 HT OFF
Todo
• Compare with Atlas and CMS code– GEN, SIM, DIGI and RECO
• Effect of HT, Turbo Mode and Overbooking on Power Consumption
Hepix Fall 09 michele michelotto - INFN Padova 23
Questions?
Hepix Fall 09 michele michelotto - INFN Padova 24
Nehalem “gainestown”
• 45 nm • Cache L1 32+32 KB• Cache L2 256KB/core• Cache L3 8MB shared• 80W: E5502 1.86 GHz E5540 2.53 GHz• 95W: X5550 2.66 GHz X5570 2.93 GHz• Dual Thread (from 5520 upwards)• Turbo Mode• Quad core (excl. 5502 and 5508)
Hepix Fall 09 michele michelotto - INFN Padova 25
Opteron Instanbul
• 45 nm
• Cache L1 128 KB
• Cache L2 512KB/core
• Cache L3 6MB shared
• Power Consumption:– EE: max 40W (1.8 GHz)– HE: max 55W (2.0-2.1 GHz)– Standard max 75W (2.2-2.6 GHz)– SE max 105W (2.8GHz)