mentor.ieee.org · Web viewData rate at ~20 Gbps, latency < 5 ms, jitter
Quality and Performance Advantages of DDR4 over DDR3 Bacchus... · • CAS latency increases with...
Transcript of Quality and Performance Advantages of DDR4 over DDR3 Bacchus... · • CAS latency increases with...
Quality and Performance Advantages of DDR4 over DDR3 Server Forum 2014 Copyright © 2014 Hewlett-Packard
Forward-looking statements This document contains forward-looking statements regarding future operations, product development, product capabilities and availability dates. This information is subject to substantial uncertainties and is subject to change at any time without prior notification. Statements contained in this document concerning these matters only reflect Hewlett-Packard’s predictions and/or expectations as of the date of this document and actual results and future plans of Hewlett-Packard may differ significantly as a result of, among other things, changes in product strategy resulting from technological, internal corporate, marketing and other changes. This is not a commitment to deliver material, code or functionality and should not be relied upon in making purchase decisions HP makes no warranties regarding the accuracy of the information in this document. HP does not warrant or represent that it will introduce any product to which the information relates. It is presented for evaluation by the recipient and to assist HP on defining product direction
Agenda
1.Comparison of DDR3 and DDR4 2.DDR4 Memory Throughput and Latency 3.A tale of two 8GB DDR4 DIMMs
Results of micro benchmark not affiliated with any
specific application
DIMM Labels 16GB 2Rx4 PC4 2133 P R Capacity Ranks &
Width DDR4 Gen DDR4 Bit Rate DDR4 CAS Latency DDR4 DIMM Type
4 GB 8 GB 16 GB 32 GB 64 GB
1Rx8 2Rx8 1Rx4 2Rx4 4Rx4
PC4 = 1.2V 2666 MT/s 2400 MT/s 2133 MT/s
S = 19 or 14.25 ns R = 17 or 14.17 ns P = 15 or 14.06 ns
E = UDIMM R = RDIMM L = LRDIMM
16GB 2Rx4 PC3 14900 R 13 Capacity Ranks &
Width DDR3 Gen and
Voltage DDR3 DIMM Data
Rate DDR3 DIMM Type DDR3 CAS Latency
4 GB 8 GB 16 GB 32 GB 64 GB
1Rx8 2Rx8 1Rx4 2Rx4 4Rx4
PC3 = 1.5V PC3L = 1.35V
14900 => 1866 12800 => 1600 10600 => 1333
E = UDIMM R = RDIMM L = LRDIMM
13 = 13.93 ns at 1866 11 = 13.75 ns at 1600 9 = 13.50 ns at 1333
ProLiant 2-Socket DDR3 and DDR4 NUMA Architecture
CPU 1
SNB/IVB HSW
CPU 2
SNB/IVB HSW
Ch3 Ch4 Ch2 Ch1
24 DIMM Design (3 Slots per Channel (3SPC))
CPU 1
SNB/IVB HSW
16 DIMM Design (2SPC)
CPU 2
SNB/IVB HSW
Ch3 Ch4 Ch2 Ch1
Ch3 Ch4 Ch2 Ch1
Ch3 Ch4 Ch2 Ch1
Memory Bus
Populate white slots
first
QPI Bus
DDR4 and DDR3 Data Rates
Generation Operating Voltage
1DPC
2DPC
3DPC
DDR4 1.20 V 2133 2133* 1866* DDR3 1.35 V 1600 1600* 1066* DDR3 1.50 V 1866 1866* 1333*
* HP Smart Memory required DPC = DIMM per Channel
DDR4 and DDR3 Idle Latency
DDR4 1-rank DIMMs have lower latency than 2-rank DIMMs
DDR3 measured on ProLiant DL360 Gen8
DDR4 measured on ProLiant DL360 Gen9
DDR4 RDIMM relative to DDR3 16GB 2Rx4 RDIMM at 1 DIMM per Channel (DPC)
DDR4 offers higher throughput at lower power and latency than DDR3
DDR3 measured on ProLiant DL360 Gen8
DDR4 measured on ProLiant DL360 Gen9
Resilience
• DDR4 provides retry on Address Parity and Uncorrectable Errors –More immune to transient errors – Previously caused a system shutdown – Longer Mean Time Between Failures
• DDR3 causes a system shutdown with Address Parity or UC Errors
DDR4 is more resilient than DDR3
Agenda
1.Comparison of DDR3 and DDR4 2.DDR4 Memory Throughput and Latency 3.A tale of two 8GB DDR4 DIMMs
Results of micro benchmark not affiliated with any
specific application
Factors that affect Throughput and Latency
1.Number of populated channels 2.Number of active cores 3.Data rate and CAS latency 4.Percentage of write traffic 5.Number of ranks on the channel
Load-to-use Idle Latency and CAS Latency
• CAS latency increases with increasing Data Rate • Idle latency decreases with increasing Data Rate
• Because the internal memory controller clock is the same as the Data Rate • Same latencies with HW pre-fetcher disabled
CA
S L
aten
cy (n
s)
CL15 CL13
CL11
CL9
Remote snoop latency == Local memory latency
Measured on ProLiant DL360 Gen9
Factors affecting Throughput
Writes
Measured on ProLiant DL360 Gen9
Impact of Throughput on Latency
Stair step is due to cores on alternating sockets
Measured on ProLiant DL360 Gen9
Impact of Throughput on Power
0.00.51.01.52.02.53.03.54.04.55.0
0 25 50 75 100 125
DIM
M P
ower
(W)
Throughput (GB/s)
DL360 Gen9 2-Socket Server with 8x 16GB 2Rx4 RDIMM 1DPC
DDR4-2133DDR4-1866DDR4-1600
• Power consumption is dominated by throughput - not Data Rate – Stair step is due to cores on alternating sockets
Measured on ProLiant DL360 Gen9
Agenda
1.Comparison of DDR3 and DDR4 2.DDR4 Memory Throughput and Latency 3.A tale of two 8GB DDR4 DIMMs
Results of micro benchmark not affiliated with any
specific application
8GB 1Rx4 relative to 2Rx8 RDIMM Same number of DRAMs
1Rx4 • Lower Latency • X4 ECC Protection
2Rx8 • More Throughput • Lower Power • Less ECC Protection
Measured on ProLiant DL360 Gen9
Wrap-up
1.Comparison of DDR3 and DDR4 – DDR4 has superior throughput, latency, power
and resiliency 2.DDR4 Memory Throughput and Latency
– Affected by occupied channels, DPC, data rates, ranks per channel, active cores, writes
3.A tale of two 8GB DDR4 DIMMs – Single-rank has better latency and ECC
protection than dual-rank
The End – Many Thanks!