Computer Architecture 2009 – PC Structure and Peripherals 1 Computer Architecture...

download Computer Architecture 2009 – PC Structure and Peripherals 1 Computer Architecture PC Structure and Peripherals Dr. Lihu Rappoport

of 40

  • date post

    19-Dec-2015
  • Category

    Documents

  • view

    215
  • download

    2

Embed Size (px)

Transcript of Computer Architecture 2009 – PC Structure and Peripherals 1 Computer Architecture...

  • Slide 1
  • Computer Architecture 2009 PC Structure and Peripherals 1 Computer Architecture PC Structure and Peripherals Dr. Lihu Rappoport
  • Slide 2
  • Computer Architecture 2009 PC Structure and Peripherals 2 Memory
  • Slide 3
  • 3 SRAM vs. DRAM u Random Access: access time is the same for all locations DRAM Dynamic RAMSRAM Static RAM RefreshRegular refresh (~1% time)No refresh needed AddressAddress muxed: row+ columnAddress not multiplexed AccessNot true Random AccessTrue Random Access densityHigh (1 Transistor/bit)Low (6 Transistor/bit) Powerlowhigh Speedslowfast Price/bitlowhigh Typical usageMain memorycache
  • Slide 4
  • Computer Architecture 2009 PC Structure and Peripherals 4 CapacitySpeed Logic2 in 3 years2 in 3 years DRAM4 in 3 years1.4 in 10 years Disk2 in 3 years1.4 in 10 years Technology Trends CPU-DRAM Memory Gap (latency)
  • Slide 5
  • Computer Architecture 2009 PC Structure and Peripherals 5 Basic DRAM chip Addressing sequence Row address and then RAS# asserted RAS# to CAS# delay Column address and then CAS# asserted DATA transfer Row latch Row address decoder Column addr decoder Column latch CAS# RAS# Data Memory array Memory address bus Addr
  • Slide 6
  • Computer Architecture 2009 PC Structure and Peripherals 6 Addressing sequence Access sequence Put row address on data bus and assert RAS# Wait for RAS# to CAS# delay (t RCD ) Put column address on data bus and assert CAS# DATA transfer Precharge t RAC Access time RAS/CAS delay Precharge delay RAS# Data A[0:7] CAS# Data n Row iCol n Row j X CL - CAS latency X
  • Slide 7
  • Computer Architecture 2009 PC Structure and Peripherals 7 Basic SDRAM controller DRAM address decoder Time delay gen. address mux RAS# CAS# R/W# A[20:23] A[10:19] A[0:9] Memory address bus D[0:7] Select Chip select u DRAM data must be periodically refreshed Needed to keep data correct DRAM controller performs DRAM refresh, using refresh counter
  • Slide 8
  • Computer Architecture 2009 PC Structure and Peripherals 8 Paged Mode DRAM Multiple accesses to different columns from same row Saves RAS and RAS to CAS delay Extended Data Output RAM (EDO RAM) A data output latch enables to parallel next column address with current column data Improved DRAM Schemes RAS# Data A[0:7] CAS# Data nD n+1 RowXCol n XCol n+1 XCol n+2 X D n+2 X RAS# Data A[0:7] CAS# Data nData n+1 RowXCol n XCol n+1 XCol n+2 X Data n+2 X
  • Slide 9
  • Computer Architecture 2009 PC Structure and Peripherals 9 Burst DRAM Generates consecutive column address by itself Improved DRAM Schemes (cont) RAS# Data A[0:7] CAS# Data nData n+1 RowXCol n X Data n+2 X
  • Slide 10
  • Computer Architecture 2009 PC Structure and Peripherals 10 Synchronous DRAM SDRAM u All signals are referenced to an external clock (100MHz-200MHz) Makes timing more precise with other system devices u Multiple Banks Multiple pages open simultaneously (one per bank) u Command driven functionality instead of signal driven ACTIVE: selects both the bank and the row to be activated ACTIVE to a new bank can be issued while accessing current bank READ/WRITE: select column u Read and write accesses to the SDRAM are burst oriented Successive column locations accessed in the given row Burst length is programmable: 1, 2, 4, 8, and full-page May end full-page burst by BURST TERMINATE to get arbitrary burst length u A user programmable Mode Register CAS latency, burst length, burst type u Auto pre-charge: may close row at last read/write in burst u Auto refresh: internal counters generate refresh address
  • Slide 11
  • Computer Architecture 2009 PC Structure and Peripherals 11 SDRAM Timing u t RCD : ACTIVE to READ/WRITE gap = t RCD (MIN) / clock period u t RC : successive ACTIVE to a different row in the same bank u t RRD : successive ACTIVE commands to different banks BL = 1
  • Slide 12
  • Computer Architecture 2009 PC Structure and Peripherals 12 DDR-SDRAM u 2n-prefetch architecture The DRAM cells are clocked at the same speed as SDR SDRAM Internal data bus is twice the width of the external data bus Data capture occurs twice per clock cycle Lower half of the bus sampled at clock rise Upper half of the bus sampled at clock fall u Uses 2.5V (vs. 3.3V in SDRAM) Reduced power consumption 0:n-1 n:2n-1 0:n-1 200MHz clock 0:2n-1 SDRAM Array
  • Slide 13
  • Computer Architecture 2009 PC Structure and Peripherals 13 DDR SDRAM Timing 133MHz clock cmd Bank Data Addr NOP X ACT Bank 0 Row iX RD Bank 0 Col j t RCD >20ns ACT Bank 0 Row l t RC >70ns ACT Bank 1 Row m t RRD >20ns CL=2 NOP X X X X X X RD Bank 1 Col n NOP X X X X X X X X j +1 +2 +3 n +1 +2 +3
  • Slide 14
  • Computer Architecture 2009 PC Structure and Peripherals 14 DIMMs u DIMM: Dual In-line Memory Module A small circuit board that holds memory chips u 64-bit wide data path (72 bit with parity) Single sided: 9 chips, each with 8 bit data bus 512 Mbit / chip 8 chips 512 Mbyte per DIMM Dual sided: 18 chips, each with 4 bit data bus 256 Mbit / chip 16 chips 512 Mbyte per DIMM
  • Slide 15
  • Computer Architecture 2009 PC Structure and Peripherals 15 DRAM Standards u SDR SDRAM: PC66, PC100 and PC133 u DDR SDRAM u Total BW for DDR400 3200M Byte/sec = 64 bit 2 200MHz / 8 (bit/byte) u Dual channel DDR SDRAM Uses 2 64 bit DIMM modules in parallel to get a 128 data bus Total BW for DDR400 dual channel: 6400M Byte/sec = 128 bit 2 200MHz /8 DDR200DDR266DDR333DDR400DDR533 Bus freq (MHz)100133167200266 Bit/pin (Mbps)200266333400533 Total bandwidth (M Byte/sec ) 16002133266632004264
  • Slide 16
  • Computer Architecture 2009 PC Structure and Peripherals 16 DRAM Standards LabelNameEffective Clock Rate Data BusBandwidth PC66SDRAM66 MHz64 Bit0,5 GB/s PC100SDRAM100 MHz64 Bit0,8 GB/s PC133SDRAM133 MHz64 Bit1,06 GB/s PC1600DDR200100 MHz64 Bit1,6 GB/s PC1600DDR200 Dual100 MHz2 x 64 Bit3,2 GB/s PC2100DDR266133 MHz64 Bit2,1 GB/s PC2100DDR266 Dual133 MHz2 x 64 Bit4,2 GB/s PC2700DDR333166 MHz64 Bit2,7 GB/s PC2700DDR333 Dual166 MHz2 x 64 Bit5,4 GB/s PC3200DDR400200 MHz64 Bit3,2 GB/s PC3200DDR400 Dual200 MHz2 x 64 Bit6,4 GB/s PC4200DDR533266 MHz64 Bit4,2 GB/s PC4200DDR533 Dual266 MHz2 x 64 Bit8,4 GB/s
  • Slide 17
  • Computer Architecture 2009 PC Structure and Peripherals 17 DDR Memory Performance Source: http://www.tomshardware.com/http://www.tomshardware.com/
  • Slide 18
  • Computer Architecture 2009 PC Structure and Peripherals 18 DDR2 u DDR2 achieves high-speed using 4-bit prefetch architecture SDRAM cells read/write 4 the amount of data as the external bus DDR2-533 cell works at the same frequency as a DDR266 SDRAM or a PC133 SDRAM cell u This method comes at a price of increased latency DDR2-based systems may perform worse than DDR1-based systems
  • Slide 19
  • Computer Architecture 2009 PC Structure and Peripherals 19 DDR2 Other Features u Shortened page size for reduced activation power Each time an ACTIVATE command is given, all bits in the page are read A major contributor to the active power A device with a shorter page size has a significantly lower power 512Mb DDR2 page size is 1KByte vs. 2KB for 512Mb DDR1 u Eight banks in 1Gb densities and above Increases flexibility in DRAM accesses Also increases the power
  • Slide 20
  • Computer Architecture 2009 PC Structure and Peripherals 20 DDR2 vs DDR1 SDRAM DDR1DDR 2 Data Bus64 bit Data Rate200/266/333/400 Mbps400/533/667/800 Mbps Bus Frequency100/133/166/200 MHz200/266/333/400 MHz DRAM Frequency100/133/166/200 MHz Operation Voltage2.5V1.8V PackageTSOPFBGA Densities128Mb~1Gb256Mb~2Gb Prefetch size2 bits4 bits Burst length2/4/84/8 CAS Latency2, 2.5, 33, 4, 5 Data Bandwidth3.2GBs6.4GBs Power Consumption399mW217mW
  • Slide 21
  • Computer Architecture 2009 PC Structure and Peripherals 21 DDR2 Latency u Many DDR2-533 modules have 4-4-4 timings (CAS Latency - RAS to CAS Delay - RAS Precharge Time) 1.5 latency compared to DDR400 232 30% growth of bandwidth does not compensates access time worsening u DDR2-533 latency improves considerably at 3-3-3 timings only 12% worse than the latency of 2-3-2 DDR400 MemoryTimingsLatencydual-channel BW DDR4002.5 3 312.5 ns6.4 GB/sec DDR40023223210 ns6.4 GB/sec DDR53334434411.2 ns8.5 GB/sec DDR5332.5 3 39.4 ns8.5 GB/sec DDR2-53355555518.8 ns8.5 GB/sec DDR2-53344444415 ns8.5 GB/sec DDR2-53333333311.2 ns8.5 GB/sec DDR2-60055555516.6 ns9.6 GB/sec DDR2-60044444413.3 ns9.6 GB/sec
  • Slide 22
  • Computer Architecture 2009 PC Structure and Peripherals 22 DDR2 Latency (cont.) u Performance tests DDR2-533 with 4-4-4 timings worse than DDR400 232 DDR2-533 with 3-3-3 timings better than DDR400 232 u DDR2-533 modules with 3-3-3 timings Supported by 925/915 best choice for enthusiastic users significant improvement u Over-clocked motherboards clock DDR2-533 at 600MHz realized through undocumented memory frequency ratios available in i925/i915 u The performance of DDR2-based systems is more sensitive to a lower latency than to a higher frequency We get practically nothing from using DDR2-600 SDRAM with i925/i915
  • Slide 23
  • Computer Architecture 2009 PC Structure and Peripherals 23 DDR3 u 30% a power consumption reduction compared to DDR2 1.5 V supply voltage, compared to DDR2's 1.8 V or DDR's 2.5 V 90 nanometer fabrication technology u Higher bandwidth 8 bit deep prefetch buffer (vs. 4 bit in DDR2 and 2 bit