B. Ramamurthy. 12 stage pipeline At peak speed, the processor can request both an instruction and...
-
Upload
priscilla-paul -
Category
Documents
-
view
213 -
download
0
Transcript of B. Ramamurthy. 12 stage pipeline At peak speed, the processor can request both an instruction and...
12 stage pipeline At peak speed, the processor can request
both an instruction and a data word on every clock.
We cannot afford pipeline stalls: solution: add a cache
Cache is 16KB, 16-word blocks
Send address to the appropriate cache. The address comes from either the PC or from the ALU.
If the cache signals hit, the requested word is available on the data lines
Since there are 16 words in the desired block, we need to select the right word.
Block index field is used to select the indexed word from the 16 words in the indexed block.
If cache signals miss, we send the address to main memory and get the data from main memory and fill the cache. Data is then read again.
Lets look at the schematic of the organization: fig.7.9
Assume that on a cache miss, We need 1 memory cycle to send address to main memory 15 memory cycles to read DRAM memory word (assume
bus width is 32 bits = 4 bytes) 1 memory cycle to send word of data back Total for block access: 1+ 4X15 + 1X4 = 1 + 60 + 4 = 65 cycle Bytes received = 1 block of cache = 4 X 4 = 16 bytes Byte/cycle = 16/65 = 0.25 ( too low for our fast
processor!) What is your solution? Need better bandwidth. Increase bus width? Memory interleave? Wide memory
organization? See fig. 7-11