1 Appendix A Pipeline implementation Pipeline hazards, detection and forwarding Multiple-cycle...
-
Upload
annabella-barnett -
Category
Documents
-
view
237 -
download
0
Transcript of 1 Appendix A Pipeline implementation Pipeline hazards, detection and forwarding Multiple-cycle...
1
Appendix A
• Pipeline implementation
• Pipeline hazards, detection and forwarding
• Multiple-cycle operations
• MIPS R4000
CDA5155 Spring, 2007, Peir / University of Florida
2
Limits of Pipelining
• Increasing the number of pipeline stages in a
given logic block by a factor of n generally allows
increasing clock speed & throughput by a factor
of almost n.– Usually less than n because of overheadsoverheads such as
latches and balance of delay in each stage.
• But, pipelining has a natural limit:– At least 1 layer of logic gates per pipeline stage!
– Practical minimum is usally several gates (2-10).
– Commercial designs are approaching this point!!
3
Simple RISC Datapath
4
Basic RISC Pipelining
• Basic idea:– Each instruction spends 1 clock cycle in each of the 5
execution stages.– During 1 clock cycle, the pipeline can be processing
(different stages of) 5 different instructions.
5
Adding Pipeline Registers
6
Operations of Pipe Stages
7
Pipeline Hazards
• Hazards are circumstances which may lead to stalls (delays, “bubbles”) in the pipeline if not addressed.
• Three major types:– Structural hazards:
• Lack of HW resources to keep all instructions moving.
– Data hazards
• Data results of earlier instrs. not yet avail. when needed.
– Control hazards
• Control decisions resulting from earlier instrs. (branches) not yet made; don’t know which new instrs. to execute.
8
Structural Hazard Example
Suppose you had a combined instruction+data memory with only 1 read port
9
Hazards Produce “Bubbles”
10
Another View
11
Example Data Hazard
12
Forwarding for Data Hazards
13
Another Forwarding Example
14
Three Types of Data Hazards
• Let i be an earlier instruction, j a later one.
• RAW (read after write)– j tries to read a value before i writes it
• WAW (write after write)– i and j write to same place, but in the wrong order.
– Only occurs if >1 pipeline stage can write.
• WAR (write after read)– j writes a new value to a location before i has read the
old one.
– Only occurs if writes can happen before reads in pipeline.
15
An Unavoidable Stall - Load
16
Stalling for Load Dependent
17
Data Hazard Prevention
• A clever compiler can often reschedule instructions
(code motion) to avoid a stall.– A simple example:
• Original code:
lw r2, 0(r4)
add r1, r2, r3 Note: Stall happens here!
lw r5, 4(r4)
• Transformed code:
lw r2, 0(r4)
lw r5, 4(r4)
add r1, r2, r3 No stall needed!
18
Data Hazard Detection
19
Hazard Detection Logic for Load
• Example: Detecting whether an instruction that has
just been fetched needs to be stalled because of
dependence from a preceding load.
NOTE, The right part of the equ. should be IF/ID.IR
20
Forwarding Situations in MIPS
Same as Figure A.22
21
Forwarding to The ALU
22
Branch Hazard
• Suppose the new PC value is not computed until the MEM stage.
• Then we must stall 3 clocks after every branch!
23
Early Branch Resolution
Branch resolution at ID stage
See Fig A.24, to resolve branch at ID stage without latching, save another cycle!!
24
Predict-Not-Taken
Same as Fig. A.12
(Branch resolves in ID)
25
Delayed Branches
Machine code sequence:Branch instructionDelay slot instruction(s)Post-branch instructions
Branch is taken (if taken) at this point
Same as Fig. A.13
26
Filling the Branch-Delay Slot
For (b), (c) must no side-effect
27
Multi-Cycle Execution
Same as Fig. A.29
28
Latency & Initiation Interval
• Latency:– Extra delay cycles before result is available.
• Initiation interval:– Minimum number of cycles before a new input can be
given to that functional unit.
Functional Unit LatencyInitiationinterval
Integer ALUData memory (loads)FP addFP & integer multiplyFP & integer divide
0136
24
111125
29
Pipelined Multiple-FP Operations
Same as Fig. A.31
30
Pipelining FP Instructions
• Notice instructions may complete out-of-order:– MULTD IF ID M1 M2 M3 M4 M5 M6 M7M7 ME WB
– ADDD IF ID A1 A2 A3 A4A4 ME WB
– LD IF ID EX MEME WB
– SD IF ID EX ME WB
• Raises the possibility of WAW hazards, and
structural hazards in MEM & WB stages.
• Structural hazards may occur especially often
with non-pipelined DIV unit.
• Out-of-order completion impacts exception
handling.
31
Issues in Multi-Cycle Operations
• Stall for RAW is longer and more frequent (Fig. A.33)
• WAW is possible; WAR is not (why?)
• Structural Hazard possible for non-pipelined unit
• Multiple WBs are likely (Fig. A.34)
• Handling hazards– At Issue (ID) stage:
• Check structural hazards: functional unit, WB port
• Check RAW hazards: Issue with forwarding
• Check WAW hazards: Not issue to make sure write in order
– Detect and stall instruction before MEM and WB stages
32
Maintaining Precise Exception
• Settle for imprecise exception
• Buffer and complete in order – Require large buffers and comparators
– History file, future file approaches
• Software trap handling when exception occurs
• Hybrid scheme: Issue when certain no exception for early instruction– All instructions before can be completed
– No instructions after can be completed
33
Real MIPS R4000 Pipeline
• IF,IS - Instruction cache fetch, First & Second halves.• RF - Inst. decode, Register Fetch, hazard check…• EX - Execution (EA calc, ALU op, target calc…)• DF,DS - Data cache access, First & Second halves.• TC - Tag Check, did cache access hit?• WB - Write-Back for loads & register-register ops.
Read through A.38 – A.49
34
2-Cycle Load Delay
35
Branch Delay