Pipelining in the Pentium 4 - University of...

21
Pipelining in the Pentium 4 http://www.hardwaresecrets.com/inside-pentium-4-architecture/2/ https://commons.wikimedia.org/wiki/File:Pentium_4,_3.0GHz_(4).jpg 20-stage pipeline in Northwood (2002) microarchitecture Pentium 4 die photo Prescott (2004) microarchitecture had 31 stages and original design had a target clock frequency of 10GHz

Transcript of Pipelining in the Pentium 4 - University of...

Page 1: Pipelining in the Pentium 4 - University of Iowahomepage.cs.uiowa.edu/~bdmyers/cs2630_fa16/public...solid red line: Pointing from where value is actually produced to where it is actually

PipelininginthePentium4

http://www.hardwaresecrets.com/inside-pentium-4-architecture/2/

https://commons.wikimedia.org/wiki/File:Pentium_4,_3.0GHz_(4).jpg

20-stagepipelineinNorthwood(2002)microarchitecture

Pentium4diephoto

Prescott(2004)microarchitecturehad31stagesandoriginaldesignhadatarget clockfrequencyof10GHz

Page 2: Pipelining in the Pentium 4 - University of Iowahomepage.cs.uiowa.edu/~bdmyers/cs2630_fa16/public...solid red line: Pointing from where value is actually produced to where it is actually

CS2630ComputerOrganization

Meeting24:PipelinedMIPSprocessorBrandonMyers

UniversityofIowa

Page 3: Pipelining in the Pentium 4 - University of Iowahomepage.cs.uiowa.edu/~bdmyers/cs2630_fa16/public...solid red line: Pointing from where value is actually produced to where it is actually

minutepaperidentifyonesituationwherepipelininghappens(exceptfor

digitallogicandlaundry)

Page 4: Pipelining in the Pentium 4 - University of Iowahomepage.cs.uiowa.edu/~bdmyers/cs2630_fa16/public...solid red line: Pointing from where value is actually produced to where it is actually

Let’sreviewthestepsofhowlw getsexecuted

http://courses.cs.washington.edu/courses/cse378

Page 5: Pipelining in the Pentium 4 - University of Iowahomepage.cs.uiowa.edu/~bdmyers/cs2630_fa16/public...solid red line: Pointing from where value is actually produced to where it is actually

IF:InstructionFetch

http://courses.cs.washington.edu/courses/cse378

readInstructionoutoftheinstructionmemory

Page 6: Pipelining in the Pentium 4 - University of Iowahomepage.cs.uiowa.edu/~bdmyers/cs2630_fa16/public...solid red line: Pointing from where value is actually produced to where it is actually

ID:InstructionDecode

http://courses.cs.washington.edu/courses/cse378

readvaluesfromregisterfile

Page 7: Pipelining in the Pentium 4 - University of Iowahomepage.cs.uiowa.edu/~bdmyers/cs2630_fa16/public...solid red line: Pointing from where value is actually produced to where it is actually

EX:Execute

http://courses.cs.washington.edu/courses/cse378

ALUcomputesaresult

Page 8: Pipelining in the Pentium 4 - University of Iowahomepage.cs.uiowa.edu/~bdmyers/cs2630_fa16/public...solid red line: Pointing from where value is actually produced to where it is actually

MEM:Accessmemory

http://courses.cs.washington.edu/courses/cse378

Readthedatamemory

Page 9: Pipelining in the Pentium 4 - University of Iowahomepage.cs.uiowa.edu/~bdmyers/cs2630_fa16/public...solid red line: Pointing from where value is actually produced to where it is actually

WB:Writeback

http://courses.cs.washington.edu/courses/cse378

Writethedatabacktoregisterfile

Page 10: Pipelining in the Pentium 4 - University of Iowahomepage.cs.uiowa.edu/~bdmyers/cs2630_fa16/public...solid red line: Pointing from where value is actually produced to where it is actually

Peerinstruction• Matchthestagestowhathappensinthemduringthebranchinstruction

1. IF(instructionfetch)2. ID(instructiondecode)3. EX(execute)4. MEM(memory)5. WB(writeback)

a) comparetwooperandsb) readtworegistersc) readthebranch

instructionbitsd) writearegistere) writememoryf) readmemoryg) nothingornoneofthe

above

Page 11: Pipelining in the Pentium 4 - University of Iowahomepage.cs.uiowa.edu/~bdmyers/cs2630_fa16/public...solid red line: Pointing from where value is actually produced to where it is actually

Pipelinedexecutionofaprogram

http://courses.cs.washington.edu/courses/cse378

Page 12: Pipelining in the Pentium 4 - University of Iowahomepage.cs.uiowa.edu/~bdmyers/cs2630_fa16/public...solid red line: Pointing from where value is actually produced to where it is actually

Performanceofpipelineddatapath

IF ID EX MEM WB

cycle0 cycle1 cycle2 cycle3 cycle4

200ps 1000ps 1800ps 2600ps

IF ID EX MEM WB

cycle5

cycle0 cycle2

IF ID EX MEM WB

IF ID EX MEM WB

clock period = 400ps (clock frequency = 2.5GHz)

clock period = 1600ps (clock frequency = 0.625 GHz)

200ps 1000ps 1800ps 2600ps 3400ps

lw $t0, 4($t1)

lw $t1, 0($t2)

lw $t0, 4($t1)

lw $t1, 0($t2)

3400ps

Page 13: Pipelining in the Pentium 4 - University of Iowahomepage.cs.uiowa.edu/~bdmyers/cs2630_fa16/public...solid red line: Pointing from where value is actually produced to where it is actually

Peerinstruction• Suppose

• clk-to-qdelayofourregistersis100ps• setuptimeofourregistersis50ps• delayofcombinationallogicineachstageis75ps,200ps,150ps,250ps,and200ps,respectively

Whatisthemaximumclockfrequencywecanrunourprocessorat?

a) 2.5GHzb) 0.976GHzc) 4.44GHzd) 2.86GHze) 3.33GHzf) 6.67GHz

Page 14: Pipelining in the Pentium 4 - University of Iowahomepage.cs.uiowa.edu/~bdmyers/cs2630_fa16/public...solid red line: Pointing from where value is actually produced to where it is actually

Peerinstruction• Howmanyinstructions/cycle(IPC)willMIPSprocessorachievewhenpipeliningandusingeachofthe5stages(IF,ID,EX,MEM,WB)asapipelinestage?

a) 𝐼𝑃𝐶 = 5b) 𝐼𝑃𝐶 = )

*⁄c) 𝐼𝑃𝐶 = 1d) 𝐼𝑃𝐶 = 2e) 𝐼𝑃𝐶 ≤ 1

Page 15: Pipelining in the Pentium 4 - University of Iowahomepage.cs.uiowa.edu/~bdmyers/cs2630_fa16/public...solid red line: Pointing from where value is actually produced to where it is actually

Pipelineddatapath

http://courses.cs.washington.edu/courses/cse378

greyboxesrepresentregistersforallsignals

Page 16: Pipelining in the Pentium 4 - University of Iowahomepage.cs.uiowa.edu/~bdmyers/cs2630_fa16/public...solid red line: Pointing from where value is actually produced to where it is actually

PeerInstruction• Howshouldwechangethecontrolunittohandleapipelinedprocessor(stagesIF,ID,EX,MEM,WB)• singlecyclecontrolunitwassomecombinationallogic

a) nochangeb) implementasafinitestatemachine(FSM)c) calculatethecontrolsignalsandpassthemdownthe

pipelined) adifferentcontrolunitforeachstage;passthe

instructionbitsdownthepipeline

Page 17: Pipelining in the Pentium 4 - University of Iowahomepage.cs.uiowa.edu/~bdmyers/cs2630_fa16/public...solid red line: Pointing from where value is actually produced to where it is actually

Controlinthepipelinedprocessor

http://courses.cs.washington.edu/courses/cse378

computeallthesignalsduringIDstage.Somesignalsnotneededuntillaterstage,sopropagatethroughstages

Page 18: Pipelining in the Pentium 4 - University of Iowahomepage.cs.uiowa.edu/~bdmyers/cs2630_fa16/public...solid red line: Pointing from where value is actually produced to where it is actually

Pipelining…whatcouldgowrong?

http://courses.cs.washington.edu/courses/cse378

Page 19: Pipelining in the Pentium 4 - University of Iowahomepage.cs.uiowa.edu/~bdmyers/cs2630_fa16/public...solid red line: Pointing from where value is actually produced to where it is actually

Considerthefollowingprograms…add $t0, $t1, $t2add $t4, $t0, $t3

=======================================

lw $s0, 4($t0)sll $s1, $s2, 3

=======================================

beq $zero, $zero, gadgetaddi $t1, $zero, 1gadget: addi $t1, $zero, 2

(seehandout)

Page 20: Pipelining in the Pentium 4 - University of Iowahomepage.cs.uiowa.edu/~bdmyers/cs2630_fa16/public...solid red line: Pointing from where value is actually produced to where it is actually

Anotationforstudyinghazards

http://courses.cs.washington.edu/courses/cse378/

notethattherearenottworegisterfiles(Reg).Rather,thisnotationmeanstoshowtheactivestageforaninstructionduringeachcycle.Theregisterfile(Reg)isinvolvedduringIDandWB.

R

R

W

W

dottedblueline:Weneedtowrite$2beforeweread$2solidredline:Pointingfromwherevalueisactuallyproducedtowhereitisactuallyused

Page 21: Pipelining in the Pentium 4 - University of Iowahomepage.cs.uiowa.edu/~bdmyers/cs2630_fa16/public...solid red line: Pointing from where value is actually produced to where it is actually

Summary• PipelinedMIPSprocessortakesmultiplecyclestofinishagiveninstruction,butitcanexecutemultipleinstructionssimultaneously(upto1perstage)• Thestagewiththelongestdelaydeterminestheclockperiod• Registersseparateeachstage;controlanddatasignalsaresenttothenextstagethroughtheregisters• Executingmultipleinstructionssimultaneouslycanresultinhazards

• Next:• thinkingabouthazardssystematically• modifyingtheprocessortocopewithhazards