SingleCycle - Computer Sciencecr4bd/3330/S2017/notes/... · 2/9/2017  · • Advantages of Complex...

Post on 10-May-2020

2 views 0 download

Transcript of SingleCycle - Computer Sciencecr4bd/3330/S2017/notes/... · 2/9/2017  · • Advantages of Complex...

2/9/17

1

SamiraKhanUniversityofVirginia

Feb9,2017

IntrotoMicroarchitecture:Single-Cycle

CS3330

AGENDA• Reviewfromlastlecture

• ISAtradeoffs

• Single-cycleMicroarchitecture

2

Review:ISAvs.Microarchitecture• ISA(InstructionSetArchitecture)

• Agreeduponinterfacebetweensoftwareandhardware

• SW/compilerassumes,HWpromises• Whatthesoftwarewriterneedstoknowtowriteanddebugsystem/userprograms

• Microarchitecture• SpecificimplementationofanISA• Notvisibletothesoftware

• Microprocessor• ISA,uarch,circuits• “Architecture” =ISA+microarchitecture

Microarchitecture

Circuits

ISA

Problem

Algorithm

Program

Transistors

3

Review:ISA• Instructions

• Opcodes,AddressingModes,DataTypes• InstructionTypesandFormats• Registers,ConditionCodes

• Memory• Addressspace,Addressability,Alignment• Virtualmemorymanagement

• Call,Interrupt/ExceptionHandling• AccessControl,Priority/Privilege• I/O:memory-mappedvs.instr.• Task/threadManagement• PowerandThermalManagement• Multi-threadingsupport,Multiprocessorsupport

4

2/9/17

2

Microarchitecture• ImplementationoftheISAunderspecific designconstraintsandgoals• Anythingdoneinhardwarewithoutexposuretosoftware

• Pipelining(willseelater)• Clockgating• Caching?Levels,size,associativity,replacementpolicy• Prefetching?• Voltage/frequencyscaling?• Errorcorrection?

5

PropertyofISAvs.Uarch?• ADDinstruction’sopcode• Numberofgeneralpurposeregisters• Numberofportstotheregisterfile• NumberofcyclestoexecutetheMULinstruction• Whetherornotthemachineemployspipelinedinstructionexecution

• Remember• Microarchitecture:ImplementationoftheISAunderspecific designconstraintsandgoals

6

DesignPoint• Asetofdesignconsiderationsandtheirimportance

• leadstotradeoffsinbothISAanduarch• Considerations

• Cost• Performance• Maximumpowerconsumption• Energyconsumption(batterylife)• Availability• ReliabilityandCorrectness• TimetoMarket

• Designpointdeterminedbythe“Problem” space(applicationspace),theintendedusers/market

7

DesignPoint• Asetofdesignconsiderationsandtheirimportance

• leadstotradeoffsinbothISAanduarch• Considerations

• Cost• Performance• Maximumpowerconsumption• Energyconsumption(batterylife)• Availability• ReliabilityandCorrectness• TimetoMarket

• Designpointdeterminedbythe“Problem” space(applicationspace),theintendedusers/marketLookForward&Up

8

2/9/17

3

ROLEOFTHE(COMPUTER)ARCHITECT

from Yale Patt’s lecture notes9

ROLEOFTHE(COMPUTER)ARCHITECT• Lookbackward(tothepast)

• Understandtradeoffsanddesigns,upsides/downsides,pastworkloads.Analyzeandevaluatethepast

• Lookforward(tothefuture)• Bethedreamerandcreatenewdesigns.Listentodreamers• Pushthestateoftheart.Evaluatenewdesignchoices

• Lookup(towardsproblemsinthecomputingstack)• Understandimportantproblemsandtheirnature• Developarchitecturesandideastosolveimportantproblems

• Lookdown(towardsdevice/circuittechnology)• Understandthecapabilitiesoftheunderlyingtechnology• Predictandadapttothefutureoftechnology(youaredesigningforNyearsahead).Enablethefuturetechnology 10

ApplicationSpace

• Dream,andtheywillappear…

11

Tradeoffs:SoulofComputerArchitecture• ISA-leveltradeoffs

• Microarchitecture-leveltradeoffs

• SystemandTask-leveltradeoffs• Howtodividethelaborbetweenhardwareandsoftware

• Computerarchitectureisthescienceandartofmakingtheappropriatetrade-offstomeetadesignpoint

• Whyart?

12

2/9/17

4

ISAPrinciplesandTradeoffs

ManyDifferentISAsOverDecades• x86• PDP-x:ProgrammedDataProcessor(PDP-11)• VAX• IBM360• CDC6600• SIMDISAs: CRAY-1,ConnectionMachine• VLIWISAs:Multiflow,Cydrome,IA-64(EPIC)• PowerPC,POWER• RISCISAs:Alpha,MIPS,SPARC,ARM

• Whatarethefundamentaldifferences?• E.g.,howinstructionsarespecifiedandwhattheydo• E.g.,howcomplexaretheinstructions

14

MIPS

opcode6-bit

rs5-bit

rt5-bit

immediate16-bit

I-type

R-type06-bit

rs5-bit

rt5-bit

rd5-bit

shamt5-bit

funct6-bit

opcode6-bit

immediate26-bit

J-type

15

ARM

16

2/9/17

5

WhatAretheElementsofAnISA?• Instructions

• Opcode• Operandspecifiers(addressingmodes)

• Howtoobtaintheoperand?

• Datatypes• Definition:Representationofinformationforwhichthereareinstructionsthatoperateontherepresentation

• Integer,floatingpoint,character,binary,decimal,BCD• Doublylinkedlist,queue,string,bitvector,stack

• VAX:INSQUEUEandREMQUEUEinstructionsonadoublylinkedlistorqueue;FINDFIRST

• DigitalEquipmentCorp.,“VAX11780ArchitectureHandbook,” 1977.• X86:SCANopcodeoperatesoncharacterstrings;PUSH/POP

Why are there different addressing modes?

17

DataTypeTradeoffs• Whatisthebenefitofhavingmoreorhigh-leveldatatypesintheISA?• Whatisthedisadvantage?

• Thinkcompiler/programmervs.microarchitect

• Conceptofsemanticgap• Datatypescoupledtightlytothesemanticlevel,orcomplexityofinstructions

• Example:EarlyRISCarchitecturesvs.Intel432• EarlyRISC:Onlyintegerdatatype• Intel432:Objectdatatype,capabilitybasedmachine

18

Complexvs.SimpleInstructions• Complexinstruction:Aninstructiondoesalotofwork,e.g.manyoperations

• Insertinadoublylinkedlist• ComputeFFT• Stringcopy

• Simpleinstruction:Aninstructiondoessmallamountofwork,itisaprimitiveusingwhichcomplexoperationscanbebuilt

• Add• XOR• Multiply

19

Complexvs.SimpleInstructions• AdvantagesofComplexinstructions

+Denserencodingà smallercodesizeà bettermemoryutilization,savesoff-chipbandwidth,bettercachehitrate(betterpackingofinstructions)

+Simplercompiler:noneedtooptimizesmallinstructionsasmuch

• DisadvantagesofComplexInstructions- Largerchunksofworkà compilerhaslessopportunitytooptimize(limitedinfine-grainedoptimizationsitcando)

- Morecomplexhardwareà translationfromahighleveltocontrolsignalsandoptimizationneedstobedonebyhardware

20

2/9/17

6

ISA-levelTradeoffs:SemanticGap• WheretoplacetheISA? Semanticgap

• Closertohigh-levellanguage(HLL)à Smallsemanticgap,complexinstructions

• Closertohardwarecontrolsignals?à Largesemanticgap,simpleinstructions

• RISCvs.CISCmachines• RISC:Reducedinstructionsetcomputer• CISC:Complexinstructionsetcomputer

• FFT,QUICKSORT,POLY,FPinstructions?• VAXINDEXinstruction(arrayaccesswithboundschecking)

21

ISA-levelTradeoffs:SemanticGap• Sometradeoffs(foryoutothinkabout)

• Simplecompiler,complexhardwarevs.complexcompiler,simplehardware

• Burdenofbackwardcompatibility

• Performance?EnergyConsumption?• Optimizationopportunity:ExampleofVAXINDEXinstruction:who(compilervs.hardware)putsmoreeffortintooptimization?

• Instructionsize,codesize

22

SmallversusLargeSemanticGap• CISCvs.RISC

• Complexinstructionsetcomputerà complexinstructions• Initiallymotivatedby“notgoodenough” codegeneration

• Reducedinstructionsetcomputerà simpleinstructions• JohnCocke,mid1970s,IBM801

• Goal:enablebettercompilercontrolandoptimization

• RISCmotivatedby• Memorystalls(noworkdoneinacomplexinstructionwhenthereisamemorystall?)

• Whenisthiscorrect?• Simplifyingthehardwareà lowercost,higherfrequency• Enablingthecompilertooptimizethecodebetter

• Findfine-grainedparallelismtoreducestalls

23

ISA-levelTradeoffs:InstructionLength• Fixedlength:Lengthofallinstructionsthesame

+Easiertodecodesingleinstructioninhardware+Easiertodecodemultipleinstructionsconcurrently-- Wastedbitsininstructions(Whyisthisbad?)-- Harder-to-extendISA(howtoaddnewinstructions?)

• Variablelength:Lengthofinstructionsdifferent(determinedbyopcodeandsub-opcode)

+Compactencoding(Whyisthisgood?)Intel432:6to321bitinstructions.

-- Morelogictodecodeasingleinstruction-- Hardertodecodemultipleinstructionsconcurrently

• Tradeoffs• Codesize(memoryspace,bandwidth,latency)vs.hardwarecomplexity• ISAextensibilityandexpressivenessvs.hardwarecomplexity• Performance?Energy?Smallercodevs.easeofdecode

24

2/9/17

7

ISA-levelTradeoffs:UniformDecode• Uniformdecode:Samebitsineachinstructioncorrespondtothesamemeaning

• Opcodeisalwaysinthesamelocation• Dittooperandspecifiers,immediatevalues,…• Many“RISC” ISAs:Alpha,MIPS,SPARC+Easierdecode,simplerhardware+Enablesparallelism:generatetargetaddressbeforeknowingtheinstructionisabranch

-- Restrictsinstructionformat(fewerinstructions?)orwastesspace

• Non-uniformdecode• E.g.,opcodecanbethe1st-7thbyteinx86+Morecompactandpowerfulinstructionformat-- Morecomplexdecodelogic

25

ISA-levelTradeoffs:NumberofRegisters• Affects:

• Numberofbitsusedforencodingregisteraddress• Numberofvalueskeptinfaststorage(registerfile)• (uarch)Size,accesstime,powerconsumptionofregisterfile

• Largenumberofregisters:+Enablesbetterregisterallocation(andoptimizations)bycompileràfewersaves/restores

-- Largerinstructionsize-- Largerregisterfilesize

26

ISA-levelTradeoffs:AddressingModes• Addressingmodespecifieshowtoobtainanoperandofaninstruction

• Register• Immediate• Memory(displacement,registerindirect,indexed,absolute,memoryindirect,autoincrement,autodecrement,…)

• Moremodes:+helpbettersupportprogrammingconstructs(arrays,pointer-basedaccesses)

-- makeitharderforthearchitecttodesign-- toomanychoicesforthecompiler?

• Manywaystodothesamethingcomplicatescompilerdesign• Wulf,“CompilersandComputerArchitecture,” IEEEComputer1981

27

ANoteonRISCvs.CISC• Usually,…

• RISC• Simpleinstructions• Fixedlength• Uniformdecode• Fewaddressingmodes

• CISC• Complexinstructions• Variablelength• Non-uniformdecode• Manyaddressingmodes

28

2/9/17

8

FoodforThoughtforYou• HowwouldyoudesignanewISA?

• Wherewouldyouplaceit?• WhatdesignchoiceswouldyoumakeintermsofISAproperties?

• Whatwouldbethefirstquestionyouaskinthisprocess?

• “Whatismydesignpoint?”

LookForward&Up

29

Y86-64InstructionSet#1Byte

pushq rA A 0 rA F

jXX Dest 7 fn Dest

popq rA B 0 rA F

call Dest 8 0 Dest

cmovXX rA, rB 2 fn rA rB

irmovq V, rB 3 0 F rB V

rmmovq rA, D(rB) 4 0 rA rB D

mrmovq D(rB), rA 5 0 rA rB D

OPq rA, rB 6 fn rA rB

ret 9 0

nop 1 0

halt 0 0

0 1 2 3 4 5 6 7 8 9

30

NowThatWeHaveanISA• Howdoweimplementit?

• i.e.,howdowedesignasystemthatobeysthehardware/softwareinterface?

31

ImplementingtheISA:MicroarchitectureBasics

2/9/17

9

HowDoesaMachineProcessInstructions?• Whatdoesprocessinganinstructionmean?• RememberthevonNeumannmodel

AS=Architectural(programmervisible)statebeforeaninstructionisprocessed

Processinstruction

AS’ =Architectural(programmervisible)stateafteraninstructionisprocessed

• Processinganinstruction:TransformingAStoAS’ accordingtotheISAspecificationoftheinstruction

33

The“Processinstruction” Step• ISAspecifiesabstractlywhatAS’ shouldbe,givenaninstructionandAS

• Itdefinesanabstractfinitestatemachinewhere• State=programmer-visiblestate• Next-statelogic=instructionexecutionspecification

• FromISApointofview,thereareno“intermediatestates” betweenASandAS’duringinstructionexecution

• Onestatetransitionperinstruction

• MicroarchitectureimplementshowASistransformedtoAS’• Therearemanychoicesinimplementation• Wecanhaveprogrammer-invisiblestatetooptimizethespeedofinstructionexecution:multiplestatetransitionsperinstruction

• Choice1:ASà AS’ (transformAStoAS’ inasingleclockcycle)• Choice2:ASà AS+MS1à AS+MS2à AS+MS3à AS’ (takemultipleclockcyclesto

transformAStoAS’)

34

AVeryBasicInstructionProcessingEngine• Eachinstructiontakesasingleclockcycletoexecute• Onlycombinationallogicisusedtoimplementinstructionexecution

• Nointermediate,programmer-invisiblestateupdates

AS=Architectural(programmervisible)stateatthebeginningofaclockcycle

Processinstructioninoneclockcycle

AS’ =Architectural(programmervisible)stateattheendofaclockcycle

35

AVeryBasicInstructionProcessingEngine• Single-cyclemachine

• Whatistheclockcycletimedeterminedby?• Whatisthecriticalpath ofthecombinationallogicdeterminedby?

AS’ AS(State)CombinationalLogic

36

2/9/17

10

Assembly/MachineCodeView

Programmer-VisibleState• PC:Programcounter

• Addressofnextinstruction• Called“RIP”(x86-64)

• Registerfile• Heavilyusedprogramdata

• Conditioncodes• Storestatusinformationaboutmostrecentarithmeticorlogicaloperation

• Usedforconditionalbranching

CPU

PC

Registers

Memory

CodeDataStack

Addresses

Data

InstructionsConditionCodes

• Memory• Byteaddressablearray• Codeanduserdata

• Stacktosupportprocedures

Instructions(andprograms)specifyhowtotransformthevaluesofprogrammervisiblestate37

Single-cyclevs.Multi-cycleMachines• Single-cyclemachines

• Eachinstructiontakesasingleclockcycle• Allstateupdatesmadeattheendofaninstruction’sexecution• Bigdisadvantage:Theslowestinstructiondeterminescycletimeà longclockcycletime

• Multi-cyclemachines• Instructionprocessingbrokenintomultiplecycles/stages• Stateupdatescanbemadeduringaninstruction’sexecution• Architecturalstateupdatesmadeonlyattheendofaninstruction’sexecution• Advantageoversingle-cycle:Theslowest“stage” determinescycletime

n Bothsingle-cycleandmulti-cyclemachinesliterallyfollowthevonNeumannmodelatthemicroarchitecturelevel

38

InstructionProcessing“Stage”• Instructionsareprocessedunderthedirectionofa“controlunit” stepbystep.

• Instructionstage:Sequenceofstepstoprocessaninstruction• Fundamentally,therearefivephases:

• Fetch• Decode• EvaluateAddress/FetchOperands• Execute• StoreResult

• Notallinstructionsrequireallstages

39

InstructionProcessing“Cycle” vs.MachineClockCycle• Single-cyclemachine:

• Allphasesoftheinstructionprocessingcycletakeasinglemachineclockcycle tocomplete

• Multi-cyclemachine:• Allsixphasesoftheinstructionprocessingcyclecantakemultiplemachineclockcycles tocomplete

• Infact,eachphasecantakemultipleclockcyclestocomplete

40

2/9/17

11

InstructionProcessingViewedAnotherWay

• InstructionstransformData(AS)toData’ (AS’)• Thistransformationisdonebyfunctionalunits

• Unitsthat“operate” ondata

• Theseunitsneedtobetoldwhattodotothedata

• Aninstructionprocessingengineconsistsoftwocomponents• Datapath:Consistsofhardwareelementsthatdealwithandtransformdatasignals

• functionalunitsthatoperateondata• hardwarestructures(e.g.wiresandmuxes)thatenabletheflowofdataintothefunctionalunitsandregisters

• storageunitsthatstoredata(e.g.,registers)• Controllogic:Consistsofhardwareelementsthatdeterminecontrolsignals,i.e.,signalsthatspecifywhatthedatapath elementsshoulddotothedata

41

Single-cyclevs.Multi-cycle:Control&Data• Single-cyclemachine:

• Controlsignalsaregeneratedinthesameclockcycleastheoneduringwhichdatasignalsareoperatedon

• Everythingrelatedtoaninstructionhappensinoneclockcycle(serializedprocessing)

• Multi-cyclemachine:• Controlsignalsneededinthenextcyclecanbegeneratedinthecurrentcycle

• Latencyofcontrolprocessingcanbeoverlappedwithlatencyofdatapath operation(moreparallelism)

42

ManyWaysofDatapath andControlDesign• Therearemanywaysofdesigningthedatapathandcontrollogic

• Single-cycle,multi-cycle,pipelineddatapath andcontrol

• Hardwired/combinationalvs.microcoded/microprogrammedcontrol• Controlsignalsgeneratedbycombinationallogicversus• Controlsignalsstoredinamemorystructure

43

Flash-Forward:PerformanceAnalysis• Executiontimeofaninstruction

• {CPI}x{clockcycletime}

• Executiontimeofaprogram• Sumoverallinstructions[{CPI}x{clockcycletime}]• {#ofinstructions}x{AverageCPI}x{clockcycletime}

• Singlecyclemicroarchitectureperformance• CPI=1• Clockcycletime=long

• Multi-cyclemicroarchitectureperformance• CPI=differentforeachinstruction

• AverageCPIà hopefullysmall• Clockcycletime=short

Now, we have two degrees of freedomto optimize independently

44

2/9/17

12

ASingle-CycleMicroarchitectureACloserLook

Remember…• Single-cyclemachine

AS(State)CombinationalLogic

AS’

46

Let’sStartwiththeStateElements• Dataandcontrolinputs

Registerfile

A

B

W dstW

srcA

valA

srcB

valB

valW

ALU

Operation

A

B

MUX

0

1

PC

InstructionMem

InstrAddr Instruction

DataMem

AddressReadData

WriteData

RegWrite

MemWrite

MemRead

MUXSelect

47

ForNow,WeWillAssume• “Magic” memoryandregisterfile

• Synchronouswrite• theselectedregisterisupdatedonthepositiveedgeclocktransitionwhenwriteenableisasserted

• Cannotaffectreadoutputinbetweenclockedges

48

2/9/17

13

InstructionProcessing• 6(5)genericsteps

• Instructionfetch(IF)• Instructiondecodeandregisteroperandfetch(ID/RF)• Execute/Evaluatememoryaddress(EX/AG)• Memoryoperandfetch(MEM)• Store/writeback result(WB)• PCUpdate

Registerfile

ValA

rB

DestE

valBALU

PC

InstructionMem

InstrAddr Instruction

DataMem

Address ReadData

WriteData

rA

ValE

IF ID/RF EX/AG WBMEM

newPC

49

InstructionProcessing• 6(5)genericsteps

• Instructionfetch(IF)• Instructiondecodeandregisteroperandfetch(ID/RF)• Execute/Evaluatememoryaddress(EX/AG)• Memoryoperandfetch(MEM)• Store/writeback result(WB)• PCUpdate

Registerfile

ValA

rB

DestE

valBALU

PC

InstructionMem

InstrAddr Instruction

DataMem

Address ReadData

WriteData

rA

ValE

IF ID/RF EX/AG WBMEM

newPC

50

InstructionProcessing• 6(5)genericsteps

• Instructionfetch(IF)• Instructiondecodeandregisteroperandfetch(ID/RF)• Execute/Evaluatememoryaddress(EX/AG)• Memoryoperandfetch(MEM)• Store/writeback result(WB)• PCUpdate

Registerfile

ValA

rB

DestE

valBALU

PC

InstructionMem

InstrAddr Instruction

DataMem

Address ReadData

WriteData

rA

ValE

IF ID/RF EX/AG WBMEM

newPC

51

InstructionProcessing• 6(5)genericsteps

• Instructionfetch(IF)• Instructiondecodeandregisteroperandfetch(ID/RF)• Execute/Evaluatememoryaddress(EX/AG)• Memoryoperandfetch(MEM)• Store/writeback result(WB)• PCUpdate

Registerfile

ValA

rB

DestE

valBALU

PC

InstructionMem

InstrAddr Instruction

DataMem

Address ReadData

WriteData

rA

ValE

IF ID/RF EX/AG WBMEM

newPC

52

2/9/17

14

Single-CycleDatapathforArithmeticandLogical

Instructions

ExecutingArith./LogicalOperation

•Fetch• Read2bytes

•Decode• Readoperandregisters

•Execute• Performoperation

• Setconditioncodes

•Memory• Donothing

•Writeback• Updateregister

•PCUpdate• IncrementPCby2

OPq rA, rB 6 fn rA rB

54

StageComputation:Arith/Log.Ops

• Formulateinstructionexecutionassequenceofsimplesteps

• Usesamegeneralformforallinstructions

OPq rA,rBicode:ifun¬M1[PC]rA:rB¬M1[PC+1]

valP¬ PC+2

FetchReadinstructionbyteReadregisterbyte

ComputenextPCvalA¬ R[rA]valB¬ R[rB]

Decode ReadoperandAReadoperandB

valE¬ valBOPvalASetCC

Execute PerformALUoperationSetconditioncoderegisterMemory

R[rB]¬ valEWrite

backWritebackresult

PC¬ valPPCupdate UpdatePC

55

ALUDatapath

**Basedonoriginalfigurefrom[P&HCO&D,COPYRIGHT2004Elsevier.ALLRIGHTSRESERVED.]

ifMEM[PC]==OPq rA, rBR[rB] ¬ R[rB] opR[rA]PC¬ PC+2

Combinationalstateupdatelogic

IF ID EX MEM WB

Registerfile

ValB

rA

DestE

valAALU

PC

InstructionMem

InstrAddr Instruction

DataMem

Address ReadData

WriteData

rB

ValE

ADD

RegWrite ALU

OP

PC

2

56

2/9/17

15

ALUDatapath

**Basedonoriginalfigurefrom[P&HCO&D,COPYRIGHT2004Elsevier.ALLRIGHTSRESERVED.]

ifMEM[PC]==OPq rA, rBR[rB] ¬ R[rB] opR[rA]PC¬ PC+2

Combinationalstateupdatelogic

IF ID EX MEM WB

Registerfile

ValB

rA

DestE

valAALU

PC

InstructionMem

InstrAddr Instruction

DataMem

Address ReadData

WriteData

rB

ValE

ADD

RegWrite ALU

OP

PC

2

57

Wedidnotcovertheseslidesintheclass

WilllearnabouttheseinthenextclassTheyarehereforyourbenefit

Single-CycleDatapathforDataMovementInstructions

Executingmrmovq (LoadfromMemtoReg)

•Fetch• Read10bytes

•Decode• Readoperandregisters

•Execute• Computeeffectiveaddress

•Memory• Readfrommemory

•Writeback• WritetoRegister

•PCUpdate• IncrementPCby10

6 fn rA rB

mrmovq D(rB),rA

D

60

2/9/17

16

StageComputation:mrmovq

• UseALUforaddresscomputation

mrmovq D(rB),rAicode:ifun¬M1[PC]rA:rB¬M1[PC+1]valC¬M8[PC+2]valP¬ PC+10

Fetch

ReadinstructionbyteReadregisterbyteReaddisplacementDComputenextPC

valB¬ R[rB]Decode

ReadoperandBvalE¬ valB +valC

ExecuteComputeeffectiveaddress

valM¬ M8[valE]Memory WritevaluetomemoryR[rA]¬ valMWrite

backPC¬ valPPCupdate UpdatePC

61

Ld Datapath

ifMEM[PC]==mrmovq Disp (rB), rAEA=Disp +R[rB]R[rA]¬MEM[EA]PC¬ PC+10

Combinationalstateupdatelogic

IF ID EX MEM WB

Registerfile

ValA

rB

DestE

valBALU

PC

InstructionMem

InstrAddr Instruction

DataMem

Address ReadData

WriteData

rA

ValE

ADD

RegWrite ALU

OP

MUX

MUXSelect

MUX

rB

rA

MUX

D

FromALU

FromMem10

PC

62

Ld Datapath

ifMEM[PC]==mrmovq Disp (rB), rAEA=Disp +R[rB]R[rA]¬MEM[EA]PC¬ PC+10

Combinationalstateupdatelogic

IF ID EX MEM WB

Registerfile

ValA

rB

DestE

valBALU

PC

InstructionMem

InstrAddr Instruction

DataMem

Address ReadData

WriteData

rA

ValE

ADD

RegWrite ALU

OP

MUX

MUXSelect

MUX

rB

rA

MUX

D

FromALU

FromMem10

PC

63

Ld Datapath

ifMEM[PC]==mrmovq Disp (rB), rAEA=Disp +R[rB]R[rA]¬MEM[EA]PC¬ PC+10

Combinationalstateupdatelogic

IF ID EX MEM WB

Registerfile

ValA

rB

DestE

valBALU

PC

InstructionMem

InstrAddr Instruction

DataMem

Address ReadData

WriteData

rA

ValE

ADD

RegWrite ALU

OP

MUX

MUXSelect

MUX

rB

rA

MUX

D

FromALU

FromMem10

PC

64

2/9/17

17

Ld Datapath

ifMEM[PC]==mrmovq Disp (rB), rAEA=Disp +R[rB]R[rA]¬MEM[EA]PC¬ PC+10

Combinationalstateupdatelogic

IF ID EX MEM WB

Registerfile

ValA

rB

DestE

valBALU

PC

InstructionMem

InstrAddr Instruction

DataMem

Address ReadData

WriteData

rA

ValE

ADD

RegWrite ALU

OP

MUX

MUXSelect

MUX

rB

rA

MUX

D

FromALU

FromMem10

PC

65

Ld Datapath

ifMEM[PC]==mrmovq Disp (rB), rAEA=Disp +R[rB]R[rA]¬MEM[EA]PC¬ PC+10

Combinationalstateupdatelogic

IF ID EX MEM WB

Registerfile

ValA

rB

DestE

valBALU

PC

InstructionMem

InstrAddr Instruction

DataMem

Address ReadData

WriteData

rA

ValE

ADD

RegWrite ALU

OP

MUX

MUXSelect

MUX

rB

rA

MUX

D

FromALU

FromMem10

PC

66

Executingrmmovq (Stfromreg toMemory)

•Fetch• Read10bytes

•Decode• Readoperandregisters

•Execute• Computeeffectiveaddress

•Memory• Writetomemory

•Writeback• Donothing

•PCUpdate• IncrementPCby10

4 0 rA rB

rmmovq rA,D(rB)

D

67

StageComputation:rmmovq

• UseALUforaddresscomputation

rmmovq rA,D(rB)icode:ifun¬M1[PC]rA:rB¬M1[PC+1]valC¬M8[PC+2]valP¬ PC+10

Fetch

ReadinstructionbyteReadregisterbyteReaddisplacementDComputenextPC

valA¬ R[rA]valB¬ R[rB]

DecodeReadoperandAReadoperandB

valE¬ valB+valCExecute

Computeeffectiveaddress

M8[valE]¬ valAMemory WritevaluetomemoryWrite

backPC¬ valPPCupdate UpdatePC

68

2/9/17

18

StDatapath

ifMEM[PC]==rmmovq rA, Disp (rB)EA=Disp +R[rB]MEM[EA]¬ R[rA]PC¬ PC+10

Combinationalstateupdatelogic

IF ID EX MEM WB

Registerfile

ValA

rB

DestE

valBALU

PC

InstructionMem

InstrAddr Instruction

DataMem

Address ReadData

WriteData

rA

ValE

ADD

RegWrite ALU

OP

MUX

MUXSelect

MUX

rB

rA

MUX

D

FromALU

FromMem

PC

10

MemWrite

69

StDatapath

ifMEM[PC]==rnmovq rA, Disp (rB)EA=Disp +R[rB]MEM[EA]¬ R[rA]PC¬ PC+10

Combinationalstateupdatelogic

IF ID EX MEM WB

Registerfile

ValA

rB

DestE

valBALU

PC

InstructionMem

InstrAddr Instruction

DataMem

Address ReadData

WriteData

rA

ValE

ADD

RegWrite ALU

OP

MUX

MUXSelect

MUX

rB

rA

MUX

D

FromALU

FromMem

PC

10

MemWrite

70

StDatapath

ifMEM[PC]==rnmovq rA, Disp (rB)EA=Disp +R[rB]MEM[EA]¬ R[rA]PC¬ PC+10

Combinationalstateupdatelogic

IF ID EX MEM WB

Registerfile

ValA

rB

DestE

valBALU

PC

InstructionMem

InstrAddr Instruction

DataMem

Address ReadData

WriteData

rA

ValE

ADD

RegWrite ALU

OP

MUX

MUXSelect

MUX

rB

rA

MUX

D

FromALU

FromMem

PC

10

MemWrite

71

StDatapath

ifMEM[PC]==rnmovq rA, Disp (rB)EA=Disp +R[rB]MEM[EA]¬ R[rA]PC¬ PC+10

Combinationalstateupdatelogic

IF ID EX MEM WB

Registerfile

ValA

rB

DestE

valBALU

PC

InstructionMem

InstrAddr Instruction

DataMem

Address ReadData

WriteData

rA

ValE

ADD

RegWrite ALU

OP

MUX

MUXSelect

MUX

rB

rA

MUX

D

FromALU

FromMem

PC

10

MemWrite

72

2/9/17

19

Executingirmovq (Moveimm toReg)

•Fetch• Read10bytes

•Decode• Readoperandregisters

•Execute• Add0toV

•Memory• Donothing

•Writeback• WriteVtorB

•PCUpdate• IncrementPCby10

3 0 F rB

irmovq V, rB

V

73

StageComputation:immovq

• UseALUforaddresscomputation

irmovq V,rBicode:ifun¬M1[PC]rA:rB¬M1[PC+1]valC¬M8[PC+2]valP¬ PC+10

Fetch

ReadinstructionbyteReadregisterbyteReaddisplacementDComputenextPC

Decode

valE¬ 0 +valCExecute

Computeeffectiveaddress

R[rB]¬ valAMemory WritevaluetomemoryWrite

backPC¬ valPPCupdate UpdatePC

74

IRMov Datapath:Option1

ifMEM[PC]==irmovq V, rBR[rB]¬ VPC¬ PC+10 Combinational

stateupdatelogic

IF ID EX MEM WB

Registerfile

ValA

rB

DestE

valBALU

PC

InstructionMem

InstrAddr Instruction

DataMem

Address ReadData

WriteData

rA

ValE

ADD

RegWrite ALU

OP

MUX

MUXSelect

MUX

rB

rA

MUX

D

FromALU

FromMem

MUX

0

PC

MemWrite

10

75

IRMov Datapath:Option1

ifMEM[PC]==irmovq V, rBR[rB]¬ V + 0PC¬ PC+10 Combinational

stateupdatelogic

IF ID EX MEM WB

Registerfile

ValA

rB

DestE

valBALU

PC

InstructionMem

InstrAddr Instruction

DataMem

Address ReadData

WriteData

rA

ValE

ADD

RegWrite ALU

OP

MUX

MUXSelect

MUX

rB

rA

MUX

D

FromALU

FromMem

MUX

0

PC

MemWrite

10

76

2/9/17

20

IRMov Datapath:Option1

ifMEM[PC]==irmovq V, rBR[rB]¬ V + 0PC¬ PC+10 Combinational

stateupdatelogic

IF ID EX MEM WB

Registerfile

ValA

rB

DestE

valBALU

PC

InstructionMem

InstrAddr Instruction

DataMem

Address ReadData

WriteData

rA

ValE

ADD

RegWrite ALU

OP

MUX

MUXSelect

MUX

rB

rA

MUX

D

FromALU

FromMem

MUX

0

PC

MemWrite

10

77

IRMov Datapath:Option1

ifMEM[PC]==irmovq V, rBR[rB]¬ V + 0PC¬ PC+10 Combinational

stateupdatelogic

IF ID EX MEM WB

Registerfile

ValA

rB

DestE

valBALU

PC

InstructionMem

InstrAddr Instruction

DataMem

Address ReadData

WriteData

rA

ValE

ADD

RegWrite ALU

OP

MUX

MUXSelect

MUX

rB

rA

MUX

D

FromALU

FromMem

MUX

0

PC

MemWrite

10

78

IRMov Datapath:Option1

ifMEM[PC]==irmovq V, rBR[rB]¬ V + 0PC¬ PC+10 Combinational

stateupdatelogic

IF ID EX MEM WB

Registerfile

ValA

rB

DestE

valBALU

PC

InstructionMem

InstrAddr Instruction

DataMem

Address ReadData

WriteData

rA

ValE

ADD

RegWrite ALU

OP

MUX

MUXSelect

MUX

rB

rA

MUX

D

FromALU

FromMem

MUX

0

PC

MemWrite

10

79

IRMov Datapath:Option2

ifMEM[PC]==irmovq V, rBR[rB]¬ VPC¬ PC+10 Combinational

stateupdatelogic

IF ID EX MEM WB

Registerfile

ValA

rB

DestE

valBALU

PC

InstructionMem

InstrAddr Instruction

DataMem

Address ReadData

WriteData

rA

ValE

ADD

RegWrite ALU

OP

MUX

MUXSelect

MUX

rB

rA

MUX

D

FromALU

FromMem

PC

MemWrite

10

80

2/9/17

21

IRMov Datapath:Option2

ifMEM[PC]==irmovq V, rBR[rB]¬ VPC¬ PC+10 Combinational

stateupdatelogic

IF ID EX MEM WB

Registerfile

ValA

rB

DestE

valBALU

PC

InstructionMem

InstrAddr Instruction

DataMem

Address ReadData

WriteData

rA

ValE

ADD

RegWrite ALU

OP

MUX

MUXSelect

MUX

rB

rA

MUX

D

FromALU

FromMem

PC

MemWrite

10

81

• Tradeoffsbetweenoption1andoption2?

82

Executingrrmovq (MovefromReg toReg)

•Fetch• Read2bytes

•Decode• ReadoperandregisterrA

•Execute• Add0toval rA

•Memory• Donothing

•Writeback• Writeval rA torB

•PCUpdate• IncrementPCby2

2 0 rA rB

rrmovq rA, rB

83

StageComputation:rrmovq

• UseALUforaddresscomputation

rrmovq rA,rBicode:ifun¬M1[PC]rA:rB¬M1[PC+1]

valP¬ PC+2

Fetch

ReadinstructionbyteReadregisterbyteReaddisplacementDComputenextPC

ValA¬ R[rA]Decode

valE¬ 0 +valAExecute

Computeeffectiveaddress

Memory WritevaluetomemoryR[rB]ß valEWrite

backPC¬ valPPCupdate UpdatePC

84

2/9/17

22

rrMov Datapath:Option1

ifMEM[PC]==rrmovq rA, rBR[rB]¬ R[rA]PC¬ PC+2 Combinational

stateupdatelogic

IF ID EX MEM WB

Registerfile

ValA

rB

DestE

valBALU

PC

InstructionMem

InstrAddr Instruction

DataMem

Address ReadData

WriteData

rA

ValE

ADD

RegWrite ALU

OP

MUX

MUXSelect

MUX

rB

rA

MUX

D

FromALU

FromMem

MUX

0

PC

MemWrite

2

85

rrmov Datapath:Option1

ifMEM[PC]==rrmovq rA, rBR[rB]¬ R[rA]PC¬ PC+2 Combinational

stateupdatelogic

IF ID EX MEM WB

Registerfile

ValA

rB

DestE

valBALU

PC

InstructionMem

InstrAddr Instruction

DataMem

Address ReadData

WriteData

rA

ValE

ADD

RegWrite ALU

OP

MUX

MUXSelect

MUX

rB

rA

MUX

D

FromALU

FromMem

MUX

0

PC

MemWrite

2

86

rrmov Datapath:Option1

ifMEM[PC]==rrmovq rA, rBR[rB]¬ R[rA]PC¬ PC+2 Combinational

stateupdatelogic

IF ID EX MEM WB

Registerfile

ValA

rB

DestE

valBALU

PC

InstructionMem

InstrAddr Instruction

DataMem

Address ReadData

WriteData

rA

ValE

ADD

RegWrite ALU

OP

MUX

MUXSelect

MUX

rB

rA

MUX

D

FromALU

FromMem

MUX

0

PC

MemWrite

2

87

rrmov Datapath:Option1

ifMEM[PC]==rrmovq rA, rBR[rB]¬ R[rA]PC¬ PC+2 Combinational

stateupdatelogic

IF ID EX MEM WB

Registerfile

ValA

rB

DestE

valBALU

PC

InstructionMem

InstrAddr Instruction

DataMem

Address ReadData

WriteData

rA

ValE

ADD

RegWrite ALU

OP

MUX

MUXSelect

MUX

rB

rA

MUX

D

FromALU

FromMem

MUX

0

PC

MemWrite

2

88

2/9/17

23

rrmov Datapath:Option2

ifMEM[PC]==rrmovq rA, rBR[rB]¬ R[rA]PC¬ PC+2 Combinational

stateupdatelogic

IF ID EX MEM WB

Registerfile

ValA

rB

DestE

valBALU

PC

InstructionMem

InstrAddr Instruction

DataMem

Address ReadData

WriteData

rA

ValE

ADD

RegWrite ALU

OP

MUX

MUXSelect

MUX

rB

rA

MUX

D

FromALU

FromMem

PC

MemWrite

10 ?89

SamiraKhanUniversityofVirginia

Feb9,2017

IntrotoMicroarchitecture:Single-Cycle

CS3330