CSC 631: High-Performance Computer Architectureharmanani.github.io › classes › csc631 › Notes...
Transcript of CSC 631: High-Performance Computer Architectureharmanani.github.io › classes › csc631 › Notes...
CSC631:High-PerformanceComputerArchitecture
Spring2017
Lecture2:InstructionSetArchitectures
AnalogComputers
§ Analogcomputerrepresentsproblemvariablesas
somephysicalquantity(e.g.,mechanical
displacement,voltageonacapacitor)andusesscaled
physicalbehaviortocalculateresults
[Marsyas,CreativeCommonsBY-SA3.0]
Antikythera mechanism c.100BC
[BenFrantzDale,CreativeCommonsBY-SA3.0]
Wingtip vortices off Cesna tail in wind tunnel
DigitalComputers
§ Representproblemvariablesasnumbersencoded
usingdiscretesteps
- Discretestepsprovidenoiseimmunity
§ Enablesaccurateanddeterministiccalculations
- Sameinputsgivesameoutputsexactly
§ Notconstrainedbyphysicallyrealizablefunctions
§ Programmabledigitalcomputersarethefocusof
computerarchitectures
CharlesBabbage(1791-1871)§ Lucasian Professorof
Mathematics,Cambridge
University,1828-1839
§ Atrue“polymath”withinterests
inmanyareas
§ Frustratedbyerrorsinprinted
tables,wantedtobuildmachines
toevaluateandprintaccurate
tables
§ Inspiredbyearlierwork
organizinghuman“computers”to
methodicallycalculatetablesby
hand
[Copyrightexpiredandinpublicdomain.
ImageobtainedfromWikimediaCommons.]
DifferenceEngine1822§ Continuousfunctionscanbeapproximatedby
polynomials,whichcanbecomputedfromdifference
tables:
f(n)=n2+n+41
d1(n)=f(n)– f(n-1)=2n
d2(n)=d1(n)– d1(n-1)=2
§ Can calculateusingonlyasingleadder:
n
d2(n)
d1(n)
f(n)
0
41
1
2
2
2
3
2
4
2
4 6 8
43 47 53 61
RealizingtheDifferenceEngine§ Mechanicalcalculator,hand-cranked,usingdecimaldigits
§ BabbagedidnotcompletetheDE,movingontotheAnalytical
Engine(butusedideasfromAEinimprovedDE2plan)
§ Scheutz completedworkingversionin1855,soldcopyto
BritishGovernment
§ ModerndayrecreationofDE2,
includingprinter,showedentire
designpossibleusingoriginal
technology
- firstatBritishScienceMuseum
- copyatComputerHistoryMuseumin
SanJose
[Geni,CreativeCommonsBY-SA3.0]
AnalyticalEngine1837
§ Recognizedasfirstgeneral-purposedigitalcomputer
- Manyiterationsofthedesign(multipleAnalyticalEngines)
§ Containsthemajorcomponentsofmoderncomputers:
- “Store”:Mainmemorywherenumbersandintermediateresultswere
held(1,000decimalwords,40-digitseach)
- “Mill”:Arithmeticunitwhereprocessingwasperformedincluding
addition,multiplication,anddivision
- Alsosupportedconditionalbranchingandlooping,andexceptionson
overflow(machinejamsandbellrings)
- Hadaformofmicrocode(the“Barrel”)
§ Program,inputandoutputdataonpunchedcards
§ Instructioncardsholdopcode andaddressofoperandsin
store
- 3-addressformatwithtwosourcesandonedestination,allinstore
§ Branchesimplementedbymechanicallychangingordercards
wereinsertedintomachine
§ Onlysmallpieceswereeverbuilt
AnalyticalEngineDesignChoices
§ Decimal,becausestorageonmechanicalgears
- Babbageconsideredbinaryandotherbases,butnoclearadvantageoverhuman-friendlydecimal
§ 40-digitprecision(equivalentto>133bits)- Toreduceimpactofscalinggivenlackoffloating-point
hardware
§ Used“locking”ormechanicalamplificationto
overcomenoiseintransferringmechanicalmotion
aroundmachine
- Similartonon-lineargainindigitalelectroniccircuits
§ Hadafast“anticipating”carry-Mechanicalversionofpass-transistorcarrypropagateused
inCMOSadders(andearlierinrelayadders)
AdaLovelace(1815-1852)§ TranslatedlecturesofLuigi
Menabrea whopublishednotesof
Babbage’slecturesinItaly
§ Lovelaceconsiderablyembellished
notesanddescribedAnalytical
Engineprogramtocalculate
Bernoullinumbersthatwould
haveworkedifAEwasbuilt
- Thefirstprogram!
§ Imaginedmanyusesofcomputers
beyondcalculationsoftables
§ Wasinterestedinmodelingthe
brain
[ByMargaretSarahCarpenter,
Copyrightexpiredandinpublicdomain]
EarlyProgrammableCalculators
§ Analogcomputingwaspopularinfirsthalfof20th
centuryasdigitalcomputingwastooexpensive
§ Butduringlate30sand40s,severalprogrammable
digitalcalculatorswerebuilt(datewhenoperational)
- Atanasoff LinearEquationSolver(1939)- Zuse Z3(1941)- HarvardMarkI(1944)
- ENIAC(1946)
Atanasoff-BerryLinearEquationSolver(1939)§ Fixed-functioncalculatorforsolvingupto29simultaneous
linearequations
§ Digitalbinaryarithmetic(50-bitfixed-pointwords)
§ Dynamicmemory(rotatingdrumofcapacitors)
§ Vacuumtubelogicforprocessing
[Manop,CreativeCommonsBY-SA3.0]
In1973,Atanasoff was
creditedasinventorof
“automaticelectronic
digitalcomputer”after
patentdisputewith
EckertandMauchly
(ENIAC)
ZuseZ3(1941)§ BuiltbyKonrad Zuse inwartimeGermanyusing2000relays
§ Hadnormalizedfloating-pointarithmeticwithhardware
handlingofexceptionalvalues(+/- infinity,undefined)
- 1-bitsign,7-bitexponent,14-bitsignificand
§ 64wordsofmemory
§ Two-stagepipeline1)fetch&execute 2)writeback
§ Noconditionalbranch
§ Programmedviapapertape
ReplicaoftheZuse Z3inthe
Deutsches Museum,Munich
[Venusianer,CreativeCommonsBY-SA3.0]
HarvardMarkI(1944)
§ ProposedbyHowardAikenatHarvard,andfundedandbuiltby
IBM
§ Mostlymechanicalwithsomeelectricallycontrolledrelaysand
gears
§ Weighed5tonsandhad750,000components
§ Stored72numberseachof23decimaldigits
§ Speed:adds0.3s,multiplies6s,divide15s,trig>1minute
§ Instructionsonpapertape(2-addressformat)
§ Couldrunlongprogramsautomatically
§ Loopsbygluingpapertapeintoloops
§ Noconditionalbranch
§ AlthoughmentionedBabbageinproposal,wasmorelimited
thananalyticalengine
[Waldir,CreativeCommonsBY-SA3.0]
ENIAC(1946)§ Firstelectronicgeneral-purposecomputer
§ ConstructionstartedinsecretatUPenn MooreSchoolof
ElectricalEngineeringduringWWIItocalculatefiringtablesfor
USArmy,designedbyEckertandMauchly
§ 17,468vacuumtubes
§ Weighed30tons,occupied1800sq ft,power150kW
§ Twelve10-decimal-digitaccumulators
§ Hadaconditionalbranch!
§ Programmedbyplugboard andswitches,timeconsuming!
§ Purelyelectronicinstructionfetchandexecution,sofast
- 10-digitx10-digitmultiplyin2.8ms(2000xfasterthanMark-1)
§ Asaresultofspeed,itwasalmostentirelyI/Obound
§ Asaresultoflargenumberoftubes,itwasoftenbroken(5
dayswaslongesttimebetweenfailures)
ENIAC
[PublicDomain,USArmyPhoto]
Changingtheprogramcouldtakedays!
EDVAC§ ENIACteamstarteddiscussingstored-programconceptto
speedupprogrammingandsimplifymachinedesign
§ JohnvonNuemann wasconsultingatUPenn andtypedup
ideasin“FirstDraftofareportonEDVAC”
§ HermanGoldstine circulatedthedraftJune1945tomany
institutions,ignitinginterestinthestored-programidea
- Butalso,ruinedchancesofpatentingit
- ReportfalselygavesolecredittovonNeumannfortheideas
- MauriceWilkeswasexcitedbyreportanddecidedtocometoUS
workshoponbuildingcomputers
§ Later,in1948,modificationstoENIACallowedittorunin
stored-programmode,but6xslowerthanhardwired
- DuetoI/Olimitations,thisspeeddropwasnotpracticallysignificant
andimprovementinproductivitymadeitworthwhile
§ EDVACeventuallybuiltand(mostly)workingin1951
- Delayedbypatentdisputeswithuniversity
[Piero71,Creative
CommonsBY-SA3.0]
Williams-Kilburn
TubeStore
ManchesterSSEM“Baby”(1948)§ ManchesterUniversitygroupbuildsmall-scaleexperimental
machinetodemonstrateideaofusingcathode-raytubes
(CRTs)forcomputermemoryinsteadofmercurydelaylines
§ Williams-KilburnTubeswerefirstrandomaccesselectronic
storagedevices
§ 32wordsof32-bits,accumulator,andprogramcounter
§ Machineranworld’sfirststored-programinJune1948
§ LedtolaterManchesterMark-1full-scalemachine
- Mark-1introducedindex registers
- Mark-1commercializedbyFerranti
CambridgeEDSAC(1949)§ MauriceWilkescamebackfromworkshopinUSandsetabout
buildingastored-programcomputerinCambridge
§ EDSACusedmercury-delaylinestoragetoholdupto1024
words(512initially)of17bits(+1bitofpaddingindelayline)
§ Two’s-complementbinaryarithmetic
§ AccumulatorISAwithself-modifyingcodeforindexing
§ DavidWheeler,whoearnedtheworld’sfirstcomputerscience
PhD,inventedthesubroutine(“Wheelerjump”)forthis
machine
- Usersbuiltalargelibraryofusefulsubroutines
§ UK’sfirstcommercialcomputer,LEO-I(LyonsElectronic
Office),wasbasedonEDSAC,ranbusinesssoftwarein1951
- SoftwareforLEOwasstillrunninginthe1980sinemulationonICL
mainframes!
§ EDSAC-II(1958)wasfirstmachinewithmicroprogrammed
controlunit
Commercialcomputers:BINAC(1949)andUNIVAC(1951)
§ EckertandMauchly leftU.Penn afterpatentrights
disputesandformedtheEckert-Mauchly Computer
Corporation
§ World’sfirstcommercialcomputerwasBINACwith
twoCPUsthatcheckedeachother
- BINACapparentlyneverworkedaftershipmenttofirst
(only)customer
§ SecondcommercialcomputerwasUNIVAC
-Usedmercurydelay-linememory,1000wordsof12alpha
characters
- Famouslyusedtopredictpresidentialelectionin1952
- Eventually46unitssoldat>$1Meach
-Often,mistakingly calledtheIBMUNIVAC
IBM701(1952)
§ IBM’sfirstcommercialscientificcomputer
§ Mainmemorywas72William’sTubes,each1Kib,for
totalof2048wordsof36bitseach
-Memorycycletimeof12µs
§ AccumulatorISAwithmultipler/quotientregister
§ 18-bit/36-bitnumbersinsign-magnitudefixed-point
§ MisquotefromThomasWatsonSr/Jr:
“Ithinkthereisaworldmarketformaybefivecomputers”
§ ActuallyTWJr saidatshareholdermeeting:
“asaresultofourtrip[sellingthe701],onwhichweexpectedtogetordersforfivemachines,wecame
homewithordersfor18.”
IBM650(1953)
§ Thefirstmass-producedcomputer
§ Low-endsystemwithdrum-basedstorageanddigit
serialALU
§ Almost2,000produced
[CushingMemorialLibraryandArchives,TexasA&M,
CreativeCommonsAttribution2.0Generic]
IBM650Architecture
22[From650Manual,©IBM]
MagneticDrum(1,000
or2,000
10-digitdecimal
words)
20-digit
accumulator
Activeinstruction
(includingnext
programcounter)
Digit-serial
ALU
IBM650InstructionSet
§ Addressanddatain10-digitdecimalwords
§ Instructionsencode:- Two-digitopcode encoded44instructionsinbaseinstructionset,expandableto97instructionswithoptions
- Four-digitdataaddress- Four-digitnextinstructionaddress- Programmer’sarrangecodetominimizedrumlatency!
§ Specialinstructionsaddedtocomparevaluetoall
wordsontrack
EarlyInstructionSets
§ VerysimpleISAs,mostlysingle-addressaccumulator-
stylemachines,ashigh-speedcircuitrywasexpensive
- Basedonearlier“calculator”model
§ Overtime,appreciationofsoftwareneedsshapedISA
§ Indexregisters(Kilburn,Mark-1)addedtoavoidneed
forself-modifyingcodetostepthrougharray
§ Overtime,moreindexregisterswereadded
§ Andmoreoperationsontheindexregisters
§ Eventually,justprovidegeneral-purposeregisters
(GPRs)andorthogonalinstructionsets
§ Butsomeotheroptionsexplored…
Burrough’s B5000StackArchitecture:RobertBarton,1960
§ Hideinstructionsetcompletelyfromprogrammer
usinghigh-levellanguage(ALGOL)
§ Usestackarchitecturetosimplifycompilation,
expressionevaluation,recursivesubroutinecalls,
interrupthandling,…
EvaluationofExpressions
26
a
b
c
(a+b*c)/(a+d*c- e)
/
+
* +a e
-
ac
dc
*b
ReversePolish
abc*+adc*+e- /
pushapushbpushcmultiply
*
EvaluationStack
b*c
EvaluationofExpressions
27
a
(a+b*c)/(a+d*c- e)
/
+
* +a e
-
ac
dc
*b
ReversePolish
abc*+adc*+e- /
add
+
EvaluationStack
b*c
a+b*c
IBM’sBigBet:360Architecture
§ Byearly1960s,IBMhadseveralincompatiblefamilies
ofcomputer:
701→7094
650→ 7074
702→ 7080
1401→7010
§ Eachsystemhaditsown
- Instructionset- I/Osystemandsecondarystorage(magnetictapes,
drumsanddisks)
- assemblers,compilers,libraries,...
-marketniche(business,scientific,realtime,...)
IBM360:DesignPremisesAmdahl,BlaauwandBrooks,1964
§ Thedesignmustlenditselftogrowthandsuccessor
machines
§ GeneralmethodforconnectingI/Odevices
§ Totalperformance- answerspermonthratherthanbits
permicrosecond� programmingaids
§ Machinemustbecapableofsupervisingitselfwithout
manualintervention
§ Built-inhardwarefaultcheckingandlocatingaidsto
reducedowntime
§ SimpletoassemblesystemswithredundantI/Odevices,
memoriesetc.forfaulttolerance
§ Someproblemsrequiredfloating-pointlargerthan36
bits
StackversusGPROrganizationAmdahl,BlaauwandBrooks,1964
1.Theperformanceadvantageofpush-downstackorganization
isderivedfromthepresenceoffastregistersandnottheway
theyareused.
2.“Surfacing”ofdatainstackwhichare“profitable”is
approximately50%becauseofconstantsandcommon
subexpressions.
3.Advantageofinstructiondensitybecauseofimplicitaddresses
isequaledifshortaddressestospecifyregistersareallowed.
4.Managementoffinite-depthstackcausescomplexity.
5.Recursivesubroutineadvantagecanberealizedonlywiththe
helpofanindependentstackforaddressing.
6.Fittingvariable-lengthfieldsintofixed-widthwordisawkward.
IBM360:AGeneral-PurposeRegister(GPR)Machine
§ ProcessorState- 16General-Purpose32-bitRegisters- maybeusedasindexandbaseregister
- Register0hassomespecialproperties
- 4FloatingPoint64-bitRegisters- AProgramStatusWord(PSW)
- PC,Conditioncodes,Controlflags
§ A32-bitmachinewith24-bitaddresses
- Butnoinstructioncontainsa24-bitaddress!§ DataFormats
- 8-bitbytes,16-bithalf-words,32-bitwords,64-bitdouble-words
The IBM 360 is why bytes are 8-bits long today!
IBM360:InitialImplementations
32
Model30 ... Model70
Storage 8K- 64KB 256K- 512KB
Datapath 8-bit 64-bit
CircuitDelay 30nsec/level 5nsec/level
LocalStore MainStore TransistorRegisters
ControlStore Readonly1 μ sec Conventionalcircuits
IBM360instructionsetarchitecture(ISA)completelyhidtheunderlyingtechnologicaldifferencesbetweenvariousmodels.
Milestone:ThefirsttrueISAdesignedasportablehardware-softwareinterface!
Withminormodificationsitstillsurvivestoday!
Acknowledgements
§ Thesecoursenotesweredevelopedby:- Krste Asanovic (UCB)- Arvind(MIT)
- JoelEmer (Intel/MIT)
- JamesHoe(CMU)
- JohnKubiatowicz (UCB)- DavidPatterson(UCB)