Grading
The frst priority: Maximize Learning Your grade will depend
on how much you have
learned
2 midterm exams (4! to "!#$ %uizzes and homewor&'s (2! to !#$
)inal *xam (4!#
+ctivity in the class %uestions and discussion in the class
give
you points and improve the ,uality o- teaching$
'on L. Henne(() * +a&id A. Patter(on
organ aumann, /ed !dition, 011/
Computer S)(tem Arcitecture
/ery late attendance will .e count as an a.sent$
0sing Mo.ile 1hone is 3T +LL3*5 during class$
%uestions and 5iscussions is the .est way to communicate
A digital computer is a machine
composed of the following three basic
components
The Von Neumann Machine, 1945
The Von Neumann model consists of five major components: input
unit
output unit
computer arctecture ~ PSUT *
Von Neumann Model
A refinement of the +on ,eumann model- the s&stem bus
model
has a "PU #A.U and control$- memor&- and an inputoutput
unit/
"ommunication among components is handled b& a shared
pathwa& called the system bus- which is made up of the
data
bus- the address bus- and the control bus/ There is also
a power
The CPU
"PU #central processing unit$ is an older term for processor
and
microprocessor- the central unit in a computer containing the
logic
circuitr& that performs the instructions of a computer1s
programs/
,!TA. TPS
Introduced in the mid (8*'s
6e9uires few transistors
instructions
comple: "PUs that had e0erlarger sets of instructions
'3'((5 "omputer Architecture "1
RISC or CISC The great Contro!ersy"
3SC proponent( argue tat 3SC macine( are bot ceaper and a(ter, and
are tereore te macine( o te uture.
S5eptic( note tat b) ma5ing te ardware (impler, 3SC arcitecture(
put a greater burden on te (otware. Te) argue tat ti( i( not wort
te trouble becau(e con&entional microproce((or( are becoming
increa(ingl) a(t and ceap an)wa).
Te T36TH7
CSC and 3SC implementation( are becoming more and more ali5e. an) o
toda)8( 3SC cip( (upport a( man) in(truction( a( )e(terda)8( CSC
cip(. And toda)8( CSC cip( u(e man) tecni9ue( ormerl) a((ociated
wit 3SC cip(.
'3'((5 "omputer Architecture ""
6nder te ood o a t)pical CP6
!rocessin" speed
# The cloc$ %requency is one measure of ho& fast a computer is
' ho&ever( the len"th of time to carry out an operation depends
not only on ho& fast the processor cycles( )ut ho&
many cycles are required to perform a "iven operation.
Volta"e requirement
Transistors 'electronic s&itches* in the +!U requires some
volta"e to tri""er them.
# ,n the pre#-/01// days( everythin" &as 2 volts
# As chips "ot faster and po&er )ecame a concern(
desi"ners dropped the chip volta"e do&n to 3.3 volts 'eternal
Volta"e* and 4.5V or 4.2V core volta"e
'3'((5 "omputer Architecture "/
More on Voltage Requirements…
Power consumption e9uates largel& with heat generation- which
is a primar& enem& in achie0ing increased performance/
,ewer processors are larger and faster- and ;eeping them cool can
be a ma<or concern/
6educing power usage is a primar& ob<ecti0e for the
designers of noteboo; computers- since the& run on batteries
with a limited life/ #The& also are more sensiti0e to heat
problems since their components are crammed into such a small
space$/
"ompensate for b& using lowerpower semiconductor processes- and
shrin;ing the circuit si=e and die si=e/ ,ewer processors reduce
0oltage le0els e0en more b& using what is called a dual
voltage- or split rail design
'3'((5 "omputer Architecture "4
A split rail processor uses two different 0oltages/
The e:ternal or I/O voltage is higher- t&picall&
3/3+ for compatibilit& with the other chips on
the motherboard/
2/5 to 2/8 0olts/ This design allows these lower
0oltage "PUs to be used without re9uiring
wholesale changes to motherboards- chipsets
etc/
'3'((5 "omputer Architecture "2
M(M)R*
"omputers ha0e hierarchies of memories that ma& be classified
according to >unction- "apacit& and 6esponse Times/
>unction
?6eads? transfer information from the memor&@ ?rites? transfer
information to the memor&7
-Random Access Memory #6A%$ performs both reads and
writes/
-Read-Only Memory #6!%$ contains information stored at
the
time of manufacture that can onl& be read/
-Programmable Read-Only Memory #P6!%$ is 6!% that can be
written once
at some point after manufacture/
"apacit&
b&te B * bits@
In modern computers- the total memor& ma& range from
sa& (C % in a small personal computer to se0eral D
#gigab&tes$ in large supercomputers/
'3'((5 "omputer Architecture ";
%emor& response is characteri=ed b& two different
measures7
- Access Time #also termed response time or
latency $ defines how 9uic;l& the memor& can respond
to a read or write re9uest/
-Memory Cycle Time refers to the minimum period between two
successi0e re9uests of the memor&/
Access times 0ar& from about *' ns Ens B nanosecond B ('F#8$
secondsG for chips in small personal computers to about (' ns or
less for the fastest chips in caches and buffers/ >or 0arious
reasons- the memor& c&cle time is more than the speed of
the memor& chips #i/e/- the length of time between successi0e
re9uests is more than the *' ns speed of the chips in a small
personal computer$/
'3'((5 "omputer Architecture "<
adapter etc/
of de0ices- the US is used/
Hefinition7
The us is simpl& a common set of wires that
connect all the computer de0ices and chips together
'3'((5 "omputer Architecture 01
Some of these wires are used to transmit data/
Some send house;eeping signals- li;e the cloc; pulse/ Some transmit
a number #the ?address?$ that identifies a particular de0ice or
memor& location
Use of te address
The computer chips and de0ices watch the address wires and
respond when their identif&ing number #address$ is transmitted
before the& can transfer data
Problem!
Starting with machines that used the 3*C "PU- "PUs and memor&
ran faster than other I! de0ices
"olution
Separate the "PU and memor& from all the I!/ Toda&-
memor& is onl& added b& plugging it into special
soc;ets on the main computer board/
'3'((5 "omputer Architecture 0"
,us Speeds
%ultiple uses with different speeds is an option or a single
bus
supporting different speeds is used
In a modern P"- there ma& be a half do=en different us
areas/
There is certainl& a ?"PU area? that still contains the
"PU-
memor&- and basic control logic/
There is a ?igh Speed I! He0ice? area that is either a +SA
.ocal us #+.$ or an P"I us
'3'((5 "omputer Architecture 00
In (8*) I% introduced a new
%icrochannel #CA$ bus
e:tension of the older ISA interface called
!SA
'3'((5 "omputer Architecture 0/
The PC bus was de0eloped b& Intel
P"I is a C4 bit interface in a 32 bit pac;age
The P"I bus runs at 33 %= and can transfer 32 bits of data
#four b&tes$ e0er& cloc; tic;/
That sounds li;e a 32bit busJ owe0er- a cloc; tic; at 33 %=
is 3' nanoseconds- and memor& onl& has a speed of )'
nanoseconds/ hen the "PU fetches data from 6A%- it has to
wait at least three cloc; tic;s for the data/ & transferring
data
e0er& cloc; tic;- the P"I bus can deli0er the same
throughput
on a 32 bit interface that other parts of the machine deli0er
through a C4 bit path/
'3'((5 "omputer Architecture 04
Things to -no$ about I+) ,us
6uses transfer information )et&een parts of a computer. Smaller
computers have a sin"le )us7 more advanced computers have comple
interconnection strate"ies.
Thin"s to $no& a)out the )us Transaction 8 Unit of
communication on )us. Bus Master 8 The module
controllin" the )us at a particular
time. Arbitration Protocol 8 Set of si"nals
echan"ed to decide
&hich of t&o competin" modules &ill control a )us at a
particular time. Communication Protocol 8
Al"orithm used to transfer data on
the )us. Asynchronous Protocol 8 +ommunication
al"orithm that can )e"in at any time7 requires overhead to
notify receivers that transfer is a)out to )e"in.
'3'((5 "omputer Architecture 0:
Things to -no$ about the bus continued &
Synchronous Protocol 8 +ommunication al"orithm that can
)e"in only at &ell#$no& times defined )y a "lo)al
cloc$.
Transfer Time 8 Time for data to )e transferred over the )us
in sin"le transaction.
Bandwidth 8 0ata transfer capacity of )us7 usually
epressed in )its per second ')ps*. Sometimes termed
throughput .
6and&idth and Transfer Time measure related thin"s( )ut
)and&idth ta$es into account required overheads and is usually
a more useful measure of the speed of the )us.
'3'((5 "omputer Architecture 02
performance7
le0el statements
instructions are e:ecuted
computer arctecture ~ PSUT 28
+omputer architects must desi"n a computer to meet functional
requirements as &ell as price( po&er (
and performance "oals. 9ften( they also have to
determine &hat the functional requirements are( &hich
can )e a major tas$.
9nce a set of functional requirements has )een esta)lished( the
architect must try to optimie the desi"n.
;ere are three major application areas and their main requirements:
0es$top computers: focus on optimiin" cost#performance as
measured )y a sin"le user( &ith little re"ard for pro"ram sie
or po&er consumption(
Server computers < focus on availa)ility( scala)ility( and
throu"hput cost#performance(
Em)edded computers < driven )y price and often po&er issues(
plus code sie is important.
Tasks of Computer Architects
computer arctecture ~ PSUT 3(
Applications Change over Time
Hatasets K memor& re9uirements ⇒ larger "ache K
memor& architecture become more critical
Standalone⇒ networ;ed I! integration K s&stem software
become more critical
Single tas;⇒ multiple tas;s Parallel architectures become
critical
Application Properties to Exploit in Computer Design
.ocalit& in memor&I! references Programs wor; on subset of
instructionsdata at an& point in time oth spatial and temporal
localit&
Parallelism Hatale0el #H.P$7 same operation on e0er& element of
a data
se9uence Instructionle0el #I.P$7 independent instructions within
se9uential
program Threadle0el #T.P$7 parallel tas;s within one program
%ultiprogramming7 independent programs Pipelining
Predictabilit& "ontrolflow direction- memor& references-
data 0alues
Levels of Machines
computer arctecture ~ PSUT 34
A modern meanin" of the term computer architecture covers
three aspects of computer desi"n: instruction set
architecture,
computer organization and
computer hardware.
nstruction Set Architecture ! SA refers to the actual
pro"rammer#visi)le machine interface such as instruction set(
re"isters( memory or"aniation and eception handlin". T&o main
approaches: =,S+ and +,S+ architectures.
A computer or"aniation and computer hard&are are t&o
components of the implementation of a machine.
Computer Architecture
O ()(temn(tr. Set Proc.
Compiler
search- bloc; mo0e- >>T- etc/$
Usuall& ha0e 0ariable length instructions
6egisters ha0e speciali=ed functions
6educed Instruction Set #6IS"$
Usuall& fi:ed length instructions
.arge orthogonal register sets
computer arctecture ~ PSUT 3)
3SC N 6educed Instruction Set "omputer CSC N
"omple: #and Powerful$ Instruction Set "omputer hat does PS
stand forO
icroprocessor without nterloc;ed Pipeline Stages/
%IPS processor is one of the first 6IS" processors/ Again- all
processors announced after (8*5 ha0e been of 6IS"
architecture/
hat is the main e:ample of "IS" architecture processorO Intel
processors #in o0er 8' computers$/
RISC and CISC Architecture
6IS" designers focused on two critical performance techni9ues in
computer design7 the e:ploitation of instructionle0el
parallelism- first through pipelining and later through
multiple instruction issue-
the use of cache- first in simple forms and later using
sophisticated organi=ations and optimi=ations/
RISC Architecture
computer arctecture ~ PSUT 38
All operations on data appl& to data in
registers and t&picall& change the entire
register@
The onl& operations that affect memor& are load and
store operations that mo0e data from memor& to a register
or to memor& from a register- respecti0el&@
A small number of memor& addressing modes@ The
instruction formats are few in number with all
instructions t&picall& being one si=e@ .arge number of
registers@ These simple properties lead to dramatic
simplifications
in the implementation of ad0anced pipelining techni9ues- which is
wh& 6IS" architecture instruction sets were designed this
wa&/
RISC ISA Characteristics