ECE 4436ECE 5367 Introduction to Computer Architecture and Design Ji Chen Section : T TH 1:00PM –...

20
ECE 4436 ECE 5367 Introduction to Computer Architecture and Design Ji Chen Section : T TH 1:00PM – 2:30PM Prerequisites: ECE 4436

Transcript of ECE 4436ECE 5367 Introduction to Computer Architecture and Design Ji Chen Section : T TH 1:00PM –...

Page 1: ECE 4436ECE 5367 Introduction to Computer Architecture and Design Ji Chen Section : T TH 1:00PM – 2:30PM Prerequisites: ECE 4436.

ECE 4436ECE 5367

Introduction to Computer Architecture and Design

Ji Chen

Section : T TH 1:00PM – 2:30PM

Prerequisites: ECE 4436

Page 2: ECE 4436ECE 5367 Introduction to Computer Architecture and Design Ji Chen Section : T TH 1:00PM – 2:30PM Prerequisites: ECE 4436.

ECE 4436ECE 5367

Instructor: Ji Chen Email: [email protected]: (713)-743-4423Office: W328Office Hour: T TH 2:30-3:30 orby appointment

TA: None

Page 3: ECE 4436ECE 5367 Introduction to Computer Architecture and Design Ji Chen Section : T TH 1:00PM – 2:30PM Prerequisites: ECE 4436.

ECE 4436ECE 5367

Page 4: ECE 4436ECE 5367 Introduction to Computer Architecture and Design Ji Chen Section : T TH 1:00PM – 2:30PM Prerequisites: ECE 4436.

ECE 4436ECE 5367

1. Introduction, basic computer organization 2. Instruction formats, instruction sets and their design 3. ALU design: Adders, subtracters, logic operations 4. Multiplication, division, floating point arithmetic5. Datapath design 6. Control design: Hardwired control, microprogrammed control 7. Pipelining 8. Memory systems 9. I/O

Course Contents

Page 5: ECE 4436ECE 5367 Introduction to Computer Architecture and Design Ji Chen Section : T TH 1:00PM – 2:30PM Prerequisites: ECE 4436.

ECE 4436ECE 5367

HW/Quiz/Lab 10 %

Project 15 %

Exam 1 25 %

Exam 2 25 %

Exam 3 25 %

Grading

Web: http://www.egr.uh.edu/courses/ece/ECE5367/

Academic Honesty Statement

Page 6: ECE 4436ECE 5367 Introduction to Computer Architecture and Design Ji Chen Section : T TH 1:00PM – 2:30PM Prerequisites: ECE 4436.

ECE 4436ECE 5367

Computer Organization and Design: The Hardware/Software Interfaceby David A. Patterson, John L. Hennessy, 3rd edition

Required NOT REQUIRED

Page 7: ECE 4436ECE 5367 Introduction to Computer Architecture and Design Ji Chen Section : T TH 1:00PM – 2:30PM Prerequisites: ECE 4436.

ECE 4436ECE 5367

Home works/quiz:  There will be several graded homework/lab assignments. Home works turned in late will be

accepted only under extraordinary circumstances.

Labs:  Laboratory assignments may be worked in teams of two (2); however, there should be no collaboration between teams . .  Lab assignments turned in late will be penalized 25 points for each calendar

day.  Both students in a team will receive the same grade for the project.

Projects: Teams of four (4): describe computer architecture of a modern technology

Exams:  two mid-term exams, and one final exam.  A missed exam will result in a grade of zero Let me know immediately if you have any situation

Final Exam - TBD

Grading:  Your final grade will be computed as follows:   HW/Quiz/Lab 10 %

Project 15 %

Exam 1 25 %

Exam 2 25 %

Exam 3 25 %

Page 8: ECE 4436ECE 5367 Introduction to Computer Architecture and Design Ji Chen Section : T TH 1:00PM – 2:30PM Prerequisites: ECE 4436.

ECE 4436ECE 5367

• Since 1946 all computers have had 5 components

Control

Datapath

Memory

Processor

Input

Output

Page 9: ECE 4436ECE 5367 Introduction to Computer Architecture and Design Ji Chen Section : T TH 1:00PM – 2:30PM Prerequisites: ECE 4436.

ECE 4436ECE 5367

Message Bus (Mbus)

• TI SuperSPARCtm TMS390Z50 in Sun SPARCstation20

Floating-point Unit

Integer Unit

InstCache

RefMMU

DataCache

StoreBuffer

Bus Interface

SuperSPARC

L2$

CC

MBus Module

MBus

L64852 MBus controlM-S Adapter

SBus

DRAM Controller

SBusDMA

SCSIEthernet

STDIO

serialkbdmouseaudioRTC

FloppySBusCards

Page 10: ECE 4436ECE 5367 Introduction to Computer Architecture and Design Ji Chen Section : T TH 1:00PM – 2:30PM Prerequisites: ECE 4436.

ECE 4436ECE 5367

Computer Architecture

• Coordination of many levels of abstraction

• Under a rapidly changing set of forces• Design, Measurement, and Evaluation

I/O systemInstr. Set Proc.

Compiler

OperatingSystem

Application

Digital DesignCircuit Design

Instruction Set Architecture

Firmware

Datapath & Control

Layout

Page 11: ECE 4436ECE 5367 Introduction to Computer Architecture and Design Ji Chen Section : T TH 1:00PM – 2:30PM Prerequisites: ECE 4436.

ECE 4436ECE 5367

Forces on Computer Architecture

ComputerArchitecture

Technology ProgrammingLanguages

OperatingSystems

History

Applications

Cleverness

Page 12: ECE 4436ECE 5367 Introduction to Computer Architecture and Design Ji Chen Section : T TH 1:00PM – 2:30PM Prerequisites: ECE 4436.

ECE 4436ECE 5367

Mixed-Signal

Page 13: ECE 4436ECE 5367 Introduction to Computer Architecture and Design Ji Chen Section : T TH 1:00PM – 2:30PM Prerequisites: ECE 4436.

ECE 4436ECE 5367Where are We Going??

ECE 5367Spring 08

µProc60%/yr.(2X/1.5yr)

DRAM9%/yr.(2X/10 yrs)1

10

100

1000

19

80

19

81

19

83

19

84

19

85

19

86

19

87

19

88

19

89

19

90

19

91

19

92

19

93

19

94

19

95

19

96

19

97

19

98

19

99

20

00

DRAM

CPU

19

82

Processor-MemoryPerformance Gap:(grows 50% / year)

Per

form

ance

Time

“Moore’s Law”

34-b it A LU

LO register(16x2 bits)

Load

HI

Cle

arH

I

Load

LO

M ultiplicandR egister

S h iftA ll

LoadM p

Extra

2 bits

3 232

LO [1 :0 ]

R esult[H I] R esult[LO ]

32 32

Prev

LO[1]

Booth

Encoder E N C [0 ]

E N C [2 ]

ControlLog ic

InputM ultiplier

32

S ub /A dd

2

34

34

32

InputM ultiplicand

32=>34sig nEx

34

34x2 M U X

32=>34sig nEx

<<13 4

E N C [1 ]

M ulti x2 /x1

2

2H I register(16x2 bits)

2

01

3 4 ArithmeticSingle/multicycleDatapaths

IFetchDcd Exec Mem WB

IFetchDcd Exec Mem WB

IFetchDcd Exec Mem WB

IFetchDcd Exec Mem WB

Pipelining

Memory Systems

I/O

Page 14: ECE 4436ECE 5367 Introduction to Computer Architecture and Design Ji Chen Section : T TH 1:00PM – 2:30PM Prerequisites: ECE 4436.

ECE 4436ECE 5367

• Purchasing perspective – Given a collection of machines, which has the

• Best performance ?• Least cost ?• Best performance / cost ?

• Design perspective– Faced with design options, which has the

• Best performance improvement ?• Least cost ?• Best performance / cost ?

• Both require– basis for comparison– metric for evaluation

• Our goal: understand cost & performance implications of architectural

choices

Page 15: ECE 4436ECE 5367 Introduction to Computer Architecture and Design Ji Chen Section : T TH 1:00PM – 2:30PM Prerequisites: ECE 4436.

ECE 4436ECE 5367Two Notions of “Performance”

Which has higher performance?• Time to do the task (Execution Time)

– execution time, response time, latency• Tasks per day, hour, week, sec, ns. .. (Performance)

– throughput, bandwidthResponse time and throughput often are in opposition

Plane

Boeing 747

Concorde

Speed

610 mph

1350 mph

DC to Paris

6.5 hours

3 hours

Passengers

470

132

Throughput (pmph)

286,700

178,200

Page 16: ECE 4436ECE 5367 Introduction to Computer Architecture and Design Ji Chen Section : T TH 1:00PM – 2:30PM Prerequisites: ECE 4436.

ECE 4436ECE 5367

Definitions

• Performance is in units of things-per-second– bigger is better

• If we are primarily concerned with response time– performance(x) = 1

execution_time(x)

" X is n times faster than Y" means

Performance(X)

n = ----------------------

Performance(Y)

Page 17: ECE 4436ECE 5367 Introduction to Computer Architecture and Design Ji Chen Section : T TH 1:00PM – 2:30PM Prerequisites: ECE 4436.

ECE 4436ECE 5367Example

• Time of Concorde vs. Boeing 747?• Concord is 1350 mph / 610 mph = 2.2 times faster

= 6.5 hours / 3 hours

• Throughput of Concorde vs. Boeing 747 ?• Concord is 178,200 pmph / 286,700 pmph = 0.62 “times

faster”• Boeing is 286,700 pmph / 178,200 pmph = 1.60 “times

faster”

• Boeing is 1.6 times (“60%”) faster in terms of throughput• Concord is 2.2 times (“120%”) faster in terms of flying

time

We will focus primarily on execution time for a single jobLots of instructions in a program => Instruction throughput

important!

Page 18: ECE 4436ECE 5367 Introduction to Computer Architecture and Design Ji Chen Section : T TH 1:00PM – 2:30PM Prerequisites: ECE 4436.

ECE 4436ECE 5367

CPU = Seconds = Instructions x Cycles x Seconds

Performance Program Program Instruction Cycle

CPU = Seconds = Instructions x Cycles x Seconds

Performance Program Program Instruction Cycle

Page 19: ECE 4436ECE 5367 Introduction to Computer Architecture and Design Ji Chen Section : T TH 1:00PM – 2:30PM Prerequisites: ECE 4436.

ECE 4436ECE 5367

Speedup due to enhancement E: ExTime w/o E Performance w/ ESpeedup(E) = -------------------- = --------------------- ExTime w/ E Performance w/o E

Suppose that enhancement E accelerates a fraction F of the task by a factor S and the remainder of the task is unaffected then,

ExTime(with E) = ((1-F) + F/S) x ExTime(without E)

Speedup(with E) = 1 (1-F) + F/S

Amdahl's Law

Page 20: ECE 4436ECE 5367 Introduction to Computer Architecture and Design Ji Chen Section : T TH 1:00PM – 2:30PM Prerequisites: ECE 4436.

ECE 4436ECE 5367

Typical Mix

Base Machine Op Freq Cycles CPI(i) % Time

ALU 50% 1 .5 23%

Load 20% 5 1.0 45%

Store 10% 3 .3 14%

Branch 20% 2 .4 18%

2.2

How much faster would the machine be if a better data cachereduced the average load time to 2 cycles?

How does this compare with using branch prediction to save a cycle off the branch time?

What if two ALU instructions could be executed at once?