CS211 Advanced Computer Architecture L01 IntroductionShanghaiTech/Fall-2020/... · CS211 Advanced...

1

Transcript of CS211 Advanced Computer Architecture L01 IntroductionShanghaiTech/Fall-2020/... · CS211 Advanced...

Page 1: CS211 Advanced Computer Architecture L01 IntroductionShanghaiTech/Fall-2020/... · CS211 Advanced Computer Architecture L01 Introduction. Chundong Wang. September 7th, 2020. CS211@ShanghaiTech

CS211Advanced Computer Architecture

L01 Introduction

Chundong WangSeptember 7th, 2020

CS211@ShanghaiTech 1

Page 2: CS211 Advanced Computer Architecture L01 IntroductionShanghaiTech/Fall-2020/... · CS211 Advanced Computer Architecture L01 Introduction. Chundong Wang. September 7th, 2020. CS211@ShanghaiTech

Welcome to CS211• Course information

• Advanced topics related to computer architecture• Prerequisite

• CS110 (Computer Architecture I) of ShanghaiTech University, or equivalents• Reference Textbook

• 计算机体系结构:量化研究方法(英文版·原书第6版). 约翰·L. 亨尼斯(John L. Hennessy),戴维·A. 帕特森 (David A. Patterson)、机械工业出版社、2019年07月、

ISBN 9787111631101

• Instructor• Chundong Wang (王春东)• Office: SIST 1A-504D• Email: wangchd• Office hour: 14:00-16:00, every Thursday in the Fall semester of 2020

• Teaching assistant (TA) of Fall 2020• Ms. Chunyan Rong (荣春燕)• Email: rongchy

• Online• https://toast-lab.gitee.io/courses/CS211@ShanghaiTech/Fall-2020/• Blackboard, Gradescope• WeChat group, in the charge of TA

CS211@ShanghaiTech 2

Page 3: CS211 Advanced Computer Architecture L01 IntroductionShanghaiTech/Fall-2020/... · CS211 Advanced Computer Architecture L01 Introduction. Chundong Wang. September 7th, 2020. CS211@ShanghaiTech

Why CS211?

• To become a computer architect• Alumni of this class helped design your computer• To make the frequent cases fast and the rare case correct

• To learn what is under the hood of a computer• Innate curiosity• To better understand when things break• To write better code/applications• To write better system software (OS, compiler, DB, etc.)

• Because it is intellectually fascinating!• What is the most complex man-made single device?

Mikko Lipasti -- University of Wisconsin 3

Page 4: CS211 Advanced Computer Architecture L01 IntroductionShanghaiTech/Fall-2020/... · CS211 Advanced Computer Architecture L01 Introduction. Chundong Wang. September 7th, 2020. CS211@ShanghaiTech

Topics covered by CS211

• Microcode, instructions, ISA• Pipeline• Memory hierarchy

• CPU cache/main memory/disk

• In-order execution and out-of-order execution

• Speculative execution• Branch prediction

• VLIW• Very long instruction word

• Multi-threading

• Vectors• GPU

• Graphics processing unit

• Cache coherence• Memory consistency• Synchronization• Virtual machines• Cluster and distributed

computing• Security and Privacy

CS211@ShanghaiTech 4

Page 5: CS211 Advanced Computer Architecture L01 IntroductionShanghaiTech/Fall-2020/... · CS211 Advanced Computer Architecture L01 Introduction. Chundong Wang. September 7th, 2020. CS211@ShanghaiTech

Evaluation

CS211@ShanghaiTech 5

Item Percentage Note

Participation and quizzes 10% In the middle of some live lectures

Homework 15% Four or more homeworks to be assigned

Mid-term examination 15% 5th November, Wednesday (tentative)

Literature reading 15% Four (tentative) papers, details to be issued

Labs 20% Five (tentative) labs, details to be announced

Final examination 25% To be decided

Page 6: CS211 Advanced Computer Architecture L01 IntroductionShanghaiTech/Fall-2020/... · CS211 Advanced Computer Architecture L01 Introduction. Chundong Wang. September 7th, 2020. CS211@ShanghaiTech

Basic Rules

• No Abuse• Abusive words or behaviors are NOT allowed.• Be graceful and respectful.

• No Plagiarism• Do not copy ANYTHING from ANYWHERE at ANYTIME.• Be yourself.

CS211@ShanghaiTech 6

Page 7: CS211 Advanced Computer Architecture L01 IntroductionShanghaiTech/Fall-2020/... · CS211 Advanced Computer Architecture L01 Introduction. Chundong Wang. September 7th, 2020. CS211@ShanghaiTech

Policy on Assignments and Independent Work• With the exception of laboratories and assignments that explicitly permit you to work in groups, all

homework and projects are to be YOUR work and your work ALONE.• PARTNER TEAMS MAY NOT WORK WITH OTHER PARTNER TEAMS• You can discuss your assignments with other students, and credit will be assigned to students who help

others by answering questions (participation), but we expect that what you hand in is yours.

• Level of detail allowed to discuss with other students: Concepts (Material taught in the class/ in the text book)! Pseudocode is NOT allowed!

• Use the Office Hours of the TA and the Prof. if you need help with your homework/ project!• Rather submit an incomplete homework with maybe 0 points than risking an F!

• It is NOT acceptable to copy solutions from other students.• You can never look at homework/ project code not by you/your team!• You cannot give your code to anybody else –> secure your computer when not around it

• It is NOT acceptable to copy (or start your) solutions from the Web. • It is NOT acceptable to use PUBLIC github/gitlab/gitee archives (giving your answers away)• We have tools and methods, developed over many years, for detecting this. You WILL be caught, and the

penalties WILL be severe.

• At the minimum F in the course, and a letter to your university record documenting the incidence of cheating.

• Both Giver and Receiver are equally culpable and suffer equal penalties7CS211@ShanghaiTech

Page 8: CS211 Advanced Computer Architecture L01 IntroductionShanghaiTech/Fall-2020/... · CS211 Advanced Computer Architecture L01 Introduction. Chundong Wang. September 7th, 2020. CS211@ShanghaiTech

Timetable before 1st October, 2020

CS211@ShanghaiTech 8

Week Date Lecture Task

1 7 Sept., Monday Introduction & Review (I) Survey

9 Sept., Wednesday Introduction & Review (II) Lab 0 (with credit)

2 14 Sept., Monday Microcode, ISA, etc. HW1, Paper Reading 1

3 21 Sept., Monday Pipeline (I) Paper Reading 2, Lab 0 due,HW2 (tentative)

23 Sept., Wednesday Pipeline (II)

4 28 Sept., Monday Memory Hierarchy (I) Lab 1, HW1 due

Page 9: CS211 Advanced Computer Architecture L01 IntroductionShanghaiTech/Fall-2020/... · CS211 Advanced Computer Architecture L01 Introduction. Chundong Wang. September 7th, 2020. CS211@ShanghaiTech

L01 Survey

CS211@ShanghaiTech 9

https://www.wjx.cn/jq/89700035.aspx

Page 10: CS211 Advanced Computer Architecture L01 IntroductionShanghaiTech/Fall-2020/... · CS211 Advanced Computer Architecture L01 Introduction. Chundong Wang. September 7th, 2020. CS211@ShanghaiTech

Now, let’s dive into CA.

CS211@ShanghaiTech 10

Page 11: CS211 Advanced Computer Architecture L01 IntroductionShanghaiTech/Fall-2020/... · CS211 Advanced Computer Architecture L01 Introduction. Chundong Wang. September 7th, 2020. CS211@ShanghaiTech

Computer Architecture

CS211@ShanghaiTech 11

Algorithm, e.g., KMP, Quicksort

Gates/Register-Transfer Level (RTL)

Application, e.g., WeChat, Meituan

Instruction Set Architecture (ISA)

Operating System/Virtual Machines

Microarchitecture

Devices

Programming Language, e.g., C/C++/Python

Circuits

Physics

Copyright © 2016 Elsevier Ltd. All rights reserved.

Page 12: CS211 Advanced Computer Architecture L01 IntroductionShanghaiTech/Fall-2020/... · CS211 Advanced Computer Architecture L01 Introduction. Chundong Wang. September 7th, 2020. CS211@ShanghaiTech

12

Computing Devices Long Long ago

EDSAC (Electronic delay storage automatic calculator), University of Cambridge, UK, May 1949

CS211@ShanghaiTech

Page 13: CS211 Advanced Computer Architecture L01 IntroductionShanghaiTech/Fall-2020/... · CS211 Advanced Computer Architecture L01 Introduction. Chundong Wang. September 7th, 2020. CS211@ShanghaiTech

The 1st computer of China

CS211@ShanghaiTech 13

http://www.ict.cas.cn/kxcb/kxtp/200909/t20090922_2514368.html

http://news.sciencenet.cn/sbhtmlnews/2019/9/349501.shtm

Computer 103 (103机) Kick-off: On August 1st 1958, four instructions were executed on Computer 103 who was with

1KB memory and 30 instructions per second. Products: In all, 41 machines were manufactured. The remaining intact one: Fudan University (复旦大学) purchased it in 1964. In 1973, Fudan

donated it to Qufu Normal University (曲阜师范大学). It was delivered using two trucks with three instructors accompanied. Qufu Normal University built a two-story building for it.

Page 14: CS211 Advanced Computer Architecture L01 IntroductionShanghaiTech/Fall-2020/... · CS211 Advanced Computer Architecture L01 Introduction. Chundong Wang. September 7th, 2020. CS211@ShanghaiTech

14

Computing Devices Now

Robots

SupercomputersAutomobiles

Laptops

Set-top boxes

Smart phones

ServersMedia Players

Sensor Nets

Routers

CamerasGames

CS211@ShanghaiTech

Page 15: CS211 Advanced Computer Architecture L01 IntroductionShanghaiTech/Fall-2020/... · CS211 Advanced Computer Architecture L01 Introduction. Chundong Wang. September 7th, 2020. CS211@ShanghaiTech

Cost of software development makes compatibility a major force in market

Architecture continually changing

15

Applications

Technology

Applications suggest how to improve technology, provide revenue to fund development

Improved technologies make new applications possible

CS211@ShanghaiTech

Page 16: CS211 Advanced Computer Architecture L01 IntroductionShanghaiTech/Fall-2020/... · CS211 Advanced Computer Architecture L01 Introduction. Chundong Wang. September 7th, 2020. CS211@ShanghaiTech

Moore’s Law

16

Gordon MooreIntel Cofounder

Predicts: 2X Transistors / chip

every 2 years

"The number of transistors incorporated in a chip will approximately double every 24 months."—Gordon Moore, Intel co-founder

Page 17: CS211 Advanced Computer Architecture L01 IntroductionShanghaiTech/Fall-2020/... · CS211 Advanced Computer Architecture L01 Introduction. Chundong Wang. September 7th, 2020. CS211@ShanghaiTech

Single-Thread Processor Performance

17

[ Hen

ness

y &

Pat

ters

on, 2

017

]

CS211@ShanghaiTech

Page 18: CS211 Advanced Computer Architecture L01 IntroductionShanghaiTech/Fall-2020/... · CS211 Advanced Computer Architecture L01 Introduction. Chundong Wang. September 7th, 2020. CS211@ShanghaiTech

1

10

100

1999 2003 2006 2008 2011 2013 2014 2015 2016 2017

DRAM

Impr

ovem

ent (

log)

Capacity Bandwidth Latency

DRAM Capacity, Bandwidth, Latency Trends

128x

20x

1.3x

Page 19: CS211 Advanced Computer Architecture L01 IntroductionShanghaiTech/Fall-2020/... · CS211 Advanced Computer Architecture L01 Introduction. Chundong Wang. September 7th, 2020. CS211@ShanghaiTech

Upheaval in Computer Design

• Most of last 50 years, Moore’s Law ruled• Technology scaling allowed continual performance/energy improvements without

changing software model• Last decade, technology scaling slowed/stopped

• Dennard (voltage) scaling over (supply voltage ~fixed)• Moore’s Law (cost/transistor) over?• No competitive replacement for CMOS anytime soon• Energy efficiency constrains everything

• No “free lunch” for software developers, must consider:• Parallel systems

• e.g., AMD released their first dual-core processor, the Athlon 64 X2 3800+ (2.0 GHz, 512 KB L2 cache per core), on April 21, 2005.

• Heterogeneous systems• e.g., ARM’s big.LITTLE• ”LITTLE” processors (e.g., Cortex-A53) are designed for maximum power efficiency while ”big”

processors (e.g., Cortex-A73) are designed to provide maximum compute performance.

19CS211@ShanghaiTech

Page 20: CS211 Advanced Computer Architecture L01 IntroductionShanghaiTech/Fall-2020/... · CS211 Advanced Computer Architecture L01 Introduction. Chundong Wang. September 7th, 2020. CS211@ShanghaiTech

Today’s Dominant Target Systems• Mobile (smartphone/tablet)

• >1 billion sold/year• Market dominated by ARM-ISA-compatible general-purpose processor in

system-on-a-chip (SoC)• Plus sea of custom accelerators (radio, image, video, graphics, audio,

motion, location, security, etc.)

• Warehouse-Scale Computers (WSCs)• 100,000’s cores per warehouse• Market dominated by x86-compatible server chips• Dedicated apps, plus cloud hosting of virtual machines• Now seeing increasing use of GPUs, FPGAs, custom hardware to accelerate

workloads

• Embedded computing• Wired/wireless network infrastructure, printers• Consumer TV/Music/Games/Automotive/Camera/MP3• Internet of Things!

20CS211@ShanghaiTech

Page 21: CS211 Advanced Computer Architecture L01 IntroductionShanghaiTech/Fall-2020/... · CS211 Advanced Computer Architecture L01 Introduction. Chundong Wang. September 7th, 2020. CS211@ShanghaiTech

A toy example to review CA

CS211@ShanghaiTech 21

Page 22: CS211 Advanced Computer Architecture L01 IntroductionShanghaiTech/Fall-2020/... · CS211 Advanced Computer Architecture L01 Introduction. Chundong Wang. September 7th, 2020. CS211@ShanghaiTech

What are not covered by CA?

CS211@ShanghaiTech 22

Programming, and language design Compiling

Data structure and Algorithm

Process scheduling

File operations

Page 23: CS211 Advanced Computer Architecture L01 IntroductionShanghaiTech/Fall-2020/... · CS211 Advanced Computer Architecture L01 Introduction. Chundong Wang. September 7th, 2020. CS211@ShanghaiTech

What are covered by CA?

CS211@ShanghaiTech 23

Instructions and micro-codesInstruction execution: pipeline, in-order or out-of-order, speculation, etc.

Memory hierarchy: cache, main memory, disk, etc.

Exception, interrupt, etc.

I/OSingle-threaded or multi-threaded

Page 24: CS211 Advanced Computer Architecture L01 IntroductionShanghaiTech/Fall-2020/... · CS211 Advanced Computer Architecture L01 Introduction. Chundong Wang. September 7th, 2020. CS211@ShanghaiTech

Conventional Machine Structure

24

CA

I/O systemProcessor

CompilerOperatingSystem(Mac OSX)

Application (ex: browser)

Digital Design

Circuit Design

Instruction SetArchitecture

Datapath & Control

transistors

MemoryHardware

Software Assembler

Today: ISA, Pipeline, etc.Next: memory hierarchy, I/O, threading, WSC, etc.

Page 25: CS211 Advanced Computer Architecture L01 IntroductionShanghaiTech/Fall-2020/... · CS211 Advanced Computer Architecture L01 Introduction. Chundong Wang. September 7th, 2020. CS211@ShanghaiTech

Instruction Set Architecture

CS211@ShanghaiTech 25

Page 26: CS211 Advanced Computer Architecture L01 IntroductionShanghaiTech/Fall-2020/... · CS211 Advanced Computer Architecture L01 Introduction. Chundong Wang. September 7th, 2020. CS211@ShanghaiTech

ISA: instruction set architecture

9/9/2020 26

instruction set

software

hardware

ISA is the actual programmer-visible instruction set, a critical interface/boundary/contract between software and hardware

An instruction is a single operation, with an opcode and zero or more operands, of a processor, defined by the processor instruction set.

Page 27: CS211 Advanced Computer Architecture L01 IntroductionShanghaiTech/Fall-2020/... · CS211 Advanced Computer Architecture L01 Introduction. Chundong Wang. September 7th, 2020. CS211@ShanghaiTech

An example, add of RISC-V

CS211@ShanghaiTech 27

add rd, rs1, rs2• add: opcode• rd, rs1, rs2: registers as operands• [rd] [rs1] + [rs2]• No memory access• 32-bit long

0000000 rs2 rs1 000 rd 0110011

Reg-Reg OPrdaddadd rs2 rs1

7 5 5 3 75

31 25 20 15 71224 19 14 11 6 0funct7 rs2 rs1 funct3 rd opcode

Page 28: CS211 Advanced Computer Architecture L01 IntroductionShanghaiTech/Fall-2020/... · CS211 Advanced Computer Architecture L01 Introduction. Chundong Wang. September 7th, 2020. CS211@ShanghaiTech

7 Dimensions of ISA

• D1: Class of ISA• General-purpose register architectures

• Operands are either registers or memory locations• Two versions

• Register-memory ISA, e.g., 80x86, in which many instructions can access memory• Load-store ISA, e.g., ARMv8 and RISC-V, in which only load or store can access memory

• D2: Memory addressing• Byte-level addressing to access memory operands.

• D3: Addressing modes• How memory addresses are interpreted and specified• Basic: Register, Immediate (constant), Displacement (constant + register)• RISC-V supports these three modes• Other ISAs have more variations of displacement

• e.g., 80x86 has three variations of displacementCS211@ShanghaiTech 28

1. 80x86 has 16 general-purpose registers and 16 registers for floating-point data.

2. RISC-V has 32 general-purpose registers and 32 registers for floating-point data

ARMv8 demands address alignment in load/store. An access to an object of size 𝑠𝑠 bytes at byte address 𝐴𝐴 is aligned if 𝐴𝐴 % 𝑠𝑠 = 0. 80x86 and RISC-V do not require such alignment.

Page 29: CS211 Advanced Computer Architecture L01 IntroductionShanghaiTech/Fall-2020/... · CS211 Advanced Computer Architecture L01 Introduction. Chundong Wang. September 7th, 2020. CS211@ShanghaiTech

7 Dimensions of ISA (cont.)

• D4: Types and sizes of operands• 8-bit (character), 16-bit (half word), 32-bit (word), and 64-bit (double word) • IEEE 754 floating point in 32-bit (single precision) and 64-bit (double precision)

• D5: Operations• Data transfer, arithmetic, control, floating point• Floating point operations are complicated

• Versus fixed point data• Using binary (0/1) to represent real numbers is inexact as precision matters!!!• https://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html• https://software.intel.com/sites/default/files/article/164389/fp-consistency-102511.pdf

CS211@ShanghaiTech 29

Data transfer

Arithmetic

Control

Page 30: CS211 Advanced Computer Architecture L01 IntroductionShanghaiTech/Fall-2020/... · CS211 Advanced Computer Architecture L01 Introduction. Chundong Wang. September 7th, 2020. CS211@ShanghaiTech

7 Dimensions of ISA (cont.)

• D4: Types and sizes of operands• 8-bit (character), 16-bit (half word), 32-bit (word), and 64-bit (double word) • IEEE 754 floating point in 32-bit (single precision) and 64-bit (double precision)

• D5: Operations• Data transfer, arithmetic, control, floating point• Floating point operations are complicated

• Versus fixed point data• Using binary (0/1) to represent real numbers is inexact as precision matters!!!• https://docs.oracle.com/cd/E19957-01/806-3568/ncg_goldberg.html• https://software.intel.com/sites/default/files/article/164389/fp-consistency-102511.pdf

CS211@ShanghaiTech 30

Page 31: CS211 Advanced Computer Architecture L01 IntroductionShanghaiTech/Fall-2020/... · CS211 Advanced Computer Architecture L01 Introduction. Chundong Wang. September 7th, 2020. CS211@ShanghaiTech

7 Dimensions of ISA (cont.)

• D6: Control flow instructions• Conditional branch• Unconditional jump• Procedure call• Return

• D7: Encoding an ISA• Fixed length or variable length• ARMv8 and RISC-V instructions are 32-bit long• 80x86 encoding is variable length, ranging from 1 to 18 bytes

CS211@ShanghaiTech 31

Page 32: CS211 Advanced Computer Architecture L01 IntroductionShanghaiTech/Fall-2020/... · CS211 Advanced Computer Architecture L01 Introduction. Chundong Wang. September 7th, 2020. CS211@ShanghaiTech

Implementing the add instruction of RISC-V

add rd, rs1, rs2

• Instruction makes two changes to machine’s state:• Reg[rd] = Reg[rs1] + Reg[rs2]• PC = PC + 4 %%% To get next instruction

0000000 rs2 rs1 000 rd 0110011

Reg-Reg OPrdaddadd rs2 rs1

7 5 5 3 75

31 25 20 15 71224 19 14 11 6 0funct7 rs2 rs1 funct3 rd opcode

32

Page 33: CS211 Advanced Computer Architecture L01 IntroductionShanghaiTech/Fall-2020/... · CS211 Advanced Computer Architecture L01 Introduction. Chundong Wang. September 7th, 2020. CS211@ShanghaiTech

Datapath for add

+4

Add

clk

addrinst

IMEM

PCpc+4

Inst[24:20]ALU+

clk

Reg [ ]

Inst[19:15]

Inst[11:7]

AddrB

AddrA DataA

DataB

AddrD

DataD

aluReg[rs1]

Reg[rs2]

Inst[31:0]

Control logic

RegWriteEnable (RegWEn)=1

add 5 5 add Reg-Reg OP5

31 25 20 15 71224 19 14 11 6 00000000 rs2 rs1 000 rd opcode

Reg[rd] = Reg[rs1] + Reg[rs2]PC = PC + 4

33

Page 34: CS211 Advanced Computer Architecture L01 IntroductionShanghaiTech/Fall-2020/... · CS211 Advanced Computer Architecture L01 Introduction. Chundong Wang. September 7th, 2020. CS211@ShanghaiTech

From architecture to microarchitecture

• Instructions are visible to programmers• Two processors may have the same ISA but different microarchitectures

• e.g., AMD Opteron and Intel Core i7, with the same 80x86 ISA, have very different pipelines and cache organizations.

• An instruction is partitioned into multiple stages• Five classic stages of executing an instruction

• Instruction fetch• Instruction decode/register fetch• Execute• Memory access• Write back

CS211@ShanghaiTech 34

Page 35: CS211 Advanced Computer Architecture L01 IntroductionShanghaiTech/Fall-2020/... · CS211 Advanced Computer Architecture L01 Introduction. Chundong Wang. September 7th, 2020. CS211@ShanghaiTech

Stages of Execution on Datapath

inst

ruct

ion

mem

ory

+4

rs2rs1rd

regi

ster

s

ALU

Dat

am

emor

y

imm

1. InstructionFetch

2. Decode/RegisterRead

3. Execute 4. Memory 5. Write Back

PC

35

Page 36: CS211 Advanced Computer Architecture L01 IntroductionShanghaiTech/Fall-2020/... · CS211 Advanced Computer Architecture L01 Introduction. Chundong Wang. September 7th, 2020. CS211@ShanghaiTech

Pipeline

CS211@ShanghaiTech 36

Page 37: CS211 Advanced Computer Architecture L01 IntroductionShanghaiTech/Fall-2020/... · CS211 Advanced Computer Architecture L01 Introduction. Chundong Wang. September 7th, 2020. CS211@ShanghaiTech

Instruction execution one by one

CS211@ShanghaiTech 37

Inst. No. 1 2 3 4 5 6 7 8 9 10 11 12 13

i (add) F D X M W

i+1 F D X

i+2 F D X M

i+3 F

i+4

Low utilization of hardware

Page 38: CS211 Advanced Computer Architecture L01 IntroductionShanghaiTech/Fall-2020/... · CS211 Advanced Computer Architecture L01 Introduction. Chundong Wang. September 7th, 2020. CS211@ShanghaiTech

Idle hardware

inst

ruct

ion

mem

ory

+4

rs2rs1rd

regi

ster

s

ALU

Dat

am

emor

y

imm

1. InstructionFetch

2. Decode/RegisterRead

3. Execute 4. Memory 5. Write Back

PC

38

Page 39: CS211 Advanced Computer Architecture L01 IntroductionShanghaiTech/Fall-2020/... · CS211 Advanced Computer Architecture L01 Introduction. Chundong Wang. September 7th, 2020. CS211@ShanghaiTech

Pipeline: instruction-level parallelism

CS211@ShanghaiTech 39

Inst. No. 1 2 3 4 5 6 7 8 9 10 11 12 13

i F D X M W

i+1 F D X

i+2 F D X M

i+3 F D X M W

i+4 F D X M W

Page 40: CS211 Advanced Computer Architecture L01 IntroductionShanghaiTech/Fall-2020/... · CS211 Advanced Computer Architecture L01 Introduction. Chundong Wang. September 7th, 2020. CS211@ShanghaiTech

Well, an imperfect world

• Hazards• A situation that prevents starting the next instruction in the next clock cycle• A hazard may stall the pipeline

• Structural hazard• A required hardware resource is busy

• Data hazard• An instruction depends on the result(s) of a previous instruction

• Control hazard• The pipelining of braches and other instructions that change the PC

CS211@ShanghaiTech 40

Page 41: CS211 Advanced Computer Architecture L01 IntroductionShanghaiTech/Fall-2020/... · CS211 Advanced Computer Architecture L01 Introduction. Chundong Wang. September 7th, 2020. CS211@ShanghaiTech

Structural Hazard

add t0, t1, t2

lw t5, 0(t4)

sll t6, t0, t3

instruction sequence

sw t0, 4(t3)

lw t0, 8(t3)

• Instruction and data memory used simultaneously Use two separate

memories

41

A required hardware resource is busy

Page 42: CS211 Advanced Computer Architecture L01 IntroductionShanghaiTech/Fall-2020/... · CS211 Advanced Computer Architecture L01 Introduction. Chundong Wang. September 7th, 2020. CS211@ShanghaiTech

Data hazard

CS211@ShanghaiTech 42

An instruction depends on the result(s) of a previous instruction

add t0, t1, t2

lw t5, 0(t4)

sll t6, t0, t3

instruction sequence

sw t0, 4(t3)

lw t0, 8(t3)

Solution 1: stall the pipeline Solution 2: register register, fast enough to do write then read in one clock cycle? Solution 3: forwarding

Page 43: CS211 Advanced Computer Architecture L01 IntroductionShanghaiTech/Fall-2020/... · CS211 Advanced Computer Architecture L01 Introduction. Chundong Wang. September 7th, 2020. CS211@ShanghaiTech

Control Hazard

• Branches• We need to calculate next PC

• Exception• Something unusual that breaks the normal flow of instruction execution• Exception handling

CS211@ShanghaiTech 43

Pipeline may stall due to branch

Jump to next instruction, with

a new PC

“assert” may terminate the program

Page 44: CS211 Advanced Computer Architecture L01 IntroductionShanghaiTech/Fall-2020/... · CS211 Advanced Computer Architecture L01 Introduction. Chundong Wang. September 7th, 2020. CS211@ShanghaiTech

Control Hazard

• Branches• We need to calculate next PC

• Exception• Something unusual that breaks the normal flow of instruction execution• Exception handling

CS211@ShanghaiTech 44

Page 45: CS211 Advanced Computer Architecture L01 IntroductionShanghaiTech/Fall-2020/... · CS211 Advanced Computer Architecture L01 Introduction. Chundong Wang. September 7th, 2020. CS211@ShanghaiTech

Control Hazard

• Branches• We need to calculate next PC

• Exception• Something unusual that breaks the normal flow of instruction execution• Exception handling, for synchronous and asynchronous events

• Jump to exception handler program, and return back if possible

CS211@ShanghaiTech 45

PCInst. Mem D Decode E M

Data Mem W+

Illegal Opcode Overflow Data address

ExceptionsPC address Exception

Asynchronous Interrupts

Page 46: CS211 Advanced Computer Architecture L01 IntroductionShanghaiTech/Fall-2020/... · CS211 Advanced Computer Architecture L01 Introduction. Chundong Wang. September 7th, 2020. CS211@ShanghaiTech

Conclusion

• The domain of CA• Many interesting topics• CA is a set of rules and methods that describe the functionality, organization,

and implementation of computer systems (from wikipedia)• CA describes the capabilities and programming model of a computer but not

a particular implementation (also from wikipedia)

• The evolution of CA• New challenges emerge from time to time

• Review of ISA and Pipeline

CS211@ShanghaiTech 46

Page 47: CS211 Advanced Computer Architecture L01 IntroductionShanghaiTech/Fall-2020/... · CS211 Advanced Computer Architecture L01 Introduction. Chundong Wang. September 7th, 2020. CS211@ShanghaiTech

47

Acknowledgements

• These slides contain materials developed and copyright by:• Prof. Krste Asanovic (UC Berkeley)• Prof. Xuehai Zhou (USTC)• Prof. Onur Mutlu (ETH Zurich)• Prof. Mikko Lipasti (UW-Madison)• Prof. Sören Schwertfeger (ShanghaiTech) • Prof. Kenji Kise (Tokyo Tech)• Prof. Wenzhi Chen (ZJU)

CS211@ShanghaiTech