MIPS ISA-I: The Instruction Set Architecture Lecture notes from MKP, H. H. Lee and S. Yalamanchili.

79
MIPS ISA-I: The Instruction Set Architecture Lecture notes from MKP, H. H. Lee and S. Yalamanchili

Transcript of MIPS ISA-I: The Instruction Set Architecture Lecture notes from MKP, H. H. Lee and S. Yalamanchili.

MIPS ISA-I: The Instruction Set Architecture

Lecture notes from MKP, H. H. Lee and S. Yalamanchili

(2)

Module Outline

Review ISA and understand instruction encodings

• Arithmetic and Logical Instructions

• Review memory organization

• Memory (data movement) instructions

• Control flow instructions

• Procedure/Function calls

• Program assembly, linking, & encoding

(3)

Reading

• Chapter 2 2.1, Figure 2.1, 2.2 – 2.7 2.9, Figure 2.15 2.10, 2.16, 2.17

• Appendix A9, A10

• Goals: Understand how programs are encoded Impact of ISA on program encodings Why are ISAs different?

(4)

Economics of an ISA

Thermal Design Power 130W3.6 GHz

Thermal Design Power 4W1.6 GHz

Software/binary portability

(5)

Below Your Program

• Application software Written in high-level language

• System software Compiler: translates HLL code to

machine code Operating System: service code

o Handling input/outputo Managing memory and storageo Scheduling tasks & sharing

resources• Hardware

Processor, memory, I/O controllers

Abstractions help deal with complexity

(6)

Instruction Set Architecture

• A very important abstraction interface between hardware and low-level software

standardizes instructions, machine language bit patterns, etc.

advantage: different implementations of the same architecture

disadvantage: sometimes prevents using new innovations

• Modern instruction set architectures: 80x86 (aka iA32), PowerPC (e.g. G4, G5) Xscale, ARM, MIPS Intel/HP EPIC (iA64), AMD64, Intel’s EM64T, SPARC,

HP PA-RISC, DEC/Compaq/HP Alpha

(7)

Instructions

• Language of the Machine• More primitive than higher level languages

e.g., no sophisticated control flow• Very restrictive

e.g., MIPS Arithmetic Instructions

• We’ll be working with the MIPS instruction set architecture Representative of Reduced Instruction Set Computer

(RISC) Similar to other architectures developed since the 1980's Used by NEC, Nintendo, Silicon Graphics, Sony

Design goals: Maximize performance and Minimize cost, Reduce design time

(8)

Instruction Set Architecture (ISA)

0x00000000

0x00000001

0x00000002

0x00000003

byte addressed memory

0xFFFFFFFF

Arithmetic Logic Unit (ALU)

0x000x010x02

0x03

0x1FData flow for computation

Data flow for transfersRegister File

Address Space

compiler

ISA

HW

(9)

Intel IA-32 Register View

Courtesy Intel IA-32 Software Developers Manual

• Many features are a byproduct of backward compatibility issues

• Distinctive relative to the MIPS ISA

• More later……

(10)

ARM ISA View

Courtesy ARM University Program Presentation

More later…

(11)

Module Outline

Review ISA and understand instruction encodings

• Arithmetic and Logical Instructions

• Review memory organization

• Memory (data movement) instructions

• Control flow instructions

• Procedure/Function calls

• Program assembly, linking, & encoding

(12)

MIPS Programmer Visible Registers

Register Names Usage by Software Convention

$0 $zero Hardwired to zero

$1 $at Reserved by assembler

$2 - $3 $v0 - $v1 Function return result registers

$4 - $7 $a0 - $a3 Function passing argument value registers

$8 - $15 $t0 - $t7 Temporary registers, caller saved

$16 - $23 $s0 - $s7 Saved registers, callee saved

$24 - $25 $t8 - $t9 Temporary registers, caller saved

$26 - $27 $k0 - $k1 Reserved for OS kernel

$28 $gp Global pointer

$29 $sp Stack pointer

$30 $fp Frame pointer

$31 $ra Return address (pushed by call instruction)

$hi $hi High result register (remainder/div, high word/mult)

$lo $lo Low result register (quotient/div, low word/mult)

(13)

MIPS Register View

• Arithmetic instruction operands must be registers

• Compiler associates variables with registers

• Other registers that are not visible to the programmer Program counter Status register ……

(14)

MIPS arithmetic• Design Principle 1: simplicity favors regularity. • Of course this complicates some things...

C code: A = B + C + D;E = F - A;

MIPS code: add $t0, $s1, $s2add $s0, $t0, $s3sub $s4, $s5, $s0

andi $3, $4, $5 ……..

• Operands must be registers, only 32 registers provided• All memory accesses accomplished via loads and stores

A common feature of RISC processors

• Example

compiler

ISA

HW

RA & scheduling

Note the need for intermediate registers

(15)

Logical Operations

• Instructions for bitwise manipulation

Operation C Java MIPS

Shift left << << sll

Shift right >> >>> srl

Bitwise AND & & and, andi

Bitwise OR | | or, ori

Bitwise NOT ~ ~ nor

Useful for extracting and inserting groups of bits in a word

Example

(16)

AND Operations

• Useful to mask bits in a word Select some bits, clear others to 0

and $t0, $t1, $t2

0000 0000 0000 0000 0000 1101 1100 0000

0000 0000 0000 0000 0011 1100 0000 0000

$t2

$t1

0000 0000 0000 0000 0000 1100 0000 0000$t0

(17)

OR Operations

• Useful to include bits in a word Set some bits to 1, leave others unchanged

or $t0, $t1, $t2

0000 0000 0000 0000 0000 1101 1100 0000

0000 0000 0000 0000 0011 1100 0000 0000

$t2

$t1

0000 0000 0000 0000 0011 1101 1100 0000$t0

(18)

NOT Operations

• Useful to invert bits in a word Change 0 to 1, and 1 to 0

• MIPS has NOR 3-operand instruction a NOR b == NOT ( a OR b )

nor $t0, $t1, $zero

0000 0000 0000 0000 0011 1100 0000 0000$t1

1111 1111 1111 1111 1100 0011 1111 1111$t0

Register 0: always read as zero

(19)

Encoding: Instruction Format

• Instructions, like registers and words of data, are also 32 bits long Example: add $t0, $s1, $s2 registers have numbers, $t0=9, $s1=17, $s2=18

op rs rt rd shamt funcR-Format

Opcode (operation)

source operand 1

source operand 2

destination operand

shift amount

function field

Opcodes on page B-50Encodings – Section B10

(20)

MIPS Encoding: R-Type

31

opcode rs rt rd

26 25 21 20 1615 11 10 6 5 0

shamt funct

0 0 0 0 0 0 0 1 1 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 00 031

opcode rs rt rd

26 25 21 20 1615 11 10 6 5 0

shamt funct

add $4, $3, $2rt

rs

rd

0 0 0 0 0 0 0 1 1 0 0 0 1 0 0 0 1 0 0 0 0 0 0 0 1 0 0 0 0 00 0

Encoding = 0x00622020

(21)

MIPS Encoding: R-Type

31

opcode rs rt rd

26 25 21 20 1615 11 10 6 5 0

shamt funct

sll $3, $5, 7shamt

rt

rd

0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 1 1 0 0 1 1 1 0 0 0 0 0 00 031

opcode rs rt rd

26 25 21 20 1615 11 10 6 5 0

shamt funct

0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 1 1 0 0 1 1 1 0 0 0 0 0 00 0

Encoding = 0x000519C0

SPIM Example

(22)

Module Outline

Review ISA and understand instruction encodings

• Arithmetic and Logical Instructions

• Review memory organization

• Memory (data movement) instructions

• Control flow instructions

• Procedure/Function calls

• Program assembly, linking, & encoding

(23)

Memory Organization

• Viewed as a large, single-dimension array, with an address.

• A memory address is an index into the array• "Byte addressing" means that the index points

to a byte of memory.

0

1

2

3

4

5

6

...

8 bits of data

8 bits of data

8 bits of data

8 bits of data

8 bits of data

8 bits of data

8 bits of data

(24)

Memory Organization

• Bytes are nice, but most data items use larger "words“• MIPS provides lw/lh/lb and sw/sh/sb instructions• For MIPS, a word is 32 bits or 4 bytes.

• 232 bytes with byte addresses from 0 to 232-1• 230 words with byte addresses 0, 4, 8, ... 232-4• Words are aligned

i.e., what are the least 2 significant bits of a word address?

0

4

8

12

...

32 bits of data

32 bits of data

32 bits of data

32 bits of data

Registers hold 32 bits of data

(25)

Data Directives

• For placement of data in memory

Stack

Heap

Data

Text

reserved0x00000000

0x00400000

0x10010000

0x7FFFFFFF

Memory Map

.data

.word 0x1234

.byte 0x08

.asciiz “Hello World”

.ascii “Hello World”

.align 2

.space 64

See page B-47Example:

(26)

Endianness [defined by Danny Cohen 1981]

• Byte ordering How is a multiple byte data word stored in memory

• Endianness (from Gulliver’s Travels) Big Endian

o Most significant byte of a multi-byte word is stored at the lowest memory address

o e.g. Sun Sparc, PowerPC Little Endian

o Least significant byte of a multi-byte word is stored at the lowest memory address

o e.g. Intel x86

• Some embedded & DSP processors would support both for interoperability

(27)

Example of Endian Store 0x87654321 at address 0x0000, byte-addressable

0x87

0x65

0x43

0x21

LowerMemoryAddress

HigherMemoryAddress

0x0000

0x0001

0x0002

0x0003

BIG ENDIAN

0x21

0x43

0x65

0x87

LowerMemoryAddress

HigherMemoryAddress

0x0000

0x0001

0x0002

0x0003

LITTLE ENDIAN

(28)

Instruction Set Architecture (ISA)

0x00000000

0x00000001

0x00000002

0x00000003

byte addressed memory

0xFFFFFFFF

Arithmetic Logic Unit (ALU)

0x000x010x02

0x03

0x1FData flow for computation

Data flow for transfersRegister File

Address Space

compiler

ISA

HW

words

(29)

Module Outline

Review ISA and understand instruction encodings

• Arithmetic and Logical Instructions

• Review memory organization

• Memory (data movement) instructions

• Control flow instructions

• Procedure/Function calls

• Program assembly, linking, & encoding

(30)

Memory Instructions

• Load & store instructions: Orthogonal ISA• Example:

C code: long A[100];A[9] = h + A[8];

MIPS code: lw $t0, 32($s3) #load word add $t0, $s2, $t0 sw $t0, 36($s3)

• Remember arithmetic operands are registers, not memory!

32 bits of data

32 bits of data

32 bits of data

32 bits of data

A[0]

A[1]

A[2]

4 bytes

compiler

ISA

HW

data layout

base registerindex

(31)

Registers vs. Memory

• Registers are faster to access than memory

• Operating on memory data requires loads and stores More instructions to be executed

• Compiler must use registers for variables as much as possible Only spill to memory for less frequently used

variables Register usage optimization is important!

• Design Principle 2: Smaller is faster c.f. main memory: millions of locations

• Rationale for the Memory Hierarchy

(32)

MIPS Registers

Register Names Usage by Software Convention

$0 $zero Hardwired to zero

$1 $at Reserved by assembler

$2 - $3 $v0 - $v1 Function return result registers

$4 - $7 $a0 - $a3 Function passing argument value registers

$8 - $15 $t0 - $t7 Temporary registers, caller saved

$16 - $23 $s0 - $s7 Saved registers, callee saved

$24 - $25 $t8 - $t9 Temporary registers, caller saved

$26 - $27 $k0 - $k1 Reserved for OS kernel

$28 $gp Global pointer

$29 $sp Stack pointer

$30 $fp Frame pointer

$31 $ra Return address (pushed by call instruction)

$hi $hi High result register (remainder/div, high word/mult)

$lo $lo Low result register (quotient/div, low word/mult)

(33)

• Consider the load-word and store-word instructions, What would the regularity principle have us do?

• Design Principle 3: Good design demands a compromise

• Introduce a new type of instruction format I-type for data transfer instructions other format was R-type for register

• Example: lw $t0, 32($s2)

35 18 8 32

op rs rt 16 bit number

Encoding Memory Instructions

(34)

MIPS Encoding: I-Type

31

opcode rs rt Immediate Value

26 25 21 20 1615 0

lw $5, 3000($2)Immediate

rs

rt

0 0 1 1 0 0 0 1 0 0 0 1 0 1 0 0 0 0 1 0 1 1 1 0 1 1 1 0 0 01 0

Encoding = 0x8C450BB8

0 0 1 1 0 0 0 1 0 0 0 1 0 1 0 0 0 0 1 0 1 1 1 0 1 1 1 0 0 01 031

opcode rs rt

26 25 21 20 1615 0

Immediate Value

(35)

MIPS Encoding: I-Type

31

opcode rs rt Immediate Value

26 25 21 20 1615 0

sw $5, 3000($2)Immediate

rs

rt

1 0 1 1 0 0 0 1 0 0 0 1 0 1 0 0 0 0 1 0 1 1 1 0 1 1 1 0 0 01 0

Encoding = 0xAC450BB8

1 0 1 1 0 0 0 1 0 0 0 1 0 1 0 0 0 0 1 0 1 1 1 0 1 1 1 0 0 01 031

opcode rs rt

26 25 21 20 1615 0

Immediate Value

SPIM Example

(36)

• Small constants are used quite frequently (50% of operands) e.g., A = A + 5;

B = B + 1;C = C - 18;

• Solutions? put 'typical constants' in memory and load them. create hard-wired registers (like $zero) for constants like one. Use immediate values

• MIPS Instructions:

addi $29, $29, 4slti $8, $18, 10andi $29, $29, 6ori $29, $29, 4

Constants

compiler

ISA

HWHW support

(37)

Immediate Operands

• No subtract immediate instruction Just use a negative constant

addi $s2, $s1, -1

• Hardwired values useful for common operations E.g., move between registers

add $t2, $s1, $zero

• Design Principle 4: Make the common case fast Small constants are common Immediate operand avoids a load instruction

(38)

• We'd like to be able to load a 32 bit constant into a register• Must use two instructions, new "load upper immediate"

instruction

lui $t0, 1010101010101010

• Then must get the lower order bits right, i.e.,

ori $t0, $t0, 1010101010101010

1010101010101010 0000000000000000

0000000000000000 1010101010101010

1010101010101010 1010101010101010

ori

1010101010101010 0000000000000000

filled with zeros

How about larger constants?

compiler

ISA

HW

Instruction synthesis

Now consider la $t0, L1(a pseudo instruction)

(39)

2s-Complement Signed Integers

• Bit 31 is sign bit 1 for negative numbers 0 for non-negative numbers

• –(–2n – 1) can’t be represented

• Non-negative numbers have the same unsigned and 2s-complement representation

• Some specific numbers 0: 0000 0000 … 0000 –1: 1111 1111 … 1111 Most-negative: 1000 0000 … 0000 Most-positive: 0111 1111 … 1111

(40)

Sign Extension

• Representing a number using more bits Preserve the numeric value

• In MIPS instruction set addi: extend immediate value lb, lh: extend loaded byte/halfword beq, bne: extend the displacement

• Replicate the sign bit to the left c.f. unsigned values: extend with 0s

• Examples: 8-bit to 16-bit +2: 0000 0010 => 0000 0000 0000 0010 –2: 1111 1110 => 1111 1111 1111 1110

(41)

Encoding: Constants & Immediates

• Use the I-format

• Compromise: Use instruction sequences to construct larger

constants Avoid another adding another format impact on the

hardware?

Example

(42)

Module Outline

Review ISA and understand instruction encodings

• Arithmetic and Logical Instructions

• Review memory organization

• Memory (data movement) instructions

• Control flow instructions

• Procedure/Function calls

• Program assembly, linking, & encoding

(43)

• Decision making instructions alter the control flow, i.e., change the "next" instruction to be executed

• MIPS conditional branch instructions:

bne $t0, $t1, Label beq $t0, $t1, Label

• Example: if (i==j) h = i + j;

bne $s0, $s1, Labeladd $s3, $s0, $s1

Label: ....

Control

(44)

• MIPS unconditional branch instructions:j label

• Example:

if (i!=j) beq $s4, $s5, Lab1 h=i+j; add $s3, $s4, $s5else j Lab2 h=i-j; Lab1: sub $s3, $s4, $s5

Lab2: ...

• Can you build a simple for loop?

Control

compiler

ISA

HWHW support

Assembler calculates address

(45)

Compiling Loop Statements

• C code:

while (save[i] == k) i += 1; i in $s3, k in $s5, address of save in $s6

• Compiled MIPS code:

Loop: sll $t1, $s3, 2 add $t1, $t1, $s6 lw $t0, 0($t1) bne $t0, $s5, Exit addi $s3, $s3, 1 j LoopExit: …

(46)

• We have: beq, bne, what about Branch-if-less-than?• New instruction:

if $s1 < $s2 then $t0 = 1

slt $t0, $s1, $s2 else $t0 = 0

• Can use this instruction to build "blt $s1, $s2, Label" — can now build general control structures

• For ease of assembly programmers, the assembler allows “blt” as a “pseudo-instruction”

— assembler substitutes them with valid MIPS instructions

— there are policy of use conventions for registers

blt $4 $5 loop slt $1 $4 $5bne $1 $0 loop

Control Flow

compiler

ISA

HW

pseudo-instructions

(47)

Signed vs. Unsigned

• Signed comparison: slt, slti• Unsigned comparison: sltu, sltui• Example

$s0 = 1111 1111 1111 1111 1111 1111 1111 1111

$s1 = 0000 0000 0000 0000 0000 0000 0000 0001

slt $t0, $s0, $s1 # signed

o –1 < +1 $t0 = 1

sltu $t0, $s0, $s1 # unsigned

o +4,294,967,295 > +1 $t0 = 0

(48)

• Instructions:bne $t4,$t5,Label Next instruction is at Label if $t4 $t5beq $t4,$t5,Label Next instruction is at Label if $t4 = $t5j Label Next instruction is at Label

• Formats:

Use Instruction Address Register (PC = program counter) Most branches are local (principle of locality)

• Jump instructions just use high order bits of PC address boundaries of 256 MB

op rs rt 16 bit address

op 26 bit address

I

J

Encoding: Branches & Jumps

Opcodes on page B-50Encodings – Section B10

(49)

BEQ/BNE uses I-Type

31

opcode rs rt Signed Offset Value(encoded in words, e.g. 4-bytes)

26 25 21 20 1615 0

beq $0, $9, 40Offset

Encoded by 40/4 =

10rt

rs

0 1 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 00 0

Encoding = 0x1009000A

0 1 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 00 031

opcode rs rt

26 25 21 20 1615 0

Immediate Value

(50)

MIPS Encoding: J-Type

31

opcode Target Address

26 0

jal 0x00400030Target

0 0 1 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 00 0

Encoding = 0x0C10000C

0 0 1 1 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 00 031

opcode

26 25 0

Target Address

25

0000 0000 0100 0000 0000 0000 0011 0000XInstruction=4 bytesTarget Address

• jal will jump and pushreturn address in $ra ($31)

SPIM Example

(51)

JR

• JR (Jump Register) Unconditional jump

jr $2 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 00 031

opcode rs 0

26 25 21 20 1615 11 10 6 5 0

0 funct0

(52)

Target Addressing Example

• Loop code from earlier example Assume Loop at location 80000

Loop: sll $t1, $s3, 2 80000 0 0 19 9 4 0

add $t1, $t1, $s6 80004 0 9 22 9 0 32

lw $t0, 0($t1) 80008 35 9 8 0

bne $t0, $s5, Exit 80012 5 8 21 2

addi $s3, $s3, 1 80016 8 19 19 1

j Loop 80020 2 20000

Exit: … 80024

(53)

Branching Far Away

• If branch target is too far to encode with 16-bit offset, assembler rewrites the code

• Examplebeq $s0,$s1, L1

↓bne $s0,$s1, L2j L1

L2: …

(54)

Basic Blocks

• A basic block is a sequence of instructions with No embedded branches (except at end) No branch targets (except at beginning)

A compiler identifies basic blocks for optimization

An advanced processor can accelerate execution of basic blocks

(55)

Byte Halfword Word

Registers

Memory

Memory

Word

Memory

Word

Register

Register

1. Immediate addressing

2. Register addressing

3. Base addressing

4. PC-relative addressing

5. Pseudodirect addressing

op rs rt

op rs rt

op rs rt

op

op

rs rt

Address

Address

Address

rd . . . funct

Immediate

PC

PC

+

+

Addressing Modes

Operand is constant

Operand is in register

lb $t0, 48($s0)

bne $4, $5, Label(label will be assembled into a distance)

j Label

Concatenation w/ PC[31..28]

compiler

ISA

HWHW support

What does this imply about targets?

(56)

To SummarizeMIPS operands

Name Example Comments$s0-$s7, $t0-$t9, $zero, Fast locations for data. In MIPS, data must be in registers to perform

32 registers $a0-$a3, $v0-$v1, $gp, arithmetic. MIPS register $zero always equals 0. Register $at is $fp, $sp, $ra, $at reserved for the assembler to handle large constants.

Memory[0], Accessed only by data transfer instructions. MIPS uses byte addresses, so

230

memory Memory[4], ..., sequential words differ by 4. Memory holds data structures, such as arrays,

words Memory[4294967292] and spilled registers, such as those saved on procedure calls.

MIPS assembly language

Category Instruction Example Meaning Commentsadd add $s1, $s2, $s3 $s1 = $s2 + $s3 Three operands; data in registers

Arithmetic subtract sub $s1, $s2, $s3 $s1 = $s2 - $s3 Three operands; data in registers

add immediate addi $s1, $s2, 100 $s1 = $s2 + 100 Used to add constants

load word lw $s1, 100($s2) $s1 = Memory[$s2 + 100] Word from memory to register

store word sw $s1, 100($s2) Memory[$s2 + 100] = $s1 Word from register to memory

Data transfer load byte lb $s1, 100($s2) $s1 = Memory[$s2 + 100] Byte from memory to register

store byte sb $s1, 100($s2) Memory[$s2 + 100] = $s1 Byte from register to memory

load upper immediate lui $s1, 100 $s1 = 100 * 216 Loads constant in upper 16 bits

branch on equal beq $s1, $s2, 25 if ($s1 == $s2) go to PC + 4 + 100

Equal test; PC-relative branch

Conditional

branch on not equal bne $s1, $s2, 25 if ($s1 != $s2) go to PC + 4 + 100

Not equal test; PC-relative

branch set on less than slt $s1, $s2, $s3 if ($s2 < $s3) $s1 = 1; else $s1 = 0

Compare less than; for beq, bne

set less than immediate

slti $s1, $s2, 100 if ($s2 < 100) $s1 = 1; else $s1 = 0

Compare less than constant

jump j 2500 go to 10000 Jump to target address

Uncondi- jump register jr $ra go to $ra For switch, procedure return

tional jump jump and link jal 2500 $ra = PC + 4; go to 10000 For procedure call

(57)

• Simple instructions all 32 bits wide• Very structured• Only three instruction formats

• Rely on compiler to achieve performance— what are the compiler's goals?

• Help compiler where we can

op rs rt rd shamt funct

op rs rt 16 bit address

op 26 bit address

R

I

J

Summary To Date: MIPS ISA

compiler

ISA

HW

Opcodes on page B-50Encodings – Section B10

Full Example

(58)

Stored Program Computers

• Instructions represented in binary, just like data

• Instructions and data stored in memory

• Programs can operate on programs

e.g., compilers, linkers, …

• Binary compatibility allows compiled programs to work on different computers

Standardized ISAs

The BIG Picture

(59)

• Fetch & Execute Cycle sequential flow of control Instructions are fetched and put into a special

register Bits in the register "control" the subsequent actions Fetch the “next” instruction and continue Program Counter Program Counter + 4 (byte

addressing!) Von Neumann execution model

Example:

Stored Program Concept

compiler

ISA

HW

HW support

(60)

Instruction Set Architecture (ISA)byte addressed memory

0xFFFFFFFF

Arithmetic Logic Unit (ALU)

0x000x010x02

0x03

0x1FProcessor Internal Buses

Memory InterfaceRegister File (Programmer Visible State)

stack

Data segment(static)

Text Segment

Dynamic Data

Reserved

Program Counter

Programmer Invisible State

Kernelregisters Who sees what?

Memory MapInstruction register

(61)

ARM & MIPS Similarities

• ARM: the most popular embedded core• Similar basic set of instructions to MIPS

ARM MIPS

Date announced 1985 1985

Instruction size 32 bits 32 bits

Address space 32-bit flat 32-bit flat

Data alignment Aligned Aligned

Data addressing modes 9 3

Registers 15 × 32-bit 31 × 32-bit

Input/output Memory mapped

Memory mapped

(62)

Compare and Branch in ARM

• Uses condition codes for result of an arithmetic/logical instruction Negative, zero, carry, overflow Compare instructions to set

condition codes without keeping the result

• Each instruction can be conditional Top 4 bits of instruction word:

condition value Can avoid branches over single

instructions

ALU

$0

$1

$31

CPU/Core

Z V C N

(63)

Instruction Encoding

Differences?

(64)

The Intel x86 ISA• Evolution with backward compatibility

8080 (1974): 8-bit microprocessoro Accumulator, plus 3 index-register pairs

8086 (1978): 16-bit extension to 8080o Complex instruction set (CISC)

8087 (1980): floating-point coprocessoro Adds FP instructions and register stack

80286 (1982): 24-bit addresses, MMUo Segmented memory mapping and protection

80386 (1985): 32-bit extension (now IA-32)o Additional addressing modes and operationso Paged memory mapping as well as segments

(65)

The Intel x86 ISA

• Further evolution… i486 (1989): pipelined, on-chip caches and FPU Pentium (1993): superscalar, 64-bit datapath

o Later versions added MMX (Multi-Media eXtension) instructions

o The infamous FDIV bug Pentium Pro (1995), Pentium II (1997)

o New microarchitecture (see Colwell, The Pentium Chronicles)

Pentium III (1999)o Added SSE (Streaming SIMD Extensions) and

associated registers Pentium 4 (2001)

o New microarchitectureo Added SSE2 instructions

(66)

The Intel x86 ISA

• And further… AMD64 (2003): extended architecture to 64 bits EM64T – Extended Memory 64 Technology (2004)

o AMD64 adopted by Intel (with refinements)o Added SSE3 instructions

Intel Core (2006)o Added SSE4 instructions, virtual machine support

AMD64 (announced 2007): SSE5 instructions Intel Advanced Vector Extension (AVX announced

2008)• If Intel didn’t extend with compatibility, its

competitors would! Technical elegance ≠ market success

• Commonly thought of as a Complex Instruction Set Architecture (CISC)

(67)

Basic x86 Registers

(68)

Basic x86 Addressing Modes

• Two operands per instruction

Source/dest operand Second source operand

Register Register

Register Immediate

Register Memory

Memory Register

Memory Immediate

Memory addressing modes Address in register Address = Rbase + displacement Address = Rbase + 2scale × Rindex (scale = 0, 1, 2, or 3) Address = Rbase + 2scale × Rindex + displacement

(69)

x86 Instruction Encoding

• Variable length encoding Postfix bytes specify

addressing mode Prefix bytes modify

operationo Operand length,

repetition, locking, …

(70)

Implementing IA-32

• Complex instruction set makes implementation difficult Hardware translates instructions to simpler

microoperationso Simple instructions: 1–1o Complex instructions: 1–many

Microengine similar to RISC Market share makes this economically viable

• Comparable performance to RISC Compilers avoid complex instructions

• Better code density

(71)

Fallacies

• Powerful instruction higher performance Fewer instructions required But complex instructions are hard to implement

o May slow down all instructions, including simple ones

Compilers are good at making fast code from simple instructions

• Use assembly code for high performance But modern compilers are better at dealing with

modern processors More lines of code more errors and less productivity

(72)

Fallacies

• Backward compatibility instruction set does not change But they do accrete more instructions

x86 instruction set

(73)

• Instruction set design Tradeoffs between compiler complexity and hardware

complexity Orthogonal (RISC) ISAs vs. complex ISAs (more on

this later in the class)

• Design Principles: simplicity favors regularity smaller is faster good design demands compromise make the common case fast

• Instruction set architecture a very important abstraction indeed!

Summary

(74)

Study Guide

• What is i) an orthogonal instruction set, ii) load/store architecture, and iii) instruction set architecture?

• Translate small high level language (e.g., C, Matlab) code blocks into MIPS assembly Allocate variables to registers Layout data in memory Sequence data into/out of registers as necessary Write assembly instructions/program

• Write and execute the proceeding for A few simple if-then-else cases (say from C) for loops and while loops

(75)

Study Guide (cont.)• Utilize data directives to layout data in memory

Check anticipated layout in SPIM Layout a 2D matrix and a 3D matrix Layout a linked list

• Manually assemble instructions and check with SPIM

• Given a program, encode branch and jump instructions Use SPIM to verify your answers – remember SPIM

branches are relative to the PC

• Use SPIM to assemble some small programs Manually disassemble the code

(76)

Study Guide (cont.)

• Synthesize complex inequality tests with the slt instruction e.g., bgt, ble, bge

• Some simple learning exercises – write SPIM programs for Reversing the order of bytes in a register Reversing the order of bytes in a memory location Compute the exclusive-OR of the contents of two

registers Create a linked list to store an array of four numbers,

one number per element Traverse the preceding linked list to compute the sum

of the numbers Fetch a word starting an non-word boundary

(77)

Study Guide (cont.)

• Take a block of simple MIPS code and Translate it to equivalent ARM code Translate it to equivalent x86 code

• Name two advantages of a CISC ISA over a RISC ISA

• Name two disadvantages of a CISC ISA over a RISC ISA

(78)

Glossary

• Basic block• Big endian• Binary compatibility• Byte aligned memory

access• CISC• Data directives• Destination operand• Frame pointer• General purpose

registers• Global pointer

• I-format• Immediate operand• Instruction encoding• Instruction format• Instruction set

architecture• J-format• Little Endian• Machine code (or

language)• Memory map

(79)

Glossary (cont.)

• Native instructions• Orthogonal ISA• PC-relative

addressing• Pseudo instructions• R-format• RISC• Sign extension• Source operand

• Stack pointer• System software vs.

application software• Unsigned vs. signed

instructions• Word aligned memory

access• Von Neumann

execution model