COMP3221 lec34-Cache-II.1 Saeid Nooshabadi COMP 3221 Microprocessors and Embedded Systems Lectures...

36
COMP3221 lec34-Cache-II.1 Saeid Nooshabadi COMP 3221 Microprocessors and Embedded Systems Lectures 34: Cache Memory - II http://www.cse.unsw.edu.au/~cs3221 October, 2003 Saeid Nooshabadi [email protected] Some of the slides are adopted from David Patterson (UCB)
  • date post

    21-Dec-2015
  • Category

    Documents

  • view

    228
  • download

    0

Transcript of COMP3221 lec34-Cache-II.1 Saeid Nooshabadi COMP 3221 Microprocessors and Embedded Systems Lectures...

Page 1: COMP3221 lec34-Cache-II.1 Saeid Nooshabadi COMP 3221 Microprocessors and Embedded Systems Lectures 34: Cache Memory - II cs3221.

COMP3221 lec34-Cache-II.1 Saeid Nooshabadi

COMP 3221

Microprocessors and Embedded Systems

Lectures 34: Cache Memory - II

http://www.cse.unsw.edu.au/~cs3221

October, 2003

Saeid Nooshabadi

[email protected] of the slides are adopted from David Patterson (UCB)

Page 2: COMP3221 lec34-Cache-II.1 Saeid Nooshabadi COMP 3221 Microprocessors and Embedded Systems Lectures 34: Cache Memory - II cs3221.

COMP3221 lec34-Cache-II.2 Saeid Nooshabadi

Outline

°Direct-Mapped Cache

°Types of Cache Misses

°A (long) detailed example

°Peer - to - peer education example

°Block Size Tradeoff

°Types of Cache Misses

Page 3: COMP3221 lec34-Cache-II.1 Saeid Nooshabadi COMP 3221 Microprocessors and Embedded Systems Lectures 34: Cache Memory - II cs3221.

COMP3221 lec34-Cache-II.3 Saeid Nooshabadi

Review: Memory Hierarchy

Control

Datapath

Memory

Processor

Mem

ory

MemoryMemory

Mem

ory

Fastest Slowest

Smallest Biggest

Highest Lowest

Speed:Size:

Cost:

Registers

Cache L1SRAM

Cache L2SRAM

RAM/ROMDRAM EEPROM

Hard Disk

Page 4: COMP3221 lec34-Cache-II.1 Saeid Nooshabadi COMP 3221 Microprocessors and Embedded Systems Lectures 34: Cache Memory - II cs3221.

COMP3221 lec34-Cache-II.4 Saeid Nooshabadi

Review: Direct-Mapped Cache (#1/2)

° In a direct-mapped cache, each memory address is associated with one possible block within the cache

• Therefore, we only need to look in a single location in the cache for the data if it exists in the cache

• Block is the unit of transfer between cache and memory

Page 5: COMP3221 lec34-Cache-II.1 Saeid Nooshabadi COMP 3221 Microprocessors and Embedded Systems Lectures 34: Cache Memory - II cs3221.

COMP3221 lec34-Cache-II.5 Saeid Nooshabadi

Review: Direct-Mapped Cache 1 Word Block

° Block size = 4 bytes

° Cache Location 0 can be occupied by data from:• Memory location 0 - 3, 8 - B, ... • In general: any 4 memory locations that is 8*n

(n=0,1,2,..)

° Cache Location 1 can be occupied by data from:• Memory location 4 - 7, C - F, ... • In general: any 4 memory locations that is 8*n + 4

(n=0,1,2,..)

MemoryMemory Address

0 1 2 3 4 5 6 7 8 9 A B C D E F

8 Byte Direct Mapped Cache

Cache Index

01

Page 6: COMP3221 lec34-Cache-II.1 Saeid Nooshabadi COMP 3221 Microprocessors and Embedded Systems Lectures 34: Cache Memory - II cs3221.

COMP3221 lec34-Cache-II.6 Saeid Nooshabadi

Direct-Mapped with 1 woed Blocks Example

data RAMtag RAM

compare mux

datahit

address

Byte offsetblock indexAddress tag

Page 7: COMP3221 lec34-Cache-II.1 Saeid Nooshabadi COMP 3221 Microprocessors and Embedded Systems Lectures 34: Cache Memory - II cs3221.

COMP3221 lec34-Cache-II.7 Saeid Nooshabadi

Review: Direct-Mapped Cache Terminology

°All fields are read as unsigned integers.

° Index: specifies the cache index (which “row” or “line” of the cache we should look in)

°Offset: once we’ve found correct block, specifies which byte within the block we want

°Tag: the remaining bits after offset and index are determined; these are used to distinguish between all the memory addresses that map to the same location

Page 8: COMP3221 lec34-Cache-II.1 Saeid Nooshabadi COMP 3221 Microprocessors and Embedded Systems Lectures 34: Cache Memory - II cs3221.

COMP3221 lec34-Cache-II.8 Saeid Nooshabadi

Reading Material° Steve Furber: ARM System On-Chip; 2nd

Ed, Addison-Wesley, 2000, ISBN: 0-201-67519-6. Chapter 10.

Page 9: COMP3221 lec34-Cache-II.1 Saeid Nooshabadi COMP 3221 Microprocessors and Embedded Systems Lectures 34: Cache Memory - II cs3221.

COMP3221 lec34-Cache-II.9 Saeid Nooshabadi

Accessing Data in a Direct Mapped Cache (#1/3)

°Ex.: 16KB of data, direct-mapped, 4 word blocks

°Read 4 addresses 0x00000014, 0x0000001C, 0x00000034, 0x00008014

°Only cache/memory level of hierarchy

Address (hex) Value of Word

Memory

0000001000000014000000180000001C

abcd

... ...

0000003000000034000000380000003C

efgh

0000801000008014000080180000801C

ijkl

... ...

... ...

... ...

Page 10: COMP3221 lec34-Cache-II.1 Saeid Nooshabadi COMP 3221 Microprocessors and Embedded Systems Lectures 34: Cache Memory - II cs3221.

COMP3221 lec34-Cache-II.10 Saeid Nooshabadi

Accessing Data in a Direct Mapped Cache (#2/3)°4 Addresses:

• 0x00000014, 0x0000001C, 0x00000034, 0x00008014

°4 Addresses divided (for convenience) into Tag, Index, Byte Offset fields

000000000000000000 0000000001 0100000000000000000000 0000000001 1100000000000000000000 0000000011 0100000000000000000010 0000000001 0100 Tag Index Offset

ttttttttttttttttt iiiiiiiiii ooootag to check if have index to byte offset

correct block select block within block

Page 11: COMP3221 lec34-Cache-II.1 Saeid Nooshabadi COMP 3221 Microprocessors and Embedded Systems Lectures 34: Cache Memory - II cs3221.

COMP3221 lec34-Cache-II.11 Saeid Nooshabadi

Accessing Data in a Direct Mapped Cache (#3/3)°So lets go through accessing some data in this cache• 16KB data, direct-mapped, 4 word blocks

°Will see 3 types of events:

°cache miss: nothing in cache in appropriate block, so fetch from memory

°cache hit: cache block is valid and contains proper address, so read desired word

°cache miss, block replacement: wrong data is in cache at appropriate block, so discard it and fetch desired data from memory

Page 12: COMP3221 lec34-Cache-II.1 Saeid Nooshabadi COMP 3221 Microprocessors and Embedded Systems Lectures 34: Cache Memory - II cs3221.

COMP3221 lec34-Cache-II.12 Saeid Nooshabadi

Example Block

16 KB Direct Mapped Cache, 16B blocks° Valid bit: determines whether anything

is stored in that row (when computer initially turned on, all entries are invalid)

...

ValidTag 0x0-3 0x4-7 0x8-b 0xc-f

01234567

10221023

...

Index00000000

00

Page 13: COMP3221 lec34-Cache-II.1 Saeid Nooshabadi COMP 3221 Microprocessors and Embedded Systems Lectures 34: Cache Memory - II cs3221.

COMP3221 lec34-Cache-II.13 Saeid Nooshabadi

Read 0x00000014 = 0…00 0..001 0100° 000000000000000000 0000000001 0100

...

ValidTag 0x0-3 0x4-7 0x8-b 0xc-f

01234567

10221023

...

Index

Tag field Index field Offset

00000000

00

Page 14: COMP3221 lec34-Cache-II.1 Saeid Nooshabadi COMP 3221 Microprocessors and Embedded Systems Lectures 34: Cache Memory - II cs3221.

COMP3221 lec34-Cache-II.14 Saeid Nooshabadi

So we read block 1 (0000000001)

...

ValidTag 0x0-3 0x4-7 0x8-b 0xc-f

01234567

10221023

...

° 000000000000000000 0000000001 0100

Index

Tag field Index field Offset

00000000

00

Page 15: COMP3221 lec34-Cache-II.1 Saeid Nooshabadi COMP 3221 Microprocessors and Embedded Systems Lectures 34: Cache Memory - II cs3221.

COMP3221 lec34-Cache-II.15 Saeid Nooshabadi

No valid data

...

ValidTag 0x0-3 0x4-7 0x8-b 0xc-f

01234567

10221023

...

° 000000000000000000 0000000001 0100

Index

Tag field Index field Offset

00000000

00

Page 16: COMP3221 lec34-Cache-II.1 Saeid Nooshabadi COMP 3221 Microprocessors and Embedded Systems Lectures 34: Cache Memory - II cs3221.

COMP3221 lec34-Cache-II.16 Saeid Nooshabadi

So load that data into cache, setting tag, valid

...

ValidTag 0x0-3 0x4-7 0x8-b 0xc-f

01234567

10221023

...

1 0 a b c d

° 000000000000000000 0000000001 0100

Index

Tag field Index field Offset

0

000000

00

Page 17: COMP3221 lec34-Cache-II.1 Saeid Nooshabadi COMP 3221 Microprocessors and Embedded Systems Lectures 34: Cache Memory - II cs3221.

COMP3221 lec34-Cache-II.17 Saeid Nooshabadi

Read from cache at offset, return word b° 000000000000000000 0000000001 0100

...

ValidTag 0x0-3 0x4-7 0x8-b 0xc-f

01234567

10221023

...

1 0 a b c d

Index

Tag field Index field Offset

0

000000

00

Page 18: COMP3221 lec34-Cache-II.1 Saeid Nooshabadi COMP 3221 Microprocessors and Embedded Systems Lectures 34: Cache Memory - II cs3221.

COMP3221 lec34-Cache-II.18 Saeid Nooshabadi

Read 0x0000001C = 0…00 0..001 1100

...

ValidTag 0x0-3 0x4-7 0x8-b 0xc-f

01234567

10221023

...

1 0 a b c d

° 000000000000000000 0000000001 1100

Index

Tag field Index field Offset

0

000000

00

Page 19: COMP3221 lec34-Cache-II.1 Saeid Nooshabadi COMP 3221 Microprocessors and Embedded Systems Lectures 34: Cache Memory - II cs3221.

COMP3221 lec34-Cache-II.19 Saeid Nooshabadi

Data valid, tag OK, so read offset return word d

...

ValidTag 0x0-3 0x4-7 0x8-b 0xc-f

01234567

10221023

...

1 0 a b c d

° 000000000000000000 0000000001 1100

Index0

000000

00

Page 20: COMP3221 lec34-Cache-II.1 Saeid Nooshabadi COMP 3221 Microprocessors and Embedded Systems Lectures 34: Cache Memory - II cs3221.

COMP3221 lec34-Cache-II.20 Saeid Nooshabadi

Read 0x00000034 = 0…00 0..011 0100

...

ValidTag 0x0-3 0x4-7 0x8-b 0xc-f

01234567

10221023

...

1 0 a b c d

° 000000000000000000 0000000011 0100

Index

Tag field Index field Offset

0

000000

00

Page 21: COMP3221 lec34-Cache-II.1 Saeid Nooshabadi COMP 3221 Microprocessors and Embedded Systems Lectures 34: Cache Memory - II cs3221.

COMP3221 lec34-Cache-II.21 Saeid Nooshabadi

So read block 3

...

ValidTag 0x0-3 0x4-7 0x8-b 0xc-f

01234567

10221023

...

1 0 a b c d

° 000000000000000000 0000000011 0100

Index

Tag field Index field Offset

0

000000

00

Page 22: COMP3221 lec34-Cache-II.1 Saeid Nooshabadi COMP 3221 Microprocessors and Embedded Systems Lectures 34: Cache Memory - II cs3221.

COMP3221 lec34-Cache-II.22 Saeid Nooshabadi

No valid data

...

ValidTag 0x0-3 0x4-7 0x8-b 0xc-f

01234567

10221023

...

1 0 a b c d

° 000000000000000000 0000000011 0100

Index

Tag field Index field Offset

0

000000

00

Page 23: COMP3221 lec34-Cache-II.1 Saeid Nooshabadi COMP 3221 Microprocessors and Embedded Systems Lectures 34: Cache Memory - II cs3221.

COMP3221 lec34-Cache-II.23 Saeid Nooshabadi

Load that cache block, return word f

...

ValidTag 0x0-3 0x4-7 0x8-b 0xc-f

01234567

10221023

...

1 0 a b c d

° 000000000000000000 0000000011 0100

1 0 e f g h

Index

Tag field Index field Offset

0

0

0000

00

Page 24: COMP3221 lec34-Cache-II.1 Saeid Nooshabadi COMP 3221 Microprocessors and Embedded Systems Lectures 34: Cache Memory - II cs3221.

COMP3221 lec34-Cache-II.24 Saeid Nooshabadi

Read 0x00008014 = 0…10 0..001 0100

...

ValidTag 0x0-3 0x4-7 0x8-b 0xc-f

01234567

10221023

...

1 0 a b c d

° 000000000000000010 0000000001 0100

1 0 e f g h

Index

Tag field Index field Offset

0

0

0000

00

Page 25: COMP3221 lec34-Cache-II.1 Saeid Nooshabadi COMP 3221 Microprocessors and Embedded Systems Lectures 34: Cache Memory - II cs3221.

COMP3221 lec34-Cache-II.25 Saeid Nooshabadi

So read Cache Block 1, Data is Valid

...

ValidTag 0x0-3 0x4-7 0x8-b 0xc-f

01234567

10221023

...

1 0 a b c d

° 000000000000000010 0000000001 0100

1 0 e f g h

Index

Tag field Index field Offset

0

0

0000

00

Page 26: COMP3221 lec34-Cache-II.1 Saeid Nooshabadi COMP 3221 Microprocessors and Embedded Systems Lectures 34: Cache Memory - II cs3221.

COMP3221 lec34-Cache-II.26 Saeid Nooshabadi

Cache Block 1 Tag does not match (0 != 2)

...

ValidTag 0x0-3 0x4-7 0x8-b 0xc-f

01234567

10221023

...

1 0 a b c d

° 000000000000000010 0000000001 0100

1 0 e f g h

Index

Tag field Index field Offset

0

0

0000

00

Page 27: COMP3221 lec34-Cache-II.1 Saeid Nooshabadi COMP 3221 Microprocessors and Embedded Systems Lectures 34: Cache Memory - II cs3221.

COMP3221 lec34-Cache-II.27 Saeid Nooshabadi

Miss, so replace block 1 with new data & tag

...

ValidTag 0x0-3 0x4-7 0x8-b 0xc-f

01234567

10221023

...

1 2 i j k l

° 000000000000000010 0000000001 0100

1 0 e f g h

Index

Tag field Index field Offset

0

0

0000

00

Page 28: COMP3221 lec34-Cache-II.1 Saeid Nooshabadi COMP 3221 Microprocessors and Embedded Systems Lectures 34: Cache Memory - II cs3221.

COMP3221 lec34-Cache-II.28 Saeid Nooshabadi

And return word j

...

ValidTag 0x0-3 0x4-7 0x8-b 0xc-f

01234567

10221023

...

1 2 i j k l

° 000000000000000010 0000000001 0100

1 0 e f g h

Index

Tag field Index field Offset

0

0

0000

00

Page 29: COMP3221 lec34-Cache-II.1 Saeid Nooshabadi COMP 3221 Microprocessors and Embedded Systems Lectures 34: Cache Memory - II cs3221.

COMP3221 lec34-Cache-II.29 Saeid Nooshabadi

Do an example yourself. What happens?° Chose from: Cache: Hit, Miss, Miss w. replace

Values returned: a ,b, c, d, e, ..., k, l° Read address 0x00000030 ? 000000000000000000 0000000011 0000

° Read address 0x0000001c ? 000000000000000000 0000000001 1100

...

ValidTag 0x0-3 0x4-7 0x8-b 0xc-f01234567...

1 2 i j k l

1 0 e f g h

Index0

0

0000

Cache

Page 30: COMP3221 lec34-Cache-II.1 Saeid Nooshabadi COMP 3221 Microprocessors and Embedded Systems Lectures 34: Cache Memory - II cs3221.

COMP3221 lec34-Cache-II.30 Saeid Nooshabadi

Answers°0x00000030 a hit

Index = 3, Tag matches, Offset = 0, value = e

°0x0000001c a miss with replacment

Index = 1, Tag mismatch, so replace from memory, Offset = 0xc, value = d

°The Values read from Cache must equal memory values

whether or not cached:• 0x00000030 = e• 0x0000001c = d

Address Value of WordMemory

0000001000000014000000180000001c

abcd

... ...

0000003000000034000000380000003c

efgh

0000801000008014000080180000801c

ijkl

... ...

... ...

... ...

Page 31: COMP3221 lec34-Cache-II.1 Saeid Nooshabadi COMP 3221 Microprocessors and Embedded Systems Lectures 34: Cache Memory - II cs3221.

COMP3221 lec34-Cache-II.31 Saeid Nooshabadi

Block Size Tradeoff (#1/3)

°Benefits of Larger Block Size• Spatial Locality: if we access a given word, we’re likely to access other nearby words soon (Another Big Idea)

• Very applicable with Stored-Program Concept: if we execute a given instruction, it’s likely that we’ll execute the next few as well

• Works nicely in sequential array accesses too

Page 32: COMP3221 lec34-Cache-II.1 Saeid Nooshabadi COMP 3221 Microprocessors and Embedded Systems Lectures 34: Cache Memory - II cs3221.

COMP3221 lec34-Cache-II.32 Saeid Nooshabadi

Block Size Tradeoff (#2/3)

°Drawbacks of Larger Block Size• Larger block size means larger miss penalty

- on a miss, takes longer time to load a new block from next level

• If block size is too big relative to cache size, then there are too few blocks

- Result: miss rate goes up

° In general, minimize Average Access Time

= Hit Time x Hit Rate + Miss Penalty x Miss Rate

Page 33: COMP3221 lec34-Cache-II.1 Saeid Nooshabadi COMP 3221 Microprocessors and Embedded Systems Lectures 34: Cache Memory - II cs3221.

COMP3221 lec34-Cache-II.33 Saeid Nooshabadi

Block Size Tradeoff (#3/3)

°Hit Time = time to find and retrieve data from current level cache

°Miss Penalty = average time to retrieve data on a current level miss (includes the possibility of misses on successive levels of memory hierarchy)

°Hit Rate = % of requests that are found in current level cache

°Miss Rate = 1 - Hit Rate

Page 34: COMP3221 lec34-Cache-II.1 Saeid Nooshabadi COMP 3221 Microprocessors and Embedded Systems Lectures 34: Cache Memory - II cs3221.

COMP3221 lec34-Cache-II.34 Saeid Nooshabadi

Extreme Example: One Big Block

°Cache Size = 4 bytes Block Size = 4 bytes• Only ONE entry in the cache!

° If item accessed, likely accessed again soon• But unlikely will be accessed again immediately!

°The next access will likely to be a miss again• Continually loading data into the cache butdiscard data (force out) before use it again

• Nightmare for cache designer: Ping Pong Effect

Cache DataValid BitB 0B 1B 3

TagB 2

Page 35: COMP3221 lec34-Cache-II.1 Saeid Nooshabadi COMP 3221 Microprocessors and Embedded Systems Lectures 34: Cache Memory - II cs3221.

COMP3221 lec34-Cache-II.35 Saeid Nooshabadi

Block Size Tradeoff Conclusions

MissPenalty

Block Size

Increased Miss Penalty& Miss Rate

AverageAccess

Time

Block Size

Exploits Spatial Locality

Fewer blocks: compromisestemporal locality

MissRate

Block Size

Page 36: COMP3221 lec34-Cache-II.1 Saeid Nooshabadi COMP 3221 Microprocessors and Embedded Systems Lectures 34: Cache Memory - II cs3221.

COMP3221 lec34-Cache-II.36 Saeid Nooshabadi

Things to Remember°Cache Access involves 3 types of events:

°cache miss: nothing in cache in appropriate block, so fetch from memory

°cache hit: cache block is valid and contains proper address, so read desired word

°cache miss, block replacement: wrong data is in cache at appropriate block, so discard it and fetch desired data from memory