Chapter 7 Memory Hierarchy

Post on 15-Jan-2016

39 views 0 download

description

Chapter 7 Memory Hierarchy. Outline. Technology Trends. Processor Memory Latency Gap. Time of a full cache miss in instructions executed 1st Alpha: 340 ns/5.0 ns = 68 clks x 2 (136 instr.) 2nd Alpha: 266 ns/3.3 ns = 80 clks x 4 (320 instr.) - PowerPoint PPT Presentation

Transcript of Chapter 7 Memory Hierarchy

11998 Morgan Kaufmann Publishers

Chapter 7Memory Hierarchy

21998 Morgan Kaufmann Publishers

Outline

31998 Morgan Kaufmann Publishers

Technology Trends

41998 Morgan Kaufmann Publishers

Processor Memory Latency Gap

Time of a full cache miss in instructions executed• 1st Alpha: 340 ns/5.0 ns = 68 clks x 2 (136 instr.)• 2nd Alpha: 266 ns/3.3 ns = 80 clks x 4 (320 instr.)• 3rd Alpha: 180 ns/1.7 ns =108 clks x 6 (648 instr.)

51998 Morgan Kaufmann Publishers

Solution: Memory Hierarchy

Speed: Fastest Slowest Size: Smallest Biggest Cost: Highest Lowest

61998 Morgan Kaufmann Publishers

Memory Hierarchy: Principle

71998 Morgan Kaufmann Publishers

Why Hierarchy Works?

81998 Morgan Kaufmann Publishers

How Does It Work?

Speed (ns): 1’s 10’s 100’s 10,000,000’s 10,000,000,000’s (10’s ms) (10’s sec)Size(bytes): 100’s K’s M’s G’s T’s

91998 Morgan Kaufmann Publishers

101998 Morgan Kaufmann Publishers

How Is the Hierarchy Managed?

111998 Morgan Kaufmann Publishers

Memory Hierarchy Technology

121998 Morgan Kaufmann Publishers

Memory Hierarchy Technology

131998 Morgan Kaufmann Publishers

Memory Hierarchy: Terminology

141998 Morgan Kaufmann Publishers

4 Questions for Hierarchy Design

151998 Morgan Kaufmann Publishers

Memory System Design

161998 Morgan Kaufmann Publishers

Summary of Memory Hierarchy

171998 Morgan Kaufmann Publishers

Outline

181998 Morgan Kaufmann Publishers

Basics of Cache

191998 Morgan Kaufmann Publishers

201998 Morgan Kaufmann Publishers

Hits and Misses

211998 Morgan Kaufmann Publishers

Hits and Misses

221998 Morgan Kaufmann Publishers

Avoid Waiting for Memoryin Write Through

231998 Morgan Kaufmann Publishers

Exploiting Spatial Locality

241998 Morgan Kaufmann Publishers

251998 Morgan Kaufmann Publishers

Block Size Tradeoff

Block Size

261998 Morgan Kaufmann Publishers

Memory Design to Support Cache

271998 Morgan Kaufmann Publishers

Interleaving for Bandwidth

281998 Morgan Kaufmann Publishers

Cache Performance

291998 Morgan Kaufmann Publishers

Improving Cache Performance

301998 Morgan Kaufmann Publishers

Reduce Miss Ratio with Associativity

311998 Morgan Kaufmann Publishers

Set-Associative Cache

321998 Morgan Kaufmann Publishers

Possible Associativity Structures

331998 Morgan Kaufmann Publishers

Block Placement

341998 Morgan Kaufmann Publishers

Data Placement Policy

351998 Morgan Kaufmann Publishers

Cache Block Replacement

361998 Morgan Kaufmann Publishers

Comparing the Structures

371998 Morgan Kaufmann Publishers

A 4-Way Set-Associative Cache

381998 Morgan Kaufmann Publishers

391998 Morgan Kaufmann Publishers

Reduce Miss Penalty with Multilevel Caches

401998 Morgan Kaufmann Publishers

Sources of Cache Misses

411998 Morgan Kaufmann Publishers

Cache Design Space

421998 Morgan Kaufmann Publishers

Cache Summary

431998 Morgan Kaufmann Publishers

Outline

441998 Morgan Kaufmann Publishers

Virtual Memory

451998 Morgan Kaufmann Publishers

Virtual Memory

461998 Morgan Kaufmann Publishers

Why Virtual Memory?

471998 Morgan Kaufmann Publishers

Basic Issues in Virtual Memory

481998 Morgan Kaufmann Publishers

Paging

491998 Morgan Kaufmann Publishers

Key Decisions in Paging

501998 Morgan Kaufmann Publishers

Choosing the Page Size

511998 Morgan Kaufmann Publishers

Page Tables

521998 Morgan Kaufmann Publishers

Page Fault: What Happens When You Miss?

531998 Morgan Kaufmann Publishers

Handling Page Faults

541998 Morgan Kaufmann Publishers

Handling Page Faults

551998 Morgan Kaufmann Publishers

Page Replacement: 1-bit LRU

Architecture part: support dirty and used bits in the page table (how?)=> may need to update PTE on any instruction fetch, load, store

561998 Morgan Kaufmann Publishers

Impact of Paging (I)

571998 Morgan Kaufmann Publishers

Hashing: Inverted Page Tables

581998 Morgan Kaufmann Publishers

Two-level Page Tables

591998 Morgan Kaufmann Publishers

Impact of Paging (II)

601998 Morgan Kaufmann Publishers

Making Address Translation Practical

611998 Morgan Kaufmann Publishers

Translation Lookaside Buffer

621998 Morgan Kaufmann Publishers

Translation Lookaside Buffer

631998 Morgan Kaufmann Publishers

TLB of MIPS R2000

641998 Morgan Kaufmann Publishers

TLB in Pipeline

651998 Morgan Kaufmann Publishers

661998 Morgan Kaufmann Publishers

Processing inTLB+Cache

671998 Morgan Kaufmann Publishers

Possible Combinations of Events

681998 Morgan Kaufmann Publishers

Virtual Address and Cache

691998 Morgan Kaufmann Publishers

Virtually Addressed Cache

701998 Morgan Kaufmann Publishers

An Alternative: Overlapped TLB andCache Access

IF cache hit AND (cache tag = PA) then deliver data to CPUELSE IF [cache miss OR (cache tag = PA)] and TLB hit THENaccess memory with the PA from the TLB ELSE do standard VA translation

711998 Morgan Kaufmann Publishers

Problem with Overlapped Access

721998 Morgan Kaufmann Publishers

Protection with Virtual Memory

731998 Morgan Kaufmann Publishers

A Common Framework for MemoryHierarchies

741998 Morgan Kaufmann Publishers

Modern Systems

751998 Morgan Kaufmann Publishers

Challenge in Memory Hierarchy

761998 Morgan Kaufmann Publishers

Summary

771998 Morgan Kaufmann Publishers

Summary (cont.)