Instructor: Nick Riasanovsky - University of California,...

Virtual MemoryInstructor: Nick Riasanovsky

Review of Last Lecture

• The role of the Operating System– Booting a computer: BIOS, bootloader, OS boot,

initialization

• Base and bounds for multiple processes– Simple, but doesn’t give us everything we want

• Virtual memory bridges memory and disk– Provides illusion of independent address spaces to

processes and protects them from each other

7/31/2018 CS61C Su18 - Lecture 23 2

Agenda

• Virtual Memory and Page Tables

• Administrivia

• Translation Lookaside Buffer (TLB)

• VM Performance

• VM Wrap-up

7/31/2018 CS61C Su18 - Lecture 23 3

As programs start and end, memory becomes fragmented. Therefore, if we ever need a large chunk of memory, programs will need to be moved.

Memory Fragmentation

8/1/2016 CS61C Su16 - Lecture 22 4

What if we want to run process 6, and we need 32K of space?!

OSSpace

16K24K

24K

32K

24K

prog 1prog 2

prog 3

OSSpace

24K16K

32K

24K

prog 1prog 2

prog 3

prog 5

prog 48K

Programs 4 & 5 start

Programs 2 & 5end OS

Space

16K24K16K

32K

24K

prog 1

prog 48K

prog 3

free

16KBIG IDEA:USE RAM AS $$

FOR DISK

Direct Mapped, write back/write allocate(A)

Direct Mapped, write through/no write allocate

(B)

Fully Associative, write back/write allocate(C)

Fully Associative, write thorugh/no write allocate

(D)

5

Question: What kind of cache should memory be?

• Program’s view of memory can be broken up into pages by splitting processor issued addresses into:

• A page table will enable translation from virtual to physical page– Allows storage of programs pages non-contiguously!

Page Number : Offset

8/1/2016 CS61C Su16 - Lecture 22 6

0123

01

23

Address Spaceof Program 1

Page Table of Program 1

10

2

3

page number offset

Physical Memory

TAG

Memory as a Cache

• Allows storage of programs pages non-contiguously!

7

01

23

Page Numbersof Program 1

Physical Memory

10

2

3

Tag (PN) Data

Issue: Associative Caches are SLOW• Must check every entry to see if the tag

matches!

• RAM is SLOW!

8

Memory as a Cache

• Allows storage of programs pages non-contiguously!

9

01

23

Page Numbersof Program 1

Physical Memory

10

2

3

Tag (PN) Data

• Program’s view of memory can be broken up into pages by splitting processor issued addresses into:

• A page table will enable translation from virtual to physical page

Solution: Page Table

8/1/2016 CS61C Su16 - Lecture 22 10

page number offsetTAG

01

23

Virtual Page Numbersof Program 1

Physical Memory

Data

01

432

56Block Number

(Physical Page Number)

01

23

10

64

Page Table of Program 1

Virtual and Physical Page Numbers

VPN: like tag: responds to size of VIRTUAL memory

PPN: like block number: responds to size of PHYSICAL memory

11

Virtual Page Number OffsetVirtual Address

Physical Address

Physical Page Number Offset

log2(Virtual Memory Size/Page Size)= Virtual Address Bits – Offset Bits

log2(Physical Memory Size/Page Size)= Physical Address Bits – Offset Bits

Page Table Entry Format

• Contains either PPN or indication not in main memory

• Valid = Valid page table entry– 1 → virtual page is in physical memory– 0 → OS needs to fetch page from disk

• Access Rights checked on every access to see if allowed (provides protection)– Read– Write– Executable: Can fetch instructions from page

7/31/2018 CS61C Su18 - Lecture 23 12

Page Table Layout

7/31/2018 CS61C Su18 - Lecture 23 13

V AR PPN

X XX

. . .

Virtual Address: VPN offset

Page Table

1) Index into PT

using VPN

2) Check Valid and

Access Rights bits

+

3) Combine PPN and

offset

PhysicalAddress

4) Use PA to access memory

Private Address Space per User

8/1/2016 CS61C Su16 - Lecture 22 14

• Each program has a page table • Page table contains an entry for each program page

VA1Prog 1

VA1Prog 2

VA1Prog 3

Phys

ical

Mem

ory

Page Table

Page Table

Page Table free

OSpages

Protection Between Processes

• With a bare system, addresses issued with loads/stores are real physical addresses

• This means any program can issue any address, therefore can access any part of memory, even areas which it doesn’t own– Example: the OS data structures

• We should send all addresses through a mechanism that the OS controls, before they make it out to DRAM - a translation mechanism

157/31/2018 CS61C Su18 - Lecture 23

16

How Big is the Page Table?• 64 MiB RAM• 32-bit virtual address space• 1 KiB pages

Less than 1 Page(A)

Less than 100 Pages(B)

Less than 1000 Pages(C)

More than 1000 Pages(D)

Offset Bits = log_2 (1024) = 10

# Virtual Page Bits = 32 - 10 = 22(2^22 entries in the page table! )

# Physical Page Bits = log_2 (2^26) - 10

# Physical Page Bits = 26 - 10 = 16 (2 B)

Total Bytes = 2^22 * 2 = 2^23

Number of pages = 2^23 / 2^10 = 2^13 PAGES

Where Should Page Tables Reside?

• Space required by the page tables (PT) is proportional to the address space, number of users, ...

⇒ Too large to keep in registers, or caches….

• Idea: Keep PTs in the main memory• How can we find the page table in memory if the

page table is how we learn Physical addresses?????

• PT Base Register: stores Physical Address of current Page Table

8/1/2016 CS61C Su16 - Lecture 22 17

PT Base Register

18

Page Tables in Physical Memory

VA1

Prog 1 Virtual Address Space


PT Prog1

PT Prog2

VA1

Phys

ical

Mem

ory

1) Access page table for address

translation

2) Access correct physical address

Requires two accesses of

physical memory!

19

Linear Page Table

VPN OffsetVirtual address

PT Base Register

VPN

Data word

Data Pages

Offset

PPNPPN

DPNPPN

PPNPPNPage Table

DPN

PPN

DPNDPN

DPNPPN

Page Table Entry (PTE) contains:1 bit to indicate if page exists

And either PPN or DPN:

PPN (physical page number) for a memory-resident page

DPN (disk page number) for a page on the disk

Status bits for protection and usage (read, write, exec)

OS sets the Page Table Base Register whenever active user process changes

8/2/2017 CS61C Su18 - Lecture 24

20

Hierarchical Page Table

Level 1 Page Table

Level 2Page Tables

Data Pages

page in primary memory page in secondary memory

Root of the CurrentPage Table

p1

offset

p2

Virtual Address

(ProcessorRegister)

PTE of a nonexistent page

p1 p2 offset

01112212231

10-bitL1 index

10-bit L2 index

Phys

ical

Mem

ory

8/2/2017 CS61C Su18 - Lecture 24

21

Hierarchical Page Table Walk: SPARC v8

31 11 0

Virtual Address Index 1 Index 2 Index 3 Offset31 23 17 11 0

ContextTableRegister

ContextRegister

root ptr

PTPPTP

PTE

Context Table

L1 Table

L2 Table

L3 Table

Physical Address PPN Offset

MMU does this table walk in hardware on a TLB miss

PT Base Register

Page Tables in Physical Memory

VA1



PT Prog1

PT Prog2

VA1

Phys

ical

Mem

ory

1) Access page table for address

translation

2) Access correct physical address

Requires two accesses of

physical memory!

7/31/2018 CS61C Su18 - Lecture 23 22

Two-Level Page Tables in Physical Memory

VA1

User 1

User1/VA1

User2/VA1

Level 1 PT User 1

Level 1 PT User 2

VA1

User 2Level 2 PT User 2

Virtual Address Spaces

Physical Memory

8/2/2017 CS61C Su18 - Lecture 24 36

PT Base Register

Agenda

• Virtual Memory and Page Tables

• Administrivia


• VM Performance

• VM Wrap-up

7/31/2018 CS61C Su18 - Lecture 23 24

Administrivia

• Proj4 due on Friday (8/03)– Hold off on submissions for now

• HW7 released• Guerilla Session on Wed. @Cory 540AB, 4-6p• Regrade requests are open for MT2 until

Friday• The final will be 8/09 7-10PM @VLSB

2040/2060!

257/30/2018 CS61C Su18 - Lecture 22

Agenda

• Virtual Memory

• Page Tables

• Administrivia


• VM Performance

• VM Wrap-up

7/31/2018 CS61C Su18 - Lecture 23 26

Virtual Memory Problem

• 2 physical memory accesses per data access = SLOW!

• Since locality in pages of data, there must be locality in the translations of those pages

• Build a separate cache for the Page Table– For historical reasons, cache is called a Translation

Lookaside Buffer (TLB)

– Notice that what is stored in the TLB is NOT data, but the VPN → PPN mapping translations

7/31/2018 CS61C Su18 - Lecture 23 27

TLBs vs. Caches

• TLBs usually small, typically 32–128 entries• TLB access time comparable to cache (much faster

than accessing main memory)• TLBs usually are fully/highly associativity

7/31/2018 CS61C Su18 - Lecture 23 28

D$ / I$

Memory Address

Data at memory address

Access next cache level / main memory

On miss:

TLBVPN PPN

Access Page Table in main memory

On miss:

Where Are TLBs Located?

• Which should we check first: Cache or TLB?– Can cache hold requested data if corresponding

page is not in physical memory?

– With TLB first, does cache receive VA or PA?

7/31/2018 CS61C Su18 - Lecture 23 29

No – check PT first

PA

Now the TLB does the translation, not the Page Table!

CacheVA PAmiss

hitdata

hit

miss

CPUMain

MemoryTLB

Page Table

PPN

VPN

Address Translation Using TLB

7/31/2018 CS61C Su18 - Lecture 23 30

TLB Tag PPN(used just like in a cache)

. . .

TLB

TLB Tag Page Offset

VPN

PPN Page Offset

Tag Index Offset

Virtual Address

Physical AddressTag Block Data

. . .

DataCache

PA split two different

ways!

Note: TIO for VA & PA unrelated

Typical TLB Entry Format

• Valid whether that TLB ENTRY is valid (unrelated to PT)• Access Rights: Data from the PT

• Dirty: Basically always use write-back, so indicates whether or not to write page to disk when replaced

• Ref: Used to implement LRU– Set when page is accessed, cleared periodically by OS

• TLB Index: VPN mod (# TLB sets)

• TLB Tag: VPN minus TLB Index (upper bits)

• PPN: Data from PT7/31/2018 CS61C Su18 - Lecture 23 31

Valid Dirty Ref Access Rights TLB Tag PPNX X X XXX

Question: How many bits wide are the following?

• 16 KiB pages• 40-bit virtual addresses• 64 GiB physical memory• 2-way set associative TLB with 512 entries

12 14 38(A)18 8 45(B)

14 12 40(C)17 9 43(D)

TLB Tag TLB Index TLB Entry

32

Valid Dirty Ref Access Rights TLB Tag PPNX X X XX

Valid Dirty Ref Access Rights TLB Tag PPNX X XX

First solve for T:I:O of the TLBO = log_2 (2^14) = 14I = log_2 (512 / 2) = 8T = 40 - 14 - 8 = 18

Now solve for the size of a TLB entry.Total Bits = 5 + 18 + PPN bitsPPN bits = log_2 (2^36 / 2^14) = 22Total Bits = 5 + 18 + 22 = 45

Fetching Data on a Memory Read

1) Check TLB (input: VPN, output: PPN)– TLB Hit: Fetch translation, return PPN– TLB Miss: Check page table (in memory)

• Page Table Hit: Load page table entry into TLB• Page Table Miss (Page Fault): Fetch page from disk to

memory, update corresponding page table entry, then load entry into TLB

2) Check cache (input: PPN, output: data)– Cache Hit: Return data value to processor– Cache Miss: Fetch data value from memory, store

it in cache, return it to processor7/31/2018 CS61C Su18 - Lecture 23 33

Page Faults

• Load the page off the disk into a free page of memory– Switch to some other process while we wait

• Interrupt thrown when page loaded and the process’ page table is updated– When we switch back to the task, the desired data will

be in memory

• If memory full, replace page (LRU), writing back if necessary, and update both page table entries– Continuous swapping between disk and memory

called “thrashing”7/31/2018 CS61C Su18 - Lecture 23 34

Performance Metrics

• VM performance also uses Hit/Miss Rates and Miss Penalties– TLB Miss Rate: Fraction of TLB accesses that

result in a TLB Miss– Page Table Miss Rate: Fraction of PT accesses

that result in a page fault

• Caching performance definitions remain the same– Somewhat independent, as TLB will always pass

PA to cache regardless of TLB hit or miss

7/31/2018 CS61C Su18 - Lecture 23 35

Data Fetch Scenarios

• Are the following scenarios for a single data access possible?– TLB Miss, Page Fault– TLB Hit, Page Table Hit– TLB Miss, Cache Hit– Page Table Hit, Cache Miss– Page Fault, Cache Hit

7/31/2018 CS61C Su18 - Lecture 23 36

Yes

No

Yes

Yes

No

Cache

VA PAmiss

hitdata

hit

miss

CPUMain

MemoryTLB

Page Table

Disk

fault

page

PPN

VPN

Beat the Staff

7/31/2018 CS61C Su18 - Lecture 23 37

● You have met the staff, but now it’s your chance to beat them

● The staff is working on project 4 too, trying to get the fastest speedup they can

● We have a competition for the students with the best speedups, but we will have an additional reward for any students who Beat the Staff

● More details to come in a piazza post

Agenda

• Virtual Memory

• Page Tables

• Administrivia


• VM Performance

• VM Wrap-up

7/31/2018 CS61C Su18 - Lecture 23 38

VM Performance

• Virtual Memory is the level of the memory hierarchy that sits below main memory– TLB comes before cache, but affects transfer of

data from disk to main memory

– Previously we assumed main memory was lowest level, now we just have to account for disk accesses

• Same CPI, AMAT equations apply, but now treat main memory like a mid-level cache

7/31/2018 CS61C Su18 - Lecture 23 39

Typical Performance Stats

7/31/2018 CS61C Su18 - Lecture 23 40

CPU cache primarymemory

secondarymemory

Caching Demand pagingcache entry page framecache block (≈32 bytes) page (≈4 Ki bytes)cache miss rate (1% to 20%) page miss rate (<0.001%)cache hit (≈1 cycle) page hit (≈100 cycles)cache miss (≈100 cycles) page fault (≈5M cycles)

primarymemory

CPU

Impact of Paging on AMAT (1/2)

• Memory Parameters:– L1 cache hit = 1 clock cycles, hit 95% of accesses

– L2 cache hit = 10 clock cycles, hit 60% of L1 misses

– DRAM = 200 clock cycles (≈100 nanoseconds)

– Disk = 20,000,000 clock cycles (≈10 milliseconds)

• Average Memory Access Time (no paging):– 1 + 5%×10 + 5%×40%×200 = 5.5 clock cycles

• Average Memory Access Time (with paging):– 5.5 (AMAT with no paging) + ?

7/31/2018 CS61C Su18 - Lecture 23 41

Impact of Paging on AMAT (2/2)

• Average Memory Access Time (with paging) =• 5.5 + 5%×40%× (1-HR

Mem)×20,000,000

• AMAT if HRMem

= 99%?• 5.5 + 0.02×0.01×20,000,000 = 4005.5 (≈728x slower)• 1 in 20,000 memory accesses goes to disk: 10 sec

program takes 2 hours!

• AMAT if HRMem

= 99.9%?• 5.5 + 0.02×0.001×20,000,000 = 405.5

• AMAT if HRMem

= 99.9999%• 5.5 + 0.02×0.000001×20,000,000 = 5.9

7/31/2018 CS61C Su18 - Lecture 23 42

Impact of TLBs on Performance

• Each TLB miss to Page Table ~ L1 Cache miss

• TLB Reach: Amount of virtual address space that can be simultaneously mapped by TLB:– TLB typically has 128 entries of page size 4-8 KiB

– 128 × 4 KiB = 512 KiB = just 0.5 MiB

• What can you do to have better performance?– Multi-level TLBs

– Variable page size (segments)

– Special situationally-used “superpages”7/31/2018 CS61C Su18 - Lecture 23 43

Conceptually same as multi-level caches

Not covered in CS61C

Agenda

• Virtual Memory

• Page Tables

• Administrivia


• VM Performance

• VM Wrap-up

7/31/2018 CS61C Su18 - Lecture 23 44

Hardware/Software Support for Memory Protection

• Different tasks can share parts of their virtual address spaces– But need to protect against errant access– Requires OS assistance

• Hardware support for OS protection– Privileged supervisor mode (a.k.a. kernel mode)– Privileged instructions– Page tables and other state information only

accessible in supervisor mode– System call exception (e.g. ecall in RISC-V)

7/31/2018 CS61C Su18 - Lecture 23 45

Context Switching

• How does a single processor run many programs at once?

• Context switch: Changing of internal state of processor (switching between processes)– Save register values (and PC) and change value in

Page Table Base register

• What happens to the TLB?– Current entries are for a different process (similar

VAs, though!)– Set all entries to invalid on context switch

7/31/2018 CS61C Su18 - Lecture 23 46

47

Page-Based Virtual-Memory Machine

Hardware Page Table Walker

PC

Instr TLB I$ D RegFile E M D$ W+ Data

TLB

Physical Address

Virtual Address

Physical Address

Virtual Address

7/31/2018 CS61C Su18 - Lecture 23

Memory Controller

Main Memory (DRAM)

Physical Address

PTBR

Physical Address

Physical Address

48

Address Translation

hardwarehardware or softwaresoftware

Virtual Address

TLBLookup

Page Table“Walk”

Update TLB

Page Fault(OS loads page)

ProtectionCheck

PhysicalAddress

TLB Miss TLB Hit

Page notin Mem

AccessDenied

Access Permitted

ProtectionFault

SEGFAULT

Page in Mem

Check cacheFind in Disk Find in Mem

7/31/2018 CS61C Su18 - Lecture 23

Summary

• User program view:– Contiguous memory

– Start from some set VA

– “Infinitely” large

– Is the only running program

• Reality:– Non-contiguous memory

– Start wherever available memory is

– Finite size

– Many programs running simultaneously

• Virtual memory provides:– Illusion of contiguous memory

– All programs starting at same set address

– Illusion of ~ infinite memory (232 or 264 bytes)

– Protection, Sharing

• Implementation:– Divide memory into chunks (pages)

– OS controls page table that maps virtual into physical addresses

– memory as a cache for disk

– TLB is a cache for the page table

7/31/2018 CS61C Su18 - Lecture 23 49

Instructor: Nick Riasanovsky - University of California,...

Documents

Transcript of Instructor: Nick Riasanovsky - University of California,...