Instructor: Nick Riasanovsky - University of California,...
Transcript of Instructor: Nick Riasanovsky - University of California,...
Virtual MemoryInstructor: Nick Riasanovsky
Review of Last Lecture
• The role of the Operating System– Booting a computer: BIOS, bootloader, OS boot,
initialization
• Base and bounds for multiple processes– Simple, but doesn’t give us everything we want
• Virtual memory bridges memory and disk– Provides illusion of independent address spaces to
processes and protects them from each other
7/31/2018 CS61C Su18 - Lecture 23 2
Agenda
• Virtual Memory and Page Tables
• Administrivia
• Translation Lookaside Buffer (TLB)
• VM Performance
• VM Wrap-up
7/31/2018 CS61C Su18 - Lecture 23 3
As programs start and end, memory becomes fragmented. Therefore, if we ever need a large chunk of memory, programs will need to be moved.
Memory Fragmentation
8/1/2016 CS61C Su16 - Lecture 22 4
What if we want to run process 6, and we need 32K of space?!
OSSpace
16K24K
24K
32K
24K
prog 1prog 2
prog 3
OSSpace
24K16K
32K
24K
prog 1prog 2
prog 3
prog 5
prog 48K
Programs 4 & 5 start
Programs 2 & 5end OS
Space
16K24K16K
32K
24K
prog 1
prog 48K
prog 3
free
16KBIG IDEA:USE RAM AS $$
FOR DISK
Direct Mapped, write back/write allocate(A)
Direct Mapped, write through/no write allocate
(B)
Fully Associative, write back/write allocate(C)
Fully Associative, write thorugh/no write allocate
(D)
5
Question: What kind of cache should memory be?
• Program’s view of memory can be broken up into pages by splitting processor issued addresses into:
• A page table will enable translation from virtual to physical page– Allows storage of programs pages non-contiguously!
Page Number : Offset
8/1/2016 CS61C Su16 - Lecture 22 6
0123
01
23
Address Spaceof Program 1
Page Table of Program 1
10
2
3
page number offset
Physical Memory
TAG
Memory as a Cache
• Allows storage of programs pages non-contiguously!
7
01
23
Page Numbersof Program 1
Physical Memory
10
2
3
Tag (PN) Data
Issue: Associative Caches are SLOW• Must check every entry to see if the tag
matches!
• RAM is SLOW!
8
Memory as a Cache
• Allows storage of programs pages non-contiguously!
9
01
23
Page Numbersof Program 1
Physical Memory
10
2
3
Tag (PN) Data
• Program’s view of memory can be broken up into pages by splitting processor issued addresses into:
• A page table will enable translation from virtual to physical page
Solution: Page Table
8/1/2016 CS61C Su16 - Lecture 22 10
page number offsetTAG
01
23
Virtual Page Numbersof Program 1
Physical Memory
Data
01
432
56Block Number
(Physical Page Number)
01
23
10
64
Page Table of Program 1
Virtual and Physical Page Numbers
VPN: like tag: responds to size of VIRTUAL memory
PPN: like block number: responds to size of PHYSICAL memory
11
Virtual Page Number OffsetVirtual Address
Physical Address
Physical Page Number Offset
log2(Virtual Memory Size/Page Size)= Virtual Address Bits – Offset Bits
log2(Physical Memory Size/Page Size)= Physical Address Bits – Offset Bits
Page Table Entry Format
• Contains either PPN or indication not in main memory
• Valid = Valid page table entry– 1 → virtual page is in physical memory– 0 → OS needs to fetch page from disk
• Access Rights checked on every access to see if allowed (provides protection)– Read– Write– Executable: Can fetch instructions from page
7/31/2018 CS61C Su18 - Lecture 23 12
Page Table Layout
7/31/2018 CS61C Su18 - Lecture 23 13
V AR PPN
X XX
. . .
Virtual Address: VPN offset
Page Table
1) Index into PT
using VPN
2) Check Valid and
Access Rights bits
+
3) Combine PPN and
offset
PhysicalAddress
4) Use PA to access memory
Private Address Space per User
8/1/2016 CS61C Su16 - Lecture 22 14
• Each program has a page table • Page table contains an entry for each program page
VA1Prog 1
VA1Prog 2
VA1Prog 3
Phys
ical
Mem
ory
Page Table
Page Table
Page Table free
OSpages
Protection Between Processes
• With a bare system, addresses issued with loads/stores are real physical addresses
• This means any program can issue any address, therefore can access any part of memory, even areas which it doesn’t own– Example: the OS data structures
• We should send all addresses through a mechanism that the OS controls, before they make it out to DRAM - a translation mechanism
157/31/2018 CS61C Su18 - Lecture 23
16
How Big is the Page Table?• 64 MiB RAM• 32-bit virtual address space• 1 KiB pages
Less than 1 Page(A)
Less than 100 Pages(B)
Less than 1000 Pages(C)
More than 1000 Pages(D)
Offset Bits = log_2 (1024) = 10
# Virtual Page Bits = 32 - 10 = 22(2^22 entries in the page table! )
# Physical Page Bits = log_2 (2^26) - 10
# Physical Page Bits = 26 - 10 = 16 (2 B)
Total Bytes = 2^22 * 2 = 2^23
Number of pages = 2^23 / 2^10 = 2^13 PAGES
Where Should Page Tables Reside?
• Space required by the page tables (PT) is proportional to the address space, number of users, ...
⇒ Too large to keep in registers, or caches….
• Idea: Keep PTs in the main memory• How can we find the page table in memory if the
page table is how we learn Physical addresses?????
• PT Base Register: stores Physical Address of current Page Table
8/1/2016 CS61C Su16 - Lecture 22 17
PT Base Register
18
Page Tables in Physical Memory
VA1
Prog 1 Virtual Address Space
Prog 2 Virtual Address Space
PT Prog1
PT Prog2
VA1
Phys
ical
Mem
ory
1) Access page table for address
translation
2) Access correct physical address
Requires two accesses of
physical memory!
19
Linear Page Table
VPN OffsetVirtual address
PT Base Register
VPN
Data word
Data Pages
Offset
PPNPPN
DPNPPN
PPNPPNPage Table
DPN
PPN
DPNDPN
DPNPPN
Page Table Entry (PTE) contains:1 bit to indicate if page exists
And either PPN or DPN:
PPN (physical page number) for a memory-resident page
DPN (disk page number) for a page on the disk
Status bits for protection and usage (read, write, exec)
OS sets the Page Table Base Register whenever active user process changes
8/2/2017 CS61C Su18 - Lecture 24
20
Hierarchical Page Table
Level 1 Page Table
Level 2Page Tables
Data Pages
page in primary memory page in secondary memory
Root of the CurrentPage Table
p1
offset
p2
Virtual Address
(ProcessorRegister)
PTE of a nonexistent page
p1 p2 offset
01112212231
10-bitL1 index
10-bit L2 index
Phys
ical
Mem
ory
8/2/2017 CS61C Su18 - Lecture 24
21
Hierarchical Page Table Walk: SPARC v8
31 11 0
Virtual Address Index 1 Index 2 Index 3 Offset31 23 17 11 0
ContextTableRegister
ContextRegister
root ptr
PTPPTP
PTE
Context Table
L1 Table
L2 Table
L3 Table
Physical Address PPN Offset
MMU does this table walk in hardware on a TLB miss
PT Base Register
Page Tables in Physical Memory
VA1
Prog 1 Virtual Address Space
Prog 2 Virtual Address Space
PT Prog1
PT Prog2
VA1
Phys
ical
Mem
ory
1) Access page table for address
translation
2) Access correct physical address
Requires two accesses of
physical memory!
7/31/2018 CS61C Su18 - Lecture 23 22
Two-Level Page Tables in Physical Memory
VA1
User 1
User1/VA1
User2/VA1
Level 1 PT User 1
Level 1 PT User 2
VA1
User 2Level 2 PT User 2
Virtual Address Spaces
Physical Memory
8/2/2017 CS61C Su18 - Lecture 24 36
PT Base Register
Agenda
• Virtual Memory and Page Tables
• Administrivia
• Translation Lookaside Buffer (TLB)
• VM Performance
• VM Wrap-up
7/31/2018 CS61C Su18 - Lecture 23 24
Administrivia
• Proj4 due on Friday (8/03)– Hold off on submissions for now
• HW7 released• Guerilla Session on Wed. @Cory 540AB, 4-6p• Regrade requests are open for MT2 until
Friday• The final will be 8/09 7-10PM @VLSB
2040/2060!
257/30/2018 CS61C Su18 - Lecture 22
Agenda
• Virtual Memory
• Page Tables
• Administrivia
• Translation Lookaside Buffer (TLB)
• VM Performance
• VM Wrap-up
7/31/2018 CS61C Su18 - Lecture 23 26
Virtual Memory Problem
• 2 physical memory accesses per data access = SLOW!
• Since locality in pages of data, there must be locality in the translations of those pages
• Build a separate cache for the Page Table– For historical reasons, cache is called a Translation
Lookaside Buffer (TLB)
– Notice that what is stored in the TLB is NOT data, but the VPN → PPN mapping translations
7/31/2018 CS61C Su18 - Lecture 23 27
TLBs vs. Caches
• TLBs usually small, typically 32–128 entries• TLB access time comparable to cache (much faster
than accessing main memory)• TLBs usually are fully/highly associativity
7/31/2018 CS61C Su18 - Lecture 23 28
D$ / I$
Memory Address
Data at memory address
Access next cache level / main memory
On miss:
TLBVPN PPN
Access Page Table in main memory
On miss:
Where Are TLBs Located?
• Which should we check first: Cache or TLB?– Can cache hold requested data if corresponding
page is not in physical memory?
– With TLB first, does cache receive VA or PA?
7/31/2018 CS61C Su18 - Lecture 23 29
No – check PT first
PA
Now the TLB does the translation, not the Page Table!
CacheVA PAmiss
hitdata
hit
miss
CPUMain
MemoryTLB
Page Table
PPN
VPN
Address Translation Using TLB
7/31/2018 CS61C Su18 - Lecture 23 30
TLB Tag PPN(used just like in a cache)
. . .
TLB
TLB Tag Page Offset
VPN
PPN Page Offset
Tag Index Offset
Virtual Address
Physical AddressTag Block Data
. . .
DataCache
PA split two different
ways!
Note: TIO for VA & PA unrelated
Typical TLB Entry Format
• Valid whether that TLB ENTRY is valid (unrelated to PT)• Access Rights: Data from the PT
• Dirty: Basically always use write-back, so indicates whether or not to write page to disk when replaced
• Ref: Used to implement LRU– Set when page is accessed, cleared periodically by OS
• TLB Index: VPN mod (# TLB sets)
• TLB Tag: VPN minus TLB Index (upper bits)
• PPN: Data from PT7/31/2018 CS61C Su18 - Lecture 23 31
Valid Dirty Ref Access Rights TLB Tag PPNX X X XXX
Question: How many bits wide are the following?
• 16 KiB pages• 40-bit virtual addresses• 64 GiB physical memory• 2-way set associative TLB with 512 entries
12 14 38(A)18 8 45(B)
14 12 40(C)17 9 43(D)
TLB Tag TLB Index TLB Entry
32
Valid Dirty Ref Access Rights TLB Tag PPNX X X XX
Valid Dirty Ref Access Rights TLB Tag PPNX X XX
First solve for T:I:O of the TLBO = log_2 (2^14) = 14I = log_2 (512 / 2) = 8T = 40 - 14 - 8 = 18
Now solve for the size of a TLB entry.Total Bits = 5 + 18 + PPN bitsPPN bits = log_2 (2^36 / 2^14) = 22Total Bits = 5 + 18 + 22 = 45
Fetching Data on a Memory Read
1) Check TLB (input: VPN, output: PPN)– TLB Hit: Fetch translation, return PPN– TLB Miss: Check page table (in memory)
• Page Table Hit: Load page table entry into TLB• Page Table Miss (Page Fault): Fetch page from disk to
memory, update corresponding page table entry, then load entry into TLB
2) Check cache (input: PPN, output: data)– Cache Hit: Return data value to processor– Cache Miss: Fetch data value from memory, store
it in cache, return it to processor7/31/2018 CS61C Su18 - Lecture 23 33
Page Faults
• Load the page off the disk into a free page of memory– Switch to some other process while we wait
• Interrupt thrown when page loaded and the process’ page table is updated– When we switch back to the task, the desired data will
be in memory
• If memory full, replace page (LRU), writing back if necessary, and update both page table entries– Continuous swapping between disk and memory
called “thrashing”7/31/2018 CS61C Su18 - Lecture 23 34
Performance Metrics
• VM performance also uses Hit/Miss Rates and Miss Penalties– TLB Miss Rate: Fraction of TLB accesses that
result in a TLB Miss– Page Table Miss Rate: Fraction of PT accesses
that result in a page fault
• Caching performance definitions remain the same– Somewhat independent, as TLB will always pass
PA to cache regardless of TLB hit or miss
7/31/2018 CS61C Su18 - Lecture 23 35
Data Fetch Scenarios
• Are the following scenarios for a single data access possible?– TLB Miss, Page Fault– TLB Hit, Page Table Hit– TLB Miss, Cache Hit– Page Table Hit, Cache Miss– Page Fault, Cache Hit
7/31/2018 CS61C Su18 - Lecture 23 36
Yes
No
Yes
Yes
No
Cache
VA PAmiss
hitdata
hit
miss
CPUMain
MemoryTLB
Page Table
Disk
fault
page
PPN
VPN
Beat the Staff
7/31/2018 CS61C Su18 - Lecture 23 37
● You have met the staff, but now it’s your chance to beat them
● The staff is working on project 4 too, trying to get the fastest speedup they can
● We have a competition for the students with the best speedups, but we will have an additional reward for any students who Beat the Staff
● More details to come in a piazza post
Agenda
• Virtual Memory
• Page Tables
• Administrivia
• Translation Lookaside Buffer (TLB)
• VM Performance
• VM Wrap-up
7/31/2018 CS61C Su18 - Lecture 23 38
VM Performance
• Virtual Memory is the level of the memory hierarchy that sits below main memory– TLB comes before cache, but affects transfer of
data from disk to main memory
– Previously we assumed main memory was lowest level, now we just have to account for disk accesses
• Same CPI, AMAT equations apply, but now treat main memory like a mid-level cache
7/31/2018 CS61C Su18 - Lecture 23 39
Typical Performance Stats
7/31/2018 CS61C Su18 - Lecture 23 40
CPU cache primarymemory
secondarymemory
Caching Demand pagingcache entry page framecache block (≈32 bytes) page (≈4 Ki bytes)cache miss rate (1% to 20%) page miss rate (<0.001%)cache hit (≈1 cycle) page hit (≈100 cycles)cache miss (≈100 cycles) page fault (≈5M cycles)
primarymemory
CPU
Impact of Paging on AMAT (1/2)
• Memory Parameters:– L1 cache hit = 1 clock cycles, hit 95% of accesses
– L2 cache hit = 10 clock cycles, hit 60% of L1 misses
– DRAM = 200 clock cycles (≈100 nanoseconds)
– Disk = 20,000,000 clock cycles (≈10 milliseconds)
• Average Memory Access Time (no paging):– 1 + 5%×10 + 5%×40%×200 = 5.5 clock cycles
• Average Memory Access Time (with paging):– 5.5 (AMAT with no paging) + ?
7/31/2018 CS61C Su18 - Lecture 23 41
Impact of Paging on AMAT (2/2)
• Average Memory Access Time (with paging) =• 5.5 + 5%×40%× (1-HR
Mem)×20,000,000
• AMAT if HRMem
= 99%?• 5.5 + 0.02×0.01×20,000,000 = 4005.5 (≈728x slower)• 1 in 20,000 memory accesses goes to disk: 10 sec
program takes 2 hours!
• AMAT if HRMem
= 99.9%?• 5.5 + 0.02×0.001×20,000,000 = 405.5
• AMAT if HRMem
= 99.9999%• 5.5 + 0.02×0.000001×20,000,000 = 5.9
7/31/2018 CS61C Su18 - Lecture 23 42
Impact of TLBs on Performance
• Each TLB miss to Page Table ~ L1 Cache miss
• TLB Reach: Amount of virtual address space that can be simultaneously mapped by TLB:– TLB typically has 128 entries of page size 4-8 KiB
– 128 × 4 KiB = 512 KiB = just 0.5 MiB
• What can you do to have better performance?– Multi-level TLBs
– Variable page size (segments)
– Special situationally-used “superpages”7/31/2018 CS61C Su18 - Lecture 23 43
Conceptually same as multi-level caches
Not covered in CS61C
Agenda
• Virtual Memory
• Page Tables
• Administrivia
• Translation Lookaside Buffer (TLB)
• VM Performance
• VM Wrap-up
7/31/2018 CS61C Su18 - Lecture 23 44
Hardware/Software Support for Memory Protection
• Different tasks can share parts of their virtual address spaces– But need to protect against errant access– Requires OS assistance
• Hardware support for OS protection– Privileged supervisor mode (a.k.a. kernel mode)– Privileged instructions– Page tables and other state information only
accessible in supervisor mode– System call exception (e.g. ecall in RISC-V)
7/31/2018 CS61C Su18 - Lecture 23 45
Context Switching
• How does a single processor run many programs at once?
• Context switch: Changing of internal state of processor (switching between processes)– Save register values (and PC) and change value in
Page Table Base register
• What happens to the TLB?– Current entries are for a different process (similar
VAs, though!)– Set all entries to invalid on context switch
7/31/2018 CS61C Su18 - Lecture 23 46
47
Page-Based Virtual-Memory Machine
Hardware Page Table Walker
PC
Instr TLB I$ D RegFile E M D$ W+ Data
TLB
Physical Address
Virtual Address
Physical Address
Virtual Address
7/31/2018 CS61C Su18 - Lecture 23
Memory Controller
Main Memory (DRAM)
Physical Address
PTBR
Physical Address
Physical Address
48
Address Translation
hardwarehardware or softwaresoftware
Virtual Address
TLBLookup
Page Table“Walk”
Update TLB
Page Fault(OS loads page)
ProtectionCheck
PhysicalAddress
TLB Miss TLB Hit
Page notin Mem
AccessDenied
Access Permitted
ProtectionFault
SEGFAULT
Page in Mem
Check cacheFind in Disk Find in Mem
7/31/2018 CS61C Su18 - Lecture 23
Summary
• User program view:– Contiguous memory
– Start from some set VA
– “Infinitely” large
– Is the only running program
• Reality:– Non-contiguous memory
– Start wherever available memory is
– Finite size
– Many programs running simultaneously
• Virtual memory provides:– Illusion of contiguous memory
– All programs starting at same set address
– Illusion of ~ infinite memory (232 or 264 bytes)
– Protection, Sharing
• Implementation:– Divide memory into chunks (pages)
– OS controls page table that maps virtual into physical addresses
– memory as a cache for disk
– TLB is a cache for the page table
7/31/2018 CS61C Su18 - Lecture 23 49