Classical Operating Systems: A Quick Tour Jeff Chase.
-
Upload
madison-mcbride -
Category
Documents
-
view
215 -
download
1
Transcript of Classical Operating Systems: A Quick Tour Jeff Chase.
The Birth of a ProgramThe Birth of a Program
int j;char* s = “hello\n”;
int p() { j = write(1, s, 6); return(j);}
myprogram.c
compiler
…..p: store this store that push jsr _write ret etc.
myprogram.s
assembler data
myprogram.o
linker
object file
data program
(executable file)myprogram
datadatadata
libraries and other
objects
What’s in an Object File or Executable?What’s in an Object File or Executable?
int j = 327;char* s = “hello\n”;char sbuf[512];
int p() { int k = 0; j = write(1, s, 6); return(j);}
text
dataidata
wdata
header
symboltable
relocationrecords
Used by linker; may be removed after final link step and strip.
Header “magic number”indicates type of image.
Section table an arrayof (offset, len, startVA)
program sections
program instructionsp
immutable data (constants)“hello\n”
writable global/static dataj, s
j, s ,p,sbuf
Memory and the CPUMemory and the CPU0
2n
code library
OS data
OS code
Program A
dataData
Program B
Data
registers
CPU
R0
Rn
PC
main memory
x
x
ThreadsThreads
A thread is a schedulable stream of control.defined by CPU register values (PC, SP)
suspend: save register values in memory
resume: restore registers from memory
Multiple threads can execute independently:
They can run in parallel on multiple CPUs...- physical concurrency
…or arbitrarily interleaved on a single CPU.- logical concurrency
Each thread must have its own stack.
A Peek Inside a Running ProgramA Peek Inside a Running Program
0
high
code library
your data
heap
registers
CPU
R0
Rn
PC
“memory”
x
x
your program
common runtime
stack
address space(virtual or physical)
SP
y
y
A Program With Two ThreadsA Program With Two Threads
0
high
code library
data
registers
CPU
R0
Rn
PC
“memory”
x
x
program
common runtime
stack
address space
SP
y
y stack
runningthread
“on deck” and ready to run
Thread Context SwitchThread Context Switch
0
high
code library
data
registers
CPU
R0
Rn
PC
“memory”
x
x
program
common runtime
stack
address space
SP
y
y stack
1. save registers
2. load registers
switch in
switch out
Threads vs. ProcessesThreads vs. Processes
1. The process is a kernel abstraction for an independent executing program.
includes at least one “thread of control”
also includes a private address space (VAS)
2. Threads may share a process/address spaceEvery thread must exist within some process VAS.
Processes may be “multithreaded”.
Threads can access memory as enabled in their VAS.
Threads can access memory only as enabled in their VAS.
data
data
Memory Memory Protection Protection
Paging Virtual memory provides protection by:
• Each process (user or OS) has different virtual memory space.
• The OS maintain the page tables for all processes.
• A reference outside the process allocated space cause an exception that lets the OS decide what to do.
• Memory sharing between processes is done via different Virtual spaces but common physical frames.
[Kedem, CPS 104, Fall05]
The Program and the Process VASThe Program and the Process VAS
text
dataidatawdata
header
symboltable
relocationrecords
program
text
data
BSS
user stack
args/envkernel
data
process VAS
sections segments
BSS“Block Started by Symbol”(uninitialized global data)e.g., heap and sbuf go here.
Args/env strings copied in by kernel when the process is created.
Process text segment is initialized directly from program text
section.
Process data segment(s) are
initialized from idata and wdata sections.
Process stack and BSS (e.g., heap) segment(s) are
zero-filled.
Process BSS segment may be expanded at runtime with a system call (e.g., Unix sbrk) called by the heap manager
routines.
Text and idata segments may be write-protected.
Virtual Address TranslationVirtual Address Translation
VPN offset
29 013
Example: typical 32-bitarchitecture with 8KB pages.
addresstranslation
Virtual address translation maps a virtual page number (VPN) to a physical page frame number (PFN): the rest is easy.
PFN
offset
+
00
virtual address
physical address{
Deliver exception toOS if translation is notvalid and accessible inrequested mode.
A Simple Page TableA Simple Page Table
PFN 0PFN 1
PFN i
page #i offset
user virtual address
PFN i+
offset
process page table
physical memorypage frames
In this example, each VPN j maps to PFN j,
but in practice any physical frame may be
used for any virtual page.
Each process/VAS has its own page table.
Virtual addresses are translated relative to
the current page table.
The page tables are themselves stored in memory; a protected
register holds a pointer to the current page table.
Virtual Memory as a CacheVirtual Memory as a Cache
text
dataidatawdata
header
symboltable, etc.
programsections
text
data
BSS
user stack
args/envkernel
data
processsegments
physicalpage frames
virtualmemory
(big)
physicalmemory(small)
executablefile
backingstorage
virtual-to-physical translations
pageout/eviction
page fetch
Completing a VM ReferenceCompleting a VM Reference
raiseexception
probepage table
loadTLB
probe TLB
accessphysicalmemory
accessvalid?
pagefault?
signalprocess
allocateframe
page ondisk?
fetchfrom disk
zero-fillloadTLB
starthere
MMU
OS
Processes and the KernelProcesses and the Kernel
data dataprocesses in private
virtual address spaces
system call traps...and upcalls (e.g.,
signals)
shared kernel code and data
in shared address space
Threads or processes enter the kernel for services.
The kernel sets up process execution contexts to
“virtualize” the machine.
CPU and devices force entry to the kernel to handle exceptional events.
Architectural Foundations of OS KernelsArchitectural Foundations of OS Kernels
• One or more privileged execution modes (e.g., kernel mode)
protected device control registers
privileged instructions to control basic machine functions
• System call trap instruction and protected fault handling
User processes safely enter the kernel to access shared OS services.
• Virtual memory mapping
OS controls virtual-physical translations for each address space.
• Device interrupts to notify the kernel of I/O completion etc.
Includes timer hardware and clock interrupts to periodically return control to the kernel as user code executes.
• Atomic instructions for coordination on multiprocessors
Classical View: The QuestionsClassical View: The Questions
The basic issues/questions classical OS are how to:
• allocate memory and storage to multiple programs?
• share the CPU among concurrently executing programs?
• suspend and resume programs?
• share data safely among concurrent activities?
• protect one executing program’s storage from another?
• protect the code that implements the protection, and mediates access to resources?
• prevent rogue programs from taking over the machine?
• allow programs to interact safely?
22
The Access Control Model
Object
Resource
Reference monitor
Guard
Do operation
Request
Principal
Source
Authorization
Audit log
Authentication
Policy
1. Isolation boundary2. Access
control 3. Policy
1. Isolation Boundary to prevent attacks outside access-controlled channels
2. Access Control for channel traffic
3. Policy management
[Butler Lampson – Accountability and Freedom]
23
Isolation
• Attacks on:– Program– Isolation– Policy
ServicesBoundary Creator
GUARD
GUARD
policy
policy
Program
Data
guard
Host
• I am isolated if anything that goes wrong is my fault– Actually, my program’s fault
Object
Resource
Reference monitor
Guard
Do operation
Request
Principal
Source
Authorization
Audit log
Authentication
Policy
1. Isolation boundary
2. Access control
3. Policy
[Butler Lampson – Accountability and Freedom]
24
Access Control Mechanisms:The Gold Standard
Authenticate principals: Who made a request Mainly people, but also channels, servers, programs
(encryption implements channels, so key is a principal)
Authorize access: Who is trusted with a resource Group principals or resources, to simplify management
Can define by a property, e.g. “type-safe” or “safe for scripting”
Audit: Who did what when?
• Lock = Authenticate + Authorize• Deter = Authenticate + Audit
Object
Resource
Reference monitor
Guard
Do operation
Request
Principal
Source
Authorization
Audit log
Authentication
Policy
1. Isolation boundary
2. Access control
3. Policy
[Butler Lampson – Accountability and Freedom]
Kernel ModeKernel Mode0
2n
code library
OS data
OS code
Program A
dataData
Program B
Data
registers
CPU
R0
Rn
PC
main memory
x
x
mode
CPU mode (a field in some status
register) indicates whether the CPU is
running in a user program or in the protected kernel.
Some instructions or register accesses are only legal when the CPU is executing in
kernel mode.
physical address space
The KernelThe Kernel• Today, all “real” operating systems have protected kernels.
The kernel resides in a well-known file: the “machine” automatically loads it into memory (boots) on power-on/reset.
Our “kernel” is called the executive in some systems (e.g., XP).
• The kernel is (mostly) a library of service procedures shared by all user programs, but the kernel is protected:
User code cannot access internal kernel data structures directly, and it can invoke the kernel only at well-defined entry points (system calls).
• Kernel code is like user code, but the kernel is privileged:
The kernel has direct access to all hardware functions, and defines the machine entry points for interrupts and exceptions.
Protecting Entry to the KernelProtecting Entry to the Kernel
Protected events and kernel mode are the architectural foundations of kernel-based OS (Unix, XP, etc).
• The machine defines a small set of exceptional event types.
• The machine defines what conditions raise each event.
• The kernel installs handlers for each event at boot time.
e.g., a table in kernel memory read by the machine
The machine transitions to kernel mode only on an exceptional event.
The kernel defines the event handlers.
Therefore the kernel chooses what code will execute in kernel mode, and when.
user
kernel
interrupt orexceptiontrap/return
The Role of EventsThe Role of Events
A CPU event is an “unnatural” change in control flow.Like a procedure call, an event changes the PC.
Also changes mode or context (current stack), or both.
Events do not change the current space!
The kernel defines a handler routine for each event type.Event handlers always execute in kernel mode.
The specific types of events are defined by the machine.
Once the system is booted, every entry to the kernel occurs as a result of an event.
In some sense, the whole kernel is a big event handler.
CPU Events: Interrupts and ExceptionsCPU Events: Interrupts and Exceptions
An interrupt is caused by an external event.device requests attention, timer expires, etc.
An exception is caused by an executing instruction.CPU requires software intervention to handle a fault or trap.
unplanned deliberatesync fault syscall trapasync interrupt AST
control flow
event handler (e.g., ISR: Interrupt Service
Routine)
exception.cc
AST: Asynchronous System TrapAlso called a software interrupt or an Asynchronous or Deferred Procedure Call (APC or DPC)
Note: different “cultures” may use some of these terms (e.g., trap, fault, exception, event, interrupt) slightly differently.
Protection Rings
Kernel/user mode not enough!
Need a mode for hypervisor
X86 rings Has built in security
levels (Rings 0, 1, 2, 3) Ring 0 – OS Software
(most privileged) Ring 3 – User software Ring 1 & 2 – Not used
Xen guest OS executes on e.g. Ring 1
Increasing Privilege Level
Ring 0
Ring 1
Ring 2
Ring 3
[Fischbach]
Example: System Call TrapsExample: System Call Traps
User code invokes kernel services by initiating system call traps.
• Programs in C, C++, etc. invoke system calls by linking to a standard library of procedures written in assembly language.
the library defines a stub or wrapper routine for each syscall
stub executes a special trap instruction (e.g., chmk or callsys or int)
syscall arguments/results passed in registers or user stack
read() in Unix libc.a library (executes in user mode):
#define SYSCALL_READ 27 # code for a read system call
move arg0…argn, a0…an # syscall args in registers A0..ANmove SYSCALL_READ, v0 # syscall dispatch code in V0callsys # kernel trapmove r1, _errno # errno = return statusreturn
Alpha CPU architecture
FaultsFaultsFaults are similar to system calls in some respects:
• Faults occur as a result of a process executing an instruction.
Fault handlers execute on the process kernel stack; the fault handler may block (sleep) in the kernel.
• The completed fault handler may return to the faulted context.
But faults are different from syscall traps in other respects:
• Syscalls are deliberate, but faults are “accidents”.
divide-by-zero, dereference invalid pointer, memory page fault
• Not every execution of the faulting instruction results in a fault.
may depend on memory state or register contents
Processes and the KernelProcesses and the Kernel
data dataprocesses in private
virtual address spaces
system call traps...and upcalls (e.g.,
signals)
shared kernel code and data
in shared address space
Threads or processes enter the kernel for services.
The kernel sets up process execution contexts to
“virtualize” the machine.
CPU and devices force entry to the kernel to handle exceptional events.
Process InternalsProcess Internals
+ +user ID
process IDparent PID
sibling linkschildren
virtual address space process descriptor (PCB)
resources
thread
stack
Each process has a thread bound to the VAS.
The thread has a saved user context as well as a system
context.
The kernel can manipulate the user context to start the
thread in user mode wherever it wants.
Process state includes a file descriptor table, links to maintain the process tree, and a
place to store the exit status.
The address space is represented by page
table, a set of translations to physical
memory allocated from a kernel memory manager.
The kernel must initialize the process
memory with the program image to run.
Kernel Stacks and Trap/Fault HandlingKernel Stacks and Trap/Fault Handling
data
Processes execute user
code on a user stack in the user
portion of the process virtual address space.
Each process has a second kernel stack in kernel space (the kernel portion of the
address space).
stack
stack
stack
stack
System calls and faults run in kernel mode on the process kernel stack.
syscall dispatch
table
System calls run in the process
space, so copyin and copyout can
access user memory.
The syscall trap handler makes an indirect call through the system call dispatch table to the handler for the specific system call.
A Lasting Achievement?A Lasting Achievement?
“Perhaps the most important achievement of Unix is to demonstrate that a powerful operating system for interactive use need not be expensive…it can run on hardware costing as little as $40,000.”
The UNIX Time-Sharing System* D. M. Ritchie and K. Thompson
DEC PDP-11/24
http://histoire.info.online.fr/pdp11.html
Elements of the UnixElements of the Unix
1. rich model for IPC and I/O: “everything is a file”file descriptors: most/all interactions with the outside world are
through system calls to read/write from file descriptors, with a unified set of syscalls for operating on open descriptors of different types.
2. simple and powerful primitives for creating and initializing child processes
fork: easy to use, expensive to implement
Command shell is an “application” (user mode)
3. general support for combining small simple programs to perform complex tasks
standard I/O and pipelines
The ShellThe Shell
The Unix command interpreters run as ordinary user processes with no special privilege.
This was novel at the time Unix was created: other systems viewed the command interpreter as a trusted part of the OS.
Users may select from a range of interpreter programs available, or even write their own (to add to the confusion).
csh, sh, ksh, tcsh, bash: choose your flavor...or use perl.
Shells use fork/exec/exit/wait to execute commands composed of program filenames, args, and I/O redirection symbols.
Shells are general enough to run files of commands (scripts) for more complex tasks, e.g., by redirecting shell’s stdin.
Shell’s behavior is guided by environment variables.
Using the shellUsing the shell
• Commands: ls, cat, and all that
• Current directory: cd and pwd
• Arguments: echo
• Signals: ctrl-c
• Job control, foreground, and background: &, ctrl-z, bg, fg
• Environment variables: printenv and setenv
• Most commands are programs: which, $PATH, and /bin
• Shells are commands: sh, csh, ksh, tcsh, bash
• Pipes and redirection: ls | grep a
• Files and I/O: open, read, write, lseek, close
• stdin, stdout, stderr
• Users and groups: whoami, sudo, groups
Other application programs
cc
Other application programs
Hardware
Kernel
sh who a.out
date
wc
grepedvi
ld
as
comp
cppnroff
Questions about ProcessesQuestions about Processes
A process is an execution of a program within a private virtual address space (VAS).
1. What are the system calls to operate on processes?
2. How does the kernel maintain the state of a process?Processes are the “basic unit of resource grouping”.
3. How is the process virtual address space laid out?What is the relationship between the program and the process?
4. How does the kernel create a new process?How to allocate physical memory for processes?
How to create/initialize the virtual address space?
Process CreationProcess CreationTwo ways to create a process
• Build a new empty process from scratch
• Copy an existing process and change it appropriately
Option 1: New process from scratch
• StepsLoad specified code and data into memory;
Create empty call stack
Create and initialize PCB (make look like context-switch)
Put process on ready list
• Advantages: No wasted work
• Disadvantages: Difficult to setup process correctly and to express all possible options
Process permissions, where to write I/O, environment variables
Example: WindowsNT has call with 10 arguments
[Remzi Arpaci-Dusseau]
Process CreationProcess CreationOption 2: Clone existing process and change
• Example: Unix fork() and exec()Fork(): Clones calling process
Exec(char *file): Overlays file image on calling process
• Fork() Stop current process and save its state
Make copy of code, data, stack, and PCB
Add new PCB to ready list
Any changes needed to PCB?
• Exec(char *file)Replace current data and code segments with those in specified file
• Advantages: Flexible, clean, simple
• Disadvantages: Wasteful to perform copy and then overwrite of memory
[Remzi Arpaci-Dusseau]
Process Creation in UnixProcess Creation in Unix
int pid;int status = 0;
if (pid = fork()) {/* parent */…..pid = wait(&status);
} else {/* child */…..exit(status);
}
Parent uses wait to sleep until the child exits; wait returns child pid and status.
Wait variants allow wait on a specific child, or notification of stops and other signals.
The fork syscall returns twice: it returns a zero to the child and the child process ID (pid) to the parent.
Unix Unix Fork/Exec/Exit/WaitFork/Exec/Exit/Wait Example Example
fork parent fork child
wait exit
int pid = fork();Create a new process that is a clone of its parent.
exec*(“program” [, argvp, envp]);Overlay the calling process virtual memory with a new program, and transfer control to it.
exit(status);Exit with status, destroying the process. Note: this is not the only way for a process to exit!
int pid = wait*(&status);Wait for exit (or other status change) of a child, and “reap” its exit status. Note: child may have exited before parent calls wait!
exec initialize child context
How are Unix shells implemented?How are Unix shells implemented?while (1) {
Char *cmd = getcmd();
int retval = fork();
if (retval == 0) {
// This is the child process
// Setup the child’s process environment here
// E.g., where is standard I/O, how to handle signals?
exec(cmd);
// exec does not return if it succeeds
printf(“ERROR: Could not execute %s\n”, cmd);
exit(1);
} else {
// This is the parent process; Wait for child to finish
int pid = retval;
wait(pid);
}
}
[Remzi Arpaci-Dusseau]
The Concept of ForkThe Concept of Fork
fork creates a child process that is a clone of the parent.
• Child has a (virtual) copy of the parent’s virtual memory.
• Child is running the same program as the parent.
• Child inherits open file descriptors from the parent.
(Parent and child file descriptors point to a common entry in the system open file table.)
• Child begins life with the same register values as parent.
The child process may execute a different program in its context with a separate exec() system call.
What’s So Cool About What’s So Cool About ForkFork
1. fork is a simple primitive that allows process creation without troubling with what program to run, args, etc.
Serves the purpose of “lightweight” processes (like threads?).
2. fork gives the parent program an opportunity to initialize the child process…e.g., the open file descriptors.
Unix syscalls for file descriptors operate on the current process.
Parent program running in child process context may open/close I/O and IPC objects, and bind them to stdin, stdout, and stderr.
Also may modify environment variables, arguments, etc.
3. Using the common fork/exec sequence, the parent (e.g., a command interpreter or shell) can transparently cause children to read/write from files, terminal windows, network connections, pipes, etc.
Unix File DescriptorsUnix File Descriptors
Unix processes name I/O and IPC objects by integers known as file descriptors. • File descriptors 0, 1, and 2 are reserved by convention
for standard input, standard output, and standard error.“Conforming” Unix programs read input from stdin, write
output to stdout, and errors to stderr by default.
• Other descriptors are assigned by syscalls to open/create files, create pipes, or bind to devices or network sockets.pipe, socket, open, creat
• A common set of syscalls operate on open file descriptors independent of their underlying types.read, write, dup, close
Unix File Descriptors IllustratedUnix File Descriptors Illustrated
user space
File descriptors are a special case of kernel object handles.
pipe
file
socket
process filedescriptor
table
kernel
system open file table
tty
Disclaimer: this drawing is
oversimplified.
The binding of file descriptors to objects is specific to each process, like the virtual translations in the virtual address space.
Kernel Object HandlesKernel Object Handles
Instances of kernel abstractions may be viewed as “objects” named by protected handles held by processes.• Handles are obtained by create/open calls, subject to security
policies that grant specific rights for each handle.
• Any process with a handle for an object may operate on the object using operations (system calls).Specific operations are defined by the object’s type.
• The handle is an integer index to a kernel table.
port
file
etc.object
handles
user space kernel
Microsoft Windows object handles Unix file descriptors
Unix PhilosophyUnix Philosophy
Rule of Modularity: Write simple parts connected by clean interfaces.Rule of Composition: Design programs to be connected to other
programs.Rule of Separation: Separate policy from mechanism; separate interfaces
from engines.Rule of Representation: Fold knowledge into data so program logic can
be stupid and robust.Rule of Transparency: Design for visibility to make inspection and
debugging easier.Rule of Repair: When you must fail, fail noisily and as soon as possibleRule of Extensibility: Design for the future, because it will be here
sooner than you think.Rule of Robustness: Robustness is the child of transparency and
simplicity.
[Eric Raymond]
Unix Philosophy: SimplicityUnix Philosophy: Simplicity
Rule of Economy: Programmer time is expensive; conserve it in preference to machine time.
Rule of Clarity: Clarity is better than cleverness.Rule of Simplicity: Design for simplicity; add complexity only where
you must.Rule of Parsimony: Write a big program only when it is clear by
demonstration that nothing else will do.Rule of Generation: Avoid hand-hacking; write programs to write
programs when you can.Rule of Optimization: Prototype before polishing. Get it working before
you optimize it.
[Eric Raymond]
Unix Philosophy: InterfacesUnix Philosophy: Interfaces
Rule of Least Surprise: In interface design, always do the least surprising thing.
Rule of Silence: When a program has nothing surprising to say, it should say nothing.
[Eric Raymond]
Rule of Diversity: Distrust all claims for “one true way”.
Introduction to Virtual AddressingIntroduction to Virtual Addressing
text
data
BSS
user stack
args/envkernel
data
virtualmemory
(big?)
physicalmemory(small?)
virtual-to-physical translations
User processes address memory through virtual
addresses.
The kernel and the machine collude to
translate virtual addresses to
physical addresses.
The kernel controls the virtual-physical translations in effect
for each space.
The machine does not allow a user process to access memory unless the kernel “says it’s OK”.
The specific mechanisms for implementing virtual address translation
are machine-dependent.
Multics (1965-2000)Multics (1965-2000)
• “Multiplexed 24x7 computer utility”
Multi-user “time-sharing”, interactive and batch
Processes, privacy, security, sharing, accounting
“Decentralized system programming”
• Virtual memory with “automatic page turning”
• “Two-dimensional VM” with segmentation
Avoids “complicated overlay techniques”
Modular sharing and dynamic linking
• Hierarchical file system with symbolic names, automatic backup
“Single-level store”
• Dawn of “systems research” as an academic enterprise
Multics ConceptsMultics Concepts
• Segments as granularity of sharingProtection rings and the segment protection level
• Segments have symbolic names in a hierarchical name space
• Segments are “made known” before access from a process.Apply access control at this point.
• Segmentation is a cheap way to extend the address space.Note: paging mechanisms are independent of segmentation.
• Dynamic linking resolves references across segments.
How does the Unix philosophy differ?
Segmentation (1)Segmentation (1)
One-dimensional address space
growing tables
tables may bump
[Tanenbaum]
Variable PartitioningVariable Partitioning
Variable partitioning is the strategy of parking differently sized carsalong a street with no marked parking space dividers.
Wasted space from externalfragmentation
Segmentation with Paging: MULTICS (1)Segmentation with Paging: MULTICS (1)
Descriptor segment points to page tables
Segment descriptor – numbers are field lengths
Segmentation with Paging: MULTICS (2)Segmentation with Paging: MULTICS (2)
A 34-bit MULTICS virtual address
Segmentation with Paging: MULTICS (3)Segmentation with Paging: MULTICS (3)
Conversion of a 2-part MULTICS address into a main memory address
[Tanenbaum]
The Virtual Address SpaceThe Virtual Address Space
A typical process VAS space includes:• user regions in the lower half
V->P mappings specific to each process
accessible to user or kernel code
• kernel regions in upper halfshared by all processes, but accessible only
to kernel code
• Windows on IA32 subdivides kernel region into an unpaged half and a (mostly) paged upper half at 0xC0000000 for page tables and I/O cache.
• Win95/98 used the lower half of system space as a system-wide shared region.
text
data
BSS
user stack
args/env
0
data
kernel textand
kernel data
2n-1
2n-1
0x0
0xffffffff
A VAS for a private address space system (e.g., Unix, NT/XP) executing on a typical 32-bit system (e.g., x86).
sbrk()
jsr
Process and Kernel Address SpacesProcess and Kernel Address Spaces
data
0
2n-1-1
2n-1
2n-1
data
0x7FFFFFFF
0x80000000
0xFFFFFFFF
0x0
n-bit virtual address space
32-bit virtual address space
The OS Directs the MMUThe OS Directs the MMU
The OS controls the operation of the MMU to select:
(1) the subset of possible virtual addresses that are valid for each process (the process virtual address space);
(2) the physical translations for those virtual addresses;
(3) the modes of permissible access to those virtual addresses;
read/write/execute
(4) the specific set of translations in effect at any instant.
need rapid context switch from one address space to another
MMU completes a reference only if the OS “says it’s OK”.MMU raises an exception if the reference is “not OK”.
Example: Windows/IA32Example: Windows/IA32
There is lots more to say about address translation, but we don’t want to spend too much time on it now.
• Each address space has a page directory
• One page: 4K bytes, 1024 4-byte entries (PTEs)
• Each PDIR entry points to a “page table”
• Each “page table” is one page with 1024 PTEs
• each PTE maps one 4K page of the address space
• Each page table maps 4MB of memory: 1024*4K
• One PDIR for a 4GB address space, max 4MB of tables
• Load PDIR base address into a register to activate the VAS
What did we just do?What did we just do?
We used special machine features to “virtualize” a core resource: memory.• Each process/space only gets some of the memory.
• The OS decides how much you get.
• The OS decides what parts of the program and its data are in memory, and what parts you will have to wait for.
• You can’t tell exactly what you have.
• The OS isolates each process from its competitors.
Virtualization involves a clean abstract interface with a level of indirection that enables the system to interpose on important actions, securely and transparently, in order to hide details of the system below the interface.
Mode, Space, and ContextMode, Space, and Context
At any time, the state of each processor (core) is defined by:
1. mode: given by the mode bit(s)Is the CPU executing in the protected kernel or a user program?
2. space: defined by V->P translations currently in effectWhat address space is the CPU running in? Once the system is
booted, it always runs in some virtual address space.
3. context: given by register state and execution streamIs the CPU executing a thread/process, or an interrupt handler?
Where is the stack?
These are important because the mode/space/context determines the meaning and validity of key operations.
VM Internals: Mach/BSD ExampleVM Internals: Mach/BSD Example
start, len,prot
start, len,prot
start, len,prot
start, len,prot
addressspace (task)
vm_maplookupenter
pmap
page table
system-widephys-virtual map
pmap_enter()pmap_remove()
pmap_page_protectpmap_clear_modifypmap_is_modifiedpmap_is_referencedpmap_clear_reference
putpagegetpage
memoryobjects
One pmap (physical map)per virtual address space.
page cells (vm_page_t)array indexed by PFN
Memory ObjectsMemory Objects•Memory objects “virtualize” VM backing storage policy.
• source and sink for pages•triggered by faults
•...or OS eviction policy
• manage their own storage
• external pager has some control:•prefetch
•prewrite
•protect/enable
• can be shared via vm_map()•(Mach extended mmap syscall)
anonymous VM
object->putpage(page)object->getpage(offset, page, mode)
memory object
swappager
vnodepager
externpager
mapped files DSM databasesreliable VM etc.
Windows/NT ProcessesWindows/NT Processes
• A raw NT process is just a virtual address space, a handle table, and an (initially empty) list of threads.
• Processes are themselves objects named by handles, supporting specific operations.
• create threads
• map sections (VM regions)
• NtCreateProcess returns an object handle for the process.• Creator may specify a separate (assignable) “parent” process.
• Inherit VAS from designated parent, or initialize as empty.
• Handles can be inherited; creator controls per-handle inheritance.
Sharing the CPUSharing the CPU
We have seen how an operating system can share and “virtualize” one hardware resource: memory.
How can does an OS share the CPU among multiple running programs (processes)?
• Safely
• Fairly (?)
• Efficiently