Classical Operating Systems: A Quick Tour Jeff Chase.

77
Classical Operating Systems: A Classical Operating Systems: A Quick Tour Quick Tour Jeff Chase

Transcript of Classical Operating Systems: A Quick Tour Jeff Chase.

Classical Operating Systems: A Quick TourClassical Operating Systems: A Quick Tour

Jeff Chase

The Birth of a ProgramThe Birth of a Program

int j;char* s = “hello\n”;

int p() { j = write(1, s, 6); return(j);}

myprogram.c

compiler

…..p: store this store that push jsr _write ret etc.

myprogram.s

assembler data

myprogram.o

linker

object file

data program

(executable file)myprogram

datadatadata

libraries and other

objects

What’s in an Object File or Executable?What’s in an Object File or Executable?

int j = 327;char* s = “hello\n”;char sbuf[512];

int p() { int k = 0; j = write(1, s, 6); return(j);}

text

dataidata

wdata

header

symboltable

relocationrecords

Used by linker; may be removed after final link step and strip.

Header “magic number”indicates type of image.

Section table an arrayof (offset, len, startVA)

program sections

program instructionsp

immutable data (constants)“hello\n”

writable global/static dataj, s

j, s ,p,sbuf

Memory and the CPUMemory and the CPU0

2n

code library

OS data

OS code

Program A

dataData

Program B

Data

registers

CPU

R0

Rn

PC

main memory

x

x

ThreadsThreads

A thread is a schedulable stream of control.defined by CPU register values (PC, SP)

suspend: save register values in memory

resume: restore registers from memory

Multiple threads can execute independently:

They can run in parallel on multiple CPUs...- physical concurrency

…or arbitrarily interleaved on a single CPU.- logical concurrency

Each thread must have its own stack.

A Peek Inside a Running ProgramA Peek Inside a Running Program

0

high

code library

your data

heap

registers

CPU

R0

Rn

PC

“memory”

x

x

your program

common runtime

stack

address space(virtual or physical)

SP

y

y

Two Threads Sharing a CPUTwo Threads Sharing a CPU

reality

concept

context switch

A Program With Two ThreadsA Program With Two Threads

0

high

code library

data

registers

CPU

R0

Rn

PC

“memory”

x

x

program

common runtime

stack

address space

SP

y

y stack

runningthread

“on deck” and ready to run

Thread Context SwitchThread Context Switch

0

high

code library

data

registers

CPU

R0

Rn

PC

“memory”

x

x

program

common runtime

stack

address space

SP

y

y stack

1. save registers

2. load registers

switch in

switch out

Threads vs. ProcessesThreads vs. Processes

1. The process is a kernel abstraction for an independent executing program.

includes at least one “thread of control”

also includes a private address space (VAS)

2. Threads may share a process/address spaceEvery thread must exist within some process VAS.

Processes may be “multithreaded”.

Threads can access memory as enabled in their VAS.

Threads can access memory only as enabled in their VAS.

data

data

Memory Memory Protection Protection

Paging Virtual memory provides protection by:

• Each process (user or OS) has different virtual memory space.

• The OS maintain the page tables for all processes.

• A reference outside the process allocated space cause an exception that lets the OS decide what to do.

• Memory sharing between processes is done via different Virtual spaces but common physical frames.

[Kedem, CPS 104, Fall05]

The Program and the Process VASThe Program and the Process VAS

text

dataidatawdata

header

symboltable

relocationrecords

program

text

data

BSS

user stack

args/envkernel

data

process VAS

sections segments

BSS“Block Started by Symbol”(uninitialized global data)e.g., heap and sbuf go here.

Args/env strings copied in by kernel when the process is created.

Process text segment is initialized directly from program text

section.

Process data segment(s) are

initialized from idata and wdata sections.

Process stack and BSS (e.g., heap) segment(s) are

zero-filled.

Process BSS segment may be expanded at runtime with a system call (e.g., Unix sbrk) called by the heap manager

routines.

Text and idata segments may be write-protected.

Virtual Address TranslationVirtual Address Translation

VPN offset

29 013

Example: typical 32-bitarchitecture with 8KB pages.

addresstranslation

Virtual address translation maps a virtual page number (VPN) to a physical page frame number (PFN): the rest is easy.

PFN

offset

+

00

virtual address

physical address{

Deliver exception toOS if translation is notvalid and accessible inrequested mode.

A Simple Page TableA Simple Page Table

PFN 0PFN 1

PFN i

page #i offset

user virtual address

PFN i+

offset

process page table

physical memorypage frames

In this example, each VPN j maps to PFN j,

but in practice any physical frame may be

used for any virtual page.

Each process/VAS has its own page table.

Virtual addresses are translated relative to

the current page table.

The page tables are themselves stored in memory; a protected

register holds a pointer to the current page table.

Virtual Memory as a CacheVirtual Memory as a Cache

text

dataidatawdata

header

symboltable, etc.

programsections

text

data

BSS

user stack

args/envkernel

data

processsegments

physicalpage frames

virtualmemory

(big)

physicalmemory(small)

executablefile

backingstorage

virtual-to-physical translations

pageout/eviction

page fetch

Completing a VM ReferenceCompleting a VM Reference

raiseexception

probepage table

loadTLB

probe TLB

accessphysicalmemory

accessvalid?

pagefault?

signalprocess

allocateframe

page ondisk?

fetchfrom disk

zero-fillloadTLB

starthere

MMU

OS

Processes and the KernelProcesses and the Kernel

data dataprocesses in private

virtual address spaces

system call traps...and upcalls (e.g.,

signals)

shared kernel code and data

in shared address space

Threads or processes enter the kernel for services.

The kernel sets up process execution contexts to

“virtualize” the machine.

CPU and devices force entry to the kernel to handle exceptional events.

Architectural Foundations of OS KernelsArchitectural Foundations of OS Kernels

• One or more privileged execution modes (e.g., kernel mode)

protected device control registers

privileged instructions to control basic machine functions

• System call trap instruction and protected fault handling

User processes safely enter the kernel to access shared OS services.

• Virtual memory mapping

OS controls virtual-physical translations for each address space.

• Device interrupts to notify the kernel of I/O completion etc.

Includes timer hardware and clock interrupts to periodically return control to the kernel as user code executes.

• Atomic instructions for coordination on multiprocessors

Example: Mac OS XExample: Mac OS X

Classical View: The QuestionsClassical View: The Questions

The basic issues/questions classical OS are how to:

• allocate memory and storage to multiple programs?

• share the CPU among concurrently executing programs?

• suspend and resume programs?

• share data safely among concurrent activities?

• protect one executing program’s storage from another?

• protect the code that implements the protection, and mediates access to resources?

• prevent rogue programs from taking over the machine?

• allow programs to interact safely?

22

The Access Control Model

Object

Resource

Reference monitor

Guard

Do operation

Request

Principal

Source

Authorization

Audit log

Authentication

Policy

1. Isolation boundary2. Access

control 3. Policy

1. Isolation Boundary to prevent attacks outside access-controlled channels

2. Access Control for channel traffic

3. Policy management

[Butler Lampson – Accountability and Freedom]

23

Isolation

• Attacks on:– Program– Isolation– Policy

ServicesBoundary Creator

GUARD

GUARD

policy

policy

Program

Data

guard

Host

• I am isolated if anything that goes wrong is my fault– Actually, my program’s fault

Object

Resource

Reference monitor

Guard

Do operation

Request

Principal

Source

Authorization

Audit log

Authentication

Policy

1. Isolation boundary

2. Access control

3. Policy

[Butler Lampson – Accountability and Freedom]

24

Access Control Mechanisms:The Gold Standard

Authenticate principals: Who made a request Mainly people, but also channels, servers, programs

(encryption implements channels, so key is a principal)

Authorize access: Who is trusted with a resource Group principals or resources, to simplify management

Can define by a property, e.g. “type-safe” or “safe for scripting”

Audit: Who did what when?

• Lock = Authenticate + Authorize• Deter = Authenticate + Audit

Object

Resource

Reference monitor

Guard

Do operation

Request

Principal

Source

Authorization

Audit log

Authentication

Policy

1. Isolation boundary

2. Access control

3. Policy

[Butler Lampson – Accountability and Freedom]

Kernel ModeKernel Mode0

2n

code library

OS data

OS code

Program A

dataData

Program B

Data

registers

CPU

R0

Rn

PC

main memory

x

x

mode

CPU mode (a field in some status

register) indicates whether the CPU is

running in a user program or in the protected kernel.

Some instructions or register accesses are only legal when the CPU is executing in

kernel mode.

physical address space

The KernelThe Kernel• Today, all “real” operating systems have protected kernels.

The kernel resides in a well-known file: the “machine” automatically loads it into memory (boots) on power-on/reset.

Our “kernel” is called the executive in some systems (e.g., XP).

• The kernel is (mostly) a library of service procedures shared by all user programs, but the kernel is protected:

User code cannot access internal kernel data structures directly, and it can invoke the kernel only at well-defined entry points (system calls).

• Kernel code is like user code, but the kernel is privileged:

The kernel has direct access to all hardware functions, and defines the machine entry points for interrupts and exceptions.

Protecting Entry to the KernelProtecting Entry to the Kernel

Protected events and kernel mode are the architectural foundations of kernel-based OS (Unix, XP, etc).

• The machine defines a small set of exceptional event types.

• The machine defines what conditions raise each event.

• The kernel installs handlers for each event at boot time.

e.g., a table in kernel memory read by the machine

The machine transitions to kernel mode only on an exceptional event.

The kernel defines the event handlers.

Therefore the kernel chooses what code will execute in kernel mode, and when.

user

kernel

interrupt orexceptiontrap/return

The Role of EventsThe Role of Events

A CPU event is an “unnatural” change in control flow.Like a procedure call, an event changes the PC.

Also changes mode or context (current stack), or both.

Events do not change the current space!

The kernel defines a handler routine for each event type.Event handlers always execute in kernel mode.

The specific types of events are defined by the machine.

Once the system is booted, every entry to the kernel occurs as a result of an event.

In some sense, the whole kernel is a big event handler.

CPU Events: Interrupts and ExceptionsCPU Events: Interrupts and Exceptions

An interrupt is caused by an external event.device requests attention, timer expires, etc.

An exception is caused by an executing instruction.CPU requires software intervention to handle a fault or trap.

unplanned deliberatesync fault syscall trapasync interrupt AST

control flow

event handler (e.g., ISR: Interrupt Service

Routine)

exception.cc

AST: Asynchronous System TrapAlso called a software interrupt or an Asynchronous or Deferred Procedure Call (APC or DPC)

Note: different “cultures” may use some of these terms (e.g., trap, fault, exception, event, interrupt) slightly differently.

Protection Rings

Kernel/user mode not enough!

Need a mode for hypervisor

X86 rings Has built in security

levels (Rings 0, 1, 2, 3) Ring 0 – OS Software

(most privileged) Ring 3 – User software Ring 1 & 2 – Not used

Xen guest OS executes on e.g. Ring 1

Increasing Privilege Level

Ring 0

Ring 1

Ring 2

Ring 3

[Fischbach]

Example: System Call TrapsExample: System Call Traps

User code invokes kernel services by initiating system call traps.

• Programs in C, C++, etc. invoke system calls by linking to a standard library of procedures written in assembly language.

the library defines a stub or wrapper routine for each syscall

stub executes a special trap instruction (e.g., chmk or callsys or int)

syscall arguments/results passed in registers or user stack

read() in Unix libc.a library (executes in user mode):

#define SYSCALL_READ 27 # code for a read system call

move arg0…argn, a0…an # syscall args in registers A0..ANmove SYSCALL_READ, v0 # syscall dispatch code in V0callsys # kernel trapmove r1, _errno # errno = return statusreturn

Alpha CPU architecture

FaultsFaultsFaults are similar to system calls in some respects:

• Faults occur as a result of a process executing an instruction.

Fault handlers execute on the process kernel stack; the fault handler may block (sleep) in the kernel.

• The completed fault handler may return to the faulted context.

But faults are different from syscall traps in other respects:

• Syscalls are deliberate, but faults are “accidents”.

divide-by-zero, dereference invalid pointer, memory page fault

• Not every execution of the faulting instruction results in a fault.

may depend on memory state or register contents

Processes and the KernelProcesses and the Kernel

data dataprocesses in private

virtual address spaces

system call traps...and upcalls (e.g.,

signals)

shared kernel code and data

in shared address space

Threads or processes enter the kernel for services.

The kernel sets up process execution contexts to

“virtualize” the machine.

CPU and devices force entry to the kernel to handle exceptional events.

Process InternalsProcess Internals

+ +user ID

process IDparent PID

sibling linkschildren

virtual address space process descriptor (PCB)

resources

thread

stack

Each process has a thread bound to the VAS.

The thread has a saved user context as well as a system

context.

The kernel can manipulate the user context to start the

thread in user mode wherever it wants.

Process state includes a file descriptor table, links to maintain the process tree, and a

place to store the exit status.

The address space is represented by page

table, a set of translations to physical

memory allocated from a kernel memory manager.

The kernel must initialize the process

memory with the program image to run.

Kernel Stacks and Trap/Fault HandlingKernel Stacks and Trap/Fault Handling

data

Processes execute user

code on a user stack in the user

portion of the process virtual address space.

Each process has a second kernel stack in kernel space (the kernel portion of the

address space).

stack

stack

stack

stack

System calls and faults run in kernel mode on the process kernel stack.

syscall dispatch

table

System calls run in the process

space, so copyin and copyout can

access user memory.

The syscall trap handler makes an indirect call through the system call dispatch table to the handler for the specific system call.

The Classical OS Model in UnixThe Classical OS Model in Unix

A Lasting Achievement?A Lasting Achievement?

“Perhaps the most important achievement of Unix is to demonstrate that a powerful operating system for interactive use need not be expensive…it can run on hardware costing as little as $40,000.”

The UNIX Time-Sharing System* D. M. Ritchie and K. Thompson

DEC PDP-11/24

http://histoire.info.online.fr/pdp11.html

Elements of the UnixElements of the Unix

1. rich model for IPC and I/O: “everything is a file”file descriptors: most/all interactions with the outside world are

through system calls to read/write from file descriptors, with a unified set of syscalls for operating on open descriptors of different types.

2. simple and powerful primitives for creating and initializing child processes

fork: easy to use, expensive to implement

Command shell is an “application” (user mode)

3. general support for combining small simple programs to perform complex tasks

standard I/O and pipelines

The ShellThe Shell

The Unix command interpreters run as ordinary user processes with no special privilege.

This was novel at the time Unix was created: other systems viewed the command interpreter as a trusted part of the OS.

Users may select from a range of interpreter programs available, or even write their own (to add to the confusion).

csh, sh, ksh, tcsh, bash: choose your flavor...or use perl.

Shells use fork/exec/exit/wait to execute commands composed of program filenames, args, and I/O redirection symbols.

Shells are general enough to run files of commands (scripts) for more complex tasks, e.g., by redirecting shell’s stdin.

Shell’s behavior is guided by environment variables.

Using the shellUsing the shell

• Commands: ls, cat, and all that

• Current directory: cd and pwd

• Arguments: echo

• Signals: ctrl-c

• Job control, foreground, and background: &, ctrl-z, bg, fg

• Environment variables: printenv and setenv

• Most commands are programs: which, $PATH, and /bin

• Shells are commands: sh, csh, ksh, tcsh, bash

• Pipes and redirection: ls | grep a

• Files and I/O: open, read, write, lseek, close

• stdin, stdout, stderr

• Users and groups: whoami, sudo, groups

Other application programs

cc

Other application programs

Hardware

Kernel

sh who a.out

date

wc

grepedvi

ld

as

comp

cppnroff

Questions about ProcessesQuestions about Processes

A process is an execution of a program within a private virtual address space (VAS).

1. What are the system calls to operate on processes?

2. How does the kernel maintain the state of a process?Processes are the “basic unit of resource grouping”.

3. How is the process virtual address space laid out?What is the relationship between the program and the process?

4. How does the kernel create a new process?How to allocate physical memory for processes?

How to create/initialize the virtual address space?

Process CreationProcess CreationTwo ways to create a process

• Build a new empty process from scratch

• Copy an existing process and change it appropriately

Option 1: New process from scratch

• StepsLoad specified code and data into memory;

Create empty call stack

Create and initialize PCB (make look like context-switch)

Put process on ready list

• Advantages: No wasted work

• Disadvantages: Difficult to setup process correctly and to express all possible options

Process permissions, where to write I/O, environment variables

Example: WindowsNT has call with 10 arguments

[Remzi Arpaci-Dusseau]

Process CreationProcess CreationOption 2: Clone existing process and change

• Example: Unix fork() and exec()Fork(): Clones calling process

Exec(char *file): Overlays file image on calling process

• Fork() Stop current process and save its state

Make copy of code, data, stack, and PCB

Add new PCB to ready list

Any changes needed to PCB?

• Exec(char *file)Replace current data and code segments with those in specified file

• Advantages: Flexible, clean, simple

• Disadvantages: Wasteful to perform copy and then overwrite of memory

[Remzi Arpaci-Dusseau]

Process Creation in UnixProcess Creation in Unix

int pid;int status = 0;

if (pid = fork()) {/* parent */…..pid = wait(&status);

} else {/* child */…..exit(status);

}

Parent uses wait to sleep until the child exits; wait returns child pid and status.

Wait variants allow wait on a specific child, or notification of stops and other signals.

The fork syscall returns twice: it returns a zero to the child and the child process ID (pid) to the parent.

Unix Unix Fork/Exec/Exit/WaitFork/Exec/Exit/Wait Example Example

fork parent fork child

wait exit

int pid = fork();Create a new process that is a clone of its parent.

exec*(“program” [, argvp, envp]);Overlay the calling process virtual memory with a new program, and transfer control to it.

exit(status);Exit with status, destroying the process. Note: this is not the only way for a process to exit!

int pid = wait*(&status);Wait for exit (or other status change) of a child, and “reap” its exit status. Note: child may have exited before parent calls wait!

exec initialize child context

How are Unix shells implemented?How are Unix shells implemented?while (1) {

Char *cmd = getcmd();

int retval = fork();

if (retval == 0) {

// This is the child process

// Setup the child’s process environment here

// E.g., where is standard I/O, how to handle signals?

exec(cmd);

// exec does not return if it succeeds

printf(“ERROR: Could not execute %s\n”, cmd);

exit(1);

} else {

// This is the parent process; Wait for child to finish

int pid = retval;

wait(pid);

}

}

[Remzi Arpaci-Dusseau]

The Concept of ForkThe Concept of Fork

fork creates a child process that is a clone of the parent.

• Child has a (virtual) copy of the parent’s virtual memory.

• Child is running the same program as the parent.

• Child inherits open file descriptors from the parent.

(Parent and child file descriptors point to a common entry in the system open file table.)

• Child begins life with the same register values as parent.

The child process may execute a different program in its context with a separate exec() system call.

What’s So Cool About What’s So Cool About ForkFork

1. fork is a simple primitive that allows process creation without troubling with what program to run, args, etc.

Serves the purpose of “lightweight” processes (like threads?).

2. fork gives the parent program an opportunity to initialize the child process…e.g., the open file descriptors.

Unix syscalls for file descriptors operate on the current process.

Parent program running in child process context may open/close I/O and IPC objects, and bind them to stdin, stdout, and stderr.

Also may modify environment variables, arguments, etc.

3. Using the common fork/exec sequence, the parent (e.g., a command interpreter or shell) can transparently cause children to read/write from files, terminal windows, network connections, pipes, etc.

Unix File DescriptorsUnix File Descriptors

Unix processes name I/O and IPC objects by integers known as file descriptors. • File descriptors 0, 1, and 2 are reserved by convention

for standard input, standard output, and standard error.“Conforming” Unix programs read input from stdin, write

output to stdout, and errors to stderr by default.

• Other descriptors are assigned by syscalls to open/create files, create pipes, or bind to devices or network sockets.pipe, socket, open, creat

• A common set of syscalls operate on open file descriptors independent of their underlying types.read, write, dup, close

Unix File Descriptors IllustratedUnix File Descriptors Illustrated

user space

File descriptors are a special case of kernel object handles.

pipe

file

socket

process filedescriptor

table

kernel

system open file table

tty

Disclaimer: this drawing is

oversimplified.

The binding of file descriptors to objects is specific to each process, like the virtual translations in the virtual address space.

Kernel Object HandlesKernel Object Handles

Instances of kernel abstractions may be viewed as “objects” named by protected handles held by processes.• Handles are obtained by create/open calls, subject to security

policies that grant specific rights for each handle.

• Any process with a handle for an object may operate on the object using operations (system calls).Specific operations are defined by the object’s type.

• The handle is an integer index to a kernel table.

port

file

etc.object

handles

user space kernel

Microsoft Windows object handles Unix file descriptors

Unix PhilosophyUnix Philosophy

Rule of Modularity: Write simple parts connected by clean interfaces.Rule of Composition: Design programs to be connected to other

programs.Rule of Separation: Separate policy from mechanism; separate interfaces

from engines.Rule of Representation: Fold knowledge into data so program logic can

be stupid and robust.Rule of Transparency: Design for visibility to make inspection and

debugging easier.Rule of Repair: When you must fail, fail noisily and as soon as possibleRule of Extensibility: Design for the future, because it will be here

sooner than you think.Rule of Robustness: Robustness is the child of transparency and

simplicity.

[Eric Raymond]

Unix Philosophy: SimplicityUnix Philosophy: Simplicity

Rule of Economy: Programmer time is expensive; conserve it in preference to machine time.

Rule of Clarity: Clarity is better than cleverness.Rule of Simplicity: Design for simplicity; add complexity only where

you must.Rule of Parsimony: Write a big program only when it is clear by

demonstration that nothing else will do.Rule of Generation: Avoid hand-hacking; write programs to write

programs when you can.Rule of Optimization: Prototype before polishing. Get it working before

you optimize it.

[Eric Raymond]

Unix Philosophy: InterfacesUnix Philosophy: Interfaces

Rule of Least Surprise: In interface design, always do the least surprising thing.

Rule of Silence: When a program has nothing surprising to say, it should say nothing.

[Eric Raymond]

Rule of Diversity: Distrust all claims for “one true way”.

Introduction to Virtual AddressingIntroduction to Virtual Addressing

text

data

BSS

user stack

args/envkernel

data

virtualmemory

(big?)

physicalmemory(small?)

virtual-to-physical translations

User processes address memory through virtual

addresses.

The kernel and the machine collude to

translate virtual addresses to

physical addresses.

The kernel controls the virtual-physical translations in effect

for each space.

The machine does not allow a user process to access memory unless the kernel “says it’s OK”.

The specific mechanisms for implementing virtual address translation

are machine-dependent.

Multics (1965-2000)Multics (1965-2000)

• “Multiplexed 24x7 computer utility”

Multi-user “time-sharing”, interactive and batch

Processes, privacy, security, sharing, accounting

“Decentralized system programming”

• Virtual memory with “automatic page turning”

• “Two-dimensional VM” with segmentation

Avoids “complicated overlay techniques”

Modular sharing and dynamic linking

• Hierarchical file system with symbolic names, automatic backup

“Single-level store”

• Dawn of “systems research” as an academic enterprise

Multics ConceptsMultics Concepts

• Segments as granularity of sharingProtection rings and the segment protection level

• Segments have symbolic names in a hierarchical name space

• Segments are “made known” before access from a process.Apply access control at this point.

• Segmentation is a cheap way to extend the address space.Note: paging mechanisms are independent of segmentation.

• Dynamic linking resolves references across segments.

How does the Unix philosophy differ?

Segmentation (1)Segmentation (1)

One-dimensional address space

growing tables

tables may bump

[Tanenbaum]

Variable PartitioningVariable Partitioning

Variable partitioning is the strategy of parking differently sized carsalong a street with no marked parking space dividers.

Wasted space from externalfragmentation

Fixed PartitioningFixed Partitioning

Wasted space from internal fragmentation

Segmentation with Paging: MULTICS (1)Segmentation with Paging: MULTICS (1)

Descriptor segment points to page tables

Segment descriptor – numbers are field lengths

Segmentation with Paging: MULTICS (2)Segmentation with Paging: MULTICS (2)

A 34-bit MULTICS virtual address

Segmentation with Paging: MULTICS (3)Segmentation with Paging: MULTICS (3)

Conversion of a 2-part MULTICS address into a main memory address

[Tanenbaum]

The Virtual Address SpaceThe Virtual Address Space

A typical process VAS space includes:• user regions in the lower half

V->P mappings specific to each process

accessible to user or kernel code

• kernel regions in upper halfshared by all processes, but accessible only

to kernel code

• Windows on IA32 subdivides kernel region into an unpaged half and a (mostly) paged upper half at 0xC0000000 for page tables and I/O cache.

• Win95/98 used the lower half of system space as a system-wide shared region.

text

data

BSS

user stack

args/env

0

data

kernel textand

kernel data

2n-1

2n-1

0x0

0xffffffff

A VAS for a private address space system (e.g., Unix, NT/XP) executing on a typical 32-bit system (e.g., x86).

sbrk()

jsr

Process and Kernel Address SpacesProcess and Kernel Address Spaces

data

0

2n-1-1

2n-1

2n-1

data

0x7FFFFFFF

0x80000000

0xFFFFFFFF

0x0

n-bit virtual address space

32-bit virtual address space

The OS Directs the MMUThe OS Directs the MMU

The OS controls the operation of the MMU to select:

(1) the subset of possible virtual addresses that are valid for each process (the process virtual address space);

(2) the physical translations for those virtual addresses;

(3) the modes of permissible access to those virtual addresses;

read/write/execute

(4) the specific set of translations in effect at any instant.

need rapid context switch from one address space to another

MMU completes a reference only if the OS “says it’s OK”.MMU raises an exception if the reference is “not OK”.

Example: Windows/IA32Example: Windows/IA32

There is lots more to say about address translation, but we don’t want to spend too much time on it now.

• Each address space has a page directory

• One page: 4K bytes, 1024 4-byte entries (PTEs)

• Each PDIR entry points to a “page table”

• Each “page table” is one page with 1024 PTEs

• each PTE maps one 4K page of the address space

• Each page table maps 4MB of memory: 1024*4K

• One PDIR for a 4GB address space, max 4MB of tables

• Load PDIR base address into a register to activate the VAS

Top-level page table

[from Tanenbaum]

32 bit address with 2 page table fields

Two-level page tables

What did we just do?What did we just do?

We used special machine features to “virtualize” a core resource: memory.• Each process/space only gets some of the memory.

• The OS decides how much you get.

• The OS decides what parts of the program and its data are in memory, and what parts you will have to wait for.

• You can’t tell exactly what you have.

• The OS isolates each process from its competitors.

Virtualization involves a clean abstract interface with a level of indirection that enables the system to interpose on important actions, securely and transparently, in order to hide details of the system below the interface.

Mode, Space, and ContextMode, Space, and Context

At any time, the state of each processor (core) is defined by:

1. mode: given by the mode bit(s)Is the CPU executing in the protected kernel or a user program?

2. space: defined by V->P translations currently in effectWhat address space is the CPU running in? Once the system is

booted, it always runs in some virtual address space.

3. context: given by register state and execution streamIs the CPU executing a thread/process, or an interrupt handler?

Where is the stack?

These are important because the mode/space/context determines the meaning and validity of key operations.

VM Internals: Mach/BSD ExampleVM Internals: Mach/BSD Example

start, len,prot

start, len,prot

start, len,prot

start, len,prot

addressspace (task)

vm_maplookupenter

pmap

page table

system-widephys-virtual map

pmap_enter()pmap_remove()

pmap_page_protectpmap_clear_modifypmap_is_modifiedpmap_is_referencedpmap_clear_reference

putpagegetpage

memoryobjects

One pmap (physical map)per virtual address space.

page cells (vm_page_t)array indexed by PFN

Memory ObjectsMemory Objects•Memory objects “virtualize” VM backing storage policy.

• source and sink for pages•triggered by faults

•...or OS eviction policy

• manage their own storage

• external pager has some control:•prefetch

•prewrite

•protect/enable

• can be shared via vm_map()•(Mach extended mmap syscall)

anonymous VM

object->putpage(page)object->getpage(offset, page, mode)

memory object

swappager

vnodepager

externpager

mapped files DSM databasesreliable VM etc.

Windows/NT ProcessesWindows/NT Processes

• A raw NT process is just a virtual address space, a handle table, and an (initially empty) list of threads.

• Processes are themselves objects named by handles, supporting specific operations.

• create threads

• map sections (VM regions)

• NtCreateProcess returns an object handle for the process.• Creator may specify a separate (assignable) “parent” process.

• Inherit VAS from designated parent, or initialize as empty.

• Handles can be inherited; creator controls per-handle inheritance.

Sharing the CPUSharing the CPU

We have seen how an operating system can share and “virtualize” one hardware resource: memory.

How can does an OS share the CPU among multiple running programs (processes)?

• Safely

• Fairly (?)

• Efficiently

Sharing DisksSharing Disks

How should the OS mediate/virtualize/share the disk(s) among multiple users or programs?

• Safely

• Fairly

• Securely

• Efficiently

• Effectively

• Robustly