Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity...

215
Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求求求求 求求求求

Transcript of Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity...

Page 1: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

Scheduling and Dispatch

Instructor: Hengming Zou, Ph.D.

In Pursuit of Absolute Simplicity 求于至简,归于永恒

Page 2: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

22

Content

4.1. The Concept of Processes and Threads

4.2. Windows Processes and Threads

4.3. Windows Process and Thread Internals

4.4. Windows OS Thread Scheduling

4.5. Advanced Windows Scheduling

Page 3: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

33

Process Concept

An operating system executes programs:– Batch system – jobs

– Time-shared systems – user programs or tasks

Process – a program in execution– Process execution must progress sequentially

A process includes:– CPU state (one or multiple threads)

– Text & data section

– Resources such as open files, handles, sockets

Page 4: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

44

Process Concept

Traditionally, process used to be unit of scheduling – (i.e. no threads)

However, like most modern operating systems, Windows schedules threads

Our discussion assumes thread scheduling

Page 5: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

55

Thread States

Five-state diagram for thread scheduling:– init: The thread is being created

– ready: The thread is waiting to be assigned to a CPU

– running: The thread’s instructions are being executed

– waiting: The thread is waiting for some event to occur

– terminated: The thread has finished execution

Page 6: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

66

init

ready

waiting

running

terminated

schedulerdispatch

waiting forI/O or event

I/O or eventcompletion

interrupt quantum expired

admitted exit

Thread States

Page 7: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

77

Process and Thread Control Blocks

Information associated with process: Process Control Block (PCB)

– Memory management information

– Accounting information

– Process-global vs. thread-specific

Information associated with thread: Thread Control Block (TCB)

– Program counter

– CPU registers

– CPU scheduling information

– Pending I/O information

Page 8: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

88

Process Control Block (PCB)

Windows implementation of PCB is split in multiple data structures

Program Counter

Parent PID

Handle Table

Process ID (PID)

Registers

Next Process Block

Image File Name

PCB

List of ThreadControl Blocks

List of open files

Next TCB

Thread Control Block (TCB)

Page 9: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

99

CPU Switch from Thread to Thread

Thread T1

executing

executing

ready orwaiting

Save state into TCB2

Reload state from TCB1

Save state into TCB1

Reload state from TCB2

Interrupt or system call Thread T2

executingInterrupt or system call

ready orwaiting

ready orwaiting

Page 10: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

1010

Context Switch

Save the state of the old thread and load the saved state for the new thread

Context-switch time is overhead

Thread context-switch can be implemented in kernel or user mode

Interaction with MMU is required when switching between threads in different processes

Page 11: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

1111

Thread Scheduling Queues

Ready queue – Maintains set of all threads ready and waiting to

execute

– There might be multiple ready queues, sorted by priorities

Device queue– Maintains set of threads waiting for an I/O device

– There might be multiple queues for different devices

Threads migrate between the various queues

Page 12: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

1212

Ready Queue and I/O Device Queues

CPU

Ready queue

I/O 1 wait

I/O 2 wait

I/O n wait

I/O n queue

I/O 1 queue

I/O occurs

Time-out

ReleaseDispatch

Page 13: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

1313

Optimization Criteria

CPU scheduling uses heuristics to manage the tradeoffs among contradicting optimization criteria.

Schedulers are optimized for certain workloads

– Interactive vs. batch processing

– I/O-intense vs. compute-intense

Page 14: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

1414

Common Optimization Criteria

Maximize CPU utilization

Maximize throughput

Minimize turnaround time

Minimize waiting time

Minimize response time

Page 15: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

1515

Basic Scheduling Considerations

What invokes the scheduler?

Which assumptions should a scheduler rely on?

What are its optimization goals?

Rationale:

– Multiprogramming maximizes CPU utilization

– Thread execution experiences cycles of compute- and I/O-bursts

– Scheduler should consider CPU burst distribution

Page 16: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

1616

Alternating Sequence of CPU and I/O Bursts …load valinc valread file

wait for I/O

inc countadd data, valwrite file

wait for I/O

load valinc valread from file

wait for I/O

CPU burst

CPU burst

CPU burst

I/O burst

I/O burst

I/O burst

Threads can be described as:

I/O-bound – spends more time doing I/O than computations

– many short CPU bursts

CPU-bound – spends more time doing computations

– few very long CPU bursts

Page 17: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

1717

Histogram of CPU-burst Times

Burst duration (msec)0 10 20 30

distribution

Many short CPU bursts are typical

Exact figures vary greatly by process and computer

Page 18: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

1818

Schedulers

Long-term scheduler (or job scheduler)

– Select which processes with their threads should be brought into the ready queue

– Takes MM into consideration (swapped-out processes)

– Controls degree of multiprogramming

– Invoked infrequently, may be slow

Short-term scheduler (or CPU scheduler)

– Select which thread should be executed next and allocate CPU

– Invoked frequently, must be fast

Windows has no dedicated long-term scheduler

Page 19: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

1919

CPU Scheduler

Select from among the threads in memory that are ready to execute, and allocate the CPU to one of them

CPU scheduling decisions may take place when a thread

– 1.Switches from running to waiting state

– 2.Switches from running to ready state

– 3.Switches from waiting to ready

– 4.Terminates

Scheduling under 1 and 4 is nonpreemptive

All other scheduling is preemptive

Page 20: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

2020

Dispatcher

Dispatcher module gives control of CPU to the thread selected by the short-term scheduler; this involves:

– switch context

– switch to user mode

– jump to proper location in user program to restart that program

Dispatch latency – time it takes for the dispatcher to stop one thread and start another running

Windows scheduling is event-driven

– No central dispatcher module in the kernel

Page 21: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

2121

Scheduling Algorithms: FIFO

First-In, First-Out

Also known as First-Come, First-Served (FCFS)

Thread Burst Time

T1 20

T2 5

T3 4

Suppose threads arrive in the order: T1 , T2 , T3

– The Gantt Chart for the schedule is:

Page 22: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

2222

Scheduling Algorithms: FIFO

Waiting time for T1 = 0; T2 = 20; T3 = 25

Average waiting time: (0 + 20 + 25)/3 = 15

Convoy effect:

– short thread behind long threads experience long waiting time

T1 T2 T3

20 25 290

Page 23: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

2323

FIFO Scheduling (Cont.)

Suppose that the threads arrive in the order

T2 , T3 , T1 .

The Gantt chart for the schedule is:

Waiting time for T1 = 9; T2 = 0; T3 = 5

Average waiting time: (9 + 0 + 5)/3 = 4.66

Much better than previous case

T1T3T2

95 290

Page 24: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

2424

Scheduling Algorithms: Round Robin (RR)

Preemptive version of FIFO scheduling algorithm

Each thread gets a small unit of CPU time (quantum),

– usually 10-100 milliseconds

After this time has elapsed, the thread is preempted and added to the end of the ready queue

Each of n ready thread gets 1/n of the CPU time in chunks of at most quantum q time units at once

Of n threads, no one waits more than (n-1)q time units

Page 25: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

2525

Scheduling Algorithms:Round Robin (RR)

Performance

q large FIFO q small q must be large with respect to

context switch

– otherwise overhead is too high

Page 26: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

2626

Example of RR with Quantum = 10

Assume we have:

– Thread Burst Time

– T1 23

– T2 7

– T3 38

– T4 14

Assume all threads have same priority, the Gantt chart is:

T1 T2 T3 T4 T1 T3 T4 T1 T3 T3

0 10 17 27 37 47 57 61 64 74 82

Page 27: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

2727

Example of RR with Quantum = 10

Round-Robin favors CPU-intense over I/O-intense threads

Priority-elevation after I/O completion can provide a compensation

Windows uses Round-Robin with a priority-elevation scheme

Page 28: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

2828

Round Robin Performance

Shorter quantum yields more context switches

Longer quantum yields shorter average turnaround time

Thread execution time: 15

0 15

15

15

0

0

10

10

quantumcontextswitches

20

10

1

0

1

14

Page 29: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

2929

Scheduling Algorithms: Priority Scheduling

A priority number (integer) is associated with each thread

CPU is allocated to the thread with the highest priority

– Preemptive

– Non-preemptive

Page 30: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

3030

Priority Scheduling - Starvation

Starvation is a problem:

– low priority threads may never execute

Solutions:

– 1) Decreasing priority & aging: the Unix approachDecrease priority of CPU-intense threadsExponential averaging of CPU usage to slowly increase priority of blocked threads

– 2) Priority Elevation: the Windows/VMS approachIncrease priority of a thread on I/O completionSystem gives starved threads an extra burst

Page 31: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

3131

Multilevel Queue

Ready queue is partitioned into separate queues:

– Real-time (system, multimedia)

– Interactive

Queues may have different scheduling algorithm

– Real-Time – RR

– Interactive – RR + priority-elevation + quantum stretching

Page 32: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

3232

Multilevel Queue

Scheduling must be done between the queues

Fixed priority scheduling (i.e., serve all from real-time threads then from interactive)– Possibility of starvation

Time slice – each queue gets a certain amount of CPU time which it can schedule amongst its threads– CPU reserves

Page 33: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

3333

Multilevel Queue Scheduling

Windows uses strict Round-Robin for real-time threads

Priority-elevation can be disabled for non-RT threads

Real-time system threads

Real-time user threads

System threads

Interactive user threads

background threads

High priority

Low priority

Page 34: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

3434

Process Creation

Parent process creates children processes, which create other processes, forming a tree of processes

– Processes start with one initial thread

Resource sharing models

– Parent and children share all resources

– Children share subset of parent’s resources

– Parent and child share no resources

Execution

– Parent’s and children's’ threads execute concurrently

– Parent waits until children terminate

Page 35: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

3535

Process Creation (Cont.)

How to set up an address space

– Child can be duplicate of parent

– Child may have a program loaded into it

UNIX example

– fork() system call creates new process

– exec() system call used after a fork to replace the process’ memory space with a new program

Windows example

– CreateProcess() system call create new process and

– loads program for execution

Page 36: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

3636

Processes Tree on a UNIX System

Page 37: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

3737

Process Termination

Last thread inside a process executes last statement and returns control to operating system (exit)

– Parent may receive return code (via wait)

– Process’ resources are deallocated by OS

Page 38: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

3838

Process Termination

Parent may terminate children processes (kill)

– Child has exceeded allocated resources

– Task assigned to child is no longer required

– Parent is exiting

OS typically does not allow child to continue if its parent terminates (depending on creation flags)– Cascading termination inside process groups

Page 39: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

3939

Single and Multithreaded Processes

code data files

registers stack

Thread

single-threaded

code data files

registers

stack

Thread

multi-threaded

stack

registers

stack

registers

Thread Thread

Page 40: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

4040

Benefits of Multithreading

Higher Responsiveness– Dedicated threads for handling user events

Simpler Resource Sharing– All threads in a process share same address space

Economy - fewer context switches– If threading implemented in user-space

Utilization of Multiprocessor Architectures– Multiple threads may run in parallel

Page 41: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

4141

User Threads

Thread management within a user-level threads library

– Process is unit of CPU scheduling from kernel perspective

Examples

– POSIX Pthreads

– Mach C-threads

– Solaris threads

– Fibers on Windows

Page 42: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

4242

Kernel Threads

Supported by the Kernel

– Thread is unit of CPU scheduling

Examples

– Windows

– Solaris

– OSF/1

– LinuxTasks can act like threads by sharing kernel data structures

Page 43: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

4343

Multithreading Models

How are user-level threads mapped on kernel threads?

Many-to-One

– Many user-mode threads mapped on a single kernel thread

One-to-One

– Each user-mode thread mapped on a separate kernel thread

Many-to-Many

– Set of user-mode threads mapped on set of kernel threads

Page 44: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

4444

Many-to-One Model

Used on systems that do not support kernel threads

Example:

– POSIX Pthreads

– Mach C-Threads

Kernelthread

UserThread

UserThread

UserThread

Page 45: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

4545

One-to-One Model

Each user-level thread maps to kernel thread

Examples

– Windows

– OS/2

Kernelthread

UserThread

Kernelthread

UserThread

Kernelthread

UserThread

Page 46: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

4646

Many-to-Many Model

Allows many user level threads to be mapped to many kernel threads.

Allows OS to create a sufficient number of kernel threads.

Example

– Solaris 2

Kernelthread

UserThread

UserThread

UserThread

Kernelthread

Page 47: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

4747

Problems with Multithreading

Semantics of fork()/exec() or CreateProcess() system calls

Coordinated termination

Signal handling

Global data, errno, error handling

Thread specific data

Reentrant vs. non-reentrant system calls

Page 48: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

4848

Pthreads

a POSIX standard (IEEE 1003.1c) API for thread creation and synchronization

API specifies behavior of the thread library, not an implementation

Implemented on many UNIX operating systems

Services for Unix (SFU) implement PThreads on Windows

Page 49: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

4949

4.2. Windows Processes and Threads

Page 50: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

5050

Windows Processes

What is a process?– Represents an instance of a running program

you create a process to run a programstarting an application creates a process

– Process defined by:Address spaceResources (e.g. open handles)Security profile (token)

Every process starts with one thread– First thread executes the program’s “main” functionCan create other threads in the same processCan create additional processes

Page 51: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

5151

Windows Threads

What is a thread?

– An execution context within a process

– Unit of scheduling (threads run, processes don’t run)

All threads in a process share same process address space

– Services provided so threads can synchronize access to shared resources (critical sections, mutexes, events, semaphores)

All threads in the system are scheduled as peers to all others, without regard to their “parent” process

Page 52: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

5252

Per-Process Data

Virtual address space

– program code, global storage, heap storage, threads’ stacks

Working set

– physical memory “owned” by the process

Access token

– includes security identifiers

Handle table for Windows kernel objects

Page 53: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

5353

Per-Process Data

Environment strings

Command line

These are common to all threads in the process, but separate and protected between processes

Page 54: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

5454

Per-Thread Data

User-mode stack

– arguments passed to thread, automatic storage, call frames

Kernel-mode stack (for system calls)

Thread Local Storage (TLS)

– array of pointers to allocate unique data

Scheduling state (Wait, Ready, Running, etc.) and priority

Page 55: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

5555

Per-Thread Data

Hardware context

– Program counter, stack pointer, register values

– Current access mode (user mode or kernel mode)

– (saved in CONTEXT structure if not running)

Access token (optional -- overrides process’s if present)

Page 56: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

5656

Process and Thread Identifiers

Every process and every thread has an identifier

Generically: “client ID” (debugger shows as “CID”)

– A.K.A. “process ID” and “thread ID”, respectively

– Process IDs and thread IDs are in the same “number space”

ID identifies request process or thread to its subsystem server process, in API calls that need server’s help

Page 57: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

5757

Process and Thread Identifiers

Visible in:

– PerfMon, Task Manager (for processes),

– Process Viewer (for processes), kernel debugger, etc.

IDs are unique among all existing processes and threads

– might be reused as soon as a process or thread is deleted

Page 58: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

5858

Process-Related Performance Counters

Object: Counter Function

Process:%PrivilegedTime

Percentage of time that the threads in the process have run in kernel mode

Process:%ProcessorTime

Percentage of CPU time that threads have used during specified interval

%PrivilegedTime + %UserTime

Process:%UserTime Percentage of time that the threads in the process have run in user mode

Process: ElapsedTime Total lifetime of process in seconds

Process: ID Process PID – process IDs are re-used

Process: ThreadCount Number of threads in a process

Page 59: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

5959

Thread-Related Performance Counters

Object: Counter Function

Process: Priority Base Base priority of process: starting priority for thread within process

Thread:%PrivilegedTime Percentage of time that the thread was run in kernel mode

Thread:%ProcessorTime Percentage of CPU time that the threads has used during specified interval

%PrivilegedTime + %UserTime

Thread:%UserTime Percentage of time that the thread has run in user mode

Thread: ElapsedTime Total lifetime of process in seconds

Thread: ID Process PID – process IDs are re-used

Thread: ID Thread Thread ID – re-used

Page 60: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

6060

Thread-Related Performance Counters (contd.)

Object: Counter Function

Thread: Priority Base Base priority of thread: may differ from the thread‘s starting priority

Thread: Priority Current

The thread‘s current dynamic priority

Thread: Start Address The thread‘s starting virtual address (the same for most threads)

Thread: Thread State Value from 0 through 7 – current state of thread

Thread: Thread Wait Reason

Value from 0 through 19 – reason why the thread is in wait state

Page 61: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

6161

Tools for Obtaining Process & Thread Information

Many overlapping tools

– most show one item the others do not

Built-in tools in Windows 2000/XP:

– Task Manager, Performance Tool

– Tasklist (new in XP)

Support Tools

– pviewer - process and thread details (GUI)

– pmon -rocess list (character cell)

– tlist-shows process tree, thread details (character cell)

Page 62: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

6262

Tools for Obtaining Process & Thread Information

Resource Kit tools:

– apimon - system call and page fault monitoring (GUI)

– oh – display open handles (character cell)

– pviewer - processes & threads and security details (GUI)

– ptree –display process tree & kill remote processes (GUI)

– pulist-lists processes and usernames (character cell)

– pstat -process/threads & driver addresses (character cell)

– qslice - can show process-relative thread activity (GUI)

Page 63: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

6363

Tools for Obtaining Process & Thread Information

Tools from www.sysinternals.com:

Process Explorer: super Task Manager

– shows open files, loaded DLLs, security info, etc.

Pslist

– list processes on local or remote systems

Ntpmon

– shows process/thread create/deletes

– and context switches on MP systems only

Listdlls

– displays full path of EXE & DLLs loaded in each process

Page 64: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

6464

What Are Task Manager’s “Applications”?

A meaningless term at the OS level

– Not a list of processes

– Not a list of “tasks” (another meaningless term)

– It’s a list of top level visible windows in your session that meet certain criteria

Page 65: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

6565

What Are Task Manager’s “Applications”?

What does the status column mean?

Running:

– Windows don’t run—threads do

– Running displayed only when owning thread is waiting for a window message (e.g. not running!)

Not Responding: not waiting for window messages

To map a window to a process

– right-click on a window and select “Go to process”

Page 66: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

6666

What Are Task Manager’s “Applications”?

Page 67: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

6767

Process Explorer (Sysinternals)

Super Task Manager

Shows:– full image path, command line,

– environment variables, parent process,

– security access token, open handles,

– loaded DLLs & mapped files

Page 68: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

6868

Process Explorer (Sysinternals)

Page 69: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

6969

Lab: The Process List

Run Process Explorer & maximize window

Run Task Manager – click on Processes tab

Arrange windows so you can see both

Notice process tree vs flat list in Task Manager

If parent has exited, process is left justified

Page 70: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

7070

Lab: The Process List

1. Sort on first column (“Process”) and note tree view disappears

2. Click on View->Show Process Tree (or CTRL+T) to bring it back

3. Notice description and company name columns

4. Hover mouse over image to see full path of image

5. Right click on a process and choose “Google”

Page 71: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

7171

Lab: Refresh Highlighting

1. Change update speed to paused by pressing space bar

2. Run Notepad

3. In ProcExp, hit F5 and notice new process

4. Exit Notepad

5. In ProcExp, hit F5 and notice Notepad in red

Uses

– Understanding process startup sequences

– Detecting appearance of processes coming and going

Page 72: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

7272

Process Performance

Click on Performance Tab of process properties

– Note: all these numbers can be configured as columns

Page 73: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

7373

Thread Details

Process Explorer “Threads” tab shows which thread(s) are running

– Start address represents where the thread began running (not where it is now)

– Click Module to get details on module containing thread start address

Page 74: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

7474

Thread Start Functions

Process Explorer can map the addresses within a module to the names of functions

– This can help identify which component within a process is responsible for CPU usage

Requires access to:

– Symbol file for that module

– Proper version of Dbghelp.dll

Page 75: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

7575

Thread Start Functions

By default, Process Explorer looks for:

Dbghelp.dll: – in the default Windows Debugging Tools install directory

Symbols: – _NT_SYMBOL_PATH environment variable

Can also specify with Options->Configure Symbols

Page 76: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

7676

Call Stacks

Function 2Function 2

Function 1Function 1

Function 3Function 3

Process Explorer can also show the thread call stack

– Represents sequence of functions called

Important if start address doesn’t indicate what the thread is doing

– E.g. if it’s a generic library start routine

Page 77: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

7777

Call Stacks

Click Stack to view call stack

– Lists functions in reverse chronological order

Note that start address on Threads tab is different than first function shown in stack

– This is because all user threads start in a Windows library function which calls the programmed start address

Page 78: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

7878

Example: Viewing Stacks

Problem: Powerpoint was hanging for 1 minute on startup

Thread stack shows waiting on a printer driver

Page 79: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

7979

Suspending Processes

Process Explorer can suspend a process Why would you want to do this?

– You’ve started a long running job but want to pause it to do something elseLowering the priority still leaves it running…

– You’ve started a long download but want to have your network bandwidth temporarily

– Some multi-service system process activity is due to other processes calling upon their servicesSuspend a process that is consuming CPU time to see what that does to the system process in question

Page 80: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

8080

Lab: Suspend

Start Notepad

From a command prompt:

1. Suspend Notepad process with Process Explorer

2. Try to switch back to Notepad (should not respond)

3. Open Task Manager and look at Notepad’s status in the applications tab

4. Resume Notepad

Page 81: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

8181

Processes

Jobs

Jobs are collections of processes

Can be used to specify limits on CPU, memory, and security

Enables control over some unique process & thread settings not available through any process or thread system call– E.g. length of thread time slice

Job

Processes

Page 82: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

8282

Jobs

How do processes become part of a job?

Job object has to be created (CreateJobObject)

Then processes are explicitly added (AssignProcessToJob)– Processes created by processes in a job automatically

are part of the jobUnless restricted, processes can “break away” from a job

Then quotas and limits are defined (SetInformationJobObject)– Examples on next slide…

Page 83: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

8383

Process Lifetime

Created as an empty shell

Address space created with only ntdll and the main image unless created by POSIX fork()

Handle table created empty or populated via duplication from parent

Process is partially destroyed on last thread exit

Process totally destroyed on last dereference

Page 84: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

8484

Thread Lifetime

Created within a process with a CONTEXT record

– Starts running in the kernel but has a trap frame to return to user mode

Threads run until they:

– The thread returns to the OS

– ExitThread is called by the thread

– TerminateThread is called on the thread

– ExitProcess is called on the process

Page 85: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

8585

Why Do Processes Exit? (or Terminate?)

Normal: Application decides to exit (ExitProcess)

Usually due to a request from the UI

or: CRTL does ExitProcess when primary thread function (main, WinMain, etc.) returns to caller– this forces TerminateThread on the process’s remaining

threads

– or, any thread in the process can do an explicit ExitProcess

Page 86: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

8686

Why Do Processes Exit? (or Terminate?)

Orderly exit requested from the desktop (ExitProcess)

– e.g. “End Task” from Task Manager “Tasks” tab

– Task Manager sends a WM_CLOSE message to the window’s message loop…

– …which should do an ExitProcess (or equivalent) on itself

Page 87: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

8787

Why Do Processes Exit? (or Terminate?)

Forced termination (TerminateProcess)

– if no response to “End Task” in five seconds, Task Manager presents End Program dialog (which does a TerminateProcess)

– or: “End Process” from Task Manager Processes tab

Unhandled exception

– Covered in Unit 4.3 (Process and Thread Internals)

Page 88: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

8888

Why Do Processes Exit? (or Terminate?)

Page 89: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

8989

Job Settings

Quotas and restrictions:

– Quotas: total CPU time, # active processes, per-process CPU time, memory usage

– Run-time restrictions: priority of all the processes in job; processors threads in job can run on

– Security restrictions: limits what processes can doNot acquire administrative privilegesNot accessing windows outside the job, no reading/writing the clipboard

Page 90: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

9090

Job Settings

– Scheduling class: number from 0-9 (5 is default) - affects length of thread timeslice (or quantum)E.g. can be used to achieve “class scheduling” (partition CPU)

Page 91: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

9191

Jobs

Examples where Windows OS uses jobs:

– Add/Remove Programs (“ARP Job”)

– WMI provider

– RUNAS service (SecLogon) uses jobs to terminate processes at log outSU from NT4 ResKit didn’t do this

Process Explorer highlights processes that are members of jobs

– Color can be configured with Options->Configure Highlighting

– For processes in a job, click on Job tab in process properties to see details

Page 92: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

9292

Lab: WMI Job

Jobs are used by WMI

– Example: run Psinfo (Sysinternals) and pause output

Page 93: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

9393

Lab: RUNAS Job

1. In a command prompt: RUNAS /USER:xxx CMD(where xxx is some other local account)

2. In ProcExp, find newly created cmd.exe process

– Who is the father?

3. Run Notepad from new CMD window

4. Double click on newly highlighted process & click on Job tab

Page 94: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

9494

Programming Slides

NOTE: The remaining slides are for use in a class that covers the programming aspects of the OS (vs a class aimed at system administrators who are not doing programming)

Page 95: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

9595

Process Windows APIs

CreateProcess

OpenProcess

GetCurrentProcessId - returns a global ID

GetCurrentProcess - returns a handle

ExitProcess

TerminateProcess - no DLL notification

Get/SetProcessShutdownParameters

GetExitCodeProcess

GetProcessTimes

GetStartupInfo

Page 96: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

9696

Windows Thread APIs

CreateThread

CreateRemoteThread

GetCurrentThreadId - returns global ID

GetCurrentThread - returns handle

SuspendThread/ResumeThread

ExitThread

TerminateThread - no DLL notification

GetExitCodeThread

GetThreadTimes

Windows 2000 adds:– OpenThread

– new thread pooling APIs

Page 97: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

9797

Fibers

Implemented completely in user mode– no “internals” ramifications

– Fibers are still scheduled as threads

– Fiber APIs allow different execution contexts within a threadstackfiber-local storagesome registers (essentially those saved and restored for a procedure call)

cooperatively “scheduled” within the thread

– Analogous to threading libraries under many Unix systems

– Analogous to co-routines in assembly language

– Allow easy porting of apps that “did their own threads” under other systems

Page 98: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

9898

Process Creation

BOOL CreateProcess( LPCSTR lpApplicationName, LPSTR lpCommandLine, LPSECURITY_ATTRIBUTES lpProcessAttributes, LPSECURITY_ATTRIBUTES lpThreadAttributes, BOOL bInheritHandles, DWORD dwCreationFlags, LPVOID lpEnvironment, LPCSTR lpCurrentDirectory, LPSTARTUPINFO lpStartupInfo, LPPROCESS_INFORMATION lpProcessInformation)

No parent/child relation in Win32

CreateProcess() – new process with primary thread

Page 99: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

9999

typedef struct _PROCESS_INFORMATION { HANDLE hProcess; HANDLE hThread; DWORD dwProcessId; DWORD dwThreadId;} PROCESS_INFORMATION;

Parameters

fdwCreate:

– CREATE_SUSPENDED, DETACHED_PROCESS, CREATE_NEW_CONSOLE, CREATE_NEW_PROCESS_GROUP

lpStartupInfo:

– Main window appearance

– Parent‘s info: GetStartupInfo

– hStdIn, hStdOut, hStdErr fields for I/O redirection

lpProcessInformation:

– Ptr to handle & ID of new proc/thread

Page 100: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

100100

UNIX & Win32 comparison

Windows API has no equivalent to fork()

CreateProcess() similar to fork()/exec()

UNIX $PATH vs. lpCommandLine argument

– Win32 searches in dir of curr. Proc. Image; in curr. Dir.;

in Windows system dir. (GetSystemDirectory); in Windows dir.

(GetWindowsDirectory); in dir. Given in PATH

Windows API has no parent/child relations for processes

No UNIX process groups in Windows API

– Limited form: group = processes to receive a console event

Page 101: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

101101

Windows API Thread Creation

cbStack == 0: thread‘sstack size defaults toprimary thread‘s size

HANDLE CreateThread (LPSECURITY_ATTRIBUTES lpsa,DWORD cbStack,LPTHREAD_START_ROUTINE lpStartAddr,LPVOID lpvThreadParm,DWORD fdwCreate,LPDWORD lpIDThread)

lpstartAddr points to function declared as

DWORD WINAPI ThreadFunc(LPVOID) lpvThreadParm is 32-bit argument LPIDThread points to DWORD that receives thread ID

non-NULL pointer !

Page 102: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

102102

VOID ExitProcess( VOID ExitProcess( UINT uExitCode);UINT uExitCode);

BOOL TerminateProcess( BOOL TerminateProcess( HANDLE hProcess, HANDLE hProcess, UINT uExitCode);UINT uExitCode);

BOOL GetExitCodeProcess( BOOL GetExitCodeProcess( HANDLE hProcess, HANDLE hProcess, LPDWORD lpExitCode);LPDWORD lpExitCode);

Exiting and Terminating a Process

Shared resources must be freed before exiting– Mutexes, semaphores, events

– Use structured exception handling

But:

_finally, _except

handlers are not

executed on

ExitProcess; no SEH on

TerminateProcess

Page 103: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

103103

VOID ExitThread( DWORD devExitCode )

When the last thread in a process terminates, the process itself terminates(TerminateThread() does not execute final SEH)

Thread continues to exist until last handle is closed(CloseHandle())

BOOL GetExitCodeThread (HANDLE hThread, LPDWORD lpdwExitCode)

Returns exit code or STILL_ACTIVE

Windows API Thread Termination

Page 104: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

104104

Each thread has suspend count

Can only execute if suspend count == 0

Thread can be created in suspended state

DWORD ResumeThread (HANDLE hThread)

DWORD SuspendThread(HANDLE hThread)

Both functions return suspend count or 0xFFFFFFFF on failure

Suspending and Resuming Threads

Page 105: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

105105

Synchronization & Remote Threads

WaitForSingleObject() and WaitForMultipleObjects() with thread handles as arguments perform thread synchronization

– Waits for thread to become signaled

– ExitThread(), TerminateThread(), ExitProcess() set thread objects to signaled state

CreateRemoteThread() allows creation of thread in another process

– Not implemented in Windows 9x

C library is not thread-safe; use libcmt.lib instead

– #define _MT before any include

– Use _beginthreadex/_endthreadex instead of Create/ExitThread

Page 106: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

106106

Windows Process and Thread Internals

Data Structures for each process/thread:

Executive process block (EPROCESS)

Executive thread block (ETHREAD)

Win32 process block

Process environment block

Thread environment block

Page 107: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

107107

Windows Process and Thread Internals

Process environment

block

Thread environment

block

Process block(EPROCESS)

Thread block(ETHREAD)

Win32 process block

Handle table

...

Process address space

System address space

Page 108: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

108108

Process

Container for an address space and threads

Associated User-mode Process Environment Block (PEB)

Primary Access Token

Quota, Debug port, Handle Table etc

Unique process ID

Queued to the Job, global process list and Session list

MM structures like the WorkingSet, VAD tree, AWE etc

Page 109: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

109109

Thread

Fundamental schedulable entity in the system Represented by ETHREAD that includes a KTHREAD Queued to the process (both E and K thread) IRP list, Impersonation Access Token Unique thread ID Associated User-mode Thread Environment Block

(TEB) User-mode stack, Kernel-mode stack Processor Control Block (in KTHREAD) for CPU

state when not running

Page 110: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

110110

ProcessObject

Handle Table

VAD VAD VAD

object

object

Virtual Address Space Descriptors

Access Token

Thread Thread Thread . . .Access Token

See kernel debuggercommands:

dt (see next slide)!process!thread!token!handle!object

Processes & Threads Internal Data Structures

Page 111: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

111111

Process/Thread Kernel Debugger Commands

!process [/s Session] [Address/Pid [Flags]]

– !process – display current process (not full details)

– !process 342 – display full details of process 342

– !process 829fa030 – display process identified by EPROCESS address

– !process 0 0 – summary display of all processes

– !process 0 7 – full details of all processes

Page 112: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

112112

Process/Thread Kernel Debugger Commands

!thread [Address [Flags]]

– !thread – current thread

– !thread 826e8898display thread identified by ETHREAD address

To view user stack, must set process context:

– .process <address of EPROCESS>

– .context <address of page directory (Dirbase)>

!peb [Address]

!teb [Address]

Page 113: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

113113

PROCESS ff704020 Cid: 0075 Peb: 7ffdf000 ParentCid: 005d DirBase: 0063c000 ObjectTable: ff7063c8 TableSize: 70. Image: Explorer.exe VadRoot ff70d6e8 Clone 0 Private 229. Modified 236. Locked 0. FF7041DC MutantState Signalled OwningThread 0 Token e1462030 ElapsedTime 0:01:19.0874 UserTime 0:00:00.0991 KernelTime 0:00:02.0613 QuotaPoolUsage[PagedPool] 18317 QuotaPoolUsage[NonPagedPool] 3824 Working Set Sizes (now,min,max) (727, 20, 45) (2908KB, 80KB, 180KB) PeakWorkingSetSize 757 VirtualSize 29 Mb PeakVirtualSize 31 Mb PageFaultCount 1396 MemoryPriority FOREGROUND BasePriority 8 CommitCharge 250

EPROCESS address Process ID Address of process environment block

Process ID ofparent process

Time the processhas been running,divided into Userand Kernel time

Physical address of Page Directory

root of the process’sVirtual AddressDescriptor tree

Process Block (!process)

Page 114: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

114114

Process Block Layout

Quota Block

Exit Status

Primary Access Token

Process ID

Parent Process ID

Exception Port

Debugger Port

Handle Table

Process Environment Block

Create and Exit Time

Next Process Block

Image File Name

Process Priority Class

Memory Management Information

EPROCESS

Kernel Process Block (or PCB)

Image Base Address

Win32 Process Block

Dispatcher Header

Processor Affinity

Kernel Time

User Time

Inwwap/Outswap List Entry

Process Spin Lock

Resident Kernel Stack Count

Process Base Priority

Default Thread Quantum

Process State

Thread Seed

Disable Boost Flag

Process Page Directory

KTHREAD . . .

Page 115: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

115115

Process Block Layout

lkd> dt nt!_EPROCESS +0x000 Pcb : _KPROCESS +0x06c ProcessLock : _EX_PUSH_LOCK +0x070 CreateTime : _LARGE_INTEGER +0x078 ExitTime : _LARGE_INTEGER +0x080 RundownProtect : _EX_RUNDOWN_REF +0x084 UniqueProcessId : Ptr32 Void +0x088 ActiveProcessLinks : _LIST_ENTRY +0x090 QuotaUsage : [3] Uint4B +0x09c QuotaPeak : [3] Uint4B +0x0a8 CommitCharge : Uint4B +0x0ac PeakVirtualSize : Uint4B +0x0b0 VirtualSize : Uint4B

.

. NOTE: Add “-r” to recurse through substructures

Page 116: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

116116

THREAD 83160f60 Cid 9f.3d Teb: 7ffdc000 Win32Thread: e153d2c8WAIT: (WrUserRequest) UserMode Non-Alertable 808e9d60 SynchronizationEvent Not impersonating Owning Process 81b44880 WaitTime (seconds) 953945 Context Switch Count 2697 LargeStack UserTime 0:00:00.0289 KernelTime 0:00:04.0664 Start Address kernel32!BaseProcessStart (0x77e8f268) Win32 Start Address 0x020d9d98 Stack Init f7818000 Current f7817bb0 Base f7818000 Limit f7812000 Call 0 Priority 14 BasePriority 8 PriorityDecrement 6 DecrementCount 13Kernel stack not resident.

ChildEBP RetAddr Args to Child f7817bb0 8008f430 00000001 00000000 00000000 ntoskrnl!KiSwapThreadExit f7817c50 de0119ec 00000001 00000000 00000000 ntoskrnl!KeWaitForSingleObject+0x2a0 f7817cc0 de0123f4 00000001 00000000 00000000 win32k!xxxSleepThread+0x23c f7817d10 de01f2f0 00000001 00000000 00000000 win32k!xxxInternalGetMessage+0x504 f7817d80 800bab58 00000001 00000000 00000000 win32k!NtUserGetMessage+0x58 f7817df0 77d887d0 00000001 00000000 00000000 ntoskrnl!KiSystemServiceEndAddress+0x4 0012fef0 00000000 00000001 00000000 00000000 user32!GetMessageW+0x30

Address of ETHREAD

Thread ID

Address of threadenvironment block

Objects beingwaited on

Threadstate

Address of systemservice dispatch table

Priority Information

Actual thread start address

Stack trace

Address of user thread function

Process ID

Thread Block (!thread)

Page 117: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

117117

Thread Block

ETHREAD

Create and Exit Time

Process ID

Thread Start Address

Impersonation Information

LPC Message Information

EPROCESS

Access Token

KTHREAD

Timer InformationPending I/O Requests

Total User Time

Total Kernel Time

Thread Scheduling Information

Synchronization Information

List of Pending APCs

Timer Block and Wait Blocks

List of Objects Thread is Waiting On

System Service Table

TEB

KTHREAD

Thread Local Storage Array

Kernel Stack Information

Dispatcher Header

Trap Frame

Page 118: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

118118

Thread Block (!strct ethread)

lkd> dt nt!_ETHREAD +0x000 Tcb : _KTHREAD +0x1c0 CreateTime : _LARGE_INTEGER +0x1c0 NestedFaultCount : Pos 0, 2 Bits +0x1c0 ApcNeeded : Pos 2, 1 Bit +0x1c8 ExitTime : _LARGE_INTEGER +0x1c8 LpcReplyChain : _LIST_ENTRY +0x1c8 KeyedWaitChain : _LIST_ENTRY +0x1d0 ExitStatus : Int4B +0x1d0 OfsChain : Ptr32 Void +0x1d4 PostBlockList : _LIST_ENTRY +0x1dc TerminationPort : Ptr32 _TERMINATION_PORT +0x1dc ReaperLink : Ptr32 _ETHREAD

Page 119: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

119119

Process Environment Block

Mapped in user space

Image loader, heap manager, Windows system DLLs use this info

View with !peb or dt nt!_peb

Image base addressModule list

Thread-local storage dataCode page data

Critical section time-outNumber of heaps

Heap size info

GDI shared handle tableOS version no infoImage version info

Image process affinity mask

Processheap

Page 120: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

120120

Thread Environment Block

User mode data structure

Context for image loader and various Windows DLLs

View with !teb or dt nt!_teb

Exception listStack baseStack limit

Thread IDActive RPC handle

LastError valueCount of owned crit. sect.

Current localeUser32 client info

GDI32 infoOpenGL infoTLS array

Subsyst. TIB

Fiber info

PEB

Winsock data

Page 121: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

121121

Flow of CreateProcess()

1. Open image file (.EXE) to be executed inside the process

2. Create Windows NT executive process object

3. Create initial thread

1. stack, context, Win NT executive thread object)

4. Notify Windows subsystem of new process so that it can set up for new proc.& thread

5. Start execution of initial thread

1. unless CREATE_SUSPENDED was specified)

6. In context of new process/thread:

1. complete initialization of address space (load DLLs)

2. and begin execution of the program

Page 122: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

122122

Open EXE andcreate selection

object

Create NTprocess object

Create NTthread objectNotify Windowssubsystem

Set up for newprocess and

thread

Start execution of the initialthread

Return to caller

Finalprocess/imageinitialization

Start executionat entry point to

image

Creating process

Windows subsystem

New process

Stages Windows follows to create a process

Page 123: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

123123

CreateProcess: some notes

CreationFlags: independent bits for priority class -> NT assigns lowest-prio class set

Default prio class is normalunless creator has prio class idle

If real-time prio class is specified andcreator has insufficient privileges:prio class high is used

Caller‘s current desktop is used if no desktop is specified

Priority classes:• Real-time• High• Normal• idle

Page 124: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

124124

Opening the image to be executed

What kind of application is it?

Run CMD.EXE Run NTVDM.EXE Use .EXE directly

Run NTVDM.EXERun POSIX.EXERun OS2.EXE

Win16 Windows

OS/2 1.x MS-DOS .EXE,.COM, or .PIF

MS-DOS .BATor .CMD

POSIX

Page 125: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

125125

If executable has no Windows format...

CreateProcess uses Windows “support image”

No way to create non-Windows processes directly

– OS2.EXE runs only on Intel systems

– Multiple MS-DOS apps may share virtual DOS machine

– .BAT of .CMD files are interpreted by CMD.EXE

– Win16 apps may share virtual dos machine (VDM)Flags: CREATE_SEPARATE_WOW_VDM, CREATE_SHARED_WOW_VDMDefault: HKLM\System...\Control\WOW\DefaultSeparateVDM

– Sharing of VDM only if apps run on same desktop under same security

Page 126: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

126126

If executable has no Windows format...

Debugger may be specified under (run instead of app !!)

– \Software\Microsoft\WindowsNT\CurrentVersion\ImageFileExecutionOptions

Page 127: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

127127

Process Creation - next Steps...

CreateProcess has opened Windows executable and created a section object to map in proc‘s addr space

Now: create executive process object via NtCreateProcess

– Set up EPROCESS block

– Create initial process address space (page directory, hyperspace page, working set list)

– Create kernel process block (set inital quantum)

– Conclude setup of process address space VM, map NTDLL.DLL, map lang support tables,

register process: PsActiveProcessHead

– Set up Process Environment Block

– Complete setup of executive process object

Page 128: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

128128

Further Steps...(contd.)

Create Initial Thread and Its Stack and Context

– NtCreateThread;

– new thread is suspended until CreateProcess returns

Notify Windows Subsystem about new process

KERNEL32.DLL sends message to Windows subsystem including:

– Process and thread handles

– Entries in creation flags

– ID of process‘s creator

– Flag describing Windows app (CSRSS may show startup cursor)

Page 129: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

129129

Further Steps...(contd.)

Windows: duplicate handles (inc usage count), set priority class, bookkeeping

– allocate CSRSS proc/thread block, init exception port, init debug port

– Show cursor (arrow & hourglass), wait 2 sec for GUI call, then wait 5 sec for window

Page 130: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

130130

CreateProcess: final steps

Process Initialization in context of new process:

Lower IRQL level (dispatch -> Async.Proc.Call. level)

Enable working set expansion

Queue APC to exec LdrInitializeThunk in NTDLL.DLL

Lower IRQL level to 0 – APC fires,

– Init loader, heap manager, NLS tables,

– TLS array, critical section, structures

– Load DLLs, call DLL_PROCESS_ATTACH func

Page 131: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

131131

CreateProcess: Final Steps

Debuggee: all threads are suspended

– Send msg to proc‘s debug port Windows creates CREATE_PROCESS_DEBUG_INFO event

Image begins execution in user-mode (return from trap)

Page 132: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

132132

1. DLL notification - unless TerminateProcess used

2. All handles to executive and kernel objects are closed

3. Terminate any active threads

4. exit code changes from STILL_ACTIVE to the specified exit code:

BOOL GetExitCodeProcess(HANDLE hProcess,LPDWORD lpdwExitCode);

5. Process object & thread objects become signaled

6. When handle and reference counts to process object == 0, process object is deleted

Process Rundown Sequence

Page 133: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

Thread Startup(in-context thread init.)

Lower IRQLto APC

Enable workingset expansion

Queue user-modeAPC to run

LdrInitializeThunkAnd lower IRQL to 0

Perform in-processcontext initialization(init loader, load DLLs)

Process has

debugger?Suspend allthreads

Send new threadmessage tosubsystem

Resume allthreads

Notify debuggerprocess of newprocess and wait

for replyRestore trapframe and dismissexception

Begin execution in

user mode

LPC send/receive

APC fires

yes

no

User mode

Inside CSRSS

Kernel mode

Kernel mode

Page 134: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

134134

1. DLL notification- unless TerminateThread was used

2. All handles to Windows User and GDI objects are closed

3. Outstanding I/Os are cancelled

4. Thread stack is deallocated

5. exit code changes from STILL_ACTIVE to the specified exit code

BOOL GetExitCodeThread(HANDLE hThread,LPDWORD lpdwExitCode);

6. Thread kernel object becomes signaled

7. When handle and reference counts == 0, thread object deleted

8. If last thread in process, process exits

Thread Rundown Sequence

Page 135: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

135135

Start of Thread Wrapper

All threads in all processes appear to have one of just two different start addresses, regardless of.EXE running

– One for thread 0 (start of process wrapper)

– the other for all other threads (start of thread wrapper)

These “wrapper” functions are what Process Viewer shows as Thread Start Address for Windows apps

Page 136: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

136136

Start of Thread Wrapper

Start of process and start of thread wrappers have same behavior

– Provides default exception handling, access to debugger, etc.

– Forces thread exit when thread function returns

To find “real” Windows start address, use TLIST <processname> (or Kernel Debugger !thread command)

Page 137: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

137137

void BaseProcessStart [or BaseThreadStart - basically the same]

(LPTHREAD_START_ROUTINE lpStartAddr, LPVOID lpvThreadParm)

{

__try {

DWORD dwThreadExitCode = lpStartAddr(lpvThreadParm);

ExitThread(dwThreadExitCode);

}

__except(UnhandledExceptionFilter(

GetExceptionInformation())) {

ExitProcess(GetExceptionCode());

}

}

Start of Process/Thread Function (conceptual model)

Page 138: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

138138

if process has a debugger attached return EXCEPTION_CONTINUE_SEARCHif AUTO=0 { // run debugger automatically? Display message box; // no - ask user what to do if(clicked OK)

ExitProcess();}

// either AUTO=1, or (AUTO=0 and user clicked CANCEL),// so run debuggerGetProfileString("AEdebug","debugger",...);hEvent = CreateEvent( ... );hProcess = CreateProcess(...); // Create debugger - pass process id, event to signalWaitForMultipleObjects( [hEvent, hProcess] );return EXCEPTION_CONTINUE_SEARCH;

Windows Unhandled Exception Filter

Page 139: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

139139

Windows Unhandled Exception Filter

Implication: you can connect a debugger (VC++ or WinDbg) to a running process

– C:\> msdev -p pid

Page 140: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

140140

Process Crashes (Windows 2000)

Registry defines behavior for unhandled exceptions

– HKLM\Software\Microsoft\Windows NT\CurrentVersion\AeDebug

– Debugger=filespec of debugger to run on app crash

– Auto 1=run debugger immediately 0=ask user first

Page 141: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

141141

Process Crashes (Windows 2000)

Default on retail system is Auto=1; Debugger=DRWTSN32.EXE

Default with VC++ is Auto=0, Debugger=MSDEV.EXE

Page 142: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

142142

On XP & Server 2003, when an unhandled exception occurs:

– System first runs DWWIN.EXEDWWIN creates a process microdump and XML file and offers the option to send the error report

– Then runs debugger (default is Drwtsn32.exe)

Process Crashes (XP & Server 2003)

Page 143: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

143143

Windows Error Reporting

Configurable with System Properties->Advanced->Error Reporting

– HKLM\SOFTWARE\Microsoft\PCHealth\ErrorReporting

Configurable with group policies

– HKLM\SOFTWARE\Policies\Microsoft\PCHealth

Page 144: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

144144

Scheduling Criteria

CPU utilization – keep the CPU as busy as possible

Throughput – # of processes/threads that complete their execution per time unit

Turnaround time – amount of time to execute a particular process/thread

Waiting time – amount of time a process/thread has been waiting in the ready queue

Response time – amount of time it takes from when a request was submitted until the first response is produced, not output (i.e.; the hourglass)

Page 145: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

145145

Windows Scheduler

Priority-driven, preemptive scheduling system

Highest-priority runnable thread always runs

Thread runs for time amount of quantum

No single scheduler – event-based scheduling code spread across the kernel

Page 146: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

146146

Windows Scheduler

Dispatcher routines triggered by the following events:

– Thread becomes ready for execution

– Thread leaves running state (quantum expires, wait state)

– Thread‘s priority changes (system call/NT activity)

– Processor affinity of a running thread changes

Page 147: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

147147

Windows Scheduling Principles

32 priority levels

Threads within same priority are scheduled Round-Robin

Non-real-time priorities are adjusted dynamically

– Priority elevation as response to certain I/O and dispatch

– Quantum stretching to optimize responsiveness

Real-time priorities are assigned statically to threads

Page 148: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

148148

Scheduling

Multiple threads may be ready to run

“Who gets to use the CPU?”

From Windows API point of view:

Processes are given a priority class upon creation

– Idle, Normal, High, Realtime

– Windows 2000 added “Above normal” and “Below normal”

Threads have a relative priority within the class

– Idle, Lowest, Below_Normal, Normal,

– Above_Normal, Highest, and Time_Critical

Page 149: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

149149

Windows Scheduling-related APIs:Get/SetPriorityClassGet/SetThreadPriorityGet/SetProcessAffinityMaskSetThreadAffinityMaskSetThreadIdealProcessorSuspend/ResumeThread

Scheduling

From the kernel’s view:

– Threads have priorities 0 through 31

– Threads are scheduled, not processes

– Priority class is not used to make schedule decisions

Page 150: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

150150

Kernel: Thread Priority Levels

16 “real-time” levels

15 variable levels

Used by zero page thread

Used by idle thread(s)

31

16

0

i

15

1

Page 151: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

151151

Windows vs. NT Kernel Priorities

Win32 Process Classes

Realtime HighAboveNormal Normal

BelowNormal Idle

Win32 Time-critical 31 15 15 15 15 15Thread Highest 26 15 12 10 8 6

Priorities Above-normal 25 14 11 9 7 5

Normal 24 13 10 8 6 4

Below-normal 23 12 9 7 5 3

Lowest 22 11 8 6 4 2

Idle 16 1 1 1 1 1

Page 152: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

152152

Windows vs. NT Kernel Priorities

Table shows base priorities

– current or dynamic thread priority may be higher if base <15

Many utilities (such as Process Viewer) show the “dynamic priority” of threads rather than the base

– Performance Monitor can show both

Drivers can set to any value with KeSetPriorityThread

Page 153: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

153153

Special Thread Priorities

Idle threads -- one per CPU

When no threads want to run, Idle thread “runs”

– Not a real priority level - appears to have priority zero, but actually runs “below” priority 0

– Provides CPU idle time accounting (unused clock ticks are charged to the idle thread)

Loop:

– Calls HAL to allow for power management

– Processes DPC list; Dispatches to a thread if selected

Server 2003:

– in certain cases, scans per-CPU ready queues for next thread

Page 154: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

154154

Special Thread Priorities

Zero page thread -- one per NT system

– Zeroes pages of memory in anticipation of “demand zero” page faults

– Runs at priority zero (lower than any reachable from Windows)

– Part of the “System” process (not a complete process)

Page 155: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

155155

Thread Scheduling Priorities vs. Interrupt Request Levels (IRQLs)

Passive_LevelAPC

Dispatch/DPCDevice 1

.

.

.Device nClock

Interprocessor InterruptPower fail

High

Hardware interrupts

IRQLs

Software interrupts

012

302928

31

Thread priorities

0-31

Page 156: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

156156

Priority driven, preemptive

– 32 queues (FIFO lists) of “ready” threads

– UP: highest priority thread always runs

– MP: One of the highest priority runnable thread will be running somewhere

– No attempt to share processor(s) “fairly” among processes, only among threadsTime-sliced, round-robin within a priority level

Single Processor Thread Scheduling

Page 157: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

157157

Event-driven:

– no guaranteed execution period before preemption

– When a thread becomes Ready, it either runs immediately or is inserted at the tail of the Ready queue for its current (dynamic) priority

Single Processor Thread Scheduling

Page 158: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

158158

Thread Scheduling

No central scheduler!

– there is no always-instantiated routine called “scheduler”

The “code that does scheduling” is not a thread

Scheduling routines are simply called whenever events occur that change the Ready state of a thread

Page 159: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

159159

Thread Scheduling

Things that cause scheduling events include:

– interval timer interrupts (for quantum end)

– interval timer interrupts (for timed wait completion)

– other hardware interrupts (for I/O wait completion)

– one thread changes the state of a waitable object upon which other thread(s) are waiting

– a thread waits on one or more dispatcher objects

– a thread priority is changed

Page 160: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

160160

Thread Scheduling

Based on doubly-linked lists (queues) of Ready threads

– Nothing that takes “order-n time” for n threads

Page 161: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

161161

Scheduling Data Structures

Process

thread thread

Process

thread thread

Default base prioDefault proc affinityDefault quantum

31

0

Ready summary Idle summary31 0 31 0

Base priorityCurrent priorityProcessor affinityQuantum

Bitmask for non-emptyready queuesBitmask for idle CPUs

Page 162: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

162162

Scheduling Scenarios

Preemption

– A thread becomes Ready at a higher priority than the running thread

– Lower-priority Running thread is preempted

– Preempted thread goes back to head of its Ready queueaction: pick lowest priority thread to preempt

Voluntary switch

– Waiting on a dispatcher object

– Termination

– Explicit lowering of priorityaction: scan for next Ready thread starting at your priority & down)

Page 163: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

163163

Scheduling Scenarios

Running thread experiences quantum end

– Priority is decremented unless already at base priority

– Thread goes to tail of ready queue for its new priority

– May continue running if no equal or higher-priority threads are Readyaction: pick next thread at same priority level

Page 164: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

164164

181716151413

Running Ready

from Wait state

Scheduling Scenarios Preemption

Preemption is strictly event-driven

– does not wait for the next clock tick

– no guaranteed execution period before preemption

– threads in kernel mode may be preempted (unless they raise IRQL to >= 2)

Page 165: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

165165

181716151413

Running Ready

from Wait state

Scheduling Scenarios: Ready after Wait

If newly-ready thread is no higher than running thread…

– it is put at tail of ready queue for its current priority

– If priority >=14 quantum is reset (t.b.d.)

– If priority <14 and you’re about to be boosted and didn’t already have a boost, quantum is set to process quantum - 1

Page 166: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

166166

Scheduling Scenarios: Voluntary Switch

to Waiting state

181716151413

Running Ready

When the running thread gives up the CPU…

– Schedule the thread at head of next non-empty “ready” queue

Page 167: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

167167

Scheduling Scenarios: Quantum End (“time-slicing”) When the running thread exhausts its CPU quantum, it goes to the end of its ready queue

Applies to both real-time and dynamic priority threads, user and kernel mode

– Quantums can be disabled for a thread by a kernel function

Default quantum on Professional is 2 clock ticks, 12 on Server

– standard clock tick is 10 msec;

– might be 15 msec on some MP Pentium systems

if no other ready threads at that priority, same thread continues running (just gets new quantum)

if running at boosted priority, priority decays by one at quantum end (described later)

Page 168: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

168168

Scheduling Scenarios: Quantum End (“time-slicing”)

Running Ready181716151413

Page 169: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

169169

Basic Thread Scheduling States

Ready (1) Running (2)

Waiting (5)

voluntaryswitch

preemption, quantum end

Page 170: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

170170

Watching Scheduling

CPUSTRES.EXE - Creating a Test Case

Run: cpustres.exe(Resource Kit)

Page 171: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

171171

Watching the SchedulerPerformance Monitor - Threads Object

Screen snapshot from: Programs | Admin. Tools | Performance Monitor select “Add to Chart”, and Object: Thread. use Ctrl-leftClick to select multiple items in a selection box

Page 172: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

172172

Watching the SchedulerPerformance Monitor - Options | Chart

Screen snapshot from: Performance MonitorOptions menu | Chart command

Set chart maximum vertical scale to 16

Set update interval to 0.1 seconds or less

Page 173: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

173173

Watching the SchedulerPerformance Monitor

Screen snapshot from:PerfMon main window, setup from previous slide

Thread states are indicated by numbers (see thread state transition diagram on previous slide, or Perfmon Explain display for Thread State counter)

5 = waiting2 = running1 = ready

Page 174: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

174174

Priority Adjustments

Dynamic priority adjustments (boost and decay) are applied to threads in “dynamic” classes– Threads with base priorities 1-15 (technically, 1 through 14)

– Disable if desired with SetThreadPriorityBoost or SetProcessPriorityBoost

Five types:– I/O completion

– Wait completion on events or semaphores

– When threads in the foreground process complete a wait

– When GUI threads wake up for windows input

– For CPU starvation avoidance

Page 175: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

175175

Priority Adjustments

No automatic adjustments in real-time class (16 or above)

Real time here really means “system won’t change the relative priorities of your real-time threads”

Hence, scheduling is predictable with respect to other “real-time” threads (but not for absolute latency)

Page 176: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

176176

To favor I/O intense threads:

After an I/O: specified by device driver– IoCompleteRequest( Irp, PriorityBoost )

Common boost values (see NTDDK.H)1: disk, CD-ROM, parallel, Video2: serial, network, named pipe, mailslot6: keyboard or mouse8: sound

Priority Boosting

Page 177: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

177177

Other cases discussed in WIN Scheduling Internals Section

– After a wait on executive event or semaphore

– After any wait on a dispatcher object by a thread in the foreground process

– GUI threads that wake up to process windowing input (e.g. windows messages) get a boost of 2

Priority Boosting

Page 178: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

178178

Thread Priority Boost and Decay

Behavior of these boosts:

– Applied to thread’s base prioritywill not take you above priority 15

– After a boost, you get one quantumThen decays 1 level, runs another quantum

Page 179: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

179179

Priority

BasePriority

Run Wait Run

Preempt(beforequantumend)

Run

Priority decayat quantum end

Boostuponwaitcomplete

Round-robin atbase priority

quantum

Time

Thread Priority Boost and Decay

Page 180: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

180180

Thread Scheduling States (2000, XP)

Ready (1) Running (2)

Waiting (5)

Ready = thread eligible to be scheduled to runStandby = thread is selected to run on CPU

voluntaryswitch

preemption,

quantum end

Init (0)

Terminate (4)

Transition (6)

wait resolvedafter kernelstack made pageable

Standby (3)preempt

Page 181: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

181181

Other Thread States

Transition– Thread was in a wait entered from user mode for 12 seconds or

more

– System was short on physical memory

– Balance set manager (t.b.d.) marked the thread’s kernel stack as pageable (preparatory to “outswapping” the thread’s process)

– Later, the thread’s wait was satisfied, but...

– ...Thread can’t become Ready until the system allocates a nonpageable kernel stack; it is in the “transition” state until then

Initiate– Thread is “under construction” and can’t run yet

Standby– One processor has selected a thread for execution on another

processor

Terminate– Thread has executed its last code, but can’t be deleted until

all handles and references to it are closed (object manager)

Page 182: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

182182

Scheduling Scenarios: Quantum Details

Quantum internally stored as “3 * number of clock ticks”

– Default quantum is 6 on Professional, 36 on Server

Thread->Quantum field is decremented by 3 on every clock tick

Process and thread objects have a Quantum field

– Process quantum is simply used to initialize thread quantum for all threads in the process

Quantum decremented by 1 when you come out of a wait

– So that threads that get boosted after I/O completion won't keep running and never experiencing quantum end

– Prevents I/O bound threads from getting unfair preference over CPU bound threads

Page 183: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

183183

Scheduling Scenarios: Quantum Details

When Thread->Quantum reaches zero(or less than zero):– you’ve experienced quantum end

– Thread->Quantum = Process->Quantum; // restore quantum

– for dynamic-priority threads, this is the only thing that restores the quantum

– for real-time threads, quantum is also restored upon preemption

Interval timer interrupts when previous IRQL >= 2:– are not charged to the current thread’s “privileged”

time

– but do cause the thread “remaining quantum” counter to be decremented

Page 184: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

184184

Quantum Stretching

Favoring foreground applications

If normal-priority process owns the foreground window, its threads may be given longer quantum

– Set by Control Panel / System applet / Performance tab

– Stored in…\System\CurrentControlSet\Control\PriorityControlWin32PrioritySeparation = 0, 1, or 2

– New behavior with 4.0 formerly implemented via priority shift

Page 185: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

185185

Quantum Stretching

Screen snapshot from:Control Panel | System |Performance tab

Page 186: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

186186

Quantum Stretching

Resulting quantum:– “Maximum” = 6 ticks

– (middle) = 4 ticks– “None” = 2 ticks

Quantum stretching does not happen on Server– Quantum on Server is always 12 ticks

8

Running Ready

Page 187: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

187187

As of Windows 2000, can choose short or long quantums (e.g. for Terminal Services)

– NT Server 4.0 was always the same, regardless of slider bar

Screen snapshot from:Control Panel | System | Advanced tab | Performance

Windows 2000:

XP:

Quantum Selection

Page 188: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

188188

Finer grained quantum control can be achieved by modifying

– HKLM\System\CurrentControlSet\Control\PriorityControl\

Win32PrioritySeparation

– 6 bit value

Short vs. Long Quantum BoostVariable vs.

Fixed

024

Quantum Control

Page 189: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

189189

Short vs. Long0,3 default (short for Pro, long for

Server)1 long2 short

Variable vs. Fixed0,3 default (yes for Pro, no for

Server)1 yes2 no

Quantum Boost0 fixed (overrides above setting)1 double quantum of foreground

threads2,3 triple quantum of foreground

threads

Quantum Control

Page 190: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

190190

Controlling Quantum with Jobs

Scheduling class

Quantum units

0 6

1 12

2 18

3 24

4 30

5 36

6 42

7 48

8 54

9 60

If a process is a member of a job, quantum can be adjusted by setting the “Scheduling Class”

– Only applies if process is higher then Idle priority class

– Only applies if system running with fixed quantums (the default on Servers)

Values are 0-9

– 5 is default

Page 191: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

191191

Common boost values (see NTDDK.H)1: disk, CD-ROM, parallel, Video2: serial, network, named pipe, mailslot6: keyboard or mouse8: sound

After an I/O: specified by device driver

– IoCompleteRequest( Irp, PriorityBoost )

After a wait on executive event or semaphore

– Boost value of 1 is used for these objects– Server 2003: for critical sections and pushlocks:

Waiting thread is boosted to 1 more than setting thread’s priority (max boost is to 13)

Setting thread loses boost (lock convoy issue)

Priority Boosting

Page 192: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

192192

After any wait on a dispatcher object by a thread in the foreground process:

– Boost value of 2XP/2003: boost is lost after one full quantum

– Goal: improve responsiveness of interactive apps

GUI threads that wake up to process windowing input (e.g. windows messages) get a boost of 2

– This is added to the current, not base priority

– Goal: improve responsiveness of interactive apps

Priority Boosting

Page 193: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

193193

Lab: Foreground Priority Boosts

See Book “EXPERIMENT: Watching Foreground Priority Boosts and Decays”, p.351

See Book “EXPERIMENT: Watching Priority Boosts on GUI Threads”, p.353

Page 194: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

194194

CPU Starvation Avoidance

Balance Set Manager (sys thread) looks for starved threads– This is a thread, running at priority 16

– Wakes up once per second and examines Ready queues

– Looks for threads that have been Ready for 300 clock ticksapproximate 4 seconds on a 10ms clock

– Attempts to resolve “priority inversions” (high priority thread (12 in diagram) waits on something locked by a lower thread (4), which can’t run because of a middle priority CPU-bound thread (7)), but not deterministically (no priority inheritance)

12

4

7

Wait

Run

Ready

Page 195: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

195195

Priority is boosted to 15 (14 prior to NT 4 SP3)

– Quantum is doubled on Win2000/XP and set to 4 on 2003

– At quantum end, returns to previous priority (no gradual decay) and normal quantum

Scans up to 16 Ready threads per priority level each pass

Boosts up to 10 Ready threads per pass

Like all priority boosts, does not apply in the real-time range (priority 16 and above)

CPU Starvation Avoidance

Page 196: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

196196

Lab: CPU Starvation Resolution

See Book EXPERIMENT: Watching Priority Boosts for CPU Starvation, p.355

– CpuStres with two compute-bound threads (“maximum” activity level)

– One is at lower priority than the other

See Book EXPERIMENT: “Listening to Priority Boosting”, p.357

Page 197: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

197197

Multiprocessor Scheduling

Threads can run on any CPU, unless specified otherwise

– Tries to keep threads on same CPU (“soft affinity”)

– Setting of which CPUs a thread will run on is called “hard affinity”

Fully distributed (no “master processor”)

– Any processor can interrupt another processor to schedule a thread

Scheduling database:

– Pre-Windows Server 2003: single system-wide list of ready queues

– Windows Server 2003: per-CPU ready queues

Page 198: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

198198

Hard Affinity

Affinity is a bit mask where each bit corresponds to a CPU number

– Hard Affinity specifies where a thread is permitted to runDefaults to all CPUs

– Thread affinity mask must be subset of process affinity mask, which in turn must be a subset of the active processor mask

Page 199: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

199199

Hard Affinity

Functions to change:

– SetThreadAffinityMask, SetProcessAffinityMask, SetInformationJobObject

Tools to change:

– Task Manager or Process ExplorerRight click on process and choose “Set Affinity”

– Psexec -a

Page 200: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

200200

Hard Affinity

Can also set an image affinity mask

– See “Imagecfg” tool in Windows 2000 Server Resource Kit Supplement 1E.g. Imagecfg –a 2 xyz.exe will run xyz on CPU 1

Can also set “uniprocessor only”: sets affinity mask to one processor

– Imagecfg –u xyz.exe

– System chooses 1 CPU for the processRotates round robin at each process creation

– Useful as temporary workaround for multithreaded synchronization bugs that appear on MP systems

Page 201: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

201201

Hard Affinity

NOTE: Setting hard affinity can lead to threads’ getting less CPU time than they normally would

– More applicable to large MP systems running dedicated server apps

– Also, OS may in some cases run your thread on CPUs other than your hard affinity setting (flushing DPCs, setting system time)Thread “system affinity” vs “user affinity”

Page 202: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

202202

Every thread has an “ideal processor”

– System selects ideal processor for first thread in process (round robin across CPUs)

– Next thread gets next CPU relative to the process seed

– Can override with:

SetThreadIdealProcessor (

HANDLE hThread, // handle to the thread to be changed

DWORD dwIdealProcessor); // processor number

Soft Processor Affinity

Page 203: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

203203

Hard affinity changes update ideal processor settings

Used in selecting where a thread runs next

For Hyperthreaded systems, new Windows API in Server 2003 to allow apps to optimize

– GetLogicalProcessorInformation

For NUMA systems, new APIs to allow apps to optimize:

– Use GetProcessAffinityMask to get list of processorsThen GetNumaProcessorNode to get node # for each CPU

– Or call GetNumaHighestNodeNumber and then GetNumaNodeProcessorMask to get processor #s for each node

Soft Processor Affinity

Page 204: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

204204

MP Systems Only0

Process

Thread 1 Thread 2 Thread 3 Thread 4

31

Ready Queues

Ready Summary

31 0

Idle Summary Mask

31 0

Process

Active Processor Mask

31 0

Windows 2000/XP Dispatcher Database

Page 205: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

205205

Choosing a CPU for a Ready Thread (Windows 2000) When a thread becomes ready to run (e.g. its wait

completes, or it is just beginning execution), need to choose a processor for it to run on

First, it sees if any processors are idle that are in the thread’s hard affinity mask:

– If its “ideal processor” is idle, it runs there

– Else, if the previous processor it ran on is idle, it runs there

– Else if the current processor is idle, it runs there

– Else it picks the highest numbered idle processor in the thread’s affinity mask

Page 206: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

206206

Choosing a CPU for a Ready Thread (Windows 2000) If no processors are idle:

– If the ideal processor is in the thread’s affinity mask, it selects that

– Else if the the last processor is in the thread’s affinity mask, it selects that

– Else it picks the highest numbered processor in the thread’s affinity mask

Finally, it compares the priority of the new thread with the priority of the thread running on the processor it selected (if any) to determine whether or not to perform a preemption

Page 207: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

207207

Selecting a Thread to Run on a CPU (Windows 2000) System needs to choose a thread to run on a specific CPU

at:

– At quantum end

– When a thread enters a wait state

– When a thread removes its current processor from its hard affinity mask

– When a thread exits Starting with the first thread in the highest priority

non-empty ready queue, it scans the queue for the first thread that has the current processor in its hard affinity mask and:

– Ran last on the current processor, or

– Has its ideal processor equal to the current processor, or

– Has been in its Ready queue for 3 or more clock ticks, or

– Has a priority >=24

Page 208: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

208208

Selecting a Thread to Run on a CPU (Windows 2000) If it cannot find such a candidate, it selects the

highest priority thread that can run on the current CPU (whose hard affinity includes the current CPU)

– Note: this may mean going to a lower priority ready queue to find a candidate

Page 209: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

209209

0

Process

Thread 1 Thread 2 Thread 3 Thread 4

31

CPU 0 Ready Queues

Ready Summary

31 0

Process

0

31

CPU 1 Ready Queues

Ready Summary

31 0

Deferred Ready QueueDeferred Ready Queue

Windows Server 2003 Dispatcher Database

Page 210: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

210210

Server 2003 Enhancements

Threads always go into the ready queue of their ideal processor

Instead of locking the dispatcher database to look for a candidate to run, per-CPU ready queue is checked first (first grabs PRCB spinlock)

– If a thread has been selected to run on the CPU, does the context swap

– Else begins scan of other CPU’s ready queues looking for a thread to runThis scan is done OUTSIDE the dispatcher lockJust acquires CPU PRCB lock

Page 211: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

211211

Server 2003 Enhancements

Dispatcher lock still acquired to wait or unwait a thread and/or change state of a dispatcher object

Bottom line: dispatcher lock is now held for a MUCH shorter time

Page 212: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

212212

DeferredReady (7)

Running (2)

Waiting (5)

voluntaryswitch

preemption, quantum end

Init (0)

Terminate (4)

Transition (6)

Standby (3)preempt

Ready (1)

Thread Scheduling States (Server 2003)

Page 213: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

213213

Server 2003 Enhancements

Idle processor selection further refined to: NUMA system:

– if there are idle CPUs in the node containing the thread’s ideal processor, reduce to that set

hyperthreaded system: – if one of the idle processors is a physical processor with all logical processors idle, reduce to that set

Then try to eliminate idle CPUs that are sleeping

If thread ran last on a member of the set, pick that CPU– Else pick lowest numbered CPU in remaining set

Page 214: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

214214

Affinity Collisions

CPU 1 CPU 0Thread A:Current priority 4Affinity mask 10

Thread B:Current priority 8Affinity mask 11

Thread C:Current priority 6Affinity mask 01

Highest-priority n threads may not be running if thread affinity interferes

NT guarantees the highest-priority thread will be Running

– But lower-priority n-1 Ready threads may not be…

– because scheduler will not move running threads among CPUs

Example: Threads became Ready in order A, B, C

Page 215: Scheduling and Dispatch Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity 求于至简,归于永恒.

Thoughts Change Life意念改变生活