Unit 3: Concurrency Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity...
-
Upload
aubrie-cottom -
Category
Documents
-
view
409 -
download
0
Transcript of Unit 3: Concurrency Instructor: Hengming Zou, Ph.D. In Pursuit of Absolute Simplicity...
Unit 3 Concurrency
Instructor Hengming Zou PhD
In Pursuit of Absolute Simplicity 求于至简归于永恒
22
Outline of Content
31 Critical Sections Semaphores and Monitors
32 Windows Trap Dispatching Interrupts Synchronization
33 Advanced Windows Synchronization
34 Windows APIs for Synchronization and IPC
33
Critical Sections Semaphores and Monitors The Critical-Section Problem
Software Solutions
Synchronization Hardware
Semaphores
Synchronization in Windows amp Linux
44
The Critical-Section Problem
n threads all competing to use a shared resource
Each thread has a code segment called critical section in which the shared data is accessed
Problem Ensure that
ndash when one thread is executing in its critical section no other thread is allowed to execute in its critical section
55
Solution to Critical-Section Problem
Mutual Exclusion
ndash Only one thread at a time is allowed into its CS among all threads that have CS for the same resource or shared data
ndash A thread halted in its non-critical section must not interfere with other threads
Progress
ndash A thread remains inside CS for a finite time only
ndash No assumptions concerning relative speed of the threads
66
Solution to Critical-Section Problem
Bounded Waiting
ndash It must no be possible for a thread requiring access to a critical section to be delayed indefinitely
ndash When no thread is in a critical section any thread that requests entry must be permitted to enter without delay
77
Only 2 threads T0 and T1
General structure of thread Ti (other thread Tj)
do
enter section
critical section
exit section
reminder section
while (1)
Threads may share some common variables to synchronize their actions
Initial Attempts to Solve Problem
88
First Attempt Algorithm 1
Shared variables
ndash Initialization int turn = 0
ndash turn == i Ti can enter its critical section
Thread Ti
do
while (turn = i)
critical section
turn = j
reminder section
while (1)
Satisfies mutual exclusion but not progress
99
Second Attempt Algorithm 2
Shared variables
ndash initialization int flag[2] flag[0] = flag[1] = 0
ndash flag[i] == 1 Ti can enter its critical section
Thread Ti
do
flag[i] = 1while (flag[j] == 1)
critical section
flag[i] = 0remainder section
while(1)
Satisfies mutual exclusion not progress requirement
1010
Algorithm 3 (Petersonrsquos Algorithm - 1981)
Shared variables of algorithms 1 and 2 - initialization
int flag[2] flag[0] = flag[1] = 0int turn = 0
Thread Ti
do
flag[i] = 1turn = jwhile ((flag[j] == 1) ampamp turn == j)
critical section
flag[i] = 0
remainder section
while (1)
Solves the critical-section problem for two threads
1111
Dekkerrsquos Algorithm (1965)
This is the first correct solution proposed for the two-thread (two-process) case
Originally developed by Dekker in a different context it was applied to the critical section problem by Dijkstra
Dekker adds the idea of a favored thread and allows access to either thread when the request is uncontested
When there is a conflict one thread is favored and the priority reverses after successful execution of the critical section
1212
Dekkerrsquos Algorithm (contd)
Shared variables - initializationint flag[2] flag[0] = flag[1] = 0int turn = 0
Thread Ti
do flag[i] = 1
while (flag[j] ) if (turn == j)
flag[i] = 0while (turn == j)flag[i] = 1
critical section
turn = jflag[I] = 0
remainder section
while (1)
1313
Bakery Algorithm (Lamport 1979)
A Solution to the Critical Section problem for n threads
Before entering its CS a thread receives a number
Holder of the smallest number enters the CS
If threads Ti and Tj receive the same number if i lt j then Ti is served first else Tj is served first
The numbering scheme generates numbers in monotonically non-decreasing order
ndash ie 1112333445
1414
Bakery Algorithm
Notation ldquoltldquo establishes lexicographical order among 2-tuples (ticket thread id )
(ab) lt (cd) if a lt c or if a == c and b lt d
max (a0hellip an-1) = k | k ai for i = 0hellip n ndash 1
Shared data
int choosing[n]
int number[n] - the ticket
Data structures are initialized to 0
1515
Bakery Algorithm
do
choosing[i] = 1
number[i] = max(number[0]number[1] number[n-1]) + 1
choosing[i] = 0
for (j = 0 j lt n j++)
while (choosing[j] == 1)
while ((number[j] = 0) ampamp ((number[j]j) lsquorsquoltlsquorsquo (number[i]i)))
critical section
number[i] = 0
remainder section
while (1)
1616
Mutual Exclusion - Hardware Support
Interrupt Disabling
ndash Concurrent threads cannot overlap on a uniprocessor
ndash Thread will run until performing a system call or interrupt happens
Special Atomic Machine Instructions
ndash Test and Set Instruction - read amp write a memory location
ndash Exchange Instruction - swap register and memory location
Problems with Machine-Instruction Approach
ndash Busy waiting
ndash Starvation is possible
ndash Deadlock is possible
1717
Synchronization Hardware
Test and modify the content of a word atomically
boolean TestAndSet(boolean amptarget)
boolean rv = target
target = true
return rv
1818
Shared data ndash boolean lock = false
Thread Ti
do
while (TestAndSet(lock))
critical section
lock = false
remainder section
Mutual Exclusion with Test-and-Set
1919
Synchronization Hardware
Atomically swap two variables
void Swap(boolean ampa boolean ampb)
boolean temp = a
a = b
b = temp
2020
Mutual Exclusion with Swap
Shared data (initialized to 0) int lock = 0
Thread Ti
int key
do
key = 1
while (key == 1) Swap(lockkey)
critical section
lock = 0
remainder section
2121
Semaphores
Semaphore S ndash integer variable
can only be accessed via two atomic operations
wait (S)
while (S lt= 0)S--
signal (S)
S++
2222
Critical Section of n Threads
Shared data
semaphore mutex initially mutex = 1
Thread Ti
do wait(mutex) critical section
signal(mutex) remainder section
while (1)
2323
Semaphore Implementation
Semaphores may suspendresume threads
ndash Avoid busy waiting
Define a semaphore as a record
typedef struct
int value struct thread L semaphore
Assume two simple operations
ndash suspend() suspends the thread that invokes it
ndash resume(T) resumes the execution of a blocked thread T
2424
Implementation
Semaphore operations now defined as wait(S)
Svalue--
if (Svalue lt 0)
add this thread to SLsuspend()
signal(S) Svalue++
if (Svalue lt= 0)
remove a thread T from SLresume(T)
2525
Semaphore as a General Synchronization Tool Execute B in Tj only after A executed in Ti
Use semaphore flag initialized to 0
Code
Ti Tj
A wait(flag)
signal(flag) B
2626
Two Types of Semaphores
Counting semaphore
ndash integer value can range over an unrestricted domain
Binary semaphore
ndash integer value can range only between 0 and 1
ndash can be simpler to implement
Counting semaphore S can be implemented as a binary semaphore
2727
Deadlock and Starvation
Deadlock ndash
ndash two or more threads are waiting indefinitely for an event that can be caused by only one of the waiting threads
Let S and Q be two semaphores initialized to 1
T0 T1
wait(S) wait(Q)
wait(Q) wait(S)
signal(S) signal(Q)
signal(Q) signal(S)
2828
Deadlock and Starvation
Starvation ndash indefinite blocking
ndash A thread may never be removed from the semaphore queue in which it is suspended
Solution ndash
ndash all code should acquirerelease semaphores in same order
2929
Windows Synchronization
Uses interrupt masks to protect access to global resources on uniprocessor systems
Uses spinlocks on multiprocessor systems
Provides dispatcher objects which may act as mutexes and semaphores
Dispatcher objects may also provide events An event acts much like a condition variable
3030
Linux Synchronization
Kernel disables interrupts for synchronizing access to global data on uniprocessor systems
Uses spinlocks for multiprocessor synchronization
Uses semaphores and readers-writers locks when longer sections of code need access to data
Implements POSIX synchronization primitives to support multitasking multithreading (including real-time threads) and multiprocessing
3131
Further Reading
Ben-Ari M Principles of Concurrent Programming Prentice Hall 1982
Lamport L The Mutual Exclusion Problem Journal of the ACM April 1986
Abraham Silberschatz Peter B Galvin Operating System Concepts John Wiley amp Sons 6th Ed 2003
ndash Chapter 7 - Process Synchronization
ndash Chapter 8 - Deadlocks
3232
32 Trap Dispatching Interrupts Synchronization Trap and Interrupt dispatching
IRQL levels amp Interrupt Precedence
Spinlocks and Kernel Synchronization
Executive Synchronization
3333
Kernel Mode Versus User Mode
A processor state
Controls access to memory
Each memory page is tagged to show the required mode for reading and for writing
ndash Protects the system from the users
ndash Protects the user (process) from themselves
ndash System is not protected from system
Code regions are tagged ldquono write in any moderdquo
Controls ability to execute privileged instructions
A Windows abstraction
ndash Intel Ring 0 Ring 3
3434
Kernel Mode Versus User Mode
Control flow (a thread) can change from user to kernel mode and back
ndash Does not affect scheduling
ndash Thread context includes info about execution mode (along with registers etc)
PerfMon counters
ndash ldquoPrivileged Timerdquo and ldquoUser Timerdquo
ndash 4 levels of granularity thread process processor system
3535
Getting Into Kernel Mode
Code is run in kernel mode for one of three reasons
1 Requests from user mode
ndash Via the system service dispatch mechanism
ndash Kernel-mode code runs in the context of the requesting thread
2 Dedicated kernel-mode system threads
ndash Some threads in the system stay in kernel mode at all timesmostly in the ldquoSystemrdquo process
ndash Scheduled preempted etc like any other threads
3636
Getting Into Kernel Mode
3 Interrupts from external devicesndash interrupt dispatcher invokes the interrupt service routine
ndash ISR runs in the context of the interrupted thread so-called ldquoarbitrary thread contextrdquo
ndash ISR often requests the execution of a ldquoDPC routinerdquo which also runs in kernel mode
ndash Time not charged to interrupted thread
3737
Trap dispatching
Interruptdispatcher
Systemservice
dispatcher
Interruptserviceroutines
Interruptserviceroutines
Interruptserviceroutines
System services
System services
System services
Exceptiondispatcher
Exceptionhandlers
Exceptionhandlers
Exceptionhandlers
Virtual memorymanagerlsquos pager
Interrupt
System service call
HW exceptionsSW exceptions
Virtual addressexceptions
Trap processorlsquos mechanism to capture executing thread
ndash Switch from user to kernel mode
ndash Interrupts ndash asynchronous
ndash Exceptions - synchronous
Interrupt dispatch routine
Disable interrupts
Record machine state (trap frame) to allow resume
Mask equal- and lower-IRQL interrupts
Find and call appropriate ISR
Dismiss interrupt
Restore machine state (including mode and enabled interrupts)
Disable interrupts
Record machine state (trap frame) to allow resume
Mask equal- and lower-IRQL interrupts
Find and call appropriate ISR
Dismiss interrupt
Restore machine state (including mode and enabled interrupts)
Tell the device to stop interrupting
Interrogate device state start next operation on device etc
Request a DPC
Return to caller
Tell the device to stop interrupting
Interrogate device state start next operation on device etc
Request a DPC
Return to caller
Interrupt service routine
interrupt
user or kernel mode
codekernel mode
Note no thread or process context switch
Note no thread or process context switch
Interrupt Dispatching
3939
IRQL = Interrupt Request Level
ndash Precedence of the interrupt with respect to other interrupts
ndash Different interrupt sources have different IRQLs
ndash not the same as IRQ
IRQL is also a state of the processor
ndash Servicing an interrupt raises processor IRQL to that interruptrsquos IRQL
ndash this masks subsequent interrupts at equal and lower IRQLs
User mode is limited to IRQL 0
No waits or page faults at IRQL gt= DISPATCH_LEVEL
Interrupt Precedence via IRQLs (x86)
4040
PassiveLowAPC
DispatchDPCDevice 1
Profile amp Synch (Srv 2003)
ClockInterprocessor Interrupt
Power failHigh
normal thread execution
Hardware interrupts
Deferrable software interrupts
012
302928
31
Interrupt Precedence via IRQLs (x86)
4141
Interrupt processing
Interrupt dispatch table (IDT)ndash Links to interrupt service routines
x86ndash Interrupt controller interrupts processor (single line)
ndash Processor queries for interrupt vector uses vector as index to IDT
Alphandash PAL code (Privileged Architecture Library ndash Alpha BIOS) determines interrupt vector calls kernel
ndash Kernel uses vector to index IDT
After ISR execution IRQL is lowered to initial level
4242
Interrupt object
Allows device drivers to register ISRs for their devicesndash Contains dispatch code (initial handler)
ndash Dispatch code calls ISR with interrupt object as parameter(HW cannot pass parameters to ISR)
Connectingdisconnecting interrupt objectsndash Dynamic association between ISR and IDT entry
ndash Loadable device drivers (kernel modules)
ndash Turn onoff ISR
Interrupt objects can synchronize access to ISR datandash Multiple instances of ISR may be active simultaneously (MP machine)
ndash Multiple ISR may be connected with IRQL
4343
Predefined IRQLs
High ndash used when halting the system (via KeBugCheck())
Power fail ndash originated in the NT design document but has never been used
Inter-processor interruptndash used to request action from other processor (dispatching a thread updating a processors TLB system shutdown system crash)
Clockndash Used to update systemlsquos clock allocation of CPU time to threads
Profilendash Used for kernel profiling (see Kernel profiler ndash Kernprofexe Res Kit)
4444
Predefined IRQLs (contd)
Device
ndash Used to prioritize device interrupts
DPCdispatch and APC
ndash Software interrupts that kernel and device drivers generate
Passive
ndash No interrupt level at all normal thread execution
4545
IRQLs on 64-bit Systems
PassiveLowAPC
DispatchDPCDevice 1
Device n
Synch (Srv 2003)Clock
Interprocessor InterruptPower
HighProfile
012
1413
15
34
PassiveLowAPC
DispatchDPC amp Synch (UP only)Correctable Machine Check
Device 1
Device nSynch (MP only)
ClockInterprocessor Interrupt
HighProfilePower
x64 IA64
12
4646
Interrupt Prioritization amp Delivery
IRQLs are determined as followsndash x86 UP systems IRQL = 27 - IRQ
ndash x86 MP systems bucketized (random)
ndash x64 amp IA64 systems IRQL = IDT vector number 16
On MP systems which processor is chosen to deliver an interruptndash By default any processor can receive an interrupt from any deviceCan be configured with IntFilter utility in Resource Kit
ndash On x86 and x64 systems the IOAPIC (IO advanced programmable interrupt controller) is programmed to interrupt the processor running at the lowest IRQL
ndash On IA64 systems the SAPIC (streamlined advanced programmable interrupt controller) is configured to interrupt one processor for each interrupt sourceProcessors are assigned round robin for each interrupt vector
4747
Software interrupts
Initiating thread dispatching
ndash DPC allow for scheduling actions when kernel is deep within many layers of code
ndash Delayed scheduling decision one DPC queue per processor
Handling timer expiration
Asynchronous execution of a procedure in context of a particular thread
Support for asynchronous IO operations
4848
Flow of Interrupts
4949
Sync on MP use spinlocks to coordinate among processors
Spinlock acquisition and release routines implement a one-owner-at-a-time algorithm
ndash Spinlock is either free or is considered to be owned by a CPU
ndash Analogous to using Windows API mutexes from user mode
A spinlock is just a data cell in memory
ndash Accessed with a test-and-modify operation that is atomic across all processors
ndash KSPIN_LOCK is an opaque data type typedefrsquod as a ULONG
ndash To implement synchronization a single bit is sufficient
Synchronization on SMP Systems
5050
do acquire_spinlock(DPC)until (SUCCESS)
begin remove DPC from queueend
release_spinlock(DPC)
do acquire_spinlock(DPC)until (SUCCESS)
begin remove DPC from queueend
release_spinlock(DPC)
Kernel Synchronization
Processor BProcessor A
Critical section
spinlock
DPC DPC
A spinlock is a locking primitive associatedwith a global data structure such as the DPC queue
5151
Queued Spinlocks
Problem Checking status of spinlock via test-and-set operation creates bus contention
Queued spinlocks maintain queue of waiting processors
First processor acquires lock other processors wait on processor-local flag
ndash Thus busy-wait loop requires no access to the memory bus
When releasing lock the 1st processorrsquos flag is modified
ndash Exactly one processor is being signaled
ndash Pre-determined wait order
5252
SMP Scalability Improvements
Windows 2000 queued spinlocks
ndash qlocks in Kernel Debugger
Server 2003
ndash More spinlocks eliminated (context swap system space commit)
ndash Further reduction of use of spinlocks amp length they are held
ndash Scheduling database now per-CPUAllows thread state transitions in parallel
5353
SMP Scalability Improvements
XP2003
ndash Minimized lock contention for hot locks PFN or Page Frame Database lock
ndash Some locks completely eliminatedCharging nonpagedpaged pool quotas allocating and mapping system page table entries charging commitment of pages allocatingmapping physical memory through AWE functions
ndash New more efficient locking mechanism (pushlocks)Doesnrsquot use spinlocks when no contentionUsed for object manager and address windowing extensions (AWE) related locks
5454
Waiting
Flexible wait calls
ndash Wait for one or multiple objects in one call
ndash Wait for multiple can wait for ldquoanyrdquo one or ldquoallrdquo at onceldquoAllrdquo all objects must be in the signalled state concurrently to resolve the wait
ndash All wait calls include optional timeout argument
ndash Waiting threads consume no CPU time
5555
Waiting
Waitable objects include
ndash Events (may be auto-reset or manual reset may be set or ldquopulsedrdquo)
ndash Mutexes (ldquomutual exclusionrdquo one-at-a-time)
ndash Semaphores (n-at-a-time)
ndash Timers
ndash Processes and Threads (signalled upon exit or terminate)
ndash Directories (change notification)
5656
Waiting
No guaranteed ordering of wait resolution
ndash If multiple threads are waiting for an object and only one thread is released (eg itrsquos a mutex or auto-reset event) which thread gets released is unpredictable
5757
Executive Synchronization
Thread waitson an object
handle
Create and initialize thread object
Initialized
Ready
Transition
Waiting
Running
Terminated
Standby
Wait is completeSet object to
signaled state
Interaction with thread scheduling
Waiting on Dispatcher Objects ndash outside the kernel
5858
Interaction bet Synchronization amp Dispatching User mode thread waits on an event objectlsquos handle
Kernel changes threadlsquos scheduling state from ready to waiting and adds thread to wait-list
Another thread sets the event
Kernel wakes up waiting threads variable priority threads get priority boost
5959
Interaction bet Synchronization amp Dispatching Dispatcher re-schedules new thread ndash it may preempt running thread it it has lower priority and issues software interrupt to initiate context switch
If no processor can be preempted the dispatcher places the ready thread in the dispatcher ready queue to be scheduled later
6060
What signals an object
Dispatcher object
System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
Owning thread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (kernel mode)
nonsignaled signaled
Owning thread or otherthread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (exported to user mode)
nonsignaled signaled
One thread releases thesemaphore freeing a resource
A thread acquires the semaphoreMore resources are not available
Kernel resumes one or more waiting threads
Semaphore
6161
A thread reinitializesthe thread object
What signals an object (contd)
Dispatcher object System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
A thread sets the event
Kernel resumes one or more threads
Kernel resumes one or more waiting threads
Event
nonsignaled signaled
Dedicated thread sets oneevent in the event pair
Kernel resumes theother dedicated thread
Kernel resumes waitingdedicated thread
Event pair
nonsignaled signaled
Timer expires
A thread (re) initializes the timer
Kernel resumes all waiting threads
Timer
nonsignaled signaled
Thread terminates
Kernel resumes all waiting threads
Thread
6262
Further Reading
Mark E Russinovich and David A Solomon Microsoft Windows Internals 4th Edition Microsoft Press 2004
Chapter 3 - System Mechanisms
ndash Trap Dispatching (pp 85 ff)
ndash Synchronization (pp 149 ff)
ndash Kernel Event Tracing (pp 175 ff)
6363
33 Advanced Windows Synchronization
Deferred and Asynchronous Procedure Calls
IRQLs and CPU Time Accounting
Wait Queues amp Dispatcher Objects
6464
Used to defer processing from higher (device) interrupt level to a lower (dispatch) levelndash Also used for quantum end and timer expiration
Driver (usually ISR) queues requestndash One queue per CPU DPCs are normally queued to the current processor but can be targeted to other CPUs
ndash Executes specified procedure at dispatch IRQL (or ldquodispatch levelrdquo also ldquoDPC levelrdquo) when all higher-IRQL work (interrupts) completed
ndash Maximum times recommended ISR 10 usec DPC 25 usec
See httpwwwmicrosoftcomwhdcdriverperformmmdrvmspx
Deferred Procedure Calls (DPCs)
6565
queue head DPC object DPC object DPC object
Deferred Procedure Calls (DPCs)
6666
DPC
Delivering a DPC
DPC routines can call kernel functionsbut canlsquot call system services generatepage faults or create or wait on objects
DPC routines canlsquotassume whatprocess addressspace is currentlymapped
Interruptdispatch table
high
Power failure
DispatchDPC
APC
Low
DPC
1 Timer expires kernelqueues DPC that willrelease all waiting threadsKernel requests SW int
DPCDPC
DPC queue
2 DPC interrupt occurswhen IRQL drops belowdispatchDPC level
dispatcher
3 After DPC interruptcontrol transfers tothread dispatcher
4 Dispatcher executes each DPCroutine in DPC queue
6767
Asynchronous Procedure Calls (APCs)
Execute code in context of a particular user threadndash APC routines can acquire resources (objects) incur page faultscall system services
APC queue is thread-specific
User mode amp kernel mode APCsndash Permission required for user mode APCs
6868
Asynchronous Procedure Calls (APCs)
Executive uses APCs to complete work in thread spacendash Wait for asynchronous IO operation
ndash Emulate delivery of POSIX signals
ndash Make threads suspendterminate itself (env subsystems)
APCs are delivered when thread is in alertable wait statendash WaitForMultipleObjectsEx() SleepEx()
6969
Special kernel APCs
ndash Run in kernel mode at IRQL 1
ndash Always deliverable unless thread is already at IRQL 1 or above
ndash Used for IO completion reporting from ldquoarbitrary thread contextrdquo
ndash Kernel-mode interface is linkable but not documented
Asynchronous Procedure Calls (APCs)
7070
ldquoOrdinaryrdquo kernel APCs
ndash Always deliverable if at IRQL 0 unless explicitly disabled (disable with KeEnterCriticalRegion)
User mode APCs
ndash Used for IO completion callback routines (see ReadFileEx WriteFileEx) also QueueUserApc
ndash Only deliverable when thread is in ldquoalertable waitrdquo
Asynchronous Procedure Calls (APCs)
7171
ThreadObject
K
U
APC objects
Asynchronous Procedure Calls (APCs)
7272
IRQLs and CPU Time Accounting
Interval clock timer ISR keeps track of time
Clock ISR time accounting
ndash If IRQLlt2 charge to threadrsquos user or kernel time
ndash If IRQL=2 and processing a DPC charge to DPC time
ndash If IRQL=2 amp not processing a DPC charge to thread kernel time
ndash If IRQLgt2 charge to interrupt time
7373
IRQLs and CPU Time Accounting
Since time servicing interrupts are NOT charged to interrupted thread if system is busy but no process appears to be running must be due to interrupt-related activity
ndash Note time at IRQL 2 or more is charged to the current threadrsquos quantum (to be described)
7474
Interrupt Time Accounting
Task Manager includes interrupt and DPC time with the Idle process time
Interrupt activity is not charged to any threadprocess
ndash Process Explorer shows these as separate processesnot really processes
ndash Context switches for these are really of interrupts amp DPCs
7575
Time Accounting Quirks
Looking at total CPU time for each process may not reveal where system has spent its time
CPU time accounting is driven by programmable interrupt timer
ndash Normally 10 msec (15 msec on some MP Pentiums)
Thread execution and context switches between clock intervals NOT accounted
ndash Eg one or more threads run and enter a wait state before clock fires
ndash Thus threads may run but never get charged
View context switch activity with Process Explorer
ndash Add Context Switch Delta column
7676
For waiting threads user-mode utilities only display the wait reason
Example pstat
Looking at Waiting Threads
7777
Wait Internals 1 Dispatcher Objects
Size TypeState
Wait listhead
Object-type-specific data
DispatcherObject
(see ntddkincddkntddkh)
Any kernel object you can wait for is a ldquodispatcher objectrdquo
ndash some exclusively for synchronizationeg events mutexes (ldquomutantsrdquo) semaphores queues timers
ndash others can be waited for as a side effect of their prime function eg processes threads file objects
ndash non-waitable kernel objects are called ldquocontrol objectsrdquo
All dispatcher objects have a common header
All dispatcher objects are in one of two states
ndash ldquosignaledrdquo vs ldquononsignaledrdquo
ndash when signalled a wait on the object is satisfied
ndash different object types differ in terms of what changes their state
ndash wait and unwait implementation iscommon to all types of dispatcher objects
7878
Object-type-specific data
Wait Internals 2Wait Blocks
Size TypeState
Wait listhead
Size TypeState
Wait listhead
Represent a threadrsquos reference to something itrsquos waiting for (one per handle passed to WaitForhellip)
All wait blocks from a given wait call are chained to the waiting thread
Type indicates wait for ldquoanyrdquo or ldquoallrdquo Key denotes argument list position for
WaitForMultipleObjects
Object-type-specific data
DispatcherObjects
Thread Objects
WaitBlockListWaitBlockList
Wait blocks
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
7979
34 Windows APIs for Synchronization and IPC Windows API constructs for synchronization and interprocess communication
Synchronization
ndash Critical sections
ndash Mutexes
ndash Semaphores
ndash Event objects
Synchronization through interprocess communication
ndash Anonymous pipes
ndash Named pipes
ndash Mailslots
8080
Critical Sections
Only usable from within the same process
Critical sections are initialized and deleted but do not have handles
Only one thread at a time can be in a critical section
A thread can enter a critical section multiple times - however the number of Enter- and Leave-operations must match
Leaving a critical section before entering it may cause deadlocks
No way to test whether another thread is in a critical section
VOID InitializeCriticalSection( LPCRITICAL_SECTION sec )VOID DeleteCriticalSection( LPCRITICAL_SECTION sec )
VOID EnterCriticalSection( LPCRITICAL_SECTION sec ) VOID LeaveCriticalSection( LPCRITICAL_SECTION sec )BOOL TryEnterCriticalSection ( LPCRITICAL_SECTION sec )
8181
Critical Section Example
counter is global shared by all threads
volatile int counter = 0
CRITICAL_SECTION crit
InitializeCriticalSection ( ampcrit )
hellip main loop in any of the threads
while (done)
_try
EnterCriticalSection ( ampcrit )
counter += local_value
LeaveCriticalSection ( ampcrit )
_finally LeaveCriticalSection ( ampcrit )
DeleteCriticalSection( ampcrit )
8282
Synchronizing Threads with Kernel Objects
The following kernel objects can be used to synchronize threads
ndash Processes
ndash Threads
ndash Files
ndash Console input
File change notificationsFile change notifications
MutexesMutexes
Events (auto-reset + manual-reset)Events (auto-reset + manual-reset)
Waitable timersWaitable timers
DWORD WaitForSingleObject( HANDLE hObject DWORD dwTimeout )
DWORD WaitForMultipleObjects( DWORD cObjects LPHANDLE lpHandles BOOL bWaitAll DWORD dwTimeout )
8383
Wait Functions - Details
WaitForSingleObject()ndash hObject specifies kernel object
ndash dwTimeout specifies wait time in msecdwTimeout == 0 - no wait check whether object is signaled
dwTimeout == INFINITE - wait forever
WaitForMultipleObjects()ndash cObjects lt= MAXIMUM_WAIT_OBJECTS (64)
ndash lpHandles - pointer to array identifying these objects
ndash bWaitAll - whether to wait for first signaled object or all objectsFunction returns index of first signaled object
Side effectsndash Mutexes auto-reset events and waitable timers will be reset to non-signaled state after completing wait functions
8484
Mutexes
Mutexes work across processes
First thread has to call CreateMutex()
When sharing a mutex second thread (process) calls CreateMutex() or OpenMutex()
fInitialOwner == TRUE gives creator immediate ownership
Threads acquire mutex ownership using WaitForSingleObject() or WaitForMultipleObjects()
ReleaseMutex() gives up ownership
CloseHandle() will free mutex object
8585
Mutexes
HANDLE CreateMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
HANDLE OpenMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
BOOL ReleaseMutex( HANDLE hMutex )
8686
Mutex Example
counter is global shared by all threads
volatile int done counter = 0
HANDLE mutex = CreateMutex( NULL FALSE NULL )
main loop in any of the threads ret is local
DWORD ret
while (done)
ret = WaitForSingleObject( mutex INFINITE )
if (ret == WAIT_OBJECT_0)
counter += local_value
else mutex was abandoned
break exit the loop
ReleaseMutex( mutex )
CloseHandle( mutex )
8787
Comparison - POSIX mutexes
POSIX pthreads specification supports mutexesndash Synchronization among threads in same process
Five basic functionsndash pthread_mutex_init()
ndash pthread_mutex_destroy()
ndash pthread_mutex_lock()
ndash pthread_mutex_unlock()
ndash pthread_mutex_trylock()
Comparisonndash pthread_mutex_lock() will block - equivalent to WaitForSingleObject( hMutex )
ndash pthread_mutex_trylock() is nonblocking (polling) - equivalent to WaitForSingleObject() with timeout == 0
8888
Semaphores
Semaphore objects are used for resource countingndash A semaphore is signaled when count gt 0
Threadsprocesses use wait functionsndash Each wait function decreases semaphore count by 1
ndash ReleaseSemaphore() may increment count by any value
ndash ReleaseSemaphore() returns old semaphore count
8989
Semaphores
HANDLE CreateSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE OpenSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE ReleaseSemaphore( HANDLE hSemaphoreLONG cReleaseCount LPLONG lpPreviousCount )
9090
Events
Multiple threads can be released when a single event is signaled (barrier synchronization)ndash Manual-reset event can signal several thread simultaneously must be reset manually
ndash PulseEvent() will release all threads waiting on a manual-reset event and automatically reset the event
ndash Auto-reset event signals a single thread event is reset automatically
ndash fInitialState == TRUE - create event in signaled state
9191
Events
HANDLE CreateEvent( LPSECURITY_ATTRIBUTE lpsaBOOL fManualReset BOOL fInititalStateLPTSTR lpszEventName )
BOOL SetEvent( HANDLE hEvent )BOOL ResetEvent( HANDLE hEvent )BOOL PulseEvent( HANDLE hEvent )
9292
Comparison - POSIX condition variables
pthreadrsquos condition variables are comparable to eventsndash pthread_cond_init()
ndash pthread_cond_destroy()
Wait functionsndash pthread_cond_wait()
ndash pthread_cond_timedwait()
Signalingndash pthread_cond_signal() - one thread
ndash pthread_cond_broadcast() - all waiting threads
No exact equivalent to manual-reset events
9393
Anonymous pipes
BOOL CreatePipe( PHANDLE phReadPHANDLE phWriteLPSECURITY_ATTRIBUTES lpsaDWORD cbPipe )
main
prog1 prog2pipe
Half-duplex character-based IPC
cbPipe pipe byte size zero == default
Read on pipe handle will block if pipe is empty
Write operation to a full pipe will block
Anonymous pipes are oneway
9494
IO Redirection using an Anonymous Pipe
Create default size anonymous pipe handles are inheritable
if (CreatePipe (amphReadPipe amphWritePipe ampPipeSA 0))
fprintf(stderr ldquoAnon pipe create failednrdquo) exit(1)
Set output handle to pipe handle create first processes
StartInfoCh1hStdInput = GetStdHandle (STD_INPUT_HANDLE)
StartInfoCh1hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh1hStdOutput = hWritePipe
StartInfoCh1dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)Command1 NULL NULL TRUE 0 NULL NULL ampStartInfoCh1 ampProcInfo1))
fprintf(stderr ldquoCreateProc1 failednrdquo) exit(2)
CloseHandle (hWritePipe)
9595
Pipe example (contd)
Repeat (symmetrically) for the second process
StartInfoCh2hStdInput = hReadPipe
StartInfoCh2hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh2hStdOutput = GetStdHandle (STD_OUTPUT_HANDLE)
StartInfoCh2dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)targv NULL NULLTRUE Inherit handles
0 NULL NULL ampStartInfoCh2 ampProcInfo2))
fprintf(stderr ldquoCreateProc2 failednrdquo) exit(3)
CloseHandle (hReadPipe)
Wait for both processes to complete
WaitForSingleObject (ProcInfo1hProcess INFINITE)
WaitForSingleObject (ProcInfo2hProcess INFINITE)
CloseHandle (ProcInfo1hThread) CloseHandle (ProcInfo1hProcess)
CloseHandle (ProcInfo2hThread) CloseHandle (ProcInfo2hProcess)
return 0
9696
Named Pipes
Message orientedndash Reading process can read varying-length messages precisely as sent by the writing process
Bi-directionalndash Two processes can exchange messages over the same pipe
Multiple independent instances of a named pipendash Several clients can communicate with a single server using the same instance
ndash Server can respond to client using the same instance
Pipe can be accessed over the networkndash location transparency
Convenience and connection functions
9797
Using Named Pipes
HANDLE CreateNamedPipe (LPCTSTR lpszPipeNameDWORD fdwOpenMode DWORD fdwPipModeDWORD nMaxInstances DWORD cbOutBufDWORD cbInBuf DWORD dwTimeOutLPSECURITY_ATTRIBUTES lpsa )
Use same flag settings forall instances of a named pipe
lpszPipeName pipe[path]pipename
ndash Not possible to create a pipe on remote machine ( ndash local machine)
fdwOpenMode
ndash PIPE_ACCESS_DUPLEX PIPE_ACCESS_INBOUND PIPE_ACCESS_OUTBOUND
fdwPipeMode
ndash PIPE_TYPE_BYTE or PIPE_TYPE_MESSAGE
ndash PIPE_READMODE_BYTE or PIPE_READMODE_MESSAGE
ndash PIPE_WAIT or PIPE_NOWAIT (will ReadFile block)
9898
Named Pipes (contd)
BOOL PeekNamedPipe (HANDLE hPipeLPVOID lpvBuffer DWORD cbBufferLPDWORD lpcbRead LPDWORD lpcbAvailLPDWORD lpcbMessage)
nMaxInstances
ndash Number of instances
ndash PIPE_UNLIMITED_INSTANCES OS choice based on resources
dwTimeOut
ndash Default time-out period (in msec) for WaitNamedPipe()
First CreateNamedPipe creates named pipe
ndash Closing handle to last instance deletes named pipe
Polling a pipe
ndash Nondestructive ndash is there a message waiting for ReadFile
9999
Named Pipe Client Connections
CreateFile with named pipe namendash pipe[path]pipename
ndash servernamepipe[path]pipename
ndash First method gives better performance (local server)
Status Functionsndash GetNamedPipeHandleState
ndash SetNamedPipeHandleState
ndash GetNamedPipeInfo
100100
Convenience Functions
BOOL TransactNamedPipe( HANDLE hNamedPipeLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDOWRD lpcbRead LPOVERLAPPED lpa)
WriteFile ReadFile sequence
101101
Convenience Functions
BOOL CallNamedPipe( LPCTSTR lpszPipeNameLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDWORD lpcbRead DWORD dwTimeOut)
CreateFile WriteFile ReadFile CloseHandle
ndash dwTimeOut NMPWAIT_NOWAIT NMPWAIT_WIAT_FOREVER NMPWAIT_USE_DEFAULT_WAIT
102102
Server eliminate the polling loop
BOOL ConnectNamedPipe (HANDLE hNamedPipeLPOVERLAPPED lpo
lpo == NULLndash Call will return as soon as there is a client connection
ndash Returns false if client connected between CreateNamed Pipe calland ConnectNamedPipe()
Use DisconnectNamedPipe to free the handle for connection from another client
WaitNamedPipe()ndash Client may wait for serverlsquos ConnectNamedPipe()
Security rights for named pipesndash GENERIC_READ GENERIC_WRITE SYNCHRONIZE
103103
Comparison with UNIX
UNIX FIFOs are similar to a named pipe
ndash FIFOs are half-duplex
ndash FIFOs are limited to a single machine
ndash FIFOs are still byte-oriented so its easiest to use fixed-size records in clientserver applications
ndash Individual readwrites are atomic
A server using FIFOs must use a separate FIFO for each clientlsquos response although all clients can send requests via a single well known FIFO
Mkfifo() is the UNIX counterpart to CreateNamedPipe()
Use sockets for networked clientserver scenarios
104104
Client Example using Named Pipe
WaitNamedPipe (ServerPipeName NMPWAIT_WAIT_FOREVER)
hNamedPipe = CreateFile (ServerPipeName GENERIC_READ | GENERIC_WRITE
0 NULL OPEN_EXISTING FILE_ATTRIBUTE_NORMAL NULL)
if (hNamedPipe == INVALID_HANDLE_VALUE)
fptinf(stderr Failure to locate servern) exit(3)
Write the request
WriteFile (hNamedPipe ampRequest MAX_RQRS_LEN ampnWrite NULL)
Read each response and send it to std out
while (ReadFile (hNamedPipe ResponseRecord MAX_RQRS_LEN ampnRead NULL))
printf (s ResponseRecord)
CloseHandle (hNamedPipe)
return 0
105105
Server Example Using a Named Pipe
hNamedPipe = CreateNamedPipe (SERVER_PIPE PIPE_ACCESS_DUPLEX
PIPE_READMODE_MESSAGE | PIPE_TYPE_MESSAGE | PIPE_WAIT
1 0 0 CS_TIMEOUT pNPSA)
while (Done)
printf (Server is awaiting next requestn)
if (ConnectNamedPipe (hNamedPipe NULL)
|| ReadFile (hNamedPipe ampRequest RQ_SIZE ampnXfer NULL))
fprintf(stderr ldquoConnect or Read Named Pipe errornrdquo) exit(4)
printf( ldquoRequest is sn RequestRecord)
Send the file one line at a time to the client
fp = fopen (File r)
while ((fgets (ResponseRecord MAX_RQRS_LEN fp) = NULL))
WriteFile (hNamedPipe ampResponseRecord
(strlen(ResponseRecord) + 1) TSIZE ampnXfer NULL)
fclose (fp)
DisconnectNamedPipe (hNamedPipe)
End of server operation
106106
Win32 IPC - MailslotsMailslots bear some nasty implementation detailsthey are almost never used
Broadcast mechanism
ndash One-directional
ndash Mutliple writersmultiple readers (frequently one-to-many comm)
ndash Message delivery is unreliable
ndash Can be located over a network domain
ndash Message lengths are limited (w2k lt 426 byte) Operations on the mailslot
ndash Each reader (server) creates mailslot with CreateMailslot()
ndash Write-only client opens mailslot with CreateFile() and uses WriteFile() ndash open will fail if there are no waiting readers
ndash Clientlsquos message can be read by all servers (readers) Client lookup mailslotmailslotname
ndash Client will connect to every server in network domain
107107
Locate a server via mailslot
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
App client 0
App client n
Mailslot Servers
While () Sleep() hMS = CreateFile( ldquomailslotstatusldquo)
WriteFile(hMS ampStatInfo
App Server
Mailslot Client
Message is sent periodically
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
22
Outline of Content
31 Critical Sections Semaphores and Monitors
32 Windows Trap Dispatching Interrupts Synchronization
33 Advanced Windows Synchronization
34 Windows APIs for Synchronization and IPC
33
Critical Sections Semaphores and Monitors The Critical-Section Problem
Software Solutions
Synchronization Hardware
Semaphores
Synchronization in Windows amp Linux
44
The Critical-Section Problem
n threads all competing to use a shared resource
Each thread has a code segment called critical section in which the shared data is accessed
Problem Ensure that
ndash when one thread is executing in its critical section no other thread is allowed to execute in its critical section
55
Solution to Critical-Section Problem
Mutual Exclusion
ndash Only one thread at a time is allowed into its CS among all threads that have CS for the same resource or shared data
ndash A thread halted in its non-critical section must not interfere with other threads
Progress
ndash A thread remains inside CS for a finite time only
ndash No assumptions concerning relative speed of the threads
66
Solution to Critical-Section Problem
Bounded Waiting
ndash It must no be possible for a thread requiring access to a critical section to be delayed indefinitely
ndash When no thread is in a critical section any thread that requests entry must be permitted to enter without delay
77
Only 2 threads T0 and T1
General structure of thread Ti (other thread Tj)
do
enter section
critical section
exit section
reminder section
while (1)
Threads may share some common variables to synchronize their actions
Initial Attempts to Solve Problem
88
First Attempt Algorithm 1
Shared variables
ndash Initialization int turn = 0
ndash turn == i Ti can enter its critical section
Thread Ti
do
while (turn = i)
critical section
turn = j
reminder section
while (1)
Satisfies mutual exclusion but not progress
99
Second Attempt Algorithm 2
Shared variables
ndash initialization int flag[2] flag[0] = flag[1] = 0
ndash flag[i] == 1 Ti can enter its critical section
Thread Ti
do
flag[i] = 1while (flag[j] == 1)
critical section
flag[i] = 0remainder section
while(1)
Satisfies mutual exclusion not progress requirement
1010
Algorithm 3 (Petersonrsquos Algorithm - 1981)
Shared variables of algorithms 1 and 2 - initialization
int flag[2] flag[0] = flag[1] = 0int turn = 0
Thread Ti
do
flag[i] = 1turn = jwhile ((flag[j] == 1) ampamp turn == j)
critical section
flag[i] = 0
remainder section
while (1)
Solves the critical-section problem for two threads
1111
Dekkerrsquos Algorithm (1965)
This is the first correct solution proposed for the two-thread (two-process) case
Originally developed by Dekker in a different context it was applied to the critical section problem by Dijkstra
Dekker adds the idea of a favored thread and allows access to either thread when the request is uncontested
When there is a conflict one thread is favored and the priority reverses after successful execution of the critical section
1212
Dekkerrsquos Algorithm (contd)
Shared variables - initializationint flag[2] flag[0] = flag[1] = 0int turn = 0
Thread Ti
do flag[i] = 1
while (flag[j] ) if (turn == j)
flag[i] = 0while (turn == j)flag[i] = 1
critical section
turn = jflag[I] = 0
remainder section
while (1)
1313
Bakery Algorithm (Lamport 1979)
A Solution to the Critical Section problem for n threads
Before entering its CS a thread receives a number
Holder of the smallest number enters the CS
If threads Ti and Tj receive the same number if i lt j then Ti is served first else Tj is served first
The numbering scheme generates numbers in monotonically non-decreasing order
ndash ie 1112333445
1414
Bakery Algorithm
Notation ldquoltldquo establishes lexicographical order among 2-tuples (ticket thread id )
(ab) lt (cd) if a lt c or if a == c and b lt d
max (a0hellip an-1) = k | k ai for i = 0hellip n ndash 1
Shared data
int choosing[n]
int number[n] - the ticket
Data structures are initialized to 0
1515
Bakery Algorithm
do
choosing[i] = 1
number[i] = max(number[0]number[1] number[n-1]) + 1
choosing[i] = 0
for (j = 0 j lt n j++)
while (choosing[j] == 1)
while ((number[j] = 0) ampamp ((number[j]j) lsquorsquoltlsquorsquo (number[i]i)))
critical section
number[i] = 0
remainder section
while (1)
1616
Mutual Exclusion - Hardware Support
Interrupt Disabling
ndash Concurrent threads cannot overlap on a uniprocessor
ndash Thread will run until performing a system call or interrupt happens
Special Atomic Machine Instructions
ndash Test and Set Instruction - read amp write a memory location
ndash Exchange Instruction - swap register and memory location
Problems with Machine-Instruction Approach
ndash Busy waiting
ndash Starvation is possible
ndash Deadlock is possible
1717
Synchronization Hardware
Test and modify the content of a word atomically
boolean TestAndSet(boolean amptarget)
boolean rv = target
target = true
return rv
1818
Shared data ndash boolean lock = false
Thread Ti
do
while (TestAndSet(lock))
critical section
lock = false
remainder section
Mutual Exclusion with Test-and-Set
1919
Synchronization Hardware
Atomically swap two variables
void Swap(boolean ampa boolean ampb)
boolean temp = a
a = b
b = temp
2020
Mutual Exclusion with Swap
Shared data (initialized to 0) int lock = 0
Thread Ti
int key
do
key = 1
while (key == 1) Swap(lockkey)
critical section
lock = 0
remainder section
2121
Semaphores
Semaphore S ndash integer variable
can only be accessed via two atomic operations
wait (S)
while (S lt= 0)S--
signal (S)
S++
2222
Critical Section of n Threads
Shared data
semaphore mutex initially mutex = 1
Thread Ti
do wait(mutex) critical section
signal(mutex) remainder section
while (1)
2323
Semaphore Implementation
Semaphores may suspendresume threads
ndash Avoid busy waiting
Define a semaphore as a record
typedef struct
int value struct thread L semaphore
Assume two simple operations
ndash suspend() suspends the thread that invokes it
ndash resume(T) resumes the execution of a blocked thread T
2424
Implementation
Semaphore operations now defined as wait(S)
Svalue--
if (Svalue lt 0)
add this thread to SLsuspend()
signal(S) Svalue++
if (Svalue lt= 0)
remove a thread T from SLresume(T)
2525
Semaphore as a General Synchronization Tool Execute B in Tj only after A executed in Ti
Use semaphore flag initialized to 0
Code
Ti Tj
A wait(flag)
signal(flag) B
2626
Two Types of Semaphores
Counting semaphore
ndash integer value can range over an unrestricted domain
Binary semaphore
ndash integer value can range only between 0 and 1
ndash can be simpler to implement
Counting semaphore S can be implemented as a binary semaphore
2727
Deadlock and Starvation
Deadlock ndash
ndash two or more threads are waiting indefinitely for an event that can be caused by only one of the waiting threads
Let S and Q be two semaphores initialized to 1
T0 T1
wait(S) wait(Q)
wait(Q) wait(S)
signal(S) signal(Q)
signal(Q) signal(S)
2828
Deadlock and Starvation
Starvation ndash indefinite blocking
ndash A thread may never be removed from the semaphore queue in which it is suspended
Solution ndash
ndash all code should acquirerelease semaphores in same order
2929
Windows Synchronization
Uses interrupt masks to protect access to global resources on uniprocessor systems
Uses spinlocks on multiprocessor systems
Provides dispatcher objects which may act as mutexes and semaphores
Dispatcher objects may also provide events An event acts much like a condition variable
3030
Linux Synchronization
Kernel disables interrupts for synchronizing access to global data on uniprocessor systems
Uses spinlocks for multiprocessor synchronization
Uses semaphores and readers-writers locks when longer sections of code need access to data
Implements POSIX synchronization primitives to support multitasking multithreading (including real-time threads) and multiprocessing
3131
Further Reading
Ben-Ari M Principles of Concurrent Programming Prentice Hall 1982
Lamport L The Mutual Exclusion Problem Journal of the ACM April 1986
Abraham Silberschatz Peter B Galvin Operating System Concepts John Wiley amp Sons 6th Ed 2003
ndash Chapter 7 - Process Synchronization
ndash Chapter 8 - Deadlocks
3232
32 Trap Dispatching Interrupts Synchronization Trap and Interrupt dispatching
IRQL levels amp Interrupt Precedence
Spinlocks and Kernel Synchronization
Executive Synchronization
3333
Kernel Mode Versus User Mode
A processor state
Controls access to memory
Each memory page is tagged to show the required mode for reading and for writing
ndash Protects the system from the users
ndash Protects the user (process) from themselves
ndash System is not protected from system
Code regions are tagged ldquono write in any moderdquo
Controls ability to execute privileged instructions
A Windows abstraction
ndash Intel Ring 0 Ring 3
3434
Kernel Mode Versus User Mode
Control flow (a thread) can change from user to kernel mode and back
ndash Does not affect scheduling
ndash Thread context includes info about execution mode (along with registers etc)
PerfMon counters
ndash ldquoPrivileged Timerdquo and ldquoUser Timerdquo
ndash 4 levels of granularity thread process processor system
3535
Getting Into Kernel Mode
Code is run in kernel mode for one of three reasons
1 Requests from user mode
ndash Via the system service dispatch mechanism
ndash Kernel-mode code runs in the context of the requesting thread
2 Dedicated kernel-mode system threads
ndash Some threads in the system stay in kernel mode at all timesmostly in the ldquoSystemrdquo process
ndash Scheduled preempted etc like any other threads
3636
Getting Into Kernel Mode
3 Interrupts from external devicesndash interrupt dispatcher invokes the interrupt service routine
ndash ISR runs in the context of the interrupted thread so-called ldquoarbitrary thread contextrdquo
ndash ISR often requests the execution of a ldquoDPC routinerdquo which also runs in kernel mode
ndash Time not charged to interrupted thread
3737
Trap dispatching
Interruptdispatcher
Systemservice
dispatcher
Interruptserviceroutines
Interruptserviceroutines
Interruptserviceroutines
System services
System services
System services
Exceptiondispatcher
Exceptionhandlers
Exceptionhandlers
Exceptionhandlers
Virtual memorymanagerlsquos pager
Interrupt
System service call
HW exceptionsSW exceptions
Virtual addressexceptions
Trap processorlsquos mechanism to capture executing thread
ndash Switch from user to kernel mode
ndash Interrupts ndash asynchronous
ndash Exceptions - synchronous
Interrupt dispatch routine
Disable interrupts
Record machine state (trap frame) to allow resume
Mask equal- and lower-IRQL interrupts
Find and call appropriate ISR
Dismiss interrupt
Restore machine state (including mode and enabled interrupts)
Disable interrupts
Record machine state (trap frame) to allow resume
Mask equal- and lower-IRQL interrupts
Find and call appropriate ISR
Dismiss interrupt
Restore machine state (including mode and enabled interrupts)
Tell the device to stop interrupting
Interrogate device state start next operation on device etc
Request a DPC
Return to caller
Tell the device to stop interrupting
Interrogate device state start next operation on device etc
Request a DPC
Return to caller
Interrupt service routine
interrupt
user or kernel mode
codekernel mode
Note no thread or process context switch
Note no thread or process context switch
Interrupt Dispatching
3939
IRQL = Interrupt Request Level
ndash Precedence of the interrupt with respect to other interrupts
ndash Different interrupt sources have different IRQLs
ndash not the same as IRQ
IRQL is also a state of the processor
ndash Servicing an interrupt raises processor IRQL to that interruptrsquos IRQL
ndash this masks subsequent interrupts at equal and lower IRQLs
User mode is limited to IRQL 0
No waits or page faults at IRQL gt= DISPATCH_LEVEL
Interrupt Precedence via IRQLs (x86)
4040
PassiveLowAPC
DispatchDPCDevice 1
Profile amp Synch (Srv 2003)
ClockInterprocessor Interrupt
Power failHigh
normal thread execution
Hardware interrupts
Deferrable software interrupts
012
302928
31
Interrupt Precedence via IRQLs (x86)
4141
Interrupt processing
Interrupt dispatch table (IDT)ndash Links to interrupt service routines
x86ndash Interrupt controller interrupts processor (single line)
ndash Processor queries for interrupt vector uses vector as index to IDT
Alphandash PAL code (Privileged Architecture Library ndash Alpha BIOS) determines interrupt vector calls kernel
ndash Kernel uses vector to index IDT
After ISR execution IRQL is lowered to initial level
4242
Interrupt object
Allows device drivers to register ISRs for their devicesndash Contains dispatch code (initial handler)
ndash Dispatch code calls ISR with interrupt object as parameter(HW cannot pass parameters to ISR)
Connectingdisconnecting interrupt objectsndash Dynamic association between ISR and IDT entry
ndash Loadable device drivers (kernel modules)
ndash Turn onoff ISR
Interrupt objects can synchronize access to ISR datandash Multiple instances of ISR may be active simultaneously (MP machine)
ndash Multiple ISR may be connected with IRQL
4343
Predefined IRQLs
High ndash used when halting the system (via KeBugCheck())
Power fail ndash originated in the NT design document but has never been used
Inter-processor interruptndash used to request action from other processor (dispatching a thread updating a processors TLB system shutdown system crash)
Clockndash Used to update systemlsquos clock allocation of CPU time to threads
Profilendash Used for kernel profiling (see Kernel profiler ndash Kernprofexe Res Kit)
4444
Predefined IRQLs (contd)
Device
ndash Used to prioritize device interrupts
DPCdispatch and APC
ndash Software interrupts that kernel and device drivers generate
Passive
ndash No interrupt level at all normal thread execution
4545
IRQLs on 64-bit Systems
PassiveLowAPC
DispatchDPCDevice 1
Device n
Synch (Srv 2003)Clock
Interprocessor InterruptPower
HighProfile
012
1413
15
34
PassiveLowAPC
DispatchDPC amp Synch (UP only)Correctable Machine Check
Device 1
Device nSynch (MP only)
ClockInterprocessor Interrupt
HighProfilePower
x64 IA64
12
4646
Interrupt Prioritization amp Delivery
IRQLs are determined as followsndash x86 UP systems IRQL = 27 - IRQ
ndash x86 MP systems bucketized (random)
ndash x64 amp IA64 systems IRQL = IDT vector number 16
On MP systems which processor is chosen to deliver an interruptndash By default any processor can receive an interrupt from any deviceCan be configured with IntFilter utility in Resource Kit
ndash On x86 and x64 systems the IOAPIC (IO advanced programmable interrupt controller) is programmed to interrupt the processor running at the lowest IRQL
ndash On IA64 systems the SAPIC (streamlined advanced programmable interrupt controller) is configured to interrupt one processor for each interrupt sourceProcessors are assigned round robin for each interrupt vector
4747
Software interrupts
Initiating thread dispatching
ndash DPC allow for scheduling actions when kernel is deep within many layers of code
ndash Delayed scheduling decision one DPC queue per processor
Handling timer expiration
Asynchronous execution of a procedure in context of a particular thread
Support for asynchronous IO operations
4848
Flow of Interrupts
4949
Sync on MP use spinlocks to coordinate among processors
Spinlock acquisition and release routines implement a one-owner-at-a-time algorithm
ndash Spinlock is either free or is considered to be owned by a CPU
ndash Analogous to using Windows API mutexes from user mode
A spinlock is just a data cell in memory
ndash Accessed with a test-and-modify operation that is atomic across all processors
ndash KSPIN_LOCK is an opaque data type typedefrsquod as a ULONG
ndash To implement synchronization a single bit is sufficient
Synchronization on SMP Systems
5050
do acquire_spinlock(DPC)until (SUCCESS)
begin remove DPC from queueend
release_spinlock(DPC)
do acquire_spinlock(DPC)until (SUCCESS)
begin remove DPC from queueend
release_spinlock(DPC)
Kernel Synchronization
Processor BProcessor A
Critical section
spinlock
DPC DPC
A spinlock is a locking primitive associatedwith a global data structure such as the DPC queue
5151
Queued Spinlocks
Problem Checking status of spinlock via test-and-set operation creates bus contention
Queued spinlocks maintain queue of waiting processors
First processor acquires lock other processors wait on processor-local flag
ndash Thus busy-wait loop requires no access to the memory bus
When releasing lock the 1st processorrsquos flag is modified
ndash Exactly one processor is being signaled
ndash Pre-determined wait order
5252
SMP Scalability Improvements
Windows 2000 queued spinlocks
ndash qlocks in Kernel Debugger
Server 2003
ndash More spinlocks eliminated (context swap system space commit)
ndash Further reduction of use of spinlocks amp length they are held
ndash Scheduling database now per-CPUAllows thread state transitions in parallel
5353
SMP Scalability Improvements
XP2003
ndash Minimized lock contention for hot locks PFN or Page Frame Database lock
ndash Some locks completely eliminatedCharging nonpagedpaged pool quotas allocating and mapping system page table entries charging commitment of pages allocatingmapping physical memory through AWE functions
ndash New more efficient locking mechanism (pushlocks)Doesnrsquot use spinlocks when no contentionUsed for object manager and address windowing extensions (AWE) related locks
5454
Waiting
Flexible wait calls
ndash Wait for one or multiple objects in one call
ndash Wait for multiple can wait for ldquoanyrdquo one or ldquoallrdquo at onceldquoAllrdquo all objects must be in the signalled state concurrently to resolve the wait
ndash All wait calls include optional timeout argument
ndash Waiting threads consume no CPU time
5555
Waiting
Waitable objects include
ndash Events (may be auto-reset or manual reset may be set or ldquopulsedrdquo)
ndash Mutexes (ldquomutual exclusionrdquo one-at-a-time)
ndash Semaphores (n-at-a-time)
ndash Timers
ndash Processes and Threads (signalled upon exit or terminate)
ndash Directories (change notification)
5656
Waiting
No guaranteed ordering of wait resolution
ndash If multiple threads are waiting for an object and only one thread is released (eg itrsquos a mutex or auto-reset event) which thread gets released is unpredictable
5757
Executive Synchronization
Thread waitson an object
handle
Create and initialize thread object
Initialized
Ready
Transition
Waiting
Running
Terminated
Standby
Wait is completeSet object to
signaled state
Interaction with thread scheduling
Waiting on Dispatcher Objects ndash outside the kernel
5858
Interaction bet Synchronization amp Dispatching User mode thread waits on an event objectlsquos handle
Kernel changes threadlsquos scheduling state from ready to waiting and adds thread to wait-list
Another thread sets the event
Kernel wakes up waiting threads variable priority threads get priority boost
5959
Interaction bet Synchronization amp Dispatching Dispatcher re-schedules new thread ndash it may preempt running thread it it has lower priority and issues software interrupt to initiate context switch
If no processor can be preempted the dispatcher places the ready thread in the dispatcher ready queue to be scheduled later
6060
What signals an object
Dispatcher object
System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
Owning thread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (kernel mode)
nonsignaled signaled
Owning thread or otherthread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (exported to user mode)
nonsignaled signaled
One thread releases thesemaphore freeing a resource
A thread acquires the semaphoreMore resources are not available
Kernel resumes one or more waiting threads
Semaphore
6161
A thread reinitializesthe thread object
What signals an object (contd)
Dispatcher object System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
A thread sets the event
Kernel resumes one or more threads
Kernel resumes one or more waiting threads
Event
nonsignaled signaled
Dedicated thread sets oneevent in the event pair
Kernel resumes theother dedicated thread
Kernel resumes waitingdedicated thread
Event pair
nonsignaled signaled
Timer expires
A thread (re) initializes the timer
Kernel resumes all waiting threads
Timer
nonsignaled signaled
Thread terminates
Kernel resumes all waiting threads
Thread
6262
Further Reading
Mark E Russinovich and David A Solomon Microsoft Windows Internals 4th Edition Microsoft Press 2004
Chapter 3 - System Mechanisms
ndash Trap Dispatching (pp 85 ff)
ndash Synchronization (pp 149 ff)
ndash Kernel Event Tracing (pp 175 ff)
6363
33 Advanced Windows Synchronization
Deferred and Asynchronous Procedure Calls
IRQLs and CPU Time Accounting
Wait Queues amp Dispatcher Objects
6464
Used to defer processing from higher (device) interrupt level to a lower (dispatch) levelndash Also used for quantum end and timer expiration
Driver (usually ISR) queues requestndash One queue per CPU DPCs are normally queued to the current processor but can be targeted to other CPUs
ndash Executes specified procedure at dispatch IRQL (or ldquodispatch levelrdquo also ldquoDPC levelrdquo) when all higher-IRQL work (interrupts) completed
ndash Maximum times recommended ISR 10 usec DPC 25 usec
See httpwwwmicrosoftcomwhdcdriverperformmmdrvmspx
Deferred Procedure Calls (DPCs)
6565
queue head DPC object DPC object DPC object
Deferred Procedure Calls (DPCs)
6666
DPC
Delivering a DPC
DPC routines can call kernel functionsbut canlsquot call system services generatepage faults or create or wait on objects
DPC routines canlsquotassume whatprocess addressspace is currentlymapped
Interruptdispatch table
high
Power failure
DispatchDPC
APC
Low
DPC
1 Timer expires kernelqueues DPC that willrelease all waiting threadsKernel requests SW int
DPCDPC
DPC queue
2 DPC interrupt occurswhen IRQL drops belowdispatchDPC level
dispatcher
3 After DPC interruptcontrol transfers tothread dispatcher
4 Dispatcher executes each DPCroutine in DPC queue
6767
Asynchronous Procedure Calls (APCs)
Execute code in context of a particular user threadndash APC routines can acquire resources (objects) incur page faultscall system services
APC queue is thread-specific
User mode amp kernel mode APCsndash Permission required for user mode APCs
6868
Asynchronous Procedure Calls (APCs)
Executive uses APCs to complete work in thread spacendash Wait for asynchronous IO operation
ndash Emulate delivery of POSIX signals
ndash Make threads suspendterminate itself (env subsystems)
APCs are delivered when thread is in alertable wait statendash WaitForMultipleObjectsEx() SleepEx()
6969
Special kernel APCs
ndash Run in kernel mode at IRQL 1
ndash Always deliverable unless thread is already at IRQL 1 or above
ndash Used for IO completion reporting from ldquoarbitrary thread contextrdquo
ndash Kernel-mode interface is linkable but not documented
Asynchronous Procedure Calls (APCs)
7070
ldquoOrdinaryrdquo kernel APCs
ndash Always deliverable if at IRQL 0 unless explicitly disabled (disable with KeEnterCriticalRegion)
User mode APCs
ndash Used for IO completion callback routines (see ReadFileEx WriteFileEx) also QueueUserApc
ndash Only deliverable when thread is in ldquoalertable waitrdquo
Asynchronous Procedure Calls (APCs)
7171
ThreadObject
K
U
APC objects
Asynchronous Procedure Calls (APCs)
7272
IRQLs and CPU Time Accounting
Interval clock timer ISR keeps track of time
Clock ISR time accounting
ndash If IRQLlt2 charge to threadrsquos user or kernel time
ndash If IRQL=2 and processing a DPC charge to DPC time
ndash If IRQL=2 amp not processing a DPC charge to thread kernel time
ndash If IRQLgt2 charge to interrupt time
7373
IRQLs and CPU Time Accounting
Since time servicing interrupts are NOT charged to interrupted thread if system is busy but no process appears to be running must be due to interrupt-related activity
ndash Note time at IRQL 2 or more is charged to the current threadrsquos quantum (to be described)
7474
Interrupt Time Accounting
Task Manager includes interrupt and DPC time with the Idle process time
Interrupt activity is not charged to any threadprocess
ndash Process Explorer shows these as separate processesnot really processes
ndash Context switches for these are really of interrupts amp DPCs
7575
Time Accounting Quirks
Looking at total CPU time for each process may not reveal where system has spent its time
CPU time accounting is driven by programmable interrupt timer
ndash Normally 10 msec (15 msec on some MP Pentiums)
Thread execution and context switches between clock intervals NOT accounted
ndash Eg one or more threads run and enter a wait state before clock fires
ndash Thus threads may run but never get charged
View context switch activity with Process Explorer
ndash Add Context Switch Delta column
7676
For waiting threads user-mode utilities only display the wait reason
Example pstat
Looking at Waiting Threads
7777
Wait Internals 1 Dispatcher Objects
Size TypeState
Wait listhead
Object-type-specific data
DispatcherObject
(see ntddkincddkntddkh)
Any kernel object you can wait for is a ldquodispatcher objectrdquo
ndash some exclusively for synchronizationeg events mutexes (ldquomutantsrdquo) semaphores queues timers
ndash others can be waited for as a side effect of their prime function eg processes threads file objects
ndash non-waitable kernel objects are called ldquocontrol objectsrdquo
All dispatcher objects have a common header
All dispatcher objects are in one of two states
ndash ldquosignaledrdquo vs ldquononsignaledrdquo
ndash when signalled a wait on the object is satisfied
ndash different object types differ in terms of what changes their state
ndash wait and unwait implementation iscommon to all types of dispatcher objects
7878
Object-type-specific data
Wait Internals 2Wait Blocks
Size TypeState
Wait listhead
Size TypeState
Wait listhead
Represent a threadrsquos reference to something itrsquos waiting for (one per handle passed to WaitForhellip)
All wait blocks from a given wait call are chained to the waiting thread
Type indicates wait for ldquoanyrdquo or ldquoallrdquo Key denotes argument list position for
WaitForMultipleObjects
Object-type-specific data
DispatcherObjects
Thread Objects
WaitBlockListWaitBlockList
Wait blocks
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
7979
34 Windows APIs for Synchronization and IPC Windows API constructs for synchronization and interprocess communication
Synchronization
ndash Critical sections
ndash Mutexes
ndash Semaphores
ndash Event objects
Synchronization through interprocess communication
ndash Anonymous pipes
ndash Named pipes
ndash Mailslots
8080
Critical Sections
Only usable from within the same process
Critical sections are initialized and deleted but do not have handles
Only one thread at a time can be in a critical section
A thread can enter a critical section multiple times - however the number of Enter- and Leave-operations must match
Leaving a critical section before entering it may cause deadlocks
No way to test whether another thread is in a critical section
VOID InitializeCriticalSection( LPCRITICAL_SECTION sec )VOID DeleteCriticalSection( LPCRITICAL_SECTION sec )
VOID EnterCriticalSection( LPCRITICAL_SECTION sec ) VOID LeaveCriticalSection( LPCRITICAL_SECTION sec )BOOL TryEnterCriticalSection ( LPCRITICAL_SECTION sec )
8181
Critical Section Example
counter is global shared by all threads
volatile int counter = 0
CRITICAL_SECTION crit
InitializeCriticalSection ( ampcrit )
hellip main loop in any of the threads
while (done)
_try
EnterCriticalSection ( ampcrit )
counter += local_value
LeaveCriticalSection ( ampcrit )
_finally LeaveCriticalSection ( ampcrit )
DeleteCriticalSection( ampcrit )
8282
Synchronizing Threads with Kernel Objects
The following kernel objects can be used to synchronize threads
ndash Processes
ndash Threads
ndash Files
ndash Console input
File change notificationsFile change notifications
MutexesMutexes
Events (auto-reset + manual-reset)Events (auto-reset + manual-reset)
Waitable timersWaitable timers
DWORD WaitForSingleObject( HANDLE hObject DWORD dwTimeout )
DWORD WaitForMultipleObjects( DWORD cObjects LPHANDLE lpHandles BOOL bWaitAll DWORD dwTimeout )
8383
Wait Functions - Details
WaitForSingleObject()ndash hObject specifies kernel object
ndash dwTimeout specifies wait time in msecdwTimeout == 0 - no wait check whether object is signaled
dwTimeout == INFINITE - wait forever
WaitForMultipleObjects()ndash cObjects lt= MAXIMUM_WAIT_OBJECTS (64)
ndash lpHandles - pointer to array identifying these objects
ndash bWaitAll - whether to wait for first signaled object or all objectsFunction returns index of first signaled object
Side effectsndash Mutexes auto-reset events and waitable timers will be reset to non-signaled state after completing wait functions
8484
Mutexes
Mutexes work across processes
First thread has to call CreateMutex()
When sharing a mutex second thread (process) calls CreateMutex() or OpenMutex()
fInitialOwner == TRUE gives creator immediate ownership
Threads acquire mutex ownership using WaitForSingleObject() or WaitForMultipleObjects()
ReleaseMutex() gives up ownership
CloseHandle() will free mutex object
8585
Mutexes
HANDLE CreateMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
HANDLE OpenMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
BOOL ReleaseMutex( HANDLE hMutex )
8686
Mutex Example
counter is global shared by all threads
volatile int done counter = 0
HANDLE mutex = CreateMutex( NULL FALSE NULL )
main loop in any of the threads ret is local
DWORD ret
while (done)
ret = WaitForSingleObject( mutex INFINITE )
if (ret == WAIT_OBJECT_0)
counter += local_value
else mutex was abandoned
break exit the loop
ReleaseMutex( mutex )
CloseHandle( mutex )
8787
Comparison - POSIX mutexes
POSIX pthreads specification supports mutexesndash Synchronization among threads in same process
Five basic functionsndash pthread_mutex_init()
ndash pthread_mutex_destroy()
ndash pthread_mutex_lock()
ndash pthread_mutex_unlock()
ndash pthread_mutex_trylock()
Comparisonndash pthread_mutex_lock() will block - equivalent to WaitForSingleObject( hMutex )
ndash pthread_mutex_trylock() is nonblocking (polling) - equivalent to WaitForSingleObject() with timeout == 0
8888
Semaphores
Semaphore objects are used for resource countingndash A semaphore is signaled when count gt 0
Threadsprocesses use wait functionsndash Each wait function decreases semaphore count by 1
ndash ReleaseSemaphore() may increment count by any value
ndash ReleaseSemaphore() returns old semaphore count
8989
Semaphores
HANDLE CreateSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE OpenSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE ReleaseSemaphore( HANDLE hSemaphoreLONG cReleaseCount LPLONG lpPreviousCount )
9090
Events
Multiple threads can be released when a single event is signaled (barrier synchronization)ndash Manual-reset event can signal several thread simultaneously must be reset manually
ndash PulseEvent() will release all threads waiting on a manual-reset event and automatically reset the event
ndash Auto-reset event signals a single thread event is reset automatically
ndash fInitialState == TRUE - create event in signaled state
9191
Events
HANDLE CreateEvent( LPSECURITY_ATTRIBUTE lpsaBOOL fManualReset BOOL fInititalStateLPTSTR lpszEventName )
BOOL SetEvent( HANDLE hEvent )BOOL ResetEvent( HANDLE hEvent )BOOL PulseEvent( HANDLE hEvent )
9292
Comparison - POSIX condition variables
pthreadrsquos condition variables are comparable to eventsndash pthread_cond_init()
ndash pthread_cond_destroy()
Wait functionsndash pthread_cond_wait()
ndash pthread_cond_timedwait()
Signalingndash pthread_cond_signal() - one thread
ndash pthread_cond_broadcast() - all waiting threads
No exact equivalent to manual-reset events
9393
Anonymous pipes
BOOL CreatePipe( PHANDLE phReadPHANDLE phWriteLPSECURITY_ATTRIBUTES lpsaDWORD cbPipe )
main
prog1 prog2pipe
Half-duplex character-based IPC
cbPipe pipe byte size zero == default
Read on pipe handle will block if pipe is empty
Write operation to a full pipe will block
Anonymous pipes are oneway
9494
IO Redirection using an Anonymous Pipe
Create default size anonymous pipe handles are inheritable
if (CreatePipe (amphReadPipe amphWritePipe ampPipeSA 0))
fprintf(stderr ldquoAnon pipe create failednrdquo) exit(1)
Set output handle to pipe handle create first processes
StartInfoCh1hStdInput = GetStdHandle (STD_INPUT_HANDLE)
StartInfoCh1hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh1hStdOutput = hWritePipe
StartInfoCh1dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)Command1 NULL NULL TRUE 0 NULL NULL ampStartInfoCh1 ampProcInfo1))
fprintf(stderr ldquoCreateProc1 failednrdquo) exit(2)
CloseHandle (hWritePipe)
9595
Pipe example (contd)
Repeat (symmetrically) for the second process
StartInfoCh2hStdInput = hReadPipe
StartInfoCh2hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh2hStdOutput = GetStdHandle (STD_OUTPUT_HANDLE)
StartInfoCh2dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)targv NULL NULLTRUE Inherit handles
0 NULL NULL ampStartInfoCh2 ampProcInfo2))
fprintf(stderr ldquoCreateProc2 failednrdquo) exit(3)
CloseHandle (hReadPipe)
Wait for both processes to complete
WaitForSingleObject (ProcInfo1hProcess INFINITE)
WaitForSingleObject (ProcInfo2hProcess INFINITE)
CloseHandle (ProcInfo1hThread) CloseHandle (ProcInfo1hProcess)
CloseHandle (ProcInfo2hThread) CloseHandle (ProcInfo2hProcess)
return 0
9696
Named Pipes
Message orientedndash Reading process can read varying-length messages precisely as sent by the writing process
Bi-directionalndash Two processes can exchange messages over the same pipe
Multiple independent instances of a named pipendash Several clients can communicate with a single server using the same instance
ndash Server can respond to client using the same instance
Pipe can be accessed over the networkndash location transparency
Convenience and connection functions
9797
Using Named Pipes
HANDLE CreateNamedPipe (LPCTSTR lpszPipeNameDWORD fdwOpenMode DWORD fdwPipModeDWORD nMaxInstances DWORD cbOutBufDWORD cbInBuf DWORD dwTimeOutLPSECURITY_ATTRIBUTES lpsa )
Use same flag settings forall instances of a named pipe
lpszPipeName pipe[path]pipename
ndash Not possible to create a pipe on remote machine ( ndash local machine)
fdwOpenMode
ndash PIPE_ACCESS_DUPLEX PIPE_ACCESS_INBOUND PIPE_ACCESS_OUTBOUND
fdwPipeMode
ndash PIPE_TYPE_BYTE or PIPE_TYPE_MESSAGE
ndash PIPE_READMODE_BYTE or PIPE_READMODE_MESSAGE
ndash PIPE_WAIT or PIPE_NOWAIT (will ReadFile block)
9898
Named Pipes (contd)
BOOL PeekNamedPipe (HANDLE hPipeLPVOID lpvBuffer DWORD cbBufferLPDWORD lpcbRead LPDWORD lpcbAvailLPDWORD lpcbMessage)
nMaxInstances
ndash Number of instances
ndash PIPE_UNLIMITED_INSTANCES OS choice based on resources
dwTimeOut
ndash Default time-out period (in msec) for WaitNamedPipe()
First CreateNamedPipe creates named pipe
ndash Closing handle to last instance deletes named pipe
Polling a pipe
ndash Nondestructive ndash is there a message waiting for ReadFile
9999
Named Pipe Client Connections
CreateFile with named pipe namendash pipe[path]pipename
ndash servernamepipe[path]pipename
ndash First method gives better performance (local server)
Status Functionsndash GetNamedPipeHandleState
ndash SetNamedPipeHandleState
ndash GetNamedPipeInfo
100100
Convenience Functions
BOOL TransactNamedPipe( HANDLE hNamedPipeLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDOWRD lpcbRead LPOVERLAPPED lpa)
WriteFile ReadFile sequence
101101
Convenience Functions
BOOL CallNamedPipe( LPCTSTR lpszPipeNameLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDWORD lpcbRead DWORD dwTimeOut)
CreateFile WriteFile ReadFile CloseHandle
ndash dwTimeOut NMPWAIT_NOWAIT NMPWAIT_WIAT_FOREVER NMPWAIT_USE_DEFAULT_WAIT
102102
Server eliminate the polling loop
BOOL ConnectNamedPipe (HANDLE hNamedPipeLPOVERLAPPED lpo
lpo == NULLndash Call will return as soon as there is a client connection
ndash Returns false if client connected between CreateNamed Pipe calland ConnectNamedPipe()
Use DisconnectNamedPipe to free the handle for connection from another client
WaitNamedPipe()ndash Client may wait for serverlsquos ConnectNamedPipe()
Security rights for named pipesndash GENERIC_READ GENERIC_WRITE SYNCHRONIZE
103103
Comparison with UNIX
UNIX FIFOs are similar to a named pipe
ndash FIFOs are half-duplex
ndash FIFOs are limited to a single machine
ndash FIFOs are still byte-oriented so its easiest to use fixed-size records in clientserver applications
ndash Individual readwrites are atomic
A server using FIFOs must use a separate FIFO for each clientlsquos response although all clients can send requests via a single well known FIFO
Mkfifo() is the UNIX counterpart to CreateNamedPipe()
Use sockets for networked clientserver scenarios
104104
Client Example using Named Pipe
WaitNamedPipe (ServerPipeName NMPWAIT_WAIT_FOREVER)
hNamedPipe = CreateFile (ServerPipeName GENERIC_READ | GENERIC_WRITE
0 NULL OPEN_EXISTING FILE_ATTRIBUTE_NORMAL NULL)
if (hNamedPipe == INVALID_HANDLE_VALUE)
fptinf(stderr Failure to locate servern) exit(3)
Write the request
WriteFile (hNamedPipe ampRequest MAX_RQRS_LEN ampnWrite NULL)
Read each response and send it to std out
while (ReadFile (hNamedPipe ResponseRecord MAX_RQRS_LEN ampnRead NULL))
printf (s ResponseRecord)
CloseHandle (hNamedPipe)
return 0
105105
Server Example Using a Named Pipe
hNamedPipe = CreateNamedPipe (SERVER_PIPE PIPE_ACCESS_DUPLEX
PIPE_READMODE_MESSAGE | PIPE_TYPE_MESSAGE | PIPE_WAIT
1 0 0 CS_TIMEOUT pNPSA)
while (Done)
printf (Server is awaiting next requestn)
if (ConnectNamedPipe (hNamedPipe NULL)
|| ReadFile (hNamedPipe ampRequest RQ_SIZE ampnXfer NULL))
fprintf(stderr ldquoConnect or Read Named Pipe errornrdquo) exit(4)
printf( ldquoRequest is sn RequestRecord)
Send the file one line at a time to the client
fp = fopen (File r)
while ((fgets (ResponseRecord MAX_RQRS_LEN fp) = NULL))
WriteFile (hNamedPipe ampResponseRecord
(strlen(ResponseRecord) + 1) TSIZE ampnXfer NULL)
fclose (fp)
DisconnectNamedPipe (hNamedPipe)
End of server operation
106106
Win32 IPC - MailslotsMailslots bear some nasty implementation detailsthey are almost never used
Broadcast mechanism
ndash One-directional
ndash Mutliple writersmultiple readers (frequently one-to-many comm)
ndash Message delivery is unreliable
ndash Can be located over a network domain
ndash Message lengths are limited (w2k lt 426 byte) Operations on the mailslot
ndash Each reader (server) creates mailslot with CreateMailslot()
ndash Write-only client opens mailslot with CreateFile() and uses WriteFile() ndash open will fail if there are no waiting readers
ndash Clientlsquos message can be read by all servers (readers) Client lookup mailslotmailslotname
ndash Client will connect to every server in network domain
107107
Locate a server via mailslot
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
App client 0
App client n
Mailslot Servers
While () Sleep() hMS = CreateFile( ldquomailslotstatusldquo)
WriteFile(hMS ampStatInfo
App Server
Mailslot Client
Message is sent periodically
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
33
Critical Sections Semaphores and Monitors The Critical-Section Problem
Software Solutions
Synchronization Hardware
Semaphores
Synchronization in Windows amp Linux
44
The Critical-Section Problem
n threads all competing to use a shared resource
Each thread has a code segment called critical section in which the shared data is accessed
Problem Ensure that
ndash when one thread is executing in its critical section no other thread is allowed to execute in its critical section
55
Solution to Critical-Section Problem
Mutual Exclusion
ndash Only one thread at a time is allowed into its CS among all threads that have CS for the same resource or shared data
ndash A thread halted in its non-critical section must not interfere with other threads
Progress
ndash A thread remains inside CS for a finite time only
ndash No assumptions concerning relative speed of the threads
66
Solution to Critical-Section Problem
Bounded Waiting
ndash It must no be possible for a thread requiring access to a critical section to be delayed indefinitely
ndash When no thread is in a critical section any thread that requests entry must be permitted to enter without delay
77
Only 2 threads T0 and T1
General structure of thread Ti (other thread Tj)
do
enter section
critical section
exit section
reminder section
while (1)
Threads may share some common variables to synchronize their actions
Initial Attempts to Solve Problem
88
First Attempt Algorithm 1
Shared variables
ndash Initialization int turn = 0
ndash turn == i Ti can enter its critical section
Thread Ti
do
while (turn = i)
critical section
turn = j
reminder section
while (1)
Satisfies mutual exclusion but not progress
99
Second Attempt Algorithm 2
Shared variables
ndash initialization int flag[2] flag[0] = flag[1] = 0
ndash flag[i] == 1 Ti can enter its critical section
Thread Ti
do
flag[i] = 1while (flag[j] == 1)
critical section
flag[i] = 0remainder section
while(1)
Satisfies mutual exclusion not progress requirement
1010
Algorithm 3 (Petersonrsquos Algorithm - 1981)
Shared variables of algorithms 1 and 2 - initialization
int flag[2] flag[0] = flag[1] = 0int turn = 0
Thread Ti
do
flag[i] = 1turn = jwhile ((flag[j] == 1) ampamp turn == j)
critical section
flag[i] = 0
remainder section
while (1)
Solves the critical-section problem for two threads
1111
Dekkerrsquos Algorithm (1965)
This is the first correct solution proposed for the two-thread (two-process) case
Originally developed by Dekker in a different context it was applied to the critical section problem by Dijkstra
Dekker adds the idea of a favored thread and allows access to either thread when the request is uncontested
When there is a conflict one thread is favored and the priority reverses after successful execution of the critical section
1212
Dekkerrsquos Algorithm (contd)
Shared variables - initializationint flag[2] flag[0] = flag[1] = 0int turn = 0
Thread Ti
do flag[i] = 1
while (flag[j] ) if (turn == j)
flag[i] = 0while (turn == j)flag[i] = 1
critical section
turn = jflag[I] = 0
remainder section
while (1)
1313
Bakery Algorithm (Lamport 1979)
A Solution to the Critical Section problem for n threads
Before entering its CS a thread receives a number
Holder of the smallest number enters the CS
If threads Ti and Tj receive the same number if i lt j then Ti is served first else Tj is served first
The numbering scheme generates numbers in monotonically non-decreasing order
ndash ie 1112333445
1414
Bakery Algorithm
Notation ldquoltldquo establishes lexicographical order among 2-tuples (ticket thread id )
(ab) lt (cd) if a lt c or if a == c and b lt d
max (a0hellip an-1) = k | k ai for i = 0hellip n ndash 1
Shared data
int choosing[n]
int number[n] - the ticket
Data structures are initialized to 0
1515
Bakery Algorithm
do
choosing[i] = 1
number[i] = max(number[0]number[1] number[n-1]) + 1
choosing[i] = 0
for (j = 0 j lt n j++)
while (choosing[j] == 1)
while ((number[j] = 0) ampamp ((number[j]j) lsquorsquoltlsquorsquo (number[i]i)))
critical section
number[i] = 0
remainder section
while (1)
1616
Mutual Exclusion - Hardware Support
Interrupt Disabling
ndash Concurrent threads cannot overlap on a uniprocessor
ndash Thread will run until performing a system call or interrupt happens
Special Atomic Machine Instructions
ndash Test and Set Instruction - read amp write a memory location
ndash Exchange Instruction - swap register and memory location
Problems with Machine-Instruction Approach
ndash Busy waiting
ndash Starvation is possible
ndash Deadlock is possible
1717
Synchronization Hardware
Test and modify the content of a word atomically
boolean TestAndSet(boolean amptarget)
boolean rv = target
target = true
return rv
1818
Shared data ndash boolean lock = false
Thread Ti
do
while (TestAndSet(lock))
critical section
lock = false
remainder section
Mutual Exclusion with Test-and-Set
1919
Synchronization Hardware
Atomically swap two variables
void Swap(boolean ampa boolean ampb)
boolean temp = a
a = b
b = temp
2020
Mutual Exclusion with Swap
Shared data (initialized to 0) int lock = 0
Thread Ti
int key
do
key = 1
while (key == 1) Swap(lockkey)
critical section
lock = 0
remainder section
2121
Semaphores
Semaphore S ndash integer variable
can only be accessed via two atomic operations
wait (S)
while (S lt= 0)S--
signal (S)
S++
2222
Critical Section of n Threads
Shared data
semaphore mutex initially mutex = 1
Thread Ti
do wait(mutex) critical section
signal(mutex) remainder section
while (1)
2323
Semaphore Implementation
Semaphores may suspendresume threads
ndash Avoid busy waiting
Define a semaphore as a record
typedef struct
int value struct thread L semaphore
Assume two simple operations
ndash suspend() suspends the thread that invokes it
ndash resume(T) resumes the execution of a blocked thread T
2424
Implementation
Semaphore operations now defined as wait(S)
Svalue--
if (Svalue lt 0)
add this thread to SLsuspend()
signal(S) Svalue++
if (Svalue lt= 0)
remove a thread T from SLresume(T)
2525
Semaphore as a General Synchronization Tool Execute B in Tj only after A executed in Ti
Use semaphore flag initialized to 0
Code
Ti Tj
A wait(flag)
signal(flag) B
2626
Two Types of Semaphores
Counting semaphore
ndash integer value can range over an unrestricted domain
Binary semaphore
ndash integer value can range only between 0 and 1
ndash can be simpler to implement
Counting semaphore S can be implemented as a binary semaphore
2727
Deadlock and Starvation
Deadlock ndash
ndash two or more threads are waiting indefinitely for an event that can be caused by only one of the waiting threads
Let S and Q be two semaphores initialized to 1
T0 T1
wait(S) wait(Q)
wait(Q) wait(S)
signal(S) signal(Q)
signal(Q) signal(S)
2828
Deadlock and Starvation
Starvation ndash indefinite blocking
ndash A thread may never be removed from the semaphore queue in which it is suspended
Solution ndash
ndash all code should acquirerelease semaphores in same order
2929
Windows Synchronization
Uses interrupt masks to protect access to global resources on uniprocessor systems
Uses spinlocks on multiprocessor systems
Provides dispatcher objects which may act as mutexes and semaphores
Dispatcher objects may also provide events An event acts much like a condition variable
3030
Linux Synchronization
Kernel disables interrupts for synchronizing access to global data on uniprocessor systems
Uses spinlocks for multiprocessor synchronization
Uses semaphores and readers-writers locks when longer sections of code need access to data
Implements POSIX synchronization primitives to support multitasking multithreading (including real-time threads) and multiprocessing
3131
Further Reading
Ben-Ari M Principles of Concurrent Programming Prentice Hall 1982
Lamport L The Mutual Exclusion Problem Journal of the ACM April 1986
Abraham Silberschatz Peter B Galvin Operating System Concepts John Wiley amp Sons 6th Ed 2003
ndash Chapter 7 - Process Synchronization
ndash Chapter 8 - Deadlocks
3232
32 Trap Dispatching Interrupts Synchronization Trap and Interrupt dispatching
IRQL levels amp Interrupt Precedence
Spinlocks and Kernel Synchronization
Executive Synchronization
3333
Kernel Mode Versus User Mode
A processor state
Controls access to memory
Each memory page is tagged to show the required mode for reading and for writing
ndash Protects the system from the users
ndash Protects the user (process) from themselves
ndash System is not protected from system
Code regions are tagged ldquono write in any moderdquo
Controls ability to execute privileged instructions
A Windows abstraction
ndash Intel Ring 0 Ring 3
3434
Kernel Mode Versus User Mode
Control flow (a thread) can change from user to kernel mode and back
ndash Does not affect scheduling
ndash Thread context includes info about execution mode (along with registers etc)
PerfMon counters
ndash ldquoPrivileged Timerdquo and ldquoUser Timerdquo
ndash 4 levels of granularity thread process processor system
3535
Getting Into Kernel Mode
Code is run in kernel mode for one of three reasons
1 Requests from user mode
ndash Via the system service dispatch mechanism
ndash Kernel-mode code runs in the context of the requesting thread
2 Dedicated kernel-mode system threads
ndash Some threads in the system stay in kernel mode at all timesmostly in the ldquoSystemrdquo process
ndash Scheduled preempted etc like any other threads
3636
Getting Into Kernel Mode
3 Interrupts from external devicesndash interrupt dispatcher invokes the interrupt service routine
ndash ISR runs in the context of the interrupted thread so-called ldquoarbitrary thread contextrdquo
ndash ISR often requests the execution of a ldquoDPC routinerdquo which also runs in kernel mode
ndash Time not charged to interrupted thread
3737
Trap dispatching
Interruptdispatcher
Systemservice
dispatcher
Interruptserviceroutines
Interruptserviceroutines
Interruptserviceroutines
System services
System services
System services
Exceptiondispatcher
Exceptionhandlers
Exceptionhandlers
Exceptionhandlers
Virtual memorymanagerlsquos pager
Interrupt
System service call
HW exceptionsSW exceptions
Virtual addressexceptions
Trap processorlsquos mechanism to capture executing thread
ndash Switch from user to kernel mode
ndash Interrupts ndash asynchronous
ndash Exceptions - synchronous
Interrupt dispatch routine
Disable interrupts
Record machine state (trap frame) to allow resume
Mask equal- and lower-IRQL interrupts
Find and call appropriate ISR
Dismiss interrupt
Restore machine state (including mode and enabled interrupts)
Disable interrupts
Record machine state (trap frame) to allow resume
Mask equal- and lower-IRQL interrupts
Find and call appropriate ISR
Dismiss interrupt
Restore machine state (including mode and enabled interrupts)
Tell the device to stop interrupting
Interrogate device state start next operation on device etc
Request a DPC
Return to caller
Tell the device to stop interrupting
Interrogate device state start next operation on device etc
Request a DPC
Return to caller
Interrupt service routine
interrupt
user or kernel mode
codekernel mode
Note no thread or process context switch
Note no thread or process context switch
Interrupt Dispatching
3939
IRQL = Interrupt Request Level
ndash Precedence of the interrupt with respect to other interrupts
ndash Different interrupt sources have different IRQLs
ndash not the same as IRQ
IRQL is also a state of the processor
ndash Servicing an interrupt raises processor IRQL to that interruptrsquos IRQL
ndash this masks subsequent interrupts at equal and lower IRQLs
User mode is limited to IRQL 0
No waits or page faults at IRQL gt= DISPATCH_LEVEL
Interrupt Precedence via IRQLs (x86)
4040
PassiveLowAPC
DispatchDPCDevice 1
Profile amp Synch (Srv 2003)
ClockInterprocessor Interrupt
Power failHigh
normal thread execution
Hardware interrupts
Deferrable software interrupts
012
302928
31
Interrupt Precedence via IRQLs (x86)
4141
Interrupt processing
Interrupt dispatch table (IDT)ndash Links to interrupt service routines
x86ndash Interrupt controller interrupts processor (single line)
ndash Processor queries for interrupt vector uses vector as index to IDT
Alphandash PAL code (Privileged Architecture Library ndash Alpha BIOS) determines interrupt vector calls kernel
ndash Kernel uses vector to index IDT
After ISR execution IRQL is lowered to initial level
4242
Interrupt object
Allows device drivers to register ISRs for their devicesndash Contains dispatch code (initial handler)
ndash Dispatch code calls ISR with interrupt object as parameter(HW cannot pass parameters to ISR)
Connectingdisconnecting interrupt objectsndash Dynamic association between ISR and IDT entry
ndash Loadable device drivers (kernel modules)
ndash Turn onoff ISR
Interrupt objects can synchronize access to ISR datandash Multiple instances of ISR may be active simultaneously (MP machine)
ndash Multiple ISR may be connected with IRQL
4343
Predefined IRQLs
High ndash used when halting the system (via KeBugCheck())
Power fail ndash originated in the NT design document but has never been used
Inter-processor interruptndash used to request action from other processor (dispatching a thread updating a processors TLB system shutdown system crash)
Clockndash Used to update systemlsquos clock allocation of CPU time to threads
Profilendash Used for kernel profiling (see Kernel profiler ndash Kernprofexe Res Kit)
4444
Predefined IRQLs (contd)
Device
ndash Used to prioritize device interrupts
DPCdispatch and APC
ndash Software interrupts that kernel and device drivers generate
Passive
ndash No interrupt level at all normal thread execution
4545
IRQLs on 64-bit Systems
PassiveLowAPC
DispatchDPCDevice 1
Device n
Synch (Srv 2003)Clock
Interprocessor InterruptPower
HighProfile
012
1413
15
34
PassiveLowAPC
DispatchDPC amp Synch (UP only)Correctable Machine Check
Device 1
Device nSynch (MP only)
ClockInterprocessor Interrupt
HighProfilePower
x64 IA64
12
4646
Interrupt Prioritization amp Delivery
IRQLs are determined as followsndash x86 UP systems IRQL = 27 - IRQ
ndash x86 MP systems bucketized (random)
ndash x64 amp IA64 systems IRQL = IDT vector number 16
On MP systems which processor is chosen to deliver an interruptndash By default any processor can receive an interrupt from any deviceCan be configured with IntFilter utility in Resource Kit
ndash On x86 and x64 systems the IOAPIC (IO advanced programmable interrupt controller) is programmed to interrupt the processor running at the lowest IRQL
ndash On IA64 systems the SAPIC (streamlined advanced programmable interrupt controller) is configured to interrupt one processor for each interrupt sourceProcessors are assigned round robin for each interrupt vector
4747
Software interrupts
Initiating thread dispatching
ndash DPC allow for scheduling actions when kernel is deep within many layers of code
ndash Delayed scheduling decision one DPC queue per processor
Handling timer expiration
Asynchronous execution of a procedure in context of a particular thread
Support for asynchronous IO operations
4848
Flow of Interrupts
4949
Sync on MP use spinlocks to coordinate among processors
Spinlock acquisition and release routines implement a one-owner-at-a-time algorithm
ndash Spinlock is either free or is considered to be owned by a CPU
ndash Analogous to using Windows API mutexes from user mode
A spinlock is just a data cell in memory
ndash Accessed with a test-and-modify operation that is atomic across all processors
ndash KSPIN_LOCK is an opaque data type typedefrsquod as a ULONG
ndash To implement synchronization a single bit is sufficient
Synchronization on SMP Systems
5050
do acquire_spinlock(DPC)until (SUCCESS)
begin remove DPC from queueend
release_spinlock(DPC)
do acquire_spinlock(DPC)until (SUCCESS)
begin remove DPC from queueend
release_spinlock(DPC)
Kernel Synchronization
Processor BProcessor A
Critical section
spinlock
DPC DPC
A spinlock is a locking primitive associatedwith a global data structure such as the DPC queue
5151
Queued Spinlocks
Problem Checking status of spinlock via test-and-set operation creates bus contention
Queued spinlocks maintain queue of waiting processors
First processor acquires lock other processors wait on processor-local flag
ndash Thus busy-wait loop requires no access to the memory bus
When releasing lock the 1st processorrsquos flag is modified
ndash Exactly one processor is being signaled
ndash Pre-determined wait order
5252
SMP Scalability Improvements
Windows 2000 queued spinlocks
ndash qlocks in Kernel Debugger
Server 2003
ndash More spinlocks eliminated (context swap system space commit)
ndash Further reduction of use of spinlocks amp length they are held
ndash Scheduling database now per-CPUAllows thread state transitions in parallel
5353
SMP Scalability Improvements
XP2003
ndash Minimized lock contention for hot locks PFN or Page Frame Database lock
ndash Some locks completely eliminatedCharging nonpagedpaged pool quotas allocating and mapping system page table entries charging commitment of pages allocatingmapping physical memory through AWE functions
ndash New more efficient locking mechanism (pushlocks)Doesnrsquot use spinlocks when no contentionUsed for object manager and address windowing extensions (AWE) related locks
5454
Waiting
Flexible wait calls
ndash Wait for one or multiple objects in one call
ndash Wait for multiple can wait for ldquoanyrdquo one or ldquoallrdquo at onceldquoAllrdquo all objects must be in the signalled state concurrently to resolve the wait
ndash All wait calls include optional timeout argument
ndash Waiting threads consume no CPU time
5555
Waiting
Waitable objects include
ndash Events (may be auto-reset or manual reset may be set or ldquopulsedrdquo)
ndash Mutexes (ldquomutual exclusionrdquo one-at-a-time)
ndash Semaphores (n-at-a-time)
ndash Timers
ndash Processes and Threads (signalled upon exit or terminate)
ndash Directories (change notification)
5656
Waiting
No guaranteed ordering of wait resolution
ndash If multiple threads are waiting for an object and only one thread is released (eg itrsquos a mutex or auto-reset event) which thread gets released is unpredictable
5757
Executive Synchronization
Thread waitson an object
handle
Create and initialize thread object
Initialized
Ready
Transition
Waiting
Running
Terminated
Standby
Wait is completeSet object to
signaled state
Interaction with thread scheduling
Waiting on Dispatcher Objects ndash outside the kernel
5858
Interaction bet Synchronization amp Dispatching User mode thread waits on an event objectlsquos handle
Kernel changes threadlsquos scheduling state from ready to waiting and adds thread to wait-list
Another thread sets the event
Kernel wakes up waiting threads variable priority threads get priority boost
5959
Interaction bet Synchronization amp Dispatching Dispatcher re-schedules new thread ndash it may preempt running thread it it has lower priority and issues software interrupt to initiate context switch
If no processor can be preempted the dispatcher places the ready thread in the dispatcher ready queue to be scheduled later
6060
What signals an object
Dispatcher object
System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
Owning thread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (kernel mode)
nonsignaled signaled
Owning thread or otherthread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (exported to user mode)
nonsignaled signaled
One thread releases thesemaphore freeing a resource
A thread acquires the semaphoreMore resources are not available
Kernel resumes one or more waiting threads
Semaphore
6161
A thread reinitializesthe thread object
What signals an object (contd)
Dispatcher object System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
A thread sets the event
Kernel resumes one or more threads
Kernel resumes one or more waiting threads
Event
nonsignaled signaled
Dedicated thread sets oneevent in the event pair
Kernel resumes theother dedicated thread
Kernel resumes waitingdedicated thread
Event pair
nonsignaled signaled
Timer expires
A thread (re) initializes the timer
Kernel resumes all waiting threads
Timer
nonsignaled signaled
Thread terminates
Kernel resumes all waiting threads
Thread
6262
Further Reading
Mark E Russinovich and David A Solomon Microsoft Windows Internals 4th Edition Microsoft Press 2004
Chapter 3 - System Mechanisms
ndash Trap Dispatching (pp 85 ff)
ndash Synchronization (pp 149 ff)
ndash Kernel Event Tracing (pp 175 ff)
6363
33 Advanced Windows Synchronization
Deferred and Asynchronous Procedure Calls
IRQLs and CPU Time Accounting
Wait Queues amp Dispatcher Objects
6464
Used to defer processing from higher (device) interrupt level to a lower (dispatch) levelndash Also used for quantum end and timer expiration
Driver (usually ISR) queues requestndash One queue per CPU DPCs are normally queued to the current processor but can be targeted to other CPUs
ndash Executes specified procedure at dispatch IRQL (or ldquodispatch levelrdquo also ldquoDPC levelrdquo) when all higher-IRQL work (interrupts) completed
ndash Maximum times recommended ISR 10 usec DPC 25 usec
See httpwwwmicrosoftcomwhdcdriverperformmmdrvmspx
Deferred Procedure Calls (DPCs)
6565
queue head DPC object DPC object DPC object
Deferred Procedure Calls (DPCs)
6666
DPC
Delivering a DPC
DPC routines can call kernel functionsbut canlsquot call system services generatepage faults or create or wait on objects
DPC routines canlsquotassume whatprocess addressspace is currentlymapped
Interruptdispatch table
high
Power failure
DispatchDPC
APC
Low
DPC
1 Timer expires kernelqueues DPC that willrelease all waiting threadsKernel requests SW int
DPCDPC
DPC queue
2 DPC interrupt occurswhen IRQL drops belowdispatchDPC level
dispatcher
3 After DPC interruptcontrol transfers tothread dispatcher
4 Dispatcher executes each DPCroutine in DPC queue
6767
Asynchronous Procedure Calls (APCs)
Execute code in context of a particular user threadndash APC routines can acquire resources (objects) incur page faultscall system services
APC queue is thread-specific
User mode amp kernel mode APCsndash Permission required for user mode APCs
6868
Asynchronous Procedure Calls (APCs)
Executive uses APCs to complete work in thread spacendash Wait for asynchronous IO operation
ndash Emulate delivery of POSIX signals
ndash Make threads suspendterminate itself (env subsystems)
APCs are delivered when thread is in alertable wait statendash WaitForMultipleObjectsEx() SleepEx()
6969
Special kernel APCs
ndash Run in kernel mode at IRQL 1
ndash Always deliverable unless thread is already at IRQL 1 or above
ndash Used for IO completion reporting from ldquoarbitrary thread contextrdquo
ndash Kernel-mode interface is linkable but not documented
Asynchronous Procedure Calls (APCs)
7070
ldquoOrdinaryrdquo kernel APCs
ndash Always deliverable if at IRQL 0 unless explicitly disabled (disable with KeEnterCriticalRegion)
User mode APCs
ndash Used for IO completion callback routines (see ReadFileEx WriteFileEx) also QueueUserApc
ndash Only deliverable when thread is in ldquoalertable waitrdquo
Asynchronous Procedure Calls (APCs)
7171
ThreadObject
K
U
APC objects
Asynchronous Procedure Calls (APCs)
7272
IRQLs and CPU Time Accounting
Interval clock timer ISR keeps track of time
Clock ISR time accounting
ndash If IRQLlt2 charge to threadrsquos user or kernel time
ndash If IRQL=2 and processing a DPC charge to DPC time
ndash If IRQL=2 amp not processing a DPC charge to thread kernel time
ndash If IRQLgt2 charge to interrupt time
7373
IRQLs and CPU Time Accounting
Since time servicing interrupts are NOT charged to interrupted thread if system is busy but no process appears to be running must be due to interrupt-related activity
ndash Note time at IRQL 2 or more is charged to the current threadrsquos quantum (to be described)
7474
Interrupt Time Accounting
Task Manager includes interrupt and DPC time with the Idle process time
Interrupt activity is not charged to any threadprocess
ndash Process Explorer shows these as separate processesnot really processes
ndash Context switches for these are really of interrupts amp DPCs
7575
Time Accounting Quirks
Looking at total CPU time for each process may not reveal where system has spent its time
CPU time accounting is driven by programmable interrupt timer
ndash Normally 10 msec (15 msec on some MP Pentiums)
Thread execution and context switches between clock intervals NOT accounted
ndash Eg one or more threads run and enter a wait state before clock fires
ndash Thus threads may run but never get charged
View context switch activity with Process Explorer
ndash Add Context Switch Delta column
7676
For waiting threads user-mode utilities only display the wait reason
Example pstat
Looking at Waiting Threads
7777
Wait Internals 1 Dispatcher Objects
Size TypeState
Wait listhead
Object-type-specific data
DispatcherObject
(see ntddkincddkntddkh)
Any kernel object you can wait for is a ldquodispatcher objectrdquo
ndash some exclusively for synchronizationeg events mutexes (ldquomutantsrdquo) semaphores queues timers
ndash others can be waited for as a side effect of their prime function eg processes threads file objects
ndash non-waitable kernel objects are called ldquocontrol objectsrdquo
All dispatcher objects have a common header
All dispatcher objects are in one of two states
ndash ldquosignaledrdquo vs ldquononsignaledrdquo
ndash when signalled a wait on the object is satisfied
ndash different object types differ in terms of what changes their state
ndash wait and unwait implementation iscommon to all types of dispatcher objects
7878
Object-type-specific data
Wait Internals 2Wait Blocks
Size TypeState
Wait listhead
Size TypeState
Wait listhead
Represent a threadrsquos reference to something itrsquos waiting for (one per handle passed to WaitForhellip)
All wait blocks from a given wait call are chained to the waiting thread
Type indicates wait for ldquoanyrdquo or ldquoallrdquo Key denotes argument list position for
WaitForMultipleObjects
Object-type-specific data
DispatcherObjects
Thread Objects
WaitBlockListWaitBlockList
Wait blocks
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
7979
34 Windows APIs for Synchronization and IPC Windows API constructs for synchronization and interprocess communication
Synchronization
ndash Critical sections
ndash Mutexes
ndash Semaphores
ndash Event objects
Synchronization through interprocess communication
ndash Anonymous pipes
ndash Named pipes
ndash Mailslots
8080
Critical Sections
Only usable from within the same process
Critical sections are initialized and deleted but do not have handles
Only one thread at a time can be in a critical section
A thread can enter a critical section multiple times - however the number of Enter- and Leave-operations must match
Leaving a critical section before entering it may cause deadlocks
No way to test whether another thread is in a critical section
VOID InitializeCriticalSection( LPCRITICAL_SECTION sec )VOID DeleteCriticalSection( LPCRITICAL_SECTION sec )
VOID EnterCriticalSection( LPCRITICAL_SECTION sec ) VOID LeaveCriticalSection( LPCRITICAL_SECTION sec )BOOL TryEnterCriticalSection ( LPCRITICAL_SECTION sec )
8181
Critical Section Example
counter is global shared by all threads
volatile int counter = 0
CRITICAL_SECTION crit
InitializeCriticalSection ( ampcrit )
hellip main loop in any of the threads
while (done)
_try
EnterCriticalSection ( ampcrit )
counter += local_value
LeaveCriticalSection ( ampcrit )
_finally LeaveCriticalSection ( ampcrit )
DeleteCriticalSection( ampcrit )
8282
Synchronizing Threads with Kernel Objects
The following kernel objects can be used to synchronize threads
ndash Processes
ndash Threads
ndash Files
ndash Console input
File change notificationsFile change notifications
MutexesMutexes
Events (auto-reset + manual-reset)Events (auto-reset + manual-reset)
Waitable timersWaitable timers
DWORD WaitForSingleObject( HANDLE hObject DWORD dwTimeout )
DWORD WaitForMultipleObjects( DWORD cObjects LPHANDLE lpHandles BOOL bWaitAll DWORD dwTimeout )
8383
Wait Functions - Details
WaitForSingleObject()ndash hObject specifies kernel object
ndash dwTimeout specifies wait time in msecdwTimeout == 0 - no wait check whether object is signaled
dwTimeout == INFINITE - wait forever
WaitForMultipleObjects()ndash cObjects lt= MAXIMUM_WAIT_OBJECTS (64)
ndash lpHandles - pointer to array identifying these objects
ndash bWaitAll - whether to wait for first signaled object or all objectsFunction returns index of first signaled object
Side effectsndash Mutexes auto-reset events and waitable timers will be reset to non-signaled state after completing wait functions
8484
Mutexes
Mutexes work across processes
First thread has to call CreateMutex()
When sharing a mutex second thread (process) calls CreateMutex() or OpenMutex()
fInitialOwner == TRUE gives creator immediate ownership
Threads acquire mutex ownership using WaitForSingleObject() or WaitForMultipleObjects()
ReleaseMutex() gives up ownership
CloseHandle() will free mutex object
8585
Mutexes
HANDLE CreateMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
HANDLE OpenMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
BOOL ReleaseMutex( HANDLE hMutex )
8686
Mutex Example
counter is global shared by all threads
volatile int done counter = 0
HANDLE mutex = CreateMutex( NULL FALSE NULL )
main loop in any of the threads ret is local
DWORD ret
while (done)
ret = WaitForSingleObject( mutex INFINITE )
if (ret == WAIT_OBJECT_0)
counter += local_value
else mutex was abandoned
break exit the loop
ReleaseMutex( mutex )
CloseHandle( mutex )
8787
Comparison - POSIX mutexes
POSIX pthreads specification supports mutexesndash Synchronization among threads in same process
Five basic functionsndash pthread_mutex_init()
ndash pthread_mutex_destroy()
ndash pthread_mutex_lock()
ndash pthread_mutex_unlock()
ndash pthread_mutex_trylock()
Comparisonndash pthread_mutex_lock() will block - equivalent to WaitForSingleObject( hMutex )
ndash pthread_mutex_trylock() is nonblocking (polling) - equivalent to WaitForSingleObject() with timeout == 0
8888
Semaphores
Semaphore objects are used for resource countingndash A semaphore is signaled when count gt 0
Threadsprocesses use wait functionsndash Each wait function decreases semaphore count by 1
ndash ReleaseSemaphore() may increment count by any value
ndash ReleaseSemaphore() returns old semaphore count
8989
Semaphores
HANDLE CreateSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE OpenSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE ReleaseSemaphore( HANDLE hSemaphoreLONG cReleaseCount LPLONG lpPreviousCount )
9090
Events
Multiple threads can be released when a single event is signaled (barrier synchronization)ndash Manual-reset event can signal several thread simultaneously must be reset manually
ndash PulseEvent() will release all threads waiting on a manual-reset event and automatically reset the event
ndash Auto-reset event signals a single thread event is reset automatically
ndash fInitialState == TRUE - create event in signaled state
9191
Events
HANDLE CreateEvent( LPSECURITY_ATTRIBUTE lpsaBOOL fManualReset BOOL fInititalStateLPTSTR lpszEventName )
BOOL SetEvent( HANDLE hEvent )BOOL ResetEvent( HANDLE hEvent )BOOL PulseEvent( HANDLE hEvent )
9292
Comparison - POSIX condition variables
pthreadrsquos condition variables are comparable to eventsndash pthread_cond_init()
ndash pthread_cond_destroy()
Wait functionsndash pthread_cond_wait()
ndash pthread_cond_timedwait()
Signalingndash pthread_cond_signal() - one thread
ndash pthread_cond_broadcast() - all waiting threads
No exact equivalent to manual-reset events
9393
Anonymous pipes
BOOL CreatePipe( PHANDLE phReadPHANDLE phWriteLPSECURITY_ATTRIBUTES lpsaDWORD cbPipe )
main
prog1 prog2pipe
Half-duplex character-based IPC
cbPipe pipe byte size zero == default
Read on pipe handle will block if pipe is empty
Write operation to a full pipe will block
Anonymous pipes are oneway
9494
IO Redirection using an Anonymous Pipe
Create default size anonymous pipe handles are inheritable
if (CreatePipe (amphReadPipe amphWritePipe ampPipeSA 0))
fprintf(stderr ldquoAnon pipe create failednrdquo) exit(1)
Set output handle to pipe handle create first processes
StartInfoCh1hStdInput = GetStdHandle (STD_INPUT_HANDLE)
StartInfoCh1hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh1hStdOutput = hWritePipe
StartInfoCh1dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)Command1 NULL NULL TRUE 0 NULL NULL ampStartInfoCh1 ampProcInfo1))
fprintf(stderr ldquoCreateProc1 failednrdquo) exit(2)
CloseHandle (hWritePipe)
9595
Pipe example (contd)
Repeat (symmetrically) for the second process
StartInfoCh2hStdInput = hReadPipe
StartInfoCh2hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh2hStdOutput = GetStdHandle (STD_OUTPUT_HANDLE)
StartInfoCh2dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)targv NULL NULLTRUE Inherit handles
0 NULL NULL ampStartInfoCh2 ampProcInfo2))
fprintf(stderr ldquoCreateProc2 failednrdquo) exit(3)
CloseHandle (hReadPipe)
Wait for both processes to complete
WaitForSingleObject (ProcInfo1hProcess INFINITE)
WaitForSingleObject (ProcInfo2hProcess INFINITE)
CloseHandle (ProcInfo1hThread) CloseHandle (ProcInfo1hProcess)
CloseHandle (ProcInfo2hThread) CloseHandle (ProcInfo2hProcess)
return 0
9696
Named Pipes
Message orientedndash Reading process can read varying-length messages precisely as sent by the writing process
Bi-directionalndash Two processes can exchange messages over the same pipe
Multiple independent instances of a named pipendash Several clients can communicate with a single server using the same instance
ndash Server can respond to client using the same instance
Pipe can be accessed over the networkndash location transparency
Convenience and connection functions
9797
Using Named Pipes
HANDLE CreateNamedPipe (LPCTSTR lpszPipeNameDWORD fdwOpenMode DWORD fdwPipModeDWORD nMaxInstances DWORD cbOutBufDWORD cbInBuf DWORD dwTimeOutLPSECURITY_ATTRIBUTES lpsa )
Use same flag settings forall instances of a named pipe
lpszPipeName pipe[path]pipename
ndash Not possible to create a pipe on remote machine ( ndash local machine)
fdwOpenMode
ndash PIPE_ACCESS_DUPLEX PIPE_ACCESS_INBOUND PIPE_ACCESS_OUTBOUND
fdwPipeMode
ndash PIPE_TYPE_BYTE or PIPE_TYPE_MESSAGE
ndash PIPE_READMODE_BYTE or PIPE_READMODE_MESSAGE
ndash PIPE_WAIT or PIPE_NOWAIT (will ReadFile block)
9898
Named Pipes (contd)
BOOL PeekNamedPipe (HANDLE hPipeLPVOID lpvBuffer DWORD cbBufferLPDWORD lpcbRead LPDWORD lpcbAvailLPDWORD lpcbMessage)
nMaxInstances
ndash Number of instances
ndash PIPE_UNLIMITED_INSTANCES OS choice based on resources
dwTimeOut
ndash Default time-out period (in msec) for WaitNamedPipe()
First CreateNamedPipe creates named pipe
ndash Closing handle to last instance deletes named pipe
Polling a pipe
ndash Nondestructive ndash is there a message waiting for ReadFile
9999
Named Pipe Client Connections
CreateFile with named pipe namendash pipe[path]pipename
ndash servernamepipe[path]pipename
ndash First method gives better performance (local server)
Status Functionsndash GetNamedPipeHandleState
ndash SetNamedPipeHandleState
ndash GetNamedPipeInfo
100100
Convenience Functions
BOOL TransactNamedPipe( HANDLE hNamedPipeLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDOWRD lpcbRead LPOVERLAPPED lpa)
WriteFile ReadFile sequence
101101
Convenience Functions
BOOL CallNamedPipe( LPCTSTR lpszPipeNameLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDWORD lpcbRead DWORD dwTimeOut)
CreateFile WriteFile ReadFile CloseHandle
ndash dwTimeOut NMPWAIT_NOWAIT NMPWAIT_WIAT_FOREVER NMPWAIT_USE_DEFAULT_WAIT
102102
Server eliminate the polling loop
BOOL ConnectNamedPipe (HANDLE hNamedPipeLPOVERLAPPED lpo
lpo == NULLndash Call will return as soon as there is a client connection
ndash Returns false if client connected between CreateNamed Pipe calland ConnectNamedPipe()
Use DisconnectNamedPipe to free the handle for connection from another client
WaitNamedPipe()ndash Client may wait for serverlsquos ConnectNamedPipe()
Security rights for named pipesndash GENERIC_READ GENERIC_WRITE SYNCHRONIZE
103103
Comparison with UNIX
UNIX FIFOs are similar to a named pipe
ndash FIFOs are half-duplex
ndash FIFOs are limited to a single machine
ndash FIFOs are still byte-oriented so its easiest to use fixed-size records in clientserver applications
ndash Individual readwrites are atomic
A server using FIFOs must use a separate FIFO for each clientlsquos response although all clients can send requests via a single well known FIFO
Mkfifo() is the UNIX counterpart to CreateNamedPipe()
Use sockets for networked clientserver scenarios
104104
Client Example using Named Pipe
WaitNamedPipe (ServerPipeName NMPWAIT_WAIT_FOREVER)
hNamedPipe = CreateFile (ServerPipeName GENERIC_READ | GENERIC_WRITE
0 NULL OPEN_EXISTING FILE_ATTRIBUTE_NORMAL NULL)
if (hNamedPipe == INVALID_HANDLE_VALUE)
fptinf(stderr Failure to locate servern) exit(3)
Write the request
WriteFile (hNamedPipe ampRequest MAX_RQRS_LEN ampnWrite NULL)
Read each response and send it to std out
while (ReadFile (hNamedPipe ResponseRecord MAX_RQRS_LEN ampnRead NULL))
printf (s ResponseRecord)
CloseHandle (hNamedPipe)
return 0
105105
Server Example Using a Named Pipe
hNamedPipe = CreateNamedPipe (SERVER_PIPE PIPE_ACCESS_DUPLEX
PIPE_READMODE_MESSAGE | PIPE_TYPE_MESSAGE | PIPE_WAIT
1 0 0 CS_TIMEOUT pNPSA)
while (Done)
printf (Server is awaiting next requestn)
if (ConnectNamedPipe (hNamedPipe NULL)
|| ReadFile (hNamedPipe ampRequest RQ_SIZE ampnXfer NULL))
fprintf(stderr ldquoConnect or Read Named Pipe errornrdquo) exit(4)
printf( ldquoRequest is sn RequestRecord)
Send the file one line at a time to the client
fp = fopen (File r)
while ((fgets (ResponseRecord MAX_RQRS_LEN fp) = NULL))
WriteFile (hNamedPipe ampResponseRecord
(strlen(ResponseRecord) + 1) TSIZE ampnXfer NULL)
fclose (fp)
DisconnectNamedPipe (hNamedPipe)
End of server operation
106106
Win32 IPC - MailslotsMailslots bear some nasty implementation detailsthey are almost never used
Broadcast mechanism
ndash One-directional
ndash Mutliple writersmultiple readers (frequently one-to-many comm)
ndash Message delivery is unreliable
ndash Can be located over a network domain
ndash Message lengths are limited (w2k lt 426 byte) Operations on the mailslot
ndash Each reader (server) creates mailslot with CreateMailslot()
ndash Write-only client opens mailslot with CreateFile() and uses WriteFile() ndash open will fail if there are no waiting readers
ndash Clientlsquos message can be read by all servers (readers) Client lookup mailslotmailslotname
ndash Client will connect to every server in network domain
107107
Locate a server via mailslot
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
App client 0
App client n
Mailslot Servers
While () Sleep() hMS = CreateFile( ldquomailslotstatusldquo)
WriteFile(hMS ampStatInfo
App Server
Mailslot Client
Message is sent periodically
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
44
The Critical-Section Problem
n threads all competing to use a shared resource
Each thread has a code segment called critical section in which the shared data is accessed
Problem Ensure that
ndash when one thread is executing in its critical section no other thread is allowed to execute in its critical section
55
Solution to Critical-Section Problem
Mutual Exclusion
ndash Only one thread at a time is allowed into its CS among all threads that have CS for the same resource or shared data
ndash A thread halted in its non-critical section must not interfere with other threads
Progress
ndash A thread remains inside CS for a finite time only
ndash No assumptions concerning relative speed of the threads
66
Solution to Critical-Section Problem
Bounded Waiting
ndash It must no be possible for a thread requiring access to a critical section to be delayed indefinitely
ndash When no thread is in a critical section any thread that requests entry must be permitted to enter without delay
77
Only 2 threads T0 and T1
General structure of thread Ti (other thread Tj)
do
enter section
critical section
exit section
reminder section
while (1)
Threads may share some common variables to synchronize their actions
Initial Attempts to Solve Problem
88
First Attempt Algorithm 1
Shared variables
ndash Initialization int turn = 0
ndash turn == i Ti can enter its critical section
Thread Ti
do
while (turn = i)
critical section
turn = j
reminder section
while (1)
Satisfies mutual exclusion but not progress
99
Second Attempt Algorithm 2
Shared variables
ndash initialization int flag[2] flag[0] = flag[1] = 0
ndash flag[i] == 1 Ti can enter its critical section
Thread Ti
do
flag[i] = 1while (flag[j] == 1)
critical section
flag[i] = 0remainder section
while(1)
Satisfies mutual exclusion not progress requirement
1010
Algorithm 3 (Petersonrsquos Algorithm - 1981)
Shared variables of algorithms 1 and 2 - initialization
int flag[2] flag[0] = flag[1] = 0int turn = 0
Thread Ti
do
flag[i] = 1turn = jwhile ((flag[j] == 1) ampamp turn == j)
critical section
flag[i] = 0
remainder section
while (1)
Solves the critical-section problem for two threads
1111
Dekkerrsquos Algorithm (1965)
This is the first correct solution proposed for the two-thread (two-process) case
Originally developed by Dekker in a different context it was applied to the critical section problem by Dijkstra
Dekker adds the idea of a favored thread and allows access to either thread when the request is uncontested
When there is a conflict one thread is favored and the priority reverses after successful execution of the critical section
1212
Dekkerrsquos Algorithm (contd)
Shared variables - initializationint flag[2] flag[0] = flag[1] = 0int turn = 0
Thread Ti
do flag[i] = 1
while (flag[j] ) if (turn == j)
flag[i] = 0while (turn == j)flag[i] = 1
critical section
turn = jflag[I] = 0
remainder section
while (1)
1313
Bakery Algorithm (Lamport 1979)
A Solution to the Critical Section problem for n threads
Before entering its CS a thread receives a number
Holder of the smallest number enters the CS
If threads Ti and Tj receive the same number if i lt j then Ti is served first else Tj is served first
The numbering scheme generates numbers in monotonically non-decreasing order
ndash ie 1112333445
1414
Bakery Algorithm
Notation ldquoltldquo establishes lexicographical order among 2-tuples (ticket thread id )
(ab) lt (cd) if a lt c or if a == c and b lt d
max (a0hellip an-1) = k | k ai for i = 0hellip n ndash 1
Shared data
int choosing[n]
int number[n] - the ticket
Data structures are initialized to 0
1515
Bakery Algorithm
do
choosing[i] = 1
number[i] = max(number[0]number[1] number[n-1]) + 1
choosing[i] = 0
for (j = 0 j lt n j++)
while (choosing[j] == 1)
while ((number[j] = 0) ampamp ((number[j]j) lsquorsquoltlsquorsquo (number[i]i)))
critical section
number[i] = 0
remainder section
while (1)
1616
Mutual Exclusion - Hardware Support
Interrupt Disabling
ndash Concurrent threads cannot overlap on a uniprocessor
ndash Thread will run until performing a system call or interrupt happens
Special Atomic Machine Instructions
ndash Test and Set Instruction - read amp write a memory location
ndash Exchange Instruction - swap register and memory location
Problems with Machine-Instruction Approach
ndash Busy waiting
ndash Starvation is possible
ndash Deadlock is possible
1717
Synchronization Hardware
Test and modify the content of a word atomically
boolean TestAndSet(boolean amptarget)
boolean rv = target
target = true
return rv
1818
Shared data ndash boolean lock = false
Thread Ti
do
while (TestAndSet(lock))
critical section
lock = false
remainder section
Mutual Exclusion with Test-and-Set
1919
Synchronization Hardware
Atomically swap two variables
void Swap(boolean ampa boolean ampb)
boolean temp = a
a = b
b = temp
2020
Mutual Exclusion with Swap
Shared data (initialized to 0) int lock = 0
Thread Ti
int key
do
key = 1
while (key == 1) Swap(lockkey)
critical section
lock = 0
remainder section
2121
Semaphores
Semaphore S ndash integer variable
can only be accessed via two atomic operations
wait (S)
while (S lt= 0)S--
signal (S)
S++
2222
Critical Section of n Threads
Shared data
semaphore mutex initially mutex = 1
Thread Ti
do wait(mutex) critical section
signal(mutex) remainder section
while (1)
2323
Semaphore Implementation
Semaphores may suspendresume threads
ndash Avoid busy waiting
Define a semaphore as a record
typedef struct
int value struct thread L semaphore
Assume two simple operations
ndash suspend() suspends the thread that invokes it
ndash resume(T) resumes the execution of a blocked thread T
2424
Implementation
Semaphore operations now defined as wait(S)
Svalue--
if (Svalue lt 0)
add this thread to SLsuspend()
signal(S) Svalue++
if (Svalue lt= 0)
remove a thread T from SLresume(T)
2525
Semaphore as a General Synchronization Tool Execute B in Tj only after A executed in Ti
Use semaphore flag initialized to 0
Code
Ti Tj
A wait(flag)
signal(flag) B
2626
Two Types of Semaphores
Counting semaphore
ndash integer value can range over an unrestricted domain
Binary semaphore
ndash integer value can range only between 0 and 1
ndash can be simpler to implement
Counting semaphore S can be implemented as a binary semaphore
2727
Deadlock and Starvation
Deadlock ndash
ndash two or more threads are waiting indefinitely for an event that can be caused by only one of the waiting threads
Let S and Q be two semaphores initialized to 1
T0 T1
wait(S) wait(Q)
wait(Q) wait(S)
signal(S) signal(Q)
signal(Q) signal(S)
2828
Deadlock and Starvation
Starvation ndash indefinite blocking
ndash A thread may never be removed from the semaphore queue in which it is suspended
Solution ndash
ndash all code should acquirerelease semaphores in same order
2929
Windows Synchronization
Uses interrupt masks to protect access to global resources on uniprocessor systems
Uses spinlocks on multiprocessor systems
Provides dispatcher objects which may act as mutexes and semaphores
Dispatcher objects may also provide events An event acts much like a condition variable
3030
Linux Synchronization
Kernel disables interrupts for synchronizing access to global data on uniprocessor systems
Uses spinlocks for multiprocessor synchronization
Uses semaphores and readers-writers locks when longer sections of code need access to data
Implements POSIX synchronization primitives to support multitasking multithreading (including real-time threads) and multiprocessing
3131
Further Reading
Ben-Ari M Principles of Concurrent Programming Prentice Hall 1982
Lamport L The Mutual Exclusion Problem Journal of the ACM April 1986
Abraham Silberschatz Peter B Galvin Operating System Concepts John Wiley amp Sons 6th Ed 2003
ndash Chapter 7 - Process Synchronization
ndash Chapter 8 - Deadlocks
3232
32 Trap Dispatching Interrupts Synchronization Trap and Interrupt dispatching
IRQL levels amp Interrupt Precedence
Spinlocks and Kernel Synchronization
Executive Synchronization
3333
Kernel Mode Versus User Mode
A processor state
Controls access to memory
Each memory page is tagged to show the required mode for reading and for writing
ndash Protects the system from the users
ndash Protects the user (process) from themselves
ndash System is not protected from system
Code regions are tagged ldquono write in any moderdquo
Controls ability to execute privileged instructions
A Windows abstraction
ndash Intel Ring 0 Ring 3
3434
Kernel Mode Versus User Mode
Control flow (a thread) can change from user to kernel mode and back
ndash Does not affect scheduling
ndash Thread context includes info about execution mode (along with registers etc)
PerfMon counters
ndash ldquoPrivileged Timerdquo and ldquoUser Timerdquo
ndash 4 levels of granularity thread process processor system
3535
Getting Into Kernel Mode
Code is run in kernel mode for one of three reasons
1 Requests from user mode
ndash Via the system service dispatch mechanism
ndash Kernel-mode code runs in the context of the requesting thread
2 Dedicated kernel-mode system threads
ndash Some threads in the system stay in kernel mode at all timesmostly in the ldquoSystemrdquo process
ndash Scheduled preempted etc like any other threads
3636
Getting Into Kernel Mode
3 Interrupts from external devicesndash interrupt dispatcher invokes the interrupt service routine
ndash ISR runs in the context of the interrupted thread so-called ldquoarbitrary thread contextrdquo
ndash ISR often requests the execution of a ldquoDPC routinerdquo which also runs in kernel mode
ndash Time not charged to interrupted thread
3737
Trap dispatching
Interruptdispatcher
Systemservice
dispatcher
Interruptserviceroutines
Interruptserviceroutines
Interruptserviceroutines
System services
System services
System services
Exceptiondispatcher
Exceptionhandlers
Exceptionhandlers
Exceptionhandlers
Virtual memorymanagerlsquos pager
Interrupt
System service call
HW exceptionsSW exceptions
Virtual addressexceptions
Trap processorlsquos mechanism to capture executing thread
ndash Switch from user to kernel mode
ndash Interrupts ndash asynchronous
ndash Exceptions - synchronous
Interrupt dispatch routine
Disable interrupts
Record machine state (trap frame) to allow resume
Mask equal- and lower-IRQL interrupts
Find and call appropriate ISR
Dismiss interrupt
Restore machine state (including mode and enabled interrupts)
Disable interrupts
Record machine state (trap frame) to allow resume
Mask equal- and lower-IRQL interrupts
Find and call appropriate ISR
Dismiss interrupt
Restore machine state (including mode and enabled interrupts)
Tell the device to stop interrupting
Interrogate device state start next operation on device etc
Request a DPC
Return to caller
Tell the device to stop interrupting
Interrogate device state start next operation on device etc
Request a DPC
Return to caller
Interrupt service routine
interrupt
user or kernel mode
codekernel mode
Note no thread or process context switch
Note no thread or process context switch
Interrupt Dispatching
3939
IRQL = Interrupt Request Level
ndash Precedence of the interrupt with respect to other interrupts
ndash Different interrupt sources have different IRQLs
ndash not the same as IRQ
IRQL is also a state of the processor
ndash Servicing an interrupt raises processor IRQL to that interruptrsquos IRQL
ndash this masks subsequent interrupts at equal and lower IRQLs
User mode is limited to IRQL 0
No waits or page faults at IRQL gt= DISPATCH_LEVEL
Interrupt Precedence via IRQLs (x86)
4040
PassiveLowAPC
DispatchDPCDevice 1
Profile amp Synch (Srv 2003)
ClockInterprocessor Interrupt
Power failHigh
normal thread execution
Hardware interrupts
Deferrable software interrupts
012
302928
31
Interrupt Precedence via IRQLs (x86)
4141
Interrupt processing
Interrupt dispatch table (IDT)ndash Links to interrupt service routines
x86ndash Interrupt controller interrupts processor (single line)
ndash Processor queries for interrupt vector uses vector as index to IDT
Alphandash PAL code (Privileged Architecture Library ndash Alpha BIOS) determines interrupt vector calls kernel
ndash Kernel uses vector to index IDT
After ISR execution IRQL is lowered to initial level
4242
Interrupt object
Allows device drivers to register ISRs for their devicesndash Contains dispatch code (initial handler)
ndash Dispatch code calls ISR with interrupt object as parameter(HW cannot pass parameters to ISR)
Connectingdisconnecting interrupt objectsndash Dynamic association between ISR and IDT entry
ndash Loadable device drivers (kernel modules)
ndash Turn onoff ISR
Interrupt objects can synchronize access to ISR datandash Multiple instances of ISR may be active simultaneously (MP machine)
ndash Multiple ISR may be connected with IRQL
4343
Predefined IRQLs
High ndash used when halting the system (via KeBugCheck())
Power fail ndash originated in the NT design document but has never been used
Inter-processor interruptndash used to request action from other processor (dispatching a thread updating a processors TLB system shutdown system crash)
Clockndash Used to update systemlsquos clock allocation of CPU time to threads
Profilendash Used for kernel profiling (see Kernel profiler ndash Kernprofexe Res Kit)
4444
Predefined IRQLs (contd)
Device
ndash Used to prioritize device interrupts
DPCdispatch and APC
ndash Software interrupts that kernel and device drivers generate
Passive
ndash No interrupt level at all normal thread execution
4545
IRQLs on 64-bit Systems
PassiveLowAPC
DispatchDPCDevice 1
Device n
Synch (Srv 2003)Clock
Interprocessor InterruptPower
HighProfile
012
1413
15
34
PassiveLowAPC
DispatchDPC amp Synch (UP only)Correctable Machine Check
Device 1
Device nSynch (MP only)
ClockInterprocessor Interrupt
HighProfilePower
x64 IA64
12
4646
Interrupt Prioritization amp Delivery
IRQLs are determined as followsndash x86 UP systems IRQL = 27 - IRQ
ndash x86 MP systems bucketized (random)
ndash x64 amp IA64 systems IRQL = IDT vector number 16
On MP systems which processor is chosen to deliver an interruptndash By default any processor can receive an interrupt from any deviceCan be configured with IntFilter utility in Resource Kit
ndash On x86 and x64 systems the IOAPIC (IO advanced programmable interrupt controller) is programmed to interrupt the processor running at the lowest IRQL
ndash On IA64 systems the SAPIC (streamlined advanced programmable interrupt controller) is configured to interrupt one processor for each interrupt sourceProcessors are assigned round robin for each interrupt vector
4747
Software interrupts
Initiating thread dispatching
ndash DPC allow for scheduling actions when kernel is deep within many layers of code
ndash Delayed scheduling decision one DPC queue per processor
Handling timer expiration
Asynchronous execution of a procedure in context of a particular thread
Support for asynchronous IO operations
4848
Flow of Interrupts
4949
Sync on MP use spinlocks to coordinate among processors
Spinlock acquisition and release routines implement a one-owner-at-a-time algorithm
ndash Spinlock is either free or is considered to be owned by a CPU
ndash Analogous to using Windows API mutexes from user mode
A spinlock is just a data cell in memory
ndash Accessed with a test-and-modify operation that is atomic across all processors
ndash KSPIN_LOCK is an opaque data type typedefrsquod as a ULONG
ndash To implement synchronization a single bit is sufficient
Synchronization on SMP Systems
5050
do acquire_spinlock(DPC)until (SUCCESS)
begin remove DPC from queueend
release_spinlock(DPC)
do acquire_spinlock(DPC)until (SUCCESS)
begin remove DPC from queueend
release_spinlock(DPC)
Kernel Synchronization
Processor BProcessor A
Critical section
spinlock
DPC DPC
A spinlock is a locking primitive associatedwith a global data structure such as the DPC queue
5151
Queued Spinlocks
Problem Checking status of spinlock via test-and-set operation creates bus contention
Queued spinlocks maintain queue of waiting processors
First processor acquires lock other processors wait on processor-local flag
ndash Thus busy-wait loop requires no access to the memory bus
When releasing lock the 1st processorrsquos flag is modified
ndash Exactly one processor is being signaled
ndash Pre-determined wait order
5252
SMP Scalability Improvements
Windows 2000 queued spinlocks
ndash qlocks in Kernel Debugger
Server 2003
ndash More spinlocks eliminated (context swap system space commit)
ndash Further reduction of use of spinlocks amp length they are held
ndash Scheduling database now per-CPUAllows thread state transitions in parallel
5353
SMP Scalability Improvements
XP2003
ndash Minimized lock contention for hot locks PFN or Page Frame Database lock
ndash Some locks completely eliminatedCharging nonpagedpaged pool quotas allocating and mapping system page table entries charging commitment of pages allocatingmapping physical memory through AWE functions
ndash New more efficient locking mechanism (pushlocks)Doesnrsquot use spinlocks when no contentionUsed for object manager and address windowing extensions (AWE) related locks
5454
Waiting
Flexible wait calls
ndash Wait for one or multiple objects in one call
ndash Wait for multiple can wait for ldquoanyrdquo one or ldquoallrdquo at onceldquoAllrdquo all objects must be in the signalled state concurrently to resolve the wait
ndash All wait calls include optional timeout argument
ndash Waiting threads consume no CPU time
5555
Waiting
Waitable objects include
ndash Events (may be auto-reset or manual reset may be set or ldquopulsedrdquo)
ndash Mutexes (ldquomutual exclusionrdquo one-at-a-time)
ndash Semaphores (n-at-a-time)
ndash Timers
ndash Processes and Threads (signalled upon exit or terminate)
ndash Directories (change notification)
5656
Waiting
No guaranteed ordering of wait resolution
ndash If multiple threads are waiting for an object and only one thread is released (eg itrsquos a mutex or auto-reset event) which thread gets released is unpredictable
5757
Executive Synchronization
Thread waitson an object
handle
Create and initialize thread object
Initialized
Ready
Transition
Waiting
Running
Terminated
Standby
Wait is completeSet object to
signaled state
Interaction with thread scheduling
Waiting on Dispatcher Objects ndash outside the kernel
5858
Interaction bet Synchronization amp Dispatching User mode thread waits on an event objectlsquos handle
Kernel changes threadlsquos scheduling state from ready to waiting and adds thread to wait-list
Another thread sets the event
Kernel wakes up waiting threads variable priority threads get priority boost
5959
Interaction bet Synchronization amp Dispatching Dispatcher re-schedules new thread ndash it may preempt running thread it it has lower priority and issues software interrupt to initiate context switch
If no processor can be preempted the dispatcher places the ready thread in the dispatcher ready queue to be scheduled later
6060
What signals an object
Dispatcher object
System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
Owning thread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (kernel mode)
nonsignaled signaled
Owning thread or otherthread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (exported to user mode)
nonsignaled signaled
One thread releases thesemaphore freeing a resource
A thread acquires the semaphoreMore resources are not available
Kernel resumes one or more waiting threads
Semaphore
6161
A thread reinitializesthe thread object
What signals an object (contd)
Dispatcher object System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
A thread sets the event
Kernel resumes one or more threads
Kernel resumes one or more waiting threads
Event
nonsignaled signaled
Dedicated thread sets oneevent in the event pair
Kernel resumes theother dedicated thread
Kernel resumes waitingdedicated thread
Event pair
nonsignaled signaled
Timer expires
A thread (re) initializes the timer
Kernel resumes all waiting threads
Timer
nonsignaled signaled
Thread terminates
Kernel resumes all waiting threads
Thread
6262
Further Reading
Mark E Russinovich and David A Solomon Microsoft Windows Internals 4th Edition Microsoft Press 2004
Chapter 3 - System Mechanisms
ndash Trap Dispatching (pp 85 ff)
ndash Synchronization (pp 149 ff)
ndash Kernel Event Tracing (pp 175 ff)
6363
33 Advanced Windows Synchronization
Deferred and Asynchronous Procedure Calls
IRQLs and CPU Time Accounting
Wait Queues amp Dispatcher Objects
6464
Used to defer processing from higher (device) interrupt level to a lower (dispatch) levelndash Also used for quantum end and timer expiration
Driver (usually ISR) queues requestndash One queue per CPU DPCs are normally queued to the current processor but can be targeted to other CPUs
ndash Executes specified procedure at dispatch IRQL (or ldquodispatch levelrdquo also ldquoDPC levelrdquo) when all higher-IRQL work (interrupts) completed
ndash Maximum times recommended ISR 10 usec DPC 25 usec
See httpwwwmicrosoftcomwhdcdriverperformmmdrvmspx
Deferred Procedure Calls (DPCs)
6565
queue head DPC object DPC object DPC object
Deferred Procedure Calls (DPCs)
6666
DPC
Delivering a DPC
DPC routines can call kernel functionsbut canlsquot call system services generatepage faults or create or wait on objects
DPC routines canlsquotassume whatprocess addressspace is currentlymapped
Interruptdispatch table
high
Power failure
DispatchDPC
APC
Low
DPC
1 Timer expires kernelqueues DPC that willrelease all waiting threadsKernel requests SW int
DPCDPC
DPC queue
2 DPC interrupt occurswhen IRQL drops belowdispatchDPC level
dispatcher
3 After DPC interruptcontrol transfers tothread dispatcher
4 Dispatcher executes each DPCroutine in DPC queue
6767
Asynchronous Procedure Calls (APCs)
Execute code in context of a particular user threadndash APC routines can acquire resources (objects) incur page faultscall system services
APC queue is thread-specific
User mode amp kernel mode APCsndash Permission required for user mode APCs
6868
Asynchronous Procedure Calls (APCs)
Executive uses APCs to complete work in thread spacendash Wait for asynchronous IO operation
ndash Emulate delivery of POSIX signals
ndash Make threads suspendterminate itself (env subsystems)
APCs are delivered when thread is in alertable wait statendash WaitForMultipleObjectsEx() SleepEx()
6969
Special kernel APCs
ndash Run in kernel mode at IRQL 1
ndash Always deliverable unless thread is already at IRQL 1 or above
ndash Used for IO completion reporting from ldquoarbitrary thread contextrdquo
ndash Kernel-mode interface is linkable but not documented
Asynchronous Procedure Calls (APCs)
7070
ldquoOrdinaryrdquo kernel APCs
ndash Always deliverable if at IRQL 0 unless explicitly disabled (disable with KeEnterCriticalRegion)
User mode APCs
ndash Used for IO completion callback routines (see ReadFileEx WriteFileEx) also QueueUserApc
ndash Only deliverable when thread is in ldquoalertable waitrdquo
Asynchronous Procedure Calls (APCs)
7171
ThreadObject
K
U
APC objects
Asynchronous Procedure Calls (APCs)
7272
IRQLs and CPU Time Accounting
Interval clock timer ISR keeps track of time
Clock ISR time accounting
ndash If IRQLlt2 charge to threadrsquos user or kernel time
ndash If IRQL=2 and processing a DPC charge to DPC time
ndash If IRQL=2 amp not processing a DPC charge to thread kernel time
ndash If IRQLgt2 charge to interrupt time
7373
IRQLs and CPU Time Accounting
Since time servicing interrupts are NOT charged to interrupted thread if system is busy but no process appears to be running must be due to interrupt-related activity
ndash Note time at IRQL 2 or more is charged to the current threadrsquos quantum (to be described)
7474
Interrupt Time Accounting
Task Manager includes interrupt and DPC time with the Idle process time
Interrupt activity is not charged to any threadprocess
ndash Process Explorer shows these as separate processesnot really processes
ndash Context switches for these are really of interrupts amp DPCs
7575
Time Accounting Quirks
Looking at total CPU time for each process may not reveal where system has spent its time
CPU time accounting is driven by programmable interrupt timer
ndash Normally 10 msec (15 msec on some MP Pentiums)
Thread execution and context switches between clock intervals NOT accounted
ndash Eg one or more threads run and enter a wait state before clock fires
ndash Thus threads may run but never get charged
View context switch activity with Process Explorer
ndash Add Context Switch Delta column
7676
For waiting threads user-mode utilities only display the wait reason
Example pstat
Looking at Waiting Threads
7777
Wait Internals 1 Dispatcher Objects
Size TypeState
Wait listhead
Object-type-specific data
DispatcherObject
(see ntddkincddkntddkh)
Any kernel object you can wait for is a ldquodispatcher objectrdquo
ndash some exclusively for synchronizationeg events mutexes (ldquomutantsrdquo) semaphores queues timers
ndash others can be waited for as a side effect of their prime function eg processes threads file objects
ndash non-waitable kernel objects are called ldquocontrol objectsrdquo
All dispatcher objects have a common header
All dispatcher objects are in one of two states
ndash ldquosignaledrdquo vs ldquononsignaledrdquo
ndash when signalled a wait on the object is satisfied
ndash different object types differ in terms of what changes their state
ndash wait and unwait implementation iscommon to all types of dispatcher objects
7878
Object-type-specific data
Wait Internals 2Wait Blocks
Size TypeState
Wait listhead
Size TypeState
Wait listhead
Represent a threadrsquos reference to something itrsquos waiting for (one per handle passed to WaitForhellip)
All wait blocks from a given wait call are chained to the waiting thread
Type indicates wait for ldquoanyrdquo or ldquoallrdquo Key denotes argument list position for
WaitForMultipleObjects
Object-type-specific data
DispatcherObjects
Thread Objects
WaitBlockListWaitBlockList
Wait blocks
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
7979
34 Windows APIs for Synchronization and IPC Windows API constructs for synchronization and interprocess communication
Synchronization
ndash Critical sections
ndash Mutexes
ndash Semaphores
ndash Event objects
Synchronization through interprocess communication
ndash Anonymous pipes
ndash Named pipes
ndash Mailslots
8080
Critical Sections
Only usable from within the same process
Critical sections are initialized and deleted but do not have handles
Only one thread at a time can be in a critical section
A thread can enter a critical section multiple times - however the number of Enter- and Leave-operations must match
Leaving a critical section before entering it may cause deadlocks
No way to test whether another thread is in a critical section
VOID InitializeCriticalSection( LPCRITICAL_SECTION sec )VOID DeleteCriticalSection( LPCRITICAL_SECTION sec )
VOID EnterCriticalSection( LPCRITICAL_SECTION sec ) VOID LeaveCriticalSection( LPCRITICAL_SECTION sec )BOOL TryEnterCriticalSection ( LPCRITICAL_SECTION sec )
8181
Critical Section Example
counter is global shared by all threads
volatile int counter = 0
CRITICAL_SECTION crit
InitializeCriticalSection ( ampcrit )
hellip main loop in any of the threads
while (done)
_try
EnterCriticalSection ( ampcrit )
counter += local_value
LeaveCriticalSection ( ampcrit )
_finally LeaveCriticalSection ( ampcrit )
DeleteCriticalSection( ampcrit )
8282
Synchronizing Threads with Kernel Objects
The following kernel objects can be used to synchronize threads
ndash Processes
ndash Threads
ndash Files
ndash Console input
File change notificationsFile change notifications
MutexesMutexes
Events (auto-reset + manual-reset)Events (auto-reset + manual-reset)
Waitable timersWaitable timers
DWORD WaitForSingleObject( HANDLE hObject DWORD dwTimeout )
DWORD WaitForMultipleObjects( DWORD cObjects LPHANDLE lpHandles BOOL bWaitAll DWORD dwTimeout )
8383
Wait Functions - Details
WaitForSingleObject()ndash hObject specifies kernel object
ndash dwTimeout specifies wait time in msecdwTimeout == 0 - no wait check whether object is signaled
dwTimeout == INFINITE - wait forever
WaitForMultipleObjects()ndash cObjects lt= MAXIMUM_WAIT_OBJECTS (64)
ndash lpHandles - pointer to array identifying these objects
ndash bWaitAll - whether to wait for first signaled object or all objectsFunction returns index of first signaled object
Side effectsndash Mutexes auto-reset events and waitable timers will be reset to non-signaled state after completing wait functions
8484
Mutexes
Mutexes work across processes
First thread has to call CreateMutex()
When sharing a mutex second thread (process) calls CreateMutex() or OpenMutex()
fInitialOwner == TRUE gives creator immediate ownership
Threads acquire mutex ownership using WaitForSingleObject() or WaitForMultipleObjects()
ReleaseMutex() gives up ownership
CloseHandle() will free mutex object
8585
Mutexes
HANDLE CreateMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
HANDLE OpenMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
BOOL ReleaseMutex( HANDLE hMutex )
8686
Mutex Example
counter is global shared by all threads
volatile int done counter = 0
HANDLE mutex = CreateMutex( NULL FALSE NULL )
main loop in any of the threads ret is local
DWORD ret
while (done)
ret = WaitForSingleObject( mutex INFINITE )
if (ret == WAIT_OBJECT_0)
counter += local_value
else mutex was abandoned
break exit the loop
ReleaseMutex( mutex )
CloseHandle( mutex )
8787
Comparison - POSIX mutexes
POSIX pthreads specification supports mutexesndash Synchronization among threads in same process
Five basic functionsndash pthread_mutex_init()
ndash pthread_mutex_destroy()
ndash pthread_mutex_lock()
ndash pthread_mutex_unlock()
ndash pthread_mutex_trylock()
Comparisonndash pthread_mutex_lock() will block - equivalent to WaitForSingleObject( hMutex )
ndash pthread_mutex_trylock() is nonblocking (polling) - equivalent to WaitForSingleObject() with timeout == 0
8888
Semaphores
Semaphore objects are used for resource countingndash A semaphore is signaled when count gt 0
Threadsprocesses use wait functionsndash Each wait function decreases semaphore count by 1
ndash ReleaseSemaphore() may increment count by any value
ndash ReleaseSemaphore() returns old semaphore count
8989
Semaphores
HANDLE CreateSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE OpenSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE ReleaseSemaphore( HANDLE hSemaphoreLONG cReleaseCount LPLONG lpPreviousCount )
9090
Events
Multiple threads can be released when a single event is signaled (barrier synchronization)ndash Manual-reset event can signal several thread simultaneously must be reset manually
ndash PulseEvent() will release all threads waiting on a manual-reset event and automatically reset the event
ndash Auto-reset event signals a single thread event is reset automatically
ndash fInitialState == TRUE - create event in signaled state
9191
Events
HANDLE CreateEvent( LPSECURITY_ATTRIBUTE lpsaBOOL fManualReset BOOL fInititalStateLPTSTR lpszEventName )
BOOL SetEvent( HANDLE hEvent )BOOL ResetEvent( HANDLE hEvent )BOOL PulseEvent( HANDLE hEvent )
9292
Comparison - POSIX condition variables
pthreadrsquos condition variables are comparable to eventsndash pthread_cond_init()
ndash pthread_cond_destroy()
Wait functionsndash pthread_cond_wait()
ndash pthread_cond_timedwait()
Signalingndash pthread_cond_signal() - one thread
ndash pthread_cond_broadcast() - all waiting threads
No exact equivalent to manual-reset events
9393
Anonymous pipes
BOOL CreatePipe( PHANDLE phReadPHANDLE phWriteLPSECURITY_ATTRIBUTES lpsaDWORD cbPipe )
main
prog1 prog2pipe
Half-duplex character-based IPC
cbPipe pipe byte size zero == default
Read on pipe handle will block if pipe is empty
Write operation to a full pipe will block
Anonymous pipes are oneway
9494
IO Redirection using an Anonymous Pipe
Create default size anonymous pipe handles are inheritable
if (CreatePipe (amphReadPipe amphWritePipe ampPipeSA 0))
fprintf(stderr ldquoAnon pipe create failednrdquo) exit(1)
Set output handle to pipe handle create first processes
StartInfoCh1hStdInput = GetStdHandle (STD_INPUT_HANDLE)
StartInfoCh1hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh1hStdOutput = hWritePipe
StartInfoCh1dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)Command1 NULL NULL TRUE 0 NULL NULL ampStartInfoCh1 ampProcInfo1))
fprintf(stderr ldquoCreateProc1 failednrdquo) exit(2)
CloseHandle (hWritePipe)
9595
Pipe example (contd)
Repeat (symmetrically) for the second process
StartInfoCh2hStdInput = hReadPipe
StartInfoCh2hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh2hStdOutput = GetStdHandle (STD_OUTPUT_HANDLE)
StartInfoCh2dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)targv NULL NULLTRUE Inherit handles
0 NULL NULL ampStartInfoCh2 ampProcInfo2))
fprintf(stderr ldquoCreateProc2 failednrdquo) exit(3)
CloseHandle (hReadPipe)
Wait for both processes to complete
WaitForSingleObject (ProcInfo1hProcess INFINITE)
WaitForSingleObject (ProcInfo2hProcess INFINITE)
CloseHandle (ProcInfo1hThread) CloseHandle (ProcInfo1hProcess)
CloseHandle (ProcInfo2hThread) CloseHandle (ProcInfo2hProcess)
return 0
9696
Named Pipes
Message orientedndash Reading process can read varying-length messages precisely as sent by the writing process
Bi-directionalndash Two processes can exchange messages over the same pipe
Multiple independent instances of a named pipendash Several clients can communicate with a single server using the same instance
ndash Server can respond to client using the same instance
Pipe can be accessed over the networkndash location transparency
Convenience and connection functions
9797
Using Named Pipes
HANDLE CreateNamedPipe (LPCTSTR lpszPipeNameDWORD fdwOpenMode DWORD fdwPipModeDWORD nMaxInstances DWORD cbOutBufDWORD cbInBuf DWORD dwTimeOutLPSECURITY_ATTRIBUTES lpsa )
Use same flag settings forall instances of a named pipe
lpszPipeName pipe[path]pipename
ndash Not possible to create a pipe on remote machine ( ndash local machine)
fdwOpenMode
ndash PIPE_ACCESS_DUPLEX PIPE_ACCESS_INBOUND PIPE_ACCESS_OUTBOUND
fdwPipeMode
ndash PIPE_TYPE_BYTE or PIPE_TYPE_MESSAGE
ndash PIPE_READMODE_BYTE or PIPE_READMODE_MESSAGE
ndash PIPE_WAIT or PIPE_NOWAIT (will ReadFile block)
9898
Named Pipes (contd)
BOOL PeekNamedPipe (HANDLE hPipeLPVOID lpvBuffer DWORD cbBufferLPDWORD lpcbRead LPDWORD lpcbAvailLPDWORD lpcbMessage)
nMaxInstances
ndash Number of instances
ndash PIPE_UNLIMITED_INSTANCES OS choice based on resources
dwTimeOut
ndash Default time-out period (in msec) for WaitNamedPipe()
First CreateNamedPipe creates named pipe
ndash Closing handle to last instance deletes named pipe
Polling a pipe
ndash Nondestructive ndash is there a message waiting for ReadFile
9999
Named Pipe Client Connections
CreateFile with named pipe namendash pipe[path]pipename
ndash servernamepipe[path]pipename
ndash First method gives better performance (local server)
Status Functionsndash GetNamedPipeHandleState
ndash SetNamedPipeHandleState
ndash GetNamedPipeInfo
100100
Convenience Functions
BOOL TransactNamedPipe( HANDLE hNamedPipeLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDOWRD lpcbRead LPOVERLAPPED lpa)
WriteFile ReadFile sequence
101101
Convenience Functions
BOOL CallNamedPipe( LPCTSTR lpszPipeNameLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDWORD lpcbRead DWORD dwTimeOut)
CreateFile WriteFile ReadFile CloseHandle
ndash dwTimeOut NMPWAIT_NOWAIT NMPWAIT_WIAT_FOREVER NMPWAIT_USE_DEFAULT_WAIT
102102
Server eliminate the polling loop
BOOL ConnectNamedPipe (HANDLE hNamedPipeLPOVERLAPPED lpo
lpo == NULLndash Call will return as soon as there is a client connection
ndash Returns false if client connected between CreateNamed Pipe calland ConnectNamedPipe()
Use DisconnectNamedPipe to free the handle for connection from another client
WaitNamedPipe()ndash Client may wait for serverlsquos ConnectNamedPipe()
Security rights for named pipesndash GENERIC_READ GENERIC_WRITE SYNCHRONIZE
103103
Comparison with UNIX
UNIX FIFOs are similar to a named pipe
ndash FIFOs are half-duplex
ndash FIFOs are limited to a single machine
ndash FIFOs are still byte-oriented so its easiest to use fixed-size records in clientserver applications
ndash Individual readwrites are atomic
A server using FIFOs must use a separate FIFO for each clientlsquos response although all clients can send requests via a single well known FIFO
Mkfifo() is the UNIX counterpart to CreateNamedPipe()
Use sockets for networked clientserver scenarios
104104
Client Example using Named Pipe
WaitNamedPipe (ServerPipeName NMPWAIT_WAIT_FOREVER)
hNamedPipe = CreateFile (ServerPipeName GENERIC_READ | GENERIC_WRITE
0 NULL OPEN_EXISTING FILE_ATTRIBUTE_NORMAL NULL)
if (hNamedPipe == INVALID_HANDLE_VALUE)
fptinf(stderr Failure to locate servern) exit(3)
Write the request
WriteFile (hNamedPipe ampRequest MAX_RQRS_LEN ampnWrite NULL)
Read each response and send it to std out
while (ReadFile (hNamedPipe ResponseRecord MAX_RQRS_LEN ampnRead NULL))
printf (s ResponseRecord)
CloseHandle (hNamedPipe)
return 0
105105
Server Example Using a Named Pipe
hNamedPipe = CreateNamedPipe (SERVER_PIPE PIPE_ACCESS_DUPLEX
PIPE_READMODE_MESSAGE | PIPE_TYPE_MESSAGE | PIPE_WAIT
1 0 0 CS_TIMEOUT pNPSA)
while (Done)
printf (Server is awaiting next requestn)
if (ConnectNamedPipe (hNamedPipe NULL)
|| ReadFile (hNamedPipe ampRequest RQ_SIZE ampnXfer NULL))
fprintf(stderr ldquoConnect or Read Named Pipe errornrdquo) exit(4)
printf( ldquoRequest is sn RequestRecord)
Send the file one line at a time to the client
fp = fopen (File r)
while ((fgets (ResponseRecord MAX_RQRS_LEN fp) = NULL))
WriteFile (hNamedPipe ampResponseRecord
(strlen(ResponseRecord) + 1) TSIZE ampnXfer NULL)
fclose (fp)
DisconnectNamedPipe (hNamedPipe)
End of server operation
106106
Win32 IPC - MailslotsMailslots bear some nasty implementation detailsthey are almost never used
Broadcast mechanism
ndash One-directional
ndash Mutliple writersmultiple readers (frequently one-to-many comm)
ndash Message delivery is unreliable
ndash Can be located over a network domain
ndash Message lengths are limited (w2k lt 426 byte) Operations on the mailslot
ndash Each reader (server) creates mailslot with CreateMailslot()
ndash Write-only client opens mailslot with CreateFile() and uses WriteFile() ndash open will fail if there are no waiting readers
ndash Clientlsquos message can be read by all servers (readers) Client lookup mailslotmailslotname
ndash Client will connect to every server in network domain
107107
Locate a server via mailslot
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
App client 0
App client n
Mailslot Servers
While () Sleep() hMS = CreateFile( ldquomailslotstatusldquo)
WriteFile(hMS ampStatInfo
App Server
Mailslot Client
Message is sent periodically
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
55
Solution to Critical-Section Problem
Mutual Exclusion
ndash Only one thread at a time is allowed into its CS among all threads that have CS for the same resource or shared data
ndash A thread halted in its non-critical section must not interfere with other threads
Progress
ndash A thread remains inside CS for a finite time only
ndash No assumptions concerning relative speed of the threads
66
Solution to Critical-Section Problem
Bounded Waiting
ndash It must no be possible for a thread requiring access to a critical section to be delayed indefinitely
ndash When no thread is in a critical section any thread that requests entry must be permitted to enter without delay
77
Only 2 threads T0 and T1
General structure of thread Ti (other thread Tj)
do
enter section
critical section
exit section
reminder section
while (1)
Threads may share some common variables to synchronize their actions
Initial Attempts to Solve Problem
88
First Attempt Algorithm 1
Shared variables
ndash Initialization int turn = 0
ndash turn == i Ti can enter its critical section
Thread Ti
do
while (turn = i)
critical section
turn = j
reminder section
while (1)
Satisfies mutual exclusion but not progress
99
Second Attempt Algorithm 2
Shared variables
ndash initialization int flag[2] flag[0] = flag[1] = 0
ndash flag[i] == 1 Ti can enter its critical section
Thread Ti
do
flag[i] = 1while (flag[j] == 1)
critical section
flag[i] = 0remainder section
while(1)
Satisfies mutual exclusion not progress requirement
1010
Algorithm 3 (Petersonrsquos Algorithm - 1981)
Shared variables of algorithms 1 and 2 - initialization
int flag[2] flag[0] = flag[1] = 0int turn = 0
Thread Ti
do
flag[i] = 1turn = jwhile ((flag[j] == 1) ampamp turn == j)
critical section
flag[i] = 0
remainder section
while (1)
Solves the critical-section problem for two threads
1111
Dekkerrsquos Algorithm (1965)
This is the first correct solution proposed for the two-thread (two-process) case
Originally developed by Dekker in a different context it was applied to the critical section problem by Dijkstra
Dekker adds the idea of a favored thread and allows access to either thread when the request is uncontested
When there is a conflict one thread is favored and the priority reverses after successful execution of the critical section
1212
Dekkerrsquos Algorithm (contd)
Shared variables - initializationint flag[2] flag[0] = flag[1] = 0int turn = 0
Thread Ti
do flag[i] = 1
while (flag[j] ) if (turn == j)
flag[i] = 0while (turn == j)flag[i] = 1
critical section
turn = jflag[I] = 0
remainder section
while (1)
1313
Bakery Algorithm (Lamport 1979)
A Solution to the Critical Section problem for n threads
Before entering its CS a thread receives a number
Holder of the smallest number enters the CS
If threads Ti and Tj receive the same number if i lt j then Ti is served first else Tj is served first
The numbering scheme generates numbers in monotonically non-decreasing order
ndash ie 1112333445
1414
Bakery Algorithm
Notation ldquoltldquo establishes lexicographical order among 2-tuples (ticket thread id )
(ab) lt (cd) if a lt c or if a == c and b lt d
max (a0hellip an-1) = k | k ai for i = 0hellip n ndash 1
Shared data
int choosing[n]
int number[n] - the ticket
Data structures are initialized to 0
1515
Bakery Algorithm
do
choosing[i] = 1
number[i] = max(number[0]number[1] number[n-1]) + 1
choosing[i] = 0
for (j = 0 j lt n j++)
while (choosing[j] == 1)
while ((number[j] = 0) ampamp ((number[j]j) lsquorsquoltlsquorsquo (number[i]i)))
critical section
number[i] = 0
remainder section
while (1)
1616
Mutual Exclusion - Hardware Support
Interrupt Disabling
ndash Concurrent threads cannot overlap on a uniprocessor
ndash Thread will run until performing a system call or interrupt happens
Special Atomic Machine Instructions
ndash Test and Set Instruction - read amp write a memory location
ndash Exchange Instruction - swap register and memory location
Problems with Machine-Instruction Approach
ndash Busy waiting
ndash Starvation is possible
ndash Deadlock is possible
1717
Synchronization Hardware
Test and modify the content of a word atomically
boolean TestAndSet(boolean amptarget)
boolean rv = target
target = true
return rv
1818
Shared data ndash boolean lock = false
Thread Ti
do
while (TestAndSet(lock))
critical section
lock = false
remainder section
Mutual Exclusion with Test-and-Set
1919
Synchronization Hardware
Atomically swap two variables
void Swap(boolean ampa boolean ampb)
boolean temp = a
a = b
b = temp
2020
Mutual Exclusion with Swap
Shared data (initialized to 0) int lock = 0
Thread Ti
int key
do
key = 1
while (key == 1) Swap(lockkey)
critical section
lock = 0
remainder section
2121
Semaphores
Semaphore S ndash integer variable
can only be accessed via two atomic operations
wait (S)
while (S lt= 0)S--
signal (S)
S++
2222
Critical Section of n Threads
Shared data
semaphore mutex initially mutex = 1
Thread Ti
do wait(mutex) critical section
signal(mutex) remainder section
while (1)
2323
Semaphore Implementation
Semaphores may suspendresume threads
ndash Avoid busy waiting
Define a semaphore as a record
typedef struct
int value struct thread L semaphore
Assume two simple operations
ndash suspend() suspends the thread that invokes it
ndash resume(T) resumes the execution of a blocked thread T
2424
Implementation
Semaphore operations now defined as wait(S)
Svalue--
if (Svalue lt 0)
add this thread to SLsuspend()
signal(S) Svalue++
if (Svalue lt= 0)
remove a thread T from SLresume(T)
2525
Semaphore as a General Synchronization Tool Execute B in Tj only after A executed in Ti
Use semaphore flag initialized to 0
Code
Ti Tj
A wait(flag)
signal(flag) B
2626
Two Types of Semaphores
Counting semaphore
ndash integer value can range over an unrestricted domain
Binary semaphore
ndash integer value can range only between 0 and 1
ndash can be simpler to implement
Counting semaphore S can be implemented as a binary semaphore
2727
Deadlock and Starvation
Deadlock ndash
ndash two or more threads are waiting indefinitely for an event that can be caused by only one of the waiting threads
Let S and Q be two semaphores initialized to 1
T0 T1
wait(S) wait(Q)
wait(Q) wait(S)
signal(S) signal(Q)
signal(Q) signal(S)
2828
Deadlock and Starvation
Starvation ndash indefinite blocking
ndash A thread may never be removed from the semaphore queue in which it is suspended
Solution ndash
ndash all code should acquirerelease semaphores in same order
2929
Windows Synchronization
Uses interrupt masks to protect access to global resources on uniprocessor systems
Uses spinlocks on multiprocessor systems
Provides dispatcher objects which may act as mutexes and semaphores
Dispatcher objects may also provide events An event acts much like a condition variable
3030
Linux Synchronization
Kernel disables interrupts for synchronizing access to global data on uniprocessor systems
Uses spinlocks for multiprocessor synchronization
Uses semaphores and readers-writers locks when longer sections of code need access to data
Implements POSIX synchronization primitives to support multitasking multithreading (including real-time threads) and multiprocessing
3131
Further Reading
Ben-Ari M Principles of Concurrent Programming Prentice Hall 1982
Lamport L The Mutual Exclusion Problem Journal of the ACM April 1986
Abraham Silberschatz Peter B Galvin Operating System Concepts John Wiley amp Sons 6th Ed 2003
ndash Chapter 7 - Process Synchronization
ndash Chapter 8 - Deadlocks
3232
32 Trap Dispatching Interrupts Synchronization Trap and Interrupt dispatching
IRQL levels amp Interrupt Precedence
Spinlocks and Kernel Synchronization
Executive Synchronization
3333
Kernel Mode Versus User Mode
A processor state
Controls access to memory
Each memory page is tagged to show the required mode for reading and for writing
ndash Protects the system from the users
ndash Protects the user (process) from themselves
ndash System is not protected from system
Code regions are tagged ldquono write in any moderdquo
Controls ability to execute privileged instructions
A Windows abstraction
ndash Intel Ring 0 Ring 3
3434
Kernel Mode Versus User Mode
Control flow (a thread) can change from user to kernel mode and back
ndash Does not affect scheduling
ndash Thread context includes info about execution mode (along with registers etc)
PerfMon counters
ndash ldquoPrivileged Timerdquo and ldquoUser Timerdquo
ndash 4 levels of granularity thread process processor system
3535
Getting Into Kernel Mode
Code is run in kernel mode for one of three reasons
1 Requests from user mode
ndash Via the system service dispatch mechanism
ndash Kernel-mode code runs in the context of the requesting thread
2 Dedicated kernel-mode system threads
ndash Some threads in the system stay in kernel mode at all timesmostly in the ldquoSystemrdquo process
ndash Scheduled preempted etc like any other threads
3636
Getting Into Kernel Mode
3 Interrupts from external devicesndash interrupt dispatcher invokes the interrupt service routine
ndash ISR runs in the context of the interrupted thread so-called ldquoarbitrary thread contextrdquo
ndash ISR often requests the execution of a ldquoDPC routinerdquo which also runs in kernel mode
ndash Time not charged to interrupted thread
3737
Trap dispatching
Interruptdispatcher
Systemservice
dispatcher
Interruptserviceroutines
Interruptserviceroutines
Interruptserviceroutines
System services
System services
System services
Exceptiondispatcher
Exceptionhandlers
Exceptionhandlers
Exceptionhandlers
Virtual memorymanagerlsquos pager
Interrupt
System service call
HW exceptionsSW exceptions
Virtual addressexceptions
Trap processorlsquos mechanism to capture executing thread
ndash Switch from user to kernel mode
ndash Interrupts ndash asynchronous
ndash Exceptions - synchronous
Interrupt dispatch routine
Disable interrupts
Record machine state (trap frame) to allow resume
Mask equal- and lower-IRQL interrupts
Find and call appropriate ISR
Dismiss interrupt
Restore machine state (including mode and enabled interrupts)
Disable interrupts
Record machine state (trap frame) to allow resume
Mask equal- and lower-IRQL interrupts
Find and call appropriate ISR
Dismiss interrupt
Restore machine state (including mode and enabled interrupts)
Tell the device to stop interrupting
Interrogate device state start next operation on device etc
Request a DPC
Return to caller
Tell the device to stop interrupting
Interrogate device state start next operation on device etc
Request a DPC
Return to caller
Interrupt service routine
interrupt
user or kernel mode
codekernel mode
Note no thread or process context switch
Note no thread or process context switch
Interrupt Dispatching
3939
IRQL = Interrupt Request Level
ndash Precedence of the interrupt with respect to other interrupts
ndash Different interrupt sources have different IRQLs
ndash not the same as IRQ
IRQL is also a state of the processor
ndash Servicing an interrupt raises processor IRQL to that interruptrsquos IRQL
ndash this masks subsequent interrupts at equal and lower IRQLs
User mode is limited to IRQL 0
No waits or page faults at IRQL gt= DISPATCH_LEVEL
Interrupt Precedence via IRQLs (x86)
4040
PassiveLowAPC
DispatchDPCDevice 1
Profile amp Synch (Srv 2003)
ClockInterprocessor Interrupt
Power failHigh
normal thread execution
Hardware interrupts
Deferrable software interrupts
012
302928
31
Interrupt Precedence via IRQLs (x86)
4141
Interrupt processing
Interrupt dispatch table (IDT)ndash Links to interrupt service routines
x86ndash Interrupt controller interrupts processor (single line)
ndash Processor queries for interrupt vector uses vector as index to IDT
Alphandash PAL code (Privileged Architecture Library ndash Alpha BIOS) determines interrupt vector calls kernel
ndash Kernel uses vector to index IDT
After ISR execution IRQL is lowered to initial level
4242
Interrupt object
Allows device drivers to register ISRs for their devicesndash Contains dispatch code (initial handler)
ndash Dispatch code calls ISR with interrupt object as parameter(HW cannot pass parameters to ISR)
Connectingdisconnecting interrupt objectsndash Dynamic association between ISR and IDT entry
ndash Loadable device drivers (kernel modules)
ndash Turn onoff ISR
Interrupt objects can synchronize access to ISR datandash Multiple instances of ISR may be active simultaneously (MP machine)
ndash Multiple ISR may be connected with IRQL
4343
Predefined IRQLs
High ndash used when halting the system (via KeBugCheck())
Power fail ndash originated in the NT design document but has never been used
Inter-processor interruptndash used to request action from other processor (dispatching a thread updating a processors TLB system shutdown system crash)
Clockndash Used to update systemlsquos clock allocation of CPU time to threads
Profilendash Used for kernel profiling (see Kernel profiler ndash Kernprofexe Res Kit)
4444
Predefined IRQLs (contd)
Device
ndash Used to prioritize device interrupts
DPCdispatch and APC
ndash Software interrupts that kernel and device drivers generate
Passive
ndash No interrupt level at all normal thread execution
4545
IRQLs on 64-bit Systems
PassiveLowAPC
DispatchDPCDevice 1
Device n
Synch (Srv 2003)Clock
Interprocessor InterruptPower
HighProfile
012
1413
15
34
PassiveLowAPC
DispatchDPC amp Synch (UP only)Correctable Machine Check
Device 1
Device nSynch (MP only)
ClockInterprocessor Interrupt
HighProfilePower
x64 IA64
12
4646
Interrupt Prioritization amp Delivery
IRQLs are determined as followsndash x86 UP systems IRQL = 27 - IRQ
ndash x86 MP systems bucketized (random)
ndash x64 amp IA64 systems IRQL = IDT vector number 16
On MP systems which processor is chosen to deliver an interruptndash By default any processor can receive an interrupt from any deviceCan be configured with IntFilter utility in Resource Kit
ndash On x86 and x64 systems the IOAPIC (IO advanced programmable interrupt controller) is programmed to interrupt the processor running at the lowest IRQL
ndash On IA64 systems the SAPIC (streamlined advanced programmable interrupt controller) is configured to interrupt one processor for each interrupt sourceProcessors are assigned round robin for each interrupt vector
4747
Software interrupts
Initiating thread dispatching
ndash DPC allow for scheduling actions when kernel is deep within many layers of code
ndash Delayed scheduling decision one DPC queue per processor
Handling timer expiration
Asynchronous execution of a procedure in context of a particular thread
Support for asynchronous IO operations
4848
Flow of Interrupts
4949
Sync on MP use spinlocks to coordinate among processors
Spinlock acquisition and release routines implement a one-owner-at-a-time algorithm
ndash Spinlock is either free or is considered to be owned by a CPU
ndash Analogous to using Windows API mutexes from user mode
A spinlock is just a data cell in memory
ndash Accessed with a test-and-modify operation that is atomic across all processors
ndash KSPIN_LOCK is an opaque data type typedefrsquod as a ULONG
ndash To implement synchronization a single bit is sufficient
Synchronization on SMP Systems
5050
do acquire_spinlock(DPC)until (SUCCESS)
begin remove DPC from queueend
release_spinlock(DPC)
do acquire_spinlock(DPC)until (SUCCESS)
begin remove DPC from queueend
release_spinlock(DPC)
Kernel Synchronization
Processor BProcessor A
Critical section
spinlock
DPC DPC
A spinlock is a locking primitive associatedwith a global data structure such as the DPC queue
5151
Queued Spinlocks
Problem Checking status of spinlock via test-and-set operation creates bus contention
Queued spinlocks maintain queue of waiting processors
First processor acquires lock other processors wait on processor-local flag
ndash Thus busy-wait loop requires no access to the memory bus
When releasing lock the 1st processorrsquos flag is modified
ndash Exactly one processor is being signaled
ndash Pre-determined wait order
5252
SMP Scalability Improvements
Windows 2000 queued spinlocks
ndash qlocks in Kernel Debugger
Server 2003
ndash More spinlocks eliminated (context swap system space commit)
ndash Further reduction of use of spinlocks amp length they are held
ndash Scheduling database now per-CPUAllows thread state transitions in parallel
5353
SMP Scalability Improvements
XP2003
ndash Minimized lock contention for hot locks PFN or Page Frame Database lock
ndash Some locks completely eliminatedCharging nonpagedpaged pool quotas allocating and mapping system page table entries charging commitment of pages allocatingmapping physical memory through AWE functions
ndash New more efficient locking mechanism (pushlocks)Doesnrsquot use spinlocks when no contentionUsed for object manager and address windowing extensions (AWE) related locks
5454
Waiting
Flexible wait calls
ndash Wait for one or multiple objects in one call
ndash Wait for multiple can wait for ldquoanyrdquo one or ldquoallrdquo at onceldquoAllrdquo all objects must be in the signalled state concurrently to resolve the wait
ndash All wait calls include optional timeout argument
ndash Waiting threads consume no CPU time
5555
Waiting
Waitable objects include
ndash Events (may be auto-reset or manual reset may be set or ldquopulsedrdquo)
ndash Mutexes (ldquomutual exclusionrdquo one-at-a-time)
ndash Semaphores (n-at-a-time)
ndash Timers
ndash Processes and Threads (signalled upon exit or terminate)
ndash Directories (change notification)
5656
Waiting
No guaranteed ordering of wait resolution
ndash If multiple threads are waiting for an object and only one thread is released (eg itrsquos a mutex or auto-reset event) which thread gets released is unpredictable
5757
Executive Synchronization
Thread waitson an object
handle
Create and initialize thread object
Initialized
Ready
Transition
Waiting
Running
Terminated
Standby
Wait is completeSet object to
signaled state
Interaction with thread scheduling
Waiting on Dispatcher Objects ndash outside the kernel
5858
Interaction bet Synchronization amp Dispatching User mode thread waits on an event objectlsquos handle
Kernel changes threadlsquos scheduling state from ready to waiting and adds thread to wait-list
Another thread sets the event
Kernel wakes up waiting threads variable priority threads get priority boost
5959
Interaction bet Synchronization amp Dispatching Dispatcher re-schedules new thread ndash it may preempt running thread it it has lower priority and issues software interrupt to initiate context switch
If no processor can be preempted the dispatcher places the ready thread in the dispatcher ready queue to be scheduled later
6060
What signals an object
Dispatcher object
System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
Owning thread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (kernel mode)
nonsignaled signaled
Owning thread or otherthread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (exported to user mode)
nonsignaled signaled
One thread releases thesemaphore freeing a resource
A thread acquires the semaphoreMore resources are not available
Kernel resumes one or more waiting threads
Semaphore
6161
A thread reinitializesthe thread object
What signals an object (contd)
Dispatcher object System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
A thread sets the event
Kernel resumes one or more threads
Kernel resumes one or more waiting threads
Event
nonsignaled signaled
Dedicated thread sets oneevent in the event pair
Kernel resumes theother dedicated thread
Kernel resumes waitingdedicated thread
Event pair
nonsignaled signaled
Timer expires
A thread (re) initializes the timer
Kernel resumes all waiting threads
Timer
nonsignaled signaled
Thread terminates
Kernel resumes all waiting threads
Thread
6262
Further Reading
Mark E Russinovich and David A Solomon Microsoft Windows Internals 4th Edition Microsoft Press 2004
Chapter 3 - System Mechanisms
ndash Trap Dispatching (pp 85 ff)
ndash Synchronization (pp 149 ff)
ndash Kernel Event Tracing (pp 175 ff)
6363
33 Advanced Windows Synchronization
Deferred and Asynchronous Procedure Calls
IRQLs and CPU Time Accounting
Wait Queues amp Dispatcher Objects
6464
Used to defer processing from higher (device) interrupt level to a lower (dispatch) levelndash Also used for quantum end and timer expiration
Driver (usually ISR) queues requestndash One queue per CPU DPCs are normally queued to the current processor but can be targeted to other CPUs
ndash Executes specified procedure at dispatch IRQL (or ldquodispatch levelrdquo also ldquoDPC levelrdquo) when all higher-IRQL work (interrupts) completed
ndash Maximum times recommended ISR 10 usec DPC 25 usec
See httpwwwmicrosoftcomwhdcdriverperformmmdrvmspx
Deferred Procedure Calls (DPCs)
6565
queue head DPC object DPC object DPC object
Deferred Procedure Calls (DPCs)
6666
DPC
Delivering a DPC
DPC routines can call kernel functionsbut canlsquot call system services generatepage faults or create or wait on objects
DPC routines canlsquotassume whatprocess addressspace is currentlymapped
Interruptdispatch table
high
Power failure
DispatchDPC
APC
Low
DPC
1 Timer expires kernelqueues DPC that willrelease all waiting threadsKernel requests SW int
DPCDPC
DPC queue
2 DPC interrupt occurswhen IRQL drops belowdispatchDPC level
dispatcher
3 After DPC interruptcontrol transfers tothread dispatcher
4 Dispatcher executes each DPCroutine in DPC queue
6767
Asynchronous Procedure Calls (APCs)
Execute code in context of a particular user threadndash APC routines can acquire resources (objects) incur page faultscall system services
APC queue is thread-specific
User mode amp kernel mode APCsndash Permission required for user mode APCs
6868
Asynchronous Procedure Calls (APCs)
Executive uses APCs to complete work in thread spacendash Wait for asynchronous IO operation
ndash Emulate delivery of POSIX signals
ndash Make threads suspendterminate itself (env subsystems)
APCs are delivered when thread is in alertable wait statendash WaitForMultipleObjectsEx() SleepEx()
6969
Special kernel APCs
ndash Run in kernel mode at IRQL 1
ndash Always deliverable unless thread is already at IRQL 1 or above
ndash Used for IO completion reporting from ldquoarbitrary thread contextrdquo
ndash Kernel-mode interface is linkable but not documented
Asynchronous Procedure Calls (APCs)
7070
ldquoOrdinaryrdquo kernel APCs
ndash Always deliverable if at IRQL 0 unless explicitly disabled (disable with KeEnterCriticalRegion)
User mode APCs
ndash Used for IO completion callback routines (see ReadFileEx WriteFileEx) also QueueUserApc
ndash Only deliverable when thread is in ldquoalertable waitrdquo
Asynchronous Procedure Calls (APCs)
7171
ThreadObject
K
U
APC objects
Asynchronous Procedure Calls (APCs)
7272
IRQLs and CPU Time Accounting
Interval clock timer ISR keeps track of time
Clock ISR time accounting
ndash If IRQLlt2 charge to threadrsquos user or kernel time
ndash If IRQL=2 and processing a DPC charge to DPC time
ndash If IRQL=2 amp not processing a DPC charge to thread kernel time
ndash If IRQLgt2 charge to interrupt time
7373
IRQLs and CPU Time Accounting
Since time servicing interrupts are NOT charged to interrupted thread if system is busy but no process appears to be running must be due to interrupt-related activity
ndash Note time at IRQL 2 or more is charged to the current threadrsquos quantum (to be described)
7474
Interrupt Time Accounting
Task Manager includes interrupt and DPC time with the Idle process time
Interrupt activity is not charged to any threadprocess
ndash Process Explorer shows these as separate processesnot really processes
ndash Context switches for these are really of interrupts amp DPCs
7575
Time Accounting Quirks
Looking at total CPU time for each process may not reveal where system has spent its time
CPU time accounting is driven by programmable interrupt timer
ndash Normally 10 msec (15 msec on some MP Pentiums)
Thread execution and context switches between clock intervals NOT accounted
ndash Eg one or more threads run and enter a wait state before clock fires
ndash Thus threads may run but never get charged
View context switch activity with Process Explorer
ndash Add Context Switch Delta column
7676
For waiting threads user-mode utilities only display the wait reason
Example pstat
Looking at Waiting Threads
7777
Wait Internals 1 Dispatcher Objects
Size TypeState
Wait listhead
Object-type-specific data
DispatcherObject
(see ntddkincddkntddkh)
Any kernel object you can wait for is a ldquodispatcher objectrdquo
ndash some exclusively for synchronizationeg events mutexes (ldquomutantsrdquo) semaphores queues timers
ndash others can be waited for as a side effect of their prime function eg processes threads file objects
ndash non-waitable kernel objects are called ldquocontrol objectsrdquo
All dispatcher objects have a common header
All dispatcher objects are in one of two states
ndash ldquosignaledrdquo vs ldquononsignaledrdquo
ndash when signalled a wait on the object is satisfied
ndash different object types differ in terms of what changes their state
ndash wait and unwait implementation iscommon to all types of dispatcher objects
7878
Object-type-specific data
Wait Internals 2Wait Blocks
Size TypeState
Wait listhead
Size TypeState
Wait listhead
Represent a threadrsquos reference to something itrsquos waiting for (one per handle passed to WaitForhellip)
All wait blocks from a given wait call are chained to the waiting thread
Type indicates wait for ldquoanyrdquo or ldquoallrdquo Key denotes argument list position for
WaitForMultipleObjects
Object-type-specific data
DispatcherObjects
Thread Objects
WaitBlockListWaitBlockList
Wait blocks
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
7979
34 Windows APIs for Synchronization and IPC Windows API constructs for synchronization and interprocess communication
Synchronization
ndash Critical sections
ndash Mutexes
ndash Semaphores
ndash Event objects
Synchronization through interprocess communication
ndash Anonymous pipes
ndash Named pipes
ndash Mailslots
8080
Critical Sections
Only usable from within the same process
Critical sections are initialized and deleted but do not have handles
Only one thread at a time can be in a critical section
A thread can enter a critical section multiple times - however the number of Enter- and Leave-operations must match
Leaving a critical section before entering it may cause deadlocks
No way to test whether another thread is in a critical section
VOID InitializeCriticalSection( LPCRITICAL_SECTION sec )VOID DeleteCriticalSection( LPCRITICAL_SECTION sec )
VOID EnterCriticalSection( LPCRITICAL_SECTION sec ) VOID LeaveCriticalSection( LPCRITICAL_SECTION sec )BOOL TryEnterCriticalSection ( LPCRITICAL_SECTION sec )
8181
Critical Section Example
counter is global shared by all threads
volatile int counter = 0
CRITICAL_SECTION crit
InitializeCriticalSection ( ampcrit )
hellip main loop in any of the threads
while (done)
_try
EnterCriticalSection ( ampcrit )
counter += local_value
LeaveCriticalSection ( ampcrit )
_finally LeaveCriticalSection ( ampcrit )
DeleteCriticalSection( ampcrit )
8282
Synchronizing Threads with Kernel Objects
The following kernel objects can be used to synchronize threads
ndash Processes
ndash Threads
ndash Files
ndash Console input
File change notificationsFile change notifications
MutexesMutexes
Events (auto-reset + manual-reset)Events (auto-reset + manual-reset)
Waitable timersWaitable timers
DWORD WaitForSingleObject( HANDLE hObject DWORD dwTimeout )
DWORD WaitForMultipleObjects( DWORD cObjects LPHANDLE lpHandles BOOL bWaitAll DWORD dwTimeout )
8383
Wait Functions - Details
WaitForSingleObject()ndash hObject specifies kernel object
ndash dwTimeout specifies wait time in msecdwTimeout == 0 - no wait check whether object is signaled
dwTimeout == INFINITE - wait forever
WaitForMultipleObjects()ndash cObjects lt= MAXIMUM_WAIT_OBJECTS (64)
ndash lpHandles - pointer to array identifying these objects
ndash bWaitAll - whether to wait for first signaled object or all objectsFunction returns index of first signaled object
Side effectsndash Mutexes auto-reset events and waitable timers will be reset to non-signaled state after completing wait functions
8484
Mutexes
Mutexes work across processes
First thread has to call CreateMutex()
When sharing a mutex second thread (process) calls CreateMutex() or OpenMutex()
fInitialOwner == TRUE gives creator immediate ownership
Threads acquire mutex ownership using WaitForSingleObject() or WaitForMultipleObjects()
ReleaseMutex() gives up ownership
CloseHandle() will free mutex object
8585
Mutexes
HANDLE CreateMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
HANDLE OpenMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
BOOL ReleaseMutex( HANDLE hMutex )
8686
Mutex Example
counter is global shared by all threads
volatile int done counter = 0
HANDLE mutex = CreateMutex( NULL FALSE NULL )
main loop in any of the threads ret is local
DWORD ret
while (done)
ret = WaitForSingleObject( mutex INFINITE )
if (ret == WAIT_OBJECT_0)
counter += local_value
else mutex was abandoned
break exit the loop
ReleaseMutex( mutex )
CloseHandle( mutex )
8787
Comparison - POSIX mutexes
POSIX pthreads specification supports mutexesndash Synchronization among threads in same process
Five basic functionsndash pthread_mutex_init()
ndash pthread_mutex_destroy()
ndash pthread_mutex_lock()
ndash pthread_mutex_unlock()
ndash pthread_mutex_trylock()
Comparisonndash pthread_mutex_lock() will block - equivalent to WaitForSingleObject( hMutex )
ndash pthread_mutex_trylock() is nonblocking (polling) - equivalent to WaitForSingleObject() with timeout == 0
8888
Semaphores
Semaphore objects are used for resource countingndash A semaphore is signaled when count gt 0
Threadsprocesses use wait functionsndash Each wait function decreases semaphore count by 1
ndash ReleaseSemaphore() may increment count by any value
ndash ReleaseSemaphore() returns old semaphore count
8989
Semaphores
HANDLE CreateSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE OpenSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE ReleaseSemaphore( HANDLE hSemaphoreLONG cReleaseCount LPLONG lpPreviousCount )
9090
Events
Multiple threads can be released when a single event is signaled (barrier synchronization)ndash Manual-reset event can signal several thread simultaneously must be reset manually
ndash PulseEvent() will release all threads waiting on a manual-reset event and automatically reset the event
ndash Auto-reset event signals a single thread event is reset automatically
ndash fInitialState == TRUE - create event in signaled state
9191
Events
HANDLE CreateEvent( LPSECURITY_ATTRIBUTE lpsaBOOL fManualReset BOOL fInititalStateLPTSTR lpszEventName )
BOOL SetEvent( HANDLE hEvent )BOOL ResetEvent( HANDLE hEvent )BOOL PulseEvent( HANDLE hEvent )
9292
Comparison - POSIX condition variables
pthreadrsquos condition variables are comparable to eventsndash pthread_cond_init()
ndash pthread_cond_destroy()
Wait functionsndash pthread_cond_wait()
ndash pthread_cond_timedwait()
Signalingndash pthread_cond_signal() - one thread
ndash pthread_cond_broadcast() - all waiting threads
No exact equivalent to manual-reset events
9393
Anonymous pipes
BOOL CreatePipe( PHANDLE phReadPHANDLE phWriteLPSECURITY_ATTRIBUTES lpsaDWORD cbPipe )
main
prog1 prog2pipe
Half-duplex character-based IPC
cbPipe pipe byte size zero == default
Read on pipe handle will block if pipe is empty
Write operation to a full pipe will block
Anonymous pipes are oneway
9494
IO Redirection using an Anonymous Pipe
Create default size anonymous pipe handles are inheritable
if (CreatePipe (amphReadPipe amphWritePipe ampPipeSA 0))
fprintf(stderr ldquoAnon pipe create failednrdquo) exit(1)
Set output handle to pipe handle create first processes
StartInfoCh1hStdInput = GetStdHandle (STD_INPUT_HANDLE)
StartInfoCh1hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh1hStdOutput = hWritePipe
StartInfoCh1dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)Command1 NULL NULL TRUE 0 NULL NULL ampStartInfoCh1 ampProcInfo1))
fprintf(stderr ldquoCreateProc1 failednrdquo) exit(2)
CloseHandle (hWritePipe)
9595
Pipe example (contd)
Repeat (symmetrically) for the second process
StartInfoCh2hStdInput = hReadPipe
StartInfoCh2hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh2hStdOutput = GetStdHandle (STD_OUTPUT_HANDLE)
StartInfoCh2dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)targv NULL NULLTRUE Inherit handles
0 NULL NULL ampStartInfoCh2 ampProcInfo2))
fprintf(stderr ldquoCreateProc2 failednrdquo) exit(3)
CloseHandle (hReadPipe)
Wait for both processes to complete
WaitForSingleObject (ProcInfo1hProcess INFINITE)
WaitForSingleObject (ProcInfo2hProcess INFINITE)
CloseHandle (ProcInfo1hThread) CloseHandle (ProcInfo1hProcess)
CloseHandle (ProcInfo2hThread) CloseHandle (ProcInfo2hProcess)
return 0
9696
Named Pipes
Message orientedndash Reading process can read varying-length messages precisely as sent by the writing process
Bi-directionalndash Two processes can exchange messages over the same pipe
Multiple independent instances of a named pipendash Several clients can communicate with a single server using the same instance
ndash Server can respond to client using the same instance
Pipe can be accessed over the networkndash location transparency
Convenience and connection functions
9797
Using Named Pipes
HANDLE CreateNamedPipe (LPCTSTR lpszPipeNameDWORD fdwOpenMode DWORD fdwPipModeDWORD nMaxInstances DWORD cbOutBufDWORD cbInBuf DWORD dwTimeOutLPSECURITY_ATTRIBUTES lpsa )
Use same flag settings forall instances of a named pipe
lpszPipeName pipe[path]pipename
ndash Not possible to create a pipe on remote machine ( ndash local machine)
fdwOpenMode
ndash PIPE_ACCESS_DUPLEX PIPE_ACCESS_INBOUND PIPE_ACCESS_OUTBOUND
fdwPipeMode
ndash PIPE_TYPE_BYTE or PIPE_TYPE_MESSAGE
ndash PIPE_READMODE_BYTE or PIPE_READMODE_MESSAGE
ndash PIPE_WAIT or PIPE_NOWAIT (will ReadFile block)
9898
Named Pipes (contd)
BOOL PeekNamedPipe (HANDLE hPipeLPVOID lpvBuffer DWORD cbBufferLPDWORD lpcbRead LPDWORD lpcbAvailLPDWORD lpcbMessage)
nMaxInstances
ndash Number of instances
ndash PIPE_UNLIMITED_INSTANCES OS choice based on resources
dwTimeOut
ndash Default time-out period (in msec) for WaitNamedPipe()
First CreateNamedPipe creates named pipe
ndash Closing handle to last instance deletes named pipe
Polling a pipe
ndash Nondestructive ndash is there a message waiting for ReadFile
9999
Named Pipe Client Connections
CreateFile with named pipe namendash pipe[path]pipename
ndash servernamepipe[path]pipename
ndash First method gives better performance (local server)
Status Functionsndash GetNamedPipeHandleState
ndash SetNamedPipeHandleState
ndash GetNamedPipeInfo
100100
Convenience Functions
BOOL TransactNamedPipe( HANDLE hNamedPipeLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDOWRD lpcbRead LPOVERLAPPED lpa)
WriteFile ReadFile sequence
101101
Convenience Functions
BOOL CallNamedPipe( LPCTSTR lpszPipeNameLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDWORD lpcbRead DWORD dwTimeOut)
CreateFile WriteFile ReadFile CloseHandle
ndash dwTimeOut NMPWAIT_NOWAIT NMPWAIT_WIAT_FOREVER NMPWAIT_USE_DEFAULT_WAIT
102102
Server eliminate the polling loop
BOOL ConnectNamedPipe (HANDLE hNamedPipeLPOVERLAPPED lpo
lpo == NULLndash Call will return as soon as there is a client connection
ndash Returns false if client connected between CreateNamed Pipe calland ConnectNamedPipe()
Use DisconnectNamedPipe to free the handle for connection from another client
WaitNamedPipe()ndash Client may wait for serverlsquos ConnectNamedPipe()
Security rights for named pipesndash GENERIC_READ GENERIC_WRITE SYNCHRONIZE
103103
Comparison with UNIX
UNIX FIFOs are similar to a named pipe
ndash FIFOs are half-duplex
ndash FIFOs are limited to a single machine
ndash FIFOs are still byte-oriented so its easiest to use fixed-size records in clientserver applications
ndash Individual readwrites are atomic
A server using FIFOs must use a separate FIFO for each clientlsquos response although all clients can send requests via a single well known FIFO
Mkfifo() is the UNIX counterpart to CreateNamedPipe()
Use sockets for networked clientserver scenarios
104104
Client Example using Named Pipe
WaitNamedPipe (ServerPipeName NMPWAIT_WAIT_FOREVER)
hNamedPipe = CreateFile (ServerPipeName GENERIC_READ | GENERIC_WRITE
0 NULL OPEN_EXISTING FILE_ATTRIBUTE_NORMAL NULL)
if (hNamedPipe == INVALID_HANDLE_VALUE)
fptinf(stderr Failure to locate servern) exit(3)
Write the request
WriteFile (hNamedPipe ampRequest MAX_RQRS_LEN ampnWrite NULL)
Read each response and send it to std out
while (ReadFile (hNamedPipe ResponseRecord MAX_RQRS_LEN ampnRead NULL))
printf (s ResponseRecord)
CloseHandle (hNamedPipe)
return 0
105105
Server Example Using a Named Pipe
hNamedPipe = CreateNamedPipe (SERVER_PIPE PIPE_ACCESS_DUPLEX
PIPE_READMODE_MESSAGE | PIPE_TYPE_MESSAGE | PIPE_WAIT
1 0 0 CS_TIMEOUT pNPSA)
while (Done)
printf (Server is awaiting next requestn)
if (ConnectNamedPipe (hNamedPipe NULL)
|| ReadFile (hNamedPipe ampRequest RQ_SIZE ampnXfer NULL))
fprintf(stderr ldquoConnect or Read Named Pipe errornrdquo) exit(4)
printf( ldquoRequest is sn RequestRecord)
Send the file one line at a time to the client
fp = fopen (File r)
while ((fgets (ResponseRecord MAX_RQRS_LEN fp) = NULL))
WriteFile (hNamedPipe ampResponseRecord
(strlen(ResponseRecord) + 1) TSIZE ampnXfer NULL)
fclose (fp)
DisconnectNamedPipe (hNamedPipe)
End of server operation
106106
Win32 IPC - MailslotsMailslots bear some nasty implementation detailsthey are almost never used
Broadcast mechanism
ndash One-directional
ndash Mutliple writersmultiple readers (frequently one-to-many comm)
ndash Message delivery is unreliable
ndash Can be located over a network domain
ndash Message lengths are limited (w2k lt 426 byte) Operations on the mailslot
ndash Each reader (server) creates mailslot with CreateMailslot()
ndash Write-only client opens mailslot with CreateFile() and uses WriteFile() ndash open will fail if there are no waiting readers
ndash Clientlsquos message can be read by all servers (readers) Client lookup mailslotmailslotname
ndash Client will connect to every server in network domain
107107
Locate a server via mailslot
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
App client 0
App client n
Mailslot Servers
While () Sleep() hMS = CreateFile( ldquomailslotstatusldquo)
WriteFile(hMS ampStatInfo
App Server
Mailslot Client
Message is sent periodically
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
66
Solution to Critical-Section Problem
Bounded Waiting
ndash It must no be possible for a thread requiring access to a critical section to be delayed indefinitely
ndash When no thread is in a critical section any thread that requests entry must be permitted to enter without delay
77
Only 2 threads T0 and T1
General structure of thread Ti (other thread Tj)
do
enter section
critical section
exit section
reminder section
while (1)
Threads may share some common variables to synchronize their actions
Initial Attempts to Solve Problem
88
First Attempt Algorithm 1
Shared variables
ndash Initialization int turn = 0
ndash turn == i Ti can enter its critical section
Thread Ti
do
while (turn = i)
critical section
turn = j
reminder section
while (1)
Satisfies mutual exclusion but not progress
99
Second Attempt Algorithm 2
Shared variables
ndash initialization int flag[2] flag[0] = flag[1] = 0
ndash flag[i] == 1 Ti can enter its critical section
Thread Ti
do
flag[i] = 1while (flag[j] == 1)
critical section
flag[i] = 0remainder section
while(1)
Satisfies mutual exclusion not progress requirement
1010
Algorithm 3 (Petersonrsquos Algorithm - 1981)
Shared variables of algorithms 1 and 2 - initialization
int flag[2] flag[0] = flag[1] = 0int turn = 0
Thread Ti
do
flag[i] = 1turn = jwhile ((flag[j] == 1) ampamp turn == j)
critical section
flag[i] = 0
remainder section
while (1)
Solves the critical-section problem for two threads
1111
Dekkerrsquos Algorithm (1965)
This is the first correct solution proposed for the two-thread (two-process) case
Originally developed by Dekker in a different context it was applied to the critical section problem by Dijkstra
Dekker adds the idea of a favored thread and allows access to either thread when the request is uncontested
When there is a conflict one thread is favored and the priority reverses after successful execution of the critical section
1212
Dekkerrsquos Algorithm (contd)
Shared variables - initializationint flag[2] flag[0] = flag[1] = 0int turn = 0
Thread Ti
do flag[i] = 1
while (flag[j] ) if (turn == j)
flag[i] = 0while (turn == j)flag[i] = 1
critical section
turn = jflag[I] = 0
remainder section
while (1)
1313
Bakery Algorithm (Lamport 1979)
A Solution to the Critical Section problem for n threads
Before entering its CS a thread receives a number
Holder of the smallest number enters the CS
If threads Ti and Tj receive the same number if i lt j then Ti is served first else Tj is served first
The numbering scheme generates numbers in monotonically non-decreasing order
ndash ie 1112333445
1414
Bakery Algorithm
Notation ldquoltldquo establishes lexicographical order among 2-tuples (ticket thread id )
(ab) lt (cd) if a lt c or if a == c and b lt d
max (a0hellip an-1) = k | k ai for i = 0hellip n ndash 1
Shared data
int choosing[n]
int number[n] - the ticket
Data structures are initialized to 0
1515
Bakery Algorithm
do
choosing[i] = 1
number[i] = max(number[0]number[1] number[n-1]) + 1
choosing[i] = 0
for (j = 0 j lt n j++)
while (choosing[j] == 1)
while ((number[j] = 0) ampamp ((number[j]j) lsquorsquoltlsquorsquo (number[i]i)))
critical section
number[i] = 0
remainder section
while (1)
1616
Mutual Exclusion - Hardware Support
Interrupt Disabling
ndash Concurrent threads cannot overlap on a uniprocessor
ndash Thread will run until performing a system call or interrupt happens
Special Atomic Machine Instructions
ndash Test and Set Instruction - read amp write a memory location
ndash Exchange Instruction - swap register and memory location
Problems with Machine-Instruction Approach
ndash Busy waiting
ndash Starvation is possible
ndash Deadlock is possible
1717
Synchronization Hardware
Test and modify the content of a word atomically
boolean TestAndSet(boolean amptarget)
boolean rv = target
target = true
return rv
1818
Shared data ndash boolean lock = false
Thread Ti
do
while (TestAndSet(lock))
critical section
lock = false
remainder section
Mutual Exclusion with Test-and-Set
1919
Synchronization Hardware
Atomically swap two variables
void Swap(boolean ampa boolean ampb)
boolean temp = a
a = b
b = temp
2020
Mutual Exclusion with Swap
Shared data (initialized to 0) int lock = 0
Thread Ti
int key
do
key = 1
while (key == 1) Swap(lockkey)
critical section
lock = 0
remainder section
2121
Semaphores
Semaphore S ndash integer variable
can only be accessed via two atomic operations
wait (S)
while (S lt= 0)S--
signal (S)
S++
2222
Critical Section of n Threads
Shared data
semaphore mutex initially mutex = 1
Thread Ti
do wait(mutex) critical section
signal(mutex) remainder section
while (1)
2323
Semaphore Implementation
Semaphores may suspendresume threads
ndash Avoid busy waiting
Define a semaphore as a record
typedef struct
int value struct thread L semaphore
Assume two simple operations
ndash suspend() suspends the thread that invokes it
ndash resume(T) resumes the execution of a blocked thread T
2424
Implementation
Semaphore operations now defined as wait(S)
Svalue--
if (Svalue lt 0)
add this thread to SLsuspend()
signal(S) Svalue++
if (Svalue lt= 0)
remove a thread T from SLresume(T)
2525
Semaphore as a General Synchronization Tool Execute B in Tj only after A executed in Ti
Use semaphore flag initialized to 0
Code
Ti Tj
A wait(flag)
signal(flag) B
2626
Two Types of Semaphores
Counting semaphore
ndash integer value can range over an unrestricted domain
Binary semaphore
ndash integer value can range only between 0 and 1
ndash can be simpler to implement
Counting semaphore S can be implemented as a binary semaphore
2727
Deadlock and Starvation
Deadlock ndash
ndash two or more threads are waiting indefinitely for an event that can be caused by only one of the waiting threads
Let S and Q be two semaphores initialized to 1
T0 T1
wait(S) wait(Q)
wait(Q) wait(S)
signal(S) signal(Q)
signal(Q) signal(S)
2828
Deadlock and Starvation
Starvation ndash indefinite blocking
ndash A thread may never be removed from the semaphore queue in which it is suspended
Solution ndash
ndash all code should acquirerelease semaphores in same order
2929
Windows Synchronization
Uses interrupt masks to protect access to global resources on uniprocessor systems
Uses spinlocks on multiprocessor systems
Provides dispatcher objects which may act as mutexes and semaphores
Dispatcher objects may also provide events An event acts much like a condition variable
3030
Linux Synchronization
Kernel disables interrupts for synchronizing access to global data on uniprocessor systems
Uses spinlocks for multiprocessor synchronization
Uses semaphores and readers-writers locks when longer sections of code need access to data
Implements POSIX synchronization primitives to support multitasking multithreading (including real-time threads) and multiprocessing
3131
Further Reading
Ben-Ari M Principles of Concurrent Programming Prentice Hall 1982
Lamport L The Mutual Exclusion Problem Journal of the ACM April 1986
Abraham Silberschatz Peter B Galvin Operating System Concepts John Wiley amp Sons 6th Ed 2003
ndash Chapter 7 - Process Synchronization
ndash Chapter 8 - Deadlocks
3232
32 Trap Dispatching Interrupts Synchronization Trap and Interrupt dispatching
IRQL levels amp Interrupt Precedence
Spinlocks and Kernel Synchronization
Executive Synchronization
3333
Kernel Mode Versus User Mode
A processor state
Controls access to memory
Each memory page is tagged to show the required mode for reading and for writing
ndash Protects the system from the users
ndash Protects the user (process) from themselves
ndash System is not protected from system
Code regions are tagged ldquono write in any moderdquo
Controls ability to execute privileged instructions
A Windows abstraction
ndash Intel Ring 0 Ring 3
3434
Kernel Mode Versus User Mode
Control flow (a thread) can change from user to kernel mode and back
ndash Does not affect scheduling
ndash Thread context includes info about execution mode (along with registers etc)
PerfMon counters
ndash ldquoPrivileged Timerdquo and ldquoUser Timerdquo
ndash 4 levels of granularity thread process processor system
3535
Getting Into Kernel Mode
Code is run in kernel mode for one of three reasons
1 Requests from user mode
ndash Via the system service dispatch mechanism
ndash Kernel-mode code runs in the context of the requesting thread
2 Dedicated kernel-mode system threads
ndash Some threads in the system stay in kernel mode at all timesmostly in the ldquoSystemrdquo process
ndash Scheduled preempted etc like any other threads
3636
Getting Into Kernel Mode
3 Interrupts from external devicesndash interrupt dispatcher invokes the interrupt service routine
ndash ISR runs in the context of the interrupted thread so-called ldquoarbitrary thread contextrdquo
ndash ISR often requests the execution of a ldquoDPC routinerdquo which also runs in kernel mode
ndash Time not charged to interrupted thread
3737
Trap dispatching
Interruptdispatcher
Systemservice
dispatcher
Interruptserviceroutines
Interruptserviceroutines
Interruptserviceroutines
System services
System services
System services
Exceptiondispatcher
Exceptionhandlers
Exceptionhandlers
Exceptionhandlers
Virtual memorymanagerlsquos pager
Interrupt
System service call
HW exceptionsSW exceptions
Virtual addressexceptions
Trap processorlsquos mechanism to capture executing thread
ndash Switch from user to kernel mode
ndash Interrupts ndash asynchronous
ndash Exceptions - synchronous
Interrupt dispatch routine
Disable interrupts
Record machine state (trap frame) to allow resume
Mask equal- and lower-IRQL interrupts
Find and call appropriate ISR
Dismiss interrupt
Restore machine state (including mode and enabled interrupts)
Disable interrupts
Record machine state (trap frame) to allow resume
Mask equal- and lower-IRQL interrupts
Find and call appropriate ISR
Dismiss interrupt
Restore machine state (including mode and enabled interrupts)
Tell the device to stop interrupting
Interrogate device state start next operation on device etc
Request a DPC
Return to caller
Tell the device to stop interrupting
Interrogate device state start next operation on device etc
Request a DPC
Return to caller
Interrupt service routine
interrupt
user or kernel mode
codekernel mode
Note no thread or process context switch
Note no thread or process context switch
Interrupt Dispatching
3939
IRQL = Interrupt Request Level
ndash Precedence of the interrupt with respect to other interrupts
ndash Different interrupt sources have different IRQLs
ndash not the same as IRQ
IRQL is also a state of the processor
ndash Servicing an interrupt raises processor IRQL to that interruptrsquos IRQL
ndash this masks subsequent interrupts at equal and lower IRQLs
User mode is limited to IRQL 0
No waits or page faults at IRQL gt= DISPATCH_LEVEL
Interrupt Precedence via IRQLs (x86)
4040
PassiveLowAPC
DispatchDPCDevice 1
Profile amp Synch (Srv 2003)
ClockInterprocessor Interrupt
Power failHigh
normal thread execution
Hardware interrupts
Deferrable software interrupts
012
302928
31
Interrupt Precedence via IRQLs (x86)
4141
Interrupt processing
Interrupt dispatch table (IDT)ndash Links to interrupt service routines
x86ndash Interrupt controller interrupts processor (single line)
ndash Processor queries for interrupt vector uses vector as index to IDT
Alphandash PAL code (Privileged Architecture Library ndash Alpha BIOS) determines interrupt vector calls kernel
ndash Kernel uses vector to index IDT
After ISR execution IRQL is lowered to initial level
4242
Interrupt object
Allows device drivers to register ISRs for their devicesndash Contains dispatch code (initial handler)
ndash Dispatch code calls ISR with interrupt object as parameter(HW cannot pass parameters to ISR)
Connectingdisconnecting interrupt objectsndash Dynamic association between ISR and IDT entry
ndash Loadable device drivers (kernel modules)
ndash Turn onoff ISR
Interrupt objects can synchronize access to ISR datandash Multiple instances of ISR may be active simultaneously (MP machine)
ndash Multiple ISR may be connected with IRQL
4343
Predefined IRQLs
High ndash used when halting the system (via KeBugCheck())
Power fail ndash originated in the NT design document but has never been used
Inter-processor interruptndash used to request action from other processor (dispatching a thread updating a processors TLB system shutdown system crash)
Clockndash Used to update systemlsquos clock allocation of CPU time to threads
Profilendash Used for kernel profiling (see Kernel profiler ndash Kernprofexe Res Kit)
4444
Predefined IRQLs (contd)
Device
ndash Used to prioritize device interrupts
DPCdispatch and APC
ndash Software interrupts that kernel and device drivers generate
Passive
ndash No interrupt level at all normal thread execution
4545
IRQLs on 64-bit Systems
PassiveLowAPC
DispatchDPCDevice 1
Device n
Synch (Srv 2003)Clock
Interprocessor InterruptPower
HighProfile
012
1413
15
34
PassiveLowAPC
DispatchDPC amp Synch (UP only)Correctable Machine Check
Device 1
Device nSynch (MP only)
ClockInterprocessor Interrupt
HighProfilePower
x64 IA64
12
4646
Interrupt Prioritization amp Delivery
IRQLs are determined as followsndash x86 UP systems IRQL = 27 - IRQ
ndash x86 MP systems bucketized (random)
ndash x64 amp IA64 systems IRQL = IDT vector number 16
On MP systems which processor is chosen to deliver an interruptndash By default any processor can receive an interrupt from any deviceCan be configured with IntFilter utility in Resource Kit
ndash On x86 and x64 systems the IOAPIC (IO advanced programmable interrupt controller) is programmed to interrupt the processor running at the lowest IRQL
ndash On IA64 systems the SAPIC (streamlined advanced programmable interrupt controller) is configured to interrupt one processor for each interrupt sourceProcessors are assigned round robin for each interrupt vector
4747
Software interrupts
Initiating thread dispatching
ndash DPC allow for scheduling actions when kernel is deep within many layers of code
ndash Delayed scheduling decision one DPC queue per processor
Handling timer expiration
Asynchronous execution of a procedure in context of a particular thread
Support for asynchronous IO operations
4848
Flow of Interrupts
4949
Sync on MP use spinlocks to coordinate among processors
Spinlock acquisition and release routines implement a one-owner-at-a-time algorithm
ndash Spinlock is either free or is considered to be owned by a CPU
ndash Analogous to using Windows API mutexes from user mode
A spinlock is just a data cell in memory
ndash Accessed with a test-and-modify operation that is atomic across all processors
ndash KSPIN_LOCK is an opaque data type typedefrsquod as a ULONG
ndash To implement synchronization a single bit is sufficient
Synchronization on SMP Systems
5050
do acquire_spinlock(DPC)until (SUCCESS)
begin remove DPC from queueend
release_spinlock(DPC)
do acquire_spinlock(DPC)until (SUCCESS)
begin remove DPC from queueend
release_spinlock(DPC)
Kernel Synchronization
Processor BProcessor A
Critical section
spinlock
DPC DPC
A spinlock is a locking primitive associatedwith a global data structure such as the DPC queue
5151
Queued Spinlocks
Problem Checking status of spinlock via test-and-set operation creates bus contention
Queued spinlocks maintain queue of waiting processors
First processor acquires lock other processors wait on processor-local flag
ndash Thus busy-wait loop requires no access to the memory bus
When releasing lock the 1st processorrsquos flag is modified
ndash Exactly one processor is being signaled
ndash Pre-determined wait order
5252
SMP Scalability Improvements
Windows 2000 queued spinlocks
ndash qlocks in Kernel Debugger
Server 2003
ndash More spinlocks eliminated (context swap system space commit)
ndash Further reduction of use of spinlocks amp length they are held
ndash Scheduling database now per-CPUAllows thread state transitions in parallel
5353
SMP Scalability Improvements
XP2003
ndash Minimized lock contention for hot locks PFN or Page Frame Database lock
ndash Some locks completely eliminatedCharging nonpagedpaged pool quotas allocating and mapping system page table entries charging commitment of pages allocatingmapping physical memory through AWE functions
ndash New more efficient locking mechanism (pushlocks)Doesnrsquot use spinlocks when no contentionUsed for object manager and address windowing extensions (AWE) related locks
5454
Waiting
Flexible wait calls
ndash Wait for one or multiple objects in one call
ndash Wait for multiple can wait for ldquoanyrdquo one or ldquoallrdquo at onceldquoAllrdquo all objects must be in the signalled state concurrently to resolve the wait
ndash All wait calls include optional timeout argument
ndash Waiting threads consume no CPU time
5555
Waiting
Waitable objects include
ndash Events (may be auto-reset or manual reset may be set or ldquopulsedrdquo)
ndash Mutexes (ldquomutual exclusionrdquo one-at-a-time)
ndash Semaphores (n-at-a-time)
ndash Timers
ndash Processes and Threads (signalled upon exit or terminate)
ndash Directories (change notification)
5656
Waiting
No guaranteed ordering of wait resolution
ndash If multiple threads are waiting for an object and only one thread is released (eg itrsquos a mutex or auto-reset event) which thread gets released is unpredictable
5757
Executive Synchronization
Thread waitson an object
handle
Create and initialize thread object
Initialized
Ready
Transition
Waiting
Running
Terminated
Standby
Wait is completeSet object to
signaled state
Interaction with thread scheduling
Waiting on Dispatcher Objects ndash outside the kernel
5858
Interaction bet Synchronization amp Dispatching User mode thread waits on an event objectlsquos handle
Kernel changes threadlsquos scheduling state from ready to waiting and adds thread to wait-list
Another thread sets the event
Kernel wakes up waiting threads variable priority threads get priority boost
5959
Interaction bet Synchronization amp Dispatching Dispatcher re-schedules new thread ndash it may preempt running thread it it has lower priority and issues software interrupt to initiate context switch
If no processor can be preempted the dispatcher places the ready thread in the dispatcher ready queue to be scheduled later
6060
What signals an object
Dispatcher object
System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
Owning thread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (kernel mode)
nonsignaled signaled
Owning thread or otherthread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (exported to user mode)
nonsignaled signaled
One thread releases thesemaphore freeing a resource
A thread acquires the semaphoreMore resources are not available
Kernel resumes one or more waiting threads
Semaphore
6161
A thread reinitializesthe thread object
What signals an object (contd)
Dispatcher object System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
A thread sets the event
Kernel resumes one or more threads
Kernel resumes one or more waiting threads
Event
nonsignaled signaled
Dedicated thread sets oneevent in the event pair
Kernel resumes theother dedicated thread
Kernel resumes waitingdedicated thread
Event pair
nonsignaled signaled
Timer expires
A thread (re) initializes the timer
Kernel resumes all waiting threads
Timer
nonsignaled signaled
Thread terminates
Kernel resumes all waiting threads
Thread
6262
Further Reading
Mark E Russinovich and David A Solomon Microsoft Windows Internals 4th Edition Microsoft Press 2004
Chapter 3 - System Mechanisms
ndash Trap Dispatching (pp 85 ff)
ndash Synchronization (pp 149 ff)
ndash Kernel Event Tracing (pp 175 ff)
6363
33 Advanced Windows Synchronization
Deferred and Asynchronous Procedure Calls
IRQLs and CPU Time Accounting
Wait Queues amp Dispatcher Objects
6464
Used to defer processing from higher (device) interrupt level to a lower (dispatch) levelndash Also used for quantum end and timer expiration
Driver (usually ISR) queues requestndash One queue per CPU DPCs are normally queued to the current processor but can be targeted to other CPUs
ndash Executes specified procedure at dispatch IRQL (or ldquodispatch levelrdquo also ldquoDPC levelrdquo) when all higher-IRQL work (interrupts) completed
ndash Maximum times recommended ISR 10 usec DPC 25 usec
See httpwwwmicrosoftcomwhdcdriverperformmmdrvmspx
Deferred Procedure Calls (DPCs)
6565
queue head DPC object DPC object DPC object
Deferred Procedure Calls (DPCs)
6666
DPC
Delivering a DPC
DPC routines can call kernel functionsbut canlsquot call system services generatepage faults or create or wait on objects
DPC routines canlsquotassume whatprocess addressspace is currentlymapped
Interruptdispatch table
high
Power failure
DispatchDPC
APC
Low
DPC
1 Timer expires kernelqueues DPC that willrelease all waiting threadsKernel requests SW int
DPCDPC
DPC queue
2 DPC interrupt occurswhen IRQL drops belowdispatchDPC level
dispatcher
3 After DPC interruptcontrol transfers tothread dispatcher
4 Dispatcher executes each DPCroutine in DPC queue
6767
Asynchronous Procedure Calls (APCs)
Execute code in context of a particular user threadndash APC routines can acquire resources (objects) incur page faultscall system services
APC queue is thread-specific
User mode amp kernel mode APCsndash Permission required for user mode APCs
6868
Asynchronous Procedure Calls (APCs)
Executive uses APCs to complete work in thread spacendash Wait for asynchronous IO operation
ndash Emulate delivery of POSIX signals
ndash Make threads suspendterminate itself (env subsystems)
APCs are delivered when thread is in alertable wait statendash WaitForMultipleObjectsEx() SleepEx()
6969
Special kernel APCs
ndash Run in kernel mode at IRQL 1
ndash Always deliverable unless thread is already at IRQL 1 or above
ndash Used for IO completion reporting from ldquoarbitrary thread contextrdquo
ndash Kernel-mode interface is linkable but not documented
Asynchronous Procedure Calls (APCs)
7070
ldquoOrdinaryrdquo kernel APCs
ndash Always deliverable if at IRQL 0 unless explicitly disabled (disable with KeEnterCriticalRegion)
User mode APCs
ndash Used for IO completion callback routines (see ReadFileEx WriteFileEx) also QueueUserApc
ndash Only deliverable when thread is in ldquoalertable waitrdquo
Asynchronous Procedure Calls (APCs)
7171
ThreadObject
K
U
APC objects
Asynchronous Procedure Calls (APCs)
7272
IRQLs and CPU Time Accounting
Interval clock timer ISR keeps track of time
Clock ISR time accounting
ndash If IRQLlt2 charge to threadrsquos user or kernel time
ndash If IRQL=2 and processing a DPC charge to DPC time
ndash If IRQL=2 amp not processing a DPC charge to thread kernel time
ndash If IRQLgt2 charge to interrupt time
7373
IRQLs and CPU Time Accounting
Since time servicing interrupts are NOT charged to interrupted thread if system is busy but no process appears to be running must be due to interrupt-related activity
ndash Note time at IRQL 2 or more is charged to the current threadrsquos quantum (to be described)
7474
Interrupt Time Accounting
Task Manager includes interrupt and DPC time with the Idle process time
Interrupt activity is not charged to any threadprocess
ndash Process Explorer shows these as separate processesnot really processes
ndash Context switches for these are really of interrupts amp DPCs
7575
Time Accounting Quirks
Looking at total CPU time for each process may not reveal where system has spent its time
CPU time accounting is driven by programmable interrupt timer
ndash Normally 10 msec (15 msec on some MP Pentiums)
Thread execution and context switches between clock intervals NOT accounted
ndash Eg one or more threads run and enter a wait state before clock fires
ndash Thus threads may run but never get charged
View context switch activity with Process Explorer
ndash Add Context Switch Delta column
7676
For waiting threads user-mode utilities only display the wait reason
Example pstat
Looking at Waiting Threads
7777
Wait Internals 1 Dispatcher Objects
Size TypeState
Wait listhead
Object-type-specific data
DispatcherObject
(see ntddkincddkntddkh)
Any kernel object you can wait for is a ldquodispatcher objectrdquo
ndash some exclusively for synchronizationeg events mutexes (ldquomutantsrdquo) semaphores queues timers
ndash others can be waited for as a side effect of their prime function eg processes threads file objects
ndash non-waitable kernel objects are called ldquocontrol objectsrdquo
All dispatcher objects have a common header
All dispatcher objects are in one of two states
ndash ldquosignaledrdquo vs ldquononsignaledrdquo
ndash when signalled a wait on the object is satisfied
ndash different object types differ in terms of what changes their state
ndash wait and unwait implementation iscommon to all types of dispatcher objects
7878
Object-type-specific data
Wait Internals 2Wait Blocks
Size TypeState
Wait listhead
Size TypeState
Wait listhead
Represent a threadrsquos reference to something itrsquos waiting for (one per handle passed to WaitForhellip)
All wait blocks from a given wait call are chained to the waiting thread
Type indicates wait for ldquoanyrdquo or ldquoallrdquo Key denotes argument list position for
WaitForMultipleObjects
Object-type-specific data
DispatcherObjects
Thread Objects
WaitBlockListWaitBlockList
Wait blocks
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
7979
34 Windows APIs for Synchronization and IPC Windows API constructs for synchronization and interprocess communication
Synchronization
ndash Critical sections
ndash Mutexes
ndash Semaphores
ndash Event objects
Synchronization through interprocess communication
ndash Anonymous pipes
ndash Named pipes
ndash Mailslots
8080
Critical Sections
Only usable from within the same process
Critical sections are initialized and deleted but do not have handles
Only one thread at a time can be in a critical section
A thread can enter a critical section multiple times - however the number of Enter- and Leave-operations must match
Leaving a critical section before entering it may cause deadlocks
No way to test whether another thread is in a critical section
VOID InitializeCriticalSection( LPCRITICAL_SECTION sec )VOID DeleteCriticalSection( LPCRITICAL_SECTION sec )
VOID EnterCriticalSection( LPCRITICAL_SECTION sec ) VOID LeaveCriticalSection( LPCRITICAL_SECTION sec )BOOL TryEnterCriticalSection ( LPCRITICAL_SECTION sec )
8181
Critical Section Example
counter is global shared by all threads
volatile int counter = 0
CRITICAL_SECTION crit
InitializeCriticalSection ( ampcrit )
hellip main loop in any of the threads
while (done)
_try
EnterCriticalSection ( ampcrit )
counter += local_value
LeaveCriticalSection ( ampcrit )
_finally LeaveCriticalSection ( ampcrit )
DeleteCriticalSection( ampcrit )
8282
Synchronizing Threads with Kernel Objects
The following kernel objects can be used to synchronize threads
ndash Processes
ndash Threads
ndash Files
ndash Console input
File change notificationsFile change notifications
MutexesMutexes
Events (auto-reset + manual-reset)Events (auto-reset + manual-reset)
Waitable timersWaitable timers
DWORD WaitForSingleObject( HANDLE hObject DWORD dwTimeout )
DWORD WaitForMultipleObjects( DWORD cObjects LPHANDLE lpHandles BOOL bWaitAll DWORD dwTimeout )
8383
Wait Functions - Details
WaitForSingleObject()ndash hObject specifies kernel object
ndash dwTimeout specifies wait time in msecdwTimeout == 0 - no wait check whether object is signaled
dwTimeout == INFINITE - wait forever
WaitForMultipleObjects()ndash cObjects lt= MAXIMUM_WAIT_OBJECTS (64)
ndash lpHandles - pointer to array identifying these objects
ndash bWaitAll - whether to wait for first signaled object or all objectsFunction returns index of first signaled object
Side effectsndash Mutexes auto-reset events and waitable timers will be reset to non-signaled state after completing wait functions
8484
Mutexes
Mutexes work across processes
First thread has to call CreateMutex()
When sharing a mutex second thread (process) calls CreateMutex() or OpenMutex()
fInitialOwner == TRUE gives creator immediate ownership
Threads acquire mutex ownership using WaitForSingleObject() or WaitForMultipleObjects()
ReleaseMutex() gives up ownership
CloseHandle() will free mutex object
8585
Mutexes
HANDLE CreateMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
HANDLE OpenMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
BOOL ReleaseMutex( HANDLE hMutex )
8686
Mutex Example
counter is global shared by all threads
volatile int done counter = 0
HANDLE mutex = CreateMutex( NULL FALSE NULL )
main loop in any of the threads ret is local
DWORD ret
while (done)
ret = WaitForSingleObject( mutex INFINITE )
if (ret == WAIT_OBJECT_0)
counter += local_value
else mutex was abandoned
break exit the loop
ReleaseMutex( mutex )
CloseHandle( mutex )
8787
Comparison - POSIX mutexes
POSIX pthreads specification supports mutexesndash Synchronization among threads in same process
Five basic functionsndash pthread_mutex_init()
ndash pthread_mutex_destroy()
ndash pthread_mutex_lock()
ndash pthread_mutex_unlock()
ndash pthread_mutex_trylock()
Comparisonndash pthread_mutex_lock() will block - equivalent to WaitForSingleObject( hMutex )
ndash pthread_mutex_trylock() is nonblocking (polling) - equivalent to WaitForSingleObject() with timeout == 0
8888
Semaphores
Semaphore objects are used for resource countingndash A semaphore is signaled when count gt 0
Threadsprocesses use wait functionsndash Each wait function decreases semaphore count by 1
ndash ReleaseSemaphore() may increment count by any value
ndash ReleaseSemaphore() returns old semaphore count
8989
Semaphores
HANDLE CreateSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE OpenSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE ReleaseSemaphore( HANDLE hSemaphoreLONG cReleaseCount LPLONG lpPreviousCount )
9090
Events
Multiple threads can be released when a single event is signaled (barrier synchronization)ndash Manual-reset event can signal several thread simultaneously must be reset manually
ndash PulseEvent() will release all threads waiting on a manual-reset event and automatically reset the event
ndash Auto-reset event signals a single thread event is reset automatically
ndash fInitialState == TRUE - create event in signaled state
9191
Events
HANDLE CreateEvent( LPSECURITY_ATTRIBUTE lpsaBOOL fManualReset BOOL fInititalStateLPTSTR lpszEventName )
BOOL SetEvent( HANDLE hEvent )BOOL ResetEvent( HANDLE hEvent )BOOL PulseEvent( HANDLE hEvent )
9292
Comparison - POSIX condition variables
pthreadrsquos condition variables are comparable to eventsndash pthread_cond_init()
ndash pthread_cond_destroy()
Wait functionsndash pthread_cond_wait()
ndash pthread_cond_timedwait()
Signalingndash pthread_cond_signal() - one thread
ndash pthread_cond_broadcast() - all waiting threads
No exact equivalent to manual-reset events
9393
Anonymous pipes
BOOL CreatePipe( PHANDLE phReadPHANDLE phWriteLPSECURITY_ATTRIBUTES lpsaDWORD cbPipe )
main
prog1 prog2pipe
Half-duplex character-based IPC
cbPipe pipe byte size zero == default
Read on pipe handle will block if pipe is empty
Write operation to a full pipe will block
Anonymous pipes are oneway
9494
IO Redirection using an Anonymous Pipe
Create default size anonymous pipe handles are inheritable
if (CreatePipe (amphReadPipe amphWritePipe ampPipeSA 0))
fprintf(stderr ldquoAnon pipe create failednrdquo) exit(1)
Set output handle to pipe handle create first processes
StartInfoCh1hStdInput = GetStdHandle (STD_INPUT_HANDLE)
StartInfoCh1hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh1hStdOutput = hWritePipe
StartInfoCh1dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)Command1 NULL NULL TRUE 0 NULL NULL ampStartInfoCh1 ampProcInfo1))
fprintf(stderr ldquoCreateProc1 failednrdquo) exit(2)
CloseHandle (hWritePipe)
9595
Pipe example (contd)
Repeat (symmetrically) for the second process
StartInfoCh2hStdInput = hReadPipe
StartInfoCh2hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh2hStdOutput = GetStdHandle (STD_OUTPUT_HANDLE)
StartInfoCh2dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)targv NULL NULLTRUE Inherit handles
0 NULL NULL ampStartInfoCh2 ampProcInfo2))
fprintf(stderr ldquoCreateProc2 failednrdquo) exit(3)
CloseHandle (hReadPipe)
Wait for both processes to complete
WaitForSingleObject (ProcInfo1hProcess INFINITE)
WaitForSingleObject (ProcInfo2hProcess INFINITE)
CloseHandle (ProcInfo1hThread) CloseHandle (ProcInfo1hProcess)
CloseHandle (ProcInfo2hThread) CloseHandle (ProcInfo2hProcess)
return 0
9696
Named Pipes
Message orientedndash Reading process can read varying-length messages precisely as sent by the writing process
Bi-directionalndash Two processes can exchange messages over the same pipe
Multiple independent instances of a named pipendash Several clients can communicate with a single server using the same instance
ndash Server can respond to client using the same instance
Pipe can be accessed over the networkndash location transparency
Convenience and connection functions
9797
Using Named Pipes
HANDLE CreateNamedPipe (LPCTSTR lpszPipeNameDWORD fdwOpenMode DWORD fdwPipModeDWORD nMaxInstances DWORD cbOutBufDWORD cbInBuf DWORD dwTimeOutLPSECURITY_ATTRIBUTES lpsa )
Use same flag settings forall instances of a named pipe
lpszPipeName pipe[path]pipename
ndash Not possible to create a pipe on remote machine ( ndash local machine)
fdwOpenMode
ndash PIPE_ACCESS_DUPLEX PIPE_ACCESS_INBOUND PIPE_ACCESS_OUTBOUND
fdwPipeMode
ndash PIPE_TYPE_BYTE or PIPE_TYPE_MESSAGE
ndash PIPE_READMODE_BYTE or PIPE_READMODE_MESSAGE
ndash PIPE_WAIT or PIPE_NOWAIT (will ReadFile block)
9898
Named Pipes (contd)
BOOL PeekNamedPipe (HANDLE hPipeLPVOID lpvBuffer DWORD cbBufferLPDWORD lpcbRead LPDWORD lpcbAvailLPDWORD lpcbMessage)
nMaxInstances
ndash Number of instances
ndash PIPE_UNLIMITED_INSTANCES OS choice based on resources
dwTimeOut
ndash Default time-out period (in msec) for WaitNamedPipe()
First CreateNamedPipe creates named pipe
ndash Closing handle to last instance deletes named pipe
Polling a pipe
ndash Nondestructive ndash is there a message waiting for ReadFile
9999
Named Pipe Client Connections
CreateFile with named pipe namendash pipe[path]pipename
ndash servernamepipe[path]pipename
ndash First method gives better performance (local server)
Status Functionsndash GetNamedPipeHandleState
ndash SetNamedPipeHandleState
ndash GetNamedPipeInfo
100100
Convenience Functions
BOOL TransactNamedPipe( HANDLE hNamedPipeLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDOWRD lpcbRead LPOVERLAPPED lpa)
WriteFile ReadFile sequence
101101
Convenience Functions
BOOL CallNamedPipe( LPCTSTR lpszPipeNameLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDWORD lpcbRead DWORD dwTimeOut)
CreateFile WriteFile ReadFile CloseHandle
ndash dwTimeOut NMPWAIT_NOWAIT NMPWAIT_WIAT_FOREVER NMPWAIT_USE_DEFAULT_WAIT
102102
Server eliminate the polling loop
BOOL ConnectNamedPipe (HANDLE hNamedPipeLPOVERLAPPED lpo
lpo == NULLndash Call will return as soon as there is a client connection
ndash Returns false if client connected between CreateNamed Pipe calland ConnectNamedPipe()
Use DisconnectNamedPipe to free the handle for connection from another client
WaitNamedPipe()ndash Client may wait for serverlsquos ConnectNamedPipe()
Security rights for named pipesndash GENERIC_READ GENERIC_WRITE SYNCHRONIZE
103103
Comparison with UNIX
UNIX FIFOs are similar to a named pipe
ndash FIFOs are half-duplex
ndash FIFOs are limited to a single machine
ndash FIFOs are still byte-oriented so its easiest to use fixed-size records in clientserver applications
ndash Individual readwrites are atomic
A server using FIFOs must use a separate FIFO for each clientlsquos response although all clients can send requests via a single well known FIFO
Mkfifo() is the UNIX counterpart to CreateNamedPipe()
Use sockets for networked clientserver scenarios
104104
Client Example using Named Pipe
WaitNamedPipe (ServerPipeName NMPWAIT_WAIT_FOREVER)
hNamedPipe = CreateFile (ServerPipeName GENERIC_READ | GENERIC_WRITE
0 NULL OPEN_EXISTING FILE_ATTRIBUTE_NORMAL NULL)
if (hNamedPipe == INVALID_HANDLE_VALUE)
fptinf(stderr Failure to locate servern) exit(3)
Write the request
WriteFile (hNamedPipe ampRequest MAX_RQRS_LEN ampnWrite NULL)
Read each response and send it to std out
while (ReadFile (hNamedPipe ResponseRecord MAX_RQRS_LEN ampnRead NULL))
printf (s ResponseRecord)
CloseHandle (hNamedPipe)
return 0
105105
Server Example Using a Named Pipe
hNamedPipe = CreateNamedPipe (SERVER_PIPE PIPE_ACCESS_DUPLEX
PIPE_READMODE_MESSAGE | PIPE_TYPE_MESSAGE | PIPE_WAIT
1 0 0 CS_TIMEOUT pNPSA)
while (Done)
printf (Server is awaiting next requestn)
if (ConnectNamedPipe (hNamedPipe NULL)
|| ReadFile (hNamedPipe ampRequest RQ_SIZE ampnXfer NULL))
fprintf(stderr ldquoConnect or Read Named Pipe errornrdquo) exit(4)
printf( ldquoRequest is sn RequestRecord)
Send the file one line at a time to the client
fp = fopen (File r)
while ((fgets (ResponseRecord MAX_RQRS_LEN fp) = NULL))
WriteFile (hNamedPipe ampResponseRecord
(strlen(ResponseRecord) + 1) TSIZE ampnXfer NULL)
fclose (fp)
DisconnectNamedPipe (hNamedPipe)
End of server operation
106106
Win32 IPC - MailslotsMailslots bear some nasty implementation detailsthey are almost never used
Broadcast mechanism
ndash One-directional
ndash Mutliple writersmultiple readers (frequently one-to-many comm)
ndash Message delivery is unreliable
ndash Can be located over a network domain
ndash Message lengths are limited (w2k lt 426 byte) Operations on the mailslot
ndash Each reader (server) creates mailslot with CreateMailslot()
ndash Write-only client opens mailslot with CreateFile() and uses WriteFile() ndash open will fail if there are no waiting readers
ndash Clientlsquos message can be read by all servers (readers) Client lookup mailslotmailslotname
ndash Client will connect to every server in network domain
107107
Locate a server via mailslot
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
App client 0
App client n
Mailslot Servers
While () Sleep() hMS = CreateFile( ldquomailslotstatusldquo)
WriteFile(hMS ampStatInfo
App Server
Mailslot Client
Message is sent periodically
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
77
Only 2 threads T0 and T1
General structure of thread Ti (other thread Tj)
do
enter section
critical section
exit section
reminder section
while (1)
Threads may share some common variables to synchronize their actions
Initial Attempts to Solve Problem
88
First Attempt Algorithm 1
Shared variables
ndash Initialization int turn = 0
ndash turn == i Ti can enter its critical section
Thread Ti
do
while (turn = i)
critical section
turn = j
reminder section
while (1)
Satisfies mutual exclusion but not progress
99
Second Attempt Algorithm 2
Shared variables
ndash initialization int flag[2] flag[0] = flag[1] = 0
ndash flag[i] == 1 Ti can enter its critical section
Thread Ti
do
flag[i] = 1while (flag[j] == 1)
critical section
flag[i] = 0remainder section
while(1)
Satisfies mutual exclusion not progress requirement
1010
Algorithm 3 (Petersonrsquos Algorithm - 1981)
Shared variables of algorithms 1 and 2 - initialization
int flag[2] flag[0] = flag[1] = 0int turn = 0
Thread Ti
do
flag[i] = 1turn = jwhile ((flag[j] == 1) ampamp turn == j)
critical section
flag[i] = 0
remainder section
while (1)
Solves the critical-section problem for two threads
1111
Dekkerrsquos Algorithm (1965)
This is the first correct solution proposed for the two-thread (two-process) case
Originally developed by Dekker in a different context it was applied to the critical section problem by Dijkstra
Dekker adds the idea of a favored thread and allows access to either thread when the request is uncontested
When there is a conflict one thread is favored and the priority reverses after successful execution of the critical section
1212
Dekkerrsquos Algorithm (contd)
Shared variables - initializationint flag[2] flag[0] = flag[1] = 0int turn = 0
Thread Ti
do flag[i] = 1
while (flag[j] ) if (turn == j)
flag[i] = 0while (turn == j)flag[i] = 1
critical section
turn = jflag[I] = 0
remainder section
while (1)
1313
Bakery Algorithm (Lamport 1979)
A Solution to the Critical Section problem for n threads
Before entering its CS a thread receives a number
Holder of the smallest number enters the CS
If threads Ti and Tj receive the same number if i lt j then Ti is served first else Tj is served first
The numbering scheme generates numbers in monotonically non-decreasing order
ndash ie 1112333445
1414
Bakery Algorithm
Notation ldquoltldquo establishes lexicographical order among 2-tuples (ticket thread id )
(ab) lt (cd) if a lt c or if a == c and b lt d
max (a0hellip an-1) = k | k ai for i = 0hellip n ndash 1
Shared data
int choosing[n]
int number[n] - the ticket
Data structures are initialized to 0
1515
Bakery Algorithm
do
choosing[i] = 1
number[i] = max(number[0]number[1] number[n-1]) + 1
choosing[i] = 0
for (j = 0 j lt n j++)
while (choosing[j] == 1)
while ((number[j] = 0) ampamp ((number[j]j) lsquorsquoltlsquorsquo (number[i]i)))
critical section
number[i] = 0
remainder section
while (1)
1616
Mutual Exclusion - Hardware Support
Interrupt Disabling
ndash Concurrent threads cannot overlap on a uniprocessor
ndash Thread will run until performing a system call or interrupt happens
Special Atomic Machine Instructions
ndash Test and Set Instruction - read amp write a memory location
ndash Exchange Instruction - swap register and memory location
Problems with Machine-Instruction Approach
ndash Busy waiting
ndash Starvation is possible
ndash Deadlock is possible
1717
Synchronization Hardware
Test and modify the content of a word atomically
boolean TestAndSet(boolean amptarget)
boolean rv = target
target = true
return rv
1818
Shared data ndash boolean lock = false
Thread Ti
do
while (TestAndSet(lock))
critical section
lock = false
remainder section
Mutual Exclusion with Test-and-Set
1919
Synchronization Hardware
Atomically swap two variables
void Swap(boolean ampa boolean ampb)
boolean temp = a
a = b
b = temp
2020
Mutual Exclusion with Swap
Shared data (initialized to 0) int lock = 0
Thread Ti
int key
do
key = 1
while (key == 1) Swap(lockkey)
critical section
lock = 0
remainder section
2121
Semaphores
Semaphore S ndash integer variable
can only be accessed via two atomic operations
wait (S)
while (S lt= 0)S--
signal (S)
S++
2222
Critical Section of n Threads
Shared data
semaphore mutex initially mutex = 1
Thread Ti
do wait(mutex) critical section
signal(mutex) remainder section
while (1)
2323
Semaphore Implementation
Semaphores may suspendresume threads
ndash Avoid busy waiting
Define a semaphore as a record
typedef struct
int value struct thread L semaphore
Assume two simple operations
ndash suspend() suspends the thread that invokes it
ndash resume(T) resumes the execution of a blocked thread T
2424
Implementation
Semaphore operations now defined as wait(S)
Svalue--
if (Svalue lt 0)
add this thread to SLsuspend()
signal(S) Svalue++
if (Svalue lt= 0)
remove a thread T from SLresume(T)
2525
Semaphore as a General Synchronization Tool Execute B in Tj only after A executed in Ti
Use semaphore flag initialized to 0
Code
Ti Tj
A wait(flag)
signal(flag) B
2626
Two Types of Semaphores
Counting semaphore
ndash integer value can range over an unrestricted domain
Binary semaphore
ndash integer value can range only between 0 and 1
ndash can be simpler to implement
Counting semaphore S can be implemented as a binary semaphore
2727
Deadlock and Starvation
Deadlock ndash
ndash two or more threads are waiting indefinitely for an event that can be caused by only one of the waiting threads
Let S and Q be two semaphores initialized to 1
T0 T1
wait(S) wait(Q)
wait(Q) wait(S)
signal(S) signal(Q)
signal(Q) signal(S)
2828
Deadlock and Starvation
Starvation ndash indefinite blocking
ndash A thread may never be removed from the semaphore queue in which it is suspended
Solution ndash
ndash all code should acquirerelease semaphores in same order
2929
Windows Synchronization
Uses interrupt masks to protect access to global resources on uniprocessor systems
Uses spinlocks on multiprocessor systems
Provides dispatcher objects which may act as mutexes and semaphores
Dispatcher objects may also provide events An event acts much like a condition variable
3030
Linux Synchronization
Kernel disables interrupts for synchronizing access to global data on uniprocessor systems
Uses spinlocks for multiprocessor synchronization
Uses semaphores and readers-writers locks when longer sections of code need access to data
Implements POSIX synchronization primitives to support multitasking multithreading (including real-time threads) and multiprocessing
3131
Further Reading
Ben-Ari M Principles of Concurrent Programming Prentice Hall 1982
Lamport L The Mutual Exclusion Problem Journal of the ACM April 1986
Abraham Silberschatz Peter B Galvin Operating System Concepts John Wiley amp Sons 6th Ed 2003
ndash Chapter 7 - Process Synchronization
ndash Chapter 8 - Deadlocks
3232
32 Trap Dispatching Interrupts Synchronization Trap and Interrupt dispatching
IRQL levels amp Interrupt Precedence
Spinlocks and Kernel Synchronization
Executive Synchronization
3333
Kernel Mode Versus User Mode
A processor state
Controls access to memory
Each memory page is tagged to show the required mode for reading and for writing
ndash Protects the system from the users
ndash Protects the user (process) from themselves
ndash System is not protected from system
Code regions are tagged ldquono write in any moderdquo
Controls ability to execute privileged instructions
A Windows abstraction
ndash Intel Ring 0 Ring 3
3434
Kernel Mode Versus User Mode
Control flow (a thread) can change from user to kernel mode and back
ndash Does not affect scheduling
ndash Thread context includes info about execution mode (along with registers etc)
PerfMon counters
ndash ldquoPrivileged Timerdquo and ldquoUser Timerdquo
ndash 4 levels of granularity thread process processor system
3535
Getting Into Kernel Mode
Code is run in kernel mode for one of three reasons
1 Requests from user mode
ndash Via the system service dispatch mechanism
ndash Kernel-mode code runs in the context of the requesting thread
2 Dedicated kernel-mode system threads
ndash Some threads in the system stay in kernel mode at all timesmostly in the ldquoSystemrdquo process
ndash Scheduled preempted etc like any other threads
3636
Getting Into Kernel Mode
3 Interrupts from external devicesndash interrupt dispatcher invokes the interrupt service routine
ndash ISR runs in the context of the interrupted thread so-called ldquoarbitrary thread contextrdquo
ndash ISR often requests the execution of a ldquoDPC routinerdquo which also runs in kernel mode
ndash Time not charged to interrupted thread
3737
Trap dispatching
Interruptdispatcher
Systemservice
dispatcher
Interruptserviceroutines
Interruptserviceroutines
Interruptserviceroutines
System services
System services
System services
Exceptiondispatcher
Exceptionhandlers
Exceptionhandlers
Exceptionhandlers
Virtual memorymanagerlsquos pager
Interrupt
System service call
HW exceptionsSW exceptions
Virtual addressexceptions
Trap processorlsquos mechanism to capture executing thread
ndash Switch from user to kernel mode
ndash Interrupts ndash asynchronous
ndash Exceptions - synchronous
Interrupt dispatch routine
Disable interrupts
Record machine state (trap frame) to allow resume
Mask equal- and lower-IRQL interrupts
Find and call appropriate ISR
Dismiss interrupt
Restore machine state (including mode and enabled interrupts)
Disable interrupts
Record machine state (trap frame) to allow resume
Mask equal- and lower-IRQL interrupts
Find and call appropriate ISR
Dismiss interrupt
Restore machine state (including mode and enabled interrupts)
Tell the device to stop interrupting
Interrogate device state start next operation on device etc
Request a DPC
Return to caller
Tell the device to stop interrupting
Interrogate device state start next operation on device etc
Request a DPC
Return to caller
Interrupt service routine
interrupt
user or kernel mode
codekernel mode
Note no thread or process context switch
Note no thread or process context switch
Interrupt Dispatching
3939
IRQL = Interrupt Request Level
ndash Precedence of the interrupt with respect to other interrupts
ndash Different interrupt sources have different IRQLs
ndash not the same as IRQ
IRQL is also a state of the processor
ndash Servicing an interrupt raises processor IRQL to that interruptrsquos IRQL
ndash this masks subsequent interrupts at equal and lower IRQLs
User mode is limited to IRQL 0
No waits or page faults at IRQL gt= DISPATCH_LEVEL
Interrupt Precedence via IRQLs (x86)
4040
PassiveLowAPC
DispatchDPCDevice 1
Profile amp Synch (Srv 2003)
ClockInterprocessor Interrupt
Power failHigh
normal thread execution
Hardware interrupts
Deferrable software interrupts
012
302928
31
Interrupt Precedence via IRQLs (x86)
4141
Interrupt processing
Interrupt dispatch table (IDT)ndash Links to interrupt service routines
x86ndash Interrupt controller interrupts processor (single line)
ndash Processor queries for interrupt vector uses vector as index to IDT
Alphandash PAL code (Privileged Architecture Library ndash Alpha BIOS) determines interrupt vector calls kernel
ndash Kernel uses vector to index IDT
After ISR execution IRQL is lowered to initial level
4242
Interrupt object
Allows device drivers to register ISRs for their devicesndash Contains dispatch code (initial handler)
ndash Dispatch code calls ISR with interrupt object as parameter(HW cannot pass parameters to ISR)
Connectingdisconnecting interrupt objectsndash Dynamic association between ISR and IDT entry
ndash Loadable device drivers (kernel modules)
ndash Turn onoff ISR
Interrupt objects can synchronize access to ISR datandash Multiple instances of ISR may be active simultaneously (MP machine)
ndash Multiple ISR may be connected with IRQL
4343
Predefined IRQLs
High ndash used when halting the system (via KeBugCheck())
Power fail ndash originated in the NT design document but has never been used
Inter-processor interruptndash used to request action from other processor (dispatching a thread updating a processors TLB system shutdown system crash)
Clockndash Used to update systemlsquos clock allocation of CPU time to threads
Profilendash Used for kernel profiling (see Kernel profiler ndash Kernprofexe Res Kit)
4444
Predefined IRQLs (contd)
Device
ndash Used to prioritize device interrupts
DPCdispatch and APC
ndash Software interrupts that kernel and device drivers generate
Passive
ndash No interrupt level at all normal thread execution
4545
IRQLs on 64-bit Systems
PassiveLowAPC
DispatchDPCDevice 1
Device n
Synch (Srv 2003)Clock
Interprocessor InterruptPower
HighProfile
012
1413
15
34
PassiveLowAPC
DispatchDPC amp Synch (UP only)Correctable Machine Check
Device 1
Device nSynch (MP only)
ClockInterprocessor Interrupt
HighProfilePower
x64 IA64
12
4646
Interrupt Prioritization amp Delivery
IRQLs are determined as followsndash x86 UP systems IRQL = 27 - IRQ
ndash x86 MP systems bucketized (random)
ndash x64 amp IA64 systems IRQL = IDT vector number 16
On MP systems which processor is chosen to deliver an interruptndash By default any processor can receive an interrupt from any deviceCan be configured with IntFilter utility in Resource Kit
ndash On x86 and x64 systems the IOAPIC (IO advanced programmable interrupt controller) is programmed to interrupt the processor running at the lowest IRQL
ndash On IA64 systems the SAPIC (streamlined advanced programmable interrupt controller) is configured to interrupt one processor for each interrupt sourceProcessors are assigned round robin for each interrupt vector
4747
Software interrupts
Initiating thread dispatching
ndash DPC allow for scheduling actions when kernel is deep within many layers of code
ndash Delayed scheduling decision one DPC queue per processor
Handling timer expiration
Asynchronous execution of a procedure in context of a particular thread
Support for asynchronous IO operations
4848
Flow of Interrupts
4949
Sync on MP use spinlocks to coordinate among processors
Spinlock acquisition and release routines implement a one-owner-at-a-time algorithm
ndash Spinlock is either free or is considered to be owned by a CPU
ndash Analogous to using Windows API mutexes from user mode
A spinlock is just a data cell in memory
ndash Accessed with a test-and-modify operation that is atomic across all processors
ndash KSPIN_LOCK is an opaque data type typedefrsquod as a ULONG
ndash To implement synchronization a single bit is sufficient
Synchronization on SMP Systems
5050
do acquire_spinlock(DPC)until (SUCCESS)
begin remove DPC from queueend
release_spinlock(DPC)
do acquire_spinlock(DPC)until (SUCCESS)
begin remove DPC from queueend
release_spinlock(DPC)
Kernel Synchronization
Processor BProcessor A
Critical section
spinlock
DPC DPC
A spinlock is a locking primitive associatedwith a global data structure such as the DPC queue
5151
Queued Spinlocks
Problem Checking status of spinlock via test-and-set operation creates bus contention
Queued spinlocks maintain queue of waiting processors
First processor acquires lock other processors wait on processor-local flag
ndash Thus busy-wait loop requires no access to the memory bus
When releasing lock the 1st processorrsquos flag is modified
ndash Exactly one processor is being signaled
ndash Pre-determined wait order
5252
SMP Scalability Improvements
Windows 2000 queued spinlocks
ndash qlocks in Kernel Debugger
Server 2003
ndash More spinlocks eliminated (context swap system space commit)
ndash Further reduction of use of spinlocks amp length they are held
ndash Scheduling database now per-CPUAllows thread state transitions in parallel
5353
SMP Scalability Improvements
XP2003
ndash Minimized lock contention for hot locks PFN or Page Frame Database lock
ndash Some locks completely eliminatedCharging nonpagedpaged pool quotas allocating and mapping system page table entries charging commitment of pages allocatingmapping physical memory through AWE functions
ndash New more efficient locking mechanism (pushlocks)Doesnrsquot use spinlocks when no contentionUsed for object manager and address windowing extensions (AWE) related locks
5454
Waiting
Flexible wait calls
ndash Wait for one or multiple objects in one call
ndash Wait for multiple can wait for ldquoanyrdquo one or ldquoallrdquo at onceldquoAllrdquo all objects must be in the signalled state concurrently to resolve the wait
ndash All wait calls include optional timeout argument
ndash Waiting threads consume no CPU time
5555
Waiting
Waitable objects include
ndash Events (may be auto-reset or manual reset may be set or ldquopulsedrdquo)
ndash Mutexes (ldquomutual exclusionrdquo one-at-a-time)
ndash Semaphores (n-at-a-time)
ndash Timers
ndash Processes and Threads (signalled upon exit or terminate)
ndash Directories (change notification)
5656
Waiting
No guaranteed ordering of wait resolution
ndash If multiple threads are waiting for an object and only one thread is released (eg itrsquos a mutex or auto-reset event) which thread gets released is unpredictable
5757
Executive Synchronization
Thread waitson an object
handle
Create and initialize thread object
Initialized
Ready
Transition
Waiting
Running
Terminated
Standby
Wait is completeSet object to
signaled state
Interaction with thread scheduling
Waiting on Dispatcher Objects ndash outside the kernel
5858
Interaction bet Synchronization amp Dispatching User mode thread waits on an event objectlsquos handle
Kernel changes threadlsquos scheduling state from ready to waiting and adds thread to wait-list
Another thread sets the event
Kernel wakes up waiting threads variable priority threads get priority boost
5959
Interaction bet Synchronization amp Dispatching Dispatcher re-schedules new thread ndash it may preempt running thread it it has lower priority and issues software interrupt to initiate context switch
If no processor can be preempted the dispatcher places the ready thread in the dispatcher ready queue to be scheduled later
6060
What signals an object
Dispatcher object
System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
Owning thread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (kernel mode)
nonsignaled signaled
Owning thread or otherthread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (exported to user mode)
nonsignaled signaled
One thread releases thesemaphore freeing a resource
A thread acquires the semaphoreMore resources are not available
Kernel resumes one or more waiting threads
Semaphore
6161
A thread reinitializesthe thread object
What signals an object (contd)
Dispatcher object System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
A thread sets the event
Kernel resumes one or more threads
Kernel resumes one or more waiting threads
Event
nonsignaled signaled
Dedicated thread sets oneevent in the event pair
Kernel resumes theother dedicated thread
Kernel resumes waitingdedicated thread
Event pair
nonsignaled signaled
Timer expires
A thread (re) initializes the timer
Kernel resumes all waiting threads
Timer
nonsignaled signaled
Thread terminates
Kernel resumes all waiting threads
Thread
6262
Further Reading
Mark E Russinovich and David A Solomon Microsoft Windows Internals 4th Edition Microsoft Press 2004
Chapter 3 - System Mechanisms
ndash Trap Dispatching (pp 85 ff)
ndash Synchronization (pp 149 ff)
ndash Kernel Event Tracing (pp 175 ff)
6363
33 Advanced Windows Synchronization
Deferred and Asynchronous Procedure Calls
IRQLs and CPU Time Accounting
Wait Queues amp Dispatcher Objects
6464
Used to defer processing from higher (device) interrupt level to a lower (dispatch) levelndash Also used for quantum end and timer expiration
Driver (usually ISR) queues requestndash One queue per CPU DPCs are normally queued to the current processor but can be targeted to other CPUs
ndash Executes specified procedure at dispatch IRQL (or ldquodispatch levelrdquo also ldquoDPC levelrdquo) when all higher-IRQL work (interrupts) completed
ndash Maximum times recommended ISR 10 usec DPC 25 usec
See httpwwwmicrosoftcomwhdcdriverperformmmdrvmspx
Deferred Procedure Calls (DPCs)
6565
queue head DPC object DPC object DPC object
Deferred Procedure Calls (DPCs)
6666
DPC
Delivering a DPC
DPC routines can call kernel functionsbut canlsquot call system services generatepage faults or create or wait on objects
DPC routines canlsquotassume whatprocess addressspace is currentlymapped
Interruptdispatch table
high
Power failure
DispatchDPC
APC
Low
DPC
1 Timer expires kernelqueues DPC that willrelease all waiting threadsKernel requests SW int
DPCDPC
DPC queue
2 DPC interrupt occurswhen IRQL drops belowdispatchDPC level
dispatcher
3 After DPC interruptcontrol transfers tothread dispatcher
4 Dispatcher executes each DPCroutine in DPC queue
6767
Asynchronous Procedure Calls (APCs)
Execute code in context of a particular user threadndash APC routines can acquire resources (objects) incur page faultscall system services
APC queue is thread-specific
User mode amp kernel mode APCsndash Permission required for user mode APCs
6868
Asynchronous Procedure Calls (APCs)
Executive uses APCs to complete work in thread spacendash Wait for asynchronous IO operation
ndash Emulate delivery of POSIX signals
ndash Make threads suspendterminate itself (env subsystems)
APCs are delivered when thread is in alertable wait statendash WaitForMultipleObjectsEx() SleepEx()
6969
Special kernel APCs
ndash Run in kernel mode at IRQL 1
ndash Always deliverable unless thread is already at IRQL 1 or above
ndash Used for IO completion reporting from ldquoarbitrary thread contextrdquo
ndash Kernel-mode interface is linkable but not documented
Asynchronous Procedure Calls (APCs)
7070
ldquoOrdinaryrdquo kernel APCs
ndash Always deliverable if at IRQL 0 unless explicitly disabled (disable with KeEnterCriticalRegion)
User mode APCs
ndash Used for IO completion callback routines (see ReadFileEx WriteFileEx) also QueueUserApc
ndash Only deliverable when thread is in ldquoalertable waitrdquo
Asynchronous Procedure Calls (APCs)
7171
ThreadObject
K
U
APC objects
Asynchronous Procedure Calls (APCs)
7272
IRQLs and CPU Time Accounting
Interval clock timer ISR keeps track of time
Clock ISR time accounting
ndash If IRQLlt2 charge to threadrsquos user or kernel time
ndash If IRQL=2 and processing a DPC charge to DPC time
ndash If IRQL=2 amp not processing a DPC charge to thread kernel time
ndash If IRQLgt2 charge to interrupt time
7373
IRQLs and CPU Time Accounting
Since time servicing interrupts are NOT charged to interrupted thread if system is busy but no process appears to be running must be due to interrupt-related activity
ndash Note time at IRQL 2 or more is charged to the current threadrsquos quantum (to be described)
7474
Interrupt Time Accounting
Task Manager includes interrupt and DPC time with the Idle process time
Interrupt activity is not charged to any threadprocess
ndash Process Explorer shows these as separate processesnot really processes
ndash Context switches for these are really of interrupts amp DPCs
7575
Time Accounting Quirks
Looking at total CPU time for each process may not reveal where system has spent its time
CPU time accounting is driven by programmable interrupt timer
ndash Normally 10 msec (15 msec on some MP Pentiums)
Thread execution and context switches between clock intervals NOT accounted
ndash Eg one or more threads run and enter a wait state before clock fires
ndash Thus threads may run but never get charged
View context switch activity with Process Explorer
ndash Add Context Switch Delta column
7676
For waiting threads user-mode utilities only display the wait reason
Example pstat
Looking at Waiting Threads
7777
Wait Internals 1 Dispatcher Objects
Size TypeState
Wait listhead
Object-type-specific data
DispatcherObject
(see ntddkincddkntddkh)
Any kernel object you can wait for is a ldquodispatcher objectrdquo
ndash some exclusively for synchronizationeg events mutexes (ldquomutantsrdquo) semaphores queues timers
ndash others can be waited for as a side effect of their prime function eg processes threads file objects
ndash non-waitable kernel objects are called ldquocontrol objectsrdquo
All dispatcher objects have a common header
All dispatcher objects are in one of two states
ndash ldquosignaledrdquo vs ldquononsignaledrdquo
ndash when signalled a wait on the object is satisfied
ndash different object types differ in terms of what changes their state
ndash wait and unwait implementation iscommon to all types of dispatcher objects
7878
Object-type-specific data
Wait Internals 2Wait Blocks
Size TypeState
Wait listhead
Size TypeState
Wait listhead
Represent a threadrsquos reference to something itrsquos waiting for (one per handle passed to WaitForhellip)
All wait blocks from a given wait call are chained to the waiting thread
Type indicates wait for ldquoanyrdquo or ldquoallrdquo Key denotes argument list position for
WaitForMultipleObjects
Object-type-specific data
DispatcherObjects
Thread Objects
WaitBlockListWaitBlockList
Wait blocks
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
7979
34 Windows APIs for Synchronization and IPC Windows API constructs for synchronization and interprocess communication
Synchronization
ndash Critical sections
ndash Mutexes
ndash Semaphores
ndash Event objects
Synchronization through interprocess communication
ndash Anonymous pipes
ndash Named pipes
ndash Mailslots
8080
Critical Sections
Only usable from within the same process
Critical sections are initialized and deleted but do not have handles
Only one thread at a time can be in a critical section
A thread can enter a critical section multiple times - however the number of Enter- and Leave-operations must match
Leaving a critical section before entering it may cause deadlocks
No way to test whether another thread is in a critical section
VOID InitializeCriticalSection( LPCRITICAL_SECTION sec )VOID DeleteCriticalSection( LPCRITICAL_SECTION sec )
VOID EnterCriticalSection( LPCRITICAL_SECTION sec ) VOID LeaveCriticalSection( LPCRITICAL_SECTION sec )BOOL TryEnterCriticalSection ( LPCRITICAL_SECTION sec )
8181
Critical Section Example
counter is global shared by all threads
volatile int counter = 0
CRITICAL_SECTION crit
InitializeCriticalSection ( ampcrit )
hellip main loop in any of the threads
while (done)
_try
EnterCriticalSection ( ampcrit )
counter += local_value
LeaveCriticalSection ( ampcrit )
_finally LeaveCriticalSection ( ampcrit )
DeleteCriticalSection( ampcrit )
8282
Synchronizing Threads with Kernel Objects
The following kernel objects can be used to synchronize threads
ndash Processes
ndash Threads
ndash Files
ndash Console input
File change notificationsFile change notifications
MutexesMutexes
Events (auto-reset + manual-reset)Events (auto-reset + manual-reset)
Waitable timersWaitable timers
DWORD WaitForSingleObject( HANDLE hObject DWORD dwTimeout )
DWORD WaitForMultipleObjects( DWORD cObjects LPHANDLE lpHandles BOOL bWaitAll DWORD dwTimeout )
8383
Wait Functions - Details
WaitForSingleObject()ndash hObject specifies kernel object
ndash dwTimeout specifies wait time in msecdwTimeout == 0 - no wait check whether object is signaled
dwTimeout == INFINITE - wait forever
WaitForMultipleObjects()ndash cObjects lt= MAXIMUM_WAIT_OBJECTS (64)
ndash lpHandles - pointer to array identifying these objects
ndash bWaitAll - whether to wait for first signaled object or all objectsFunction returns index of first signaled object
Side effectsndash Mutexes auto-reset events and waitable timers will be reset to non-signaled state after completing wait functions
8484
Mutexes
Mutexes work across processes
First thread has to call CreateMutex()
When sharing a mutex second thread (process) calls CreateMutex() or OpenMutex()
fInitialOwner == TRUE gives creator immediate ownership
Threads acquire mutex ownership using WaitForSingleObject() or WaitForMultipleObjects()
ReleaseMutex() gives up ownership
CloseHandle() will free mutex object
8585
Mutexes
HANDLE CreateMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
HANDLE OpenMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
BOOL ReleaseMutex( HANDLE hMutex )
8686
Mutex Example
counter is global shared by all threads
volatile int done counter = 0
HANDLE mutex = CreateMutex( NULL FALSE NULL )
main loop in any of the threads ret is local
DWORD ret
while (done)
ret = WaitForSingleObject( mutex INFINITE )
if (ret == WAIT_OBJECT_0)
counter += local_value
else mutex was abandoned
break exit the loop
ReleaseMutex( mutex )
CloseHandle( mutex )
8787
Comparison - POSIX mutexes
POSIX pthreads specification supports mutexesndash Synchronization among threads in same process
Five basic functionsndash pthread_mutex_init()
ndash pthread_mutex_destroy()
ndash pthread_mutex_lock()
ndash pthread_mutex_unlock()
ndash pthread_mutex_trylock()
Comparisonndash pthread_mutex_lock() will block - equivalent to WaitForSingleObject( hMutex )
ndash pthread_mutex_trylock() is nonblocking (polling) - equivalent to WaitForSingleObject() with timeout == 0
8888
Semaphores
Semaphore objects are used for resource countingndash A semaphore is signaled when count gt 0
Threadsprocesses use wait functionsndash Each wait function decreases semaphore count by 1
ndash ReleaseSemaphore() may increment count by any value
ndash ReleaseSemaphore() returns old semaphore count
8989
Semaphores
HANDLE CreateSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE OpenSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE ReleaseSemaphore( HANDLE hSemaphoreLONG cReleaseCount LPLONG lpPreviousCount )
9090
Events
Multiple threads can be released when a single event is signaled (barrier synchronization)ndash Manual-reset event can signal several thread simultaneously must be reset manually
ndash PulseEvent() will release all threads waiting on a manual-reset event and automatically reset the event
ndash Auto-reset event signals a single thread event is reset automatically
ndash fInitialState == TRUE - create event in signaled state
9191
Events
HANDLE CreateEvent( LPSECURITY_ATTRIBUTE lpsaBOOL fManualReset BOOL fInititalStateLPTSTR lpszEventName )
BOOL SetEvent( HANDLE hEvent )BOOL ResetEvent( HANDLE hEvent )BOOL PulseEvent( HANDLE hEvent )
9292
Comparison - POSIX condition variables
pthreadrsquos condition variables are comparable to eventsndash pthread_cond_init()
ndash pthread_cond_destroy()
Wait functionsndash pthread_cond_wait()
ndash pthread_cond_timedwait()
Signalingndash pthread_cond_signal() - one thread
ndash pthread_cond_broadcast() - all waiting threads
No exact equivalent to manual-reset events
9393
Anonymous pipes
BOOL CreatePipe( PHANDLE phReadPHANDLE phWriteLPSECURITY_ATTRIBUTES lpsaDWORD cbPipe )
main
prog1 prog2pipe
Half-duplex character-based IPC
cbPipe pipe byte size zero == default
Read on pipe handle will block if pipe is empty
Write operation to a full pipe will block
Anonymous pipes are oneway
9494
IO Redirection using an Anonymous Pipe
Create default size anonymous pipe handles are inheritable
if (CreatePipe (amphReadPipe amphWritePipe ampPipeSA 0))
fprintf(stderr ldquoAnon pipe create failednrdquo) exit(1)
Set output handle to pipe handle create first processes
StartInfoCh1hStdInput = GetStdHandle (STD_INPUT_HANDLE)
StartInfoCh1hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh1hStdOutput = hWritePipe
StartInfoCh1dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)Command1 NULL NULL TRUE 0 NULL NULL ampStartInfoCh1 ampProcInfo1))
fprintf(stderr ldquoCreateProc1 failednrdquo) exit(2)
CloseHandle (hWritePipe)
9595
Pipe example (contd)
Repeat (symmetrically) for the second process
StartInfoCh2hStdInput = hReadPipe
StartInfoCh2hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh2hStdOutput = GetStdHandle (STD_OUTPUT_HANDLE)
StartInfoCh2dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)targv NULL NULLTRUE Inherit handles
0 NULL NULL ampStartInfoCh2 ampProcInfo2))
fprintf(stderr ldquoCreateProc2 failednrdquo) exit(3)
CloseHandle (hReadPipe)
Wait for both processes to complete
WaitForSingleObject (ProcInfo1hProcess INFINITE)
WaitForSingleObject (ProcInfo2hProcess INFINITE)
CloseHandle (ProcInfo1hThread) CloseHandle (ProcInfo1hProcess)
CloseHandle (ProcInfo2hThread) CloseHandle (ProcInfo2hProcess)
return 0
9696
Named Pipes
Message orientedndash Reading process can read varying-length messages precisely as sent by the writing process
Bi-directionalndash Two processes can exchange messages over the same pipe
Multiple independent instances of a named pipendash Several clients can communicate with a single server using the same instance
ndash Server can respond to client using the same instance
Pipe can be accessed over the networkndash location transparency
Convenience and connection functions
9797
Using Named Pipes
HANDLE CreateNamedPipe (LPCTSTR lpszPipeNameDWORD fdwOpenMode DWORD fdwPipModeDWORD nMaxInstances DWORD cbOutBufDWORD cbInBuf DWORD dwTimeOutLPSECURITY_ATTRIBUTES lpsa )
Use same flag settings forall instances of a named pipe
lpszPipeName pipe[path]pipename
ndash Not possible to create a pipe on remote machine ( ndash local machine)
fdwOpenMode
ndash PIPE_ACCESS_DUPLEX PIPE_ACCESS_INBOUND PIPE_ACCESS_OUTBOUND
fdwPipeMode
ndash PIPE_TYPE_BYTE or PIPE_TYPE_MESSAGE
ndash PIPE_READMODE_BYTE or PIPE_READMODE_MESSAGE
ndash PIPE_WAIT or PIPE_NOWAIT (will ReadFile block)
9898
Named Pipes (contd)
BOOL PeekNamedPipe (HANDLE hPipeLPVOID lpvBuffer DWORD cbBufferLPDWORD lpcbRead LPDWORD lpcbAvailLPDWORD lpcbMessage)
nMaxInstances
ndash Number of instances
ndash PIPE_UNLIMITED_INSTANCES OS choice based on resources
dwTimeOut
ndash Default time-out period (in msec) for WaitNamedPipe()
First CreateNamedPipe creates named pipe
ndash Closing handle to last instance deletes named pipe
Polling a pipe
ndash Nondestructive ndash is there a message waiting for ReadFile
9999
Named Pipe Client Connections
CreateFile with named pipe namendash pipe[path]pipename
ndash servernamepipe[path]pipename
ndash First method gives better performance (local server)
Status Functionsndash GetNamedPipeHandleState
ndash SetNamedPipeHandleState
ndash GetNamedPipeInfo
100100
Convenience Functions
BOOL TransactNamedPipe( HANDLE hNamedPipeLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDOWRD lpcbRead LPOVERLAPPED lpa)
WriteFile ReadFile sequence
101101
Convenience Functions
BOOL CallNamedPipe( LPCTSTR lpszPipeNameLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDWORD lpcbRead DWORD dwTimeOut)
CreateFile WriteFile ReadFile CloseHandle
ndash dwTimeOut NMPWAIT_NOWAIT NMPWAIT_WIAT_FOREVER NMPWAIT_USE_DEFAULT_WAIT
102102
Server eliminate the polling loop
BOOL ConnectNamedPipe (HANDLE hNamedPipeLPOVERLAPPED lpo
lpo == NULLndash Call will return as soon as there is a client connection
ndash Returns false if client connected between CreateNamed Pipe calland ConnectNamedPipe()
Use DisconnectNamedPipe to free the handle for connection from another client
WaitNamedPipe()ndash Client may wait for serverlsquos ConnectNamedPipe()
Security rights for named pipesndash GENERIC_READ GENERIC_WRITE SYNCHRONIZE
103103
Comparison with UNIX
UNIX FIFOs are similar to a named pipe
ndash FIFOs are half-duplex
ndash FIFOs are limited to a single machine
ndash FIFOs are still byte-oriented so its easiest to use fixed-size records in clientserver applications
ndash Individual readwrites are atomic
A server using FIFOs must use a separate FIFO for each clientlsquos response although all clients can send requests via a single well known FIFO
Mkfifo() is the UNIX counterpart to CreateNamedPipe()
Use sockets for networked clientserver scenarios
104104
Client Example using Named Pipe
WaitNamedPipe (ServerPipeName NMPWAIT_WAIT_FOREVER)
hNamedPipe = CreateFile (ServerPipeName GENERIC_READ | GENERIC_WRITE
0 NULL OPEN_EXISTING FILE_ATTRIBUTE_NORMAL NULL)
if (hNamedPipe == INVALID_HANDLE_VALUE)
fptinf(stderr Failure to locate servern) exit(3)
Write the request
WriteFile (hNamedPipe ampRequest MAX_RQRS_LEN ampnWrite NULL)
Read each response and send it to std out
while (ReadFile (hNamedPipe ResponseRecord MAX_RQRS_LEN ampnRead NULL))
printf (s ResponseRecord)
CloseHandle (hNamedPipe)
return 0
105105
Server Example Using a Named Pipe
hNamedPipe = CreateNamedPipe (SERVER_PIPE PIPE_ACCESS_DUPLEX
PIPE_READMODE_MESSAGE | PIPE_TYPE_MESSAGE | PIPE_WAIT
1 0 0 CS_TIMEOUT pNPSA)
while (Done)
printf (Server is awaiting next requestn)
if (ConnectNamedPipe (hNamedPipe NULL)
|| ReadFile (hNamedPipe ampRequest RQ_SIZE ampnXfer NULL))
fprintf(stderr ldquoConnect or Read Named Pipe errornrdquo) exit(4)
printf( ldquoRequest is sn RequestRecord)
Send the file one line at a time to the client
fp = fopen (File r)
while ((fgets (ResponseRecord MAX_RQRS_LEN fp) = NULL))
WriteFile (hNamedPipe ampResponseRecord
(strlen(ResponseRecord) + 1) TSIZE ampnXfer NULL)
fclose (fp)
DisconnectNamedPipe (hNamedPipe)
End of server operation
106106
Win32 IPC - MailslotsMailslots bear some nasty implementation detailsthey are almost never used
Broadcast mechanism
ndash One-directional
ndash Mutliple writersmultiple readers (frequently one-to-many comm)
ndash Message delivery is unreliable
ndash Can be located over a network domain
ndash Message lengths are limited (w2k lt 426 byte) Operations on the mailslot
ndash Each reader (server) creates mailslot with CreateMailslot()
ndash Write-only client opens mailslot with CreateFile() and uses WriteFile() ndash open will fail if there are no waiting readers
ndash Clientlsquos message can be read by all servers (readers) Client lookup mailslotmailslotname
ndash Client will connect to every server in network domain
107107
Locate a server via mailslot
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
App client 0
App client n
Mailslot Servers
While () Sleep() hMS = CreateFile( ldquomailslotstatusldquo)
WriteFile(hMS ampStatInfo
App Server
Mailslot Client
Message is sent periodically
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
88
First Attempt Algorithm 1
Shared variables
ndash Initialization int turn = 0
ndash turn == i Ti can enter its critical section
Thread Ti
do
while (turn = i)
critical section
turn = j
reminder section
while (1)
Satisfies mutual exclusion but not progress
99
Second Attempt Algorithm 2
Shared variables
ndash initialization int flag[2] flag[0] = flag[1] = 0
ndash flag[i] == 1 Ti can enter its critical section
Thread Ti
do
flag[i] = 1while (flag[j] == 1)
critical section
flag[i] = 0remainder section
while(1)
Satisfies mutual exclusion not progress requirement
1010
Algorithm 3 (Petersonrsquos Algorithm - 1981)
Shared variables of algorithms 1 and 2 - initialization
int flag[2] flag[0] = flag[1] = 0int turn = 0
Thread Ti
do
flag[i] = 1turn = jwhile ((flag[j] == 1) ampamp turn == j)
critical section
flag[i] = 0
remainder section
while (1)
Solves the critical-section problem for two threads
1111
Dekkerrsquos Algorithm (1965)
This is the first correct solution proposed for the two-thread (two-process) case
Originally developed by Dekker in a different context it was applied to the critical section problem by Dijkstra
Dekker adds the idea of a favored thread and allows access to either thread when the request is uncontested
When there is a conflict one thread is favored and the priority reverses after successful execution of the critical section
1212
Dekkerrsquos Algorithm (contd)
Shared variables - initializationint flag[2] flag[0] = flag[1] = 0int turn = 0
Thread Ti
do flag[i] = 1
while (flag[j] ) if (turn == j)
flag[i] = 0while (turn == j)flag[i] = 1
critical section
turn = jflag[I] = 0
remainder section
while (1)
1313
Bakery Algorithm (Lamport 1979)
A Solution to the Critical Section problem for n threads
Before entering its CS a thread receives a number
Holder of the smallest number enters the CS
If threads Ti and Tj receive the same number if i lt j then Ti is served first else Tj is served first
The numbering scheme generates numbers in monotonically non-decreasing order
ndash ie 1112333445
1414
Bakery Algorithm
Notation ldquoltldquo establishes lexicographical order among 2-tuples (ticket thread id )
(ab) lt (cd) if a lt c or if a == c and b lt d
max (a0hellip an-1) = k | k ai for i = 0hellip n ndash 1
Shared data
int choosing[n]
int number[n] - the ticket
Data structures are initialized to 0
1515
Bakery Algorithm
do
choosing[i] = 1
number[i] = max(number[0]number[1] number[n-1]) + 1
choosing[i] = 0
for (j = 0 j lt n j++)
while (choosing[j] == 1)
while ((number[j] = 0) ampamp ((number[j]j) lsquorsquoltlsquorsquo (number[i]i)))
critical section
number[i] = 0
remainder section
while (1)
1616
Mutual Exclusion - Hardware Support
Interrupt Disabling
ndash Concurrent threads cannot overlap on a uniprocessor
ndash Thread will run until performing a system call or interrupt happens
Special Atomic Machine Instructions
ndash Test and Set Instruction - read amp write a memory location
ndash Exchange Instruction - swap register and memory location
Problems with Machine-Instruction Approach
ndash Busy waiting
ndash Starvation is possible
ndash Deadlock is possible
1717
Synchronization Hardware
Test and modify the content of a word atomically
boolean TestAndSet(boolean amptarget)
boolean rv = target
target = true
return rv
1818
Shared data ndash boolean lock = false
Thread Ti
do
while (TestAndSet(lock))
critical section
lock = false
remainder section
Mutual Exclusion with Test-and-Set
1919
Synchronization Hardware
Atomically swap two variables
void Swap(boolean ampa boolean ampb)
boolean temp = a
a = b
b = temp
2020
Mutual Exclusion with Swap
Shared data (initialized to 0) int lock = 0
Thread Ti
int key
do
key = 1
while (key == 1) Swap(lockkey)
critical section
lock = 0
remainder section
2121
Semaphores
Semaphore S ndash integer variable
can only be accessed via two atomic operations
wait (S)
while (S lt= 0)S--
signal (S)
S++
2222
Critical Section of n Threads
Shared data
semaphore mutex initially mutex = 1
Thread Ti
do wait(mutex) critical section
signal(mutex) remainder section
while (1)
2323
Semaphore Implementation
Semaphores may suspendresume threads
ndash Avoid busy waiting
Define a semaphore as a record
typedef struct
int value struct thread L semaphore
Assume two simple operations
ndash suspend() suspends the thread that invokes it
ndash resume(T) resumes the execution of a blocked thread T
2424
Implementation
Semaphore operations now defined as wait(S)
Svalue--
if (Svalue lt 0)
add this thread to SLsuspend()
signal(S) Svalue++
if (Svalue lt= 0)
remove a thread T from SLresume(T)
2525
Semaphore as a General Synchronization Tool Execute B in Tj only after A executed in Ti
Use semaphore flag initialized to 0
Code
Ti Tj
A wait(flag)
signal(flag) B
2626
Two Types of Semaphores
Counting semaphore
ndash integer value can range over an unrestricted domain
Binary semaphore
ndash integer value can range only between 0 and 1
ndash can be simpler to implement
Counting semaphore S can be implemented as a binary semaphore
2727
Deadlock and Starvation
Deadlock ndash
ndash two or more threads are waiting indefinitely for an event that can be caused by only one of the waiting threads
Let S and Q be two semaphores initialized to 1
T0 T1
wait(S) wait(Q)
wait(Q) wait(S)
signal(S) signal(Q)
signal(Q) signal(S)
2828
Deadlock and Starvation
Starvation ndash indefinite blocking
ndash A thread may never be removed from the semaphore queue in which it is suspended
Solution ndash
ndash all code should acquirerelease semaphores in same order
2929
Windows Synchronization
Uses interrupt masks to protect access to global resources on uniprocessor systems
Uses spinlocks on multiprocessor systems
Provides dispatcher objects which may act as mutexes and semaphores
Dispatcher objects may also provide events An event acts much like a condition variable
3030
Linux Synchronization
Kernel disables interrupts for synchronizing access to global data on uniprocessor systems
Uses spinlocks for multiprocessor synchronization
Uses semaphores and readers-writers locks when longer sections of code need access to data
Implements POSIX synchronization primitives to support multitasking multithreading (including real-time threads) and multiprocessing
3131
Further Reading
Ben-Ari M Principles of Concurrent Programming Prentice Hall 1982
Lamport L The Mutual Exclusion Problem Journal of the ACM April 1986
Abraham Silberschatz Peter B Galvin Operating System Concepts John Wiley amp Sons 6th Ed 2003
ndash Chapter 7 - Process Synchronization
ndash Chapter 8 - Deadlocks
3232
32 Trap Dispatching Interrupts Synchronization Trap and Interrupt dispatching
IRQL levels amp Interrupt Precedence
Spinlocks and Kernel Synchronization
Executive Synchronization
3333
Kernel Mode Versus User Mode
A processor state
Controls access to memory
Each memory page is tagged to show the required mode for reading and for writing
ndash Protects the system from the users
ndash Protects the user (process) from themselves
ndash System is not protected from system
Code regions are tagged ldquono write in any moderdquo
Controls ability to execute privileged instructions
A Windows abstraction
ndash Intel Ring 0 Ring 3
3434
Kernel Mode Versus User Mode
Control flow (a thread) can change from user to kernel mode and back
ndash Does not affect scheduling
ndash Thread context includes info about execution mode (along with registers etc)
PerfMon counters
ndash ldquoPrivileged Timerdquo and ldquoUser Timerdquo
ndash 4 levels of granularity thread process processor system
3535
Getting Into Kernel Mode
Code is run in kernel mode for one of three reasons
1 Requests from user mode
ndash Via the system service dispatch mechanism
ndash Kernel-mode code runs in the context of the requesting thread
2 Dedicated kernel-mode system threads
ndash Some threads in the system stay in kernel mode at all timesmostly in the ldquoSystemrdquo process
ndash Scheduled preempted etc like any other threads
3636
Getting Into Kernel Mode
3 Interrupts from external devicesndash interrupt dispatcher invokes the interrupt service routine
ndash ISR runs in the context of the interrupted thread so-called ldquoarbitrary thread contextrdquo
ndash ISR often requests the execution of a ldquoDPC routinerdquo which also runs in kernel mode
ndash Time not charged to interrupted thread
3737
Trap dispatching
Interruptdispatcher
Systemservice
dispatcher
Interruptserviceroutines
Interruptserviceroutines
Interruptserviceroutines
System services
System services
System services
Exceptiondispatcher
Exceptionhandlers
Exceptionhandlers
Exceptionhandlers
Virtual memorymanagerlsquos pager
Interrupt
System service call
HW exceptionsSW exceptions
Virtual addressexceptions
Trap processorlsquos mechanism to capture executing thread
ndash Switch from user to kernel mode
ndash Interrupts ndash asynchronous
ndash Exceptions - synchronous
Interrupt dispatch routine
Disable interrupts
Record machine state (trap frame) to allow resume
Mask equal- and lower-IRQL interrupts
Find and call appropriate ISR
Dismiss interrupt
Restore machine state (including mode and enabled interrupts)
Disable interrupts
Record machine state (trap frame) to allow resume
Mask equal- and lower-IRQL interrupts
Find and call appropriate ISR
Dismiss interrupt
Restore machine state (including mode and enabled interrupts)
Tell the device to stop interrupting
Interrogate device state start next operation on device etc
Request a DPC
Return to caller
Tell the device to stop interrupting
Interrogate device state start next operation on device etc
Request a DPC
Return to caller
Interrupt service routine
interrupt
user or kernel mode
codekernel mode
Note no thread or process context switch
Note no thread or process context switch
Interrupt Dispatching
3939
IRQL = Interrupt Request Level
ndash Precedence of the interrupt with respect to other interrupts
ndash Different interrupt sources have different IRQLs
ndash not the same as IRQ
IRQL is also a state of the processor
ndash Servicing an interrupt raises processor IRQL to that interruptrsquos IRQL
ndash this masks subsequent interrupts at equal and lower IRQLs
User mode is limited to IRQL 0
No waits or page faults at IRQL gt= DISPATCH_LEVEL
Interrupt Precedence via IRQLs (x86)
4040
PassiveLowAPC
DispatchDPCDevice 1
Profile amp Synch (Srv 2003)
ClockInterprocessor Interrupt
Power failHigh
normal thread execution
Hardware interrupts
Deferrable software interrupts
012
302928
31
Interrupt Precedence via IRQLs (x86)
4141
Interrupt processing
Interrupt dispatch table (IDT)ndash Links to interrupt service routines
x86ndash Interrupt controller interrupts processor (single line)
ndash Processor queries for interrupt vector uses vector as index to IDT
Alphandash PAL code (Privileged Architecture Library ndash Alpha BIOS) determines interrupt vector calls kernel
ndash Kernel uses vector to index IDT
After ISR execution IRQL is lowered to initial level
4242
Interrupt object
Allows device drivers to register ISRs for their devicesndash Contains dispatch code (initial handler)
ndash Dispatch code calls ISR with interrupt object as parameter(HW cannot pass parameters to ISR)
Connectingdisconnecting interrupt objectsndash Dynamic association between ISR and IDT entry
ndash Loadable device drivers (kernel modules)
ndash Turn onoff ISR
Interrupt objects can synchronize access to ISR datandash Multiple instances of ISR may be active simultaneously (MP machine)
ndash Multiple ISR may be connected with IRQL
4343
Predefined IRQLs
High ndash used when halting the system (via KeBugCheck())
Power fail ndash originated in the NT design document but has never been used
Inter-processor interruptndash used to request action from other processor (dispatching a thread updating a processors TLB system shutdown system crash)
Clockndash Used to update systemlsquos clock allocation of CPU time to threads
Profilendash Used for kernel profiling (see Kernel profiler ndash Kernprofexe Res Kit)
4444
Predefined IRQLs (contd)
Device
ndash Used to prioritize device interrupts
DPCdispatch and APC
ndash Software interrupts that kernel and device drivers generate
Passive
ndash No interrupt level at all normal thread execution
4545
IRQLs on 64-bit Systems
PassiveLowAPC
DispatchDPCDevice 1
Device n
Synch (Srv 2003)Clock
Interprocessor InterruptPower
HighProfile
012
1413
15
34
PassiveLowAPC
DispatchDPC amp Synch (UP only)Correctable Machine Check
Device 1
Device nSynch (MP only)
ClockInterprocessor Interrupt
HighProfilePower
x64 IA64
12
4646
Interrupt Prioritization amp Delivery
IRQLs are determined as followsndash x86 UP systems IRQL = 27 - IRQ
ndash x86 MP systems bucketized (random)
ndash x64 amp IA64 systems IRQL = IDT vector number 16
On MP systems which processor is chosen to deliver an interruptndash By default any processor can receive an interrupt from any deviceCan be configured with IntFilter utility in Resource Kit
ndash On x86 and x64 systems the IOAPIC (IO advanced programmable interrupt controller) is programmed to interrupt the processor running at the lowest IRQL
ndash On IA64 systems the SAPIC (streamlined advanced programmable interrupt controller) is configured to interrupt one processor for each interrupt sourceProcessors are assigned round robin for each interrupt vector
4747
Software interrupts
Initiating thread dispatching
ndash DPC allow for scheduling actions when kernel is deep within many layers of code
ndash Delayed scheduling decision one DPC queue per processor
Handling timer expiration
Asynchronous execution of a procedure in context of a particular thread
Support for asynchronous IO operations
4848
Flow of Interrupts
4949
Sync on MP use spinlocks to coordinate among processors
Spinlock acquisition and release routines implement a one-owner-at-a-time algorithm
ndash Spinlock is either free or is considered to be owned by a CPU
ndash Analogous to using Windows API mutexes from user mode
A spinlock is just a data cell in memory
ndash Accessed with a test-and-modify operation that is atomic across all processors
ndash KSPIN_LOCK is an opaque data type typedefrsquod as a ULONG
ndash To implement synchronization a single bit is sufficient
Synchronization on SMP Systems
5050
do acquire_spinlock(DPC)until (SUCCESS)
begin remove DPC from queueend
release_spinlock(DPC)
do acquire_spinlock(DPC)until (SUCCESS)
begin remove DPC from queueend
release_spinlock(DPC)
Kernel Synchronization
Processor BProcessor A
Critical section
spinlock
DPC DPC
A spinlock is a locking primitive associatedwith a global data structure such as the DPC queue
5151
Queued Spinlocks
Problem Checking status of spinlock via test-and-set operation creates bus contention
Queued spinlocks maintain queue of waiting processors
First processor acquires lock other processors wait on processor-local flag
ndash Thus busy-wait loop requires no access to the memory bus
When releasing lock the 1st processorrsquos flag is modified
ndash Exactly one processor is being signaled
ndash Pre-determined wait order
5252
SMP Scalability Improvements
Windows 2000 queued spinlocks
ndash qlocks in Kernel Debugger
Server 2003
ndash More spinlocks eliminated (context swap system space commit)
ndash Further reduction of use of spinlocks amp length they are held
ndash Scheduling database now per-CPUAllows thread state transitions in parallel
5353
SMP Scalability Improvements
XP2003
ndash Minimized lock contention for hot locks PFN or Page Frame Database lock
ndash Some locks completely eliminatedCharging nonpagedpaged pool quotas allocating and mapping system page table entries charging commitment of pages allocatingmapping physical memory through AWE functions
ndash New more efficient locking mechanism (pushlocks)Doesnrsquot use spinlocks when no contentionUsed for object manager and address windowing extensions (AWE) related locks
5454
Waiting
Flexible wait calls
ndash Wait for one or multiple objects in one call
ndash Wait for multiple can wait for ldquoanyrdquo one or ldquoallrdquo at onceldquoAllrdquo all objects must be in the signalled state concurrently to resolve the wait
ndash All wait calls include optional timeout argument
ndash Waiting threads consume no CPU time
5555
Waiting
Waitable objects include
ndash Events (may be auto-reset or manual reset may be set or ldquopulsedrdquo)
ndash Mutexes (ldquomutual exclusionrdquo one-at-a-time)
ndash Semaphores (n-at-a-time)
ndash Timers
ndash Processes and Threads (signalled upon exit or terminate)
ndash Directories (change notification)
5656
Waiting
No guaranteed ordering of wait resolution
ndash If multiple threads are waiting for an object and only one thread is released (eg itrsquos a mutex or auto-reset event) which thread gets released is unpredictable
5757
Executive Synchronization
Thread waitson an object
handle
Create and initialize thread object
Initialized
Ready
Transition
Waiting
Running
Terminated
Standby
Wait is completeSet object to
signaled state
Interaction with thread scheduling
Waiting on Dispatcher Objects ndash outside the kernel
5858
Interaction bet Synchronization amp Dispatching User mode thread waits on an event objectlsquos handle
Kernel changes threadlsquos scheduling state from ready to waiting and adds thread to wait-list
Another thread sets the event
Kernel wakes up waiting threads variable priority threads get priority boost
5959
Interaction bet Synchronization amp Dispatching Dispatcher re-schedules new thread ndash it may preempt running thread it it has lower priority and issues software interrupt to initiate context switch
If no processor can be preempted the dispatcher places the ready thread in the dispatcher ready queue to be scheduled later
6060
What signals an object
Dispatcher object
System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
Owning thread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (kernel mode)
nonsignaled signaled
Owning thread or otherthread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (exported to user mode)
nonsignaled signaled
One thread releases thesemaphore freeing a resource
A thread acquires the semaphoreMore resources are not available
Kernel resumes one or more waiting threads
Semaphore
6161
A thread reinitializesthe thread object
What signals an object (contd)
Dispatcher object System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
A thread sets the event
Kernel resumes one or more threads
Kernel resumes one or more waiting threads
Event
nonsignaled signaled
Dedicated thread sets oneevent in the event pair
Kernel resumes theother dedicated thread
Kernel resumes waitingdedicated thread
Event pair
nonsignaled signaled
Timer expires
A thread (re) initializes the timer
Kernel resumes all waiting threads
Timer
nonsignaled signaled
Thread terminates
Kernel resumes all waiting threads
Thread
6262
Further Reading
Mark E Russinovich and David A Solomon Microsoft Windows Internals 4th Edition Microsoft Press 2004
Chapter 3 - System Mechanisms
ndash Trap Dispatching (pp 85 ff)
ndash Synchronization (pp 149 ff)
ndash Kernel Event Tracing (pp 175 ff)
6363
33 Advanced Windows Synchronization
Deferred and Asynchronous Procedure Calls
IRQLs and CPU Time Accounting
Wait Queues amp Dispatcher Objects
6464
Used to defer processing from higher (device) interrupt level to a lower (dispatch) levelndash Also used for quantum end and timer expiration
Driver (usually ISR) queues requestndash One queue per CPU DPCs are normally queued to the current processor but can be targeted to other CPUs
ndash Executes specified procedure at dispatch IRQL (or ldquodispatch levelrdquo also ldquoDPC levelrdquo) when all higher-IRQL work (interrupts) completed
ndash Maximum times recommended ISR 10 usec DPC 25 usec
See httpwwwmicrosoftcomwhdcdriverperformmmdrvmspx
Deferred Procedure Calls (DPCs)
6565
queue head DPC object DPC object DPC object
Deferred Procedure Calls (DPCs)
6666
DPC
Delivering a DPC
DPC routines can call kernel functionsbut canlsquot call system services generatepage faults or create or wait on objects
DPC routines canlsquotassume whatprocess addressspace is currentlymapped
Interruptdispatch table
high
Power failure
DispatchDPC
APC
Low
DPC
1 Timer expires kernelqueues DPC that willrelease all waiting threadsKernel requests SW int
DPCDPC
DPC queue
2 DPC interrupt occurswhen IRQL drops belowdispatchDPC level
dispatcher
3 After DPC interruptcontrol transfers tothread dispatcher
4 Dispatcher executes each DPCroutine in DPC queue
6767
Asynchronous Procedure Calls (APCs)
Execute code in context of a particular user threadndash APC routines can acquire resources (objects) incur page faultscall system services
APC queue is thread-specific
User mode amp kernel mode APCsndash Permission required for user mode APCs
6868
Asynchronous Procedure Calls (APCs)
Executive uses APCs to complete work in thread spacendash Wait for asynchronous IO operation
ndash Emulate delivery of POSIX signals
ndash Make threads suspendterminate itself (env subsystems)
APCs are delivered when thread is in alertable wait statendash WaitForMultipleObjectsEx() SleepEx()
6969
Special kernel APCs
ndash Run in kernel mode at IRQL 1
ndash Always deliverable unless thread is already at IRQL 1 or above
ndash Used for IO completion reporting from ldquoarbitrary thread contextrdquo
ndash Kernel-mode interface is linkable but not documented
Asynchronous Procedure Calls (APCs)
7070
ldquoOrdinaryrdquo kernel APCs
ndash Always deliverable if at IRQL 0 unless explicitly disabled (disable with KeEnterCriticalRegion)
User mode APCs
ndash Used for IO completion callback routines (see ReadFileEx WriteFileEx) also QueueUserApc
ndash Only deliverable when thread is in ldquoalertable waitrdquo
Asynchronous Procedure Calls (APCs)
7171
ThreadObject
K
U
APC objects
Asynchronous Procedure Calls (APCs)
7272
IRQLs and CPU Time Accounting
Interval clock timer ISR keeps track of time
Clock ISR time accounting
ndash If IRQLlt2 charge to threadrsquos user or kernel time
ndash If IRQL=2 and processing a DPC charge to DPC time
ndash If IRQL=2 amp not processing a DPC charge to thread kernel time
ndash If IRQLgt2 charge to interrupt time
7373
IRQLs and CPU Time Accounting
Since time servicing interrupts are NOT charged to interrupted thread if system is busy but no process appears to be running must be due to interrupt-related activity
ndash Note time at IRQL 2 or more is charged to the current threadrsquos quantum (to be described)
7474
Interrupt Time Accounting
Task Manager includes interrupt and DPC time with the Idle process time
Interrupt activity is not charged to any threadprocess
ndash Process Explorer shows these as separate processesnot really processes
ndash Context switches for these are really of interrupts amp DPCs
7575
Time Accounting Quirks
Looking at total CPU time for each process may not reveal where system has spent its time
CPU time accounting is driven by programmable interrupt timer
ndash Normally 10 msec (15 msec on some MP Pentiums)
Thread execution and context switches between clock intervals NOT accounted
ndash Eg one or more threads run and enter a wait state before clock fires
ndash Thus threads may run but never get charged
View context switch activity with Process Explorer
ndash Add Context Switch Delta column
7676
For waiting threads user-mode utilities only display the wait reason
Example pstat
Looking at Waiting Threads
7777
Wait Internals 1 Dispatcher Objects
Size TypeState
Wait listhead
Object-type-specific data
DispatcherObject
(see ntddkincddkntddkh)
Any kernel object you can wait for is a ldquodispatcher objectrdquo
ndash some exclusively for synchronizationeg events mutexes (ldquomutantsrdquo) semaphores queues timers
ndash others can be waited for as a side effect of their prime function eg processes threads file objects
ndash non-waitable kernel objects are called ldquocontrol objectsrdquo
All dispatcher objects have a common header
All dispatcher objects are in one of two states
ndash ldquosignaledrdquo vs ldquononsignaledrdquo
ndash when signalled a wait on the object is satisfied
ndash different object types differ in terms of what changes their state
ndash wait and unwait implementation iscommon to all types of dispatcher objects
7878
Object-type-specific data
Wait Internals 2Wait Blocks
Size TypeState
Wait listhead
Size TypeState
Wait listhead
Represent a threadrsquos reference to something itrsquos waiting for (one per handle passed to WaitForhellip)
All wait blocks from a given wait call are chained to the waiting thread
Type indicates wait for ldquoanyrdquo or ldquoallrdquo Key denotes argument list position for
WaitForMultipleObjects
Object-type-specific data
DispatcherObjects
Thread Objects
WaitBlockListWaitBlockList
Wait blocks
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
7979
34 Windows APIs for Synchronization and IPC Windows API constructs for synchronization and interprocess communication
Synchronization
ndash Critical sections
ndash Mutexes
ndash Semaphores
ndash Event objects
Synchronization through interprocess communication
ndash Anonymous pipes
ndash Named pipes
ndash Mailslots
8080
Critical Sections
Only usable from within the same process
Critical sections are initialized and deleted but do not have handles
Only one thread at a time can be in a critical section
A thread can enter a critical section multiple times - however the number of Enter- and Leave-operations must match
Leaving a critical section before entering it may cause deadlocks
No way to test whether another thread is in a critical section
VOID InitializeCriticalSection( LPCRITICAL_SECTION sec )VOID DeleteCriticalSection( LPCRITICAL_SECTION sec )
VOID EnterCriticalSection( LPCRITICAL_SECTION sec ) VOID LeaveCriticalSection( LPCRITICAL_SECTION sec )BOOL TryEnterCriticalSection ( LPCRITICAL_SECTION sec )
8181
Critical Section Example
counter is global shared by all threads
volatile int counter = 0
CRITICAL_SECTION crit
InitializeCriticalSection ( ampcrit )
hellip main loop in any of the threads
while (done)
_try
EnterCriticalSection ( ampcrit )
counter += local_value
LeaveCriticalSection ( ampcrit )
_finally LeaveCriticalSection ( ampcrit )
DeleteCriticalSection( ampcrit )
8282
Synchronizing Threads with Kernel Objects
The following kernel objects can be used to synchronize threads
ndash Processes
ndash Threads
ndash Files
ndash Console input
File change notificationsFile change notifications
MutexesMutexes
Events (auto-reset + manual-reset)Events (auto-reset + manual-reset)
Waitable timersWaitable timers
DWORD WaitForSingleObject( HANDLE hObject DWORD dwTimeout )
DWORD WaitForMultipleObjects( DWORD cObjects LPHANDLE lpHandles BOOL bWaitAll DWORD dwTimeout )
8383
Wait Functions - Details
WaitForSingleObject()ndash hObject specifies kernel object
ndash dwTimeout specifies wait time in msecdwTimeout == 0 - no wait check whether object is signaled
dwTimeout == INFINITE - wait forever
WaitForMultipleObjects()ndash cObjects lt= MAXIMUM_WAIT_OBJECTS (64)
ndash lpHandles - pointer to array identifying these objects
ndash bWaitAll - whether to wait for first signaled object or all objectsFunction returns index of first signaled object
Side effectsndash Mutexes auto-reset events and waitable timers will be reset to non-signaled state after completing wait functions
8484
Mutexes
Mutexes work across processes
First thread has to call CreateMutex()
When sharing a mutex second thread (process) calls CreateMutex() or OpenMutex()
fInitialOwner == TRUE gives creator immediate ownership
Threads acquire mutex ownership using WaitForSingleObject() or WaitForMultipleObjects()
ReleaseMutex() gives up ownership
CloseHandle() will free mutex object
8585
Mutexes
HANDLE CreateMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
HANDLE OpenMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
BOOL ReleaseMutex( HANDLE hMutex )
8686
Mutex Example
counter is global shared by all threads
volatile int done counter = 0
HANDLE mutex = CreateMutex( NULL FALSE NULL )
main loop in any of the threads ret is local
DWORD ret
while (done)
ret = WaitForSingleObject( mutex INFINITE )
if (ret == WAIT_OBJECT_0)
counter += local_value
else mutex was abandoned
break exit the loop
ReleaseMutex( mutex )
CloseHandle( mutex )
8787
Comparison - POSIX mutexes
POSIX pthreads specification supports mutexesndash Synchronization among threads in same process
Five basic functionsndash pthread_mutex_init()
ndash pthread_mutex_destroy()
ndash pthread_mutex_lock()
ndash pthread_mutex_unlock()
ndash pthread_mutex_trylock()
Comparisonndash pthread_mutex_lock() will block - equivalent to WaitForSingleObject( hMutex )
ndash pthread_mutex_trylock() is nonblocking (polling) - equivalent to WaitForSingleObject() with timeout == 0
8888
Semaphores
Semaphore objects are used for resource countingndash A semaphore is signaled when count gt 0
Threadsprocesses use wait functionsndash Each wait function decreases semaphore count by 1
ndash ReleaseSemaphore() may increment count by any value
ndash ReleaseSemaphore() returns old semaphore count
8989
Semaphores
HANDLE CreateSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE OpenSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE ReleaseSemaphore( HANDLE hSemaphoreLONG cReleaseCount LPLONG lpPreviousCount )
9090
Events
Multiple threads can be released when a single event is signaled (barrier synchronization)ndash Manual-reset event can signal several thread simultaneously must be reset manually
ndash PulseEvent() will release all threads waiting on a manual-reset event and automatically reset the event
ndash Auto-reset event signals a single thread event is reset automatically
ndash fInitialState == TRUE - create event in signaled state
9191
Events
HANDLE CreateEvent( LPSECURITY_ATTRIBUTE lpsaBOOL fManualReset BOOL fInititalStateLPTSTR lpszEventName )
BOOL SetEvent( HANDLE hEvent )BOOL ResetEvent( HANDLE hEvent )BOOL PulseEvent( HANDLE hEvent )
9292
Comparison - POSIX condition variables
pthreadrsquos condition variables are comparable to eventsndash pthread_cond_init()
ndash pthread_cond_destroy()
Wait functionsndash pthread_cond_wait()
ndash pthread_cond_timedwait()
Signalingndash pthread_cond_signal() - one thread
ndash pthread_cond_broadcast() - all waiting threads
No exact equivalent to manual-reset events
9393
Anonymous pipes
BOOL CreatePipe( PHANDLE phReadPHANDLE phWriteLPSECURITY_ATTRIBUTES lpsaDWORD cbPipe )
main
prog1 prog2pipe
Half-duplex character-based IPC
cbPipe pipe byte size zero == default
Read on pipe handle will block if pipe is empty
Write operation to a full pipe will block
Anonymous pipes are oneway
9494
IO Redirection using an Anonymous Pipe
Create default size anonymous pipe handles are inheritable
if (CreatePipe (amphReadPipe amphWritePipe ampPipeSA 0))
fprintf(stderr ldquoAnon pipe create failednrdquo) exit(1)
Set output handle to pipe handle create first processes
StartInfoCh1hStdInput = GetStdHandle (STD_INPUT_HANDLE)
StartInfoCh1hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh1hStdOutput = hWritePipe
StartInfoCh1dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)Command1 NULL NULL TRUE 0 NULL NULL ampStartInfoCh1 ampProcInfo1))
fprintf(stderr ldquoCreateProc1 failednrdquo) exit(2)
CloseHandle (hWritePipe)
9595
Pipe example (contd)
Repeat (symmetrically) for the second process
StartInfoCh2hStdInput = hReadPipe
StartInfoCh2hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh2hStdOutput = GetStdHandle (STD_OUTPUT_HANDLE)
StartInfoCh2dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)targv NULL NULLTRUE Inherit handles
0 NULL NULL ampStartInfoCh2 ampProcInfo2))
fprintf(stderr ldquoCreateProc2 failednrdquo) exit(3)
CloseHandle (hReadPipe)
Wait for both processes to complete
WaitForSingleObject (ProcInfo1hProcess INFINITE)
WaitForSingleObject (ProcInfo2hProcess INFINITE)
CloseHandle (ProcInfo1hThread) CloseHandle (ProcInfo1hProcess)
CloseHandle (ProcInfo2hThread) CloseHandle (ProcInfo2hProcess)
return 0
9696
Named Pipes
Message orientedndash Reading process can read varying-length messages precisely as sent by the writing process
Bi-directionalndash Two processes can exchange messages over the same pipe
Multiple independent instances of a named pipendash Several clients can communicate with a single server using the same instance
ndash Server can respond to client using the same instance
Pipe can be accessed over the networkndash location transparency
Convenience and connection functions
9797
Using Named Pipes
HANDLE CreateNamedPipe (LPCTSTR lpszPipeNameDWORD fdwOpenMode DWORD fdwPipModeDWORD nMaxInstances DWORD cbOutBufDWORD cbInBuf DWORD dwTimeOutLPSECURITY_ATTRIBUTES lpsa )
Use same flag settings forall instances of a named pipe
lpszPipeName pipe[path]pipename
ndash Not possible to create a pipe on remote machine ( ndash local machine)
fdwOpenMode
ndash PIPE_ACCESS_DUPLEX PIPE_ACCESS_INBOUND PIPE_ACCESS_OUTBOUND
fdwPipeMode
ndash PIPE_TYPE_BYTE or PIPE_TYPE_MESSAGE
ndash PIPE_READMODE_BYTE or PIPE_READMODE_MESSAGE
ndash PIPE_WAIT or PIPE_NOWAIT (will ReadFile block)
9898
Named Pipes (contd)
BOOL PeekNamedPipe (HANDLE hPipeLPVOID lpvBuffer DWORD cbBufferLPDWORD lpcbRead LPDWORD lpcbAvailLPDWORD lpcbMessage)
nMaxInstances
ndash Number of instances
ndash PIPE_UNLIMITED_INSTANCES OS choice based on resources
dwTimeOut
ndash Default time-out period (in msec) for WaitNamedPipe()
First CreateNamedPipe creates named pipe
ndash Closing handle to last instance deletes named pipe
Polling a pipe
ndash Nondestructive ndash is there a message waiting for ReadFile
9999
Named Pipe Client Connections
CreateFile with named pipe namendash pipe[path]pipename
ndash servernamepipe[path]pipename
ndash First method gives better performance (local server)
Status Functionsndash GetNamedPipeHandleState
ndash SetNamedPipeHandleState
ndash GetNamedPipeInfo
100100
Convenience Functions
BOOL TransactNamedPipe( HANDLE hNamedPipeLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDOWRD lpcbRead LPOVERLAPPED lpa)
WriteFile ReadFile sequence
101101
Convenience Functions
BOOL CallNamedPipe( LPCTSTR lpszPipeNameLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDWORD lpcbRead DWORD dwTimeOut)
CreateFile WriteFile ReadFile CloseHandle
ndash dwTimeOut NMPWAIT_NOWAIT NMPWAIT_WIAT_FOREVER NMPWAIT_USE_DEFAULT_WAIT
102102
Server eliminate the polling loop
BOOL ConnectNamedPipe (HANDLE hNamedPipeLPOVERLAPPED lpo
lpo == NULLndash Call will return as soon as there is a client connection
ndash Returns false if client connected between CreateNamed Pipe calland ConnectNamedPipe()
Use DisconnectNamedPipe to free the handle for connection from another client
WaitNamedPipe()ndash Client may wait for serverlsquos ConnectNamedPipe()
Security rights for named pipesndash GENERIC_READ GENERIC_WRITE SYNCHRONIZE
103103
Comparison with UNIX
UNIX FIFOs are similar to a named pipe
ndash FIFOs are half-duplex
ndash FIFOs are limited to a single machine
ndash FIFOs are still byte-oriented so its easiest to use fixed-size records in clientserver applications
ndash Individual readwrites are atomic
A server using FIFOs must use a separate FIFO for each clientlsquos response although all clients can send requests via a single well known FIFO
Mkfifo() is the UNIX counterpart to CreateNamedPipe()
Use sockets for networked clientserver scenarios
104104
Client Example using Named Pipe
WaitNamedPipe (ServerPipeName NMPWAIT_WAIT_FOREVER)
hNamedPipe = CreateFile (ServerPipeName GENERIC_READ | GENERIC_WRITE
0 NULL OPEN_EXISTING FILE_ATTRIBUTE_NORMAL NULL)
if (hNamedPipe == INVALID_HANDLE_VALUE)
fptinf(stderr Failure to locate servern) exit(3)
Write the request
WriteFile (hNamedPipe ampRequest MAX_RQRS_LEN ampnWrite NULL)
Read each response and send it to std out
while (ReadFile (hNamedPipe ResponseRecord MAX_RQRS_LEN ampnRead NULL))
printf (s ResponseRecord)
CloseHandle (hNamedPipe)
return 0
105105
Server Example Using a Named Pipe
hNamedPipe = CreateNamedPipe (SERVER_PIPE PIPE_ACCESS_DUPLEX
PIPE_READMODE_MESSAGE | PIPE_TYPE_MESSAGE | PIPE_WAIT
1 0 0 CS_TIMEOUT pNPSA)
while (Done)
printf (Server is awaiting next requestn)
if (ConnectNamedPipe (hNamedPipe NULL)
|| ReadFile (hNamedPipe ampRequest RQ_SIZE ampnXfer NULL))
fprintf(stderr ldquoConnect or Read Named Pipe errornrdquo) exit(4)
printf( ldquoRequest is sn RequestRecord)
Send the file one line at a time to the client
fp = fopen (File r)
while ((fgets (ResponseRecord MAX_RQRS_LEN fp) = NULL))
WriteFile (hNamedPipe ampResponseRecord
(strlen(ResponseRecord) + 1) TSIZE ampnXfer NULL)
fclose (fp)
DisconnectNamedPipe (hNamedPipe)
End of server operation
106106
Win32 IPC - MailslotsMailslots bear some nasty implementation detailsthey are almost never used
Broadcast mechanism
ndash One-directional
ndash Mutliple writersmultiple readers (frequently one-to-many comm)
ndash Message delivery is unreliable
ndash Can be located over a network domain
ndash Message lengths are limited (w2k lt 426 byte) Operations on the mailslot
ndash Each reader (server) creates mailslot with CreateMailslot()
ndash Write-only client opens mailslot with CreateFile() and uses WriteFile() ndash open will fail if there are no waiting readers
ndash Clientlsquos message can be read by all servers (readers) Client lookup mailslotmailslotname
ndash Client will connect to every server in network domain
107107
Locate a server via mailslot
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
App client 0
App client n
Mailslot Servers
While () Sleep() hMS = CreateFile( ldquomailslotstatusldquo)
WriteFile(hMS ampStatInfo
App Server
Mailslot Client
Message is sent periodically
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
99
Second Attempt Algorithm 2
Shared variables
ndash initialization int flag[2] flag[0] = flag[1] = 0
ndash flag[i] == 1 Ti can enter its critical section
Thread Ti
do
flag[i] = 1while (flag[j] == 1)
critical section
flag[i] = 0remainder section
while(1)
Satisfies mutual exclusion not progress requirement
1010
Algorithm 3 (Petersonrsquos Algorithm - 1981)
Shared variables of algorithms 1 and 2 - initialization
int flag[2] flag[0] = flag[1] = 0int turn = 0
Thread Ti
do
flag[i] = 1turn = jwhile ((flag[j] == 1) ampamp turn == j)
critical section
flag[i] = 0
remainder section
while (1)
Solves the critical-section problem for two threads
1111
Dekkerrsquos Algorithm (1965)
This is the first correct solution proposed for the two-thread (two-process) case
Originally developed by Dekker in a different context it was applied to the critical section problem by Dijkstra
Dekker adds the idea of a favored thread and allows access to either thread when the request is uncontested
When there is a conflict one thread is favored and the priority reverses after successful execution of the critical section
1212
Dekkerrsquos Algorithm (contd)
Shared variables - initializationint flag[2] flag[0] = flag[1] = 0int turn = 0
Thread Ti
do flag[i] = 1
while (flag[j] ) if (turn == j)
flag[i] = 0while (turn == j)flag[i] = 1
critical section
turn = jflag[I] = 0
remainder section
while (1)
1313
Bakery Algorithm (Lamport 1979)
A Solution to the Critical Section problem for n threads
Before entering its CS a thread receives a number
Holder of the smallest number enters the CS
If threads Ti and Tj receive the same number if i lt j then Ti is served first else Tj is served first
The numbering scheme generates numbers in monotonically non-decreasing order
ndash ie 1112333445
1414
Bakery Algorithm
Notation ldquoltldquo establishes lexicographical order among 2-tuples (ticket thread id )
(ab) lt (cd) if a lt c or if a == c and b lt d
max (a0hellip an-1) = k | k ai for i = 0hellip n ndash 1
Shared data
int choosing[n]
int number[n] - the ticket
Data structures are initialized to 0
1515
Bakery Algorithm
do
choosing[i] = 1
number[i] = max(number[0]number[1] number[n-1]) + 1
choosing[i] = 0
for (j = 0 j lt n j++)
while (choosing[j] == 1)
while ((number[j] = 0) ampamp ((number[j]j) lsquorsquoltlsquorsquo (number[i]i)))
critical section
number[i] = 0
remainder section
while (1)
1616
Mutual Exclusion - Hardware Support
Interrupt Disabling
ndash Concurrent threads cannot overlap on a uniprocessor
ndash Thread will run until performing a system call or interrupt happens
Special Atomic Machine Instructions
ndash Test and Set Instruction - read amp write a memory location
ndash Exchange Instruction - swap register and memory location
Problems with Machine-Instruction Approach
ndash Busy waiting
ndash Starvation is possible
ndash Deadlock is possible
1717
Synchronization Hardware
Test and modify the content of a word atomically
boolean TestAndSet(boolean amptarget)
boolean rv = target
target = true
return rv
1818
Shared data ndash boolean lock = false
Thread Ti
do
while (TestAndSet(lock))
critical section
lock = false
remainder section
Mutual Exclusion with Test-and-Set
1919
Synchronization Hardware
Atomically swap two variables
void Swap(boolean ampa boolean ampb)
boolean temp = a
a = b
b = temp
2020
Mutual Exclusion with Swap
Shared data (initialized to 0) int lock = 0
Thread Ti
int key
do
key = 1
while (key == 1) Swap(lockkey)
critical section
lock = 0
remainder section
2121
Semaphores
Semaphore S ndash integer variable
can only be accessed via two atomic operations
wait (S)
while (S lt= 0)S--
signal (S)
S++
2222
Critical Section of n Threads
Shared data
semaphore mutex initially mutex = 1
Thread Ti
do wait(mutex) critical section
signal(mutex) remainder section
while (1)
2323
Semaphore Implementation
Semaphores may suspendresume threads
ndash Avoid busy waiting
Define a semaphore as a record
typedef struct
int value struct thread L semaphore
Assume two simple operations
ndash suspend() suspends the thread that invokes it
ndash resume(T) resumes the execution of a blocked thread T
2424
Implementation
Semaphore operations now defined as wait(S)
Svalue--
if (Svalue lt 0)
add this thread to SLsuspend()
signal(S) Svalue++
if (Svalue lt= 0)
remove a thread T from SLresume(T)
2525
Semaphore as a General Synchronization Tool Execute B in Tj only after A executed in Ti
Use semaphore flag initialized to 0
Code
Ti Tj
A wait(flag)
signal(flag) B
2626
Two Types of Semaphores
Counting semaphore
ndash integer value can range over an unrestricted domain
Binary semaphore
ndash integer value can range only between 0 and 1
ndash can be simpler to implement
Counting semaphore S can be implemented as a binary semaphore
2727
Deadlock and Starvation
Deadlock ndash
ndash two or more threads are waiting indefinitely for an event that can be caused by only one of the waiting threads
Let S and Q be two semaphores initialized to 1
T0 T1
wait(S) wait(Q)
wait(Q) wait(S)
signal(S) signal(Q)
signal(Q) signal(S)
2828
Deadlock and Starvation
Starvation ndash indefinite blocking
ndash A thread may never be removed from the semaphore queue in which it is suspended
Solution ndash
ndash all code should acquirerelease semaphores in same order
2929
Windows Synchronization
Uses interrupt masks to protect access to global resources on uniprocessor systems
Uses spinlocks on multiprocessor systems
Provides dispatcher objects which may act as mutexes and semaphores
Dispatcher objects may also provide events An event acts much like a condition variable
3030
Linux Synchronization
Kernel disables interrupts for synchronizing access to global data on uniprocessor systems
Uses spinlocks for multiprocessor synchronization
Uses semaphores and readers-writers locks when longer sections of code need access to data
Implements POSIX synchronization primitives to support multitasking multithreading (including real-time threads) and multiprocessing
3131
Further Reading
Ben-Ari M Principles of Concurrent Programming Prentice Hall 1982
Lamport L The Mutual Exclusion Problem Journal of the ACM April 1986
Abraham Silberschatz Peter B Galvin Operating System Concepts John Wiley amp Sons 6th Ed 2003
ndash Chapter 7 - Process Synchronization
ndash Chapter 8 - Deadlocks
3232
32 Trap Dispatching Interrupts Synchronization Trap and Interrupt dispatching
IRQL levels amp Interrupt Precedence
Spinlocks and Kernel Synchronization
Executive Synchronization
3333
Kernel Mode Versus User Mode
A processor state
Controls access to memory
Each memory page is tagged to show the required mode for reading and for writing
ndash Protects the system from the users
ndash Protects the user (process) from themselves
ndash System is not protected from system
Code regions are tagged ldquono write in any moderdquo
Controls ability to execute privileged instructions
A Windows abstraction
ndash Intel Ring 0 Ring 3
3434
Kernel Mode Versus User Mode
Control flow (a thread) can change from user to kernel mode and back
ndash Does not affect scheduling
ndash Thread context includes info about execution mode (along with registers etc)
PerfMon counters
ndash ldquoPrivileged Timerdquo and ldquoUser Timerdquo
ndash 4 levels of granularity thread process processor system
3535
Getting Into Kernel Mode
Code is run in kernel mode for one of three reasons
1 Requests from user mode
ndash Via the system service dispatch mechanism
ndash Kernel-mode code runs in the context of the requesting thread
2 Dedicated kernel-mode system threads
ndash Some threads in the system stay in kernel mode at all timesmostly in the ldquoSystemrdquo process
ndash Scheduled preempted etc like any other threads
3636
Getting Into Kernel Mode
3 Interrupts from external devicesndash interrupt dispatcher invokes the interrupt service routine
ndash ISR runs in the context of the interrupted thread so-called ldquoarbitrary thread contextrdquo
ndash ISR often requests the execution of a ldquoDPC routinerdquo which also runs in kernel mode
ndash Time not charged to interrupted thread
3737
Trap dispatching
Interruptdispatcher
Systemservice
dispatcher
Interruptserviceroutines
Interruptserviceroutines
Interruptserviceroutines
System services
System services
System services
Exceptiondispatcher
Exceptionhandlers
Exceptionhandlers
Exceptionhandlers
Virtual memorymanagerlsquos pager
Interrupt
System service call
HW exceptionsSW exceptions
Virtual addressexceptions
Trap processorlsquos mechanism to capture executing thread
ndash Switch from user to kernel mode
ndash Interrupts ndash asynchronous
ndash Exceptions - synchronous
Interrupt dispatch routine
Disable interrupts
Record machine state (trap frame) to allow resume
Mask equal- and lower-IRQL interrupts
Find and call appropriate ISR
Dismiss interrupt
Restore machine state (including mode and enabled interrupts)
Disable interrupts
Record machine state (trap frame) to allow resume
Mask equal- and lower-IRQL interrupts
Find and call appropriate ISR
Dismiss interrupt
Restore machine state (including mode and enabled interrupts)
Tell the device to stop interrupting
Interrogate device state start next operation on device etc
Request a DPC
Return to caller
Tell the device to stop interrupting
Interrogate device state start next operation on device etc
Request a DPC
Return to caller
Interrupt service routine
interrupt
user or kernel mode
codekernel mode
Note no thread or process context switch
Note no thread or process context switch
Interrupt Dispatching
3939
IRQL = Interrupt Request Level
ndash Precedence of the interrupt with respect to other interrupts
ndash Different interrupt sources have different IRQLs
ndash not the same as IRQ
IRQL is also a state of the processor
ndash Servicing an interrupt raises processor IRQL to that interruptrsquos IRQL
ndash this masks subsequent interrupts at equal and lower IRQLs
User mode is limited to IRQL 0
No waits or page faults at IRQL gt= DISPATCH_LEVEL
Interrupt Precedence via IRQLs (x86)
4040
PassiveLowAPC
DispatchDPCDevice 1
Profile amp Synch (Srv 2003)
ClockInterprocessor Interrupt
Power failHigh
normal thread execution
Hardware interrupts
Deferrable software interrupts
012
302928
31
Interrupt Precedence via IRQLs (x86)
4141
Interrupt processing
Interrupt dispatch table (IDT)ndash Links to interrupt service routines
x86ndash Interrupt controller interrupts processor (single line)
ndash Processor queries for interrupt vector uses vector as index to IDT
Alphandash PAL code (Privileged Architecture Library ndash Alpha BIOS) determines interrupt vector calls kernel
ndash Kernel uses vector to index IDT
After ISR execution IRQL is lowered to initial level
4242
Interrupt object
Allows device drivers to register ISRs for their devicesndash Contains dispatch code (initial handler)
ndash Dispatch code calls ISR with interrupt object as parameter(HW cannot pass parameters to ISR)
Connectingdisconnecting interrupt objectsndash Dynamic association between ISR and IDT entry
ndash Loadable device drivers (kernel modules)
ndash Turn onoff ISR
Interrupt objects can synchronize access to ISR datandash Multiple instances of ISR may be active simultaneously (MP machine)
ndash Multiple ISR may be connected with IRQL
4343
Predefined IRQLs
High ndash used when halting the system (via KeBugCheck())
Power fail ndash originated in the NT design document but has never been used
Inter-processor interruptndash used to request action from other processor (dispatching a thread updating a processors TLB system shutdown system crash)
Clockndash Used to update systemlsquos clock allocation of CPU time to threads
Profilendash Used for kernel profiling (see Kernel profiler ndash Kernprofexe Res Kit)
4444
Predefined IRQLs (contd)
Device
ndash Used to prioritize device interrupts
DPCdispatch and APC
ndash Software interrupts that kernel and device drivers generate
Passive
ndash No interrupt level at all normal thread execution
4545
IRQLs on 64-bit Systems
PassiveLowAPC
DispatchDPCDevice 1
Device n
Synch (Srv 2003)Clock
Interprocessor InterruptPower
HighProfile
012
1413
15
34
PassiveLowAPC
DispatchDPC amp Synch (UP only)Correctable Machine Check
Device 1
Device nSynch (MP only)
ClockInterprocessor Interrupt
HighProfilePower
x64 IA64
12
4646
Interrupt Prioritization amp Delivery
IRQLs are determined as followsndash x86 UP systems IRQL = 27 - IRQ
ndash x86 MP systems bucketized (random)
ndash x64 amp IA64 systems IRQL = IDT vector number 16
On MP systems which processor is chosen to deliver an interruptndash By default any processor can receive an interrupt from any deviceCan be configured with IntFilter utility in Resource Kit
ndash On x86 and x64 systems the IOAPIC (IO advanced programmable interrupt controller) is programmed to interrupt the processor running at the lowest IRQL
ndash On IA64 systems the SAPIC (streamlined advanced programmable interrupt controller) is configured to interrupt one processor for each interrupt sourceProcessors are assigned round robin for each interrupt vector
4747
Software interrupts
Initiating thread dispatching
ndash DPC allow for scheduling actions when kernel is deep within many layers of code
ndash Delayed scheduling decision one DPC queue per processor
Handling timer expiration
Asynchronous execution of a procedure in context of a particular thread
Support for asynchronous IO operations
4848
Flow of Interrupts
4949
Sync on MP use spinlocks to coordinate among processors
Spinlock acquisition and release routines implement a one-owner-at-a-time algorithm
ndash Spinlock is either free or is considered to be owned by a CPU
ndash Analogous to using Windows API mutexes from user mode
A spinlock is just a data cell in memory
ndash Accessed with a test-and-modify operation that is atomic across all processors
ndash KSPIN_LOCK is an opaque data type typedefrsquod as a ULONG
ndash To implement synchronization a single bit is sufficient
Synchronization on SMP Systems
5050
do acquire_spinlock(DPC)until (SUCCESS)
begin remove DPC from queueend
release_spinlock(DPC)
do acquire_spinlock(DPC)until (SUCCESS)
begin remove DPC from queueend
release_spinlock(DPC)
Kernel Synchronization
Processor BProcessor A
Critical section
spinlock
DPC DPC
A spinlock is a locking primitive associatedwith a global data structure such as the DPC queue
5151
Queued Spinlocks
Problem Checking status of spinlock via test-and-set operation creates bus contention
Queued spinlocks maintain queue of waiting processors
First processor acquires lock other processors wait on processor-local flag
ndash Thus busy-wait loop requires no access to the memory bus
When releasing lock the 1st processorrsquos flag is modified
ndash Exactly one processor is being signaled
ndash Pre-determined wait order
5252
SMP Scalability Improvements
Windows 2000 queued spinlocks
ndash qlocks in Kernel Debugger
Server 2003
ndash More spinlocks eliminated (context swap system space commit)
ndash Further reduction of use of spinlocks amp length they are held
ndash Scheduling database now per-CPUAllows thread state transitions in parallel
5353
SMP Scalability Improvements
XP2003
ndash Minimized lock contention for hot locks PFN or Page Frame Database lock
ndash Some locks completely eliminatedCharging nonpagedpaged pool quotas allocating and mapping system page table entries charging commitment of pages allocatingmapping physical memory through AWE functions
ndash New more efficient locking mechanism (pushlocks)Doesnrsquot use spinlocks when no contentionUsed for object manager and address windowing extensions (AWE) related locks
5454
Waiting
Flexible wait calls
ndash Wait for one or multiple objects in one call
ndash Wait for multiple can wait for ldquoanyrdquo one or ldquoallrdquo at onceldquoAllrdquo all objects must be in the signalled state concurrently to resolve the wait
ndash All wait calls include optional timeout argument
ndash Waiting threads consume no CPU time
5555
Waiting
Waitable objects include
ndash Events (may be auto-reset or manual reset may be set or ldquopulsedrdquo)
ndash Mutexes (ldquomutual exclusionrdquo one-at-a-time)
ndash Semaphores (n-at-a-time)
ndash Timers
ndash Processes and Threads (signalled upon exit or terminate)
ndash Directories (change notification)
5656
Waiting
No guaranteed ordering of wait resolution
ndash If multiple threads are waiting for an object and only one thread is released (eg itrsquos a mutex or auto-reset event) which thread gets released is unpredictable
5757
Executive Synchronization
Thread waitson an object
handle
Create and initialize thread object
Initialized
Ready
Transition
Waiting
Running
Terminated
Standby
Wait is completeSet object to
signaled state
Interaction with thread scheduling
Waiting on Dispatcher Objects ndash outside the kernel
5858
Interaction bet Synchronization amp Dispatching User mode thread waits on an event objectlsquos handle
Kernel changes threadlsquos scheduling state from ready to waiting and adds thread to wait-list
Another thread sets the event
Kernel wakes up waiting threads variable priority threads get priority boost
5959
Interaction bet Synchronization amp Dispatching Dispatcher re-schedules new thread ndash it may preempt running thread it it has lower priority and issues software interrupt to initiate context switch
If no processor can be preempted the dispatcher places the ready thread in the dispatcher ready queue to be scheduled later
6060
What signals an object
Dispatcher object
System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
Owning thread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (kernel mode)
nonsignaled signaled
Owning thread or otherthread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (exported to user mode)
nonsignaled signaled
One thread releases thesemaphore freeing a resource
A thread acquires the semaphoreMore resources are not available
Kernel resumes one or more waiting threads
Semaphore
6161
A thread reinitializesthe thread object
What signals an object (contd)
Dispatcher object System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
A thread sets the event
Kernel resumes one or more threads
Kernel resumes one or more waiting threads
Event
nonsignaled signaled
Dedicated thread sets oneevent in the event pair
Kernel resumes theother dedicated thread
Kernel resumes waitingdedicated thread
Event pair
nonsignaled signaled
Timer expires
A thread (re) initializes the timer
Kernel resumes all waiting threads
Timer
nonsignaled signaled
Thread terminates
Kernel resumes all waiting threads
Thread
6262
Further Reading
Mark E Russinovich and David A Solomon Microsoft Windows Internals 4th Edition Microsoft Press 2004
Chapter 3 - System Mechanisms
ndash Trap Dispatching (pp 85 ff)
ndash Synchronization (pp 149 ff)
ndash Kernel Event Tracing (pp 175 ff)
6363
33 Advanced Windows Synchronization
Deferred and Asynchronous Procedure Calls
IRQLs and CPU Time Accounting
Wait Queues amp Dispatcher Objects
6464
Used to defer processing from higher (device) interrupt level to a lower (dispatch) levelndash Also used for quantum end and timer expiration
Driver (usually ISR) queues requestndash One queue per CPU DPCs are normally queued to the current processor but can be targeted to other CPUs
ndash Executes specified procedure at dispatch IRQL (or ldquodispatch levelrdquo also ldquoDPC levelrdquo) when all higher-IRQL work (interrupts) completed
ndash Maximum times recommended ISR 10 usec DPC 25 usec
See httpwwwmicrosoftcomwhdcdriverperformmmdrvmspx
Deferred Procedure Calls (DPCs)
6565
queue head DPC object DPC object DPC object
Deferred Procedure Calls (DPCs)
6666
DPC
Delivering a DPC
DPC routines can call kernel functionsbut canlsquot call system services generatepage faults or create or wait on objects
DPC routines canlsquotassume whatprocess addressspace is currentlymapped
Interruptdispatch table
high
Power failure
DispatchDPC
APC
Low
DPC
1 Timer expires kernelqueues DPC that willrelease all waiting threadsKernel requests SW int
DPCDPC
DPC queue
2 DPC interrupt occurswhen IRQL drops belowdispatchDPC level
dispatcher
3 After DPC interruptcontrol transfers tothread dispatcher
4 Dispatcher executes each DPCroutine in DPC queue
6767
Asynchronous Procedure Calls (APCs)
Execute code in context of a particular user threadndash APC routines can acquire resources (objects) incur page faultscall system services
APC queue is thread-specific
User mode amp kernel mode APCsndash Permission required for user mode APCs
6868
Asynchronous Procedure Calls (APCs)
Executive uses APCs to complete work in thread spacendash Wait for asynchronous IO operation
ndash Emulate delivery of POSIX signals
ndash Make threads suspendterminate itself (env subsystems)
APCs are delivered when thread is in alertable wait statendash WaitForMultipleObjectsEx() SleepEx()
6969
Special kernel APCs
ndash Run in kernel mode at IRQL 1
ndash Always deliverable unless thread is already at IRQL 1 or above
ndash Used for IO completion reporting from ldquoarbitrary thread contextrdquo
ndash Kernel-mode interface is linkable but not documented
Asynchronous Procedure Calls (APCs)
7070
ldquoOrdinaryrdquo kernel APCs
ndash Always deliverable if at IRQL 0 unless explicitly disabled (disable with KeEnterCriticalRegion)
User mode APCs
ndash Used for IO completion callback routines (see ReadFileEx WriteFileEx) also QueueUserApc
ndash Only deliverable when thread is in ldquoalertable waitrdquo
Asynchronous Procedure Calls (APCs)
7171
ThreadObject
K
U
APC objects
Asynchronous Procedure Calls (APCs)
7272
IRQLs and CPU Time Accounting
Interval clock timer ISR keeps track of time
Clock ISR time accounting
ndash If IRQLlt2 charge to threadrsquos user or kernel time
ndash If IRQL=2 and processing a DPC charge to DPC time
ndash If IRQL=2 amp not processing a DPC charge to thread kernel time
ndash If IRQLgt2 charge to interrupt time
7373
IRQLs and CPU Time Accounting
Since time servicing interrupts are NOT charged to interrupted thread if system is busy but no process appears to be running must be due to interrupt-related activity
ndash Note time at IRQL 2 or more is charged to the current threadrsquos quantum (to be described)
7474
Interrupt Time Accounting
Task Manager includes interrupt and DPC time with the Idle process time
Interrupt activity is not charged to any threadprocess
ndash Process Explorer shows these as separate processesnot really processes
ndash Context switches for these are really of interrupts amp DPCs
7575
Time Accounting Quirks
Looking at total CPU time for each process may not reveal where system has spent its time
CPU time accounting is driven by programmable interrupt timer
ndash Normally 10 msec (15 msec on some MP Pentiums)
Thread execution and context switches between clock intervals NOT accounted
ndash Eg one or more threads run and enter a wait state before clock fires
ndash Thus threads may run but never get charged
View context switch activity with Process Explorer
ndash Add Context Switch Delta column
7676
For waiting threads user-mode utilities only display the wait reason
Example pstat
Looking at Waiting Threads
7777
Wait Internals 1 Dispatcher Objects
Size TypeState
Wait listhead
Object-type-specific data
DispatcherObject
(see ntddkincddkntddkh)
Any kernel object you can wait for is a ldquodispatcher objectrdquo
ndash some exclusively for synchronizationeg events mutexes (ldquomutantsrdquo) semaphores queues timers
ndash others can be waited for as a side effect of their prime function eg processes threads file objects
ndash non-waitable kernel objects are called ldquocontrol objectsrdquo
All dispatcher objects have a common header
All dispatcher objects are in one of two states
ndash ldquosignaledrdquo vs ldquononsignaledrdquo
ndash when signalled a wait on the object is satisfied
ndash different object types differ in terms of what changes their state
ndash wait and unwait implementation iscommon to all types of dispatcher objects
7878
Object-type-specific data
Wait Internals 2Wait Blocks
Size TypeState
Wait listhead
Size TypeState
Wait listhead
Represent a threadrsquos reference to something itrsquos waiting for (one per handle passed to WaitForhellip)
All wait blocks from a given wait call are chained to the waiting thread
Type indicates wait for ldquoanyrdquo or ldquoallrdquo Key denotes argument list position for
WaitForMultipleObjects
Object-type-specific data
DispatcherObjects
Thread Objects
WaitBlockListWaitBlockList
Wait blocks
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
7979
34 Windows APIs for Synchronization and IPC Windows API constructs for synchronization and interprocess communication
Synchronization
ndash Critical sections
ndash Mutexes
ndash Semaphores
ndash Event objects
Synchronization through interprocess communication
ndash Anonymous pipes
ndash Named pipes
ndash Mailslots
8080
Critical Sections
Only usable from within the same process
Critical sections are initialized and deleted but do not have handles
Only one thread at a time can be in a critical section
A thread can enter a critical section multiple times - however the number of Enter- and Leave-operations must match
Leaving a critical section before entering it may cause deadlocks
No way to test whether another thread is in a critical section
VOID InitializeCriticalSection( LPCRITICAL_SECTION sec )VOID DeleteCriticalSection( LPCRITICAL_SECTION sec )
VOID EnterCriticalSection( LPCRITICAL_SECTION sec ) VOID LeaveCriticalSection( LPCRITICAL_SECTION sec )BOOL TryEnterCriticalSection ( LPCRITICAL_SECTION sec )
8181
Critical Section Example
counter is global shared by all threads
volatile int counter = 0
CRITICAL_SECTION crit
InitializeCriticalSection ( ampcrit )
hellip main loop in any of the threads
while (done)
_try
EnterCriticalSection ( ampcrit )
counter += local_value
LeaveCriticalSection ( ampcrit )
_finally LeaveCriticalSection ( ampcrit )
DeleteCriticalSection( ampcrit )
8282
Synchronizing Threads with Kernel Objects
The following kernel objects can be used to synchronize threads
ndash Processes
ndash Threads
ndash Files
ndash Console input
File change notificationsFile change notifications
MutexesMutexes
Events (auto-reset + manual-reset)Events (auto-reset + manual-reset)
Waitable timersWaitable timers
DWORD WaitForSingleObject( HANDLE hObject DWORD dwTimeout )
DWORD WaitForMultipleObjects( DWORD cObjects LPHANDLE lpHandles BOOL bWaitAll DWORD dwTimeout )
8383
Wait Functions - Details
WaitForSingleObject()ndash hObject specifies kernel object
ndash dwTimeout specifies wait time in msecdwTimeout == 0 - no wait check whether object is signaled
dwTimeout == INFINITE - wait forever
WaitForMultipleObjects()ndash cObjects lt= MAXIMUM_WAIT_OBJECTS (64)
ndash lpHandles - pointer to array identifying these objects
ndash bWaitAll - whether to wait for first signaled object or all objectsFunction returns index of first signaled object
Side effectsndash Mutexes auto-reset events and waitable timers will be reset to non-signaled state after completing wait functions
8484
Mutexes
Mutexes work across processes
First thread has to call CreateMutex()
When sharing a mutex second thread (process) calls CreateMutex() or OpenMutex()
fInitialOwner == TRUE gives creator immediate ownership
Threads acquire mutex ownership using WaitForSingleObject() or WaitForMultipleObjects()
ReleaseMutex() gives up ownership
CloseHandle() will free mutex object
8585
Mutexes
HANDLE CreateMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
HANDLE OpenMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
BOOL ReleaseMutex( HANDLE hMutex )
8686
Mutex Example
counter is global shared by all threads
volatile int done counter = 0
HANDLE mutex = CreateMutex( NULL FALSE NULL )
main loop in any of the threads ret is local
DWORD ret
while (done)
ret = WaitForSingleObject( mutex INFINITE )
if (ret == WAIT_OBJECT_0)
counter += local_value
else mutex was abandoned
break exit the loop
ReleaseMutex( mutex )
CloseHandle( mutex )
8787
Comparison - POSIX mutexes
POSIX pthreads specification supports mutexesndash Synchronization among threads in same process
Five basic functionsndash pthread_mutex_init()
ndash pthread_mutex_destroy()
ndash pthread_mutex_lock()
ndash pthread_mutex_unlock()
ndash pthread_mutex_trylock()
Comparisonndash pthread_mutex_lock() will block - equivalent to WaitForSingleObject( hMutex )
ndash pthread_mutex_trylock() is nonblocking (polling) - equivalent to WaitForSingleObject() with timeout == 0
8888
Semaphores
Semaphore objects are used for resource countingndash A semaphore is signaled when count gt 0
Threadsprocesses use wait functionsndash Each wait function decreases semaphore count by 1
ndash ReleaseSemaphore() may increment count by any value
ndash ReleaseSemaphore() returns old semaphore count
8989
Semaphores
HANDLE CreateSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE OpenSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE ReleaseSemaphore( HANDLE hSemaphoreLONG cReleaseCount LPLONG lpPreviousCount )
9090
Events
Multiple threads can be released when a single event is signaled (barrier synchronization)ndash Manual-reset event can signal several thread simultaneously must be reset manually
ndash PulseEvent() will release all threads waiting on a manual-reset event and automatically reset the event
ndash Auto-reset event signals a single thread event is reset automatically
ndash fInitialState == TRUE - create event in signaled state
9191
Events
HANDLE CreateEvent( LPSECURITY_ATTRIBUTE lpsaBOOL fManualReset BOOL fInititalStateLPTSTR lpszEventName )
BOOL SetEvent( HANDLE hEvent )BOOL ResetEvent( HANDLE hEvent )BOOL PulseEvent( HANDLE hEvent )
9292
Comparison - POSIX condition variables
pthreadrsquos condition variables are comparable to eventsndash pthread_cond_init()
ndash pthread_cond_destroy()
Wait functionsndash pthread_cond_wait()
ndash pthread_cond_timedwait()
Signalingndash pthread_cond_signal() - one thread
ndash pthread_cond_broadcast() - all waiting threads
No exact equivalent to manual-reset events
9393
Anonymous pipes
BOOL CreatePipe( PHANDLE phReadPHANDLE phWriteLPSECURITY_ATTRIBUTES lpsaDWORD cbPipe )
main
prog1 prog2pipe
Half-duplex character-based IPC
cbPipe pipe byte size zero == default
Read on pipe handle will block if pipe is empty
Write operation to a full pipe will block
Anonymous pipes are oneway
9494
IO Redirection using an Anonymous Pipe
Create default size anonymous pipe handles are inheritable
if (CreatePipe (amphReadPipe amphWritePipe ampPipeSA 0))
fprintf(stderr ldquoAnon pipe create failednrdquo) exit(1)
Set output handle to pipe handle create first processes
StartInfoCh1hStdInput = GetStdHandle (STD_INPUT_HANDLE)
StartInfoCh1hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh1hStdOutput = hWritePipe
StartInfoCh1dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)Command1 NULL NULL TRUE 0 NULL NULL ampStartInfoCh1 ampProcInfo1))
fprintf(stderr ldquoCreateProc1 failednrdquo) exit(2)
CloseHandle (hWritePipe)
9595
Pipe example (contd)
Repeat (symmetrically) for the second process
StartInfoCh2hStdInput = hReadPipe
StartInfoCh2hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh2hStdOutput = GetStdHandle (STD_OUTPUT_HANDLE)
StartInfoCh2dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)targv NULL NULLTRUE Inherit handles
0 NULL NULL ampStartInfoCh2 ampProcInfo2))
fprintf(stderr ldquoCreateProc2 failednrdquo) exit(3)
CloseHandle (hReadPipe)
Wait for both processes to complete
WaitForSingleObject (ProcInfo1hProcess INFINITE)
WaitForSingleObject (ProcInfo2hProcess INFINITE)
CloseHandle (ProcInfo1hThread) CloseHandle (ProcInfo1hProcess)
CloseHandle (ProcInfo2hThread) CloseHandle (ProcInfo2hProcess)
return 0
9696
Named Pipes
Message orientedndash Reading process can read varying-length messages precisely as sent by the writing process
Bi-directionalndash Two processes can exchange messages over the same pipe
Multiple independent instances of a named pipendash Several clients can communicate with a single server using the same instance
ndash Server can respond to client using the same instance
Pipe can be accessed over the networkndash location transparency
Convenience and connection functions
9797
Using Named Pipes
HANDLE CreateNamedPipe (LPCTSTR lpszPipeNameDWORD fdwOpenMode DWORD fdwPipModeDWORD nMaxInstances DWORD cbOutBufDWORD cbInBuf DWORD dwTimeOutLPSECURITY_ATTRIBUTES lpsa )
Use same flag settings forall instances of a named pipe
lpszPipeName pipe[path]pipename
ndash Not possible to create a pipe on remote machine ( ndash local machine)
fdwOpenMode
ndash PIPE_ACCESS_DUPLEX PIPE_ACCESS_INBOUND PIPE_ACCESS_OUTBOUND
fdwPipeMode
ndash PIPE_TYPE_BYTE or PIPE_TYPE_MESSAGE
ndash PIPE_READMODE_BYTE or PIPE_READMODE_MESSAGE
ndash PIPE_WAIT or PIPE_NOWAIT (will ReadFile block)
9898
Named Pipes (contd)
BOOL PeekNamedPipe (HANDLE hPipeLPVOID lpvBuffer DWORD cbBufferLPDWORD lpcbRead LPDWORD lpcbAvailLPDWORD lpcbMessage)
nMaxInstances
ndash Number of instances
ndash PIPE_UNLIMITED_INSTANCES OS choice based on resources
dwTimeOut
ndash Default time-out period (in msec) for WaitNamedPipe()
First CreateNamedPipe creates named pipe
ndash Closing handle to last instance deletes named pipe
Polling a pipe
ndash Nondestructive ndash is there a message waiting for ReadFile
9999
Named Pipe Client Connections
CreateFile with named pipe namendash pipe[path]pipename
ndash servernamepipe[path]pipename
ndash First method gives better performance (local server)
Status Functionsndash GetNamedPipeHandleState
ndash SetNamedPipeHandleState
ndash GetNamedPipeInfo
100100
Convenience Functions
BOOL TransactNamedPipe( HANDLE hNamedPipeLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDOWRD lpcbRead LPOVERLAPPED lpa)
WriteFile ReadFile sequence
101101
Convenience Functions
BOOL CallNamedPipe( LPCTSTR lpszPipeNameLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDWORD lpcbRead DWORD dwTimeOut)
CreateFile WriteFile ReadFile CloseHandle
ndash dwTimeOut NMPWAIT_NOWAIT NMPWAIT_WIAT_FOREVER NMPWAIT_USE_DEFAULT_WAIT
102102
Server eliminate the polling loop
BOOL ConnectNamedPipe (HANDLE hNamedPipeLPOVERLAPPED lpo
lpo == NULLndash Call will return as soon as there is a client connection
ndash Returns false if client connected between CreateNamed Pipe calland ConnectNamedPipe()
Use DisconnectNamedPipe to free the handle for connection from another client
WaitNamedPipe()ndash Client may wait for serverlsquos ConnectNamedPipe()
Security rights for named pipesndash GENERIC_READ GENERIC_WRITE SYNCHRONIZE
103103
Comparison with UNIX
UNIX FIFOs are similar to a named pipe
ndash FIFOs are half-duplex
ndash FIFOs are limited to a single machine
ndash FIFOs are still byte-oriented so its easiest to use fixed-size records in clientserver applications
ndash Individual readwrites are atomic
A server using FIFOs must use a separate FIFO for each clientlsquos response although all clients can send requests via a single well known FIFO
Mkfifo() is the UNIX counterpart to CreateNamedPipe()
Use sockets for networked clientserver scenarios
104104
Client Example using Named Pipe
WaitNamedPipe (ServerPipeName NMPWAIT_WAIT_FOREVER)
hNamedPipe = CreateFile (ServerPipeName GENERIC_READ | GENERIC_WRITE
0 NULL OPEN_EXISTING FILE_ATTRIBUTE_NORMAL NULL)
if (hNamedPipe == INVALID_HANDLE_VALUE)
fptinf(stderr Failure to locate servern) exit(3)
Write the request
WriteFile (hNamedPipe ampRequest MAX_RQRS_LEN ampnWrite NULL)
Read each response and send it to std out
while (ReadFile (hNamedPipe ResponseRecord MAX_RQRS_LEN ampnRead NULL))
printf (s ResponseRecord)
CloseHandle (hNamedPipe)
return 0
105105
Server Example Using a Named Pipe
hNamedPipe = CreateNamedPipe (SERVER_PIPE PIPE_ACCESS_DUPLEX
PIPE_READMODE_MESSAGE | PIPE_TYPE_MESSAGE | PIPE_WAIT
1 0 0 CS_TIMEOUT pNPSA)
while (Done)
printf (Server is awaiting next requestn)
if (ConnectNamedPipe (hNamedPipe NULL)
|| ReadFile (hNamedPipe ampRequest RQ_SIZE ampnXfer NULL))
fprintf(stderr ldquoConnect or Read Named Pipe errornrdquo) exit(4)
printf( ldquoRequest is sn RequestRecord)
Send the file one line at a time to the client
fp = fopen (File r)
while ((fgets (ResponseRecord MAX_RQRS_LEN fp) = NULL))
WriteFile (hNamedPipe ampResponseRecord
(strlen(ResponseRecord) + 1) TSIZE ampnXfer NULL)
fclose (fp)
DisconnectNamedPipe (hNamedPipe)
End of server operation
106106
Win32 IPC - MailslotsMailslots bear some nasty implementation detailsthey are almost never used
Broadcast mechanism
ndash One-directional
ndash Mutliple writersmultiple readers (frequently one-to-many comm)
ndash Message delivery is unreliable
ndash Can be located over a network domain
ndash Message lengths are limited (w2k lt 426 byte) Operations on the mailslot
ndash Each reader (server) creates mailslot with CreateMailslot()
ndash Write-only client opens mailslot with CreateFile() and uses WriteFile() ndash open will fail if there are no waiting readers
ndash Clientlsquos message can be read by all servers (readers) Client lookup mailslotmailslotname
ndash Client will connect to every server in network domain
107107
Locate a server via mailslot
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
App client 0
App client n
Mailslot Servers
While () Sleep() hMS = CreateFile( ldquomailslotstatusldquo)
WriteFile(hMS ampStatInfo
App Server
Mailslot Client
Message is sent periodically
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
1010
Algorithm 3 (Petersonrsquos Algorithm - 1981)
Shared variables of algorithms 1 and 2 - initialization
int flag[2] flag[0] = flag[1] = 0int turn = 0
Thread Ti
do
flag[i] = 1turn = jwhile ((flag[j] == 1) ampamp turn == j)
critical section
flag[i] = 0
remainder section
while (1)
Solves the critical-section problem for two threads
1111
Dekkerrsquos Algorithm (1965)
This is the first correct solution proposed for the two-thread (two-process) case
Originally developed by Dekker in a different context it was applied to the critical section problem by Dijkstra
Dekker adds the idea of a favored thread and allows access to either thread when the request is uncontested
When there is a conflict one thread is favored and the priority reverses after successful execution of the critical section
1212
Dekkerrsquos Algorithm (contd)
Shared variables - initializationint flag[2] flag[0] = flag[1] = 0int turn = 0
Thread Ti
do flag[i] = 1
while (flag[j] ) if (turn == j)
flag[i] = 0while (turn == j)flag[i] = 1
critical section
turn = jflag[I] = 0
remainder section
while (1)
1313
Bakery Algorithm (Lamport 1979)
A Solution to the Critical Section problem for n threads
Before entering its CS a thread receives a number
Holder of the smallest number enters the CS
If threads Ti and Tj receive the same number if i lt j then Ti is served first else Tj is served first
The numbering scheme generates numbers in monotonically non-decreasing order
ndash ie 1112333445
1414
Bakery Algorithm
Notation ldquoltldquo establishes lexicographical order among 2-tuples (ticket thread id )
(ab) lt (cd) if a lt c or if a == c and b lt d
max (a0hellip an-1) = k | k ai for i = 0hellip n ndash 1
Shared data
int choosing[n]
int number[n] - the ticket
Data structures are initialized to 0
1515
Bakery Algorithm
do
choosing[i] = 1
number[i] = max(number[0]number[1] number[n-1]) + 1
choosing[i] = 0
for (j = 0 j lt n j++)
while (choosing[j] == 1)
while ((number[j] = 0) ampamp ((number[j]j) lsquorsquoltlsquorsquo (number[i]i)))
critical section
number[i] = 0
remainder section
while (1)
1616
Mutual Exclusion - Hardware Support
Interrupt Disabling
ndash Concurrent threads cannot overlap on a uniprocessor
ndash Thread will run until performing a system call or interrupt happens
Special Atomic Machine Instructions
ndash Test and Set Instruction - read amp write a memory location
ndash Exchange Instruction - swap register and memory location
Problems with Machine-Instruction Approach
ndash Busy waiting
ndash Starvation is possible
ndash Deadlock is possible
1717
Synchronization Hardware
Test and modify the content of a word atomically
boolean TestAndSet(boolean amptarget)
boolean rv = target
target = true
return rv
1818
Shared data ndash boolean lock = false
Thread Ti
do
while (TestAndSet(lock))
critical section
lock = false
remainder section
Mutual Exclusion with Test-and-Set
1919
Synchronization Hardware
Atomically swap two variables
void Swap(boolean ampa boolean ampb)
boolean temp = a
a = b
b = temp
2020
Mutual Exclusion with Swap
Shared data (initialized to 0) int lock = 0
Thread Ti
int key
do
key = 1
while (key == 1) Swap(lockkey)
critical section
lock = 0
remainder section
2121
Semaphores
Semaphore S ndash integer variable
can only be accessed via two atomic operations
wait (S)
while (S lt= 0)S--
signal (S)
S++
2222
Critical Section of n Threads
Shared data
semaphore mutex initially mutex = 1
Thread Ti
do wait(mutex) critical section
signal(mutex) remainder section
while (1)
2323
Semaphore Implementation
Semaphores may suspendresume threads
ndash Avoid busy waiting
Define a semaphore as a record
typedef struct
int value struct thread L semaphore
Assume two simple operations
ndash suspend() suspends the thread that invokes it
ndash resume(T) resumes the execution of a blocked thread T
2424
Implementation
Semaphore operations now defined as wait(S)
Svalue--
if (Svalue lt 0)
add this thread to SLsuspend()
signal(S) Svalue++
if (Svalue lt= 0)
remove a thread T from SLresume(T)
2525
Semaphore as a General Synchronization Tool Execute B in Tj only after A executed in Ti
Use semaphore flag initialized to 0
Code
Ti Tj
A wait(flag)
signal(flag) B
2626
Two Types of Semaphores
Counting semaphore
ndash integer value can range over an unrestricted domain
Binary semaphore
ndash integer value can range only between 0 and 1
ndash can be simpler to implement
Counting semaphore S can be implemented as a binary semaphore
2727
Deadlock and Starvation
Deadlock ndash
ndash two or more threads are waiting indefinitely for an event that can be caused by only one of the waiting threads
Let S and Q be two semaphores initialized to 1
T0 T1
wait(S) wait(Q)
wait(Q) wait(S)
signal(S) signal(Q)
signal(Q) signal(S)
2828
Deadlock and Starvation
Starvation ndash indefinite blocking
ndash A thread may never be removed from the semaphore queue in which it is suspended
Solution ndash
ndash all code should acquirerelease semaphores in same order
2929
Windows Synchronization
Uses interrupt masks to protect access to global resources on uniprocessor systems
Uses spinlocks on multiprocessor systems
Provides dispatcher objects which may act as mutexes and semaphores
Dispatcher objects may also provide events An event acts much like a condition variable
3030
Linux Synchronization
Kernel disables interrupts for synchronizing access to global data on uniprocessor systems
Uses spinlocks for multiprocessor synchronization
Uses semaphores and readers-writers locks when longer sections of code need access to data
Implements POSIX synchronization primitives to support multitasking multithreading (including real-time threads) and multiprocessing
3131
Further Reading
Ben-Ari M Principles of Concurrent Programming Prentice Hall 1982
Lamport L The Mutual Exclusion Problem Journal of the ACM April 1986
Abraham Silberschatz Peter B Galvin Operating System Concepts John Wiley amp Sons 6th Ed 2003
ndash Chapter 7 - Process Synchronization
ndash Chapter 8 - Deadlocks
3232
32 Trap Dispatching Interrupts Synchronization Trap and Interrupt dispatching
IRQL levels amp Interrupt Precedence
Spinlocks and Kernel Synchronization
Executive Synchronization
3333
Kernel Mode Versus User Mode
A processor state
Controls access to memory
Each memory page is tagged to show the required mode for reading and for writing
ndash Protects the system from the users
ndash Protects the user (process) from themselves
ndash System is not protected from system
Code regions are tagged ldquono write in any moderdquo
Controls ability to execute privileged instructions
A Windows abstraction
ndash Intel Ring 0 Ring 3
3434
Kernel Mode Versus User Mode
Control flow (a thread) can change from user to kernel mode and back
ndash Does not affect scheduling
ndash Thread context includes info about execution mode (along with registers etc)
PerfMon counters
ndash ldquoPrivileged Timerdquo and ldquoUser Timerdquo
ndash 4 levels of granularity thread process processor system
3535
Getting Into Kernel Mode
Code is run in kernel mode for one of three reasons
1 Requests from user mode
ndash Via the system service dispatch mechanism
ndash Kernel-mode code runs in the context of the requesting thread
2 Dedicated kernel-mode system threads
ndash Some threads in the system stay in kernel mode at all timesmostly in the ldquoSystemrdquo process
ndash Scheduled preempted etc like any other threads
3636
Getting Into Kernel Mode
3 Interrupts from external devicesndash interrupt dispatcher invokes the interrupt service routine
ndash ISR runs in the context of the interrupted thread so-called ldquoarbitrary thread contextrdquo
ndash ISR often requests the execution of a ldquoDPC routinerdquo which also runs in kernel mode
ndash Time not charged to interrupted thread
3737
Trap dispatching
Interruptdispatcher
Systemservice
dispatcher
Interruptserviceroutines
Interruptserviceroutines
Interruptserviceroutines
System services
System services
System services
Exceptiondispatcher
Exceptionhandlers
Exceptionhandlers
Exceptionhandlers
Virtual memorymanagerlsquos pager
Interrupt
System service call
HW exceptionsSW exceptions
Virtual addressexceptions
Trap processorlsquos mechanism to capture executing thread
ndash Switch from user to kernel mode
ndash Interrupts ndash asynchronous
ndash Exceptions - synchronous
Interrupt dispatch routine
Disable interrupts
Record machine state (trap frame) to allow resume
Mask equal- and lower-IRQL interrupts
Find and call appropriate ISR
Dismiss interrupt
Restore machine state (including mode and enabled interrupts)
Disable interrupts
Record machine state (trap frame) to allow resume
Mask equal- and lower-IRQL interrupts
Find and call appropriate ISR
Dismiss interrupt
Restore machine state (including mode and enabled interrupts)
Tell the device to stop interrupting
Interrogate device state start next operation on device etc
Request a DPC
Return to caller
Tell the device to stop interrupting
Interrogate device state start next operation on device etc
Request a DPC
Return to caller
Interrupt service routine
interrupt
user or kernel mode
codekernel mode
Note no thread or process context switch
Note no thread or process context switch
Interrupt Dispatching
3939
IRQL = Interrupt Request Level
ndash Precedence of the interrupt with respect to other interrupts
ndash Different interrupt sources have different IRQLs
ndash not the same as IRQ
IRQL is also a state of the processor
ndash Servicing an interrupt raises processor IRQL to that interruptrsquos IRQL
ndash this masks subsequent interrupts at equal and lower IRQLs
User mode is limited to IRQL 0
No waits or page faults at IRQL gt= DISPATCH_LEVEL
Interrupt Precedence via IRQLs (x86)
4040
PassiveLowAPC
DispatchDPCDevice 1
Profile amp Synch (Srv 2003)
ClockInterprocessor Interrupt
Power failHigh
normal thread execution
Hardware interrupts
Deferrable software interrupts
012
302928
31
Interrupt Precedence via IRQLs (x86)
4141
Interrupt processing
Interrupt dispatch table (IDT)ndash Links to interrupt service routines
x86ndash Interrupt controller interrupts processor (single line)
ndash Processor queries for interrupt vector uses vector as index to IDT
Alphandash PAL code (Privileged Architecture Library ndash Alpha BIOS) determines interrupt vector calls kernel
ndash Kernel uses vector to index IDT
After ISR execution IRQL is lowered to initial level
4242
Interrupt object
Allows device drivers to register ISRs for their devicesndash Contains dispatch code (initial handler)
ndash Dispatch code calls ISR with interrupt object as parameter(HW cannot pass parameters to ISR)
Connectingdisconnecting interrupt objectsndash Dynamic association between ISR and IDT entry
ndash Loadable device drivers (kernel modules)
ndash Turn onoff ISR
Interrupt objects can synchronize access to ISR datandash Multiple instances of ISR may be active simultaneously (MP machine)
ndash Multiple ISR may be connected with IRQL
4343
Predefined IRQLs
High ndash used when halting the system (via KeBugCheck())
Power fail ndash originated in the NT design document but has never been used
Inter-processor interruptndash used to request action from other processor (dispatching a thread updating a processors TLB system shutdown system crash)
Clockndash Used to update systemlsquos clock allocation of CPU time to threads
Profilendash Used for kernel profiling (see Kernel profiler ndash Kernprofexe Res Kit)
4444
Predefined IRQLs (contd)
Device
ndash Used to prioritize device interrupts
DPCdispatch and APC
ndash Software interrupts that kernel and device drivers generate
Passive
ndash No interrupt level at all normal thread execution
4545
IRQLs on 64-bit Systems
PassiveLowAPC
DispatchDPCDevice 1
Device n
Synch (Srv 2003)Clock
Interprocessor InterruptPower
HighProfile
012
1413
15
34
PassiveLowAPC
DispatchDPC amp Synch (UP only)Correctable Machine Check
Device 1
Device nSynch (MP only)
ClockInterprocessor Interrupt
HighProfilePower
x64 IA64
12
4646
Interrupt Prioritization amp Delivery
IRQLs are determined as followsndash x86 UP systems IRQL = 27 - IRQ
ndash x86 MP systems bucketized (random)
ndash x64 amp IA64 systems IRQL = IDT vector number 16
On MP systems which processor is chosen to deliver an interruptndash By default any processor can receive an interrupt from any deviceCan be configured with IntFilter utility in Resource Kit
ndash On x86 and x64 systems the IOAPIC (IO advanced programmable interrupt controller) is programmed to interrupt the processor running at the lowest IRQL
ndash On IA64 systems the SAPIC (streamlined advanced programmable interrupt controller) is configured to interrupt one processor for each interrupt sourceProcessors are assigned round robin for each interrupt vector
4747
Software interrupts
Initiating thread dispatching
ndash DPC allow for scheduling actions when kernel is deep within many layers of code
ndash Delayed scheduling decision one DPC queue per processor
Handling timer expiration
Asynchronous execution of a procedure in context of a particular thread
Support for asynchronous IO operations
4848
Flow of Interrupts
4949
Sync on MP use spinlocks to coordinate among processors
Spinlock acquisition and release routines implement a one-owner-at-a-time algorithm
ndash Spinlock is either free or is considered to be owned by a CPU
ndash Analogous to using Windows API mutexes from user mode
A spinlock is just a data cell in memory
ndash Accessed with a test-and-modify operation that is atomic across all processors
ndash KSPIN_LOCK is an opaque data type typedefrsquod as a ULONG
ndash To implement synchronization a single bit is sufficient
Synchronization on SMP Systems
5050
do acquire_spinlock(DPC)until (SUCCESS)
begin remove DPC from queueend
release_spinlock(DPC)
do acquire_spinlock(DPC)until (SUCCESS)
begin remove DPC from queueend
release_spinlock(DPC)
Kernel Synchronization
Processor BProcessor A
Critical section
spinlock
DPC DPC
A spinlock is a locking primitive associatedwith a global data structure such as the DPC queue
5151
Queued Spinlocks
Problem Checking status of spinlock via test-and-set operation creates bus contention
Queued spinlocks maintain queue of waiting processors
First processor acquires lock other processors wait on processor-local flag
ndash Thus busy-wait loop requires no access to the memory bus
When releasing lock the 1st processorrsquos flag is modified
ndash Exactly one processor is being signaled
ndash Pre-determined wait order
5252
SMP Scalability Improvements
Windows 2000 queued spinlocks
ndash qlocks in Kernel Debugger
Server 2003
ndash More spinlocks eliminated (context swap system space commit)
ndash Further reduction of use of spinlocks amp length they are held
ndash Scheduling database now per-CPUAllows thread state transitions in parallel
5353
SMP Scalability Improvements
XP2003
ndash Minimized lock contention for hot locks PFN or Page Frame Database lock
ndash Some locks completely eliminatedCharging nonpagedpaged pool quotas allocating and mapping system page table entries charging commitment of pages allocatingmapping physical memory through AWE functions
ndash New more efficient locking mechanism (pushlocks)Doesnrsquot use spinlocks when no contentionUsed for object manager and address windowing extensions (AWE) related locks
5454
Waiting
Flexible wait calls
ndash Wait for one or multiple objects in one call
ndash Wait for multiple can wait for ldquoanyrdquo one or ldquoallrdquo at onceldquoAllrdquo all objects must be in the signalled state concurrently to resolve the wait
ndash All wait calls include optional timeout argument
ndash Waiting threads consume no CPU time
5555
Waiting
Waitable objects include
ndash Events (may be auto-reset or manual reset may be set or ldquopulsedrdquo)
ndash Mutexes (ldquomutual exclusionrdquo one-at-a-time)
ndash Semaphores (n-at-a-time)
ndash Timers
ndash Processes and Threads (signalled upon exit or terminate)
ndash Directories (change notification)
5656
Waiting
No guaranteed ordering of wait resolution
ndash If multiple threads are waiting for an object and only one thread is released (eg itrsquos a mutex or auto-reset event) which thread gets released is unpredictable
5757
Executive Synchronization
Thread waitson an object
handle
Create and initialize thread object
Initialized
Ready
Transition
Waiting
Running
Terminated
Standby
Wait is completeSet object to
signaled state
Interaction with thread scheduling
Waiting on Dispatcher Objects ndash outside the kernel
5858
Interaction bet Synchronization amp Dispatching User mode thread waits on an event objectlsquos handle
Kernel changes threadlsquos scheduling state from ready to waiting and adds thread to wait-list
Another thread sets the event
Kernel wakes up waiting threads variable priority threads get priority boost
5959
Interaction bet Synchronization amp Dispatching Dispatcher re-schedules new thread ndash it may preempt running thread it it has lower priority and issues software interrupt to initiate context switch
If no processor can be preempted the dispatcher places the ready thread in the dispatcher ready queue to be scheduled later
6060
What signals an object
Dispatcher object
System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
Owning thread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (kernel mode)
nonsignaled signaled
Owning thread or otherthread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (exported to user mode)
nonsignaled signaled
One thread releases thesemaphore freeing a resource
A thread acquires the semaphoreMore resources are not available
Kernel resumes one or more waiting threads
Semaphore
6161
A thread reinitializesthe thread object
What signals an object (contd)
Dispatcher object System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
A thread sets the event
Kernel resumes one or more threads
Kernel resumes one or more waiting threads
Event
nonsignaled signaled
Dedicated thread sets oneevent in the event pair
Kernel resumes theother dedicated thread
Kernel resumes waitingdedicated thread
Event pair
nonsignaled signaled
Timer expires
A thread (re) initializes the timer
Kernel resumes all waiting threads
Timer
nonsignaled signaled
Thread terminates
Kernel resumes all waiting threads
Thread
6262
Further Reading
Mark E Russinovich and David A Solomon Microsoft Windows Internals 4th Edition Microsoft Press 2004
Chapter 3 - System Mechanisms
ndash Trap Dispatching (pp 85 ff)
ndash Synchronization (pp 149 ff)
ndash Kernel Event Tracing (pp 175 ff)
6363
33 Advanced Windows Synchronization
Deferred and Asynchronous Procedure Calls
IRQLs and CPU Time Accounting
Wait Queues amp Dispatcher Objects
6464
Used to defer processing from higher (device) interrupt level to a lower (dispatch) levelndash Also used for quantum end and timer expiration
Driver (usually ISR) queues requestndash One queue per CPU DPCs are normally queued to the current processor but can be targeted to other CPUs
ndash Executes specified procedure at dispatch IRQL (or ldquodispatch levelrdquo also ldquoDPC levelrdquo) when all higher-IRQL work (interrupts) completed
ndash Maximum times recommended ISR 10 usec DPC 25 usec
See httpwwwmicrosoftcomwhdcdriverperformmmdrvmspx
Deferred Procedure Calls (DPCs)
6565
queue head DPC object DPC object DPC object
Deferred Procedure Calls (DPCs)
6666
DPC
Delivering a DPC
DPC routines can call kernel functionsbut canlsquot call system services generatepage faults or create or wait on objects
DPC routines canlsquotassume whatprocess addressspace is currentlymapped
Interruptdispatch table
high
Power failure
DispatchDPC
APC
Low
DPC
1 Timer expires kernelqueues DPC that willrelease all waiting threadsKernel requests SW int
DPCDPC
DPC queue
2 DPC interrupt occurswhen IRQL drops belowdispatchDPC level
dispatcher
3 After DPC interruptcontrol transfers tothread dispatcher
4 Dispatcher executes each DPCroutine in DPC queue
6767
Asynchronous Procedure Calls (APCs)
Execute code in context of a particular user threadndash APC routines can acquire resources (objects) incur page faultscall system services
APC queue is thread-specific
User mode amp kernel mode APCsndash Permission required for user mode APCs
6868
Asynchronous Procedure Calls (APCs)
Executive uses APCs to complete work in thread spacendash Wait for asynchronous IO operation
ndash Emulate delivery of POSIX signals
ndash Make threads suspendterminate itself (env subsystems)
APCs are delivered when thread is in alertable wait statendash WaitForMultipleObjectsEx() SleepEx()
6969
Special kernel APCs
ndash Run in kernel mode at IRQL 1
ndash Always deliverable unless thread is already at IRQL 1 or above
ndash Used for IO completion reporting from ldquoarbitrary thread contextrdquo
ndash Kernel-mode interface is linkable but not documented
Asynchronous Procedure Calls (APCs)
7070
ldquoOrdinaryrdquo kernel APCs
ndash Always deliverable if at IRQL 0 unless explicitly disabled (disable with KeEnterCriticalRegion)
User mode APCs
ndash Used for IO completion callback routines (see ReadFileEx WriteFileEx) also QueueUserApc
ndash Only deliverable when thread is in ldquoalertable waitrdquo
Asynchronous Procedure Calls (APCs)
7171
ThreadObject
K
U
APC objects
Asynchronous Procedure Calls (APCs)
7272
IRQLs and CPU Time Accounting
Interval clock timer ISR keeps track of time
Clock ISR time accounting
ndash If IRQLlt2 charge to threadrsquos user or kernel time
ndash If IRQL=2 and processing a DPC charge to DPC time
ndash If IRQL=2 amp not processing a DPC charge to thread kernel time
ndash If IRQLgt2 charge to interrupt time
7373
IRQLs and CPU Time Accounting
Since time servicing interrupts are NOT charged to interrupted thread if system is busy but no process appears to be running must be due to interrupt-related activity
ndash Note time at IRQL 2 or more is charged to the current threadrsquos quantum (to be described)
7474
Interrupt Time Accounting
Task Manager includes interrupt and DPC time with the Idle process time
Interrupt activity is not charged to any threadprocess
ndash Process Explorer shows these as separate processesnot really processes
ndash Context switches for these are really of interrupts amp DPCs
7575
Time Accounting Quirks
Looking at total CPU time for each process may not reveal where system has spent its time
CPU time accounting is driven by programmable interrupt timer
ndash Normally 10 msec (15 msec on some MP Pentiums)
Thread execution and context switches between clock intervals NOT accounted
ndash Eg one or more threads run and enter a wait state before clock fires
ndash Thus threads may run but never get charged
View context switch activity with Process Explorer
ndash Add Context Switch Delta column
7676
For waiting threads user-mode utilities only display the wait reason
Example pstat
Looking at Waiting Threads
7777
Wait Internals 1 Dispatcher Objects
Size TypeState
Wait listhead
Object-type-specific data
DispatcherObject
(see ntddkincddkntddkh)
Any kernel object you can wait for is a ldquodispatcher objectrdquo
ndash some exclusively for synchronizationeg events mutexes (ldquomutantsrdquo) semaphores queues timers
ndash others can be waited for as a side effect of their prime function eg processes threads file objects
ndash non-waitable kernel objects are called ldquocontrol objectsrdquo
All dispatcher objects have a common header
All dispatcher objects are in one of two states
ndash ldquosignaledrdquo vs ldquononsignaledrdquo
ndash when signalled a wait on the object is satisfied
ndash different object types differ in terms of what changes their state
ndash wait and unwait implementation iscommon to all types of dispatcher objects
7878
Object-type-specific data
Wait Internals 2Wait Blocks
Size TypeState
Wait listhead
Size TypeState
Wait listhead
Represent a threadrsquos reference to something itrsquos waiting for (one per handle passed to WaitForhellip)
All wait blocks from a given wait call are chained to the waiting thread
Type indicates wait for ldquoanyrdquo or ldquoallrdquo Key denotes argument list position for
WaitForMultipleObjects
Object-type-specific data
DispatcherObjects
Thread Objects
WaitBlockListWaitBlockList
Wait blocks
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
7979
34 Windows APIs for Synchronization and IPC Windows API constructs for synchronization and interprocess communication
Synchronization
ndash Critical sections
ndash Mutexes
ndash Semaphores
ndash Event objects
Synchronization through interprocess communication
ndash Anonymous pipes
ndash Named pipes
ndash Mailslots
8080
Critical Sections
Only usable from within the same process
Critical sections are initialized and deleted but do not have handles
Only one thread at a time can be in a critical section
A thread can enter a critical section multiple times - however the number of Enter- and Leave-operations must match
Leaving a critical section before entering it may cause deadlocks
No way to test whether another thread is in a critical section
VOID InitializeCriticalSection( LPCRITICAL_SECTION sec )VOID DeleteCriticalSection( LPCRITICAL_SECTION sec )
VOID EnterCriticalSection( LPCRITICAL_SECTION sec ) VOID LeaveCriticalSection( LPCRITICAL_SECTION sec )BOOL TryEnterCriticalSection ( LPCRITICAL_SECTION sec )
8181
Critical Section Example
counter is global shared by all threads
volatile int counter = 0
CRITICAL_SECTION crit
InitializeCriticalSection ( ampcrit )
hellip main loop in any of the threads
while (done)
_try
EnterCriticalSection ( ampcrit )
counter += local_value
LeaveCriticalSection ( ampcrit )
_finally LeaveCriticalSection ( ampcrit )
DeleteCriticalSection( ampcrit )
8282
Synchronizing Threads with Kernel Objects
The following kernel objects can be used to synchronize threads
ndash Processes
ndash Threads
ndash Files
ndash Console input
File change notificationsFile change notifications
MutexesMutexes
Events (auto-reset + manual-reset)Events (auto-reset + manual-reset)
Waitable timersWaitable timers
DWORD WaitForSingleObject( HANDLE hObject DWORD dwTimeout )
DWORD WaitForMultipleObjects( DWORD cObjects LPHANDLE lpHandles BOOL bWaitAll DWORD dwTimeout )
8383
Wait Functions - Details
WaitForSingleObject()ndash hObject specifies kernel object
ndash dwTimeout specifies wait time in msecdwTimeout == 0 - no wait check whether object is signaled
dwTimeout == INFINITE - wait forever
WaitForMultipleObjects()ndash cObjects lt= MAXIMUM_WAIT_OBJECTS (64)
ndash lpHandles - pointer to array identifying these objects
ndash bWaitAll - whether to wait for first signaled object or all objectsFunction returns index of first signaled object
Side effectsndash Mutexes auto-reset events and waitable timers will be reset to non-signaled state after completing wait functions
8484
Mutexes
Mutexes work across processes
First thread has to call CreateMutex()
When sharing a mutex second thread (process) calls CreateMutex() or OpenMutex()
fInitialOwner == TRUE gives creator immediate ownership
Threads acquire mutex ownership using WaitForSingleObject() or WaitForMultipleObjects()
ReleaseMutex() gives up ownership
CloseHandle() will free mutex object
8585
Mutexes
HANDLE CreateMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
HANDLE OpenMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
BOOL ReleaseMutex( HANDLE hMutex )
8686
Mutex Example
counter is global shared by all threads
volatile int done counter = 0
HANDLE mutex = CreateMutex( NULL FALSE NULL )
main loop in any of the threads ret is local
DWORD ret
while (done)
ret = WaitForSingleObject( mutex INFINITE )
if (ret == WAIT_OBJECT_0)
counter += local_value
else mutex was abandoned
break exit the loop
ReleaseMutex( mutex )
CloseHandle( mutex )
8787
Comparison - POSIX mutexes
POSIX pthreads specification supports mutexesndash Synchronization among threads in same process
Five basic functionsndash pthread_mutex_init()
ndash pthread_mutex_destroy()
ndash pthread_mutex_lock()
ndash pthread_mutex_unlock()
ndash pthread_mutex_trylock()
Comparisonndash pthread_mutex_lock() will block - equivalent to WaitForSingleObject( hMutex )
ndash pthread_mutex_trylock() is nonblocking (polling) - equivalent to WaitForSingleObject() with timeout == 0
8888
Semaphores
Semaphore objects are used for resource countingndash A semaphore is signaled when count gt 0
Threadsprocesses use wait functionsndash Each wait function decreases semaphore count by 1
ndash ReleaseSemaphore() may increment count by any value
ndash ReleaseSemaphore() returns old semaphore count
8989
Semaphores
HANDLE CreateSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE OpenSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE ReleaseSemaphore( HANDLE hSemaphoreLONG cReleaseCount LPLONG lpPreviousCount )
9090
Events
Multiple threads can be released when a single event is signaled (barrier synchronization)ndash Manual-reset event can signal several thread simultaneously must be reset manually
ndash PulseEvent() will release all threads waiting on a manual-reset event and automatically reset the event
ndash Auto-reset event signals a single thread event is reset automatically
ndash fInitialState == TRUE - create event in signaled state
9191
Events
HANDLE CreateEvent( LPSECURITY_ATTRIBUTE lpsaBOOL fManualReset BOOL fInititalStateLPTSTR lpszEventName )
BOOL SetEvent( HANDLE hEvent )BOOL ResetEvent( HANDLE hEvent )BOOL PulseEvent( HANDLE hEvent )
9292
Comparison - POSIX condition variables
pthreadrsquos condition variables are comparable to eventsndash pthread_cond_init()
ndash pthread_cond_destroy()
Wait functionsndash pthread_cond_wait()
ndash pthread_cond_timedwait()
Signalingndash pthread_cond_signal() - one thread
ndash pthread_cond_broadcast() - all waiting threads
No exact equivalent to manual-reset events
9393
Anonymous pipes
BOOL CreatePipe( PHANDLE phReadPHANDLE phWriteLPSECURITY_ATTRIBUTES lpsaDWORD cbPipe )
main
prog1 prog2pipe
Half-duplex character-based IPC
cbPipe pipe byte size zero == default
Read on pipe handle will block if pipe is empty
Write operation to a full pipe will block
Anonymous pipes are oneway
9494
IO Redirection using an Anonymous Pipe
Create default size anonymous pipe handles are inheritable
if (CreatePipe (amphReadPipe amphWritePipe ampPipeSA 0))
fprintf(stderr ldquoAnon pipe create failednrdquo) exit(1)
Set output handle to pipe handle create first processes
StartInfoCh1hStdInput = GetStdHandle (STD_INPUT_HANDLE)
StartInfoCh1hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh1hStdOutput = hWritePipe
StartInfoCh1dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)Command1 NULL NULL TRUE 0 NULL NULL ampStartInfoCh1 ampProcInfo1))
fprintf(stderr ldquoCreateProc1 failednrdquo) exit(2)
CloseHandle (hWritePipe)
9595
Pipe example (contd)
Repeat (symmetrically) for the second process
StartInfoCh2hStdInput = hReadPipe
StartInfoCh2hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh2hStdOutput = GetStdHandle (STD_OUTPUT_HANDLE)
StartInfoCh2dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)targv NULL NULLTRUE Inherit handles
0 NULL NULL ampStartInfoCh2 ampProcInfo2))
fprintf(stderr ldquoCreateProc2 failednrdquo) exit(3)
CloseHandle (hReadPipe)
Wait for both processes to complete
WaitForSingleObject (ProcInfo1hProcess INFINITE)
WaitForSingleObject (ProcInfo2hProcess INFINITE)
CloseHandle (ProcInfo1hThread) CloseHandle (ProcInfo1hProcess)
CloseHandle (ProcInfo2hThread) CloseHandle (ProcInfo2hProcess)
return 0
9696
Named Pipes
Message orientedndash Reading process can read varying-length messages precisely as sent by the writing process
Bi-directionalndash Two processes can exchange messages over the same pipe
Multiple independent instances of a named pipendash Several clients can communicate with a single server using the same instance
ndash Server can respond to client using the same instance
Pipe can be accessed over the networkndash location transparency
Convenience and connection functions
9797
Using Named Pipes
HANDLE CreateNamedPipe (LPCTSTR lpszPipeNameDWORD fdwOpenMode DWORD fdwPipModeDWORD nMaxInstances DWORD cbOutBufDWORD cbInBuf DWORD dwTimeOutLPSECURITY_ATTRIBUTES lpsa )
Use same flag settings forall instances of a named pipe
lpszPipeName pipe[path]pipename
ndash Not possible to create a pipe on remote machine ( ndash local machine)
fdwOpenMode
ndash PIPE_ACCESS_DUPLEX PIPE_ACCESS_INBOUND PIPE_ACCESS_OUTBOUND
fdwPipeMode
ndash PIPE_TYPE_BYTE or PIPE_TYPE_MESSAGE
ndash PIPE_READMODE_BYTE or PIPE_READMODE_MESSAGE
ndash PIPE_WAIT or PIPE_NOWAIT (will ReadFile block)
9898
Named Pipes (contd)
BOOL PeekNamedPipe (HANDLE hPipeLPVOID lpvBuffer DWORD cbBufferLPDWORD lpcbRead LPDWORD lpcbAvailLPDWORD lpcbMessage)
nMaxInstances
ndash Number of instances
ndash PIPE_UNLIMITED_INSTANCES OS choice based on resources
dwTimeOut
ndash Default time-out period (in msec) for WaitNamedPipe()
First CreateNamedPipe creates named pipe
ndash Closing handle to last instance deletes named pipe
Polling a pipe
ndash Nondestructive ndash is there a message waiting for ReadFile
9999
Named Pipe Client Connections
CreateFile with named pipe namendash pipe[path]pipename
ndash servernamepipe[path]pipename
ndash First method gives better performance (local server)
Status Functionsndash GetNamedPipeHandleState
ndash SetNamedPipeHandleState
ndash GetNamedPipeInfo
100100
Convenience Functions
BOOL TransactNamedPipe( HANDLE hNamedPipeLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDOWRD lpcbRead LPOVERLAPPED lpa)
WriteFile ReadFile sequence
101101
Convenience Functions
BOOL CallNamedPipe( LPCTSTR lpszPipeNameLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDWORD lpcbRead DWORD dwTimeOut)
CreateFile WriteFile ReadFile CloseHandle
ndash dwTimeOut NMPWAIT_NOWAIT NMPWAIT_WIAT_FOREVER NMPWAIT_USE_DEFAULT_WAIT
102102
Server eliminate the polling loop
BOOL ConnectNamedPipe (HANDLE hNamedPipeLPOVERLAPPED lpo
lpo == NULLndash Call will return as soon as there is a client connection
ndash Returns false if client connected between CreateNamed Pipe calland ConnectNamedPipe()
Use DisconnectNamedPipe to free the handle for connection from another client
WaitNamedPipe()ndash Client may wait for serverlsquos ConnectNamedPipe()
Security rights for named pipesndash GENERIC_READ GENERIC_WRITE SYNCHRONIZE
103103
Comparison with UNIX
UNIX FIFOs are similar to a named pipe
ndash FIFOs are half-duplex
ndash FIFOs are limited to a single machine
ndash FIFOs are still byte-oriented so its easiest to use fixed-size records in clientserver applications
ndash Individual readwrites are atomic
A server using FIFOs must use a separate FIFO for each clientlsquos response although all clients can send requests via a single well known FIFO
Mkfifo() is the UNIX counterpart to CreateNamedPipe()
Use sockets for networked clientserver scenarios
104104
Client Example using Named Pipe
WaitNamedPipe (ServerPipeName NMPWAIT_WAIT_FOREVER)
hNamedPipe = CreateFile (ServerPipeName GENERIC_READ | GENERIC_WRITE
0 NULL OPEN_EXISTING FILE_ATTRIBUTE_NORMAL NULL)
if (hNamedPipe == INVALID_HANDLE_VALUE)
fptinf(stderr Failure to locate servern) exit(3)
Write the request
WriteFile (hNamedPipe ampRequest MAX_RQRS_LEN ampnWrite NULL)
Read each response and send it to std out
while (ReadFile (hNamedPipe ResponseRecord MAX_RQRS_LEN ampnRead NULL))
printf (s ResponseRecord)
CloseHandle (hNamedPipe)
return 0
105105
Server Example Using a Named Pipe
hNamedPipe = CreateNamedPipe (SERVER_PIPE PIPE_ACCESS_DUPLEX
PIPE_READMODE_MESSAGE | PIPE_TYPE_MESSAGE | PIPE_WAIT
1 0 0 CS_TIMEOUT pNPSA)
while (Done)
printf (Server is awaiting next requestn)
if (ConnectNamedPipe (hNamedPipe NULL)
|| ReadFile (hNamedPipe ampRequest RQ_SIZE ampnXfer NULL))
fprintf(stderr ldquoConnect or Read Named Pipe errornrdquo) exit(4)
printf( ldquoRequest is sn RequestRecord)
Send the file one line at a time to the client
fp = fopen (File r)
while ((fgets (ResponseRecord MAX_RQRS_LEN fp) = NULL))
WriteFile (hNamedPipe ampResponseRecord
(strlen(ResponseRecord) + 1) TSIZE ampnXfer NULL)
fclose (fp)
DisconnectNamedPipe (hNamedPipe)
End of server operation
106106
Win32 IPC - MailslotsMailslots bear some nasty implementation detailsthey are almost never used
Broadcast mechanism
ndash One-directional
ndash Mutliple writersmultiple readers (frequently one-to-many comm)
ndash Message delivery is unreliable
ndash Can be located over a network domain
ndash Message lengths are limited (w2k lt 426 byte) Operations on the mailslot
ndash Each reader (server) creates mailslot with CreateMailslot()
ndash Write-only client opens mailslot with CreateFile() and uses WriteFile() ndash open will fail if there are no waiting readers
ndash Clientlsquos message can be read by all servers (readers) Client lookup mailslotmailslotname
ndash Client will connect to every server in network domain
107107
Locate a server via mailslot
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
App client 0
App client n
Mailslot Servers
While () Sleep() hMS = CreateFile( ldquomailslotstatusldquo)
WriteFile(hMS ampStatInfo
App Server
Mailslot Client
Message is sent periodically
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
1111
Dekkerrsquos Algorithm (1965)
This is the first correct solution proposed for the two-thread (two-process) case
Originally developed by Dekker in a different context it was applied to the critical section problem by Dijkstra
Dekker adds the idea of a favored thread and allows access to either thread when the request is uncontested
When there is a conflict one thread is favored and the priority reverses after successful execution of the critical section
1212
Dekkerrsquos Algorithm (contd)
Shared variables - initializationint flag[2] flag[0] = flag[1] = 0int turn = 0
Thread Ti
do flag[i] = 1
while (flag[j] ) if (turn == j)
flag[i] = 0while (turn == j)flag[i] = 1
critical section
turn = jflag[I] = 0
remainder section
while (1)
1313
Bakery Algorithm (Lamport 1979)
A Solution to the Critical Section problem for n threads
Before entering its CS a thread receives a number
Holder of the smallest number enters the CS
If threads Ti and Tj receive the same number if i lt j then Ti is served first else Tj is served first
The numbering scheme generates numbers in monotonically non-decreasing order
ndash ie 1112333445
1414
Bakery Algorithm
Notation ldquoltldquo establishes lexicographical order among 2-tuples (ticket thread id )
(ab) lt (cd) if a lt c or if a == c and b lt d
max (a0hellip an-1) = k | k ai for i = 0hellip n ndash 1
Shared data
int choosing[n]
int number[n] - the ticket
Data structures are initialized to 0
1515
Bakery Algorithm
do
choosing[i] = 1
number[i] = max(number[0]number[1] number[n-1]) + 1
choosing[i] = 0
for (j = 0 j lt n j++)
while (choosing[j] == 1)
while ((number[j] = 0) ampamp ((number[j]j) lsquorsquoltlsquorsquo (number[i]i)))
critical section
number[i] = 0
remainder section
while (1)
1616
Mutual Exclusion - Hardware Support
Interrupt Disabling
ndash Concurrent threads cannot overlap on a uniprocessor
ndash Thread will run until performing a system call or interrupt happens
Special Atomic Machine Instructions
ndash Test and Set Instruction - read amp write a memory location
ndash Exchange Instruction - swap register and memory location
Problems with Machine-Instruction Approach
ndash Busy waiting
ndash Starvation is possible
ndash Deadlock is possible
1717
Synchronization Hardware
Test and modify the content of a word atomically
boolean TestAndSet(boolean amptarget)
boolean rv = target
target = true
return rv
1818
Shared data ndash boolean lock = false
Thread Ti
do
while (TestAndSet(lock))
critical section
lock = false
remainder section
Mutual Exclusion with Test-and-Set
1919
Synchronization Hardware
Atomically swap two variables
void Swap(boolean ampa boolean ampb)
boolean temp = a
a = b
b = temp
2020
Mutual Exclusion with Swap
Shared data (initialized to 0) int lock = 0
Thread Ti
int key
do
key = 1
while (key == 1) Swap(lockkey)
critical section
lock = 0
remainder section
2121
Semaphores
Semaphore S ndash integer variable
can only be accessed via two atomic operations
wait (S)
while (S lt= 0)S--
signal (S)
S++
2222
Critical Section of n Threads
Shared data
semaphore mutex initially mutex = 1
Thread Ti
do wait(mutex) critical section
signal(mutex) remainder section
while (1)
2323
Semaphore Implementation
Semaphores may suspendresume threads
ndash Avoid busy waiting
Define a semaphore as a record
typedef struct
int value struct thread L semaphore
Assume two simple operations
ndash suspend() suspends the thread that invokes it
ndash resume(T) resumes the execution of a blocked thread T
2424
Implementation
Semaphore operations now defined as wait(S)
Svalue--
if (Svalue lt 0)
add this thread to SLsuspend()
signal(S) Svalue++
if (Svalue lt= 0)
remove a thread T from SLresume(T)
2525
Semaphore as a General Synchronization Tool Execute B in Tj only after A executed in Ti
Use semaphore flag initialized to 0
Code
Ti Tj
A wait(flag)
signal(flag) B
2626
Two Types of Semaphores
Counting semaphore
ndash integer value can range over an unrestricted domain
Binary semaphore
ndash integer value can range only between 0 and 1
ndash can be simpler to implement
Counting semaphore S can be implemented as a binary semaphore
2727
Deadlock and Starvation
Deadlock ndash
ndash two or more threads are waiting indefinitely for an event that can be caused by only one of the waiting threads
Let S and Q be two semaphores initialized to 1
T0 T1
wait(S) wait(Q)
wait(Q) wait(S)
signal(S) signal(Q)
signal(Q) signal(S)
2828
Deadlock and Starvation
Starvation ndash indefinite blocking
ndash A thread may never be removed from the semaphore queue in which it is suspended
Solution ndash
ndash all code should acquirerelease semaphores in same order
2929
Windows Synchronization
Uses interrupt masks to protect access to global resources on uniprocessor systems
Uses spinlocks on multiprocessor systems
Provides dispatcher objects which may act as mutexes and semaphores
Dispatcher objects may also provide events An event acts much like a condition variable
3030
Linux Synchronization
Kernel disables interrupts for synchronizing access to global data on uniprocessor systems
Uses spinlocks for multiprocessor synchronization
Uses semaphores and readers-writers locks when longer sections of code need access to data
Implements POSIX synchronization primitives to support multitasking multithreading (including real-time threads) and multiprocessing
3131
Further Reading
Ben-Ari M Principles of Concurrent Programming Prentice Hall 1982
Lamport L The Mutual Exclusion Problem Journal of the ACM April 1986
Abraham Silberschatz Peter B Galvin Operating System Concepts John Wiley amp Sons 6th Ed 2003
ndash Chapter 7 - Process Synchronization
ndash Chapter 8 - Deadlocks
3232
32 Trap Dispatching Interrupts Synchronization Trap and Interrupt dispatching
IRQL levels amp Interrupt Precedence
Spinlocks and Kernel Synchronization
Executive Synchronization
3333
Kernel Mode Versus User Mode
A processor state
Controls access to memory
Each memory page is tagged to show the required mode for reading and for writing
ndash Protects the system from the users
ndash Protects the user (process) from themselves
ndash System is not protected from system
Code regions are tagged ldquono write in any moderdquo
Controls ability to execute privileged instructions
A Windows abstraction
ndash Intel Ring 0 Ring 3
3434
Kernel Mode Versus User Mode
Control flow (a thread) can change from user to kernel mode and back
ndash Does not affect scheduling
ndash Thread context includes info about execution mode (along with registers etc)
PerfMon counters
ndash ldquoPrivileged Timerdquo and ldquoUser Timerdquo
ndash 4 levels of granularity thread process processor system
3535
Getting Into Kernel Mode
Code is run in kernel mode for one of three reasons
1 Requests from user mode
ndash Via the system service dispatch mechanism
ndash Kernel-mode code runs in the context of the requesting thread
2 Dedicated kernel-mode system threads
ndash Some threads in the system stay in kernel mode at all timesmostly in the ldquoSystemrdquo process
ndash Scheduled preempted etc like any other threads
3636
Getting Into Kernel Mode
3 Interrupts from external devicesndash interrupt dispatcher invokes the interrupt service routine
ndash ISR runs in the context of the interrupted thread so-called ldquoarbitrary thread contextrdquo
ndash ISR often requests the execution of a ldquoDPC routinerdquo which also runs in kernel mode
ndash Time not charged to interrupted thread
3737
Trap dispatching
Interruptdispatcher
Systemservice
dispatcher
Interruptserviceroutines
Interruptserviceroutines
Interruptserviceroutines
System services
System services
System services
Exceptiondispatcher
Exceptionhandlers
Exceptionhandlers
Exceptionhandlers
Virtual memorymanagerlsquos pager
Interrupt
System service call
HW exceptionsSW exceptions
Virtual addressexceptions
Trap processorlsquos mechanism to capture executing thread
ndash Switch from user to kernel mode
ndash Interrupts ndash asynchronous
ndash Exceptions - synchronous
Interrupt dispatch routine
Disable interrupts
Record machine state (trap frame) to allow resume
Mask equal- and lower-IRQL interrupts
Find and call appropriate ISR
Dismiss interrupt
Restore machine state (including mode and enabled interrupts)
Disable interrupts
Record machine state (trap frame) to allow resume
Mask equal- and lower-IRQL interrupts
Find and call appropriate ISR
Dismiss interrupt
Restore machine state (including mode and enabled interrupts)
Tell the device to stop interrupting
Interrogate device state start next operation on device etc
Request a DPC
Return to caller
Tell the device to stop interrupting
Interrogate device state start next operation on device etc
Request a DPC
Return to caller
Interrupt service routine
interrupt
user or kernel mode
codekernel mode
Note no thread or process context switch
Note no thread or process context switch
Interrupt Dispatching
3939
IRQL = Interrupt Request Level
ndash Precedence of the interrupt with respect to other interrupts
ndash Different interrupt sources have different IRQLs
ndash not the same as IRQ
IRQL is also a state of the processor
ndash Servicing an interrupt raises processor IRQL to that interruptrsquos IRQL
ndash this masks subsequent interrupts at equal and lower IRQLs
User mode is limited to IRQL 0
No waits or page faults at IRQL gt= DISPATCH_LEVEL
Interrupt Precedence via IRQLs (x86)
4040
PassiveLowAPC
DispatchDPCDevice 1
Profile amp Synch (Srv 2003)
ClockInterprocessor Interrupt
Power failHigh
normal thread execution
Hardware interrupts
Deferrable software interrupts
012
302928
31
Interrupt Precedence via IRQLs (x86)
4141
Interrupt processing
Interrupt dispatch table (IDT)ndash Links to interrupt service routines
x86ndash Interrupt controller interrupts processor (single line)
ndash Processor queries for interrupt vector uses vector as index to IDT
Alphandash PAL code (Privileged Architecture Library ndash Alpha BIOS) determines interrupt vector calls kernel
ndash Kernel uses vector to index IDT
After ISR execution IRQL is lowered to initial level
4242
Interrupt object
Allows device drivers to register ISRs for their devicesndash Contains dispatch code (initial handler)
ndash Dispatch code calls ISR with interrupt object as parameter(HW cannot pass parameters to ISR)
Connectingdisconnecting interrupt objectsndash Dynamic association between ISR and IDT entry
ndash Loadable device drivers (kernel modules)
ndash Turn onoff ISR
Interrupt objects can synchronize access to ISR datandash Multiple instances of ISR may be active simultaneously (MP machine)
ndash Multiple ISR may be connected with IRQL
4343
Predefined IRQLs
High ndash used when halting the system (via KeBugCheck())
Power fail ndash originated in the NT design document but has never been used
Inter-processor interruptndash used to request action from other processor (dispatching a thread updating a processors TLB system shutdown system crash)
Clockndash Used to update systemlsquos clock allocation of CPU time to threads
Profilendash Used for kernel profiling (see Kernel profiler ndash Kernprofexe Res Kit)
4444
Predefined IRQLs (contd)
Device
ndash Used to prioritize device interrupts
DPCdispatch and APC
ndash Software interrupts that kernel and device drivers generate
Passive
ndash No interrupt level at all normal thread execution
4545
IRQLs on 64-bit Systems
PassiveLowAPC
DispatchDPCDevice 1
Device n
Synch (Srv 2003)Clock
Interprocessor InterruptPower
HighProfile
012
1413
15
34
PassiveLowAPC
DispatchDPC amp Synch (UP only)Correctable Machine Check
Device 1
Device nSynch (MP only)
ClockInterprocessor Interrupt
HighProfilePower
x64 IA64
12
4646
Interrupt Prioritization amp Delivery
IRQLs are determined as followsndash x86 UP systems IRQL = 27 - IRQ
ndash x86 MP systems bucketized (random)
ndash x64 amp IA64 systems IRQL = IDT vector number 16
On MP systems which processor is chosen to deliver an interruptndash By default any processor can receive an interrupt from any deviceCan be configured with IntFilter utility in Resource Kit
ndash On x86 and x64 systems the IOAPIC (IO advanced programmable interrupt controller) is programmed to interrupt the processor running at the lowest IRQL
ndash On IA64 systems the SAPIC (streamlined advanced programmable interrupt controller) is configured to interrupt one processor for each interrupt sourceProcessors are assigned round robin for each interrupt vector
4747
Software interrupts
Initiating thread dispatching
ndash DPC allow for scheduling actions when kernel is deep within many layers of code
ndash Delayed scheduling decision one DPC queue per processor
Handling timer expiration
Asynchronous execution of a procedure in context of a particular thread
Support for asynchronous IO operations
4848
Flow of Interrupts
4949
Sync on MP use spinlocks to coordinate among processors
Spinlock acquisition and release routines implement a one-owner-at-a-time algorithm
ndash Spinlock is either free or is considered to be owned by a CPU
ndash Analogous to using Windows API mutexes from user mode
A spinlock is just a data cell in memory
ndash Accessed with a test-and-modify operation that is atomic across all processors
ndash KSPIN_LOCK is an opaque data type typedefrsquod as a ULONG
ndash To implement synchronization a single bit is sufficient
Synchronization on SMP Systems
5050
do acquire_spinlock(DPC)until (SUCCESS)
begin remove DPC from queueend
release_spinlock(DPC)
do acquire_spinlock(DPC)until (SUCCESS)
begin remove DPC from queueend
release_spinlock(DPC)
Kernel Synchronization
Processor BProcessor A
Critical section
spinlock
DPC DPC
A spinlock is a locking primitive associatedwith a global data structure such as the DPC queue
5151
Queued Spinlocks
Problem Checking status of spinlock via test-and-set operation creates bus contention
Queued spinlocks maintain queue of waiting processors
First processor acquires lock other processors wait on processor-local flag
ndash Thus busy-wait loop requires no access to the memory bus
When releasing lock the 1st processorrsquos flag is modified
ndash Exactly one processor is being signaled
ndash Pre-determined wait order
5252
SMP Scalability Improvements
Windows 2000 queued spinlocks
ndash qlocks in Kernel Debugger
Server 2003
ndash More spinlocks eliminated (context swap system space commit)
ndash Further reduction of use of spinlocks amp length they are held
ndash Scheduling database now per-CPUAllows thread state transitions in parallel
5353
SMP Scalability Improvements
XP2003
ndash Minimized lock contention for hot locks PFN or Page Frame Database lock
ndash Some locks completely eliminatedCharging nonpagedpaged pool quotas allocating and mapping system page table entries charging commitment of pages allocatingmapping physical memory through AWE functions
ndash New more efficient locking mechanism (pushlocks)Doesnrsquot use spinlocks when no contentionUsed for object manager and address windowing extensions (AWE) related locks
5454
Waiting
Flexible wait calls
ndash Wait for one or multiple objects in one call
ndash Wait for multiple can wait for ldquoanyrdquo one or ldquoallrdquo at onceldquoAllrdquo all objects must be in the signalled state concurrently to resolve the wait
ndash All wait calls include optional timeout argument
ndash Waiting threads consume no CPU time
5555
Waiting
Waitable objects include
ndash Events (may be auto-reset or manual reset may be set or ldquopulsedrdquo)
ndash Mutexes (ldquomutual exclusionrdquo one-at-a-time)
ndash Semaphores (n-at-a-time)
ndash Timers
ndash Processes and Threads (signalled upon exit or terminate)
ndash Directories (change notification)
5656
Waiting
No guaranteed ordering of wait resolution
ndash If multiple threads are waiting for an object and only one thread is released (eg itrsquos a mutex or auto-reset event) which thread gets released is unpredictable
5757
Executive Synchronization
Thread waitson an object
handle
Create and initialize thread object
Initialized
Ready
Transition
Waiting
Running
Terminated
Standby
Wait is completeSet object to
signaled state
Interaction with thread scheduling
Waiting on Dispatcher Objects ndash outside the kernel
5858
Interaction bet Synchronization amp Dispatching User mode thread waits on an event objectlsquos handle
Kernel changes threadlsquos scheduling state from ready to waiting and adds thread to wait-list
Another thread sets the event
Kernel wakes up waiting threads variable priority threads get priority boost
5959
Interaction bet Synchronization amp Dispatching Dispatcher re-schedules new thread ndash it may preempt running thread it it has lower priority and issues software interrupt to initiate context switch
If no processor can be preempted the dispatcher places the ready thread in the dispatcher ready queue to be scheduled later
6060
What signals an object
Dispatcher object
System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
Owning thread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (kernel mode)
nonsignaled signaled
Owning thread or otherthread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (exported to user mode)
nonsignaled signaled
One thread releases thesemaphore freeing a resource
A thread acquires the semaphoreMore resources are not available
Kernel resumes one or more waiting threads
Semaphore
6161
A thread reinitializesthe thread object
What signals an object (contd)
Dispatcher object System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
A thread sets the event
Kernel resumes one or more threads
Kernel resumes one or more waiting threads
Event
nonsignaled signaled
Dedicated thread sets oneevent in the event pair
Kernel resumes theother dedicated thread
Kernel resumes waitingdedicated thread
Event pair
nonsignaled signaled
Timer expires
A thread (re) initializes the timer
Kernel resumes all waiting threads
Timer
nonsignaled signaled
Thread terminates
Kernel resumes all waiting threads
Thread
6262
Further Reading
Mark E Russinovich and David A Solomon Microsoft Windows Internals 4th Edition Microsoft Press 2004
Chapter 3 - System Mechanisms
ndash Trap Dispatching (pp 85 ff)
ndash Synchronization (pp 149 ff)
ndash Kernel Event Tracing (pp 175 ff)
6363
33 Advanced Windows Synchronization
Deferred and Asynchronous Procedure Calls
IRQLs and CPU Time Accounting
Wait Queues amp Dispatcher Objects
6464
Used to defer processing from higher (device) interrupt level to a lower (dispatch) levelndash Also used for quantum end and timer expiration
Driver (usually ISR) queues requestndash One queue per CPU DPCs are normally queued to the current processor but can be targeted to other CPUs
ndash Executes specified procedure at dispatch IRQL (or ldquodispatch levelrdquo also ldquoDPC levelrdquo) when all higher-IRQL work (interrupts) completed
ndash Maximum times recommended ISR 10 usec DPC 25 usec
See httpwwwmicrosoftcomwhdcdriverperformmmdrvmspx
Deferred Procedure Calls (DPCs)
6565
queue head DPC object DPC object DPC object
Deferred Procedure Calls (DPCs)
6666
DPC
Delivering a DPC
DPC routines can call kernel functionsbut canlsquot call system services generatepage faults or create or wait on objects
DPC routines canlsquotassume whatprocess addressspace is currentlymapped
Interruptdispatch table
high
Power failure
DispatchDPC
APC
Low
DPC
1 Timer expires kernelqueues DPC that willrelease all waiting threadsKernel requests SW int
DPCDPC
DPC queue
2 DPC interrupt occurswhen IRQL drops belowdispatchDPC level
dispatcher
3 After DPC interruptcontrol transfers tothread dispatcher
4 Dispatcher executes each DPCroutine in DPC queue
6767
Asynchronous Procedure Calls (APCs)
Execute code in context of a particular user threadndash APC routines can acquire resources (objects) incur page faultscall system services
APC queue is thread-specific
User mode amp kernel mode APCsndash Permission required for user mode APCs
6868
Asynchronous Procedure Calls (APCs)
Executive uses APCs to complete work in thread spacendash Wait for asynchronous IO operation
ndash Emulate delivery of POSIX signals
ndash Make threads suspendterminate itself (env subsystems)
APCs are delivered when thread is in alertable wait statendash WaitForMultipleObjectsEx() SleepEx()
6969
Special kernel APCs
ndash Run in kernel mode at IRQL 1
ndash Always deliverable unless thread is already at IRQL 1 or above
ndash Used for IO completion reporting from ldquoarbitrary thread contextrdquo
ndash Kernel-mode interface is linkable but not documented
Asynchronous Procedure Calls (APCs)
7070
ldquoOrdinaryrdquo kernel APCs
ndash Always deliverable if at IRQL 0 unless explicitly disabled (disable with KeEnterCriticalRegion)
User mode APCs
ndash Used for IO completion callback routines (see ReadFileEx WriteFileEx) also QueueUserApc
ndash Only deliverable when thread is in ldquoalertable waitrdquo
Asynchronous Procedure Calls (APCs)
7171
ThreadObject
K
U
APC objects
Asynchronous Procedure Calls (APCs)
7272
IRQLs and CPU Time Accounting
Interval clock timer ISR keeps track of time
Clock ISR time accounting
ndash If IRQLlt2 charge to threadrsquos user or kernel time
ndash If IRQL=2 and processing a DPC charge to DPC time
ndash If IRQL=2 amp not processing a DPC charge to thread kernel time
ndash If IRQLgt2 charge to interrupt time
7373
IRQLs and CPU Time Accounting
Since time servicing interrupts are NOT charged to interrupted thread if system is busy but no process appears to be running must be due to interrupt-related activity
ndash Note time at IRQL 2 or more is charged to the current threadrsquos quantum (to be described)
7474
Interrupt Time Accounting
Task Manager includes interrupt and DPC time with the Idle process time
Interrupt activity is not charged to any threadprocess
ndash Process Explorer shows these as separate processesnot really processes
ndash Context switches for these are really of interrupts amp DPCs
7575
Time Accounting Quirks
Looking at total CPU time for each process may not reveal where system has spent its time
CPU time accounting is driven by programmable interrupt timer
ndash Normally 10 msec (15 msec on some MP Pentiums)
Thread execution and context switches between clock intervals NOT accounted
ndash Eg one or more threads run and enter a wait state before clock fires
ndash Thus threads may run but never get charged
View context switch activity with Process Explorer
ndash Add Context Switch Delta column
7676
For waiting threads user-mode utilities only display the wait reason
Example pstat
Looking at Waiting Threads
7777
Wait Internals 1 Dispatcher Objects
Size TypeState
Wait listhead
Object-type-specific data
DispatcherObject
(see ntddkincddkntddkh)
Any kernel object you can wait for is a ldquodispatcher objectrdquo
ndash some exclusively for synchronizationeg events mutexes (ldquomutantsrdquo) semaphores queues timers
ndash others can be waited for as a side effect of their prime function eg processes threads file objects
ndash non-waitable kernel objects are called ldquocontrol objectsrdquo
All dispatcher objects have a common header
All dispatcher objects are in one of two states
ndash ldquosignaledrdquo vs ldquononsignaledrdquo
ndash when signalled a wait on the object is satisfied
ndash different object types differ in terms of what changes their state
ndash wait and unwait implementation iscommon to all types of dispatcher objects
7878
Object-type-specific data
Wait Internals 2Wait Blocks
Size TypeState
Wait listhead
Size TypeState
Wait listhead
Represent a threadrsquos reference to something itrsquos waiting for (one per handle passed to WaitForhellip)
All wait blocks from a given wait call are chained to the waiting thread
Type indicates wait for ldquoanyrdquo or ldquoallrdquo Key denotes argument list position for
WaitForMultipleObjects
Object-type-specific data
DispatcherObjects
Thread Objects
WaitBlockListWaitBlockList
Wait blocks
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
7979
34 Windows APIs for Synchronization and IPC Windows API constructs for synchronization and interprocess communication
Synchronization
ndash Critical sections
ndash Mutexes
ndash Semaphores
ndash Event objects
Synchronization through interprocess communication
ndash Anonymous pipes
ndash Named pipes
ndash Mailslots
8080
Critical Sections
Only usable from within the same process
Critical sections are initialized and deleted but do not have handles
Only one thread at a time can be in a critical section
A thread can enter a critical section multiple times - however the number of Enter- and Leave-operations must match
Leaving a critical section before entering it may cause deadlocks
No way to test whether another thread is in a critical section
VOID InitializeCriticalSection( LPCRITICAL_SECTION sec )VOID DeleteCriticalSection( LPCRITICAL_SECTION sec )
VOID EnterCriticalSection( LPCRITICAL_SECTION sec ) VOID LeaveCriticalSection( LPCRITICAL_SECTION sec )BOOL TryEnterCriticalSection ( LPCRITICAL_SECTION sec )
8181
Critical Section Example
counter is global shared by all threads
volatile int counter = 0
CRITICAL_SECTION crit
InitializeCriticalSection ( ampcrit )
hellip main loop in any of the threads
while (done)
_try
EnterCriticalSection ( ampcrit )
counter += local_value
LeaveCriticalSection ( ampcrit )
_finally LeaveCriticalSection ( ampcrit )
DeleteCriticalSection( ampcrit )
8282
Synchronizing Threads with Kernel Objects
The following kernel objects can be used to synchronize threads
ndash Processes
ndash Threads
ndash Files
ndash Console input
File change notificationsFile change notifications
MutexesMutexes
Events (auto-reset + manual-reset)Events (auto-reset + manual-reset)
Waitable timersWaitable timers
DWORD WaitForSingleObject( HANDLE hObject DWORD dwTimeout )
DWORD WaitForMultipleObjects( DWORD cObjects LPHANDLE lpHandles BOOL bWaitAll DWORD dwTimeout )
8383
Wait Functions - Details
WaitForSingleObject()ndash hObject specifies kernel object
ndash dwTimeout specifies wait time in msecdwTimeout == 0 - no wait check whether object is signaled
dwTimeout == INFINITE - wait forever
WaitForMultipleObjects()ndash cObjects lt= MAXIMUM_WAIT_OBJECTS (64)
ndash lpHandles - pointer to array identifying these objects
ndash bWaitAll - whether to wait for first signaled object or all objectsFunction returns index of first signaled object
Side effectsndash Mutexes auto-reset events and waitable timers will be reset to non-signaled state after completing wait functions
8484
Mutexes
Mutexes work across processes
First thread has to call CreateMutex()
When sharing a mutex second thread (process) calls CreateMutex() or OpenMutex()
fInitialOwner == TRUE gives creator immediate ownership
Threads acquire mutex ownership using WaitForSingleObject() or WaitForMultipleObjects()
ReleaseMutex() gives up ownership
CloseHandle() will free mutex object
8585
Mutexes
HANDLE CreateMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
HANDLE OpenMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
BOOL ReleaseMutex( HANDLE hMutex )
8686
Mutex Example
counter is global shared by all threads
volatile int done counter = 0
HANDLE mutex = CreateMutex( NULL FALSE NULL )
main loop in any of the threads ret is local
DWORD ret
while (done)
ret = WaitForSingleObject( mutex INFINITE )
if (ret == WAIT_OBJECT_0)
counter += local_value
else mutex was abandoned
break exit the loop
ReleaseMutex( mutex )
CloseHandle( mutex )
8787
Comparison - POSIX mutexes
POSIX pthreads specification supports mutexesndash Synchronization among threads in same process
Five basic functionsndash pthread_mutex_init()
ndash pthread_mutex_destroy()
ndash pthread_mutex_lock()
ndash pthread_mutex_unlock()
ndash pthread_mutex_trylock()
Comparisonndash pthread_mutex_lock() will block - equivalent to WaitForSingleObject( hMutex )
ndash pthread_mutex_trylock() is nonblocking (polling) - equivalent to WaitForSingleObject() with timeout == 0
8888
Semaphores
Semaphore objects are used for resource countingndash A semaphore is signaled when count gt 0
Threadsprocesses use wait functionsndash Each wait function decreases semaphore count by 1
ndash ReleaseSemaphore() may increment count by any value
ndash ReleaseSemaphore() returns old semaphore count
8989
Semaphores
HANDLE CreateSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE OpenSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE ReleaseSemaphore( HANDLE hSemaphoreLONG cReleaseCount LPLONG lpPreviousCount )
9090
Events
Multiple threads can be released when a single event is signaled (barrier synchronization)ndash Manual-reset event can signal several thread simultaneously must be reset manually
ndash PulseEvent() will release all threads waiting on a manual-reset event and automatically reset the event
ndash Auto-reset event signals a single thread event is reset automatically
ndash fInitialState == TRUE - create event in signaled state
9191
Events
HANDLE CreateEvent( LPSECURITY_ATTRIBUTE lpsaBOOL fManualReset BOOL fInititalStateLPTSTR lpszEventName )
BOOL SetEvent( HANDLE hEvent )BOOL ResetEvent( HANDLE hEvent )BOOL PulseEvent( HANDLE hEvent )
9292
Comparison - POSIX condition variables
pthreadrsquos condition variables are comparable to eventsndash pthread_cond_init()
ndash pthread_cond_destroy()
Wait functionsndash pthread_cond_wait()
ndash pthread_cond_timedwait()
Signalingndash pthread_cond_signal() - one thread
ndash pthread_cond_broadcast() - all waiting threads
No exact equivalent to manual-reset events
9393
Anonymous pipes
BOOL CreatePipe( PHANDLE phReadPHANDLE phWriteLPSECURITY_ATTRIBUTES lpsaDWORD cbPipe )
main
prog1 prog2pipe
Half-duplex character-based IPC
cbPipe pipe byte size zero == default
Read on pipe handle will block if pipe is empty
Write operation to a full pipe will block
Anonymous pipes are oneway
9494
IO Redirection using an Anonymous Pipe
Create default size anonymous pipe handles are inheritable
if (CreatePipe (amphReadPipe amphWritePipe ampPipeSA 0))
fprintf(stderr ldquoAnon pipe create failednrdquo) exit(1)
Set output handle to pipe handle create first processes
StartInfoCh1hStdInput = GetStdHandle (STD_INPUT_HANDLE)
StartInfoCh1hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh1hStdOutput = hWritePipe
StartInfoCh1dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)Command1 NULL NULL TRUE 0 NULL NULL ampStartInfoCh1 ampProcInfo1))
fprintf(stderr ldquoCreateProc1 failednrdquo) exit(2)
CloseHandle (hWritePipe)
9595
Pipe example (contd)
Repeat (symmetrically) for the second process
StartInfoCh2hStdInput = hReadPipe
StartInfoCh2hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh2hStdOutput = GetStdHandle (STD_OUTPUT_HANDLE)
StartInfoCh2dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)targv NULL NULLTRUE Inherit handles
0 NULL NULL ampStartInfoCh2 ampProcInfo2))
fprintf(stderr ldquoCreateProc2 failednrdquo) exit(3)
CloseHandle (hReadPipe)
Wait for both processes to complete
WaitForSingleObject (ProcInfo1hProcess INFINITE)
WaitForSingleObject (ProcInfo2hProcess INFINITE)
CloseHandle (ProcInfo1hThread) CloseHandle (ProcInfo1hProcess)
CloseHandle (ProcInfo2hThread) CloseHandle (ProcInfo2hProcess)
return 0
9696
Named Pipes
Message orientedndash Reading process can read varying-length messages precisely as sent by the writing process
Bi-directionalndash Two processes can exchange messages over the same pipe
Multiple independent instances of a named pipendash Several clients can communicate with a single server using the same instance
ndash Server can respond to client using the same instance
Pipe can be accessed over the networkndash location transparency
Convenience and connection functions
9797
Using Named Pipes
HANDLE CreateNamedPipe (LPCTSTR lpszPipeNameDWORD fdwOpenMode DWORD fdwPipModeDWORD nMaxInstances DWORD cbOutBufDWORD cbInBuf DWORD dwTimeOutLPSECURITY_ATTRIBUTES lpsa )
Use same flag settings forall instances of a named pipe
lpszPipeName pipe[path]pipename
ndash Not possible to create a pipe on remote machine ( ndash local machine)
fdwOpenMode
ndash PIPE_ACCESS_DUPLEX PIPE_ACCESS_INBOUND PIPE_ACCESS_OUTBOUND
fdwPipeMode
ndash PIPE_TYPE_BYTE or PIPE_TYPE_MESSAGE
ndash PIPE_READMODE_BYTE or PIPE_READMODE_MESSAGE
ndash PIPE_WAIT or PIPE_NOWAIT (will ReadFile block)
9898
Named Pipes (contd)
BOOL PeekNamedPipe (HANDLE hPipeLPVOID lpvBuffer DWORD cbBufferLPDWORD lpcbRead LPDWORD lpcbAvailLPDWORD lpcbMessage)
nMaxInstances
ndash Number of instances
ndash PIPE_UNLIMITED_INSTANCES OS choice based on resources
dwTimeOut
ndash Default time-out period (in msec) for WaitNamedPipe()
First CreateNamedPipe creates named pipe
ndash Closing handle to last instance deletes named pipe
Polling a pipe
ndash Nondestructive ndash is there a message waiting for ReadFile
9999
Named Pipe Client Connections
CreateFile with named pipe namendash pipe[path]pipename
ndash servernamepipe[path]pipename
ndash First method gives better performance (local server)
Status Functionsndash GetNamedPipeHandleState
ndash SetNamedPipeHandleState
ndash GetNamedPipeInfo
100100
Convenience Functions
BOOL TransactNamedPipe( HANDLE hNamedPipeLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDOWRD lpcbRead LPOVERLAPPED lpa)
WriteFile ReadFile sequence
101101
Convenience Functions
BOOL CallNamedPipe( LPCTSTR lpszPipeNameLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDWORD lpcbRead DWORD dwTimeOut)
CreateFile WriteFile ReadFile CloseHandle
ndash dwTimeOut NMPWAIT_NOWAIT NMPWAIT_WIAT_FOREVER NMPWAIT_USE_DEFAULT_WAIT
102102
Server eliminate the polling loop
BOOL ConnectNamedPipe (HANDLE hNamedPipeLPOVERLAPPED lpo
lpo == NULLndash Call will return as soon as there is a client connection
ndash Returns false if client connected between CreateNamed Pipe calland ConnectNamedPipe()
Use DisconnectNamedPipe to free the handle for connection from another client
WaitNamedPipe()ndash Client may wait for serverlsquos ConnectNamedPipe()
Security rights for named pipesndash GENERIC_READ GENERIC_WRITE SYNCHRONIZE
103103
Comparison with UNIX
UNIX FIFOs are similar to a named pipe
ndash FIFOs are half-duplex
ndash FIFOs are limited to a single machine
ndash FIFOs are still byte-oriented so its easiest to use fixed-size records in clientserver applications
ndash Individual readwrites are atomic
A server using FIFOs must use a separate FIFO for each clientlsquos response although all clients can send requests via a single well known FIFO
Mkfifo() is the UNIX counterpart to CreateNamedPipe()
Use sockets for networked clientserver scenarios
104104
Client Example using Named Pipe
WaitNamedPipe (ServerPipeName NMPWAIT_WAIT_FOREVER)
hNamedPipe = CreateFile (ServerPipeName GENERIC_READ | GENERIC_WRITE
0 NULL OPEN_EXISTING FILE_ATTRIBUTE_NORMAL NULL)
if (hNamedPipe == INVALID_HANDLE_VALUE)
fptinf(stderr Failure to locate servern) exit(3)
Write the request
WriteFile (hNamedPipe ampRequest MAX_RQRS_LEN ampnWrite NULL)
Read each response and send it to std out
while (ReadFile (hNamedPipe ResponseRecord MAX_RQRS_LEN ampnRead NULL))
printf (s ResponseRecord)
CloseHandle (hNamedPipe)
return 0
105105
Server Example Using a Named Pipe
hNamedPipe = CreateNamedPipe (SERVER_PIPE PIPE_ACCESS_DUPLEX
PIPE_READMODE_MESSAGE | PIPE_TYPE_MESSAGE | PIPE_WAIT
1 0 0 CS_TIMEOUT pNPSA)
while (Done)
printf (Server is awaiting next requestn)
if (ConnectNamedPipe (hNamedPipe NULL)
|| ReadFile (hNamedPipe ampRequest RQ_SIZE ampnXfer NULL))
fprintf(stderr ldquoConnect or Read Named Pipe errornrdquo) exit(4)
printf( ldquoRequest is sn RequestRecord)
Send the file one line at a time to the client
fp = fopen (File r)
while ((fgets (ResponseRecord MAX_RQRS_LEN fp) = NULL))
WriteFile (hNamedPipe ampResponseRecord
(strlen(ResponseRecord) + 1) TSIZE ampnXfer NULL)
fclose (fp)
DisconnectNamedPipe (hNamedPipe)
End of server operation
106106
Win32 IPC - MailslotsMailslots bear some nasty implementation detailsthey are almost never used
Broadcast mechanism
ndash One-directional
ndash Mutliple writersmultiple readers (frequently one-to-many comm)
ndash Message delivery is unreliable
ndash Can be located over a network domain
ndash Message lengths are limited (w2k lt 426 byte) Operations on the mailslot
ndash Each reader (server) creates mailslot with CreateMailslot()
ndash Write-only client opens mailslot with CreateFile() and uses WriteFile() ndash open will fail if there are no waiting readers
ndash Clientlsquos message can be read by all servers (readers) Client lookup mailslotmailslotname
ndash Client will connect to every server in network domain
107107
Locate a server via mailslot
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
App client 0
App client n
Mailslot Servers
While () Sleep() hMS = CreateFile( ldquomailslotstatusldquo)
WriteFile(hMS ampStatInfo
App Server
Mailslot Client
Message is sent periodically
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
1212
Dekkerrsquos Algorithm (contd)
Shared variables - initializationint flag[2] flag[0] = flag[1] = 0int turn = 0
Thread Ti
do flag[i] = 1
while (flag[j] ) if (turn == j)
flag[i] = 0while (turn == j)flag[i] = 1
critical section
turn = jflag[I] = 0
remainder section
while (1)
1313
Bakery Algorithm (Lamport 1979)
A Solution to the Critical Section problem for n threads
Before entering its CS a thread receives a number
Holder of the smallest number enters the CS
If threads Ti and Tj receive the same number if i lt j then Ti is served first else Tj is served first
The numbering scheme generates numbers in monotonically non-decreasing order
ndash ie 1112333445
1414
Bakery Algorithm
Notation ldquoltldquo establishes lexicographical order among 2-tuples (ticket thread id )
(ab) lt (cd) if a lt c or if a == c and b lt d
max (a0hellip an-1) = k | k ai for i = 0hellip n ndash 1
Shared data
int choosing[n]
int number[n] - the ticket
Data structures are initialized to 0
1515
Bakery Algorithm
do
choosing[i] = 1
number[i] = max(number[0]number[1] number[n-1]) + 1
choosing[i] = 0
for (j = 0 j lt n j++)
while (choosing[j] == 1)
while ((number[j] = 0) ampamp ((number[j]j) lsquorsquoltlsquorsquo (number[i]i)))
critical section
number[i] = 0
remainder section
while (1)
1616
Mutual Exclusion - Hardware Support
Interrupt Disabling
ndash Concurrent threads cannot overlap on a uniprocessor
ndash Thread will run until performing a system call or interrupt happens
Special Atomic Machine Instructions
ndash Test and Set Instruction - read amp write a memory location
ndash Exchange Instruction - swap register and memory location
Problems with Machine-Instruction Approach
ndash Busy waiting
ndash Starvation is possible
ndash Deadlock is possible
1717
Synchronization Hardware
Test and modify the content of a word atomically
boolean TestAndSet(boolean amptarget)
boolean rv = target
target = true
return rv
1818
Shared data ndash boolean lock = false
Thread Ti
do
while (TestAndSet(lock))
critical section
lock = false
remainder section
Mutual Exclusion with Test-and-Set
1919
Synchronization Hardware
Atomically swap two variables
void Swap(boolean ampa boolean ampb)
boolean temp = a
a = b
b = temp
2020
Mutual Exclusion with Swap
Shared data (initialized to 0) int lock = 0
Thread Ti
int key
do
key = 1
while (key == 1) Swap(lockkey)
critical section
lock = 0
remainder section
2121
Semaphores
Semaphore S ndash integer variable
can only be accessed via two atomic operations
wait (S)
while (S lt= 0)S--
signal (S)
S++
2222
Critical Section of n Threads
Shared data
semaphore mutex initially mutex = 1
Thread Ti
do wait(mutex) critical section
signal(mutex) remainder section
while (1)
2323
Semaphore Implementation
Semaphores may suspendresume threads
ndash Avoid busy waiting
Define a semaphore as a record
typedef struct
int value struct thread L semaphore
Assume two simple operations
ndash suspend() suspends the thread that invokes it
ndash resume(T) resumes the execution of a blocked thread T
2424
Implementation
Semaphore operations now defined as wait(S)
Svalue--
if (Svalue lt 0)
add this thread to SLsuspend()
signal(S) Svalue++
if (Svalue lt= 0)
remove a thread T from SLresume(T)
2525
Semaphore as a General Synchronization Tool Execute B in Tj only after A executed in Ti
Use semaphore flag initialized to 0
Code
Ti Tj
A wait(flag)
signal(flag) B
2626
Two Types of Semaphores
Counting semaphore
ndash integer value can range over an unrestricted domain
Binary semaphore
ndash integer value can range only between 0 and 1
ndash can be simpler to implement
Counting semaphore S can be implemented as a binary semaphore
2727
Deadlock and Starvation
Deadlock ndash
ndash two or more threads are waiting indefinitely for an event that can be caused by only one of the waiting threads
Let S and Q be two semaphores initialized to 1
T0 T1
wait(S) wait(Q)
wait(Q) wait(S)
signal(S) signal(Q)
signal(Q) signal(S)
2828
Deadlock and Starvation
Starvation ndash indefinite blocking
ndash A thread may never be removed from the semaphore queue in which it is suspended
Solution ndash
ndash all code should acquirerelease semaphores in same order
2929
Windows Synchronization
Uses interrupt masks to protect access to global resources on uniprocessor systems
Uses spinlocks on multiprocessor systems
Provides dispatcher objects which may act as mutexes and semaphores
Dispatcher objects may also provide events An event acts much like a condition variable
3030
Linux Synchronization
Kernel disables interrupts for synchronizing access to global data on uniprocessor systems
Uses spinlocks for multiprocessor synchronization
Uses semaphores and readers-writers locks when longer sections of code need access to data
Implements POSIX synchronization primitives to support multitasking multithreading (including real-time threads) and multiprocessing
3131
Further Reading
Ben-Ari M Principles of Concurrent Programming Prentice Hall 1982
Lamport L The Mutual Exclusion Problem Journal of the ACM April 1986
Abraham Silberschatz Peter B Galvin Operating System Concepts John Wiley amp Sons 6th Ed 2003
ndash Chapter 7 - Process Synchronization
ndash Chapter 8 - Deadlocks
3232
32 Trap Dispatching Interrupts Synchronization Trap and Interrupt dispatching
IRQL levels amp Interrupt Precedence
Spinlocks and Kernel Synchronization
Executive Synchronization
3333
Kernel Mode Versus User Mode
A processor state
Controls access to memory
Each memory page is tagged to show the required mode for reading and for writing
ndash Protects the system from the users
ndash Protects the user (process) from themselves
ndash System is not protected from system
Code regions are tagged ldquono write in any moderdquo
Controls ability to execute privileged instructions
A Windows abstraction
ndash Intel Ring 0 Ring 3
3434
Kernel Mode Versus User Mode
Control flow (a thread) can change from user to kernel mode and back
ndash Does not affect scheduling
ndash Thread context includes info about execution mode (along with registers etc)
PerfMon counters
ndash ldquoPrivileged Timerdquo and ldquoUser Timerdquo
ndash 4 levels of granularity thread process processor system
3535
Getting Into Kernel Mode
Code is run in kernel mode for one of three reasons
1 Requests from user mode
ndash Via the system service dispatch mechanism
ndash Kernel-mode code runs in the context of the requesting thread
2 Dedicated kernel-mode system threads
ndash Some threads in the system stay in kernel mode at all timesmostly in the ldquoSystemrdquo process
ndash Scheduled preempted etc like any other threads
3636
Getting Into Kernel Mode
3 Interrupts from external devicesndash interrupt dispatcher invokes the interrupt service routine
ndash ISR runs in the context of the interrupted thread so-called ldquoarbitrary thread contextrdquo
ndash ISR often requests the execution of a ldquoDPC routinerdquo which also runs in kernel mode
ndash Time not charged to interrupted thread
3737
Trap dispatching
Interruptdispatcher
Systemservice
dispatcher
Interruptserviceroutines
Interruptserviceroutines
Interruptserviceroutines
System services
System services
System services
Exceptiondispatcher
Exceptionhandlers
Exceptionhandlers
Exceptionhandlers
Virtual memorymanagerlsquos pager
Interrupt
System service call
HW exceptionsSW exceptions
Virtual addressexceptions
Trap processorlsquos mechanism to capture executing thread
ndash Switch from user to kernel mode
ndash Interrupts ndash asynchronous
ndash Exceptions - synchronous
Interrupt dispatch routine
Disable interrupts
Record machine state (trap frame) to allow resume
Mask equal- and lower-IRQL interrupts
Find and call appropriate ISR
Dismiss interrupt
Restore machine state (including mode and enabled interrupts)
Disable interrupts
Record machine state (trap frame) to allow resume
Mask equal- and lower-IRQL interrupts
Find and call appropriate ISR
Dismiss interrupt
Restore machine state (including mode and enabled interrupts)
Tell the device to stop interrupting
Interrogate device state start next operation on device etc
Request a DPC
Return to caller
Tell the device to stop interrupting
Interrogate device state start next operation on device etc
Request a DPC
Return to caller
Interrupt service routine
interrupt
user or kernel mode
codekernel mode
Note no thread or process context switch
Note no thread or process context switch
Interrupt Dispatching
3939
IRQL = Interrupt Request Level
ndash Precedence of the interrupt with respect to other interrupts
ndash Different interrupt sources have different IRQLs
ndash not the same as IRQ
IRQL is also a state of the processor
ndash Servicing an interrupt raises processor IRQL to that interruptrsquos IRQL
ndash this masks subsequent interrupts at equal and lower IRQLs
User mode is limited to IRQL 0
No waits or page faults at IRQL gt= DISPATCH_LEVEL
Interrupt Precedence via IRQLs (x86)
4040
PassiveLowAPC
DispatchDPCDevice 1
Profile amp Synch (Srv 2003)
ClockInterprocessor Interrupt
Power failHigh
normal thread execution
Hardware interrupts
Deferrable software interrupts
012
302928
31
Interrupt Precedence via IRQLs (x86)
4141
Interrupt processing
Interrupt dispatch table (IDT)ndash Links to interrupt service routines
x86ndash Interrupt controller interrupts processor (single line)
ndash Processor queries for interrupt vector uses vector as index to IDT
Alphandash PAL code (Privileged Architecture Library ndash Alpha BIOS) determines interrupt vector calls kernel
ndash Kernel uses vector to index IDT
After ISR execution IRQL is lowered to initial level
4242
Interrupt object
Allows device drivers to register ISRs for their devicesndash Contains dispatch code (initial handler)
ndash Dispatch code calls ISR with interrupt object as parameter(HW cannot pass parameters to ISR)
Connectingdisconnecting interrupt objectsndash Dynamic association between ISR and IDT entry
ndash Loadable device drivers (kernel modules)
ndash Turn onoff ISR
Interrupt objects can synchronize access to ISR datandash Multiple instances of ISR may be active simultaneously (MP machine)
ndash Multiple ISR may be connected with IRQL
4343
Predefined IRQLs
High ndash used when halting the system (via KeBugCheck())
Power fail ndash originated in the NT design document but has never been used
Inter-processor interruptndash used to request action from other processor (dispatching a thread updating a processors TLB system shutdown system crash)
Clockndash Used to update systemlsquos clock allocation of CPU time to threads
Profilendash Used for kernel profiling (see Kernel profiler ndash Kernprofexe Res Kit)
4444
Predefined IRQLs (contd)
Device
ndash Used to prioritize device interrupts
DPCdispatch and APC
ndash Software interrupts that kernel and device drivers generate
Passive
ndash No interrupt level at all normal thread execution
4545
IRQLs on 64-bit Systems
PassiveLowAPC
DispatchDPCDevice 1
Device n
Synch (Srv 2003)Clock
Interprocessor InterruptPower
HighProfile
012
1413
15
34
PassiveLowAPC
DispatchDPC amp Synch (UP only)Correctable Machine Check
Device 1
Device nSynch (MP only)
ClockInterprocessor Interrupt
HighProfilePower
x64 IA64
12
4646
Interrupt Prioritization amp Delivery
IRQLs are determined as followsndash x86 UP systems IRQL = 27 - IRQ
ndash x86 MP systems bucketized (random)
ndash x64 amp IA64 systems IRQL = IDT vector number 16
On MP systems which processor is chosen to deliver an interruptndash By default any processor can receive an interrupt from any deviceCan be configured with IntFilter utility in Resource Kit
ndash On x86 and x64 systems the IOAPIC (IO advanced programmable interrupt controller) is programmed to interrupt the processor running at the lowest IRQL
ndash On IA64 systems the SAPIC (streamlined advanced programmable interrupt controller) is configured to interrupt one processor for each interrupt sourceProcessors are assigned round robin for each interrupt vector
4747
Software interrupts
Initiating thread dispatching
ndash DPC allow for scheduling actions when kernel is deep within many layers of code
ndash Delayed scheduling decision one DPC queue per processor
Handling timer expiration
Asynchronous execution of a procedure in context of a particular thread
Support for asynchronous IO operations
4848
Flow of Interrupts
4949
Sync on MP use spinlocks to coordinate among processors
Spinlock acquisition and release routines implement a one-owner-at-a-time algorithm
ndash Spinlock is either free or is considered to be owned by a CPU
ndash Analogous to using Windows API mutexes from user mode
A spinlock is just a data cell in memory
ndash Accessed with a test-and-modify operation that is atomic across all processors
ndash KSPIN_LOCK is an opaque data type typedefrsquod as a ULONG
ndash To implement synchronization a single bit is sufficient
Synchronization on SMP Systems
5050
do acquire_spinlock(DPC)until (SUCCESS)
begin remove DPC from queueend
release_spinlock(DPC)
do acquire_spinlock(DPC)until (SUCCESS)
begin remove DPC from queueend
release_spinlock(DPC)
Kernel Synchronization
Processor BProcessor A
Critical section
spinlock
DPC DPC
A spinlock is a locking primitive associatedwith a global data structure such as the DPC queue
5151
Queued Spinlocks
Problem Checking status of spinlock via test-and-set operation creates bus contention
Queued spinlocks maintain queue of waiting processors
First processor acquires lock other processors wait on processor-local flag
ndash Thus busy-wait loop requires no access to the memory bus
When releasing lock the 1st processorrsquos flag is modified
ndash Exactly one processor is being signaled
ndash Pre-determined wait order
5252
SMP Scalability Improvements
Windows 2000 queued spinlocks
ndash qlocks in Kernel Debugger
Server 2003
ndash More spinlocks eliminated (context swap system space commit)
ndash Further reduction of use of spinlocks amp length they are held
ndash Scheduling database now per-CPUAllows thread state transitions in parallel
5353
SMP Scalability Improvements
XP2003
ndash Minimized lock contention for hot locks PFN or Page Frame Database lock
ndash Some locks completely eliminatedCharging nonpagedpaged pool quotas allocating and mapping system page table entries charging commitment of pages allocatingmapping physical memory through AWE functions
ndash New more efficient locking mechanism (pushlocks)Doesnrsquot use spinlocks when no contentionUsed for object manager and address windowing extensions (AWE) related locks
5454
Waiting
Flexible wait calls
ndash Wait for one or multiple objects in one call
ndash Wait for multiple can wait for ldquoanyrdquo one or ldquoallrdquo at onceldquoAllrdquo all objects must be in the signalled state concurrently to resolve the wait
ndash All wait calls include optional timeout argument
ndash Waiting threads consume no CPU time
5555
Waiting
Waitable objects include
ndash Events (may be auto-reset or manual reset may be set or ldquopulsedrdquo)
ndash Mutexes (ldquomutual exclusionrdquo one-at-a-time)
ndash Semaphores (n-at-a-time)
ndash Timers
ndash Processes and Threads (signalled upon exit or terminate)
ndash Directories (change notification)
5656
Waiting
No guaranteed ordering of wait resolution
ndash If multiple threads are waiting for an object and only one thread is released (eg itrsquos a mutex or auto-reset event) which thread gets released is unpredictable
5757
Executive Synchronization
Thread waitson an object
handle
Create and initialize thread object
Initialized
Ready
Transition
Waiting
Running
Terminated
Standby
Wait is completeSet object to
signaled state
Interaction with thread scheduling
Waiting on Dispatcher Objects ndash outside the kernel
5858
Interaction bet Synchronization amp Dispatching User mode thread waits on an event objectlsquos handle
Kernel changes threadlsquos scheduling state from ready to waiting and adds thread to wait-list
Another thread sets the event
Kernel wakes up waiting threads variable priority threads get priority boost
5959
Interaction bet Synchronization amp Dispatching Dispatcher re-schedules new thread ndash it may preempt running thread it it has lower priority and issues software interrupt to initiate context switch
If no processor can be preempted the dispatcher places the ready thread in the dispatcher ready queue to be scheduled later
6060
What signals an object
Dispatcher object
System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
Owning thread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (kernel mode)
nonsignaled signaled
Owning thread or otherthread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (exported to user mode)
nonsignaled signaled
One thread releases thesemaphore freeing a resource
A thread acquires the semaphoreMore resources are not available
Kernel resumes one or more waiting threads
Semaphore
6161
A thread reinitializesthe thread object
What signals an object (contd)
Dispatcher object System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
A thread sets the event
Kernel resumes one or more threads
Kernel resumes one or more waiting threads
Event
nonsignaled signaled
Dedicated thread sets oneevent in the event pair
Kernel resumes theother dedicated thread
Kernel resumes waitingdedicated thread
Event pair
nonsignaled signaled
Timer expires
A thread (re) initializes the timer
Kernel resumes all waiting threads
Timer
nonsignaled signaled
Thread terminates
Kernel resumes all waiting threads
Thread
6262
Further Reading
Mark E Russinovich and David A Solomon Microsoft Windows Internals 4th Edition Microsoft Press 2004
Chapter 3 - System Mechanisms
ndash Trap Dispatching (pp 85 ff)
ndash Synchronization (pp 149 ff)
ndash Kernel Event Tracing (pp 175 ff)
6363
33 Advanced Windows Synchronization
Deferred and Asynchronous Procedure Calls
IRQLs and CPU Time Accounting
Wait Queues amp Dispatcher Objects
6464
Used to defer processing from higher (device) interrupt level to a lower (dispatch) levelndash Also used for quantum end and timer expiration
Driver (usually ISR) queues requestndash One queue per CPU DPCs are normally queued to the current processor but can be targeted to other CPUs
ndash Executes specified procedure at dispatch IRQL (or ldquodispatch levelrdquo also ldquoDPC levelrdquo) when all higher-IRQL work (interrupts) completed
ndash Maximum times recommended ISR 10 usec DPC 25 usec
See httpwwwmicrosoftcomwhdcdriverperformmmdrvmspx
Deferred Procedure Calls (DPCs)
6565
queue head DPC object DPC object DPC object
Deferred Procedure Calls (DPCs)
6666
DPC
Delivering a DPC
DPC routines can call kernel functionsbut canlsquot call system services generatepage faults or create or wait on objects
DPC routines canlsquotassume whatprocess addressspace is currentlymapped
Interruptdispatch table
high
Power failure
DispatchDPC
APC
Low
DPC
1 Timer expires kernelqueues DPC that willrelease all waiting threadsKernel requests SW int
DPCDPC
DPC queue
2 DPC interrupt occurswhen IRQL drops belowdispatchDPC level
dispatcher
3 After DPC interruptcontrol transfers tothread dispatcher
4 Dispatcher executes each DPCroutine in DPC queue
6767
Asynchronous Procedure Calls (APCs)
Execute code in context of a particular user threadndash APC routines can acquire resources (objects) incur page faultscall system services
APC queue is thread-specific
User mode amp kernel mode APCsndash Permission required for user mode APCs
6868
Asynchronous Procedure Calls (APCs)
Executive uses APCs to complete work in thread spacendash Wait for asynchronous IO operation
ndash Emulate delivery of POSIX signals
ndash Make threads suspendterminate itself (env subsystems)
APCs are delivered when thread is in alertable wait statendash WaitForMultipleObjectsEx() SleepEx()
6969
Special kernel APCs
ndash Run in kernel mode at IRQL 1
ndash Always deliverable unless thread is already at IRQL 1 or above
ndash Used for IO completion reporting from ldquoarbitrary thread contextrdquo
ndash Kernel-mode interface is linkable but not documented
Asynchronous Procedure Calls (APCs)
7070
ldquoOrdinaryrdquo kernel APCs
ndash Always deliverable if at IRQL 0 unless explicitly disabled (disable with KeEnterCriticalRegion)
User mode APCs
ndash Used for IO completion callback routines (see ReadFileEx WriteFileEx) also QueueUserApc
ndash Only deliverable when thread is in ldquoalertable waitrdquo
Asynchronous Procedure Calls (APCs)
7171
ThreadObject
K
U
APC objects
Asynchronous Procedure Calls (APCs)
7272
IRQLs and CPU Time Accounting
Interval clock timer ISR keeps track of time
Clock ISR time accounting
ndash If IRQLlt2 charge to threadrsquos user or kernel time
ndash If IRQL=2 and processing a DPC charge to DPC time
ndash If IRQL=2 amp not processing a DPC charge to thread kernel time
ndash If IRQLgt2 charge to interrupt time
7373
IRQLs and CPU Time Accounting
Since time servicing interrupts are NOT charged to interrupted thread if system is busy but no process appears to be running must be due to interrupt-related activity
ndash Note time at IRQL 2 or more is charged to the current threadrsquos quantum (to be described)
7474
Interrupt Time Accounting
Task Manager includes interrupt and DPC time with the Idle process time
Interrupt activity is not charged to any threadprocess
ndash Process Explorer shows these as separate processesnot really processes
ndash Context switches for these are really of interrupts amp DPCs
7575
Time Accounting Quirks
Looking at total CPU time for each process may not reveal where system has spent its time
CPU time accounting is driven by programmable interrupt timer
ndash Normally 10 msec (15 msec on some MP Pentiums)
Thread execution and context switches between clock intervals NOT accounted
ndash Eg one or more threads run and enter a wait state before clock fires
ndash Thus threads may run but never get charged
View context switch activity with Process Explorer
ndash Add Context Switch Delta column
7676
For waiting threads user-mode utilities only display the wait reason
Example pstat
Looking at Waiting Threads
7777
Wait Internals 1 Dispatcher Objects
Size TypeState
Wait listhead
Object-type-specific data
DispatcherObject
(see ntddkincddkntddkh)
Any kernel object you can wait for is a ldquodispatcher objectrdquo
ndash some exclusively for synchronizationeg events mutexes (ldquomutantsrdquo) semaphores queues timers
ndash others can be waited for as a side effect of their prime function eg processes threads file objects
ndash non-waitable kernel objects are called ldquocontrol objectsrdquo
All dispatcher objects have a common header
All dispatcher objects are in one of two states
ndash ldquosignaledrdquo vs ldquononsignaledrdquo
ndash when signalled a wait on the object is satisfied
ndash different object types differ in terms of what changes their state
ndash wait and unwait implementation iscommon to all types of dispatcher objects
7878
Object-type-specific data
Wait Internals 2Wait Blocks
Size TypeState
Wait listhead
Size TypeState
Wait listhead
Represent a threadrsquos reference to something itrsquos waiting for (one per handle passed to WaitForhellip)
All wait blocks from a given wait call are chained to the waiting thread
Type indicates wait for ldquoanyrdquo or ldquoallrdquo Key denotes argument list position for
WaitForMultipleObjects
Object-type-specific data
DispatcherObjects
Thread Objects
WaitBlockListWaitBlockList
Wait blocks
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
7979
34 Windows APIs for Synchronization and IPC Windows API constructs for synchronization and interprocess communication
Synchronization
ndash Critical sections
ndash Mutexes
ndash Semaphores
ndash Event objects
Synchronization through interprocess communication
ndash Anonymous pipes
ndash Named pipes
ndash Mailslots
8080
Critical Sections
Only usable from within the same process
Critical sections are initialized and deleted but do not have handles
Only one thread at a time can be in a critical section
A thread can enter a critical section multiple times - however the number of Enter- and Leave-operations must match
Leaving a critical section before entering it may cause deadlocks
No way to test whether another thread is in a critical section
VOID InitializeCriticalSection( LPCRITICAL_SECTION sec )VOID DeleteCriticalSection( LPCRITICAL_SECTION sec )
VOID EnterCriticalSection( LPCRITICAL_SECTION sec ) VOID LeaveCriticalSection( LPCRITICAL_SECTION sec )BOOL TryEnterCriticalSection ( LPCRITICAL_SECTION sec )
8181
Critical Section Example
counter is global shared by all threads
volatile int counter = 0
CRITICAL_SECTION crit
InitializeCriticalSection ( ampcrit )
hellip main loop in any of the threads
while (done)
_try
EnterCriticalSection ( ampcrit )
counter += local_value
LeaveCriticalSection ( ampcrit )
_finally LeaveCriticalSection ( ampcrit )
DeleteCriticalSection( ampcrit )
8282
Synchronizing Threads with Kernel Objects
The following kernel objects can be used to synchronize threads
ndash Processes
ndash Threads
ndash Files
ndash Console input
File change notificationsFile change notifications
MutexesMutexes
Events (auto-reset + manual-reset)Events (auto-reset + manual-reset)
Waitable timersWaitable timers
DWORD WaitForSingleObject( HANDLE hObject DWORD dwTimeout )
DWORD WaitForMultipleObjects( DWORD cObjects LPHANDLE lpHandles BOOL bWaitAll DWORD dwTimeout )
8383
Wait Functions - Details
WaitForSingleObject()ndash hObject specifies kernel object
ndash dwTimeout specifies wait time in msecdwTimeout == 0 - no wait check whether object is signaled
dwTimeout == INFINITE - wait forever
WaitForMultipleObjects()ndash cObjects lt= MAXIMUM_WAIT_OBJECTS (64)
ndash lpHandles - pointer to array identifying these objects
ndash bWaitAll - whether to wait for first signaled object or all objectsFunction returns index of first signaled object
Side effectsndash Mutexes auto-reset events and waitable timers will be reset to non-signaled state after completing wait functions
8484
Mutexes
Mutexes work across processes
First thread has to call CreateMutex()
When sharing a mutex second thread (process) calls CreateMutex() or OpenMutex()
fInitialOwner == TRUE gives creator immediate ownership
Threads acquire mutex ownership using WaitForSingleObject() or WaitForMultipleObjects()
ReleaseMutex() gives up ownership
CloseHandle() will free mutex object
8585
Mutexes
HANDLE CreateMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
HANDLE OpenMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
BOOL ReleaseMutex( HANDLE hMutex )
8686
Mutex Example
counter is global shared by all threads
volatile int done counter = 0
HANDLE mutex = CreateMutex( NULL FALSE NULL )
main loop in any of the threads ret is local
DWORD ret
while (done)
ret = WaitForSingleObject( mutex INFINITE )
if (ret == WAIT_OBJECT_0)
counter += local_value
else mutex was abandoned
break exit the loop
ReleaseMutex( mutex )
CloseHandle( mutex )
8787
Comparison - POSIX mutexes
POSIX pthreads specification supports mutexesndash Synchronization among threads in same process
Five basic functionsndash pthread_mutex_init()
ndash pthread_mutex_destroy()
ndash pthread_mutex_lock()
ndash pthread_mutex_unlock()
ndash pthread_mutex_trylock()
Comparisonndash pthread_mutex_lock() will block - equivalent to WaitForSingleObject( hMutex )
ndash pthread_mutex_trylock() is nonblocking (polling) - equivalent to WaitForSingleObject() with timeout == 0
8888
Semaphores
Semaphore objects are used for resource countingndash A semaphore is signaled when count gt 0
Threadsprocesses use wait functionsndash Each wait function decreases semaphore count by 1
ndash ReleaseSemaphore() may increment count by any value
ndash ReleaseSemaphore() returns old semaphore count
8989
Semaphores
HANDLE CreateSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE OpenSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE ReleaseSemaphore( HANDLE hSemaphoreLONG cReleaseCount LPLONG lpPreviousCount )
9090
Events
Multiple threads can be released when a single event is signaled (barrier synchronization)ndash Manual-reset event can signal several thread simultaneously must be reset manually
ndash PulseEvent() will release all threads waiting on a manual-reset event and automatically reset the event
ndash Auto-reset event signals a single thread event is reset automatically
ndash fInitialState == TRUE - create event in signaled state
9191
Events
HANDLE CreateEvent( LPSECURITY_ATTRIBUTE lpsaBOOL fManualReset BOOL fInititalStateLPTSTR lpszEventName )
BOOL SetEvent( HANDLE hEvent )BOOL ResetEvent( HANDLE hEvent )BOOL PulseEvent( HANDLE hEvent )
9292
Comparison - POSIX condition variables
pthreadrsquos condition variables are comparable to eventsndash pthread_cond_init()
ndash pthread_cond_destroy()
Wait functionsndash pthread_cond_wait()
ndash pthread_cond_timedwait()
Signalingndash pthread_cond_signal() - one thread
ndash pthread_cond_broadcast() - all waiting threads
No exact equivalent to manual-reset events
9393
Anonymous pipes
BOOL CreatePipe( PHANDLE phReadPHANDLE phWriteLPSECURITY_ATTRIBUTES lpsaDWORD cbPipe )
main
prog1 prog2pipe
Half-duplex character-based IPC
cbPipe pipe byte size zero == default
Read on pipe handle will block if pipe is empty
Write operation to a full pipe will block
Anonymous pipes are oneway
9494
IO Redirection using an Anonymous Pipe
Create default size anonymous pipe handles are inheritable
if (CreatePipe (amphReadPipe amphWritePipe ampPipeSA 0))
fprintf(stderr ldquoAnon pipe create failednrdquo) exit(1)
Set output handle to pipe handle create first processes
StartInfoCh1hStdInput = GetStdHandle (STD_INPUT_HANDLE)
StartInfoCh1hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh1hStdOutput = hWritePipe
StartInfoCh1dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)Command1 NULL NULL TRUE 0 NULL NULL ampStartInfoCh1 ampProcInfo1))
fprintf(stderr ldquoCreateProc1 failednrdquo) exit(2)
CloseHandle (hWritePipe)
9595
Pipe example (contd)
Repeat (symmetrically) for the second process
StartInfoCh2hStdInput = hReadPipe
StartInfoCh2hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh2hStdOutput = GetStdHandle (STD_OUTPUT_HANDLE)
StartInfoCh2dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)targv NULL NULLTRUE Inherit handles
0 NULL NULL ampStartInfoCh2 ampProcInfo2))
fprintf(stderr ldquoCreateProc2 failednrdquo) exit(3)
CloseHandle (hReadPipe)
Wait for both processes to complete
WaitForSingleObject (ProcInfo1hProcess INFINITE)
WaitForSingleObject (ProcInfo2hProcess INFINITE)
CloseHandle (ProcInfo1hThread) CloseHandle (ProcInfo1hProcess)
CloseHandle (ProcInfo2hThread) CloseHandle (ProcInfo2hProcess)
return 0
9696
Named Pipes
Message orientedndash Reading process can read varying-length messages precisely as sent by the writing process
Bi-directionalndash Two processes can exchange messages over the same pipe
Multiple independent instances of a named pipendash Several clients can communicate with a single server using the same instance
ndash Server can respond to client using the same instance
Pipe can be accessed over the networkndash location transparency
Convenience and connection functions
9797
Using Named Pipes
HANDLE CreateNamedPipe (LPCTSTR lpszPipeNameDWORD fdwOpenMode DWORD fdwPipModeDWORD nMaxInstances DWORD cbOutBufDWORD cbInBuf DWORD dwTimeOutLPSECURITY_ATTRIBUTES lpsa )
Use same flag settings forall instances of a named pipe
lpszPipeName pipe[path]pipename
ndash Not possible to create a pipe on remote machine ( ndash local machine)
fdwOpenMode
ndash PIPE_ACCESS_DUPLEX PIPE_ACCESS_INBOUND PIPE_ACCESS_OUTBOUND
fdwPipeMode
ndash PIPE_TYPE_BYTE or PIPE_TYPE_MESSAGE
ndash PIPE_READMODE_BYTE or PIPE_READMODE_MESSAGE
ndash PIPE_WAIT or PIPE_NOWAIT (will ReadFile block)
9898
Named Pipes (contd)
BOOL PeekNamedPipe (HANDLE hPipeLPVOID lpvBuffer DWORD cbBufferLPDWORD lpcbRead LPDWORD lpcbAvailLPDWORD lpcbMessage)
nMaxInstances
ndash Number of instances
ndash PIPE_UNLIMITED_INSTANCES OS choice based on resources
dwTimeOut
ndash Default time-out period (in msec) for WaitNamedPipe()
First CreateNamedPipe creates named pipe
ndash Closing handle to last instance deletes named pipe
Polling a pipe
ndash Nondestructive ndash is there a message waiting for ReadFile
9999
Named Pipe Client Connections
CreateFile with named pipe namendash pipe[path]pipename
ndash servernamepipe[path]pipename
ndash First method gives better performance (local server)
Status Functionsndash GetNamedPipeHandleState
ndash SetNamedPipeHandleState
ndash GetNamedPipeInfo
100100
Convenience Functions
BOOL TransactNamedPipe( HANDLE hNamedPipeLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDOWRD lpcbRead LPOVERLAPPED lpa)
WriteFile ReadFile sequence
101101
Convenience Functions
BOOL CallNamedPipe( LPCTSTR lpszPipeNameLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDWORD lpcbRead DWORD dwTimeOut)
CreateFile WriteFile ReadFile CloseHandle
ndash dwTimeOut NMPWAIT_NOWAIT NMPWAIT_WIAT_FOREVER NMPWAIT_USE_DEFAULT_WAIT
102102
Server eliminate the polling loop
BOOL ConnectNamedPipe (HANDLE hNamedPipeLPOVERLAPPED lpo
lpo == NULLndash Call will return as soon as there is a client connection
ndash Returns false if client connected between CreateNamed Pipe calland ConnectNamedPipe()
Use DisconnectNamedPipe to free the handle for connection from another client
WaitNamedPipe()ndash Client may wait for serverlsquos ConnectNamedPipe()
Security rights for named pipesndash GENERIC_READ GENERIC_WRITE SYNCHRONIZE
103103
Comparison with UNIX
UNIX FIFOs are similar to a named pipe
ndash FIFOs are half-duplex
ndash FIFOs are limited to a single machine
ndash FIFOs are still byte-oriented so its easiest to use fixed-size records in clientserver applications
ndash Individual readwrites are atomic
A server using FIFOs must use a separate FIFO for each clientlsquos response although all clients can send requests via a single well known FIFO
Mkfifo() is the UNIX counterpart to CreateNamedPipe()
Use sockets for networked clientserver scenarios
104104
Client Example using Named Pipe
WaitNamedPipe (ServerPipeName NMPWAIT_WAIT_FOREVER)
hNamedPipe = CreateFile (ServerPipeName GENERIC_READ | GENERIC_WRITE
0 NULL OPEN_EXISTING FILE_ATTRIBUTE_NORMAL NULL)
if (hNamedPipe == INVALID_HANDLE_VALUE)
fptinf(stderr Failure to locate servern) exit(3)
Write the request
WriteFile (hNamedPipe ampRequest MAX_RQRS_LEN ampnWrite NULL)
Read each response and send it to std out
while (ReadFile (hNamedPipe ResponseRecord MAX_RQRS_LEN ampnRead NULL))
printf (s ResponseRecord)
CloseHandle (hNamedPipe)
return 0
105105
Server Example Using a Named Pipe
hNamedPipe = CreateNamedPipe (SERVER_PIPE PIPE_ACCESS_DUPLEX
PIPE_READMODE_MESSAGE | PIPE_TYPE_MESSAGE | PIPE_WAIT
1 0 0 CS_TIMEOUT pNPSA)
while (Done)
printf (Server is awaiting next requestn)
if (ConnectNamedPipe (hNamedPipe NULL)
|| ReadFile (hNamedPipe ampRequest RQ_SIZE ampnXfer NULL))
fprintf(stderr ldquoConnect or Read Named Pipe errornrdquo) exit(4)
printf( ldquoRequest is sn RequestRecord)
Send the file one line at a time to the client
fp = fopen (File r)
while ((fgets (ResponseRecord MAX_RQRS_LEN fp) = NULL))
WriteFile (hNamedPipe ampResponseRecord
(strlen(ResponseRecord) + 1) TSIZE ampnXfer NULL)
fclose (fp)
DisconnectNamedPipe (hNamedPipe)
End of server operation
106106
Win32 IPC - MailslotsMailslots bear some nasty implementation detailsthey are almost never used
Broadcast mechanism
ndash One-directional
ndash Mutliple writersmultiple readers (frequently one-to-many comm)
ndash Message delivery is unreliable
ndash Can be located over a network domain
ndash Message lengths are limited (w2k lt 426 byte) Operations on the mailslot
ndash Each reader (server) creates mailslot with CreateMailslot()
ndash Write-only client opens mailslot with CreateFile() and uses WriteFile() ndash open will fail if there are no waiting readers
ndash Clientlsquos message can be read by all servers (readers) Client lookup mailslotmailslotname
ndash Client will connect to every server in network domain
107107
Locate a server via mailslot
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
App client 0
App client n
Mailslot Servers
While () Sleep() hMS = CreateFile( ldquomailslotstatusldquo)
WriteFile(hMS ampStatInfo
App Server
Mailslot Client
Message is sent periodically
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
1313
Bakery Algorithm (Lamport 1979)
A Solution to the Critical Section problem for n threads
Before entering its CS a thread receives a number
Holder of the smallest number enters the CS
If threads Ti and Tj receive the same number if i lt j then Ti is served first else Tj is served first
The numbering scheme generates numbers in monotonically non-decreasing order
ndash ie 1112333445
1414
Bakery Algorithm
Notation ldquoltldquo establishes lexicographical order among 2-tuples (ticket thread id )
(ab) lt (cd) if a lt c or if a == c and b lt d
max (a0hellip an-1) = k | k ai for i = 0hellip n ndash 1
Shared data
int choosing[n]
int number[n] - the ticket
Data structures are initialized to 0
1515
Bakery Algorithm
do
choosing[i] = 1
number[i] = max(number[0]number[1] number[n-1]) + 1
choosing[i] = 0
for (j = 0 j lt n j++)
while (choosing[j] == 1)
while ((number[j] = 0) ampamp ((number[j]j) lsquorsquoltlsquorsquo (number[i]i)))
critical section
number[i] = 0
remainder section
while (1)
1616
Mutual Exclusion - Hardware Support
Interrupt Disabling
ndash Concurrent threads cannot overlap on a uniprocessor
ndash Thread will run until performing a system call or interrupt happens
Special Atomic Machine Instructions
ndash Test and Set Instruction - read amp write a memory location
ndash Exchange Instruction - swap register and memory location
Problems with Machine-Instruction Approach
ndash Busy waiting
ndash Starvation is possible
ndash Deadlock is possible
1717
Synchronization Hardware
Test and modify the content of a word atomically
boolean TestAndSet(boolean amptarget)
boolean rv = target
target = true
return rv
1818
Shared data ndash boolean lock = false
Thread Ti
do
while (TestAndSet(lock))
critical section
lock = false
remainder section
Mutual Exclusion with Test-and-Set
1919
Synchronization Hardware
Atomically swap two variables
void Swap(boolean ampa boolean ampb)
boolean temp = a
a = b
b = temp
2020
Mutual Exclusion with Swap
Shared data (initialized to 0) int lock = 0
Thread Ti
int key
do
key = 1
while (key == 1) Swap(lockkey)
critical section
lock = 0
remainder section
2121
Semaphores
Semaphore S ndash integer variable
can only be accessed via two atomic operations
wait (S)
while (S lt= 0)S--
signal (S)
S++
2222
Critical Section of n Threads
Shared data
semaphore mutex initially mutex = 1
Thread Ti
do wait(mutex) critical section
signal(mutex) remainder section
while (1)
2323
Semaphore Implementation
Semaphores may suspendresume threads
ndash Avoid busy waiting
Define a semaphore as a record
typedef struct
int value struct thread L semaphore
Assume two simple operations
ndash suspend() suspends the thread that invokes it
ndash resume(T) resumes the execution of a blocked thread T
2424
Implementation
Semaphore operations now defined as wait(S)
Svalue--
if (Svalue lt 0)
add this thread to SLsuspend()
signal(S) Svalue++
if (Svalue lt= 0)
remove a thread T from SLresume(T)
2525
Semaphore as a General Synchronization Tool Execute B in Tj only after A executed in Ti
Use semaphore flag initialized to 0
Code
Ti Tj
A wait(flag)
signal(flag) B
2626
Two Types of Semaphores
Counting semaphore
ndash integer value can range over an unrestricted domain
Binary semaphore
ndash integer value can range only between 0 and 1
ndash can be simpler to implement
Counting semaphore S can be implemented as a binary semaphore
2727
Deadlock and Starvation
Deadlock ndash
ndash two or more threads are waiting indefinitely for an event that can be caused by only one of the waiting threads
Let S and Q be two semaphores initialized to 1
T0 T1
wait(S) wait(Q)
wait(Q) wait(S)
signal(S) signal(Q)
signal(Q) signal(S)
2828
Deadlock and Starvation
Starvation ndash indefinite blocking
ndash A thread may never be removed from the semaphore queue in which it is suspended
Solution ndash
ndash all code should acquirerelease semaphores in same order
2929
Windows Synchronization
Uses interrupt masks to protect access to global resources on uniprocessor systems
Uses spinlocks on multiprocessor systems
Provides dispatcher objects which may act as mutexes and semaphores
Dispatcher objects may also provide events An event acts much like a condition variable
3030
Linux Synchronization
Kernel disables interrupts for synchronizing access to global data on uniprocessor systems
Uses spinlocks for multiprocessor synchronization
Uses semaphores and readers-writers locks when longer sections of code need access to data
Implements POSIX synchronization primitives to support multitasking multithreading (including real-time threads) and multiprocessing
3131
Further Reading
Ben-Ari M Principles of Concurrent Programming Prentice Hall 1982
Lamport L The Mutual Exclusion Problem Journal of the ACM April 1986
Abraham Silberschatz Peter B Galvin Operating System Concepts John Wiley amp Sons 6th Ed 2003
ndash Chapter 7 - Process Synchronization
ndash Chapter 8 - Deadlocks
3232
32 Trap Dispatching Interrupts Synchronization Trap and Interrupt dispatching
IRQL levels amp Interrupt Precedence
Spinlocks and Kernel Synchronization
Executive Synchronization
3333
Kernel Mode Versus User Mode
A processor state
Controls access to memory
Each memory page is tagged to show the required mode for reading and for writing
ndash Protects the system from the users
ndash Protects the user (process) from themselves
ndash System is not protected from system
Code regions are tagged ldquono write in any moderdquo
Controls ability to execute privileged instructions
A Windows abstraction
ndash Intel Ring 0 Ring 3
3434
Kernel Mode Versus User Mode
Control flow (a thread) can change from user to kernel mode and back
ndash Does not affect scheduling
ndash Thread context includes info about execution mode (along with registers etc)
PerfMon counters
ndash ldquoPrivileged Timerdquo and ldquoUser Timerdquo
ndash 4 levels of granularity thread process processor system
3535
Getting Into Kernel Mode
Code is run in kernel mode for one of three reasons
1 Requests from user mode
ndash Via the system service dispatch mechanism
ndash Kernel-mode code runs in the context of the requesting thread
2 Dedicated kernel-mode system threads
ndash Some threads in the system stay in kernel mode at all timesmostly in the ldquoSystemrdquo process
ndash Scheduled preempted etc like any other threads
3636
Getting Into Kernel Mode
3 Interrupts from external devicesndash interrupt dispatcher invokes the interrupt service routine
ndash ISR runs in the context of the interrupted thread so-called ldquoarbitrary thread contextrdquo
ndash ISR often requests the execution of a ldquoDPC routinerdquo which also runs in kernel mode
ndash Time not charged to interrupted thread
3737
Trap dispatching
Interruptdispatcher
Systemservice
dispatcher
Interruptserviceroutines
Interruptserviceroutines
Interruptserviceroutines
System services
System services
System services
Exceptiondispatcher
Exceptionhandlers
Exceptionhandlers
Exceptionhandlers
Virtual memorymanagerlsquos pager
Interrupt
System service call
HW exceptionsSW exceptions
Virtual addressexceptions
Trap processorlsquos mechanism to capture executing thread
ndash Switch from user to kernel mode
ndash Interrupts ndash asynchronous
ndash Exceptions - synchronous
Interrupt dispatch routine
Disable interrupts
Record machine state (trap frame) to allow resume
Mask equal- and lower-IRQL interrupts
Find and call appropriate ISR
Dismiss interrupt
Restore machine state (including mode and enabled interrupts)
Disable interrupts
Record machine state (trap frame) to allow resume
Mask equal- and lower-IRQL interrupts
Find and call appropriate ISR
Dismiss interrupt
Restore machine state (including mode and enabled interrupts)
Tell the device to stop interrupting
Interrogate device state start next operation on device etc
Request a DPC
Return to caller
Tell the device to stop interrupting
Interrogate device state start next operation on device etc
Request a DPC
Return to caller
Interrupt service routine
interrupt
user or kernel mode
codekernel mode
Note no thread or process context switch
Note no thread or process context switch
Interrupt Dispatching
3939
IRQL = Interrupt Request Level
ndash Precedence of the interrupt with respect to other interrupts
ndash Different interrupt sources have different IRQLs
ndash not the same as IRQ
IRQL is also a state of the processor
ndash Servicing an interrupt raises processor IRQL to that interruptrsquos IRQL
ndash this masks subsequent interrupts at equal and lower IRQLs
User mode is limited to IRQL 0
No waits or page faults at IRQL gt= DISPATCH_LEVEL
Interrupt Precedence via IRQLs (x86)
4040
PassiveLowAPC
DispatchDPCDevice 1
Profile amp Synch (Srv 2003)
ClockInterprocessor Interrupt
Power failHigh
normal thread execution
Hardware interrupts
Deferrable software interrupts
012
302928
31
Interrupt Precedence via IRQLs (x86)
4141
Interrupt processing
Interrupt dispatch table (IDT)ndash Links to interrupt service routines
x86ndash Interrupt controller interrupts processor (single line)
ndash Processor queries for interrupt vector uses vector as index to IDT
Alphandash PAL code (Privileged Architecture Library ndash Alpha BIOS) determines interrupt vector calls kernel
ndash Kernel uses vector to index IDT
After ISR execution IRQL is lowered to initial level
4242
Interrupt object
Allows device drivers to register ISRs for their devicesndash Contains dispatch code (initial handler)
ndash Dispatch code calls ISR with interrupt object as parameter(HW cannot pass parameters to ISR)
Connectingdisconnecting interrupt objectsndash Dynamic association between ISR and IDT entry
ndash Loadable device drivers (kernel modules)
ndash Turn onoff ISR
Interrupt objects can synchronize access to ISR datandash Multiple instances of ISR may be active simultaneously (MP machine)
ndash Multiple ISR may be connected with IRQL
4343
Predefined IRQLs
High ndash used when halting the system (via KeBugCheck())
Power fail ndash originated in the NT design document but has never been used
Inter-processor interruptndash used to request action from other processor (dispatching a thread updating a processors TLB system shutdown system crash)
Clockndash Used to update systemlsquos clock allocation of CPU time to threads
Profilendash Used for kernel profiling (see Kernel profiler ndash Kernprofexe Res Kit)
4444
Predefined IRQLs (contd)
Device
ndash Used to prioritize device interrupts
DPCdispatch and APC
ndash Software interrupts that kernel and device drivers generate
Passive
ndash No interrupt level at all normal thread execution
4545
IRQLs on 64-bit Systems
PassiveLowAPC
DispatchDPCDevice 1
Device n
Synch (Srv 2003)Clock
Interprocessor InterruptPower
HighProfile
012
1413
15
34
PassiveLowAPC
DispatchDPC amp Synch (UP only)Correctable Machine Check
Device 1
Device nSynch (MP only)
ClockInterprocessor Interrupt
HighProfilePower
x64 IA64
12
4646
Interrupt Prioritization amp Delivery
IRQLs are determined as followsndash x86 UP systems IRQL = 27 - IRQ
ndash x86 MP systems bucketized (random)
ndash x64 amp IA64 systems IRQL = IDT vector number 16
On MP systems which processor is chosen to deliver an interruptndash By default any processor can receive an interrupt from any deviceCan be configured with IntFilter utility in Resource Kit
ndash On x86 and x64 systems the IOAPIC (IO advanced programmable interrupt controller) is programmed to interrupt the processor running at the lowest IRQL
ndash On IA64 systems the SAPIC (streamlined advanced programmable interrupt controller) is configured to interrupt one processor for each interrupt sourceProcessors are assigned round robin for each interrupt vector
4747
Software interrupts
Initiating thread dispatching
ndash DPC allow for scheduling actions when kernel is deep within many layers of code
ndash Delayed scheduling decision one DPC queue per processor
Handling timer expiration
Asynchronous execution of a procedure in context of a particular thread
Support for asynchronous IO operations
4848
Flow of Interrupts
4949
Sync on MP use spinlocks to coordinate among processors
Spinlock acquisition and release routines implement a one-owner-at-a-time algorithm
ndash Spinlock is either free or is considered to be owned by a CPU
ndash Analogous to using Windows API mutexes from user mode
A spinlock is just a data cell in memory
ndash Accessed with a test-and-modify operation that is atomic across all processors
ndash KSPIN_LOCK is an opaque data type typedefrsquod as a ULONG
ndash To implement synchronization a single bit is sufficient
Synchronization on SMP Systems
5050
do acquire_spinlock(DPC)until (SUCCESS)
begin remove DPC from queueend
release_spinlock(DPC)
do acquire_spinlock(DPC)until (SUCCESS)
begin remove DPC from queueend
release_spinlock(DPC)
Kernel Synchronization
Processor BProcessor A
Critical section
spinlock
DPC DPC
A spinlock is a locking primitive associatedwith a global data structure such as the DPC queue
5151
Queued Spinlocks
Problem Checking status of spinlock via test-and-set operation creates bus contention
Queued spinlocks maintain queue of waiting processors
First processor acquires lock other processors wait on processor-local flag
ndash Thus busy-wait loop requires no access to the memory bus
When releasing lock the 1st processorrsquos flag is modified
ndash Exactly one processor is being signaled
ndash Pre-determined wait order
5252
SMP Scalability Improvements
Windows 2000 queued spinlocks
ndash qlocks in Kernel Debugger
Server 2003
ndash More spinlocks eliminated (context swap system space commit)
ndash Further reduction of use of spinlocks amp length they are held
ndash Scheduling database now per-CPUAllows thread state transitions in parallel
5353
SMP Scalability Improvements
XP2003
ndash Minimized lock contention for hot locks PFN or Page Frame Database lock
ndash Some locks completely eliminatedCharging nonpagedpaged pool quotas allocating and mapping system page table entries charging commitment of pages allocatingmapping physical memory through AWE functions
ndash New more efficient locking mechanism (pushlocks)Doesnrsquot use spinlocks when no contentionUsed for object manager and address windowing extensions (AWE) related locks
5454
Waiting
Flexible wait calls
ndash Wait for one or multiple objects in one call
ndash Wait for multiple can wait for ldquoanyrdquo one or ldquoallrdquo at onceldquoAllrdquo all objects must be in the signalled state concurrently to resolve the wait
ndash All wait calls include optional timeout argument
ndash Waiting threads consume no CPU time
5555
Waiting
Waitable objects include
ndash Events (may be auto-reset or manual reset may be set or ldquopulsedrdquo)
ndash Mutexes (ldquomutual exclusionrdquo one-at-a-time)
ndash Semaphores (n-at-a-time)
ndash Timers
ndash Processes and Threads (signalled upon exit or terminate)
ndash Directories (change notification)
5656
Waiting
No guaranteed ordering of wait resolution
ndash If multiple threads are waiting for an object and only one thread is released (eg itrsquos a mutex or auto-reset event) which thread gets released is unpredictable
5757
Executive Synchronization
Thread waitson an object
handle
Create and initialize thread object
Initialized
Ready
Transition
Waiting
Running
Terminated
Standby
Wait is completeSet object to
signaled state
Interaction with thread scheduling
Waiting on Dispatcher Objects ndash outside the kernel
5858
Interaction bet Synchronization amp Dispatching User mode thread waits on an event objectlsquos handle
Kernel changes threadlsquos scheduling state from ready to waiting and adds thread to wait-list
Another thread sets the event
Kernel wakes up waiting threads variable priority threads get priority boost
5959
Interaction bet Synchronization amp Dispatching Dispatcher re-schedules new thread ndash it may preempt running thread it it has lower priority and issues software interrupt to initiate context switch
If no processor can be preempted the dispatcher places the ready thread in the dispatcher ready queue to be scheduled later
6060
What signals an object
Dispatcher object
System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
Owning thread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (kernel mode)
nonsignaled signaled
Owning thread or otherthread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (exported to user mode)
nonsignaled signaled
One thread releases thesemaphore freeing a resource
A thread acquires the semaphoreMore resources are not available
Kernel resumes one or more waiting threads
Semaphore
6161
A thread reinitializesthe thread object
What signals an object (contd)
Dispatcher object System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
A thread sets the event
Kernel resumes one or more threads
Kernel resumes one or more waiting threads
Event
nonsignaled signaled
Dedicated thread sets oneevent in the event pair
Kernel resumes theother dedicated thread
Kernel resumes waitingdedicated thread
Event pair
nonsignaled signaled
Timer expires
A thread (re) initializes the timer
Kernel resumes all waiting threads
Timer
nonsignaled signaled
Thread terminates
Kernel resumes all waiting threads
Thread
6262
Further Reading
Mark E Russinovich and David A Solomon Microsoft Windows Internals 4th Edition Microsoft Press 2004
Chapter 3 - System Mechanisms
ndash Trap Dispatching (pp 85 ff)
ndash Synchronization (pp 149 ff)
ndash Kernel Event Tracing (pp 175 ff)
6363
33 Advanced Windows Synchronization
Deferred and Asynchronous Procedure Calls
IRQLs and CPU Time Accounting
Wait Queues amp Dispatcher Objects
6464
Used to defer processing from higher (device) interrupt level to a lower (dispatch) levelndash Also used for quantum end and timer expiration
Driver (usually ISR) queues requestndash One queue per CPU DPCs are normally queued to the current processor but can be targeted to other CPUs
ndash Executes specified procedure at dispatch IRQL (or ldquodispatch levelrdquo also ldquoDPC levelrdquo) when all higher-IRQL work (interrupts) completed
ndash Maximum times recommended ISR 10 usec DPC 25 usec
See httpwwwmicrosoftcomwhdcdriverperformmmdrvmspx
Deferred Procedure Calls (DPCs)
6565
queue head DPC object DPC object DPC object
Deferred Procedure Calls (DPCs)
6666
DPC
Delivering a DPC
DPC routines can call kernel functionsbut canlsquot call system services generatepage faults or create or wait on objects
DPC routines canlsquotassume whatprocess addressspace is currentlymapped
Interruptdispatch table
high
Power failure
DispatchDPC
APC
Low
DPC
1 Timer expires kernelqueues DPC that willrelease all waiting threadsKernel requests SW int
DPCDPC
DPC queue
2 DPC interrupt occurswhen IRQL drops belowdispatchDPC level
dispatcher
3 After DPC interruptcontrol transfers tothread dispatcher
4 Dispatcher executes each DPCroutine in DPC queue
6767
Asynchronous Procedure Calls (APCs)
Execute code in context of a particular user threadndash APC routines can acquire resources (objects) incur page faultscall system services
APC queue is thread-specific
User mode amp kernel mode APCsndash Permission required for user mode APCs
6868
Asynchronous Procedure Calls (APCs)
Executive uses APCs to complete work in thread spacendash Wait for asynchronous IO operation
ndash Emulate delivery of POSIX signals
ndash Make threads suspendterminate itself (env subsystems)
APCs are delivered when thread is in alertable wait statendash WaitForMultipleObjectsEx() SleepEx()
6969
Special kernel APCs
ndash Run in kernel mode at IRQL 1
ndash Always deliverable unless thread is already at IRQL 1 or above
ndash Used for IO completion reporting from ldquoarbitrary thread contextrdquo
ndash Kernel-mode interface is linkable but not documented
Asynchronous Procedure Calls (APCs)
7070
ldquoOrdinaryrdquo kernel APCs
ndash Always deliverable if at IRQL 0 unless explicitly disabled (disable with KeEnterCriticalRegion)
User mode APCs
ndash Used for IO completion callback routines (see ReadFileEx WriteFileEx) also QueueUserApc
ndash Only deliverable when thread is in ldquoalertable waitrdquo
Asynchronous Procedure Calls (APCs)
7171
ThreadObject
K
U
APC objects
Asynchronous Procedure Calls (APCs)
7272
IRQLs and CPU Time Accounting
Interval clock timer ISR keeps track of time
Clock ISR time accounting
ndash If IRQLlt2 charge to threadrsquos user or kernel time
ndash If IRQL=2 and processing a DPC charge to DPC time
ndash If IRQL=2 amp not processing a DPC charge to thread kernel time
ndash If IRQLgt2 charge to interrupt time
7373
IRQLs and CPU Time Accounting
Since time servicing interrupts are NOT charged to interrupted thread if system is busy but no process appears to be running must be due to interrupt-related activity
ndash Note time at IRQL 2 or more is charged to the current threadrsquos quantum (to be described)
7474
Interrupt Time Accounting
Task Manager includes interrupt and DPC time with the Idle process time
Interrupt activity is not charged to any threadprocess
ndash Process Explorer shows these as separate processesnot really processes
ndash Context switches for these are really of interrupts amp DPCs
7575
Time Accounting Quirks
Looking at total CPU time for each process may not reveal where system has spent its time
CPU time accounting is driven by programmable interrupt timer
ndash Normally 10 msec (15 msec on some MP Pentiums)
Thread execution and context switches between clock intervals NOT accounted
ndash Eg one or more threads run and enter a wait state before clock fires
ndash Thus threads may run but never get charged
View context switch activity with Process Explorer
ndash Add Context Switch Delta column
7676
For waiting threads user-mode utilities only display the wait reason
Example pstat
Looking at Waiting Threads
7777
Wait Internals 1 Dispatcher Objects
Size TypeState
Wait listhead
Object-type-specific data
DispatcherObject
(see ntddkincddkntddkh)
Any kernel object you can wait for is a ldquodispatcher objectrdquo
ndash some exclusively for synchronizationeg events mutexes (ldquomutantsrdquo) semaphores queues timers
ndash others can be waited for as a side effect of their prime function eg processes threads file objects
ndash non-waitable kernel objects are called ldquocontrol objectsrdquo
All dispatcher objects have a common header
All dispatcher objects are in one of two states
ndash ldquosignaledrdquo vs ldquononsignaledrdquo
ndash when signalled a wait on the object is satisfied
ndash different object types differ in terms of what changes their state
ndash wait and unwait implementation iscommon to all types of dispatcher objects
7878
Object-type-specific data
Wait Internals 2Wait Blocks
Size TypeState
Wait listhead
Size TypeState
Wait listhead
Represent a threadrsquos reference to something itrsquos waiting for (one per handle passed to WaitForhellip)
All wait blocks from a given wait call are chained to the waiting thread
Type indicates wait for ldquoanyrdquo or ldquoallrdquo Key denotes argument list position for
WaitForMultipleObjects
Object-type-specific data
DispatcherObjects
Thread Objects
WaitBlockListWaitBlockList
Wait blocks
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
7979
34 Windows APIs for Synchronization and IPC Windows API constructs for synchronization and interprocess communication
Synchronization
ndash Critical sections
ndash Mutexes
ndash Semaphores
ndash Event objects
Synchronization through interprocess communication
ndash Anonymous pipes
ndash Named pipes
ndash Mailslots
8080
Critical Sections
Only usable from within the same process
Critical sections are initialized and deleted but do not have handles
Only one thread at a time can be in a critical section
A thread can enter a critical section multiple times - however the number of Enter- and Leave-operations must match
Leaving a critical section before entering it may cause deadlocks
No way to test whether another thread is in a critical section
VOID InitializeCriticalSection( LPCRITICAL_SECTION sec )VOID DeleteCriticalSection( LPCRITICAL_SECTION sec )
VOID EnterCriticalSection( LPCRITICAL_SECTION sec ) VOID LeaveCriticalSection( LPCRITICAL_SECTION sec )BOOL TryEnterCriticalSection ( LPCRITICAL_SECTION sec )
8181
Critical Section Example
counter is global shared by all threads
volatile int counter = 0
CRITICAL_SECTION crit
InitializeCriticalSection ( ampcrit )
hellip main loop in any of the threads
while (done)
_try
EnterCriticalSection ( ampcrit )
counter += local_value
LeaveCriticalSection ( ampcrit )
_finally LeaveCriticalSection ( ampcrit )
DeleteCriticalSection( ampcrit )
8282
Synchronizing Threads with Kernel Objects
The following kernel objects can be used to synchronize threads
ndash Processes
ndash Threads
ndash Files
ndash Console input
File change notificationsFile change notifications
MutexesMutexes
Events (auto-reset + manual-reset)Events (auto-reset + manual-reset)
Waitable timersWaitable timers
DWORD WaitForSingleObject( HANDLE hObject DWORD dwTimeout )
DWORD WaitForMultipleObjects( DWORD cObjects LPHANDLE lpHandles BOOL bWaitAll DWORD dwTimeout )
8383
Wait Functions - Details
WaitForSingleObject()ndash hObject specifies kernel object
ndash dwTimeout specifies wait time in msecdwTimeout == 0 - no wait check whether object is signaled
dwTimeout == INFINITE - wait forever
WaitForMultipleObjects()ndash cObjects lt= MAXIMUM_WAIT_OBJECTS (64)
ndash lpHandles - pointer to array identifying these objects
ndash bWaitAll - whether to wait for first signaled object or all objectsFunction returns index of first signaled object
Side effectsndash Mutexes auto-reset events and waitable timers will be reset to non-signaled state after completing wait functions
8484
Mutexes
Mutexes work across processes
First thread has to call CreateMutex()
When sharing a mutex second thread (process) calls CreateMutex() or OpenMutex()
fInitialOwner == TRUE gives creator immediate ownership
Threads acquire mutex ownership using WaitForSingleObject() or WaitForMultipleObjects()
ReleaseMutex() gives up ownership
CloseHandle() will free mutex object
8585
Mutexes
HANDLE CreateMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
HANDLE OpenMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
BOOL ReleaseMutex( HANDLE hMutex )
8686
Mutex Example
counter is global shared by all threads
volatile int done counter = 0
HANDLE mutex = CreateMutex( NULL FALSE NULL )
main loop in any of the threads ret is local
DWORD ret
while (done)
ret = WaitForSingleObject( mutex INFINITE )
if (ret == WAIT_OBJECT_0)
counter += local_value
else mutex was abandoned
break exit the loop
ReleaseMutex( mutex )
CloseHandle( mutex )
8787
Comparison - POSIX mutexes
POSIX pthreads specification supports mutexesndash Synchronization among threads in same process
Five basic functionsndash pthread_mutex_init()
ndash pthread_mutex_destroy()
ndash pthread_mutex_lock()
ndash pthread_mutex_unlock()
ndash pthread_mutex_trylock()
Comparisonndash pthread_mutex_lock() will block - equivalent to WaitForSingleObject( hMutex )
ndash pthread_mutex_trylock() is nonblocking (polling) - equivalent to WaitForSingleObject() with timeout == 0
8888
Semaphores
Semaphore objects are used for resource countingndash A semaphore is signaled when count gt 0
Threadsprocesses use wait functionsndash Each wait function decreases semaphore count by 1
ndash ReleaseSemaphore() may increment count by any value
ndash ReleaseSemaphore() returns old semaphore count
8989
Semaphores
HANDLE CreateSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE OpenSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE ReleaseSemaphore( HANDLE hSemaphoreLONG cReleaseCount LPLONG lpPreviousCount )
9090
Events
Multiple threads can be released when a single event is signaled (barrier synchronization)ndash Manual-reset event can signal several thread simultaneously must be reset manually
ndash PulseEvent() will release all threads waiting on a manual-reset event and automatically reset the event
ndash Auto-reset event signals a single thread event is reset automatically
ndash fInitialState == TRUE - create event in signaled state
9191
Events
HANDLE CreateEvent( LPSECURITY_ATTRIBUTE lpsaBOOL fManualReset BOOL fInititalStateLPTSTR lpszEventName )
BOOL SetEvent( HANDLE hEvent )BOOL ResetEvent( HANDLE hEvent )BOOL PulseEvent( HANDLE hEvent )
9292
Comparison - POSIX condition variables
pthreadrsquos condition variables are comparable to eventsndash pthread_cond_init()
ndash pthread_cond_destroy()
Wait functionsndash pthread_cond_wait()
ndash pthread_cond_timedwait()
Signalingndash pthread_cond_signal() - one thread
ndash pthread_cond_broadcast() - all waiting threads
No exact equivalent to manual-reset events
9393
Anonymous pipes
BOOL CreatePipe( PHANDLE phReadPHANDLE phWriteLPSECURITY_ATTRIBUTES lpsaDWORD cbPipe )
main
prog1 prog2pipe
Half-duplex character-based IPC
cbPipe pipe byte size zero == default
Read on pipe handle will block if pipe is empty
Write operation to a full pipe will block
Anonymous pipes are oneway
9494
IO Redirection using an Anonymous Pipe
Create default size anonymous pipe handles are inheritable
if (CreatePipe (amphReadPipe amphWritePipe ampPipeSA 0))
fprintf(stderr ldquoAnon pipe create failednrdquo) exit(1)
Set output handle to pipe handle create first processes
StartInfoCh1hStdInput = GetStdHandle (STD_INPUT_HANDLE)
StartInfoCh1hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh1hStdOutput = hWritePipe
StartInfoCh1dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)Command1 NULL NULL TRUE 0 NULL NULL ampStartInfoCh1 ampProcInfo1))
fprintf(stderr ldquoCreateProc1 failednrdquo) exit(2)
CloseHandle (hWritePipe)
9595
Pipe example (contd)
Repeat (symmetrically) for the second process
StartInfoCh2hStdInput = hReadPipe
StartInfoCh2hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh2hStdOutput = GetStdHandle (STD_OUTPUT_HANDLE)
StartInfoCh2dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)targv NULL NULLTRUE Inherit handles
0 NULL NULL ampStartInfoCh2 ampProcInfo2))
fprintf(stderr ldquoCreateProc2 failednrdquo) exit(3)
CloseHandle (hReadPipe)
Wait for both processes to complete
WaitForSingleObject (ProcInfo1hProcess INFINITE)
WaitForSingleObject (ProcInfo2hProcess INFINITE)
CloseHandle (ProcInfo1hThread) CloseHandle (ProcInfo1hProcess)
CloseHandle (ProcInfo2hThread) CloseHandle (ProcInfo2hProcess)
return 0
9696
Named Pipes
Message orientedndash Reading process can read varying-length messages precisely as sent by the writing process
Bi-directionalndash Two processes can exchange messages over the same pipe
Multiple independent instances of a named pipendash Several clients can communicate with a single server using the same instance
ndash Server can respond to client using the same instance
Pipe can be accessed over the networkndash location transparency
Convenience and connection functions
9797
Using Named Pipes
HANDLE CreateNamedPipe (LPCTSTR lpszPipeNameDWORD fdwOpenMode DWORD fdwPipModeDWORD nMaxInstances DWORD cbOutBufDWORD cbInBuf DWORD dwTimeOutLPSECURITY_ATTRIBUTES lpsa )
Use same flag settings forall instances of a named pipe
lpszPipeName pipe[path]pipename
ndash Not possible to create a pipe on remote machine ( ndash local machine)
fdwOpenMode
ndash PIPE_ACCESS_DUPLEX PIPE_ACCESS_INBOUND PIPE_ACCESS_OUTBOUND
fdwPipeMode
ndash PIPE_TYPE_BYTE or PIPE_TYPE_MESSAGE
ndash PIPE_READMODE_BYTE or PIPE_READMODE_MESSAGE
ndash PIPE_WAIT or PIPE_NOWAIT (will ReadFile block)
9898
Named Pipes (contd)
BOOL PeekNamedPipe (HANDLE hPipeLPVOID lpvBuffer DWORD cbBufferLPDWORD lpcbRead LPDWORD lpcbAvailLPDWORD lpcbMessage)
nMaxInstances
ndash Number of instances
ndash PIPE_UNLIMITED_INSTANCES OS choice based on resources
dwTimeOut
ndash Default time-out period (in msec) for WaitNamedPipe()
First CreateNamedPipe creates named pipe
ndash Closing handle to last instance deletes named pipe
Polling a pipe
ndash Nondestructive ndash is there a message waiting for ReadFile
9999
Named Pipe Client Connections
CreateFile with named pipe namendash pipe[path]pipename
ndash servernamepipe[path]pipename
ndash First method gives better performance (local server)
Status Functionsndash GetNamedPipeHandleState
ndash SetNamedPipeHandleState
ndash GetNamedPipeInfo
100100
Convenience Functions
BOOL TransactNamedPipe( HANDLE hNamedPipeLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDOWRD lpcbRead LPOVERLAPPED lpa)
WriteFile ReadFile sequence
101101
Convenience Functions
BOOL CallNamedPipe( LPCTSTR lpszPipeNameLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDWORD lpcbRead DWORD dwTimeOut)
CreateFile WriteFile ReadFile CloseHandle
ndash dwTimeOut NMPWAIT_NOWAIT NMPWAIT_WIAT_FOREVER NMPWAIT_USE_DEFAULT_WAIT
102102
Server eliminate the polling loop
BOOL ConnectNamedPipe (HANDLE hNamedPipeLPOVERLAPPED lpo
lpo == NULLndash Call will return as soon as there is a client connection
ndash Returns false if client connected between CreateNamed Pipe calland ConnectNamedPipe()
Use DisconnectNamedPipe to free the handle for connection from another client
WaitNamedPipe()ndash Client may wait for serverlsquos ConnectNamedPipe()
Security rights for named pipesndash GENERIC_READ GENERIC_WRITE SYNCHRONIZE
103103
Comparison with UNIX
UNIX FIFOs are similar to a named pipe
ndash FIFOs are half-duplex
ndash FIFOs are limited to a single machine
ndash FIFOs are still byte-oriented so its easiest to use fixed-size records in clientserver applications
ndash Individual readwrites are atomic
A server using FIFOs must use a separate FIFO for each clientlsquos response although all clients can send requests via a single well known FIFO
Mkfifo() is the UNIX counterpart to CreateNamedPipe()
Use sockets for networked clientserver scenarios
104104
Client Example using Named Pipe
WaitNamedPipe (ServerPipeName NMPWAIT_WAIT_FOREVER)
hNamedPipe = CreateFile (ServerPipeName GENERIC_READ | GENERIC_WRITE
0 NULL OPEN_EXISTING FILE_ATTRIBUTE_NORMAL NULL)
if (hNamedPipe == INVALID_HANDLE_VALUE)
fptinf(stderr Failure to locate servern) exit(3)
Write the request
WriteFile (hNamedPipe ampRequest MAX_RQRS_LEN ampnWrite NULL)
Read each response and send it to std out
while (ReadFile (hNamedPipe ResponseRecord MAX_RQRS_LEN ampnRead NULL))
printf (s ResponseRecord)
CloseHandle (hNamedPipe)
return 0
105105
Server Example Using a Named Pipe
hNamedPipe = CreateNamedPipe (SERVER_PIPE PIPE_ACCESS_DUPLEX
PIPE_READMODE_MESSAGE | PIPE_TYPE_MESSAGE | PIPE_WAIT
1 0 0 CS_TIMEOUT pNPSA)
while (Done)
printf (Server is awaiting next requestn)
if (ConnectNamedPipe (hNamedPipe NULL)
|| ReadFile (hNamedPipe ampRequest RQ_SIZE ampnXfer NULL))
fprintf(stderr ldquoConnect or Read Named Pipe errornrdquo) exit(4)
printf( ldquoRequest is sn RequestRecord)
Send the file one line at a time to the client
fp = fopen (File r)
while ((fgets (ResponseRecord MAX_RQRS_LEN fp) = NULL))
WriteFile (hNamedPipe ampResponseRecord
(strlen(ResponseRecord) + 1) TSIZE ampnXfer NULL)
fclose (fp)
DisconnectNamedPipe (hNamedPipe)
End of server operation
106106
Win32 IPC - MailslotsMailslots bear some nasty implementation detailsthey are almost never used
Broadcast mechanism
ndash One-directional
ndash Mutliple writersmultiple readers (frequently one-to-many comm)
ndash Message delivery is unreliable
ndash Can be located over a network domain
ndash Message lengths are limited (w2k lt 426 byte) Operations on the mailslot
ndash Each reader (server) creates mailslot with CreateMailslot()
ndash Write-only client opens mailslot with CreateFile() and uses WriteFile() ndash open will fail if there are no waiting readers
ndash Clientlsquos message can be read by all servers (readers) Client lookup mailslotmailslotname
ndash Client will connect to every server in network domain
107107
Locate a server via mailslot
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
App client 0
App client n
Mailslot Servers
While () Sleep() hMS = CreateFile( ldquomailslotstatusldquo)
WriteFile(hMS ampStatInfo
App Server
Mailslot Client
Message is sent periodically
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
1414
Bakery Algorithm
Notation ldquoltldquo establishes lexicographical order among 2-tuples (ticket thread id )
(ab) lt (cd) if a lt c or if a == c and b lt d
max (a0hellip an-1) = k | k ai for i = 0hellip n ndash 1
Shared data
int choosing[n]
int number[n] - the ticket
Data structures are initialized to 0
1515
Bakery Algorithm
do
choosing[i] = 1
number[i] = max(number[0]number[1] number[n-1]) + 1
choosing[i] = 0
for (j = 0 j lt n j++)
while (choosing[j] == 1)
while ((number[j] = 0) ampamp ((number[j]j) lsquorsquoltlsquorsquo (number[i]i)))
critical section
number[i] = 0
remainder section
while (1)
1616
Mutual Exclusion - Hardware Support
Interrupt Disabling
ndash Concurrent threads cannot overlap on a uniprocessor
ndash Thread will run until performing a system call or interrupt happens
Special Atomic Machine Instructions
ndash Test and Set Instruction - read amp write a memory location
ndash Exchange Instruction - swap register and memory location
Problems with Machine-Instruction Approach
ndash Busy waiting
ndash Starvation is possible
ndash Deadlock is possible
1717
Synchronization Hardware
Test and modify the content of a word atomically
boolean TestAndSet(boolean amptarget)
boolean rv = target
target = true
return rv
1818
Shared data ndash boolean lock = false
Thread Ti
do
while (TestAndSet(lock))
critical section
lock = false
remainder section
Mutual Exclusion with Test-and-Set
1919
Synchronization Hardware
Atomically swap two variables
void Swap(boolean ampa boolean ampb)
boolean temp = a
a = b
b = temp
2020
Mutual Exclusion with Swap
Shared data (initialized to 0) int lock = 0
Thread Ti
int key
do
key = 1
while (key == 1) Swap(lockkey)
critical section
lock = 0
remainder section
2121
Semaphores
Semaphore S ndash integer variable
can only be accessed via two atomic operations
wait (S)
while (S lt= 0)S--
signal (S)
S++
2222
Critical Section of n Threads
Shared data
semaphore mutex initially mutex = 1
Thread Ti
do wait(mutex) critical section
signal(mutex) remainder section
while (1)
2323
Semaphore Implementation
Semaphores may suspendresume threads
ndash Avoid busy waiting
Define a semaphore as a record
typedef struct
int value struct thread L semaphore
Assume two simple operations
ndash suspend() suspends the thread that invokes it
ndash resume(T) resumes the execution of a blocked thread T
2424
Implementation
Semaphore operations now defined as wait(S)
Svalue--
if (Svalue lt 0)
add this thread to SLsuspend()
signal(S) Svalue++
if (Svalue lt= 0)
remove a thread T from SLresume(T)
2525
Semaphore as a General Synchronization Tool Execute B in Tj only after A executed in Ti
Use semaphore flag initialized to 0
Code
Ti Tj
A wait(flag)
signal(flag) B
2626
Two Types of Semaphores
Counting semaphore
ndash integer value can range over an unrestricted domain
Binary semaphore
ndash integer value can range only between 0 and 1
ndash can be simpler to implement
Counting semaphore S can be implemented as a binary semaphore
2727
Deadlock and Starvation
Deadlock ndash
ndash two or more threads are waiting indefinitely for an event that can be caused by only one of the waiting threads
Let S and Q be two semaphores initialized to 1
T0 T1
wait(S) wait(Q)
wait(Q) wait(S)
signal(S) signal(Q)
signal(Q) signal(S)
2828
Deadlock and Starvation
Starvation ndash indefinite blocking
ndash A thread may never be removed from the semaphore queue in which it is suspended
Solution ndash
ndash all code should acquirerelease semaphores in same order
2929
Windows Synchronization
Uses interrupt masks to protect access to global resources on uniprocessor systems
Uses spinlocks on multiprocessor systems
Provides dispatcher objects which may act as mutexes and semaphores
Dispatcher objects may also provide events An event acts much like a condition variable
3030
Linux Synchronization
Kernel disables interrupts for synchronizing access to global data on uniprocessor systems
Uses spinlocks for multiprocessor synchronization
Uses semaphores and readers-writers locks when longer sections of code need access to data
Implements POSIX synchronization primitives to support multitasking multithreading (including real-time threads) and multiprocessing
3131
Further Reading
Ben-Ari M Principles of Concurrent Programming Prentice Hall 1982
Lamport L The Mutual Exclusion Problem Journal of the ACM April 1986
Abraham Silberschatz Peter B Galvin Operating System Concepts John Wiley amp Sons 6th Ed 2003
ndash Chapter 7 - Process Synchronization
ndash Chapter 8 - Deadlocks
3232
32 Trap Dispatching Interrupts Synchronization Trap and Interrupt dispatching
IRQL levels amp Interrupt Precedence
Spinlocks and Kernel Synchronization
Executive Synchronization
3333
Kernel Mode Versus User Mode
A processor state
Controls access to memory
Each memory page is tagged to show the required mode for reading and for writing
ndash Protects the system from the users
ndash Protects the user (process) from themselves
ndash System is not protected from system
Code regions are tagged ldquono write in any moderdquo
Controls ability to execute privileged instructions
A Windows abstraction
ndash Intel Ring 0 Ring 3
3434
Kernel Mode Versus User Mode
Control flow (a thread) can change from user to kernel mode and back
ndash Does not affect scheduling
ndash Thread context includes info about execution mode (along with registers etc)
PerfMon counters
ndash ldquoPrivileged Timerdquo and ldquoUser Timerdquo
ndash 4 levels of granularity thread process processor system
3535
Getting Into Kernel Mode
Code is run in kernel mode for one of three reasons
1 Requests from user mode
ndash Via the system service dispatch mechanism
ndash Kernel-mode code runs in the context of the requesting thread
2 Dedicated kernel-mode system threads
ndash Some threads in the system stay in kernel mode at all timesmostly in the ldquoSystemrdquo process
ndash Scheduled preempted etc like any other threads
3636
Getting Into Kernel Mode
3 Interrupts from external devicesndash interrupt dispatcher invokes the interrupt service routine
ndash ISR runs in the context of the interrupted thread so-called ldquoarbitrary thread contextrdquo
ndash ISR often requests the execution of a ldquoDPC routinerdquo which also runs in kernel mode
ndash Time not charged to interrupted thread
3737
Trap dispatching
Interruptdispatcher
Systemservice
dispatcher
Interruptserviceroutines
Interruptserviceroutines
Interruptserviceroutines
System services
System services
System services
Exceptiondispatcher
Exceptionhandlers
Exceptionhandlers
Exceptionhandlers
Virtual memorymanagerlsquos pager
Interrupt
System service call
HW exceptionsSW exceptions
Virtual addressexceptions
Trap processorlsquos mechanism to capture executing thread
ndash Switch from user to kernel mode
ndash Interrupts ndash asynchronous
ndash Exceptions - synchronous
Interrupt dispatch routine
Disable interrupts
Record machine state (trap frame) to allow resume
Mask equal- and lower-IRQL interrupts
Find and call appropriate ISR
Dismiss interrupt
Restore machine state (including mode and enabled interrupts)
Disable interrupts
Record machine state (trap frame) to allow resume
Mask equal- and lower-IRQL interrupts
Find and call appropriate ISR
Dismiss interrupt
Restore machine state (including mode and enabled interrupts)
Tell the device to stop interrupting
Interrogate device state start next operation on device etc
Request a DPC
Return to caller
Tell the device to stop interrupting
Interrogate device state start next operation on device etc
Request a DPC
Return to caller
Interrupt service routine
interrupt
user or kernel mode
codekernel mode
Note no thread or process context switch
Note no thread or process context switch
Interrupt Dispatching
3939
IRQL = Interrupt Request Level
ndash Precedence of the interrupt with respect to other interrupts
ndash Different interrupt sources have different IRQLs
ndash not the same as IRQ
IRQL is also a state of the processor
ndash Servicing an interrupt raises processor IRQL to that interruptrsquos IRQL
ndash this masks subsequent interrupts at equal and lower IRQLs
User mode is limited to IRQL 0
No waits or page faults at IRQL gt= DISPATCH_LEVEL
Interrupt Precedence via IRQLs (x86)
4040
PassiveLowAPC
DispatchDPCDevice 1
Profile amp Synch (Srv 2003)
ClockInterprocessor Interrupt
Power failHigh
normal thread execution
Hardware interrupts
Deferrable software interrupts
012
302928
31
Interrupt Precedence via IRQLs (x86)
4141
Interrupt processing
Interrupt dispatch table (IDT)ndash Links to interrupt service routines
x86ndash Interrupt controller interrupts processor (single line)
ndash Processor queries for interrupt vector uses vector as index to IDT
Alphandash PAL code (Privileged Architecture Library ndash Alpha BIOS) determines interrupt vector calls kernel
ndash Kernel uses vector to index IDT
After ISR execution IRQL is lowered to initial level
4242
Interrupt object
Allows device drivers to register ISRs for their devicesndash Contains dispatch code (initial handler)
ndash Dispatch code calls ISR with interrupt object as parameter(HW cannot pass parameters to ISR)
Connectingdisconnecting interrupt objectsndash Dynamic association between ISR and IDT entry
ndash Loadable device drivers (kernel modules)
ndash Turn onoff ISR
Interrupt objects can synchronize access to ISR datandash Multiple instances of ISR may be active simultaneously (MP machine)
ndash Multiple ISR may be connected with IRQL
4343
Predefined IRQLs
High ndash used when halting the system (via KeBugCheck())
Power fail ndash originated in the NT design document but has never been used
Inter-processor interruptndash used to request action from other processor (dispatching a thread updating a processors TLB system shutdown system crash)
Clockndash Used to update systemlsquos clock allocation of CPU time to threads
Profilendash Used for kernel profiling (see Kernel profiler ndash Kernprofexe Res Kit)
4444
Predefined IRQLs (contd)
Device
ndash Used to prioritize device interrupts
DPCdispatch and APC
ndash Software interrupts that kernel and device drivers generate
Passive
ndash No interrupt level at all normal thread execution
4545
IRQLs on 64-bit Systems
PassiveLowAPC
DispatchDPCDevice 1
Device n
Synch (Srv 2003)Clock
Interprocessor InterruptPower
HighProfile
012
1413
15
34
PassiveLowAPC
DispatchDPC amp Synch (UP only)Correctable Machine Check
Device 1
Device nSynch (MP only)
ClockInterprocessor Interrupt
HighProfilePower
x64 IA64
12
4646
Interrupt Prioritization amp Delivery
IRQLs are determined as followsndash x86 UP systems IRQL = 27 - IRQ
ndash x86 MP systems bucketized (random)
ndash x64 amp IA64 systems IRQL = IDT vector number 16
On MP systems which processor is chosen to deliver an interruptndash By default any processor can receive an interrupt from any deviceCan be configured with IntFilter utility in Resource Kit
ndash On x86 and x64 systems the IOAPIC (IO advanced programmable interrupt controller) is programmed to interrupt the processor running at the lowest IRQL
ndash On IA64 systems the SAPIC (streamlined advanced programmable interrupt controller) is configured to interrupt one processor for each interrupt sourceProcessors are assigned round robin for each interrupt vector
4747
Software interrupts
Initiating thread dispatching
ndash DPC allow for scheduling actions when kernel is deep within many layers of code
ndash Delayed scheduling decision one DPC queue per processor
Handling timer expiration
Asynchronous execution of a procedure in context of a particular thread
Support for asynchronous IO operations
4848
Flow of Interrupts
4949
Sync on MP use spinlocks to coordinate among processors
Spinlock acquisition and release routines implement a one-owner-at-a-time algorithm
ndash Spinlock is either free or is considered to be owned by a CPU
ndash Analogous to using Windows API mutexes from user mode
A spinlock is just a data cell in memory
ndash Accessed with a test-and-modify operation that is atomic across all processors
ndash KSPIN_LOCK is an opaque data type typedefrsquod as a ULONG
ndash To implement synchronization a single bit is sufficient
Synchronization on SMP Systems
5050
do acquire_spinlock(DPC)until (SUCCESS)
begin remove DPC from queueend
release_spinlock(DPC)
do acquire_spinlock(DPC)until (SUCCESS)
begin remove DPC from queueend
release_spinlock(DPC)
Kernel Synchronization
Processor BProcessor A
Critical section
spinlock
DPC DPC
A spinlock is a locking primitive associatedwith a global data structure such as the DPC queue
5151
Queued Spinlocks
Problem Checking status of spinlock via test-and-set operation creates bus contention
Queued spinlocks maintain queue of waiting processors
First processor acquires lock other processors wait on processor-local flag
ndash Thus busy-wait loop requires no access to the memory bus
When releasing lock the 1st processorrsquos flag is modified
ndash Exactly one processor is being signaled
ndash Pre-determined wait order
5252
SMP Scalability Improvements
Windows 2000 queued spinlocks
ndash qlocks in Kernel Debugger
Server 2003
ndash More spinlocks eliminated (context swap system space commit)
ndash Further reduction of use of spinlocks amp length they are held
ndash Scheduling database now per-CPUAllows thread state transitions in parallel
5353
SMP Scalability Improvements
XP2003
ndash Minimized lock contention for hot locks PFN or Page Frame Database lock
ndash Some locks completely eliminatedCharging nonpagedpaged pool quotas allocating and mapping system page table entries charging commitment of pages allocatingmapping physical memory through AWE functions
ndash New more efficient locking mechanism (pushlocks)Doesnrsquot use spinlocks when no contentionUsed for object manager and address windowing extensions (AWE) related locks
5454
Waiting
Flexible wait calls
ndash Wait for one or multiple objects in one call
ndash Wait for multiple can wait for ldquoanyrdquo one or ldquoallrdquo at onceldquoAllrdquo all objects must be in the signalled state concurrently to resolve the wait
ndash All wait calls include optional timeout argument
ndash Waiting threads consume no CPU time
5555
Waiting
Waitable objects include
ndash Events (may be auto-reset or manual reset may be set or ldquopulsedrdquo)
ndash Mutexes (ldquomutual exclusionrdquo one-at-a-time)
ndash Semaphores (n-at-a-time)
ndash Timers
ndash Processes and Threads (signalled upon exit or terminate)
ndash Directories (change notification)
5656
Waiting
No guaranteed ordering of wait resolution
ndash If multiple threads are waiting for an object and only one thread is released (eg itrsquos a mutex or auto-reset event) which thread gets released is unpredictable
5757
Executive Synchronization
Thread waitson an object
handle
Create and initialize thread object
Initialized
Ready
Transition
Waiting
Running
Terminated
Standby
Wait is completeSet object to
signaled state
Interaction with thread scheduling
Waiting on Dispatcher Objects ndash outside the kernel
5858
Interaction bet Synchronization amp Dispatching User mode thread waits on an event objectlsquos handle
Kernel changes threadlsquos scheduling state from ready to waiting and adds thread to wait-list
Another thread sets the event
Kernel wakes up waiting threads variable priority threads get priority boost
5959
Interaction bet Synchronization amp Dispatching Dispatcher re-schedules new thread ndash it may preempt running thread it it has lower priority and issues software interrupt to initiate context switch
If no processor can be preempted the dispatcher places the ready thread in the dispatcher ready queue to be scheduled later
6060
What signals an object
Dispatcher object
System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
Owning thread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (kernel mode)
nonsignaled signaled
Owning thread or otherthread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (exported to user mode)
nonsignaled signaled
One thread releases thesemaphore freeing a resource
A thread acquires the semaphoreMore resources are not available
Kernel resumes one or more waiting threads
Semaphore
6161
A thread reinitializesthe thread object
What signals an object (contd)
Dispatcher object System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
A thread sets the event
Kernel resumes one or more threads
Kernel resumes one or more waiting threads
Event
nonsignaled signaled
Dedicated thread sets oneevent in the event pair
Kernel resumes theother dedicated thread
Kernel resumes waitingdedicated thread
Event pair
nonsignaled signaled
Timer expires
A thread (re) initializes the timer
Kernel resumes all waiting threads
Timer
nonsignaled signaled
Thread terminates
Kernel resumes all waiting threads
Thread
6262
Further Reading
Mark E Russinovich and David A Solomon Microsoft Windows Internals 4th Edition Microsoft Press 2004
Chapter 3 - System Mechanisms
ndash Trap Dispatching (pp 85 ff)
ndash Synchronization (pp 149 ff)
ndash Kernel Event Tracing (pp 175 ff)
6363
33 Advanced Windows Synchronization
Deferred and Asynchronous Procedure Calls
IRQLs and CPU Time Accounting
Wait Queues amp Dispatcher Objects
6464
Used to defer processing from higher (device) interrupt level to a lower (dispatch) levelndash Also used for quantum end and timer expiration
Driver (usually ISR) queues requestndash One queue per CPU DPCs are normally queued to the current processor but can be targeted to other CPUs
ndash Executes specified procedure at dispatch IRQL (or ldquodispatch levelrdquo also ldquoDPC levelrdquo) when all higher-IRQL work (interrupts) completed
ndash Maximum times recommended ISR 10 usec DPC 25 usec
See httpwwwmicrosoftcomwhdcdriverperformmmdrvmspx
Deferred Procedure Calls (DPCs)
6565
queue head DPC object DPC object DPC object
Deferred Procedure Calls (DPCs)
6666
DPC
Delivering a DPC
DPC routines can call kernel functionsbut canlsquot call system services generatepage faults or create or wait on objects
DPC routines canlsquotassume whatprocess addressspace is currentlymapped
Interruptdispatch table
high
Power failure
DispatchDPC
APC
Low
DPC
1 Timer expires kernelqueues DPC that willrelease all waiting threadsKernel requests SW int
DPCDPC
DPC queue
2 DPC interrupt occurswhen IRQL drops belowdispatchDPC level
dispatcher
3 After DPC interruptcontrol transfers tothread dispatcher
4 Dispatcher executes each DPCroutine in DPC queue
6767
Asynchronous Procedure Calls (APCs)
Execute code in context of a particular user threadndash APC routines can acquire resources (objects) incur page faultscall system services
APC queue is thread-specific
User mode amp kernel mode APCsndash Permission required for user mode APCs
6868
Asynchronous Procedure Calls (APCs)
Executive uses APCs to complete work in thread spacendash Wait for asynchronous IO operation
ndash Emulate delivery of POSIX signals
ndash Make threads suspendterminate itself (env subsystems)
APCs are delivered when thread is in alertable wait statendash WaitForMultipleObjectsEx() SleepEx()
6969
Special kernel APCs
ndash Run in kernel mode at IRQL 1
ndash Always deliverable unless thread is already at IRQL 1 or above
ndash Used for IO completion reporting from ldquoarbitrary thread contextrdquo
ndash Kernel-mode interface is linkable but not documented
Asynchronous Procedure Calls (APCs)
7070
ldquoOrdinaryrdquo kernel APCs
ndash Always deliverable if at IRQL 0 unless explicitly disabled (disable with KeEnterCriticalRegion)
User mode APCs
ndash Used for IO completion callback routines (see ReadFileEx WriteFileEx) also QueueUserApc
ndash Only deliverable when thread is in ldquoalertable waitrdquo
Asynchronous Procedure Calls (APCs)
7171
ThreadObject
K
U
APC objects
Asynchronous Procedure Calls (APCs)
7272
IRQLs and CPU Time Accounting
Interval clock timer ISR keeps track of time
Clock ISR time accounting
ndash If IRQLlt2 charge to threadrsquos user or kernel time
ndash If IRQL=2 and processing a DPC charge to DPC time
ndash If IRQL=2 amp not processing a DPC charge to thread kernel time
ndash If IRQLgt2 charge to interrupt time
7373
IRQLs and CPU Time Accounting
Since time servicing interrupts are NOT charged to interrupted thread if system is busy but no process appears to be running must be due to interrupt-related activity
ndash Note time at IRQL 2 or more is charged to the current threadrsquos quantum (to be described)
7474
Interrupt Time Accounting
Task Manager includes interrupt and DPC time with the Idle process time
Interrupt activity is not charged to any threadprocess
ndash Process Explorer shows these as separate processesnot really processes
ndash Context switches for these are really of interrupts amp DPCs
7575
Time Accounting Quirks
Looking at total CPU time for each process may not reveal where system has spent its time
CPU time accounting is driven by programmable interrupt timer
ndash Normally 10 msec (15 msec on some MP Pentiums)
Thread execution and context switches between clock intervals NOT accounted
ndash Eg one or more threads run and enter a wait state before clock fires
ndash Thus threads may run but never get charged
View context switch activity with Process Explorer
ndash Add Context Switch Delta column
7676
For waiting threads user-mode utilities only display the wait reason
Example pstat
Looking at Waiting Threads
7777
Wait Internals 1 Dispatcher Objects
Size TypeState
Wait listhead
Object-type-specific data
DispatcherObject
(see ntddkincddkntddkh)
Any kernel object you can wait for is a ldquodispatcher objectrdquo
ndash some exclusively for synchronizationeg events mutexes (ldquomutantsrdquo) semaphores queues timers
ndash others can be waited for as a side effect of their prime function eg processes threads file objects
ndash non-waitable kernel objects are called ldquocontrol objectsrdquo
All dispatcher objects have a common header
All dispatcher objects are in one of two states
ndash ldquosignaledrdquo vs ldquononsignaledrdquo
ndash when signalled a wait on the object is satisfied
ndash different object types differ in terms of what changes their state
ndash wait and unwait implementation iscommon to all types of dispatcher objects
7878
Object-type-specific data
Wait Internals 2Wait Blocks
Size TypeState
Wait listhead
Size TypeState
Wait listhead
Represent a threadrsquos reference to something itrsquos waiting for (one per handle passed to WaitForhellip)
All wait blocks from a given wait call are chained to the waiting thread
Type indicates wait for ldquoanyrdquo or ldquoallrdquo Key denotes argument list position for
WaitForMultipleObjects
Object-type-specific data
DispatcherObjects
Thread Objects
WaitBlockListWaitBlockList
Wait blocks
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
7979
34 Windows APIs for Synchronization and IPC Windows API constructs for synchronization and interprocess communication
Synchronization
ndash Critical sections
ndash Mutexes
ndash Semaphores
ndash Event objects
Synchronization through interprocess communication
ndash Anonymous pipes
ndash Named pipes
ndash Mailslots
8080
Critical Sections
Only usable from within the same process
Critical sections are initialized and deleted but do not have handles
Only one thread at a time can be in a critical section
A thread can enter a critical section multiple times - however the number of Enter- and Leave-operations must match
Leaving a critical section before entering it may cause deadlocks
No way to test whether another thread is in a critical section
VOID InitializeCriticalSection( LPCRITICAL_SECTION sec )VOID DeleteCriticalSection( LPCRITICAL_SECTION sec )
VOID EnterCriticalSection( LPCRITICAL_SECTION sec ) VOID LeaveCriticalSection( LPCRITICAL_SECTION sec )BOOL TryEnterCriticalSection ( LPCRITICAL_SECTION sec )
8181
Critical Section Example
counter is global shared by all threads
volatile int counter = 0
CRITICAL_SECTION crit
InitializeCriticalSection ( ampcrit )
hellip main loop in any of the threads
while (done)
_try
EnterCriticalSection ( ampcrit )
counter += local_value
LeaveCriticalSection ( ampcrit )
_finally LeaveCriticalSection ( ampcrit )
DeleteCriticalSection( ampcrit )
8282
Synchronizing Threads with Kernel Objects
The following kernel objects can be used to synchronize threads
ndash Processes
ndash Threads
ndash Files
ndash Console input
File change notificationsFile change notifications
MutexesMutexes
Events (auto-reset + manual-reset)Events (auto-reset + manual-reset)
Waitable timersWaitable timers
DWORD WaitForSingleObject( HANDLE hObject DWORD dwTimeout )
DWORD WaitForMultipleObjects( DWORD cObjects LPHANDLE lpHandles BOOL bWaitAll DWORD dwTimeout )
8383
Wait Functions - Details
WaitForSingleObject()ndash hObject specifies kernel object
ndash dwTimeout specifies wait time in msecdwTimeout == 0 - no wait check whether object is signaled
dwTimeout == INFINITE - wait forever
WaitForMultipleObjects()ndash cObjects lt= MAXIMUM_WAIT_OBJECTS (64)
ndash lpHandles - pointer to array identifying these objects
ndash bWaitAll - whether to wait for first signaled object or all objectsFunction returns index of first signaled object
Side effectsndash Mutexes auto-reset events and waitable timers will be reset to non-signaled state after completing wait functions
8484
Mutexes
Mutexes work across processes
First thread has to call CreateMutex()
When sharing a mutex second thread (process) calls CreateMutex() or OpenMutex()
fInitialOwner == TRUE gives creator immediate ownership
Threads acquire mutex ownership using WaitForSingleObject() or WaitForMultipleObjects()
ReleaseMutex() gives up ownership
CloseHandle() will free mutex object
8585
Mutexes
HANDLE CreateMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
HANDLE OpenMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
BOOL ReleaseMutex( HANDLE hMutex )
8686
Mutex Example
counter is global shared by all threads
volatile int done counter = 0
HANDLE mutex = CreateMutex( NULL FALSE NULL )
main loop in any of the threads ret is local
DWORD ret
while (done)
ret = WaitForSingleObject( mutex INFINITE )
if (ret == WAIT_OBJECT_0)
counter += local_value
else mutex was abandoned
break exit the loop
ReleaseMutex( mutex )
CloseHandle( mutex )
8787
Comparison - POSIX mutexes
POSIX pthreads specification supports mutexesndash Synchronization among threads in same process
Five basic functionsndash pthread_mutex_init()
ndash pthread_mutex_destroy()
ndash pthread_mutex_lock()
ndash pthread_mutex_unlock()
ndash pthread_mutex_trylock()
Comparisonndash pthread_mutex_lock() will block - equivalent to WaitForSingleObject( hMutex )
ndash pthread_mutex_trylock() is nonblocking (polling) - equivalent to WaitForSingleObject() with timeout == 0
8888
Semaphores
Semaphore objects are used for resource countingndash A semaphore is signaled when count gt 0
Threadsprocesses use wait functionsndash Each wait function decreases semaphore count by 1
ndash ReleaseSemaphore() may increment count by any value
ndash ReleaseSemaphore() returns old semaphore count
8989
Semaphores
HANDLE CreateSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE OpenSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE ReleaseSemaphore( HANDLE hSemaphoreLONG cReleaseCount LPLONG lpPreviousCount )
9090
Events
Multiple threads can be released when a single event is signaled (barrier synchronization)ndash Manual-reset event can signal several thread simultaneously must be reset manually
ndash PulseEvent() will release all threads waiting on a manual-reset event and automatically reset the event
ndash Auto-reset event signals a single thread event is reset automatically
ndash fInitialState == TRUE - create event in signaled state
9191
Events
HANDLE CreateEvent( LPSECURITY_ATTRIBUTE lpsaBOOL fManualReset BOOL fInititalStateLPTSTR lpszEventName )
BOOL SetEvent( HANDLE hEvent )BOOL ResetEvent( HANDLE hEvent )BOOL PulseEvent( HANDLE hEvent )
9292
Comparison - POSIX condition variables
pthreadrsquos condition variables are comparable to eventsndash pthread_cond_init()
ndash pthread_cond_destroy()
Wait functionsndash pthread_cond_wait()
ndash pthread_cond_timedwait()
Signalingndash pthread_cond_signal() - one thread
ndash pthread_cond_broadcast() - all waiting threads
No exact equivalent to manual-reset events
9393
Anonymous pipes
BOOL CreatePipe( PHANDLE phReadPHANDLE phWriteLPSECURITY_ATTRIBUTES lpsaDWORD cbPipe )
main
prog1 prog2pipe
Half-duplex character-based IPC
cbPipe pipe byte size zero == default
Read on pipe handle will block if pipe is empty
Write operation to a full pipe will block
Anonymous pipes are oneway
9494
IO Redirection using an Anonymous Pipe
Create default size anonymous pipe handles are inheritable
if (CreatePipe (amphReadPipe amphWritePipe ampPipeSA 0))
fprintf(stderr ldquoAnon pipe create failednrdquo) exit(1)
Set output handle to pipe handle create first processes
StartInfoCh1hStdInput = GetStdHandle (STD_INPUT_HANDLE)
StartInfoCh1hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh1hStdOutput = hWritePipe
StartInfoCh1dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)Command1 NULL NULL TRUE 0 NULL NULL ampStartInfoCh1 ampProcInfo1))
fprintf(stderr ldquoCreateProc1 failednrdquo) exit(2)
CloseHandle (hWritePipe)
9595
Pipe example (contd)
Repeat (symmetrically) for the second process
StartInfoCh2hStdInput = hReadPipe
StartInfoCh2hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh2hStdOutput = GetStdHandle (STD_OUTPUT_HANDLE)
StartInfoCh2dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)targv NULL NULLTRUE Inherit handles
0 NULL NULL ampStartInfoCh2 ampProcInfo2))
fprintf(stderr ldquoCreateProc2 failednrdquo) exit(3)
CloseHandle (hReadPipe)
Wait for both processes to complete
WaitForSingleObject (ProcInfo1hProcess INFINITE)
WaitForSingleObject (ProcInfo2hProcess INFINITE)
CloseHandle (ProcInfo1hThread) CloseHandle (ProcInfo1hProcess)
CloseHandle (ProcInfo2hThread) CloseHandle (ProcInfo2hProcess)
return 0
9696
Named Pipes
Message orientedndash Reading process can read varying-length messages precisely as sent by the writing process
Bi-directionalndash Two processes can exchange messages over the same pipe
Multiple independent instances of a named pipendash Several clients can communicate with a single server using the same instance
ndash Server can respond to client using the same instance
Pipe can be accessed over the networkndash location transparency
Convenience and connection functions
9797
Using Named Pipes
HANDLE CreateNamedPipe (LPCTSTR lpszPipeNameDWORD fdwOpenMode DWORD fdwPipModeDWORD nMaxInstances DWORD cbOutBufDWORD cbInBuf DWORD dwTimeOutLPSECURITY_ATTRIBUTES lpsa )
Use same flag settings forall instances of a named pipe
lpszPipeName pipe[path]pipename
ndash Not possible to create a pipe on remote machine ( ndash local machine)
fdwOpenMode
ndash PIPE_ACCESS_DUPLEX PIPE_ACCESS_INBOUND PIPE_ACCESS_OUTBOUND
fdwPipeMode
ndash PIPE_TYPE_BYTE or PIPE_TYPE_MESSAGE
ndash PIPE_READMODE_BYTE or PIPE_READMODE_MESSAGE
ndash PIPE_WAIT or PIPE_NOWAIT (will ReadFile block)
9898
Named Pipes (contd)
BOOL PeekNamedPipe (HANDLE hPipeLPVOID lpvBuffer DWORD cbBufferLPDWORD lpcbRead LPDWORD lpcbAvailLPDWORD lpcbMessage)
nMaxInstances
ndash Number of instances
ndash PIPE_UNLIMITED_INSTANCES OS choice based on resources
dwTimeOut
ndash Default time-out period (in msec) for WaitNamedPipe()
First CreateNamedPipe creates named pipe
ndash Closing handle to last instance deletes named pipe
Polling a pipe
ndash Nondestructive ndash is there a message waiting for ReadFile
9999
Named Pipe Client Connections
CreateFile with named pipe namendash pipe[path]pipename
ndash servernamepipe[path]pipename
ndash First method gives better performance (local server)
Status Functionsndash GetNamedPipeHandleState
ndash SetNamedPipeHandleState
ndash GetNamedPipeInfo
100100
Convenience Functions
BOOL TransactNamedPipe( HANDLE hNamedPipeLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDOWRD lpcbRead LPOVERLAPPED lpa)
WriteFile ReadFile sequence
101101
Convenience Functions
BOOL CallNamedPipe( LPCTSTR lpszPipeNameLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDWORD lpcbRead DWORD dwTimeOut)
CreateFile WriteFile ReadFile CloseHandle
ndash dwTimeOut NMPWAIT_NOWAIT NMPWAIT_WIAT_FOREVER NMPWAIT_USE_DEFAULT_WAIT
102102
Server eliminate the polling loop
BOOL ConnectNamedPipe (HANDLE hNamedPipeLPOVERLAPPED lpo
lpo == NULLndash Call will return as soon as there is a client connection
ndash Returns false if client connected between CreateNamed Pipe calland ConnectNamedPipe()
Use DisconnectNamedPipe to free the handle for connection from another client
WaitNamedPipe()ndash Client may wait for serverlsquos ConnectNamedPipe()
Security rights for named pipesndash GENERIC_READ GENERIC_WRITE SYNCHRONIZE
103103
Comparison with UNIX
UNIX FIFOs are similar to a named pipe
ndash FIFOs are half-duplex
ndash FIFOs are limited to a single machine
ndash FIFOs are still byte-oriented so its easiest to use fixed-size records in clientserver applications
ndash Individual readwrites are atomic
A server using FIFOs must use a separate FIFO for each clientlsquos response although all clients can send requests via a single well known FIFO
Mkfifo() is the UNIX counterpart to CreateNamedPipe()
Use sockets for networked clientserver scenarios
104104
Client Example using Named Pipe
WaitNamedPipe (ServerPipeName NMPWAIT_WAIT_FOREVER)
hNamedPipe = CreateFile (ServerPipeName GENERIC_READ | GENERIC_WRITE
0 NULL OPEN_EXISTING FILE_ATTRIBUTE_NORMAL NULL)
if (hNamedPipe == INVALID_HANDLE_VALUE)
fptinf(stderr Failure to locate servern) exit(3)
Write the request
WriteFile (hNamedPipe ampRequest MAX_RQRS_LEN ampnWrite NULL)
Read each response and send it to std out
while (ReadFile (hNamedPipe ResponseRecord MAX_RQRS_LEN ampnRead NULL))
printf (s ResponseRecord)
CloseHandle (hNamedPipe)
return 0
105105
Server Example Using a Named Pipe
hNamedPipe = CreateNamedPipe (SERVER_PIPE PIPE_ACCESS_DUPLEX
PIPE_READMODE_MESSAGE | PIPE_TYPE_MESSAGE | PIPE_WAIT
1 0 0 CS_TIMEOUT pNPSA)
while (Done)
printf (Server is awaiting next requestn)
if (ConnectNamedPipe (hNamedPipe NULL)
|| ReadFile (hNamedPipe ampRequest RQ_SIZE ampnXfer NULL))
fprintf(stderr ldquoConnect or Read Named Pipe errornrdquo) exit(4)
printf( ldquoRequest is sn RequestRecord)
Send the file one line at a time to the client
fp = fopen (File r)
while ((fgets (ResponseRecord MAX_RQRS_LEN fp) = NULL))
WriteFile (hNamedPipe ampResponseRecord
(strlen(ResponseRecord) + 1) TSIZE ampnXfer NULL)
fclose (fp)
DisconnectNamedPipe (hNamedPipe)
End of server operation
106106
Win32 IPC - MailslotsMailslots bear some nasty implementation detailsthey are almost never used
Broadcast mechanism
ndash One-directional
ndash Mutliple writersmultiple readers (frequently one-to-many comm)
ndash Message delivery is unreliable
ndash Can be located over a network domain
ndash Message lengths are limited (w2k lt 426 byte) Operations on the mailslot
ndash Each reader (server) creates mailslot with CreateMailslot()
ndash Write-only client opens mailslot with CreateFile() and uses WriteFile() ndash open will fail if there are no waiting readers
ndash Clientlsquos message can be read by all servers (readers) Client lookup mailslotmailslotname
ndash Client will connect to every server in network domain
107107
Locate a server via mailslot
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
App client 0
App client n
Mailslot Servers
While () Sleep() hMS = CreateFile( ldquomailslotstatusldquo)
WriteFile(hMS ampStatInfo
App Server
Mailslot Client
Message is sent periodically
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
1515
Bakery Algorithm
do
choosing[i] = 1
number[i] = max(number[0]number[1] number[n-1]) + 1
choosing[i] = 0
for (j = 0 j lt n j++)
while (choosing[j] == 1)
while ((number[j] = 0) ampamp ((number[j]j) lsquorsquoltlsquorsquo (number[i]i)))
critical section
number[i] = 0
remainder section
while (1)
1616
Mutual Exclusion - Hardware Support
Interrupt Disabling
ndash Concurrent threads cannot overlap on a uniprocessor
ndash Thread will run until performing a system call or interrupt happens
Special Atomic Machine Instructions
ndash Test and Set Instruction - read amp write a memory location
ndash Exchange Instruction - swap register and memory location
Problems with Machine-Instruction Approach
ndash Busy waiting
ndash Starvation is possible
ndash Deadlock is possible
1717
Synchronization Hardware
Test and modify the content of a word atomically
boolean TestAndSet(boolean amptarget)
boolean rv = target
target = true
return rv
1818
Shared data ndash boolean lock = false
Thread Ti
do
while (TestAndSet(lock))
critical section
lock = false
remainder section
Mutual Exclusion with Test-and-Set
1919
Synchronization Hardware
Atomically swap two variables
void Swap(boolean ampa boolean ampb)
boolean temp = a
a = b
b = temp
2020
Mutual Exclusion with Swap
Shared data (initialized to 0) int lock = 0
Thread Ti
int key
do
key = 1
while (key == 1) Swap(lockkey)
critical section
lock = 0
remainder section
2121
Semaphores
Semaphore S ndash integer variable
can only be accessed via two atomic operations
wait (S)
while (S lt= 0)S--
signal (S)
S++
2222
Critical Section of n Threads
Shared data
semaphore mutex initially mutex = 1
Thread Ti
do wait(mutex) critical section
signal(mutex) remainder section
while (1)
2323
Semaphore Implementation
Semaphores may suspendresume threads
ndash Avoid busy waiting
Define a semaphore as a record
typedef struct
int value struct thread L semaphore
Assume two simple operations
ndash suspend() suspends the thread that invokes it
ndash resume(T) resumes the execution of a blocked thread T
2424
Implementation
Semaphore operations now defined as wait(S)
Svalue--
if (Svalue lt 0)
add this thread to SLsuspend()
signal(S) Svalue++
if (Svalue lt= 0)
remove a thread T from SLresume(T)
2525
Semaphore as a General Synchronization Tool Execute B in Tj only after A executed in Ti
Use semaphore flag initialized to 0
Code
Ti Tj
A wait(flag)
signal(flag) B
2626
Two Types of Semaphores
Counting semaphore
ndash integer value can range over an unrestricted domain
Binary semaphore
ndash integer value can range only between 0 and 1
ndash can be simpler to implement
Counting semaphore S can be implemented as a binary semaphore
2727
Deadlock and Starvation
Deadlock ndash
ndash two or more threads are waiting indefinitely for an event that can be caused by only one of the waiting threads
Let S and Q be two semaphores initialized to 1
T0 T1
wait(S) wait(Q)
wait(Q) wait(S)
signal(S) signal(Q)
signal(Q) signal(S)
2828
Deadlock and Starvation
Starvation ndash indefinite blocking
ndash A thread may never be removed from the semaphore queue in which it is suspended
Solution ndash
ndash all code should acquirerelease semaphores in same order
2929
Windows Synchronization
Uses interrupt masks to protect access to global resources on uniprocessor systems
Uses spinlocks on multiprocessor systems
Provides dispatcher objects which may act as mutexes and semaphores
Dispatcher objects may also provide events An event acts much like a condition variable
3030
Linux Synchronization
Kernel disables interrupts for synchronizing access to global data on uniprocessor systems
Uses spinlocks for multiprocessor synchronization
Uses semaphores and readers-writers locks when longer sections of code need access to data
Implements POSIX synchronization primitives to support multitasking multithreading (including real-time threads) and multiprocessing
3131
Further Reading
Ben-Ari M Principles of Concurrent Programming Prentice Hall 1982
Lamport L The Mutual Exclusion Problem Journal of the ACM April 1986
Abraham Silberschatz Peter B Galvin Operating System Concepts John Wiley amp Sons 6th Ed 2003
ndash Chapter 7 - Process Synchronization
ndash Chapter 8 - Deadlocks
3232
32 Trap Dispatching Interrupts Synchronization Trap and Interrupt dispatching
IRQL levels amp Interrupt Precedence
Spinlocks and Kernel Synchronization
Executive Synchronization
3333
Kernel Mode Versus User Mode
A processor state
Controls access to memory
Each memory page is tagged to show the required mode for reading and for writing
ndash Protects the system from the users
ndash Protects the user (process) from themselves
ndash System is not protected from system
Code regions are tagged ldquono write in any moderdquo
Controls ability to execute privileged instructions
A Windows abstraction
ndash Intel Ring 0 Ring 3
3434
Kernel Mode Versus User Mode
Control flow (a thread) can change from user to kernel mode and back
ndash Does not affect scheduling
ndash Thread context includes info about execution mode (along with registers etc)
PerfMon counters
ndash ldquoPrivileged Timerdquo and ldquoUser Timerdquo
ndash 4 levels of granularity thread process processor system
3535
Getting Into Kernel Mode
Code is run in kernel mode for one of three reasons
1 Requests from user mode
ndash Via the system service dispatch mechanism
ndash Kernel-mode code runs in the context of the requesting thread
2 Dedicated kernel-mode system threads
ndash Some threads in the system stay in kernel mode at all timesmostly in the ldquoSystemrdquo process
ndash Scheduled preempted etc like any other threads
3636
Getting Into Kernel Mode
3 Interrupts from external devicesndash interrupt dispatcher invokes the interrupt service routine
ndash ISR runs in the context of the interrupted thread so-called ldquoarbitrary thread contextrdquo
ndash ISR often requests the execution of a ldquoDPC routinerdquo which also runs in kernel mode
ndash Time not charged to interrupted thread
3737
Trap dispatching
Interruptdispatcher
Systemservice
dispatcher
Interruptserviceroutines
Interruptserviceroutines
Interruptserviceroutines
System services
System services
System services
Exceptiondispatcher
Exceptionhandlers
Exceptionhandlers
Exceptionhandlers
Virtual memorymanagerlsquos pager
Interrupt
System service call
HW exceptionsSW exceptions
Virtual addressexceptions
Trap processorlsquos mechanism to capture executing thread
ndash Switch from user to kernel mode
ndash Interrupts ndash asynchronous
ndash Exceptions - synchronous
Interrupt dispatch routine
Disable interrupts
Record machine state (trap frame) to allow resume
Mask equal- and lower-IRQL interrupts
Find and call appropriate ISR
Dismiss interrupt
Restore machine state (including mode and enabled interrupts)
Disable interrupts
Record machine state (trap frame) to allow resume
Mask equal- and lower-IRQL interrupts
Find and call appropriate ISR
Dismiss interrupt
Restore machine state (including mode and enabled interrupts)
Tell the device to stop interrupting
Interrogate device state start next operation on device etc
Request a DPC
Return to caller
Tell the device to stop interrupting
Interrogate device state start next operation on device etc
Request a DPC
Return to caller
Interrupt service routine
interrupt
user or kernel mode
codekernel mode
Note no thread or process context switch
Note no thread or process context switch
Interrupt Dispatching
3939
IRQL = Interrupt Request Level
ndash Precedence of the interrupt with respect to other interrupts
ndash Different interrupt sources have different IRQLs
ndash not the same as IRQ
IRQL is also a state of the processor
ndash Servicing an interrupt raises processor IRQL to that interruptrsquos IRQL
ndash this masks subsequent interrupts at equal and lower IRQLs
User mode is limited to IRQL 0
No waits or page faults at IRQL gt= DISPATCH_LEVEL
Interrupt Precedence via IRQLs (x86)
4040
PassiveLowAPC
DispatchDPCDevice 1
Profile amp Synch (Srv 2003)
ClockInterprocessor Interrupt
Power failHigh
normal thread execution
Hardware interrupts
Deferrable software interrupts
012
302928
31
Interrupt Precedence via IRQLs (x86)
4141
Interrupt processing
Interrupt dispatch table (IDT)ndash Links to interrupt service routines
x86ndash Interrupt controller interrupts processor (single line)
ndash Processor queries for interrupt vector uses vector as index to IDT
Alphandash PAL code (Privileged Architecture Library ndash Alpha BIOS) determines interrupt vector calls kernel
ndash Kernel uses vector to index IDT
After ISR execution IRQL is lowered to initial level
4242
Interrupt object
Allows device drivers to register ISRs for their devicesndash Contains dispatch code (initial handler)
ndash Dispatch code calls ISR with interrupt object as parameter(HW cannot pass parameters to ISR)
Connectingdisconnecting interrupt objectsndash Dynamic association between ISR and IDT entry
ndash Loadable device drivers (kernel modules)
ndash Turn onoff ISR
Interrupt objects can synchronize access to ISR datandash Multiple instances of ISR may be active simultaneously (MP machine)
ndash Multiple ISR may be connected with IRQL
4343
Predefined IRQLs
High ndash used when halting the system (via KeBugCheck())
Power fail ndash originated in the NT design document but has never been used
Inter-processor interruptndash used to request action from other processor (dispatching a thread updating a processors TLB system shutdown system crash)
Clockndash Used to update systemlsquos clock allocation of CPU time to threads
Profilendash Used for kernel profiling (see Kernel profiler ndash Kernprofexe Res Kit)
4444
Predefined IRQLs (contd)
Device
ndash Used to prioritize device interrupts
DPCdispatch and APC
ndash Software interrupts that kernel and device drivers generate
Passive
ndash No interrupt level at all normal thread execution
4545
IRQLs on 64-bit Systems
PassiveLowAPC
DispatchDPCDevice 1
Device n
Synch (Srv 2003)Clock
Interprocessor InterruptPower
HighProfile
012
1413
15
34
PassiveLowAPC
DispatchDPC amp Synch (UP only)Correctable Machine Check
Device 1
Device nSynch (MP only)
ClockInterprocessor Interrupt
HighProfilePower
x64 IA64
12
4646
Interrupt Prioritization amp Delivery
IRQLs are determined as followsndash x86 UP systems IRQL = 27 - IRQ
ndash x86 MP systems bucketized (random)
ndash x64 amp IA64 systems IRQL = IDT vector number 16
On MP systems which processor is chosen to deliver an interruptndash By default any processor can receive an interrupt from any deviceCan be configured with IntFilter utility in Resource Kit
ndash On x86 and x64 systems the IOAPIC (IO advanced programmable interrupt controller) is programmed to interrupt the processor running at the lowest IRQL
ndash On IA64 systems the SAPIC (streamlined advanced programmable interrupt controller) is configured to interrupt one processor for each interrupt sourceProcessors are assigned round robin for each interrupt vector
4747
Software interrupts
Initiating thread dispatching
ndash DPC allow for scheduling actions when kernel is deep within many layers of code
ndash Delayed scheduling decision one DPC queue per processor
Handling timer expiration
Asynchronous execution of a procedure in context of a particular thread
Support for asynchronous IO operations
4848
Flow of Interrupts
4949
Sync on MP use spinlocks to coordinate among processors
Spinlock acquisition and release routines implement a one-owner-at-a-time algorithm
ndash Spinlock is either free or is considered to be owned by a CPU
ndash Analogous to using Windows API mutexes from user mode
A spinlock is just a data cell in memory
ndash Accessed with a test-and-modify operation that is atomic across all processors
ndash KSPIN_LOCK is an opaque data type typedefrsquod as a ULONG
ndash To implement synchronization a single bit is sufficient
Synchronization on SMP Systems
5050
do acquire_spinlock(DPC)until (SUCCESS)
begin remove DPC from queueend
release_spinlock(DPC)
do acquire_spinlock(DPC)until (SUCCESS)
begin remove DPC from queueend
release_spinlock(DPC)
Kernel Synchronization
Processor BProcessor A
Critical section
spinlock
DPC DPC
A spinlock is a locking primitive associatedwith a global data structure such as the DPC queue
5151
Queued Spinlocks
Problem Checking status of spinlock via test-and-set operation creates bus contention
Queued spinlocks maintain queue of waiting processors
First processor acquires lock other processors wait on processor-local flag
ndash Thus busy-wait loop requires no access to the memory bus
When releasing lock the 1st processorrsquos flag is modified
ndash Exactly one processor is being signaled
ndash Pre-determined wait order
5252
SMP Scalability Improvements
Windows 2000 queued spinlocks
ndash qlocks in Kernel Debugger
Server 2003
ndash More spinlocks eliminated (context swap system space commit)
ndash Further reduction of use of spinlocks amp length they are held
ndash Scheduling database now per-CPUAllows thread state transitions in parallel
5353
SMP Scalability Improvements
XP2003
ndash Minimized lock contention for hot locks PFN or Page Frame Database lock
ndash Some locks completely eliminatedCharging nonpagedpaged pool quotas allocating and mapping system page table entries charging commitment of pages allocatingmapping physical memory through AWE functions
ndash New more efficient locking mechanism (pushlocks)Doesnrsquot use spinlocks when no contentionUsed for object manager and address windowing extensions (AWE) related locks
5454
Waiting
Flexible wait calls
ndash Wait for one or multiple objects in one call
ndash Wait for multiple can wait for ldquoanyrdquo one or ldquoallrdquo at onceldquoAllrdquo all objects must be in the signalled state concurrently to resolve the wait
ndash All wait calls include optional timeout argument
ndash Waiting threads consume no CPU time
5555
Waiting
Waitable objects include
ndash Events (may be auto-reset or manual reset may be set or ldquopulsedrdquo)
ndash Mutexes (ldquomutual exclusionrdquo one-at-a-time)
ndash Semaphores (n-at-a-time)
ndash Timers
ndash Processes and Threads (signalled upon exit or terminate)
ndash Directories (change notification)
5656
Waiting
No guaranteed ordering of wait resolution
ndash If multiple threads are waiting for an object and only one thread is released (eg itrsquos a mutex or auto-reset event) which thread gets released is unpredictable
5757
Executive Synchronization
Thread waitson an object
handle
Create and initialize thread object
Initialized
Ready
Transition
Waiting
Running
Terminated
Standby
Wait is completeSet object to
signaled state
Interaction with thread scheduling
Waiting on Dispatcher Objects ndash outside the kernel
5858
Interaction bet Synchronization amp Dispatching User mode thread waits on an event objectlsquos handle
Kernel changes threadlsquos scheduling state from ready to waiting and adds thread to wait-list
Another thread sets the event
Kernel wakes up waiting threads variable priority threads get priority boost
5959
Interaction bet Synchronization amp Dispatching Dispatcher re-schedules new thread ndash it may preempt running thread it it has lower priority and issues software interrupt to initiate context switch
If no processor can be preempted the dispatcher places the ready thread in the dispatcher ready queue to be scheduled later
6060
What signals an object
Dispatcher object
System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
Owning thread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (kernel mode)
nonsignaled signaled
Owning thread or otherthread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (exported to user mode)
nonsignaled signaled
One thread releases thesemaphore freeing a resource
A thread acquires the semaphoreMore resources are not available
Kernel resumes one or more waiting threads
Semaphore
6161
A thread reinitializesthe thread object
What signals an object (contd)
Dispatcher object System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
A thread sets the event
Kernel resumes one or more threads
Kernel resumes one or more waiting threads
Event
nonsignaled signaled
Dedicated thread sets oneevent in the event pair
Kernel resumes theother dedicated thread
Kernel resumes waitingdedicated thread
Event pair
nonsignaled signaled
Timer expires
A thread (re) initializes the timer
Kernel resumes all waiting threads
Timer
nonsignaled signaled
Thread terminates
Kernel resumes all waiting threads
Thread
6262
Further Reading
Mark E Russinovich and David A Solomon Microsoft Windows Internals 4th Edition Microsoft Press 2004
Chapter 3 - System Mechanisms
ndash Trap Dispatching (pp 85 ff)
ndash Synchronization (pp 149 ff)
ndash Kernel Event Tracing (pp 175 ff)
6363
33 Advanced Windows Synchronization
Deferred and Asynchronous Procedure Calls
IRQLs and CPU Time Accounting
Wait Queues amp Dispatcher Objects
6464
Used to defer processing from higher (device) interrupt level to a lower (dispatch) levelndash Also used for quantum end and timer expiration
Driver (usually ISR) queues requestndash One queue per CPU DPCs are normally queued to the current processor but can be targeted to other CPUs
ndash Executes specified procedure at dispatch IRQL (or ldquodispatch levelrdquo also ldquoDPC levelrdquo) when all higher-IRQL work (interrupts) completed
ndash Maximum times recommended ISR 10 usec DPC 25 usec
See httpwwwmicrosoftcomwhdcdriverperformmmdrvmspx
Deferred Procedure Calls (DPCs)
6565
queue head DPC object DPC object DPC object
Deferred Procedure Calls (DPCs)
6666
DPC
Delivering a DPC
DPC routines can call kernel functionsbut canlsquot call system services generatepage faults or create or wait on objects
DPC routines canlsquotassume whatprocess addressspace is currentlymapped
Interruptdispatch table
high
Power failure
DispatchDPC
APC
Low
DPC
1 Timer expires kernelqueues DPC that willrelease all waiting threadsKernel requests SW int
DPCDPC
DPC queue
2 DPC interrupt occurswhen IRQL drops belowdispatchDPC level
dispatcher
3 After DPC interruptcontrol transfers tothread dispatcher
4 Dispatcher executes each DPCroutine in DPC queue
6767
Asynchronous Procedure Calls (APCs)
Execute code in context of a particular user threadndash APC routines can acquire resources (objects) incur page faultscall system services
APC queue is thread-specific
User mode amp kernel mode APCsndash Permission required for user mode APCs
6868
Asynchronous Procedure Calls (APCs)
Executive uses APCs to complete work in thread spacendash Wait for asynchronous IO operation
ndash Emulate delivery of POSIX signals
ndash Make threads suspendterminate itself (env subsystems)
APCs are delivered when thread is in alertable wait statendash WaitForMultipleObjectsEx() SleepEx()
6969
Special kernel APCs
ndash Run in kernel mode at IRQL 1
ndash Always deliverable unless thread is already at IRQL 1 or above
ndash Used for IO completion reporting from ldquoarbitrary thread contextrdquo
ndash Kernel-mode interface is linkable but not documented
Asynchronous Procedure Calls (APCs)
7070
ldquoOrdinaryrdquo kernel APCs
ndash Always deliverable if at IRQL 0 unless explicitly disabled (disable with KeEnterCriticalRegion)
User mode APCs
ndash Used for IO completion callback routines (see ReadFileEx WriteFileEx) also QueueUserApc
ndash Only deliverable when thread is in ldquoalertable waitrdquo
Asynchronous Procedure Calls (APCs)
7171
ThreadObject
K
U
APC objects
Asynchronous Procedure Calls (APCs)
7272
IRQLs and CPU Time Accounting
Interval clock timer ISR keeps track of time
Clock ISR time accounting
ndash If IRQLlt2 charge to threadrsquos user or kernel time
ndash If IRQL=2 and processing a DPC charge to DPC time
ndash If IRQL=2 amp not processing a DPC charge to thread kernel time
ndash If IRQLgt2 charge to interrupt time
7373
IRQLs and CPU Time Accounting
Since time servicing interrupts are NOT charged to interrupted thread if system is busy but no process appears to be running must be due to interrupt-related activity
ndash Note time at IRQL 2 or more is charged to the current threadrsquos quantum (to be described)
7474
Interrupt Time Accounting
Task Manager includes interrupt and DPC time with the Idle process time
Interrupt activity is not charged to any threadprocess
ndash Process Explorer shows these as separate processesnot really processes
ndash Context switches for these are really of interrupts amp DPCs
7575
Time Accounting Quirks
Looking at total CPU time for each process may not reveal where system has spent its time
CPU time accounting is driven by programmable interrupt timer
ndash Normally 10 msec (15 msec on some MP Pentiums)
Thread execution and context switches between clock intervals NOT accounted
ndash Eg one or more threads run and enter a wait state before clock fires
ndash Thus threads may run but never get charged
View context switch activity with Process Explorer
ndash Add Context Switch Delta column
7676
For waiting threads user-mode utilities only display the wait reason
Example pstat
Looking at Waiting Threads
7777
Wait Internals 1 Dispatcher Objects
Size TypeState
Wait listhead
Object-type-specific data
DispatcherObject
(see ntddkincddkntddkh)
Any kernel object you can wait for is a ldquodispatcher objectrdquo
ndash some exclusively for synchronizationeg events mutexes (ldquomutantsrdquo) semaphores queues timers
ndash others can be waited for as a side effect of their prime function eg processes threads file objects
ndash non-waitable kernel objects are called ldquocontrol objectsrdquo
All dispatcher objects have a common header
All dispatcher objects are in one of two states
ndash ldquosignaledrdquo vs ldquononsignaledrdquo
ndash when signalled a wait on the object is satisfied
ndash different object types differ in terms of what changes their state
ndash wait and unwait implementation iscommon to all types of dispatcher objects
7878
Object-type-specific data
Wait Internals 2Wait Blocks
Size TypeState
Wait listhead
Size TypeState
Wait listhead
Represent a threadrsquos reference to something itrsquos waiting for (one per handle passed to WaitForhellip)
All wait blocks from a given wait call are chained to the waiting thread
Type indicates wait for ldquoanyrdquo or ldquoallrdquo Key denotes argument list position for
WaitForMultipleObjects
Object-type-specific data
DispatcherObjects
Thread Objects
WaitBlockListWaitBlockList
Wait blocks
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
7979
34 Windows APIs for Synchronization and IPC Windows API constructs for synchronization and interprocess communication
Synchronization
ndash Critical sections
ndash Mutexes
ndash Semaphores
ndash Event objects
Synchronization through interprocess communication
ndash Anonymous pipes
ndash Named pipes
ndash Mailslots
8080
Critical Sections
Only usable from within the same process
Critical sections are initialized and deleted but do not have handles
Only one thread at a time can be in a critical section
A thread can enter a critical section multiple times - however the number of Enter- and Leave-operations must match
Leaving a critical section before entering it may cause deadlocks
No way to test whether another thread is in a critical section
VOID InitializeCriticalSection( LPCRITICAL_SECTION sec )VOID DeleteCriticalSection( LPCRITICAL_SECTION sec )
VOID EnterCriticalSection( LPCRITICAL_SECTION sec ) VOID LeaveCriticalSection( LPCRITICAL_SECTION sec )BOOL TryEnterCriticalSection ( LPCRITICAL_SECTION sec )
8181
Critical Section Example
counter is global shared by all threads
volatile int counter = 0
CRITICAL_SECTION crit
InitializeCriticalSection ( ampcrit )
hellip main loop in any of the threads
while (done)
_try
EnterCriticalSection ( ampcrit )
counter += local_value
LeaveCriticalSection ( ampcrit )
_finally LeaveCriticalSection ( ampcrit )
DeleteCriticalSection( ampcrit )
8282
Synchronizing Threads with Kernel Objects
The following kernel objects can be used to synchronize threads
ndash Processes
ndash Threads
ndash Files
ndash Console input
File change notificationsFile change notifications
MutexesMutexes
Events (auto-reset + manual-reset)Events (auto-reset + manual-reset)
Waitable timersWaitable timers
DWORD WaitForSingleObject( HANDLE hObject DWORD dwTimeout )
DWORD WaitForMultipleObjects( DWORD cObjects LPHANDLE lpHandles BOOL bWaitAll DWORD dwTimeout )
8383
Wait Functions - Details
WaitForSingleObject()ndash hObject specifies kernel object
ndash dwTimeout specifies wait time in msecdwTimeout == 0 - no wait check whether object is signaled
dwTimeout == INFINITE - wait forever
WaitForMultipleObjects()ndash cObjects lt= MAXIMUM_WAIT_OBJECTS (64)
ndash lpHandles - pointer to array identifying these objects
ndash bWaitAll - whether to wait for first signaled object or all objectsFunction returns index of first signaled object
Side effectsndash Mutexes auto-reset events and waitable timers will be reset to non-signaled state after completing wait functions
8484
Mutexes
Mutexes work across processes
First thread has to call CreateMutex()
When sharing a mutex second thread (process) calls CreateMutex() or OpenMutex()
fInitialOwner == TRUE gives creator immediate ownership
Threads acquire mutex ownership using WaitForSingleObject() or WaitForMultipleObjects()
ReleaseMutex() gives up ownership
CloseHandle() will free mutex object
8585
Mutexes
HANDLE CreateMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
HANDLE OpenMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
BOOL ReleaseMutex( HANDLE hMutex )
8686
Mutex Example
counter is global shared by all threads
volatile int done counter = 0
HANDLE mutex = CreateMutex( NULL FALSE NULL )
main loop in any of the threads ret is local
DWORD ret
while (done)
ret = WaitForSingleObject( mutex INFINITE )
if (ret == WAIT_OBJECT_0)
counter += local_value
else mutex was abandoned
break exit the loop
ReleaseMutex( mutex )
CloseHandle( mutex )
8787
Comparison - POSIX mutexes
POSIX pthreads specification supports mutexesndash Synchronization among threads in same process
Five basic functionsndash pthread_mutex_init()
ndash pthread_mutex_destroy()
ndash pthread_mutex_lock()
ndash pthread_mutex_unlock()
ndash pthread_mutex_trylock()
Comparisonndash pthread_mutex_lock() will block - equivalent to WaitForSingleObject( hMutex )
ndash pthread_mutex_trylock() is nonblocking (polling) - equivalent to WaitForSingleObject() with timeout == 0
8888
Semaphores
Semaphore objects are used for resource countingndash A semaphore is signaled when count gt 0
Threadsprocesses use wait functionsndash Each wait function decreases semaphore count by 1
ndash ReleaseSemaphore() may increment count by any value
ndash ReleaseSemaphore() returns old semaphore count
8989
Semaphores
HANDLE CreateSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE OpenSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE ReleaseSemaphore( HANDLE hSemaphoreLONG cReleaseCount LPLONG lpPreviousCount )
9090
Events
Multiple threads can be released when a single event is signaled (barrier synchronization)ndash Manual-reset event can signal several thread simultaneously must be reset manually
ndash PulseEvent() will release all threads waiting on a manual-reset event and automatically reset the event
ndash Auto-reset event signals a single thread event is reset automatically
ndash fInitialState == TRUE - create event in signaled state
9191
Events
HANDLE CreateEvent( LPSECURITY_ATTRIBUTE lpsaBOOL fManualReset BOOL fInititalStateLPTSTR lpszEventName )
BOOL SetEvent( HANDLE hEvent )BOOL ResetEvent( HANDLE hEvent )BOOL PulseEvent( HANDLE hEvent )
9292
Comparison - POSIX condition variables
pthreadrsquos condition variables are comparable to eventsndash pthread_cond_init()
ndash pthread_cond_destroy()
Wait functionsndash pthread_cond_wait()
ndash pthread_cond_timedwait()
Signalingndash pthread_cond_signal() - one thread
ndash pthread_cond_broadcast() - all waiting threads
No exact equivalent to manual-reset events
9393
Anonymous pipes
BOOL CreatePipe( PHANDLE phReadPHANDLE phWriteLPSECURITY_ATTRIBUTES lpsaDWORD cbPipe )
main
prog1 prog2pipe
Half-duplex character-based IPC
cbPipe pipe byte size zero == default
Read on pipe handle will block if pipe is empty
Write operation to a full pipe will block
Anonymous pipes are oneway
9494
IO Redirection using an Anonymous Pipe
Create default size anonymous pipe handles are inheritable
if (CreatePipe (amphReadPipe amphWritePipe ampPipeSA 0))
fprintf(stderr ldquoAnon pipe create failednrdquo) exit(1)
Set output handle to pipe handle create first processes
StartInfoCh1hStdInput = GetStdHandle (STD_INPUT_HANDLE)
StartInfoCh1hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh1hStdOutput = hWritePipe
StartInfoCh1dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)Command1 NULL NULL TRUE 0 NULL NULL ampStartInfoCh1 ampProcInfo1))
fprintf(stderr ldquoCreateProc1 failednrdquo) exit(2)
CloseHandle (hWritePipe)
9595
Pipe example (contd)
Repeat (symmetrically) for the second process
StartInfoCh2hStdInput = hReadPipe
StartInfoCh2hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh2hStdOutput = GetStdHandle (STD_OUTPUT_HANDLE)
StartInfoCh2dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)targv NULL NULLTRUE Inherit handles
0 NULL NULL ampStartInfoCh2 ampProcInfo2))
fprintf(stderr ldquoCreateProc2 failednrdquo) exit(3)
CloseHandle (hReadPipe)
Wait for both processes to complete
WaitForSingleObject (ProcInfo1hProcess INFINITE)
WaitForSingleObject (ProcInfo2hProcess INFINITE)
CloseHandle (ProcInfo1hThread) CloseHandle (ProcInfo1hProcess)
CloseHandle (ProcInfo2hThread) CloseHandle (ProcInfo2hProcess)
return 0
9696
Named Pipes
Message orientedndash Reading process can read varying-length messages precisely as sent by the writing process
Bi-directionalndash Two processes can exchange messages over the same pipe
Multiple independent instances of a named pipendash Several clients can communicate with a single server using the same instance
ndash Server can respond to client using the same instance
Pipe can be accessed over the networkndash location transparency
Convenience and connection functions
9797
Using Named Pipes
HANDLE CreateNamedPipe (LPCTSTR lpszPipeNameDWORD fdwOpenMode DWORD fdwPipModeDWORD nMaxInstances DWORD cbOutBufDWORD cbInBuf DWORD dwTimeOutLPSECURITY_ATTRIBUTES lpsa )
Use same flag settings forall instances of a named pipe
lpszPipeName pipe[path]pipename
ndash Not possible to create a pipe on remote machine ( ndash local machine)
fdwOpenMode
ndash PIPE_ACCESS_DUPLEX PIPE_ACCESS_INBOUND PIPE_ACCESS_OUTBOUND
fdwPipeMode
ndash PIPE_TYPE_BYTE or PIPE_TYPE_MESSAGE
ndash PIPE_READMODE_BYTE or PIPE_READMODE_MESSAGE
ndash PIPE_WAIT or PIPE_NOWAIT (will ReadFile block)
9898
Named Pipes (contd)
BOOL PeekNamedPipe (HANDLE hPipeLPVOID lpvBuffer DWORD cbBufferLPDWORD lpcbRead LPDWORD lpcbAvailLPDWORD lpcbMessage)
nMaxInstances
ndash Number of instances
ndash PIPE_UNLIMITED_INSTANCES OS choice based on resources
dwTimeOut
ndash Default time-out period (in msec) for WaitNamedPipe()
First CreateNamedPipe creates named pipe
ndash Closing handle to last instance deletes named pipe
Polling a pipe
ndash Nondestructive ndash is there a message waiting for ReadFile
9999
Named Pipe Client Connections
CreateFile with named pipe namendash pipe[path]pipename
ndash servernamepipe[path]pipename
ndash First method gives better performance (local server)
Status Functionsndash GetNamedPipeHandleState
ndash SetNamedPipeHandleState
ndash GetNamedPipeInfo
100100
Convenience Functions
BOOL TransactNamedPipe( HANDLE hNamedPipeLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDOWRD lpcbRead LPOVERLAPPED lpa)
WriteFile ReadFile sequence
101101
Convenience Functions
BOOL CallNamedPipe( LPCTSTR lpszPipeNameLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDWORD lpcbRead DWORD dwTimeOut)
CreateFile WriteFile ReadFile CloseHandle
ndash dwTimeOut NMPWAIT_NOWAIT NMPWAIT_WIAT_FOREVER NMPWAIT_USE_DEFAULT_WAIT
102102
Server eliminate the polling loop
BOOL ConnectNamedPipe (HANDLE hNamedPipeLPOVERLAPPED lpo
lpo == NULLndash Call will return as soon as there is a client connection
ndash Returns false if client connected between CreateNamed Pipe calland ConnectNamedPipe()
Use DisconnectNamedPipe to free the handle for connection from another client
WaitNamedPipe()ndash Client may wait for serverlsquos ConnectNamedPipe()
Security rights for named pipesndash GENERIC_READ GENERIC_WRITE SYNCHRONIZE
103103
Comparison with UNIX
UNIX FIFOs are similar to a named pipe
ndash FIFOs are half-duplex
ndash FIFOs are limited to a single machine
ndash FIFOs are still byte-oriented so its easiest to use fixed-size records in clientserver applications
ndash Individual readwrites are atomic
A server using FIFOs must use a separate FIFO for each clientlsquos response although all clients can send requests via a single well known FIFO
Mkfifo() is the UNIX counterpart to CreateNamedPipe()
Use sockets for networked clientserver scenarios
104104
Client Example using Named Pipe
WaitNamedPipe (ServerPipeName NMPWAIT_WAIT_FOREVER)
hNamedPipe = CreateFile (ServerPipeName GENERIC_READ | GENERIC_WRITE
0 NULL OPEN_EXISTING FILE_ATTRIBUTE_NORMAL NULL)
if (hNamedPipe == INVALID_HANDLE_VALUE)
fptinf(stderr Failure to locate servern) exit(3)
Write the request
WriteFile (hNamedPipe ampRequest MAX_RQRS_LEN ampnWrite NULL)
Read each response and send it to std out
while (ReadFile (hNamedPipe ResponseRecord MAX_RQRS_LEN ampnRead NULL))
printf (s ResponseRecord)
CloseHandle (hNamedPipe)
return 0
105105
Server Example Using a Named Pipe
hNamedPipe = CreateNamedPipe (SERVER_PIPE PIPE_ACCESS_DUPLEX
PIPE_READMODE_MESSAGE | PIPE_TYPE_MESSAGE | PIPE_WAIT
1 0 0 CS_TIMEOUT pNPSA)
while (Done)
printf (Server is awaiting next requestn)
if (ConnectNamedPipe (hNamedPipe NULL)
|| ReadFile (hNamedPipe ampRequest RQ_SIZE ampnXfer NULL))
fprintf(stderr ldquoConnect or Read Named Pipe errornrdquo) exit(4)
printf( ldquoRequest is sn RequestRecord)
Send the file one line at a time to the client
fp = fopen (File r)
while ((fgets (ResponseRecord MAX_RQRS_LEN fp) = NULL))
WriteFile (hNamedPipe ampResponseRecord
(strlen(ResponseRecord) + 1) TSIZE ampnXfer NULL)
fclose (fp)
DisconnectNamedPipe (hNamedPipe)
End of server operation
106106
Win32 IPC - MailslotsMailslots bear some nasty implementation detailsthey are almost never used
Broadcast mechanism
ndash One-directional
ndash Mutliple writersmultiple readers (frequently one-to-many comm)
ndash Message delivery is unreliable
ndash Can be located over a network domain
ndash Message lengths are limited (w2k lt 426 byte) Operations on the mailslot
ndash Each reader (server) creates mailslot with CreateMailslot()
ndash Write-only client opens mailslot with CreateFile() and uses WriteFile() ndash open will fail if there are no waiting readers
ndash Clientlsquos message can be read by all servers (readers) Client lookup mailslotmailslotname
ndash Client will connect to every server in network domain
107107
Locate a server via mailslot
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
App client 0
App client n
Mailslot Servers
While () Sleep() hMS = CreateFile( ldquomailslotstatusldquo)
WriteFile(hMS ampStatInfo
App Server
Mailslot Client
Message is sent periodically
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
1616
Mutual Exclusion - Hardware Support
Interrupt Disabling
ndash Concurrent threads cannot overlap on a uniprocessor
ndash Thread will run until performing a system call or interrupt happens
Special Atomic Machine Instructions
ndash Test and Set Instruction - read amp write a memory location
ndash Exchange Instruction - swap register and memory location
Problems with Machine-Instruction Approach
ndash Busy waiting
ndash Starvation is possible
ndash Deadlock is possible
1717
Synchronization Hardware
Test and modify the content of a word atomically
boolean TestAndSet(boolean amptarget)
boolean rv = target
target = true
return rv
1818
Shared data ndash boolean lock = false
Thread Ti
do
while (TestAndSet(lock))
critical section
lock = false
remainder section
Mutual Exclusion with Test-and-Set
1919
Synchronization Hardware
Atomically swap two variables
void Swap(boolean ampa boolean ampb)
boolean temp = a
a = b
b = temp
2020
Mutual Exclusion with Swap
Shared data (initialized to 0) int lock = 0
Thread Ti
int key
do
key = 1
while (key == 1) Swap(lockkey)
critical section
lock = 0
remainder section
2121
Semaphores
Semaphore S ndash integer variable
can only be accessed via two atomic operations
wait (S)
while (S lt= 0)S--
signal (S)
S++
2222
Critical Section of n Threads
Shared data
semaphore mutex initially mutex = 1
Thread Ti
do wait(mutex) critical section
signal(mutex) remainder section
while (1)
2323
Semaphore Implementation
Semaphores may suspendresume threads
ndash Avoid busy waiting
Define a semaphore as a record
typedef struct
int value struct thread L semaphore
Assume two simple operations
ndash suspend() suspends the thread that invokes it
ndash resume(T) resumes the execution of a blocked thread T
2424
Implementation
Semaphore operations now defined as wait(S)
Svalue--
if (Svalue lt 0)
add this thread to SLsuspend()
signal(S) Svalue++
if (Svalue lt= 0)
remove a thread T from SLresume(T)
2525
Semaphore as a General Synchronization Tool Execute B in Tj only after A executed in Ti
Use semaphore flag initialized to 0
Code
Ti Tj
A wait(flag)
signal(flag) B
2626
Two Types of Semaphores
Counting semaphore
ndash integer value can range over an unrestricted domain
Binary semaphore
ndash integer value can range only between 0 and 1
ndash can be simpler to implement
Counting semaphore S can be implemented as a binary semaphore
2727
Deadlock and Starvation
Deadlock ndash
ndash two or more threads are waiting indefinitely for an event that can be caused by only one of the waiting threads
Let S and Q be two semaphores initialized to 1
T0 T1
wait(S) wait(Q)
wait(Q) wait(S)
signal(S) signal(Q)
signal(Q) signal(S)
2828
Deadlock and Starvation
Starvation ndash indefinite blocking
ndash A thread may never be removed from the semaphore queue in which it is suspended
Solution ndash
ndash all code should acquirerelease semaphores in same order
2929
Windows Synchronization
Uses interrupt masks to protect access to global resources on uniprocessor systems
Uses spinlocks on multiprocessor systems
Provides dispatcher objects which may act as mutexes and semaphores
Dispatcher objects may also provide events An event acts much like a condition variable
3030
Linux Synchronization
Kernel disables interrupts for synchronizing access to global data on uniprocessor systems
Uses spinlocks for multiprocessor synchronization
Uses semaphores and readers-writers locks when longer sections of code need access to data
Implements POSIX synchronization primitives to support multitasking multithreading (including real-time threads) and multiprocessing
3131
Further Reading
Ben-Ari M Principles of Concurrent Programming Prentice Hall 1982
Lamport L The Mutual Exclusion Problem Journal of the ACM April 1986
Abraham Silberschatz Peter B Galvin Operating System Concepts John Wiley amp Sons 6th Ed 2003
ndash Chapter 7 - Process Synchronization
ndash Chapter 8 - Deadlocks
3232
32 Trap Dispatching Interrupts Synchronization Trap and Interrupt dispatching
IRQL levels amp Interrupt Precedence
Spinlocks and Kernel Synchronization
Executive Synchronization
3333
Kernel Mode Versus User Mode
A processor state
Controls access to memory
Each memory page is tagged to show the required mode for reading and for writing
ndash Protects the system from the users
ndash Protects the user (process) from themselves
ndash System is not protected from system
Code regions are tagged ldquono write in any moderdquo
Controls ability to execute privileged instructions
A Windows abstraction
ndash Intel Ring 0 Ring 3
3434
Kernel Mode Versus User Mode
Control flow (a thread) can change from user to kernel mode and back
ndash Does not affect scheduling
ndash Thread context includes info about execution mode (along with registers etc)
PerfMon counters
ndash ldquoPrivileged Timerdquo and ldquoUser Timerdquo
ndash 4 levels of granularity thread process processor system
3535
Getting Into Kernel Mode
Code is run in kernel mode for one of three reasons
1 Requests from user mode
ndash Via the system service dispatch mechanism
ndash Kernel-mode code runs in the context of the requesting thread
2 Dedicated kernel-mode system threads
ndash Some threads in the system stay in kernel mode at all timesmostly in the ldquoSystemrdquo process
ndash Scheduled preempted etc like any other threads
3636
Getting Into Kernel Mode
3 Interrupts from external devicesndash interrupt dispatcher invokes the interrupt service routine
ndash ISR runs in the context of the interrupted thread so-called ldquoarbitrary thread contextrdquo
ndash ISR often requests the execution of a ldquoDPC routinerdquo which also runs in kernel mode
ndash Time not charged to interrupted thread
3737
Trap dispatching
Interruptdispatcher
Systemservice
dispatcher
Interruptserviceroutines
Interruptserviceroutines
Interruptserviceroutines
System services
System services
System services
Exceptiondispatcher
Exceptionhandlers
Exceptionhandlers
Exceptionhandlers
Virtual memorymanagerlsquos pager
Interrupt
System service call
HW exceptionsSW exceptions
Virtual addressexceptions
Trap processorlsquos mechanism to capture executing thread
ndash Switch from user to kernel mode
ndash Interrupts ndash asynchronous
ndash Exceptions - synchronous
Interrupt dispatch routine
Disable interrupts
Record machine state (trap frame) to allow resume
Mask equal- and lower-IRQL interrupts
Find and call appropriate ISR
Dismiss interrupt
Restore machine state (including mode and enabled interrupts)
Disable interrupts
Record machine state (trap frame) to allow resume
Mask equal- and lower-IRQL interrupts
Find and call appropriate ISR
Dismiss interrupt
Restore machine state (including mode and enabled interrupts)
Tell the device to stop interrupting
Interrogate device state start next operation on device etc
Request a DPC
Return to caller
Tell the device to stop interrupting
Interrogate device state start next operation on device etc
Request a DPC
Return to caller
Interrupt service routine
interrupt
user or kernel mode
codekernel mode
Note no thread or process context switch
Note no thread or process context switch
Interrupt Dispatching
3939
IRQL = Interrupt Request Level
ndash Precedence of the interrupt with respect to other interrupts
ndash Different interrupt sources have different IRQLs
ndash not the same as IRQ
IRQL is also a state of the processor
ndash Servicing an interrupt raises processor IRQL to that interruptrsquos IRQL
ndash this masks subsequent interrupts at equal and lower IRQLs
User mode is limited to IRQL 0
No waits or page faults at IRQL gt= DISPATCH_LEVEL
Interrupt Precedence via IRQLs (x86)
4040
PassiveLowAPC
DispatchDPCDevice 1
Profile amp Synch (Srv 2003)
ClockInterprocessor Interrupt
Power failHigh
normal thread execution
Hardware interrupts
Deferrable software interrupts
012
302928
31
Interrupt Precedence via IRQLs (x86)
4141
Interrupt processing
Interrupt dispatch table (IDT)ndash Links to interrupt service routines
x86ndash Interrupt controller interrupts processor (single line)
ndash Processor queries for interrupt vector uses vector as index to IDT
Alphandash PAL code (Privileged Architecture Library ndash Alpha BIOS) determines interrupt vector calls kernel
ndash Kernel uses vector to index IDT
After ISR execution IRQL is lowered to initial level
4242
Interrupt object
Allows device drivers to register ISRs for their devicesndash Contains dispatch code (initial handler)
ndash Dispatch code calls ISR with interrupt object as parameter(HW cannot pass parameters to ISR)
Connectingdisconnecting interrupt objectsndash Dynamic association between ISR and IDT entry
ndash Loadable device drivers (kernel modules)
ndash Turn onoff ISR
Interrupt objects can synchronize access to ISR datandash Multiple instances of ISR may be active simultaneously (MP machine)
ndash Multiple ISR may be connected with IRQL
4343
Predefined IRQLs
High ndash used when halting the system (via KeBugCheck())
Power fail ndash originated in the NT design document but has never been used
Inter-processor interruptndash used to request action from other processor (dispatching a thread updating a processors TLB system shutdown system crash)
Clockndash Used to update systemlsquos clock allocation of CPU time to threads
Profilendash Used for kernel profiling (see Kernel profiler ndash Kernprofexe Res Kit)
4444
Predefined IRQLs (contd)
Device
ndash Used to prioritize device interrupts
DPCdispatch and APC
ndash Software interrupts that kernel and device drivers generate
Passive
ndash No interrupt level at all normal thread execution
4545
IRQLs on 64-bit Systems
PassiveLowAPC
DispatchDPCDevice 1
Device n
Synch (Srv 2003)Clock
Interprocessor InterruptPower
HighProfile
012
1413
15
34
PassiveLowAPC
DispatchDPC amp Synch (UP only)Correctable Machine Check
Device 1
Device nSynch (MP only)
ClockInterprocessor Interrupt
HighProfilePower
x64 IA64
12
4646
Interrupt Prioritization amp Delivery
IRQLs are determined as followsndash x86 UP systems IRQL = 27 - IRQ
ndash x86 MP systems bucketized (random)
ndash x64 amp IA64 systems IRQL = IDT vector number 16
On MP systems which processor is chosen to deliver an interruptndash By default any processor can receive an interrupt from any deviceCan be configured with IntFilter utility in Resource Kit
ndash On x86 and x64 systems the IOAPIC (IO advanced programmable interrupt controller) is programmed to interrupt the processor running at the lowest IRQL
ndash On IA64 systems the SAPIC (streamlined advanced programmable interrupt controller) is configured to interrupt one processor for each interrupt sourceProcessors are assigned round robin for each interrupt vector
4747
Software interrupts
Initiating thread dispatching
ndash DPC allow for scheduling actions when kernel is deep within many layers of code
ndash Delayed scheduling decision one DPC queue per processor
Handling timer expiration
Asynchronous execution of a procedure in context of a particular thread
Support for asynchronous IO operations
4848
Flow of Interrupts
4949
Sync on MP use spinlocks to coordinate among processors
Spinlock acquisition and release routines implement a one-owner-at-a-time algorithm
ndash Spinlock is either free or is considered to be owned by a CPU
ndash Analogous to using Windows API mutexes from user mode
A spinlock is just a data cell in memory
ndash Accessed with a test-and-modify operation that is atomic across all processors
ndash KSPIN_LOCK is an opaque data type typedefrsquod as a ULONG
ndash To implement synchronization a single bit is sufficient
Synchronization on SMP Systems
5050
do acquire_spinlock(DPC)until (SUCCESS)
begin remove DPC from queueend
release_spinlock(DPC)
do acquire_spinlock(DPC)until (SUCCESS)
begin remove DPC from queueend
release_spinlock(DPC)
Kernel Synchronization
Processor BProcessor A
Critical section
spinlock
DPC DPC
A spinlock is a locking primitive associatedwith a global data structure such as the DPC queue
5151
Queued Spinlocks
Problem Checking status of spinlock via test-and-set operation creates bus contention
Queued spinlocks maintain queue of waiting processors
First processor acquires lock other processors wait on processor-local flag
ndash Thus busy-wait loop requires no access to the memory bus
When releasing lock the 1st processorrsquos flag is modified
ndash Exactly one processor is being signaled
ndash Pre-determined wait order
5252
SMP Scalability Improvements
Windows 2000 queued spinlocks
ndash qlocks in Kernel Debugger
Server 2003
ndash More spinlocks eliminated (context swap system space commit)
ndash Further reduction of use of spinlocks amp length they are held
ndash Scheduling database now per-CPUAllows thread state transitions in parallel
5353
SMP Scalability Improvements
XP2003
ndash Minimized lock contention for hot locks PFN or Page Frame Database lock
ndash Some locks completely eliminatedCharging nonpagedpaged pool quotas allocating and mapping system page table entries charging commitment of pages allocatingmapping physical memory through AWE functions
ndash New more efficient locking mechanism (pushlocks)Doesnrsquot use spinlocks when no contentionUsed for object manager and address windowing extensions (AWE) related locks
5454
Waiting
Flexible wait calls
ndash Wait for one or multiple objects in one call
ndash Wait for multiple can wait for ldquoanyrdquo one or ldquoallrdquo at onceldquoAllrdquo all objects must be in the signalled state concurrently to resolve the wait
ndash All wait calls include optional timeout argument
ndash Waiting threads consume no CPU time
5555
Waiting
Waitable objects include
ndash Events (may be auto-reset or manual reset may be set or ldquopulsedrdquo)
ndash Mutexes (ldquomutual exclusionrdquo one-at-a-time)
ndash Semaphores (n-at-a-time)
ndash Timers
ndash Processes and Threads (signalled upon exit or terminate)
ndash Directories (change notification)
5656
Waiting
No guaranteed ordering of wait resolution
ndash If multiple threads are waiting for an object and only one thread is released (eg itrsquos a mutex or auto-reset event) which thread gets released is unpredictable
5757
Executive Synchronization
Thread waitson an object
handle
Create and initialize thread object
Initialized
Ready
Transition
Waiting
Running
Terminated
Standby
Wait is completeSet object to
signaled state
Interaction with thread scheduling
Waiting on Dispatcher Objects ndash outside the kernel
5858
Interaction bet Synchronization amp Dispatching User mode thread waits on an event objectlsquos handle
Kernel changes threadlsquos scheduling state from ready to waiting and adds thread to wait-list
Another thread sets the event
Kernel wakes up waiting threads variable priority threads get priority boost
5959
Interaction bet Synchronization amp Dispatching Dispatcher re-schedules new thread ndash it may preempt running thread it it has lower priority and issues software interrupt to initiate context switch
If no processor can be preempted the dispatcher places the ready thread in the dispatcher ready queue to be scheduled later
6060
What signals an object
Dispatcher object
System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
Owning thread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (kernel mode)
nonsignaled signaled
Owning thread or otherthread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (exported to user mode)
nonsignaled signaled
One thread releases thesemaphore freeing a resource
A thread acquires the semaphoreMore resources are not available
Kernel resumes one or more waiting threads
Semaphore
6161
A thread reinitializesthe thread object
What signals an object (contd)
Dispatcher object System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
A thread sets the event
Kernel resumes one or more threads
Kernel resumes one or more waiting threads
Event
nonsignaled signaled
Dedicated thread sets oneevent in the event pair
Kernel resumes theother dedicated thread
Kernel resumes waitingdedicated thread
Event pair
nonsignaled signaled
Timer expires
A thread (re) initializes the timer
Kernel resumes all waiting threads
Timer
nonsignaled signaled
Thread terminates
Kernel resumes all waiting threads
Thread
6262
Further Reading
Mark E Russinovich and David A Solomon Microsoft Windows Internals 4th Edition Microsoft Press 2004
Chapter 3 - System Mechanisms
ndash Trap Dispatching (pp 85 ff)
ndash Synchronization (pp 149 ff)
ndash Kernel Event Tracing (pp 175 ff)
6363
33 Advanced Windows Synchronization
Deferred and Asynchronous Procedure Calls
IRQLs and CPU Time Accounting
Wait Queues amp Dispatcher Objects
6464
Used to defer processing from higher (device) interrupt level to a lower (dispatch) levelndash Also used for quantum end and timer expiration
Driver (usually ISR) queues requestndash One queue per CPU DPCs are normally queued to the current processor but can be targeted to other CPUs
ndash Executes specified procedure at dispatch IRQL (or ldquodispatch levelrdquo also ldquoDPC levelrdquo) when all higher-IRQL work (interrupts) completed
ndash Maximum times recommended ISR 10 usec DPC 25 usec
See httpwwwmicrosoftcomwhdcdriverperformmmdrvmspx
Deferred Procedure Calls (DPCs)
6565
queue head DPC object DPC object DPC object
Deferred Procedure Calls (DPCs)
6666
DPC
Delivering a DPC
DPC routines can call kernel functionsbut canlsquot call system services generatepage faults or create or wait on objects
DPC routines canlsquotassume whatprocess addressspace is currentlymapped
Interruptdispatch table
high
Power failure
DispatchDPC
APC
Low
DPC
1 Timer expires kernelqueues DPC that willrelease all waiting threadsKernel requests SW int
DPCDPC
DPC queue
2 DPC interrupt occurswhen IRQL drops belowdispatchDPC level
dispatcher
3 After DPC interruptcontrol transfers tothread dispatcher
4 Dispatcher executes each DPCroutine in DPC queue
6767
Asynchronous Procedure Calls (APCs)
Execute code in context of a particular user threadndash APC routines can acquire resources (objects) incur page faultscall system services
APC queue is thread-specific
User mode amp kernel mode APCsndash Permission required for user mode APCs
6868
Asynchronous Procedure Calls (APCs)
Executive uses APCs to complete work in thread spacendash Wait for asynchronous IO operation
ndash Emulate delivery of POSIX signals
ndash Make threads suspendterminate itself (env subsystems)
APCs are delivered when thread is in alertable wait statendash WaitForMultipleObjectsEx() SleepEx()
6969
Special kernel APCs
ndash Run in kernel mode at IRQL 1
ndash Always deliverable unless thread is already at IRQL 1 or above
ndash Used for IO completion reporting from ldquoarbitrary thread contextrdquo
ndash Kernel-mode interface is linkable but not documented
Asynchronous Procedure Calls (APCs)
7070
ldquoOrdinaryrdquo kernel APCs
ndash Always deliverable if at IRQL 0 unless explicitly disabled (disable with KeEnterCriticalRegion)
User mode APCs
ndash Used for IO completion callback routines (see ReadFileEx WriteFileEx) also QueueUserApc
ndash Only deliverable when thread is in ldquoalertable waitrdquo
Asynchronous Procedure Calls (APCs)
7171
ThreadObject
K
U
APC objects
Asynchronous Procedure Calls (APCs)
7272
IRQLs and CPU Time Accounting
Interval clock timer ISR keeps track of time
Clock ISR time accounting
ndash If IRQLlt2 charge to threadrsquos user or kernel time
ndash If IRQL=2 and processing a DPC charge to DPC time
ndash If IRQL=2 amp not processing a DPC charge to thread kernel time
ndash If IRQLgt2 charge to interrupt time
7373
IRQLs and CPU Time Accounting
Since time servicing interrupts are NOT charged to interrupted thread if system is busy but no process appears to be running must be due to interrupt-related activity
ndash Note time at IRQL 2 or more is charged to the current threadrsquos quantum (to be described)
7474
Interrupt Time Accounting
Task Manager includes interrupt and DPC time with the Idle process time
Interrupt activity is not charged to any threadprocess
ndash Process Explorer shows these as separate processesnot really processes
ndash Context switches for these are really of interrupts amp DPCs
7575
Time Accounting Quirks
Looking at total CPU time for each process may not reveal where system has spent its time
CPU time accounting is driven by programmable interrupt timer
ndash Normally 10 msec (15 msec on some MP Pentiums)
Thread execution and context switches between clock intervals NOT accounted
ndash Eg one or more threads run and enter a wait state before clock fires
ndash Thus threads may run but never get charged
View context switch activity with Process Explorer
ndash Add Context Switch Delta column
7676
For waiting threads user-mode utilities only display the wait reason
Example pstat
Looking at Waiting Threads
7777
Wait Internals 1 Dispatcher Objects
Size TypeState
Wait listhead
Object-type-specific data
DispatcherObject
(see ntddkincddkntddkh)
Any kernel object you can wait for is a ldquodispatcher objectrdquo
ndash some exclusively for synchronizationeg events mutexes (ldquomutantsrdquo) semaphores queues timers
ndash others can be waited for as a side effect of their prime function eg processes threads file objects
ndash non-waitable kernel objects are called ldquocontrol objectsrdquo
All dispatcher objects have a common header
All dispatcher objects are in one of two states
ndash ldquosignaledrdquo vs ldquononsignaledrdquo
ndash when signalled a wait on the object is satisfied
ndash different object types differ in terms of what changes their state
ndash wait and unwait implementation iscommon to all types of dispatcher objects
7878
Object-type-specific data
Wait Internals 2Wait Blocks
Size TypeState
Wait listhead
Size TypeState
Wait listhead
Represent a threadrsquos reference to something itrsquos waiting for (one per handle passed to WaitForhellip)
All wait blocks from a given wait call are chained to the waiting thread
Type indicates wait for ldquoanyrdquo or ldquoallrdquo Key denotes argument list position for
WaitForMultipleObjects
Object-type-specific data
DispatcherObjects
Thread Objects
WaitBlockListWaitBlockList
Wait blocks
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
7979
34 Windows APIs for Synchronization and IPC Windows API constructs for synchronization and interprocess communication
Synchronization
ndash Critical sections
ndash Mutexes
ndash Semaphores
ndash Event objects
Synchronization through interprocess communication
ndash Anonymous pipes
ndash Named pipes
ndash Mailslots
8080
Critical Sections
Only usable from within the same process
Critical sections are initialized and deleted but do not have handles
Only one thread at a time can be in a critical section
A thread can enter a critical section multiple times - however the number of Enter- and Leave-operations must match
Leaving a critical section before entering it may cause deadlocks
No way to test whether another thread is in a critical section
VOID InitializeCriticalSection( LPCRITICAL_SECTION sec )VOID DeleteCriticalSection( LPCRITICAL_SECTION sec )
VOID EnterCriticalSection( LPCRITICAL_SECTION sec ) VOID LeaveCriticalSection( LPCRITICAL_SECTION sec )BOOL TryEnterCriticalSection ( LPCRITICAL_SECTION sec )
8181
Critical Section Example
counter is global shared by all threads
volatile int counter = 0
CRITICAL_SECTION crit
InitializeCriticalSection ( ampcrit )
hellip main loop in any of the threads
while (done)
_try
EnterCriticalSection ( ampcrit )
counter += local_value
LeaveCriticalSection ( ampcrit )
_finally LeaveCriticalSection ( ampcrit )
DeleteCriticalSection( ampcrit )
8282
Synchronizing Threads with Kernel Objects
The following kernel objects can be used to synchronize threads
ndash Processes
ndash Threads
ndash Files
ndash Console input
File change notificationsFile change notifications
MutexesMutexes
Events (auto-reset + manual-reset)Events (auto-reset + manual-reset)
Waitable timersWaitable timers
DWORD WaitForSingleObject( HANDLE hObject DWORD dwTimeout )
DWORD WaitForMultipleObjects( DWORD cObjects LPHANDLE lpHandles BOOL bWaitAll DWORD dwTimeout )
8383
Wait Functions - Details
WaitForSingleObject()ndash hObject specifies kernel object
ndash dwTimeout specifies wait time in msecdwTimeout == 0 - no wait check whether object is signaled
dwTimeout == INFINITE - wait forever
WaitForMultipleObjects()ndash cObjects lt= MAXIMUM_WAIT_OBJECTS (64)
ndash lpHandles - pointer to array identifying these objects
ndash bWaitAll - whether to wait for first signaled object or all objectsFunction returns index of first signaled object
Side effectsndash Mutexes auto-reset events and waitable timers will be reset to non-signaled state after completing wait functions
8484
Mutexes
Mutexes work across processes
First thread has to call CreateMutex()
When sharing a mutex second thread (process) calls CreateMutex() or OpenMutex()
fInitialOwner == TRUE gives creator immediate ownership
Threads acquire mutex ownership using WaitForSingleObject() or WaitForMultipleObjects()
ReleaseMutex() gives up ownership
CloseHandle() will free mutex object
8585
Mutexes
HANDLE CreateMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
HANDLE OpenMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
BOOL ReleaseMutex( HANDLE hMutex )
8686
Mutex Example
counter is global shared by all threads
volatile int done counter = 0
HANDLE mutex = CreateMutex( NULL FALSE NULL )
main loop in any of the threads ret is local
DWORD ret
while (done)
ret = WaitForSingleObject( mutex INFINITE )
if (ret == WAIT_OBJECT_0)
counter += local_value
else mutex was abandoned
break exit the loop
ReleaseMutex( mutex )
CloseHandle( mutex )
8787
Comparison - POSIX mutexes
POSIX pthreads specification supports mutexesndash Synchronization among threads in same process
Five basic functionsndash pthread_mutex_init()
ndash pthread_mutex_destroy()
ndash pthread_mutex_lock()
ndash pthread_mutex_unlock()
ndash pthread_mutex_trylock()
Comparisonndash pthread_mutex_lock() will block - equivalent to WaitForSingleObject( hMutex )
ndash pthread_mutex_trylock() is nonblocking (polling) - equivalent to WaitForSingleObject() with timeout == 0
8888
Semaphores
Semaphore objects are used for resource countingndash A semaphore is signaled when count gt 0
Threadsprocesses use wait functionsndash Each wait function decreases semaphore count by 1
ndash ReleaseSemaphore() may increment count by any value
ndash ReleaseSemaphore() returns old semaphore count
8989
Semaphores
HANDLE CreateSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE OpenSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE ReleaseSemaphore( HANDLE hSemaphoreLONG cReleaseCount LPLONG lpPreviousCount )
9090
Events
Multiple threads can be released when a single event is signaled (barrier synchronization)ndash Manual-reset event can signal several thread simultaneously must be reset manually
ndash PulseEvent() will release all threads waiting on a manual-reset event and automatically reset the event
ndash Auto-reset event signals a single thread event is reset automatically
ndash fInitialState == TRUE - create event in signaled state
9191
Events
HANDLE CreateEvent( LPSECURITY_ATTRIBUTE lpsaBOOL fManualReset BOOL fInititalStateLPTSTR lpszEventName )
BOOL SetEvent( HANDLE hEvent )BOOL ResetEvent( HANDLE hEvent )BOOL PulseEvent( HANDLE hEvent )
9292
Comparison - POSIX condition variables
pthreadrsquos condition variables are comparable to eventsndash pthread_cond_init()
ndash pthread_cond_destroy()
Wait functionsndash pthread_cond_wait()
ndash pthread_cond_timedwait()
Signalingndash pthread_cond_signal() - one thread
ndash pthread_cond_broadcast() - all waiting threads
No exact equivalent to manual-reset events
9393
Anonymous pipes
BOOL CreatePipe( PHANDLE phReadPHANDLE phWriteLPSECURITY_ATTRIBUTES lpsaDWORD cbPipe )
main
prog1 prog2pipe
Half-duplex character-based IPC
cbPipe pipe byte size zero == default
Read on pipe handle will block if pipe is empty
Write operation to a full pipe will block
Anonymous pipes are oneway
9494
IO Redirection using an Anonymous Pipe
Create default size anonymous pipe handles are inheritable
if (CreatePipe (amphReadPipe amphWritePipe ampPipeSA 0))
fprintf(stderr ldquoAnon pipe create failednrdquo) exit(1)
Set output handle to pipe handle create first processes
StartInfoCh1hStdInput = GetStdHandle (STD_INPUT_HANDLE)
StartInfoCh1hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh1hStdOutput = hWritePipe
StartInfoCh1dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)Command1 NULL NULL TRUE 0 NULL NULL ampStartInfoCh1 ampProcInfo1))
fprintf(stderr ldquoCreateProc1 failednrdquo) exit(2)
CloseHandle (hWritePipe)
9595
Pipe example (contd)
Repeat (symmetrically) for the second process
StartInfoCh2hStdInput = hReadPipe
StartInfoCh2hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh2hStdOutput = GetStdHandle (STD_OUTPUT_HANDLE)
StartInfoCh2dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)targv NULL NULLTRUE Inherit handles
0 NULL NULL ampStartInfoCh2 ampProcInfo2))
fprintf(stderr ldquoCreateProc2 failednrdquo) exit(3)
CloseHandle (hReadPipe)
Wait for both processes to complete
WaitForSingleObject (ProcInfo1hProcess INFINITE)
WaitForSingleObject (ProcInfo2hProcess INFINITE)
CloseHandle (ProcInfo1hThread) CloseHandle (ProcInfo1hProcess)
CloseHandle (ProcInfo2hThread) CloseHandle (ProcInfo2hProcess)
return 0
9696
Named Pipes
Message orientedndash Reading process can read varying-length messages precisely as sent by the writing process
Bi-directionalndash Two processes can exchange messages over the same pipe
Multiple independent instances of a named pipendash Several clients can communicate with a single server using the same instance
ndash Server can respond to client using the same instance
Pipe can be accessed over the networkndash location transparency
Convenience and connection functions
9797
Using Named Pipes
HANDLE CreateNamedPipe (LPCTSTR lpszPipeNameDWORD fdwOpenMode DWORD fdwPipModeDWORD nMaxInstances DWORD cbOutBufDWORD cbInBuf DWORD dwTimeOutLPSECURITY_ATTRIBUTES lpsa )
Use same flag settings forall instances of a named pipe
lpszPipeName pipe[path]pipename
ndash Not possible to create a pipe on remote machine ( ndash local machine)
fdwOpenMode
ndash PIPE_ACCESS_DUPLEX PIPE_ACCESS_INBOUND PIPE_ACCESS_OUTBOUND
fdwPipeMode
ndash PIPE_TYPE_BYTE or PIPE_TYPE_MESSAGE
ndash PIPE_READMODE_BYTE or PIPE_READMODE_MESSAGE
ndash PIPE_WAIT or PIPE_NOWAIT (will ReadFile block)
9898
Named Pipes (contd)
BOOL PeekNamedPipe (HANDLE hPipeLPVOID lpvBuffer DWORD cbBufferLPDWORD lpcbRead LPDWORD lpcbAvailLPDWORD lpcbMessage)
nMaxInstances
ndash Number of instances
ndash PIPE_UNLIMITED_INSTANCES OS choice based on resources
dwTimeOut
ndash Default time-out period (in msec) for WaitNamedPipe()
First CreateNamedPipe creates named pipe
ndash Closing handle to last instance deletes named pipe
Polling a pipe
ndash Nondestructive ndash is there a message waiting for ReadFile
9999
Named Pipe Client Connections
CreateFile with named pipe namendash pipe[path]pipename
ndash servernamepipe[path]pipename
ndash First method gives better performance (local server)
Status Functionsndash GetNamedPipeHandleState
ndash SetNamedPipeHandleState
ndash GetNamedPipeInfo
100100
Convenience Functions
BOOL TransactNamedPipe( HANDLE hNamedPipeLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDOWRD lpcbRead LPOVERLAPPED lpa)
WriteFile ReadFile sequence
101101
Convenience Functions
BOOL CallNamedPipe( LPCTSTR lpszPipeNameLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDWORD lpcbRead DWORD dwTimeOut)
CreateFile WriteFile ReadFile CloseHandle
ndash dwTimeOut NMPWAIT_NOWAIT NMPWAIT_WIAT_FOREVER NMPWAIT_USE_DEFAULT_WAIT
102102
Server eliminate the polling loop
BOOL ConnectNamedPipe (HANDLE hNamedPipeLPOVERLAPPED lpo
lpo == NULLndash Call will return as soon as there is a client connection
ndash Returns false if client connected between CreateNamed Pipe calland ConnectNamedPipe()
Use DisconnectNamedPipe to free the handle for connection from another client
WaitNamedPipe()ndash Client may wait for serverlsquos ConnectNamedPipe()
Security rights for named pipesndash GENERIC_READ GENERIC_WRITE SYNCHRONIZE
103103
Comparison with UNIX
UNIX FIFOs are similar to a named pipe
ndash FIFOs are half-duplex
ndash FIFOs are limited to a single machine
ndash FIFOs are still byte-oriented so its easiest to use fixed-size records in clientserver applications
ndash Individual readwrites are atomic
A server using FIFOs must use a separate FIFO for each clientlsquos response although all clients can send requests via a single well known FIFO
Mkfifo() is the UNIX counterpart to CreateNamedPipe()
Use sockets for networked clientserver scenarios
104104
Client Example using Named Pipe
WaitNamedPipe (ServerPipeName NMPWAIT_WAIT_FOREVER)
hNamedPipe = CreateFile (ServerPipeName GENERIC_READ | GENERIC_WRITE
0 NULL OPEN_EXISTING FILE_ATTRIBUTE_NORMAL NULL)
if (hNamedPipe == INVALID_HANDLE_VALUE)
fptinf(stderr Failure to locate servern) exit(3)
Write the request
WriteFile (hNamedPipe ampRequest MAX_RQRS_LEN ampnWrite NULL)
Read each response and send it to std out
while (ReadFile (hNamedPipe ResponseRecord MAX_RQRS_LEN ampnRead NULL))
printf (s ResponseRecord)
CloseHandle (hNamedPipe)
return 0
105105
Server Example Using a Named Pipe
hNamedPipe = CreateNamedPipe (SERVER_PIPE PIPE_ACCESS_DUPLEX
PIPE_READMODE_MESSAGE | PIPE_TYPE_MESSAGE | PIPE_WAIT
1 0 0 CS_TIMEOUT pNPSA)
while (Done)
printf (Server is awaiting next requestn)
if (ConnectNamedPipe (hNamedPipe NULL)
|| ReadFile (hNamedPipe ampRequest RQ_SIZE ampnXfer NULL))
fprintf(stderr ldquoConnect or Read Named Pipe errornrdquo) exit(4)
printf( ldquoRequest is sn RequestRecord)
Send the file one line at a time to the client
fp = fopen (File r)
while ((fgets (ResponseRecord MAX_RQRS_LEN fp) = NULL))
WriteFile (hNamedPipe ampResponseRecord
(strlen(ResponseRecord) + 1) TSIZE ampnXfer NULL)
fclose (fp)
DisconnectNamedPipe (hNamedPipe)
End of server operation
106106
Win32 IPC - MailslotsMailslots bear some nasty implementation detailsthey are almost never used
Broadcast mechanism
ndash One-directional
ndash Mutliple writersmultiple readers (frequently one-to-many comm)
ndash Message delivery is unreliable
ndash Can be located over a network domain
ndash Message lengths are limited (w2k lt 426 byte) Operations on the mailslot
ndash Each reader (server) creates mailslot with CreateMailslot()
ndash Write-only client opens mailslot with CreateFile() and uses WriteFile() ndash open will fail if there are no waiting readers
ndash Clientlsquos message can be read by all servers (readers) Client lookup mailslotmailslotname
ndash Client will connect to every server in network domain
107107
Locate a server via mailslot
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
App client 0
App client n
Mailslot Servers
While () Sleep() hMS = CreateFile( ldquomailslotstatusldquo)
WriteFile(hMS ampStatInfo
App Server
Mailslot Client
Message is sent periodically
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
1717
Synchronization Hardware
Test and modify the content of a word atomically
boolean TestAndSet(boolean amptarget)
boolean rv = target
target = true
return rv
1818
Shared data ndash boolean lock = false
Thread Ti
do
while (TestAndSet(lock))
critical section
lock = false
remainder section
Mutual Exclusion with Test-and-Set
1919
Synchronization Hardware
Atomically swap two variables
void Swap(boolean ampa boolean ampb)
boolean temp = a
a = b
b = temp
2020
Mutual Exclusion with Swap
Shared data (initialized to 0) int lock = 0
Thread Ti
int key
do
key = 1
while (key == 1) Swap(lockkey)
critical section
lock = 0
remainder section
2121
Semaphores
Semaphore S ndash integer variable
can only be accessed via two atomic operations
wait (S)
while (S lt= 0)S--
signal (S)
S++
2222
Critical Section of n Threads
Shared data
semaphore mutex initially mutex = 1
Thread Ti
do wait(mutex) critical section
signal(mutex) remainder section
while (1)
2323
Semaphore Implementation
Semaphores may suspendresume threads
ndash Avoid busy waiting
Define a semaphore as a record
typedef struct
int value struct thread L semaphore
Assume two simple operations
ndash suspend() suspends the thread that invokes it
ndash resume(T) resumes the execution of a blocked thread T
2424
Implementation
Semaphore operations now defined as wait(S)
Svalue--
if (Svalue lt 0)
add this thread to SLsuspend()
signal(S) Svalue++
if (Svalue lt= 0)
remove a thread T from SLresume(T)
2525
Semaphore as a General Synchronization Tool Execute B in Tj only after A executed in Ti
Use semaphore flag initialized to 0
Code
Ti Tj
A wait(flag)
signal(flag) B
2626
Two Types of Semaphores
Counting semaphore
ndash integer value can range over an unrestricted domain
Binary semaphore
ndash integer value can range only between 0 and 1
ndash can be simpler to implement
Counting semaphore S can be implemented as a binary semaphore
2727
Deadlock and Starvation
Deadlock ndash
ndash two or more threads are waiting indefinitely for an event that can be caused by only one of the waiting threads
Let S and Q be two semaphores initialized to 1
T0 T1
wait(S) wait(Q)
wait(Q) wait(S)
signal(S) signal(Q)
signal(Q) signal(S)
2828
Deadlock and Starvation
Starvation ndash indefinite blocking
ndash A thread may never be removed from the semaphore queue in which it is suspended
Solution ndash
ndash all code should acquirerelease semaphores in same order
2929
Windows Synchronization
Uses interrupt masks to protect access to global resources on uniprocessor systems
Uses spinlocks on multiprocessor systems
Provides dispatcher objects which may act as mutexes and semaphores
Dispatcher objects may also provide events An event acts much like a condition variable
3030
Linux Synchronization
Kernel disables interrupts for synchronizing access to global data on uniprocessor systems
Uses spinlocks for multiprocessor synchronization
Uses semaphores and readers-writers locks when longer sections of code need access to data
Implements POSIX synchronization primitives to support multitasking multithreading (including real-time threads) and multiprocessing
3131
Further Reading
Ben-Ari M Principles of Concurrent Programming Prentice Hall 1982
Lamport L The Mutual Exclusion Problem Journal of the ACM April 1986
Abraham Silberschatz Peter B Galvin Operating System Concepts John Wiley amp Sons 6th Ed 2003
ndash Chapter 7 - Process Synchronization
ndash Chapter 8 - Deadlocks
3232
32 Trap Dispatching Interrupts Synchronization Trap and Interrupt dispatching
IRQL levels amp Interrupt Precedence
Spinlocks and Kernel Synchronization
Executive Synchronization
3333
Kernel Mode Versus User Mode
A processor state
Controls access to memory
Each memory page is tagged to show the required mode for reading and for writing
ndash Protects the system from the users
ndash Protects the user (process) from themselves
ndash System is not protected from system
Code regions are tagged ldquono write in any moderdquo
Controls ability to execute privileged instructions
A Windows abstraction
ndash Intel Ring 0 Ring 3
3434
Kernel Mode Versus User Mode
Control flow (a thread) can change from user to kernel mode and back
ndash Does not affect scheduling
ndash Thread context includes info about execution mode (along with registers etc)
PerfMon counters
ndash ldquoPrivileged Timerdquo and ldquoUser Timerdquo
ndash 4 levels of granularity thread process processor system
3535
Getting Into Kernel Mode
Code is run in kernel mode for one of three reasons
1 Requests from user mode
ndash Via the system service dispatch mechanism
ndash Kernel-mode code runs in the context of the requesting thread
2 Dedicated kernel-mode system threads
ndash Some threads in the system stay in kernel mode at all timesmostly in the ldquoSystemrdquo process
ndash Scheduled preempted etc like any other threads
3636
Getting Into Kernel Mode
3 Interrupts from external devicesndash interrupt dispatcher invokes the interrupt service routine
ndash ISR runs in the context of the interrupted thread so-called ldquoarbitrary thread contextrdquo
ndash ISR often requests the execution of a ldquoDPC routinerdquo which also runs in kernel mode
ndash Time not charged to interrupted thread
3737
Trap dispatching
Interruptdispatcher
Systemservice
dispatcher
Interruptserviceroutines
Interruptserviceroutines
Interruptserviceroutines
System services
System services
System services
Exceptiondispatcher
Exceptionhandlers
Exceptionhandlers
Exceptionhandlers
Virtual memorymanagerlsquos pager
Interrupt
System service call
HW exceptionsSW exceptions
Virtual addressexceptions
Trap processorlsquos mechanism to capture executing thread
ndash Switch from user to kernel mode
ndash Interrupts ndash asynchronous
ndash Exceptions - synchronous
Interrupt dispatch routine
Disable interrupts
Record machine state (trap frame) to allow resume
Mask equal- and lower-IRQL interrupts
Find and call appropriate ISR
Dismiss interrupt
Restore machine state (including mode and enabled interrupts)
Disable interrupts
Record machine state (trap frame) to allow resume
Mask equal- and lower-IRQL interrupts
Find and call appropriate ISR
Dismiss interrupt
Restore machine state (including mode and enabled interrupts)
Tell the device to stop interrupting
Interrogate device state start next operation on device etc
Request a DPC
Return to caller
Tell the device to stop interrupting
Interrogate device state start next operation on device etc
Request a DPC
Return to caller
Interrupt service routine
interrupt
user or kernel mode
codekernel mode
Note no thread or process context switch
Note no thread or process context switch
Interrupt Dispatching
3939
IRQL = Interrupt Request Level
ndash Precedence of the interrupt with respect to other interrupts
ndash Different interrupt sources have different IRQLs
ndash not the same as IRQ
IRQL is also a state of the processor
ndash Servicing an interrupt raises processor IRQL to that interruptrsquos IRQL
ndash this masks subsequent interrupts at equal and lower IRQLs
User mode is limited to IRQL 0
No waits or page faults at IRQL gt= DISPATCH_LEVEL
Interrupt Precedence via IRQLs (x86)
4040
PassiveLowAPC
DispatchDPCDevice 1
Profile amp Synch (Srv 2003)
ClockInterprocessor Interrupt
Power failHigh
normal thread execution
Hardware interrupts
Deferrable software interrupts
012
302928
31
Interrupt Precedence via IRQLs (x86)
4141
Interrupt processing
Interrupt dispatch table (IDT)ndash Links to interrupt service routines
x86ndash Interrupt controller interrupts processor (single line)
ndash Processor queries for interrupt vector uses vector as index to IDT
Alphandash PAL code (Privileged Architecture Library ndash Alpha BIOS) determines interrupt vector calls kernel
ndash Kernel uses vector to index IDT
After ISR execution IRQL is lowered to initial level
4242
Interrupt object
Allows device drivers to register ISRs for their devicesndash Contains dispatch code (initial handler)
ndash Dispatch code calls ISR with interrupt object as parameter(HW cannot pass parameters to ISR)
Connectingdisconnecting interrupt objectsndash Dynamic association between ISR and IDT entry
ndash Loadable device drivers (kernel modules)
ndash Turn onoff ISR
Interrupt objects can synchronize access to ISR datandash Multiple instances of ISR may be active simultaneously (MP machine)
ndash Multiple ISR may be connected with IRQL
4343
Predefined IRQLs
High ndash used when halting the system (via KeBugCheck())
Power fail ndash originated in the NT design document but has never been used
Inter-processor interruptndash used to request action from other processor (dispatching a thread updating a processors TLB system shutdown system crash)
Clockndash Used to update systemlsquos clock allocation of CPU time to threads
Profilendash Used for kernel profiling (see Kernel profiler ndash Kernprofexe Res Kit)
4444
Predefined IRQLs (contd)
Device
ndash Used to prioritize device interrupts
DPCdispatch and APC
ndash Software interrupts that kernel and device drivers generate
Passive
ndash No interrupt level at all normal thread execution
4545
IRQLs on 64-bit Systems
PassiveLowAPC
DispatchDPCDevice 1
Device n
Synch (Srv 2003)Clock
Interprocessor InterruptPower
HighProfile
012
1413
15
34
PassiveLowAPC
DispatchDPC amp Synch (UP only)Correctable Machine Check
Device 1
Device nSynch (MP only)
ClockInterprocessor Interrupt
HighProfilePower
x64 IA64
12
4646
Interrupt Prioritization amp Delivery
IRQLs are determined as followsndash x86 UP systems IRQL = 27 - IRQ
ndash x86 MP systems bucketized (random)
ndash x64 amp IA64 systems IRQL = IDT vector number 16
On MP systems which processor is chosen to deliver an interruptndash By default any processor can receive an interrupt from any deviceCan be configured with IntFilter utility in Resource Kit
ndash On x86 and x64 systems the IOAPIC (IO advanced programmable interrupt controller) is programmed to interrupt the processor running at the lowest IRQL
ndash On IA64 systems the SAPIC (streamlined advanced programmable interrupt controller) is configured to interrupt one processor for each interrupt sourceProcessors are assigned round robin for each interrupt vector
4747
Software interrupts
Initiating thread dispatching
ndash DPC allow for scheduling actions when kernel is deep within many layers of code
ndash Delayed scheduling decision one DPC queue per processor
Handling timer expiration
Asynchronous execution of a procedure in context of a particular thread
Support for asynchronous IO operations
4848
Flow of Interrupts
4949
Sync on MP use spinlocks to coordinate among processors
Spinlock acquisition and release routines implement a one-owner-at-a-time algorithm
ndash Spinlock is either free or is considered to be owned by a CPU
ndash Analogous to using Windows API mutexes from user mode
A spinlock is just a data cell in memory
ndash Accessed with a test-and-modify operation that is atomic across all processors
ndash KSPIN_LOCK is an opaque data type typedefrsquod as a ULONG
ndash To implement synchronization a single bit is sufficient
Synchronization on SMP Systems
5050
do acquire_spinlock(DPC)until (SUCCESS)
begin remove DPC from queueend
release_spinlock(DPC)
do acquire_spinlock(DPC)until (SUCCESS)
begin remove DPC from queueend
release_spinlock(DPC)
Kernel Synchronization
Processor BProcessor A
Critical section
spinlock
DPC DPC
A spinlock is a locking primitive associatedwith a global data structure such as the DPC queue
5151
Queued Spinlocks
Problem Checking status of spinlock via test-and-set operation creates bus contention
Queued spinlocks maintain queue of waiting processors
First processor acquires lock other processors wait on processor-local flag
ndash Thus busy-wait loop requires no access to the memory bus
When releasing lock the 1st processorrsquos flag is modified
ndash Exactly one processor is being signaled
ndash Pre-determined wait order
5252
SMP Scalability Improvements
Windows 2000 queued spinlocks
ndash qlocks in Kernel Debugger
Server 2003
ndash More spinlocks eliminated (context swap system space commit)
ndash Further reduction of use of spinlocks amp length they are held
ndash Scheduling database now per-CPUAllows thread state transitions in parallel
5353
SMP Scalability Improvements
XP2003
ndash Minimized lock contention for hot locks PFN or Page Frame Database lock
ndash Some locks completely eliminatedCharging nonpagedpaged pool quotas allocating and mapping system page table entries charging commitment of pages allocatingmapping physical memory through AWE functions
ndash New more efficient locking mechanism (pushlocks)Doesnrsquot use spinlocks when no contentionUsed for object manager and address windowing extensions (AWE) related locks
5454
Waiting
Flexible wait calls
ndash Wait for one or multiple objects in one call
ndash Wait for multiple can wait for ldquoanyrdquo one or ldquoallrdquo at onceldquoAllrdquo all objects must be in the signalled state concurrently to resolve the wait
ndash All wait calls include optional timeout argument
ndash Waiting threads consume no CPU time
5555
Waiting
Waitable objects include
ndash Events (may be auto-reset or manual reset may be set or ldquopulsedrdquo)
ndash Mutexes (ldquomutual exclusionrdquo one-at-a-time)
ndash Semaphores (n-at-a-time)
ndash Timers
ndash Processes and Threads (signalled upon exit or terminate)
ndash Directories (change notification)
5656
Waiting
No guaranteed ordering of wait resolution
ndash If multiple threads are waiting for an object and only one thread is released (eg itrsquos a mutex or auto-reset event) which thread gets released is unpredictable
5757
Executive Synchronization
Thread waitson an object
handle
Create and initialize thread object
Initialized
Ready
Transition
Waiting
Running
Terminated
Standby
Wait is completeSet object to
signaled state
Interaction with thread scheduling
Waiting on Dispatcher Objects ndash outside the kernel
5858
Interaction bet Synchronization amp Dispatching User mode thread waits on an event objectlsquos handle
Kernel changes threadlsquos scheduling state from ready to waiting and adds thread to wait-list
Another thread sets the event
Kernel wakes up waiting threads variable priority threads get priority boost
5959
Interaction bet Synchronization amp Dispatching Dispatcher re-schedules new thread ndash it may preempt running thread it it has lower priority and issues software interrupt to initiate context switch
If no processor can be preempted the dispatcher places the ready thread in the dispatcher ready queue to be scheduled later
6060
What signals an object
Dispatcher object
System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
Owning thread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (kernel mode)
nonsignaled signaled
Owning thread or otherthread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (exported to user mode)
nonsignaled signaled
One thread releases thesemaphore freeing a resource
A thread acquires the semaphoreMore resources are not available
Kernel resumes one or more waiting threads
Semaphore
6161
A thread reinitializesthe thread object
What signals an object (contd)
Dispatcher object System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
A thread sets the event
Kernel resumes one or more threads
Kernel resumes one or more waiting threads
Event
nonsignaled signaled
Dedicated thread sets oneevent in the event pair
Kernel resumes theother dedicated thread
Kernel resumes waitingdedicated thread
Event pair
nonsignaled signaled
Timer expires
A thread (re) initializes the timer
Kernel resumes all waiting threads
Timer
nonsignaled signaled
Thread terminates
Kernel resumes all waiting threads
Thread
6262
Further Reading
Mark E Russinovich and David A Solomon Microsoft Windows Internals 4th Edition Microsoft Press 2004
Chapter 3 - System Mechanisms
ndash Trap Dispatching (pp 85 ff)
ndash Synchronization (pp 149 ff)
ndash Kernel Event Tracing (pp 175 ff)
6363
33 Advanced Windows Synchronization
Deferred and Asynchronous Procedure Calls
IRQLs and CPU Time Accounting
Wait Queues amp Dispatcher Objects
6464
Used to defer processing from higher (device) interrupt level to a lower (dispatch) levelndash Also used for quantum end and timer expiration
Driver (usually ISR) queues requestndash One queue per CPU DPCs are normally queued to the current processor but can be targeted to other CPUs
ndash Executes specified procedure at dispatch IRQL (or ldquodispatch levelrdquo also ldquoDPC levelrdquo) when all higher-IRQL work (interrupts) completed
ndash Maximum times recommended ISR 10 usec DPC 25 usec
See httpwwwmicrosoftcomwhdcdriverperformmmdrvmspx
Deferred Procedure Calls (DPCs)
6565
queue head DPC object DPC object DPC object
Deferred Procedure Calls (DPCs)
6666
DPC
Delivering a DPC
DPC routines can call kernel functionsbut canlsquot call system services generatepage faults or create or wait on objects
DPC routines canlsquotassume whatprocess addressspace is currentlymapped
Interruptdispatch table
high
Power failure
DispatchDPC
APC
Low
DPC
1 Timer expires kernelqueues DPC that willrelease all waiting threadsKernel requests SW int
DPCDPC
DPC queue
2 DPC interrupt occurswhen IRQL drops belowdispatchDPC level
dispatcher
3 After DPC interruptcontrol transfers tothread dispatcher
4 Dispatcher executes each DPCroutine in DPC queue
6767
Asynchronous Procedure Calls (APCs)
Execute code in context of a particular user threadndash APC routines can acquire resources (objects) incur page faultscall system services
APC queue is thread-specific
User mode amp kernel mode APCsndash Permission required for user mode APCs
6868
Asynchronous Procedure Calls (APCs)
Executive uses APCs to complete work in thread spacendash Wait for asynchronous IO operation
ndash Emulate delivery of POSIX signals
ndash Make threads suspendterminate itself (env subsystems)
APCs are delivered when thread is in alertable wait statendash WaitForMultipleObjectsEx() SleepEx()
6969
Special kernel APCs
ndash Run in kernel mode at IRQL 1
ndash Always deliverable unless thread is already at IRQL 1 or above
ndash Used for IO completion reporting from ldquoarbitrary thread contextrdquo
ndash Kernel-mode interface is linkable but not documented
Asynchronous Procedure Calls (APCs)
7070
ldquoOrdinaryrdquo kernel APCs
ndash Always deliverable if at IRQL 0 unless explicitly disabled (disable with KeEnterCriticalRegion)
User mode APCs
ndash Used for IO completion callback routines (see ReadFileEx WriteFileEx) also QueueUserApc
ndash Only deliverable when thread is in ldquoalertable waitrdquo
Asynchronous Procedure Calls (APCs)
7171
ThreadObject
K
U
APC objects
Asynchronous Procedure Calls (APCs)
7272
IRQLs and CPU Time Accounting
Interval clock timer ISR keeps track of time
Clock ISR time accounting
ndash If IRQLlt2 charge to threadrsquos user or kernel time
ndash If IRQL=2 and processing a DPC charge to DPC time
ndash If IRQL=2 amp not processing a DPC charge to thread kernel time
ndash If IRQLgt2 charge to interrupt time
7373
IRQLs and CPU Time Accounting
Since time servicing interrupts are NOT charged to interrupted thread if system is busy but no process appears to be running must be due to interrupt-related activity
ndash Note time at IRQL 2 or more is charged to the current threadrsquos quantum (to be described)
7474
Interrupt Time Accounting
Task Manager includes interrupt and DPC time with the Idle process time
Interrupt activity is not charged to any threadprocess
ndash Process Explorer shows these as separate processesnot really processes
ndash Context switches for these are really of interrupts amp DPCs
7575
Time Accounting Quirks
Looking at total CPU time for each process may not reveal where system has spent its time
CPU time accounting is driven by programmable interrupt timer
ndash Normally 10 msec (15 msec on some MP Pentiums)
Thread execution and context switches between clock intervals NOT accounted
ndash Eg one or more threads run and enter a wait state before clock fires
ndash Thus threads may run but never get charged
View context switch activity with Process Explorer
ndash Add Context Switch Delta column
7676
For waiting threads user-mode utilities only display the wait reason
Example pstat
Looking at Waiting Threads
7777
Wait Internals 1 Dispatcher Objects
Size TypeState
Wait listhead
Object-type-specific data
DispatcherObject
(see ntddkincddkntddkh)
Any kernel object you can wait for is a ldquodispatcher objectrdquo
ndash some exclusively for synchronizationeg events mutexes (ldquomutantsrdquo) semaphores queues timers
ndash others can be waited for as a side effect of their prime function eg processes threads file objects
ndash non-waitable kernel objects are called ldquocontrol objectsrdquo
All dispatcher objects have a common header
All dispatcher objects are in one of two states
ndash ldquosignaledrdquo vs ldquononsignaledrdquo
ndash when signalled a wait on the object is satisfied
ndash different object types differ in terms of what changes their state
ndash wait and unwait implementation iscommon to all types of dispatcher objects
7878
Object-type-specific data
Wait Internals 2Wait Blocks
Size TypeState
Wait listhead
Size TypeState
Wait listhead
Represent a threadrsquos reference to something itrsquos waiting for (one per handle passed to WaitForhellip)
All wait blocks from a given wait call are chained to the waiting thread
Type indicates wait for ldquoanyrdquo or ldquoallrdquo Key denotes argument list position for
WaitForMultipleObjects
Object-type-specific data
DispatcherObjects
Thread Objects
WaitBlockListWaitBlockList
Wait blocks
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
7979
34 Windows APIs for Synchronization and IPC Windows API constructs for synchronization and interprocess communication
Synchronization
ndash Critical sections
ndash Mutexes
ndash Semaphores
ndash Event objects
Synchronization through interprocess communication
ndash Anonymous pipes
ndash Named pipes
ndash Mailslots
8080
Critical Sections
Only usable from within the same process
Critical sections are initialized and deleted but do not have handles
Only one thread at a time can be in a critical section
A thread can enter a critical section multiple times - however the number of Enter- and Leave-operations must match
Leaving a critical section before entering it may cause deadlocks
No way to test whether another thread is in a critical section
VOID InitializeCriticalSection( LPCRITICAL_SECTION sec )VOID DeleteCriticalSection( LPCRITICAL_SECTION sec )
VOID EnterCriticalSection( LPCRITICAL_SECTION sec ) VOID LeaveCriticalSection( LPCRITICAL_SECTION sec )BOOL TryEnterCriticalSection ( LPCRITICAL_SECTION sec )
8181
Critical Section Example
counter is global shared by all threads
volatile int counter = 0
CRITICAL_SECTION crit
InitializeCriticalSection ( ampcrit )
hellip main loop in any of the threads
while (done)
_try
EnterCriticalSection ( ampcrit )
counter += local_value
LeaveCriticalSection ( ampcrit )
_finally LeaveCriticalSection ( ampcrit )
DeleteCriticalSection( ampcrit )
8282
Synchronizing Threads with Kernel Objects
The following kernel objects can be used to synchronize threads
ndash Processes
ndash Threads
ndash Files
ndash Console input
File change notificationsFile change notifications
MutexesMutexes
Events (auto-reset + manual-reset)Events (auto-reset + manual-reset)
Waitable timersWaitable timers
DWORD WaitForSingleObject( HANDLE hObject DWORD dwTimeout )
DWORD WaitForMultipleObjects( DWORD cObjects LPHANDLE lpHandles BOOL bWaitAll DWORD dwTimeout )
8383
Wait Functions - Details
WaitForSingleObject()ndash hObject specifies kernel object
ndash dwTimeout specifies wait time in msecdwTimeout == 0 - no wait check whether object is signaled
dwTimeout == INFINITE - wait forever
WaitForMultipleObjects()ndash cObjects lt= MAXIMUM_WAIT_OBJECTS (64)
ndash lpHandles - pointer to array identifying these objects
ndash bWaitAll - whether to wait for first signaled object or all objectsFunction returns index of first signaled object
Side effectsndash Mutexes auto-reset events and waitable timers will be reset to non-signaled state after completing wait functions
8484
Mutexes
Mutexes work across processes
First thread has to call CreateMutex()
When sharing a mutex second thread (process) calls CreateMutex() or OpenMutex()
fInitialOwner == TRUE gives creator immediate ownership
Threads acquire mutex ownership using WaitForSingleObject() or WaitForMultipleObjects()
ReleaseMutex() gives up ownership
CloseHandle() will free mutex object
8585
Mutexes
HANDLE CreateMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
HANDLE OpenMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
BOOL ReleaseMutex( HANDLE hMutex )
8686
Mutex Example
counter is global shared by all threads
volatile int done counter = 0
HANDLE mutex = CreateMutex( NULL FALSE NULL )
main loop in any of the threads ret is local
DWORD ret
while (done)
ret = WaitForSingleObject( mutex INFINITE )
if (ret == WAIT_OBJECT_0)
counter += local_value
else mutex was abandoned
break exit the loop
ReleaseMutex( mutex )
CloseHandle( mutex )
8787
Comparison - POSIX mutexes
POSIX pthreads specification supports mutexesndash Synchronization among threads in same process
Five basic functionsndash pthread_mutex_init()
ndash pthread_mutex_destroy()
ndash pthread_mutex_lock()
ndash pthread_mutex_unlock()
ndash pthread_mutex_trylock()
Comparisonndash pthread_mutex_lock() will block - equivalent to WaitForSingleObject( hMutex )
ndash pthread_mutex_trylock() is nonblocking (polling) - equivalent to WaitForSingleObject() with timeout == 0
8888
Semaphores
Semaphore objects are used for resource countingndash A semaphore is signaled when count gt 0
Threadsprocesses use wait functionsndash Each wait function decreases semaphore count by 1
ndash ReleaseSemaphore() may increment count by any value
ndash ReleaseSemaphore() returns old semaphore count
8989
Semaphores
HANDLE CreateSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE OpenSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE ReleaseSemaphore( HANDLE hSemaphoreLONG cReleaseCount LPLONG lpPreviousCount )
9090
Events
Multiple threads can be released when a single event is signaled (barrier synchronization)ndash Manual-reset event can signal several thread simultaneously must be reset manually
ndash PulseEvent() will release all threads waiting on a manual-reset event and automatically reset the event
ndash Auto-reset event signals a single thread event is reset automatically
ndash fInitialState == TRUE - create event in signaled state
9191
Events
HANDLE CreateEvent( LPSECURITY_ATTRIBUTE lpsaBOOL fManualReset BOOL fInititalStateLPTSTR lpszEventName )
BOOL SetEvent( HANDLE hEvent )BOOL ResetEvent( HANDLE hEvent )BOOL PulseEvent( HANDLE hEvent )
9292
Comparison - POSIX condition variables
pthreadrsquos condition variables are comparable to eventsndash pthread_cond_init()
ndash pthread_cond_destroy()
Wait functionsndash pthread_cond_wait()
ndash pthread_cond_timedwait()
Signalingndash pthread_cond_signal() - one thread
ndash pthread_cond_broadcast() - all waiting threads
No exact equivalent to manual-reset events
9393
Anonymous pipes
BOOL CreatePipe( PHANDLE phReadPHANDLE phWriteLPSECURITY_ATTRIBUTES lpsaDWORD cbPipe )
main
prog1 prog2pipe
Half-duplex character-based IPC
cbPipe pipe byte size zero == default
Read on pipe handle will block if pipe is empty
Write operation to a full pipe will block
Anonymous pipes are oneway
9494
IO Redirection using an Anonymous Pipe
Create default size anonymous pipe handles are inheritable
if (CreatePipe (amphReadPipe amphWritePipe ampPipeSA 0))
fprintf(stderr ldquoAnon pipe create failednrdquo) exit(1)
Set output handle to pipe handle create first processes
StartInfoCh1hStdInput = GetStdHandle (STD_INPUT_HANDLE)
StartInfoCh1hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh1hStdOutput = hWritePipe
StartInfoCh1dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)Command1 NULL NULL TRUE 0 NULL NULL ampStartInfoCh1 ampProcInfo1))
fprintf(stderr ldquoCreateProc1 failednrdquo) exit(2)
CloseHandle (hWritePipe)
9595
Pipe example (contd)
Repeat (symmetrically) for the second process
StartInfoCh2hStdInput = hReadPipe
StartInfoCh2hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh2hStdOutput = GetStdHandle (STD_OUTPUT_HANDLE)
StartInfoCh2dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)targv NULL NULLTRUE Inherit handles
0 NULL NULL ampStartInfoCh2 ampProcInfo2))
fprintf(stderr ldquoCreateProc2 failednrdquo) exit(3)
CloseHandle (hReadPipe)
Wait for both processes to complete
WaitForSingleObject (ProcInfo1hProcess INFINITE)
WaitForSingleObject (ProcInfo2hProcess INFINITE)
CloseHandle (ProcInfo1hThread) CloseHandle (ProcInfo1hProcess)
CloseHandle (ProcInfo2hThread) CloseHandle (ProcInfo2hProcess)
return 0
9696
Named Pipes
Message orientedndash Reading process can read varying-length messages precisely as sent by the writing process
Bi-directionalndash Two processes can exchange messages over the same pipe
Multiple independent instances of a named pipendash Several clients can communicate with a single server using the same instance
ndash Server can respond to client using the same instance
Pipe can be accessed over the networkndash location transparency
Convenience and connection functions
9797
Using Named Pipes
HANDLE CreateNamedPipe (LPCTSTR lpszPipeNameDWORD fdwOpenMode DWORD fdwPipModeDWORD nMaxInstances DWORD cbOutBufDWORD cbInBuf DWORD dwTimeOutLPSECURITY_ATTRIBUTES lpsa )
Use same flag settings forall instances of a named pipe
lpszPipeName pipe[path]pipename
ndash Not possible to create a pipe on remote machine ( ndash local machine)
fdwOpenMode
ndash PIPE_ACCESS_DUPLEX PIPE_ACCESS_INBOUND PIPE_ACCESS_OUTBOUND
fdwPipeMode
ndash PIPE_TYPE_BYTE or PIPE_TYPE_MESSAGE
ndash PIPE_READMODE_BYTE or PIPE_READMODE_MESSAGE
ndash PIPE_WAIT or PIPE_NOWAIT (will ReadFile block)
9898
Named Pipes (contd)
BOOL PeekNamedPipe (HANDLE hPipeLPVOID lpvBuffer DWORD cbBufferLPDWORD lpcbRead LPDWORD lpcbAvailLPDWORD lpcbMessage)
nMaxInstances
ndash Number of instances
ndash PIPE_UNLIMITED_INSTANCES OS choice based on resources
dwTimeOut
ndash Default time-out period (in msec) for WaitNamedPipe()
First CreateNamedPipe creates named pipe
ndash Closing handle to last instance deletes named pipe
Polling a pipe
ndash Nondestructive ndash is there a message waiting for ReadFile
9999
Named Pipe Client Connections
CreateFile with named pipe namendash pipe[path]pipename
ndash servernamepipe[path]pipename
ndash First method gives better performance (local server)
Status Functionsndash GetNamedPipeHandleState
ndash SetNamedPipeHandleState
ndash GetNamedPipeInfo
100100
Convenience Functions
BOOL TransactNamedPipe( HANDLE hNamedPipeLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDOWRD lpcbRead LPOVERLAPPED lpa)
WriteFile ReadFile sequence
101101
Convenience Functions
BOOL CallNamedPipe( LPCTSTR lpszPipeNameLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDWORD lpcbRead DWORD dwTimeOut)
CreateFile WriteFile ReadFile CloseHandle
ndash dwTimeOut NMPWAIT_NOWAIT NMPWAIT_WIAT_FOREVER NMPWAIT_USE_DEFAULT_WAIT
102102
Server eliminate the polling loop
BOOL ConnectNamedPipe (HANDLE hNamedPipeLPOVERLAPPED lpo
lpo == NULLndash Call will return as soon as there is a client connection
ndash Returns false if client connected between CreateNamed Pipe calland ConnectNamedPipe()
Use DisconnectNamedPipe to free the handle for connection from another client
WaitNamedPipe()ndash Client may wait for serverlsquos ConnectNamedPipe()
Security rights for named pipesndash GENERIC_READ GENERIC_WRITE SYNCHRONIZE
103103
Comparison with UNIX
UNIX FIFOs are similar to a named pipe
ndash FIFOs are half-duplex
ndash FIFOs are limited to a single machine
ndash FIFOs are still byte-oriented so its easiest to use fixed-size records in clientserver applications
ndash Individual readwrites are atomic
A server using FIFOs must use a separate FIFO for each clientlsquos response although all clients can send requests via a single well known FIFO
Mkfifo() is the UNIX counterpart to CreateNamedPipe()
Use sockets for networked clientserver scenarios
104104
Client Example using Named Pipe
WaitNamedPipe (ServerPipeName NMPWAIT_WAIT_FOREVER)
hNamedPipe = CreateFile (ServerPipeName GENERIC_READ | GENERIC_WRITE
0 NULL OPEN_EXISTING FILE_ATTRIBUTE_NORMAL NULL)
if (hNamedPipe == INVALID_HANDLE_VALUE)
fptinf(stderr Failure to locate servern) exit(3)
Write the request
WriteFile (hNamedPipe ampRequest MAX_RQRS_LEN ampnWrite NULL)
Read each response and send it to std out
while (ReadFile (hNamedPipe ResponseRecord MAX_RQRS_LEN ampnRead NULL))
printf (s ResponseRecord)
CloseHandle (hNamedPipe)
return 0
105105
Server Example Using a Named Pipe
hNamedPipe = CreateNamedPipe (SERVER_PIPE PIPE_ACCESS_DUPLEX
PIPE_READMODE_MESSAGE | PIPE_TYPE_MESSAGE | PIPE_WAIT
1 0 0 CS_TIMEOUT pNPSA)
while (Done)
printf (Server is awaiting next requestn)
if (ConnectNamedPipe (hNamedPipe NULL)
|| ReadFile (hNamedPipe ampRequest RQ_SIZE ampnXfer NULL))
fprintf(stderr ldquoConnect or Read Named Pipe errornrdquo) exit(4)
printf( ldquoRequest is sn RequestRecord)
Send the file one line at a time to the client
fp = fopen (File r)
while ((fgets (ResponseRecord MAX_RQRS_LEN fp) = NULL))
WriteFile (hNamedPipe ampResponseRecord
(strlen(ResponseRecord) + 1) TSIZE ampnXfer NULL)
fclose (fp)
DisconnectNamedPipe (hNamedPipe)
End of server operation
106106
Win32 IPC - MailslotsMailslots bear some nasty implementation detailsthey are almost never used
Broadcast mechanism
ndash One-directional
ndash Mutliple writersmultiple readers (frequently one-to-many comm)
ndash Message delivery is unreliable
ndash Can be located over a network domain
ndash Message lengths are limited (w2k lt 426 byte) Operations on the mailslot
ndash Each reader (server) creates mailslot with CreateMailslot()
ndash Write-only client opens mailslot with CreateFile() and uses WriteFile() ndash open will fail if there are no waiting readers
ndash Clientlsquos message can be read by all servers (readers) Client lookup mailslotmailslotname
ndash Client will connect to every server in network domain
107107
Locate a server via mailslot
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
App client 0
App client n
Mailslot Servers
While () Sleep() hMS = CreateFile( ldquomailslotstatusldquo)
WriteFile(hMS ampStatInfo
App Server
Mailslot Client
Message is sent periodically
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
1818
Shared data ndash boolean lock = false
Thread Ti
do
while (TestAndSet(lock))
critical section
lock = false
remainder section
Mutual Exclusion with Test-and-Set
1919
Synchronization Hardware
Atomically swap two variables
void Swap(boolean ampa boolean ampb)
boolean temp = a
a = b
b = temp
2020
Mutual Exclusion with Swap
Shared data (initialized to 0) int lock = 0
Thread Ti
int key
do
key = 1
while (key == 1) Swap(lockkey)
critical section
lock = 0
remainder section
2121
Semaphores
Semaphore S ndash integer variable
can only be accessed via two atomic operations
wait (S)
while (S lt= 0)S--
signal (S)
S++
2222
Critical Section of n Threads
Shared data
semaphore mutex initially mutex = 1
Thread Ti
do wait(mutex) critical section
signal(mutex) remainder section
while (1)
2323
Semaphore Implementation
Semaphores may suspendresume threads
ndash Avoid busy waiting
Define a semaphore as a record
typedef struct
int value struct thread L semaphore
Assume two simple operations
ndash suspend() suspends the thread that invokes it
ndash resume(T) resumes the execution of a blocked thread T
2424
Implementation
Semaphore operations now defined as wait(S)
Svalue--
if (Svalue lt 0)
add this thread to SLsuspend()
signal(S) Svalue++
if (Svalue lt= 0)
remove a thread T from SLresume(T)
2525
Semaphore as a General Synchronization Tool Execute B in Tj only after A executed in Ti
Use semaphore flag initialized to 0
Code
Ti Tj
A wait(flag)
signal(flag) B
2626
Two Types of Semaphores
Counting semaphore
ndash integer value can range over an unrestricted domain
Binary semaphore
ndash integer value can range only between 0 and 1
ndash can be simpler to implement
Counting semaphore S can be implemented as a binary semaphore
2727
Deadlock and Starvation
Deadlock ndash
ndash two or more threads are waiting indefinitely for an event that can be caused by only one of the waiting threads
Let S and Q be two semaphores initialized to 1
T0 T1
wait(S) wait(Q)
wait(Q) wait(S)
signal(S) signal(Q)
signal(Q) signal(S)
2828
Deadlock and Starvation
Starvation ndash indefinite blocking
ndash A thread may never be removed from the semaphore queue in which it is suspended
Solution ndash
ndash all code should acquirerelease semaphores in same order
2929
Windows Synchronization
Uses interrupt masks to protect access to global resources on uniprocessor systems
Uses spinlocks on multiprocessor systems
Provides dispatcher objects which may act as mutexes and semaphores
Dispatcher objects may also provide events An event acts much like a condition variable
3030
Linux Synchronization
Kernel disables interrupts for synchronizing access to global data on uniprocessor systems
Uses spinlocks for multiprocessor synchronization
Uses semaphores and readers-writers locks when longer sections of code need access to data
Implements POSIX synchronization primitives to support multitasking multithreading (including real-time threads) and multiprocessing
3131
Further Reading
Ben-Ari M Principles of Concurrent Programming Prentice Hall 1982
Lamport L The Mutual Exclusion Problem Journal of the ACM April 1986
Abraham Silberschatz Peter B Galvin Operating System Concepts John Wiley amp Sons 6th Ed 2003
ndash Chapter 7 - Process Synchronization
ndash Chapter 8 - Deadlocks
3232
32 Trap Dispatching Interrupts Synchronization Trap and Interrupt dispatching
IRQL levels amp Interrupt Precedence
Spinlocks and Kernel Synchronization
Executive Synchronization
3333
Kernel Mode Versus User Mode
A processor state
Controls access to memory
Each memory page is tagged to show the required mode for reading and for writing
ndash Protects the system from the users
ndash Protects the user (process) from themselves
ndash System is not protected from system
Code regions are tagged ldquono write in any moderdquo
Controls ability to execute privileged instructions
A Windows abstraction
ndash Intel Ring 0 Ring 3
3434
Kernel Mode Versus User Mode
Control flow (a thread) can change from user to kernel mode and back
ndash Does not affect scheduling
ndash Thread context includes info about execution mode (along with registers etc)
PerfMon counters
ndash ldquoPrivileged Timerdquo and ldquoUser Timerdquo
ndash 4 levels of granularity thread process processor system
3535
Getting Into Kernel Mode
Code is run in kernel mode for one of three reasons
1 Requests from user mode
ndash Via the system service dispatch mechanism
ndash Kernel-mode code runs in the context of the requesting thread
2 Dedicated kernel-mode system threads
ndash Some threads in the system stay in kernel mode at all timesmostly in the ldquoSystemrdquo process
ndash Scheduled preempted etc like any other threads
3636
Getting Into Kernel Mode
3 Interrupts from external devicesndash interrupt dispatcher invokes the interrupt service routine
ndash ISR runs in the context of the interrupted thread so-called ldquoarbitrary thread contextrdquo
ndash ISR often requests the execution of a ldquoDPC routinerdquo which also runs in kernel mode
ndash Time not charged to interrupted thread
3737
Trap dispatching
Interruptdispatcher
Systemservice
dispatcher
Interruptserviceroutines
Interruptserviceroutines
Interruptserviceroutines
System services
System services
System services
Exceptiondispatcher
Exceptionhandlers
Exceptionhandlers
Exceptionhandlers
Virtual memorymanagerlsquos pager
Interrupt
System service call
HW exceptionsSW exceptions
Virtual addressexceptions
Trap processorlsquos mechanism to capture executing thread
ndash Switch from user to kernel mode
ndash Interrupts ndash asynchronous
ndash Exceptions - synchronous
Interrupt dispatch routine
Disable interrupts
Record machine state (trap frame) to allow resume
Mask equal- and lower-IRQL interrupts
Find and call appropriate ISR
Dismiss interrupt
Restore machine state (including mode and enabled interrupts)
Disable interrupts
Record machine state (trap frame) to allow resume
Mask equal- and lower-IRQL interrupts
Find and call appropriate ISR
Dismiss interrupt
Restore machine state (including mode and enabled interrupts)
Tell the device to stop interrupting
Interrogate device state start next operation on device etc
Request a DPC
Return to caller
Tell the device to stop interrupting
Interrogate device state start next operation on device etc
Request a DPC
Return to caller
Interrupt service routine
interrupt
user or kernel mode
codekernel mode
Note no thread or process context switch
Note no thread or process context switch
Interrupt Dispatching
3939
IRQL = Interrupt Request Level
ndash Precedence of the interrupt with respect to other interrupts
ndash Different interrupt sources have different IRQLs
ndash not the same as IRQ
IRQL is also a state of the processor
ndash Servicing an interrupt raises processor IRQL to that interruptrsquos IRQL
ndash this masks subsequent interrupts at equal and lower IRQLs
User mode is limited to IRQL 0
No waits or page faults at IRQL gt= DISPATCH_LEVEL
Interrupt Precedence via IRQLs (x86)
4040
PassiveLowAPC
DispatchDPCDevice 1
Profile amp Synch (Srv 2003)
ClockInterprocessor Interrupt
Power failHigh
normal thread execution
Hardware interrupts
Deferrable software interrupts
012
302928
31
Interrupt Precedence via IRQLs (x86)
4141
Interrupt processing
Interrupt dispatch table (IDT)ndash Links to interrupt service routines
x86ndash Interrupt controller interrupts processor (single line)
ndash Processor queries for interrupt vector uses vector as index to IDT
Alphandash PAL code (Privileged Architecture Library ndash Alpha BIOS) determines interrupt vector calls kernel
ndash Kernel uses vector to index IDT
After ISR execution IRQL is lowered to initial level
4242
Interrupt object
Allows device drivers to register ISRs for their devicesndash Contains dispatch code (initial handler)
ndash Dispatch code calls ISR with interrupt object as parameter(HW cannot pass parameters to ISR)
Connectingdisconnecting interrupt objectsndash Dynamic association between ISR and IDT entry
ndash Loadable device drivers (kernel modules)
ndash Turn onoff ISR
Interrupt objects can synchronize access to ISR datandash Multiple instances of ISR may be active simultaneously (MP machine)
ndash Multiple ISR may be connected with IRQL
4343
Predefined IRQLs
High ndash used when halting the system (via KeBugCheck())
Power fail ndash originated in the NT design document but has never been used
Inter-processor interruptndash used to request action from other processor (dispatching a thread updating a processors TLB system shutdown system crash)
Clockndash Used to update systemlsquos clock allocation of CPU time to threads
Profilendash Used for kernel profiling (see Kernel profiler ndash Kernprofexe Res Kit)
4444
Predefined IRQLs (contd)
Device
ndash Used to prioritize device interrupts
DPCdispatch and APC
ndash Software interrupts that kernel and device drivers generate
Passive
ndash No interrupt level at all normal thread execution
4545
IRQLs on 64-bit Systems
PassiveLowAPC
DispatchDPCDevice 1
Device n
Synch (Srv 2003)Clock
Interprocessor InterruptPower
HighProfile
012
1413
15
34
PassiveLowAPC
DispatchDPC amp Synch (UP only)Correctable Machine Check
Device 1
Device nSynch (MP only)
ClockInterprocessor Interrupt
HighProfilePower
x64 IA64
12
4646
Interrupt Prioritization amp Delivery
IRQLs are determined as followsndash x86 UP systems IRQL = 27 - IRQ
ndash x86 MP systems bucketized (random)
ndash x64 amp IA64 systems IRQL = IDT vector number 16
On MP systems which processor is chosen to deliver an interruptndash By default any processor can receive an interrupt from any deviceCan be configured with IntFilter utility in Resource Kit
ndash On x86 and x64 systems the IOAPIC (IO advanced programmable interrupt controller) is programmed to interrupt the processor running at the lowest IRQL
ndash On IA64 systems the SAPIC (streamlined advanced programmable interrupt controller) is configured to interrupt one processor for each interrupt sourceProcessors are assigned round robin for each interrupt vector
4747
Software interrupts
Initiating thread dispatching
ndash DPC allow for scheduling actions when kernel is deep within many layers of code
ndash Delayed scheduling decision one DPC queue per processor
Handling timer expiration
Asynchronous execution of a procedure in context of a particular thread
Support for asynchronous IO operations
4848
Flow of Interrupts
4949
Sync on MP use spinlocks to coordinate among processors
Spinlock acquisition and release routines implement a one-owner-at-a-time algorithm
ndash Spinlock is either free or is considered to be owned by a CPU
ndash Analogous to using Windows API mutexes from user mode
A spinlock is just a data cell in memory
ndash Accessed with a test-and-modify operation that is atomic across all processors
ndash KSPIN_LOCK is an opaque data type typedefrsquod as a ULONG
ndash To implement synchronization a single bit is sufficient
Synchronization on SMP Systems
5050
do acquire_spinlock(DPC)until (SUCCESS)
begin remove DPC from queueend
release_spinlock(DPC)
do acquire_spinlock(DPC)until (SUCCESS)
begin remove DPC from queueend
release_spinlock(DPC)
Kernel Synchronization
Processor BProcessor A
Critical section
spinlock
DPC DPC
A spinlock is a locking primitive associatedwith a global data structure such as the DPC queue
5151
Queued Spinlocks
Problem Checking status of spinlock via test-and-set operation creates bus contention
Queued spinlocks maintain queue of waiting processors
First processor acquires lock other processors wait on processor-local flag
ndash Thus busy-wait loop requires no access to the memory bus
When releasing lock the 1st processorrsquos flag is modified
ndash Exactly one processor is being signaled
ndash Pre-determined wait order
5252
SMP Scalability Improvements
Windows 2000 queued spinlocks
ndash qlocks in Kernel Debugger
Server 2003
ndash More spinlocks eliminated (context swap system space commit)
ndash Further reduction of use of spinlocks amp length they are held
ndash Scheduling database now per-CPUAllows thread state transitions in parallel
5353
SMP Scalability Improvements
XP2003
ndash Minimized lock contention for hot locks PFN or Page Frame Database lock
ndash Some locks completely eliminatedCharging nonpagedpaged pool quotas allocating and mapping system page table entries charging commitment of pages allocatingmapping physical memory through AWE functions
ndash New more efficient locking mechanism (pushlocks)Doesnrsquot use spinlocks when no contentionUsed for object manager and address windowing extensions (AWE) related locks
5454
Waiting
Flexible wait calls
ndash Wait for one or multiple objects in one call
ndash Wait for multiple can wait for ldquoanyrdquo one or ldquoallrdquo at onceldquoAllrdquo all objects must be in the signalled state concurrently to resolve the wait
ndash All wait calls include optional timeout argument
ndash Waiting threads consume no CPU time
5555
Waiting
Waitable objects include
ndash Events (may be auto-reset or manual reset may be set or ldquopulsedrdquo)
ndash Mutexes (ldquomutual exclusionrdquo one-at-a-time)
ndash Semaphores (n-at-a-time)
ndash Timers
ndash Processes and Threads (signalled upon exit or terminate)
ndash Directories (change notification)
5656
Waiting
No guaranteed ordering of wait resolution
ndash If multiple threads are waiting for an object and only one thread is released (eg itrsquos a mutex or auto-reset event) which thread gets released is unpredictable
5757
Executive Synchronization
Thread waitson an object
handle
Create and initialize thread object
Initialized
Ready
Transition
Waiting
Running
Terminated
Standby
Wait is completeSet object to
signaled state
Interaction with thread scheduling
Waiting on Dispatcher Objects ndash outside the kernel
5858
Interaction bet Synchronization amp Dispatching User mode thread waits on an event objectlsquos handle
Kernel changes threadlsquos scheduling state from ready to waiting and adds thread to wait-list
Another thread sets the event
Kernel wakes up waiting threads variable priority threads get priority boost
5959
Interaction bet Synchronization amp Dispatching Dispatcher re-schedules new thread ndash it may preempt running thread it it has lower priority and issues software interrupt to initiate context switch
If no processor can be preempted the dispatcher places the ready thread in the dispatcher ready queue to be scheduled later
6060
What signals an object
Dispatcher object
System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
Owning thread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (kernel mode)
nonsignaled signaled
Owning thread or otherthread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (exported to user mode)
nonsignaled signaled
One thread releases thesemaphore freeing a resource
A thread acquires the semaphoreMore resources are not available
Kernel resumes one or more waiting threads
Semaphore
6161
A thread reinitializesthe thread object
What signals an object (contd)
Dispatcher object System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
A thread sets the event
Kernel resumes one or more threads
Kernel resumes one or more waiting threads
Event
nonsignaled signaled
Dedicated thread sets oneevent in the event pair
Kernel resumes theother dedicated thread
Kernel resumes waitingdedicated thread
Event pair
nonsignaled signaled
Timer expires
A thread (re) initializes the timer
Kernel resumes all waiting threads
Timer
nonsignaled signaled
Thread terminates
Kernel resumes all waiting threads
Thread
6262
Further Reading
Mark E Russinovich and David A Solomon Microsoft Windows Internals 4th Edition Microsoft Press 2004
Chapter 3 - System Mechanisms
ndash Trap Dispatching (pp 85 ff)
ndash Synchronization (pp 149 ff)
ndash Kernel Event Tracing (pp 175 ff)
6363
33 Advanced Windows Synchronization
Deferred and Asynchronous Procedure Calls
IRQLs and CPU Time Accounting
Wait Queues amp Dispatcher Objects
6464
Used to defer processing from higher (device) interrupt level to a lower (dispatch) levelndash Also used for quantum end and timer expiration
Driver (usually ISR) queues requestndash One queue per CPU DPCs are normally queued to the current processor but can be targeted to other CPUs
ndash Executes specified procedure at dispatch IRQL (or ldquodispatch levelrdquo also ldquoDPC levelrdquo) when all higher-IRQL work (interrupts) completed
ndash Maximum times recommended ISR 10 usec DPC 25 usec
See httpwwwmicrosoftcomwhdcdriverperformmmdrvmspx
Deferred Procedure Calls (DPCs)
6565
queue head DPC object DPC object DPC object
Deferred Procedure Calls (DPCs)
6666
DPC
Delivering a DPC
DPC routines can call kernel functionsbut canlsquot call system services generatepage faults or create or wait on objects
DPC routines canlsquotassume whatprocess addressspace is currentlymapped
Interruptdispatch table
high
Power failure
DispatchDPC
APC
Low
DPC
1 Timer expires kernelqueues DPC that willrelease all waiting threadsKernel requests SW int
DPCDPC
DPC queue
2 DPC interrupt occurswhen IRQL drops belowdispatchDPC level
dispatcher
3 After DPC interruptcontrol transfers tothread dispatcher
4 Dispatcher executes each DPCroutine in DPC queue
6767
Asynchronous Procedure Calls (APCs)
Execute code in context of a particular user threadndash APC routines can acquire resources (objects) incur page faultscall system services
APC queue is thread-specific
User mode amp kernel mode APCsndash Permission required for user mode APCs
6868
Asynchronous Procedure Calls (APCs)
Executive uses APCs to complete work in thread spacendash Wait for asynchronous IO operation
ndash Emulate delivery of POSIX signals
ndash Make threads suspendterminate itself (env subsystems)
APCs are delivered when thread is in alertable wait statendash WaitForMultipleObjectsEx() SleepEx()
6969
Special kernel APCs
ndash Run in kernel mode at IRQL 1
ndash Always deliverable unless thread is already at IRQL 1 or above
ndash Used for IO completion reporting from ldquoarbitrary thread contextrdquo
ndash Kernel-mode interface is linkable but not documented
Asynchronous Procedure Calls (APCs)
7070
ldquoOrdinaryrdquo kernel APCs
ndash Always deliverable if at IRQL 0 unless explicitly disabled (disable with KeEnterCriticalRegion)
User mode APCs
ndash Used for IO completion callback routines (see ReadFileEx WriteFileEx) also QueueUserApc
ndash Only deliverable when thread is in ldquoalertable waitrdquo
Asynchronous Procedure Calls (APCs)
7171
ThreadObject
K
U
APC objects
Asynchronous Procedure Calls (APCs)
7272
IRQLs and CPU Time Accounting
Interval clock timer ISR keeps track of time
Clock ISR time accounting
ndash If IRQLlt2 charge to threadrsquos user or kernel time
ndash If IRQL=2 and processing a DPC charge to DPC time
ndash If IRQL=2 amp not processing a DPC charge to thread kernel time
ndash If IRQLgt2 charge to interrupt time
7373
IRQLs and CPU Time Accounting
Since time servicing interrupts are NOT charged to interrupted thread if system is busy but no process appears to be running must be due to interrupt-related activity
ndash Note time at IRQL 2 or more is charged to the current threadrsquos quantum (to be described)
7474
Interrupt Time Accounting
Task Manager includes interrupt and DPC time with the Idle process time
Interrupt activity is not charged to any threadprocess
ndash Process Explorer shows these as separate processesnot really processes
ndash Context switches for these are really of interrupts amp DPCs
7575
Time Accounting Quirks
Looking at total CPU time for each process may not reveal where system has spent its time
CPU time accounting is driven by programmable interrupt timer
ndash Normally 10 msec (15 msec on some MP Pentiums)
Thread execution and context switches between clock intervals NOT accounted
ndash Eg one or more threads run and enter a wait state before clock fires
ndash Thus threads may run but never get charged
View context switch activity with Process Explorer
ndash Add Context Switch Delta column
7676
For waiting threads user-mode utilities only display the wait reason
Example pstat
Looking at Waiting Threads
7777
Wait Internals 1 Dispatcher Objects
Size TypeState
Wait listhead
Object-type-specific data
DispatcherObject
(see ntddkincddkntddkh)
Any kernel object you can wait for is a ldquodispatcher objectrdquo
ndash some exclusively for synchronizationeg events mutexes (ldquomutantsrdquo) semaphores queues timers
ndash others can be waited for as a side effect of their prime function eg processes threads file objects
ndash non-waitable kernel objects are called ldquocontrol objectsrdquo
All dispatcher objects have a common header
All dispatcher objects are in one of two states
ndash ldquosignaledrdquo vs ldquononsignaledrdquo
ndash when signalled a wait on the object is satisfied
ndash different object types differ in terms of what changes their state
ndash wait and unwait implementation iscommon to all types of dispatcher objects
7878
Object-type-specific data
Wait Internals 2Wait Blocks
Size TypeState
Wait listhead
Size TypeState
Wait listhead
Represent a threadrsquos reference to something itrsquos waiting for (one per handle passed to WaitForhellip)
All wait blocks from a given wait call are chained to the waiting thread
Type indicates wait for ldquoanyrdquo or ldquoallrdquo Key denotes argument list position for
WaitForMultipleObjects
Object-type-specific data
DispatcherObjects
Thread Objects
WaitBlockListWaitBlockList
Wait blocks
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
7979
34 Windows APIs for Synchronization and IPC Windows API constructs for synchronization and interprocess communication
Synchronization
ndash Critical sections
ndash Mutexes
ndash Semaphores
ndash Event objects
Synchronization through interprocess communication
ndash Anonymous pipes
ndash Named pipes
ndash Mailslots
8080
Critical Sections
Only usable from within the same process
Critical sections are initialized and deleted but do not have handles
Only one thread at a time can be in a critical section
A thread can enter a critical section multiple times - however the number of Enter- and Leave-operations must match
Leaving a critical section before entering it may cause deadlocks
No way to test whether another thread is in a critical section
VOID InitializeCriticalSection( LPCRITICAL_SECTION sec )VOID DeleteCriticalSection( LPCRITICAL_SECTION sec )
VOID EnterCriticalSection( LPCRITICAL_SECTION sec ) VOID LeaveCriticalSection( LPCRITICAL_SECTION sec )BOOL TryEnterCriticalSection ( LPCRITICAL_SECTION sec )
8181
Critical Section Example
counter is global shared by all threads
volatile int counter = 0
CRITICAL_SECTION crit
InitializeCriticalSection ( ampcrit )
hellip main loop in any of the threads
while (done)
_try
EnterCriticalSection ( ampcrit )
counter += local_value
LeaveCriticalSection ( ampcrit )
_finally LeaveCriticalSection ( ampcrit )
DeleteCriticalSection( ampcrit )
8282
Synchronizing Threads with Kernel Objects
The following kernel objects can be used to synchronize threads
ndash Processes
ndash Threads
ndash Files
ndash Console input
File change notificationsFile change notifications
MutexesMutexes
Events (auto-reset + manual-reset)Events (auto-reset + manual-reset)
Waitable timersWaitable timers
DWORD WaitForSingleObject( HANDLE hObject DWORD dwTimeout )
DWORD WaitForMultipleObjects( DWORD cObjects LPHANDLE lpHandles BOOL bWaitAll DWORD dwTimeout )
8383
Wait Functions - Details
WaitForSingleObject()ndash hObject specifies kernel object
ndash dwTimeout specifies wait time in msecdwTimeout == 0 - no wait check whether object is signaled
dwTimeout == INFINITE - wait forever
WaitForMultipleObjects()ndash cObjects lt= MAXIMUM_WAIT_OBJECTS (64)
ndash lpHandles - pointer to array identifying these objects
ndash bWaitAll - whether to wait for first signaled object or all objectsFunction returns index of first signaled object
Side effectsndash Mutexes auto-reset events and waitable timers will be reset to non-signaled state after completing wait functions
8484
Mutexes
Mutexes work across processes
First thread has to call CreateMutex()
When sharing a mutex second thread (process) calls CreateMutex() or OpenMutex()
fInitialOwner == TRUE gives creator immediate ownership
Threads acquire mutex ownership using WaitForSingleObject() or WaitForMultipleObjects()
ReleaseMutex() gives up ownership
CloseHandle() will free mutex object
8585
Mutexes
HANDLE CreateMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
HANDLE OpenMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
BOOL ReleaseMutex( HANDLE hMutex )
8686
Mutex Example
counter is global shared by all threads
volatile int done counter = 0
HANDLE mutex = CreateMutex( NULL FALSE NULL )
main loop in any of the threads ret is local
DWORD ret
while (done)
ret = WaitForSingleObject( mutex INFINITE )
if (ret == WAIT_OBJECT_0)
counter += local_value
else mutex was abandoned
break exit the loop
ReleaseMutex( mutex )
CloseHandle( mutex )
8787
Comparison - POSIX mutexes
POSIX pthreads specification supports mutexesndash Synchronization among threads in same process
Five basic functionsndash pthread_mutex_init()
ndash pthread_mutex_destroy()
ndash pthread_mutex_lock()
ndash pthread_mutex_unlock()
ndash pthread_mutex_trylock()
Comparisonndash pthread_mutex_lock() will block - equivalent to WaitForSingleObject( hMutex )
ndash pthread_mutex_trylock() is nonblocking (polling) - equivalent to WaitForSingleObject() with timeout == 0
8888
Semaphores
Semaphore objects are used for resource countingndash A semaphore is signaled when count gt 0
Threadsprocesses use wait functionsndash Each wait function decreases semaphore count by 1
ndash ReleaseSemaphore() may increment count by any value
ndash ReleaseSemaphore() returns old semaphore count
8989
Semaphores
HANDLE CreateSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE OpenSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE ReleaseSemaphore( HANDLE hSemaphoreLONG cReleaseCount LPLONG lpPreviousCount )
9090
Events
Multiple threads can be released when a single event is signaled (barrier synchronization)ndash Manual-reset event can signal several thread simultaneously must be reset manually
ndash PulseEvent() will release all threads waiting on a manual-reset event and automatically reset the event
ndash Auto-reset event signals a single thread event is reset automatically
ndash fInitialState == TRUE - create event in signaled state
9191
Events
HANDLE CreateEvent( LPSECURITY_ATTRIBUTE lpsaBOOL fManualReset BOOL fInititalStateLPTSTR lpszEventName )
BOOL SetEvent( HANDLE hEvent )BOOL ResetEvent( HANDLE hEvent )BOOL PulseEvent( HANDLE hEvent )
9292
Comparison - POSIX condition variables
pthreadrsquos condition variables are comparable to eventsndash pthread_cond_init()
ndash pthread_cond_destroy()
Wait functionsndash pthread_cond_wait()
ndash pthread_cond_timedwait()
Signalingndash pthread_cond_signal() - one thread
ndash pthread_cond_broadcast() - all waiting threads
No exact equivalent to manual-reset events
9393
Anonymous pipes
BOOL CreatePipe( PHANDLE phReadPHANDLE phWriteLPSECURITY_ATTRIBUTES lpsaDWORD cbPipe )
main
prog1 prog2pipe
Half-duplex character-based IPC
cbPipe pipe byte size zero == default
Read on pipe handle will block if pipe is empty
Write operation to a full pipe will block
Anonymous pipes are oneway
9494
IO Redirection using an Anonymous Pipe
Create default size anonymous pipe handles are inheritable
if (CreatePipe (amphReadPipe amphWritePipe ampPipeSA 0))
fprintf(stderr ldquoAnon pipe create failednrdquo) exit(1)
Set output handle to pipe handle create first processes
StartInfoCh1hStdInput = GetStdHandle (STD_INPUT_HANDLE)
StartInfoCh1hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh1hStdOutput = hWritePipe
StartInfoCh1dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)Command1 NULL NULL TRUE 0 NULL NULL ampStartInfoCh1 ampProcInfo1))
fprintf(stderr ldquoCreateProc1 failednrdquo) exit(2)
CloseHandle (hWritePipe)
9595
Pipe example (contd)
Repeat (symmetrically) for the second process
StartInfoCh2hStdInput = hReadPipe
StartInfoCh2hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh2hStdOutput = GetStdHandle (STD_OUTPUT_HANDLE)
StartInfoCh2dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)targv NULL NULLTRUE Inherit handles
0 NULL NULL ampStartInfoCh2 ampProcInfo2))
fprintf(stderr ldquoCreateProc2 failednrdquo) exit(3)
CloseHandle (hReadPipe)
Wait for both processes to complete
WaitForSingleObject (ProcInfo1hProcess INFINITE)
WaitForSingleObject (ProcInfo2hProcess INFINITE)
CloseHandle (ProcInfo1hThread) CloseHandle (ProcInfo1hProcess)
CloseHandle (ProcInfo2hThread) CloseHandle (ProcInfo2hProcess)
return 0
9696
Named Pipes
Message orientedndash Reading process can read varying-length messages precisely as sent by the writing process
Bi-directionalndash Two processes can exchange messages over the same pipe
Multiple independent instances of a named pipendash Several clients can communicate with a single server using the same instance
ndash Server can respond to client using the same instance
Pipe can be accessed over the networkndash location transparency
Convenience and connection functions
9797
Using Named Pipes
HANDLE CreateNamedPipe (LPCTSTR lpszPipeNameDWORD fdwOpenMode DWORD fdwPipModeDWORD nMaxInstances DWORD cbOutBufDWORD cbInBuf DWORD dwTimeOutLPSECURITY_ATTRIBUTES lpsa )
Use same flag settings forall instances of a named pipe
lpszPipeName pipe[path]pipename
ndash Not possible to create a pipe on remote machine ( ndash local machine)
fdwOpenMode
ndash PIPE_ACCESS_DUPLEX PIPE_ACCESS_INBOUND PIPE_ACCESS_OUTBOUND
fdwPipeMode
ndash PIPE_TYPE_BYTE or PIPE_TYPE_MESSAGE
ndash PIPE_READMODE_BYTE or PIPE_READMODE_MESSAGE
ndash PIPE_WAIT or PIPE_NOWAIT (will ReadFile block)
9898
Named Pipes (contd)
BOOL PeekNamedPipe (HANDLE hPipeLPVOID lpvBuffer DWORD cbBufferLPDWORD lpcbRead LPDWORD lpcbAvailLPDWORD lpcbMessage)
nMaxInstances
ndash Number of instances
ndash PIPE_UNLIMITED_INSTANCES OS choice based on resources
dwTimeOut
ndash Default time-out period (in msec) for WaitNamedPipe()
First CreateNamedPipe creates named pipe
ndash Closing handle to last instance deletes named pipe
Polling a pipe
ndash Nondestructive ndash is there a message waiting for ReadFile
9999
Named Pipe Client Connections
CreateFile with named pipe namendash pipe[path]pipename
ndash servernamepipe[path]pipename
ndash First method gives better performance (local server)
Status Functionsndash GetNamedPipeHandleState
ndash SetNamedPipeHandleState
ndash GetNamedPipeInfo
100100
Convenience Functions
BOOL TransactNamedPipe( HANDLE hNamedPipeLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDOWRD lpcbRead LPOVERLAPPED lpa)
WriteFile ReadFile sequence
101101
Convenience Functions
BOOL CallNamedPipe( LPCTSTR lpszPipeNameLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDWORD lpcbRead DWORD dwTimeOut)
CreateFile WriteFile ReadFile CloseHandle
ndash dwTimeOut NMPWAIT_NOWAIT NMPWAIT_WIAT_FOREVER NMPWAIT_USE_DEFAULT_WAIT
102102
Server eliminate the polling loop
BOOL ConnectNamedPipe (HANDLE hNamedPipeLPOVERLAPPED lpo
lpo == NULLndash Call will return as soon as there is a client connection
ndash Returns false if client connected between CreateNamed Pipe calland ConnectNamedPipe()
Use DisconnectNamedPipe to free the handle for connection from another client
WaitNamedPipe()ndash Client may wait for serverlsquos ConnectNamedPipe()
Security rights for named pipesndash GENERIC_READ GENERIC_WRITE SYNCHRONIZE
103103
Comparison with UNIX
UNIX FIFOs are similar to a named pipe
ndash FIFOs are half-duplex
ndash FIFOs are limited to a single machine
ndash FIFOs are still byte-oriented so its easiest to use fixed-size records in clientserver applications
ndash Individual readwrites are atomic
A server using FIFOs must use a separate FIFO for each clientlsquos response although all clients can send requests via a single well known FIFO
Mkfifo() is the UNIX counterpart to CreateNamedPipe()
Use sockets for networked clientserver scenarios
104104
Client Example using Named Pipe
WaitNamedPipe (ServerPipeName NMPWAIT_WAIT_FOREVER)
hNamedPipe = CreateFile (ServerPipeName GENERIC_READ | GENERIC_WRITE
0 NULL OPEN_EXISTING FILE_ATTRIBUTE_NORMAL NULL)
if (hNamedPipe == INVALID_HANDLE_VALUE)
fptinf(stderr Failure to locate servern) exit(3)
Write the request
WriteFile (hNamedPipe ampRequest MAX_RQRS_LEN ampnWrite NULL)
Read each response and send it to std out
while (ReadFile (hNamedPipe ResponseRecord MAX_RQRS_LEN ampnRead NULL))
printf (s ResponseRecord)
CloseHandle (hNamedPipe)
return 0
105105
Server Example Using a Named Pipe
hNamedPipe = CreateNamedPipe (SERVER_PIPE PIPE_ACCESS_DUPLEX
PIPE_READMODE_MESSAGE | PIPE_TYPE_MESSAGE | PIPE_WAIT
1 0 0 CS_TIMEOUT pNPSA)
while (Done)
printf (Server is awaiting next requestn)
if (ConnectNamedPipe (hNamedPipe NULL)
|| ReadFile (hNamedPipe ampRequest RQ_SIZE ampnXfer NULL))
fprintf(stderr ldquoConnect or Read Named Pipe errornrdquo) exit(4)
printf( ldquoRequest is sn RequestRecord)
Send the file one line at a time to the client
fp = fopen (File r)
while ((fgets (ResponseRecord MAX_RQRS_LEN fp) = NULL))
WriteFile (hNamedPipe ampResponseRecord
(strlen(ResponseRecord) + 1) TSIZE ampnXfer NULL)
fclose (fp)
DisconnectNamedPipe (hNamedPipe)
End of server operation
106106
Win32 IPC - MailslotsMailslots bear some nasty implementation detailsthey are almost never used
Broadcast mechanism
ndash One-directional
ndash Mutliple writersmultiple readers (frequently one-to-many comm)
ndash Message delivery is unreliable
ndash Can be located over a network domain
ndash Message lengths are limited (w2k lt 426 byte) Operations on the mailslot
ndash Each reader (server) creates mailslot with CreateMailslot()
ndash Write-only client opens mailslot with CreateFile() and uses WriteFile() ndash open will fail if there are no waiting readers
ndash Clientlsquos message can be read by all servers (readers) Client lookup mailslotmailslotname
ndash Client will connect to every server in network domain
107107
Locate a server via mailslot
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
App client 0
App client n
Mailslot Servers
While () Sleep() hMS = CreateFile( ldquomailslotstatusldquo)
WriteFile(hMS ampStatInfo
App Server
Mailslot Client
Message is sent periodically
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
1919
Synchronization Hardware
Atomically swap two variables
void Swap(boolean ampa boolean ampb)
boolean temp = a
a = b
b = temp
2020
Mutual Exclusion with Swap
Shared data (initialized to 0) int lock = 0
Thread Ti
int key
do
key = 1
while (key == 1) Swap(lockkey)
critical section
lock = 0
remainder section
2121
Semaphores
Semaphore S ndash integer variable
can only be accessed via two atomic operations
wait (S)
while (S lt= 0)S--
signal (S)
S++
2222
Critical Section of n Threads
Shared data
semaphore mutex initially mutex = 1
Thread Ti
do wait(mutex) critical section
signal(mutex) remainder section
while (1)
2323
Semaphore Implementation
Semaphores may suspendresume threads
ndash Avoid busy waiting
Define a semaphore as a record
typedef struct
int value struct thread L semaphore
Assume two simple operations
ndash suspend() suspends the thread that invokes it
ndash resume(T) resumes the execution of a blocked thread T
2424
Implementation
Semaphore operations now defined as wait(S)
Svalue--
if (Svalue lt 0)
add this thread to SLsuspend()
signal(S) Svalue++
if (Svalue lt= 0)
remove a thread T from SLresume(T)
2525
Semaphore as a General Synchronization Tool Execute B in Tj only after A executed in Ti
Use semaphore flag initialized to 0
Code
Ti Tj
A wait(flag)
signal(flag) B
2626
Two Types of Semaphores
Counting semaphore
ndash integer value can range over an unrestricted domain
Binary semaphore
ndash integer value can range only between 0 and 1
ndash can be simpler to implement
Counting semaphore S can be implemented as a binary semaphore
2727
Deadlock and Starvation
Deadlock ndash
ndash two or more threads are waiting indefinitely for an event that can be caused by only one of the waiting threads
Let S and Q be two semaphores initialized to 1
T0 T1
wait(S) wait(Q)
wait(Q) wait(S)
signal(S) signal(Q)
signal(Q) signal(S)
2828
Deadlock and Starvation
Starvation ndash indefinite blocking
ndash A thread may never be removed from the semaphore queue in which it is suspended
Solution ndash
ndash all code should acquirerelease semaphores in same order
2929
Windows Synchronization
Uses interrupt masks to protect access to global resources on uniprocessor systems
Uses spinlocks on multiprocessor systems
Provides dispatcher objects which may act as mutexes and semaphores
Dispatcher objects may also provide events An event acts much like a condition variable
3030
Linux Synchronization
Kernel disables interrupts for synchronizing access to global data on uniprocessor systems
Uses spinlocks for multiprocessor synchronization
Uses semaphores and readers-writers locks when longer sections of code need access to data
Implements POSIX synchronization primitives to support multitasking multithreading (including real-time threads) and multiprocessing
3131
Further Reading
Ben-Ari M Principles of Concurrent Programming Prentice Hall 1982
Lamport L The Mutual Exclusion Problem Journal of the ACM April 1986
Abraham Silberschatz Peter B Galvin Operating System Concepts John Wiley amp Sons 6th Ed 2003
ndash Chapter 7 - Process Synchronization
ndash Chapter 8 - Deadlocks
3232
32 Trap Dispatching Interrupts Synchronization Trap and Interrupt dispatching
IRQL levels amp Interrupt Precedence
Spinlocks and Kernel Synchronization
Executive Synchronization
3333
Kernel Mode Versus User Mode
A processor state
Controls access to memory
Each memory page is tagged to show the required mode for reading and for writing
ndash Protects the system from the users
ndash Protects the user (process) from themselves
ndash System is not protected from system
Code regions are tagged ldquono write in any moderdquo
Controls ability to execute privileged instructions
A Windows abstraction
ndash Intel Ring 0 Ring 3
3434
Kernel Mode Versus User Mode
Control flow (a thread) can change from user to kernel mode and back
ndash Does not affect scheduling
ndash Thread context includes info about execution mode (along with registers etc)
PerfMon counters
ndash ldquoPrivileged Timerdquo and ldquoUser Timerdquo
ndash 4 levels of granularity thread process processor system
3535
Getting Into Kernel Mode
Code is run in kernel mode for one of three reasons
1 Requests from user mode
ndash Via the system service dispatch mechanism
ndash Kernel-mode code runs in the context of the requesting thread
2 Dedicated kernel-mode system threads
ndash Some threads in the system stay in kernel mode at all timesmostly in the ldquoSystemrdquo process
ndash Scheduled preempted etc like any other threads
3636
Getting Into Kernel Mode
3 Interrupts from external devicesndash interrupt dispatcher invokes the interrupt service routine
ndash ISR runs in the context of the interrupted thread so-called ldquoarbitrary thread contextrdquo
ndash ISR often requests the execution of a ldquoDPC routinerdquo which also runs in kernel mode
ndash Time not charged to interrupted thread
3737
Trap dispatching
Interruptdispatcher
Systemservice
dispatcher
Interruptserviceroutines
Interruptserviceroutines
Interruptserviceroutines
System services
System services
System services
Exceptiondispatcher
Exceptionhandlers
Exceptionhandlers
Exceptionhandlers
Virtual memorymanagerlsquos pager
Interrupt
System service call
HW exceptionsSW exceptions
Virtual addressexceptions
Trap processorlsquos mechanism to capture executing thread
ndash Switch from user to kernel mode
ndash Interrupts ndash asynchronous
ndash Exceptions - synchronous
Interrupt dispatch routine
Disable interrupts
Record machine state (trap frame) to allow resume
Mask equal- and lower-IRQL interrupts
Find and call appropriate ISR
Dismiss interrupt
Restore machine state (including mode and enabled interrupts)
Disable interrupts
Record machine state (trap frame) to allow resume
Mask equal- and lower-IRQL interrupts
Find and call appropriate ISR
Dismiss interrupt
Restore machine state (including mode and enabled interrupts)
Tell the device to stop interrupting
Interrogate device state start next operation on device etc
Request a DPC
Return to caller
Tell the device to stop interrupting
Interrogate device state start next operation on device etc
Request a DPC
Return to caller
Interrupt service routine
interrupt
user or kernel mode
codekernel mode
Note no thread or process context switch
Note no thread or process context switch
Interrupt Dispatching
3939
IRQL = Interrupt Request Level
ndash Precedence of the interrupt with respect to other interrupts
ndash Different interrupt sources have different IRQLs
ndash not the same as IRQ
IRQL is also a state of the processor
ndash Servicing an interrupt raises processor IRQL to that interruptrsquos IRQL
ndash this masks subsequent interrupts at equal and lower IRQLs
User mode is limited to IRQL 0
No waits or page faults at IRQL gt= DISPATCH_LEVEL
Interrupt Precedence via IRQLs (x86)
4040
PassiveLowAPC
DispatchDPCDevice 1
Profile amp Synch (Srv 2003)
ClockInterprocessor Interrupt
Power failHigh
normal thread execution
Hardware interrupts
Deferrable software interrupts
012
302928
31
Interrupt Precedence via IRQLs (x86)
4141
Interrupt processing
Interrupt dispatch table (IDT)ndash Links to interrupt service routines
x86ndash Interrupt controller interrupts processor (single line)
ndash Processor queries for interrupt vector uses vector as index to IDT
Alphandash PAL code (Privileged Architecture Library ndash Alpha BIOS) determines interrupt vector calls kernel
ndash Kernel uses vector to index IDT
After ISR execution IRQL is lowered to initial level
4242
Interrupt object
Allows device drivers to register ISRs for their devicesndash Contains dispatch code (initial handler)
ndash Dispatch code calls ISR with interrupt object as parameter(HW cannot pass parameters to ISR)
Connectingdisconnecting interrupt objectsndash Dynamic association between ISR and IDT entry
ndash Loadable device drivers (kernel modules)
ndash Turn onoff ISR
Interrupt objects can synchronize access to ISR datandash Multiple instances of ISR may be active simultaneously (MP machine)
ndash Multiple ISR may be connected with IRQL
4343
Predefined IRQLs
High ndash used when halting the system (via KeBugCheck())
Power fail ndash originated in the NT design document but has never been used
Inter-processor interruptndash used to request action from other processor (dispatching a thread updating a processors TLB system shutdown system crash)
Clockndash Used to update systemlsquos clock allocation of CPU time to threads
Profilendash Used for kernel profiling (see Kernel profiler ndash Kernprofexe Res Kit)
4444
Predefined IRQLs (contd)
Device
ndash Used to prioritize device interrupts
DPCdispatch and APC
ndash Software interrupts that kernel and device drivers generate
Passive
ndash No interrupt level at all normal thread execution
4545
IRQLs on 64-bit Systems
PassiveLowAPC
DispatchDPCDevice 1
Device n
Synch (Srv 2003)Clock
Interprocessor InterruptPower
HighProfile
012
1413
15
34
PassiveLowAPC
DispatchDPC amp Synch (UP only)Correctable Machine Check
Device 1
Device nSynch (MP only)
ClockInterprocessor Interrupt
HighProfilePower
x64 IA64
12
4646
Interrupt Prioritization amp Delivery
IRQLs are determined as followsndash x86 UP systems IRQL = 27 - IRQ
ndash x86 MP systems bucketized (random)
ndash x64 amp IA64 systems IRQL = IDT vector number 16
On MP systems which processor is chosen to deliver an interruptndash By default any processor can receive an interrupt from any deviceCan be configured with IntFilter utility in Resource Kit
ndash On x86 and x64 systems the IOAPIC (IO advanced programmable interrupt controller) is programmed to interrupt the processor running at the lowest IRQL
ndash On IA64 systems the SAPIC (streamlined advanced programmable interrupt controller) is configured to interrupt one processor for each interrupt sourceProcessors are assigned round robin for each interrupt vector
4747
Software interrupts
Initiating thread dispatching
ndash DPC allow for scheduling actions when kernel is deep within many layers of code
ndash Delayed scheduling decision one DPC queue per processor
Handling timer expiration
Asynchronous execution of a procedure in context of a particular thread
Support for asynchronous IO operations
4848
Flow of Interrupts
4949
Sync on MP use spinlocks to coordinate among processors
Spinlock acquisition and release routines implement a one-owner-at-a-time algorithm
ndash Spinlock is either free or is considered to be owned by a CPU
ndash Analogous to using Windows API mutexes from user mode
A spinlock is just a data cell in memory
ndash Accessed with a test-and-modify operation that is atomic across all processors
ndash KSPIN_LOCK is an opaque data type typedefrsquod as a ULONG
ndash To implement synchronization a single bit is sufficient
Synchronization on SMP Systems
5050
do acquire_spinlock(DPC)until (SUCCESS)
begin remove DPC from queueend
release_spinlock(DPC)
do acquire_spinlock(DPC)until (SUCCESS)
begin remove DPC from queueend
release_spinlock(DPC)
Kernel Synchronization
Processor BProcessor A
Critical section
spinlock
DPC DPC
A spinlock is a locking primitive associatedwith a global data structure such as the DPC queue
5151
Queued Spinlocks
Problem Checking status of spinlock via test-and-set operation creates bus contention
Queued spinlocks maintain queue of waiting processors
First processor acquires lock other processors wait on processor-local flag
ndash Thus busy-wait loop requires no access to the memory bus
When releasing lock the 1st processorrsquos flag is modified
ndash Exactly one processor is being signaled
ndash Pre-determined wait order
5252
SMP Scalability Improvements
Windows 2000 queued spinlocks
ndash qlocks in Kernel Debugger
Server 2003
ndash More spinlocks eliminated (context swap system space commit)
ndash Further reduction of use of spinlocks amp length they are held
ndash Scheduling database now per-CPUAllows thread state transitions in parallel
5353
SMP Scalability Improvements
XP2003
ndash Minimized lock contention for hot locks PFN or Page Frame Database lock
ndash Some locks completely eliminatedCharging nonpagedpaged pool quotas allocating and mapping system page table entries charging commitment of pages allocatingmapping physical memory through AWE functions
ndash New more efficient locking mechanism (pushlocks)Doesnrsquot use spinlocks when no contentionUsed for object manager and address windowing extensions (AWE) related locks
5454
Waiting
Flexible wait calls
ndash Wait for one or multiple objects in one call
ndash Wait for multiple can wait for ldquoanyrdquo one or ldquoallrdquo at onceldquoAllrdquo all objects must be in the signalled state concurrently to resolve the wait
ndash All wait calls include optional timeout argument
ndash Waiting threads consume no CPU time
5555
Waiting
Waitable objects include
ndash Events (may be auto-reset or manual reset may be set or ldquopulsedrdquo)
ndash Mutexes (ldquomutual exclusionrdquo one-at-a-time)
ndash Semaphores (n-at-a-time)
ndash Timers
ndash Processes and Threads (signalled upon exit or terminate)
ndash Directories (change notification)
5656
Waiting
No guaranteed ordering of wait resolution
ndash If multiple threads are waiting for an object and only one thread is released (eg itrsquos a mutex or auto-reset event) which thread gets released is unpredictable
5757
Executive Synchronization
Thread waitson an object
handle
Create and initialize thread object
Initialized
Ready
Transition
Waiting
Running
Terminated
Standby
Wait is completeSet object to
signaled state
Interaction with thread scheduling
Waiting on Dispatcher Objects ndash outside the kernel
5858
Interaction bet Synchronization amp Dispatching User mode thread waits on an event objectlsquos handle
Kernel changes threadlsquos scheduling state from ready to waiting and adds thread to wait-list
Another thread sets the event
Kernel wakes up waiting threads variable priority threads get priority boost
5959
Interaction bet Synchronization amp Dispatching Dispatcher re-schedules new thread ndash it may preempt running thread it it has lower priority and issues software interrupt to initiate context switch
If no processor can be preempted the dispatcher places the ready thread in the dispatcher ready queue to be scheduled later
6060
What signals an object
Dispatcher object
System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
Owning thread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (kernel mode)
nonsignaled signaled
Owning thread or otherthread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (exported to user mode)
nonsignaled signaled
One thread releases thesemaphore freeing a resource
A thread acquires the semaphoreMore resources are not available
Kernel resumes one or more waiting threads
Semaphore
6161
A thread reinitializesthe thread object
What signals an object (contd)
Dispatcher object System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
A thread sets the event
Kernel resumes one or more threads
Kernel resumes one or more waiting threads
Event
nonsignaled signaled
Dedicated thread sets oneevent in the event pair
Kernel resumes theother dedicated thread
Kernel resumes waitingdedicated thread
Event pair
nonsignaled signaled
Timer expires
A thread (re) initializes the timer
Kernel resumes all waiting threads
Timer
nonsignaled signaled
Thread terminates
Kernel resumes all waiting threads
Thread
6262
Further Reading
Mark E Russinovich and David A Solomon Microsoft Windows Internals 4th Edition Microsoft Press 2004
Chapter 3 - System Mechanisms
ndash Trap Dispatching (pp 85 ff)
ndash Synchronization (pp 149 ff)
ndash Kernel Event Tracing (pp 175 ff)
6363
33 Advanced Windows Synchronization
Deferred and Asynchronous Procedure Calls
IRQLs and CPU Time Accounting
Wait Queues amp Dispatcher Objects
6464
Used to defer processing from higher (device) interrupt level to a lower (dispatch) levelndash Also used for quantum end and timer expiration
Driver (usually ISR) queues requestndash One queue per CPU DPCs are normally queued to the current processor but can be targeted to other CPUs
ndash Executes specified procedure at dispatch IRQL (or ldquodispatch levelrdquo also ldquoDPC levelrdquo) when all higher-IRQL work (interrupts) completed
ndash Maximum times recommended ISR 10 usec DPC 25 usec
See httpwwwmicrosoftcomwhdcdriverperformmmdrvmspx
Deferred Procedure Calls (DPCs)
6565
queue head DPC object DPC object DPC object
Deferred Procedure Calls (DPCs)
6666
DPC
Delivering a DPC
DPC routines can call kernel functionsbut canlsquot call system services generatepage faults or create or wait on objects
DPC routines canlsquotassume whatprocess addressspace is currentlymapped
Interruptdispatch table
high
Power failure
DispatchDPC
APC
Low
DPC
1 Timer expires kernelqueues DPC that willrelease all waiting threadsKernel requests SW int
DPCDPC
DPC queue
2 DPC interrupt occurswhen IRQL drops belowdispatchDPC level
dispatcher
3 After DPC interruptcontrol transfers tothread dispatcher
4 Dispatcher executes each DPCroutine in DPC queue
6767
Asynchronous Procedure Calls (APCs)
Execute code in context of a particular user threadndash APC routines can acquire resources (objects) incur page faultscall system services
APC queue is thread-specific
User mode amp kernel mode APCsndash Permission required for user mode APCs
6868
Asynchronous Procedure Calls (APCs)
Executive uses APCs to complete work in thread spacendash Wait for asynchronous IO operation
ndash Emulate delivery of POSIX signals
ndash Make threads suspendterminate itself (env subsystems)
APCs are delivered when thread is in alertable wait statendash WaitForMultipleObjectsEx() SleepEx()
6969
Special kernel APCs
ndash Run in kernel mode at IRQL 1
ndash Always deliverable unless thread is already at IRQL 1 or above
ndash Used for IO completion reporting from ldquoarbitrary thread contextrdquo
ndash Kernel-mode interface is linkable but not documented
Asynchronous Procedure Calls (APCs)
7070
ldquoOrdinaryrdquo kernel APCs
ndash Always deliverable if at IRQL 0 unless explicitly disabled (disable with KeEnterCriticalRegion)
User mode APCs
ndash Used for IO completion callback routines (see ReadFileEx WriteFileEx) also QueueUserApc
ndash Only deliverable when thread is in ldquoalertable waitrdquo
Asynchronous Procedure Calls (APCs)
7171
ThreadObject
K
U
APC objects
Asynchronous Procedure Calls (APCs)
7272
IRQLs and CPU Time Accounting
Interval clock timer ISR keeps track of time
Clock ISR time accounting
ndash If IRQLlt2 charge to threadrsquos user or kernel time
ndash If IRQL=2 and processing a DPC charge to DPC time
ndash If IRQL=2 amp not processing a DPC charge to thread kernel time
ndash If IRQLgt2 charge to interrupt time
7373
IRQLs and CPU Time Accounting
Since time servicing interrupts are NOT charged to interrupted thread if system is busy but no process appears to be running must be due to interrupt-related activity
ndash Note time at IRQL 2 or more is charged to the current threadrsquos quantum (to be described)
7474
Interrupt Time Accounting
Task Manager includes interrupt and DPC time with the Idle process time
Interrupt activity is not charged to any threadprocess
ndash Process Explorer shows these as separate processesnot really processes
ndash Context switches for these are really of interrupts amp DPCs
7575
Time Accounting Quirks
Looking at total CPU time for each process may not reveal where system has spent its time
CPU time accounting is driven by programmable interrupt timer
ndash Normally 10 msec (15 msec on some MP Pentiums)
Thread execution and context switches between clock intervals NOT accounted
ndash Eg one or more threads run and enter a wait state before clock fires
ndash Thus threads may run but never get charged
View context switch activity with Process Explorer
ndash Add Context Switch Delta column
7676
For waiting threads user-mode utilities only display the wait reason
Example pstat
Looking at Waiting Threads
7777
Wait Internals 1 Dispatcher Objects
Size TypeState
Wait listhead
Object-type-specific data
DispatcherObject
(see ntddkincddkntddkh)
Any kernel object you can wait for is a ldquodispatcher objectrdquo
ndash some exclusively for synchronizationeg events mutexes (ldquomutantsrdquo) semaphores queues timers
ndash others can be waited for as a side effect of their prime function eg processes threads file objects
ndash non-waitable kernel objects are called ldquocontrol objectsrdquo
All dispatcher objects have a common header
All dispatcher objects are in one of two states
ndash ldquosignaledrdquo vs ldquononsignaledrdquo
ndash when signalled a wait on the object is satisfied
ndash different object types differ in terms of what changes their state
ndash wait and unwait implementation iscommon to all types of dispatcher objects
7878
Object-type-specific data
Wait Internals 2Wait Blocks
Size TypeState
Wait listhead
Size TypeState
Wait listhead
Represent a threadrsquos reference to something itrsquos waiting for (one per handle passed to WaitForhellip)
All wait blocks from a given wait call are chained to the waiting thread
Type indicates wait for ldquoanyrdquo or ldquoallrdquo Key denotes argument list position for
WaitForMultipleObjects
Object-type-specific data
DispatcherObjects
Thread Objects
WaitBlockListWaitBlockList
Wait blocks
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
7979
34 Windows APIs for Synchronization and IPC Windows API constructs for synchronization and interprocess communication
Synchronization
ndash Critical sections
ndash Mutexes
ndash Semaphores
ndash Event objects
Synchronization through interprocess communication
ndash Anonymous pipes
ndash Named pipes
ndash Mailslots
8080
Critical Sections
Only usable from within the same process
Critical sections are initialized and deleted but do not have handles
Only one thread at a time can be in a critical section
A thread can enter a critical section multiple times - however the number of Enter- and Leave-operations must match
Leaving a critical section before entering it may cause deadlocks
No way to test whether another thread is in a critical section
VOID InitializeCriticalSection( LPCRITICAL_SECTION sec )VOID DeleteCriticalSection( LPCRITICAL_SECTION sec )
VOID EnterCriticalSection( LPCRITICAL_SECTION sec ) VOID LeaveCriticalSection( LPCRITICAL_SECTION sec )BOOL TryEnterCriticalSection ( LPCRITICAL_SECTION sec )
8181
Critical Section Example
counter is global shared by all threads
volatile int counter = 0
CRITICAL_SECTION crit
InitializeCriticalSection ( ampcrit )
hellip main loop in any of the threads
while (done)
_try
EnterCriticalSection ( ampcrit )
counter += local_value
LeaveCriticalSection ( ampcrit )
_finally LeaveCriticalSection ( ampcrit )
DeleteCriticalSection( ampcrit )
8282
Synchronizing Threads with Kernel Objects
The following kernel objects can be used to synchronize threads
ndash Processes
ndash Threads
ndash Files
ndash Console input
File change notificationsFile change notifications
MutexesMutexes
Events (auto-reset + manual-reset)Events (auto-reset + manual-reset)
Waitable timersWaitable timers
DWORD WaitForSingleObject( HANDLE hObject DWORD dwTimeout )
DWORD WaitForMultipleObjects( DWORD cObjects LPHANDLE lpHandles BOOL bWaitAll DWORD dwTimeout )
8383
Wait Functions - Details
WaitForSingleObject()ndash hObject specifies kernel object
ndash dwTimeout specifies wait time in msecdwTimeout == 0 - no wait check whether object is signaled
dwTimeout == INFINITE - wait forever
WaitForMultipleObjects()ndash cObjects lt= MAXIMUM_WAIT_OBJECTS (64)
ndash lpHandles - pointer to array identifying these objects
ndash bWaitAll - whether to wait for first signaled object or all objectsFunction returns index of first signaled object
Side effectsndash Mutexes auto-reset events and waitable timers will be reset to non-signaled state after completing wait functions
8484
Mutexes
Mutexes work across processes
First thread has to call CreateMutex()
When sharing a mutex second thread (process) calls CreateMutex() or OpenMutex()
fInitialOwner == TRUE gives creator immediate ownership
Threads acquire mutex ownership using WaitForSingleObject() or WaitForMultipleObjects()
ReleaseMutex() gives up ownership
CloseHandle() will free mutex object
8585
Mutexes
HANDLE CreateMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
HANDLE OpenMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
BOOL ReleaseMutex( HANDLE hMutex )
8686
Mutex Example
counter is global shared by all threads
volatile int done counter = 0
HANDLE mutex = CreateMutex( NULL FALSE NULL )
main loop in any of the threads ret is local
DWORD ret
while (done)
ret = WaitForSingleObject( mutex INFINITE )
if (ret == WAIT_OBJECT_0)
counter += local_value
else mutex was abandoned
break exit the loop
ReleaseMutex( mutex )
CloseHandle( mutex )
8787
Comparison - POSIX mutexes
POSIX pthreads specification supports mutexesndash Synchronization among threads in same process
Five basic functionsndash pthread_mutex_init()
ndash pthread_mutex_destroy()
ndash pthread_mutex_lock()
ndash pthread_mutex_unlock()
ndash pthread_mutex_trylock()
Comparisonndash pthread_mutex_lock() will block - equivalent to WaitForSingleObject( hMutex )
ndash pthread_mutex_trylock() is nonblocking (polling) - equivalent to WaitForSingleObject() with timeout == 0
8888
Semaphores
Semaphore objects are used for resource countingndash A semaphore is signaled when count gt 0
Threadsprocesses use wait functionsndash Each wait function decreases semaphore count by 1
ndash ReleaseSemaphore() may increment count by any value
ndash ReleaseSemaphore() returns old semaphore count
8989
Semaphores
HANDLE CreateSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE OpenSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE ReleaseSemaphore( HANDLE hSemaphoreLONG cReleaseCount LPLONG lpPreviousCount )
9090
Events
Multiple threads can be released when a single event is signaled (barrier synchronization)ndash Manual-reset event can signal several thread simultaneously must be reset manually
ndash PulseEvent() will release all threads waiting on a manual-reset event and automatically reset the event
ndash Auto-reset event signals a single thread event is reset automatically
ndash fInitialState == TRUE - create event in signaled state
9191
Events
HANDLE CreateEvent( LPSECURITY_ATTRIBUTE lpsaBOOL fManualReset BOOL fInititalStateLPTSTR lpszEventName )
BOOL SetEvent( HANDLE hEvent )BOOL ResetEvent( HANDLE hEvent )BOOL PulseEvent( HANDLE hEvent )
9292
Comparison - POSIX condition variables
pthreadrsquos condition variables are comparable to eventsndash pthread_cond_init()
ndash pthread_cond_destroy()
Wait functionsndash pthread_cond_wait()
ndash pthread_cond_timedwait()
Signalingndash pthread_cond_signal() - one thread
ndash pthread_cond_broadcast() - all waiting threads
No exact equivalent to manual-reset events
9393
Anonymous pipes
BOOL CreatePipe( PHANDLE phReadPHANDLE phWriteLPSECURITY_ATTRIBUTES lpsaDWORD cbPipe )
main
prog1 prog2pipe
Half-duplex character-based IPC
cbPipe pipe byte size zero == default
Read on pipe handle will block if pipe is empty
Write operation to a full pipe will block
Anonymous pipes are oneway
9494
IO Redirection using an Anonymous Pipe
Create default size anonymous pipe handles are inheritable
if (CreatePipe (amphReadPipe amphWritePipe ampPipeSA 0))
fprintf(stderr ldquoAnon pipe create failednrdquo) exit(1)
Set output handle to pipe handle create first processes
StartInfoCh1hStdInput = GetStdHandle (STD_INPUT_HANDLE)
StartInfoCh1hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh1hStdOutput = hWritePipe
StartInfoCh1dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)Command1 NULL NULL TRUE 0 NULL NULL ampStartInfoCh1 ampProcInfo1))
fprintf(stderr ldquoCreateProc1 failednrdquo) exit(2)
CloseHandle (hWritePipe)
9595
Pipe example (contd)
Repeat (symmetrically) for the second process
StartInfoCh2hStdInput = hReadPipe
StartInfoCh2hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh2hStdOutput = GetStdHandle (STD_OUTPUT_HANDLE)
StartInfoCh2dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)targv NULL NULLTRUE Inherit handles
0 NULL NULL ampStartInfoCh2 ampProcInfo2))
fprintf(stderr ldquoCreateProc2 failednrdquo) exit(3)
CloseHandle (hReadPipe)
Wait for both processes to complete
WaitForSingleObject (ProcInfo1hProcess INFINITE)
WaitForSingleObject (ProcInfo2hProcess INFINITE)
CloseHandle (ProcInfo1hThread) CloseHandle (ProcInfo1hProcess)
CloseHandle (ProcInfo2hThread) CloseHandle (ProcInfo2hProcess)
return 0
9696
Named Pipes
Message orientedndash Reading process can read varying-length messages precisely as sent by the writing process
Bi-directionalndash Two processes can exchange messages over the same pipe
Multiple independent instances of a named pipendash Several clients can communicate with a single server using the same instance
ndash Server can respond to client using the same instance
Pipe can be accessed over the networkndash location transparency
Convenience and connection functions
9797
Using Named Pipes
HANDLE CreateNamedPipe (LPCTSTR lpszPipeNameDWORD fdwOpenMode DWORD fdwPipModeDWORD nMaxInstances DWORD cbOutBufDWORD cbInBuf DWORD dwTimeOutLPSECURITY_ATTRIBUTES lpsa )
Use same flag settings forall instances of a named pipe
lpszPipeName pipe[path]pipename
ndash Not possible to create a pipe on remote machine ( ndash local machine)
fdwOpenMode
ndash PIPE_ACCESS_DUPLEX PIPE_ACCESS_INBOUND PIPE_ACCESS_OUTBOUND
fdwPipeMode
ndash PIPE_TYPE_BYTE or PIPE_TYPE_MESSAGE
ndash PIPE_READMODE_BYTE or PIPE_READMODE_MESSAGE
ndash PIPE_WAIT or PIPE_NOWAIT (will ReadFile block)
9898
Named Pipes (contd)
BOOL PeekNamedPipe (HANDLE hPipeLPVOID lpvBuffer DWORD cbBufferLPDWORD lpcbRead LPDWORD lpcbAvailLPDWORD lpcbMessage)
nMaxInstances
ndash Number of instances
ndash PIPE_UNLIMITED_INSTANCES OS choice based on resources
dwTimeOut
ndash Default time-out period (in msec) for WaitNamedPipe()
First CreateNamedPipe creates named pipe
ndash Closing handle to last instance deletes named pipe
Polling a pipe
ndash Nondestructive ndash is there a message waiting for ReadFile
9999
Named Pipe Client Connections
CreateFile with named pipe namendash pipe[path]pipename
ndash servernamepipe[path]pipename
ndash First method gives better performance (local server)
Status Functionsndash GetNamedPipeHandleState
ndash SetNamedPipeHandleState
ndash GetNamedPipeInfo
100100
Convenience Functions
BOOL TransactNamedPipe( HANDLE hNamedPipeLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDOWRD lpcbRead LPOVERLAPPED lpa)
WriteFile ReadFile sequence
101101
Convenience Functions
BOOL CallNamedPipe( LPCTSTR lpszPipeNameLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDWORD lpcbRead DWORD dwTimeOut)
CreateFile WriteFile ReadFile CloseHandle
ndash dwTimeOut NMPWAIT_NOWAIT NMPWAIT_WIAT_FOREVER NMPWAIT_USE_DEFAULT_WAIT
102102
Server eliminate the polling loop
BOOL ConnectNamedPipe (HANDLE hNamedPipeLPOVERLAPPED lpo
lpo == NULLndash Call will return as soon as there is a client connection
ndash Returns false if client connected between CreateNamed Pipe calland ConnectNamedPipe()
Use DisconnectNamedPipe to free the handle for connection from another client
WaitNamedPipe()ndash Client may wait for serverlsquos ConnectNamedPipe()
Security rights for named pipesndash GENERIC_READ GENERIC_WRITE SYNCHRONIZE
103103
Comparison with UNIX
UNIX FIFOs are similar to a named pipe
ndash FIFOs are half-duplex
ndash FIFOs are limited to a single machine
ndash FIFOs are still byte-oriented so its easiest to use fixed-size records in clientserver applications
ndash Individual readwrites are atomic
A server using FIFOs must use a separate FIFO for each clientlsquos response although all clients can send requests via a single well known FIFO
Mkfifo() is the UNIX counterpart to CreateNamedPipe()
Use sockets for networked clientserver scenarios
104104
Client Example using Named Pipe
WaitNamedPipe (ServerPipeName NMPWAIT_WAIT_FOREVER)
hNamedPipe = CreateFile (ServerPipeName GENERIC_READ | GENERIC_WRITE
0 NULL OPEN_EXISTING FILE_ATTRIBUTE_NORMAL NULL)
if (hNamedPipe == INVALID_HANDLE_VALUE)
fptinf(stderr Failure to locate servern) exit(3)
Write the request
WriteFile (hNamedPipe ampRequest MAX_RQRS_LEN ampnWrite NULL)
Read each response and send it to std out
while (ReadFile (hNamedPipe ResponseRecord MAX_RQRS_LEN ampnRead NULL))
printf (s ResponseRecord)
CloseHandle (hNamedPipe)
return 0
105105
Server Example Using a Named Pipe
hNamedPipe = CreateNamedPipe (SERVER_PIPE PIPE_ACCESS_DUPLEX
PIPE_READMODE_MESSAGE | PIPE_TYPE_MESSAGE | PIPE_WAIT
1 0 0 CS_TIMEOUT pNPSA)
while (Done)
printf (Server is awaiting next requestn)
if (ConnectNamedPipe (hNamedPipe NULL)
|| ReadFile (hNamedPipe ampRequest RQ_SIZE ampnXfer NULL))
fprintf(stderr ldquoConnect or Read Named Pipe errornrdquo) exit(4)
printf( ldquoRequest is sn RequestRecord)
Send the file one line at a time to the client
fp = fopen (File r)
while ((fgets (ResponseRecord MAX_RQRS_LEN fp) = NULL))
WriteFile (hNamedPipe ampResponseRecord
(strlen(ResponseRecord) + 1) TSIZE ampnXfer NULL)
fclose (fp)
DisconnectNamedPipe (hNamedPipe)
End of server operation
106106
Win32 IPC - MailslotsMailslots bear some nasty implementation detailsthey are almost never used
Broadcast mechanism
ndash One-directional
ndash Mutliple writersmultiple readers (frequently one-to-many comm)
ndash Message delivery is unreliable
ndash Can be located over a network domain
ndash Message lengths are limited (w2k lt 426 byte) Operations on the mailslot
ndash Each reader (server) creates mailslot with CreateMailslot()
ndash Write-only client opens mailslot with CreateFile() and uses WriteFile() ndash open will fail if there are no waiting readers
ndash Clientlsquos message can be read by all servers (readers) Client lookup mailslotmailslotname
ndash Client will connect to every server in network domain
107107
Locate a server via mailslot
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
App client 0
App client n
Mailslot Servers
While () Sleep() hMS = CreateFile( ldquomailslotstatusldquo)
WriteFile(hMS ampStatInfo
App Server
Mailslot Client
Message is sent periodically
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
2020
Mutual Exclusion with Swap
Shared data (initialized to 0) int lock = 0
Thread Ti
int key
do
key = 1
while (key == 1) Swap(lockkey)
critical section
lock = 0
remainder section
2121
Semaphores
Semaphore S ndash integer variable
can only be accessed via two atomic operations
wait (S)
while (S lt= 0)S--
signal (S)
S++
2222
Critical Section of n Threads
Shared data
semaphore mutex initially mutex = 1
Thread Ti
do wait(mutex) critical section
signal(mutex) remainder section
while (1)
2323
Semaphore Implementation
Semaphores may suspendresume threads
ndash Avoid busy waiting
Define a semaphore as a record
typedef struct
int value struct thread L semaphore
Assume two simple operations
ndash suspend() suspends the thread that invokes it
ndash resume(T) resumes the execution of a blocked thread T
2424
Implementation
Semaphore operations now defined as wait(S)
Svalue--
if (Svalue lt 0)
add this thread to SLsuspend()
signal(S) Svalue++
if (Svalue lt= 0)
remove a thread T from SLresume(T)
2525
Semaphore as a General Synchronization Tool Execute B in Tj only after A executed in Ti
Use semaphore flag initialized to 0
Code
Ti Tj
A wait(flag)
signal(flag) B
2626
Two Types of Semaphores
Counting semaphore
ndash integer value can range over an unrestricted domain
Binary semaphore
ndash integer value can range only between 0 and 1
ndash can be simpler to implement
Counting semaphore S can be implemented as a binary semaphore
2727
Deadlock and Starvation
Deadlock ndash
ndash two or more threads are waiting indefinitely for an event that can be caused by only one of the waiting threads
Let S and Q be two semaphores initialized to 1
T0 T1
wait(S) wait(Q)
wait(Q) wait(S)
signal(S) signal(Q)
signal(Q) signal(S)
2828
Deadlock and Starvation
Starvation ndash indefinite blocking
ndash A thread may never be removed from the semaphore queue in which it is suspended
Solution ndash
ndash all code should acquirerelease semaphores in same order
2929
Windows Synchronization
Uses interrupt masks to protect access to global resources on uniprocessor systems
Uses spinlocks on multiprocessor systems
Provides dispatcher objects which may act as mutexes and semaphores
Dispatcher objects may also provide events An event acts much like a condition variable
3030
Linux Synchronization
Kernel disables interrupts for synchronizing access to global data on uniprocessor systems
Uses spinlocks for multiprocessor synchronization
Uses semaphores and readers-writers locks when longer sections of code need access to data
Implements POSIX synchronization primitives to support multitasking multithreading (including real-time threads) and multiprocessing
3131
Further Reading
Ben-Ari M Principles of Concurrent Programming Prentice Hall 1982
Lamport L The Mutual Exclusion Problem Journal of the ACM April 1986
Abraham Silberschatz Peter B Galvin Operating System Concepts John Wiley amp Sons 6th Ed 2003
ndash Chapter 7 - Process Synchronization
ndash Chapter 8 - Deadlocks
3232
32 Trap Dispatching Interrupts Synchronization Trap and Interrupt dispatching
IRQL levels amp Interrupt Precedence
Spinlocks and Kernel Synchronization
Executive Synchronization
3333
Kernel Mode Versus User Mode
A processor state
Controls access to memory
Each memory page is tagged to show the required mode for reading and for writing
ndash Protects the system from the users
ndash Protects the user (process) from themselves
ndash System is not protected from system
Code regions are tagged ldquono write in any moderdquo
Controls ability to execute privileged instructions
A Windows abstraction
ndash Intel Ring 0 Ring 3
3434
Kernel Mode Versus User Mode
Control flow (a thread) can change from user to kernel mode and back
ndash Does not affect scheduling
ndash Thread context includes info about execution mode (along with registers etc)
PerfMon counters
ndash ldquoPrivileged Timerdquo and ldquoUser Timerdquo
ndash 4 levels of granularity thread process processor system
3535
Getting Into Kernel Mode
Code is run in kernel mode for one of three reasons
1 Requests from user mode
ndash Via the system service dispatch mechanism
ndash Kernel-mode code runs in the context of the requesting thread
2 Dedicated kernel-mode system threads
ndash Some threads in the system stay in kernel mode at all timesmostly in the ldquoSystemrdquo process
ndash Scheduled preempted etc like any other threads
3636
Getting Into Kernel Mode
3 Interrupts from external devicesndash interrupt dispatcher invokes the interrupt service routine
ndash ISR runs in the context of the interrupted thread so-called ldquoarbitrary thread contextrdquo
ndash ISR often requests the execution of a ldquoDPC routinerdquo which also runs in kernel mode
ndash Time not charged to interrupted thread
3737
Trap dispatching
Interruptdispatcher
Systemservice
dispatcher
Interruptserviceroutines
Interruptserviceroutines
Interruptserviceroutines
System services
System services
System services
Exceptiondispatcher
Exceptionhandlers
Exceptionhandlers
Exceptionhandlers
Virtual memorymanagerlsquos pager
Interrupt
System service call
HW exceptionsSW exceptions
Virtual addressexceptions
Trap processorlsquos mechanism to capture executing thread
ndash Switch from user to kernel mode
ndash Interrupts ndash asynchronous
ndash Exceptions - synchronous
Interrupt dispatch routine
Disable interrupts
Record machine state (trap frame) to allow resume
Mask equal- and lower-IRQL interrupts
Find and call appropriate ISR
Dismiss interrupt
Restore machine state (including mode and enabled interrupts)
Disable interrupts
Record machine state (trap frame) to allow resume
Mask equal- and lower-IRQL interrupts
Find and call appropriate ISR
Dismiss interrupt
Restore machine state (including mode and enabled interrupts)
Tell the device to stop interrupting
Interrogate device state start next operation on device etc
Request a DPC
Return to caller
Tell the device to stop interrupting
Interrogate device state start next operation on device etc
Request a DPC
Return to caller
Interrupt service routine
interrupt
user or kernel mode
codekernel mode
Note no thread or process context switch
Note no thread or process context switch
Interrupt Dispatching
3939
IRQL = Interrupt Request Level
ndash Precedence of the interrupt with respect to other interrupts
ndash Different interrupt sources have different IRQLs
ndash not the same as IRQ
IRQL is also a state of the processor
ndash Servicing an interrupt raises processor IRQL to that interruptrsquos IRQL
ndash this masks subsequent interrupts at equal and lower IRQLs
User mode is limited to IRQL 0
No waits or page faults at IRQL gt= DISPATCH_LEVEL
Interrupt Precedence via IRQLs (x86)
4040
PassiveLowAPC
DispatchDPCDevice 1
Profile amp Synch (Srv 2003)
ClockInterprocessor Interrupt
Power failHigh
normal thread execution
Hardware interrupts
Deferrable software interrupts
012
302928
31
Interrupt Precedence via IRQLs (x86)
4141
Interrupt processing
Interrupt dispatch table (IDT)ndash Links to interrupt service routines
x86ndash Interrupt controller interrupts processor (single line)
ndash Processor queries for interrupt vector uses vector as index to IDT
Alphandash PAL code (Privileged Architecture Library ndash Alpha BIOS) determines interrupt vector calls kernel
ndash Kernel uses vector to index IDT
After ISR execution IRQL is lowered to initial level
4242
Interrupt object
Allows device drivers to register ISRs for their devicesndash Contains dispatch code (initial handler)
ndash Dispatch code calls ISR with interrupt object as parameter(HW cannot pass parameters to ISR)
Connectingdisconnecting interrupt objectsndash Dynamic association between ISR and IDT entry
ndash Loadable device drivers (kernel modules)
ndash Turn onoff ISR
Interrupt objects can synchronize access to ISR datandash Multiple instances of ISR may be active simultaneously (MP machine)
ndash Multiple ISR may be connected with IRQL
4343
Predefined IRQLs
High ndash used when halting the system (via KeBugCheck())
Power fail ndash originated in the NT design document but has never been used
Inter-processor interruptndash used to request action from other processor (dispatching a thread updating a processors TLB system shutdown system crash)
Clockndash Used to update systemlsquos clock allocation of CPU time to threads
Profilendash Used for kernel profiling (see Kernel profiler ndash Kernprofexe Res Kit)
4444
Predefined IRQLs (contd)
Device
ndash Used to prioritize device interrupts
DPCdispatch and APC
ndash Software interrupts that kernel and device drivers generate
Passive
ndash No interrupt level at all normal thread execution
4545
IRQLs on 64-bit Systems
PassiveLowAPC
DispatchDPCDevice 1
Device n
Synch (Srv 2003)Clock
Interprocessor InterruptPower
HighProfile
012
1413
15
34
PassiveLowAPC
DispatchDPC amp Synch (UP only)Correctable Machine Check
Device 1
Device nSynch (MP only)
ClockInterprocessor Interrupt
HighProfilePower
x64 IA64
12
4646
Interrupt Prioritization amp Delivery
IRQLs are determined as followsndash x86 UP systems IRQL = 27 - IRQ
ndash x86 MP systems bucketized (random)
ndash x64 amp IA64 systems IRQL = IDT vector number 16
On MP systems which processor is chosen to deliver an interruptndash By default any processor can receive an interrupt from any deviceCan be configured with IntFilter utility in Resource Kit
ndash On x86 and x64 systems the IOAPIC (IO advanced programmable interrupt controller) is programmed to interrupt the processor running at the lowest IRQL
ndash On IA64 systems the SAPIC (streamlined advanced programmable interrupt controller) is configured to interrupt one processor for each interrupt sourceProcessors are assigned round robin for each interrupt vector
4747
Software interrupts
Initiating thread dispatching
ndash DPC allow for scheduling actions when kernel is deep within many layers of code
ndash Delayed scheduling decision one DPC queue per processor
Handling timer expiration
Asynchronous execution of a procedure in context of a particular thread
Support for asynchronous IO operations
4848
Flow of Interrupts
4949
Sync on MP use spinlocks to coordinate among processors
Spinlock acquisition and release routines implement a one-owner-at-a-time algorithm
ndash Spinlock is either free or is considered to be owned by a CPU
ndash Analogous to using Windows API mutexes from user mode
A spinlock is just a data cell in memory
ndash Accessed with a test-and-modify operation that is atomic across all processors
ndash KSPIN_LOCK is an opaque data type typedefrsquod as a ULONG
ndash To implement synchronization a single bit is sufficient
Synchronization on SMP Systems
5050
do acquire_spinlock(DPC)until (SUCCESS)
begin remove DPC from queueend
release_spinlock(DPC)
do acquire_spinlock(DPC)until (SUCCESS)
begin remove DPC from queueend
release_spinlock(DPC)
Kernel Synchronization
Processor BProcessor A
Critical section
spinlock
DPC DPC
A spinlock is a locking primitive associatedwith a global data structure such as the DPC queue
5151
Queued Spinlocks
Problem Checking status of spinlock via test-and-set operation creates bus contention
Queued spinlocks maintain queue of waiting processors
First processor acquires lock other processors wait on processor-local flag
ndash Thus busy-wait loop requires no access to the memory bus
When releasing lock the 1st processorrsquos flag is modified
ndash Exactly one processor is being signaled
ndash Pre-determined wait order
5252
SMP Scalability Improvements
Windows 2000 queued spinlocks
ndash qlocks in Kernel Debugger
Server 2003
ndash More spinlocks eliminated (context swap system space commit)
ndash Further reduction of use of spinlocks amp length they are held
ndash Scheduling database now per-CPUAllows thread state transitions in parallel
5353
SMP Scalability Improvements
XP2003
ndash Minimized lock contention for hot locks PFN or Page Frame Database lock
ndash Some locks completely eliminatedCharging nonpagedpaged pool quotas allocating and mapping system page table entries charging commitment of pages allocatingmapping physical memory through AWE functions
ndash New more efficient locking mechanism (pushlocks)Doesnrsquot use spinlocks when no contentionUsed for object manager and address windowing extensions (AWE) related locks
5454
Waiting
Flexible wait calls
ndash Wait for one or multiple objects in one call
ndash Wait for multiple can wait for ldquoanyrdquo one or ldquoallrdquo at onceldquoAllrdquo all objects must be in the signalled state concurrently to resolve the wait
ndash All wait calls include optional timeout argument
ndash Waiting threads consume no CPU time
5555
Waiting
Waitable objects include
ndash Events (may be auto-reset or manual reset may be set or ldquopulsedrdquo)
ndash Mutexes (ldquomutual exclusionrdquo one-at-a-time)
ndash Semaphores (n-at-a-time)
ndash Timers
ndash Processes and Threads (signalled upon exit or terminate)
ndash Directories (change notification)
5656
Waiting
No guaranteed ordering of wait resolution
ndash If multiple threads are waiting for an object and only one thread is released (eg itrsquos a mutex or auto-reset event) which thread gets released is unpredictable
5757
Executive Synchronization
Thread waitson an object
handle
Create and initialize thread object
Initialized
Ready
Transition
Waiting
Running
Terminated
Standby
Wait is completeSet object to
signaled state
Interaction with thread scheduling
Waiting on Dispatcher Objects ndash outside the kernel
5858
Interaction bet Synchronization amp Dispatching User mode thread waits on an event objectlsquos handle
Kernel changes threadlsquos scheduling state from ready to waiting and adds thread to wait-list
Another thread sets the event
Kernel wakes up waiting threads variable priority threads get priority boost
5959
Interaction bet Synchronization amp Dispatching Dispatcher re-schedules new thread ndash it may preempt running thread it it has lower priority and issues software interrupt to initiate context switch
If no processor can be preempted the dispatcher places the ready thread in the dispatcher ready queue to be scheduled later
6060
What signals an object
Dispatcher object
System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
Owning thread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (kernel mode)
nonsignaled signaled
Owning thread or otherthread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (exported to user mode)
nonsignaled signaled
One thread releases thesemaphore freeing a resource
A thread acquires the semaphoreMore resources are not available
Kernel resumes one or more waiting threads
Semaphore
6161
A thread reinitializesthe thread object
What signals an object (contd)
Dispatcher object System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
A thread sets the event
Kernel resumes one or more threads
Kernel resumes one or more waiting threads
Event
nonsignaled signaled
Dedicated thread sets oneevent in the event pair
Kernel resumes theother dedicated thread
Kernel resumes waitingdedicated thread
Event pair
nonsignaled signaled
Timer expires
A thread (re) initializes the timer
Kernel resumes all waiting threads
Timer
nonsignaled signaled
Thread terminates
Kernel resumes all waiting threads
Thread
6262
Further Reading
Mark E Russinovich and David A Solomon Microsoft Windows Internals 4th Edition Microsoft Press 2004
Chapter 3 - System Mechanisms
ndash Trap Dispatching (pp 85 ff)
ndash Synchronization (pp 149 ff)
ndash Kernel Event Tracing (pp 175 ff)
6363
33 Advanced Windows Synchronization
Deferred and Asynchronous Procedure Calls
IRQLs and CPU Time Accounting
Wait Queues amp Dispatcher Objects
6464
Used to defer processing from higher (device) interrupt level to a lower (dispatch) levelndash Also used for quantum end and timer expiration
Driver (usually ISR) queues requestndash One queue per CPU DPCs are normally queued to the current processor but can be targeted to other CPUs
ndash Executes specified procedure at dispatch IRQL (or ldquodispatch levelrdquo also ldquoDPC levelrdquo) when all higher-IRQL work (interrupts) completed
ndash Maximum times recommended ISR 10 usec DPC 25 usec
See httpwwwmicrosoftcomwhdcdriverperformmmdrvmspx
Deferred Procedure Calls (DPCs)
6565
queue head DPC object DPC object DPC object
Deferred Procedure Calls (DPCs)
6666
DPC
Delivering a DPC
DPC routines can call kernel functionsbut canlsquot call system services generatepage faults or create or wait on objects
DPC routines canlsquotassume whatprocess addressspace is currentlymapped
Interruptdispatch table
high
Power failure
DispatchDPC
APC
Low
DPC
1 Timer expires kernelqueues DPC that willrelease all waiting threadsKernel requests SW int
DPCDPC
DPC queue
2 DPC interrupt occurswhen IRQL drops belowdispatchDPC level
dispatcher
3 After DPC interruptcontrol transfers tothread dispatcher
4 Dispatcher executes each DPCroutine in DPC queue
6767
Asynchronous Procedure Calls (APCs)
Execute code in context of a particular user threadndash APC routines can acquire resources (objects) incur page faultscall system services
APC queue is thread-specific
User mode amp kernel mode APCsndash Permission required for user mode APCs
6868
Asynchronous Procedure Calls (APCs)
Executive uses APCs to complete work in thread spacendash Wait for asynchronous IO operation
ndash Emulate delivery of POSIX signals
ndash Make threads suspendterminate itself (env subsystems)
APCs are delivered when thread is in alertable wait statendash WaitForMultipleObjectsEx() SleepEx()
6969
Special kernel APCs
ndash Run in kernel mode at IRQL 1
ndash Always deliverable unless thread is already at IRQL 1 or above
ndash Used for IO completion reporting from ldquoarbitrary thread contextrdquo
ndash Kernel-mode interface is linkable but not documented
Asynchronous Procedure Calls (APCs)
7070
ldquoOrdinaryrdquo kernel APCs
ndash Always deliverable if at IRQL 0 unless explicitly disabled (disable with KeEnterCriticalRegion)
User mode APCs
ndash Used for IO completion callback routines (see ReadFileEx WriteFileEx) also QueueUserApc
ndash Only deliverable when thread is in ldquoalertable waitrdquo
Asynchronous Procedure Calls (APCs)
7171
ThreadObject
K
U
APC objects
Asynchronous Procedure Calls (APCs)
7272
IRQLs and CPU Time Accounting
Interval clock timer ISR keeps track of time
Clock ISR time accounting
ndash If IRQLlt2 charge to threadrsquos user or kernel time
ndash If IRQL=2 and processing a DPC charge to DPC time
ndash If IRQL=2 amp not processing a DPC charge to thread kernel time
ndash If IRQLgt2 charge to interrupt time
7373
IRQLs and CPU Time Accounting
Since time servicing interrupts are NOT charged to interrupted thread if system is busy but no process appears to be running must be due to interrupt-related activity
ndash Note time at IRQL 2 or more is charged to the current threadrsquos quantum (to be described)
7474
Interrupt Time Accounting
Task Manager includes interrupt and DPC time with the Idle process time
Interrupt activity is not charged to any threadprocess
ndash Process Explorer shows these as separate processesnot really processes
ndash Context switches for these are really of interrupts amp DPCs
7575
Time Accounting Quirks
Looking at total CPU time for each process may not reveal where system has spent its time
CPU time accounting is driven by programmable interrupt timer
ndash Normally 10 msec (15 msec on some MP Pentiums)
Thread execution and context switches between clock intervals NOT accounted
ndash Eg one or more threads run and enter a wait state before clock fires
ndash Thus threads may run but never get charged
View context switch activity with Process Explorer
ndash Add Context Switch Delta column
7676
For waiting threads user-mode utilities only display the wait reason
Example pstat
Looking at Waiting Threads
7777
Wait Internals 1 Dispatcher Objects
Size TypeState
Wait listhead
Object-type-specific data
DispatcherObject
(see ntddkincddkntddkh)
Any kernel object you can wait for is a ldquodispatcher objectrdquo
ndash some exclusively for synchronizationeg events mutexes (ldquomutantsrdquo) semaphores queues timers
ndash others can be waited for as a side effect of their prime function eg processes threads file objects
ndash non-waitable kernel objects are called ldquocontrol objectsrdquo
All dispatcher objects have a common header
All dispatcher objects are in one of two states
ndash ldquosignaledrdquo vs ldquononsignaledrdquo
ndash when signalled a wait on the object is satisfied
ndash different object types differ in terms of what changes their state
ndash wait and unwait implementation iscommon to all types of dispatcher objects
7878
Object-type-specific data
Wait Internals 2Wait Blocks
Size TypeState
Wait listhead
Size TypeState
Wait listhead
Represent a threadrsquos reference to something itrsquos waiting for (one per handle passed to WaitForhellip)
All wait blocks from a given wait call are chained to the waiting thread
Type indicates wait for ldquoanyrdquo or ldquoallrdquo Key denotes argument list position for
WaitForMultipleObjects
Object-type-specific data
DispatcherObjects
Thread Objects
WaitBlockListWaitBlockList
Wait blocks
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
7979
34 Windows APIs for Synchronization and IPC Windows API constructs for synchronization and interprocess communication
Synchronization
ndash Critical sections
ndash Mutexes
ndash Semaphores
ndash Event objects
Synchronization through interprocess communication
ndash Anonymous pipes
ndash Named pipes
ndash Mailslots
8080
Critical Sections
Only usable from within the same process
Critical sections are initialized and deleted but do not have handles
Only one thread at a time can be in a critical section
A thread can enter a critical section multiple times - however the number of Enter- and Leave-operations must match
Leaving a critical section before entering it may cause deadlocks
No way to test whether another thread is in a critical section
VOID InitializeCriticalSection( LPCRITICAL_SECTION sec )VOID DeleteCriticalSection( LPCRITICAL_SECTION sec )
VOID EnterCriticalSection( LPCRITICAL_SECTION sec ) VOID LeaveCriticalSection( LPCRITICAL_SECTION sec )BOOL TryEnterCriticalSection ( LPCRITICAL_SECTION sec )
8181
Critical Section Example
counter is global shared by all threads
volatile int counter = 0
CRITICAL_SECTION crit
InitializeCriticalSection ( ampcrit )
hellip main loop in any of the threads
while (done)
_try
EnterCriticalSection ( ampcrit )
counter += local_value
LeaveCriticalSection ( ampcrit )
_finally LeaveCriticalSection ( ampcrit )
DeleteCriticalSection( ampcrit )
8282
Synchronizing Threads with Kernel Objects
The following kernel objects can be used to synchronize threads
ndash Processes
ndash Threads
ndash Files
ndash Console input
File change notificationsFile change notifications
MutexesMutexes
Events (auto-reset + manual-reset)Events (auto-reset + manual-reset)
Waitable timersWaitable timers
DWORD WaitForSingleObject( HANDLE hObject DWORD dwTimeout )
DWORD WaitForMultipleObjects( DWORD cObjects LPHANDLE lpHandles BOOL bWaitAll DWORD dwTimeout )
8383
Wait Functions - Details
WaitForSingleObject()ndash hObject specifies kernel object
ndash dwTimeout specifies wait time in msecdwTimeout == 0 - no wait check whether object is signaled
dwTimeout == INFINITE - wait forever
WaitForMultipleObjects()ndash cObjects lt= MAXIMUM_WAIT_OBJECTS (64)
ndash lpHandles - pointer to array identifying these objects
ndash bWaitAll - whether to wait for first signaled object or all objectsFunction returns index of first signaled object
Side effectsndash Mutexes auto-reset events and waitable timers will be reset to non-signaled state after completing wait functions
8484
Mutexes
Mutexes work across processes
First thread has to call CreateMutex()
When sharing a mutex second thread (process) calls CreateMutex() or OpenMutex()
fInitialOwner == TRUE gives creator immediate ownership
Threads acquire mutex ownership using WaitForSingleObject() or WaitForMultipleObjects()
ReleaseMutex() gives up ownership
CloseHandle() will free mutex object
8585
Mutexes
HANDLE CreateMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
HANDLE OpenMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
BOOL ReleaseMutex( HANDLE hMutex )
8686
Mutex Example
counter is global shared by all threads
volatile int done counter = 0
HANDLE mutex = CreateMutex( NULL FALSE NULL )
main loop in any of the threads ret is local
DWORD ret
while (done)
ret = WaitForSingleObject( mutex INFINITE )
if (ret == WAIT_OBJECT_0)
counter += local_value
else mutex was abandoned
break exit the loop
ReleaseMutex( mutex )
CloseHandle( mutex )
8787
Comparison - POSIX mutexes
POSIX pthreads specification supports mutexesndash Synchronization among threads in same process
Five basic functionsndash pthread_mutex_init()
ndash pthread_mutex_destroy()
ndash pthread_mutex_lock()
ndash pthread_mutex_unlock()
ndash pthread_mutex_trylock()
Comparisonndash pthread_mutex_lock() will block - equivalent to WaitForSingleObject( hMutex )
ndash pthread_mutex_trylock() is nonblocking (polling) - equivalent to WaitForSingleObject() with timeout == 0
8888
Semaphores
Semaphore objects are used for resource countingndash A semaphore is signaled when count gt 0
Threadsprocesses use wait functionsndash Each wait function decreases semaphore count by 1
ndash ReleaseSemaphore() may increment count by any value
ndash ReleaseSemaphore() returns old semaphore count
8989
Semaphores
HANDLE CreateSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE OpenSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE ReleaseSemaphore( HANDLE hSemaphoreLONG cReleaseCount LPLONG lpPreviousCount )
9090
Events
Multiple threads can be released when a single event is signaled (barrier synchronization)ndash Manual-reset event can signal several thread simultaneously must be reset manually
ndash PulseEvent() will release all threads waiting on a manual-reset event and automatically reset the event
ndash Auto-reset event signals a single thread event is reset automatically
ndash fInitialState == TRUE - create event in signaled state
9191
Events
HANDLE CreateEvent( LPSECURITY_ATTRIBUTE lpsaBOOL fManualReset BOOL fInititalStateLPTSTR lpszEventName )
BOOL SetEvent( HANDLE hEvent )BOOL ResetEvent( HANDLE hEvent )BOOL PulseEvent( HANDLE hEvent )
9292
Comparison - POSIX condition variables
pthreadrsquos condition variables are comparable to eventsndash pthread_cond_init()
ndash pthread_cond_destroy()
Wait functionsndash pthread_cond_wait()
ndash pthread_cond_timedwait()
Signalingndash pthread_cond_signal() - one thread
ndash pthread_cond_broadcast() - all waiting threads
No exact equivalent to manual-reset events
9393
Anonymous pipes
BOOL CreatePipe( PHANDLE phReadPHANDLE phWriteLPSECURITY_ATTRIBUTES lpsaDWORD cbPipe )
main
prog1 prog2pipe
Half-duplex character-based IPC
cbPipe pipe byte size zero == default
Read on pipe handle will block if pipe is empty
Write operation to a full pipe will block
Anonymous pipes are oneway
9494
IO Redirection using an Anonymous Pipe
Create default size anonymous pipe handles are inheritable
if (CreatePipe (amphReadPipe amphWritePipe ampPipeSA 0))
fprintf(stderr ldquoAnon pipe create failednrdquo) exit(1)
Set output handle to pipe handle create first processes
StartInfoCh1hStdInput = GetStdHandle (STD_INPUT_HANDLE)
StartInfoCh1hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh1hStdOutput = hWritePipe
StartInfoCh1dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)Command1 NULL NULL TRUE 0 NULL NULL ampStartInfoCh1 ampProcInfo1))
fprintf(stderr ldquoCreateProc1 failednrdquo) exit(2)
CloseHandle (hWritePipe)
9595
Pipe example (contd)
Repeat (symmetrically) for the second process
StartInfoCh2hStdInput = hReadPipe
StartInfoCh2hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh2hStdOutput = GetStdHandle (STD_OUTPUT_HANDLE)
StartInfoCh2dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)targv NULL NULLTRUE Inherit handles
0 NULL NULL ampStartInfoCh2 ampProcInfo2))
fprintf(stderr ldquoCreateProc2 failednrdquo) exit(3)
CloseHandle (hReadPipe)
Wait for both processes to complete
WaitForSingleObject (ProcInfo1hProcess INFINITE)
WaitForSingleObject (ProcInfo2hProcess INFINITE)
CloseHandle (ProcInfo1hThread) CloseHandle (ProcInfo1hProcess)
CloseHandle (ProcInfo2hThread) CloseHandle (ProcInfo2hProcess)
return 0
9696
Named Pipes
Message orientedndash Reading process can read varying-length messages precisely as sent by the writing process
Bi-directionalndash Two processes can exchange messages over the same pipe
Multiple independent instances of a named pipendash Several clients can communicate with a single server using the same instance
ndash Server can respond to client using the same instance
Pipe can be accessed over the networkndash location transparency
Convenience and connection functions
9797
Using Named Pipes
HANDLE CreateNamedPipe (LPCTSTR lpszPipeNameDWORD fdwOpenMode DWORD fdwPipModeDWORD nMaxInstances DWORD cbOutBufDWORD cbInBuf DWORD dwTimeOutLPSECURITY_ATTRIBUTES lpsa )
Use same flag settings forall instances of a named pipe
lpszPipeName pipe[path]pipename
ndash Not possible to create a pipe on remote machine ( ndash local machine)
fdwOpenMode
ndash PIPE_ACCESS_DUPLEX PIPE_ACCESS_INBOUND PIPE_ACCESS_OUTBOUND
fdwPipeMode
ndash PIPE_TYPE_BYTE or PIPE_TYPE_MESSAGE
ndash PIPE_READMODE_BYTE or PIPE_READMODE_MESSAGE
ndash PIPE_WAIT or PIPE_NOWAIT (will ReadFile block)
9898
Named Pipes (contd)
BOOL PeekNamedPipe (HANDLE hPipeLPVOID lpvBuffer DWORD cbBufferLPDWORD lpcbRead LPDWORD lpcbAvailLPDWORD lpcbMessage)
nMaxInstances
ndash Number of instances
ndash PIPE_UNLIMITED_INSTANCES OS choice based on resources
dwTimeOut
ndash Default time-out period (in msec) for WaitNamedPipe()
First CreateNamedPipe creates named pipe
ndash Closing handle to last instance deletes named pipe
Polling a pipe
ndash Nondestructive ndash is there a message waiting for ReadFile
9999
Named Pipe Client Connections
CreateFile with named pipe namendash pipe[path]pipename
ndash servernamepipe[path]pipename
ndash First method gives better performance (local server)
Status Functionsndash GetNamedPipeHandleState
ndash SetNamedPipeHandleState
ndash GetNamedPipeInfo
100100
Convenience Functions
BOOL TransactNamedPipe( HANDLE hNamedPipeLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDOWRD lpcbRead LPOVERLAPPED lpa)
WriteFile ReadFile sequence
101101
Convenience Functions
BOOL CallNamedPipe( LPCTSTR lpszPipeNameLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDWORD lpcbRead DWORD dwTimeOut)
CreateFile WriteFile ReadFile CloseHandle
ndash dwTimeOut NMPWAIT_NOWAIT NMPWAIT_WIAT_FOREVER NMPWAIT_USE_DEFAULT_WAIT
102102
Server eliminate the polling loop
BOOL ConnectNamedPipe (HANDLE hNamedPipeLPOVERLAPPED lpo
lpo == NULLndash Call will return as soon as there is a client connection
ndash Returns false if client connected between CreateNamed Pipe calland ConnectNamedPipe()
Use DisconnectNamedPipe to free the handle for connection from another client
WaitNamedPipe()ndash Client may wait for serverlsquos ConnectNamedPipe()
Security rights for named pipesndash GENERIC_READ GENERIC_WRITE SYNCHRONIZE
103103
Comparison with UNIX
UNIX FIFOs are similar to a named pipe
ndash FIFOs are half-duplex
ndash FIFOs are limited to a single machine
ndash FIFOs are still byte-oriented so its easiest to use fixed-size records in clientserver applications
ndash Individual readwrites are atomic
A server using FIFOs must use a separate FIFO for each clientlsquos response although all clients can send requests via a single well known FIFO
Mkfifo() is the UNIX counterpart to CreateNamedPipe()
Use sockets for networked clientserver scenarios
104104
Client Example using Named Pipe
WaitNamedPipe (ServerPipeName NMPWAIT_WAIT_FOREVER)
hNamedPipe = CreateFile (ServerPipeName GENERIC_READ | GENERIC_WRITE
0 NULL OPEN_EXISTING FILE_ATTRIBUTE_NORMAL NULL)
if (hNamedPipe == INVALID_HANDLE_VALUE)
fptinf(stderr Failure to locate servern) exit(3)
Write the request
WriteFile (hNamedPipe ampRequest MAX_RQRS_LEN ampnWrite NULL)
Read each response and send it to std out
while (ReadFile (hNamedPipe ResponseRecord MAX_RQRS_LEN ampnRead NULL))
printf (s ResponseRecord)
CloseHandle (hNamedPipe)
return 0
105105
Server Example Using a Named Pipe
hNamedPipe = CreateNamedPipe (SERVER_PIPE PIPE_ACCESS_DUPLEX
PIPE_READMODE_MESSAGE | PIPE_TYPE_MESSAGE | PIPE_WAIT
1 0 0 CS_TIMEOUT pNPSA)
while (Done)
printf (Server is awaiting next requestn)
if (ConnectNamedPipe (hNamedPipe NULL)
|| ReadFile (hNamedPipe ampRequest RQ_SIZE ampnXfer NULL))
fprintf(stderr ldquoConnect or Read Named Pipe errornrdquo) exit(4)
printf( ldquoRequest is sn RequestRecord)
Send the file one line at a time to the client
fp = fopen (File r)
while ((fgets (ResponseRecord MAX_RQRS_LEN fp) = NULL))
WriteFile (hNamedPipe ampResponseRecord
(strlen(ResponseRecord) + 1) TSIZE ampnXfer NULL)
fclose (fp)
DisconnectNamedPipe (hNamedPipe)
End of server operation
106106
Win32 IPC - MailslotsMailslots bear some nasty implementation detailsthey are almost never used
Broadcast mechanism
ndash One-directional
ndash Mutliple writersmultiple readers (frequently one-to-many comm)
ndash Message delivery is unreliable
ndash Can be located over a network domain
ndash Message lengths are limited (w2k lt 426 byte) Operations on the mailslot
ndash Each reader (server) creates mailslot with CreateMailslot()
ndash Write-only client opens mailslot with CreateFile() and uses WriteFile() ndash open will fail if there are no waiting readers
ndash Clientlsquos message can be read by all servers (readers) Client lookup mailslotmailslotname
ndash Client will connect to every server in network domain
107107
Locate a server via mailslot
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
App client 0
App client n
Mailslot Servers
While () Sleep() hMS = CreateFile( ldquomailslotstatusldquo)
WriteFile(hMS ampStatInfo
App Server
Mailslot Client
Message is sent periodically
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
2121
Semaphores
Semaphore S ndash integer variable
can only be accessed via two atomic operations
wait (S)
while (S lt= 0)S--
signal (S)
S++
2222
Critical Section of n Threads
Shared data
semaphore mutex initially mutex = 1
Thread Ti
do wait(mutex) critical section
signal(mutex) remainder section
while (1)
2323
Semaphore Implementation
Semaphores may suspendresume threads
ndash Avoid busy waiting
Define a semaphore as a record
typedef struct
int value struct thread L semaphore
Assume two simple operations
ndash suspend() suspends the thread that invokes it
ndash resume(T) resumes the execution of a blocked thread T
2424
Implementation
Semaphore operations now defined as wait(S)
Svalue--
if (Svalue lt 0)
add this thread to SLsuspend()
signal(S) Svalue++
if (Svalue lt= 0)
remove a thread T from SLresume(T)
2525
Semaphore as a General Synchronization Tool Execute B in Tj only after A executed in Ti
Use semaphore flag initialized to 0
Code
Ti Tj
A wait(flag)
signal(flag) B
2626
Two Types of Semaphores
Counting semaphore
ndash integer value can range over an unrestricted domain
Binary semaphore
ndash integer value can range only between 0 and 1
ndash can be simpler to implement
Counting semaphore S can be implemented as a binary semaphore
2727
Deadlock and Starvation
Deadlock ndash
ndash two or more threads are waiting indefinitely for an event that can be caused by only one of the waiting threads
Let S and Q be two semaphores initialized to 1
T0 T1
wait(S) wait(Q)
wait(Q) wait(S)
signal(S) signal(Q)
signal(Q) signal(S)
2828
Deadlock and Starvation
Starvation ndash indefinite blocking
ndash A thread may never be removed from the semaphore queue in which it is suspended
Solution ndash
ndash all code should acquirerelease semaphores in same order
2929
Windows Synchronization
Uses interrupt masks to protect access to global resources on uniprocessor systems
Uses spinlocks on multiprocessor systems
Provides dispatcher objects which may act as mutexes and semaphores
Dispatcher objects may also provide events An event acts much like a condition variable
3030
Linux Synchronization
Kernel disables interrupts for synchronizing access to global data on uniprocessor systems
Uses spinlocks for multiprocessor synchronization
Uses semaphores and readers-writers locks when longer sections of code need access to data
Implements POSIX synchronization primitives to support multitasking multithreading (including real-time threads) and multiprocessing
3131
Further Reading
Ben-Ari M Principles of Concurrent Programming Prentice Hall 1982
Lamport L The Mutual Exclusion Problem Journal of the ACM April 1986
Abraham Silberschatz Peter B Galvin Operating System Concepts John Wiley amp Sons 6th Ed 2003
ndash Chapter 7 - Process Synchronization
ndash Chapter 8 - Deadlocks
3232
32 Trap Dispatching Interrupts Synchronization Trap and Interrupt dispatching
IRQL levels amp Interrupt Precedence
Spinlocks and Kernel Synchronization
Executive Synchronization
3333
Kernel Mode Versus User Mode
A processor state
Controls access to memory
Each memory page is tagged to show the required mode for reading and for writing
ndash Protects the system from the users
ndash Protects the user (process) from themselves
ndash System is not protected from system
Code regions are tagged ldquono write in any moderdquo
Controls ability to execute privileged instructions
A Windows abstraction
ndash Intel Ring 0 Ring 3
3434
Kernel Mode Versus User Mode
Control flow (a thread) can change from user to kernel mode and back
ndash Does not affect scheduling
ndash Thread context includes info about execution mode (along with registers etc)
PerfMon counters
ndash ldquoPrivileged Timerdquo and ldquoUser Timerdquo
ndash 4 levels of granularity thread process processor system
3535
Getting Into Kernel Mode
Code is run in kernel mode for one of three reasons
1 Requests from user mode
ndash Via the system service dispatch mechanism
ndash Kernel-mode code runs in the context of the requesting thread
2 Dedicated kernel-mode system threads
ndash Some threads in the system stay in kernel mode at all timesmostly in the ldquoSystemrdquo process
ndash Scheduled preempted etc like any other threads
3636
Getting Into Kernel Mode
3 Interrupts from external devicesndash interrupt dispatcher invokes the interrupt service routine
ndash ISR runs in the context of the interrupted thread so-called ldquoarbitrary thread contextrdquo
ndash ISR often requests the execution of a ldquoDPC routinerdquo which also runs in kernel mode
ndash Time not charged to interrupted thread
3737
Trap dispatching
Interruptdispatcher
Systemservice
dispatcher
Interruptserviceroutines
Interruptserviceroutines
Interruptserviceroutines
System services
System services
System services
Exceptiondispatcher
Exceptionhandlers
Exceptionhandlers
Exceptionhandlers
Virtual memorymanagerlsquos pager
Interrupt
System service call
HW exceptionsSW exceptions
Virtual addressexceptions
Trap processorlsquos mechanism to capture executing thread
ndash Switch from user to kernel mode
ndash Interrupts ndash asynchronous
ndash Exceptions - synchronous
Interrupt dispatch routine
Disable interrupts
Record machine state (trap frame) to allow resume
Mask equal- and lower-IRQL interrupts
Find and call appropriate ISR
Dismiss interrupt
Restore machine state (including mode and enabled interrupts)
Disable interrupts
Record machine state (trap frame) to allow resume
Mask equal- and lower-IRQL interrupts
Find and call appropriate ISR
Dismiss interrupt
Restore machine state (including mode and enabled interrupts)
Tell the device to stop interrupting
Interrogate device state start next operation on device etc
Request a DPC
Return to caller
Tell the device to stop interrupting
Interrogate device state start next operation on device etc
Request a DPC
Return to caller
Interrupt service routine
interrupt
user or kernel mode
codekernel mode
Note no thread or process context switch
Note no thread or process context switch
Interrupt Dispatching
3939
IRQL = Interrupt Request Level
ndash Precedence of the interrupt with respect to other interrupts
ndash Different interrupt sources have different IRQLs
ndash not the same as IRQ
IRQL is also a state of the processor
ndash Servicing an interrupt raises processor IRQL to that interruptrsquos IRQL
ndash this masks subsequent interrupts at equal and lower IRQLs
User mode is limited to IRQL 0
No waits or page faults at IRQL gt= DISPATCH_LEVEL
Interrupt Precedence via IRQLs (x86)
4040
PassiveLowAPC
DispatchDPCDevice 1
Profile amp Synch (Srv 2003)
ClockInterprocessor Interrupt
Power failHigh
normal thread execution
Hardware interrupts
Deferrable software interrupts
012
302928
31
Interrupt Precedence via IRQLs (x86)
4141
Interrupt processing
Interrupt dispatch table (IDT)ndash Links to interrupt service routines
x86ndash Interrupt controller interrupts processor (single line)
ndash Processor queries for interrupt vector uses vector as index to IDT
Alphandash PAL code (Privileged Architecture Library ndash Alpha BIOS) determines interrupt vector calls kernel
ndash Kernel uses vector to index IDT
After ISR execution IRQL is lowered to initial level
4242
Interrupt object
Allows device drivers to register ISRs for their devicesndash Contains dispatch code (initial handler)
ndash Dispatch code calls ISR with interrupt object as parameter(HW cannot pass parameters to ISR)
Connectingdisconnecting interrupt objectsndash Dynamic association between ISR and IDT entry
ndash Loadable device drivers (kernel modules)
ndash Turn onoff ISR
Interrupt objects can synchronize access to ISR datandash Multiple instances of ISR may be active simultaneously (MP machine)
ndash Multiple ISR may be connected with IRQL
4343
Predefined IRQLs
High ndash used when halting the system (via KeBugCheck())
Power fail ndash originated in the NT design document but has never been used
Inter-processor interruptndash used to request action from other processor (dispatching a thread updating a processors TLB system shutdown system crash)
Clockndash Used to update systemlsquos clock allocation of CPU time to threads
Profilendash Used for kernel profiling (see Kernel profiler ndash Kernprofexe Res Kit)
4444
Predefined IRQLs (contd)
Device
ndash Used to prioritize device interrupts
DPCdispatch and APC
ndash Software interrupts that kernel and device drivers generate
Passive
ndash No interrupt level at all normal thread execution
4545
IRQLs on 64-bit Systems
PassiveLowAPC
DispatchDPCDevice 1
Device n
Synch (Srv 2003)Clock
Interprocessor InterruptPower
HighProfile
012
1413
15
34
PassiveLowAPC
DispatchDPC amp Synch (UP only)Correctable Machine Check
Device 1
Device nSynch (MP only)
ClockInterprocessor Interrupt
HighProfilePower
x64 IA64
12
4646
Interrupt Prioritization amp Delivery
IRQLs are determined as followsndash x86 UP systems IRQL = 27 - IRQ
ndash x86 MP systems bucketized (random)
ndash x64 amp IA64 systems IRQL = IDT vector number 16
On MP systems which processor is chosen to deliver an interruptndash By default any processor can receive an interrupt from any deviceCan be configured with IntFilter utility in Resource Kit
ndash On x86 and x64 systems the IOAPIC (IO advanced programmable interrupt controller) is programmed to interrupt the processor running at the lowest IRQL
ndash On IA64 systems the SAPIC (streamlined advanced programmable interrupt controller) is configured to interrupt one processor for each interrupt sourceProcessors are assigned round robin for each interrupt vector
4747
Software interrupts
Initiating thread dispatching
ndash DPC allow for scheduling actions when kernel is deep within many layers of code
ndash Delayed scheduling decision one DPC queue per processor
Handling timer expiration
Asynchronous execution of a procedure in context of a particular thread
Support for asynchronous IO operations
4848
Flow of Interrupts
4949
Sync on MP use spinlocks to coordinate among processors
Spinlock acquisition and release routines implement a one-owner-at-a-time algorithm
ndash Spinlock is either free or is considered to be owned by a CPU
ndash Analogous to using Windows API mutexes from user mode
A spinlock is just a data cell in memory
ndash Accessed with a test-and-modify operation that is atomic across all processors
ndash KSPIN_LOCK is an opaque data type typedefrsquod as a ULONG
ndash To implement synchronization a single bit is sufficient
Synchronization on SMP Systems
5050
do acquire_spinlock(DPC)until (SUCCESS)
begin remove DPC from queueend
release_spinlock(DPC)
do acquire_spinlock(DPC)until (SUCCESS)
begin remove DPC from queueend
release_spinlock(DPC)
Kernel Synchronization
Processor BProcessor A
Critical section
spinlock
DPC DPC
A spinlock is a locking primitive associatedwith a global data structure such as the DPC queue
5151
Queued Spinlocks
Problem Checking status of spinlock via test-and-set operation creates bus contention
Queued spinlocks maintain queue of waiting processors
First processor acquires lock other processors wait on processor-local flag
ndash Thus busy-wait loop requires no access to the memory bus
When releasing lock the 1st processorrsquos flag is modified
ndash Exactly one processor is being signaled
ndash Pre-determined wait order
5252
SMP Scalability Improvements
Windows 2000 queued spinlocks
ndash qlocks in Kernel Debugger
Server 2003
ndash More spinlocks eliminated (context swap system space commit)
ndash Further reduction of use of spinlocks amp length they are held
ndash Scheduling database now per-CPUAllows thread state transitions in parallel
5353
SMP Scalability Improvements
XP2003
ndash Minimized lock contention for hot locks PFN or Page Frame Database lock
ndash Some locks completely eliminatedCharging nonpagedpaged pool quotas allocating and mapping system page table entries charging commitment of pages allocatingmapping physical memory through AWE functions
ndash New more efficient locking mechanism (pushlocks)Doesnrsquot use spinlocks when no contentionUsed for object manager and address windowing extensions (AWE) related locks
5454
Waiting
Flexible wait calls
ndash Wait for one or multiple objects in one call
ndash Wait for multiple can wait for ldquoanyrdquo one or ldquoallrdquo at onceldquoAllrdquo all objects must be in the signalled state concurrently to resolve the wait
ndash All wait calls include optional timeout argument
ndash Waiting threads consume no CPU time
5555
Waiting
Waitable objects include
ndash Events (may be auto-reset or manual reset may be set or ldquopulsedrdquo)
ndash Mutexes (ldquomutual exclusionrdquo one-at-a-time)
ndash Semaphores (n-at-a-time)
ndash Timers
ndash Processes and Threads (signalled upon exit or terminate)
ndash Directories (change notification)
5656
Waiting
No guaranteed ordering of wait resolution
ndash If multiple threads are waiting for an object and only one thread is released (eg itrsquos a mutex or auto-reset event) which thread gets released is unpredictable
5757
Executive Synchronization
Thread waitson an object
handle
Create and initialize thread object
Initialized
Ready
Transition
Waiting
Running
Terminated
Standby
Wait is completeSet object to
signaled state
Interaction with thread scheduling
Waiting on Dispatcher Objects ndash outside the kernel
5858
Interaction bet Synchronization amp Dispatching User mode thread waits on an event objectlsquos handle
Kernel changes threadlsquos scheduling state from ready to waiting and adds thread to wait-list
Another thread sets the event
Kernel wakes up waiting threads variable priority threads get priority boost
5959
Interaction bet Synchronization amp Dispatching Dispatcher re-schedules new thread ndash it may preempt running thread it it has lower priority and issues software interrupt to initiate context switch
If no processor can be preempted the dispatcher places the ready thread in the dispatcher ready queue to be scheduled later
6060
What signals an object
Dispatcher object
System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
Owning thread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (kernel mode)
nonsignaled signaled
Owning thread or otherthread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (exported to user mode)
nonsignaled signaled
One thread releases thesemaphore freeing a resource
A thread acquires the semaphoreMore resources are not available
Kernel resumes one or more waiting threads
Semaphore
6161
A thread reinitializesthe thread object
What signals an object (contd)
Dispatcher object System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
A thread sets the event
Kernel resumes one or more threads
Kernel resumes one or more waiting threads
Event
nonsignaled signaled
Dedicated thread sets oneevent in the event pair
Kernel resumes theother dedicated thread
Kernel resumes waitingdedicated thread
Event pair
nonsignaled signaled
Timer expires
A thread (re) initializes the timer
Kernel resumes all waiting threads
Timer
nonsignaled signaled
Thread terminates
Kernel resumes all waiting threads
Thread
6262
Further Reading
Mark E Russinovich and David A Solomon Microsoft Windows Internals 4th Edition Microsoft Press 2004
Chapter 3 - System Mechanisms
ndash Trap Dispatching (pp 85 ff)
ndash Synchronization (pp 149 ff)
ndash Kernel Event Tracing (pp 175 ff)
6363
33 Advanced Windows Synchronization
Deferred and Asynchronous Procedure Calls
IRQLs and CPU Time Accounting
Wait Queues amp Dispatcher Objects
6464
Used to defer processing from higher (device) interrupt level to a lower (dispatch) levelndash Also used for quantum end and timer expiration
Driver (usually ISR) queues requestndash One queue per CPU DPCs are normally queued to the current processor but can be targeted to other CPUs
ndash Executes specified procedure at dispatch IRQL (or ldquodispatch levelrdquo also ldquoDPC levelrdquo) when all higher-IRQL work (interrupts) completed
ndash Maximum times recommended ISR 10 usec DPC 25 usec
See httpwwwmicrosoftcomwhdcdriverperformmmdrvmspx
Deferred Procedure Calls (DPCs)
6565
queue head DPC object DPC object DPC object
Deferred Procedure Calls (DPCs)
6666
DPC
Delivering a DPC
DPC routines can call kernel functionsbut canlsquot call system services generatepage faults or create or wait on objects
DPC routines canlsquotassume whatprocess addressspace is currentlymapped
Interruptdispatch table
high
Power failure
DispatchDPC
APC
Low
DPC
1 Timer expires kernelqueues DPC that willrelease all waiting threadsKernel requests SW int
DPCDPC
DPC queue
2 DPC interrupt occurswhen IRQL drops belowdispatchDPC level
dispatcher
3 After DPC interruptcontrol transfers tothread dispatcher
4 Dispatcher executes each DPCroutine in DPC queue
6767
Asynchronous Procedure Calls (APCs)
Execute code in context of a particular user threadndash APC routines can acquire resources (objects) incur page faultscall system services
APC queue is thread-specific
User mode amp kernel mode APCsndash Permission required for user mode APCs
6868
Asynchronous Procedure Calls (APCs)
Executive uses APCs to complete work in thread spacendash Wait for asynchronous IO operation
ndash Emulate delivery of POSIX signals
ndash Make threads suspendterminate itself (env subsystems)
APCs are delivered when thread is in alertable wait statendash WaitForMultipleObjectsEx() SleepEx()
6969
Special kernel APCs
ndash Run in kernel mode at IRQL 1
ndash Always deliverable unless thread is already at IRQL 1 or above
ndash Used for IO completion reporting from ldquoarbitrary thread contextrdquo
ndash Kernel-mode interface is linkable but not documented
Asynchronous Procedure Calls (APCs)
7070
ldquoOrdinaryrdquo kernel APCs
ndash Always deliverable if at IRQL 0 unless explicitly disabled (disable with KeEnterCriticalRegion)
User mode APCs
ndash Used for IO completion callback routines (see ReadFileEx WriteFileEx) also QueueUserApc
ndash Only deliverable when thread is in ldquoalertable waitrdquo
Asynchronous Procedure Calls (APCs)
7171
ThreadObject
K
U
APC objects
Asynchronous Procedure Calls (APCs)
7272
IRQLs and CPU Time Accounting
Interval clock timer ISR keeps track of time
Clock ISR time accounting
ndash If IRQLlt2 charge to threadrsquos user or kernel time
ndash If IRQL=2 and processing a DPC charge to DPC time
ndash If IRQL=2 amp not processing a DPC charge to thread kernel time
ndash If IRQLgt2 charge to interrupt time
7373
IRQLs and CPU Time Accounting
Since time servicing interrupts are NOT charged to interrupted thread if system is busy but no process appears to be running must be due to interrupt-related activity
ndash Note time at IRQL 2 or more is charged to the current threadrsquos quantum (to be described)
7474
Interrupt Time Accounting
Task Manager includes interrupt and DPC time with the Idle process time
Interrupt activity is not charged to any threadprocess
ndash Process Explorer shows these as separate processesnot really processes
ndash Context switches for these are really of interrupts amp DPCs
7575
Time Accounting Quirks
Looking at total CPU time for each process may not reveal where system has spent its time
CPU time accounting is driven by programmable interrupt timer
ndash Normally 10 msec (15 msec on some MP Pentiums)
Thread execution and context switches between clock intervals NOT accounted
ndash Eg one or more threads run and enter a wait state before clock fires
ndash Thus threads may run but never get charged
View context switch activity with Process Explorer
ndash Add Context Switch Delta column
7676
For waiting threads user-mode utilities only display the wait reason
Example pstat
Looking at Waiting Threads
7777
Wait Internals 1 Dispatcher Objects
Size TypeState
Wait listhead
Object-type-specific data
DispatcherObject
(see ntddkincddkntddkh)
Any kernel object you can wait for is a ldquodispatcher objectrdquo
ndash some exclusively for synchronizationeg events mutexes (ldquomutantsrdquo) semaphores queues timers
ndash others can be waited for as a side effect of their prime function eg processes threads file objects
ndash non-waitable kernel objects are called ldquocontrol objectsrdquo
All dispatcher objects have a common header
All dispatcher objects are in one of two states
ndash ldquosignaledrdquo vs ldquononsignaledrdquo
ndash when signalled a wait on the object is satisfied
ndash different object types differ in terms of what changes their state
ndash wait and unwait implementation iscommon to all types of dispatcher objects
7878
Object-type-specific data
Wait Internals 2Wait Blocks
Size TypeState
Wait listhead
Size TypeState
Wait listhead
Represent a threadrsquos reference to something itrsquos waiting for (one per handle passed to WaitForhellip)
All wait blocks from a given wait call are chained to the waiting thread
Type indicates wait for ldquoanyrdquo or ldquoallrdquo Key denotes argument list position for
WaitForMultipleObjects
Object-type-specific data
DispatcherObjects
Thread Objects
WaitBlockListWaitBlockList
Wait blocks
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
7979
34 Windows APIs for Synchronization and IPC Windows API constructs for synchronization and interprocess communication
Synchronization
ndash Critical sections
ndash Mutexes
ndash Semaphores
ndash Event objects
Synchronization through interprocess communication
ndash Anonymous pipes
ndash Named pipes
ndash Mailslots
8080
Critical Sections
Only usable from within the same process
Critical sections are initialized and deleted but do not have handles
Only one thread at a time can be in a critical section
A thread can enter a critical section multiple times - however the number of Enter- and Leave-operations must match
Leaving a critical section before entering it may cause deadlocks
No way to test whether another thread is in a critical section
VOID InitializeCriticalSection( LPCRITICAL_SECTION sec )VOID DeleteCriticalSection( LPCRITICAL_SECTION sec )
VOID EnterCriticalSection( LPCRITICAL_SECTION sec ) VOID LeaveCriticalSection( LPCRITICAL_SECTION sec )BOOL TryEnterCriticalSection ( LPCRITICAL_SECTION sec )
8181
Critical Section Example
counter is global shared by all threads
volatile int counter = 0
CRITICAL_SECTION crit
InitializeCriticalSection ( ampcrit )
hellip main loop in any of the threads
while (done)
_try
EnterCriticalSection ( ampcrit )
counter += local_value
LeaveCriticalSection ( ampcrit )
_finally LeaveCriticalSection ( ampcrit )
DeleteCriticalSection( ampcrit )
8282
Synchronizing Threads with Kernel Objects
The following kernel objects can be used to synchronize threads
ndash Processes
ndash Threads
ndash Files
ndash Console input
File change notificationsFile change notifications
MutexesMutexes
Events (auto-reset + manual-reset)Events (auto-reset + manual-reset)
Waitable timersWaitable timers
DWORD WaitForSingleObject( HANDLE hObject DWORD dwTimeout )
DWORD WaitForMultipleObjects( DWORD cObjects LPHANDLE lpHandles BOOL bWaitAll DWORD dwTimeout )
8383
Wait Functions - Details
WaitForSingleObject()ndash hObject specifies kernel object
ndash dwTimeout specifies wait time in msecdwTimeout == 0 - no wait check whether object is signaled
dwTimeout == INFINITE - wait forever
WaitForMultipleObjects()ndash cObjects lt= MAXIMUM_WAIT_OBJECTS (64)
ndash lpHandles - pointer to array identifying these objects
ndash bWaitAll - whether to wait for first signaled object or all objectsFunction returns index of first signaled object
Side effectsndash Mutexes auto-reset events and waitable timers will be reset to non-signaled state after completing wait functions
8484
Mutexes
Mutexes work across processes
First thread has to call CreateMutex()
When sharing a mutex second thread (process) calls CreateMutex() or OpenMutex()
fInitialOwner == TRUE gives creator immediate ownership
Threads acquire mutex ownership using WaitForSingleObject() or WaitForMultipleObjects()
ReleaseMutex() gives up ownership
CloseHandle() will free mutex object
8585
Mutexes
HANDLE CreateMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
HANDLE OpenMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
BOOL ReleaseMutex( HANDLE hMutex )
8686
Mutex Example
counter is global shared by all threads
volatile int done counter = 0
HANDLE mutex = CreateMutex( NULL FALSE NULL )
main loop in any of the threads ret is local
DWORD ret
while (done)
ret = WaitForSingleObject( mutex INFINITE )
if (ret == WAIT_OBJECT_0)
counter += local_value
else mutex was abandoned
break exit the loop
ReleaseMutex( mutex )
CloseHandle( mutex )
8787
Comparison - POSIX mutexes
POSIX pthreads specification supports mutexesndash Synchronization among threads in same process
Five basic functionsndash pthread_mutex_init()
ndash pthread_mutex_destroy()
ndash pthread_mutex_lock()
ndash pthread_mutex_unlock()
ndash pthread_mutex_trylock()
Comparisonndash pthread_mutex_lock() will block - equivalent to WaitForSingleObject( hMutex )
ndash pthread_mutex_trylock() is nonblocking (polling) - equivalent to WaitForSingleObject() with timeout == 0
8888
Semaphores
Semaphore objects are used for resource countingndash A semaphore is signaled when count gt 0
Threadsprocesses use wait functionsndash Each wait function decreases semaphore count by 1
ndash ReleaseSemaphore() may increment count by any value
ndash ReleaseSemaphore() returns old semaphore count
8989
Semaphores
HANDLE CreateSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE OpenSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE ReleaseSemaphore( HANDLE hSemaphoreLONG cReleaseCount LPLONG lpPreviousCount )
9090
Events
Multiple threads can be released when a single event is signaled (barrier synchronization)ndash Manual-reset event can signal several thread simultaneously must be reset manually
ndash PulseEvent() will release all threads waiting on a manual-reset event and automatically reset the event
ndash Auto-reset event signals a single thread event is reset automatically
ndash fInitialState == TRUE - create event in signaled state
9191
Events
HANDLE CreateEvent( LPSECURITY_ATTRIBUTE lpsaBOOL fManualReset BOOL fInititalStateLPTSTR lpszEventName )
BOOL SetEvent( HANDLE hEvent )BOOL ResetEvent( HANDLE hEvent )BOOL PulseEvent( HANDLE hEvent )
9292
Comparison - POSIX condition variables
pthreadrsquos condition variables are comparable to eventsndash pthread_cond_init()
ndash pthread_cond_destroy()
Wait functionsndash pthread_cond_wait()
ndash pthread_cond_timedwait()
Signalingndash pthread_cond_signal() - one thread
ndash pthread_cond_broadcast() - all waiting threads
No exact equivalent to manual-reset events
9393
Anonymous pipes
BOOL CreatePipe( PHANDLE phReadPHANDLE phWriteLPSECURITY_ATTRIBUTES lpsaDWORD cbPipe )
main
prog1 prog2pipe
Half-duplex character-based IPC
cbPipe pipe byte size zero == default
Read on pipe handle will block if pipe is empty
Write operation to a full pipe will block
Anonymous pipes are oneway
9494
IO Redirection using an Anonymous Pipe
Create default size anonymous pipe handles are inheritable
if (CreatePipe (amphReadPipe amphWritePipe ampPipeSA 0))
fprintf(stderr ldquoAnon pipe create failednrdquo) exit(1)
Set output handle to pipe handle create first processes
StartInfoCh1hStdInput = GetStdHandle (STD_INPUT_HANDLE)
StartInfoCh1hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh1hStdOutput = hWritePipe
StartInfoCh1dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)Command1 NULL NULL TRUE 0 NULL NULL ampStartInfoCh1 ampProcInfo1))
fprintf(stderr ldquoCreateProc1 failednrdquo) exit(2)
CloseHandle (hWritePipe)
9595
Pipe example (contd)
Repeat (symmetrically) for the second process
StartInfoCh2hStdInput = hReadPipe
StartInfoCh2hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh2hStdOutput = GetStdHandle (STD_OUTPUT_HANDLE)
StartInfoCh2dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)targv NULL NULLTRUE Inherit handles
0 NULL NULL ampStartInfoCh2 ampProcInfo2))
fprintf(stderr ldquoCreateProc2 failednrdquo) exit(3)
CloseHandle (hReadPipe)
Wait for both processes to complete
WaitForSingleObject (ProcInfo1hProcess INFINITE)
WaitForSingleObject (ProcInfo2hProcess INFINITE)
CloseHandle (ProcInfo1hThread) CloseHandle (ProcInfo1hProcess)
CloseHandle (ProcInfo2hThread) CloseHandle (ProcInfo2hProcess)
return 0
9696
Named Pipes
Message orientedndash Reading process can read varying-length messages precisely as sent by the writing process
Bi-directionalndash Two processes can exchange messages over the same pipe
Multiple independent instances of a named pipendash Several clients can communicate with a single server using the same instance
ndash Server can respond to client using the same instance
Pipe can be accessed over the networkndash location transparency
Convenience and connection functions
9797
Using Named Pipes
HANDLE CreateNamedPipe (LPCTSTR lpszPipeNameDWORD fdwOpenMode DWORD fdwPipModeDWORD nMaxInstances DWORD cbOutBufDWORD cbInBuf DWORD dwTimeOutLPSECURITY_ATTRIBUTES lpsa )
Use same flag settings forall instances of a named pipe
lpszPipeName pipe[path]pipename
ndash Not possible to create a pipe on remote machine ( ndash local machine)
fdwOpenMode
ndash PIPE_ACCESS_DUPLEX PIPE_ACCESS_INBOUND PIPE_ACCESS_OUTBOUND
fdwPipeMode
ndash PIPE_TYPE_BYTE or PIPE_TYPE_MESSAGE
ndash PIPE_READMODE_BYTE or PIPE_READMODE_MESSAGE
ndash PIPE_WAIT or PIPE_NOWAIT (will ReadFile block)
9898
Named Pipes (contd)
BOOL PeekNamedPipe (HANDLE hPipeLPVOID lpvBuffer DWORD cbBufferLPDWORD lpcbRead LPDWORD lpcbAvailLPDWORD lpcbMessage)
nMaxInstances
ndash Number of instances
ndash PIPE_UNLIMITED_INSTANCES OS choice based on resources
dwTimeOut
ndash Default time-out period (in msec) for WaitNamedPipe()
First CreateNamedPipe creates named pipe
ndash Closing handle to last instance deletes named pipe
Polling a pipe
ndash Nondestructive ndash is there a message waiting for ReadFile
9999
Named Pipe Client Connections
CreateFile with named pipe namendash pipe[path]pipename
ndash servernamepipe[path]pipename
ndash First method gives better performance (local server)
Status Functionsndash GetNamedPipeHandleState
ndash SetNamedPipeHandleState
ndash GetNamedPipeInfo
100100
Convenience Functions
BOOL TransactNamedPipe( HANDLE hNamedPipeLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDOWRD lpcbRead LPOVERLAPPED lpa)
WriteFile ReadFile sequence
101101
Convenience Functions
BOOL CallNamedPipe( LPCTSTR lpszPipeNameLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDWORD lpcbRead DWORD dwTimeOut)
CreateFile WriteFile ReadFile CloseHandle
ndash dwTimeOut NMPWAIT_NOWAIT NMPWAIT_WIAT_FOREVER NMPWAIT_USE_DEFAULT_WAIT
102102
Server eliminate the polling loop
BOOL ConnectNamedPipe (HANDLE hNamedPipeLPOVERLAPPED lpo
lpo == NULLndash Call will return as soon as there is a client connection
ndash Returns false if client connected between CreateNamed Pipe calland ConnectNamedPipe()
Use DisconnectNamedPipe to free the handle for connection from another client
WaitNamedPipe()ndash Client may wait for serverlsquos ConnectNamedPipe()
Security rights for named pipesndash GENERIC_READ GENERIC_WRITE SYNCHRONIZE
103103
Comparison with UNIX
UNIX FIFOs are similar to a named pipe
ndash FIFOs are half-duplex
ndash FIFOs are limited to a single machine
ndash FIFOs are still byte-oriented so its easiest to use fixed-size records in clientserver applications
ndash Individual readwrites are atomic
A server using FIFOs must use a separate FIFO for each clientlsquos response although all clients can send requests via a single well known FIFO
Mkfifo() is the UNIX counterpart to CreateNamedPipe()
Use sockets for networked clientserver scenarios
104104
Client Example using Named Pipe
WaitNamedPipe (ServerPipeName NMPWAIT_WAIT_FOREVER)
hNamedPipe = CreateFile (ServerPipeName GENERIC_READ | GENERIC_WRITE
0 NULL OPEN_EXISTING FILE_ATTRIBUTE_NORMAL NULL)
if (hNamedPipe == INVALID_HANDLE_VALUE)
fptinf(stderr Failure to locate servern) exit(3)
Write the request
WriteFile (hNamedPipe ampRequest MAX_RQRS_LEN ampnWrite NULL)
Read each response and send it to std out
while (ReadFile (hNamedPipe ResponseRecord MAX_RQRS_LEN ampnRead NULL))
printf (s ResponseRecord)
CloseHandle (hNamedPipe)
return 0
105105
Server Example Using a Named Pipe
hNamedPipe = CreateNamedPipe (SERVER_PIPE PIPE_ACCESS_DUPLEX
PIPE_READMODE_MESSAGE | PIPE_TYPE_MESSAGE | PIPE_WAIT
1 0 0 CS_TIMEOUT pNPSA)
while (Done)
printf (Server is awaiting next requestn)
if (ConnectNamedPipe (hNamedPipe NULL)
|| ReadFile (hNamedPipe ampRequest RQ_SIZE ampnXfer NULL))
fprintf(stderr ldquoConnect or Read Named Pipe errornrdquo) exit(4)
printf( ldquoRequest is sn RequestRecord)
Send the file one line at a time to the client
fp = fopen (File r)
while ((fgets (ResponseRecord MAX_RQRS_LEN fp) = NULL))
WriteFile (hNamedPipe ampResponseRecord
(strlen(ResponseRecord) + 1) TSIZE ampnXfer NULL)
fclose (fp)
DisconnectNamedPipe (hNamedPipe)
End of server operation
106106
Win32 IPC - MailslotsMailslots bear some nasty implementation detailsthey are almost never used
Broadcast mechanism
ndash One-directional
ndash Mutliple writersmultiple readers (frequently one-to-many comm)
ndash Message delivery is unreliable
ndash Can be located over a network domain
ndash Message lengths are limited (w2k lt 426 byte) Operations on the mailslot
ndash Each reader (server) creates mailslot with CreateMailslot()
ndash Write-only client opens mailslot with CreateFile() and uses WriteFile() ndash open will fail if there are no waiting readers
ndash Clientlsquos message can be read by all servers (readers) Client lookup mailslotmailslotname
ndash Client will connect to every server in network domain
107107
Locate a server via mailslot
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
App client 0
App client n
Mailslot Servers
While () Sleep() hMS = CreateFile( ldquomailslotstatusldquo)
WriteFile(hMS ampStatInfo
App Server
Mailslot Client
Message is sent periodically
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
2222
Critical Section of n Threads
Shared data
semaphore mutex initially mutex = 1
Thread Ti
do wait(mutex) critical section
signal(mutex) remainder section
while (1)
2323
Semaphore Implementation
Semaphores may suspendresume threads
ndash Avoid busy waiting
Define a semaphore as a record
typedef struct
int value struct thread L semaphore
Assume two simple operations
ndash suspend() suspends the thread that invokes it
ndash resume(T) resumes the execution of a blocked thread T
2424
Implementation
Semaphore operations now defined as wait(S)
Svalue--
if (Svalue lt 0)
add this thread to SLsuspend()
signal(S) Svalue++
if (Svalue lt= 0)
remove a thread T from SLresume(T)
2525
Semaphore as a General Synchronization Tool Execute B in Tj only after A executed in Ti
Use semaphore flag initialized to 0
Code
Ti Tj
A wait(flag)
signal(flag) B
2626
Two Types of Semaphores
Counting semaphore
ndash integer value can range over an unrestricted domain
Binary semaphore
ndash integer value can range only between 0 and 1
ndash can be simpler to implement
Counting semaphore S can be implemented as a binary semaphore
2727
Deadlock and Starvation
Deadlock ndash
ndash two or more threads are waiting indefinitely for an event that can be caused by only one of the waiting threads
Let S and Q be two semaphores initialized to 1
T0 T1
wait(S) wait(Q)
wait(Q) wait(S)
signal(S) signal(Q)
signal(Q) signal(S)
2828
Deadlock and Starvation
Starvation ndash indefinite blocking
ndash A thread may never be removed from the semaphore queue in which it is suspended
Solution ndash
ndash all code should acquirerelease semaphores in same order
2929
Windows Synchronization
Uses interrupt masks to protect access to global resources on uniprocessor systems
Uses spinlocks on multiprocessor systems
Provides dispatcher objects which may act as mutexes and semaphores
Dispatcher objects may also provide events An event acts much like a condition variable
3030
Linux Synchronization
Kernel disables interrupts for synchronizing access to global data on uniprocessor systems
Uses spinlocks for multiprocessor synchronization
Uses semaphores and readers-writers locks when longer sections of code need access to data
Implements POSIX synchronization primitives to support multitasking multithreading (including real-time threads) and multiprocessing
3131
Further Reading
Ben-Ari M Principles of Concurrent Programming Prentice Hall 1982
Lamport L The Mutual Exclusion Problem Journal of the ACM April 1986
Abraham Silberschatz Peter B Galvin Operating System Concepts John Wiley amp Sons 6th Ed 2003
ndash Chapter 7 - Process Synchronization
ndash Chapter 8 - Deadlocks
3232
32 Trap Dispatching Interrupts Synchronization Trap and Interrupt dispatching
IRQL levels amp Interrupt Precedence
Spinlocks and Kernel Synchronization
Executive Synchronization
3333
Kernel Mode Versus User Mode
A processor state
Controls access to memory
Each memory page is tagged to show the required mode for reading and for writing
ndash Protects the system from the users
ndash Protects the user (process) from themselves
ndash System is not protected from system
Code regions are tagged ldquono write in any moderdquo
Controls ability to execute privileged instructions
A Windows abstraction
ndash Intel Ring 0 Ring 3
3434
Kernel Mode Versus User Mode
Control flow (a thread) can change from user to kernel mode and back
ndash Does not affect scheduling
ndash Thread context includes info about execution mode (along with registers etc)
PerfMon counters
ndash ldquoPrivileged Timerdquo and ldquoUser Timerdquo
ndash 4 levels of granularity thread process processor system
3535
Getting Into Kernel Mode
Code is run in kernel mode for one of three reasons
1 Requests from user mode
ndash Via the system service dispatch mechanism
ndash Kernel-mode code runs in the context of the requesting thread
2 Dedicated kernel-mode system threads
ndash Some threads in the system stay in kernel mode at all timesmostly in the ldquoSystemrdquo process
ndash Scheduled preempted etc like any other threads
3636
Getting Into Kernel Mode
3 Interrupts from external devicesndash interrupt dispatcher invokes the interrupt service routine
ndash ISR runs in the context of the interrupted thread so-called ldquoarbitrary thread contextrdquo
ndash ISR often requests the execution of a ldquoDPC routinerdquo which also runs in kernel mode
ndash Time not charged to interrupted thread
3737
Trap dispatching
Interruptdispatcher
Systemservice
dispatcher
Interruptserviceroutines
Interruptserviceroutines
Interruptserviceroutines
System services
System services
System services
Exceptiondispatcher
Exceptionhandlers
Exceptionhandlers
Exceptionhandlers
Virtual memorymanagerlsquos pager
Interrupt
System service call
HW exceptionsSW exceptions
Virtual addressexceptions
Trap processorlsquos mechanism to capture executing thread
ndash Switch from user to kernel mode
ndash Interrupts ndash asynchronous
ndash Exceptions - synchronous
Interrupt dispatch routine
Disable interrupts
Record machine state (trap frame) to allow resume
Mask equal- and lower-IRQL interrupts
Find and call appropriate ISR
Dismiss interrupt
Restore machine state (including mode and enabled interrupts)
Disable interrupts
Record machine state (trap frame) to allow resume
Mask equal- and lower-IRQL interrupts
Find and call appropriate ISR
Dismiss interrupt
Restore machine state (including mode and enabled interrupts)
Tell the device to stop interrupting
Interrogate device state start next operation on device etc
Request a DPC
Return to caller
Tell the device to stop interrupting
Interrogate device state start next operation on device etc
Request a DPC
Return to caller
Interrupt service routine
interrupt
user or kernel mode
codekernel mode
Note no thread or process context switch
Note no thread or process context switch
Interrupt Dispatching
3939
IRQL = Interrupt Request Level
ndash Precedence of the interrupt with respect to other interrupts
ndash Different interrupt sources have different IRQLs
ndash not the same as IRQ
IRQL is also a state of the processor
ndash Servicing an interrupt raises processor IRQL to that interruptrsquos IRQL
ndash this masks subsequent interrupts at equal and lower IRQLs
User mode is limited to IRQL 0
No waits or page faults at IRQL gt= DISPATCH_LEVEL
Interrupt Precedence via IRQLs (x86)
4040
PassiveLowAPC
DispatchDPCDevice 1
Profile amp Synch (Srv 2003)
ClockInterprocessor Interrupt
Power failHigh
normal thread execution
Hardware interrupts
Deferrable software interrupts
012
302928
31
Interrupt Precedence via IRQLs (x86)
4141
Interrupt processing
Interrupt dispatch table (IDT)ndash Links to interrupt service routines
x86ndash Interrupt controller interrupts processor (single line)
ndash Processor queries for interrupt vector uses vector as index to IDT
Alphandash PAL code (Privileged Architecture Library ndash Alpha BIOS) determines interrupt vector calls kernel
ndash Kernel uses vector to index IDT
After ISR execution IRQL is lowered to initial level
4242
Interrupt object
Allows device drivers to register ISRs for their devicesndash Contains dispatch code (initial handler)
ndash Dispatch code calls ISR with interrupt object as parameter(HW cannot pass parameters to ISR)
Connectingdisconnecting interrupt objectsndash Dynamic association between ISR and IDT entry
ndash Loadable device drivers (kernel modules)
ndash Turn onoff ISR
Interrupt objects can synchronize access to ISR datandash Multiple instances of ISR may be active simultaneously (MP machine)
ndash Multiple ISR may be connected with IRQL
4343
Predefined IRQLs
High ndash used when halting the system (via KeBugCheck())
Power fail ndash originated in the NT design document but has never been used
Inter-processor interruptndash used to request action from other processor (dispatching a thread updating a processors TLB system shutdown system crash)
Clockndash Used to update systemlsquos clock allocation of CPU time to threads
Profilendash Used for kernel profiling (see Kernel profiler ndash Kernprofexe Res Kit)
4444
Predefined IRQLs (contd)
Device
ndash Used to prioritize device interrupts
DPCdispatch and APC
ndash Software interrupts that kernel and device drivers generate
Passive
ndash No interrupt level at all normal thread execution
4545
IRQLs on 64-bit Systems
PassiveLowAPC
DispatchDPCDevice 1
Device n
Synch (Srv 2003)Clock
Interprocessor InterruptPower
HighProfile
012
1413
15
34
PassiveLowAPC
DispatchDPC amp Synch (UP only)Correctable Machine Check
Device 1
Device nSynch (MP only)
ClockInterprocessor Interrupt
HighProfilePower
x64 IA64
12
4646
Interrupt Prioritization amp Delivery
IRQLs are determined as followsndash x86 UP systems IRQL = 27 - IRQ
ndash x86 MP systems bucketized (random)
ndash x64 amp IA64 systems IRQL = IDT vector number 16
On MP systems which processor is chosen to deliver an interruptndash By default any processor can receive an interrupt from any deviceCan be configured with IntFilter utility in Resource Kit
ndash On x86 and x64 systems the IOAPIC (IO advanced programmable interrupt controller) is programmed to interrupt the processor running at the lowest IRQL
ndash On IA64 systems the SAPIC (streamlined advanced programmable interrupt controller) is configured to interrupt one processor for each interrupt sourceProcessors are assigned round robin for each interrupt vector
4747
Software interrupts
Initiating thread dispatching
ndash DPC allow for scheduling actions when kernel is deep within many layers of code
ndash Delayed scheduling decision one DPC queue per processor
Handling timer expiration
Asynchronous execution of a procedure in context of a particular thread
Support for asynchronous IO operations
4848
Flow of Interrupts
4949
Sync on MP use spinlocks to coordinate among processors
Spinlock acquisition and release routines implement a one-owner-at-a-time algorithm
ndash Spinlock is either free or is considered to be owned by a CPU
ndash Analogous to using Windows API mutexes from user mode
A spinlock is just a data cell in memory
ndash Accessed with a test-and-modify operation that is atomic across all processors
ndash KSPIN_LOCK is an opaque data type typedefrsquod as a ULONG
ndash To implement synchronization a single bit is sufficient
Synchronization on SMP Systems
5050
do acquire_spinlock(DPC)until (SUCCESS)
begin remove DPC from queueend
release_spinlock(DPC)
do acquire_spinlock(DPC)until (SUCCESS)
begin remove DPC from queueend
release_spinlock(DPC)
Kernel Synchronization
Processor BProcessor A
Critical section
spinlock
DPC DPC
A spinlock is a locking primitive associatedwith a global data structure such as the DPC queue
5151
Queued Spinlocks
Problem Checking status of spinlock via test-and-set operation creates bus contention
Queued spinlocks maintain queue of waiting processors
First processor acquires lock other processors wait on processor-local flag
ndash Thus busy-wait loop requires no access to the memory bus
When releasing lock the 1st processorrsquos flag is modified
ndash Exactly one processor is being signaled
ndash Pre-determined wait order
5252
SMP Scalability Improvements
Windows 2000 queued spinlocks
ndash qlocks in Kernel Debugger
Server 2003
ndash More spinlocks eliminated (context swap system space commit)
ndash Further reduction of use of spinlocks amp length they are held
ndash Scheduling database now per-CPUAllows thread state transitions in parallel
5353
SMP Scalability Improvements
XP2003
ndash Minimized lock contention for hot locks PFN or Page Frame Database lock
ndash Some locks completely eliminatedCharging nonpagedpaged pool quotas allocating and mapping system page table entries charging commitment of pages allocatingmapping physical memory through AWE functions
ndash New more efficient locking mechanism (pushlocks)Doesnrsquot use spinlocks when no contentionUsed for object manager and address windowing extensions (AWE) related locks
5454
Waiting
Flexible wait calls
ndash Wait for one or multiple objects in one call
ndash Wait for multiple can wait for ldquoanyrdquo one or ldquoallrdquo at onceldquoAllrdquo all objects must be in the signalled state concurrently to resolve the wait
ndash All wait calls include optional timeout argument
ndash Waiting threads consume no CPU time
5555
Waiting
Waitable objects include
ndash Events (may be auto-reset or manual reset may be set or ldquopulsedrdquo)
ndash Mutexes (ldquomutual exclusionrdquo one-at-a-time)
ndash Semaphores (n-at-a-time)
ndash Timers
ndash Processes and Threads (signalled upon exit or terminate)
ndash Directories (change notification)
5656
Waiting
No guaranteed ordering of wait resolution
ndash If multiple threads are waiting for an object and only one thread is released (eg itrsquos a mutex or auto-reset event) which thread gets released is unpredictable
5757
Executive Synchronization
Thread waitson an object
handle
Create and initialize thread object
Initialized
Ready
Transition
Waiting
Running
Terminated
Standby
Wait is completeSet object to
signaled state
Interaction with thread scheduling
Waiting on Dispatcher Objects ndash outside the kernel
5858
Interaction bet Synchronization amp Dispatching User mode thread waits on an event objectlsquos handle
Kernel changes threadlsquos scheduling state from ready to waiting and adds thread to wait-list
Another thread sets the event
Kernel wakes up waiting threads variable priority threads get priority boost
5959
Interaction bet Synchronization amp Dispatching Dispatcher re-schedules new thread ndash it may preempt running thread it it has lower priority and issues software interrupt to initiate context switch
If no processor can be preempted the dispatcher places the ready thread in the dispatcher ready queue to be scheduled later
6060
What signals an object
Dispatcher object
System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
Owning thread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (kernel mode)
nonsignaled signaled
Owning thread or otherthread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (exported to user mode)
nonsignaled signaled
One thread releases thesemaphore freeing a resource
A thread acquires the semaphoreMore resources are not available
Kernel resumes one or more waiting threads
Semaphore
6161
A thread reinitializesthe thread object
What signals an object (contd)
Dispatcher object System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
A thread sets the event
Kernel resumes one or more threads
Kernel resumes one or more waiting threads
Event
nonsignaled signaled
Dedicated thread sets oneevent in the event pair
Kernel resumes theother dedicated thread
Kernel resumes waitingdedicated thread
Event pair
nonsignaled signaled
Timer expires
A thread (re) initializes the timer
Kernel resumes all waiting threads
Timer
nonsignaled signaled
Thread terminates
Kernel resumes all waiting threads
Thread
6262
Further Reading
Mark E Russinovich and David A Solomon Microsoft Windows Internals 4th Edition Microsoft Press 2004
Chapter 3 - System Mechanisms
ndash Trap Dispatching (pp 85 ff)
ndash Synchronization (pp 149 ff)
ndash Kernel Event Tracing (pp 175 ff)
6363
33 Advanced Windows Synchronization
Deferred and Asynchronous Procedure Calls
IRQLs and CPU Time Accounting
Wait Queues amp Dispatcher Objects
6464
Used to defer processing from higher (device) interrupt level to a lower (dispatch) levelndash Also used for quantum end and timer expiration
Driver (usually ISR) queues requestndash One queue per CPU DPCs are normally queued to the current processor but can be targeted to other CPUs
ndash Executes specified procedure at dispatch IRQL (or ldquodispatch levelrdquo also ldquoDPC levelrdquo) when all higher-IRQL work (interrupts) completed
ndash Maximum times recommended ISR 10 usec DPC 25 usec
See httpwwwmicrosoftcomwhdcdriverperformmmdrvmspx
Deferred Procedure Calls (DPCs)
6565
queue head DPC object DPC object DPC object
Deferred Procedure Calls (DPCs)
6666
DPC
Delivering a DPC
DPC routines can call kernel functionsbut canlsquot call system services generatepage faults or create or wait on objects
DPC routines canlsquotassume whatprocess addressspace is currentlymapped
Interruptdispatch table
high
Power failure
DispatchDPC
APC
Low
DPC
1 Timer expires kernelqueues DPC that willrelease all waiting threadsKernel requests SW int
DPCDPC
DPC queue
2 DPC interrupt occurswhen IRQL drops belowdispatchDPC level
dispatcher
3 After DPC interruptcontrol transfers tothread dispatcher
4 Dispatcher executes each DPCroutine in DPC queue
6767
Asynchronous Procedure Calls (APCs)
Execute code in context of a particular user threadndash APC routines can acquire resources (objects) incur page faultscall system services
APC queue is thread-specific
User mode amp kernel mode APCsndash Permission required for user mode APCs
6868
Asynchronous Procedure Calls (APCs)
Executive uses APCs to complete work in thread spacendash Wait for asynchronous IO operation
ndash Emulate delivery of POSIX signals
ndash Make threads suspendterminate itself (env subsystems)
APCs are delivered when thread is in alertable wait statendash WaitForMultipleObjectsEx() SleepEx()
6969
Special kernel APCs
ndash Run in kernel mode at IRQL 1
ndash Always deliverable unless thread is already at IRQL 1 or above
ndash Used for IO completion reporting from ldquoarbitrary thread contextrdquo
ndash Kernel-mode interface is linkable but not documented
Asynchronous Procedure Calls (APCs)
7070
ldquoOrdinaryrdquo kernel APCs
ndash Always deliverable if at IRQL 0 unless explicitly disabled (disable with KeEnterCriticalRegion)
User mode APCs
ndash Used for IO completion callback routines (see ReadFileEx WriteFileEx) also QueueUserApc
ndash Only deliverable when thread is in ldquoalertable waitrdquo
Asynchronous Procedure Calls (APCs)
7171
ThreadObject
K
U
APC objects
Asynchronous Procedure Calls (APCs)
7272
IRQLs and CPU Time Accounting
Interval clock timer ISR keeps track of time
Clock ISR time accounting
ndash If IRQLlt2 charge to threadrsquos user or kernel time
ndash If IRQL=2 and processing a DPC charge to DPC time
ndash If IRQL=2 amp not processing a DPC charge to thread kernel time
ndash If IRQLgt2 charge to interrupt time
7373
IRQLs and CPU Time Accounting
Since time servicing interrupts are NOT charged to interrupted thread if system is busy but no process appears to be running must be due to interrupt-related activity
ndash Note time at IRQL 2 or more is charged to the current threadrsquos quantum (to be described)
7474
Interrupt Time Accounting
Task Manager includes interrupt and DPC time with the Idle process time
Interrupt activity is not charged to any threadprocess
ndash Process Explorer shows these as separate processesnot really processes
ndash Context switches for these are really of interrupts amp DPCs
7575
Time Accounting Quirks
Looking at total CPU time for each process may not reveal where system has spent its time
CPU time accounting is driven by programmable interrupt timer
ndash Normally 10 msec (15 msec on some MP Pentiums)
Thread execution and context switches between clock intervals NOT accounted
ndash Eg one or more threads run and enter a wait state before clock fires
ndash Thus threads may run but never get charged
View context switch activity with Process Explorer
ndash Add Context Switch Delta column
7676
For waiting threads user-mode utilities only display the wait reason
Example pstat
Looking at Waiting Threads
7777
Wait Internals 1 Dispatcher Objects
Size TypeState
Wait listhead
Object-type-specific data
DispatcherObject
(see ntddkincddkntddkh)
Any kernel object you can wait for is a ldquodispatcher objectrdquo
ndash some exclusively for synchronizationeg events mutexes (ldquomutantsrdquo) semaphores queues timers
ndash others can be waited for as a side effect of their prime function eg processes threads file objects
ndash non-waitable kernel objects are called ldquocontrol objectsrdquo
All dispatcher objects have a common header
All dispatcher objects are in one of two states
ndash ldquosignaledrdquo vs ldquononsignaledrdquo
ndash when signalled a wait on the object is satisfied
ndash different object types differ in terms of what changes their state
ndash wait and unwait implementation iscommon to all types of dispatcher objects
7878
Object-type-specific data
Wait Internals 2Wait Blocks
Size TypeState
Wait listhead
Size TypeState
Wait listhead
Represent a threadrsquos reference to something itrsquos waiting for (one per handle passed to WaitForhellip)
All wait blocks from a given wait call are chained to the waiting thread
Type indicates wait for ldquoanyrdquo or ldquoallrdquo Key denotes argument list position for
WaitForMultipleObjects
Object-type-specific data
DispatcherObjects
Thread Objects
WaitBlockListWaitBlockList
Wait blocks
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
7979
34 Windows APIs for Synchronization and IPC Windows API constructs for synchronization and interprocess communication
Synchronization
ndash Critical sections
ndash Mutexes
ndash Semaphores
ndash Event objects
Synchronization through interprocess communication
ndash Anonymous pipes
ndash Named pipes
ndash Mailslots
8080
Critical Sections
Only usable from within the same process
Critical sections are initialized and deleted but do not have handles
Only one thread at a time can be in a critical section
A thread can enter a critical section multiple times - however the number of Enter- and Leave-operations must match
Leaving a critical section before entering it may cause deadlocks
No way to test whether another thread is in a critical section
VOID InitializeCriticalSection( LPCRITICAL_SECTION sec )VOID DeleteCriticalSection( LPCRITICAL_SECTION sec )
VOID EnterCriticalSection( LPCRITICAL_SECTION sec ) VOID LeaveCriticalSection( LPCRITICAL_SECTION sec )BOOL TryEnterCriticalSection ( LPCRITICAL_SECTION sec )
8181
Critical Section Example
counter is global shared by all threads
volatile int counter = 0
CRITICAL_SECTION crit
InitializeCriticalSection ( ampcrit )
hellip main loop in any of the threads
while (done)
_try
EnterCriticalSection ( ampcrit )
counter += local_value
LeaveCriticalSection ( ampcrit )
_finally LeaveCriticalSection ( ampcrit )
DeleteCriticalSection( ampcrit )
8282
Synchronizing Threads with Kernel Objects
The following kernel objects can be used to synchronize threads
ndash Processes
ndash Threads
ndash Files
ndash Console input
File change notificationsFile change notifications
MutexesMutexes
Events (auto-reset + manual-reset)Events (auto-reset + manual-reset)
Waitable timersWaitable timers
DWORD WaitForSingleObject( HANDLE hObject DWORD dwTimeout )
DWORD WaitForMultipleObjects( DWORD cObjects LPHANDLE lpHandles BOOL bWaitAll DWORD dwTimeout )
8383
Wait Functions - Details
WaitForSingleObject()ndash hObject specifies kernel object
ndash dwTimeout specifies wait time in msecdwTimeout == 0 - no wait check whether object is signaled
dwTimeout == INFINITE - wait forever
WaitForMultipleObjects()ndash cObjects lt= MAXIMUM_WAIT_OBJECTS (64)
ndash lpHandles - pointer to array identifying these objects
ndash bWaitAll - whether to wait for first signaled object or all objectsFunction returns index of first signaled object
Side effectsndash Mutexes auto-reset events and waitable timers will be reset to non-signaled state after completing wait functions
8484
Mutexes
Mutexes work across processes
First thread has to call CreateMutex()
When sharing a mutex second thread (process) calls CreateMutex() or OpenMutex()
fInitialOwner == TRUE gives creator immediate ownership
Threads acquire mutex ownership using WaitForSingleObject() or WaitForMultipleObjects()
ReleaseMutex() gives up ownership
CloseHandle() will free mutex object
8585
Mutexes
HANDLE CreateMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
HANDLE OpenMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
BOOL ReleaseMutex( HANDLE hMutex )
8686
Mutex Example
counter is global shared by all threads
volatile int done counter = 0
HANDLE mutex = CreateMutex( NULL FALSE NULL )
main loop in any of the threads ret is local
DWORD ret
while (done)
ret = WaitForSingleObject( mutex INFINITE )
if (ret == WAIT_OBJECT_0)
counter += local_value
else mutex was abandoned
break exit the loop
ReleaseMutex( mutex )
CloseHandle( mutex )
8787
Comparison - POSIX mutexes
POSIX pthreads specification supports mutexesndash Synchronization among threads in same process
Five basic functionsndash pthread_mutex_init()
ndash pthread_mutex_destroy()
ndash pthread_mutex_lock()
ndash pthread_mutex_unlock()
ndash pthread_mutex_trylock()
Comparisonndash pthread_mutex_lock() will block - equivalent to WaitForSingleObject( hMutex )
ndash pthread_mutex_trylock() is nonblocking (polling) - equivalent to WaitForSingleObject() with timeout == 0
8888
Semaphores
Semaphore objects are used for resource countingndash A semaphore is signaled when count gt 0
Threadsprocesses use wait functionsndash Each wait function decreases semaphore count by 1
ndash ReleaseSemaphore() may increment count by any value
ndash ReleaseSemaphore() returns old semaphore count
8989
Semaphores
HANDLE CreateSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE OpenSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE ReleaseSemaphore( HANDLE hSemaphoreLONG cReleaseCount LPLONG lpPreviousCount )
9090
Events
Multiple threads can be released when a single event is signaled (barrier synchronization)ndash Manual-reset event can signal several thread simultaneously must be reset manually
ndash PulseEvent() will release all threads waiting on a manual-reset event and automatically reset the event
ndash Auto-reset event signals a single thread event is reset automatically
ndash fInitialState == TRUE - create event in signaled state
9191
Events
HANDLE CreateEvent( LPSECURITY_ATTRIBUTE lpsaBOOL fManualReset BOOL fInititalStateLPTSTR lpszEventName )
BOOL SetEvent( HANDLE hEvent )BOOL ResetEvent( HANDLE hEvent )BOOL PulseEvent( HANDLE hEvent )
9292
Comparison - POSIX condition variables
pthreadrsquos condition variables are comparable to eventsndash pthread_cond_init()
ndash pthread_cond_destroy()
Wait functionsndash pthread_cond_wait()
ndash pthread_cond_timedwait()
Signalingndash pthread_cond_signal() - one thread
ndash pthread_cond_broadcast() - all waiting threads
No exact equivalent to manual-reset events
9393
Anonymous pipes
BOOL CreatePipe( PHANDLE phReadPHANDLE phWriteLPSECURITY_ATTRIBUTES lpsaDWORD cbPipe )
main
prog1 prog2pipe
Half-duplex character-based IPC
cbPipe pipe byte size zero == default
Read on pipe handle will block if pipe is empty
Write operation to a full pipe will block
Anonymous pipes are oneway
9494
IO Redirection using an Anonymous Pipe
Create default size anonymous pipe handles are inheritable
if (CreatePipe (amphReadPipe amphWritePipe ampPipeSA 0))
fprintf(stderr ldquoAnon pipe create failednrdquo) exit(1)
Set output handle to pipe handle create first processes
StartInfoCh1hStdInput = GetStdHandle (STD_INPUT_HANDLE)
StartInfoCh1hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh1hStdOutput = hWritePipe
StartInfoCh1dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)Command1 NULL NULL TRUE 0 NULL NULL ampStartInfoCh1 ampProcInfo1))
fprintf(stderr ldquoCreateProc1 failednrdquo) exit(2)
CloseHandle (hWritePipe)
9595
Pipe example (contd)
Repeat (symmetrically) for the second process
StartInfoCh2hStdInput = hReadPipe
StartInfoCh2hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh2hStdOutput = GetStdHandle (STD_OUTPUT_HANDLE)
StartInfoCh2dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)targv NULL NULLTRUE Inherit handles
0 NULL NULL ampStartInfoCh2 ampProcInfo2))
fprintf(stderr ldquoCreateProc2 failednrdquo) exit(3)
CloseHandle (hReadPipe)
Wait for both processes to complete
WaitForSingleObject (ProcInfo1hProcess INFINITE)
WaitForSingleObject (ProcInfo2hProcess INFINITE)
CloseHandle (ProcInfo1hThread) CloseHandle (ProcInfo1hProcess)
CloseHandle (ProcInfo2hThread) CloseHandle (ProcInfo2hProcess)
return 0
9696
Named Pipes
Message orientedndash Reading process can read varying-length messages precisely as sent by the writing process
Bi-directionalndash Two processes can exchange messages over the same pipe
Multiple independent instances of a named pipendash Several clients can communicate with a single server using the same instance
ndash Server can respond to client using the same instance
Pipe can be accessed over the networkndash location transparency
Convenience and connection functions
9797
Using Named Pipes
HANDLE CreateNamedPipe (LPCTSTR lpszPipeNameDWORD fdwOpenMode DWORD fdwPipModeDWORD nMaxInstances DWORD cbOutBufDWORD cbInBuf DWORD dwTimeOutLPSECURITY_ATTRIBUTES lpsa )
Use same flag settings forall instances of a named pipe
lpszPipeName pipe[path]pipename
ndash Not possible to create a pipe on remote machine ( ndash local machine)
fdwOpenMode
ndash PIPE_ACCESS_DUPLEX PIPE_ACCESS_INBOUND PIPE_ACCESS_OUTBOUND
fdwPipeMode
ndash PIPE_TYPE_BYTE or PIPE_TYPE_MESSAGE
ndash PIPE_READMODE_BYTE or PIPE_READMODE_MESSAGE
ndash PIPE_WAIT or PIPE_NOWAIT (will ReadFile block)
9898
Named Pipes (contd)
BOOL PeekNamedPipe (HANDLE hPipeLPVOID lpvBuffer DWORD cbBufferLPDWORD lpcbRead LPDWORD lpcbAvailLPDWORD lpcbMessage)
nMaxInstances
ndash Number of instances
ndash PIPE_UNLIMITED_INSTANCES OS choice based on resources
dwTimeOut
ndash Default time-out period (in msec) for WaitNamedPipe()
First CreateNamedPipe creates named pipe
ndash Closing handle to last instance deletes named pipe
Polling a pipe
ndash Nondestructive ndash is there a message waiting for ReadFile
9999
Named Pipe Client Connections
CreateFile with named pipe namendash pipe[path]pipename
ndash servernamepipe[path]pipename
ndash First method gives better performance (local server)
Status Functionsndash GetNamedPipeHandleState
ndash SetNamedPipeHandleState
ndash GetNamedPipeInfo
100100
Convenience Functions
BOOL TransactNamedPipe( HANDLE hNamedPipeLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDOWRD lpcbRead LPOVERLAPPED lpa)
WriteFile ReadFile sequence
101101
Convenience Functions
BOOL CallNamedPipe( LPCTSTR lpszPipeNameLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDWORD lpcbRead DWORD dwTimeOut)
CreateFile WriteFile ReadFile CloseHandle
ndash dwTimeOut NMPWAIT_NOWAIT NMPWAIT_WIAT_FOREVER NMPWAIT_USE_DEFAULT_WAIT
102102
Server eliminate the polling loop
BOOL ConnectNamedPipe (HANDLE hNamedPipeLPOVERLAPPED lpo
lpo == NULLndash Call will return as soon as there is a client connection
ndash Returns false if client connected between CreateNamed Pipe calland ConnectNamedPipe()
Use DisconnectNamedPipe to free the handle for connection from another client
WaitNamedPipe()ndash Client may wait for serverlsquos ConnectNamedPipe()
Security rights for named pipesndash GENERIC_READ GENERIC_WRITE SYNCHRONIZE
103103
Comparison with UNIX
UNIX FIFOs are similar to a named pipe
ndash FIFOs are half-duplex
ndash FIFOs are limited to a single machine
ndash FIFOs are still byte-oriented so its easiest to use fixed-size records in clientserver applications
ndash Individual readwrites are atomic
A server using FIFOs must use a separate FIFO for each clientlsquos response although all clients can send requests via a single well known FIFO
Mkfifo() is the UNIX counterpart to CreateNamedPipe()
Use sockets for networked clientserver scenarios
104104
Client Example using Named Pipe
WaitNamedPipe (ServerPipeName NMPWAIT_WAIT_FOREVER)
hNamedPipe = CreateFile (ServerPipeName GENERIC_READ | GENERIC_WRITE
0 NULL OPEN_EXISTING FILE_ATTRIBUTE_NORMAL NULL)
if (hNamedPipe == INVALID_HANDLE_VALUE)
fptinf(stderr Failure to locate servern) exit(3)
Write the request
WriteFile (hNamedPipe ampRequest MAX_RQRS_LEN ampnWrite NULL)
Read each response and send it to std out
while (ReadFile (hNamedPipe ResponseRecord MAX_RQRS_LEN ampnRead NULL))
printf (s ResponseRecord)
CloseHandle (hNamedPipe)
return 0
105105
Server Example Using a Named Pipe
hNamedPipe = CreateNamedPipe (SERVER_PIPE PIPE_ACCESS_DUPLEX
PIPE_READMODE_MESSAGE | PIPE_TYPE_MESSAGE | PIPE_WAIT
1 0 0 CS_TIMEOUT pNPSA)
while (Done)
printf (Server is awaiting next requestn)
if (ConnectNamedPipe (hNamedPipe NULL)
|| ReadFile (hNamedPipe ampRequest RQ_SIZE ampnXfer NULL))
fprintf(stderr ldquoConnect or Read Named Pipe errornrdquo) exit(4)
printf( ldquoRequest is sn RequestRecord)
Send the file one line at a time to the client
fp = fopen (File r)
while ((fgets (ResponseRecord MAX_RQRS_LEN fp) = NULL))
WriteFile (hNamedPipe ampResponseRecord
(strlen(ResponseRecord) + 1) TSIZE ampnXfer NULL)
fclose (fp)
DisconnectNamedPipe (hNamedPipe)
End of server operation
106106
Win32 IPC - MailslotsMailslots bear some nasty implementation detailsthey are almost never used
Broadcast mechanism
ndash One-directional
ndash Mutliple writersmultiple readers (frequently one-to-many comm)
ndash Message delivery is unreliable
ndash Can be located over a network domain
ndash Message lengths are limited (w2k lt 426 byte) Operations on the mailslot
ndash Each reader (server) creates mailslot with CreateMailslot()
ndash Write-only client opens mailslot with CreateFile() and uses WriteFile() ndash open will fail if there are no waiting readers
ndash Clientlsquos message can be read by all servers (readers) Client lookup mailslotmailslotname
ndash Client will connect to every server in network domain
107107
Locate a server via mailslot
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
App client 0
App client n
Mailslot Servers
While () Sleep() hMS = CreateFile( ldquomailslotstatusldquo)
WriteFile(hMS ampStatInfo
App Server
Mailslot Client
Message is sent periodically
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
2323
Semaphore Implementation
Semaphores may suspendresume threads
ndash Avoid busy waiting
Define a semaphore as a record
typedef struct
int value struct thread L semaphore
Assume two simple operations
ndash suspend() suspends the thread that invokes it
ndash resume(T) resumes the execution of a blocked thread T
2424
Implementation
Semaphore operations now defined as wait(S)
Svalue--
if (Svalue lt 0)
add this thread to SLsuspend()
signal(S) Svalue++
if (Svalue lt= 0)
remove a thread T from SLresume(T)
2525
Semaphore as a General Synchronization Tool Execute B in Tj only after A executed in Ti
Use semaphore flag initialized to 0
Code
Ti Tj
A wait(flag)
signal(flag) B
2626
Two Types of Semaphores
Counting semaphore
ndash integer value can range over an unrestricted domain
Binary semaphore
ndash integer value can range only between 0 and 1
ndash can be simpler to implement
Counting semaphore S can be implemented as a binary semaphore
2727
Deadlock and Starvation
Deadlock ndash
ndash two or more threads are waiting indefinitely for an event that can be caused by only one of the waiting threads
Let S and Q be two semaphores initialized to 1
T0 T1
wait(S) wait(Q)
wait(Q) wait(S)
signal(S) signal(Q)
signal(Q) signal(S)
2828
Deadlock and Starvation
Starvation ndash indefinite blocking
ndash A thread may never be removed from the semaphore queue in which it is suspended
Solution ndash
ndash all code should acquirerelease semaphores in same order
2929
Windows Synchronization
Uses interrupt masks to protect access to global resources on uniprocessor systems
Uses spinlocks on multiprocessor systems
Provides dispatcher objects which may act as mutexes and semaphores
Dispatcher objects may also provide events An event acts much like a condition variable
3030
Linux Synchronization
Kernel disables interrupts for synchronizing access to global data on uniprocessor systems
Uses spinlocks for multiprocessor synchronization
Uses semaphores and readers-writers locks when longer sections of code need access to data
Implements POSIX synchronization primitives to support multitasking multithreading (including real-time threads) and multiprocessing
3131
Further Reading
Ben-Ari M Principles of Concurrent Programming Prentice Hall 1982
Lamport L The Mutual Exclusion Problem Journal of the ACM April 1986
Abraham Silberschatz Peter B Galvin Operating System Concepts John Wiley amp Sons 6th Ed 2003
ndash Chapter 7 - Process Synchronization
ndash Chapter 8 - Deadlocks
3232
32 Trap Dispatching Interrupts Synchronization Trap and Interrupt dispatching
IRQL levels amp Interrupt Precedence
Spinlocks and Kernel Synchronization
Executive Synchronization
3333
Kernel Mode Versus User Mode
A processor state
Controls access to memory
Each memory page is tagged to show the required mode for reading and for writing
ndash Protects the system from the users
ndash Protects the user (process) from themselves
ndash System is not protected from system
Code regions are tagged ldquono write in any moderdquo
Controls ability to execute privileged instructions
A Windows abstraction
ndash Intel Ring 0 Ring 3
3434
Kernel Mode Versus User Mode
Control flow (a thread) can change from user to kernel mode and back
ndash Does not affect scheduling
ndash Thread context includes info about execution mode (along with registers etc)
PerfMon counters
ndash ldquoPrivileged Timerdquo and ldquoUser Timerdquo
ndash 4 levels of granularity thread process processor system
3535
Getting Into Kernel Mode
Code is run in kernel mode for one of three reasons
1 Requests from user mode
ndash Via the system service dispatch mechanism
ndash Kernel-mode code runs in the context of the requesting thread
2 Dedicated kernel-mode system threads
ndash Some threads in the system stay in kernel mode at all timesmostly in the ldquoSystemrdquo process
ndash Scheduled preempted etc like any other threads
3636
Getting Into Kernel Mode
3 Interrupts from external devicesndash interrupt dispatcher invokes the interrupt service routine
ndash ISR runs in the context of the interrupted thread so-called ldquoarbitrary thread contextrdquo
ndash ISR often requests the execution of a ldquoDPC routinerdquo which also runs in kernel mode
ndash Time not charged to interrupted thread
3737
Trap dispatching
Interruptdispatcher
Systemservice
dispatcher
Interruptserviceroutines
Interruptserviceroutines
Interruptserviceroutines
System services
System services
System services
Exceptiondispatcher
Exceptionhandlers
Exceptionhandlers
Exceptionhandlers
Virtual memorymanagerlsquos pager
Interrupt
System service call
HW exceptionsSW exceptions
Virtual addressexceptions
Trap processorlsquos mechanism to capture executing thread
ndash Switch from user to kernel mode
ndash Interrupts ndash asynchronous
ndash Exceptions - synchronous
Interrupt dispatch routine
Disable interrupts
Record machine state (trap frame) to allow resume
Mask equal- and lower-IRQL interrupts
Find and call appropriate ISR
Dismiss interrupt
Restore machine state (including mode and enabled interrupts)
Disable interrupts
Record machine state (trap frame) to allow resume
Mask equal- and lower-IRQL interrupts
Find and call appropriate ISR
Dismiss interrupt
Restore machine state (including mode and enabled interrupts)
Tell the device to stop interrupting
Interrogate device state start next operation on device etc
Request a DPC
Return to caller
Tell the device to stop interrupting
Interrogate device state start next operation on device etc
Request a DPC
Return to caller
Interrupt service routine
interrupt
user or kernel mode
codekernel mode
Note no thread or process context switch
Note no thread or process context switch
Interrupt Dispatching
3939
IRQL = Interrupt Request Level
ndash Precedence of the interrupt with respect to other interrupts
ndash Different interrupt sources have different IRQLs
ndash not the same as IRQ
IRQL is also a state of the processor
ndash Servicing an interrupt raises processor IRQL to that interruptrsquos IRQL
ndash this masks subsequent interrupts at equal and lower IRQLs
User mode is limited to IRQL 0
No waits or page faults at IRQL gt= DISPATCH_LEVEL
Interrupt Precedence via IRQLs (x86)
4040
PassiveLowAPC
DispatchDPCDevice 1
Profile amp Synch (Srv 2003)
ClockInterprocessor Interrupt
Power failHigh
normal thread execution
Hardware interrupts
Deferrable software interrupts
012
302928
31
Interrupt Precedence via IRQLs (x86)
4141
Interrupt processing
Interrupt dispatch table (IDT)ndash Links to interrupt service routines
x86ndash Interrupt controller interrupts processor (single line)
ndash Processor queries for interrupt vector uses vector as index to IDT
Alphandash PAL code (Privileged Architecture Library ndash Alpha BIOS) determines interrupt vector calls kernel
ndash Kernel uses vector to index IDT
After ISR execution IRQL is lowered to initial level
4242
Interrupt object
Allows device drivers to register ISRs for their devicesndash Contains dispatch code (initial handler)
ndash Dispatch code calls ISR with interrupt object as parameter(HW cannot pass parameters to ISR)
Connectingdisconnecting interrupt objectsndash Dynamic association between ISR and IDT entry
ndash Loadable device drivers (kernel modules)
ndash Turn onoff ISR
Interrupt objects can synchronize access to ISR datandash Multiple instances of ISR may be active simultaneously (MP machine)
ndash Multiple ISR may be connected with IRQL
4343
Predefined IRQLs
High ndash used when halting the system (via KeBugCheck())
Power fail ndash originated in the NT design document but has never been used
Inter-processor interruptndash used to request action from other processor (dispatching a thread updating a processors TLB system shutdown system crash)
Clockndash Used to update systemlsquos clock allocation of CPU time to threads
Profilendash Used for kernel profiling (see Kernel profiler ndash Kernprofexe Res Kit)
4444
Predefined IRQLs (contd)
Device
ndash Used to prioritize device interrupts
DPCdispatch and APC
ndash Software interrupts that kernel and device drivers generate
Passive
ndash No interrupt level at all normal thread execution
4545
IRQLs on 64-bit Systems
PassiveLowAPC
DispatchDPCDevice 1
Device n
Synch (Srv 2003)Clock
Interprocessor InterruptPower
HighProfile
012
1413
15
34
PassiveLowAPC
DispatchDPC amp Synch (UP only)Correctable Machine Check
Device 1
Device nSynch (MP only)
ClockInterprocessor Interrupt
HighProfilePower
x64 IA64
12
4646
Interrupt Prioritization amp Delivery
IRQLs are determined as followsndash x86 UP systems IRQL = 27 - IRQ
ndash x86 MP systems bucketized (random)
ndash x64 amp IA64 systems IRQL = IDT vector number 16
On MP systems which processor is chosen to deliver an interruptndash By default any processor can receive an interrupt from any deviceCan be configured with IntFilter utility in Resource Kit
ndash On x86 and x64 systems the IOAPIC (IO advanced programmable interrupt controller) is programmed to interrupt the processor running at the lowest IRQL
ndash On IA64 systems the SAPIC (streamlined advanced programmable interrupt controller) is configured to interrupt one processor for each interrupt sourceProcessors are assigned round robin for each interrupt vector
4747
Software interrupts
Initiating thread dispatching
ndash DPC allow for scheduling actions when kernel is deep within many layers of code
ndash Delayed scheduling decision one DPC queue per processor
Handling timer expiration
Asynchronous execution of a procedure in context of a particular thread
Support for asynchronous IO operations
4848
Flow of Interrupts
4949
Sync on MP use spinlocks to coordinate among processors
Spinlock acquisition and release routines implement a one-owner-at-a-time algorithm
ndash Spinlock is either free or is considered to be owned by a CPU
ndash Analogous to using Windows API mutexes from user mode
A spinlock is just a data cell in memory
ndash Accessed with a test-and-modify operation that is atomic across all processors
ndash KSPIN_LOCK is an opaque data type typedefrsquod as a ULONG
ndash To implement synchronization a single bit is sufficient
Synchronization on SMP Systems
5050
do acquire_spinlock(DPC)until (SUCCESS)
begin remove DPC from queueend
release_spinlock(DPC)
do acquire_spinlock(DPC)until (SUCCESS)
begin remove DPC from queueend
release_spinlock(DPC)
Kernel Synchronization
Processor BProcessor A
Critical section
spinlock
DPC DPC
A spinlock is a locking primitive associatedwith a global data structure such as the DPC queue
5151
Queued Spinlocks
Problem Checking status of spinlock via test-and-set operation creates bus contention
Queued spinlocks maintain queue of waiting processors
First processor acquires lock other processors wait on processor-local flag
ndash Thus busy-wait loop requires no access to the memory bus
When releasing lock the 1st processorrsquos flag is modified
ndash Exactly one processor is being signaled
ndash Pre-determined wait order
5252
SMP Scalability Improvements
Windows 2000 queued spinlocks
ndash qlocks in Kernel Debugger
Server 2003
ndash More spinlocks eliminated (context swap system space commit)
ndash Further reduction of use of spinlocks amp length they are held
ndash Scheduling database now per-CPUAllows thread state transitions in parallel
5353
SMP Scalability Improvements
XP2003
ndash Minimized lock contention for hot locks PFN or Page Frame Database lock
ndash Some locks completely eliminatedCharging nonpagedpaged pool quotas allocating and mapping system page table entries charging commitment of pages allocatingmapping physical memory through AWE functions
ndash New more efficient locking mechanism (pushlocks)Doesnrsquot use spinlocks when no contentionUsed for object manager and address windowing extensions (AWE) related locks
5454
Waiting
Flexible wait calls
ndash Wait for one or multiple objects in one call
ndash Wait for multiple can wait for ldquoanyrdquo one or ldquoallrdquo at onceldquoAllrdquo all objects must be in the signalled state concurrently to resolve the wait
ndash All wait calls include optional timeout argument
ndash Waiting threads consume no CPU time
5555
Waiting
Waitable objects include
ndash Events (may be auto-reset or manual reset may be set or ldquopulsedrdquo)
ndash Mutexes (ldquomutual exclusionrdquo one-at-a-time)
ndash Semaphores (n-at-a-time)
ndash Timers
ndash Processes and Threads (signalled upon exit or terminate)
ndash Directories (change notification)
5656
Waiting
No guaranteed ordering of wait resolution
ndash If multiple threads are waiting for an object and only one thread is released (eg itrsquos a mutex or auto-reset event) which thread gets released is unpredictable
5757
Executive Synchronization
Thread waitson an object
handle
Create and initialize thread object
Initialized
Ready
Transition
Waiting
Running
Terminated
Standby
Wait is completeSet object to
signaled state
Interaction with thread scheduling
Waiting on Dispatcher Objects ndash outside the kernel
5858
Interaction bet Synchronization amp Dispatching User mode thread waits on an event objectlsquos handle
Kernel changes threadlsquos scheduling state from ready to waiting and adds thread to wait-list
Another thread sets the event
Kernel wakes up waiting threads variable priority threads get priority boost
5959
Interaction bet Synchronization amp Dispatching Dispatcher re-schedules new thread ndash it may preempt running thread it it has lower priority and issues software interrupt to initiate context switch
If no processor can be preempted the dispatcher places the ready thread in the dispatcher ready queue to be scheduled later
6060
What signals an object
Dispatcher object
System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
Owning thread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (kernel mode)
nonsignaled signaled
Owning thread or otherthread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (exported to user mode)
nonsignaled signaled
One thread releases thesemaphore freeing a resource
A thread acquires the semaphoreMore resources are not available
Kernel resumes one or more waiting threads
Semaphore
6161
A thread reinitializesthe thread object
What signals an object (contd)
Dispatcher object System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
A thread sets the event
Kernel resumes one or more threads
Kernel resumes one or more waiting threads
Event
nonsignaled signaled
Dedicated thread sets oneevent in the event pair
Kernel resumes theother dedicated thread
Kernel resumes waitingdedicated thread
Event pair
nonsignaled signaled
Timer expires
A thread (re) initializes the timer
Kernel resumes all waiting threads
Timer
nonsignaled signaled
Thread terminates
Kernel resumes all waiting threads
Thread
6262
Further Reading
Mark E Russinovich and David A Solomon Microsoft Windows Internals 4th Edition Microsoft Press 2004
Chapter 3 - System Mechanisms
ndash Trap Dispatching (pp 85 ff)
ndash Synchronization (pp 149 ff)
ndash Kernel Event Tracing (pp 175 ff)
6363
33 Advanced Windows Synchronization
Deferred and Asynchronous Procedure Calls
IRQLs and CPU Time Accounting
Wait Queues amp Dispatcher Objects
6464
Used to defer processing from higher (device) interrupt level to a lower (dispatch) levelndash Also used for quantum end and timer expiration
Driver (usually ISR) queues requestndash One queue per CPU DPCs are normally queued to the current processor but can be targeted to other CPUs
ndash Executes specified procedure at dispatch IRQL (or ldquodispatch levelrdquo also ldquoDPC levelrdquo) when all higher-IRQL work (interrupts) completed
ndash Maximum times recommended ISR 10 usec DPC 25 usec
See httpwwwmicrosoftcomwhdcdriverperformmmdrvmspx
Deferred Procedure Calls (DPCs)
6565
queue head DPC object DPC object DPC object
Deferred Procedure Calls (DPCs)
6666
DPC
Delivering a DPC
DPC routines can call kernel functionsbut canlsquot call system services generatepage faults or create or wait on objects
DPC routines canlsquotassume whatprocess addressspace is currentlymapped
Interruptdispatch table
high
Power failure
DispatchDPC
APC
Low
DPC
1 Timer expires kernelqueues DPC that willrelease all waiting threadsKernel requests SW int
DPCDPC
DPC queue
2 DPC interrupt occurswhen IRQL drops belowdispatchDPC level
dispatcher
3 After DPC interruptcontrol transfers tothread dispatcher
4 Dispatcher executes each DPCroutine in DPC queue
6767
Asynchronous Procedure Calls (APCs)
Execute code in context of a particular user threadndash APC routines can acquire resources (objects) incur page faultscall system services
APC queue is thread-specific
User mode amp kernel mode APCsndash Permission required for user mode APCs
6868
Asynchronous Procedure Calls (APCs)
Executive uses APCs to complete work in thread spacendash Wait for asynchronous IO operation
ndash Emulate delivery of POSIX signals
ndash Make threads suspendterminate itself (env subsystems)
APCs are delivered when thread is in alertable wait statendash WaitForMultipleObjectsEx() SleepEx()
6969
Special kernel APCs
ndash Run in kernel mode at IRQL 1
ndash Always deliverable unless thread is already at IRQL 1 or above
ndash Used for IO completion reporting from ldquoarbitrary thread contextrdquo
ndash Kernel-mode interface is linkable but not documented
Asynchronous Procedure Calls (APCs)
7070
ldquoOrdinaryrdquo kernel APCs
ndash Always deliverable if at IRQL 0 unless explicitly disabled (disable with KeEnterCriticalRegion)
User mode APCs
ndash Used for IO completion callback routines (see ReadFileEx WriteFileEx) also QueueUserApc
ndash Only deliverable when thread is in ldquoalertable waitrdquo
Asynchronous Procedure Calls (APCs)
7171
ThreadObject
K
U
APC objects
Asynchronous Procedure Calls (APCs)
7272
IRQLs and CPU Time Accounting
Interval clock timer ISR keeps track of time
Clock ISR time accounting
ndash If IRQLlt2 charge to threadrsquos user or kernel time
ndash If IRQL=2 and processing a DPC charge to DPC time
ndash If IRQL=2 amp not processing a DPC charge to thread kernel time
ndash If IRQLgt2 charge to interrupt time
7373
IRQLs and CPU Time Accounting
Since time servicing interrupts are NOT charged to interrupted thread if system is busy but no process appears to be running must be due to interrupt-related activity
ndash Note time at IRQL 2 or more is charged to the current threadrsquos quantum (to be described)
7474
Interrupt Time Accounting
Task Manager includes interrupt and DPC time with the Idle process time
Interrupt activity is not charged to any threadprocess
ndash Process Explorer shows these as separate processesnot really processes
ndash Context switches for these are really of interrupts amp DPCs
7575
Time Accounting Quirks
Looking at total CPU time for each process may not reveal where system has spent its time
CPU time accounting is driven by programmable interrupt timer
ndash Normally 10 msec (15 msec on some MP Pentiums)
Thread execution and context switches between clock intervals NOT accounted
ndash Eg one or more threads run and enter a wait state before clock fires
ndash Thus threads may run but never get charged
View context switch activity with Process Explorer
ndash Add Context Switch Delta column
7676
For waiting threads user-mode utilities only display the wait reason
Example pstat
Looking at Waiting Threads
7777
Wait Internals 1 Dispatcher Objects
Size TypeState
Wait listhead
Object-type-specific data
DispatcherObject
(see ntddkincddkntddkh)
Any kernel object you can wait for is a ldquodispatcher objectrdquo
ndash some exclusively for synchronizationeg events mutexes (ldquomutantsrdquo) semaphores queues timers
ndash others can be waited for as a side effect of their prime function eg processes threads file objects
ndash non-waitable kernel objects are called ldquocontrol objectsrdquo
All dispatcher objects have a common header
All dispatcher objects are in one of two states
ndash ldquosignaledrdquo vs ldquononsignaledrdquo
ndash when signalled a wait on the object is satisfied
ndash different object types differ in terms of what changes their state
ndash wait and unwait implementation iscommon to all types of dispatcher objects
7878
Object-type-specific data
Wait Internals 2Wait Blocks
Size TypeState
Wait listhead
Size TypeState
Wait listhead
Represent a threadrsquos reference to something itrsquos waiting for (one per handle passed to WaitForhellip)
All wait blocks from a given wait call are chained to the waiting thread
Type indicates wait for ldquoanyrdquo or ldquoallrdquo Key denotes argument list position for
WaitForMultipleObjects
Object-type-specific data
DispatcherObjects
Thread Objects
WaitBlockListWaitBlockList
Wait blocks
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
7979
34 Windows APIs for Synchronization and IPC Windows API constructs for synchronization and interprocess communication
Synchronization
ndash Critical sections
ndash Mutexes
ndash Semaphores
ndash Event objects
Synchronization through interprocess communication
ndash Anonymous pipes
ndash Named pipes
ndash Mailslots
8080
Critical Sections
Only usable from within the same process
Critical sections are initialized and deleted but do not have handles
Only one thread at a time can be in a critical section
A thread can enter a critical section multiple times - however the number of Enter- and Leave-operations must match
Leaving a critical section before entering it may cause deadlocks
No way to test whether another thread is in a critical section
VOID InitializeCriticalSection( LPCRITICAL_SECTION sec )VOID DeleteCriticalSection( LPCRITICAL_SECTION sec )
VOID EnterCriticalSection( LPCRITICAL_SECTION sec ) VOID LeaveCriticalSection( LPCRITICAL_SECTION sec )BOOL TryEnterCriticalSection ( LPCRITICAL_SECTION sec )
8181
Critical Section Example
counter is global shared by all threads
volatile int counter = 0
CRITICAL_SECTION crit
InitializeCriticalSection ( ampcrit )
hellip main loop in any of the threads
while (done)
_try
EnterCriticalSection ( ampcrit )
counter += local_value
LeaveCriticalSection ( ampcrit )
_finally LeaveCriticalSection ( ampcrit )
DeleteCriticalSection( ampcrit )
8282
Synchronizing Threads with Kernel Objects
The following kernel objects can be used to synchronize threads
ndash Processes
ndash Threads
ndash Files
ndash Console input
File change notificationsFile change notifications
MutexesMutexes
Events (auto-reset + manual-reset)Events (auto-reset + manual-reset)
Waitable timersWaitable timers
DWORD WaitForSingleObject( HANDLE hObject DWORD dwTimeout )
DWORD WaitForMultipleObjects( DWORD cObjects LPHANDLE lpHandles BOOL bWaitAll DWORD dwTimeout )
8383
Wait Functions - Details
WaitForSingleObject()ndash hObject specifies kernel object
ndash dwTimeout specifies wait time in msecdwTimeout == 0 - no wait check whether object is signaled
dwTimeout == INFINITE - wait forever
WaitForMultipleObjects()ndash cObjects lt= MAXIMUM_WAIT_OBJECTS (64)
ndash lpHandles - pointer to array identifying these objects
ndash bWaitAll - whether to wait for first signaled object or all objectsFunction returns index of first signaled object
Side effectsndash Mutexes auto-reset events and waitable timers will be reset to non-signaled state after completing wait functions
8484
Mutexes
Mutexes work across processes
First thread has to call CreateMutex()
When sharing a mutex second thread (process) calls CreateMutex() or OpenMutex()
fInitialOwner == TRUE gives creator immediate ownership
Threads acquire mutex ownership using WaitForSingleObject() or WaitForMultipleObjects()
ReleaseMutex() gives up ownership
CloseHandle() will free mutex object
8585
Mutexes
HANDLE CreateMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
HANDLE OpenMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
BOOL ReleaseMutex( HANDLE hMutex )
8686
Mutex Example
counter is global shared by all threads
volatile int done counter = 0
HANDLE mutex = CreateMutex( NULL FALSE NULL )
main loop in any of the threads ret is local
DWORD ret
while (done)
ret = WaitForSingleObject( mutex INFINITE )
if (ret == WAIT_OBJECT_0)
counter += local_value
else mutex was abandoned
break exit the loop
ReleaseMutex( mutex )
CloseHandle( mutex )
8787
Comparison - POSIX mutexes
POSIX pthreads specification supports mutexesndash Synchronization among threads in same process
Five basic functionsndash pthread_mutex_init()
ndash pthread_mutex_destroy()
ndash pthread_mutex_lock()
ndash pthread_mutex_unlock()
ndash pthread_mutex_trylock()
Comparisonndash pthread_mutex_lock() will block - equivalent to WaitForSingleObject( hMutex )
ndash pthread_mutex_trylock() is nonblocking (polling) - equivalent to WaitForSingleObject() with timeout == 0
8888
Semaphores
Semaphore objects are used for resource countingndash A semaphore is signaled when count gt 0
Threadsprocesses use wait functionsndash Each wait function decreases semaphore count by 1
ndash ReleaseSemaphore() may increment count by any value
ndash ReleaseSemaphore() returns old semaphore count
8989
Semaphores
HANDLE CreateSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE OpenSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE ReleaseSemaphore( HANDLE hSemaphoreLONG cReleaseCount LPLONG lpPreviousCount )
9090
Events
Multiple threads can be released when a single event is signaled (barrier synchronization)ndash Manual-reset event can signal several thread simultaneously must be reset manually
ndash PulseEvent() will release all threads waiting on a manual-reset event and automatically reset the event
ndash Auto-reset event signals a single thread event is reset automatically
ndash fInitialState == TRUE - create event in signaled state
9191
Events
HANDLE CreateEvent( LPSECURITY_ATTRIBUTE lpsaBOOL fManualReset BOOL fInititalStateLPTSTR lpszEventName )
BOOL SetEvent( HANDLE hEvent )BOOL ResetEvent( HANDLE hEvent )BOOL PulseEvent( HANDLE hEvent )
9292
Comparison - POSIX condition variables
pthreadrsquos condition variables are comparable to eventsndash pthread_cond_init()
ndash pthread_cond_destroy()
Wait functionsndash pthread_cond_wait()
ndash pthread_cond_timedwait()
Signalingndash pthread_cond_signal() - one thread
ndash pthread_cond_broadcast() - all waiting threads
No exact equivalent to manual-reset events
9393
Anonymous pipes
BOOL CreatePipe( PHANDLE phReadPHANDLE phWriteLPSECURITY_ATTRIBUTES lpsaDWORD cbPipe )
main
prog1 prog2pipe
Half-duplex character-based IPC
cbPipe pipe byte size zero == default
Read on pipe handle will block if pipe is empty
Write operation to a full pipe will block
Anonymous pipes are oneway
9494
IO Redirection using an Anonymous Pipe
Create default size anonymous pipe handles are inheritable
if (CreatePipe (amphReadPipe amphWritePipe ampPipeSA 0))
fprintf(stderr ldquoAnon pipe create failednrdquo) exit(1)
Set output handle to pipe handle create first processes
StartInfoCh1hStdInput = GetStdHandle (STD_INPUT_HANDLE)
StartInfoCh1hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh1hStdOutput = hWritePipe
StartInfoCh1dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)Command1 NULL NULL TRUE 0 NULL NULL ampStartInfoCh1 ampProcInfo1))
fprintf(stderr ldquoCreateProc1 failednrdquo) exit(2)
CloseHandle (hWritePipe)
9595
Pipe example (contd)
Repeat (symmetrically) for the second process
StartInfoCh2hStdInput = hReadPipe
StartInfoCh2hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh2hStdOutput = GetStdHandle (STD_OUTPUT_HANDLE)
StartInfoCh2dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)targv NULL NULLTRUE Inherit handles
0 NULL NULL ampStartInfoCh2 ampProcInfo2))
fprintf(stderr ldquoCreateProc2 failednrdquo) exit(3)
CloseHandle (hReadPipe)
Wait for both processes to complete
WaitForSingleObject (ProcInfo1hProcess INFINITE)
WaitForSingleObject (ProcInfo2hProcess INFINITE)
CloseHandle (ProcInfo1hThread) CloseHandle (ProcInfo1hProcess)
CloseHandle (ProcInfo2hThread) CloseHandle (ProcInfo2hProcess)
return 0
9696
Named Pipes
Message orientedndash Reading process can read varying-length messages precisely as sent by the writing process
Bi-directionalndash Two processes can exchange messages over the same pipe
Multiple independent instances of a named pipendash Several clients can communicate with a single server using the same instance
ndash Server can respond to client using the same instance
Pipe can be accessed over the networkndash location transparency
Convenience and connection functions
9797
Using Named Pipes
HANDLE CreateNamedPipe (LPCTSTR lpszPipeNameDWORD fdwOpenMode DWORD fdwPipModeDWORD nMaxInstances DWORD cbOutBufDWORD cbInBuf DWORD dwTimeOutLPSECURITY_ATTRIBUTES lpsa )
Use same flag settings forall instances of a named pipe
lpszPipeName pipe[path]pipename
ndash Not possible to create a pipe on remote machine ( ndash local machine)
fdwOpenMode
ndash PIPE_ACCESS_DUPLEX PIPE_ACCESS_INBOUND PIPE_ACCESS_OUTBOUND
fdwPipeMode
ndash PIPE_TYPE_BYTE or PIPE_TYPE_MESSAGE
ndash PIPE_READMODE_BYTE or PIPE_READMODE_MESSAGE
ndash PIPE_WAIT or PIPE_NOWAIT (will ReadFile block)
9898
Named Pipes (contd)
BOOL PeekNamedPipe (HANDLE hPipeLPVOID lpvBuffer DWORD cbBufferLPDWORD lpcbRead LPDWORD lpcbAvailLPDWORD lpcbMessage)
nMaxInstances
ndash Number of instances
ndash PIPE_UNLIMITED_INSTANCES OS choice based on resources
dwTimeOut
ndash Default time-out period (in msec) for WaitNamedPipe()
First CreateNamedPipe creates named pipe
ndash Closing handle to last instance deletes named pipe
Polling a pipe
ndash Nondestructive ndash is there a message waiting for ReadFile
9999
Named Pipe Client Connections
CreateFile with named pipe namendash pipe[path]pipename
ndash servernamepipe[path]pipename
ndash First method gives better performance (local server)
Status Functionsndash GetNamedPipeHandleState
ndash SetNamedPipeHandleState
ndash GetNamedPipeInfo
100100
Convenience Functions
BOOL TransactNamedPipe( HANDLE hNamedPipeLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDOWRD lpcbRead LPOVERLAPPED lpa)
WriteFile ReadFile sequence
101101
Convenience Functions
BOOL CallNamedPipe( LPCTSTR lpszPipeNameLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDWORD lpcbRead DWORD dwTimeOut)
CreateFile WriteFile ReadFile CloseHandle
ndash dwTimeOut NMPWAIT_NOWAIT NMPWAIT_WIAT_FOREVER NMPWAIT_USE_DEFAULT_WAIT
102102
Server eliminate the polling loop
BOOL ConnectNamedPipe (HANDLE hNamedPipeLPOVERLAPPED lpo
lpo == NULLndash Call will return as soon as there is a client connection
ndash Returns false if client connected between CreateNamed Pipe calland ConnectNamedPipe()
Use DisconnectNamedPipe to free the handle for connection from another client
WaitNamedPipe()ndash Client may wait for serverlsquos ConnectNamedPipe()
Security rights for named pipesndash GENERIC_READ GENERIC_WRITE SYNCHRONIZE
103103
Comparison with UNIX
UNIX FIFOs are similar to a named pipe
ndash FIFOs are half-duplex
ndash FIFOs are limited to a single machine
ndash FIFOs are still byte-oriented so its easiest to use fixed-size records in clientserver applications
ndash Individual readwrites are atomic
A server using FIFOs must use a separate FIFO for each clientlsquos response although all clients can send requests via a single well known FIFO
Mkfifo() is the UNIX counterpart to CreateNamedPipe()
Use sockets for networked clientserver scenarios
104104
Client Example using Named Pipe
WaitNamedPipe (ServerPipeName NMPWAIT_WAIT_FOREVER)
hNamedPipe = CreateFile (ServerPipeName GENERIC_READ | GENERIC_WRITE
0 NULL OPEN_EXISTING FILE_ATTRIBUTE_NORMAL NULL)
if (hNamedPipe == INVALID_HANDLE_VALUE)
fptinf(stderr Failure to locate servern) exit(3)
Write the request
WriteFile (hNamedPipe ampRequest MAX_RQRS_LEN ampnWrite NULL)
Read each response and send it to std out
while (ReadFile (hNamedPipe ResponseRecord MAX_RQRS_LEN ampnRead NULL))
printf (s ResponseRecord)
CloseHandle (hNamedPipe)
return 0
105105
Server Example Using a Named Pipe
hNamedPipe = CreateNamedPipe (SERVER_PIPE PIPE_ACCESS_DUPLEX
PIPE_READMODE_MESSAGE | PIPE_TYPE_MESSAGE | PIPE_WAIT
1 0 0 CS_TIMEOUT pNPSA)
while (Done)
printf (Server is awaiting next requestn)
if (ConnectNamedPipe (hNamedPipe NULL)
|| ReadFile (hNamedPipe ampRequest RQ_SIZE ampnXfer NULL))
fprintf(stderr ldquoConnect or Read Named Pipe errornrdquo) exit(4)
printf( ldquoRequest is sn RequestRecord)
Send the file one line at a time to the client
fp = fopen (File r)
while ((fgets (ResponseRecord MAX_RQRS_LEN fp) = NULL))
WriteFile (hNamedPipe ampResponseRecord
(strlen(ResponseRecord) + 1) TSIZE ampnXfer NULL)
fclose (fp)
DisconnectNamedPipe (hNamedPipe)
End of server operation
106106
Win32 IPC - MailslotsMailslots bear some nasty implementation detailsthey are almost never used
Broadcast mechanism
ndash One-directional
ndash Mutliple writersmultiple readers (frequently one-to-many comm)
ndash Message delivery is unreliable
ndash Can be located over a network domain
ndash Message lengths are limited (w2k lt 426 byte) Operations on the mailslot
ndash Each reader (server) creates mailslot with CreateMailslot()
ndash Write-only client opens mailslot with CreateFile() and uses WriteFile() ndash open will fail if there are no waiting readers
ndash Clientlsquos message can be read by all servers (readers) Client lookup mailslotmailslotname
ndash Client will connect to every server in network domain
107107
Locate a server via mailslot
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
App client 0
App client n
Mailslot Servers
While () Sleep() hMS = CreateFile( ldquomailslotstatusldquo)
WriteFile(hMS ampStatInfo
App Server
Mailslot Client
Message is sent periodically
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
2424
Implementation
Semaphore operations now defined as wait(S)
Svalue--
if (Svalue lt 0)
add this thread to SLsuspend()
signal(S) Svalue++
if (Svalue lt= 0)
remove a thread T from SLresume(T)
2525
Semaphore as a General Synchronization Tool Execute B in Tj only after A executed in Ti
Use semaphore flag initialized to 0
Code
Ti Tj
A wait(flag)
signal(flag) B
2626
Two Types of Semaphores
Counting semaphore
ndash integer value can range over an unrestricted domain
Binary semaphore
ndash integer value can range only between 0 and 1
ndash can be simpler to implement
Counting semaphore S can be implemented as a binary semaphore
2727
Deadlock and Starvation
Deadlock ndash
ndash two or more threads are waiting indefinitely for an event that can be caused by only one of the waiting threads
Let S and Q be two semaphores initialized to 1
T0 T1
wait(S) wait(Q)
wait(Q) wait(S)
signal(S) signal(Q)
signal(Q) signal(S)
2828
Deadlock and Starvation
Starvation ndash indefinite blocking
ndash A thread may never be removed from the semaphore queue in which it is suspended
Solution ndash
ndash all code should acquirerelease semaphores in same order
2929
Windows Synchronization
Uses interrupt masks to protect access to global resources on uniprocessor systems
Uses spinlocks on multiprocessor systems
Provides dispatcher objects which may act as mutexes and semaphores
Dispatcher objects may also provide events An event acts much like a condition variable
3030
Linux Synchronization
Kernel disables interrupts for synchronizing access to global data on uniprocessor systems
Uses spinlocks for multiprocessor synchronization
Uses semaphores and readers-writers locks when longer sections of code need access to data
Implements POSIX synchronization primitives to support multitasking multithreading (including real-time threads) and multiprocessing
3131
Further Reading
Ben-Ari M Principles of Concurrent Programming Prentice Hall 1982
Lamport L The Mutual Exclusion Problem Journal of the ACM April 1986
Abraham Silberschatz Peter B Galvin Operating System Concepts John Wiley amp Sons 6th Ed 2003
ndash Chapter 7 - Process Synchronization
ndash Chapter 8 - Deadlocks
3232
32 Trap Dispatching Interrupts Synchronization Trap and Interrupt dispatching
IRQL levels amp Interrupt Precedence
Spinlocks and Kernel Synchronization
Executive Synchronization
3333
Kernel Mode Versus User Mode
A processor state
Controls access to memory
Each memory page is tagged to show the required mode for reading and for writing
ndash Protects the system from the users
ndash Protects the user (process) from themselves
ndash System is not protected from system
Code regions are tagged ldquono write in any moderdquo
Controls ability to execute privileged instructions
A Windows abstraction
ndash Intel Ring 0 Ring 3
3434
Kernel Mode Versus User Mode
Control flow (a thread) can change from user to kernel mode and back
ndash Does not affect scheduling
ndash Thread context includes info about execution mode (along with registers etc)
PerfMon counters
ndash ldquoPrivileged Timerdquo and ldquoUser Timerdquo
ndash 4 levels of granularity thread process processor system
3535
Getting Into Kernel Mode
Code is run in kernel mode for one of three reasons
1 Requests from user mode
ndash Via the system service dispatch mechanism
ndash Kernel-mode code runs in the context of the requesting thread
2 Dedicated kernel-mode system threads
ndash Some threads in the system stay in kernel mode at all timesmostly in the ldquoSystemrdquo process
ndash Scheduled preempted etc like any other threads
3636
Getting Into Kernel Mode
3 Interrupts from external devicesndash interrupt dispatcher invokes the interrupt service routine
ndash ISR runs in the context of the interrupted thread so-called ldquoarbitrary thread contextrdquo
ndash ISR often requests the execution of a ldquoDPC routinerdquo which also runs in kernel mode
ndash Time not charged to interrupted thread
3737
Trap dispatching
Interruptdispatcher
Systemservice
dispatcher
Interruptserviceroutines
Interruptserviceroutines
Interruptserviceroutines
System services
System services
System services
Exceptiondispatcher
Exceptionhandlers
Exceptionhandlers
Exceptionhandlers
Virtual memorymanagerlsquos pager
Interrupt
System service call
HW exceptionsSW exceptions
Virtual addressexceptions
Trap processorlsquos mechanism to capture executing thread
ndash Switch from user to kernel mode
ndash Interrupts ndash asynchronous
ndash Exceptions - synchronous
Interrupt dispatch routine
Disable interrupts
Record machine state (trap frame) to allow resume
Mask equal- and lower-IRQL interrupts
Find and call appropriate ISR
Dismiss interrupt
Restore machine state (including mode and enabled interrupts)
Disable interrupts
Record machine state (trap frame) to allow resume
Mask equal- and lower-IRQL interrupts
Find and call appropriate ISR
Dismiss interrupt
Restore machine state (including mode and enabled interrupts)
Tell the device to stop interrupting
Interrogate device state start next operation on device etc
Request a DPC
Return to caller
Tell the device to stop interrupting
Interrogate device state start next operation on device etc
Request a DPC
Return to caller
Interrupt service routine
interrupt
user or kernel mode
codekernel mode
Note no thread or process context switch
Note no thread or process context switch
Interrupt Dispatching
3939
IRQL = Interrupt Request Level
ndash Precedence of the interrupt with respect to other interrupts
ndash Different interrupt sources have different IRQLs
ndash not the same as IRQ
IRQL is also a state of the processor
ndash Servicing an interrupt raises processor IRQL to that interruptrsquos IRQL
ndash this masks subsequent interrupts at equal and lower IRQLs
User mode is limited to IRQL 0
No waits or page faults at IRQL gt= DISPATCH_LEVEL
Interrupt Precedence via IRQLs (x86)
4040
PassiveLowAPC
DispatchDPCDevice 1
Profile amp Synch (Srv 2003)
ClockInterprocessor Interrupt
Power failHigh
normal thread execution
Hardware interrupts
Deferrable software interrupts
012
302928
31
Interrupt Precedence via IRQLs (x86)
4141
Interrupt processing
Interrupt dispatch table (IDT)ndash Links to interrupt service routines
x86ndash Interrupt controller interrupts processor (single line)
ndash Processor queries for interrupt vector uses vector as index to IDT
Alphandash PAL code (Privileged Architecture Library ndash Alpha BIOS) determines interrupt vector calls kernel
ndash Kernel uses vector to index IDT
After ISR execution IRQL is lowered to initial level
4242
Interrupt object
Allows device drivers to register ISRs for their devicesndash Contains dispatch code (initial handler)
ndash Dispatch code calls ISR with interrupt object as parameter(HW cannot pass parameters to ISR)
Connectingdisconnecting interrupt objectsndash Dynamic association between ISR and IDT entry
ndash Loadable device drivers (kernel modules)
ndash Turn onoff ISR
Interrupt objects can synchronize access to ISR datandash Multiple instances of ISR may be active simultaneously (MP machine)
ndash Multiple ISR may be connected with IRQL
4343
Predefined IRQLs
High ndash used when halting the system (via KeBugCheck())
Power fail ndash originated in the NT design document but has never been used
Inter-processor interruptndash used to request action from other processor (dispatching a thread updating a processors TLB system shutdown system crash)
Clockndash Used to update systemlsquos clock allocation of CPU time to threads
Profilendash Used for kernel profiling (see Kernel profiler ndash Kernprofexe Res Kit)
4444
Predefined IRQLs (contd)
Device
ndash Used to prioritize device interrupts
DPCdispatch and APC
ndash Software interrupts that kernel and device drivers generate
Passive
ndash No interrupt level at all normal thread execution
4545
IRQLs on 64-bit Systems
PassiveLowAPC
DispatchDPCDevice 1
Device n
Synch (Srv 2003)Clock
Interprocessor InterruptPower
HighProfile
012
1413
15
34
PassiveLowAPC
DispatchDPC amp Synch (UP only)Correctable Machine Check
Device 1
Device nSynch (MP only)
ClockInterprocessor Interrupt
HighProfilePower
x64 IA64
12
4646
Interrupt Prioritization amp Delivery
IRQLs are determined as followsndash x86 UP systems IRQL = 27 - IRQ
ndash x86 MP systems bucketized (random)
ndash x64 amp IA64 systems IRQL = IDT vector number 16
On MP systems which processor is chosen to deliver an interruptndash By default any processor can receive an interrupt from any deviceCan be configured with IntFilter utility in Resource Kit
ndash On x86 and x64 systems the IOAPIC (IO advanced programmable interrupt controller) is programmed to interrupt the processor running at the lowest IRQL
ndash On IA64 systems the SAPIC (streamlined advanced programmable interrupt controller) is configured to interrupt one processor for each interrupt sourceProcessors are assigned round robin for each interrupt vector
4747
Software interrupts
Initiating thread dispatching
ndash DPC allow for scheduling actions when kernel is deep within many layers of code
ndash Delayed scheduling decision one DPC queue per processor
Handling timer expiration
Asynchronous execution of a procedure in context of a particular thread
Support for asynchronous IO operations
4848
Flow of Interrupts
4949
Sync on MP use spinlocks to coordinate among processors
Spinlock acquisition and release routines implement a one-owner-at-a-time algorithm
ndash Spinlock is either free or is considered to be owned by a CPU
ndash Analogous to using Windows API mutexes from user mode
A spinlock is just a data cell in memory
ndash Accessed with a test-and-modify operation that is atomic across all processors
ndash KSPIN_LOCK is an opaque data type typedefrsquod as a ULONG
ndash To implement synchronization a single bit is sufficient
Synchronization on SMP Systems
5050
do acquire_spinlock(DPC)until (SUCCESS)
begin remove DPC from queueend
release_spinlock(DPC)
do acquire_spinlock(DPC)until (SUCCESS)
begin remove DPC from queueend
release_spinlock(DPC)
Kernel Synchronization
Processor BProcessor A
Critical section
spinlock
DPC DPC
A spinlock is a locking primitive associatedwith a global data structure such as the DPC queue
5151
Queued Spinlocks
Problem Checking status of spinlock via test-and-set operation creates bus contention
Queued spinlocks maintain queue of waiting processors
First processor acquires lock other processors wait on processor-local flag
ndash Thus busy-wait loop requires no access to the memory bus
When releasing lock the 1st processorrsquos flag is modified
ndash Exactly one processor is being signaled
ndash Pre-determined wait order
5252
SMP Scalability Improvements
Windows 2000 queued spinlocks
ndash qlocks in Kernel Debugger
Server 2003
ndash More spinlocks eliminated (context swap system space commit)
ndash Further reduction of use of spinlocks amp length they are held
ndash Scheduling database now per-CPUAllows thread state transitions in parallel
5353
SMP Scalability Improvements
XP2003
ndash Minimized lock contention for hot locks PFN or Page Frame Database lock
ndash Some locks completely eliminatedCharging nonpagedpaged pool quotas allocating and mapping system page table entries charging commitment of pages allocatingmapping physical memory through AWE functions
ndash New more efficient locking mechanism (pushlocks)Doesnrsquot use spinlocks when no contentionUsed for object manager and address windowing extensions (AWE) related locks
5454
Waiting
Flexible wait calls
ndash Wait for one or multiple objects in one call
ndash Wait for multiple can wait for ldquoanyrdquo one or ldquoallrdquo at onceldquoAllrdquo all objects must be in the signalled state concurrently to resolve the wait
ndash All wait calls include optional timeout argument
ndash Waiting threads consume no CPU time
5555
Waiting
Waitable objects include
ndash Events (may be auto-reset or manual reset may be set or ldquopulsedrdquo)
ndash Mutexes (ldquomutual exclusionrdquo one-at-a-time)
ndash Semaphores (n-at-a-time)
ndash Timers
ndash Processes and Threads (signalled upon exit or terminate)
ndash Directories (change notification)
5656
Waiting
No guaranteed ordering of wait resolution
ndash If multiple threads are waiting for an object and only one thread is released (eg itrsquos a mutex or auto-reset event) which thread gets released is unpredictable
5757
Executive Synchronization
Thread waitson an object
handle
Create and initialize thread object
Initialized
Ready
Transition
Waiting
Running
Terminated
Standby
Wait is completeSet object to
signaled state
Interaction with thread scheduling
Waiting on Dispatcher Objects ndash outside the kernel
5858
Interaction bet Synchronization amp Dispatching User mode thread waits on an event objectlsquos handle
Kernel changes threadlsquos scheduling state from ready to waiting and adds thread to wait-list
Another thread sets the event
Kernel wakes up waiting threads variable priority threads get priority boost
5959
Interaction bet Synchronization amp Dispatching Dispatcher re-schedules new thread ndash it may preempt running thread it it has lower priority and issues software interrupt to initiate context switch
If no processor can be preempted the dispatcher places the ready thread in the dispatcher ready queue to be scheduled later
6060
What signals an object
Dispatcher object
System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
Owning thread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (kernel mode)
nonsignaled signaled
Owning thread or otherthread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (exported to user mode)
nonsignaled signaled
One thread releases thesemaphore freeing a resource
A thread acquires the semaphoreMore resources are not available
Kernel resumes one or more waiting threads
Semaphore
6161
A thread reinitializesthe thread object
What signals an object (contd)
Dispatcher object System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
A thread sets the event
Kernel resumes one or more threads
Kernel resumes one or more waiting threads
Event
nonsignaled signaled
Dedicated thread sets oneevent in the event pair
Kernel resumes theother dedicated thread
Kernel resumes waitingdedicated thread
Event pair
nonsignaled signaled
Timer expires
A thread (re) initializes the timer
Kernel resumes all waiting threads
Timer
nonsignaled signaled
Thread terminates
Kernel resumes all waiting threads
Thread
6262
Further Reading
Mark E Russinovich and David A Solomon Microsoft Windows Internals 4th Edition Microsoft Press 2004
Chapter 3 - System Mechanisms
ndash Trap Dispatching (pp 85 ff)
ndash Synchronization (pp 149 ff)
ndash Kernel Event Tracing (pp 175 ff)
6363
33 Advanced Windows Synchronization
Deferred and Asynchronous Procedure Calls
IRQLs and CPU Time Accounting
Wait Queues amp Dispatcher Objects
6464
Used to defer processing from higher (device) interrupt level to a lower (dispatch) levelndash Also used for quantum end and timer expiration
Driver (usually ISR) queues requestndash One queue per CPU DPCs are normally queued to the current processor but can be targeted to other CPUs
ndash Executes specified procedure at dispatch IRQL (or ldquodispatch levelrdquo also ldquoDPC levelrdquo) when all higher-IRQL work (interrupts) completed
ndash Maximum times recommended ISR 10 usec DPC 25 usec
See httpwwwmicrosoftcomwhdcdriverperformmmdrvmspx
Deferred Procedure Calls (DPCs)
6565
queue head DPC object DPC object DPC object
Deferred Procedure Calls (DPCs)
6666
DPC
Delivering a DPC
DPC routines can call kernel functionsbut canlsquot call system services generatepage faults or create or wait on objects
DPC routines canlsquotassume whatprocess addressspace is currentlymapped
Interruptdispatch table
high
Power failure
DispatchDPC
APC
Low
DPC
1 Timer expires kernelqueues DPC that willrelease all waiting threadsKernel requests SW int
DPCDPC
DPC queue
2 DPC interrupt occurswhen IRQL drops belowdispatchDPC level
dispatcher
3 After DPC interruptcontrol transfers tothread dispatcher
4 Dispatcher executes each DPCroutine in DPC queue
6767
Asynchronous Procedure Calls (APCs)
Execute code in context of a particular user threadndash APC routines can acquire resources (objects) incur page faultscall system services
APC queue is thread-specific
User mode amp kernel mode APCsndash Permission required for user mode APCs
6868
Asynchronous Procedure Calls (APCs)
Executive uses APCs to complete work in thread spacendash Wait for asynchronous IO operation
ndash Emulate delivery of POSIX signals
ndash Make threads suspendterminate itself (env subsystems)
APCs are delivered when thread is in alertable wait statendash WaitForMultipleObjectsEx() SleepEx()
6969
Special kernel APCs
ndash Run in kernel mode at IRQL 1
ndash Always deliverable unless thread is already at IRQL 1 or above
ndash Used for IO completion reporting from ldquoarbitrary thread contextrdquo
ndash Kernel-mode interface is linkable but not documented
Asynchronous Procedure Calls (APCs)
7070
ldquoOrdinaryrdquo kernel APCs
ndash Always deliverable if at IRQL 0 unless explicitly disabled (disable with KeEnterCriticalRegion)
User mode APCs
ndash Used for IO completion callback routines (see ReadFileEx WriteFileEx) also QueueUserApc
ndash Only deliverable when thread is in ldquoalertable waitrdquo
Asynchronous Procedure Calls (APCs)
7171
ThreadObject
K
U
APC objects
Asynchronous Procedure Calls (APCs)
7272
IRQLs and CPU Time Accounting
Interval clock timer ISR keeps track of time
Clock ISR time accounting
ndash If IRQLlt2 charge to threadrsquos user or kernel time
ndash If IRQL=2 and processing a DPC charge to DPC time
ndash If IRQL=2 amp not processing a DPC charge to thread kernel time
ndash If IRQLgt2 charge to interrupt time
7373
IRQLs and CPU Time Accounting
Since time servicing interrupts are NOT charged to interrupted thread if system is busy but no process appears to be running must be due to interrupt-related activity
ndash Note time at IRQL 2 or more is charged to the current threadrsquos quantum (to be described)
7474
Interrupt Time Accounting
Task Manager includes interrupt and DPC time with the Idle process time
Interrupt activity is not charged to any threadprocess
ndash Process Explorer shows these as separate processesnot really processes
ndash Context switches for these are really of interrupts amp DPCs
7575
Time Accounting Quirks
Looking at total CPU time for each process may not reveal where system has spent its time
CPU time accounting is driven by programmable interrupt timer
ndash Normally 10 msec (15 msec on some MP Pentiums)
Thread execution and context switches between clock intervals NOT accounted
ndash Eg one or more threads run and enter a wait state before clock fires
ndash Thus threads may run but never get charged
View context switch activity with Process Explorer
ndash Add Context Switch Delta column
7676
For waiting threads user-mode utilities only display the wait reason
Example pstat
Looking at Waiting Threads
7777
Wait Internals 1 Dispatcher Objects
Size TypeState
Wait listhead
Object-type-specific data
DispatcherObject
(see ntddkincddkntddkh)
Any kernel object you can wait for is a ldquodispatcher objectrdquo
ndash some exclusively for synchronizationeg events mutexes (ldquomutantsrdquo) semaphores queues timers
ndash others can be waited for as a side effect of their prime function eg processes threads file objects
ndash non-waitable kernel objects are called ldquocontrol objectsrdquo
All dispatcher objects have a common header
All dispatcher objects are in one of two states
ndash ldquosignaledrdquo vs ldquononsignaledrdquo
ndash when signalled a wait on the object is satisfied
ndash different object types differ in terms of what changes their state
ndash wait and unwait implementation iscommon to all types of dispatcher objects
7878
Object-type-specific data
Wait Internals 2Wait Blocks
Size TypeState
Wait listhead
Size TypeState
Wait listhead
Represent a threadrsquos reference to something itrsquos waiting for (one per handle passed to WaitForhellip)
All wait blocks from a given wait call are chained to the waiting thread
Type indicates wait for ldquoanyrdquo or ldquoallrdquo Key denotes argument list position for
WaitForMultipleObjects
Object-type-specific data
DispatcherObjects
Thread Objects
WaitBlockListWaitBlockList
Wait blocks
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
7979
34 Windows APIs for Synchronization and IPC Windows API constructs for synchronization and interprocess communication
Synchronization
ndash Critical sections
ndash Mutexes
ndash Semaphores
ndash Event objects
Synchronization through interprocess communication
ndash Anonymous pipes
ndash Named pipes
ndash Mailslots
8080
Critical Sections
Only usable from within the same process
Critical sections are initialized and deleted but do not have handles
Only one thread at a time can be in a critical section
A thread can enter a critical section multiple times - however the number of Enter- and Leave-operations must match
Leaving a critical section before entering it may cause deadlocks
No way to test whether another thread is in a critical section
VOID InitializeCriticalSection( LPCRITICAL_SECTION sec )VOID DeleteCriticalSection( LPCRITICAL_SECTION sec )
VOID EnterCriticalSection( LPCRITICAL_SECTION sec ) VOID LeaveCriticalSection( LPCRITICAL_SECTION sec )BOOL TryEnterCriticalSection ( LPCRITICAL_SECTION sec )
8181
Critical Section Example
counter is global shared by all threads
volatile int counter = 0
CRITICAL_SECTION crit
InitializeCriticalSection ( ampcrit )
hellip main loop in any of the threads
while (done)
_try
EnterCriticalSection ( ampcrit )
counter += local_value
LeaveCriticalSection ( ampcrit )
_finally LeaveCriticalSection ( ampcrit )
DeleteCriticalSection( ampcrit )
8282
Synchronizing Threads with Kernel Objects
The following kernel objects can be used to synchronize threads
ndash Processes
ndash Threads
ndash Files
ndash Console input
File change notificationsFile change notifications
MutexesMutexes
Events (auto-reset + manual-reset)Events (auto-reset + manual-reset)
Waitable timersWaitable timers
DWORD WaitForSingleObject( HANDLE hObject DWORD dwTimeout )
DWORD WaitForMultipleObjects( DWORD cObjects LPHANDLE lpHandles BOOL bWaitAll DWORD dwTimeout )
8383
Wait Functions - Details
WaitForSingleObject()ndash hObject specifies kernel object
ndash dwTimeout specifies wait time in msecdwTimeout == 0 - no wait check whether object is signaled
dwTimeout == INFINITE - wait forever
WaitForMultipleObjects()ndash cObjects lt= MAXIMUM_WAIT_OBJECTS (64)
ndash lpHandles - pointer to array identifying these objects
ndash bWaitAll - whether to wait for first signaled object or all objectsFunction returns index of first signaled object
Side effectsndash Mutexes auto-reset events and waitable timers will be reset to non-signaled state after completing wait functions
8484
Mutexes
Mutexes work across processes
First thread has to call CreateMutex()
When sharing a mutex second thread (process) calls CreateMutex() or OpenMutex()
fInitialOwner == TRUE gives creator immediate ownership
Threads acquire mutex ownership using WaitForSingleObject() or WaitForMultipleObjects()
ReleaseMutex() gives up ownership
CloseHandle() will free mutex object
8585
Mutexes
HANDLE CreateMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
HANDLE OpenMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
BOOL ReleaseMutex( HANDLE hMutex )
8686
Mutex Example
counter is global shared by all threads
volatile int done counter = 0
HANDLE mutex = CreateMutex( NULL FALSE NULL )
main loop in any of the threads ret is local
DWORD ret
while (done)
ret = WaitForSingleObject( mutex INFINITE )
if (ret == WAIT_OBJECT_0)
counter += local_value
else mutex was abandoned
break exit the loop
ReleaseMutex( mutex )
CloseHandle( mutex )
8787
Comparison - POSIX mutexes
POSIX pthreads specification supports mutexesndash Synchronization among threads in same process
Five basic functionsndash pthread_mutex_init()
ndash pthread_mutex_destroy()
ndash pthread_mutex_lock()
ndash pthread_mutex_unlock()
ndash pthread_mutex_trylock()
Comparisonndash pthread_mutex_lock() will block - equivalent to WaitForSingleObject( hMutex )
ndash pthread_mutex_trylock() is nonblocking (polling) - equivalent to WaitForSingleObject() with timeout == 0
8888
Semaphores
Semaphore objects are used for resource countingndash A semaphore is signaled when count gt 0
Threadsprocesses use wait functionsndash Each wait function decreases semaphore count by 1
ndash ReleaseSemaphore() may increment count by any value
ndash ReleaseSemaphore() returns old semaphore count
8989
Semaphores
HANDLE CreateSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE OpenSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE ReleaseSemaphore( HANDLE hSemaphoreLONG cReleaseCount LPLONG lpPreviousCount )
9090
Events
Multiple threads can be released when a single event is signaled (barrier synchronization)ndash Manual-reset event can signal several thread simultaneously must be reset manually
ndash PulseEvent() will release all threads waiting on a manual-reset event and automatically reset the event
ndash Auto-reset event signals a single thread event is reset automatically
ndash fInitialState == TRUE - create event in signaled state
9191
Events
HANDLE CreateEvent( LPSECURITY_ATTRIBUTE lpsaBOOL fManualReset BOOL fInititalStateLPTSTR lpszEventName )
BOOL SetEvent( HANDLE hEvent )BOOL ResetEvent( HANDLE hEvent )BOOL PulseEvent( HANDLE hEvent )
9292
Comparison - POSIX condition variables
pthreadrsquos condition variables are comparable to eventsndash pthread_cond_init()
ndash pthread_cond_destroy()
Wait functionsndash pthread_cond_wait()
ndash pthread_cond_timedwait()
Signalingndash pthread_cond_signal() - one thread
ndash pthread_cond_broadcast() - all waiting threads
No exact equivalent to manual-reset events
9393
Anonymous pipes
BOOL CreatePipe( PHANDLE phReadPHANDLE phWriteLPSECURITY_ATTRIBUTES lpsaDWORD cbPipe )
main
prog1 prog2pipe
Half-duplex character-based IPC
cbPipe pipe byte size zero == default
Read on pipe handle will block if pipe is empty
Write operation to a full pipe will block
Anonymous pipes are oneway
9494
IO Redirection using an Anonymous Pipe
Create default size anonymous pipe handles are inheritable
if (CreatePipe (amphReadPipe amphWritePipe ampPipeSA 0))
fprintf(stderr ldquoAnon pipe create failednrdquo) exit(1)
Set output handle to pipe handle create first processes
StartInfoCh1hStdInput = GetStdHandle (STD_INPUT_HANDLE)
StartInfoCh1hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh1hStdOutput = hWritePipe
StartInfoCh1dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)Command1 NULL NULL TRUE 0 NULL NULL ampStartInfoCh1 ampProcInfo1))
fprintf(stderr ldquoCreateProc1 failednrdquo) exit(2)
CloseHandle (hWritePipe)
9595
Pipe example (contd)
Repeat (symmetrically) for the second process
StartInfoCh2hStdInput = hReadPipe
StartInfoCh2hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh2hStdOutput = GetStdHandle (STD_OUTPUT_HANDLE)
StartInfoCh2dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)targv NULL NULLTRUE Inherit handles
0 NULL NULL ampStartInfoCh2 ampProcInfo2))
fprintf(stderr ldquoCreateProc2 failednrdquo) exit(3)
CloseHandle (hReadPipe)
Wait for both processes to complete
WaitForSingleObject (ProcInfo1hProcess INFINITE)
WaitForSingleObject (ProcInfo2hProcess INFINITE)
CloseHandle (ProcInfo1hThread) CloseHandle (ProcInfo1hProcess)
CloseHandle (ProcInfo2hThread) CloseHandle (ProcInfo2hProcess)
return 0
9696
Named Pipes
Message orientedndash Reading process can read varying-length messages precisely as sent by the writing process
Bi-directionalndash Two processes can exchange messages over the same pipe
Multiple independent instances of a named pipendash Several clients can communicate with a single server using the same instance
ndash Server can respond to client using the same instance
Pipe can be accessed over the networkndash location transparency
Convenience and connection functions
9797
Using Named Pipes
HANDLE CreateNamedPipe (LPCTSTR lpszPipeNameDWORD fdwOpenMode DWORD fdwPipModeDWORD nMaxInstances DWORD cbOutBufDWORD cbInBuf DWORD dwTimeOutLPSECURITY_ATTRIBUTES lpsa )
Use same flag settings forall instances of a named pipe
lpszPipeName pipe[path]pipename
ndash Not possible to create a pipe on remote machine ( ndash local machine)
fdwOpenMode
ndash PIPE_ACCESS_DUPLEX PIPE_ACCESS_INBOUND PIPE_ACCESS_OUTBOUND
fdwPipeMode
ndash PIPE_TYPE_BYTE or PIPE_TYPE_MESSAGE
ndash PIPE_READMODE_BYTE or PIPE_READMODE_MESSAGE
ndash PIPE_WAIT or PIPE_NOWAIT (will ReadFile block)
9898
Named Pipes (contd)
BOOL PeekNamedPipe (HANDLE hPipeLPVOID lpvBuffer DWORD cbBufferLPDWORD lpcbRead LPDWORD lpcbAvailLPDWORD lpcbMessage)
nMaxInstances
ndash Number of instances
ndash PIPE_UNLIMITED_INSTANCES OS choice based on resources
dwTimeOut
ndash Default time-out period (in msec) for WaitNamedPipe()
First CreateNamedPipe creates named pipe
ndash Closing handle to last instance deletes named pipe
Polling a pipe
ndash Nondestructive ndash is there a message waiting for ReadFile
9999
Named Pipe Client Connections
CreateFile with named pipe namendash pipe[path]pipename
ndash servernamepipe[path]pipename
ndash First method gives better performance (local server)
Status Functionsndash GetNamedPipeHandleState
ndash SetNamedPipeHandleState
ndash GetNamedPipeInfo
100100
Convenience Functions
BOOL TransactNamedPipe( HANDLE hNamedPipeLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDOWRD lpcbRead LPOVERLAPPED lpa)
WriteFile ReadFile sequence
101101
Convenience Functions
BOOL CallNamedPipe( LPCTSTR lpszPipeNameLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDWORD lpcbRead DWORD dwTimeOut)
CreateFile WriteFile ReadFile CloseHandle
ndash dwTimeOut NMPWAIT_NOWAIT NMPWAIT_WIAT_FOREVER NMPWAIT_USE_DEFAULT_WAIT
102102
Server eliminate the polling loop
BOOL ConnectNamedPipe (HANDLE hNamedPipeLPOVERLAPPED lpo
lpo == NULLndash Call will return as soon as there is a client connection
ndash Returns false if client connected between CreateNamed Pipe calland ConnectNamedPipe()
Use DisconnectNamedPipe to free the handle for connection from another client
WaitNamedPipe()ndash Client may wait for serverlsquos ConnectNamedPipe()
Security rights for named pipesndash GENERIC_READ GENERIC_WRITE SYNCHRONIZE
103103
Comparison with UNIX
UNIX FIFOs are similar to a named pipe
ndash FIFOs are half-duplex
ndash FIFOs are limited to a single machine
ndash FIFOs are still byte-oriented so its easiest to use fixed-size records in clientserver applications
ndash Individual readwrites are atomic
A server using FIFOs must use a separate FIFO for each clientlsquos response although all clients can send requests via a single well known FIFO
Mkfifo() is the UNIX counterpart to CreateNamedPipe()
Use sockets for networked clientserver scenarios
104104
Client Example using Named Pipe
WaitNamedPipe (ServerPipeName NMPWAIT_WAIT_FOREVER)
hNamedPipe = CreateFile (ServerPipeName GENERIC_READ | GENERIC_WRITE
0 NULL OPEN_EXISTING FILE_ATTRIBUTE_NORMAL NULL)
if (hNamedPipe == INVALID_HANDLE_VALUE)
fptinf(stderr Failure to locate servern) exit(3)
Write the request
WriteFile (hNamedPipe ampRequest MAX_RQRS_LEN ampnWrite NULL)
Read each response and send it to std out
while (ReadFile (hNamedPipe ResponseRecord MAX_RQRS_LEN ampnRead NULL))
printf (s ResponseRecord)
CloseHandle (hNamedPipe)
return 0
105105
Server Example Using a Named Pipe
hNamedPipe = CreateNamedPipe (SERVER_PIPE PIPE_ACCESS_DUPLEX
PIPE_READMODE_MESSAGE | PIPE_TYPE_MESSAGE | PIPE_WAIT
1 0 0 CS_TIMEOUT pNPSA)
while (Done)
printf (Server is awaiting next requestn)
if (ConnectNamedPipe (hNamedPipe NULL)
|| ReadFile (hNamedPipe ampRequest RQ_SIZE ampnXfer NULL))
fprintf(stderr ldquoConnect or Read Named Pipe errornrdquo) exit(4)
printf( ldquoRequest is sn RequestRecord)
Send the file one line at a time to the client
fp = fopen (File r)
while ((fgets (ResponseRecord MAX_RQRS_LEN fp) = NULL))
WriteFile (hNamedPipe ampResponseRecord
(strlen(ResponseRecord) + 1) TSIZE ampnXfer NULL)
fclose (fp)
DisconnectNamedPipe (hNamedPipe)
End of server operation
106106
Win32 IPC - MailslotsMailslots bear some nasty implementation detailsthey are almost never used
Broadcast mechanism
ndash One-directional
ndash Mutliple writersmultiple readers (frequently one-to-many comm)
ndash Message delivery is unreliable
ndash Can be located over a network domain
ndash Message lengths are limited (w2k lt 426 byte) Operations on the mailslot
ndash Each reader (server) creates mailslot with CreateMailslot()
ndash Write-only client opens mailslot with CreateFile() and uses WriteFile() ndash open will fail if there are no waiting readers
ndash Clientlsquos message can be read by all servers (readers) Client lookup mailslotmailslotname
ndash Client will connect to every server in network domain
107107
Locate a server via mailslot
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
App client 0
App client n
Mailslot Servers
While () Sleep() hMS = CreateFile( ldquomailslotstatusldquo)
WriteFile(hMS ampStatInfo
App Server
Mailslot Client
Message is sent periodically
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
2525
Semaphore as a General Synchronization Tool Execute B in Tj only after A executed in Ti
Use semaphore flag initialized to 0
Code
Ti Tj
A wait(flag)
signal(flag) B
2626
Two Types of Semaphores
Counting semaphore
ndash integer value can range over an unrestricted domain
Binary semaphore
ndash integer value can range only between 0 and 1
ndash can be simpler to implement
Counting semaphore S can be implemented as a binary semaphore
2727
Deadlock and Starvation
Deadlock ndash
ndash two or more threads are waiting indefinitely for an event that can be caused by only one of the waiting threads
Let S and Q be two semaphores initialized to 1
T0 T1
wait(S) wait(Q)
wait(Q) wait(S)
signal(S) signal(Q)
signal(Q) signal(S)
2828
Deadlock and Starvation
Starvation ndash indefinite blocking
ndash A thread may never be removed from the semaphore queue in which it is suspended
Solution ndash
ndash all code should acquirerelease semaphores in same order
2929
Windows Synchronization
Uses interrupt masks to protect access to global resources on uniprocessor systems
Uses spinlocks on multiprocessor systems
Provides dispatcher objects which may act as mutexes and semaphores
Dispatcher objects may also provide events An event acts much like a condition variable
3030
Linux Synchronization
Kernel disables interrupts for synchronizing access to global data on uniprocessor systems
Uses spinlocks for multiprocessor synchronization
Uses semaphores and readers-writers locks when longer sections of code need access to data
Implements POSIX synchronization primitives to support multitasking multithreading (including real-time threads) and multiprocessing
3131
Further Reading
Ben-Ari M Principles of Concurrent Programming Prentice Hall 1982
Lamport L The Mutual Exclusion Problem Journal of the ACM April 1986
Abraham Silberschatz Peter B Galvin Operating System Concepts John Wiley amp Sons 6th Ed 2003
ndash Chapter 7 - Process Synchronization
ndash Chapter 8 - Deadlocks
3232
32 Trap Dispatching Interrupts Synchronization Trap and Interrupt dispatching
IRQL levels amp Interrupt Precedence
Spinlocks and Kernel Synchronization
Executive Synchronization
3333
Kernel Mode Versus User Mode
A processor state
Controls access to memory
Each memory page is tagged to show the required mode for reading and for writing
ndash Protects the system from the users
ndash Protects the user (process) from themselves
ndash System is not protected from system
Code regions are tagged ldquono write in any moderdquo
Controls ability to execute privileged instructions
A Windows abstraction
ndash Intel Ring 0 Ring 3
3434
Kernel Mode Versus User Mode
Control flow (a thread) can change from user to kernel mode and back
ndash Does not affect scheduling
ndash Thread context includes info about execution mode (along with registers etc)
PerfMon counters
ndash ldquoPrivileged Timerdquo and ldquoUser Timerdquo
ndash 4 levels of granularity thread process processor system
3535
Getting Into Kernel Mode
Code is run in kernel mode for one of three reasons
1 Requests from user mode
ndash Via the system service dispatch mechanism
ndash Kernel-mode code runs in the context of the requesting thread
2 Dedicated kernel-mode system threads
ndash Some threads in the system stay in kernel mode at all timesmostly in the ldquoSystemrdquo process
ndash Scheduled preempted etc like any other threads
3636
Getting Into Kernel Mode
3 Interrupts from external devicesndash interrupt dispatcher invokes the interrupt service routine
ndash ISR runs in the context of the interrupted thread so-called ldquoarbitrary thread contextrdquo
ndash ISR often requests the execution of a ldquoDPC routinerdquo which also runs in kernel mode
ndash Time not charged to interrupted thread
3737
Trap dispatching
Interruptdispatcher
Systemservice
dispatcher
Interruptserviceroutines
Interruptserviceroutines
Interruptserviceroutines
System services
System services
System services
Exceptiondispatcher
Exceptionhandlers
Exceptionhandlers
Exceptionhandlers
Virtual memorymanagerlsquos pager
Interrupt
System service call
HW exceptionsSW exceptions
Virtual addressexceptions
Trap processorlsquos mechanism to capture executing thread
ndash Switch from user to kernel mode
ndash Interrupts ndash asynchronous
ndash Exceptions - synchronous
Interrupt dispatch routine
Disable interrupts
Record machine state (trap frame) to allow resume
Mask equal- and lower-IRQL interrupts
Find and call appropriate ISR
Dismiss interrupt
Restore machine state (including mode and enabled interrupts)
Disable interrupts
Record machine state (trap frame) to allow resume
Mask equal- and lower-IRQL interrupts
Find and call appropriate ISR
Dismiss interrupt
Restore machine state (including mode and enabled interrupts)
Tell the device to stop interrupting
Interrogate device state start next operation on device etc
Request a DPC
Return to caller
Tell the device to stop interrupting
Interrogate device state start next operation on device etc
Request a DPC
Return to caller
Interrupt service routine
interrupt
user or kernel mode
codekernel mode
Note no thread or process context switch
Note no thread or process context switch
Interrupt Dispatching
3939
IRQL = Interrupt Request Level
ndash Precedence of the interrupt with respect to other interrupts
ndash Different interrupt sources have different IRQLs
ndash not the same as IRQ
IRQL is also a state of the processor
ndash Servicing an interrupt raises processor IRQL to that interruptrsquos IRQL
ndash this masks subsequent interrupts at equal and lower IRQLs
User mode is limited to IRQL 0
No waits or page faults at IRQL gt= DISPATCH_LEVEL
Interrupt Precedence via IRQLs (x86)
4040
PassiveLowAPC
DispatchDPCDevice 1
Profile amp Synch (Srv 2003)
ClockInterprocessor Interrupt
Power failHigh
normal thread execution
Hardware interrupts
Deferrable software interrupts
012
302928
31
Interrupt Precedence via IRQLs (x86)
4141
Interrupt processing
Interrupt dispatch table (IDT)ndash Links to interrupt service routines
x86ndash Interrupt controller interrupts processor (single line)
ndash Processor queries for interrupt vector uses vector as index to IDT
Alphandash PAL code (Privileged Architecture Library ndash Alpha BIOS) determines interrupt vector calls kernel
ndash Kernel uses vector to index IDT
After ISR execution IRQL is lowered to initial level
4242
Interrupt object
Allows device drivers to register ISRs for their devicesndash Contains dispatch code (initial handler)
ndash Dispatch code calls ISR with interrupt object as parameter(HW cannot pass parameters to ISR)
Connectingdisconnecting interrupt objectsndash Dynamic association between ISR and IDT entry
ndash Loadable device drivers (kernel modules)
ndash Turn onoff ISR
Interrupt objects can synchronize access to ISR datandash Multiple instances of ISR may be active simultaneously (MP machine)
ndash Multiple ISR may be connected with IRQL
4343
Predefined IRQLs
High ndash used when halting the system (via KeBugCheck())
Power fail ndash originated in the NT design document but has never been used
Inter-processor interruptndash used to request action from other processor (dispatching a thread updating a processors TLB system shutdown system crash)
Clockndash Used to update systemlsquos clock allocation of CPU time to threads
Profilendash Used for kernel profiling (see Kernel profiler ndash Kernprofexe Res Kit)
4444
Predefined IRQLs (contd)
Device
ndash Used to prioritize device interrupts
DPCdispatch and APC
ndash Software interrupts that kernel and device drivers generate
Passive
ndash No interrupt level at all normal thread execution
4545
IRQLs on 64-bit Systems
PassiveLowAPC
DispatchDPCDevice 1
Device n
Synch (Srv 2003)Clock
Interprocessor InterruptPower
HighProfile
012
1413
15
34
PassiveLowAPC
DispatchDPC amp Synch (UP only)Correctable Machine Check
Device 1
Device nSynch (MP only)
ClockInterprocessor Interrupt
HighProfilePower
x64 IA64
12
4646
Interrupt Prioritization amp Delivery
IRQLs are determined as followsndash x86 UP systems IRQL = 27 - IRQ
ndash x86 MP systems bucketized (random)
ndash x64 amp IA64 systems IRQL = IDT vector number 16
On MP systems which processor is chosen to deliver an interruptndash By default any processor can receive an interrupt from any deviceCan be configured with IntFilter utility in Resource Kit
ndash On x86 and x64 systems the IOAPIC (IO advanced programmable interrupt controller) is programmed to interrupt the processor running at the lowest IRQL
ndash On IA64 systems the SAPIC (streamlined advanced programmable interrupt controller) is configured to interrupt one processor for each interrupt sourceProcessors are assigned round robin for each interrupt vector
4747
Software interrupts
Initiating thread dispatching
ndash DPC allow for scheduling actions when kernel is deep within many layers of code
ndash Delayed scheduling decision one DPC queue per processor
Handling timer expiration
Asynchronous execution of a procedure in context of a particular thread
Support for asynchronous IO operations
4848
Flow of Interrupts
4949
Sync on MP use spinlocks to coordinate among processors
Spinlock acquisition and release routines implement a one-owner-at-a-time algorithm
ndash Spinlock is either free or is considered to be owned by a CPU
ndash Analogous to using Windows API mutexes from user mode
A spinlock is just a data cell in memory
ndash Accessed with a test-and-modify operation that is atomic across all processors
ndash KSPIN_LOCK is an opaque data type typedefrsquod as a ULONG
ndash To implement synchronization a single bit is sufficient
Synchronization on SMP Systems
5050
do acquire_spinlock(DPC)until (SUCCESS)
begin remove DPC from queueend
release_spinlock(DPC)
do acquire_spinlock(DPC)until (SUCCESS)
begin remove DPC from queueend
release_spinlock(DPC)
Kernel Synchronization
Processor BProcessor A
Critical section
spinlock
DPC DPC
A spinlock is a locking primitive associatedwith a global data structure such as the DPC queue
5151
Queued Spinlocks
Problem Checking status of spinlock via test-and-set operation creates bus contention
Queued spinlocks maintain queue of waiting processors
First processor acquires lock other processors wait on processor-local flag
ndash Thus busy-wait loop requires no access to the memory bus
When releasing lock the 1st processorrsquos flag is modified
ndash Exactly one processor is being signaled
ndash Pre-determined wait order
5252
SMP Scalability Improvements
Windows 2000 queued spinlocks
ndash qlocks in Kernel Debugger
Server 2003
ndash More spinlocks eliminated (context swap system space commit)
ndash Further reduction of use of spinlocks amp length they are held
ndash Scheduling database now per-CPUAllows thread state transitions in parallel
5353
SMP Scalability Improvements
XP2003
ndash Minimized lock contention for hot locks PFN or Page Frame Database lock
ndash Some locks completely eliminatedCharging nonpagedpaged pool quotas allocating and mapping system page table entries charging commitment of pages allocatingmapping physical memory through AWE functions
ndash New more efficient locking mechanism (pushlocks)Doesnrsquot use spinlocks when no contentionUsed for object manager and address windowing extensions (AWE) related locks
5454
Waiting
Flexible wait calls
ndash Wait for one or multiple objects in one call
ndash Wait for multiple can wait for ldquoanyrdquo one or ldquoallrdquo at onceldquoAllrdquo all objects must be in the signalled state concurrently to resolve the wait
ndash All wait calls include optional timeout argument
ndash Waiting threads consume no CPU time
5555
Waiting
Waitable objects include
ndash Events (may be auto-reset or manual reset may be set or ldquopulsedrdquo)
ndash Mutexes (ldquomutual exclusionrdquo one-at-a-time)
ndash Semaphores (n-at-a-time)
ndash Timers
ndash Processes and Threads (signalled upon exit or terminate)
ndash Directories (change notification)
5656
Waiting
No guaranteed ordering of wait resolution
ndash If multiple threads are waiting for an object and only one thread is released (eg itrsquos a mutex or auto-reset event) which thread gets released is unpredictable
5757
Executive Synchronization
Thread waitson an object
handle
Create and initialize thread object
Initialized
Ready
Transition
Waiting
Running
Terminated
Standby
Wait is completeSet object to
signaled state
Interaction with thread scheduling
Waiting on Dispatcher Objects ndash outside the kernel
5858
Interaction bet Synchronization amp Dispatching User mode thread waits on an event objectlsquos handle
Kernel changes threadlsquos scheduling state from ready to waiting and adds thread to wait-list
Another thread sets the event
Kernel wakes up waiting threads variable priority threads get priority boost
5959
Interaction bet Synchronization amp Dispatching Dispatcher re-schedules new thread ndash it may preempt running thread it it has lower priority and issues software interrupt to initiate context switch
If no processor can be preempted the dispatcher places the ready thread in the dispatcher ready queue to be scheduled later
6060
What signals an object
Dispatcher object
System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
Owning thread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (kernel mode)
nonsignaled signaled
Owning thread or otherthread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (exported to user mode)
nonsignaled signaled
One thread releases thesemaphore freeing a resource
A thread acquires the semaphoreMore resources are not available
Kernel resumes one or more waiting threads
Semaphore
6161
A thread reinitializesthe thread object
What signals an object (contd)
Dispatcher object System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
A thread sets the event
Kernel resumes one or more threads
Kernel resumes one or more waiting threads
Event
nonsignaled signaled
Dedicated thread sets oneevent in the event pair
Kernel resumes theother dedicated thread
Kernel resumes waitingdedicated thread
Event pair
nonsignaled signaled
Timer expires
A thread (re) initializes the timer
Kernel resumes all waiting threads
Timer
nonsignaled signaled
Thread terminates
Kernel resumes all waiting threads
Thread
6262
Further Reading
Mark E Russinovich and David A Solomon Microsoft Windows Internals 4th Edition Microsoft Press 2004
Chapter 3 - System Mechanisms
ndash Trap Dispatching (pp 85 ff)
ndash Synchronization (pp 149 ff)
ndash Kernel Event Tracing (pp 175 ff)
6363
33 Advanced Windows Synchronization
Deferred and Asynchronous Procedure Calls
IRQLs and CPU Time Accounting
Wait Queues amp Dispatcher Objects
6464
Used to defer processing from higher (device) interrupt level to a lower (dispatch) levelndash Also used for quantum end and timer expiration
Driver (usually ISR) queues requestndash One queue per CPU DPCs are normally queued to the current processor but can be targeted to other CPUs
ndash Executes specified procedure at dispatch IRQL (or ldquodispatch levelrdquo also ldquoDPC levelrdquo) when all higher-IRQL work (interrupts) completed
ndash Maximum times recommended ISR 10 usec DPC 25 usec
See httpwwwmicrosoftcomwhdcdriverperformmmdrvmspx
Deferred Procedure Calls (DPCs)
6565
queue head DPC object DPC object DPC object
Deferred Procedure Calls (DPCs)
6666
DPC
Delivering a DPC
DPC routines can call kernel functionsbut canlsquot call system services generatepage faults or create or wait on objects
DPC routines canlsquotassume whatprocess addressspace is currentlymapped
Interruptdispatch table
high
Power failure
DispatchDPC
APC
Low
DPC
1 Timer expires kernelqueues DPC that willrelease all waiting threadsKernel requests SW int
DPCDPC
DPC queue
2 DPC interrupt occurswhen IRQL drops belowdispatchDPC level
dispatcher
3 After DPC interruptcontrol transfers tothread dispatcher
4 Dispatcher executes each DPCroutine in DPC queue
6767
Asynchronous Procedure Calls (APCs)
Execute code in context of a particular user threadndash APC routines can acquire resources (objects) incur page faultscall system services
APC queue is thread-specific
User mode amp kernel mode APCsndash Permission required for user mode APCs
6868
Asynchronous Procedure Calls (APCs)
Executive uses APCs to complete work in thread spacendash Wait for asynchronous IO operation
ndash Emulate delivery of POSIX signals
ndash Make threads suspendterminate itself (env subsystems)
APCs are delivered when thread is in alertable wait statendash WaitForMultipleObjectsEx() SleepEx()
6969
Special kernel APCs
ndash Run in kernel mode at IRQL 1
ndash Always deliverable unless thread is already at IRQL 1 or above
ndash Used for IO completion reporting from ldquoarbitrary thread contextrdquo
ndash Kernel-mode interface is linkable but not documented
Asynchronous Procedure Calls (APCs)
7070
ldquoOrdinaryrdquo kernel APCs
ndash Always deliverable if at IRQL 0 unless explicitly disabled (disable with KeEnterCriticalRegion)
User mode APCs
ndash Used for IO completion callback routines (see ReadFileEx WriteFileEx) also QueueUserApc
ndash Only deliverable when thread is in ldquoalertable waitrdquo
Asynchronous Procedure Calls (APCs)
7171
ThreadObject
K
U
APC objects
Asynchronous Procedure Calls (APCs)
7272
IRQLs and CPU Time Accounting
Interval clock timer ISR keeps track of time
Clock ISR time accounting
ndash If IRQLlt2 charge to threadrsquos user or kernel time
ndash If IRQL=2 and processing a DPC charge to DPC time
ndash If IRQL=2 amp not processing a DPC charge to thread kernel time
ndash If IRQLgt2 charge to interrupt time
7373
IRQLs and CPU Time Accounting
Since time servicing interrupts are NOT charged to interrupted thread if system is busy but no process appears to be running must be due to interrupt-related activity
ndash Note time at IRQL 2 or more is charged to the current threadrsquos quantum (to be described)
7474
Interrupt Time Accounting
Task Manager includes interrupt and DPC time with the Idle process time
Interrupt activity is not charged to any threadprocess
ndash Process Explorer shows these as separate processesnot really processes
ndash Context switches for these are really of interrupts amp DPCs
7575
Time Accounting Quirks
Looking at total CPU time for each process may not reveal where system has spent its time
CPU time accounting is driven by programmable interrupt timer
ndash Normally 10 msec (15 msec on some MP Pentiums)
Thread execution and context switches between clock intervals NOT accounted
ndash Eg one or more threads run and enter a wait state before clock fires
ndash Thus threads may run but never get charged
View context switch activity with Process Explorer
ndash Add Context Switch Delta column
7676
For waiting threads user-mode utilities only display the wait reason
Example pstat
Looking at Waiting Threads
7777
Wait Internals 1 Dispatcher Objects
Size TypeState
Wait listhead
Object-type-specific data
DispatcherObject
(see ntddkincddkntddkh)
Any kernel object you can wait for is a ldquodispatcher objectrdquo
ndash some exclusively for synchronizationeg events mutexes (ldquomutantsrdquo) semaphores queues timers
ndash others can be waited for as a side effect of their prime function eg processes threads file objects
ndash non-waitable kernel objects are called ldquocontrol objectsrdquo
All dispatcher objects have a common header
All dispatcher objects are in one of two states
ndash ldquosignaledrdquo vs ldquononsignaledrdquo
ndash when signalled a wait on the object is satisfied
ndash different object types differ in terms of what changes their state
ndash wait and unwait implementation iscommon to all types of dispatcher objects
7878
Object-type-specific data
Wait Internals 2Wait Blocks
Size TypeState
Wait listhead
Size TypeState
Wait listhead
Represent a threadrsquos reference to something itrsquos waiting for (one per handle passed to WaitForhellip)
All wait blocks from a given wait call are chained to the waiting thread
Type indicates wait for ldquoanyrdquo or ldquoallrdquo Key denotes argument list position for
WaitForMultipleObjects
Object-type-specific data
DispatcherObjects
Thread Objects
WaitBlockListWaitBlockList
Wait blocks
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
7979
34 Windows APIs for Synchronization and IPC Windows API constructs for synchronization and interprocess communication
Synchronization
ndash Critical sections
ndash Mutexes
ndash Semaphores
ndash Event objects
Synchronization through interprocess communication
ndash Anonymous pipes
ndash Named pipes
ndash Mailslots
8080
Critical Sections
Only usable from within the same process
Critical sections are initialized and deleted but do not have handles
Only one thread at a time can be in a critical section
A thread can enter a critical section multiple times - however the number of Enter- and Leave-operations must match
Leaving a critical section before entering it may cause deadlocks
No way to test whether another thread is in a critical section
VOID InitializeCriticalSection( LPCRITICAL_SECTION sec )VOID DeleteCriticalSection( LPCRITICAL_SECTION sec )
VOID EnterCriticalSection( LPCRITICAL_SECTION sec ) VOID LeaveCriticalSection( LPCRITICAL_SECTION sec )BOOL TryEnterCriticalSection ( LPCRITICAL_SECTION sec )
8181
Critical Section Example
counter is global shared by all threads
volatile int counter = 0
CRITICAL_SECTION crit
InitializeCriticalSection ( ampcrit )
hellip main loop in any of the threads
while (done)
_try
EnterCriticalSection ( ampcrit )
counter += local_value
LeaveCriticalSection ( ampcrit )
_finally LeaveCriticalSection ( ampcrit )
DeleteCriticalSection( ampcrit )
8282
Synchronizing Threads with Kernel Objects
The following kernel objects can be used to synchronize threads
ndash Processes
ndash Threads
ndash Files
ndash Console input
File change notificationsFile change notifications
MutexesMutexes
Events (auto-reset + manual-reset)Events (auto-reset + manual-reset)
Waitable timersWaitable timers
DWORD WaitForSingleObject( HANDLE hObject DWORD dwTimeout )
DWORD WaitForMultipleObjects( DWORD cObjects LPHANDLE lpHandles BOOL bWaitAll DWORD dwTimeout )
8383
Wait Functions - Details
WaitForSingleObject()ndash hObject specifies kernel object
ndash dwTimeout specifies wait time in msecdwTimeout == 0 - no wait check whether object is signaled
dwTimeout == INFINITE - wait forever
WaitForMultipleObjects()ndash cObjects lt= MAXIMUM_WAIT_OBJECTS (64)
ndash lpHandles - pointer to array identifying these objects
ndash bWaitAll - whether to wait for first signaled object or all objectsFunction returns index of first signaled object
Side effectsndash Mutexes auto-reset events and waitable timers will be reset to non-signaled state after completing wait functions
8484
Mutexes
Mutexes work across processes
First thread has to call CreateMutex()
When sharing a mutex second thread (process) calls CreateMutex() or OpenMutex()
fInitialOwner == TRUE gives creator immediate ownership
Threads acquire mutex ownership using WaitForSingleObject() or WaitForMultipleObjects()
ReleaseMutex() gives up ownership
CloseHandle() will free mutex object
8585
Mutexes
HANDLE CreateMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
HANDLE OpenMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
BOOL ReleaseMutex( HANDLE hMutex )
8686
Mutex Example
counter is global shared by all threads
volatile int done counter = 0
HANDLE mutex = CreateMutex( NULL FALSE NULL )
main loop in any of the threads ret is local
DWORD ret
while (done)
ret = WaitForSingleObject( mutex INFINITE )
if (ret == WAIT_OBJECT_0)
counter += local_value
else mutex was abandoned
break exit the loop
ReleaseMutex( mutex )
CloseHandle( mutex )
8787
Comparison - POSIX mutexes
POSIX pthreads specification supports mutexesndash Synchronization among threads in same process
Five basic functionsndash pthread_mutex_init()
ndash pthread_mutex_destroy()
ndash pthread_mutex_lock()
ndash pthread_mutex_unlock()
ndash pthread_mutex_trylock()
Comparisonndash pthread_mutex_lock() will block - equivalent to WaitForSingleObject( hMutex )
ndash pthread_mutex_trylock() is nonblocking (polling) - equivalent to WaitForSingleObject() with timeout == 0
8888
Semaphores
Semaphore objects are used for resource countingndash A semaphore is signaled when count gt 0
Threadsprocesses use wait functionsndash Each wait function decreases semaphore count by 1
ndash ReleaseSemaphore() may increment count by any value
ndash ReleaseSemaphore() returns old semaphore count
8989
Semaphores
HANDLE CreateSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE OpenSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE ReleaseSemaphore( HANDLE hSemaphoreLONG cReleaseCount LPLONG lpPreviousCount )
9090
Events
Multiple threads can be released when a single event is signaled (barrier synchronization)ndash Manual-reset event can signal several thread simultaneously must be reset manually
ndash PulseEvent() will release all threads waiting on a manual-reset event and automatically reset the event
ndash Auto-reset event signals a single thread event is reset automatically
ndash fInitialState == TRUE - create event in signaled state
9191
Events
HANDLE CreateEvent( LPSECURITY_ATTRIBUTE lpsaBOOL fManualReset BOOL fInititalStateLPTSTR lpszEventName )
BOOL SetEvent( HANDLE hEvent )BOOL ResetEvent( HANDLE hEvent )BOOL PulseEvent( HANDLE hEvent )
9292
Comparison - POSIX condition variables
pthreadrsquos condition variables are comparable to eventsndash pthread_cond_init()
ndash pthread_cond_destroy()
Wait functionsndash pthread_cond_wait()
ndash pthread_cond_timedwait()
Signalingndash pthread_cond_signal() - one thread
ndash pthread_cond_broadcast() - all waiting threads
No exact equivalent to manual-reset events
9393
Anonymous pipes
BOOL CreatePipe( PHANDLE phReadPHANDLE phWriteLPSECURITY_ATTRIBUTES lpsaDWORD cbPipe )
main
prog1 prog2pipe
Half-duplex character-based IPC
cbPipe pipe byte size zero == default
Read on pipe handle will block if pipe is empty
Write operation to a full pipe will block
Anonymous pipes are oneway
9494
IO Redirection using an Anonymous Pipe
Create default size anonymous pipe handles are inheritable
if (CreatePipe (amphReadPipe amphWritePipe ampPipeSA 0))
fprintf(stderr ldquoAnon pipe create failednrdquo) exit(1)
Set output handle to pipe handle create first processes
StartInfoCh1hStdInput = GetStdHandle (STD_INPUT_HANDLE)
StartInfoCh1hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh1hStdOutput = hWritePipe
StartInfoCh1dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)Command1 NULL NULL TRUE 0 NULL NULL ampStartInfoCh1 ampProcInfo1))
fprintf(stderr ldquoCreateProc1 failednrdquo) exit(2)
CloseHandle (hWritePipe)
9595
Pipe example (contd)
Repeat (symmetrically) for the second process
StartInfoCh2hStdInput = hReadPipe
StartInfoCh2hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh2hStdOutput = GetStdHandle (STD_OUTPUT_HANDLE)
StartInfoCh2dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)targv NULL NULLTRUE Inherit handles
0 NULL NULL ampStartInfoCh2 ampProcInfo2))
fprintf(stderr ldquoCreateProc2 failednrdquo) exit(3)
CloseHandle (hReadPipe)
Wait for both processes to complete
WaitForSingleObject (ProcInfo1hProcess INFINITE)
WaitForSingleObject (ProcInfo2hProcess INFINITE)
CloseHandle (ProcInfo1hThread) CloseHandle (ProcInfo1hProcess)
CloseHandle (ProcInfo2hThread) CloseHandle (ProcInfo2hProcess)
return 0
9696
Named Pipes
Message orientedndash Reading process can read varying-length messages precisely as sent by the writing process
Bi-directionalndash Two processes can exchange messages over the same pipe
Multiple independent instances of a named pipendash Several clients can communicate with a single server using the same instance
ndash Server can respond to client using the same instance
Pipe can be accessed over the networkndash location transparency
Convenience and connection functions
9797
Using Named Pipes
HANDLE CreateNamedPipe (LPCTSTR lpszPipeNameDWORD fdwOpenMode DWORD fdwPipModeDWORD nMaxInstances DWORD cbOutBufDWORD cbInBuf DWORD dwTimeOutLPSECURITY_ATTRIBUTES lpsa )
Use same flag settings forall instances of a named pipe
lpszPipeName pipe[path]pipename
ndash Not possible to create a pipe on remote machine ( ndash local machine)
fdwOpenMode
ndash PIPE_ACCESS_DUPLEX PIPE_ACCESS_INBOUND PIPE_ACCESS_OUTBOUND
fdwPipeMode
ndash PIPE_TYPE_BYTE or PIPE_TYPE_MESSAGE
ndash PIPE_READMODE_BYTE or PIPE_READMODE_MESSAGE
ndash PIPE_WAIT or PIPE_NOWAIT (will ReadFile block)
9898
Named Pipes (contd)
BOOL PeekNamedPipe (HANDLE hPipeLPVOID lpvBuffer DWORD cbBufferLPDWORD lpcbRead LPDWORD lpcbAvailLPDWORD lpcbMessage)
nMaxInstances
ndash Number of instances
ndash PIPE_UNLIMITED_INSTANCES OS choice based on resources
dwTimeOut
ndash Default time-out period (in msec) for WaitNamedPipe()
First CreateNamedPipe creates named pipe
ndash Closing handle to last instance deletes named pipe
Polling a pipe
ndash Nondestructive ndash is there a message waiting for ReadFile
9999
Named Pipe Client Connections
CreateFile with named pipe namendash pipe[path]pipename
ndash servernamepipe[path]pipename
ndash First method gives better performance (local server)
Status Functionsndash GetNamedPipeHandleState
ndash SetNamedPipeHandleState
ndash GetNamedPipeInfo
100100
Convenience Functions
BOOL TransactNamedPipe( HANDLE hNamedPipeLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDOWRD lpcbRead LPOVERLAPPED lpa)
WriteFile ReadFile sequence
101101
Convenience Functions
BOOL CallNamedPipe( LPCTSTR lpszPipeNameLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDWORD lpcbRead DWORD dwTimeOut)
CreateFile WriteFile ReadFile CloseHandle
ndash dwTimeOut NMPWAIT_NOWAIT NMPWAIT_WIAT_FOREVER NMPWAIT_USE_DEFAULT_WAIT
102102
Server eliminate the polling loop
BOOL ConnectNamedPipe (HANDLE hNamedPipeLPOVERLAPPED lpo
lpo == NULLndash Call will return as soon as there is a client connection
ndash Returns false if client connected between CreateNamed Pipe calland ConnectNamedPipe()
Use DisconnectNamedPipe to free the handle for connection from another client
WaitNamedPipe()ndash Client may wait for serverlsquos ConnectNamedPipe()
Security rights for named pipesndash GENERIC_READ GENERIC_WRITE SYNCHRONIZE
103103
Comparison with UNIX
UNIX FIFOs are similar to a named pipe
ndash FIFOs are half-duplex
ndash FIFOs are limited to a single machine
ndash FIFOs are still byte-oriented so its easiest to use fixed-size records in clientserver applications
ndash Individual readwrites are atomic
A server using FIFOs must use a separate FIFO for each clientlsquos response although all clients can send requests via a single well known FIFO
Mkfifo() is the UNIX counterpart to CreateNamedPipe()
Use sockets for networked clientserver scenarios
104104
Client Example using Named Pipe
WaitNamedPipe (ServerPipeName NMPWAIT_WAIT_FOREVER)
hNamedPipe = CreateFile (ServerPipeName GENERIC_READ | GENERIC_WRITE
0 NULL OPEN_EXISTING FILE_ATTRIBUTE_NORMAL NULL)
if (hNamedPipe == INVALID_HANDLE_VALUE)
fptinf(stderr Failure to locate servern) exit(3)
Write the request
WriteFile (hNamedPipe ampRequest MAX_RQRS_LEN ampnWrite NULL)
Read each response and send it to std out
while (ReadFile (hNamedPipe ResponseRecord MAX_RQRS_LEN ampnRead NULL))
printf (s ResponseRecord)
CloseHandle (hNamedPipe)
return 0
105105
Server Example Using a Named Pipe
hNamedPipe = CreateNamedPipe (SERVER_PIPE PIPE_ACCESS_DUPLEX
PIPE_READMODE_MESSAGE | PIPE_TYPE_MESSAGE | PIPE_WAIT
1 0 0 CS_TIMEOUT pNPSA)
while (Done)
printf (Server is awaiting next requestn)
if (ConnectNamedPipe (hNamedPipe NULL)
|| ReadFile (hNamedPipe ampRequest RQ_SIZE ampnXfer NULL))
fprintf(stderr ldquoConnect or Read Named Pipe errornrdquo) exit(4)
printf( ldquoRequest is sn RequestRecord)
Send the file one line at a time to the client
fp = fopen (File r)
while ((fgets (ResponseRecord MAX_RQRS_LEN fp) = NULL))
WriteFile (hNamedPipe ampResponseRecord
(strlen(ResponseRecord) + 1) TSIZE ampnXfer NULL)
fclose (fp)
DisconnectNamedPipe (hNamedPipe)
End of server operation
106106
Win32 IPC - MailslotsMailslots bear some nasty implementation detailsthey are almost never used
Broadcast mechanism
ndash One-directional
ndash Mutliple writersmultiple readers (frequently one-to-many comm)
ndash Message delivery is unreliable
ndash Can be located over a network domain
ndash Message lengths are limited (w2k lt 426 byte) Operations on the mailslot
ndash Each reader (server) creates mailslot with CreateMailslot()
ndash Write-only client opens mailslot with CreateFile() and uses WriteFile() ndash open will fail if there are no waiting readers
ndash Clientlsquos message can be read by all servers (readers) Client lookup mailslotmailslotname
ndash Client will connect to every server in network domain
107107
Locate a server via mailslot
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
App client 0
App client n
Mailslot Servers
While () Sleep() hMS = CreateFile( ldquomailslotstatusldquo)
WriteFile(hMS ampStatInfo
App Server
Mailslot Client
Message is sent periodically
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
2626
Two Types of Semaphores
Counting semaphore
ndash integer value can range over an unrestricted domain
Binary semaphore
ndash integer value can range only between 0 and 1
ndash can be simpler to implement
Counting semaphore S can be implemented as a binary semaphore
2727
Deadlock and Starvation
Deadlock ndash
ndash two or more threads are waiting indefinitely for an event that can be caused by only one of the waiting threads
Let S and Q be two semaphores initialized to 1
T0 T1
wait(S) wait(Q)
wait(Q) wait(S)
signal(S) signal(Q)
signal(Q) signal(S)
2828
Deadlock and Starvation
Starvation ndash indefinite blocking
ndash A thread may never be removed from the semaphore queue in which it is suspended
Solution ndash
ndash all code should acquirerelease semaphores in same order
2929
Windows Synchronization
Uses interrupt masks to protect access to global resources on uniprocessor systems
Uses spinlocks on multiprocessor systems
Provides dispatcher objects which may act as mutexes and semaphores
Dispatcher objects may also provide events An event acts much like a condition variable
3030
Linux Synchronization
Kernel disables interrupts for synchronizing access to global data on uniprocessor systems
Uses spinlocks for multiprocessor synchronization
Uses semaphores and readers-writers locks when longer sections of code need access to data
Implements POSIX synchronization primitives to support multitasking multithreading (including real-time threads) and multiprocessing
3131
Further Reading
Ben-Ari M Principles of Concurrent Programming Prentice Hall 1982
Lamport L The Mutual Exclusion Problem Journal of the ACM April 1986
Abraham Silberschatz Peter B Galvin Operating System Concepts John Wiley amp Sons 6th Ed 2003
ndash Chapter 7 - Process Synchronization
ndash Chapter 8 - Deadlocks
3232
32 Trap Dispatching Interrupts Synchronization Trap and Interrupt dispatching
IRQL levels amp Interrupt Precedence
Spinlocks and Kernel Synchronization
Executive Synchronization
3333
Kernel Mode Versus User Mode
A processor state
Controls access to memory
Each memory page is tagged to show the required mode for reading and for writing
ndash Protects the system from the users
ndash Protects the user (process) from themselves
ndash System is not protected from system
Code regions are tagged ldquono write in any moderdquo
Controls ability to execute privileged instructions
A Windows abstraction
ndash Intel Ring 0 Ring 3
3434
Kernel Mode Versus User Mode
Control flow (a thread) can change from user to kernel mode and back
ndash Does not affect scheduling
ndash Thread context includes info about execution mode (along with registers etc)
PerfMon counters
ndash ldquoPrivileged Timerdquo and ldquoUser Timerdquo
ndash 4 levels of granularity thread process processor system
3535
Getting Into Kernel Mode
Code is run in kernel mode for one of three reasons
1 Requests from user mode
ndash Via the system service dispatch mechanism
ndash Kernel-mode code runs in the context of the requesting thread
2 Dedicated kernel-mode system threads
ndash Some threads in the system stay in kernel mode at all timesmostly in the ldquoSystemrdquo process
ndash Scheduled preempted etc like any other threads
3636
Getting Into Kernel Mode
3 Interrupts from external devicesndash interrupt dispatcher invokes the interrupt service routine
ndash ISR runs in the context of the interrupted thread so-called ldquoarbitrary thread contextrdquo
ndash ISR often requests the execution of a ldquoDPC routinerdquo which also runs in kernel mode
ndash Time not charged to interrupted thread
3737
Trap dispatching
Interruptdispatcher
Systemservice
dispatcher
Interruptserviceroutines
Interruptserviceroutines
Interruptserviceroutines
System services
System services
System services
Exceptiondispatcher
Exceptionhandlers
Exceptionhandlers
Exceptionhandlers
Virtual memorymanagerlsquos pager
Interrupt
System service call
HW exceptionsSW exceptions
Virtual addressexceptions
Trap processorlsquos mechanism to capture executing thread
ndash Switch from user to kernel mode
ndash Interrupts ndash asynchronous
ndash Exceptions - synchronous
Interrupt dispatch routine
Disable interrupts
Record machine state (trap frame) to allow resume
Mask equal- and lower-IRQL interrupts
Find and call appropriate ISR
Dismiss interrupt
Restore machine state (including mode and enabled interrupts)
Disable interrupts
Record machine state (trap frame) to allow resume
Mask equal- and lower-IRQL interrupts
Find and call appropriate ISR
Dismiss interrupt
Restore machine state (including mode and enabled interrupts)
Tell the device to stop interrupting
Interrogate device state start next operation on device etc
Request a DPC
Return to caller
Tell the device to stop interrupting
Interrogate device state start next operation on device etc
Request a DPC
Return to caller
Interrupt service routine
interrupt
user or kernel mode
codekernel mode
Note no thread or process context switch
Note no thread or process context switch
Interrupt Dispatching
3939
IRQL = Interrupt Request Level
ndash Precedence of the interrupt with respect to other interrupts
ndash Different interrupt sources have different IRQLs
ndash not the same as IRQ
IRQL is also a state of the processor
ndash Servicing an interrupt raises processor IRQL to that interruptrsquos IRQL
ndash this masks subsequent interrupts at equal and lower IRQLs
User mode is limited to IRQL 0
No waits or page faults at IRQL gt= DISPATCH_LEVEL
Interrupt Precedence via IRQLs (x86)
4040
PassiveLowAPC
DispatchDPCDevice 1
Profile amp Synch (Srv 2003)
ClockInterprocessor Interrupt
Power failHigh
normal thread execution
Hardware interrupts
Deferrable software interrupts
012
302928
31
Interrupt Precedence via IRQLs (x86)
4141
Interrupt processing
Interrupt dispatch table (IDT)ndash Links to interrupt service routines
x86ndash Interrupt controller interrupts processor (single line)
ndash Processor queries for interrupt vector uses vector as index to IDT
Alphandash PAL code (Privileged Architecture Library ndash Alpha BIOS) determines interrupt vector calls kernel
ndash Kernel uses vector to index IDT
After ISR execution IRQL is lowered to initial level
4242
Interrupt object
Allows device drivers to register ISRs for their devicesndash Contains dispatch code (initial handler)
ndash Dispatch code calls ISR with interrupt object as parameter(HW cannot pass parameters to ISR)
Connectingdisconnecting interrupt objectsndash Dynamic association between ISR and IDT entry
ndash Loadable device drivers (kernel modules)
ndash Turn onoff ISR
Interrupt objects can synchronize access to ISR datandash Multiple instances of ISR may be active simultaneously (MP machine)
ndash Multiple ISR may be connected with IRQL
4343
Predefined IRQLs
High ndash used when halting the system (via KeBugCheck())
Power fail ndash originated in the NT design document but has never been used
Inter-processor interruptndash used to request action from other processor (dispatching a thread updating a processors TLB system shutdown system crash)
Clockndash Used to update systemlsquos clock allocation of CPU time to threads
Profilendash Used for kernel profiling (see Kernel profiler ndash Kernprofexe Res Kit)
4444
Predefined IRQLs (contd)
Device
ndash Used to prioritize device interrupts
DPCdispatch and APC
ndash Software interrupts that kernel and device drivers generate
Passive
ndash No interrupt level at all normal thread execution
4545
IRQLs on 64-bit Systems
PassiveLowAPC
DispatchDPCDevice 1
Device n
Synch (Srv 2003)Clock
Interprocessor InterruptPower
HighProfile
012
1413
15
34
PassiveLowAPC
DispatchDPC amp Synch (UP only)Correctable Machine Check
Device 1
Device nSynch (MP only)
ClockInterprocessor Interrupt
HighProfilePower
x64 IA64
12
4646
Interrupt Prioritization amp Delivery
IRQLs are determined as followsndash x86 UP systems IRQL = 27 - IRQ
ndash x86 MP systems bucketized (random)
ndash x64 amp IA64 systems IRQL = IDT vector number 16
On MP systems which processor is chosen to deliver an interruptndash By default any processor can receive an interrupt from any deviceCan be configured with IntFilter utility in Resource Kit
ndash On x86 and x64 systems the IOAPIC (IO advanced programmable interrupt controller) is programmed to interrupt the processor running at the lowest IRQL
ndash On IA64 systems the SAPIC (streamlined advanced programmable interrupt controller) is configured to interrupt one processor for each interrupt sourceProcessors are assigned round robin for each interrupt vector
4747
Software interrupts
Initiating thread dispatching
ndash DPC allow for scheduling actions when kernel is deep within many layers of code
ndash Delayed scheduling decision one DPC queue per processor
Handling timer expiration
Asynchronous execution of a procedure in context of a particular thread
Support for asynchronous IO operations
4848
Flow of Interrupts
4949
Sync on MP use spinlocks to coordinate among processors
Spinlock acquisition and release routines implement a one-owner-at-a-time algorithm
ndash Spinlock is either free or is considered to be owned by a CPU
ndash Analogous to using Windows API mutexes from user mode
A spinlock is just a data cell in memory
ndash Accessed with a test-and-modify operation that is atomic across all processors
ndash KSPIN_LOCK is an opaque data type typedefrsquod as a ULONG
ndash To implement synchronization a single bit is sufficient
Synchronization on SMP Systems
5050
do acquire_spinlock(DPC)until (SUCCESS)
begin remove DPC from queueend
release_spinlock(DPC)
do acquire_spinlock(DPC)until (SUCCESS)
begin remove DPC from queueend
release_spinlock(DPC)
Kernel Synchronization
Processor BProcessor A
Critical section
spinlock
DPC DPC
A spinlock is a locking primitive associatedwith a global data structure such as the DPC queue
5151
Queued Spinlocks
Problem Checking status of spinlock via test-and-set operation creates bus contention
Queued spinlocks maintain queue of waiting processors
First processor acquires lock other processors wait on processor-local flag
ndash Thus busy-wait loop requires no access to the memory bus
When releasing lock the 1st processorrsquos flag is modified
ndash Exactly one processor is being signaled
ndash Pre-determined wait order
5252
SMP Scalability Improvements
Windows 2000 queued spinlocks
ndash qlocks in Kernel Debugger
Server 2003
ndash More spinlocks eliminated (context swap system space commit)
ndash Further reduction of use of spinlocks amp length they are held
ndash Scheduling database now per-CPUAllows thread state transitions in parallel
5353
SMP Scalability Improvements
XP2003
ndash Minimized lock contention for hot locks PFN or Page Frame Database lock
ndash Some locks completely eliminatedCharging nonpagedpaged pool quotas allocating and mapping system page table entries charging commitment of pages allocatingmapping physical memory through AWE functions
ndash New more efficient locking mechanism (pushlocks)Doesnrsquot use spinlocks when no contentionUsed for object manager and address windowing extensions (AWE) related locks
5454
Waiting
Flexible wait calls
ndash Wait for one or multiple objects in one call
ndash Wait for multiple can wait for ldquoanyrdquo one or ldquoallrdquo at onceldquoAllrdquo all objects must be in the signalled state concurrently to resolve the wait
ndash All wait calls include optional timeout argument
ndash Waiting threads consume no CPU time
5555
Waiting
Waitable objects include
ndash Events (may be auto-reset or manual reset may be set or ldquopulsedrdquo)
ndash Mutexes (ldquomutual exclusionrdquo one-at-a-time)
ndash Semaphores (n-at-a-time)
ndash Timers
ndash Processes and Threads (signalled upon exit or terminate)
ndash Directories (change notification)
5656
Waiting
No guaranteed ordering of wait resolution
ndash If multiple threads are waiting for an object and only one thread is released (eg itrsquos a mutex or auto-reset event) which thread gets released is unpredictable
5757
Executive Synchronization
Thread waitson an object
handle
Create and initialize thread object
Initialized
Ready
Transition
Waiting
Running
Terminated
Standby
Wait is completeSet object to
signaled state
Interaction with thread scheduling
Waiting on Dispatcher Objects ndash outside the kernel
5858
Interaction bet Synchronization amp Dispatching User mode thread waits on an event objectlsquos handle
Kernel changes threadlsquos scheduling state from ready to waiting and adds thread to wait-list
Another thread sets the event
Kernel wakes up waiting threads variable priority threads get priority boost
5959
Interaction bet Synchronization amp Dispatching Dispatcher re-schedules new thread ndash it may preempt running thread it it has lower priority and issues software interrupt to initiate context switch
If no processor can be preempted the dispatcher places the ready thread in the dispatcher ready queue to be scheduled later
6060
What signals an object
Dispatcher object
System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
Owning thread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (kernel mode)
nonsignaled signaled
Owning thread or otherthread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (exported to user mode)
nonsignaled signaled
One thread releases thesemaphore freeing a resource
A thread acquires the semaphoreMore resources are not available
Kernel resumes one or more waiting threads
Semaphore
6161
A thread reinitializesthe thread object
What signals an object (contd)
Dispatcher object System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
A thread sets the event
Kernel resumes one or more threads
Kernel resumes one or more waiting threads
Event
nonsignaled signaled
Dedicated thread sets oneevent in the event pair
Kernel resumes theother dedicated thread
Kernel resumes waitingdedicated thread
Event pair
nonsignaled signaled
Timer expires
A thread (re) initializes the timer
Kernel resumes all waiting threads
Timer
nonsignaled signaled
Thread terminates
Kernel resumes all waiting threads
Thread
6262
Further Reading
Mark E Russinovich and David A Solomon Microsoft Windows Internals 4th Edition Microsoft Press 2004
Chapter 3 - System Mechanisms
ndash Trap Dispatching (pp 85 ff)
ndash Synchronization (pp 149 ff)
ndash Kernel Event Tracing (pp 175 ff)
6363
33 Advanced Windows Synchronization
Deferred and Asynchronous Procedure Calls
IRQLs and CPU Time Accounting
Wait Queues amp Dispatcher Objects
6464
Used to defer processing from higher (device) interrupt level to a lower (dispatch) levelndash Also used for quantum end and timer expiration
Driver (usually ISR) queues requestndash One queue per CPU DPCs are normally queued to the current processor but can be targeted to other CPUs
ndash Executes specified procedure at dispatch IRQL (or ldquodispatch levelrdquo also ldquoDPC levelrdquo) when all higher-IRQL work (interrupts) completed
ndash Maximum times recommended ISR 10 usec DPC 25 usec
See httpwwwmicrosoftcomwhdcdriverperformmmdrvmspx
Deferred Procedure Calls (DPCs)
6565
queue head DPC object DPC object DPC object
Deferred Procedure Calls (DPCs)
6666
DPC
Delivering a DPC
DPC routines can call kernel functionsbut canlsquot call system services generatepage faults or create or wait on objects
DPC routines canlsquotassume whatprocess addressspace is currentlymapped
Interruptdispatch table
high
Power failure
DispatchDPC
APC
Low
DPC
1 Timer expires kernelqueues DPC that willrelease all waiting threadsKernel requests SW int
DPCDPC
DPC queue
2 DPC interrupt occurswhen IRQL drops belowdispatchDPC level
dispatcher
3 After DPC interruptcontrol transfers tothread dispatcher
4 Dispatcher executes each DPCroutine in DPC queue
6767
Asynchronous Procedure Calls (APCs)
Execute code in context of a particular user threadndash APC routines can acquire resources (objects) incur page faultscall system services
APC queue is thread-specific
User mode amp kernel mode APCsndash Permission required for user mode APCs
6868
Asynchronous Procedure Calls (APCs)
Executive uses APCs to complete work in thread spacendash Wait for asynchronous IO operation
ndash Emulate delivery of POSIX signals
ndash Make threads suspendterminate itself (env subsystems)
APCs are delivered when thread is in alertable wait statendash WaitForMultipleObjectsEx() SleepEx()
6969
Special kernel APCs
ndash Run in kernel mode at IRQL 1
ndash Always deliverable unless thread is already at IRQL 1 or above
ndash Used for IO completion reporting from ldquoarbitrary thread contextrdquo
ndash Kernel-mode interface is linkable but not documented
Asynchronous Procedure Calls (APCs)
7070
ldquoOrdinaryrdquo kernel APCs
ndash Always deliverable if at IRQL 0 unless explicitly disabled (disable with KeEnterCriticalRegion)
User mode APCs
ndash Used for IO completion callback routines (see ReadFileEx WriteFileEx) also QueueUserApc
ndash Only deliverable when thread is in ldquoalertable waitrdquo
Asynchronous Procedure Calls (APCs)
7171
ThreadObject
K
U
APC objects
Asynchronous Procedure Calls (APCs)
7272
IRQLs and CPU Time Accounting
Interval clock timer ISR keeps track of time
Clock ISR time accounting
ndash If IRQLlt2 charge to threadrsquos user or kernel time
ndash If IRQL=2 and processing a DPC charge to DPC time
ndash If IRQL=2 amp not processing a DPC charge to thread kernel time
ndash If IRQLgt2 charge to interrupt time
7373
IRQLs and CPU Time Accounting
Since time servicing interrupts are NOT charged to interrupted thread if system is busy but no process appears to be running must be due to interrupt-related activity
ndash Note time at IRQL 2 or more is charged to the current threadrsquos quantum (to be described)
7474
Interrupt Time Accounting
Task Manager includes interrupt and DPC time with the Idle process time
Interrupt activity is not charged to any threadprocess
ndash Process Explorer shows these as separate processesnot really processes
ndash Context switches for these are really of interrupts amp DPCs
7575
Time Accounting Quirks
Looking at total CPU time for each process may not reveal where system has spent its time
CPU time accounting is driven by programmable interrupt timer
ndash Normally 10 msec (15 msec on some MP Pentiums)
Thread execution and context switches between clock intervals NOT accounted
ndash Eg one or more threads run and enter a wait state before clock fires
ndash Thus threads may run but never get charged
View context switch activity with Process Explorer
ndash Add Context Switch Delta column
7676
For waiting threads user-mode utilities only display the wait reason
Example pstat
Looking at Waiting Threads
7777
Wait Internals 1 Dispatcher Objects
Size TypeState
Wait listhead
Object-type-specific data
DispatcherObject
(see ntddkincddkntddkh)
Any kernel object you can wait for is a ldquodispatcher objectrdquo
ndash some exclusively for synchronizationeg events mutexes (ldquomutantsrdquo) semaphores queues timers
ndash others can be waited for as a side effect of their prime function eg processes threads file objects
ndash non-waitable kernel objects are called ldquocontrol objectsrdquo
All dispatcher objects have a common header
All dispatcher objects are in one of two states
ndash ldquosignaledrdquo vs ldquononsignaledrdquo
ndash when signalled a wait on the object is satisfied
ndash different object types differ in terms of what changes their state
ndash wait and unwait implementation iscommon to all types of dispatcher objects
7878
Object-type-specific data
Wait Internals 2Wait Blocks
Size TypeState
Wait listhead
Size TypeState
Wait listhead
Represent a threadrsquos reference to something itrsquos waiting for (one per handle passed to WaitForhellip)
All wait blocks from a given wait call are chained to the waiting thread
Type indicates wait for ldquoanyrdquo or ldquoallrdquo Key denotes argument list position for
WaitForMultipleObjects
Object-type-specific data
DispatcherObjects
Thread Objects
WaitBlockListWaitBlockList
Wait blocks
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
7979
34 Windows APIs for Synchronization and IPC Windows API constructs for synchronization and interprocess communication
Synchronization
ndash Critical sections
ndash Mutexes
ndash Semaphores
ndash Event objects
Synchronization through interprocess communication
ndash Anonymous pipes
ndash Named pipes
ndash Mailslots
8080
Critical Sections
Only usable from within the same process
Critical sections are initialized and deleted but do not have handles
Only one thread at a time can be in a critical section
A thread can enter a critical section multiple times - however the number of Enter- and Leave-operations must match
Leaving a critical section before entering it may cause deadlocks
No way to test whether another thread is in a critical section
VOID InitializeCriticalSection( LPCRITICAL_SECTION sec )VOID DeleteCriticalSection( LPCRITICAL_SECTION sec )
VOID EnterCriticalSection( LPCRITICAL_SECTION sec ) VOID LeaveCriticalSection( LPCRITICAL_SECTION sec )BOOL TryEnterCriticalSection ( LPCRITICAL_SECTION sec )
8181
Critical Section Example
counter is global shared by all threads
volatile int counter = 0
CRITICAL_SECTION crit
InitializeCriticalSection ( ampcrit )
hellip main loop in any of the threads
while (done)
_try
EnterCriticalSection ( ampcrit )
counter += local_value
LeaveCriticalSection ( ampcrit )
_finally LeaveCriticalSection ( ampcrit )
DeleteCriticalSection( ampcrit )
8282
Synchronizing Threads with Kernel Objects
The following kernel objects can be used to synchronize threads
ndash Processes
ndash Threads
ndash Files
ndash Console input
File change notificationsFile change notifications
MutexesMutexes
Events (auto-reset + manual-reset)Events (auto-reset + manual-reset)
Waitable timersWaitable timers
DWORD WaitForSingleObject( HANDLE hObject DWORD dwTimeout )
DWORD WaitForMultipleObjects( DWORD cObjects LPHANDLE lpHandles BOOL bWaitAll DWORD dwTimeout )
8383
Wait Functions - Details
WaitForSingleObject()ndash hObject specifies kernel object
ndash dwTimeout specifies wait time in msecdwTimeout == 0 - no wait check whether object is signaled
dwTimeout == INFINITE - wait forever
WaitForMultipleObjects()ndash cObjects lt= MAXIMUM_WAIT_OBJECTS (64)
ndash lpHandles - pointer to array identifying these objects
ndash bWaitAll - whether to wait for first signaled object or all objectsFunction returns index of first signaled object
Side effectsndash Mutexes auto-reset events and waitable timers will be reset to non-signaled state after completing wait functions
8484
Mutexes
Mutexes work across processes
First thread has to call CreateMutex()
When sharing a mutex second thread (process) calls CreateMutex() or OpenMutex()
fInitialOwner == TRUE gives creator immediate ownership
Threads acquire mutex ownership using WaitForSingleObject() or WaitForMultipleObjects()
ReleaseMutex() gives up ownership
CloseHandle() will free mutex object
8585
Mutexes
HANDLE CreateMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
HANDLE OpenMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
BOOL ReleaseMutex( HANDLE hMutex )
8686
Mutex Example
counter is global shared by all threads
volatile int done counter = 0
HANDLE mutex = CreateMutex( NULL FALSE NULL )
main loop in any of the threads ret is local
DWORD ret
while (done)
ret = WaitForSingleObject( mutex INFINITE )
if (ret == WAIT_OBJECT_0)
counter += local_value
else mutex was abandoned
break exit the loop
ReleaseMutex( mutex )
CloseHandle( mutex )
8787
Comparison - POSIX mutexes
POSIX pthreads specification supports mutexesndash Synchronization among threads in same process
Five basic functionsndash pthread_mutex_init()
ndash pthread_mutex_destroy()
ndash pthread_mutex_lock()
ndash pthread_mutex_unlock()
ndash pthread_mutex_trylock()
Comparisonndash pthread_mutex_lock() will block - equivalent to WaitForSingleObject( hMutex )
ndash pthread_mutex_trylock() is nonblocking (polling) - equivalent to WaitForSingleObject() with timeout == 0
8888
Semaphores
Semaphore objects are used for resource countingndash A semaphore is signaled when count gt 0
Threadsprocesses use wait functionsndash Each wait function decreases semaphore count by 1
ndash ReleaseSemaphore() may increment count by any value
ndash ReleaseSemaphore() returns old semaphore count
8989
Semaphores
HANDLE CreateSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE OpenSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE ReleaseSemaphore( HANDLE hSemaphoreLONG cReleaseCount LPLONG lpPreviousCount )
9090
Events
Multiple threads can be released when a single event is signaled (barrier synchronization)ndash Manual-reset event can signal several thread simultaneously must be reset manually
ndash PulseEvent() will release all threads waiting on a manual-reset event and automatically reset the event
ndash Auto-reset event signals a single thread event is reset automatically
ndash fInitialState == TRUE - create event in signaled state
9191
Events
HANDLE CreateEvent( LPSECURITY_ATTRIBUTE lpsaBOOL fManualReset BOOL fInititalStateLPTSTR lpszEventName )
BOOL SetEvent( HANDLE hEvent )BOOL ResetEvent( HANDLE hEvent )BOOL PulseEvent( HANDLE hEvent )
9292
Comparison - POSIX condition variables
pthreadrsquos condition variables are comparable to eventsndash pthread_cond_init()
ndash pthread_cond_destroy()
Wait functionsndash pthread_cond_wait()
ndash pthread_cond_timedwait()
Signalingndash pthread_cond_signal() - one thread
ndash pthread_cond_broadcast() - all waiting threads
No exact equivalent to manual-reset events
9393
Anonymous pipes
BOOL CreatePipe( PHANDLE phReadPHANDLE phWriteLPSECURITY_ATTRIBUTES lpsaDWORD cbPipe )
main
prog1 prog2pipe
Half-duplex character-based IPC
cbPipe pipe byte size zero == default
Read on pipe handle will block if pipe is empty
Write operation to a full pipe will block
Anonymous pipes are oneway
9494
IO Redirection using an Anonymous Pipe
Create default size anonymous pipe handles are inheritable
if (CreatePipe (amphReadPipe amphWritePipe ampPipeSA 0))
fprintf(stderr ldquoAnon pipe create failednrdquo) exit(1)
Set output handle to pipe handle create first processes
StartInfoCh1hStdInput = GetStdHandle (STD_INPUT_HANDLE)
StartInfoCh1hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh1hStdOutput = hWritePipe
StartInfoCh1dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)Command1 NULL NULL TRUE 0 NULL NULL ampStartInfoCh1 ampProcInfo1))
fprintf(stderr ldquoCreateProc1 failednrdquo) exit(2)
CloseHandle (hWritePipe)
9595
Pipe example (contd)
Repeat (symmetrically) for the second process
StartInfoCh2hStdInput = hReadPipe
StartInfoCh2hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh2hStdOutput = GetStdHandle (STD_OUTPUT_HANDLE)
StartInfoCh2dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)targv NULL NULLTRUE Inherit handles
0 NULL NULL ampStartInfoCh2 ampProcInfo2))
fprintf(stderr ldquoCreateProc2 failednrdquo) exit(3)
CloseHandle (hReadPipe)
Wait for both processes to complete
WaitForSingleObject (ProcInfo1hProcess INFINITE)
WaitForSingleObject (ProcInfo2hProcess INFINITE)
CloseHandle (ProcInfo1hThread) CloseHandle (ProcInfo1hProcess)
CloseHandle (ProcInfo2hThread) CloseHandle (ProcInfo2hProcess)
return 0
9696
Named Pipes
Message orientedndash Reading process can read varying-length messages precisely as sent by the writing process
Bi-directionalndash Two processes can exchange messages over the same pipe
Multiple independent instances of a named pipendash Several clients can communicate with a single server using the same instance
ndash Server can respond to client using the same instance
Pipe can be accessed over the networkndash location transparency
Convenience and connection functions
9797
Using Named Pipes
HANDLE CreateNamedPipe (LPCTSTR lpszPipeNameDWORD fdwOpenMode DWORD fdwPipModeDWORD nMaxInstances DWORD cbOutBufDWORD cbInBuf DWORD dwTimeOutLPSECURITY_ATTRIBUTES lpsa )
Use same flag settings forall instances of a named pipe
lpszPipeName pipe[path]pipename
ndash Not possible to create a pipe on remote machine ( ndash local machine)
fdwOpenMode
ndash PIPE_ACCESS_DUPLEX PIPE_ACCESS_INBOUND PIPE_ACCESS_OUTBOUND
fdwPipeMode
ndash PIPE_TYPE_BYTE or PIPE_TYPE_MESSAGE
ndash PIPE_READMODE_BYTE or PIPE_READMODE_MESSAGE
ndash PIPE_WAIT or PIPE_NOWAIT (will ReadFile block)
9898
Named Pipes (contd)
BOOL PeekNamedPipe (HANDLE hPipeLPVOID lpvBuffer DWORD cbBufferLPDWORD lpcbRead LPDWORD lpcbAvailLPDWORD lpcbMessage)
nMaxInstances
ndash Number of instances
ndash PIPE_UNLIMITED_INSTANCES OS choice based on resources
dwTimeOut
ndash Default time-out period (in msec) for WaitNamedPipe()
First CreateNamedPipe creates named pipe
ndash Closing handle to last instance deletes named pipe
Polling a pipe
ndash Nondestructive ndash is there a message waiting for ReadFile
9999
Named Pipe Client Connections
CreateFile with named pipe namendash pipe[path]pipename
ndash servernamepipe[path]pipename
ndash First method gives better performance (local server)
Status Functionsndash GetNamedPipeHandleState
ndash SetNamedPipeHandleState
ndash GetNamedPipeInfo
100100
Convenience Functions
BOOL TransactNamedPipe( HANDLE hNamedPipeLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDOWRD lpcbRead LPOVERLAPPED lpa)
WriteFile ReadFile sequence
101101
Convenience Functions
BOOL CallNamedPipe( LPCTSTR lpszPipeNameLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDWORD lpcbRead DWORD dwTimeOut)
CreateFile WriteFile ReadFile CloseHandle
ndash dwTimeOut NMPWAIT_NOWAIT NMPWAIT_WIAT_FOREVER NMPWAIT_USE_DEFAULT_WAIT
102102
Server eliminate the polling loop
BOOL ConnectNamedPipe (HANDLE hNamedPipeLPOVERLAPPED lpo
lpo == NULLndash Call will return as soon as there is a client connection
ndash Returns false if client connected between CreateNamed Pipe calland ConnectNamedPipe()
Use DisconnectNamedPipe to free the handle for connection from another client
WaitNamedPipe()ndash Client may wait for serverlsquos ConnectNamedPipe()
Security rights for named pipesndash GENERIC_READ GENERIC_WRITE SYNCHRONIZE
103103
Comparison with UNIX
UNIX FIFOs are similar to a named pipe
ndash FIFOs are half-duplex
ndash FIFOs are limited to a single machine
ndash FIFOs are still byte-oriented so its easiest to use fixed-size records in clientserver applications
ndash Individual readwrites are atomic
A server using FIFOs must use a separate FIFO for each clientlsquos response although all clients can send requests via a single well known FIFO
Mkfifo() is the UNIX counterpart to CreateNamedPipe()
Use sockets for networked clientserver scenarios
104104
Client Example using Named Pipe
WaitNamedPipe (ServerPipeName NMPWAIT_WAIT_FOREVER)
hNamedPipe = CreateFile (ServerPipeName GENERIC_READ | GENERIC_WRITE
0 NULL OPEN_EXISTING FILE_ATTRIBUTE_NORMAL NULL)
if (hNamedPipe == INVALID_HANDLE_VALUE)
fptinf(stderr Failure to locate servern) exit(3)
Write the request
WriteFile (hNamedPipe ampRequest MAX_RQRS_LEN ampnWrite NULL)
Read each response and send it to std out
while (ReadFile (hNamedPipe ResponseRecord MAX_RQRS_LEN ampnRead NULL))
printf (s ResponseRecord)
CloseHandle (hNamedPipe)
return 0
105105
Server Example Using a Named Pipe
hNamedPipe = CreateNamedPipe (SERVER_PIPE PIPE_ACCESS_DUPLEX
PIPE_READMODE_MESSAGE | PIPE_TYPE_MESSAGE | PIPE_WAIT
1 0 0 CS_TIMEOUT pNPSA)
while (Done)
printf (Server is awaiting next requestn)
if (ConnectNamedPipe (hNamedPipe NULL)
|| ReadFile (hNamedPipe ampRequest RQ_SIZE ampnXfer NULL))
fprintf(stderr ldquoConnect or Read Named Pipe errornrdquo) exit(4)
printf( ldquoRequest is sn RequestRecord)
Send the file one line at a time to the client
fp = fopen (File r)
while ((fgets (ResponseRecord MAX_RQRS_LEN fp) = NULL))
WriteFile (hNamedPipe ampResponseRecord
(strlen(ResponseRecord) + 1) TSIZE ampnXfer NULL)
fclose (fp)
DisconnectNamedPipe (hNamedPipe)
End of server operation
106106
Win32 IPC - MailslotsMailslots bear some nasty implementation detailsthey are almost never used
Broadcast mechanism
ndash One-directional
ndash Mutliple writersmultiple readers (frequently one-to-many comm)
ndash Message delivery is unreliable
ndash Can be located over a network domain
ndash Message lengths are limited (w2k lt 426 byte) Operations on the mailslot
ndash Each reader (server) creates mailslot with CreateMailslot()
ndash Write-only client opens mailslot with CreateFile() and uses WriteFile() ndash open will fail if there are no waiting readers
ndash Clientlsquos message can be read by all servers (readers) Client lookup mailslotmailslotname
ndash Client will connect to every server in network domain
107107
Locate a server via mailslot
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
App client 0
App client n
Mailslot Servers
While () Sleep() hMS = CreateFile( ldquomailslotstatusldquo)
WriteFile(hMS ampStatInfo
App Server
Mailslot Client
Message is sent periodically
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
2727
Deadlock and Starvation
Deadlock ndash
ndash two or more threads are waiting indefinitely for an event that can be caused by only one of the waiting threads
Let S and Q be two semaphores initialized to 1
T0 T1
wait(S) wait(Q)
wait(Q) wait(S)
signal(S) signal(Q)
signal(Q) signal(S)
2828
Deadlock and Starvation
Starvation ndash indefinite blocking
ndash A thread may never be removed from the semaphore queue in which it is suspended
Solution ndash
ndash all code should acquirerelease semaphores in same order
2929
Windows Synchronization
Uses interrupt masks to protect access to global resources on uniprocessor systems
Uses spinlocks on multiprocessor systems
Provides dispatcher objects which may act as mutexes and semaphores
Dispatcher objects may also provide events An event acts much like a condition variable
3030
Linux Synchronization
Kernel disables interrupts for synchronizing access to global data on uniprocessor systems
Uses spinlocks for multiprocessor synchronization
Uses semaphores and readers-writers locks when longer sections of code need access to data
Implements POSIX synchronization primitives to support multitasking multithreading (including real-time threads) and multiprocessing
3131
Further Reading
Ben-Ari M Principles of Concurrent Programming Prentice Hall 1982
Lamport L The Mutual Exclusion Problem Journal of the ACM April 1986
Abraham Silberschatz Peter B Galvin Operating System Concepts John Wiley amp Sons 6th Ed 2003
ndash Chapter 7 - Process Synchronization
ndash Chapter 8 - Deadlocks
3232
32 Trap Dispatching Interrupts Synchronization Trap and Interrupt dispatching
IRQL levels amp Interrupt Precedence
Spinlocks and Kernel Synchronization
Executive Synchronization
3333
Kernel Mode Versus User Mode
A processor state
Controls access to memory
Each memory page is tagged to show the required mode for reading and for writing
ndash Protects the system from the users
ndash Protects the user (process) from themselves
ndash System is not protected from system
Code regions are tagged ldquono write in any moderdquo
Controls ability to execute privileged instructions
A Windows abstraction
ndash Intel Ring 0 Ring 3
3434
Kernel Mode Versus User Mode
Control flow (a thread) can change from user to kernel mode and back
ndash Does not affect scheduling
ndash Thread context includes info about execution mode (along with registers etc)
PerfMon counters
ndash ldquoPrivileged Timerdquo and ldquoUser Timerdquo
ndash 4 levels of granularity thread process processor system
3535
Getting Into Kernel Mode
Code is run in kernel mode for one of three reasons
1 Requests from user mode
ndash Via the system service dispatch mechanism
ndash Kernel-mode code runs in the context of the requesting thread
2 Dedicated kernel-mode system threads
ndash Some threads in the system stay in kernel mode at all timesmostly in the ldquoSystemrdquo process
ndash Scheduled preempted etc like any other threads
3636
Getting Into Kernel Mode
3 Interrupts from external devicesndash interrupt dispatcher invokes the interrupt service routine
ndash ISR runs in the context of the interrupted thread so-called ldquoarbitrary thread contextrdquo
ndash ISR often requests the execution of a ldquoDPC routinerdquo which also runs in kernel mode
ndash Time not charged to interrupted thread
3737
Trap dispatching
Interruptdispatcher
Systemservice
dispatcher
Interruptserviceroutines
Interruptserviceroutines
Interruptserviceroutines
System services
System services
System services
Exceptiondispatcher
Exceptionhandlers
Exceptionhandlers
Exceptionhandlers
Virtual memorymanagerlsquos pager
Interrupt
System service call
HW exceptionsSW exceptions
Virtual addressexceptions
Trap processorlsquos mechanism to capture executing thread
ndash Switch from user to kernel mode
ndash Interrupts ndash asynchronous
ndash Exceptions - synchronous
Interrupt dispatch routine
Disable interrupts
Record machine state (trap frame) to allow resume
Mask equal- and lower-IRQL interrupts
Find and call appropriate ISR
Dismiss interrupt
Restore machine state (including mode and enabled interrupts)
Disable interrupts
Record machine state (trap frame) to allow resume
Mask equal- and lower-IRQL interrupts
Find and call appropriate ISR
Dismiss interrupt
Restore machine state (including mode and enabled interrupts)
Tell the device to stop interrupting
Interrogate device state start next operation on device etc
Request a DPC
Return to caller
Tell the device to stop interrupting
Interrogate device state start next operation on device etc
Request a DPC
Return to caller
Interrupt service routine
interrupt
user or kernel mode
codekernel mode
Note no thread or process context switch
Note no thread or process context switch
Interrupt Dispatching
3939
IRQL = Interrupt Request Level
ndash Precedence of the interrupt with respect to other interrupts
ndash Different interrupt sources have different IRQLs
ndash not the same as IRQ
IRQL is also a state of the processor
ndash Servicing an interrupt raises processor IRQL to that interruptrsquos IRQL
ndash this masks subsequent interrupts at equal and lower IRQLs
User mode is limited to IRQL 0
No waits or page faults at IRQL gt= DISPATCH_LEVEL
Interrupt Precedence via IRQLs (x86)
4040
PassiveLowAPC
DispatchDPCDevice 1
Profile amp Synch (Srv 2003)
ClockInterprocessor Interrupt
Power failHigh
normal thread execution
Hardware interrupts
Deferrable software interrupts
012
302928
31
Interrupt Precedence via IRQLs (x86)
4141
Interrupt processing
Interrupt dispatch table (IDT)ndash Links to interrupt service routines
x86ndash Interrupt controller interrupts processor (single line)
ndash Processor queries for interrupt vector uses vector as index to IDT
Alphandash PAL code (Privileged Architecture Library ndash Alpha BIOS) determines interrupt vector calls kernel
ndash Kernel uses vector to index IDT
After ISR execution IRQL is lowered to initial level
4242
Interrupt object
Allows device drivers to register ISRs for their devicesndash Contains dispatch code (initial handler)
ndash Dispatch code calls ISR with interrupt object as parameter(HW cannot pass parameters to ISR)
Connectingdisconnecting interrupt objectsndash Dynamic association between ISR and IDT entry
ndash Loadable device drivers (kernel modules)
ndash Turn onoff ISR
Interrupt objects can synchronize access to ISR datandash Multiple instances of ISR may be active simultaneously (MP machine)
ndash Multiple ISR may be connected with IRQL
4343
Predefined IRQLs
High ndash used when halting the system (via KeBugCheck())
Power fail ndash originated in the NT design document but has never been used
Inter-processor interruptndash used to request action from other processor (dispatching a thread updating a processors TLB system shutdown system crash)
Clockndash Used to update systemlsquos clock allocation of CPU time to threads
Profilendash Used for kernel profiling (see Kernel profiler ndash Kernprofexe Res Kit)
4444
Predefined IRQLs (contd)
Device
ndash Used to prioritize device interrupts
DPCdispatch and APC
ndash Software interrupts that kernel and device drivers generate
Passive
ndash No interrupt level at all normal thread execution
4545
IRQLs on 64-bit Systems
PassiveLowAPC
DispatchDPCDevice 1
Device n
Synch (Srv 2003)Clock
Interprocessor InterruptPower
HighProfile
012
1413
15
34
PassiveLowAPC
DispatchDPC amp Synch (UP only)Correctable Machine Check
Device 1
Device nSynch (MP only)
ClockInterprocessor Interrupt
HighProfilePower
x64 IA64
12
4646
Interrupt Prioritization amp Delivery
IRQLs are determined as followsndash x86 UP systems IRQL = 27 - IRQ
ndash x86 MP systems bucketized (random)
ndash x64 amp IA64 systems IRQL = IDT vector number 16
On MP systems which processor is chosen to deliver an interruptndash By default any processor can receive an interrupt from any deviceCan be configured with IntFilter utility in Resource Kit
ndash On x86 and x64 systems the IOAPIC (IO advanced programmable interrupt controller) is programmed to interrupt the processor running at the lowest IRQL
ndash On IA64 systems the SAPIC (streamlined advanced programmable interrupt controller) is configured to interrupt one processor for each interrupt sourceProcessors are assigned round robin for each interrupt vector
4747
Software interrupts
Initiating thread dispatching
ndash DPC allow for scheduling actions when kernel is deep within many layers of code
ndash Delayed scheduling decision one DPC queue per processor
Handling timer expiration
Asynchronous execution of a procedure in context of a particular thread
Support for asynchronous IO operations
4848
Flow of Interrupts
4949
Sync on MP use spinlocks to coordinate among processors
Spinlock acquisition and release routines implement a one-owner-at-a-time algorithm
ndash Spinlock is either free or is considered to be owned by a CPU
ndash Analogous to using Windows API mutexes from user mode
A spinlock is just a data cell in memory
ndash Accessed with a test-and-modify operation that is atomic across all processors
ndash KSPIN_LOCK is an opaque data type typedefrsquod as a ULONG
ndash To implement synchronization a single bit is sufficient
Synchronization on SMP Systems
5050
do acquire_spinlock(DPC)until (SUCCESS)
begin remove DPC from queueend
release_spinlock(DPC)
do acquire_spinlock(DPC)until (SUCCESS)
begin remove DPC from queueend
release_spinlock(DPC)
Kernel Synchronization
Processor BProcessor A
Critical section
spinlock
DPC DPC
A spinlock is a locking primitive associatedwith a global data structure such as the DPC queue
5151
Queued Spinlocks
Problem Checking status of spinlock via test-and-set operation creates bus contention
Queued spinlocks maintain queue of waiting processors
First processor acquires lock other processors wait on processor-local flag
ndash Thus busy-wait loop requires no access to the memory bus
When releasing lock the 1st processorrsquos flag is modified
ndash Exactly one processor is being signaled
ndash Pre-determined wait order
5252
SMP Scalability Improvements
Windows 2000 queued spinlocks
ndash qlocks in Kernel Debugger
Server 2003
ndash More spinlocks eliminated (context swap system space commit)
ndash Further reduction of use of spinlocks amp length they are held
ndash Scheduling database now per-CPUAllows thread state transitions in parallel
5353
SMP Scalability Improvements
XP2003
ndash Minimized lock contention for hot locks PFN or Page Frame Database lock
ndash Some locks completely eliminatedCharging nonpagedpaged pool quotas allocating and mapping system page table entries charging commitment of pages allocatingmapping physical memory through AWE functions
ndash New more efficient locking mechanism (pushlocks)Doesnrsquot use spinlocks when no contentionUsed for object manager and address windowing extensions (AWE) related locks
5454
Waiting
Flexible wait calls
ndash Wait for one or multiple objects in one call
ndash Wait for multiple can wait for ldquoanyrdquo one or ldquoallrdquo at onceldquoAllrdquo all objects must be in the signalled state concurrently to resolve the wait
ndash All wait calls include optional timeout argument
ndash Waiting threads consume no CPU time
5555
Waiting
Waitable objects include
ndash Events (may be auto-reset or manual reset may be set or ldquopulsedrdquo)
ndash Mutexes (ldquomutual exclusionrdquo one-at-a-time)
ndash Semaphores (n-at-a-time)
ndash Timers
ndash Processes and Threads (signalled upon exit or terminate)
ndash Directories (change notification)
5656
Waiting
No guaranteed ordering of wait resolution
ndash If multiple threads are waiting for an object and only one thread is released (eg itrsquos a mutex or auto-reset event) which thread gets released is unpredictable
5757
Executive Synchronization
Thread waitson an object
handle
Create and initialize thread object
Initialized
Ready
Transition
Waiting
Running
Terminated
Standby
Wait is completeSet object to
signaled state
Interaction with thread scheduling
Waiting on Dispatcher Objects ndash outside the kernel
5858
Interaction bet Synchronization amp Dispatching User mode thread waits on an event objectlsquos handle
Kernel changes threadlsquos scheduling state from ready to waiting and adds thread to wait-list
Another thread sets the event
Kernel wakes up waiting threads variable priority threads get priority boost
5959
Interaction bet Synchronization amp Dispatching Dispatcher re-schedules new thread ndash it may preempt running thread it it has lower priority and issues software interrupt to initiate context switch
If no processor can be preempted the dispatcher places the ready thread in the dispatcher ready queue to be scheduled later
6060
What signals an object
Dispatcher object
System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
Owning thread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (kernel mode)
nonsignaled signaled
Owning thread or otherthread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (exported to user mode)
nonsignaled signaled
One thread releases thesemaphore freeing a resource
A thread acquires the semaphoreMore resources are not available
Kernel resumes one or more waiting threads
Semaphore
6161
A thread reinitializesthe thread object
What signals an object (contd)
Dispatcher object System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
A thread sets the event
Kernel resumes one or more threads
Kernel resumes one or more waiting threads
Event
nonsignaled signaled
Dedicated thread sets oneevent in the event pair
Kernel resumes theother dedicated thread
Kernel resumes waitingdedicated thread
Event pair
nonsignaled signaled
Timer expires
A thread (re) initializes the timer
Kernel resumes all waiting threads
Timer
nonsignaled signaled
Thread terminates
Kernel resumes all waiting threads
Thread
6262
Further Reading
Mark E Russinovich and David A Solomon Microsoft Windows Internals 4th Edition Microsoft Press 2004
Chapter 3 - System Mechanisms
ndash Trap Dispatching (pp 85 ff)
ndash Synchronization (pp 149 ff)
ndash Kernel Event Tracing (pp 175 ff)
6363
33 Advanced Windows Synchronization
Deferred and Asynchronous Procedure Calls
IRQLs and CPU Time Accounting
Wait Queues amp Dispatcher Objects
6464
Used to defer processing from higher (device) interrupt level to a lower (dispatch) levelndash Also used for quantum end and timer expiration
Driver (usually ISR) queues requestndash One queue per CPU DPCs are normally queued to the current processor but can be targeted to other CPUs
ndash Executes specified procedure at dispatch IRQL (or ldquodispatch levelrdquo also ldquoDPC levelrdquo) when all higher-IRQL work (interrupts) completed
ndash Maximum times recommended ISR 10 usec DPC 25 usec
See httpwwwmicrosoftcomwhdcdriverperformmmdrvmspx
Deferred Procedure Calls (DPCs)
6565
queue head DPC object DPC object DPC object
Deferred Procedure Calls (DPCs)
6666
DPC
Delivering a DPC
DPC routines can call kernel functionsbut canlsquot call system services generatepage faults or create or wait on objects
DPC routines canlsquotassume whatprocess addressspace is currentlymapped
Interruptdispatch table
high
Power failure
DispatchDPC
APC
Low
DPC
1 Timer expires kernelqueues DPC that willrelease all waiting threadsKernel requests SW int
DPCDPC
DPC queue
2 DPC interrupt occurswhen IRQL drops belowdispatchDPC level
dispatcher
3 After DPC interruptcontrol transfers tothread dispatcher
4 Dispatcher executes each DPCroutine in DPC queue
6767
Asynchronous Procedure Calls (APCs)
Execute code in context of a particular user threadndash APC routines can acquire resources (objects) incur page faultscall system services
APC queue is thread-specific
User mode amp kernel mode APCsndash Permission required for user mode APCs
6868
Asynchronous Procedure Calls (APCs)
Executive uses APCs to complete work in thread spacendash Wait for asynchronous IO operation
ndash Emulate delivery of POSIX signals
ndash Make threads suspendterminate itself (env subsystems)
APCs are delivered when thread is in alertable wait statendash WaitForMultipleObjectsEx() SleepEx()
6969
Special kernel APCs
ndash Run in kernel mode at IRQL 1
ndash Always deliverable unless thread is already at IRQL 1 or above
ndash Used for IO completion reporting from ldquoarbitrary thread contextrdquo
ndash Kernel-mode interface is linkable but not documented
Asynchronous Procedure Calls (APCs)
7070
ldquoOrdinaryrdquo kernel APCs
ndash Always deliverable if at IRQL 0 unless explicitly disabled (disable with KeEnterCriticalRegion)
User mode APCs
ndash Used for IO completion callback routines (see ReadFileEx WriteFileEx) also QueueUserApc
ndash Only deliverable when thread is in ldquoalertable waitrdquo
Asynchronous Procedure Calls (APCs)
7171
ThreadObject
K
U
APC objects
Asynchronous Procedure Calls (APCs)
7272
IRQLs and CPU Time Accounting
Interval clock timer ISR keeps track of time
Clock ISR time accounting
ndash If IRQLlt2 charge to threadrsquos user or kernel time
ndash If IRQL=2 and processing a DPC charge to DPC time
ndash If IRQL=2 amp not processing a DPC charge to thread kernel time
ndash If IRQLgt2 charge to interrupt time
7373
IRQLs and CPU Time Accounting
Since time servicing interrupts are NOT charged to interrupted thread if system is busy but no process appears to be running must be due to interrupt-related activity
ndash Note time at IRQL 2 or more is charged to the current threadrsquos quantum (to be described)
7474
Interrupt Time Accounting
Task Manager includes interrupt and DPC time with the Idle process time
Interrupt activity is not charged to any threadprocess
ndash Process Explorer shows these as separate processesnot really processes
ndash Context switches for these are really of interrupts amp DPCs
7575
Time Accounting Quirks
Looking at total CPU time for each process may not reveal where system has spent its time
CPU time accounting is driven by programmable interrupt timer
ndash Normally 10 msec (15 msec on some MP Pentiums)
Thread execution and context switches between clock intervals NOT accounted
ndash Eg one or more threads run and enter a wait state before clock fires
ndash Thus threads may run but never get charged
View context switch activity with Process Explorer
ndash Add Context Switch Delta column
7676
For waiting threads user-mode utilities only display the wait reason
Example pstat
Looking at Waiting Threads
7777
Wait Internals 1 Dispatcher Objects
Size TypeState
Wait listhead
Object-type-specific data
DispatcherObject
(see ntddkincddkntddkh)
Any kernel object you can wait for is a ldquodispatcher objectrdquo
ndash some exclusively for synchronizationeg events mutexes (ldquomutantsrdquo) semaphores queues timers
ndash others can be waited for as a side effect of their prime function eg processes threads file objects
ndash non-waitable kernel objects are called ldquocontrol objectsrdquo
All dispatcher objects have a common header
All dispatcher objects are in one of two states
ndash ldquosignaledrdquo vs ldquononsignaledrdquo
ndash when signalled a wait on the object is satisfied
ndash different object types differ in terms of what changes their state
ndash wait and unwait implementation iscommon to all types of dispatcher objects
7878
Object-type-specific data
Wait Internals 2Wait Blocks
Size TypeState
Wait listhead
Size TypeState
Wait listhead
Represent a threadrsquos reference to something itrsquos waiting for (one per handle passed to WaitForhellip)
All wait blocks from a given wait call are chained to the waiting thread
Type indicates wait for ldquoanyrdquo or ldquoallrdquo Key denotes argument list position for
WaitForMultipleObjects
Object-type-specific data
DispatcherObjects
Thread Objects
WaitBlockListWaitBlockList
Wait blocks
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
7979
34 Windows APIs for Synchronization and IPC Windows API constructs for synchronization and interprocess communication
Synchronization
ndash Critical sections
ndash Mutexes
ndash Semaphores
ndash Event objects
Synchronization through interprocess communication
ndash Anonymous pipes
ndash Named pipes
ndash Mailslots
8080
Critical Sections
Only usable from within the same process
Critical sections are initialized and deleted but do not have handles
Only one thread at a time can be in a critical section
A thread can enter a critical section multiple times - however the number of Enter- and Leave-operations must match
Leaving a critical section before entering it may cause deadlocks
No way to test whether another thread is in a critical section
VOID InitializeCriticalSection( LPCRITICAL_SECTION sec )VOID DeleteCriticalSection( LPCRITICAL_SECTION sec )
VOID EnterCriticalSection( LPCRITICAL_SECTION sec ) VOID LeaveCriticalSection( LPCRITICAL_SECTION sec )BOOL TryEnterCriticalSection ( LPCRITICAL_SECTION sec )
8181
Critical Section Example
counter is global shared by all threads
volatile int counter = 0
CRITICAL_SECTION crit
InitializeCriticalSection ( ampcrit )
hellip main loop in any of the threads
while (done)
_try
EnterCriticalSection ( ampcrit )
counter += local_value
LeaveCriticalSection ( ampcrit )
_finally LeaveCriticalSection ( ampcrit )
DeleteCriticalSection( ampcrit )
8282
Synchronizing Threads with Kernel Objects
The following kernel objects can be used to synchronize threads
ndash Processes
ndash Threads
ndash Files
ndash Console input
File change notificationsFile change notifications
MutexesMutexes
Events (auto-reset + manual-reset)Events (auto-reset + manual-reset)
Waitable timersWaitable timers
DWORD WaitForSingleObject( HANDLE hObject DWORD dwTimeout )
DWORD WaitForMultipleObjects( DWORD cObjects LPHANDLE lpHandles BOOL bWaitAll DWORD dwTimeout )
8383
Wait Functions - Details
WaitForSingleObject()ndash hObject specifies kernel object
ndash dwTimeout specifies wait time in msecdwTimeout == 0 - no wait check whether object is signaled
dwTimeout == INFINITE - wait forever
WaitForMultipleObjects()ndash cObjects lt= MAXIMUM_WAIT_OBJECTS (64)
ndash lpHandles - pointer to array identifying these objects
ndash bWaitAll - whether to wait for first signaled object or all objectsFunction returns index of first signaled object
Side effectsndash Mutexes auto-reset events and waitable timers will be reset to non-signaled state after completing wait functions
8484
Mutexes
Mutexes work across processes
First thread has to call CreateMutex()
When sharing a mutex second thread (process) calls CreateMutex() or OpenMutex()
fInitialOwner == TRUE gives creator immediate ownership
Threads acquire mutex ownership using WaitForSingleObject() or WaitForMultipleObjects()
ReleaseMutex() gives up ownership
CloseHandle() will free mutex object
8585
Mutexes
HANDLE CreateMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
HANDLE OpenMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
BOOL ReleaseMutex( HANDLE hMutex )
8686
Mutex Example
counter is global shared by all threads
volatile int done counter = 0
HANDLE mutex = CreateMutex( NULL FALSE NULL )
main loop in any of the threads ret is local
DWORD ret
while (done)
ret = WaitForSingleObject( mutex INFINITE )
if (ret == WAIT_OBJECT_0)
counter += local_value
else mutex was abandoned
break exit the loop
ReleaseMutex( mutex )
CloseHandle( mutex )
8787
Comparison - POSIX mutexes
POSIX pthreads specification supports mutexesndash Synchronization among threads in same process
Five basic functionsndash pthread_mutex_init()
ndash pthread_mutex_destroy()
ndash pthread_mutex_lock()
ndash pthread_mutex_unlock()
ndash pthread_mutex_trylock()
Comparisonndash pthread_mutex_lock() will block - equivalent to WaitForSingleObject( hMutex )
ndash pthread_mutex_trylock() is nonblocking (polling) - equivalent to WaitForSingleObject() with timeout == 0
8888
Semaphores
Semaphore objects are used for resource countingndash A semaphore is signaled when count gt 0
Threadsprocesses use wait functionsndash Each wait function decreases semaphore count by 1
ndash ReleaseSemaphore() may increment count by any value
ndash ReleaseSemaphore() returns old semaphore count
8989
Semaphores
HANDLE CreateSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE OpenSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE ReleaseSemaphore( HANDLE hSemaphoreLONG cReleaseCount LPLONG lpPreviousCount )
9090
Events
Multiple threads can be released when a single event is signaled (barrier synchronization)ndash Manual-reset event can signal several thread simultaneously must be reset manually
ndash PulseEvent() will release all threads waiting on a manual-reset event and automatically reset the event
ndash Auto-reset event signals a single thread event is reset automatically
ndash fInitialState == TRUE - create event in signaled state
9191
Events
HANDLE CreateEvent( LPSECURITY_ATTRIBUTE lpsaBOOL fManualReset BOOL fInititalStateLPTSTR lpszEventName )
BOOL SetEvent( HANDLE hEvent )BOOL ResetEvent( HANDLE hEvent )BOOL PulseEvent( HANDLE hEvent )
9292
Comparison - POSIX condition variables
pthreadrsquos condition variables are comparable to eventsndash pthread_cond_init()
ndash pthread_cond_destroy()
Wait functionsndash pthread_cond_wait()
ndash pthread_cond_timedwait()
Signalingndash pthread_cond_signal() - one thread
ndash pthread_cond_broadcast() - all waiting threads
No exact equivalent to manual-reset events
9393
Anonymous pipes
BOOL CreatePipe( PHANDLE phReadPHANDLE phWriteLPSECURITY_ATTRIBUTES lpsaDWORD cbPipe )
main
prog1 prog2pipe
Half-duplex character-based IPC
cbPipe pipe byte size zero == default
Read on pipe handle will block if pipe is empty
Write operation to a full pipe will block
Anonymous pipes are oneway
9494
IO Redirection using an Anonymous Pipe
Create default size anonymous pipe handles are inheritable
if (CreatePipe (amphReadPipe amphWritePipe ampPipeSA 0))
fprintf(stderr ldquoAnon pipe create failednrdquo) exit(1)
Set output handle to pipe handle create first processes
StartInfoCh1hStdInput = GetStdHandle (STD_INPUT_HANDLE)
StartInfoCh1hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh1hStdOutput = hWritePipe
StartInfoCh1dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)Command1 NULL NULL TRUE 0 NULL NULL ampStartInfoCh1 ampProcInfo1))
fprintf(stderr ldquoCreateProc1 failednrdquo) exit(2)
CloseHandle (hWritePipe)
9595
Pipe example (contd)
Repeat (symmetrically) for the second process
StartInfoCh2hStdInput = hReadPipe
StartInfoCh2hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh2hStdOutput = GetStdHandle (STD_OUTPUT_HANDLE)
StartInfoCh2dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)targv NULL NULLTRUE Inherit handles
0 NULL NULL ampStartInfoCh2 ampProcInfo2))
fprintf(stderr ldquoCreateProc2 failednrdquo) exit(3)
CloseHandle (hReadPipe)
Wait for both processes to complete
WaitForSingleObject (ProcInfo1hProcess INFINITE)
WaitForSingleObject (ProcInfo2hProcess INFINITE)
CloseHandle (ProcInfo1hThread) CloseHandle (ProcInfo1hProcess)
CloseHandle (ProcInfo2hThread) CloseHandle (ProcInfo2hProcess)
return 0
9696
Named Pipes
Message orientedndash Reading process can read varying-length messages precisely as sent by the writing process
Bi-directionalndash Two processes can exchange messages over the same pipe
Multiple independent instances of a named pipendash Several clients can communicate with a single server using the same instance
ndash Server can respond to client using the same instance
Pipe can be accessed over the networkndash location transparency
Convenience and connection functions
9797
Using Named Pipes
HANDLE CreateNamedPipe (LPCTSTR lpszPipeNameDWORD fdwOpenMode DWORD fdwPipModeDWORD nMaxInstances DWORD cbOutBufDWORD cbInBuf DWORD dwTimeOutLPSECURITY_ATTRIBUTES lpsa )
Use same flag settings forall instances of a named pipe
lpszPipeName pipe[path]pipename
ndash Not possible to create a pipe on remote machine ( ndash local machine)
fdwOpenMode
ndash PIPE_ACCESS_DUPLEX PIPE_ACCESS_INBOUND PIPE_ACCESS_OUTBOUND
fdwPipeMode
ndash PIPE_TYPE_BYTE or PIPE_TYPE_MESSAGE
ndash PIPE_READMODE_BYTE or PIPE_READMODE_MESSAGE
ndash PIPE_WAIT or PIPE_NOWAIT (will ReadFile block)
9898
Named Pipes (contd)
BOOL PeekNamedPipe (HANDLE hPipeLPVOID lpvBuffer DWORD cbBufferLPDWORD lpcbRead LPDWORD lpcbAvailLPDWORD lpcbMessage)
nMaxInstances
ndash Number of instances
ndash PIPE_UNLIMITED_INSTANCES OS choice based on resources
dwTimeOut
ndash Default time-out period (in msec) for WaitNamedPipe()
First CreateNamedPipe creates named pipe
ndash Closing handle to last instance deletes named pipe
Polling a pipe
ndash Nondestructive ndash is there a message waiting for ReadFile
9999
Named Pipe Client Connections
CreateFile with named pipe namendash pipe[path]pipename
ndash servernamepipe[path]pipename
ndash First method gives better performance (local server)
Status Functionsndash GetNamedPipeHandleState
ndash SetNamedPipeHandleState
ndash GetNamedPipeInfo
100100
Convenience Functions
BOOL TransactNamedPipe( HANDLE hNamedPipeLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDOWRD lpcbRead LPOVERLAPPED lpa)
WriteFile ReadFile sequence
101101
Convenience Functions
BOOL CallNamedPipe( LPCTSTR lpszPipeNameLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDWORD lpcbRead DWORD dwTimeOut)
CreateFile WriteFile ReadFile CloseHandle
ndash dwTimeOut NMPWAIT_NOWAIT NMPWAIT_WIAT_FOREVER NMPWAIT_USE_DEFAULT_WAIT
102102
Server eliminate the polling loop
BOOL ConnectNamedPipe (HANDLE hNamedPipeLPOVERLAPPED lpo
lpo == NULLndash Call will return as soon as there is a client connection
ndash Returns false if client connected between CreateNamed Pipe calland ConnectNamedPipe()
Use DisconnectNamedPipe to free the handle for connection from another client
WaitNamedPipe()ndash Client may wait for serverlsquos ConnectNamedPipe()
Security rights for named pipesndash GENERIC_READ GENERIC_WRITE SYNCHRONIZE
103103
Comparison with UNIX
UNIX FIFOs are similar to a named pipe
ndash FIFOs are half-duplex
ndash FIFOs are limited to a single machine
ndash FIFOs are still byte-oriented so its easiest to use fixed-size records in clientserver applications
ndash Individual readwrites are atomic
A server using FIFOs must use a separate FIFO for each clientlsquos response although all clients can send requests via a single well known FIFO
Mkfifo() is the UNIX counterpart to CreateNamedPipe()
Use sockets for networked clientserver scenarios
104104
Client Example using Named Pipe
WaitNamedPipe (ServerPipeName NMPWAIT_WAIT_FOREVER)
hNamedPipe = CreateFile (ServerPipeName GENERIC_READ | GENERIC_WRITE
0 NULL OPEN_EXISTING FILE_ATTRIBUTE_NORMAL NULL)
if (hNamedPipe == INVALID_HANDLE_VALUE)
fptinf(stderr Failure to locate servern) exit(3)
Write the request
WriteFile (hNamedPipe ampRequest MAX_RQRS_LEN ampnWrite NULL)
Read each response and send it to std out
while (ReadFile (hNamedPipe ResponseRecord MAX_RQRS_LEN ampnRead NULL))
printf (s ResponseRecord)
CloseHandle (hNamedPipe)
return 0
105105
Server Example Using a Named Pipe
hNamedPipe = CreateNamedPipe (SERVER_PIPE PIPE_ACCESS_DUPLEX
PIPE_READMODE_MESSAGE | PIPE_TYPE_MESSAGE | PIPE_WAIT
1 0 0 CS_TIMEOUT pNPSA)
while (Done)
printf (Server is awaiting next requestn)
if (ConnectNamedPipe (hNamedPipe NULL)
|| ReadFile (hNamedPipe ampRequest RQ_SIZE ampnXfer NULL))
fprintf(stderr ldquoConnect or Read Named Pipe errornrdquo) exit(4)
printf( ldquoRequest is sn RequestRecord)
Send the file one line at a time to the client
fp = fopen (File r)
while ((fgets (ResponseRecord MAX_RQRS_LEN fp) = NULL))
WriteFile (hNamedPipe ampResponseRecord
(strlen(ResponseRecord) + 1) TSIZE ampnXfer NULL)
fclose (fp)
DisconnectNamedPipe (hNamedPipe)
End of server operation
106106
Win32 IPC - MailslotsMailslots bear some nasty implementation detailsthey are almost never used
Broadcast mechanism
ndash One-directional
ndash Mutliple writersmultiple readers (frequently one-to-many comm)
ndash Message delivery is unreliable
ndash Can be located over a network domain
ndash Message lengths are limited (w2k lt 426 byte) Operations on the mailslot
ndash Each reader (server) creates mailslot with CreateMailslot()
ndash Write-only client opens mailslot with CreateFile() and uses WriteFile() ndash open will fail if there are no waiting readers
ndash Clientlsquos message can be read by all servers (readers) Client lookup mailslotmailslotname
ndash Client will connect to every server in network domain
107107
Locate a server via mailslot
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
App client 0
App client n
Mailslot Servers
While () Sleep() hMS = CreateFile( ldquomailslotstatusldquo)
WriteFile(hMS ampStatInfo
App Server
Mailslot Client
Message is sent periodically
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
2828
Deadlock and Starvation
Starvation ndash indefinite blocking
ndash A thread may never be removed from the semaphore queue in which it is suspended
Solution ndash
ndash all code should acquirerelease semaphores in same order
2929
Windows Synchronization
Uses interrupt masks to protect access to global resources on uniprocessor systems
Uses spinlocks on multiprocessor systems
Provides dispatcher objects which may act as mutexes and semaphores
Dispatcher objects may also provide events An event acts much like a condition variable
3030
Linux Synchronization
Kernel disables interrupts for synchronizing access to global data on uniprocessor systems
Uses spinlocks for multiprocessor synchronization
Uses semaphores and readers-writers locks when longer sections of code need access to data
Implements POSIX synchronization primitives to support multitasking multithreading (including real-time threads) and multiprocessing
3131
Further Reading
Ben-Ari M Principles of Concurrent Programming Prentice Hall 1982
Lamport L The Mutual Exclusion Problem Journal of the ACM April 1986
Abraham Silberschatz Peter B Galvin Operating System Concepts John Wiley amp Sons 6th Ed 2003
ndash Chapter 7 - Process Synchronization
ndash Chapter 8 - Deadlocks
3232
32 Trap Dispatching Interrupts Synchronization Trap and Interrupt dispatching
IRQL levels amp Interrupt Precedence
Spinlocks and Kernel Synchronization
Executive Synchronization
3333
Kernel Mode Versus User Mode
A processor state
Controls access to memory
Each memory page is tagged to show the required mode for reading and for writing
ndash Protects the system from the users
ndash Protects the user (process) from themselves
ndash System is not protected from system
Code regions are tagged ldquono write in any moderdquo
Controls ability to execute privileged instructions
A Windows abstraction
ndash Intel Ring 0 Ring 3
3434
Kernel Mode Versus User Mode
Control flow (a thread) can change from user to kernel mode and back
ndash Does not affect scheduling
ndash Thread context includes info about execution mode (along with registers etc)
PerfMon counters
ndash ldquoPrivileged Timerdquo and ldquoUser Timerdquo
ndash 4 levels of granularity thread process processor system
3535
Getting Into Kernel Mode
Code is run in kernel mode for one of three reasons
1 Requests from user mode
ndash Via the system service dispatch mechanism
ndash Kernel-mode code runs in the context of the requesting thread
2 Dedicated kernel-mode system threads
ndash Some threads in the system stay in kernel mode at all timesmostly in the ldquoSystemrdquo process
ndash Scheduled preempted etc like any other threads
3636
Getting Into Kernel Mode
3 Interrupts from external devicesndash interrupt dispatcher invokes the interrupt service routine
ndash ISR runs in the context of the interrupted thread so-called ldquoarbitrary thread contextrdquo
ndash ISR often requests the execution of a ldquoDPC routinerdquo which also runs in kernel mode
ndash Time not charged to interrupted thread
3737
Trap dispatching
Interruptdispatcher
Systemservice
dispatcher
Interruptserviceroutines
Interruptserviceroutines
Interruptserviceroutines
System services
System services
System services
Exceptiondispatcher
Exceptionhandlers
Exceptionhandlers
Exceptionhandlers
Virtual memorymanagerlsquos pager
Interrupt
System service call
HW exceptionsSW exceptions
Virtual addressexceptions
Trap processorlsquos mechanism to capture executing thread
ndash Switch from user to kernel mode
ndash Interrupts ndash asynchronous
ndash Exceptions - synchronous
Interrupt dispatch routine
Disable interrupts
Record machine state (trap frame) to allow resume
Mask equal- and lower-IRQL interrupts
Find and call appropriate ISR
Dismiss interrupt
Restore machine state (including mode and enabled interrupts)
Disable interrupts
Record machine state (trap frame) to allow resume
Mask equal- and lower-IRQL interrupts
Find and call appropriate ISR
Dismiss interrupt
Restore machine state (including mode and enabled interrupts)
Tell the device to stop interrupting
Interrogate device state start next operation on device etc
Request a DPC
Return to caller
Tell the device to stop interrupting
Interrogate device state start next operation on device etc
Request a DPC
Return to caller
Interrupt service routine
interrupt
user or kernel mode
codekernel mode
Note no thread or process context switch
Note no thread or process context switch
Interrupt Dispatching
3939
IRQL = Interrupt Request Level
ndash Precedence of the interrupt with respect to other interrupts
ndash Different interrupt sources have different IRQLs
ndash not the same as IRQ
IRQL is also a state of the processor
ndash Servicing an interrupt raises processor IRQL to that interruptrsquos IRQL
ndash this masks subsequent interrupts at equal and lower IRQLs
User mode is limited to IRQL 0
No waits or page faults at IRQL gt= DISPATCH_LEVEL
Interrupt Precedence via IRQLs (x86)
4040
PassiveLowAPC
DispatchDPCDevice 1
Profile amp Synch (Srv 2003)
ClockInterprocessor Interrupt
Power failHigh
normal thread execution
Hardware interrupts
Deferrable software interrupts
012
302928
31
Interrupt Precedence via IRQLs (x86)
4141
Interrupt processing
Interrupt dispatch table (IDT)ndash Links to interrupt service routines
x86ndash Interrupt controller interrupts processor (single line)
ndash Processor queries for interrupt vector uses vector as index to IDT
Alphandash PAL code (Privileged Architecture Library ndash Alpha BIOS) determines interrupt vector calls kernel
ndash Kernel uses vector to index IDT
After ISR execution IRQL is lowered to initial level
4242
Interrupt object
Allows device drivers to register ISRs for their devicesndash Contains dispatch code (initial handler)
ndash Dispatch code calls ISR with interrupt object as parameter(HW cannot pass parameters to ISR)
Connectingdisconnecting interrupt objectsndash Dynamic association between ISR and IDT entry
ndash Loadable device drivers (kernel modules)
ndash Turn onoff ISR
Interrupt objects can synchronize access to ISR datandash Multiple instances of ISR may be active simultaneously (MP machine)
ndash Multiple ISR may be connected with IRQL
4343
Predefined IRQLs
High ndash used when halting the system (via KeBugCheck())
Power fail ndash originated in the NT design document but has never been used
Inter-processor interruptndash used to request action from other processor (dispatching a thread updating a processors TLB system shutdown system crash)
Clockndash Used to update systemlsquos clock allocation of CPU time to threads
Profilendash Used for kernel profiling (see Kernel profiler ndash Kernprofexe Res Kit)
4444
Predefined IRQLs (contd)
Device
ndash Used to prioritize device interrupts
DPCdispatch and APC
ndash Software interrupts that kernel and device drivers generate
Passive
ndash No interrupt level at all normal thread execution
4545
IRQLs on 64-bit Systems
PassiveLowAPC
DispatchDPCDevice 1
Device n
Synch (Srv 2003)Clock
Interprocessor InterruptPower
HighProfile
012
1413
15
34
PassiveLowAPC
DispatchDPC amp Synch (UP only)Correctable Machine Check
Device 1
Device nSynch (MP only)
ClockInterprocessor Interrupt
HighProfilePower
x64 IA64
12
4646
Interrupt Prioritization amp Delivery
IRQLs are determined as followsndash x86 UP systems IRQL = 27 - IRQ
ndash x86 MP systems bucketized (random)
ndash x64 amp IA64 systems IRQL = IDT vector number 16
On MP systems which processor is chosen to deliver an interruptndash By default any processor can receive an interrupt from any deviceCan be configured with IntFilter utility in Resource Kit
ndash On x86 and x64 systems the IOAPIC (IO advanced programmable interrupt controller) is programmed to interrupt the processor running at the lowest IRQL
ndash On IA64 systems the SAPIC (streamlined advanced programmable interrupt controller) is configured to interrupt one processor for each interrupt sourceProcessors are assigned round robin for each interrupt vector
4747
Software interrupts
Initiating thread dispatching
ndash DPC allow for scheduling actions when kernel is deep within many layers of code
ndash Delayed scheduling decision one DPC queue per processor
Handling timer expiration
Asynchronous execution of a procedure in context of a particular thread
Support for asynchronous IO operations
4848
Flow of Interrupts
4949
Sync on MP use spinlocks to coordinate among processors
Spinlock acquisition and release routines implement a one-owner-at-a-time algorithm
ndash Spinlock is either free or is considered to be owned by a CPU
ndash Analogous to using Windows API mutexes from user mode
A spinlock is just a data cell in memory
ndash Accessed with a test-and-modify operation that is atomic across all processors
ndash KSPIN_LOCK is an opaque data type typedefrsquod as a ULONG
ndash To implement synchronization a single bit is sufficient
Synchronization on SMP Systems
5050
do acquire_spinlock(DPC)until (SUCCESS)
begin remove DPC from queueend
release_spinlock(DPC)
do acquire_spinlock(DPC)until (SUCCESS)
begin remove DPC from queueend
release_spinlock(DPC)
Kernel Synchronization
Processor BProcessor A
Critical section
spinlock
DPC DPC
A spinlock is a locking primitive associatedwith a global data structure such as the DPC queue
5151
Queued Spinlocks
Problem Checking status of spinlock via test-and-set operation creates bus contention
Queued spinlocks maintain queue of waiting processors
First processor acquires lock other processors wait on processor-local flag
ndash Thus busy-wait loop requires no access to the memory bus
When releasing lock the 1st processorrsquos flag is modified
ndash Exactly one processor is being signaled
ndash Pre-determined wait order
5252
SMP Scalability Improvements
Windows 2000 queued spinlocks
ndash qlocks in Kernel Debugger
Server 2003
ndash More spinlocks eliminated (context swap system space commit)
ndash Further reduction of use of spinlocks amp length they are held
ndash Scheduling database now per-CPUAllows thread state transitions in parallel
5353
SMP Scalability Improvements
XP2003
ndash Minimized lock contention for hot locks PFN or Page Frame Database lock
ndash Some locks completely eliminatedCharging nonpagedpaged pool quotas allocating and mapping system page table entries charging commitment of pages allocatingmapping physical memory through AWE functions
ndash New more efficient locking mechanism (pushlocks)Doesnrsquot use spinlocks when no contentionUsed for object manager and address windowing extensions (AWE) related locks
5454
Waiting
Flexible wait calls
ndash Wait for one or multiple objects in one call
ndash Wait for multiple can wait for ldquoanyrdquo one or ldquoallrdquo at onceldquoAllrdquo all objects must be in the signalled state concurrently to resolve the wait
ndash All wait calls include optional timeout argument
ndash Waiting threads consume no CPU time
5555
Waiting
Waitable objects include
ndash Events (may be auto-reset or manual reset may be set or ldquopulsedrdquo)
ndash Mutexes (ldquomutual exclusionrdquo one-at-a-time)
ndash Semaphores (n-at-a-time)
ndash Timers
ndash Processes and Threads (signalled upon exit or terminate)
ndash Directories (change notification)
5656
Waiting
No guaranteed ordering of wait resolution
ndash If multiple threads are waiting for an object and only one thread is released (eg itrsquos a mutex or auto-reset event) which thread gets released is unpredictable
5757
Executive Synchronization
Thread waitson an object
handle
Create and initialize thread object
Initialized
Ready
Transition
Waiting
Running
Terminated
Standby
Wait is completeSet object to
signaled state
Interaction with thread scheduling
Waiting on Dispatcher Objects ndash outside the kernel
5858
Interaction bet Synchronization amp Dispatching User mode thread waits on an event objectlsquos handle
Kernel changes threadlsquos scheduling state from ready to waiting and adds thread to wait-list
Another thread sets the event
Kernel wakes up waiting threads variable priority threads get priority boost
5959
Interaction bet Synchronization amp Dispatching Dispatcher re-schedules new thread ndash it may preempt running thread it it has lower priority and issues software interrupt to initiate context switch
If no processor can be preempted the dispatcher places the ready thread in the dispatcher ready queue to be scheduled later
6060
What signals an object
Dispatcher object
System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
Owning thread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (kernel mode)
nonsignaled signaled
Owning thread or otherthread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (exported to user mode)
nonsignaled signaled
One thread releases thesemaphore freeing a resource
A thread acquires the semaphoreMore resources are not available
Kernel resumes one or more waiting threads
Semaphore
6161
A thread reinitializesthe thread object
What signals an object (contd)
Dispatcher object System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
A thread sets the event
Kernel resumes one or more threads
Kernel resumes one or more waiting threads
Event
nonsignaled signaled
Dedicated thread sets oneevent in the event pair
Kernel resumes theother dedicated thread
Kernel resumes waitingdedicated thread
Event pair
nonsignaled signaled
Timer expires
A thread (re) initializes the timer
Kernel resumes all waiting threads
Timer
nonsignaled signaled
Thread terminates
Kernel resumes all waiting threads
Thread
6262
Further Reading
Mark E Russinovich and David A Solomon Microsoft Windows Internals 4th Edition Microsoft Press 2004
Chapter 3 - System Mechanisms
ndash Trap Dispatching (pp 85 ff)
ndash Synchronization (pp 149 ff)
ndash Kernel Event Tracing (pp 175 ff)
6363
33 Advanced Windows Synchronization
Deferred and Asynchronous Procedure Calls
IRQLs and CPU Time Accounting
Wait Queues amp Dispatcher Objects
6464
Used to defer processing from higher (device) interrupt level to a lower (dispatch) levelndash Also used for quantum end and timer expiration
Driver (usually ISR) queues requestndash One queue per CPU DPCs are normally queued to the current processor but can be targeted to other CPUs
ndash Executes specified procedure at dispatch IRQL (or ldquodispatch levelrdquo also ldquoDPC levelrdquo) when all higher-IRQL work (interrupts) completed
ndash Maximum times recommended ISR 10 usec DPC 25 usec
See httpwwwmicrosoftcomwhdcdriverperformmmdrvmspx
Deferred Procedure Calls (DPCs)
6565
queue head DPC object DPC object DPC object
Deferred Procedure Calls (DPCs)
6666
DPC
Delivering a DPC
DPC routines can call kernel functionsbut canlsquot call system services generatepage faults or create or wait on objects
DPC routines canlsquotassume whatprocess addressspace is currentlymapped
Interruptdispatch table
high
Power failure
DispatchDPC
APC
Low
DPC
1 Timer expires kernelqueues DPC that willrelease all waiting threadsKernel requests SW int
DPCDPC
DPC queue
2 DPC interrupt occurswhen IRQL drops belowdispatchDPC level
dispatcher
3 After DPC interruptcontrol transfers tothread dispatcher
4 Dispatcher executes each DPCroutine in DPC queue
6767
Asynchronous Procedure Calls (APCs)
Execute code in context of a particular user threadndash APC routines can acquire resources (objects) incur page faultscall system services
APC queue is thread-specific
User mode amp kernel mode APCsndash Permission required for user mode APCs
6868
Asynchronous Procedure Calls (APCs)
Executive uses APCs to complete work in thread spacendash Wait for asynchronous IO operation
ndash Emulate delivery of POSIX signals
ndash Make threads suspendterminate itself (env subsystems)
APCs are delivered when thread is in alertable wait statendash WaitForMultipleObjectsEx() SleepEx()
6969
Special kernel APCs
ndash Run in kernel mode at IRQL 1
ndash Always deliverable unless thread is already at IRQL 1 or above
ndash Used for IO completion reporting from ldquoarbitrary thread contextrdquo
ndash Kernel-mode interface is linkable but not documented
Asynchronous Procedure Calls (APCs)
7070
ldquoOrdinaryrdquo kernel APCs
ndash Always deliverable if at IRQL 0 unless explicitly disabled (disable with KeEnterCriticalRegion)
User mode APCs
ndash Used for IO completion callback routines (see ReadFileEx WriteFileEx) also QueueUserApc
ndash Only deliverable when thread is in ldquoalertable waitrdquo
Asynchronous Procedure Calls (APCs)
7171
ThreadObject
K
U
APC objects
Asynchronous Procedure Calls (APCs)
7272
IRQLs and CPU Time Accounting
Interval clock timer ISR keeps track of time
Clock ISR time accounting
ndash If IRQLlt2 charge to threadrsquos user or kernel time
ndash If IRQL=2 and processing a DPC charge to DPC time
ndash If IRQL=2 amp not processing a DPC charge to thread kernel time
ndash If IRQLgt2 charge to interrupt time
7373
IRQLs and CPU Time Accounting
Since time servicing interrupts are NOT charged to interrupted thread if system is busy but no process appears to be running must be due to interrupt-related activity
ndash Note time at IRQL 2 or more is charged to the current threadrsquos quantum (to be described)
7474
Interrupt Time Accounting
Task Manager includes interrupt and DPC time with the Idle process time
Interrupt activity is not charged to any threadprocess
ndash Process Explorer shows these as separate processesnot really processes
ndash Context switches for these are really of interrupts amp DPCs
7575
Time Accounting Quirks
Looking at total CPU time for each process may not reveal where system has spent its time
CPU time accounting is driven by programmable interrupt timer
ndash Normally 10 msec (15 msec on some MP Pentiums)
Thread execution and context switches between clock intervals NOT accounted
ndash Eg one or more threads run and enter a wait state before clock fires
ndash Thus threads may run but never get charged
View context switch activity with Process Explorer
ndash Add Context Switch Delta column
7676
For waiting threads user-mode utilities only display the wait reason
Example pstat
Looking at Waiting Threads
7777
Wait Internals 1 Dispatcher Objects
Size TypeState
Wait listhead
Object-type-specific data
DispatcherObject
(see ntddkincddkntddkh)
Any kernel object you can wait for is a ldquodispatcher objectrdquo
ndash some exclusively for synchronizationeg events mutexes (ldquomutantsrdquo) semaphores queues timers
ndash others can be waited for as a side effect of their prime function eg processes threads file objects
ndash non-waitable kernel objects are called ldquocontrol objectsrdquo
All dispatcher objects have a common header
All dispatcher objects are in one of two states
ndash ldquosignaledrdquo vs ldquononsignaledrdquo
ndash when signalled a wait on the object is satisfied
ndash different object types differ in terms of what changes their state
ndash wait and unwait implementation iscommon to all types of dispatcher objects
7878
Object-type-specific data
Wait Internals 2Wait Blocks
Size TypeState
Wait listhead
Size TypeState
Wait listhead
Represent a threadrsquos reference to something itrsquos waiting for (one per handle passed to WaitForhellip)
All wait blocks from a given wait call are chained to the waiting thread
Type indicates wait for ldquoanyrdquo or ldquoallrdquo Key denotes argument list position for
WaitForMultipleObjects
Object-type-specific data
DispatcherObjects
Thread Objects
WaitBlockListWaitBlockList
Wait blocks
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
7979
34 Windows APIs for Synchronization and IPC Windows API constructs for synchronization and interprocess communication
Synchronization
ndash Critical sections
ndash Mutexes
ndash Semaphores
ndash Event objects
Synchronization through interprocess communication
ndash Anonymous pipes
ndash Named pipes
ndash Mailslots
8080
Critical Sections
Only usable from within the same process
Critical sections are initialized and deleted but do not have handles
Only one thread at a time can be in a critical section
A thread can enter a critical section multiple times - however the number of Enter- and Leave-operations must match
Leaving a critical section before entering it may cause deadlocks
No way to test whether another thread is in a critical section
VOID InitializeCriticalSection( LPCRITICAL_SECTION sec )VOID DeleteCriticalSection( LPCRITICAL_SECTION sec )
VOID EnterCriticalSection( LPCRITICAL_SECTION sec ) VOID LeaveCriticalSection( LPCRITICAL_SECTION sec )BOOL TryEnterCriticalSection ( LPCRITICAL_SECTION sec )
8181
Critical Section Example
counter is global shared by all threads
volatile int counter = 0
CRITICAL_SECTION crit
InitializeCriticalSection ( ampcrit )
hellip main loop in any of the threads
while (done)
_try
EnterCriticalSection ( ampcrit )
counter += local_value
LeaveCriticalSection ( ampcrit )
_finally LeaveCriticalSection ( ampcrit )
DeleteCriticalSection( ampcrit )
8282
Synchronizing Threads with Kernel Objects
The following kernel objects can be used to synchronize threads
ndash Processes
ndash Threads
ndash Files
ndash Console input
File change notificationsFile change notifications
MutexesMutexes
Events (auto-reset + manual-reset)Events (auto-reset + manual-reset)
Waitable timersWaitable timers
DWORD WaitForSingleObject( HANDLE hObject DWORD dwTimeout )
DWORD WaitForMultipleObjects( DWORD cObjects LPHANDLE lpHandles BOOL bWaitAll DWORD dwTimeout )
8383
Wait Functions - Details
WaitForSingleObject()ndash hObject specifies kernel object
ndash dwTimeout specifies wait time in msecdwTimeout == 0 - no wait check whether object is signaled
dwTimeout == INFINITE - wait forever
WaitForMultipleObjects()ndash cObjects lt= MAXIMUM_WAIT_OBJECTS (64)
ndash lpHandles - pointer to array identifying these objects
ndash bWaitAll - whether to wait for first signaled object or all objectsFunction returns index of first signaled object
Side effectsndash Mutexes auto-reset events and waitable timers will be reset to non-signaled state after completing wait functions
8484
Mutexes
Mutexes work across processes
First thread has to call CreateMutex()
When sharing a mutex second thread (process) calls CreateMutex() or OpenMutex()
fInitialOwner == TRUE gives creator immediate ownership
Threads acquire mutex ownership using WaitForSingleObject() or WaitForMultipleObjects()
ReleaseMutex() gives up ownership
CloseHandle() will free mutex object
8585
Mutexes
HANDLE CreateMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
HANDLE OpenMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
BOOL ReleaseMutex( HANDLE hMutex )
8686
Mutex Example
counter is global shared by all threads
volatile int done counter = 0
HANDLE mutex = CreateMutex( NULL FALSE NULL )
main loop in any of the threads ret is local
DWORD ret
while (done)
ret = WaitForSingleObject( mutex INFINITE )
if (ret == WAIT_OBJECT_0)
counter += local_value
else mutex was abandoned
break exit the loop
ReleaseMutex( mutex )
CloseHandle( mutex )
8787
Comparison - POSIX mutexes
POSIX pthreads specification supports mutexesndash Synchronization among threads in same process
Five basic functionsndash pthread_mutex_init()
ndash pthread_mutex_destroy()
ndash pthread_mutex_lock()
ndash pthread_mutex_unlock()
ndash pthread_mutex_trylock()
Comparisonndash pthread_mutex_lock() will block - equivalent to WaitForSingleObject( hMutex )
ndash pthread_mutex_trylock() is nonblocking (polling) - equivalent to WaitForSingleObject() with timeout == 0
8888
Semaphores
Semaphore objects are used for resource countingndash A semaphore is signaled when count gt 0
Threadsprocesses use wait functionsndash Each wait function decreases semaphore count by 1
ndash ReleaseSemaphore() may increment count by any value
ndash ReleaseSemaphore() returns old semaphore count
8989
Semaphores
HANDLE CreateSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE OpenSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE ReleaseSemaphore( HANDLE hSemaphoreLONG cReleaseCount LPLONG lpPreviousCount )
9090
Events
Multiple threads can be released when a single event is signaled (barrier synchronization)ndash Manual-reset event can signal several thread simultaneously must be reset manually
ndash PulseEvent() will release all threads waiting on a manual-reset event and automatically reset the event
ndash Auto-reset event signals a single thread event is reset automatically
ndash fInitialState == TRUE - create event in signaled state
9191
Events
HANDLE CreateEvent( LPSECURITY_ATTRIBUTE lpsaBOOL fManualReset BOOL fInititalStateLPTSTR lpszEventName )
BOOL SetEvent( HANDLE hEvent )BOOL ResetEvent( HANDLE hEvent )BOOL PulseEvent( HANDLE hEvent )
9292
Comparison - POSIX condition variables
pthreadrsquos condition variables are comparable to eventsndash pthread_cond_init()
ndash pthread_cond_destroy()
Wait functionsndash pthread_cond_wait()
ndash pthread_cond_timedwait()
Signalingndash pthread_cond_signal() - one thread
ndash pthread_cond_broadcast() - all waiting threads
No exact equivalent to manual-reset events
9393
Anonymous pipes
BOOL CreatePipe( PHANDLE phReadPHANDLE phWriteLPSECURITY_ATTRIBUTES lpsaDWORD cbPipe )
main
prog1 prog2pipe
Half-duplex character-based IPC
cbPipe pipe byte size zero == default
Read on pipe handle will block if pipe is empty
Write operation to a full pipe will block
Anonymous pipes are oneway
9494
IO Redirection using an Anonymous Pipe
Create default size anonymous pipe handles are inheritable
if (CreatePipe (amphReadPipe amphWritePipe ampPipeSA 0))
fprintf(stderr ldquoAnon pipe create failednrdquo) exit(1)
Set output handle to pipe handle create first processes
StartInfoCh1hStdInput = GetStdHandle (STD_INPUT_HANDLE)
StartInfoCh1hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh1hStdOutput = hWritePipe
StartInfoCh1dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)Command1 NULL NULL TRUE 0 NULL NULL ampStartInfoCh1 ampProcInfo1))
fprintf(stderr ldquoCreateProc1 failednrdquo) exit(2)
CloseHandle (hWritePipe)
9595
Pipe example (contd)
Repeat (symmetrically) for the second process
StartInfoCh2hStdInput = hReadPipe
StartInfoCh2hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh2hStdOutput = GetStdHandle (STD_OUTPUT_HANDLE)
StartInfoCh2dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)targv NULL NULLTRUE Inherit handles
0 NULL NULL ampStartInfoCh2 ampProcInfo2))
fprintf(stderr ldquoCreateProc2 failednrdquo) exit(3)
CloseHandle (hReadPipe)
Wait for both processes to complete
WaitForSingleObject (ProcInfo1hProcess INFINITE)
WaitForSingleObject (ProcInfo2hProcess INFINITE)
CloseHandle (ProcInfo1hThread) CloseHandle (ProcInfo1hProcess)
CloseHandle (ProcInfo2hThread) CloseHandle (ProcInfo2hProcess)
return 0
9696
Named Pipes
Message orientedndash Reading process can read varying-length messages precisely as sent by the writing process
Bi-directionalndash Two processes can exchange messages over the same pipe
Multiple independent instances of a named pipendash Several clients can communicate with a single server using the same instance
ndash Server can respond to client using the same instance
Pipe can be accessed over the networkndash location transparency
Convenience and connection functions
9797
Using Named Pipes
HANDLE CreateNamedPipe (LPCTSTR lpszPipeNameDWORD fdwOpenMode DWORD fdwPipModeDWORD nMaxInstances DWORD cbOutBufDWORD cbInBuf DWORD dwTimeOutLPSECURITY_ATTRIBUTES lpsa )
Use same flag settings forall instances of a named pipe
lpszPipeName pipe[path]pipename
ndash Not possible to create a pipe on remote machine ( ndash local machine)
fdwOpenMode
ndash PIPE_ACCESS_DUPLEX PIPE_ACCESS_INBOUND PIPE_ACCESS_OUTBOUND
fdwPipeMode
ndash PIPE_TYPE_BYTE or PIPE_TYPE_MESSAGE
ndash PIPE_READMODE_BYTE or PIPE_READMODE_MESSAGE
ndash PIPE_WAIT or PIPE_NOWAIT (will ReadFile block)
9898
Named Pipes (contd)
BOOL PeekNamedPipe (HANDLE hPipeLPVOID lpvBuffer DWORD cbBufferLPDWORD lpcbRead LPDWORD lpcbAvailLPDWORD lpcbMessage)
nMaxInstances
ndash Number of instances
ndash PIPE_UNLIMITED_INSTANCES OS choice based on resources
dwTimeOut
ndash Default time-out period (in msec) for WaitNamedPipe()
First CreateNamedPipe creates named pipe
ndash Closing handle to last instance deletes named pipe
Polling a pipe
ndash Nondestructive ndash is there a message waiting for ReadFile
9999
Named Pipe Client Connections
CreateFile with named pipe namendash pipe[path]pipename
ndash servernamepipe[path]pipename
ndash First method gives better performance (local server)
Status Functionsndash GetNamedPipeHandleState
ndash SetNamedPipeHandleState
ndash GetNamedPipeInfo
100100
Convenience Functions
BOOL TransactNamedPipe( HANDLE hNamedPipeLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDOWRD lpcbRead LPOVERLAPPED lpa)
WriteFile ReadFile sequence
101101
Convenience Functions
BOOL CallNamedPipe( LPCTSTR lpszPipeNameLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDWORD lpcbRead DWORD dwTimeOut)
CreateFile WriteFile ReadFile CloseHandle
ndash dwTimeOut NMPWAIT_NOWAIT NMPWAIT_WIAT_FOREVER NMPWAIT_USE_DEFAULT_WAIT
102102
Server eliminate the polling loop
BOOL ConnectNamedPipe (HANDLE hNamedPipeLPOVERLAPPED lpo
lpo == NULLndash Call will return as soon as there is a client connection
ndash Returns false if client connected between CreateNamed Pipe calland ConnectNamedPipe()
Use DisconnectNamedPipe to free the handle for connection from another client
WaitNamedPipe()ndash Client may wait for serverlsquos ConnectNamedPipe()
Security rights for named pipesndash GENERIC_READ GENERIC_WRITE SYNCHRONIZE
103103
Comparison with UNIX
UNIX FIFOs are similar to a named pipe
ndash FIFOs are half-duplex
ndash FIFOs are limited to a single machine
ndash FIFOs are still byte-oriented so its easiest to use fixed-size records in clientserver applications
ndash Individual readwrites are atomic
A server using FIFOs must use a separate FIFO for each clientlsquos response although all clients can send requests via a single well known FIFO
Mkfifo() is the UNIX counterpart to CreateNamedPipe()
Use sockets for networked clientserver scenarios
104104
Client Example using Named Pipe
WaitNamedPipe (ServerPipeName NMPWAIT_WAIT_FOREVER)
hNamedPipe = CreateFile (ServerPipeName GENERIC_READ | GENERIC_WRITE
0 NULL OPEN_EXISTING FILE_ATTRIBUTE_NORMAL NULL)
if (hNamedPipe == INVALID_HANDLE_VALUE)
fptinf(stderr Failure to locate servern) exit(3)
Write the request
WriteFile (hNamedPipe ampRequest MAX_RQRS_LEN ampnWrite NULL)
Read each response and send it to std out
while (ReadFile (hNamedPipe ResponseRecord MAX_RQRS_LEN ampnRead NULL))
printf (s ResponseRecord)
CloseHandle (hNamedPipe)
return 0
105105
Server Example Using a Named Pipe
hNamedPipe = CreateNamedPipe (SERVER_PIPE PIPE_ACCESS_DUPLEX
PIPE_READMODE_MESSAGE | PIPE_TYPE_MESSAGE | PIPE_WAIT
1 0 0 CS_TIMEOUT pNPSA)
while (Done)
printf (Server is awaiting next requestn)
if (ConnectNamedPipe (hNamedPipe NULL)
|| ReadFile (hNamedPipe ampRequest RQ_SIZE ampnXfer NULL))
fprintf(stderr ldquoConnect or Read Named Pipe errornrdquo) exit(4)
printf( ldquoRequest is sn RequestRecord)
Send the file one line at a time to the client
fp = fopen (File r)
while ((fgets (ResponseRecord MAX_RQRS_LEN fp) = NULL))
WriteFile (hNamedPipe ampResponseRecord
(strlen(ResponseRecord) + 1) TSIZE ampnXfer NULL)
fclose (fp)
DisconnectNamedPipe (hNamedPipe)
End of server operation
106106
Win32 IPC - MailslotsMailslots bear some nasty implementation detailsthey are almost never used
Broadcast mechanism
ndash One-directional
ndash Mutliple writersmultiple readers (frequently one-to-many comm)
ndash Message delivery is unreliable
ndash Can be located over a network domain
ndash Message lengths are limited (w2k lt 426 byte) Operations on the mailslot
ndash Each reader (server) creates mailslot with CreateMailslot()
ndash Write-only client opens mailslot with CreateFile() and uses WriteFile() ndash open will fail if there are no waiting readers
ndash Clientlsquos message can be read by all servers (readers) Client lookup mailslotmailslotname
ndash Client will connect to every server in network domain
107107
Locate a server via mailslot
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
App client 0
App client n
Mailslot Servers
While () Sleep() hMS = CreateFile( ldquomailslotstatusldquo)
WriteFile(hMS ampStatInfo
App Server
Mailslot Client
Message is sent periodically
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
2929
Windows Synchronization
Uses interrupt masks to protect access to global resources on uniprocessor systems
Uses spinlocks on multiprocessor systems
Provides dispatcher objects which may act as mutexes and semaphores
Dispatcher objects may also provide events An event acts much like a condition variable
3030
Linux Synchronization
Kernel disables interrupts for synchronizing access to global data on uniprocessor systems
Uses spinlocks for multiprocessor synchronization
Uses semaphores and readers-writers locks when longer sections of code need access to data
Implements POSIX synchronization primitives to support multitasking multithreading (including real-time threads) and multiprocessing
3131
Further Reading
Ben-Ari M Principles of Concurrent Programming Prentice Hall 1982
Lamport L The Mutual Exclusion Problem Journal of the ACM April 1986
Abraham Silberschatz Peter B Galvin Operating System Concepts John Wiley amp Sons 6th Ed 2003
ndash Chapter 7 - Process Synchronization
ndash Chapter 8 - Deadlocks
3232
32 Trap Dispatching Interrupts Synchronization Trap and Interrupt dispatching
IRQL levels amp Interrupt Precedence
Spinlocks and Kernel Synchronization
Executive Synchronization
3333
Kernel Mode Versus User Mode
A processor state
Controls access to memory
Each memory page is tagged to show the required mode for reading and for writing
ndash Protects the system from the users
ndash Protects the user (process) from themselves
ndash System is not protected from system
Code regions are tagged ldquono write in any moderdquo
Controls ability to execute privileged instructions
A Windows abstraction
ndash Intel Ring 0 Ring 3
3434
Kernel Mode Versus User Mode
Control flow (a thread) can change from user to kernel mode and back
ndash Does not affect scheduling
ndash Thread context includes info about execution mode (along with registers etc)
PerfMon counters
ndash ldquoPrivileged Timerdquo and ldquoUser Timerdquo
ndash 4 levels of granularity thread process processor system
3535
Getting Into Kernel Mode
Code is run in kernel mode for one of three reasons
1 Requests from user mode
ndash Via the system service dispatch mechanism
ndash Kernel-mode code runs in the context of the requesting thread
2 Dedicated kernel-mode system threads
ndash Some threads in the system stay in kernel mode at all timesmostly in the ldquoSystemrdquo process
ndash Scheduled preempted etc like any other threads
3636
Getting Into Kernel Mode
3 Interrupts from external devicesndash interrupt dispatcher invokes the interrupt service routine
ndash ISR runs in the context of the interrupted thread so-called ldquoarbitrary thread contextrdquo
ndash ISR often requests the execution of a ldquoDPC routinerdquo which also runs in kernel mode
ndash Time not charged to interrupted thread
3737
Trap dispatching
Interruptdispatcher
Systemservice
dispatcher
Interruptserviceroutines
Interruptserviceroutines
Interruptserviceroutines
System services
System services
System services
Exceptiondispatcher
Exceptionhandlers
Exceptionhandlers
Exceptionhandlers
Virtual memorymanagerlsquos pager
Interrupt
System service call
HW exceptionsSW exceptions
Virtual addressexceptions
Trap processorlsquos mechanism to capture executing thread
ndash Switch from user to kernel mode
ndash Interrupts ndash asynchronous
ndash Exceptions - synchronous
Interrupt dispatch routine
Disable interrupts
Record machine state (trap frame) to allow resume
Mask equal- and lower-IRQL interrupts
Find and call appropriate ISR
Dismiss interrupt
Restore machine state (including mode and enabled interrupts)
Disable interrupts
Record machine state (trap frame) to allow resume
Mask equal- and lower-IRQL interrupts
Find and call appropriate ISR
Dismiss interrupt
Restore machine state (including mode and enabled interrupts)
Tell the device to stop interrupting
Interrogate device state start next operation on device etc
Request a DPC
Return to caller
Tell the device to stop interrupting
Interrogate device state start next operation on device etc
Request a DPC
Return to caller
Interrupt service routine
interrupt
user or kernel mode
codekernel mode
Note no thread or process context switch
Note no thread or process context switch
Interrupt Dispatching
3939
IRQL = Interrupt Request Level
ndash Precedence of the interrupt with respect to other interrupts
ndash Different interrupt sources have different IRQLs
ndash not the same as IRQ
IRQL is also a state of the processor
ndash Servicing an interrupt raises processor IRQL to that interruptrsquos IRQL
ndash this masks subsequent interrupts at equal and lower IRQLs
User mode is limited to IRQL 0
No waits or page faults at IRQL gt= DISPATCH_LEVEL
Interrupt Precedence via IRQLs (x86)
4040
PassiveLowAPC
DispatchDPCDevice 1
Profile amp Synch (Srv 2003)
ClockInterprocessor Interrupt
Power failHigh
normal thread execution
Hardware interrupts
Deferrable software interrupts
012
302928
31
Interrupt Precedence via IRQLs (x86)
4141
Interrupt processing
Interrupt dispatch table (IDT)ndash Links to interrupt service routines
x86ndash Interrupt controller interrupts processor (single line)
ndash Processor queries for interrupt vector uses vector as index to IDT
Alphandash PAL code (Privileged Architecture Library ndash Alpha BIOS) determines interrupt vector calls kernel
ndash Kernel uses vector to index IDT
After ISR execution IRQL is lowered to initial level
4242
Interrupt object
Allows device drivers to register ISRs for their devicesndash Contains dispatch code (initial handler)
ndash Dispatch code calls ISR with interrupt object as parameter(HW cannot pass parameters to ISR)
Connectingdisconnecting interrupt objectsndash Dynamic association between ISR and IDT entry
ndash Loadable device drivers (kernel modules)
ndash Turn onoff ISR
Interrupt objects can synchronize access to ISR datandash Multiple instances of ISR may be active simultaneously (MP machine)
ndash Multiple ISR may be connected with IRQL
4343
Predefined IRQLs
High ndash used when halting the system (via KeBugCheck())
Power fail ndash originated in the NT design document but has never been used
Inter-processor interruptndash used to request action from other processor (dispatching a thread updating a processors TLB system shutdown system crash)
Clockndash Used to update systemlsquos clock allocation of CPU time to threads
Profilendash Used for kernel profiling (see Kernel profiler ndash Kernprofexe Res Kit)
4444
Predefined IRQLs (contd)
Device
ndash Used to prioritize device interrupts
DPCdispatch and APC
ndash Software interrupts that kernel and device drivers generate
Passive
ndash No interrupt level at all normal thread execution
4545
IRQLs on 64-bit Systems
PassiveLowAPC
DispatchDPCDevice 1
Device n
Synch (Srv 2003)Clock
Interprocessor InterruptPower
HighProfile
012
1413
15
34
PassiveLowAPC
DispatchDPC amp Synch (UP only)Correctable Machine Check
Device 1
Device nSynch (MP only)
ClockInterprocessor Interrupt
HighProfilePower
x64 IA64
12
4646
Interrupt Prioritization amp Delivery
IRQLs are determined as followsndash x86 UP systems IRQL = 27 - IRQ
ndash x86 MP systems bucketized (random)
ndash x64 amp IA64 systems IRQL = IDT vector number 16
On MP systems which processor is chosen to deliver an interruptndash By default any processor can receive an interrupt from any deviceCan be configured with IntFilter utility in Resource Kit
ndash On x86 and x64 systems the IOAPIC (IO advanced programmable interrupt controller) is programmed to interrupt the processor running at the lowest IRQL
ndash On IA64 systems the SAPIC (streamlined advanced programmable interrupt controller) is configured to interrupt one processor for each interrupt sourceProcessors are assigned round robin for each interrupt vector
4747
Software interrupts
Initiating thread dispatching
ndash DPC allow for scheduling actions when kernel is deep within many layers of code
ndash Delayed scheduling decision one DPC queue per processor
Handling timer expiration
Asynchronous execution of a procedure in context of a particular thread
Support for asynchronous IO operations
4848
Flow of Interrupts
4949
Sync on MP use spinlocks to coordinate among processors
Spinlock acquisition and release routines implement a one-owner-at-a-time algorithm
ndash Spinlock is either free or is considered to be owned by a CPU
ndash Analogous to using Windows API mutexes from user mode
A spinlock is just a data cell in memory
ndash Accessed with a test-and-modify operation that is atomic across all processors
ndash KSPIN_LOCK is an opaque data type typedefrsquod as a ULONG
ndash To implement synchronization a single bit is sufficient
Synchronization on SMP Systems
5050
do acquire_spinlock(DPC)until (SUCCESS)
begin remove DPC from queueend
release_spinlock(DPC)
do acquire_spinlock(DPC)until (SUCCESS)
begin remove DPC from queueend
release_spinlock(DPC)
Kernel Synchronization
Processor BProcessor A
Critical section
spinlock
DPC DPC
A spinlock is a locking primitive associatedwith a global data structure such as the DPC queue
5151
Queued Spinlocks
Problem Checking status of spinlock via test-and-set operation creates bus contention
Queued spinlocks maintain queue of waiting processors
First processor acquires lock other processors wait on processor-local flag
ndash Thus busy-wait loop requires no access to the memory bus
When releasing lock the 1st processorrsquos flag is modified
ndash Exactly one processor is being signaled
ndash Pre-determined wait order
5252
SMP Scalability Improvements
Windows 2000 queued spinlocks
ndash qlocks in Kernel Debugger
Server 2003
ndash More spinlocks eliminated (context swap system space commit)
ndash Further reduction of use of spinlocks amp length they are held
ndash Scheduling database now per-CPUAllows thread state transitions in parallel
5353
SMP Scalability Improvements
XP2003
ndash Minimized lock contention for hot locks PFN or Page Frame Database lock
ndash Some locks completely eliminatedCharging nonpagedpaged pool quotas allocating and mapping system page table entries charging commitment of pages allocatingmapping physical memory through AWE functions
ndash New more efficient locking mechanism (pushlocks)Doesnrsquot use spinlocks when no contentionUsed for object manager and address windowing extensions (AWE) related locks
5454
Waiting
Flexible wait calls
ndash Wait for one or multiple objects in one call
ndash Wait for multiple can wait for ldquoanyrdquo one or ldquoallrdquo at onceldquoAllrdquo all objects must be in the signalled state concurrently to resolve the wait
ndash All wait calls include optional timeout argument
ndash Waiting threads consume no CPU time
5555
Waiting
Waitable objects include
ndash Events (may be auto-reset or manual reset may be set or ldquopulsedrdquo)
ndash Mutexes (ldquomutual exclusionrdquo one-at-a-time)
ndash Semaphores (n-at-a-time)
ndash Timers
ndash Processes and Threads (signalled upon exit or terminate)
ndash Directories (change notification)
5656
Waiting
No guaranteed ordering of wait resolution
ndash If multiple threads are waiting for an object and only one thread is released (eg itrsquos a mutex or auto-reset event) which thread gets released is unpredictable
5757
Executive Synchronization
Thread waitson an object
handle
Create and initialize thread object
Initialized
Ready
Transition
Waiting
Running
Terminated
Standby
Wait is completeSet object to
signaled state
Interaction with thread scheduling
Waiting on Dispatcher Objects ndash outside the kernel
5858
Interaction bet Synchronization amp Dispatching User mode thread waits on an event objectlsquos handle
Kernel changes threadlsquos scheduling state from ready to waiting and adds thread to wait-list
Another thread sets the event
Kernel wakes up waiting threads variable priority threads get priority boost
5959
Interaction bet Synchronization amp Dispatching Dispatcher re-schedules new thread ndash it may preempt running thread it it has lower priority and issues software interrupt to initiate context switch
If no processor can be preempted the dispatcher places the ready thread in the dispatcher ready queue to be scheduled later
6060
What signals an object
Dispatcher object
System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
Owning thread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (kernel mode)
nonsignaled signaled
Owning thread or otherthread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (exported to user mode)
nonsignaled signaled
One thread releases thesemaphore freeing a resource
A thread acquires the semaphoreMore resources are not available
Kernel resumes one or more waiting threads
Semaphore
6161
A thread reinitializesthe thread object
What signals an object (contd)
Dispatcher object System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
A thread sets the event
Kernel resumes one or more threads
Kernel resumes one or more waiting threads
Event
nonsignaled signaled
Dedicated thread sets oneevent in the event pair
Kernel resumes theother dedicated thread
Kernel resumes waitingdedicated thread
Event pair
nonsignaled signaled
Timer expires
A thread (re) initializes the timer
Kernel resumes all waiting threads
Timer
nonsignaled signaled
Thread terminates
Kernel resumes all waiting threads
Thread
6262
Further Reading
Mark E Russinovich and David A Solomon Microsoft Windows Internals 4th Edition Microsoft Press 2004
Chapter 3 - System Mechanisms
ndash Trap Dispatching (pp 85 ff)
ndash Synchronization (pp 149 ff)
ndash Kernel Event Tracing (pp 175 ff)
6363
33 Advanced Windows Synchronization
Deferred and Asynchronous Procedure Calls
IRQLs and CPU Time Accounting
Wait Queues amp Dispatcher Objects
6464
Used to defer processing from higher (device) interrupt level to a lower (dispatch) levelndash Also used for quantum end and timer expiration
Driver (usually ISR) queues requestndash One queue per CPU DPCs are normally queued to the current processor but can be targeted to other CPUs
ndash Executes specified procedure at dispatch IRQL (or ldquodispatch levelrdquo also ldquoDPC levelrdquo) when all higher-IRQL work (interrupts) completed
ndash Maximum times recommended ISR 10 usec DPC 25 usec
See httpwwwmicrosoftcomwhdcdriverperformmmdrvmspx
Deferred Procedure Calls (DPCs)
6565
queue head DPC object DPC object DPC object
Deferred Procedure Calls (DPCs)
6666
DPC
Delivering a DPC
DPC routines can call kernel functionsbut canlsquot call system services generatepage faults or create or wait on objects
DPC routines canlsquotassume whatprocess addressspace is currentlymapped
Interruptdispatch table
high
Power failure
DispatchDPC
APC
Low
DPC
1 Timer expires kernelqueues DPC that willrelease all waiting threadsKernel requests SW int
DPCDPC
DPC queue
2 DPC interrupt occurswhen IRQL drops belowdispatchDPC level
dispatcher
3 After DPC interruptcontrol transfers tothread dispatcher
4 Dispatcher executes each DPCroutine in DPC queue
6767
Asynchronous Procedure Calls (APCs)
Execute code in context of a particular user threadndash APC routines can acquire resources (objects) incur page faultscall system services
APC queue is thread-specific
User mode amp kernel mode APCsndash Permission required for user mode APCs
6868
Asynchronous Procedure Calls (APCs)
Executive uses APCs to complete work in thread spacendash Wait for asynchronous IO operation
ndash Emulate delivery of POSIX signals
ndash Make threads suspendterminate itself (env subsystems)
APCs are delivered when thread is in alertable wait statendash WaitForMultipleObjectsEx() SleepEx()
6969
Special kernel APCs
ndash Run in kernel mode at IRQL 1
ndash Always deliverable unless thread is already at IRQL 1 or above
ndash Used for IO completion reporting from ldquoarbitrary thread contextrdquo
ndash Kernel-mode interface is linkable but not documented
Asynchronous Procedure Calls (APCs)
7070
ldquoOrdinaryrdquo kernel APCs
ndash Always deliverable if at IRQL 0 unless explicitly disabled (disable with KeEnterCriticalRegion)
User mode APCs
ndash Used for IO completion callback routines (see ReadFileEx WriteFileEx) also QueueUserApc
ndash Only deliverable when thread is in ldquoalertable waitrdquo
Asynchronous Procedure Calls (APCs)
7171
ThreadObject
K
U
APC objects
Asynchronous Procedure Calls (APCs)
7272
IRQLs and CPU Time Accounting
Interval clock timer ISR keeps track of time
Clock ISR time accounting
ndash If IRQLlt2 charge to threadrsquos user or kernel time
ndash If IRQL=2 and processing a DPC charge to DPC time
ndash If IRQL=2 amp not processing a DPC charge to thread kernel time
ndash If IRQLgt2 charge to interrupt time
7373
IRQLs and CPU Time Accounting
Since time servicing interrupts are NOT charged to interrupted thread if system is busy but no process appears to be running must be due to interrupt-related activity
ndash Note time at IRQL 2 or more is charged to the current threadrsquos quantum (to be described)
7474
Interrupt Time Accounting
Task Manager includes interrupt and DPC time with the Idle process time
Interrupt activity is not charged to any threadprocess
ndash Process Explorer shows these as separate processesnot really processes
ndash Context switches for these are really of interrupts amp DPCs
7575
Time Accounting Quirks
Looking at total CPU time for each process may not reveal where system has spent its time
CPU time accounting is driven by programmable interrupt timer
ndash Normally 10 msec (15 msec on some MP Pentiums)
Thread execution and context switches between clock intervals NOT accounted
ndash Eg one or more threads run and enter a wait state before clock fires
ndash Thus threads may run but never get charged
View context switch activity with Process Explorer
ndash Add Context Switch Delta column
7676
For waiting threads user-mode utilities only display the wait reason
Example pstat
Looking at Waiting Threads
7777
Wait Internals 1 Dispatcher Objects
Size TypeState
Wait listhead
Object-type-specific data
DispatcherObject
(see ntddkincddkntddkh)
Any kernel object you can wait for is a ldquodispatcher objectrdquo
ndash some exclusively for synchronizationeg events mutexes (ldquomutantsrdquo) semaphores queues timers
ndash others can be waited for as a side effect of their prime function eg processes threads file objects
ndash non-waitable kernel objects are called ldquocontrol objectsrdquo
All dispatcher objects have a common header
All dispatcher objects are in one of two states
ndash ldquosignaledrdquo vs ldquononsignaledrdquo
ndash when signalled a wait on the object is satisfied
ndash different object types differ in terms of what changes their state
ndash wait and unwait implementation iscommon to all types of dispatcher objects
7878
Object-type-specific data
Wait Internals 2Wait Blocks
Size TypeState
Wait listhead
Size TypeState
Wait listhead
Represent a threadrsquos reference to something itrsquos waiting for (one per handle passed to WaitForhellip)
All wait blocks from a given wait call are chained to the waiting thread
Type indicates wait for ldquoanyrdquo or ldquoallrdquo Key denotes argument list position for
WaitForMultipleObjects
Object-type-specific data
DispatcherObjects
Thread Objects
WaitBlockListWaitBlockList
Wait blocks
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
7979
34 Windows APIs for Synchronization and IPC Windows API constructs for synchronization and interprocess communication
Synchronization
ndash Critical sections
ndash Mutexes
ndash Semaphores
ndash Event objects
Synchronization through interprocess communication
ndash Anonymous pipes
ndash Named pipes
ndash Mailslots
8080
Critical Sections
Only usable from within the same process
Critical sections are initialized and deleted but do not have handles
Only one thread at a time can be in a critical section
A thread can enter a critical section multiple times - however the number of Enter- and Leave-operations must match
Leaving a critical section before entering it may cause deadlocks
No way to test whether another thread is in a critical section
VOID InitializeCriticalSection( LPCRITICAL_SECTION sec )VOID DeleteCriticalSection( LPCRITICAL_SECTION sec )
VOID EnterCriticalSection( LPCRITICAL_SECTION sec ) VOID LeaveCriticalSection( LPCRITICAL_SECTION sec )BOOL TryEnterCriticalSection ( LPCRITICAL_SECTION sec )
8181
Critical Section Example
counter is global shared by all threads
volatile int counter = 0
CRITICAL_SECTION crit
InitializeCriticalSection ( ampcrit )
hellip main loop in any of the threads
while (done)
_try
EnterCriticalSection ( ampcrit )
counter += local_value
LeaveCriticalSection ( ampcrit )
_finally LeaveCriticalSection ( ampcrit )
DeleteCriticalSection( ampcrit )
8282
Synchronizing Threads with Kernel Objects
The following kernel objects can be used to synchronize threads
ndash Processes
ndash Threads
ndash Files
ndash Console input
File change notificationsFile change notifications
MutexesMutexes
Events (auto-reset + manual-reset)Events (auto-reset + manual-reset)
Waitable timersWaitable timers
DWORD WaitForSingleObject( HANDLE hObject DWORD dwTimeout )
DWORD WaitForMultipleObjects( DWORD cObjects LPHANDLE lpHandles BOOL bWaitAll DWORD dwTimeout )
8383
Wait Functions - Details
WaitForSingleObject()ndash hObject specifies kernel object
ndash dwTimeout specifies wait time in msecdwTimeout == 0 - no wait check whether object is signaled
dwTimeout == INFINITE - wait forever
WaitForMultipleObjects()ndash cObjects lt= MAXIMUM_WAIT_OBJECTS (64)
ndash lpHandles - pointer to array identifying these objects
ndash bWaitAll - whether to wait for first signaled object or all objectsFunction returns index of first signaled object
Side effectsndash Mutexes auto-reset events and waitable timers will be reset to non-signaled state after completing wait functions
8484
Mutexes
Mutexes work across processes
First thread has to call CreateMutex()
When sharing a mutex second thread (process) calls CreateMutex() or OpenMutex()
fInitialOwner == TRUE gives creator immediate ownership
Threads acquire mutex ownership using WaitForSingleObject() or WaitForMultipleObjects()
ReleaseMutex() gives up ownership
CloseHandle() will free mutex object
8585
Mutexes
HANDLE CreateMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
HANDLE OpenMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
BOOL ReleaseMutex( HANDLE hMutex )
8686
Mutex Example
counter is global shared by all threads
volatile int done counter = 0
HANDLE mutex = CreateMutex( NULL FALSE NULL )
main loop in any of the threads ret is local
DWORD ret
while (done)
ret = WaitForSingleObject( mutex INFINITE )
if (ret == WAIT_OBJECT_0)
counter += local_value
else mutex was abandoned
break exit the loop
ReleaseMutex( mutex )
CloseHandle( mutex )
8787
Comparison - POSIX mutexes
POSIX pthreads specification supports mutexesndash Synchronization among threads in same process
Five basic functionsndash pthread_mutex_init()
ndash pthread_mutex_destroy()
ndash pthread_mutex_lock()
ndash pthread_mutex_unlock()
ndash pthread_mutex_trylock()
Comparisonndash pthread_mutex_lock() will block - equivalent to WaitForSingleObject( hMutex )
ndash pthread_mutex_trylock() is nonblocking (polling) - equivalent to WaitForSingleObject() with timeout == 0
8888
Semaphores
Semaphore objects are used for resource countingndash A semaphore is signaled when count gt 0
Threadsprocesses use wait functionsndash Each wait function decreases semaphore count by 1
ndash ReleaseSemaphore() may increment count by any value
ndash ReleaseSemaphore() returns old semaphore count
8989
Semaphores
HANDLE CreateSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE OpenSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE ReleaseSemaphore( HANDLE hSemaphoreLONG cReleaseCount LPLONG lpPreviousCount )
9090
Events
Multiple threads can be released when a single event is signaled (barrier synchronization)ndash Manual-reset event can signal several thread simultaneously must be reset manually
ndash PulseEvent() will release all threads waiting on a manual-reset event and automatically reset the event
ndash Auto-reset event signals a single thread event is reset automatically
ndash fInitialState == TRUE - create event in signaled state
9191
Events
HANDLE CreateEvent( LPSECURITY_ATTRIBUTE lpsaBOOL fManualReset BOOL fInititalStateLPTSTR lpszEventName )
BOOL SetEvent( HANDLE hEvent )BOOL ResetEvent( HANDLE hEvent )BOOL PulseEvent( HANDLE hEvent )
9292
Comparison - POSIX condition variables
pthreadrsquos condition variables are comparable to eventsndash pthread_cond_init()
ndash pthread_cond_destroy()
Wait functionsndash pthread_cond_wait()
ndash pthread_cond_timedwait()
Signalingndash pthread_cond_signal() - one thread
ndash pthread_cond_broadcast() - all waiting threads
No exact equivalent to manual-reset events
9393
Anonymous pipes
BOOL CreatePipe( PHANDLE phReadPHANDLE phWriteLPSECURITY_ATTRIBUTES lpsaDWORD cbPipe )
main
prog1 prog2pipe
Half-duplex character-based IPC
cbPipe pipe byte size zero == default
Read on pipe handle will block if pipe is empty
Write operation to a full pipe will block
Anonymous pipes are oneway
9494
IO Redirection using an Anonymous Pipe
Create default size anonymous pipe handles are inheritable
if (CreatePipe (amphReadPipe amphWritePipe ampPipeSA 0))
fprintf(stderr ldquoAnon pipe create failednrdquo) exit(1)
Set output handle to pipe handle create first processes
StartInfoCh1hStdInput = GetStdHandle (STD_INPUT_HANDLE)
StartInfoCh1hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh1hStdOutput = hWritePipe
StartInfoCh1dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)Command1 NULL NULL TRUE 0 NULL NULL ampStartInfoCh1 ampProcInfo1))
fprintf(stderr ldquoCreateProc1 failednrdquo) exit(2)
CloseHandle (hWritePipe)
9595
Pipe example (contd)
Repeat (symmetrically) for the second process
StartInfoCh2hStdInput = hReadPipe
StartInfoCh2hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh2hStdOutput = GetStdHandle (STD_OUTPUT_HANDLE)
StartInfoCh2dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)targv NULL NULLTRUE Inherit handles
0 NULL NULL ampStartInfoCh2 ampProcInfo2))
fprintf(stderr ldquoCreateProc2 failednrdquo) exit(3)
CloseHandle (hReadPipe)
Wait for both processes to complete
WaitForSingleObject (ProcInfo1hProcess INFINITE)
WaitForSingleObject (ProcInfo2hProcess INFINITE)
CloseHandle (ProcInfo1hThread) CloseHandle (ProcInfo1hProcess)
CloseHandle (ProcInfo2hThread) CloseHandle (ProcInfo2hProcess)
return 0
9696
Named Pipes
Message orientedndash Reading process can read varying-length messages precisely as sent by the writing process
Bi-directionalndash Two processes can exchange messages over the same pipe
Multiple independent instances of a named pipendash Several clients can communicate with a single server using the same instance
ndash Server can respond to client using the same instance
Pipe can be accessed over the networkndash location transparency
Convenience and connection functions
9797
Using Named Pipes
HANDLE CreateNamedPipe (LPCTSTR lpszPipeNameDWORD fdwOpenMode DWORD fdwPipModeDWORD nMaxInstances DWORD cbOutBufDWORD cbInBuf DWORD dwTimeOutLPSECURITY_ATTRIBUTES lpsa )
Use same flag settings forall instances of a named pipe
lpszPipeName pipe[path]pipename
ndash Not possible to create a pipe on remote machine ( ndash local machine)
fdwOpenMode
ndash PIPE_ACCESS_DUPLEX PIPE_ACCESS_INBOUND PIPE_ACCESS_OUTBOUND
fdwPipeMode
ndash PIPE_TYPE_BYTE or PIPE_TYPE_MESSAGE
ndash PIPE_READMODE_BYTE or PIPE_READMODE_MESSAGE
ndash PIPE_WAIT or PIPE_NOWAIT (will ReadFile block)
9898
Named Pipes (contd)
BOOL PeekNamedPipe (HANDLE hPipeLPVOID lpvBuffer DWORD cbBufferLPDWORD lpcbRead LPDWORD lpcbAvailLPDWORD lpcbMessage)
nMaxInstances
ndash Number of instances
ndash PIPE_UNLIMITED_INSTANCES OS choice based on resources
dwTimeOut
ndash Default time-out period (in msec) for WaitNamedPipe()
First CreateNamedPipe creates named pipe
ndash Closing handle to last instance deletes named pipe
Polling a pipe
ndash Nondestructive ndash is there a message waiting for ReadFile
9999
Named Pipe Client Connections
CreateFile with named pipe namendash pipe[path]pipename
ndash servernamepipe[path]pipename
ndash First method gives better performance (local server)
Status Functionsndash GetNamedPipeHandleState
ndash SetNamedPipeHandleState
ndash GetNamedPipeInfo
100100
Convenience Functions
BOOL TransactNamedPipe( HANDLE hNamedPipeLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDOWRD lpcbRead LPOVERLAPPED lpa)
WriteFile ReadFile sequence
101101
Convenience Functions
BOOL CallNamedPipe( LPCTSTR lpszPipeNameLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDWORD lpcbRead DWORD dwTimeOut)
CreateFile WriteFile ReadFile CloseHandle
ndash dwTimeOut NMPWAIT_NOWAIT NMPWAIT_WIAT_FOREVER NMPWAIT_USE_DEFAULT_WAIT
102102
Server eliminate the polling loop
BOOL ConnectNamedPipe (HANDLE hNamedPipeLPOVERLAPPED lpo
lpo == NULLndash Call will return as soon as there is a client connection
ndash Returns false if client connected between CreateNamed Pipe calland ConnectNamedPipe()
Use DisconnectNamedPipe to free the handle for connection from another client
WaitNamedPipe()ndash Client may wait for serverlsquos ConnectNamedPipe()
Security rights for named pipesndash GENERIC_READ GENERIC_WRITE SYNCHRONIZE
103103
Comparison with UNIX
UNIX FIFOs are similar to a named pipe
ndash FIFOs are half-duplex
ndash FIFOs are limited to a single machine
ndash FIFOs are still byte-oriented so its easiest to use fixed-size records in clientserver applications
ndash Individual readwrites are atomic
A server using FIFOs must use a separate FIFO for each clientlsquos response although all clients can send requests via a single well known FIFO
Mkfifo() is the UNIX counterpart to CreateNamedPipe()
Use sockets for networked clientserver scenarios
104104
Client Example using Named Pipe
WaitNamedPipe (ServerPipeName NMPWAIT_WAIT_FOREVER)
hNamedPipe = CreateFile (ServerPipeName GENERIC_READ | GENERIC_WRITE
0 NULL OPEN_EXISTING FILE_ATTRIBUTE_NORMAL NULL)
if (hNamedPipe == INVALID_HANDLE_VALUE)
fptinf(stderr Failure to locate servern) exit(3)
Write the request
WriteFile (hNamedPipe ampRequest MAX_RQRS_LEN ampnWrite NULL)
Read each response and send it to std out
while (ReadFile (hNamedPipe ResponseRecord MAX_RQRS_LEN ampnRead NULL))
printf (s ResponseRecord)
CloseHandle (hNamedPipe)
return 0
105105
Server Example Using a Named Pipe
hNamedPipe = CreateNamedPipe (SERVER_PIPE PIPE_ACCESS_DUPLEX
PIPE_READMODE_MESSAGE | PIPE_TYPE_MESSAGE | PIPE_WAIT
1 0 0 CS_TIMEOUT pNPSA)
while (Done)
printf (Server is awaiting next requestn)
if (ConnectNamedPipe (hNamedPipe NULL)
|| ReadFile (hNamedPipe ampRequest RQ_SIZE ampnXfer NULL))
fprintf(stderr ldquoConnect or Read Named Pipe errornrdquo) exit(4)
printf( ldquoRequest is sn RequestRecord)
Send the file one line at a time to the client
fp = fopen (File r)
while ((fgets (ResponseRecord MAX_RQRS_LEN fp) = NULL))
WriteFile (hNamedPipe ampResponseRecord
(strlen(ResponseRecord) + 1) TSIZE ampnXfer NULL)
fclose (fp)
DisconnectNamedPipe (hNamedPipe)
End of server operation
106106
Win32 IPC - MailslotsMailslots bear some nasty implementation detailsthey are almost never used
Broadcast mechanism
ndash One-directional
ndash Mutliple writersmultiple readers (frequently one-to-many comm)
ndash Message delivery is unreliable
ndash Can be located over a network domain
ndash Message lengths are limited (w2k lt 426 byte) Operations on the mailslot
ndash Each reader (server) creates mailslot with CreateMailslot()
ndash Write-only client opens mailslot with CreateFile() and uses WriteFile() ndash open will fail if there are no waiting readers
ndash Clientlsquos message can be read by all servers (readers) Client lookup mailslotmailslotname
ndash Client will connect to every server in network domain
107107
Locate a server via mailslot
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
App client 0
App client n
Mailslot Servers
While () Sleep() hMS = CreateFile( ldquomailslotstatusldquo)
WriteFile(hMS ampStatInfo
App Server
Mailslot Client
Message is sent periodically
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
3030
Linux Synchronization
Kernel disables interrupts for synchronizing access to global data on uniprocessor systems
Uses spinlocks for multiprocessor synchronization
Uses semaphores and readers-writers locks when longer sections of code need access to data
Implements POSIX synchronization primitives to support multitasking multithreading (including real-time threads) and multiprocessing
3131
Further Reading
Ben-Ari M Principles of Concurrent Programming Prentice Hall 1982
Lamport L The Mutual Exclusion Problem Journal of the ACM April 1986
Abraham Silberschatz Peter B Galvin Operating System Concepts John Wiley amp Sons 6th Ed 2003
ndash Chapter 7 - Process Synchronization
ndash Chapter 8 - Deadlocks
3232
32 Trap Dispatching Interrupts Synchronization Trap and Interrupt dispatching
IRQL levels amp Interrupt Precedence
Spinlocks and Kernel Synchronization
Executive Synchronization
3333
Kernel Mode Versus User Mode
A processor state
Controls access to memory
Each memory page is tagged to show the required mode for reading and for writing
ndash Protects the system from the users
ndash Protects the user (process) from themselves
ndash System is not protected from system
Code regions are tagged ldquono write in any moderdquo
Controls ability to execute privileged instructions
A Windows abstraction
ndash Intel Ring 0 Ring 3
3434
Kernel Mode Versus User Mode
Control flow (a thread) can change from user to kernel mode and back
ndash Does not affect scheduling
ndash Thread context includes info about execution mode (along with registers etc)
PerfMon counters
ndash ldquoPrivileged Timerdquo and ldquoUser Timerdquo
ndash 4 levels of granularity thread process processor system
3535
Getting Into Kernel Mode
Code is run in kernel mode for one of three reasons
1 Requests from user mode
ndash Via the system service dispatch mechanism
ndash Kernel-mode code runs in the context of the requesting thread
2 Dedicated kernel-mode system threads
ndash Some threads in the system stay in kernel mode at all timesmostly in the ldquoSystemrdquo process
ndash Scheduled preempted etc like any other threads
3636
Getting Into Kernel Mode
3 Interrupts from external devicesndash interrupt dispatcher invokes the interrupt service routine
ndash ISR runs in the context of the interrupted thread so-called ldquoarbitrary thread contextrdquo
ndash ISR often requests the execution of a ldquoDPC routinerdquo which also runs in kernel mode
ndash Time not charged to interrupted thread
3737
Trap dispatching
Interruptdispatcher
Systemservice
dispatcher
Interruptserviceroutines
Interruptserviceroutines
Interruptserviceroutines
System services
System services
System services
Exceptiondispatcher
Exceptionhandlers
Exceptionhandlers
Exceptionhandlers
Virtual memorymanagerlsquos pager
Interrupt
System service call
HW exceptionsSW exceptions
Virtual addressexceptions
Trap processorlsquos mechanism to capture executing thread
ndash Switch from user to kernel mode
ndash Interrupts ndash asynchronous
ndash Exceptions - synchronous
Interrupt dispatch routine
Disable interrupts
Record machine state (trap frame) to allow resume
Mask equal- and lower-IRQL interrupts
Find and call appropriate ISR
Dismiss interrupt
Restore machine state (including mode and enabled interrupts)
Disable interrupts
Record machine state (trap frame) to allow resume
Mask equal- and lower-IRQL interrupts
Find and call appropriate ISR
Dismiss interrupt
Restore machine state (including mode and enabled interrupts)
Tell the device to stop interrupting
Interrogate device state start next operation on device etc
Request a DPC
Return to caller
Tell the device to stop interrupting
Interrogate device state start next operation on device etc
Request a DPC
Return to caller
Interrupt service routine
interrupt
user or kernel mode
codekernel mode
Note no thread or process context switch
Note no thread or process context switch
Interrupt Dispatching
3939
IRQL = Interrupt Request Level
ndash Precedence of the interrupt with respect to other interrupts
ndash Different interrupt sources have different IRQLs
ndash not the same as IRQ
IRQL is also a state of the processor
ndash Servicing an interrupt raises processor IRQL to that interruptrsquos IRQL
ndash this masks subsequent interrupts at equal and lower IRQLs
User mode is limited to IRQL 0
No waits or page faults at IRQL gt= DISPATCH_LEVEL
Interrupt Precedence via IRQLs (x86)
4040
PassiveLowAPC
DispatchDPCDevice 1
Profile amp Synch (Srv 2003)
ClockInterprocessor Interrupt
Power failHigh
normal thread execution
Hardware interrupts
Deferrable software interrupts
012
302928
31
Interrupt Precedence via IRQLs (x86)
4141
Interrupt processing
Interrupt dispatch table (IDT)ndash Links to interrupt service routines
x86ndash Interrupt controller interrupts processor (single line)
ndash Processor queries for interrupt vector uses vector as index to IDT
Alphandash PAL code (Privileged Architecture Library ndash Alpha BIOS) determines interrupt vector calls kernel
ndash Kernel uses vector to index IDT
After ISR execution IRQL is lowered to initial level
4242
Interrupt object
Allows device drivers to register ISRs for their devicesndash Contains dispatch code (initial handler)
ndash Dispatch code calls ISR with interrupt object as parameter(HW cannot pass parameters to ISR)
Connectingdisconnecting interrupt objectsndash Dynamic association between ISR and IDT entry
ndash Loadable device drivers (kernel modules)
ndash Turn onoff ISR
Interrupt objects can synchronize access to ISR datandash Multiple instances of ISR may be active simultaneously (MP machine)
ndash Multiple ISR may be connected with IRQL
4343
Predefined IRQLs
High ndash used when halting the system (via KeBugCheck())
Power fail ndash originated in the NT design document but has never been used
Inter-processor interruptndash used to request action from other processor (dispatching a thread updating a processors TLB system shutdown system crash)
Clockndash Used to update systemlsquos clock allocation of CPU time to threads
Profilendash Used for kernel profiling (see Kernel profiler ndash Kernprofexe Res Kit)
4444
Predefined IRQLs (contd)
Device
ndash Used to prioritize device interrupts
DPCdispatch and APC
ndash Software interrupts that kernel and device drivers generate
Passive
ndash No interrupt level at all normal thread execution
4545
IRQLs on 64-bit Systems
PassiveLowAPC
DispatchDPCDevice 1
Device n
Synch (Srv 2003)Clock
Interprocessor InterruptPower
HighProfile
012
1413
15
34
PassiveLowAPC
DispatchDPC amp Synch (UP only)Correctable Machine Check
Device 1
Device nSynch (MP only)
ClockInterprocessor Interrupt
HighProfilePower
x64 IA64
12
4646
Interrupt Prioritization amp Delivery
IRQLs are determined as followsndash x86 UP systems IRQL = 27 - IRQ
ndash x86 MP systems bucketized (random)
ndash x64 amp IA64 systems IRQL = IDT vector number 16
On MP systems which processor is chosen to deliver an interruptndash By default any processor can receive an interrupt from any deviceCan be configured with IntFilter utility in Resource Kit
ndash On x86 and x64 systems the IOAPIC (IO advanced programmable interrupt controller) is programmed to interrupt the processor running at the lowest IRQL
ndash On IA64 systems the SAPIC (streamlined advanced programmable interrupt controller) is configured to interrupt one processor for each interrupt sourceProcessors are assigned round robin for each interrupt vector
4747
Software interrupts
Initiating thread dispatching
ndash DPC allow for scheduling actions when kernel is deep within many layers of code
ndash Delayed scheduling decision one DPC queue per processor
Handling timer expiration
Asynchronous execution of a procedure in context of a particular thread
Support for asynchronous IO operations
4848
Flow of Interrupts
4949
Sync on MP use spinlocks to coordinate among processors
Spinlock acquisition and release routines implement a one-owner-at-a-time algorithm
ndash Spinlock is either free or is considered to be owned by a CPU
ndash Analogous to using Windows API mutexes from user mode
A spinlock is just a data cell in memory
ndash Accessed with a test-and-modify operation that is atomic across all processors
ndash KSPIN_LOCK is an opaque data type typedefrsquod as a ULONG
ndash To implement synchronization a single bit is sufficient
Synchronization on SMP Systems
5050
do acquire_spinlock(DPC)until (SUCCESS)
begin remove DPC from queueend
release_spinlock(DPC)
do acquire_spinlock(DPC)until (SUCCESS)
begin remove DPC from queueend
release_spinlock(DPC)
Kernel Synchronization
Processor BProcessor A
Critical section
spinlock
DPC DPC
A spinlock is a locking primitive associatedwith a global data structure such as the DPC queue
5151
Queued Spinlocks
Problem Checking status of spinlock via test-and-set operation creates bus contention
Queued spinlocks maintain queue of waiting processors
First processor acquires lock other processors wait on processor-local flag
ndash Thus busy-wait loop requires no access to the memory bus
When releasing lock the 1st processorrsquos flag is modified
ndash Exactly one processor is being signaled
ndash Pre-determined wait order
5252
SMP Scalability Improvements
Windows 2000 queued spinlocks
ndash qlocks in Kernel Debugger
Server 2003
ndash More spinlocks eliminated (context swap system space commit)
ndash Further reduction of use of spinlocks amp length they are held
ndash Scheduling database now per-CPUAllows thread state transitions in parallel
5353
SMP Scalability Improvements
XP2003
ndash Minimized lock contention for hot locks PFN or Page Frame Database lock
ndash Some locks completely eliminatedCharging nonpagedpaged pool quotas allocating and mapping system page table entries charging commitment of pages allocatingmapping physical memory through AWE functions
ndash New more efficient locking mechanism (pushlocks)Doesnrsquot use spinlocks when no contentionUsed for object manager and address windowing extensions (AWE) related locks
5454
Waiting
Flexible wait calls
ndash Wait for one or multiple objects in one call
ndash Wait for multiple can wait for ldquoanyrdquo one or ldquoallrdquo at onceldquoAllrdquo all objects must be in the signalled state concurrently to resolve the wait
ndash All wait calls include optional timeout argument
ndash Waiting threads consume no CPU time
5555
Waiting
Waitable objects include
ndash Events (may be auto-reset or manual reset may be set or ldquopulsedrdquo)
ndash Mutexes (ldquomutual exclusionrdquo one-at-a-time)
ndash Semaphores (n-at-a-time)
ndash Timers
ndash Processes and Threads (signalled upon exit or terminate)
ndash Directories (change notification)
5656
Waiting
No guaranteed ordering of wait resolution
ndash If multiple threads are waiting for an object and only one thread is released (eg itrsquos a mutex or auto-reset event) which thread gets released is unpredictable
5757
Executive Synchronization
Thread waitson an object
handle
Create and initialize thread object
Initialized
Ready
Transition
Waiting
Running
Terminated
Standby
Wait is completeSet object to
signaled state
Interaction with thread scheduling
Waiting on Dispatcher Objects ndash outside the kernel
5858
Interaction bet Synchronization amp Dispatching User mode thread waits on an event objectlsquos handle
Kernel changes threadlsquos scheduling state from ready to waiting and adds thread to wait-list
Another thread sets the event
Kernel wakes up waiting threads variable priority threads get priority boost
5959
Interaction bet Synchronization amp Dispatching Dispatcher re-schedules new thread ndash it may preempt running thread it it has lower priority and issues software interrupt to initiate context switch
If no processor can be preempted the dispatcher places the ready thread in the dispatcher ready queue to be scheduled later
6060
What signals an object
Dispatcher object
System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
Owning thread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (kernel mode)
nonsignaled signaled
Owning thread or otherthread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (exported to user mode)
nonsignaled signaled
One thread releases thesemaphore freeing a resource
A thread acquires the semaphoreMore resources are not available
Kernel resumes one or more waiting threads
Semaphore
6161
A thread reinitializesthe thread object
What signals an object (contd)
Dispatcher object System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
A thread sets the event
Kernel resumes one or more threads
Kernel resumes one or more waiting threads
Event
nonsignaled signaled
Dedicated thread sets oneevent in the event pair
Kernel resumes theother dedicated thread
Kernel resumes waitingdedicated thread
Event pair
nonsignaled signaled
Timer expires
A thread (re) initializes the timer
Kernel resumes all waiting threads
Timer
nonsignaled signaled
Thread terminates
Kernel resumes all waiting threads
Thread
6262
Further Reading
Mark E Russinovich and David A Solomon Microsoft Windows Internals 4th Edition Microsoft Press 2004
Chapter 3 - System Mechanisms
ndash Trap Dispatching (pp 85 ff)
ndash Synchronization (pp 149 ff)
ndash Kernel Event Tracing (pp 175 ff)
6363
33 Advanced Windows Synchronization
Deferred and Asynchronous Procedure Calls
IRQLs and CPU Time Accounting
Wait Queues amp Dispatcher Objects
6464
Used to defer processing from higher (device) interrupt level to a lower (dispatch) levelndash Also used for quantum end and timer expiration
Driver (usually ISR) queues requestndash One queue per CPU DPCs are normally queued to the current processor but can be targeted to other CPUs
ndash Executes specified procedure at dispatch IRQL (or ldquodispatch levelrdquo also ldquoDPC levelrdquo) when all higher-IRQL work (interrupts) completed
ndash Maximum times recommended ISR 10 usec DPC 25 usec
See httpwwwmicrosoftcomwhdcdriverperformmmdrvmspx
Deferred Procedure Calls (DPCs)
6565
queue head DPC object DPC object DPC object
Deferred Procedure Calls (DPCs)
6666
DPC
Delivering a DPC
DPC routines can call kernel functionsbut canlsquot call system services generatepage faults or create or wait on objects
DPC routines canlsquotassume whatprocess addressspace is currentlymapped
Interruptdispatch table
high
Power failure
DispatchDPC
APC
Low
DPC
1 Timer expires kernelqueues DPC that willrelease all waiting threadsKernel requests SW int
DPCDPC
DPC queue
2 DPC interrupt occurswhen IRQL drops belowdispatchDPC level
dispatcher
3 After DPC interruptcontrol transfers tothread dispatcher
4 Dispatcher executes each DPCroutine in DPC queue
6767
Asynchronous Procedure Calls (APCs)
Execute code in context of a particular user threadndash APC routines can acquire resources (objects) incur page faultscall system services
APC queue is thread-specific
User mode amp kernel mode APCsndash Permission required for user mode APCs
6868
Asynchronous Procedure Calls (APCs)
Executive uses APCs to complete work in thread spacendash Wait for asynchronous IO operation
ndash Emulate delivery of POSIX signals
ndash Make threads suspendterminate itself (env subsystems)
APCs are delivered when thread is in alertable wait statendash WaitForMultipleObjectsEx() SleepEx()
6969
Special kernel APCs
ndash Run in kernel mode at IRQL 1
ndash Always deliverable unless thread is already at IRQL 1 or above
ndash Used for IO completion reporting from ldquoarbitrary thread contextrdquo
ndash Kernel-mode interface is linkable but not documented
Asynchronous Procedure Calls (APCs)
7070
ldquoOrdinaryrdquo kernel APCs
ndash Always deliverable if at IRQL 0 unless explicitly disabled (disable with KeEnterCriticalRegion)
User mode APCs
ndash Used for IO completion callback routines (see ReadFileEx WriteFileEx) also QueueUserApc
ndash Only deliverable when thread is in ldquoalertable waitrdquo
Asynchronous Procedure Calls (APCs)
7171
ThreadObject
K
U
APC objects
Asynchronous Procedure Calls (APCs)
7272
IRQLs and CPU Time Accounting
Interval clock timer ISR keeps track of time
Clock ISR time accounting
ndash If IRQLlt2 charge to threadrsquos user or kernel time
ndash If IRQL=2 and processing a DPC charge to DPC time
ndash If IRQL=2 amp not processing a DPC charge to thread kernel time
ndash If IRQLgt2 charge to interrupt time
7373
IRQLs and CPU Time Accounting
Since time servicing interrupts are NOT charged to interrupted thread if system is busy but no process appears to be running must be due to interrupt-related activity
ndash Note time at IRQL 2 or more is charged to the current threadrsquos quantum (to be described)
7474
Interrupt Time Accounting
Task Manager includes interrupt and DPC time with the Idle process time
Interrupt activity is not charged to any threadprocess
ndash Process Explorer shows these as separate processesnot really processes
ndash Context switches for these are really of interrupts amp DPCs
7575
Time Accounting Quirks
Looking at total CPU time for each process may not reveal where system has spent its time
CPU time accounting is driven by programmable interrupt timer
ndash Normally 10 msec (15 msec on some MP Pentiums)
Thread execution and context switches between clock intervals NOT accounted
ndash Eg one or more threads run and enter a wait state before clock fires
ndash Thus threads may run but never get charged
View context switch activity with Process Explorer
ndash Add Context Switch Delta column
7676
For waiting threads user-mode utilities only display the wait reason
Example pstat
Looking at Waiting Threads
7777
Wait Internals 1 Dispatcher Objects
Size TypeState
Wait listhead
Object-type-specific data
DispatcherObject
(see ntddkincddkntddkh)
Any kernel object you can wait for is a ldquodispatcher objectrdquo
ndash some exclusively for synchronizationeg events mutexes (ldquomutantsrdquo) semaphores queues timers
ndash others can be waited for as a side effect of their prime function eg processes threads file objects
ndash non-waitable kernel objects are called ldquocontrol objectsrdquo
All dispatcher objects have a common header
All dispatcher objects are in one of two states
ndash ldquosignaledrdquo vs ldquononsignaledrdquo
ndash when signalled a wait on the object is satisfied
ndash different object types differ in terms of what changes their state
ndash wait and unwait implementation iscommon to all types of dispatcher objects
7878
Object-type-specific data
Wait Internals 2Wait Blocks
Size TypeState
Wait listhead
Size TypeState
Wait listhead
Represent a threadrsquos reference to something itrsquos waiting for (one per handle passed to WaitForhellip)
All wait blocks from a given wait call are chained to the waiting thread
Type indicates wait for ldquoanyrdquo or ldquoallrdquo Key denotes argument list position for
WaitForMultipleObjects
Object-type-specific data
DispatcherObjects
Thread Objects
WaitBlockListWaitBlockList
Wait blocks
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
7979
34 Windows APIs for Synchronization and IPC Windows API constructs for synchronization and interprocess communication
Synchronization
ndash Critical sections
ndash Mutexes
ndash Semaphores
ndash Event objects
Synchronization through interprocess communication
ndash Anonymous pipes
ndash Named pipes
ndash Mailslots
8080
Critical Sections
Only usable from within the same process
Critical sections are initialized and deleted but do not have handles
Only one thread at a time can be in a critical section
A thread can enter a critical section multiple times - however the number of Enter- and Leave-operations must match
Leaving a critical section before entering it may cause deadlocks
No way to test whether another thread is in a critical section
VOID InitializeCriticalSection( LPCRITICAL_SECTION sec )VOID DeleteCriticalSection( LPCRITICAL_SECTION sec )
VOID EnterCriticalSection( LPCRITICAL_SECTION sec ) VOID LeaveCriticalSection( LPCRITICAL_SECTION sec )BOOL TryEnterCriticalSection ( LPCRITICAL_SECTION sec )
8181
Critical Section Example
counter is global shared by all threads
volatile int counter = 0
CRITICAL_SECTION crit
InitializeCriticalSection ( ampcrit )
hellip main loop in any of the threads
while (done)
_try
EnterCriticalSection ( ampcrit )
counter += local_value
LeaveCriticalSection ( ampcrit )
_finally LeaveCriticalSection ( ampcrit )
DeleteCriticalSection( ampcrit )
8282
Synchronizing Threads with Kernel Objects
The following kernel objects can be used to synchronize threads
ndash Processes
ndash Threads
ndash Files
ndash Console input
File change notificationsFile change notifications
MutexesMutexes
Events (auto-reset + manual-reset)Events (auto-reset + manual-reset)
Waitable timersWaitable timers
DWORD WaitForSingleObject( HANDLE hObject DWORD dwTimeout )
DWORD WaitForMultipleObjects( DWORD cObjects LPHANDLE lpHandles BOOL bWaitAll DWORD dwTimeout )
8383
Wait Functions - Details
WaitForSingleObject()ndash hObject specifies kernel object
ndash dwTimeout specifies wait time in msecdwTimeout == 0 - no wait check whether object is signaled
dwTimeout == INFINITE - wait forever
WaitForMultipleObjects()ndash cObjects lt= MAXIMUM_WAIT_OBJECTS (64)
ndash lpHandles - pointer to array identifying these objects
ndash bWaitAll - whether to wait for first signaled object or all objectsFunction returns index of first signaled object
Side effectsndash Mutexes auto-reset events and waitable timers will be reset to non-signaled state after completing wait functions
8484
Mutexes
Mutexes work across processes
First thread has to call CreateMutex()
When sharing a mutex second thread (process) calls CreateMutex() or OpenMutex()
fInitialOwner == TRUE gives creator immediate ownership
Threads acquire mutex ownership using WaitForSingleObject() or WaitForMultipleObjects()
ReleaseMutex() gives up ownership
CloseHandle() will free mutex object
8585
Mutexes
HANDLE CreateMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
HANDLE OpenMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
BOOL ReleaseMutex( HANDLE hMutex )
8686
Mutex Example
counter is global shared by all threads
volatile int done counter = 0
HANDLE mutex = CreateMutex( NULL FALSE NULL )
main loop in any of the threads ret is local
DWORD ret
while (done)
ret = WaitForSingleObject( mutex INFINITE )
if (ret == WAIT_OBJECT_0)
counter += local_value
else mutex was abandoned
break exit the loop
ReleaseMutex( mutex )
CloseHandle( mutex )
8787
Comparison - POSIX mutexes
POSIX pthreads specification supports mutexesndash Synchronization among threads in same process
Five basic functionsndash pthread_mutex_init()
ndash pthread_mutex_destroy()
ndash pthread_mutex_lock()
ndash pthread_mutex_unlock()
ndash pthread_mutex_trylock()
Comparisonndash pthread_mutex_lock() will block - equivalent to WaitForSingleObject( hMutex )
ndash pthread_mutex_trylock() is nonblocking (polling) - equivalent to WaitForSingleObject() with timeout == 0
8888
Semaphores
Semaphore objects are used for resource countingndash A semaphore is signaled when count gt 0
Threadsprocesses use wait functionsndash Each wait function decreases semaphore count by 1
ndash ReleaseSemaphore() may increment count by any value
ndash ReleaseSemaphore() returns old semaphore count
8989
Semaphores
HANDLE CreateSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE OpenSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE ReleaseSemaphore( HANDLE hSemaphoreLONG cReleaseCount LPLONG lpPreviousCount )
9090
Events
Multiple threads can be released when a single event is signaled (barrier synchronization)ndash Manual-reset event can signal several thread simultaneously must be reset manually
ndash PulseEvent() will release all threads waiting on a manual-reset event and automatically reset the event
ndash Auto-reset event signals a single thread event is reset automatically
ndash fInitialState == TRUE - create event in signaled state
9191
Events
HANDLE CreateEvent( LPSECURITY_ATTRIBUTE lpsaBOOL fManualReset BOOL fInititalStateLPTSTR lpszEventName )
BOOL SetEvent( HANDLE hEvent )BOOL ResetEvent( HANDLE hEvent )BOOL PulseEvent( HANDLE hEvent )
9292
Comparison - POSIX condition variables
pthreadrsquos condition variables are comparable to eventsndash pthread_cond_init()
ndash pthread_cond_destroy()
Wait functionsndash pthread_cond_wait()
ndash pthread_cond_timedwait()
Signalingndash pthread_cond_signal() - one thread
ndash pthread_cond_broadcast() - all waiting threads
No exact equivalent to manual-reset events
9393
Anonymous pipes
BOOL CreatePipe( PHANDLE phReadPHANDLE phWriteLPSECURITY_ATTRIBUTES lpsaDWORD cbPipe )
main
prog1 prog2pipe
Half-duplex character-based IPC
cbPipe pipe byte size zero == default
Read on pipe handle will block if pipe is empty
Write operation to a full pipe will block
Anonymous pipes are oneway
9494
IO Redirection using an Anonymous Pipe
Create default size anonymous pipe handles are inheritable
if (CreatePipe (amphReadPipe amphWritePipe ampPipeSA 0))
fprintf(stderr ldquoAnon pipe create failednrdquo) exit(1)
Set output handle to pipe handle create first processes
StartInfoCh1hStdInput = GetStdHandle (STD_INPUT_HANDLE)
StartInfoCh1hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh1hStdOutput = hWritePipe
StartInfoCh1dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)Command1 NULL NULL TRUE 0 NULL NULL ampStartInfoCh1 ampProcInfo1))
fprintf(stderr ldquoCreateProc1 failednrdquo) exit(2)
CloseHandle (hWritePipe)
9595
Pipe example (contd)
Repeat (symmetrically) for the second process
StartInfoCh2hStdInput = hReadPipe
StartInfoCh2hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh2hStdOutput = GetStdHandle (STD_OUTPUT_HANDLE)
StartInfoCh2dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)targv NULL NULLTRUE Inherit handles
0 NULL NULL ampStartInfoCh2 ampProcInfo2))
fprintf(stderr ldquoCreateProc2 failednrdquo) exit(3)
CloseHandle (hReadPipe)
Wait for both processes to complete
WaitForSingleObject (ProcInfo1hProcess INFINITE)
WaitForSingleObject (ProcInfo2hProcess INFINITE)
CloseHandle (ProcInfo1hThread) CloseHandle (ProcInfo1hProcess)
CloseHandle (ProcInfo2hThread) CloseHandle (ProcInfo2hProcess)
return 0
9696
Named Pipes
Message orientedndash Reading process can read varying-length messages precisely as sent by the writing process
Bi-directionalndash Two processes can exchange messages over the same pipe
Multiple independent instances of a named pipendash Several clients can communicate with a single server using the same instance
ndash Server can respond to client using the same instance
Pipe can be accessed over the networkndash location transparency
Convenience and connection functions
9797
Using Named Pipes
HANDLE CreateNamedPipe (LPCTSTR lpszPipeNameDWORD fdwOpenMode DWORD fdwPipModeDWORD nMaxInstances DWORD cbOutBufDWORD cbInBuf DWORD dwTimeOutLPSECURITY_ATTRIBUTES lpsa )
Use same flag settings forall instances of a named pipe
lpszPipeName pipe[path]pipename
ndash Not possible to create a pipe on remote machine ( ndash local machine)
fdwOpenMode
ndash PIPE_ACCESS_DUPLEX PIPE_ACCESS_INBOUND PIPE_ACCESS_OUTBOUND
fdwPipeMode
ndash PIPE_TYPE_BYTE or PIPE_TYPE_MESSAGE
ndash PIPE_READMODE_BYTE or PIPE_READMODE_MESSAGE
ndash PIPE_WAIT or PIPE_NOWAIT (will ReadFile block)
9898
Named Pipes (contd)
BOOL PeekNamedPipe (HANDLE hPipeLPVOID lpvBuffer DWORD cbBufferLPDWORD lpcbRead LPDWORD lpcbAvailLPDWORD lpcbMessage)
nMaxInstances
ndash Number of instances
ndash PIPE_UNLIMITED_INSTANCES OS choice based on resources
dwTimeOut
ndash Default time-out period (in msec) for WaitNamedPipe()
First CreateNamedPipe creates named pipe
ndash Closing handle to last instance deletes named pipe
Polling a pipe
ndash Nondestructive ndash is there a message waiting for ReadFile
9999
Named Pipe Client Connections
CreateFile with named pipe namendash pipe[path]pipename
ndash servernamepipe[path]pipename
ndash First method gives better performance (local server)
Status Functionsndash GetNamedPipeHandleState
ndash SetNamedPipeHandleState
ndash GetNamedPipeInfo
100100
Convenience Functions
BOOL TransactNamedPipe( HANDLE hNamedPipeLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDOWRD lpcbRead LPOVERLAPPED lpa)
WriteFile ReadFile sequence
101101
Convenience Functions
BOOL CallNamedPipe( LPCTSTR lpszPipeNameLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDWORD lpcbRead DWORD dwTimeOut)
CreateFile WriteFile ReadFile CloseHandle
ndash dwTimeOut NMPWAIT_NOWAIT NMPWAIT_WIAT_FOREVER NMPWAIT_USE_DEFAULT_WAIT
102102
Server eliminate the polling loop
BOOL ConnectNamedPipe (HANDLE hNamedPipeLPOVERLAPPED lpo
lpo == NULLndash Call will return as soon as there is a client connection
ndash Returns false if client connected between CreateNamed Pipe calland ConnectNamedPipe()
Use DisconnectNamedPipe to free the handle for connection from another client
WaitNamedPipe()ndash Client may wait for serverlsquos ConnectNamedPipe()
Security rights for named pipesndash GENERIC_READ GENERIC_WRITE SYNCHRONIZE
103103
Comparison with UNIX
UNIX FIFOs are similar to a named pipe
ndash FIFOs are half-duplex
ndash FIFOs are limited to a single machine
ndash FIFOs are still byte-oriented so its easiest to use fixed-size records in clientserver applications
ndash Individual readwrites are atomic
A server using FIFOs must use a separate FIFO for each clientlsquos response although all clients can send requests via a single well known FIFO
Mkfifo() is the UNIX counterpart to CreateNamedPipe()
Use sockets for networked clientserver scenarios
104104
Client Example using Named Pipe
WaitNamedPipe (ServerPipeName NMPWAIT_WAIT_FOREVER)
hNamedPipe = CreateFile (ServerPipeName GENERIC_READ | GENERIC_WRITE
0 NULL OPEN_EXISTING FILE_ATTRIBUTE_NORMAL NULL)
if (hNamedPipe == INVALID_HANDLE_VALUE)
fptinf(stderr Failure to locate servern) exit(3)
Write the request
WriteFile (hNamedPipe ampRequest MAX_RQRS_LEN ampnWrite NULL)
Read each response and send it to std out
while (ReadFile (hNamedPipe ResponseRecord MAX_RQRS_LEN ampnRead NULL))
printf (s ResponseRecord)
CloseHandle (hNamedPipe)
return 0
105105
Server Example Using a Named Pipe
hNamedPipe = CreateNamedPipe (SERVER_PIPE PIPE_ACCESS_DUPLEX
PIPE_READMODE_MESSAGE | PIPE_TYPE_MESSAGE | PIPE_WAIT
1 0 0 CS_TIMEOUT pNPSA)
while (Done)
printf (Server is awaiting next requestn)
if (ConnectNamedPipe (hNamedPipe NULL)
|| ReadFile (hNamedPipe ampRequest RQ_SIZE ampnXfer NULL))
fprintf(stderr ldquoConnect or Read Named Pipe errornrdquo) exit(4)
printf( ldquoRequest is sn RequestRecord)
Send the file one line at a time to the client
fp = fopen (File r)
while ((fgets (ResponseRecord MAX_RQRS_LEN fp) = NULL))
WriteFile (hNamedPipe ampResponseRecord
(strlen(ResponseRecord) + 1) TSIZE ampnXfer NULL)
fclose (fp)
DisconnectNamedPipe (hNamedPipe)
End of server operation
106106
Win32 IPC - MailslotsMailslots bear some nasty implementation detailsthey are almost never used
Broadcast mechanism
ndash One-directional
ndash Mutliple writersmultiple readers (frequently one-to-many comm)
ndash Message delivery is unreliable
ndash Can be located over a network domain
ndash Message lengths are limited (w2k lt 426 byte) Operations on the mailslot
ndash Each reader (server) creates mailslot with CreateMailslot()
ndash Write-only client opens mailslot with CreateFile() and uses WriteFile() ndash open will fail if there are no waiting readers
ndash Clientlsquos message can be read by all servers (readers) Client lookup mailslotmailslotname
ndash Client will connect to every server in network domain
107107
Locate a server via mailslot
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
App client 0
App client n
Mailslot Servers
While () Sleep() hMS = CreateFile( ldquomailslotstatusldquo)
WriteFile(hMS ampStatInfo
App Server
Mailslot Client
Message is sent periodically
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
3131
Further Reading
Ben-Ari M Principles of Concurrent Programming Prentice Hall 1982
Lamport L The Mutual Exclusion Problem Journal of the ACM April 1986
Abraham Silberschatz Peter B Galvin Operating System Concepts John Wiley amp Sons 6th Ed 2003
ndash Chapter 7 - Process Synchronization
ndash Chapter 8 - Deadlocks
3232
32 Trap Dispatching Interrupts Synchronization Trap and Interrupt dispatching
IRQL levels amp Interrupt Precedence
Spinlocks and Kernel Synchronization
Executive Synchronization
3333
Kernel Mode Versus User Mode
A processor state
Controls access to memory
Each memory page is tagged to show the required mode for reading and for writing
ndash Protects the system from the users
ndash Protects the user (process) from themselves
ndash System is not protected from system
Code regions are tagged ldquono write in any moderdquo
Controls ability to execute privileged instructions
A Windows abstraction
ndash Intel Ring 0 Ring 3
3434
Kernel Mode Versus User Mode
Control flow (a thread) can change from user to kernel mode and back
ndash Does not affect scheduling
ndash Thread context includes info about execution mode (along with registers etc)
PerfMon counters
ndash ldquoPrivileged Timerdquo and ldquoUser Timerdquo
ndash 4 levels of granularity thread process processor system
3535
Getting Into Kernel Mode
Code is run in kernel mode for one of three reasons
1 Requests from user mode
ndash Via the system service dispatch mechanism
ndash Kernel-mode code runs in the context of the requesting thread
2 Dedicated kernel-mode system threads
ndash Some threads in the system stay in kernel mode at all timesmostly in the ldquoSystemrdquo process
ndash Scheduled preempted etc like any other threads
3636
Getting Into Kernel Mode
3 Interrupts from external devicesndash interrupt dispatcher invokes the interrupt service routine
ndash ISR runs in the context of the interrupted thread so-called ldquoarbitrary thread contextrdquo
ndash ISR often requests the execution of a ldquoDPC routinerdquo which also runs in kernel mode
ndash Time not charged to interrupted thread
3737
Trap dispatching
Interruptdispatcher
Systemservice
dispatcher
Interruptserviceroutines
Interruptserviceroutines
Interruptserviceroutines
System services
System services
System services
Exceptiondispatcher
Exceptionhandlers
Exceptionhandlers
Exceptionhandlers
Virtual memorymanagerlsquos pager
Interrupt
System service call
HW exceptionsSW exceptions
Virtual addressexceptions
Trap processorlsquos mechanism to capture executing thread
ndash Switch from user to kernel mode
ndash Interrupts ndash asynchronous
ndash Exceptions - synchronous
Interrupt dispatch routine
Disable interrupts
Record machine state (trap frame) to allow resume
Mask equal- and lower-IRQL interrupts
Find and call appropriate ISR
Dismiss interrupt
Restore machine state (including mode and enabled interrupts)
Disable interrupts
Record machine state (trap frame) to allow resume
Mask equal- and lower-IRQL interrupts
Find and call appropriate ISR
Dismiss interrupt
Restore machine state (including mode and enabled interrupts)
Tell the device to stop interrupting
Interrogate device state start next operation on device etc
Request a DPC
Return to caller
Tell the device to stop interrupting
Interrogate device state start next operation on device etc
Request a DPC
Return to caller
Interrupt service routine
interrupt
user or kernel mode
codekernel mode
Note no thread or process context switch
Note no thread or process context switch
Interrupt Dispatching
3939
IRQL = Interrupt Request Level
ndash Precedence of the interrupt with respect to other interrupts
ndash Different interrupt sources have different IRQLs
ndash not the same as IRQ
IRQL is also a state of the processor
ndash Servicing an interrupt raises processor IRQL to that interruptrsquos IRQL
ndash this masks subsequent interrupts at equal and lower IRQLs
User mode is limited to IRQL 0
No waits or page faults at IRQL gt= DISPATCH_LEVEL
Interrupt Precedence via IRQLs (x86)
4040
PassiveLowAPC
DispatchDPCDevice 1
Profile amp Synch (Srv 2003)
ClockInterprocessor Interrupt
Power failHigh
normal thread execution
Hardware interrupts
Deferrable software interrupts
012
302928
31
Interrupt Precedence via IRQLs (x86)
4141
Interrupt processing
Interrupt dispatch table (IDT)ndash Links to interrupt service routines
x86ndash Interrupt controller interrupts processor (single line)
ndash Processor queries for interrupt vector uses vector as index to IDT
Alphandash PAL code (Privileged Architecture Library ndash Alpha BIOS) determines interrupt vector calls kernel
ndash Kernel uses vector to index IDT
After ISR execution IRQL is lowered to initial level
4242
Interrupt object
Allows device drivers to register ISRs for their devicesndash Contains dispatch code (initial handler)
ndash Dispatch code calls ISR with interrupt object as parameter(HW cannot pass parameters to ISR)
Connectingdisconnecting interrupt objectsndash Dynamic association between ISR and IDT entry
ndash Loadable device drivers (kernel modules)
ndash Turn onoff ISR
Interrupt objects can synchronize access to ISR datandash Multiple instances of ISR may be active simultaneously (MP machine)
ndash Multiple ISR may be connected with IRQL
4343
Predefined IRQLs
High ndash used when halting the system (via KeBugCheck())
Power fail ndash originated in the NT design document but has never been used
Inter-processor interruptndash used to request action from other processor (dispatching a thread updating a processors TLB system shutdown system crash)
Clockndash Used to update systemlsquos clock allocation of CPU time to threads
Profilendash Used for kernel profiling (see Kernel profiler ndash Kernprofexe Res Kit)
4444
Predefined IRQLs (contd)
Device
ndash Used to prioritize device interrupts
DPCdispatch and APC
ndash Software interrupts that kernel and device drivers generate
Passive
ndash No interrupt level at all normal thread execution
4545
IRQLs on 64-bit Systems
PassiveLowAPC
DispatchDPCDevice 1
Device n
Synch (Srv 2003)Clock
Interprocessor InterruptPower
HighProfile
012
1413
15
34
PassiveLowAPC
DispatchDPC amp Synch (UP only)Correctable Machine Check
Device 1
Device nSynch (MP only)
ClockInterprocessor Interrupt
HighProfilePower
x64 IA64
12
4646
Interrupt Prioritization amp Delivery
IRQLs are determined as followsndash x86 UP systems IRQL = 27 - IRQ
ndash x86 MP systems bucketized (random)
ndash x64 amp IA64 systems IRQL = IDT vector number 16
On MP systems which processor is chosen to deliver an interruptndash By default any processor can receive an interrupt from any deviceCan be configured with IntFilter utility in Resource Kit
ndash On x86 and x64 systems the IOAPIC (IO advanced programmable interrupt controller) is programmed to interrupt the processor running at the lowest IRQL
ndash On IA64 systems the SAPIC (streamlined advanced programmable interrupt controller) is configured to interrupt one processor for each interrupt sourceProcessors are assigned round robin for each interrupt vector
4747
Software interrupts
Initiating thread dispatching
ndash DPC allow for scheduling actions when kernel is deep within many layers of code
ndash Delayed scheduling decision one DPC queue per processor
Handling timer expiration
Asynchronous execution of a procedure in context of a particular thread
Support for asynchronous IO operations
4848
Flow of Interrupts
4949
Sync on MP use spinlocks to coordinate among processors
Spinlock acquisition and release routines implement a one-owner-at-a-time algorithm
ndash Spinlock is either free or is considered to be owned by a CPU
ndash Analogous to using Windows API mutexes from user mode
A spinlock is just a data cell in memory
ndash Accessed with a test-and-modify operation that is atomic across all processors
ndash KSPIN_LOCK is an opaque data type typedefrsquod as a ULONG
ndash To implement synchronization a single bit is sufficient
Synchronization on SMP Systems
5050
do acquire_spinlock(DPC)until (SUCCESS)
begin remove DPC from queueend
release_spinlock(DPC)
do acquire_spinlock(DPC)until (SUCCESS)
begin remove DPC from queueend
release_spinlock(DPC)
Kernel Synchronization
Processor BProcessor A
Critical section
spinlock
DPC DPC
A spinlock is a locking primitive associatedwith a global data structure such as the DPC queue
5151
Queued Spinlocks
Problem Checking status of spinlock via test-and-set operation creates bus contention
Queued spinlocks maintain queue of waiting processors
First processor acquires lock other processors wait on processor-local flag
ndash Thus busy-wait loop requires no access to the memory bus
When releasing lock the 1st processorrsquos flag is modified
ndash Exactly one processor is being signaled
ndash Pre-determined wait order
5252
SMP Scalability Improvements
Windows 2000 queued spinlocks
ndash qlocks in Kernel Debugger
Server 2003
ndash More spinlocks eliminated (context swap system space commit)
ndash Further reduction of use of spinlocks amp length they are held
ndash Scheduling database now per-CPUAllows thread state transitions in parallel
5353
SMP Scalability Improvements
XP2003
ndash Minimized lock contention for hot locks PFN or Page Frame Database lock
ndash Some locks completely eliminatedCharging nonpagedpaged pool quotas allocating and mapping system page table entries charging commitment of pages allocatingmapping physical memory through AWE functions
ndash New more efficient locking mechanism (pushlocks)Doesnrsquot use spinlocks when no contentionUsed for object manager and address windowing extensions (AWE) related locks
5454
Waiting
Flexible wait calls
ndash Wait for one or multiple objects in one call
ndash Wait for multiple can wait for ldquoanyrdquo one or ldquoallrdquo at onceldquoAllrdquo all objects must be in the signalled state concurrently to resolve the wait
ndash All wait calls include optional timeout argument
ndash Waiting threads consume no CPU time
5555
Waiting
Waitable objects include
ndash Events (may be auto-reset or manual reset may be set or ldquopulsedrdquo)
ndash Mutexes (ldquomutual exclusionrdquo one-at-a-time)
ndash Semaphores (n-at-a-time)
ndash Timers
ndash Processes and Threads (signalled upon exit or terminate)
ndash Directories (change notification)
5656
Waiting
No guaranteed ordering of wait resolution
ndash If multiple threads are waiting for an object and only one thread is released (eg itrsquos a mutex or auto-reset event) which thread gets released is unpredictable
5757
Executive Synchronization
Thread waitson an object
handle
Create and initialize thread object
Initialized
Ready
Transition
Waiting
Running
Terminated
Standby
Wait is completeSet object to
signaled state
Interaction with thread scheduling
Waiting on Dispatcher Objects ndash outside the kernel
5858
Interaction bet Synchronization amp Dispatching User mode thread waits on an event objectlsquos handle
Kernel changes threadlsquos scheduling state from ready to waiting and adds thread to wait-list
Another thread sets the event
Kernel wakes up waiting threads variable priority threads get priority boost
5959
Interaction bet Synchronization amp Dispatching Dispatcher re-schedules new thread ndash it may preempt running thread it it has lower priority and issues software interrupt to initiate context switch
If no processor can be preempted the dispatcher places the ready thread in the dispatcher ready queue to be scheduled later
6060
What signals an object
Dispatcher object
System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
Owning thread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (kernel mode)
nonsignaled signaled
Owning thread or otherthread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (exported to user mode)
nonsignaled signaled
One thread releases thesemaphore freeing a resource
A thread acquires the semaphoreMore resources are not available
Kernel resumes one or more waiting threads
Semaphore
6161
A thread reinitializesthe thread object
What signals an object (contd)
Dispatcher object System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
A thread sets the event
Kernel resumes one or more threads
Kernel resumes one or more waiting threads
Event
nonsignaled signaled
Dedicated thread sets oneevent in the event pair
Kernel resumes theother dedicated thread
Kernel resumes waitingdedicated thread
Event pair
nonsignaled signaled
Timer expires
A thread (re) initializes the timer
Kernel resumes all waiting threads
Timer
nonsignaled signaled
Thread terminates
Kernel resumes all waiting threads
Thread
6262
Further Reading
Mark E Russinovich and David A Solomon Microsoft Windows Internals 4th Edition Microsoft Press 2004
Chapter 3 - System Mechanisms
ndash Trap Dispatching (pp 85 ff)
ndash Synchronization (pp 149 ff)
ndash Kernel Event Tracing (pp 175 ff)
6363
33 Advanced Windows Synchronization
Deferred and Asynchronous Procedure Calls
IRQLs and CPU Time Accounting
Wait Queues amp Dispatcher Objects
6464
Used to defer processing from higher (device) interrupt level to a lower (dispatch) levelndash Also used for quantum end and timer expiration
Driver (usually ISR) queues requestndash One queue per CPU DPCs are normally queued to the current processor but can be targeted to other CPUs
ndash Executes specified procedure at dispatch IRQL (or ldquodispatch levelrdquo also ldquoDPC levelrdquo) when all higher-IRQL work (interrupts) completed
ndash Maximum times recommended ISR 10 usec DPC 25 usec
See httpwwwmicrosoftcomwhdcdriverperformmmdrvmspx
Deferred Procedure Calls (DPCs)
6565
queue head DPC object DPC object DPC object
Deferred Procedure Calls (DPCs)
6666
DPC
Delivering a DPC
DPC routines can call kernel functionsbut canlsquot call system services generatepage faults or create or wait on objects
DPC routines canlsquotassume whatprocess addressspace is currentlymapped
Interruptdispatch table
high
Power failure
DispatchDPC
APC
Low
DPC
1 Timer expires kernelqueues DPC that willrelease all waiting threadsKernel requests SW int
DPCDPC
DPC queue
2 DPC interrupt occurswhen IRQL drops belowdispatchDPC level
dispatcher
3 After DPC interruptcontrol transfers tothread dispatcher
4 Dispatcher executes each DPCroutine in DPC queue
6767
Asynchronous Procedure Calls (APCs)
Execute code in context of a particular user threadndash APC routines can acquire resources (objects) incur page faultscall system services
APC queue is thread-specific
User mode amp kernel mode APCsndash Permission required for user mode APCs
6868
Asynchronous Procedure Calls (APCs)
Executive uses APCs to complete work in thread spacendash Wait for asynchronous IO operation
ndash Emulate delivery of POSIX signals
ndash Make threads suspendterminate itself (env subsystems)
APCs are delivered when thread is in alertable wait statendash WaitForMultipleObjectsEx() SleepEx()
6969
Special kernel APCs
ndash Run in kernel mode at IRQL 1
ndash Always deliverable unless thread is already at IRQL 1 or above
ndash Used for IO completion reporting from ldquoarbitrary thread contextrdquo
ndash Kernel-mode interface is linkable but not documented
Asynchronous Procedure Calls (APCs)
7070
ldquoOrdinaryrdquo kernel APCs
ndash Always deliverable if at IRQL 0 unless explicitly disabled (disable with KeEnterCriticalRegion)
User mode APCs
ndash Used for IO completion callback routines (see ReadFileEx WriteFileEx) also QueueUserApc
ndash Only deliverable when thread is in ldquoalertable waitrdquo
Asynchronous Procedure Calls (APCs)
7171
ThreadObject
K
U
APC objects
Asynchronous Procedure Calls (APCs)
7272
IRQLs and CPU Time Accounting
Interval clock timer ISR keeps track of time
Clock ISR time accounting
ndash If IRQLlt2 charge to threadrsquos user or kernel time
ndash If IRQL=2 and processing a DPC charge to DPC time
ndash If IRQL=2 amp not processing a DPC charge to thread kernel time
ndash If IRQLgt2 charge to interrupt time
7373
IRQLs and CPU Time Accounting
Since time servicing interrupts are NOT charged to interrupted thread if system is busy but no process appears to be running must be due to interrupt-related activity
ndash Note time at IRQL 2 or more is charged to the current threadrsquos quantum (to be described)
7474
Interrupt Time Accounting
Task Manager includes interrupt and DPC time with the Idle process time
Interrupt activity is not charged to any threadprocess
ndash Process Explorer shows these as separate processesnot really processes
ndash Context switches for these are really of interrupts amp DPCs
7575
Time Accounting Quirks
Looking at total CPU time for each process may not reveal where system has spent its time
CPU time accounting is driven by programmable interrupt timer
ndash Normally 10 msec (15 msec on some MP Pentiums)
Thread execution and context switches between clock intervals NOT accounted
ndash Eg one or more threads run and enter a wait state before clock fires
ndash Thus threads may run but never get charged
View context switch activity with Process Explorer
ndash Add Context Switch Delta column
7676
For waiting threads user-mode utilities only display the wait reason
Example pstat
Looking at Waiting Threads
7777
Wait Internals 1 Dispatcher Objects
Size TypeState
Wait listhead
Object-type-specific data
DispatcherObject
(see ntddkincddkntddkh)
Any kernel object you can wait for is a ldquodispatcher objectrdquo
ndash some exclusively for synchronizationeg events mutexes (ldquomutantsrdquo) semaphores queues timers
ndash others can be waited for as a side effect of their prime function eg processes threads file objects
ndash non-waitable kernel objects are called ldquocontrol objectsrdquo
All dispatcher objects have a common header
All dispatcher objects are in one of two states
ndash ldquosignaledrdquo vs ldquononsignaledrdquo
ndash when signalled a wait on the object is satisfied
ndash different object types differ in terms of what changes their state
ndash wait and unwait implementation iscommon to all types of dispatcher objects
7878
Object-type-specific data
Wait Internals 2Wait Blocks
Size TypeState
Wait listhead
Size TypeState
Wait listhead
Represent a threadrsquos reference to something itrsquos waiting for (one per handle passed to WaitForhellip)
All wait blocks from a given wait call are chained to the waiting thread
Type indicates wait for ldquoanyrdquo or ldquoallrdquo Key denotes argument list position for
WaitForMultipleObjects
Object-type-specific data
DispatcherObjects
Thread Objects
WaitBlockListWaitBlockList
Wait blocks
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
7979
34 Windows APIs for Synchronization and IPC Windows API constructs for synchronization and interprocess communication
Synchronization
ndash Critical sections
ndash Mutexes
ndash Semaphores
ndash Event objects
Synchronization through interprocess communication
ndash Anonymous pipes
ndash Named pipes
ndash Mailslots
8080
Critical Sections
Only usable from within the same process
Critical sections are initialized and deleted but do not have handles
Only one thread at a time can be in a critical section
A thread can enter a critical section multiple times - however the number of Enter- and Leave-operations must match
Leaving a critical section before entering it may cause deadlocks
No way to test whether another thread is in a critical section
VOID InitializeCriticalSection( LPCRITICAL_SECTION sec )VOID DeleteCriticalSection( LPCRITICAL_SECTION sec )
VOID EnterCriticalSection( LPCRITICAL_SECTION sec ) VOID LeaveCriticalSection( LPCRITICAL_SECTION sec )BOOL TryEnterCriticalSection ( LPCRITICAL_SECTION sec )
8181
Critical Section Example
counter is global shared by all threads
volatile int counter = 0
CRITICAL_SECTION crit
InitializeCriticalSection ( ampcrit )
hellip main loop in any of the threads
while (done)
_try
EnterCriticalSection ( ampcrit )
counter += local_value
LeaveCriticalSection ( ampcrit )
_finally LeaveCriticalSection ( ampcrit )
DeleteCriticalSection( ampcrit )
8282
Synchronizing Threads with Kernel Objects
The following kernel objects can be used to synchronize threads
ndash Processes
ndash Threads
ndash Files
ndash Console input
File change notificationsFile change notifications
MutexesMutexes
Events (auto-reset + manual-reset)Events (auto-reset + manual-reset)
Waitable timersWaitable timers
DWORD WaitForSingleObject( HANDLE hObject DWORD dwTimeout )
DWORD WaitForMultipleObjects( DWORD cObjects LPHANDLE lpHandles BOOL bWaitAll DWORD dwTimeout )
8383
Wait Functions - Details
WaitForSingleObject()ndash hObject specifies kernel object
ndash dwTimeout specifies wait time in msecdwTimeout == 0 - no wait check whether object is signaled
dwTimeout == INFINITE - wait forever
WaitForMultipleObjects()ndash cObjects lt= MAXIMUM_WAIT_OBJECTS (64)
ndash lpHandles - pointer to array identifying these objects
ndash bWaitAll - whether to wait for first signaled object or all objectsFunction returns index of first signaled object
Side effectsndash Mutexes auto-reset events and waitable timers will be reset to non-signaled state after completing wait functions
8484
Mutexes
Mutexes work across processes
First thread has to call CreateMutex()
When sharing a mutex second thread (process) calls CreateMutex() or OpenMutex()
fInitialOwner == TRUE gives creator immediate ownership
Threads acquire mutex ownership using WaitForSingleObject() or WaitForMultipleObjects()
ReleaseMutex() gives up ownership
CloseHandle() will free mutex object
8585
Mutexes
HANDLE CreateMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
HANDLE OpenMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
BOOL ReleaseMutex( HANDLE hMutex )
8686
Mutex Example
counter is global shared by all threads
volatile int done counter = 0
HANDLE mutex = CreateMutex( NULL FALSE NULL )
main loop in any of the threads ret is local
DWORD ret
while (done)
ret = WaitForSingleObject( mutex INFINITE )
if (ret == WAIT_OBJECT_0)
counter += local_value
else mutex was abandoned
break exit the loop
ReleaseMutex( mutex )
CloseHandle( mutex )
8787
Comparison - POSIX mutexes
POSIX pthreads specification supports mutexesndash Synchronization among threads in same process
Five basic functionsndash pthread_mutex_init()
ndash pthread_mutex_destroy()
ndash pthread_mutex_lock()
ndash pthread_mutex_unlock()
ndash pthread_mutex_trylock()
Comparisonndash pthread_mutex_lock() will block - equivalent to WaitForSingleObject( hMutex )
ndash pthread_mutex_trylock() is nonblocking (polling) - equivalent to WaitForSingleObject() with timeout == 0
8888
Semaphores
Semaphore objects are used for resource countingndash A semaphore is signaled when count gt 0
Threadsprocesses use wait functionsndash Each wait function decreases semaphore count by 1
ndash ReleaseSemaphore() may increment count by any value
ndash ReleaseSemaphore() returns old semaphore count
8989
Semaphores
HANDLE CreateSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE OpenSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE ReleaseSemaphore( HANDLE hSemaphoreLONG cReleaseCount LPLONG lpPreviousCount )
9090
Events
Multiple threads can be released when a single event is signaled (barrier synchronization)ndash Manual-reset event can signal several thread simultaneously must be reset manually
ndash PulseEvent() will release all threads waiting on a manual-reset event and automatically reset the event
ndash Auto-reset event signals a single thread event is reset automatically
ndash fInitialState == TRUE - create event in signaled state
9191
Events
HANDLE CreateEvent( LPSECURITY_ATTRIBUTE lpsaBOOL fManualReset BOOL fInititalStateLPTSTR lpszEventName )
BOOL SetEvent( HANDLE hEvent )BOOL ResetEvent( HANDLE hEvent )BOOL PulseEvent( HANDLE hEvent )
9292
Comparison - POSIX condition variables
pthreadrsquos condition variables are comparable to eventsndash pthread_cond_init()
ndash pthread_cond_destroy()
Wait functionsndash pthread_cond_wait()
ndash pthread_cond_timedwait()
Signalingndash pthread_cond_signal() - one thread
ndash pthread_cond_broadcast() - all waiting threads
No exact equivalent to manual-reset events
9393
Anonymous pipes
BOOL CreatePipe( PHANDLE phReadPHANDLE phWriteLPSECURITY_ATTRIBUTES lpsaDWORD cbPipe )
main
prog1 prog2pipe
Half-duplex character-based IPC
cbPipe pipe byte size zero == default
Read on pipe handle will block if pipe is empty
Write operation to a full pipe will block
Anonymous pipes are oneway
9494
IO Redirection using an Anonymous Pipe
Create default size anonymous pipe handles are inheritable
if (CreatePipe (amphReadPipe amphWritePipe ampPipeSA 0))
fprintf(stderr ldquoAnon pipe create failednrdquo) exit(1)
Set output handle to pipe handle create first processes
StartInfoCh1hStdInput = GetStdHandle (STD_INPUT_HANDLE)
StartInfoCh1hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh1hStdOutput = hWritePipe
StartInfoCh1dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)Command1 NULL NULL TRUE 0 NULL NULL ampStartInfoCh1 ampProcInfo1))
fprintf(stderr ldquoCreateProc1 failednrdquo) exit(2)
CloseHandle (hWritePipe)
9595
Pipe example (contd)
Repeat (symmetrically) for the second process
StartInfoCh2hStdInput = hReadPipe
StartInfoCh2hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh2hStdOutput = GetStdHandle (STD_OUTPUT_HANDLE)
StartInfoCh2dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)targv NULL NULLTRUE Inherit handles
0 NULL NULL ampStartInfoCh2 ampProcInfo2))
fprintf(stderr ldquoCreateProc2 failednrdquo) exit(3)
CloseHandle (hReadPipe)
Wait for both processes to complete
WaitForSingleObject (ProcInfo1hProcess INFINITE)
WaitForSingleObject (ProcInfo2hProcess INFINITE)
CloseHandle (ProcInfo1hThread) CloseHandle (ProcInfo1hProcess)
CloseHandle (ProcInfo2hThread) CloseHandle (ProcInfo2hProcess)
return 0
9696
Named Pipes
Message orientedndash Reading process can read varying-length messages precisely as sent by the writing process
Bi-directionalndash Two processes can exchange messages over the same pipe
Multiple independent instances of a named pipendash Several clients can communicate with a single server using the same instance
ndash Server can respond to client using the same instance
Pipe can be accessed over the networkndash location transparency
Convenience and connection functions
9797
Using Named Pipes
HANDLE CreateNamedPipe (LPCTSTR lpszPipeNameDWORD fdwOpenMode DWORD fdwPipModeDWORD nMaxInstances DWORD cbOutBufDWORD cbInBuf DWORD dwTimeOutLPSECURITY_ATTRIBUTES lpsa )
Use same flag settings forall instances of a named pipe
lpszPipeName pipe[path]pipename
ndash Not possible to create a pipe on remote machine ( ndash local machine)
fdwOpenMode
ndash PIPE_ACCESS_DUPLEX PIPE_ACCESS_INBOUND PIPE_ACCESS_OUTBOUND
fdwPipeMode
ndash PIPE_TYPE_BYTE or PIPE_TYPE_MESSAGE
ndash PIPE_READMODE_BYTE or PIPE_READMODE_MESSAGE
ndash PIPE_WAIT or PIPE_NOWAIT (will ReadFile block)
9898
Named Pipes (contd)
BOOL PeekNamedPipe (HANDLE hPipeLPVOID lpvBuffer DWORD cbBufferLPDWORD lpcbRead LPDWORD lpcbAvailLPDWORD lpcbMessage)
nMaxInstances
ndash Number of instances
ndash PIPE_UNLIMITED_INSTANCES OS choice based on resources
dwTimeOut
ndash Default time-out period (in msec) for WaitNamedPipe()
First CreateNamedPipe creates named pipe
ndash Closing handle to last instance deletes named pipe
Polling a pipe
ndash Nondestructive ndash is there a message waiting for ReadFile
9999
Named Pipe Client Connections
CreateFile with named pipe namendash pipe[path]pipename
ndash servernamepipe[path]pipename
ndash First method gives better performance (local server)
Status Functionsndash GetNamedPipeHandleState
ndash SetNamedPipeHandleState
ndash GetNamedPipeInfo
100100
Convenience Functions
BOOL TransactNamedPipe( HANDLE hNamedPipeLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDOWRD lpcbRead LPOVERLAPPED lpa)
WriteFile ReadFile sequence
101101
Convenience Functions
BOOL CallNamedPipe( LPCTSTR lpszPipeNameLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDWORD lpcbRead DWORD dwTimeOut)
CreateFile WriteFile ReadFile CloseHandle
ndash dwTimeOut NMPWAIT_NOWAIT NMPWAIT_WIAT_FOREVER NMPWAIT_USE_DEFAULT_WAIT
102102
Server eliminate the polling loop
BOOL ConnectNamedPipe (HANDLE hNamedPipeLPOVERLAPPED lpo
lpo == NULLndash Call will return as soon as there is a client connection
ndash Returns false if client connected between CreateNamed Pipe calland ConnectNamedPipe()
Use DisconnectNamedPipe to free the handle for connection from another client
WaitNamedPipe()ndash Client may wait for serverlsquos ConnectNamedPipe()
Security rights for named pipesndash GENERIC_READ GENERIC_WRITE SYNCHRONIZE
103103
Comparison with UNIX
UNIX FIFOs are similar to a named pipe
ndash FIFOs are half-duplex
ndash FIFOs are limited to a single machine
ndash FIFOs are still byte-oriented so its easiest to use fixed-size records in clientserver applications
ndash Individual readwrites are atomic
A server using FIFOs must use a separate FIFO for each clientlsquos response although all clients can send requests via a single well known FIFO
Mkfifo() is the UNIX counterpart to CreateNamedPipe()
Use sockets for networked clientserver scenarios
104104
Client Example using Named Pipe
WaitNamedPipe (ServerPipeName NMPWAIT_WAIT_FOREVER)
hNamedPipe = CreateFile (ServerPipeName GENERIC_READ | GENERIC_WRITE
0 NULL OPEN_EXISTING FILE_ATTRIBUTE_NORMAL NULL)
if (hNamedPipe == INVALID_HANDLE_VALUE)
fptinf(stderr Failure to locate servern) exit(3)
Write the request
WriteFile (hNamedPipe ampRequest MAX_RQRS_LEN ampnWrite NULL)
Read each response and send it to std out
while (ReadFile (hNamedPipe ResponseRecord MAX_RQRS_LEN ampnRead NULL))
printf (s ResponseRecord)
CloseHandle (hNamedPipe)
return 0
105105
Server Example Using a Named Pipe
hNamedPipe = CreateNamedPipe (SERVER_PIPE PIPE_ACCESS_DUPLEX
PIPE_READMODE_MESSAGE | PIPE_TYPE_MESSAGE | PIPE_WAIT
1 0 0 CS_TIMEOUT pNPSA)
while (Done)
printf (Server is awaiting next requestn)
if (ConnectNamedPipe (hNamedPipe NULL)
|| ReadFile (hNamedPipe ampRequest RQ_SIZE ampnXfer NULL))
fprintf(stderr ldquoConnect or Read Named Pipe errornrdquo) exit(4)
printf( ldquoRequest is sn RequestRecord)
Send the file one line at a time to the client
fp = fopen (File r)
while ((fgets (ResponseRecord MAX_RQRS_LEN fp) = NULL))
WriteFile (hNamedPipe ampResponseRecord
(strlen(ResponseRecord) + 1) TSIZE ampnXfer NULL)
fclose (fp)
DisconnectNamedPipe (hNamedPipe)
End of server operation
106106
Win32 IPC - MailslotsMailslots bear some nasty implementation detailsthey are almost never used
Broadcast mechanism
ndash One-directional
ndash Mutliple writersmultiple readers (frequently one-to-many comm)
ndash Message delivery is unreliable
ndash Can be located over a network domain
ndash Message lengths are limited (w2k lt 426 byte) Operations on the mailslot
ndash Each reader (server) creates mailslot with CreateMailslot()
ndash Write-only client opens mailslot with CreateFile() and uses WriteFile() ndash open will fail if there are no waiting readers
ndash Clientlsquos message can be read by all servers (readers) Client lookup mailslotmailslotname
ndash Client will connect to every server in network domain
107107
Locate a server via mailslot
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
App client 0
App client n
Mailslot Servers
While () Sleep() hMS = CreateFile( ldquomailslotstatusldquo)
WriteFile(hMS ampStatInfo
App Server
Mailslot Client
Message is sent periodically
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
3232
32 Trap Dispatching Interrupts Synchronization Trap and Interrupt dispatching
IRQL levels amp Interrupt Precedence
Spinlocks and Kernel Synchronization
Executive Synchronization
3333
Kernel Mode Versus User Mode
A processor state
Controls access to memory
Each memory page is tagged to show the required mode for reading and for writing
ndash Protects the system from the users
ndash Protects the user (process) from themselves
ndash System is not protected from system
Code regions are tagged ldquono write in any moderdquo
Controls ability to execute privileged instructions
A Windows abstraction
ndash Intel Ring 0 Ring 3
3434
Kernel Mode Versus User Mode
Control flow (a thread) can change from user to kernel mode and back
ndash Does not affect scheduling
ndash Thread context includes info about execution mode (along with registers etc)
PerfMon counters
ndash ldquoPrivileged Timerdquo and ldquoUser Timerdquo
ndash 4 levels of granularity thread process processor system
3535
Getting Into Kernel Mode
Code is run in kernel mode for one of three reasons
1 Requests from user mode
ndash Via the system service dispatch mechanism
ndash Kernel-mode code runs in the context of the requesting thread
2 Dedicated kernel-mode system threads
ndash Some threads in the system stay in kernel mode at all timesmostly in the ldquoSystemrdquo process
ndash Scheduled preempted etc like any other threads
3636
Getting Into Kernel Mode
3 Interrupts from external devicesndash interrupt dispatcher invokes the interrupt service routine
ndash ISR runs in the context of the interrupted thread so-called ldquoarbitrary thread contextrdquo
ndash ISR often requests the execution of a ldquoDPC routinerdquo which also runs in kernel mode
ndash Time not charged to interrupted thread
3737
Trap dispatching
Interruptdispatcher
Systemservice
dispatcher
Interruptserviceroutines
Interruptserviceroutines
Interruptserviceroutines
System services
System services
System services
Exceptiondispatcher
Exceptionhandlers
Exceptionhandlers
Exceptionhandlers
Virtual memorymanagerlsquos pager
Interrupt
System service call
HW exceptionsSW exceptions
Virtual addressexceptions
Trap processorlsquos mechanism to capture executing thread
ndash Switch from user to kernel mode
ndash Interrupts ndash asynchronous
ndash Exceptions - synchronous
Interrupt dispatch routine
Disable interrupts
Record machine state (trap frame) to allow resume
Mask equal- and lower-IRQL interrupts
Find and call appropriate ISR
Dismiss interrupt
Restore machine state (including mode and enabled interrupts)
Disable interrupts
Record machine state (trap frame) to allow resume
Mask equal- and lower-IRQL interrupts
Find and call appropriate ISR
Dismiss interrupt
Restore machine state (including mode and enabled interrupts)
Tell the device to stop interrupting
Interrogate device state start next operation on device etc
Request a DPC
Return to caller
Tell the device to stop interrupting
Interrogate device state start next operation on device etc
Request a DPC
Return to caller
Interrupt service routine
interrupt
user or kernel mode
codekernel mode
Note no thread or process context switch
Note no thread or process context switch
Interrupt Dispatching
3939
IRQL = Interrupt Request Level
ndash Precedence of the interrupt with respect to other interrupts
ndash Different interrupt sources have different IRQLs
ndash not the same as IRQ
IRQL is also a state of the processor
ndash Servicing an interrupt raises processor IRQL to that interruptrsquos IRQL
ndash this masks subsequent interrupts at equal and lower IRQLs
User mode is limited to IRQL 0
No waits or page faults at IRQL gt= DISPATCH_LEVEL
Interrupt Precedence via IRQLs (x86)
4040
PassiveLowAPC
DispatchDPCDevice 1
Profile amp Synch (Srv 2003)
ClockInterprocessor Interrupt
Power failHigh
normal thread execution
Hardware interrupts
Deferrable software interrupts
012
302928
31
Interrupt Precedence via IRQLs (x86)
4141
Interrupt processing
Interrupt dispatch table (IDT)ndash Links to interrupt service routines
x86ndash Interrupt controller interrupts processor (single line)
ndash Processor queries for interrupt vector uses vector as index to IDT
Alphandash PAL code (Privileged Architecture Library ndash Alpha BIOS) determines interrupt vector calls kernel
ndash Kernel uses vector to index IDT
After ISR execution IRQL is lowered to initial level
4242
Interrupt object
Allows device drivers to register ISRs for their devicesndash Contains dispatch code (initial handler)
ndash Dispatch code calls ISR with interrupt object as parameter(HW cannot pass parameters to ISR)
Connectingdisconnecting interrupt objectsndash Dynamic association between ISR and IDT entry
ndash Loadable device drivers (kernel modules)
ndash Turn onoff ISR
Interrupt objects can synchronize access to ISR datandash Multiple instances of ISR may be active simultaneously (MP machine)
ndash Multiple ISR may be connected with IRQL
4343
Predefined IRQLs
High ndash used when halting the system (via KeBugCheck())
Power fail ndash originated in the NT design document but has never been used
Inter-processor interruptndash used to request action from other processor (dispatching a thread updating a processors TLB system shutdown system crash)
Clockndash Used to update systemlsquos clock allocation of CPU time to threads
Profilendash Used for kernel profiling (see Kernel profiler ndash Kernprofexe Res Kit)
4444
Predefined IRQLs (contd)
Device
ndash Used to prioritize device interrupts
DPCdispatch and APC
ndash Software interrupts that kernel and device drivers generate
Passive
ndash No interrupt level at all normal thread execution
4545
IRQLs on 64-bit Systems
PassiveLowAPC
DispatchDPCDevice 1
Device n
Synch (Srv 2003)Clock
Interprocessor InterruptPower
HighProfile
012
1413
15
34
PassiveLowAPC
DispatchDPC amp Synch (UP only)Correctable Machine Check
Device 1
Device nSynch (MP only)
ClockInterprocessor Interrupt
HighProfilePower
x64 IA64
12
4646
Interrupt Prioritization amp Delivery
IRQLs are determined as followsndash x86 UP systems IRQL = 27 - IRQ
ndash x86 MP systems bucketized (random)
ndash x64 amp IA64 systems IRQL = IDT vector number 16
On MP systems which processor is chosen to deliver an interruptndash By default any processor can receive an interrupt from any deviceCan be configured with IntFilter utility in Resource Kit
ndash On x86 and x64 systems the IOAPIC (IO advanced programmable interrupt controller) is programmed to interrupt the processor running at the lowest IRQL
ndash On IA64 systems the SAPIC (streamlined advanced programmable interrupt controller) is configured to interrupt one processor for each interrupt sourceProcessors are assigned round robin for each interrupt vector
4747
Software interrupts
Initiating thread dispatching
ndash DPC allow for scheduling actions when kernel is deep within many layers of code
ndash Delayed scheduling decision one DPC queue per processor
Handling timer expiration
Asynchronous execution of a procedure in context of a particular thread
Support for asynchronous IO operations
4848
Flow of Interrupts
4949
Sync on MP use spinlocks to coordinate among processors
Spinlock acquisition and release routines implement a one-owner-at-a-time algorithm
ndash Spinlock is either free or is considered to be owned by a CPU
ndash Analogous to using Windows API mutexes from user mode
A spinlock is just a data cell in memory
ndash Accessed with a test-and-modify operation that is atomic across all processors
ndash KSPIN_LOCK is an opaque data type typedefrsquod as a ULONG
ndash To implement synchronization a single bit is sufficient
Synchronization on SMP Systems
5050
do acquire_spinlock(DPC)until (SUCCESS)
begin remove DPC from queueend
release_spinlock(DPC)
do acquire_spinlock(DPC)until (SUCCESS)
begin remove DPC from queueend
release_spinlock(DPC)
Kernel Synchronization
Processor BProcessor A
Critical section
spinlock
DPC DPC
A spinlock is a locking primitive associatedwith a global data structure such as the DPC queue
5151
Queued Spinlocks
Problem Checking status of spinlock via test-and-set operation creates bus contention
Queued spinlocks maintain queue of waiting processors
First processor acquires lock other processors wait on processor-local flag
ndash Thus busy-wait loop requires no access to the memory bus
When releasing lock the 1st processorrsquos flag is modified
ndash Exactly one processor is being signaled
ndash Pre-determined wait order
5252
SMP Scalability Improvements
Windows 2000 queued spinlocks
ndash qlocks in Kernel Debugger
Server 2003
ndash More spinlocks eliminated (context swap system space commit)
ndash Further reduction of use of spinlocks amp length they are held
ndash Scheduling database now per-CPUAllows thread state transitions in parallel
5353
SMP Scalability Improvements
XP2003
ndash Minimized lock contention for hot locks PFN or Page Frame Database lock
ndash Some locks completely eliminatedCharging nonpagedpaged pool quotas allocating and mapping system page table entries charging commitment of pages allocatingmapping physical memory through AWE functions
ndash New more efficient locking mechanism (pushlocks)Doesnrsquot use spinlocks when no contentionUsed for object manager and address windowing extensions (AWE) related locks
5454
Waiting
Flexible wait calls
ndash Wait for one or multiple objects in one call
ndash Wait for multiple can wait for ldquoanyrdquo one or ldquoallrdquo at onceldquoAllrdquo all objects must be in the signalled state concurrently to resolve the wait
ndash All wait calls include optional timeout argument
ndash Waiting threads consume no CPU time
5555
Waiting
Waitable objects include
ndash Events (may be auto-reset or manual reset may be set or ldquopulsedrdquo)
ndash Mutexes (ldquomutual exclusionrdquo one-at-a-time)
ndash Semaphores (n-at-a-time)
ndash Timers
ndash Processes and Threads (signalled upon exit or terminate)
ndash Directories (change notification)
5656
Waiting
No guaranteed ordering of wait resolution
ndash If multiple threads are waiting for an object and only one thread is released (eg itrsquos a mutex or auto-reset event) which thread gets released is unpredictable
5757
Executive Synchronization
Thread waitson an object
handle
Create and initialize thread object
Initialized
Ready
Transition
Waiting
Running
Terminated
Standby
Wait is completeSet object to
signaled state
Interaction with thread scheduling
Waiting on Dispatcher Objects ndash outside the kernel
5858
Interaction bet Synchronization amp Dispatching User mode thread waits on an event objectlsquos handle
Kernel changes threadlsquos scheduling state from ready to waiting and adds thread to wait-list
Another thread sets the event
Kernel wakes up waiting threads variable priority threads get priority boost
5959
Interaction bet Synchronization amp Dispatching Dispatcher re-schedules new thread ndash it may preempt running thread it it has lower priority and issues software interrupt to initiate context switch
If no processor can be preempted the dispatcher places the ready thread in the dispatcher ready queue to be scheduled later
6060
What signals an object
Dispatcher object
System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
Owning thread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (kernel mode)
nonsignaled signaled
Owning thread or otherthread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (exported to user mode)
nonsignaled signaled
One thread releases thesemaphore freeing a resource
A thread acquires the semaphoreMore resources are not available
Kernel resumes one or more waiting threads
Semaphore
6161
A thread reinitializesthe thread object
What signals an object (contd)
Dispatcher object System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
A thread sets the event
Kernel resumes one or more threads
Kernel resumes one or more waiting threads
Event
nonsignaled signaled
Dedicated thread sets oneevent in the event pair
Kernel resumes theother dedicated thread
Kernel resumes waitingdedicated thread
Event pair
nonsignaled signaled
Timer expires
A thread (re) initializes the timer
Kernel resumes all waiting threads
Timer
nonsignaled signaled
Thread terminates
Kernel resumes all waiting threads
Thread
6262
Further Reading
Mark E Russinovich and David A Solomon Microsoft Windows Internals 4th Edition Microsoft Press 2004
Chapter 3 - System Mechanisms
ndash Trap Dispatching (pp 85 ff)
ndash Synchronization (pp 149 ff)
ndash Kernel Event Tracing (pp 175 ff)
6363
33 Advanced Windows Synchronization
Deferred and Asynchronous Procedure Calls
IRQLs and CPU Time Accounting
Wait Queues amp Dispatcher Objects
6464
Used to defer processing from higher (device) interrupt level to a lower (dispatch) levelndash Also used for quantum end and timer expiration
Driver (usually ISR) queues requestndash One queue per CPU DPCs are normally queued to the current processor but can be targeted to other CPUs
ndash Executes specified procedure at dispatch IRQL (or ldquodispatch levelrdquo also ldquoDPC levelrdquo) when all higher-IRQL work (interrupts) completed
ndash Maximum times recommended ISR 10 usec DPC 25 usec
See httpwwwmicrosoftcomwhdcdriverperformmmdrvmspx
Deferred Procedure Calls (DPCs)
6565
queue head DPC object DPC object DPC object
Deferred Procedure Calls (DPCs)
6666
DPC
Delivering a DPC
DPC routines can call kernel functionsbut canlsquot call system services generatepage faults or create or wait on objects
DPC routines canlsquotassume whatprocess addressspace is currentlymapped
Interruptdispatch table
high
Power failure
DispatchDPC
APC
Low
DPC
1 Timer expires kernelqueues DPC that willrelease all waiting threadsKernel requests SW int
DPCDPC
DPC queue
2 DPC interrupt occurswhen IRQL drops belowdispatchDPC level
dispatcher
3 After DPC interruptcontrol transfers tothread dispatcher
4 Dispatcher executes each DPCroutine in DPC queue
6767
Asynchronous Procedure Calls (APCs)
Execute code in context of a particular user threadndash APC routines can acquire resources (objects) incur page faultscall system services
APC queue is thread-specific
User mode amp kernel mode APCsndash Permission required for user mode APCs
6868
Asynchronous Procedure Calls (APCs)
Executive uses APCs to complete work in thread spacendash Wait for asynchronous IO operation
ndash Emulate delivery of POSIX signals
ndash Make threads suspendterminate itself (env subsystems)
APCs are delivered when thread is in alertable wait statendash WaitForMultipleObjectsEx() SleepEx()
6969
Special kernel APCs
ndash Run in kernel mode at IRQL 1
ndash Always deliverable unless thread is already at IRQL 1 or above
ndash Used for IO completion reporting from ldquoarbitrary thread contextrdquo
ndash Kernel-mode interface is linkable but not documented
Asynchronous Procedure Calls (APCs)
7070
ldquoOrdinaryrdquo kernel APCs
ndash Always deliverable if at IRQL 0 unless explicitly disabled (disable with KeEnterCriticalRegion)
User mode APCs
ndash Used for IO completion callback routines (see ReadFileEx WriteFileEx) also QueueUserApc
ndash Only deliverable when thread is in ldquoalertable waitrdquo
Asynchronous Procedure Calls (APCs)
7171
ThreadObject
K
U
APC objects
Asynchronous Procedure Calls (APCs)
7272
IRQLs and CPU Time Accounting
Interval clock timer ISR keeps track of time
Clock ISR time accounting
ndash If IRQLlt2 charge to threadrsquos user or kernel time
ndash If IRQL=2 and processing a DPC charge to DPC time
ndash If IRQL=2 amp not processing a DPC charge to thread kernel time
ndash If IRQLgt2 charge to interrupt time
7373
IRQLs and CPU Time Accounting
Since time servicing interrupts are NOT charged to interrupted thread if system is busy but no process appears to be running must be due to interrupt-related activity
ndash Note time at IRQL 2 or more is charged to the current threadrsquos quantum (to be described)
7474
Interrupt Time Accounting
Task Manager includes interrupt and DPC time with the Idle process time
Interrupt activity is not charged to any threadprocess
ndash Process Explorer shows these as separate processesnot really processes
ndash Context switches for these are really of interrupts amp DPCs
7575
Time Accounting Quirks
Looking at total CPU time for each process may not reveal where system has spent its time
CPU time accounting is driven by programmable interrupt timer
ndash Normally 10 msec (15 msec on some MP Pentiums)
Thread execution and context switches between clock intervals NOT accounted
ndash Eg one or more threads run and enter a wait state before clock fires
ndash Thus threads may run but never get charged
View context switch activity with Process Explorer
ndash Add Context Switch Delta column
7676
For waiting threads user-mode utilities only display the wait reason
Example pstat
Looking at Waiting Threads
7777
Wait Internals 1 Dispatcher Objects
Size TypeState
Wait listhead
Object-type-specific data
DispatcherObject
(see ntddkincddkntddkh)
Any kernel object you can wait for is a ldquodispatcher objectrdquo
ndash some exclusively for synchronizationeg events mutexes (ldquomutantsrdquo) semaphores queues timers
ndash others can be waited for as a side effect of their prime function eg processes threads file objects
ndash non-waitable kernel objects are called ldquocontrol objectsrdquo
All dispatcher objects have a common header
All dispatcher objects are in one of two states
ndash ldquosignaledrdquo vs ldquononsignaledrdquo
ndash when signalled a wait on the object is satisfied
ndash different object types differ in terms of what changes their state
ndash wait and unwait implementation iscommon to all types of dispatcher objects
7878
Object-type-specific data
Wait Internals 2Wait Blocks
Size TypeState
Wait listhead
Size TypeState
Wait listhead
Represent a threadrsquos reference to something itrsquos waiting for (one per handle passed to WaitForhellip)
All wait blocks from a given wait call are chained to the waiting thread
Type indicates wait for ldquoanyrdquo or ldquoallrdquo Key denotes argument list position for
WaitForMultipleObjects
Object-type-specific data
DispatcherObjects
Thread Objects
WaitBlockListWaitBlockList
Wait blocks
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
7979
34 Windows APIs for Synchronization and IPC Windows API constructs for synchronization and interprocess communication
Synchronization
ndash Critical sections
ndash Mutexes
ndash Semaphores
ndash Event objects
Synchronization through interprocess communication
ndash Anonymous pipes
ndash Named pipes
ndash Mailslots
8080
Critical Sections
Only usable from within the same process
Critical sections are initialized and deleted but do not have handles
Only one thread at a time can be in a critical section
A thread can enter a critical section multiple times - however the number of Enter- and Leave-operations must match
Leaving a critical section before entering it may cause deadlocks
No way to test whether another thread is in a critical section
VOID InitializeCriticalSection( LPCRITICAL_SECTION sec )VOID DeleteCriticalSection( LPCRITICAL_SECTION sec )
VOID EnterCriticalSection( LPCRITICAL_SECTION sec ) VOID LeaveCriticalSection( LPCRITICAL_SECTION sec )BOOL TryEnterCriticalSection ( LPCRITICAL_SECTION sec )
8181
Critical Section Example
counter is global shared by all threads
volatile int counter = 0
CRITICAL_SECTION crit
InitializeCriticalSection ( ampcrit )
hellip main loop in any of the threads
while (done)
_try
EnterCriticalSection ( ampcrit )
counter += local_value
LeaveCriticalSection ( ampcrit )
_finally LeaveCriticalSection ( ampcrit )
DeleteCriticalSection( ampcrit )
8282
Synchronizing Threads with Kernel Objects
The following kernel objects can be used to synchronize threads
ndash Processes
ndash Threads
ndash Files
ndash Console input
File change notificationsFile change notifications
MutexesMutexes
Events (auto-reset + manual-reset)Events (auto-reset + manual-reset)
Waitable timersWaitable timers
DWORD WaitForSingleObject( HANDLE hObject DWORD dwTimeout )
DWORD WaitForMultipleObjects( DWORD cObjects LPHANDLE lpHandles BOOL bWaitAll DWORD dwTimeout )
8383
Wait Functions - Details
WaitForSingleObject()ndash hObject specifies kernel object
ndash dwTimeout specifies wait time in msecdwTimeout == 0 - no wait check whether object is signaled
dwTimeout == INFINITE - wait forever
WaitForMultipleObjects()ndash cObjects lt= MAXIMUM_WAIT_OBJECTS (64)
ndash lpHandles - pointer to array identifying these objects
ndash bWaitAll - whether to wait for first signaled object or all objectsFunction returns index of first signaled object
Side effectsndash Mutexes auto-reset events and waitable timers will be reset to non-signaled state after completing wait functions
8484
Mutexes
Mutexes work across processes
First thread has to call CreateMutex()
When sharing a mutex second thread (process) calls CreateMutex() or OpenMutex()
fInitialOwner == TRUE gives creator immediate ownership
Threads acquire mutex ownership using WaitForSingleObject() or WaitForMultipleObjects()
ReleaseMutex() gives up ownership
CloseHandle() will free mutex object
8585
Mutexes
HANDLE CreateMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
HANDLE OpenMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
BOOL ReleaseMutex( HANDLE hMutex )
8686
Mutex Example
counter is global shared by all threads
volatile int done counter = 0
HANDLE mutex = CreateMutex( NULL FALSE NULL )
main loop in any of the threads ret is local
DWORD ret
while (done)
ret = WaitForSingleObject( mutex INFINITE )
if (ret == WAIT_OBJECT_0)
counter += local_value
else mutex was abandoned
break exit the loop
ReleaseMutex( mutex )
CloseHandle( mutex )
8787
Comparison - POSIX mutexes
POSIX pthreads specification supports mutexesndash Synchronization among threads in same process
Five basic functionsndash pthread_mutex_init()
ndash pthread_mutex_destroy()
ndash pthread_mutex_lock()
ndash pthread_mutex_unlock()
ndash pthread_mutex_trylock()
Comparisonndash pthread_mutex_lock() will block - equivalent to WaitForSingleObject( hMutex )
ndash pthread_mutex_trylock() is nonblocking (polling) - equivalent to WaitForSingleObject() with timeout == 0
8888
Semaphores
Semaphore objects are used for resource countingndash A semaphore is signaled when count gt 0
Threadsprocesses use wait functionsndash Each wait function decreases semaphore count by 1
ndash ReleaseSemaphore() may increment count by any value
ndash ReleaseSemaphore() returns old semaphore count
8989
Semaphores
HANDLE CreateSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE OpenSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE ReleaseSemaphore( HANDLE hSemaphoreLONG cReleaseCount LPLONG lpPreviousCount )
9090
Events
Multiple threads can be released when a single event is signaled (barrier synchronization)ndash Manual-reset event can signal several thread simultaneously must be reset manually
ndash PulseEvent() will release all threads waiting on a manual-reset event and automatically reset the event
ndash Auto-reset event signals a single thread event is reset automatically
ndash fInitialState == TRUE - create event in signaled state
9191
Events
HANDLE CreateEvent( LPSECURITY_ATTRIBUTE lpsaBOOL fManualReset BOOL fInititalStateLPTSTR lpszEventName )
BOOL SetEvent( HANDLE hEvent )BOOL ResetEvent( HANDLE hEvent )BOOL PulseEvent( HANDLE hEvent )
9292
Comparison - POSIX condition variables
pthreadrsquos condition variables are comparable to eventsndash pthread_cond_init()
ndash pthread_cond_destroy()
Wait functionsndash pthread_cond_wait()
ndash pthread_cond_timedwait()
Signalingndash pthread_cond_signal() - one thread
ndash pthread_cond_broadcast() - all waiting threads
No exact equivalent to manual-reset events
9393
Anonymous pipes
BOOL CreatePipe( PHANDLE phReadPHANDLE phWriteLPSECURITY_ATTRIBUTES lpsaDWORD cbPipe )
main
prog1 prog2pipe
Half-duplex character-based IPC
cbPipe pipe byte size zero == default
Read on pipe handle will block if pipe is empty
Write operation to a full pipe will block
Anonymous pipes are oneway
9494
IO Redirection using an Anonymous Pipe
Create default size anonymous pipe handles are inheritable
if (CreatePipe (amphReadPipe amphWritePipe ampPipeSA 0))
fprintf(stderr ldquoAnon pipe create failednrdquo) exit(1)
Set output handle to pipe handle create first processes
StartInfoCh1hStdInput = GetStdHandle (STD_INPUT_HANDLE)
StartInfoCh1hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh1hStdOutput = hWritePipe
StartInfoCh1dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)Command1 NULL NULL TRUE 0 NULL NULL ampStartInfoCh1 ampProcInfo1))
fprintf(stderr ldquoCreateProc1 failednrdquo) exit(2)
CloseHandle (hWritePipe)
9595
Pipe example (contd)
Repeat (symmetrically) for the second process
StartInfoCh2hStdInput = hReadPipe
StartInfoCh2hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh2hStdOutput = GetStdHandle (STD_OUTPUT_HANDLE)
StartInfoCh2dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)targv NULL NULLTRUE Inherit handles
0 NULL NULL ampStartInfoCh2 ampProcInfo2))
fprintf(stderr ldquoCreateProc2 failednrdquo) exit(3)
CloseHandle (hReadPipe)
Wait for both processes to complete
WaitForSingleObject (ProcInfo1hProcess INFINITE)
WaitForSingleObject (ProcInfo2hProcess INFINITE)
CloseHandle (ProcInfo1hThread) CloseHandle (ProcInfo1hProcess)
CloseHandle (ProcInfo2hThread) CloseHandle (ProcInfo2hProcess)
return 0
9696
Named Pipes
Message orientedndash Reading process can read varying-length messages precisely as sent by the writing process
Bi-directionalndash Two processes can exchange messages over the same pipe
Multiple independent instances of a named pipendash Several clients can communicate with a single server using the same instance
ndash Server can respond to client using the same instance
Pipe can be accessed over the networkndash location transparency
Convenience and connection functions
9797
Using Named Pipes
HANDLE CreateNamedPipe (LPCTSTR lpszPipeNameDWORD fdwOpenMode DWORD fdwPipModeDWORD nMaxInstances DWORD cbOutBufDWORD cbInBuf DWORD dwTimeOutLPSECURITY_ATTRIBUTES lpsa )
Use same flag settings forall instances of a named pipe
lpszPipeName pipe[path]pipename
ndash Not possible to create a pipe on remote machine ( ndash local machine)
fdwOpenMode
ndash PIPE_ACCESS_DUPLEX PIPE_ACCESS_INBOUND PIPE_ACCESS_OUTBOUND
fdwPipeMode
ndash PIPE_TYPE_BYTE or PIPE_TYPE_MESSAGE
ndash PIPE_READMODE_BYTE or PIPE_READMODE_MESSAGE
ndash PIPE_WAIT or PIPE_NOWAIT (will ReadFile block)
9898
Named Pipes (contd)
BOOL PeekNamedPipe (HANDLE hPipeLPVOID lpvBuffer DWORD cbBufferLPDWORD lpcbRead LPDWORD lpcbAvailLPDWORD lpcbMessage)
nMaxInstances
ndash Number of instances
ndash PIPE_UNLIMITED_INSTANCES OS choice based on resources
dwTimeOut
ndash Default time-out period (in msec) for WaitNamedPipe()
First CreateNamedPipe creates named pipe
ndash Closing handle to last instance deletes named pipe
Polling a pipe
ndash Nondestructive ndash is there a message waiting for ReadFile
9999
Named Pipe Client Connections
CreateFile with named pipe namendash pipe[path]pipename
ndash servernamepipe[path]pipename
ndash First method gives better performance (local server)
Status Functionsndash GetNamedPipeHandleState
ndash SetNamedPipeHandleState
ndash GetNamedPipeInfo
100100
Convenience Functions
BOOL TransactNamedPipe( HANDLE hNamedPipeLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDOWRD lpcbRead LPOVERLAPPED lpa)
WriteFile ReadFile sequence
101101
Convenience Functions
BOOL CallNamedPipe( LPCTSTR lpszPipeNameLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDWORD lpcbRead DWORD dwTimeOut)
CreateFile WriteFile ReadFile CloseHandle
ndash dwTimeOut NMPWAIT_NOWAIT NMPWAIT_WIAT_FOREVER NMPWAIT_USE_DEFAULT_WAIT
102102
Server eliminate the polling loop
BOOL ConnectNamedPipe (HANDLE hNamedPipeLPOVERLAPPED lpo
lpo == NULLndash Call will return as soon as there is a client connection
ndash Returns false if client connected between CreateNamed Pipe calland ConnectNamedPipe()
Use DisconnectNamedPipe to free the handle for connection from another client
WaitNamedPipe()ndash Client may wait for serverlsquos ConnectNamedPipe()
Security rights for named pipesndash GENERIC_READ GENERIC_WRITE SYNCHRONIZE
103103
Comparison with UNIX
UNIX FIFOs are similar to a named pipe
ndash FIFOs are half-duplex
ndash FIFOs are limited to a single machine
ndash FIFOs are still byte-oriented so its easiest to use fixed-size records in clientserver applications
ndash Individual readwrites are atomic
A server using FIFOs must use a separate FIFO for each clientlsquos response although all clients can send requests via a single well known FIFO
Mkfifo() is the UNIX counterpart to CreateNamedPipe()
Use sockets for networked clientserver scenarios
104104
Client Example using Named Pipe
WaitNamedPipe (ServerPipeName NMPWAIT_WAIT_FOREVER)
hNamedPipe = CreateFile (ServerPipeName GENERIC_READ | GENERIC_WRITE
0 NULL OPEN_EXISTING FILE_ATTRIBUTE_NORMAL NULL)
if (hNamedPipe == INVALID_HANDLE_VALUE)
fptinf(stderr Failure to locate servern) exit(3)
Write the request
WriteFile (hNamedPipe ampRequest MAX_RQRS_LEN ampnWrite NULL)
Read each response and send it to std out
while (ReadFile (hNamedPipe ResponseRecord MAX_RQRS_LEN ampnRead NULL))
printf (s ResponseRecord)
CloseHandle (hNamedPipe)
return 0
105105
Server Example Using a Named Pipe
hNamedPipe = CreateNamedPipe (SERVER_PIPE PIPE_ACCESS_DUPLEX
PIPE_READMODE_MESSAGE | PIPE_TYPE_MESSAGE | PIPE_WAIT
1 0 0 CS_TIMEOUT pNPSA)
while (Done)
printf (Server is awaiting next requestn)
if (ConnectNamedPipe (hNamedPipe NULL)
|| ReadFile (hNamedPipe ampRequest RQ_SIZE ampnXfer NULL))
fprintf(stderr ldquoConnect or Read Named Pipe errornrdquo) exit(4)
printf( ldquoRequest is sn RequestRecord)
Send the file one line at a time to the client
fp = fopen (File r)
while ((fgets (ResponseRecord MAX_RQRS_LEN fp) = NULL))
WriteFile (hNamedPipe ampResponseRecord
(strlen(ResponseRecord) + 1) TSIZE ampnXfer NULL)
fclose (fp)
DisconnectNamedPipe (hNamedPipe)
End of server operation
106106
Win32 IPC - MailslotsMailslots bear some nasty implementation detailsthey are almost never used
Broadcast mechanism
ndash One-directional
ndash Mutliple writersmultiple readers (frequently one-to-many comm)
ndash Message delivery is unreliable
ndash Can be located over a network domain
ndash Message lengths are limited (w2k lt 426 byte) Operations on the mailslot
ndash Each reader (server) creates mailslot with CreateMailslot()
ndash Write-only client opens mailslot with CreateFile() and uses WriteFile() ndash open will fail if there are no waiting readers
ndash Clientlsquos message can be read by all servers (readers) Client lookup mailslotmailslotname
ndash Client will connect to every server in network domain
107107
Locate a server via mailslot
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
App client 0
App client n
Mailslot Servers
While () Sleep() hMS = CreateFile( ldquomailslotstatusldquo)
WriteFile(hMS ampStatInfo
App Server
Mailslot Client
Message is sent periodically
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
3333
Kernel Mode Versus User Mode
A processor state
Controls access to memory
Each memory page is tagged to show the required mode for reading and for writing
ndash Protects the system from the users
ndash Protects the user (process) from themselves
ndash System is not protected from system
Code regions are tagged ldquono write in any moderdquo
Controls ability to execute privileged instructions
A Windows abstraction
ndash Intel Ring 0 Ring 3
3434
Kernel Mode Versus User Mode
Control flow (a thread) can change from user to kernel mode and back
ndash Does not affect scheduling
ndash Thread context includes info about execution mode (along with registers etc)
PerfMon counters
ndash ldquoPrivileged Timerdquo and ldquoUser Timerdquo
ndash 4 levels of granularity thread process processor system
3535
Getting Into Kernel Mode
Code is run in kernel mode for one of three reasons
1 Requests from user mode
ndash Via the system service dispatch mechanism
ndash Kernel-mode code runs in the context of the requesting thread
2 Dedicated kernel-mode system threads
ndash Some threads in the system stay in kernel mode at all timesmostly in the ldquoSystemrdquo process
ndash Scheduled preempted etc like any other threads
3636
Getting Into Kernel Mode
3 Interrupts from external devicesndash interrupt dispatcher invokes the interrupt service routine
ndash ISR runs in the context of the interrupted thread so-called ldquoarbitrary thread contextrdquo
ndash ISR often requests the execution of a ldquoDPC routinerdquo which also runs in kernel mode
ndash Time not charged to interrupted thread
3737
Trap dispatching
Interruptdispatcher
Systemservice
dispatcher
Interruptserviceroutines
Interruptserviceroutines
Interruptserviceroutines
System services
System services
System services
Exceptiondispatcher
Exceptionhandlers
Exceptionhandlers
Exceptionhandlers
Virtual memorymanagerlsquos pager
Interrupt
System service call
HW exceptionsSW exceptions
Virtual addressexceptions
Trap processorlsquos mechanism to capture executing thread
ndash Switch from user to kernel mode
ndash Interrupts ndash asynchronous
ndash Exceptions - synchronous
Interrupt dispatch routine
Disable interrupts
Record machine state (trap frame) to allow resume
Mask equal- and lower-IRQL interrupts
Find and call appropriate ISR
Dismiss interrupt
Restore machine state (including mode and enabled interrupts)
Disable interrupts
Record machine state (trap frame) to allow resume
Mask equal- and lower-IRQL interrupts
Find and call appropriate ISR
Dismiss interrupt
Restore machine state (including mode and enabled interrupts)
Tell the device to stop interrupting
Interrogate device state start next operation on device etc
Request a DPC
Return to caller
Tell the device to stop interrupting
Interrogate device state start next operation on device etc
Request a DPC
Return to caller
Interrupt service routine
interrupt
user or kernel mode
codekernel mode
Note no thread or process context switch
Note no thread or process context switch
Interrupt Dispatching
3939
IRQL = Interrupt Request Level
ndash Precedence of the interrupt with respect to other interrupts
ndash Different interrupt sources have different IRQLs
ndash not the same as IRQ
IRQL is also a state of the processor
ndash Servicing an interrupt raises processor IRQL to that interruptrsquos IRQL
ndash this masks subsequent interrupts at equal and lower IRQLs
User mode is limited to IRQL 0
No waits or page faults at IRQL gt= DISPATCH_LEVEL
Interrupt Precedence via IRQLs (x86)
4040
PassiveLowAPC
DispatchDPCDevice 1
Profile amp Synch (Srv 2003)
ClockInterprocessor Interrupt
Power failHigh
normal thread execution
Hardware interrupts
Deferrable software interrupts
012
302928
31
Interrupt Precedence via IRQLs (x86)
4141
Interrupt processing
Interrupt dispatch table (IDT)ndash Links to interrupt service routines
x86ndash Interrupt controller interrupts processor (single line)
ndash Processor queries for interrupt vector uses vector as index to IDT
Alphandash PAL code (Privileged Architecture Library ndash Alpha BIOS) determines interrupt vector calls kernel
ndash Kernel uses vector to index IDT
After ISR execution IRQL is lowered to initial level
4242
Interrupt object
Allows device drivers to register ISRs for their devicesndash Contains dispatch code (initial handler)
ndash Dispatch code calls ISR with interrupt object as parameter(HW cannot pass parameters to ISR)
Connectingdisconnecting interrupt objectsndash Dynamic association between ISR and IDT entry
ndash Loadable device drivers (kernel modules)
ndash Turn onoff ISR
Interrupt objects can synchronize access to ISR datandash Multiple instances of ISR may be active simultaneously (MP machine)
ndash Multiple ISR may be connected with IRQL
4343
Predefined IRQLs
High ndash used when halting the system (via KeBugCheck())
Power fail ndash originated in the NT design document but has never been used
Inter-processor interruptndash used to request action from other processor (dispatching a thread updating a processors TLB system shutdown system crash)
Clockndash Used to update systemlsquos clock allocation of CPU time to threads
Profilendash Used for kernel profiling (see Kernel profiler ndash Kernprofexe Res Kit)
4444
Predefined IRQLs (contd)
Device
ndash Used to prioritize device interrupts
DPCdispatch and APC
ndash Software interrupts that kernel and device drivers generate
Passive
ndash No interrupt level at all normal thread execution
4545
IRQLs on 64-bit Systems
PassiveLowAPC
DispatchDPCDevice 1
Device n
Synch (Srv 2003)Clock
Interprocessor InterruptPower
HighProfile
012
1413
15
34
PassiveLowAPC
DispatchDPC amp Synch (UP only)Correctable Machine Check
Device 1
Device nSynch (MP only)
ClockInterprocessor Interrupt
HighProfilePower
x64 IA64
12
4646
Interrupt Prioritization amp Delivery
IRQLs are determined as followsndash x86 UP systems IRQL = 27 - IRQ
ndash x86 MP systems bucketized (random)
ndash x64 amp IA64 systems IRQL = IDT vector number 16
On MP systems which processor is chosen to deliver an interruptndash By default any processor can receive an interrupt from any deviceCan be configured with IntFilter utility in Resource Kit
ndash On x86 and x64 systems the IOAPIC (IO advanced programmable interrupt controller) is programmed to interrupt the processor running at the lowest IRQL
ndash On IA64 systems the SAPIC (streamlined advanced programmable interrupt controller) is configured to interrupt one processor for each interrupt sourceProcessors are assigned round robin for each interrupt vector
4747
Software interrupts
Initiating thread dispatching
ndash DPC allow for scheduling actions when kernel is deep within many layers of code
ndash Delayed scheduling decision one DPC queue per processor
Handling timer expiration
Asynchronous execution of a procedure in context of a particular thread
Support for asynchronous IO operations
4848
Flow of Interrupts
4949
Sync on MP use spinlocks to coordinate among processors
Spinlock acquisition and release routines implement a one-owner-at-a-time algorithm
ndash Spinlock is either free or is considered to be owned by a CPU
ndash Analogous to using Windows API mutexes from user mode
A spinlock is just a data cell in memory
ndash Accessed with a test-and-modify operation that is atomic across all processors
ndash KSPIN_LOCK is an opaque data type typedefrsquod as a ULONG
ndash To implement synchronization a single bit is sufficient
Synchronization on SMP Systems
5050
do acquire_spinlock(DPC)until (SUCCESS)
begin remove DPC from queueend
release_spinlock(DPC)
do acquire_spinlock(DPC)until (SUCCESS)
begin remove DPC from queueend
release_spinlock(DPC)
Kernel Synchronization
Processor BProcessor A
Critical section
spinlock
DPC DPC
A spinlock is a locking primitive associatedwith a global data structure such as the DPC queue
5151
Queued Spinlocks
Problem Checking status of spinlock via test-and-set operation creates bus contention
Queued spinlocks maintain queue of waiting processors
First processor acquires lock other processors wait on processor-local flag
ndash Thus busy-wait loop requires no access to the memory bus
When releasing lock the 1st processorrsquos flag is modified
ndash Exactly one processor is being signaled
ndash Pre-determined wait order
5252
SMP Scalability Improvements
Windows 2000 queued spinlocks
ndash qlocks in Kernel Debugger
Server 2003
ndash More spinlocks eliminated (context swap system space commit)
ndash Further reduction of use of spinlocks amp length they are held
ndash Scheduling database now per-CPUAllows thread state transitions in parallel
5353
SMP Scalability Improvements
XP2003
ndash Minimized lock contention for hot locks PFN or Page Frame Database lock
ndash Some locks completely eliminatedCharging nonpagedpaged pool quotas allocating and mapping system page table entries charging commitment of pages allocatingmapping physical memory through AWE functions
ndash New more efficient locking mechanism (pushlocks)Doesnrsquot use spinlocks when no contentionUsed for object manager and address windowing extensions (AWE) related locks
5454
Waiting
Flexible wait calls
ndash Wait for one or multiple objects in one call
ndash Wait for multiple can wait for ldquoanyrdquo one or ldquoallrdquo at onceldquoAllrdquo all objects must be in the signalled state concurrently to resolve the wait
ndash All wait calls include optional timeout argument
ndash Waiting threads consume no CPU time
5555
Waiting
Waitable objects include
ndash Events (may be auto-reset or manual reset may be set or ldquopulsedrdquo)
ndash Mutexes (ldquomutual exclusionrdquo one-at-a-time)
ndash Semaphores (n-at-a-time)
ndash Timers
ndash Processes and Threads (signalled upon exit or terminate)
ndash Directories (change notification)
5656
Waiting
No guaranteed ordering of wait resolution
ndash If multiple threads are waiting for an object and only one thread is released (eg itrsquos a mutex or auto-reset event) which thread gets released is unpredictable
5757
Executive Synchronization
Thread waitson an object
handle
Create and initialize thread object
Initialized
Ready
Transition
Waiting
Running
Terminated
Standby
Wait is completeSet object to
signaled state
Interaction with thread scheduling
Waiting on Dispatcher Objects ndash outside the kernel
5858
Interaction bet Synchronization amp Dispatching User mode thread waits on an event objectlsquos handle
Kernel changes threadlsquos scheduling state from ready to waiting and adds thread to wait-list
Another thread sets the event
Kernel wakes up waiting threads variable priority threads get priority boost
5959
Interaction bet Synchronization amp Dispatching Dispatcher re-schedules new thread ndash it may preempt running thread it it has lower priority and issues software interrupt to initiate context switch
If no processor can be preempted the dispatcher places the ready thread in the dispatcher ready queue to be scheduled later
6060
What signals an object
Dispatcher object
System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
Owning thread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (kernel mode)
nonsignaled signaled
Owning thread or otherthread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (exported to user mode)
nonsignaled signaled
One thread releases thesemaphore freeing a resource
A thread acquires the semaphoreMore resources are not available
Kernel resumes one or more waiting threads
Semaphore
6161
A thread reinitializesthe thread object
What signals an object (contd)
Dispatcher object System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
A thread sets the event
Kernel resumes one or more threads
Kernel resumes one or more waiting threads
Event
nonsignaled signaled
Dedicated thread sets oneevent in the event pair
Kernel resumes theother dedicated thread
Kernel resumes waitingdedicated thread
Event pair
nonsignaled signaled
Timer expires
A thread (re) initializes the timer
Kernel resumes all waiting threads
Timer
nonsignaled signaled
Thread terminates
Kernel resumes all waiting threads
Thread
6262
Further Reading
Mark E Russinovich and David A Solomon Microsoft Windows Internals 4th Edition Microsoft Press 2004
Chapter 3 - System Mechanisms
ndash Trap Dispatching (pp 85 ff)
ndash Synchronization (pp 149 ff)
ndash Kernel Event Tracing (pp 175 ff)
6363
33 Advanced Windows Synchronization
Deferred and Asynchronous Procedure Calls
IRQLs and CPU Time Accounting
Wait Queues amp Dispatcher Objects
6464
Used to defer processing from higher (device) interrupt level to a lower (dispatch) levelndash Also used for quantum end and timer expiration
Driver (usually ISR) queues requestndash One queue per CPU DPCs are normally queued to the current processor but can be targeted to other CPUs
ndash Executes specified procedure at dispatch IRQL (or ldquodispatch levelrdquo also ldquoDPC levelrdquo) when all higher-IRQL work (interrupts) completed
ndash Maximum times recommended ISR 10 usec DPC 25 usec
See httpwwwmicrosoftcomwhdcdriverperformmmdrvmspx
Deferred Procedure Calls (DPCs)
6565
queue head DPC object DPC object DPC object
Deferred Procedure Calls (DPCs)
6666
DPC
Delivering a DPC
DPC routines can call kernel functionsbut canlsquot call system services generatepage faults or create or wait on objects
DPC routines canlsquotassume whatprocess addressspace is currentlymapped
Interruptdispatch table
high
Power failure
DispatchDPC
APC
Low
DPC
1 Timer expires kernelqueues DPC that willrelease all waiting threadsKernel requests SW int
DPCDPC
DPC queue
2 DPC interrupt occurswhen IRQL drops belowdispatchDPC level
dispatcher
3 After DPC interruptcontrol transfers tothread dispatcher
4 Dispatcher executes each DPCroutine in DPC queue
6767
Asynchronous Procedure Calls (APCs)
Execute code in context of a particular user threadndash APC routines can acquire resources (objects) incur page faultscall system services
APC queue is thread-specific
User mode amp kernel mode APCsndash Permission required for user mode APCs
6868
Asynchronous Procedure Calls (APCs)
Executive uses APCs to complete work in thread spacendash Wait for asynchronous IO operation
ndash Emulate delivery of POSIX signals
ndash Make threads suspendterminate itself (env subsystems)
APCs are delivered when thread is in alertable wait statendash WaitForMultipleObjectsEx() SleepEx()
6969
Special kernel APCs
ndash Run in kernel mode at IRQL 1
ndash Always deliverable unless thread is already at IRQL 1 or above
ndash Used for IO completion reporting from ldquoarbitrary thread contextrdquo
ndash Kernel-mode interface is linkable but not documented
Asynchronous Procedure Calls (APCs)
7070
ldquoOrdinaryrdquo kernel APCs
ndash Always deliverable if at IRQL 0 unless explicitly disabled (disable with KeEnterCriticalRegion)
User mode APCs
ndash Used for IO completion callback routines (see ReadFileEx WriteFileEx) also QueueUserApc
ndash Only deliverable when thread is in ldquoalertable waitrdquo
Asynchronous Procedure Calls (APCs)
7171
ThreadObject
K
U
APC objects
Asynchronous Procedure Calls (APCs)
7272
IRQLs and CPU Time Accounting
Interval clock timer ISR keeps track of time
Clock ISR time accounting
ndash If IRQLlt2 charge to threadrsquos user or kernel time
ndash If IRQL=2 and processing a DPC charge to DPC time
ndash If IRQL=2 amp not processing a DPC charge to thread kernel time
ndash If IRQLgt2 charge to interrupt time
7373
IRQLs and CPU Time Accounting
Since time servicing interrupts are NOT charged to interrupted thread if system is busy but no process appears to be running must be due to interrupt-related activity
ndash Note time at IRQL 2 or more is charged to the current threadrsquos quantum (to be described)
7474
Interrupt Time Accounting
Task Manager includes interrupt and DPC time with the Idle process time
Interrupt activity is not charged to any threadprocess
ndash Process Explorer shows these as separate processesnot really processes
ndash Context switches for these are really of interrupts amp DPCs
7575
Time Accounting Quirks
Looking at total CPU time for each process may not reveal where system has spent its time
CPU time accounting is driven by programmable interrupt timer
ndash Normally 10 msec (15 msec on some MP Pentiums)
Thread execution and context switches between clock intervals NOT accounted
ndash Eg one or more threads run and enter a wait state before clock fires
ndash Thus threads may run but never get charged
View context switch activity with Process Explorer
ndash Add Context Switch Delta column
7676
For waiting threads user-mode utilities only display the wait reason
Example pstat
Looking at Waiting Threads
7777
Wait Internals 1 Dispatcher Objects
Size TypeState
Wait listhead
Object-type-specific data
DispatcherObject
(see ntddkincddkntddkh)
Any kernel object you can wait for is a ldquodispatcher objectrdquo
ndash some exclusively for synchronizationeg events mutexes (ldquomutantsrdquo) semaphores queues timers
ndash others can be waited for as a side effect of their prime function eg processes threads file objects
ndash non-waitable kernel objects are called ldquocontrol objectsrdquo
All dispatcher objects have a common header
All dispatcher objects are in one of two states
ndash ldquosignaledrdquo vs ldquononsignaledrdquo
ndash when signalled a wait on the object is satisfied
ndash different object types differ in terms of what changes their state
ndash wait and unwait implementation iscommon to all types of dispatcher objects
7878
Object-type-specific data
Wait Internals 2Wait Blocks
Size TypeState
Wait listhead
Size TypeState
Wait listhead
Represent a threadrsquos reference to something itrsquos waiting for (one per handle passed to WaitForhellip)
All wait blocks from a given wait call are chained to the waiting thread
Type indicates wait for ldquoanyrdquo or ldquoallrdquo Key denotes argument list position for
WaitForMultipleObjects
Object-type-specific data
DispatcherObjects
Thread Objects
WaitBlockListWaitBlockList
Wait blocks
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
7979
34 Windows APIs for Synchronization and IPC Windows API constructs for synchronization and interprocess communication
Synchronization
ndash Critical sections
ndash Mutexes
ndash Semaphores
ndash Event objects
Synchronization through interprocess communication
ndash Anonymous pipes
ndash Named pipes
ndash Mailslots
8080
Critical Sections
Only usable from within the same process
Critical sections are initialized and deleted but do not have handles
Only one thread at a time can be in a critical section
A thread can enter a critical section multiple times - however the number of Enter- and Leave-operations must match
Leaving a critical section before entering it may cause deadlocks
No way to test whether another thread is in a critical section
VOID InitializeCriticalSection( LPCRITICAL_SECTION sec )VOID DeleteCriticalSection( LPCRITICAL_SECTION sec )
VOID EnterCriticalSection( LPCRITICAL_SECTION sec ) VOID LeaveCriticalSection( LPCRITICAL_SECTION sec )BOOL TryEnterCriticalSection ( LPCRITICAL_SECTION sec )
8181
Critical Section Example
counter is global shared by all threads
volatile int counter = 0
CRITICAL_SECTION crit
InitializeCriticalSection ( ampcrit )
hellip main loop in any of the threads
while (done)
_try
EnterCriticalSection ( ampcrit )
counter += local_value
LeaveCriticalSection ( ampcrit )
_finally LeaveCriticalSection ( ampcrit )
DeleteCriticalSection( ampcrit )
8282
Synchronizing Threads with Kernel Objects
The following kernel objects can be used to synchronize threads
ndash Processes
ndash Threads
ndash Files
ndash Console input
File change notificationsFile change notifications
MutexesMutexes
Events (auto-reset + manual-reset)Events (auto-reset + manual-reset)
Waitable timersWaitable timers
DWORD WaitForSingleObject( HANDLE hObject DWORD dwTimeout )
DWORD WaitForMultipleObjects( DWORD cObjects LPHANDLE lpHandles BOOL bWaitAll DWORD dwTimeout )
8383
Wait Functions - Details
WaitForSingleObject()ndash hObject specifies kernel object
ndash dwTimeout specifies wait time in msecdwTimeout == 0 - no wait check whether object is signaled
dwTimeout == INFINITE - wait forever
WaitForMultipleObjects()ndash cObjects lt= MAXIMUM_WAIT_OBJECTS (64)
ndash lpHandles - pointer to array identifying these objects
ndash bWaitAll - whether to wait for first signaled object or all objectsFunction returns index of first signaled object
Side effectsndash Mutexes auto-reset events and waitable timers will be reset to non-signaled state after completing wait functions
8484
Mutexes
Mutexes work across processes
First thread has to call CreateMutex()
When sharing a mutex second thread (process) calls CreateMutex() or OpenMutex()
fInitialOwner == TRUE gives creator immediate ownership
Threads acquire mutex ownership using WaitForSingleObject() or WaitForMultipleObjects()
ReleaseMutex() gives up ownership
CloseHandle() will free mutex object
8585
Mutexes
HANDLE CreateMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
HANDLE OpenMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
BOOL ReleaseMutex( HANDLE hMutex )
8686
Mutex Example
counter is global shared by all threads
volatile int done counter = 0
HANDLE mutex = CreateMutex( NULL FALSE NULL )
main loop in any of the threads ret is local
DWORD ret
while (done)
ret = WaitForSingleObject( mutex INFINITE )
if (ret == WAIT_OBJECT_0)
counter += local_value
else mutex was abandoned
break exit the loop
ReleaseMutex( mutex )
CloseHandle( mutex )
8787
Comparison - POSIX mutexes
POSIX pthreads specification supports mutexesndash Synchronization among threads in same process
Five basic functionsndash pthread_mutex_init()
ndash pthread_mutex_destroy()
ndash pthread_mutex_lock()
ndash pthread_mutex_unlock()
ndash pthread_mutex_trylock()
Comparisonndash pthread_mutex_lock() will block - equivalent to WaitForSingleObject( hMutex )
ndash pthread_mutex_trylock() is nonblocking (polling) - equivalent to WaitForSingleObject() with timeout == 0
8888
Semaphores
Semaphore objects are used for resource countingndash A semaphore is signaled when count gt 0
Threadsprocesses use wait functionsndash Each wait function decreases semaphore count by 1
ndash ReleaseSemaphore() may increment count by any value
ndash ReleaseSemaphore() returns old semaphore count
8989
Semaphores
HANDLE CreateSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE OpenSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE ReleaseSemaphore( HANDLE hSemaphoreLONG cReleaseCount LPLONG lpPreviousCount )
9090
Events
Multiple threads can be released when a single event is signaled (barrier synchronization)ndash Manual-reset event can signal several thread simultaneously must be reset manually
ndash PulseEvent() will release all threads waiting on a manual-reset event and automatically reset the event
ndash Auto-reset event signals a single thread event is reset automatically
ndash fInitialState == TRUE - create event in signaled state
9191
Events
HANDLE CreateEvent( LPSECURITY_ATTRIBUTE lpsaBOOL fManualReset BOOL fInititalStateLPTSTR lpszEventName )
BOOL SetEvent( HANDLE hEvent )BOOL ResetEvent( HANDLE hEvent )BOOL PulseEvent( HANDLE hEvent )
9292
Comparison - POSIX condition variables
pthreadrsquos condition variables are comparable to eventsndash pthread_cond_init()
ndash pthread_cond_destroy()
Wait functionsndash pthread_cond_wait()
ndash pthread_cond_timedwait()
Signalingndash pthread_cond_signal() - one thread
ndash pthread_cond_broadcast() - all waiting threads
No exact equivalent to manual-reset events
9393
Anonymous pipes
BOOL CreatePipe( PHANDLE phReadPHANDLE phWriteLPSECURITY_ATTRIBUTES lpsaDWORD cbPipe )
main
prog1 prog2pipe
Half-duplex character-based IPC
cbPipe pipe byte size zero == default
Read on pipe handle will block if pipe is empty
Write operation to a full pipe will block
Anonymous pipes are oneway
9494
IO Redirection using an Anonymous Pipe
Create default size anonymous pipe handles are inheritable
if (CreatePipe (amphReadPipe amphWritePipe ampPipeSA 0))
fprintf(stderr ldquoAnon pipe create failednrdquo) exit(1)
Set output handle to pipe handle create first processes
StartInfoCh1hStdInput = GetStdHandle (STD_INPUT_HANDLE)
StartInfoCh1hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh1hStdOutput = hWritePipe
StartInfoCh1dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)Command1 NULL NULL TRUE 0 NULL NULL ampStartInfoCh1 ampProcInfo1))
fprintf(stderr ldquoCreateProc1 failednrdquo) exit(2)
CloseHandle (hWritePipe)
9595
Pipe example (contd)
Repeat (symmetrically) for the second process
StartInfoCh2hStdInput = hReadPipe
StartInfoCh2hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh2hStdOutput = GetStdHandle (STD_OUTPUT_HANDLE)
StartInfoCh2dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)targv NULL NULLTRUE Inherit handles
0 NULL NULL ampStartInfoCh2 ampProcInfo2))
fprintf(stderr ldquoCreateProc2 failednrdquo) exit(3)
CloseHandle (hReadPipe)
Wait for both processes to complete
WaitForSingleObject (ProcInfo1hProcess INFINITE)
WaitForSingleObject (ProcInfo2hProcess INFINITE)
CloseHandle (ProcInfo1hThread) CloseHandle (ProcInfo1hProcess)
CloseHandle (ProcInfo2hThread) CloseHandle (ProcInfo2hProcess)
return 0
9696
Named Pipes
Message orientedndash Reading process can read varying-length messages precisely as sent by the writing process
Bi-directionalndash Two processes can exchange messages over the same pipe
Multiple independent instances of a named pipendash Several clients can communicate with a single server using the same instance
ndash Server can respond to client using the same instance
Pipe can be accessed over the networkndash location transparency
Convenience and connection functions
9797
Using Named Pipes
HANDLE CreateNamedPipe (LPCTSTR lpszPipeNameDWORD fdwOpenMode DWORD fdwPipModeDWORD nMaxInstances DWORD cbOutBufDWORD cbInBuf DWORD dwTimeOutLPSECURITY_ATTRIBUTES lpsa )
Use same flag settings forall instances of a named pipe
lpszPipeName pipe[path]pipename
ndash Not possible to create a pipe on remote machine ( ndash local machine)
fdwOpenMode
ndash PIPE_ACCESS_DUPLEX PIPE_ACCESS_INBOUND PIPE_ACCESS_OUTBOUND
fdwPipeMode
ndash PIPE_TYPE_BYTE or PIPE_TYPE_MESSAGE
ndash PIPE_READMODE_BYTE or PIPE_READMODE_MESSAGE
ndash PIPE_WAIT or PIPE_NOWAIT (will ReadFile block)
9898
Named Pipes (contd)
BOOL PeekNamedPipe (HANDLE hPipeLPVOID lpvBuffer DWORD cbBufferLPDWORD lpcbRead LPDWORD lpcbAvailLPDWORD lpcbMessage)
nMaxInstances
ndash Number of instances
ndash PIPE_UNLIMITED_INSTANCES OS choice based on resources
dwTimeOut
ndash Default time-out period (in msec) for WaitNamedPipe()
First CreateNamedPipe creates named pipe
ndash Closing handle to last instance deletes named pipe
Polling a pipe
ndash Nondestructive ndash is there a message waiting for ReadFile
9999
Named Pipe Client Connections
CreateFile with named pipe namendash pipe[path]pipename
ndash servernamepipe[path]pipename
ndash First method gives better performance (local server)
Status Functionsndash GetNamedPipeHandleState
ndash SetNamedPipeHandleState
ndash GetNamedPipeInfo
100100
Convenience Functions
BOOL TransactNamedPipe( HANDLE hNamedPipeLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDOWRD lpcbRead LPOVERLAPPED lpa)
WriteFile ReadFile sequence
101101
Convenience Functions
BOOL CallNamedPipe( LPCTSTR lpszPipeNameLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDWORD lpcbRead DWORD dwTimeOut)
CreateFile WriteFile ReadFile CloseHandle
ndash dwTimeOut NMPWAIT_NOWAIT NMPWAIT_WIAT_FOREVER NMPWAIT_USE_DEFAULT_WAIT
102102
Server eliminate the polling loop
BOOL ConnectNamedPipe (HANDLE hNamedPipeLPOVERLAPPED lpo
lpo == NULLndash Call will return as soon as there is a client connection
ndash Returns false if client connected between CreateNamed Pipe calland ConnectNamedPipe()
Use DisconnectNamedPipe to free the handle for connection from another client
WaitNamedPipe()ndash Client may wait for serverlsquos ConnectNamedPipe()
Security rights for named pipesndash GENERIC_READ GENERIC_WRITE SYNCHRONIZE
103103
Comparison with UNIX
UNIX FIFOs are similar to a named pipe
ndash FIFOs are half-duplex
ndash FIFOs are limited to a single machine
ndash FIFOs are still byte-oriented so its easiest to use fixed-size records in clientserver applications
ndash Individual readwrites are atomic
A server using FIFOs must use a separate FIFO for each clientlsquos response although all clients can send requests via a single well known FIFO
Mkfifo() is the UNIX counterpart to CreateNamedPipe()
Use sockets for networked clientserver scenarios
104104
Client Example using Named Pipe
WaitNamedPipe (ServerPipeName NMPWAIT_WAIT_FOREVER)
hNamedPipe = CreateFile (ServerPipeName GENERIC_READ | GENERIC_WRITE
0 NULL OPEN_EXISTING FILE_ATTRIBUTE_NORMAL NULL)
if (hNamedPipe == INVALID_HANDLE_VALUE)
fptinf(stderr Failure to locate servern) exit(3)
Write the request
WriteFile (hNamedPipe ampRequest MAX_RQRS_LEN ampnWrite NULL)
Read each response and send it to std out
while (ReadFile (hNamedPipe ResponseRecord MAX_RQRS_LEN ampnRead NULL))
printf (s ResponseRecord)
CloseHandle (hNamedPipe)
return 0
105105
Server Example Using a Named Pipe
hNamedPipe = CreateNamedPipe (SERVER_PIPE PIPE_ACCESS_DUPLEX
PIPE_READMODE_MESSAGE | PIPE_TYPE_MESSAGE | PIPE_WAIT
1 0 0 CS_TIMEOUT pNPSA)
while (Done)
printf (Server is awaiting next requestn)
if (ConnectNamedPipe (hNamedPipe NULL)
|| ReadFile (hNamedPipe ampRequest RQ_SIZE ampnXfer NULL))
fprintf(stderr ldquoConnect or Read Named Pipe errornrdquo) exit(4)
printf( ldquoRequest is sn RequestRecord)
Send the file one line at a time to the client
fp = fopen (File r)
while ((fgets (ResponseRecord MAX_RQRS_LEN fp) = NULL))
WriteFile (hNamedPipe ampResponseRecord
(strlen(ResponseRecord) + 1) TSIZE ampnXfer NULL)
fclose (fp)
DisconnectNamedPipe (hNamedPipe)
End of server operation
106106
Win32 IPC - MailslotsMailslots bear some nasty implementation detailsthey are almost never used
Broadcast mechanism
ndash One-directional
ndash Mutliple writersmultiple readers (frequently one-to-many comm)
ndash Message delivery is unreliable
ndash Can be located over a network domain
ndash Message lengths are limited (w2k lt 426 byte) Operations on the mailslot
ndash Each reader (server) creates mailslot with CreateMailslot()
ndash Write-only client opens mailslot with CreateFile() and uses WriteFile() ndash open will fail if there are no waiting readers
ndash Clientlsquos message can be read by all servers (readers) Client lookup mailslotmailslotname
ndash Client will connect to every server in network domain
107107
Locate a server via mailslot
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
App client 0
App client n
Mailslot Servers
While () Sleep() hMS = CreateFile( ldquomailslotstatusldquo)
WriteFile(hMS ampStatInfo
App Server
Mailslot Client
Message is sent periodically
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
3434
Kernel Mode Versus User Mode
Control flow (a thread) can change from user to kernel mode and back
ndash Does not affect scheduling
ndash Thread context includes info about execution mode (along with registers etc)
PerfMon counters
ndash ldquoPrivileged Timerdquo and ldquoUser Timerdquo
ndash 4 levels of granularity thread process processor system
3535
Getting Into Kernel Mode
Code is run in kernel mode for one of three reasons
1 Requests from user mode
ndash Via the system service dispatch mechanism
ndash Kernel-mode code runs in the context of the requesting thread
2 Dedicated kernel-mode system threads
ndash Some threads in the system stay in kernel mode at all timesmostly in the ldquoSystemrdquo process
ndash Scheduled preempted etc like any other threads
3636
Getting Into Kernel Mode
3 Interrupts from external devicesndash interrupt dispatcher invokes the interrupt service routine
ndash ISR runs in the context of the interrupted thread so-called ldquoarbitrary thread contextrdquo
ndash ISR often requests the execution of a ldquoDPC routinerdquo which also runs in kernel mode
ndash Time not charged to interrupted thread
3737
Trap dispatching
Interruptdispatcher
Systemservice
dispatcher
Interruptserviceroutines
Interruptserviceroutines
Interruptserviceroutines
System services
System services
System services
Exceptiondispatcher
Exceptionhandlers
Exceptionhandlers
Exceptionhandlers
Virtual memorymanagerlsquos pager
Interrupt
System service call
HW exceptionsSW exceptions
Virtual addressexceptions
Trap processorlsquos mechanism to capture executing thread
ndash Switch from user to kernel mode
ndash Interrupts ndash asynchronous
ndash Exceptions - synchronous
Interrupt dispatch routine
Disable interrupts
Record machine state (trap frame) to allow resume
Mask equal- and lower-IRQL interrupts
Find and call appropriate ISR
Dismiss interrupt
Restore machine state (including mode and enabled interrupts)
Disable interrupts
Record machine state (trap frame) to allow resume
Mask equal- and lower-IRQL interrupts
Find and call appropriate ISR
Dismiss interrupt
Restore machine state (including mode and enabled interrupts)
Tell the device to stop interrupting
Interrogate device state start next operation on device etc
Request a DPC
Return to caller
Tell the device to stop interrupting
Interrogate device state start next operation on device etc
Request a DPC
Return to caller
Interrupt service routine
interrupt
user or kernel mode
codekernel mode
Note no thread or process context switch
Note no thread or process context switch
Interrupt Dispatching
3939
IRQL = Interrupt Request Level
ndash Precedence of the interrupt with respect to other interrupts
ndash Different interrupt sources have different IRQLs
ndash not the same as IRQ
IRQL is also a state of the processor
ndash Servicing an interrupt raises processor IRQL to that interruptrsquos IRQL
ndash this masks subsequent interrupts at equal and lower IRQLs
User mode is limited to IRQL 0
No waits or page faults at IRQL gt= DISPATCH_LEVEL
Interrupt Precedence via IRQLs (x86)
4040
PassiveLowAPC
DispatchDPCDevice 1
Profile amp Synch (Srv 2003)
ClockInterprocessor Interrupt
Power failHigh
normal thread execution
Hardware interrupts
Deferrable software interrupts
012
302928
31
Interrupt Precedence via IRQLs (x86)
4141
Interrupt processing
Interrupt dispatch table (IDT)ndash Links to interrupt service routines
x86ndash Interrupt controller interrupts processor (single line)
ndash Processor queries for interrupt vector uses vector as index to IDT
Alphandash PAL code (Privileged Architecture Library ndash Alpha BIOS) determines interrupt vector calls kernel
ndash Kernel uses vector to index IDT
After ISR execution IRQL is lowered to initial level
4242
Interrupt object
Allows device drivers to register ISRs for their devicesndash Contains dispatch code (initial handler)
ndash Dispatch code calls ISR with interrupt object as parameter(HW cannot pass parameters to ISR)
Connectingdisconnecting interrupt objectsndash Dynamic association between ISR and IDT entry
ndash Loadable device drivers (kernel modules)
ndash Turn onoff ISR
Interrupt objects can synchronize access to ISR datandash Multiple instances of ISR may be active simultaneously (MP machine)
ndash Multiple ISR may be connected with IRQL
4343
Predefined IRQLs
High ndash used when halting the system (via KeBugCheck())
Power fail ndash originated in the NT design document but has never been used
Inter-processor interruptndash used to request action from other processor (dispatching a thread updating a processors TLB system shutdown system crash)
Clockndash Used to update systemlsquos clock allocation of CPU time to threads
Profilendash Used for kernel profiling (see Kernel profiler ndash Kernprofexe Res Kit)
4444
Predefined IRQLs (contd)
Device
ndash Used to prioritize device interrupts
DPCdispatch and APC
ndash Software interrupts that kernel and device drivers generate
Passive
ndash No interrupt level at all normal thread execution
4545
IRQLs on 64-bit Systems
PassiveLowAPC
DispatchDPCDevice 1
Device n
Synch (Srv 2003)Clock
Interprocessor InterruptPower
HighProfile
012
1413
15
34
PassiveLowAPC
DispatchDPC amp Synch (UP only)Correctable Machine Check
Device 1
Device nSynch (MP only)
ClockInterprocessor Interrupt
HighProfilePower
x64 IA64
12
4646
Interrupt Prioritization amp Delivery
IRQLs are determined as followsndash x86 UP systems IRQL = 27 - IRQ
ndash x86 MP systems bucketized (random)
ndash x64 amp IA64 systems IRQL = IDT vector number 16
On MP systems which processor is chosen to deliver an interruptndash By default any processor can receive an interrupt from any deviceCan be configured with IntFilter utility in Resource Kit
ndash On x86 and x64 systems the IOAPIC (IO advanced programmable interrupt controller) is programmed to interrupt the processor running at the lowest IRQL
ndash On IA64 systems the SAPIC (streamlined advanced programmable interrupt controller) is configured to interrupt one processor for each interrupt sourceProcessors are assigned round robin for each interrupt vector
4747
Software interrupts
Initiating thread dispatching
ndash DPC allow for scheduling actions when kernel is deep within many layers of code
ndash Delayed scheduling decision one DPC queue per processor
Handling timer expiration
Asynchronous execution of a procedure in context of a particular thread
Support for asynchronous IO operations
4848
Flow of Interrupts
4949
Sync on MP use spinlocks to coordinate among processors
Spinlock acquisition and release routines implement a one-owner-at-a-time algorithm
ndash Spinlock is either free or is considered to be owned by a CPU
ndash Analogous to using Windows API mutexes from user mode
A spinlock is just a data cell in memory
ndash Accessed with a test-and-modify operation that is atomic across all processors
ndash KSPIN_LOCK is an opaque data type typedefrsquod as a ULONG
ndash To implement synchronization a single bit is sufficient
Synchronization on SMP Systems
5050
do acquire_spinlock(DPC)until (SUCCESS)
begin remove DPC from queueend
release_spinlock(DPC)
do acquire_spinlock(DPC)until (SUCCESS)
begin remove DPC from queueend
release_spinlock(DPC)
Kernel Synchronization
Processor BProcessor A
Critical section
spinlock
DPC DPC
A spinlock is a locking primitive associatedwith a global data structure such as the DPC queue
5151
Queued Spinlocks
Problem Checking status of spinlock via test-and-set operation creates bus contention
Queued spinlocks maintain queue of waiting processors
First processor acquires lock other processors wait on processor-local flag
ndash Thus busy-wait loop requires no access to the memory bus
When releasing lock the 1st processorrsquos flag is modified
ndash Exactly one processor is being signaled
ndash Pre-determined wait order
5252
SMP Scalability Improvements
Windows 2000 queued spinlocks
ndash qlocks in Kernel Debugger
Server 2003
ndash More spinlocks eliminated (context swap system space commit)
ndash Further reduction of use of spinlocks amp length they are held
ndash Scheduling database now per-CPUAllows thread state transitions in parallel
5353
SMP Scalability Improvements
XP2003
ndash Minimized lock contention for hot locks PFN or Page Frame Database lock
ndash Some locks completely eliminatedCharging nonpagedpaged pool quotas allocating and mapping system page table entries charging commitment of pages allocatingmapping physical memory through AWE functions
ndash New more efficient locking mechanism (pushlocks)Doesnrsquot use spinlocks when no contentionUsed for object manager and address windowing extensions (AWE) related locks
5454
Waiting
Flexible wait calls
ndash Wait for one or multiple objects in one call
ndash Wait for multiple can wait for ldquoanyrdquo one or ldquoallrdquo at onceldquoAllrdquo all objects must be in the signalled state concurrently to resolve the wait
ndash All wait calls include optional timeout argument
ndash Waiting threads consume no CPU time
5555
Waiting
Waitable objects include
ndash Events (may be auto-reset or manual reset may be set or ldquopulsedrdquo)
ndash Mutexes (ldquomutual exclusionrdquo one-at-a-time)
ndash Semaphores (n-at-a-time)
ndash Timers
ndash Processes and Threads (signalled upon exit or terminate)
ndash Directories (change notification)
5656
Waiting
No guaranteed ordering of wait resolution
ndash If multiple threads are waiting for an object and only one thread is released (eg itrsquos a mutex or auto-reset event) which thread gets released is unpredictable
5757
Executive Synchronization
Thread waitson an object
handle
Create and initialize thread object
Initialized
Ready
Transition
Waiting
Running
Terminated
Standby
Wait is completeSet object to
signaled state
Interaction with thread scheduling
Waiting on Dispatcher Objects ndash outside the kernel
5858
Interaction bet Synchronization amp Dispatching User mode thread waits on an event objectlsquos handle
Kernel changes threadlsquos scheduling state from ready to waiting and adds thread to wait-list
Another thread sets the event
Kernel wakes up waiting threads variable priority threads get priority boost
5959
Interaction bet Synchronization amp Dispatching Dispatcher re-schedules new thread ndash it may preempt running thread it it has lower priority and issues software interrupt to initiate context switch
If no processor can be preempted the dispatcher places the ready thread in the dispatcher ready queue to be scheduled later
6060
What signals an object
Dispatcher object
System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
Owning thread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (kernel mode)
nonsignaled signaled
Owning thread or otherthread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (exported to user mode)
nonsignaled signaled
One thread releases thesemaphore freeing a resource
A thread acquires the semaphoreMore resources are not available
Kernel resumes one or more waiting threads
Semaphore
6161
A thread reinitializesthe thread object
What signals an object (contd)
Dispatcher object System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
A thread sets the event
Kernel resumes one or more threads
Kernel resumes one or more waiting threads
Event
nonsignaled signaled
Dedicated thread sets oneevent in the event pair
Kernel resumes theother dedicated thread
Kernel resumes waitingdedicated thread
Event pair
nonsignaled signaled
Timer expires
A thread (re) initializes the timer
Kernel resumes all waiting threads
Timer
nonsignaled signaled
Thread terminates
Kernel resumes all waiting threads
Thread
6262
Further Reading
Mark E Russinovich and David A Solomon Microsoft Windows Internals 4th Edition Microsoft Press 2004
Chapter 3 - System Mechanisms
ndash Trap Dispatching (pp 85 ff)
ndash Synchronization (pp 149 ff)
ndash Kernel Event Tracing (pp 175 ff)
6363
33 Advanced Windows Synchronization
Deferred and Asynchronous Procedure Calls
IRQLs and CPU Time Accounting
Wait Queues amp Dispatcher Objects
6464
Used to defer processing from higher (device) interrupt level to a lower (dispatch) levelndash Also used for quantum end and timer expiration
Driver (usually ISR) queues requestndash One queue per CPU DPCs are normally queued to the current processor but can be targeted to other CPUs
ndash Executes specified procedure at dispatch IRQL (or ldquodispatch levelrdquo also ldquoDPC levelrdquo) when all higher-IRQL work (interrupts) completed
ndash Maximum times recommended ISR 10 usec DPC 25 usec
See httpwwwmicrosoftcomwhdcdriverperformmmdrvmspx
Deferred Procedure Calls (DPCs)
6565
queue head DPC object DPC object DPC object
Deferred Procedure Calls (DPCs)
6666
DPC
Delivering a DPC
DPC routines can call kernel functionsbut canlsquot call system services generatepage faults or create or wait on objects
DPC routines canlsquotassume whatprocess addressspace is currentlymapped
Interruptdispatch table
high
Power failure
DispatchDPC
APC
Low
DPC
1 Timer expires kernelqueues DPC that willrelease all waiting threadsKernel requests SW int
DPCDPC
DPC queue
2 DPC interrupt occurswhen IRQL drops belowdispatchDPC level
dispatcher
3 After DPC interruptcontrol transfers tothread dispatcher
4 Dispatcher executes each DPCroutine in DPC queue
6767
Asynchronous Procedure Calls (APCs)
Execute code in context of a particular user threadndash APC routines can acquire resources (objects) incur page faultscall system services
APC queue is thread-specific
User mode amp kernel mode APCsndash Permission required for user mode APCs
6868
Asynchronous Procedure Calls (APCs)
Executive uses APCs to complete work in thread spacendash Wait for asynchronous IO operation
ndash Emulate delivery of POSIX signals
ndash Make threads suspendterminate itself (env subsystems)
APCs are delivered when thread is in alertable wait statendash WaitForMultipleObjectsEx() SleepEx()
6969
Special kernel APCs
ndash Run in kernel mode at IRQL 1
ndash Always deliverable unless thread is already at IRQL 1 or above
ndash Used for IO completion reporting from ldquoarbitrary thread contextrdquo
ndash Kernel-mode interface is linkable but not documented
Asynchronous Procedure Calls (APCs)
7070
ldquoOrdinaryrdquo kernel APCs
ndash Always deliverable if at IRQL 0 unless explicitly disabled (disable with KeEnterCriticalRegion)
User mode APCs
ndash Used for IO completion callback routines (see ReadFileEx WriteFileEx) also QueueUserApc
ndash Only deliverable when thread is in ldquoalertable waitrdquo
Asynchronous Procedure Calls (APCs)
7171
ThreadObject
K
U
APC objects
Asynchronous Procedure Calls (APCs)
7272
IRQLs and CPU Time Accounting
Interval clock timer ISR keeps track of time
Clock ISR time accounting
ndash If IRQLlt2 charge to threadrsquos user or kernel time
ndash If IRQL=2 and processing a DPC charge to DPC time
ndash If IRQL=2 amp not processing a DPC charge to thread kernel time
ndash If IRQLgt2 charge to interrupt time
7373
IRQLs and CPU Time Accounting
Since time servicing interrupts are NOT charged to interrupted thread if system is busy but no process appears to be running must be due to interrupt-related activity
ndash Note time at IRQL 2 or more is charged to the current threadrsquos quantum (to be described)
7474
Interrupt Time Accounting
Task Manager includes interrupt and DPC time with the Idle process time
Interrupt activity is not charged to any threadprocess
ndash Process Explorer shows these as separate processesnot really processes
ndash Context switches for these are really of interrupts amp DPCs
7575
Time Accounting Quirks
Looking at total CPU time for each process may not reveal where system has spent its time
CPU time accounting is driven by programmable interrupt timer
ndash Normally 10 msec (15 msec on some MP Pentiums)
Thread execution and context switches between clock intervals NOT accounted
ndash Eg one or more threads run and enter a wait state before clock fires
ndash Thus threads may run but never get charged
View context switch activity with Process Explorer
ndash Add Context Switch Delta column
7676
For waiting threads user-mode utilities only display the wait reason
Example pstat
Looking at Waiting Threads
7777
Wait Internals 1 Dispatcher Objects
Size TypeState
Wait listhead
Object-type-specific data
DispatcherObject
(see ntddkincddkntddkh)
Any kernel object you can wait for is a ldquodispatcher objectrdquo
ndash some exclusively for synchronizationeg events mutexes (ldquomutantsrdquo) semaphores queues timers
ndash others can be waited for as a side effect of their prime function eg processes threads file objects
ndash non-waitable kernel objects are called ldquocontrol objectsrdquo
All dispatcher objects have a common header
All dispatcher objects are in one of two states
ndash ldquosignaledrdquo vs ldquononsignaledrdquo
ndash when signalled a wait on the object is satisfied
ndash different object types differ in terms of what changes their state
ndash wait and unwait implementation iscommon to all types of dispatcher objects
7878
Object-type-specific data
Wait Internals 2Wait Blocks
Size TypeState
Wait listhead
Size TypeState
Wait listhead
Represent a threadrsquos reference to something itrsquos waiting for (one per handle passed to WaitForhellip)
All wait blocks from a given wait call are chained to the waiting thread
Type indicates wait for ldquoanyrdquo or ldquoallrdquo Key denotes argument list position for
WaitForMultipleObjects
Object-type-specific data
DispatcherObjects
Thread Objects
WaitBlockListWaitBlockList
Wait blocks
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
7979
34 Windows APIs for Synchronization and IPC Windows API constructs for synchronization and interprocess communication
Synchronization
ndash Critical sections
ndash Mutexes
ndash Semaphores
ndash Event objects
Synchronization through interprocess communication
ndash Anonymous pipes
ndash Named pipes
ndash Mailslots
8080
Critical Sections
Only usable from within the same process
Critical sections are initialized and deleted but do not have handles
Only one thread at a time can be in a critical section
A thread can enter a critical section multiple times - however the number of Enter- and Leave-operations must match
Leaving a critical section before entering it may cause deadlocks
No way to test whether another thread is in a critical section
VOID InitializeCriticalSection( LPCRITICAL_SECTION sec )VOID DeleteCriticalSection( LPCRITICAL_SECTION sec )
VOID EnterCriticalSection( LPCRITICAL_SECTION sec ) VOID LeaveCriticalSection( LPCRITICAL_SECTION sec )BOOL TryEnterCriticalSection ( LPCRITICAL_SECTION sec )
8181
Critical Section Example
counter is global shared by all threads
volatile int counter = 0
CRITICAL_SECTION crit
InitializeCriticalSection ( ampcrit )
hellip main loop in any of the threads
while (done)
_try
EnterCriticalSection ( ampcrit )
counter += local_value
LeaveCriticalSection ( ampcrit )
_finally LeaveCriticalSection ( ampcrit )
DeleteCriticalSection( ampcrit )
8282
Synchronizing Threads with Kernel Objects
The following kernel objects can be used to synchronize threads
ndash Processes
ndash Threads
ndash Files
ndash Console input
File change notificationsFile change notifications
MutexesMutexes
Events (auto-reset + manual-reset)Events (auto-reset + manual-reset)
Waitable timersWaitable timers
DWORD WaitForSingleObject( HANDLE hObject DWORD dwTimeout )
DWORD WaitForMultipleObjects( DWORD cObjects LPHANDLE lpHandles BOOL bWaitAll DWORD dwTimeout )
8383
Wait Functions - Details
WaitForSingleObject()ndash hObject specifies kernel object
ndash dwTimeout specifies wait time in msecdwTimeout == 0 - no wait check whether object is signaled
dwTimeout == INFINITE - wait forever
WaitForMultipleObjects()ndash cObjects lt= MAXIMUM_WAIT_OBJECTS (64)
ndash lpHandles - pointer to array identifying these objects
ndash bWaitAll - whether to wait for first signaled object or all objectsFunction returns index of first signaled object
Side effectsndash Mutexes auto-reset events and waitable timers will be reset to non-signaled state after completing wait functions
8484
Mutexes
Mutexes work across processes
First thread has to call CreateMutex()
When sharing a mutex second thread (process) calls CreateMutex() or OpenMutex()
fInitialOwner == TRUE gives creator immediate ownership
Threads acquire mutex ownership using WaitForSingleObject() or WaitForMultipleObjects()
ReleaseMutex() gives up ownership
CloseHandle() will free mutex object
8585
Mutexes
HANDLE CreateMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
HANDLE OpenMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
BOOL ReleaseMutex( HANDLE hMutex )
8686
Mutex Example
counter is global shared by all threads
volatile int done counter = 0
HANDLE mutex = CreateMutex( NULL FALSE NULL )
main loop in any of the threads ret is local
DWORD ret
while (done)
ret = WaitForSingleObject( mutex INFINITE )
if (ret == WAIT_OBJECT_0)
counter += local_value
else mutex was abandoned
break exit the loop
ReleaseMutex( mutex )
CloseHandle( mutex )
8787
Comparison - POSIX mutexes
POSIX pthreads specification supports mutexesndash Synchronization among threads in same process
Five basic functionsndash pthread_mutex_init()
ndash pthread_mutex_destroy()
ndash pthread_mutex_lock()
ndash pthread_mutex_unlock()
ndash pthread_mutex_trylock()
Comparisonndash pthread_mutex_lock() will block - equivalent to WaitForSingleObject( hMutex )
ndash pthread_mutex_trylock() is nonblocking (polling) - equivalent to WaitForSingleObject() with timeout == 0
8888
Semaphores
Semaphore objects are used for resource countingndash A semaphore is signaled when count gt 0
Threadsprocesses use wait functionsndash Each wait function decreases semaphore count by 1
ndash ReleaseSemaphore() may increment count by any value
ndash ReleaseSemaphore() returns old semaphore count
8989
Semaphores
HANDLE CreateSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE OpenSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE ReleaseSemaphore( HANDLE hSemaphoreLONG cReleaseCount LPLONG lpPreviousCount )
9090
Events
Multiple threads can be released when a single event is signaled (barrier synchronization)ndash Manual-reset event can signal several thread simultaneously must be reset manually
ndash PulseEvent() will release all threads waiting on a manual-reset event and automatically reset the event
ndash Auto-reset event signals a single thread event is reset automatically
ndash fInitialState == TRUE - create event in signaled state
9191
Events
HANDLE CreateEvent( LPSECURITY_ATTRIBUTE lpsaBOOL fManualReset BOOL fInititalStateLPTSTR lpszEventName )
BOOL SetEvent( HANDLE hEvent )BOOL ResetEvent( HANDLE hEvent )BOOL PulseEvent( HANDLE hEvent )
9292
Comparison - POSIX condition variables
pthreadrsquos condition variables are comparable to eventsndash pthread_cond_init()
ndash pthread_cond_destroy()
Wait functionsndash pthread_cond_wait()
ndash pthread_cond_timedwait()
Signalingndash pthread_cond_signal() - one thread
ndash pthread_cond_broadcast() - all waiting threads
No exact equivalent to manual-reset events
9393
Anonymous pipes
BOOL CreatePipe( PHANDLE phReadPHANDLE phWriteLPSECURITY_ATTRIBUTES lpsaDWORD cbPipe )
main
prog1 prog2pipe
Half-duplex character-based IPC
cbPipe pipe byte size zero == default
Read on pipe handle will block if pipe is empty
Write operation to a full pipe will block
Anonymous pipes are oneway
9494
IO Redirection using an Anonymous Pipe
Create default size anonymous pipe handles are inheritable
if (CreatePipe (amphReadPipe amphWritePipe ampPipeSA 0))
fprintf(stderr ldquoAnon pipe create failednrdquo) exit(1)
Set output handle to pipe handle create first processes
StartInfoCh1hStdInput = GetStdHandle (STD_INPUT_HANDLE)
StartInfoCh1hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh1hStdOutput = hWritePipe
StartInfoCh1dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)Command1 NULL NULL TRUE 0 NULL NULL ampStartInfoCh1 ampProcInfo1))
fprintf(stderr ldquoCreateProc1 failednrdquo) exit(2)
CloseHandle (hWritePipe)
9595
Pipe example (contd)
Repeat (symmetrically) for the second process
StartInfoCh2hStdInput = hReadPipe
StartInfoCh2hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh2hStdOutput = GetStdHandle (STD_OUTPUT_HANDLE)
StartInfoCh2dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)targv NULL NULLTRUE Inherit handles
0 NULL NULL ampStartInfoCh2 ampProcInfo2))
fprintf(stderr ldquoCreateProc2 failednrdquo) exit(3)
CloseHandle (hReadPipe)
Wait for both processes to complete
WaitForSingleObject (ProcInfo1hProcess INFINITE)
WaitForSingleObject (ProcInfo2hProcess INFINITE)
CloseHandle (ProcInfo1hThread) CloseHandle (ProcInfo1hProcess)
CloseHandle (ProcInfo2hThread) CloseHandle (ProcInfo2hProcess)
return 0
9696
Named Pipes
Message orientedndash Reading process can read varying-length messages precisely as sent by the writing process
Bi-directionalndash Two processes can exchange messages over the same pipe
Multiple independent instances of a named pipendash Several clients can communicate with a single server using the same instance
ndash Server can respond to client using the same instance
Pipe can be accessed over the networkndash location transparency
Convenience and connection functions
9797
Using Named Pipes
HANDLE CreateNamedPipe (LPCTSTR lpszPipeNameDWORD fdwOpenMode DWORD fdwPipModeDWORD nMaxInstances DWORD cbOutBufDWORD cbInBuf DWORD dwTimeOutLPSECURITY_ATTRIBUTES lpsa )
Use same flag settings forall instances of a named pipe
lpszPipeName pipe[path]pipename
ndash Not possible to create a pipe on remote machine ( ndash local machine)
fdwOpenMode
ndash PIPE_ACCESS_DUPLEX PIPE_ACCESS_INBOUND PIPE_ACCESS_OUTBOUND
fdwPipeMode
ndash PIPE_TYPE_BYTE or PIPE_TYPE_MESSAGE
ndash PIPE_READMODE_BYTE or PIPE_READMODE_MESSAGE
ndash PIPE_WAIT or PIPE_NOWAIT (will ReadFile block)
9898
Named Pipes (contd)
BOOL PeekNamedPipe (HANDLE hPipeLPVOID lpvBuffer DWORD cbBufferLPDWORD lpcbRead LPDWORD lpcbAvailLPDWORD lpcbMessage)
nMaxInstances
ndash Number of instances
ndash PIPE_UNLIMITED_INSTANCES OS choice based on resources
dwTimeOut
ndash Default time-out period (in msec) for WaitNamedPipe()
First CreateNamedPipe creates named pipe
ndash Closing handle to last instance deletes named pipe
Polling a pipe
ndash Nondestructive ndash is there a message waiting for ReadFile
9999
Named Pipe Client Connections
CreateFile with named pipe namendash pipe[path]pipename
ndash servernamepipe[path]pipename
ndash First method gives better performance (local server)
Status Functionsndash GetNamedPipeHandleState
ndash SetNamedPipeHandleState
ndash GetNamedPipeInfo
100100
Convenience Functions
BOOL TransactNamedPipe( HANDLE hNamedPipeLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDOWRD lpcbRead LPOVERLAPPED lpa)
WriteFile ReadFile sequence
101101
Convenience Functions
BOOL CallNamedPipe( LPCTSTR lpszPipeNameLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDWORD lpcbRead DWORD dwTimeOut)
CreateFile WriteFile ReadFile CloseHandle
ndash dwTimeOut NMPWAIT_NOWAIT NMPWAIT_WIAT_FOREVER NMPWAIT_USE_DEFAULT_WAIT
102102
Server eliminate the polling loop
BOOL ConnectNamedPipe (HANDLE hNamedPipeLPOVERLAPPED lpo
lpo == NULLndash Call will return as soon as there is a client connection
ndash Returns false if client connected between CreateNamed Pipe calland ConnectNamedPipe()
Use DisconnectNamedPipe to free the handle for connection from another client
WaitNamedPipe()ndash Client may wait for serverlsquos ConnectNamedPipe()
Security rights for named pipesndash GENERIC_READ GENERIC_WRITE SYNCHRONIZE
103103
Comparison with UNIX
UNIX FIFOs are similar to a named pipe
ndash FIFOs are half-duplex
ndash FIFOs are limited to a single machine
ndash FIFOs are still byte-oriented so its easiest to use fixed-size records in clientserver applications
ndash Individual readwrites are atomic
A server using FIFOs must use a separate FIFO for each clientlsquos response although all clients can send requests via a single well known FIFO
Mkfifo() is the UNIX counterpart to CreateNamedPipe()
Use sockets for networked clientserver scenarios
104104
Client Example using Named Pipe
WaitNamedPipe (ServerPipeName NMPWAIT_WAIT_FOREVER)
hNamedPipe = CreateFile (ServerPipeName GENERIC_READ | GENERIC_WRITE
0 NULL OPEN_EXISTING FILE_ATTRIBUTE_NORMAL NULL)
if (hNamedPipe == INVALID_HANDLE_VALUE)
fptinf(stderr Failure to locate servern) exit(3)
Write the request
WriteFile (hNamedPipe ampRequest MAX_RQRS_LEN ampnWrite NULL)
Read each response and send it to std out
while (ReadFile (hNamedPipe ResponseRecord MAX_RQRS_LEN ampnRead NULL))
printf (s ResponseRecord)
CloseHandle (hNamedPipe)
return 0
105105
Server Example Using a Named Pipe
hNamedPipe = CreateNamedPipe (SERVER_PIPE PIPE_ACCESS_DUPLEX
PIPE_READMODE_MESSAGE | PIPE_TYPE_MESSAGE | PIPE_WAIT
1 0 0 CS_TIMEOUT pNPSA)
while (Done)
printf (Server is awaiting next requestn)
if (ConnectNamedPipe (hNamedPipe NULL)
|| ReadFile (hNamedPipe ampRequest RQ_SIZE ampnXfer NULL))
fprintf(stderr ldquoConnect or Read Named Pipe errornrdquo) exit(4)
printf( ldquoRequest is sn RequestRecord)
Send the file one line at a time to the client
fp = fopen (File r)
while ((fgets (ResponseRecord MAX_RQRS_LEN fp) = NULL))
WriteFile (hNamedPipe ampResponseRecord
(strlen(ResponseRecord) + 1) TSIZE ampnXfer NULL)
fclose (fp)
DisconnectNamedPipe (hNamedPipe)
End of server operation
106106
Win32 IPC - MailslotsMailslots bear some nasty implementation detailsthey are almost never used
Broadcast mechanism
ndash One-directional
ndash Mutliple writersmultiple readers (frequently one-to-many comm)
ndash Message delivery is unreliable
ndash Can be located over a network domain
ndash Message lengths are limited (w2k lt 426 byte) Operations on the mailslot
ndash Each reader (server) creates mailslot with CreateMailslot()
ndash Write-only client opens mailslot with CreateFile() and uses WriteFile() ndash open will fail if there are no waiting readers
ndash Clientlsquos message can be read by all servers (readers) Client lookup mailslotmailslotname
ndash Client will connect to every server in network domain
107107
Locate a server via mailslot
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
App client 0
App client n
Mailslot Servers
While () Sleep() hMS = CreateFile( ldquomailslotstatusldquo)
WriteFile(hMS ampStatInfo
App Server
Mailslot Client
Message is sent periodically
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
3535
Getting Into Kernel Mode
Code is run in kernel mode for one of three reasons
1 Requests from user mode
ndash Via the system service dispatch mechanism
ndash Kernel-mode code runs in the context of the requesting thread
2 Dedicated kernel-mode system threads
ndash Some threads in the system stay in kernel mode at all timesmostly in the ldquoSystemrdquo process
ndash Scheduled preempted etc like any other threads
3636
Getting Into Kernel Mode
3 Interrupts from external devicesndash interrupt dispatcher invokes the interrupt service routine
ndash ISR runs in the context of the interrupted thread so-called ldquoarbitrary thread contextrdquo
ndash ISR often requests the execution of a ldquoDPC routinerdquo which also runs in kernel mode
ndash Time not charged to interrupted thread
3737
Trap dispatching
Interruptdispatcher
Systemservice
dispatcher
Interruptserviceroutines
Interruptserviceroutines
Interruptserviceroutines
System services
System services
System services
Exceptiondispatcher
Exceptionhandlers
Exceptionhandlers
Exceptionhandlers
Virtual memorymanagerlsquos pager
Interrupt
System service call
HW exceptionsSW exceptions
Virtual addressexceptions
Trap processorlsquos mechanism to capture executing thread
ndash Switch from user to kernel mode
ndash Interrupts ndash asynchronous
ndash Exceptions - synchronous
Interrupt dispatch routine
Disable interrupts
Record machine state (trap frame) to allow resume
Mask equal- and lower-IRQL interrupts
Find and call appropriate ISR
Dismiss interrupt
Restore machine state (including mode and enabled interrupts)
Disable interrupts
Record machine state (trap frame) to allow resume
Mask equal- and lower-IRQL interrupts
Find and call appropriate ISR
Dismiss interrupt
Restore machine state (including mode and enabled interrupts)
Tell the device to stop interrupting
Interrogate device state start next operation on device etc
Request a DPC
Return to caller
Tell the device to stop interrupting
Interrogate device state start next operation on device etc
Request a DPC
Return to caller
Interrupt service routine
interrupt
user or kernel mode
codekernel mode
Note no thread or process context switch
Note no thread or process context switch
Interrupt Dispatching
3939
IRQL = Interrupt Request Level
ndash Precedence of the interrupt with respect to other interrupts
ndash Different interrupt sources have different IRQLs
ndash not the same as IRQ
IRQL is also a state of the processor
ndash Servicing an interrupt raises processor IRQL to that interruptrsquos IRQL
ndash this masks subsequent interrupts at equal and lower IRQLs
User mode is limited to IRQL 0
No waits or page faults at IRQL gt= DISPATCH_LEVEL
Interrupt Precedence via IRQLs (x86)
4040
PassiveLowAPC
DispatchDPCDevice 1
Profile amp Synch (Srv 2003)
ClockInterprocessor Interrupt
Power failHigh
normal thread execution
Hardware interrupts
Deferrable software interrupts
012
302928
31
Interrupt Precedence via IRQLs (x86)
4141
Interrupt processing
Interrupt dispatch table (IDT)ndash Links to interrupt service routines
x86ndash Interrupt controller interrupts processor (single line)
ndash Processor queries for interrupt vector uses vector as index to IDT
Alphandash PAL code (Privileged Architecture Library ndash Alpha BIOS) determines interrupt vector calls kernel
ndash Kernel uses vector to index IDT
After ISR execution IRQL is lowered to initial level
4242
Interrupt object
Allows device drivers to register ISRs for their devicesndash Contains dispatch code (initial handler)
ndash Dispatch code calls ISR with interrupt object as parameter(HW cannot pass parameters to ISR)
Connectingdisconnecting interrupt objectsndash Dynamic association between ISR and IDT entry
ndash Loadable device drivers (kernel modules)
ndash Turn onoff ISR
Interrupt objects can synchronize access to ISR datandash Multiple instances of ISR may be active simultaneously (MP machine)
ndash Multiple ISR may be connected with IRQL
4343
Predefined IRQLs
High ndash used when halting the system (via KeBugCheck())
Power fail ndash originated in the NT design document but has never been used
Inter-processor interruptndash used to request action from other processor (dispatching a thread updating a processors TLB system shutdown system crash)
Clockndash Used to update systemlsquos clock allocation of CPU time to threads
Profilendash Used for kernel profiling (see Kernel profiler ndash Kernprofexe Res Kit)
4444
Predefined IRQLs (contd)
Device
ndash Used to prioritize device interrupts
DPCdispatch and APC
ndash Software interrupts that kernel and device drivers generate
Passive
ndash No interrupt level at all normal thread execution
4545
IRQLs on 64-bit Systems
PassiveLowAPC
DispatchDPCDevice 1
Device n
Synch (Srv 2003)Clock
Interprocessor InterruptPower
HighProfile
012
1413
15
34
PassiveLowAPC
DispatchDPC amp Synch (UP only)Correctable Machine Check
Device 1
Device nSynch (MP only)
ClockInterprocessor Interrupt
HighProfilePower
x64 IA64
12
4646
Interrupt Prioritization amp Delivery
IRQLs are determined as followsndash x86 UP systems IRQL = 27 - IRQ
ndash x86 MP systems bucketized (random)
ndash x64 amp IA64 systems IRQL = IDT vector number 16
On MP systems which processor is chosen to deliver an interruptndash By default any processor can receive an interrupt from any deviceCan be configured with IntFilter utility in Resource Kit
ndash On x86 and x64 systems the IOAPIC (IO advanced programmable interrupt controller) is programmed to interrupt the processor running at the lowest IRQL
ndash On IA64 systems the SAPIC (streamlined advanced programmable interrupt controller) is configured to interrupt one processor for each interrupt sourceProcessors are assigned round robin for each interrupt vector
4747
Software interrupts
Initiating thread dispatching
ndash DPC allow for scheduling actions when kernel is deep within many layers of code
ndash Delayed scheduling decision one DPC queue per processor
Handling timer expiration
Asynchronous execution of a procedure in context of a particular thread
Support for asynchronous IO operations
4848
Flow of Interrupts
4949
Sync on MP use spinlocks to coordinate among processors
Spinlock acquisition and release routines implement a one-owner-at-a-time algorithm
ndash Spinlock is either free or is considered to be owned by a CPU
ndash Analogous to using Windows API mutexes from user mode
A spinlock is just a data cell in memory
ndash Accessed with a test-and-modify operation that is atomic across all processors
ndash KSPIN_LOCK is an opaque data type typedefrsquod as a ULONG
ndash To implement synchronization a single bit is sufficient
Synchronization on SMP Systems
5050
do acquire_spinlock(DPC)until (SUCCESS)
begin remove DPC from queueend
release_spinlock(DPC)
do acquire_spinlock(DPC)until (SUCCESS)
begin remove DPC from queueend
release_spinlock(DPC)
Kernel Synchronization
Processor BProcessor A
Critical section
spinlock
DPC DPC
A spinlock is a locking primitive associatedwith a global data structure such as the DPC queue
5151
Queued Spinlocks
Problem Checking status of spinlock via test-and-set operation creates bus contention
Queued spinlocks maintain queue of waiting processors
First processor acquires lock other processors wait on processor-local flag
ndash Thus busy-wait loop requires no access to the memory bus
When releasing lock the 1st processorrsquos flag is modified
ndash Exactly one processor is being signaled
ndash Pre-determined wait order
5252
SMP Scalability Improvements
Windows 2000 queued spinlocks
ndash qlocks in Kernel Debugger
Server 2003
ndash More spinlocks eliminated (context swap system space commit)
ndash Further reduction of use of spinlocks amp length they are held
ndash Scheduling database now per-CPUAllows thread state transitions in parallel
5353
SMP Scalability Improvements
XP2003
ndash Minimized lock contention for hot locks PFN or Page Frame Database lock
ndash Some locks completely eliminatedCharging nonpagedpaged pool quotas allocating and mapping system page table entries charging commitment of pages allocatingmapping physical memory through AWE functions
ndash New more efficient locking mechanism (pushlocks)Doesnrsquot use spinlocks when no contentionUsed for object manager and address windowing extensions (AWE) related locks
5454
Waiting
Flexible wait calls
ndash Wait for one or multiple objects in one call
ndash Wait for multiple can wait for ldquoanyrdquo one or ldquoallrdquo at onceldquoAllrdquo all objects must be in the signalled state concurrently to resolve the wait
ndash All wait calls include optional timeout argument
ndash Waiting threads consume no CPU time
5555
Waiting
Waitable objects include
ndash Events (may be auto-reset or manual reset may be set or ldquopulsedrdquo)
ndash Mutexes (ldquomutual exclusionrdquo one-at-a-time)
ndash Semaphores (n-at-a-time)
ndash Timers
ndash Processes and Threads (signalled upon exit or terminate)
ndash Directories (change notification)
5656
Waiting
No guaranteed ordering of wait resolution
ndash If multiple threads are waiting for an object and only one thread is released (eg itrsquos a mutex or auto-reset event) which thread gets released is unpredictable
5757
Executive Synchronization
Thread waitson an object
handle
Create and initialize thread object
Initialized
Ready
Transition
Waiting
Running
Terminated
Standby
Wait is completeSet object to
signaled state
Interaction with thread scheduling
Waiting on Dispatcher Objects ndash outside the kernel
5858
Interaction bet Synchronization amp Dispatching User mode thread waits on an event objectlsquos handle
Kernel changes threadlsquos scheduling state from ready to waiting and adds thread to wait-list
Another thread sets the event
Kernel wakes up waiting threads variable priority threads get priority boost
5959
Interaction bet Synchronization amp Dispatching Dispatcher re-schedules new thread ndash it may preempt running thread it it has lower priority and issues software interrupt to initiate context switch
If no processor can be preempted the dispatcher places the ready thread in the dispatcher ready queue to be scheduled later
6060
What signals an object
Dispatcher object
System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
Owning thread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (kernel mode)
nonsignaled signaled
Owning thread or otherthread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (exported to user mode)
nonsignaled signaled
One thread releases thesemaphore freeing a resource
A thread acquires the semaphoreMore resources are not available
Kernel resumes one or more waiting threads
Semaphore
6161
A thread reinitializesthe thread object
What signals an object (contd)
Dispatcher object System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
A thread sets the event
Kernel resumes one or more threads
Kernel resumes one or more waiting threads
Event
nonsignaled signaled
Dedicated thread sets oneevent in the event pair
Kernel resumes theother dedicated thread
Kernel resumes waitingdedicated thread
Event pair
nonsignaled signaled
Timer expires
A thread (re) initializes the timer
Kernel resumes all waiting threads
Timer
nonsignaled signaled
Thread terminates
Kernel resumes all waiting threads
Thread
6262
Further Reading
Mark E Russinovich and David A Solomon Microsoft Windows Internals 4th Edition Microsoft Press 2004
Chapter 3 - System Mechanisms
ndash Trap Dispatching (pp 85 ff)
ndash Synchronization (pp 149 ff)
ndash Kernel Event Tracing (pp 175 ff)
6363
33 Advanced Windows Synchronization
Deferred and Asynchronous Procedure Calls
IRQLs and CPU Time Accounting
Wait Queues amp Dispatcher Objects
6464
Used to defer processing from higher (device) interrupt level to a lower (dispatch) levelndash Also used for quantum end and timer expiration
Driver (usually ISR) queues requestndash One queue per CPU DPCs are normally queued to the current processor but can be targeted to other CPUs
ndash Executes specified procedure at dispatch IRQL (or ldquodispatch levelrdquo also ldquoDPC levelrdquo) when all higher-IRQL work (interrupts) completed
ndash Maximum times recommended ISR 10 usec DPC 25 usec
See httpwwwmicrosoftcomwhdcdriverperformmmdrvmspx
Deferred Procedure Calls (DPCs)
6565
queue head DPC object DPC object DPC object
Deferred Procedure Calls (DPCs)
6666
DPC
Delivering a DPC
DPC routines can call kernel functionsbut canlsquot call system services generatepage faults or create or wait on objects
DPC routines canlsquotassume whatprocess addressspace is currentlymapped
Interruptdispatch table
high
Power failure
DispatchDPC
APC
Low
DPC
1 Timer expires kernelqueues DPC that willrelease all waiting threadsKernel requests SW int
DPCDPC
DPC queue
2 DPC interrupt occurswhen IRQL drops belowdispatchDPC level
dispatcher
3 After DPC interruptcontrol transfers tothread dispatcher
4 Dispatcher executes each DPCroutine in DPC queue
6767
Asynchronous Procedure Calls (APCs)
Execute code in context of a particular user threadndash APC routines can acquire resources (objects) incur page faultscall system services
APC queue is thread-specific
User mode amp kernel mode APCsndash Permission required for user mode APCs
6868
Asynchronous Procedure Calls (APCs)
Executive uses APCs to complete work in thread spacendash Wait for asynchronous IO operation
ndash Emulate delivery of POSIX signals
ndash Make threads suspendterminate itself (env subsystems)
APCs are delivered when thread is in alertable wait statendash WaitForMultipleObjectsEx() SleepEx()
6969
Special kernel APCs
ndash Run in kernel mode at IRQL 1
ndash Always deliverable unless thread is already at IRQL 1 or above
ndash Used for IO completion reporting from ldquoarbitrary thread contextrdquo
ndash Kernel-mode interface is linkable but not documented
Asynchronous Procedure Calls (APCs)
7070
ldquoOrdinaryrdquo kernel APCs
ndash Always deliverable if at IRQL 0 unless explicitly disabled (disable with KeEnterCriticalRegion)
User mode APCs
ndash Used for IO completion callback routines (see ReadFileEx WriteFileEx) also QueueUserApc
ndash Only deliverable when thread is in ldquoalertable waitrdquo
Asynchronous Procedure Calls (APCs)
7171
ThreadObject
K
U
APC objects
Asynchronous Procedure Calls (APCs)
7272
IRQLs and CPU Time Accounting
Interval clock timer ISR keeps track of time
Clock ISR time accounting
ndash If IRQLlt2 charge to threadrsquos user or kernel time
ndash If IRQL=2 and processing a DPC charge to DPC time
ndash If IRQL=2 amp not processing a DPC charge to thread kernel time
ndash If IRQLgt2 charge to interrupt time
7373
IRQLs and CPU Time Accounting
Since time servicing interrupts are NOT charged to interrupted thread if system is busy but no process appears to be running must be due to interrupt-related activity
ndash Note time at IRQL 2 or more is charged to the current threadrsquos quantum (to be described)
7474
Interrupt Time Accounting
Task Manager includes interrupt and DPC time with the Idle process time
Interrupt activity is not charged to any threadprocess
ndash Process Explorer shows these as separate processesnot really processes
ndash Context switches for these are really of interrupts amp DPCs
7575
Time Accounting Quirks
Looking at total CPU time for each process may not reveal where system has spent its time
CPU time accounting is driven by programmable interrupt timer
ndash Normally 10 msec (15 msec on some MP Pentiums)
Thread execution and context switches between clock intervals NOT accounted
ndash Eg one or more threads run and enter a wait state before clock fires
ndash Thus threads may run but never get charged
View context switch activity with Process Explorer
ndash Add Context Switch Delta column
7676
For waiting threads user-mode utilities only display the wait reason
Example pstat
Looking at Waiting Threads
7777
Wait Internals 1 Dispatcher Objects
Size TypeState
Wait listhead
Object-type-specific data
DispatcherObject
(see ntddkincddkntddkh)
Any kernel object you can wait for is a ldquodispatcher objectrdquo
ndash some exclusively for synchronizationeg events mutexes (ldquomutantsrdquo) semaphores queues timers
ndash others can be waited for as a side effect of their prime function eg processes threads file objects
ndash non-waitable kernel objects are called ldquocontrol objectsrdquo
All dispatcher objects have a common header
All dispatcher objects are in one of two states
ndash ldquosignaledrdquo vs ldquononsignaledrdquo
ndash when signalled a wait on the object is satisfied
ndash different object types differ in terms of what changes their state
ndash wait and unwait implementation iscommon to all types of dispatcher objects
7878
Object-type-specific data
Wait Internals 2Wait Blocks
Size TypeState
Wait listhead
Size TypeState
Wait listhead
Represent a threadrsquos reference to something itrsquos waiting for (one per handle passed to WaitForhellip)
All wait blocks from a given wait call are chained to the waiting thread
Type indicates wait for ldquoanyrdquo or ldquoallrdquo Key denotes argument list position for
WaitForMultipleObjects
Object-type-specific data
DispatcherObjects
Thread Objects
WaitBlockListWaitBlockList
Wait blocks
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
7979
34 Windows APIs for Synchronization and IPC Windows API constructs for synchronization and interprocess communication
Synchronization
ndash Critical sections
ndash Mutexes
ndash Semaphores
ndash Event objects
Synchronization through interprocess communication
ndash Anonymous pipes
ndash Named pipes
ndash Mailslots
8080
Critical Sections
Only usable from within the same process
Critical sections are initialized and deleted but do not have handles
Only one thread at a time can be in a critical section
A thread can enter a critical section multiple times - however the number of Enter- and Leave-operations must match
Leaving a critical section before entering it may cause deadlocks
No way to test whether another thread is in a critical section
VOID InitializeCriticalSection( LPCRITICAL_SECTION sec )VOID DeleteCriticalSection( LPCRITICAL_SECTION sec )
VOID EnterCriticalSection( LPCRITICAL_SECTION sec ) VOID LeaveCriticalSection( LPCRITICAL_SECTION sec )BOOL TryEnterCriticalSection ( LPCRITICAL_SECTION sec )
8181
Critical Section Example
counter is global shared by all threads
volatile int counter = 0
CRITICAL_SECTION crit
InitializeCriticalSection ( ampcrit )
hellip main loop in any of the threads
while (done)
_try
EnterCriticalSection ( ampcrit )
counter += local_value
LeaveCriticalSection ( ampcrit )
_finally LeaveCriticalSection ( ampcrit )
DeleteCriticalSection( ampcrit )
8282
Synchronizing Threads with Kernel Objects
The following kernel objects can be used to synchronize threads
ndash Processes
ndash Threads
ndash Files
ndash Console input
File change notificationsFile change notifications
MutexesMutexes
Events (auto-reset + manual-reset)Events (auto-reset + manual-reset)
Waitable timersWaitable timers
DWORD WaitForSingleObject( HANDLE hObject DWORD dwTimeout )
DWORD WaitForMultipleObjects( DWORD cObjects LPHANDLE lpHandles BOOL bWaitAll DWORD dwTimeout )
8383
Wait Functions - Details
WaitForSingleObject()ndash hObject specifies kernel object
ndash dwTimeout specifies wait time in msecdwTimeout == 0 - no wait check whether object is signaled
dwTimeout == INFINITE - wait forever
WaitForMultipleObjects()ndash cObjects lt= MAXIMUM_WAIT_OBJECTS (64)
ndash lpHandles - pointer to array identifying these objects
ndash bWaitAll - whether to wait for first signaled object or all objectsFunction returns index of first signaled object
Side effectsndash Mutexes auto-reset events and waitable timers will be reset to non-signaled state after completing wait functions
8484
Mutexes
Mutexes work across processes
First thread has to call CreateMutex()
When sharing a mutex second thread (process) calls CreateMutex() or OpenMutex()
fInitialOwner == TRUE gives creator immediate ownership
Threads acquire mutex ownership using WaitForSingleObject() or WaitForMultipleObjects()
ReleaseMutex() gives up ownership
CloseHandle() will free mutex object
8585
Mutexes
HANDLE CreateMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
HANDLE OpenMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
BOOL ReleaseMutex( HANDLE hMutex )
8686
Mutex Example
counter is global shared by all threads
volatile int done counter = 0
HANDLE mutex = CreateMutex( NULL FALSE NULL )
main loop in any of the threads ret is local
DWORD ret
while (done)
ret = WaitForSingleObject( mutex INFINITE )
if (ret == WAIT_OBJECT_0)
counter += local_value
else mutex was abandoned
break exit the loop
ReleaseMutex( mutex )
CloseHandle( mutex )
8787
Comparison - POSIX mutexes
POSIX pthreads specification supports mutexesndash Synchronization among threads in same process
Five basic functionsndash pthread_mutex_init()
ndash pthread_mutex_destroy()
ndash pthread_mutex_lock()
ndash pthread_mutex_unlock()
ndash pthread_mutex_trylock()
Comparisonndash pthread_mutex_lock() will block - equivalent to WaitForSingleObject( hMutex )
ndash pthread_mutex_trylock() is nonblocking (polling) - equivalent to WaitForSingleObject() with timeout == 0
8888
Semaphores
Semaphore objects are used for resource countingndash A semaphore is signaled when count gt 0
Threadsprocesses use wait functionsndash Each wait function decreases semaphore count by 1
ndash ReleaseSemaphore() may increment count by any value
ndash ReleaseSemaphore() returns old semaphore count
8989
Semaphores
HANDLE CreateSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE OpenSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE ReleaseSemaphore( HANDLE hSemaphoreLONG cReleaseCount LPLONG lpPreviousCount )
9090
Events
Multiple threads can be released when a single event is signaled (barrier synchronization)ndash Manual-reset event can signal several thread simultaneously must be reset manually
ndash PulseEvent() will release all threads waiting on a manual-reset event and automatically reset the event
ndash Auto-reset event signals a single thread event is reset automatically
ndash fInitialState == TRUE - create event in signaled state
9191
Events
HANDLE CreateEvent( LPSECURITY_ATTRIBUTE lpsaBOOL fManualReset BOOL fInititalStateLPTSTR lpszEventName )
BOOL SetEvent( HANDLE hEvent )BOOL ResetEvent( HANDLE hEvent )BOOL PulseEvent( HANDLE hEvent )
9292
Comparison - POSIX condition variables
pthreadrsquos condition variables are comparable to eventsndash pthread_cond_init()
ndash pthread_cond_destroy()
Wait functionsndash pthread_cond_wait()
ndash pthread_cond_timedwait()
Signalingndash pthread_cond_signal() - one thread
ndash pthread_cond_broadcast() - all waiting threads
No exact equivalent to manual-reset events
9393
Anonymous pipes
BOOL CreatePipe( PHANDLE phReadPHANDLE phWriteLPSECURITY_ATTRIBUTES lpsaDWORD cbPipe )
main
prog1 prog2pipe
Half-duplex character-based IPC
cbPipe pipe byte size zero == default
Read on pipe handle will block if pipe is empty
Write operation to a full pipe will block
Anonymous pipes are oneway
9494
IO Redirection using an Anonymous Pipe
Create default size anonymous pipe handles are inheritable
if (CreatePipe (amphReadPipe amphWritePipe ampPipeSA 0))
fprintf(stderr ldquoAnon pipe create failednrdquo) exit(1)
Set output handle to pipe handle create first processes
StartInfoCh1hStdInput = GetStdHandle (STD_INPUT_HANDLE)
StartInfoCh1hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh1hStdOutput = hWritePipe
StartInfoCh1dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)Command1 NULL NULL TRUE 0 NULL NULL ampStartInfoCh1 ampProcInfo1))
fprintf(stderr ldquoCreateProc1 failednrdquo) exit(2)
CloseHandle (hWritePipe)
9595
Pipe example (contd)
Repeat (symmetrically) for the second process
StartInfoCh2hStdInput = hReadPipe
StartInfoCh2hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh2hStdOutput = GetStdHandle (STD_OUTPUT_HANDLE)
StartInfoCh2dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)targv NULL NULLTRUE Inherit handles
0 NULL NULL ampStartInfoCh2 ampProcInfo2))
fprintf(stderr ldquoCreateProc2 failednrdquo) exit(3)
CloseHandle (hReadPipe)
Wait for both processes to complete
WaitForSingleObject (ProcInfo1hProcess INFINITE)
WaitForSingleObject (ProcInfo2hProcess INFINITE)
CloseHandle (ProcInfo1hThread) CloseHandle (ProcInfo1hProcess)
CloseHandle (ProcInfo2hThread) CloseHandle (ProcInfo2hProcess)
return 0
9696
Named Pipes
Message orientedndash Reading process can read varying-length messages precisely as sent by the writing process
Bi-directionalndash Two processes can exchange messages over the same pipe
Multiple independent instances of a named pipendash Several clients can communicate with a single server using the same instance
ndash Server can respond to client using the same instance
Pipe can be accessed over the networkndash location transparency
Convenience and connection functions
9797
Using Named Pipes
HANDLE CreateNamedPipe (LPCTSTR lpszPipeNameDWORD fdwOpenMode DWORD fdwPipModeDWORD nMaxInstances DWORD cbOutBufDWORD cbInBuf DWORD dwTimeOutLPSECURITY_ATTRIBUTES lpsa )
Use same flag settings forall instances of a named pipe
lpszPipeName pipe[path]pipename
ndash Not possible to create a pipe on remote machine ( ndash local machine)
fdwOpenMode
ndash PIPE_ACCESS_DUPLEX PIPE_ACCESS_INBOUND PIPE_ACCESS_OUTBOUND
fdwPipeMode
ndash PIPE_TYPE_BYTE or PIPE_TYPE_MESSAGE
ndash PIPE_READMODE_BYTE or PIPE_READMODE_MESSAGE
ndash PIPE_WAIT or PIPE_NOWAIT (will ReadFile block)
9898
Named Pipes (contd)
BOOL PeekNamedPipe (HANDLE hPipeLPVOID lpvBuffer DWORD cbBufferLPDWORD lpcbRead LPDWORD lpcbAvailLPDWORD lpcbMessage)
nMaxInstances
ndash Number of instances
ndash PIPE_UNLIMITED_INSTANCES OS choice based on resources
dwTimeOut
ndash Default time-out period (in msec) for WaitNamedPipe()
First CreateNamedPipe creates named pipe
ndash Closing handle to last instance deletes named pipe
Polling a pipe
ndash Nondestructive ndash is there a message waiting for ReadFile
9999
Named Pipe Client Connections
CreateFile with named pipe namendash pipe[path]pipename
ndash servernamepipe[path]pipename
ndash First method gives better performance (local server)
Status Functionsndash GetNamedPipeHandleState
ndash SetNamedPipeHandleState
ndash GetNamedPipeInfo
100100
Convenience Functions
BOOL TransactNamedPipe( HANDLE hNamedPipeLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDOWRD lpcbRead LPOVERLAPPED lpa)
WriteFile ReadFile sequence
101101
Convenience Functions
BOOL CallNamedPipe( LPCTSTR lpszPipeNameLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDWORD lpcbRead DWORD dwTimeOut)
CreateFile WriteFile ReadFile CloseHandle
ndash dwTimeOut NMPWAIT_NOWAIT NMPWAIT_WIAT_FOREVER NMPWAIT_USE_DEFAULT_WAIT
102102
Server eliminate the polling loop
BOOL ConnectNamedPipe (HANDLE hNamedPipeLPOVERLAPPED lpo
lpo == NULLndash Call will return as soon as there is a client connection
ndash Returns false if client connected between CreateNamed Pipe calland ConnectNamedPipe()
Use DisconnectNamedPipe to free the handle for connection from another client
WaitNamedPipe()ndash Client may wait for serverlsquos ConnectNamedPipe()
Security rights for named pipesndash GENERIC_READ GENERIC_WRITE SYNCHRONIZE
103103
Comparison with UNIX
UNIX FIFOs are similar to a named pipe
ndash FIFOs are half-duplex
ndash FIFOs are limited to a single machine
ndash FIFOs are still byte-oriented so its easiest to use fixed-size records in clientserver applications
ndash Individual readwrites are atomic
A server using FIFOs must use a separate FIFO for each clientlsquos response although all clients can send requests via a single well known FIFO
Mkfifo() is the UNIX counterpart to CreateNamedPipe()
Use sockets for networked clientserver scenarios
104104
Client Example using Named Pipe
WaitNamedPipe (ServerPipeName NMPWAIT_WAIT_FOREVER)
hNamedPipe = CreateFile (ServerPipeName GENERIC_READ | GENERIC_WRITE
0 NULL OPEN_EXISTING FILE_ATTRIBUTE_NORMAL NULL)
if (hNamedPipe == INVALID_HANDLE_VALUE)
fptinf(stderr Failure to locate servern) exit(3)
Write the request
WriteFile (hNamedPipe ampRequest MAX_RQRS_LEN ampnWrite NULL)
Read each response and send it to std out
while (ReadFile (hNamedPipe ResponseRecord MAX_RQRS_LEN ampnRead NULL))
printf (s ResponseRecord)
CloseHandle (hNamedPipe)
return 0
105105
Server Example Using a Named Pipe
hNamedPipe = CreateNamedPipe (SERVER_PIPE PIPE_ACCESS_DUPLEX
PIPE_READMODE_MESSAGE | PIPE_TYPE_MESSAGE | PIPE_WAIT
1 0 0 CS_TIMEOUT pNPSA)
while (Done)
printf (Server is awaiting next requestn)
if (ConnectNamedPipe (hNamedPipe NULL)
|| ReadFile (hNamedPipe ampRequest RQ_SIZE ampnXfer NULL))
fprintf(stderr ldquoConnect or Read Named Pipe errornrdquo) exit(4)
printf( ldquoRequest is sn RequestRecord)
Send the file one line at a time to the client
fp = fopen (File r)
while ((fgets (ResponseRecord MAX_RQRS_LEN fp) = NULL))
WriteFile (hNamedPipe ampResponseRecord
(strlen(ResponseRecord) + 1) TSIZE ampnXfer NULL)
fclose (fp)
DisconnectNamedPipe (hNamedPipe)
End of server operation
106106
Win32 IPC - MailslotsMailslots bear some nasty implementation detailsthey are almost never used
Broadcast mechanism
ndash One-directional
ndash Mutliple writersmultiple readers (frequently one-to-many comm)
ndash Message delivery is unreliable
ndash Can be located over a network domain
ndash Message lengths are limited (w2k lt 426 byte) Operations on the mailslot
ndash Each reader (server) creates mailslot with CreateMailslot()
ndash Write-only client opens mailslot with CreateFile() and uses WriteFile() ndash open will fail if there are no waiting readers
ndash Clientlsquos message can be read by all servers (readers) Client lookup mailslotmailslotname
ndash Client will connect to every server in network domain
107107
Locate a server via mailslot
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
App client 0
App client n
Mailslot Servers
While () Sleep() hMS = CreateFile( ldquomailslotstatusldquo)
WriteFile(hMS ampStatInfo
App Server
Mailslot Client
Message is sent periodically
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
3636
Getting Into Kernel Mode
3 Interrupts from external devicesndash interrupt dispatcher invokes the interrupt service routine
ndash ISR runs in the context of the interrupted thread so-called ldquoarbitrary thread contextrdquo
ndash ISR often requests the execution of a ldquoDPC routinerdquo which also runs in kernel mode
ndash Time not charged to interrupted thread
3737
Trap dispatching
Interruptdispatcher
Systemservice
dispatcher
Interruptserviceroutines
Interruptserviceroutines
Interruptserviceroutines
System services
System services
System services
Exceptiondispatcher
Exceptionhandlers
Exceptionhandlers
Exceptionhandlers
Virtual memorymanagerlsquos pager
Interrupt
System service call
HW exceptionsSW exceptions
Virtual addressexceptions
Trap processorlsquos mechanism to capture executing thread
ndash Switch from user to kernel mode
ndash Interrupts ndash asynchronous
ndash Exceptions - synchronous
Interrupt dispatch routine
Disable interrupts
Record machine state (trap frame) to allow resume
Mask equal- and lower-IRQL interrupts
Find and call appropriate ISR
Dismiss interrupt
Restore machine state (including mode and enabled interrupts)
Disable interrupts
Record machine state (trap frame) to allow resume
Mask equal- and lower-IRQL interrupts
Find and call appropriate ISR
Dismiss interrupt
Restore machine state (including mode and enabled interrupts)
Tell the device to stop interrupting
Interrogate device state start next operation on device etc
Request a DPC
Return to caller
Tell the device to stop interrupting
Interrogate device state start next operation on device etc
Request a DPC
Return to caller
Interrupt service routine
interrupt
user or kernel mode
codekernel mode
Note no thread or process context switch
Note no thread or process context switch
Interrupt Dispatching
3939
IRQL = Interrupt Request Level
ndash Precedence of the interrupt with respect to other interrupts
ndash Different interrupt sources have different IRQLs
ndash not the same as IRQ
IRQL is also a state of the processor
ndash Servicing an interrupt raises processor IRQL to that interruptrsquos IRQL
ndash this masks subsequent interrupts at equal and lower IRQLs
User mode is limited to IRQL 0
No waits or page faults at IRQL gt= DISPATCH_LEVEL
Interrupt Precedence via IRQLs (x86)
4040
PassiveLowAPC
DispatchDPCDevice 1
Profile amp Synch (Srv 2003)
ClockInterprocessor Interrupt
Power failHigh
normal thread execution
Hardware interrupts
Deferrable software interrupts
012
302928
31
Interrupt Precedence via IRQLs (x86)
4141
Interrupt processing
Interrupt dispatch table (IDT)ndash Links to interrupt service routines
x86ndash Interrupt controller interrupts processor (single line)
ndash Processor queries for interrupt vector uses vector as index to IDT
Alphandash PAL code (Privileged Architecture Library ndash Alpha BIOS) determines interrupt vector calls kernel
ndash Kernel uses vector to index IDT
After ISR execution IRQL is lowered to initial level
4242
Interrupt object
Allows device drivers to register ISRs for their devicesndash Contains dispatch code (initial handler)
ndash Dispatch code calls ISR with interrupt object as parameter(HW cannot pass parameters to ISR)
Connectingdisconnecting interrupt objectsndash Dynamic association between ISR and IDT entry
ndash Loadable device drivers (kernel modules)
ndash Turn onoff ISR
Interrupt objects can synchronize access to ISR datandash Multiple instances of ISR may be active simultaneously (MP machine)
ndash Multiple ISR may be connected with IRQL
4343
Predefined IRQLs
High ndash used when halting the system (via KeBugCheck())
Power fail ndash originated in the NT design document but has never been used
Inter-processor interruptndash used to request action from other processor (dispatching a thread updating a processors TLB system shutdown system crash)
Clockndash Used to update systemlsquos clock allocation of CPU time to threads
Profilendash Used for kernel profiling (see Kernel profiler ndash Kernprofexe Res Kit)
4444
Predefined IRQLs (contd)
Device
ndash Used to prioritize device interrupts
DPCdispatch and APC
ndash Software interrupts that kernel and device drivers generate
Passive
ndash No interrupt level at all normal thread execution
4545
IRQLs on 64-bit Systems
PassiveLowAPC
DispatchDPCDevice 1
Device n
Synch (Srv 2003)Clock
Interprocessor InterruptPower
HighProfile
012
1413
15
34
PassiveLowAPC
DispatchDPC amp Synch (UP only)Correctable Machine Check
Device 1
Device nSynch (MP only)
ClockInterprocessor Interrupt
HighProfilePower
x64 IA64
12
4646
Interrupt Prioritization amp Delivery
IRQLs are determined as followsndash x86 UP systems IRQL = 27 - IRQ
ndash x86 MP systems bucketized (random)
ndash x64 amp IA64 systems IRQL = IDT vector number 16
On MP systems which processor is chosen to deliver an interruptndash By default any processor can receive an interrupt from any deviceCan be configured with IntFilter utility in Resource Kit
ndash On x86 and x64 systems the IOAPIC (IO advanced programmable interrupt controller) is programmed to interrupt the processor running at the lowest IRQL
ndash On IA64 systems the SAPIC (streamlined advanced programmable interrupt controller) is configured to interrupt one processor for each interrupt sourceProcessors are assigned round robin for each interrupt vector
4747
Software interrupts
Initiating thread dispatching
ndash DPC allow for scheduling actions when kernel is deep within many layers of code
ndash Delayed scheduling decision one DPC queue per processor
Handling timer expiration
Asynchronous execution of a procedure in context of a particular thread
Support for asynchronous IO operations
4848
Flow of Interrupts
4949
Sync on MP use spinlocks to coordinate among processors
Spinlock acquisition and release routines implement a one-owner-at-a-time algorithm
ndash Spinlock is either free or is considered to be owned by a CPU
ndash Analogous to using Windows API mutexes from user mode
A spinlock is just a data cell in memory
ndash Accessed with a test-and-modify operation that is atomic across all processors
ndash KSPIN_LOCK is an opaque data type typedefrsquod as a ULONG
ndash To implement synchronization a single bit is sufficient
Synchronization on SMP Systems
5050
do acquire_spinlock(DPC)until (SUCCESS)
begin remove DPC from queueend
release_spinlock(DPC)
do acquire_spinlock(DPC)until (SUCCESS)
begin remove DPC from queueend
release_spinlock(DPC)
Kernel Synchronization
Processor BProcessor A
Critical section
spinlock
DPC DPC
A spinlock is a locking primitive associatedwith a global data structure such as the DPC queue
5151
Queued Spinlocks
Problem Checking status of spinlock via test-and-set operation creates bus contention
Queued spinlocks maintain queue of waiting processors
First processor acquires lock other processors wait on processor-local flag
ndash Thus busy-wait loop requires no access to the memory bus
When releasing lock the 1st processorrsquos flag is modified
ndash Exactly one processor is being signaled
ndash Pre-determined wait order
5252
SMP Scalability Improvements
Windows 2000 queued spinlocks
ndash qlocks in Kernel Debugger
Server 2003
ndash More spinlocks eliminated (context swap system space commit)
ndash Further reduction of use of spinlocks amp length they are held
ndash Scheduling database now per-CPUAllows thread state transitions in parallel
5353
SMP Scalability Improvements
XP2003
ndash Minimized lock contention for hot locks PFN or Page Frame Database lock
ndash Some locks completely eliminatedCharging nonpagedpaged pool quotas allocating and mapping system page table entries charging commitment of pages allocatingmapping physical memory through AWE functions
ndash New more efficient locking mechanism (pushlocks)Doesnrsquot use spinlocks when no contentionUsed for object manager and address windowing extensions (AWE) related locks
5454
Waiting
Flexible wait calls
ndash Wait for one or multiple objects in one call
ndash Wait for multiple can wait for ldquoanyrdquo one or ldquoallrdquo at onceldquoAllrdquo all objects must be in the signalled state concurrently to resolve the wait
ndash All wait calls include optional timeout argument
ndash Waiting threads consume no CPU time
5555
Waiting
Waitable objects include
ndash Events (may be auto-reset or manual reset may be set or ldquopulsedrdquo)
ndash Mutexes (ldquomutual exclusionrdquo one-at-a-time)
ndash Semaphores (n-at-a-time)
ndash Timers
ndash Processes and Threads (signalled upon exit or terminate)
ndash Directories (change notification)
5656
Waiting
No guaranteed ordering of wait resolution
ndash If multiple threads are waiting for an object and only one thread is released (eg itrsquos a mutex or auto-reset event) which thread gets released is unpredictable
5757
Executive Synchronization
Thread waitson an object
handle
Create and initialize thread object
Initialized
Ready
Transition
Waiting
Running
Terminated
Standby
Wait is completeSet object to
signaled state
Interaction with thread scheduling
Waiting on Dispatcher Objects ndash outside the kernel
5858
Interaction bet Synchronization amp Dispatching User mode thread waits on an event objectlsquos handle
Kernel changes threadlsquos scheduling state from ready to waiting and adds thread to wait-list
Another thread sets the event
Kernel wakes up waiting threads variable priority threads get priority boost
5959
Interaction bet Synchronization amp Dispatching Dispatcher re-schedules new thread ndash it may preempt running thread it it has lower priority and issues software interrupt to initiate context switch
If no processor can be preempted the dispatcher places the ready thread in the dispatcher ready queue to be scheduled later
6060
What signals an object
Dispatcher object
System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
Owning thread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (kernel mode)
nonsignaled signaled
Owning thread or otherthread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (exported to user mode)
nonsignaled signaled
One thread releases thesemaphore freeing a resource
A thread acquires the semaphoreMore resources are not available
Kernel resumes one or more waiting threads
Semaphore
6161
A thread reinitializesthe thread object
What signals an object (contd)
Dispatcher object System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
A thread sets the event
Kernel resumes one or more threads
Kernel resumes one or more waiting threads
Event
nonsignaled signaled
Dedicated thread sets oneevent in the event pair
Kernel resumes theother dedicated thread
Kernel resumes waitingdedicated thread
Event pair
nonsignaled signaled
Timer expires
A thread (re) initializes the timer
Kernel resumes all waiting threads
Timer
nonsignaled signaled
Thread terminates
Kernel resumes all waiting threads
Thread
6262
Further Reading
Mark E Russinovich and David A Solomon Microsoft Windows Internals 4th Edition Microsoft Press 2004
Chapter 3 - System Mechanisms
ndash Trap Dispatching (pp 85 ff)
ndash Synchronization (pp 149 ff)
ndash Kernel Event Tracing (pp 175 ff)
6363
33 Advanced Windows Synchronization
Deferred and Asynchronous Procedure Calls
IRQLs and CPU Time Accounting
Wait Queues amp Dispatcher Objects
6464
Used to defer processing from higher (device) interrupt level to a lower (dispatch) levelndash Also used for quantum end and timer expiration
Driver (usually ISR) queues requestndash One queue per CPU DPCs are normally queued to the current processor but can be targeted to other CPUs
ndash Executes specified procedure at dispatch IRQL (or ldquodispatch levelrdquo also ldquoDPC levelrdquo) when all higher-IRQL work (interrupts) completed
ndash Maximum times recommended ISR 10 usec DPC 25 usec
See httpwwwmicrosoftcomwhdcdriverperformmmdrvmspx
Deferred Procedure Calls (DPCs)
6565
queue head DPC object DPC object DPC object
Deferred Procedure Calls (DPCs)
6666
DPC
Delivering a DPC
DPC routines can call kernel functionsbut canlsquot call system services generatepage faults or create or wait on objects
DPC routines canlsquotassume whatprocess addressspace is currentlymapped
Interruptdispatch table
high
Power failure
DispatchDPC
APC
Low
DPC
1 Timer expires kernelqueues DPC that willrelease all waiting threadsKernel requests SW int
DPCDPC
DPC queue
2 DPC interrupt occurswhen IRQL drops belowdispatchDPC level
dispatcher
3 After DPC interruptcontrol transfers tothread dispatcher
4 Dispatcher executes each DPCroutine in DPC queue
6767
Asynchronous Procedure Calls (APCs)
Execute code in context of a particular user threadndash APC routines can acquire resources (objects) incur page faultscall system services
APC queue is thread-specific
User mode amp kernel mode APCsndash Permission required for user mode APCs
6868
Asynchronous Procedure Calls (APCs)
Executive uses APCs to complete work in thread spacendash Wait for asynchronous IO operation
ndash Emulate delivery of POSIX signals
ndash Make threads suspendterminate itself (env subsystems)
APCs are delivered when thread is in alertable wait statendash WaitForMultipleObjectsEx() SleepEx()
6969
Special kernel APCs
ndash Run in kernel mode at IRQL 1
ndash Always deliverable unless thread is already at IRQL 1 or above
ndash Used for IO completion reporting from ldquoarbitrary thread contextrdquo
ndash Kernel-mode interface is linkable but not documented
Asynchronous Procedure Calls (APCs)
7070
ldquoOrdinaryrdquo kernel APCs
ndash Always deliverable if at IRQL 0 unless explicitly disabled (disable with KeEnterCriticalRegion)
User mode APCs
ndash Used for IO completion callback routines (see ReadFileEx WriteFileEx) also QueueUserApc
ndash Only deliverable when thread is in ldquoalertable waitrdquo
Asynchronous Procedure Calls (APCs)
7171
ThreadObject
K
U
APC objects
Asynchronous Procedure Calls (APCs)
7272
IRQLs and CPU Time Accounting
Interval clock timer ISR keeps track of time
Clock ISR time accounting
ndash If IRQLlt2 charge to threadrsquos user or kernel time
ndash If IRQL=2 and processing a DPC charge to DPC time
ndash If IRQL=2 amp not processing a DPC charge to thread kernel time
ndash If IRQLgt2 charge to interrupt time
7373
IRQLs and CPU Time Accounting
Since time servicing interrupts are NOT charged to interrupted thread if system is busy but no process appears to be running must be due to interrupt-related activity
ndash Note time at IRQL 2 or more is charged to the current threadrsquos quantum (to be described)
7474
Interrupt Time Accounting
Task Manager includes interrupt and DPC time with the Idle process time
Interrupt activity is not charged to any threadprocess
ndash Process Explorer shows these as separate processesnot really processes
ndash Context switches for these are really of interrupts amp DPCs
7575
Time Accounting Quirks
Looking at total CPU time for each process may not reveal where system has spent its time
CPU time accounting is driven by programmable interrupt timer
ndash Normally 10 msec (15 msec on some MP Pentiums)
Thread execution and context switches between clock intervals NOT accounted
ndash Eg one or more threads run and enter a wait state before clock fires
ndash Thus threads may run but never get charged
View context switch activity with Process Explorer
ndash Add Context Switch Delta column
7676
For waiting threads user-mode utilities only display the wait reason
Example pstat
Looking at Waiting Threads
7777
Wait Internals 1 Dispatcher Objects
Size TypeState
Wait listhead
Object-type-specific data
DispatcherObject
(see ntddkincddkntddkh)
Any kernel object you can wait for is a ldquodispatcher objectrdquo
ndash some exclusively for synchronizationeg events mutexes (ldquomutantsrdquo) semaphores queues timers
ndash others can be waited for as a side effect of their prime function eg processes threads file objects
ndash non-waitable kernel objects are called ldquocontrol objectsrdquo
All dispatcher objects have a common header
All dispatcher objects are in one of two states
ndash ldquosignaledrdquo vs ldquononsignaledrdquo
ndash when signalled a wait on the object is satisfied
ndash different object types differ in terms of what changes their state
ndash wait and unwait implementation iscommon to all types of dispatcher objects
7878
Object-type-specific data
Wait Internals 2Wait Blocks
Size TypeState
Wait listhead
Size TypeState
Wait listhead
Represent a threadrsquos reference to something itrsquos waiting for (one per handle passed to WaitForhellip)
All wait blocks from a given wait call are chained to the waiting thread
Type indicates wait for ldquoanyrdquo or ldquoallrdquo Key denotes argument list position for
WaitForMultipleObjects
Object-type-specific data
DispatcherObjects
Thread Objects
WaitBlockListWaitBlockList
Wait blocks
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
7979
34 Windows APIs for Synchronization and IPC Windows API constructs for synchronization and interprocess communication
Synchronization
ndash Critical sections
ndash Mutexes
ndash Semaphores
ndash Event objects
Synchronization through interprocess communication
ndash Anonymous pipes
ndash Named pipes
ndash Mailslots
8080
Critical Sections
Only usable from within the same process
Critical sections are initialized and deleted but do not have handles
Only one thread at a time can be in a critical section
A thread can enter a critical section multiple times - however the number of Enter- and Leave-operations must match
Leaving a critical section before entering it may cause deadlocks
No way to test whether another thread is in a critical section
VOID InitializeCriticalSection( LPCRITICAL_SECTION sec )VOID DeleteCriticalSection( LPCRITICAL_SECTION sec )
VOID EnterCriticalSection( LPCRITICAL_SECTION sec ) VOID LeaveCriticalSection( LPCRITICAL_SECTION sec )BOOL TryEnterCriticalSection ( LPCRITICAL_SECTION sec )
8181
Critical Section Example
counter is global shared by all threads
volatile int counter = 0
CRITICAL_SECTION crit
InitializeCriticalSection ( ampcrit )
hellip main loop in any of the threads
while (done)
_try
EnterCriticalSection ( ampcrit )
counter += local_value
LeaveCriticalSection ( ampcrit )
_finally LeaveCriticalSection ( ampcrit )
DeleteCriticalSection( ampcrit )
8282
Synchronizing Threads with Kernel Objects
The following kernel objects can be used to synchronize threads
ndash Processes
ndash Threads
ndash Files
ndash Console input
File change notificationsFile change notifications
MutexesMutexes
Events (auto-reset + manual-reset)Events (auto-reset + manual-reset)
Waitable timersWaitable timers
DWORD WaitForSingleObject( HANDLE hObject DWORD dwTimeout )
DWORD WaitForMultipleObjects( DWORD cObjects LPHANDLE lpHandles BOOL bWaitAll DWORD dwTimeout )
8383
Wait Functions - Details
WaitForSingleObject()ndash hObject specifies kernel object
ndash dwTimeout specifies wait time in msecdwTimeout == 0 - no wait check whether object is signaled
dwTimeout == INFINITE - wait forever
WaitForMultipleObjects()ndash cObjects lt= MAXIMUM_WAIT_OBJECTS (64)
ndash lpHandles - pointer to array identifying these objects
ndash bWaitAll - whether to wait for first signaled object or all objectsFunction returns index of first signaled object
Side effectsndash Mutexes auto-reset events and waitable timers will be reset to non-signaled state after completing wait functions
8484
Mutexes
Mutexes work across processes
First thread has to call CreateMutex()
When sharing a mutex second thread (process) calls CreateMutex() or OpenMutex()
fInitialOwner == TRUE gives creator immediate ownership
Threads acquire mutex ownership using WaitForSingleObject() or WaitForMultipleObjects()
ReleaseMutex() gives up ownership
CloseHandle() will free mutex object
8585
Mutexes
HANDLE CreateMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
HANDLE OpenMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
BOOL ReleaseMutex( HANDLE hMutex )
8686
Mutex Example
counter is global shared by all threads
volatile int done counter = 0
HANDLE mutex = CreateMutex( NULL FALSE NULL )
main loop in any of the threads ret is local
DWORD ret
while (done)
ret = WaitForSingleObject( mutex INFINITE )
if (ret == WAIT_OBJECT_0)
counter += local_value
else mutex was abandoned
break exit the loop
ReleaseMutex( mutex )
CloseHandle( mutex )
8787
Comparison - POSIX mutexes
POSIX pthreads specification supports mutexesndash Synchronization among threads in same process
Five basic functionsndash pthread_mutex_init()
ndash pthread_mutex_destroy()
ndash pthread_mutex_lock()
ndash pthread_mutex_unlock()
ndash pthread_mutex_trylock()
Comparisonndash pthread_mutex_lock() will block - equivalent to WaitForSingleObject( hMutex )
ndash pthread_mutex_trylock() is nonblocking (polling) - equivalent to WaitForSingleObject() with timeout == 0
8888
Semaphores
Semaphore objects are used for resource countingndash A semaphore is signaled when count gt 0
Threadsprocesses use wait functionsndash Each wait function decreases semaphore count by 1
ndash ReleaseSemaphore() may increment count by any value
ndash ReleaseSemaphore() returns old semaphore count
8989
Semaphores
HANDLE CreateSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE OpenSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE ReleaseSemaphore( HANDLE hSemaphoreLONG cReleaseCount LPLONG lpPreviousCount )
9090
Events
Multiple threads can be released when a single event is signaled (barrier synchronization)ndash Manual-reset event can signal several thread simultaneously must be reset manually
ndash PulseEvent() will release all threads waiting on a manual-reset event and automatically reset the event
ndash Auto-reset event signals a single thread event is reset automatically
ndash fInitialState == TRUE - create event in signaled state
9191
Events
HANDLE CreateEvent( LPSECURITY_ATTRIBUTE lpsaBOOL fManualReset BOOL fInititalStateLPTSTR lpszEventName )
BOOL SetEvent( HANDLE hEvent )BOOL ResetEvent( HANDLE hEvent )BOOL PulseEvent( HANDLE hEvent )
9292
Comparison - POSIX condition variables
pthreadrsquos condition variables are comparable to eventsndash pthread_cond_init()
ndash pthread_cond_destroy()
Wait functionsndash pthread_cond_wait()
ndash pthread_cond_timedwait()
Signalingndash pthread_cond_signal() - one thread
ndash pthread_cond_broadcast() - all waiting threads
No exact equivalent to manual-reset events
9393
Anonymous pipes
BOOL CreatePipe( PHANDLE phReadPHANDLE phWriteLPSECURITY_ATTRIBUTES lpsaDWORD cbPipe )
main
prog1 prog2pipe
Half-duplex character-based IPC
cbPipe pipe byte size zero == default
Read on pipe handle will block if pipe is empty
Write operation to a full pipe will block
Anonymous pipes are oneway
9494
IO Redirection using an Anonymous Pipe
Create default size anonymous pipe handles are inheritable
if (CreatePipe (amphReadPipe amphWritePipe ampPipeSA 0))
fprintf(stderr ldquoAnon pipe create failednrdquo) exit(1)
Set output handle to pipe handle create first processes
StartInfoCh1hStdInput = GetStdHandle (STD_INPUT_HANDLE)
StartInfoCh1hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh1hStdOutput = hWritePipe
StartInfoCh1dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)Command1 NULL NULL TRUE 0 NULL NULL ampStartInfoCh1 ampProcInfo1))
fprintf(stderr ldquoCreateProc1 failednrdquo) exit(2)
CloseHandle (hWritePipe)
9595
Pipe example (contd)
Repeat (symmetrically) for the second process
StartInfoCh2hStdInput = hReadPipe
StartInfoCh2hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh2hStdOutput = GetStdHandle (STD_OUTPUT_HANDLE)
StartInfoCh2dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)targv NULL NULLTRUE Inherit handles
0 NULL NULL ampStartInfoCh2 ampProcInfo2))
fprintf(stderr ldquoCreateProc2 failednrdquo) exit(3)
CloseHandle (hReadPipe)
Wait for both processes to complete
WaitForSingleObject (ProcInfo1hProcess INFINITE)
WaitForSingleObject (ProcInfo2hProcess INFINITE)
CloseHandle (ProcInfo1hThread) CloseHandle (ProcInfo1hProcess)
CloseHandle (ProcInfo2hThread) CloseHandle (ProcInfo2hProcess)
return 0
9696
Named Pipes
Message orientedndash Reading process can read varying-length messages precisely as sent by the writing process
Bi-directionalndash Two processes can exchange messages over the same pipe
Multiple independent instances of a named pipendash Several clients can communicate with a single server using the same instance
ndash Server can respond to client using the same instance
Pipe can be accessed over the networkndash location transparency
Convenience and connection functions
9797
Using Named Pipes
HANDLE CreateNamedPipe (LPCTSTR lpszPipeNameDWORD fdwOpenMode DWORD fdwPipModeDWORD nMaxInstances DWORD cbOutBufDWORD cbInBuf DWORD dwTimeOutLPSECURITY_ATTRIBUTES lpsa )
Use same flag settings forall instances of a named pipe
lpszPipeName pipe[path]pipename
ndash Not possible to create a pipe on remote machine ( ndash local machine)
fdwOpenMode
ndash PIPE_ACCESS_DUPLEX PIPE_ACCESS_INBOUND PIPE_ACCESS_OUTBOUND
fdwPipeMode
ndash PIPE_TYPE_BYTE or PIPE_TYPE_MESSAGE
ndash PIPE_READMODE_BYTE or PIPE_READMODE_MESSAGE
ndash PIPE_WAIT or PIPE_NOWAIT (will ReadFile block)
9898
Named Pipes (contd)
BOOL PeekNamedPipe (HANDLE hPipeLPVOID lpvBuffer DWORD cbBufferLPDWORD lpcbRead LPDWORD lpcbAvailLPDWORD lpcbMessage)
nMaxInstances
ndash Number of instances
ndash PIPE_UNLIMITED_INSTANCES OS choice based on resources
dwTimeOut
ndash Default time-out period (in msec) for WaitNamedPipe()
First CreateNamedPipe creates named pipe
ndash Closing handle to last instance deletes named pipe
Polling a pipe
ndash Nondestructive ndash is there a message waiting for ReadFile
9999
Named Pipe Client Connections
CreateFile with named pipe namendash pipe[path]pipename
ndash servernamepipe[path]pipename
ndash First method gives better performance (local server)
Status Functionsndash GetNamedPipeHandleState
ndash SetNamedPipeHandleState
ndash GetNamedPipeInfo
100100
Convenience Functions
BOOL TransactNamedPipe( HANDLE hNamedPipeLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDOWRD lpcbRead LPOVERLAPPED lpa)
WriteFile ReadFile sequence
101101
Convenience Functions
BOOL CallNamedPipe( LPCTSTR lpszPipeNameLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDWORD lpcbRead DWORD dwTimeOut)
CreateFile WriteFile ReadFile CloseHandle
ndash dwTimeOut NMPWAIT_NOWAIT NMPWAIT_WIAT_FOREVER NMPWAIT_USE_DEFAULT_WAIT
102102
Server eliminate the polling loop
BOOL ConnectNamedPipe (HANDLE hNamedPipeLPOVERLAPPED lpo
lpo == NULLndash Call will return as soon as there is a client connection
ndash Returns false if client connected between CreateNamed Pipe calland ConnectNamedPipe()
Use DisconnectNamedPipe to free the handle for connection from another client
WaitNamedPipe()ndash Client may wait for serverlsquos ConnectNamedPipe()
Security rights for named pipesndash GENERIC_READ GENERIC_WRITE SYNCHRONIZE
103103
Comparison with UNIX
UNIX FIFOs are similar to a named pipe
ndash FIFOs are half-duplex
ndash FIFOs are limited to a single machine
ndash FIFOs are still byte-oriented so its easiest to use fixed-size records in clientserver applications
ndash Individual readwrites are atomic
A server using FIFOs must use a separate FIFO for each clientlsquos response although all clients can send requests via a single well known FIFO
Mkfifo() is the UNIX counterpart to CreateNamedPipe()
Use sockets for networked clientserver scenarios
104104
Client Example using Named Pipe
WaitNamedPipe (ServerPipeName NMPWAIT_WAIT_FOREVER)
hNamedPipe = CreateFile (ServerPipeName GENERIC_READ | GENERIC_WRITE
0 NULL OPEN_EXISTING FILE_ATTRIBUTE_NORMAL NULL)
if (hNamedPipe == INVALID_HANDLE_VALUE)
fptinf(stderr Failure to locate servern) exit(3)
Write the request
WriteFile (hNamedPipe ampRequest MAX_RQRS_LEN ampnWrite NULL)
Read each response and send it to std out
while (ReadFile (hNamedPipe ResponseRecord MAX_RQRS_LEN ampnRead NULL))
printf (s ResponseRecord)
CloseHandle (hNamedPipe)
return 0
105105
Server Example Using a Named Pipe
hNamedPipe = CreateNamedPipe (SERVER_PIPE PIPE_ACCESS_DUPLEX
PIPE_READMODE_MESSAGE | PIPE_TYPE_MESSAGE | PIPE_WAIT
1 0 0 CS_TIMEOUT pNPSA)
while (Done)
printf (Server is awaiting next requestn)
if (ConnectNamedPipe (hNamedPipe NULL)
|| ReadFile (hNamedPipe ampRequest RQ_SIZE ampnXfer NULL))
fprintf(stderr ldquoConnect or Read Named Pipe errornrdquo) exit(4)
printf( ldquoRequest is sn RequestRecord)
Send the file one line at a time to the client
fp = fopen (File r)
while ((fgets (ResponseRecord MAX_RQRS_LEN fp) = NULL))
WriteFile (hNamedPipe ampResponseRecord
(strlen(ResponseRecord) + 1) TSIZE ampnXfer NULL)
fclose (fp)
DisconnectNamedPipe (hNamedPipe)
End of server operation
106106
Win32 IPC - MailslotsMailslots bear some nasty implementation detailsthey are almost never used
Broadcast mechanism
ndash One-directional
ndash Mutliple writersmultiple readers (frequently one-to-many comm)
ndash Message delivery is unreliable
ndash Can be located over a network domain
ndash Message lengths are limited (w2k lt 426 byte) Operations on the mailslot
ndash Each reader (server) creates mailslot with CreateMailslot()
ndash Write-only client opens mailslot with CreateFile() and uses WriteFile() ndash open will fail if there are no waiting readers
ndash Clientlsquos message can be read by all servers (readers) Client lookup mailslotmailslotname
ndash Client will connect to every server in network domain
107107
Locate a server via mailslot
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
App client 0
App client n
Mailslot Servers
While () Sleep() hMS = CreateFile( ldquomailslotstatusldquo)
WriteFile(hMS ampStatInfo
App Server
Mailslot Client
Message is sent periodically
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
3737
Trap dispatching
Interruptdispatcher
Systemservice
dispatcher
Interruptserviceroutines
Interruptserviceroutines
Interruptserviceroutines
System services
System services
System services
Exceptiondispatcher
Exceptionhandlers
Exceptionhandlers
Exceptionhandlers
Virtual memorymanagerlsquos pager
Interrupt
System service call
HW exceptionsSW exceptions
Virtual addressexceptions
Trap processorlsquos mechanism to capture executing thread
ndash Switch from user to kernel mode
ndash Interrupts ndash asynchronous
ndash Exceptions - synchronous
Interrupt dispatch routine
Disable interrupts
Record machine state (trap frame) to allow resume
Mask equal- and lower-IRQL interrupts
Find and call appropriate ISR
Dismiss interrupt
Restore machine state (including mode and enabled interrupts)
Disable interrupts
Record machine state (trap frame) to allow resume
Mask equal- and lower-IRQL interrupts
Find and call appropriate ISR
Dismiss interrupt
Restore machine state (including mode and enabled interrupts)
Tell the device to stop interrupting
Interrogate device state start next operation on device etc
Request a DPC
Return to caller
Tell the device to stop interrupting
Interrogate device state start next operation on device etc
Request a DPC
Return to caller
Interrupt service routine
interrupt
user or kernel mode
codekernel mode
Note no thread or process context switch
Note no thread or process context switch
Interrupt Dispatching
3939
IRQL = Interrupt Request Level
ndash Precedence of the interrupt with respect to other interrupts
ndash Different interrupt sources have different IRQLs
ndash not the same as IRQ
IRQL is also a state of the processor
ndash Servicing an interrupt raises processor IRQL to that interruptrsquos IRQL
ndash this masks subsequent interrupts at equal and lower IRQLs
User mode is limited to IRQL 0
No waits or page faults at IRQL gt= DISPATCH_LEVEL
Interrupt Precedence via IRQLs (x86)
4040
PassiveLowAPC
DispatchDPCDevice 1
Profile amp Synch (Srv 2003)
ClockInterprocessor Interrupt
Power failHigh
normal thread execution
Hardware interrupts
Deferrable software interrupts
012
302928
31
Interrupt Precedence via IRQLs (x86)
4141
Interrupt processing
Interrupt dispatch table (IDT)ndash Links to interrupt service routines
x86ndash Interrupt controller interrupts processor (single line)
ndash Processor queries for interrupt vector uses vector as index to IDT
Alphandash PAL code (Privileged Architecture Library ndash Alpha BIOS) determines interrupt vector calls kernel
ndash Kernel uses vector to index IDT
After ISR execution IRQL is lowered to initial level
4242
Interrupt object
Allows device drivers to register ISRs for their devicesndash Contains dispatch code (initial handler)
ndash Dispatch code calls ISR with interrupt object as parameter(HW cannot pass parameters to ISR)
Connectingdisconnecting interrupt objectsndash Dynamic association between ISR and IDT entry
ndash Loadable device drivers (kernel modules)
ndash Turn onoff ISR
Interrupt objects can synchronize access to ISR datandash Multiple instances of ISR may be active simultaneously (MP machine)
ndash Multiple ISR may be connected with IRQL
4343
Predefined IRQLs
High ndash used when halting the system (via KeBugCheck())
Power fail ndash originated in the NT design document but has never been used
Inter-processor interruptndash used to request action from other processor (dispatching a thread updating a processors TLB system shutdown system crash)
Clockndash Used to update systemlsquos clock allocation of CPU time to threads
Profilendash Used for kernel profiling (see Kernel profiler ndash Kernprofexe Res Kit)
4444
Predefined IRQLs (contd)
Device
ndash Used to prioritize device interrupts
DPCdispatch and APC
ndash Software interrupts that kernel and device drivers generate
Passive
ndash No interrupt level at all normal thread execution
4545
IRQLs on 64-bit Systems
PassiveLowAPC
DispatchDPCDevice 1
Device n
Synch (Srv 2003)Clock
Interprocessor InterruptPower
HighProfile
012
1413
15
34
PassiveLowAPC
DispatchDPC amp Synch (UP only)Correctable Machine Check
Device 1
Device nSynch (MP only)
ClockInterprocessor Interrupt
HighProfilePower
x64 IA64
12
4646
Interrupt Prioritization amp Delivery
IRQLs are determined as followsndash x86 UP systems IRQL = 27 - IRQ
ndash x86 MP systems bucketized (random)
ndash x64 amp IA64 systems IRQL = IDT vector number 16
On MP systems which processor is chosen to deliver an interruptndash By default any processor can receive an interrupt from any deviceCan be configured with IntFilter utility in Resource Kit
ndash On x86 and x64 systems the IOAPIC (IO advanced programmable interrupt controller) is programmed to interrupt the processor running at the lowest IRQL
ndash On IA64 systems the SAPIC (streamlined advanced programmable interrupt controller) is configured to interrupt one processor for each interrupt sourceProcessors are assigned round robin for each interrupt vector
4747
Software interrupts
Initiating thread dispatching
ndash DPC allow for scheduling actions when kernel is deep within many layers of code
ndash Delayed scheduling decision one DPC queue per processor
Handling timer expiration
Asynchronous execution of a procedure in context of a particular thread
Support for asynchronous IO operations
4848
Flow of Interrupts
4949
Sync on MP use spinlocks to coordinate among processors
Spinlock acquisition and release routines implement a one-owner-at-a-time algorithm
ndash Spinlock is either free or is considered to be owned by a CPU
ndash Analogous to using Windows API mutexes from user mode
A spinlock is just a data cell in memory
ndash Accessed with a test-and-modify operation that is atomic across all processors
ndash KSPIN_LOCK is an opaque data type typedefrsquod as a ULONG
ndash To implement synchronization a single bit is sufficient
Synchronization on SMP Systems
5050
do acquire_spinlock(DPC)until (SUCCESS)
begin remove DPC from queueend
release_spinlock(DPC)
do acquire_spinlock(DPC)until (SUCCESS)
begin remove DPC from queueend
release_spinlock(DPC)
Kernel Synchronization
Processor BProcessor A
Critical section
spinlock
DPC DPC
A spinlock is a locking primitive associatedwith a global data structure such as the DPC queue
5151
Queued Spinlocks
Problem Checking status of spinlock via test-and-set operation creates bus contention
Queued spinlocks maintain queue of waiting processors
First processor acquires lock other processors wait on processor-local flag
ndash Thus busy-wait loop requires no access to the memory bus
When releasing lock the 1st processorrsquos flag is modified
ndash Exactly one processor is being signaled
ndash Pre-determined wait order
5252
SMP Scalability Improvements
Windows 2000 queued spinlocks
ndash qlocks in Kernel Debugger
Server 2003
ndash More spinlocks eliminated (context swap system space commit)
ndash Further reduction of use of spinlocks amp length they are held
ndash Scheduling database now per-CPUAllows thread state transitions in parallel
5353
SMP Scalability Improvements
XP2003
ndash Minimized lock contention for hot locks PFN or Page Frame Database lock
ndash Some locks completely eliminatedCharging nonpagedpaged pool quotas allocating and mapping system page table entries charging commitment of pages allocatingmapping physical memory through AWE functions
ndash New more efficient locking mechanism (pushlocks)Doesnrsquot use spinlocks when no contentionUsed for object manager and address windowing extensions (AWE) related locks
5454
Waiting
Flexible wait calls
ndash Wait for one or multiple objects in one call
ndash Wait for multiple can wait for ldquoanyrdquo one or ldquoallrdquo at onceldquoAllrdquo all objects must be in the signalled state concurrently to resolve the wait
ndash All wait calls include optional timeout argument
ndash Waiting threads consume no CPU time
5555
Waiting
Waitable objects include
ndash Events (may be auto-reset or manual reset may be set or ldquopulsedrdquo)
ndash Mutexes (ldquomutual exclusionrdquo one-at-a-time)
ndash Semaphores (n-at-a-time)
ndash Timers
ndash Processes and Threads (signalled upon exit or terminate)
ndash Directories (change notification)
5656
Waiting
No guaranteed ordering of wait resolution
ndash If multiple threads are waiting for an object and only one thread is released (eg itrsquos a mutex or auto-reset event) which thread gets released is unpredictable
5757
Executive Synchronization
Thread waitson an object
handle
Create and initialize thread object
Initialized
Ready
Transition
Waiting
Running
Terminated
Standby
Wait is completeSet object to
signaled state
Interaction with thread scheduling
Waiting on Dispatcher Objects ndash outside the kernel
5858
Interaction bet Synchronization amp Dispatching User mode thread waits on an event objectlsquos handle
Kernel changes threadlsquos scheduling state from ready to waiting and adds thread to wait-list
Another thread sets the event
Kernel wakes up waiting threads variable priority threads get priority boost
5959
Interaction bet Synchronization amp Dispatching Dispatcher re-schedules new thread ndash it may preempt running thread it it has lower priority and issues software interrupt to initiate context switch
If no processor can be preempted the dispatcher places the ready thread in the dispatcher ready queue to be scheduled later
6060
What signals an object
Dispatcher object
System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
Owning thread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (kernel mode)
nonsignaled signaled
Owning thread or otherthread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (exported to user mode)
nonsignaled signaled
One thread releases thesemaphore freeing a resource
A thread acquires the semaphoreMore resources are not available
Kernel resumes one or more waiting threads
Semaphore
6161
A thread reinitializesthe thread object
What signals an object (contd)
Dispatcher object System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
A thread sets the event
Kernel resumes one or more threads
Kernel resumes one or more waiting threads
Event
nonsignaled signaled
Dedicated thread sets oneevent in the event pair
Kernel resumes theother dedicated thread
Kernel resumes waitingdedicated thread
Event pair
nonsignaled signaled
Timer expires
A thread (re) initializes the timer
Kernel resumes all waiting threads
Timer
nonsignaled signaled
Thread terminates
Kernel resumes all waiting threads
Thread
6262
Further Reading
Mark E Russinovich and David A Solomon Microsoft Windows Internals 4th Edition Microsoft Press 2004
Chapter 3 - System Mechanisms
ndash Trap Dispatching (pp 85 ff)
ndash Synchronization (pp 149 ff)
ndash Kernel Event Tracing (pp 175 ff)
6363
33 Advanced Windows Synchronization
Deferred and Asynchronous Procedure Calls
IRQLs and CPU Time Accounting
Wait Queues amp Dispatcher Objects
6464
Used to defer processing from higher (device) interrupt level to a lower (dispatch) levelndash Also used for quantum end and timer expiration
Driver (usually ISR) queues requestndash One queue per CPU DPCs are normally queued to the current processor but can be targeted to other CPUs
ndash Executes specified procedure at dispatch IRQL (or ldquodispatch levelrdquo also ldquoDPC levelrdquo) when all higher-IRQL work (interrupts) completed
ndash Maximum times recommended ISR 10 usec DPC 25 usec
See httpwwwmicrosoftcomwhdcdriverperformmmdrvmspx
Deferred Procedure Calls (DPCs)
6565
queue head DPC object DPC object DPC object
Deferred Procedure Calls (DPCs)
6666
DPC
Delivering a DPC
DPC routines can call kernel functionsbut canlsquot call system services generatepage faults or create or wait on objects
DPC routines canlsquotassume whatprocess addressspace is currentlymapped
Interruptdispatch table
high
Power failure
DispatchDPC
APC
Low
DPC
1 Timer expires kernelqueues DPC that willrelease all waiting threadsKernel requests SW int
DPCDPC
DPC queue
2 DPC interrupt occurswhen IRQL drops belowdispatchDPC level
dispatcher
3 After DPC interruptcontrol transfers tothread dispatcher
4 Dispatcher executes each DPCroutine in DPC queue
6767
Asynchronous Procedure Calls (APCs)
Execute code in context of a particular user threadndash APC routines can acquire resources (objects) incur page faultscall system services
APC queue is thread-specific
User mode amp kernel mode APCsndash Permission required for user mode APCs
6868
Asynchronous Procedure Calls (APCs)
Executive uses APCs to complete work in thread spacendash Wait for asynchronous IO operation
ndash Emulate delivery of POSIX signals
ndash Make threads suspendterminate itself (env subsystems)
APCs are delivered when thread is in alertable wait statendash WaitForMultipleObjectsEx() SleepEx()
6969
Special kernel APCs
ndash Run in kernel mode at IRQL 1
ndash Always deliverable unless thread is already at IRQL 1 or above
ndash Used for IO completion reporting from ldquoarbitrary thread contextrdquo
ndash Kernel-mode interface is linkable but not documented
Asynchronous Procedure Calls (APCs)
7070
ldquoOrdinaryrdquo kernel APCs
ndash Always deliverable if at IRQL 0 unless explicitly disabled (disable with KeEnterCriticalRegion)
User mode APCs
ndash Used for IO completion callback routines (see ReadFileEx WriteFileEx) also QueueUserApc
ndash Only deliverable when thread is in ldquoalertable waitrdquo
Asynchronous Procedure Calls (APCs)
7171
ThreadObject
K
U
APC objects
Asynchronous Procedure Calls (APCs)
7272
IRQLs and CPU Time Accounting
Interval clock timer ISR keeps track of time
Clock ISR time accounting
ndash If IRQLlt2 charge to threadrsquos user or kernel time
ndash If IRQL=2 and processing a DPC charge to DPC time
ndash If IRQL=2 amp not processing a DPC charge to thread kernel time
ndash If IRQLgt2 charge to interrupt time
7373
IRQLs and CPU Time Accounting
Since time servicing interrupts are NOT charged to interrupted thread if system is busy but no process appears to be running must be due to interrupt-related activity
ndash Note time at IRQL 2 or more is charged to the current threadrsquos quantum (to be described)
7474
Interrupt Time Accounting
Task Manager includes interrupt and DPC time with the Idle process time
Interrupt activity is not charged to any threadprocess
ndash Process Explorer shows these as separate processesnot really processes
ndash Context switches for these are really of interrupts amp DPCs
7575
Time Accounting Quirks
Looking at total CPU time for each process may not reveal where system has spent its time
CPU time accounting is driven by programmable interrupt timer
ndash Normally 10 msec (15 msec on some MP Pentiums)
Thread execution and context switches between clock intervals NOT accounted
ndash Eg one or more threads run and enter a wait state before clock fires
ndash Thus threads may run but never get charged
View context switch activity with Process Explorer
ndash Add Context Switch Delta column
7676
For waiting threads user-mode utilities only display the wait reason
Example pstat
Looking at Waiting Threads
7777
Wait Internals 1 Dispatcher Objects
Size TypeState
Wait listhead
Object-type-specific data
DispatcherObject
(see ntddkincddkntddkh)
Any kernel object you can wait for is a ldquodispatcher objectrdquo
ndash some exclusively for synchronizationeg events mutexes (ldquomutantsrdquo) semaphores queues timers
ndash others can be waited for as a side effect of their prime function eg processes threads file objects
ndash non-waitable kernel objects are called ldquocontrol objectsrdquo
All dispatcher objects have a common header
All dispatcher objects are in one of two states
ndash ldquosignaledrdquo vs ldquononsignaledrdquo
ndash when signalled a wait on the object is satisfied
ndash different object types differ in terms of what changes their state
ndash wait and unwait implementation iscommon to all types of dispatcher objects
7878
Object-type-specific data
Wait Internals 2Wait Blocks
Size TypeState
Wait listhead
Size TypeState
Wait listhead
Represent a threadrsquos reference to something itrsquos waiting for (one per handle passed to WaitForhellip)
All wait blocks from a given wait call are chained to the waiting thread
Type indicates wait for ldquoanyrdquo or ldquoallrdquo Key denotes argument list position for
WaitForMultipleObjects
Object-type-specific data
DispatcherObjects
Thread Objects
WaitBlockListWaitBlockList
Wait blocks
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
7979
34 Windows APIs for Synchronization and IPC Windows API constructs for synchronization and interprocess communication
Synchronization
ndash Critical sections
ndash Mutexes
ndash Semaphores
ndash Event objects
Synchronization through interprocess communication
ndash Anonymous pipes
ndash Named pipes
ndash Mailslots
8080
Critical Sections
Only usable from within the same process
Critical sections are initialized and deleted but do not have handles
Only one thread at a time can be in a critical section
A thread can enter a critical section multiple times - however the number of Enter- and Leave-operations must match
Leaving a critical section before entering it may cause deadlocks
No way to test whether another thread is in a critical section
VOID InitializeCriticalSection( LPCRITICAL_SECTION sec )VOID DeleteCriticalSection( LPCRITICAL_SECTION sec )
VOID EnterCriticalSection( LPCRITICAL_SECTION sec ) VOID LeaveCriticalSection( LPCRITICAL_SECTION sec )BOOL TryEnterCriticalSection ( LPCRITICAL_SECTION sec )
8181
Critical Section Example
counter is global shared by all threads
volatile int counter = 0
CRITICAL_SECTION crit
InitializeCriticalSection ( ampcrit )
hellip main loop in any of the threads
while (done)
_try
EnterCriticalSection ( ampcrit )
counter += local_value
LeaveCriticalSection ( ampcrit )
_finally LeaveCriticalSection ( ampcrit )
DeleteCriticalSection( ampcrit )
8282
Synchronizing Threads with Kernel Objects
The following kernel objects can be used to synchronize threads
ndash Processes
ndash Threads
ndash Files
ndash Console input
File change notificationsFile change notifications
MutexesMutexes
Events (auto-reset + manual-reset)Events (auto-reset + manual-reset)
Waitable timersWaitable timers
DWORD WaitForSingleObject( HANDLE hObject DWORD dwTimeout )
DWORD WaitForMultipleObjects( DWORD cObjects LPHANDLE lpHandles BOOL bWaitAll DWORD dwTimeout )
8383
Wait Functions - Details
WaitForSingleObject()ndash hObject specifies kernel object
ndash dwTimeout specifies wait time in msecdwTimeout == 0 - no wait check whether object is signaled
dwTimeout == INFINITE - wait forever
WaitForMultipleObjects()ndash cObjects lt= MAXIMUM_WAIT_OBJECTS (64)
ndash lpHandles - pointer to array identifying these objects
ndash bWaitAll - whether to wait for first signaled object or all objectsFunction returns index of first signaled object
Side effectsndash Mutexes auto-reset events and waitable timers will be reset to non-signaled state after completing wait functions
8484
Mutexes
Mutexes work across processes
First thread has to call CreateMutex()
When sharing a mutex second thread (process) calls CreateMutex() or OpenMutex()
fInitialOwner == TRUE gives creator immediate ownership
Threads acquire mutex ownership using WaitForSingleObject() or WaitForMultipleObjects()
ReleaseMutex() gives up ownership
CloseHandle() will free mutex object
8585
Mutexes
HANDLE CreateMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
HANDLE OpenMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
BOOL ReleaseMutex( HANDLE hMutex )
8686
Mutex Example
counter is global shared by all threads
volatile int done counter = 0
HANDLE mutex = CreateMutex( NULL FALSE NULL )
main loop in any of the threads ret is local
DWORD ret
while (done)
ret = WaitForSingleObject( mutex INFINITE )
if (ret == WAIT_OBJECT_0)
counter += local_value
else mutex was abandoned
break exit the loop
ReleaseMutex( mutex )
CloseHandle( mutex )
8787
Comparison - POSIX mutexes
POSIX pthreads specification supports mutexesndash Synchronization among threads in same process
Five basic functionsndash pthread_mutex_init()
ndash pthread_mutex_destroy()
ndash pthread_mutex_lock()
ndash pthread_mutex_unlock()
ndash pthread_mutex_trylock()
Comparisonndash pthread_mutex_lock() will block - equivalent to WaitForSingleObject( hMutex )
ndash pthread_mutex_trylock() is nonblocking (polling) - equivalent to WaitForSingleObject() with timeout == 0
8888
Semaphores
Semaphore objects are used for resource countingndash A semaphore is signaled when count gt 0
Threadsprocesses use wait functionsndash Each wait function decreases semaphore count by 1
ndash ReleaseSemaphore() may increment count by any value
ndash ReleaseSemaphore() returns old semaphore count
8989
Semaphores
HANDLE CreateSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE OpenSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE ReleaseSemaphore( HANDLE hSemaphoreLONG cReleaseCount LPLONG lpPreviousCount )
9090
Events
Multiple threads can be released when a single event is signaled (barrier synchronization)ndash Manual-reset event can signal several thread simultaneously must be reset manually
ndash PulseEvent() will release all threads waiting on a manual-reset event and automatically reset the event
ndash Auto-reset event signals a single thread event is reset automatically
ndash fInitialState == TRUE - create event in signaled state
9191
Events
HANDLE CreateEvent( LPSECURITY_ATTRIBUTE lpsaBOOL fManualReset BOOL fInititalStateLPTSTR lpszEventName )
BOOL SetEvent( HANDLE hEvent )BOOL ResetEvent( HANDLE hEvent )BOOL PulseEvent( HANDLE hEvent )
9292
Comparison - POSIX condition variables
pthreadrsquos condition variables are comparable to eventsndash pthread_cond_init()
ndash pthread_cond_destroy()
Wait functionsndash pthread_cond_wait()
ndash pthread_cond_timedwait()
Signalingndash pthread_cond_signal() - one thread
ndash pthread_cond_broadcast() - all waiting threads
No exact equivalent to manual-reset events
9393
Anonymous pipes
BOOL CreatePipe( PHANDLE phReadPHANDLE phWriteLPSECURITY_ATTRIBUTES lpsaDWORD cbPipe )
main
prog1 prog2pipe
Half-duplex character-based IPC
cbPipe pipe byte size zero == default
Read on pipe handle will block if pipe is empty
Write operation to a full pipe will block
Anonymous pipes are oneway
9494
IO Redirection using an Anonymous Pipe
Create default size anonymous pipe handles are inheritable
if (CreatePipe (amphReadPipe amphWritePipe ampPipeSA 0))
fprintf(stderr ldquoAnon pipe create failednrdquo) exit(1)
Set output handle to pipe handle create first processes
StartInfoCh1hStdInput = GetStdHandle (STD_INPUT_HANDLE)
StartInfoCh1hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh1hStdOutput = hWritePipe
StartInfoCh1dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)Command1 NULL NULL TRUE 0 NULL NULL ampStartInfoCh1 ampProcInfo1))
fprintf(stderr ldquoCreateProc1 failednrdquo) exit(2)
CloseHandle (hWritePipe)
9595
Pipe example (contd)
Repeat (symmetrically) for the second process
StartInfoCh2hStdInput = hReadPipe
StartInfoCh2hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh2hStdOutput = GetStdHandle (STD_OUTPUT_HANDLE)
StartInfoCh2dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)targv NULL NULLTRUE Inherit handles
0 NULL NULL ampStartInfoCh2 ampProcInfo2))
fprintf(stderr ldquoCreateProc2 failednrdquo) exit(3)
CloseHandle (hReadPipe)
Wait for both processes to complete
WaitForSingleObject (ProcInfo1hProcess INFINITE)
WaitForSingleObject (ProcInfo2hProcess INFINITE)
CloseHandle (ProcInfo1hThread) CloseHandle (ProcInfo1hProcess)
CloseHandle (ProcInfo2hThread) CloseHandle (ProcInfo2hProcess)
return 0
9696
Named Pipes
Message orientedndash Reading process can read varying-length messages precisely as sent by the writing process
Bi-directionalndash Two processes can exchange messages over the same pipe
Multiple independent instances of a named pipendash Several clients can communicate with a single server using the same instance
ndash Server can respond to client using the same instance
Pipe can be accessed over the networkndash location transparency
Convenience and connection functions
9797
Using Named Pipes
HANDLE CreateNamedPipe (LPCTSTR lpszPipeNameDWORD fdwOpenMode DWORD fdwPipModeDWORD nMaxInstances DWORD cbOutBufDWORD cbInBuf DWORD dwTimeOutLPSECURITY_ATTRIBUTES lpsa )
Use same flag settings forall instances of a named pipe
lpszPipeName pipe[path]pipename
ndash Not possible to create a pipe on remote machine ( ndash local machine)
fdwOpenMode
ndash PIPE_ACCESS_DUPLEX PIPE_ACCESS_INBOUND PIPE_ACCESS_OUTBOUND
fdwPipeMode
ndash PIPE_TYPE_BYTE or PIPE_TYPE_MESSAGE
ndash PIPE_READMODE_BYTE or PIPE_READMODE_MESSAGE
ndash PIPE_WAIT or PIPE_NOWAIT (will ReadFile block)
9898
Named Pipes (contd)
BOOL PeekNamedPipe (HANDLE hPipeLPVOID lpvBuffer DWORD cbBufferLPDWORD lpcbRead LPDWORD lpcbAvailLPDWORD lpcbMessage)
nMaxInstances
ndash Number of instances
ndash PIPE_UNLIMITED_INSTANCES OS choice based on resources
dwTimeOut
ndash Default time-out period (in msec) for WaitNamedPipe()
First CreateNamedPipe creates named pipe
ndash Closing handle to last instance deletes named pipe
Polling a pipe
ndash Nondestructive ndash is there a message waiting for ReadFile
9999
Named Pipe Client Connections
CreateFile with named pipe namendash pipe[path]pipename
ndash servernamepipe[path]pipename
ndash First method gives better performance (local server)
Status Functionsndash GetNamedPipeHandleState
ndash SetNamedPipeHandleState
ndash GetNamedPipeInfo
100100
Convenience Functions
BOOL TransactNamedPipe( HANDLE hNamedPipeLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDOWRD lpcbRead LPOVERLAPPED lpa)
WriteFile ReadFile sequence
101101
Convenience Functions
BOOL CallNamedPipe( LPCTSTR lpszPipeNameLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDWORD lpcbRead DWORD dwTimeOut)
CreateFile WriteFile ReadFile CloseHandle
ndash dwTimeOut NMPWAIT_NOWAIT NMPWAIT_WIAT_FOREVER NMPWAIT_USE_DEFAULT_WAIT
102102
Server eliminate the polling loop
BOOL ConnectNamedPipe (HANDLE hNamedPipeLPOVERLAPPED lpo
lpo == NULLndash Call will return as soon as there is a client connection
ndash Returns false if client connected between CreateNamed Pipe calland ConnectNamedPipe()
Use DisconnectNamedPipe to free the handle for connection from another client
WaitNamedPipe()ndash Client may wait for serverlsquos ConnectNamedPipe()
Security rights for named pipesndash GENERIC_READ GENERIC_WRITE SYNCHRONIZE
103103
Comparison with UNIX
UNIX FIFOs are similar to a named pipe
ndash FIFOs are half-duplex
ndash FIFOs are limited to a single machine
ndash FIFOs are still byte-oriented so its easiest to use fixed-size records in clientserver applications
ndash Individual readwrites are atomic
A server using FIFOs must use a separate FIFO for each clientlsquos response although all clients can send requests via a single well known FIFO
Mkfifo() is the UNIX counterpart to CreateNamedPipe()
Use sockets for networked clientserver scenarios
104104
Client Example using Named Pipe
WaitNamedPipe (ServerPipeName NMPWAIT_WAIT_FOREVER)
hNamedPipe = CreateFile (ServerPipeName GENERIC_READ | GENERIC_WRITE
0 NULL OPEN_EXISTING FILE_ATTRIBUTE_NORMAL NULL)
if (hNamedPipe == INVALID_HANDLE_VALUE)
fptinf(stderr Failure to locate servern) exit(3)
Write the request
WriteFile (hNamedPipe ampRequest MAX_RQRS_LEN ampnWrite NULL)
Read each response and send it to std out
while (ReadFile (hNamedPipe ResponseRecord MAX_RQRS_LEN ampnRead NULL))
printf (s ResponseRecord)
CloseHandle (hNamedPipe)
return 0
105105
Server Example Using a Named Pipe
hNamedPipe = CreateNamedPipe (SERVER_PIPE PIPE_ACCESS_DUPLEX
PIPE_READMODE_MESSAGE | PIPE_TYPE_MESSAGE | PIPE_WAIT
1 0 0 CS_TIMEOUT pNPSA)
while (Done)
printf (Server is awaiting next requestn)
if (ConnectNamedPipe (hNamedPipe NULL)
|| ReadFile (hNamedPipe ampRequest RQ_SIZE ampnXfer NULL))
fprintf(stderr ldquoConnect or Read Named Pipe errornrdquo) exit(4)
printf( ldquoRequest is sn RequestRecord)
Send the file one line at a time to the client
fp = fopen (File r)
while ((fgets (ResponseRecord MAX_RQRS_LEN fp) = NULL))
WriteFile (hNamedPipe ampResponseRecord
(strlen(ResponseRecord) + 1) TSIZE ampnXfer NULL)
fclose (fp)
DisconnectNamedPipe (hNamedPipe)
End of server operation
106106
Win32 IPC - MailslotsMailslots bear some nasty implementation detailsthey are almost never used
Broadcast mechanism
ndash One-directional
ndash Mutliple writersmultiple readers (frequently one-to-many comm)
ndash Message delivery is unreliable
ndash Can be located over a network domain
ndash Message lengths are limited (w2k lt 426 byte) Operations on the mailslot
ndash Each reader (server) creates mailslot with CreateMailslot()
ndash Write-only client opens mailslot with CreateFile() and uses WriteFile() ndash open will fail if there are no waiting readers
ndash Clientlsquos message can be read by all servers (readers) Client lookup mailslotmailslotname
ndash Client will connect to every server in network domain
107107
Locate a server via mailslot
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
App client 0
App client n
Mailslot Servers
While () Sleep() hMS = CreateFile( ldquomailslotstatusldquo)
WriteFile(hMS ampStatInfo
App Server
Mailslot Client
Message is sent periodically
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
Interrupt dispatch routine
Disable interrupts
Record machine state (trap frame) to allow resume
Mask equal- and lower-IRQL interrupts
Find and call appropriate ISR
Dismiss interrupt
Restore machine state (including mode and enabled interrupts)
Disable interrupts
Record machine state (trap frame) to allow resume
Mask equal- and lower-IRQL interrupts
Find and call appropriate ISR
Dismiss interrupt
Restore machine state (including mode and enabled interrupts)
Tell the device to stop interrupting
Interrogate device state start next operation on device etc
Request a DPC
Return to caller
Tell the device to stop interrupting
Interrogate device state start next operation on device etc
Request a DPC
Return to caller
Interrupt service routine
interrupt
user or kernel mode
codekernel mode
Note no thread or process context switch
Note no thread or process context switch
Interrupt Dispatching
3939
IRQL = Interrupt Request Level
ndash Precedence of the interrupt with respect to other interrupts
ndash Different interrupt sources have different IRQLs
ndash not the same as IRQ
IRQL is also a state of the processor
ndash Servicing an interrupt raises processor IRQL to that interruptrsquos IRQL
ndash this masks subsequent interrupts at equal and lower IRQLs
User mode is limited to IRQL 0
No waits or page faults at IRQL gt= DISPATCH_LEVEL
Interrupt Precedence via IRQLs (x86)
4040
PassiveLowAPC
DispatchDPCDevice 1
Profile amp Synch (Srv 2003)
ClockInterprocessor Interrupt
Power failHigh
normal thread execution
Hardware interrupts
Deferrable software interrupts
012
302928
31
Interrupt Precedence via IRQLs (x86)
4141
Interrupt processing
Interrupt dispatch table (IDT)ndash Links to interrupt service routines
x86ndash Interrupt controller interrupts processor (single line)
ndash Processor queries for interrupt vector uses vector as index to IDT
Alphandash PAL code (Privileged Architecture Library ndash Alpha BIOS) determines interrupt vector calls kernel
ndash Kernel uses vector to index IDT
After ISR execution IRQL is lowered to initial level
4242
Interrupt object
Allows device drivers to register ISRs for their devicesndash Contains dispatch code (initial handler)
ndash Dispatch code calls ISR with interrupt object as parameter(HW cannot pass parameters to ISR)
Connectingdisconnecting interrupt objectsndash Dynamic association between ISR and IDT entry
ndash Loadable device drivers (kernel modules)
ndash Turn onoff ISR
Interrupt objects can synchronize access to ISR datandash Multiple instances of ISR may be active simultaneously (MP machine)
ndash Multiple ISR may be connected with IRQL
4343
Predefined IRQLs
High ndash used when halting the system (via KeBugCheck())
Power fail ndash originated in the NT design document but has never been used
Inter-processor interruptndash used to request action from other processor (dispatching a thread updating a processors TLB system shutdown system crash)
Clockndash Used to update systemlsquos clock allocation of CPU time to threads
Profilendash Used for kernel profiling (see Kernel profiler ndash Kernprofexe Res Kit)
4444
Predefined IRQLs (contd)
Device
ndash Used to prioritize device interrupts
DPCdispatch and APC
ndash Software interrupts that kernel and device drivers generate
Passive
ndash No interrupt level at all normal thread execution
4545
IRQLs on 64-bit Systems
PassiveLowAPC
DispatchDPCDevice 1
Device n
Synch (Srv 2003)Clock
Interprocessor InterruptPower
HighProfile
012
1413
15
34
PassiveLowAPC
DispatchDPC amp Synch (UP only)Correctable Machine Check
Device 1
Device nSynch (MP only)
ClockInterprocessor Interrupt
HighProfilePower
x64 IA64
12
4646
Interrupt Prioritization amp Delivery
IRQLs are determined as followsndash x86 UP systems IRQL = 27 - IRQ
ndash x86 MP systems bucketized (random)
ndash x64 amp IA64 systems IRQL = IDT vector number 16
On MP systems which processor is chosen to deliver an interruptndash By default any processor can receive an interrupt from any deviceCan be configured with IntFilter utility in Resource Kit
ndash On x86 and x64 systems the IOAPIC (IO advanced programmable interrupt controller) is programmed to interrupt the processor running at the lowest IRQL
ndash On IA64 systems the SAPIC (streamlined advanced programmable interrupt controller) is configured to interrupt one processor for each interrupt sourceProcessors are assigned round robin for each interrupt vector
4747
Software interrupts
Initiating thread dispatching
ndash DPC allow for scheduling actions when kernel is deep within many layers of code
ndash Delayed scheduling decision one DPC queue per processor
Handling timer expiration
Asynchronous execution of a procedure in context of a particular thread
Support for asynchronous IO operations
4848
Flow of Interrupts
4949
Sync on MP use spinlocks to coordinate among processors
Spinlock acquisition and release routines implement a one-owner-at-a-time algorithm
ndash Spinlock is either free or is considered to be owned by a CPU
ndash Analogous to using Windows API mutexes from user mode
A spinlock is just a data cell in memory
ndash Accessed with a test-and-modify operation that is atomic across all processors
ndash KSPIN_LOCK is an opaque data type typedefrsquod as a ULONG
ndash To implement synchronization a single bit is sufficient
Synchronization on SMP Systems
5050
do acquire_spinlock(DPC)until (SUCCESS)
begin remove DPC from queueend
release_spinlock(DPC)
do acquire_spinlock(DPC)until (SUCCESS)
begin remove DPC from queueend
release_spinlock(DPC)
Kernel Synchronization
Processor BProcessor A
Critical section
spinlock
DPC DPC
A spinlock is a locking primitive associatedwith a global data structure such as the DPC queue
5151
Queued Spinlocks
Problem Checking status of spinlock via test-and-set operation creates bus contention
Queued spinlocks maintain queue of waiting processors
First processor acquires lock other processors wait on processor-local flag
ndash Thus busy-wait loop requires no access to the memory bus
When releasing lock the 1st processorrsquos flag is modified
ndash Exactly one processor is being signaled
ndash Pre-determined wait order
5252
SMP Scalability Improvements
Windows 2000 queued spinlocks
ndash qlocks in Kernel Debugger
Server 2003
ndash More spinlocks eliminated (context swap system space commit)
ndash Further reduction of use of spinlocks amp length they are held
ndash Scheduling database now per-CPUAllows thread state transitions in parallel
5353
SMP Scalability Improvements
XP2003
ndash Minimized lock contention for hot locks PFN or Page Frame Database lock
ndash Some locks completely eliminatedCharging nonpagedpaged pool quotas allocating and mapping system page table entries charging commitment of pages allocatingmapping physical memory through AWE functions
ndash New more efficient locking mechanism (pushlocks)Doesnrsquot use spinlocks when no contentionUsed for object manager and address windowing extensions (AWE) related locks
5454
Waiting
Flexible wait calls
ndash Wait for one or multiple objects in one call
ndash Wait for multiple can wait for ldquoanyrdquo one or ldquoallrdquo at onceldquoAllrdquo all objects must be in the signalled state concurrently to resolve the wait
ndash All wait calls include optional timeout argument
ndash Waiting threads consume no CPU time
5555
Waiting
Waitable objects include
ndash Events (may be auto-reset or manual reset may be set or ldquopulsedrdquo)
ndash Mutexes (ldquomutual exclusionrdquo one-at-a-time)
ndash Semaphores (n-at-a-time)
ndash Timers
ndash Processes and Threads (signalled upon exit or terminate)
ndash Directories (change notification)
5656
Waiting
No guaranteed ordering of wait resolution
ndash If multiple threads are waiting for an object and only one thread is released (eg itrsquos a mutex or auto-reset event) which thread gets released is unpredictable
5757
Executive Synchronization
Thread waitson an object
handle
Create and initialize thread object
Initialized
Ready
Transition
Waiting
Running
Terminated
Standby
Wait is completeSet object to
signaled state
Interaction with thread scheduling
Waiting on Dispatcher Objects ndash outside the kernel
5858
Interaction bet Synchronization amp Dispatching User mode thread waits on an event objectlsquos handle
Kernel changes threadlsquos scheduling state from ready to waiting and adds thread to wait-list
Another thread sets the event
Kernel wakes up waiting threads variable priority threads get priority boost
5959
Interaction bet Synchronization amp Dispatching Dispatcher re-schedules new thread ndash it may preempt running thread it it has lower priority and issues software interrupt to initiate context switch
If no processor can be preempted the dispatcher places the ready thread in the dispatcher ready queue to be scheduled later
6060
What signals an object
Dispatcher object
System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
Owning thread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (kernel mode)
nonsignaled signaled
Owning thread or otherthread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (exported to user mode)
nonsignaled signaled
One thread releases thesemaphore freeing a resource
A thread acquires the semaphoreMore resources are not available
Kernel resumes one or more waiting threads
Semaphore
6161
A thread reinitializesthe thread object
What signals an object (contd)
Dispatcher object System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
A thread sets the event
Kernel resumes one or more threads
Kernel resumes one or more waiting threads
Event
nonsignaled signaled
Dedicated thread sets oneevent in the event pair
Kernel resumes theother dedicated thread
Kernel resumes waitingdedicated thread
Event pair
nonsignaled signaled
Timer expires
A thread (re) initializes the timer
Kernel resumes all waiting threads
Timer
nonsignaled signaled
Thread terminates
Kernel resumes all waiting threads
Thread
6262
Further Reading
Mark E Russinovich and David A Solomon Microsoft Windows Internals 4th Edition Microsoft Press 2004
Chapter 3 - System Mechanisms
ndash Trap Dispatching (pp 85 ff)
ndash Synchronization (pp 149 ff)
ndash Kernel Event Tracing (pp 175 ff)
6363
33 Advanced Windows Synchronization
Deferred and Asynchronous Procedure Calls
IRQLs and CPU Time Accounting
Wait Queues amp Dispatcher Objects
6464
Used to defer processing from higher (device) interrupt level to a lower (dispatch) levelndash Also used for quantum end and timer expiration
Driver (usually ISR) queues requestndash One queue per CPU DPCs are normally queued to the current processor but can be targeted to other CPUs
ndash Executes specified procedure at dispatch IRQL (or ldquodispatch levelrdquo also ldquoDPC levelrdquo) when all higher-IRQL work (interrupts) completed
ndash Maximum times recommended ISR 10 usec DPC 25 usec
See httpwwwmicrosoftcomwhdcdriverperformmmdrvmspx
Deferred Procedure Calls (DPCs)
6565
queue head DPC object DPC object DPC object
Deferred Procedure Calls (DPCs)
6666
DPC
Delivering a DPC
DPC routines can call kernel functionsbut canlsquot call system services generatepage faults or create or wait on objects
DPC routines canlsquotassume whatprocess addressspace is currentlymapped
Interruptdispatch table
high
Power failure
DispatchDPC
APC
Low
DPC
1 Timer expires kernelqueues DPC that willrelease all waiting threadsKernel requests SW int
DPCDPC
DPC queue
2 DPC interrupt occurswhen IRQL drops belowdispatchDPC level
dispatcher
3 After DPC interruptcontrol transfers tothread dispatcher
4 Dispatcher executes each DPCroutine in DPC queue
6767
Asynchronous Procedure Calls (APCs)
Execute code in context of a particular user threadndash APC routines can acquire resources (objects) incur page faultscall system services
APC queue is thread-specific
User mode amp kernel mode APCsndash Permission required for user mode APCs
6868
Asynchronous Procedure Calls (APCs)
Executive uses APCs to complete work in thread spacendash Wait for asynchronous IO operation
ndash Emulate delivery of POSIX signals
ndash Make threads suspendterminate itself (env subsystems)
APCs are delivered when thread is in alertable wait statendash WaitForMultipleObjectsEx() SleepEx()
6969
Special kernel APCs
ndash Run in kernel mode at IRQL 1
ndash Always deliverable unless thread is already at IRQL 1 or above
ndash Used for IO completion reporting from ldquoarbitrary thread contextrdquo
ndash Kernel-mode interface is linkable but not documented
Asynchronous Procedure Calls (APCs)
7070
ldquoOrdinaryrdquo kernel APCs
ndash Always deliverable if at IRQL 0 unless explicitly disabled (disable with KeEnterCriticalRegion)
User mode APCs
ndash Used for IO completion callback routines (see ReadFileEx WriteFileEx) also QueueUserApc
ndash Only deliverable when thread is in ldquoalertable waitrdquo
Asynchronous Procedure Calls (APCs)
7171
ThreadObject
K
U
APC objects
Asynchronous Procedure Calls (APCs)
7272
IRQLs and CPU Time Accounting
Interval clock timer ISR keeps track of time
Clock ISR time accounting
ndash If IRQLlt2 charge to threadrsquos user or kernel time
ndash If IRQL=2 and processing a DPC charge to DPC time
ndash If IRQL=2 amp not processing a DPC charge to thread kernel time
ndash If IRQLgt2 charge to interrupt time
7373
IRQLs and CPU Time Accounting
Since time servicing interrupts are NOT charged to interrupted thread if system is busy but no process appears to be running must be due to interrupt-related activity
ndash Note time at IRQL 2 or more is charged to the current threadrsquos quantum (to be described)
7474
Interrupt Time Accounting
Task Manager includes interrupt and DPC time with the Idle process time
Interrupt activity is not charged to any threadprocess
ndash Process Explorer shows these as separate processesnot really processes
ndash Context switches for these are really of interrupts amp DPCs
7575
Time Accounting Quirks
Looking at total CPU time for each process may not reveal where system has spent its time
CPU time accounting is driven by programmable interrupt timer
ndash Normally 10 msec (15 msec on some MP Pentiums)
Thread execution and context switches between clock intervals NOT accounted
ndash Eg one or more threads run and enter a wait state before clock fires
ndash Thus threads may run but never get charged
View context switch activity with Process Explorer
ndash Add Context Switch Delta column
7676
For waiting threads user-mode utilities only display the wait reason
Example pstat
Looking at Waiting Threads
7777
Wait Internals 1 Dispatcher Objects
Size TypeState
Wait listhead
Object-type-specific data
DispatcherObject
(see ntddkincddkntddkh)
Any kernel object you can wait for is a ldquodispatcher objectrdquo
ndash some exclusively for synchronizationeg events mutexes (ldquomutantsrdquo) semaphores queues timers
ndash others can be waited for as a side effect of their prime function eg processes threads file objects
ndash non-waitable kernel objects are called ldquocontrol objectsrdquo
All dispatcher objects have a common header
All dispatcher objects are in one of two states
ndash ldquosignaledrdquo vs ldquononsignaledrdquo
ndash when signalled a wait on the object is satisfied
ndash different object types differ in terms of what changes their state
ndash wait and unwait implementation iscommon to all types of dispatcher objects
7878
Object-type-specific data
Wait Internals 2Wait Blocks
Size TypeState
Wait listhead
Size TypeState
Wait listhead
Represent a threadrsquos reference to something itrsquos waiting for (one per handle passed to WaitForhellip)
All wait blocks from a given wait call are chained to the waiting thread
Type indicates wait for ldquoanyrdquo or ldquoallrdquo Key denotes argument list position for
WaitForMultipleObjects
Object-type-specific data
DispatcherObjects
Thread Objects
WaitBlockListWaitBlockList
Wait blocks
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
7979
34 Windows APIs for Synchronization and IPC Windows API constructs for synchronization and interprocess communication
Synchronization
ndash Critical sections
ndash Mutexes
ndash Semaphores
ndash Event objects
Synchronization through interprocess communication
ndash Anonymous pipes
ndash Named pipes
ndash Mailslots
8080
Critical Sections
Only usable from within the same process
Critical sections are initialized and deleted but do not have handles
Only one thread at a time can be in a critical section
A thread can enter a critical section multiple times - however the number of Enter- and Leave-operations must match
Leaving a critical section before entering it may cause deadlocks
No way to test whether another thread is in a critical section
VOID InitializeCriticalSection( LPCRITICAL_SECTION sec )VOID DeleteCriticalSection( LPCRITICAL_SECTION sec )
VOID EnterCriticalSection( LPCRITICAL_SECTION sec ) VOID LeaveCriticalSection( LPCRITICAL_SECTION sec )BOOL TryEnterCriticalSection ( LPCRITICAL_SECTION sec )
8181
Critical Section Example
counter is global shared by all threads
volatile int counter = 0
CRITICAL_SECTION crit
InitializeCriticalSection ( ampcrit )
hellip main loop in any of the threads
while (done)
_try
EnterCriticalSection ( ampcrit )
counter += local_value
LeaveCriticalSection ( ampcrit )
_finally LeaveCriticalSection ( ampcrit )
DeleteCriticalSection( ampcrit )
8282
Synchronizing Threads with Kernel Objects
The following kernel objects can be used to synchronize threads
ndash Processes
ndash Threads
ndash Files
ndash Console input
File change notificationsFile change notifications
MutexesMutexes
Events (auto-reset + manual-reset)Events (auto-reset + manual-reset)
Waitable timersWaitable timers
DWORD WaitForSingleObject( HANDLE hObject DWORD dwTimeout )
DWORD WaitForMultipleObjects( DWORD cObjects LPHANDLE lpHandles BOOL bWaitAll DWORD dwTimeout )
8383
Wait Functions - Details
WaitForSingleObject()ndash hObject specifies kernel object
ndash dwTimeout specifies wait time in msecdwTimeout == 0 - no wait check whether object is signaled
dwTimeout == INFINITE - wait forever
WaitForMultipleObjects()ndash cObjects lt= MAXIMUM_WAIT_OBJECTS (64)
ndash lpHandles - pointer to array identifying these objects
ndash bWaitAll - whether to wait for first signaled object or all objectsFunction returns index of first signaled object
Side effectsndash Mutexes auto-reset events and waitable timers will be reset to non-signaled state after completing wait functions
8484
Mutexes
Mutexes work across processes
First thread has to call CreateMutex()
When sharing a mutex second thread (process) calls CreateMutex() or OpenMutex()
fInitialOwner == TRUE gives creator immediate ownership
Threads acquire mutex ownership using WaitForSingleObject() or WaitForMultipleObjects()
ReleaseMutex() gives up ownership
CloseHandle() will free mutex object
8585
Mutexes
HANDLE CreateMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
HANDLE OpenMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
BOOL ReleaseMutex( HANDLE hMutex )
8686
Mutex Example
counter is global shared by all threads
volatile int done counter = 0
HANDLE mutex = CreateMutex( NULL FALSE NULL )
main loop in any of the threads ret is local
DWORD ret
while (done)
ret = WaitForSingleObject( mutex INFINITE )
if (ret == WAIT_OBJECT_0)
counter += local_value
else mutex was abandoned
break exit the loop
ReleaseMutex( mutex )
CloseHandle( mutex )
8787
Comparison - POSIX mutexes
POSIX pthreads specification supports mutexesndash Synchronization among threads in same process
Five basic functionsndash pthread_mutex_init()
ndash pthread_mutex_destroy()
ndash pthread_mutex_lock()
ndash pthread_mutex_unlock()
ndash pthread_mutex_trylock()
Comparisonndash pthread_mutex_lock() will block - equivalent to WaitForSingleObject( hMutex )
ndash pthread_mutex_trylock() is nonblocking (polling) - equivalent to WaitForSingleObject() with timeout == 0
8888
Semaphores
Semaphore objects are used for resource countingndash A semaphore is signaled when count gt 0
Threadsprocesses use wait functionsndash Each wait function decreases semaphore count by 1
ndash ReleaseSemaphore() may increment count by any value
ndash ReleaseSemaphore() returns old semaphore count
8989
Semaphores
HANDLE CreateSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE OpenSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE ReleaseSemaphore( HANDLE hSemaphoreLONG cReleaseCount LPLONG lpPreviousCount )
9090
Events
Multiple threads can be released when a single event is signaled (barrier synchronization)ndash Manual-reset event can signal several thread simultaneously must be reset manually
ndash PulseEvent() will release all threads waiting on a manual-reset event and automatically reset the event
ndash Auto-reset event signals a single thread event is reset automatically
ndash fInitialState == TRUE - create event in signaled state
9191
Events
HANDLE CreateEvent( LPSECURITY_ATTRIBUTE lpsaBOOL fManualReset BOOL fInititalStateLPTSTR lpszEventName )
BOOL SetEvent( HANDLE hEvent )BOOL ResetEvent( HANDLE hEvent )BOOL PulseEvent( HANDLE hEvent )
9292
Comparison - POSIX condition variables
pthreadrsquos condition variables are comparable to eventsndash pthread_cond_init()
ndash pthread_cond_destroy()
Wait functionsndash pthread_cond_wait()
ndash pthread_cond_timedwait()
Signalingndash pthread_cond_signal() - one thread
ndash pthread_cond_broadcast() - all waiting threads
No exact equivalent to manual-reset events
9393
Anonymous pipes
BOOL CreatePipe( PHANDLE phReadPHANDLE phWriteLPSECURITY_ATTRIBUTES lpsaDWORD cbPipe )
main
prog1 prog2pipe
Half-duplex character-based IPC
cbPipe pipe byte size zero == default
Read on pipe handle will block if pipe is empty
Write operation to a full pipe will block
Anonymous pipes are oneway
9494
IO Redirection using an Anonymous Pipe
Create default size anonymous pipe handles are inheritable
if (CreatePipe (amphReadPipe amphWritePipe ampPipeSA 0))
fprintf(stderr ldquoAnon pipe create failednrdquo) exit(1)
Set output handle to pipe handle create first processes
StartInfoCh1hStdInput = GetStdHandle (STD_INPUT_HANDLE)
StartInfoCh1hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh1hStdOutput = hWritePipe
StartInfoCh1dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)Command1 NULL NULL TRUE 0 NULL NULL ampStartInfoCh1 ampProcInfo1))
fprintf(stderr ldquoCreateProc1 failednrdquo) exit(2)
CloseHandle (hWritePipe)
9595
Pipe example (contd)
Repeat (symmetrically) for the second process
StartInfoCh2hStdInput = hReadPipe
StartInfoCh2hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh2hStdOutput = GetStdHandle (STD_OUTPUT_HANDLE)
StartInfoCh2dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)targv NULL NULLTRUE Inherit handles
0 NULL NULL ampStartInfoCh2 ampProcInfo2))
fprintf(stderr ldquoCreateProc2 failednrdquo) exit(3)
CloseHandle (hReadPipe)
Wait for both processes to complete
WaitForSingleObject (ProcInfo1hProcess INFINITE)
WaitForSingleObject (ProcInfo2hProcess INFINITE)
CloseHandle (ProcInfo1hThread) CloseHandle (ProcInfo1hProcess)
CloseHandle (ProcInfo2hThread) CloseHandle (ProcInfo2hProcess)
return 0
9696
Named Pipes
Message orientedndash Reading process can read varying-length messages precisely as sent by the writing process
Bi-directionalndash Two processes can exchange messages over the same pipe
Multiple independent instances of a named pipendash Several clients can communicate with a single server using the same instance
ndash Server can respond to client using the same instance
Pipe can be accessed over the networkndash location transparency
Convenience and connection functions
9797
Using Named Pipes
HANDLE CreateNamedPipe (LPCTSTR lpszPipeNameDWORD fdwOpenMode DWORD fdwPipModeDWORD nMaxInstances DWORD cbOutBufDWORD cbInBuf DWORD dwTimeOutLPSECURITY_ATTRIBUTES lpsa )
Use same flag settings forall instances of a named pipe
lpszPipeName pipe[path]pipename
ndash Not possible to create a pipe on remote machine ( ndash local machine)
fdwOpenMode
ndash PIPE_ACCESS_DUPLEX PIPE_ACCESS_INBOUND PIPE_ACCESS_OUTBOUND
fdwPipeMode
ndash PIPE_TYPE_BYTE or PIPE_TYPE_MESSAGE
ndash PIPE_READMODE_BYTE or PIPE_READMODE_MESSAGE
ndash PIPE_WAIT or PIPE_NOWAIT (will ReadFile block)
9898
Named Pipes (contd)
BOOL PeekNamedPipe (HANDLE hPipeLPVOID lpvBuffer DWORD cbBufferLPDWORD lpcbRead LPDWORD lpcbAvailLPDWORD lpcbMessage)
nMaxInstances
ndash Number of instances
ndash PIPE_UNLIMITED_INSTANCES OS choice based on resources
dwTimeOut
ndash Default time-out period (in msec) for WaitNamedPipe()
First CreateNamedPipe creates named pipe
ndash Closing handle to last instance deletes named pipe
Polling a pipe
ndash Nondestructive ndash is there a message waiting for ReadFile
9999
Named Pipe Client Connections
CreateFile with named pipe namendash pipe[path]pipename
ndash servernamepipe[path]pipename
ndash First method gives better performance (local server)
Status Functionsndash GetNamedPipeHandleState
ndash SetNamedPipeHandleState
ndash GetNamedPipeInfo
100100
Convenience Functions
BOOL TransactNamedPipe( HANDLE hNamedPipeLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDOWRD lpcbRead LPOVERLAPPED lpa)
WriteFile ReadFile sequence
101101
Convenience Functions
BOOL CallNamedPipe( LPCTSTR lpszPipeNameLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDWORD lpcbRead DWORD dwTimeOut)
CreateFile WriteFile ReadFile CloseHandle
ndash dwTimeOut NMPWAIT_NOWAIT NMPWAIT_WIAT_FOREVER NMPWAIT_USE_DEFAULT_WAIT
102102
Server eliminate the polling loop
BOOL ConnectNamedPipe (HANDLE hNamedPipeLPOVERLAPPED lpo
lpo == NULLndash Call will return as soon as there is a client connection
ndash Returns false if client connected between CreateNamed Pipe calland ConnectNamedPipe()
Use DisconnectNamedPipe to free the handle for connection from another client
WaitNamedPipe()ndash Client may wait for serverlsquos ConnectNamedPipe()
Security rights for named pipesndash GENERIC_READ GENERIC_WRITE SYNCHRONIZE
103103
Comparison with UNIX
UNIX FIFOs are similar to a named pipe
ndash FIFOs are half-duplex
ndash FIFOs are limited to a single machine
ndash FIFOs are still byte-oriented so its easiest to use fixed-size records in clientserver applications
ndash Individual readwrites are atomic
A server using FIFOs must use a separate FIFO for each clientlsquos response although all clients can send requests via a single well known FIFO
Mkfifo() is the UNIX counterpart to CreateNamedPipe()
Use sockets for networked clientserver scenarios
104104
Client Example using Named Pipe
WaitNamedPipe (ServerPipeName NMPWAIT_WAIT_FOREVER)
hNamedPipe = CreateFile (ServerPipeName GENERIC_READ | GENERIC_WRITE
0 NULL OPEN_EXISTING FILE_ATTRIBUTE_NORMAL NULL)
if (hNamedPipe == INVALID_HANDLE_VALUE)
fptinf(stderr Failure to locate servern) exit(3)
Write the request
WriteFile (hNamedPipe ampRequest MAX_RQRS_LEN ampnWrite NULL)
Read each response and send it to std out
while (ReadFile (hNamedPipe ResponseRecord MAX_RQRS_LEN ampnRead NULL))
printf (s ResponseRecord)
CloseHandle (hNamedPipe)
return 0
105105
Server Example Using a Named Pipe
hNamedPipe = CreateNamedPipe (SERVER_PIPE PIPE_ACCESS_DUPLEX
PIPE_READMODE_MESSAGE | PIPE_TYPE_MESSAGE | PIPE_WAIT
1 0 0 CS_TIMEOUT pNPSA)
while (Done)
printf (Server is awaiting next requestn)
if (ConnectNamedPipe (hNamedPipe NULL)
|| ReadFile (hNamedPipe ampRequest RQ_SIZE ampnXfer NULL))
fprintf(stderr ldquoConnect or Read Named Pipe errornrdquo) exit(4)
printf( ldquoRequest is sn RequestRecord)
Send the file one line at a time to the client
fp = fopen (File r)
while ((fgets (ResponseRecord MAX_RQRS_LEN fp) = NULL))
WriteFile (hNamedPipe ampResponseRecord
(strlen(ResponseRecord) + 1) TSIZE ampnXfer NULL)
fclose (fp)
DisconnectNamedPipe (hNamedPipe)
End of server operation
106106
Win32 IPC - MailslotsMailslots bear some nasty implementation detailsthey are almost never used
Broadcast mechanism
ndash One-directional
ndash Mutliple writersmultiple readers (frequently one-to-many comm)
ndash Message delivery is unreliable
ndash Can be located over a network domain
ndash Message lengths are limited (w2k lt 426 byte) Operations on the mailslot
ndash Each reader (server) creates mailslot with CreateMailslot()
ndash Write-only client opens mailslot with CreateFile() and uses WriteFile() ndash open will fail if there are no waiting readers
ndash Clientlsquos message can be read by all servers (readers) Client lookup mailslotmailslotname
ndash Client will connect to every server in network domain
107107
Locate a server via mailslot
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
App client 0
App client n
Mailslot Servers
While () Sleep() hMS = CreateFile( ldquomailslotstatusldquo)
WriteFile(hMS ampStatInfo
App Server
Mailslot Client
Message is sent periodically
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
3939
IRQL = Interrupt Request Level
ndash Precedence of the interrupt with respect to other interrupts
ndash Different interrupt sources have different IRQLs
ndash not the same as IRQ
IRQL is also a state of the processor
ndash Servicing an interrupt raises processor IRQL to that interruptrsquos IRQL
ndash this masks subsequent interrupts at equal and lower IRQLs
User mode is limited to IRQL 0
No waits or page faults at IRQL gt= DISPATCH_LEVEL
Interrupt Precedence via IRQLs (x86)
4040
PassiveLowAPC
DispatchDPCDevice 1
Profile amp Synch (Srv 2003)
ClockInterprocessor Interrupt
Power failHigh
normal thread execution
Hardware interrupts
Deferrable software interrupts
012
302928
31
Interrupt Precedence via IRQLs (x86)
4141
Interrupt processing
Interrupt dispatch table (IDT)ndash Links to interrupt service routines
x86ndash Interrupt controller interrupts processor (single line)
ndash Processor queries for interrupt vector uses vector as index to IDT
Alphandash PAL code (Privileged Architecture Library ndash Alpha BIOS) determines interrupt vector calls kernel
ndash Kernel uses vector to index IDT
After ISR execution IRQL is lowered to initial level
4242
Interrupt object
Allows device drivers to register ISRs for their devicesndash Contains dispatch code (initial handler)
ndash Dispatch code calls ISR with interrupt object as parameter(HW cannot pass parameters to ISR)
Connectingdisconnecting interrupt objectsndash Dynamic association between ISR and IDT entry
ndash Loadable device drivers (kernel modules)
ndash Turn onoff ISR
Interrupt objects can synchronize access to ISR datandash Multiple instances of ISR may be active simultaneously (MP machine)
ndash Multiple ISR may be connected with IRQL
4343
Predefined IRQLs
High ndash used when halting the system (via KeBugCheck())
Power fail ndash originated in the NT design document but has never been used
Inter-processor interruptndash used to request action from other processor (dispatching a thread updating a processors TLB system shutdown system crash)
Clockndash Used to update systemlsquos clock allocation of CPU time to threads
Profilendash Used for kernel profiling (see Kernel profiler ndash Kernprofexe Res Kit)
4444
Predefined IRQLs (contd)
Device
ndash Used to prioritize device interrupts
DPCdispatch and APC
ndash Software interrupts that kernel and device drivers generate
Passive
ndash No interrupt level at all normal thread execution
4545
IRQLs on 64-bit Systems
PassiveLowAPC
DispatchDPCDevice 1
Device n
Synch (Srv 2003)Clock
Interprocessor InterruptPower
HighProfile
012
1413
15
34
PassiveLowAPC
DispatchDPC amp Synch (UP only)Correctable Machine Check
Device 1
Device nSynch (MP only)
ClockInterprocessor Interrupt
HighProfilePower
x64 IA64
12
4646
Interrupt Prioritization amp Delivery
IRQLs are determined as followsndash x86 UP systems IRQL = 27 - IRQ
ndash x86 MP systems bucketized (random)
ndash x64 amp IA64 systems IRQL = IDT vector number 16
On MP systems which processor is chosen to deliver an interruptndash By default any processor can receive an interrupt from any deviceCan be configured with IntFilter utility in Resource Kit
ndash On x86 and x64 systems the IOAPIC (IO advanced programmable interrupt controller) is programmed to interrupt the processor running at the lowest IRQL
ndash On IA64 systems the SAPIC (streamlined advanced programmable interrupt controller) is configured to interrupt one processor for each interrupt sourceProcessors are assigned round robin for each interrupt vector
4747
Software interrupts
Initiating thread dispatching
ndash DPC allow for scheduling actions when kernel is deep within many layers of code
ndash Delayed scheduling decision one DPC queue per processor
Handling timer expiration
Asynchronous execution of a procedure in context of a particular thread
Support for asynchronous IO operations
4848
Flow of Interrupts
4949
Sync on MP use spinlocks to coordinate among processors
Spinlock acquisition and release routines implement a one-owner-at-a-time algorithm
ndash Spinlock is either free or is considered to be owned by a CPU
ndash Analogous to using Windows API mutexes from user mode
A spinlock is just a data cell in memory
ndash Accessed with a test-and-modify operation that is atomic across all processors
ndash KSPIN_LOCK is an opaque data type typedefrsquod as a ULONG
ndash To implement synchronization a single bit is sufficient
Synchronization on SMP Systems
5050
do acquire_spinlock(DPC)until (SUCCESS)
begin remove DPC from queueend
release_spinlock(DPC)
do acquire_spinlock(DPC)until (SUCCESS)
begin remove DPC from queueend
release_spinlock(DPC)
Kernel Synchronization
Processor BProcessor A
Critical section
spinlock
DPC DPC
A spinlock is a locking primitive associatedwith a global data structure such as the DPC queue
5151
Queued Spinlocks
Problem Checking status of spinlock via test-and-set operation creates bus contention
Queued spinlocks maintain queue of waiting processors
First processor acquires lock other processors wait on processor-local flag
ndash Thus busy-wait loop requires no access to the memory bus
When releasing lock the 1st processorrsquos flag is modified
ndash Exactly one processor is being signaled
ndash Pre-determined wait order
5252
SMP Scalability Improvements
Windows 2000 queued spinlocks
ndash qlocks in Kernel Debugger
Server 2003
ndash More spinlocks eliminated (context swap system space commit)
ndash Further reduction of use of spinlocks amp length they are held
ndash Scheduling database now per-CPUAllows thread state transitions in parallel
5353
SMP Scalability Improvements
XP2003
ndash Minimized lock contention for hot locks PFN or Page Frame Database lock
ndash Some locks completely eliminatedCharging nonpagedpaged pool quotas allocating and mapping system page table entries charging commitment of pages allocatingmapping physical memory through AWE functions
ndash New more efficient locking mechanism (pushlocks)Doesnrsquot use spinlocks when no contentionUsed for object manager and address windowing extensions (AWE) related locks
5454
Waiting
Flexible wait calls
ndash Wait for one or multiple objects in one call
ndash Wait for multiple can wait for ldquoanyrdquo one or ldquoallrdquo at onceldquoAllrdquo all objects must be in the signalled state concurrently to resolve the wait
ndash All wait calls include optional timeout argument
ndash Waiting threads consume no CPU time
5555
Waiting
Waitable objects include
ndash Events (may be auto-reset or manual reset may be set or ldquopulsedrdquo)
ndash Mutexes (ldquomutual exclusionrdquo one-at-a-time)
ndash Semaphores (n-at-a-time)
ndash Timers
ndash Processes and Threads (signalled upon exit or terminate)
ndash Directories (change notification)
5656
Waiting
No guaranteed ordering of wait resolution
ndash If multiple threads are waiting for an object and only one thread is released (eg itrsquos a mutex or auto-reset event) which thread gets released is unpredictable
5757
Executive Synchronization
Thread waitson an object
handle
Create and initialize thread object
Initialized
Ready
Transition
Waiting
Running
Terminated
Standby
Wait is completeSet object to
signaled state
Interaction with thread scheduling
Waiting on Dispatcher Objects ndash outside the kernel
5858
Interaction bet Synchronization amp Dispatching User mode thread waits on an event objectlsquos handle
Kernel changes threadlsquos scheduling state from ready to waiting and adds thread to wait-list
Another thread sets the event
Kernel wakes up waiting threads variable priority threads get priority boost
5959
Interaction bet Synchronization amp Dispatching Dispatcher re-schedules new thread ndash it may preempt running thread it it has lower priority and issues software interrupt to initiate context switch
If no processor can be preempted the dispatcher places the ready thread in the dispatcher ready queue to be scheduled later
6060
What signals an object
Dispatcher object
System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
Owning thread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (kernel mode)
nonsignaled signaled
Owning thread or otherthread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (exported to user mode)
nonsignaled signaled
One thread releases thesemaphore freeing a resource
A thread acquires the semaphoreMore resources are not available
Kernel resumes one or more waiting threads
Semaphore
6161
A thread reinitializesthe thread object
What signals an object (contd)
Dispatcher object System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
A thread sets the event
Kernel resumes one or more threads
Kernel resumes one or more waiting threads
Event
nonsignaled signaled
Dedicated thread sets oneevent in the event pair
Kernel resumes theother dedicated thread
Kernel resumes waitingdedicated thread
Event pair
nonsignaled signaled
Timer expires
A thread (re) initializes the timer
Kernel resumes all waiting threads
Timer
nonsignaled signaled
Thread terminates
Kernel resumes all waiting threads
Thread
6262
Further Reading
Mark E Russinovich and David A Solomon Microsoft Windows Internals 4th Edition Microsoft Press 2004
Chapter 3 - System Mechanisms
ndash Trap Dispatching (pp 85 ff)
ndash Synchronization (pp 149 ff)
ndash Kernel Event Tracing (pp 175 ff)
6363
33 Advanced Windows Synchronization
Deferred and Asynchronous Procedure Calls
IRQLs and CPU Time Accounting
Wait Queues amp Dispatcher Objects
6464
Used to defer processing from higher (device) interrupt level to a lower (dispatch) levelndash Also used for quantum end and timer expiration
Driver (usually ISR) queues requestndash One queue per CPU DPCs are normally queued to the current processor but can be targeted to other CPUs
ndash Executes specified procedure at dispatch IRQL (or ldquodispatch levelrdquo also ldquoDPC levelrdquo) when all higher-IRQL work (interrupts) completed
ndash Maximum times recommended ISR 10 usec DPC 25 usec
See httpwwwmicrosoftcomwhdcdriverperformmmdrvmspx
Deferred Procedure Calls (DPCs)
6565
queue head DPC object DPC object DPC object
Deferred Procedure Calls (DPCs)
6666
DPC
Delivering a DPC
DPC routines can call kernel functionsbut canlsquot call system services generatepage faults or create or wait on objects
DPC routines canlsquotassume whatprocess addressspace is currentlymapped
Interruptdispatch table
high
Power failure
DispatchDPC
APC
Low
DPC
1 Timer expires kernelqueues DPC that willrelease all waiting threadsKernel requests SW int
DPCDPC
DPC queue
2 DPC interrupt occurswhen IRQL drops belowdispatchDPC level
dispatcher
3 After DPC interruptcontrol transfers tothread dispatcher
4 Dispatcher executes each DPCroutine in DPC queue
6767
Asynchronous Procedure Calls (APCs)
Execute code in context of a particular user threadndash APC routines can acquire resources (objects) incur page faultscall system services
APC queue is thread-specific
User mode amp kernel mode APCsndash Permission required for user mode APCs
6868
Asynchronous Procedure Calls (APCs)
Executive uses APCs to complete work in thread spacendash Wait for asynchronous IO operation
ndash Emulate delivery of POSIX signals
ndash Make threads suspendterminate itself (env subsystems)
APCs are delivered when thread is in alertable wait statendash WaitForMultipleObjectsEx() SleepEx()
6969
Special kernel APCs
ndash Run in kernel mode at IRQL 1
ndash Always deliverable unless thread is already at IRQL 1 or above
ndash Used for IO completion reporting from ldquoarbitrary thread contextrdquo
ndash Kernel-mode interface is linkable but not documented
Asynchronous Procedure Calls (APCs)
7070
ldquoOrdinaryrdquo kernel APCs
ndash Always deliverable if at IRQL 0 unless explicitly disabled (disable with KeEnterCriticalRegion)
User mode APCs
ndash Used for IO completion callback routines (see ReadFileEx WriteFileEx) also QueueUserApc
ndash Only deliverable when thread is in ldquoalertable waitrdquo
Asynchronous Procedure Calls (APCs)
7171
ThreadObject
K
U
APC objects
Asynchronous Procedure Calls (APCs)
7272
IRQLs and CPU Time Accounting
Interval clock timer ISR keeps track of time
Clock ISR time accounting
ndash If IRQLlt2 charge to threadrsquos user or kernel time
ndash If IRQL=2 and processing a DPC charge to DPC time
ndash If IRQL=2 amp not processing a DPC charge to thread kernel time
ndash If IRQLgt2 charge to interrupt time
7373
IRQLs and CPU Time Accounting
Since time servicing interrupts are NOT charged to interrupted thread if system is busy but no process appears to be running must be due to interrupt-related activity
ndash Note time at IRQL 2 or more is charged to the current threadrsquos quantum (to be described)
7474
Interrupt Time Accounting
Task Manager includes interrupt and DPC time with the Idle process time
Interrupt activity is not charged to any threadprocess
ndash Process Explorer shows these as separate processesnot really processes
ndash Context switches for these are really of interrupts amp DPCs
7575
Time Accounting Quirks
Looking at total CPU time for each process may not reveal where system has spent its time
CPU time accounting is driven by programmable interrupt timer
ndash Normally 10 msec (15 msec on some MP Pentiums)
Thread execution and context switches between clock intervals NOT accounted
ndash Eg one or more threads run and enter a wait state before clock fires
ndash Thus threads may run but never get charged
View context switch activity with Process Explorer
ndash Add Context Switch Delta column
7676
For waiting threads user-mode utilities only display the wait reason
Example pstat
Looking at Waiting Threads
7777
Wait Internals 1 Dispatcher Objects
Size TypeState
Wait listhead
Object-type-specific data
DispatcherObject
(see ntddkincddkntddkh)
Any kernel object you can wait for is a ldquodispatcher objectrdquo
ndash some exclusively for synchronizationeg events mutexes (ldquomutantsrdquo) semaphores queues timers
ndash others can be waited for as a side effect of their prime function eg processes threads file objects
ndash non-waitable kernel objects are called ldquocontrol objectsrdquo
All dispatcher objects have a common header
All dispatcher objects are in one of two states
ndash ldquosignaledrdquo vs ldquononsignaledrdquo
ndash when signalled a wait on the object is satisfied
ndash different object types differ in terms of what changes their state
ndash wait and unwait implementation iscommon to all types of dispatcher objects
7878
Object-type-specific data
Wait Internals 2Wait Blocks
Size TypeState
Wait listhead
Size TypeState
Wait listhead
Represent a threadrsquos reference to something itrsquos waiting for (one per handle passed to WaitForhellip)
All wait blocks from a given wait call are chained to the waiting thread
Type indicates wait for ldquoanyrdquo or ldquoallrdquo Key denotes argument list position for
WaitForMultipleObjects
Object-type-specific data
DispatcherObjects
Thread Objects
WaitBlockListWaitBlockList
Wait blocks
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
7979
34 Windows APIs for Synchronization and IPC Windows API constructs for synchronization and interprocess communication
Synchronization
ndash Critical sections
ndash Mutexes
ndash Semaphores
ndash Event objects
Synchronization through interprocess communication
ndash Anonymous pipes
ndash Named pipes
ndash Mailslots
8080
Critical Sections
Only usable from within the same process
Critical sections are initialized and deleted but do not have handles
Only one thread at a time can be in a critical section
A thread can enter a critical section multiple times - however the number of Enter- and Leave-operations must match
Leaving a critical section before entering it may cause deadlocks
No way to test whether another thread is in a critical section
VOID InitializeCriticalSection( LPCRITICAL_SECTION sec )VOID DeleteCriticalSection( LPCRITICAL_SECTION sec )
VOID EnterCriticalSection( LPCRITICAL_SECTION sec ) VOID LeaveCriticalSection( LPCRITICAL_SECTION sec )BOOL TryEnterCriticalSection ( LPCRITICAL_SECTION sec )
8181
Critical Section Example
counter is global shared by all threads
volatile int counter = 0
CRITICAL_SECTION crit
InitializeCriticalSection ( ampcrit )
hellip main loop in any of the threads
while (done)
_try
EnterCriticalSection ( ampcrit )
counter += local_value
LeaveCriticalSection ( ampcrit )
_finally LeaveCriticalSection ( ampcrit )
DeleteCriticalSection( ampcrit )
8282
Synchronizing Threads with Kernel Objects
The following kernel objects can be used to synchronize threads
ndash Processes
ndash Threads
ndash Files
ndash Console input
File change notificationsFile change notifications
MutexesMutexes
Events (auto-reset + manual-reset)Events (auto-reset + manual-reset)
Waitable timersWaitable timers
DWORD WaitForSingleObject( HANDLE hObject DWORD dwTimeout )
DWORD WaitForMultipleObjects( DWORD cObjects LPHANDLE lpHandles BOOL bWaitAll DWORD dwTimeout )
8383
Wait Functions - Details
WaitForSingleObject()ndash hObject specifies kernel object
ndash dwTimeout specifies wait time in msecdwTimeout == 0 - no wait check whether object is signaled
dwTimeout == INFINITE - wait forever
WaitForMultipleObjects()ndash cObjects lt= MAXIMUM_WAIT_OBJECTS (64)
ndash lpHandles - pointer to array identifying these objects
ndash bWaitAll - whether to wait for first signaled object or all objectsFunction returns index of first signaled object
Side effectsndash Mutexes auto-reset events and waitable timers will be reset to non-signaled state after completing wait functions
8484
Mutexes
Mutexes work across processes
First thread has to call CreateMutex()
When sharing a mutex second thread (process) calls CreateMutex() or OpenMutex()
fInitialOwner == TRUE gives creator immediate ownership
Threads acquire mutex ownership using WaitForSingleObject() or WaitForMultipleObjects()
ReleaseMutex() gives up ownership
CloseHandle() will free mutex object
8585
Mutexes
HANDLE CreateMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
HANDLE OpenMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
BOOL ReleaseMutex( HANDLE hMutex )
8686
Mutex Example
counter is global shared by all threads
volatile int done counter = 0
HANDLE mutex = CreateMutex( NULL FALSE NULL )
main loop in any of the threads ret is local
DWORD ret
while (done)
ret = WaitForSingleObject( mutex INFINITE )
if (ret == WAIT_OBJECT_0)
counter += local_value
else mutex was abandoned
break exit the loop
ReleaseMutex( mutex )
CloseHandle( mutex )
8787
Comparison - POSIX mutexes
POSIX pthreads specification supports mutexesndash Synchronization among threads in same process
Five basic functionsndash pthread_mutex_init()
ndash pthread_mutex_destroy()
ndash pthread_mutex_lock()
ndash pthread_mutex_unlock()
ndash pthread_mutex_trylock()
Comparisonndash pthread_mutex_lock() will block - equivalent to WaitForSingleObject( hMutex )
ndash pthread_mutex_trylock() is nonblocking (polling) - equivalent to WaitForSingleObject() with timeout == 0
8888
Semaphores
Semaphore objects are used for resource countingndash A semaphore is signaled when count gt 0
Threadsprocesses use wait functionsndash Each wait function decreases semaphore count by 1
ndash ReleaseSemaphore() may increment count by any value
ndash ReleaseSemaphore() returns old semaphore count
8989
Semaphores
HANDLE CreateSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE OpenSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE ReleaseSemaphore( HANDLE hSemaphoreLONG cReleaseCount LPLONG lpPreviousCount )
9090
Events
Multiple threads can be released when a single event is signaled (barrier synchronization)ndash Manual-reset event can signal several thread simultaneously must be reset manually
ndash PulseEvent() will release all threads waiting on a manual-reset event and automatically reset the event
ndash Auto-reset event signals a single thread event is reset automatically
ndash fInitialState == TRUE - create event in signaled state
9191
Events
HANDLE CreateEvent( LPSECURITY_ATTRIBUTE lpsaBOOL fManualReset BOOL fInititalStateLPTSTR lpszEventName )
BOOL SetEvent( HANDLE hEvent )BOOL ResetEvent( HANDLE hEvent )BOOL PulseEvent( HANDLE hEvent )
9292
Comparison - POSIX condition variables
pthreadrsquos condition variables are comparable to eventsndash pthread_cond_init()
ndash pthread_cond_destroy()
Wait functionsndash pthread_cond_wait()
ndash pthread_cond_timedwait()
Signalingndash pthread_cond_signal() - one thread
ndash pthread_cond_broadcast() - all waiting threads
No exact equivalent to manual-reset events
9393
Anonymous pipes
BOOL CreatePipe( PHANDLE phReadPHANDLE phWriteLPSECURITY_ATTRIBUTES lpsaDWORD cbPipe )
main
prog1 prog2pipe
Half-duplex character-based IPC
cbPipe pipe byte size zero == default
Read on pipe handle will block if pipe is empty
Write operation to a full pipe will block
Anonymous pipes are oneway
9494
IO Redirection using an Anonymous Pipe
Create default size anonymous pipe handles are inheritable
if (CreatePipe (amphReadPipe amphWritePipe ampPipeSA 0))
fprintf(stderr ldquoAnon pipe create failednrdquo) exit(1)
Set output handle to pipe handle create first processes
StartInfoCh1hStdInput = GetStdHandle (STD_INPUT_HANDLE)
StartInfoCh1hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh1hStdOutput = hWritePipe
StartInfoCh1dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)Command1 NULL NULL TRUE 0 NULL NULL ampStartInfoCh1 ampProcInfo1))
fprintf(stderr ldquoCreateProc1 failednrdquo) exit(2)
CloseHandle (hWritePipe)
9595
Pipe example (contd)
Repeat (symmetrically) for the second process
StartInfoCh2hStdInput = hReadPipe
StartInfoCh2hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh2hStdOutput = GetStdHandle (STD_OUTPUT_HANDLE)
StartInfoCh2dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)targv NULL NULLTRUE Inherit handles
0 NULL NULL ampStartInfoCh2 ampProcInfo2))
fprintf(stderr ldquoCreateProc2 failednrdquo) exit(3)
CloseHandle (hReadPipe)
Wait for both processes to complete
WaitForSingleObject (ProcInfo1hProcess INFINITE)
WaitForSingleObject (ProcInfo2hProcess INFINITE)
CloseHandle (ProcInfo1hThread) CloseHandle (ProcInfo1hProcess)
CloseHandle (ProcInfo2hThread) CloseHandle (ProcInfo2hProcess)
return 0
9696
Named Pipes
Message orientedndash Reading process can read varying-length messages precisely as sent by the writing process
Bi-directionalndash Two processes can exchange messages over the same pipe
Multiple independent instances of a named pipendash Several clients can communicate with a single server using the same instance
ndash Server can respond to client using the same instance
Pipe can be accessed over the networkndash location transparency
Convenience and connection functions
9797
Using Named Pipes
HANDLE CreateNamedPipe (LPCTSTR lpszPipeNameDWORD fdwOpenMode DWORD fdwPipModeDWORD nMaxInstances DWORD cbOutBufDWORD cbInBuf DWORD dwTimeOutLPSECURITY_ATTRIBUTES lpsa )
Use same flag settings forall instances of a named pipe
lpszPipeName pipe[path]pipename
ndash Not possible to create a pipe on remote machine ( ndash local machine)
fdwOpenMode
ndash PIPE_ACCESS_DUPLEX PIPE_ACCESS_INBOUND PIPE_ACCESS_OUTBOUND
fdwPipeMode
ndash PIPE_TYPE_BYTE or PIPE_TYPE_MESSAGE
ndash PIPE_READMODE_BYTE or PIPE_READMODE_MESSAGE
ndash PIPE_WAIT or PIPE_NOWAIT (will ReadFile block)
9898
Named Pipes (contd)
BOOL PeekNamedPipe (HANDLE hPipeLPVOID lpvBuffer DWORD cbBufferLPDWORD lpcbRead LPDWORD lpcbAvailLPDWORD lpcbMessage)
nMaxInstances
ndash Number of instances
ndash PIPE_UNLIMITED_INSTANCES OS choice based on resources
dwTimeOut
ndash Default time-out period (in msec) for WaitNamedPipe()
First CreateNamedPipe creates named pipe
ndash Closing handle to last instance deletes named pipe
Polling a pipe
ndash Nondestructive ndash is there a message waiting for ReadFile
9999
Named Pipe Client Connections
CreateFile with named pipe namendash pipe[path]pipename
ndash servernamepipe[path]pipename
ndash First method gives better performance (local server)
Status Functionsndash GetNamedPipeHandleState
ndash SetNamedPipeHandleState
ndash GetNamedPipeInfo
100100
Convenience Functions
BOOL TransactNamedPipe( HANDLE hNamedPipeLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDOWRD lpcbRead LPOVERLAPPED lpa)
WriteFile ReadFile sequence
101101
Convenience Functions
BOOL CallNamedPipe( LPCTSTR lpszPipeNameLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDWORD lpcbRead DWORD dwTimeOut)
CreateFile WriteFile ReadFile CloseHandle
ndash dwTimeOut NMPWAIT_NOWAIT NMPWAIT_WIAT_FOREVER NMPWAIT_USE_DEFAULT_WAIT
102102
Server eliminate the polling loop
BOOL ConnectNamedPipe (HANDLE hNamedPipeLPOVERLAPPED lpo
lpo == NULLndash Call will return as soon as there is a client connection
ndash Returns false if client connected between CreateNamed Pipe calland ConnectNamedPipe()
Use DisconnectNamedPipe to free the handle for connection from another client
WaitNamedPipe()ndash Client may wait for serverlsquos ConnectNamedPipe()
Security rights for named pipesndash GENERIC_READ GENERIC_WRITE SYNCHRONIZE
103103
Comparison with UNIX
UNIX FIFOs are similar to a named pipe
ndash FIFOs are half-duplex
ndash FIFOs are limited to a single machine
ndash FIFOs are still byte-oriented so its easiest to use fixed-size records in clientserver applications
ndash Individual readwrites are atomic
A server using FIFOs must use a separate FIFO for each clientlsquos response although all clients can send requests via a single well known FIFO
Mkfifo() is the UNIX counterpart to CreateNamedPipe()
Use sockets for networked clientserver scenarios
104104
Client Example using Named Pipe
WaitNamedPipe (ServerPipeName NMPWAIT_WAIT_FOREVER)
hNamedPipe = CreateFile (ServerPipeName GENERIC_READ | GENERIC_WRITE
0 NULL OPEN_EXISTING FILE_ATTRIBUTE_NORMAL NULL)
if (hNamedPipe == INVALID_HANDLE_VALUE)
fptinf(stderr Failure to locate servern) exit(3)
Write the request
WriteFile (hNamedPipe ampRequest MAX_RQRS_LEN ampnWrite NULL)
Read each response and send it to std out
while (ReadFile (hNamedPipe ResponseRecord MAX_RQRS_LEN ampnRead NULL))
printf (s ResponseRecord)
CloseHandle (hNamedPipe)
return 0
105105
Server Example Using a Named Pipe
hNamedPipe = CreateNamedPipe (SERVER_PIPE PIPE_ACCESS_DUPLEX
PIPE_READMODE_MESSAGE | PIPE_TYPE_MESSAGE | PIPE_WAIT
1 0 0 CS_TIMEOUT pNPSA)
while (Done)
printf (Server is awaiting next requestn)
if (ConnectNamedPipe (hNamedPipe NULL)
|| ReadFile (hNamedPipe ampRequest RQ_SIZE ampnXfer NULL))
fprintf(stderr ldquoConnect or Read Named Pipe errornrdquo) exit(4)
printf( ldquoRequest is sn RequestRecord)
Send the file one line at a time to the client
fp = fopen (File r)
while ((fgets (ResponseRecord MAX_RQRS_LEN fp) = NULL))
WriteFile (hNamedPipe ampResponseRecord
(strlen(ResponseRecord) + 1) TSIZE ampnXfer NULL)
fclose (fp)
DisconnectNamedPipe (hNamedPipe)
End of server operation
106106
Win32 IPC - MailslotsMailslots bear some nasty implementation detailsthey are almost never used
Broadcast mechanism
ndash One-directional
ndash Mutliple writersmultiple readers (frequently one-to-many comm)
ndash Message delivery is unreliable
ndash Can be located over a network domain
ndash Message lengths are limited (w2k lt 426 byte) Operations on the mailslot
ndash Each reader (server) creates mailslot with CreateMailslot()
ndash Write-only client opens mailslot with CreateFile() and uses WriteFile() ndash open will fail if there are no waiting readers
ndash Clientlsquos message can be read by all servers (readers) Client lookup mailslotmailslotname
ndash Client will connect to every server in network domain
107107
Locate a server via mailslot
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
App client 0
App client n
Mailslot Servers
While () Sleep() hMS = CreateFile( ldquomailslotstatusldquo)
WriteFile(hMS ampStatInfo
App Server
Mailslot Client
Message is sent periodically
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
4040
PassiveLowAPC
DispatchDPCDevice 1
Profile amp Synch (Srv 2003)
ClockInterprocessor Interrupt
Power failHigh
normal thread execution
Hardware interrupts
Deferrable software interrupts
012
302928
31
Interrupt Precedence via IRQLs (x86)
4141
Interrupt processing
Interrupt dispatch table (IDT)ndash Links to interrupt service routines
x86ndash Interrupt controller interrupts processor (single line)
ndash Processor queries for interrupt vector uses vector as index to IDT
Alphandash PAL code (Privileged Architecture Library ndash Alpha BIOS) determines interrupt vector calls kernel
ndash Kernel uses vector to index IDT
After ISR execution IRQL is lowered to initial level
4242
Interrupt object
Allows device drivers to register ISRs for their devicesndash Contains dispatch code (initial handler)
ndash Dispatch code calls ISR with interrupt object as parameter(HW cannot pass parameters to ISR)
Connectingdisconnecting interrupt objectsndash Dynamic association between ISR and IDT entry
ndash Loadable device drivers (kernel modules)
ndash Turn onoff ISR
Interrupt objects can synchronize access to ISR datandash Multiple instances of ISR may be active simultaneously (MP machine)
ndash Multiple ISR may be connected with IRQL
4343
Predefined IRQLs
High ndash used when halting the system (via KeBugCheck())
Power fail ndash originated in the NT design document but has never been used
Inter-processor interruptndash used to request action from other processor (dispatching a thread updating a processors TLB system shutdown system crash)
Clockndash Used to update systemlsquos clock allocation of CPU time to threads
Profilendash Used for kernel profiling (see Kernel profiler ndash Kernprofexe Res Kit)
4444
Predefined IRQLs (contd)
Device
ndash Used to prioritize device interrupts
DPCdispatch and APC
ndash Software interrupts that kernel and device drivers generate
Passive
ndash No interrupt level at all normal thread execution
4545
IRQLs on 64-bit Systems
PassiveLowAPC
DispatchDPCDevice 1
Device n
Synch (Srv 2003)Clock
Interprocessor InterruptPower
HighProfile
012
1413
15
34
PassiveLowAPC
DispatchDPC amp Synch (UP only)Correctable Machine Check
Device 1
Device nSynch (MP only)
ClockInterprocessor Interrupt
HighProfilePower
x64 IA64
12
4646
Interrupt Prioritization amp Delivery
IRQLs are determined as followsndash x86 UP systems IRQL = 27 - IRQ
ndash x86 MP systems bucketized (random)
ndash x64 amp IA64 systems IRQL = IDT vector number 16
On MP systems which processor is chosen to deliver an interruptndash By default any processor can receive an interrupt from any deviceCan be configured with IntFilter utility in Resource Kit
ndash On x86 and x64 systems the IOAPIC (IO advanced programmable interrupt controller) is programmed to interrupt the processor running at the lowest IRQL
ndash On IA64 systems the SAPIC (streamlined advanced programmable interrupt controller) is configured to interrupt one processor for each interrupt sourceProcessors are assigned round robin for each interrupt vector
4747
Software interrupts
Initiating thread dispatching
ndash DPC allow for scheduling actions when kernel is deep within many layers of code
ndash Delayed scheduling decision one DPC queue per processor
Handling timer expiration
Asynchronous execution of a procedure in context of a particular thread
Support for asynchronous IO operations
4848
Flow of Interrupts
4949
Sync on MP use spinlocks to coordinate among processors
Spinlock acquisition and release routines implement a one-owner-at-a-time algorithm
ndash Spinlock is either free or is considered to be owned by a CPU
ndash Analogous to using Windows API mutexes from user mode
A spinlock is just a data cell in memory
ndash Accessed with a test-and-modify operation that is atomic across all processors
ndash KSPIN_LOCK is an opaque data type typedefrsquod as a ULONG
ndash To implement synchronization a single bit is sufficient
Synchronization on SMP Systems
5050
do acquire_spinlock(DPC)until (SUCCESS)
begin remove DPC from queueend
release_spinlock(DPC)
do acquire_spinlock(DPC)until (SUCCESS)
begin remove DPC from queueend
release_spinlock(DPC)
Kernel Synchronization
Processor BProcessor A
Critical section
spinlock
DPC DPC
A spinlock is a locking primitive associatedwith a global data structure such as the DPC queue
5151
Queued Spinlocks
Problem Checking status of spinlock via test-and-set operation creates bus contention
Queued spinlocks maintain queue of waiting processors
First processor acquires lock other processors wait on processor-local flag
ndash Thus busy-wait loop requires no access to the memory bus
When releasing lock the 1st processorrsquos flag is modified
ndash Exactly one processor is being signaled
ndash Pre-determined wait order
5252
SMP Scalability Improvements
Windows 2000 queued spinlocks
ndash qlocks in Kernel Debugger
Server 2003
ndash More spinlocks eliminated (context swap system space commit)
ndash Further reduction of use of spinlocks amp length they are held
ndash Scheduling database now per-CPUAllows thread state transitions in parallel
5353
SMP Scalability Improvements
XP2003
ndash Minimized lock contention for hot locks PFN or Page Frame Database lock
ndash Some locks completely eliminatedCharging nonpagedpaged pool quotas allocating and mapping system page table entries charging commitment of pages allocatingmapping physical memory through AWE functions
ndash New more efficient locking mechanism (pushlocks)Doesnrsquot use spinlocks when no contentionUsed for object manager and address windowing extensions (AWE) related locks
5454
Waiting
Flexible wait calls
ndash Wait for one or multiple objects in one call
ndash Wait for multiple can wait for ldquoanyrdquo one or ldquoallrdquo at onceldquoAllrdquo all objects must be in the signalled state concurrently to resolve the wait
ndash All wait calls include optional timeout argument
ndash Waiting threads consume no CPU time
5555
Waiting
Waitable objects include
ndash Events (may be auto-reset or manual reset may be set or ldquopulsedrdquo)
ndash Mutexes (ldquomutual exclusionrdquo one-at-a-time)
ndash Semaphores (n-at-a-time)
ndash Timers
ndash Processes and Threads (signalled upon exit or terminate)
ndash Directories (change notification)
5656
Waiting
No guaranteed ordering of wait resolution
ndash If multiple threads are waiting for an object and only one thread is released (eg itrsquos a mutex or auto-reset event) which thread gets released is unpredictable
5757
Executive Synchronization
Thread waitson an object
handle
Create and initialize thread object
Initialized
Ready
Transition
Waiting
Running
Terminated
Standby
Wait is completeSet object to
signaled state
Interaction with thread scheduling
Waiting on Dispatcher Objects ndash outside the kernel
5858
Interaction bet Synchronization amp Dispatching User mode thread waits on an event objectlsquos handle
Kernel changes threadlsquos scheduling state from ready to waiting and adds thread to wait-list
Another thread sets the event
Kernel wakes up waiting threads variable priority threads get priority boost
5959
Interaction bet Synchronization amp Dispatching Dispatcher re-schedules new thread ndash it may preempt running thread it it has lower priority and issues software interrupt to initiate context switch
If no processor can be preempted the dispatcher places the ready thread in the dispatcher ready queue to be scheduled later
6060
What signals an object
Dispatcher object
System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
Owning thread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (kernel mode)
nonsignaled signaled
Owning thread or otherthread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (exported to user mode)
nonsignaled signaled
One thread releases thesemaphore freeing a resource
A thread acquires the semaphoreMore resources are not available
Kernel resumes one or more waiting threads
Semaphore
6161
A thread reinitializesthe thread object
What signals an object (contd)
Dispatcher object System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
A thread sets the event
Kernel resumes one or more threads
Kernel resumes one or more waiting threads
Event
nonsignaled signaled
Dedicated thread sets oneevent in the event pair
Kernel resumes theother dedicated thread
Kernel resumes waitingdedicated thread
Event pair
nonsignaled signaled
Timer expires
A thread (re) initializes the timer
Kernel resumes all waiting threads
Timer
nonsignaled signaled
Thread terminates
Kernel resumes all waiting threads
Thread
6262
Further Reading
Mark E Russinovich and David A Solomon Microsoft Windows Internals 4th Edition Microsoft Press 2004
Chapter 3 - System Mechanisms
ndash Trap Dispatching (pp 85 ff)
ndash Synchronization (pp 149 ff)
ndash Kernel Event Tracing (pp 175 ff)
6363
33 Advanced Windows Synchronization
Deferred and Asynchronous Procedure Calls
IRQLs and CPU Time Accounting
Wait Queues amp Dispatcher Objects
6464
Used to defer processing from higher (device) interrupt level to a lower (dispatch) levelndash Also used for quantum end and timer expiration
Driver (usually ISR) queues requestndash One queue per CPU DPCs are normally queued to the current processor but can be targeted to other CPUs
ndash Executes specified procedure at dispatch IRQL (or ldquodispatch levelrdquo also ldquoDPC levelrdquo) when all higher-IRQL work (interrupts) completed
ndash Maximum times recommended ISR 10 usec DPC 25 usec
See httpwwwmicrosoftcomwhdcdriverperformmmdrvmspx
Deferred Procedure Calls (DPCs)
6565
queue head DPC object DPC object DPC object
Deferred Procedure Calls (DPCs)
6666
DPC
Delivering a DPC
DPC routines can call kernel functionsbut canlsquot call system services generatepage faults or create or wait on objects
DPC routines canlsquotassume whatprocess addressspace is currentlymapped
Interruptdispatch table
high
Power failure
DispatchDPC
APC
Low
DPC
1 Timer expires kernelqueues DPC that willrelease all waiting threadsKernel requests SW int
DPCDPC
DPC queue
2 DPC interrupt occurswhen IRQL drops belowdispatchDPC level
dispatcher
3 After DPC interruptcontrol transfers tothread dispatcher
4 Dispatcher executes each DPCroutine in DPC queue
6767
Asynchronous Procedure Calls (APCs)
Execute code in context of a particular user threadndash APC routines can acquire resources (objects) incur page faultscall system services
APC queue is thread-specific
User mode amp kernel mode APCsndash Permission required for user mode APCs
6868
Asynchronous Procedure Calls (APCs)
Executive uses APCs to complete work in thread spacendash Wait for asynchronous IO operation
ndash Emulate delivery of POSIX signals
ndash Make threads suspendterminate itself (env subsystems)
APCs are delivered when thread is in alertable wait statendash WaitForMultipleObjectsEx() SleepEx()
6969
Special kernel APCs
ndash Run in kernel mode at IRQL 1
ndash Always deliverable unless thread is already at IRQL 1 or above
ndash Used for IO completion reporting from ldquoarbitrary thread contextrdquo
ndash Kernel-mode interface is linkable but not documented
Asynchronous Procedure Calls (APCs)
7070
ldquoOrdinaryrdquo kernel APCs
ndash Always deliverable if at IRQL 0 unless explicitly disabled (disable with KeEnterCriticalRegion)
User mode APCs
ndash Used for IO completion callback routines (see ReadFileEx WriteFileEx) also QueueUserApc
ndash Only deliverable when thread is in ldquoalertable waitrdquo
Asynchronous Procedure Calls (APCs)
7171
ThreadObject
K
U
APC objects
Asynchronous Procedure Calls (APCs)
7272
IRQLs and CPU Time Accounting
Interval clock timer ISR keeps track of time
Clock ISR time accounting
ndash If IRQLlt2 charge to threadrsquos user or kernel time
ndash If IRQL=2 and processing a DPC charge to DPC time
ndash If IRQL=2 amp not processing a DPC charge to thread kernel time
ndash If IRQLgt2 charge to interrupt time
7373
IRQLs and CPU Time Accounting
Since time servicing interrupts are NOT charged to interrupted thread if system is busy but no process appears to be running must be due to interrupt-related activity
ndash Note time at IRQL 2 or more is charged to the current threadrsquos quantum (to be described)
7474
Interrupt Time Accounting
Task Manager includes interrupt and DPC time with the Idle process time
Interrupt activity is not charged to any threadprocess
ndash Process Explorer shows these as separate processesnot really processes
ndash Context switches for these are really of interrupts amp DPCs
7575
Time Accounting Quirks
Looking at total CPU time for each process may not reveal where system has spent its time
CPU time accounting is driven by programmable interrupt timer
ndash Normally 10 msec (15 msec on some MP Pentiums)
Thread execution and context switches between clock intervals NOT accounted
ndash Eg one or more threads run and enter a wait state before clock fires
ndash Thus threads may run but never get charged
View context switch activity with Process Explorer
ndash Add Context Switch Delta column
7676
For waiting threads user-mode utilities only display the wait reason
Example pstat
Looking at Waiting Threads
7777
Wait Internals 1 Dispatcher Objects
Size TypeState
Wait listhead
Object-type-specific data
DispatcherObject
(see ntddkincddkntddkh)
Any kernel object you can wait for is a ldquodispatcher objectrdquo
ndash some exclusively for synchronizationeg events mutexes (ldquomutantsrdquo) semaphores queues timers
ndash others can be waited for as a side effect of their prime function eg processes threads file objects
ndash non-waitable kernel objects are called ldquocontrol objectsrdquo
All dispatcher objects have a common header
All dispatcher objects are in one of two states
ndash ldquosignaledrdquo vs ldquononsignaledrdquo
ndash when signalled a wait on the object is satisfied
ndash different object types differ in terms of what changes their state
ndash wait and unwait implementation iscommon to all types of dispatcher objects
7878
Object-type-specific data
Wait Internals 2Wait Blocks
Size TypeState
Wait listhead
Size TypeState
Wait listhead
Represent a threadrsquos reference to something itrsquos waiting for (one per handle passed to WaitForhellip)
All wait blocks from a given wait call are chained to the waiting thread
Type indicates wait for ldquoanyrdquo or ldquoallrdquo Key denotes argument list position for
WaitForMultipleObjects
Object-type-specific data
DispatcherObjects
Thread Objects
WaitBlockListWaitBlockList
Wait blocks
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
7979
34 Windows APIs for Synchronization and IPC Windows API constructs for synchronization and interprocess communication
Synchronization
ndash Critical sections
ndash Mutexes
ndash Semaphores
ndash Event objects
Synchronization through interprocess communication
ndash Anonymous pipes
ndash Named pipes
ndash Mailslots
8080
Critical Sections
Only usable from within the same process
Critical sections are initialized and deleted but do not have handles
Only one thread at a time can be in a critical section
A thread can enter a critical section multiple times - however the number of Enter- and Leave-operations must match
Leaving a critical section before entering it may cause deadlocks
No way to test whether another thread is in a critical section
VOID InitializeCriticalSection( LPCRITICAL_SECTION sec )VOID DeleteCriticalSection( LPCRITICAL_SECTION sec )
VOID EnterCriticalSection( LPCRITICAL_SECTION sec ) VOID LeaveCriticalSection( LPCRITICAL_SECTION sec )BOOL TryEnterCriticalSection ( LPCRITICAL_SECTION sec )
8181
Critical Section Example
counter is global shared by all threads
volatile int counter = 0
CRITICAL_SECTION crit
InitializeCriticalSection ( ampcrit )
hellip main loop in any of the threads
while (done)
_try
EnterCriticalSection ( ampcrit )
counter += local_value
LeaveCriticalSection ( ampcrit )
_finally LeaveCriticalSection ( ampcrit )
DeleteCriticalSection( ampcrit )
8282
Synchronizing Threads with Kernel Objects
The following kernel objects can be used to synchronize threads
ndash Processes
ndash Threads
ndash Files
ndash Console input
File change notificationsFile change notifications
MutexesMutexes
Events (auto-reset + manual-reset)Events (auto-reset + manual-reset)
Waitable timersWaitable timers
DWORD WaitForSingleObject( HANDLE hObject DWORD dwTimeout )
DWORD WaitForMultipleObjects( DWORD cObjects LPHANDLE lpHandles BOOL bWaitAll DWORD dwTimeout )
8383
Wait Functions - Details
WaitForSingleObject()ndash hObject specifies kernel object
ndash dwTimeout specifies wait time in msecdwTimeout == 0 - no wait check whether object is signaled
dwTimeout == INFINITE - wait forever
WaitForMultipleObjects()ndash cObjects lt= MAXIMUM_WAIT_OBJECTS (64)
ndash lpHandles - pointer to array identifying these objects
ndash bWaitAll - whether to wait for first signaled object or all objectsFunction returns index of first signaled object
Side effectsndash Mutexes auto-reset events and waitable timers will be reset to non-signaled state after completing wait functions
8484
Mutexes
Mutexes work across processes
First thread has to call CreateMutex()
When sharing a mutex second thread (process) calls CreateMutex() or OpenMutex()
fInitialOwner == TRUE gives creator immediate ownership
Threads acquire mutex ownership using WaitForSingleObject() or WaitForMultipleObjects()
ReleaseMutex() gives up ownership
CloseHandle() will free mutex object
8585
Mutexes
HANDLE CreateMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
HANDLE OpenMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
BOOL ReleaseMutex( HANDLE hMutex )
8686
Mutex Example
counter is global shared by all threads
volatile int done counter = 0
HANDLE mutex = CreateMutex( NULL FALSE NULL )
main loop in any of the threads ret is local
DWORD ret
while (done)
ret = WaitForSingleObject( mutex INFINITE )
if (ret == WAIT_OBJECT_0)
counter += local_value
else mutex was abandoned
break exit the loop
ReleaseMutex( mutex )
CloseHandle( mutex )
8787
Comparison - POSIX mutexes
POSIX pthreads specification supports mutexesndash Synchronization among threads in same process
Five basic functionsndash pthread_mutex_init()
ndash pthread_mutex_destroy()
ndash pthread_mutex_lock()
ndash pthread_mutex_unlock()
ndash pthread_mutex_trylock()
Comparisonndash pthread_mutex_lock() will block - equivalent to WaitForSingleObject( hMutex )
ndash pthread_mutex_trylock() is nonblocking (polling) - equivalent to WaitForSingleObject() with timeout == 0
8888
Semaphores
Semaphore objects are used for resource countingndash A semaphore is signaled when count gt 0
Threadsprocesses use wait functionsndash Each wait function decreases semaphore count by 1
ndash ReleaseSemaphore() may increment count by any value
ndash ReleaseSemaphore() returns old semaphore count
8989
Semaphores
HANDLE CreateSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE OpenSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE ReleaseSemaphore( HANDLE hSemaphoreLONG cReleaseCount LPLONG lpPreviousCount )
9090
Events
Multiple threads can be released when a single event is signaled (barrier synchronization)ndash Manual-reset event can signal several thread simultaneously must be reset manually
ndash PulseEvent() will release all threads waiting on a manual-reset event and automatically reset the event
ndash Auto-reset event signals a single thread event is reset automatically
ndash fInitialState == TRUE - create event in signaled state
9191
Events
HANDLE CreateEvent( LPSECURITY_ATTRIBUTE lpsaBOOL fManualReset BOOL fInititalStateLPTSTR lpszEventName )
BOOL SetEvent( HANDLE hEvent )BOOL ResetEvent( HANDLE hEvent )BOOL PulseEvent( HANDLE hEvent )
9292
Comparison - POSIX condition variables
pthreadrsquos condition variables are comparable to eventsndash pthread_cond_init()
ndash pthread_cond_destroy()
Wait functionsndash pthread_cond_wait()
ndash pthread_cond_timedwait()
Signalingndash pthread_cond_signal() - one thread
ndash pthread_cond_broadcast() - all waiting threads
No exact equivalent to manual-reset events
9393
Anonymous pipes
BOOL CreatePipe( PHANDLE phReadPHANDLE phWriteLPSECURITY_ATTRIBUTES lpsaDWORD cbPipe )
main
prog1 prog2pipe
Half-duplex character-based IPC
cbPipe pipe byte size zero == default
Read on pipe handle will block if pipe is empty
Write operation to a full pipe will block
Anonymous pipes are oneway
9494
IO Redirection using an Anonymous Pipe
Create default size anonymous pipe handles are inheritable
if (CreatePipe (amphReadPipe amphWritePipe ampPipeSA 0))
fprintf(stderr ldquoAnon pipe create failednrdquo) exit(1)
Set output handle to pipe handle create first processes
StartInfoCh1hStdInput = GetStdHandle (STD_INPUT_HANDLE)
StartInfoCh1hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh1hStdOutput = hWritePipe
StartInfoCh1dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)Command1 NULL NULL TRUE 0 NULL NULL ampStartInfoCh1 ampProcInfo1))
fprintf(stderr ldquoCreateProc1 failednrdquo) exit(2)
CloseHandle (hWritePipe)
9595
Pipe example (contd)
Repeat (symmetrically) for the second process
StartInfoCh2hStdInput = hReadPipe
StartInfoCh2hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh2hStdOutput = GetStdHandle (STD_OUTPUT_HANDLE)
StartInfoCh2dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)targv NULL NULLTRUE Inherit handles
0 NULL NULL ampStartInfoCh2 ampProcInfo2))
fprintf(stderr ldquoCreateProc2 failednrdquo) exit(3)
CloseHandle (hReadPipe)
Wait for both processes to complete
WaitForSingleObject (ProcInfo1hProcess INFINITE)
WaitForSingleObject (ProcInfo2hProcess INFINITE)
CloseHandle (ProcInfo1hThread) CloseHandle (ProcInfo1hProcess)
CloseHandle (ProcInfo2hThread) CloseHandle (ProcInfo2hProcess)
return 0
9696
Named Pipes
Message orientedndash Reading process can read varying-length messages precisely as sent by the writing process
Bi-directionalndash Two processes can exchange messages over the same pipe
Multiple independent instances of a named pipendash Several clients can communicate with a single server using the same instance
ndash Server can respond to client using the same instance
Pipe can be accessed over the networkndash location transparency
Convenience and connection functions
9797
Using Named Pipes
HANDLE CreateNamedPipe (LPCTSTR lpszPipeNameDWORD fdwOpenMode DWORD fdwPipModeDWORD nMaxInstances DWORD cbOutBufDWORD cbInBuf DWORD dwTimeOutLPSECURITY_ATTRIBUTES lpsa )
Use same flag settings forall instances of a named pipe
lpszPipeName pipe[path]pipename
ndash Not possible to create a pipe on remote machine ( ndash local machine)
fdwOpenMode
ndash PIPE_ACCESS_DUPLEX PIPE_ACCESS_INBOUND PIPE_ACCESS_OUTBOUND
fdwPipeMode
ndash PIPE_TYPE_BYTE or PIPE_TYPE_MESSAGE
ndash PIPE_READMODE_BYTE or PIPE_READMODE_MESSAGE
ndash PIPE_WAIT or PIPE_NOWAIT (will ReadFile block)
9898
Named Pipes (contd)
BOOL PeekNamedPipe (HANDLE hPipeLPVOID lpvBuffer DWORD cbBufferLPDWORD lpcbRead LPDWORD lpcbAvailLPDWORD lpcbMessage)
nMaxInstances
ndash Number of instances
ndash PIPE_UNLIMITED_INSTANCES OS choice based on resources
dwTimeOut
ndash Default time-out period (in msec) for WaitNamedPipe()
First CreateNamedPipe creates named pipe
ndash Closing handle to last instance deletes named pipe
Polling a pipe
ndash Nondestructive ndash is there a message waiting for ReadFile
9999
Named Pipe Client Connections
CreateFile with named pipe namendash pipe[path]pipename
ndash servernamepipe[path]pipename
ndash First method gives better performance (local server)
Status Functionsndash GetNamedPipeHandleState
ndash SetNamedPipeHandleState
ndash GetNamedPipeInfo
100100
Convenience Functions
BOOL TransactNamedPipe( HANDLE hNamedPipeLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDOWRD lpcbRead LPOVERLAPPED lpa)
WriteFile ReadFile sequence
101101
Convenience Functions
BOOL CallNamedPipe( LPCTSTR lpszPipeNameLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDWORD lpcbRead DWORD dwTimeOut)
CreateFile WriteFile ReadFile CloseHandle
ndash dwTimeOut NMPWAIT_NOWAIT NMPWAIT_WIAT_FOREVER NMPWAIT_USE_DEFAULT_WAIT
102102
Server eliminate the polling loop
BOOL ConnectNamedPipe (HANDLE hNamedPipeLPOVERLAPPED lpo
lpo == NULLndash Call will return as soon as there is a client connection
ndash Returns false if client connected between CreateNamed Pipe calland ConnectNamedPipe()
Use DisconnectNamedPipe to free the handle for connection from another client
WaitNamedPipe()ndash Client may wait for serverlsquos ConnectNamedPipe()
Security rights for named pipesndash GENERIC_READ GENERIC_WRITE SYNCHRONIZE
103103
Comparison with UNIX
UNIX FIFOs are similar to a named pipe
ndash FIFOs are half-duplex
ndash FIFOs are limited to a single machine
ndash FIFOs are still byte-oriented so its easiest to use fixed-size records in clientserver applications
ndash Individual readwrites are atomic
A server using FIFOs must use a separate FIFO for each clientlsquos response although all clients can send requests via a single well known FIFO
Mkfifo() is the UNIX counterpart to CreateNamedPipe()
Use sockets for networked clientserver scenarios
104104
Client Example using Named Pipe
WaitNamedPipe (ServerPipeName NMPWAIT_WAIT_FOREVER)
hNamedPipe = CreateFile (ServerPipeName GENERIC_READ | GENERIC_WRITE
0 NULL OPEN_EXISTING FILE_ATTRIBUTE_NORMAL NULL)
if (hNamedPipe == INVALID_HANDLE_VALUE)
fptinf(stderr Failure to locate servern) exit(3)
Write the request
WriteFile (hNamedPipe ampRequest MAX_RQRS_LEN ampnWrite NULL)
Read each response and send it to std out
while (ReadFile (hNamedPipe ResponseRecord MAX_RQRS_LEN ampnRead NULL))
printf (s ResponseRecord)
CloseHandle (hNamedPipe)
return 0
105105
Server Example Using a Named Pipe
hNamedPipe = CreateNamedPipe (SERVER_PIPE PIPE_ACCESS_DUPLEX
PIPE_READMODE_MESSAGE | PIPE_TYPE_MESSAGE | PIPE_WAIT
1 0 0 CS_TIMEOUT pNPSA)
while (Done)
printf (Server is awaiting next requestn)
if (ConnectNamedPipe (hNamedPipe NULL)
|| ReadFile (hNamedPipe ampRequest RQ_SIZE ampnXfer NULL))
fprintf(stderr ldquoConnect or Read Named Pipe errornrdquo) exit(4)
printf( ldquoRequest is sn RequestRecord)
Send the file one line at a time to the client
fp = fopen (File r)
while ((fgets (ResponseRecord MAX_RQRS_LEN fp) = NULL))
WriteFile (hNamedPipe ampResponseRecord
(strlen(ResponseRecord) + 1) TSIZE ampnXfer NULL)
fclose (fp)
DisconnectNamedPipe (hNamedPipe)
End of server operation
106106
Win32 IPC - MailslotsMailslots bear some nasty implementation detailsthey are almost never used
Broadcast mechanism
ndash One-directional
ndash Mutliple writersmultiple readers (frequently one-to-many comm)
ndash Message delivery is unreliable
ndash Can be located over a network domain
ndash Message lengths are limited (w2k lt 426 byte) Operations on the mailslot
ndash Each reader (server) creates mailslot with CreateMailslot()
ndash Write-only client opens mailslot with CreateFile() and uses WriteFile() ndash open will fail if there are no waiting readers
ndash Clientlsquos message can be read by all servers (readers) Client lookup mailslotmailslotname
ndash Client will connect to every server in network domain
107107
Locate a server via mailslot
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
App client 0
App client n
Mailslot Servers
While () Sleep() hMS = CreateFile( ldquomailslotstatusldquo)
WriteFile(hMS ampStatInfo
App Server
Mailslot Client
Message is sent periodically
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
4141
Interrupt processing
Interrupt dispatch table (IDT)ndash Links to interrupt service routines
x86ndash Interrupt controller interrupts processor (single line)
ndash Processor queries for interrupt vector uses vector as index to IDT
Alphandash PAL code (Privileged Architecture Library ndash Alpha BIOS) determines interrupt vector calls kernel
ndash Kernel uses vector to index IDT
After ISR execution IRQL is lowered to initial level
4242
Interrupt object
Allows device drivers to register ISRs for their devicesndash Contains dispatch code (initial handler)
ndash Dispatch code calls ISR with interrupt object as parameter(HW cannot pass parameters to ISR)
Connectingdisconnecting interrupt objectsndash Dynamic association between ISR and IDT entry
ndash Loadable device drivers (kernel modules)
ndash Turn onoff ISR
Interrupt objects can synchronize access to ISR datandash Multiple instances of ISR may be active simultaneously (MP machine)
ndash Multiple ISR may be connected with IRQL
4343
Predefined IRQLs
High ndash used when halting the system (via KeBugCheck())
Power fail ndash originated in the NT design document but has never been used
Inter-processor interruptndash used to request action from other processor (dispatching a thread updating a processors TLB system shutdown system crash)
Clockndash Used to update systemlsquos clock allocation of CPU time to threads
Profilendash Used for kernel profiling (see Kernel profiler ndash Kernprofexe Res Kit)
4444
Predefined IRQLs (contd)
Device
ndash Used to prioritize device interrupts
DPCdispatch and APC
ndash Software interrupts that kernel and device drivers generate
Passive
ndash No interrupt level at all normal thread execution
4545
IRQLs on 64-bit Systems
PassiveLowAPC
DispatchDPCDevice 1
Device n
Synch (Srv 2003)Clock
Interprocessor InterruptPower
HighProfile
012
1413
15
34
PassiveLowAPC
DispatchDPC amp Synch (UP only)Correctable Machine Check
Device 1
Device nSynch (MP only)
ClockInterprocessor Interrupt
HighProfilePower
x64 IA64
12
4646
Interrupt Prioritization amp Delivery
IRQLs are determined as followsndash x86 UP systems IRQL = 27 - IRQ
ndash x86 MP systems bucketized (random)
ndash x64 amp IA64 systems IRQL = IDT vector number 16
On MP systems which processor is chosen to deliver an interruptndash By default any processor can receive an interrupt from any deviceCan be configured with IntFilter utility in Resource Kit
ndash On x86 and x64 systems the IOAPIC (IO advanced programmable interrupt controller) is programmed to interrupt the processor running at the lowest IRQL
ndash On IA64 systems the SAPIC (streamlined advanced programmable interrupt controller) is configured to interrupt one processor for each interrupt sourceProcessors are assigned round robin for each interrupt vector
4747
Software interrupts
Initiating thread dispatching
ndash DPC allow for scheduling actions when kernel is deep within many layers of code
ndash Delayed scheduling decision one DPC queue per processor
Handling timer expiration
Asynchronous execution of a procedure in context of a particular thread
Support for asynchronous IO operations
4848
Flow of Interrupts
4949
Sync on MP use spinlocks to coordinate among processors
Spinlock acquisition and release routines implement a one-owner-at-a-time algorithm
ndash Spinlock is either free or is considered to be owned by a CPU
ndash Analogous to using Windows API mutexes from user mode
A spinlock is just a data cell in memory
ndash Accessed with a test-and-modify operation that is atomic across all processors
ndash KSPIN_LOCK is an opaque data type typedefrsquod as a ULONG
ndash To implement synchronization a single bit is sufficient
Synchronization on SMP Systems
5050
do acquire_spinlock(DPC)until (SUCCESS)
begin remove DPC from queueend
release_spinlock(DPC)
do acquire_spinlock(DPC)until (SUCCESS)
begin remove DPC from queueend
release_spinlock(DPC)
Kernel Synchronization
Processor BProcessor A
Critical section
spinlock
DPC DPC
A spinlock is a locking primitive associatedwith a global data structure such as the DPC queue
5151
Queued Spinlocks
Problem Checking status of spinlock via test-and-set operation creates bus contention
Queued spinlocks maintain queue of waiting processors
First processor acquires lock other processors wait on processor-local flag
ndash Thus busy-wait loop requires no access to the memory bus
When releasing lock the 1st processorrsquos flag is modified
ndash Exactly one processor is being signaled
ndash Pre-determined wait order
5252
SMP Scalability Improvements
Windows 2000 queued spinlocks
ndash qlocks in Kernel Debugger
Server 2003
ndash More spinlocks eliminated (context swap system space commit)
ndash Further reduction of use of spinlocks amp length they are held
ndash Scheduling database now per-CPUAllows thread state transitions in parallel
5353
SMP Scalability Improvements
XP2003
ndash Minimized lock contention for hot locks PFN or Page Frame Database lock
ndash Some locks completely eliminatedCharging nonpagedpaged pool quotas allocating and mapping system page table entries charging commitment of pages allocatingmapping physical memory through AWE functions
ndash New more efficient locking mechanism (pushlocks)Doesnrsquot use spinlocks when no contentionUsed for object manager and address windowing extensions (AWE) related locks
5454
Waiting
Flexible wait calls
ndash Wait for one or multiple objects in one call
ndash Wait for multiple can wait for ldquoanyrdquo one or ldquoallrdquo at onceldquoAllrdquo all objects must be in the signalled state concurrently to resolve the wait
ndash All wait calls include optional timeout argument
ndash Waiting threads consume no CPU time
5555
Waiting
Waitable objects include
ndash Events (may be auto-reset or manual reset may be set or ldquopulsedrdquo)
ndash Mutexes (ldquomutual exclusionrdquo one-at-a-time)
ndash Semaphores (n-at-a-time)
ndash Timers
ndash Processes and Threads (signalled upon exit or terminate)
ndash Directories (change notification)
5656
Waiting
No guaranteed ordering of wait resolution
ndash If multiple threads are waiting for an object and only one thread is released (eg itrsquos a mutex or auto-reset event) which thread gets released is unpredictable
5757
Executive Synchronization
Thread waitson an object
handle
Create and initialize thread object
Initialized
Ready
Transition
Waiting
Running
Terminated
Standby
Wait is completeSet object to
signaled state
Interaction with thread scheduling
Waiting on Dispatcher Objects ndash outside the kernel
5858
Interaction bet Synchronization amp Dispatching User mode thread waits on an event objectlsquos handle
Kernel changes threadlsquos scheduling state from ready to waiting and adds thread to wait-list
Another thread sets the event
Kernel wakes up waiting threads variable priority threads get priority boost
5959
Interaction bet Synchronization amp Dispatching Dispatcher re-schedules new thread ndash it may preempt running thread it it has lower priority and issues software interrupt to initiate context switch
If no processor can be preempted the dispatcher places the ready thread in the dispatcher ready queue to be scheduled later
6060
What signals an object
Dispatcher object
System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
Owning thread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (kernel mode)
nonsignaled signaled
Owning thread or otherthread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (exported to user mode)
nonsignaled signaled
One thread releases thesemaphore freeing a resource
A thread acquires the semaphoreMore resources are not available
Kernel resumes one or more waiting threads
Semaphore
6161
A thread reinitializesthe thread object
What signals an object (contd)
Dispatcher object System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
A thread sets the event
Kernel resumes one or more threads
Kernel resumes one or more waiting threads
Event
nonsignaled signaled
Dedicated thread sets oneevent in the event pair
Kernel resumes theother dedicated thread
Kernel resumes waitingdedicated thread
Event pair
nonsignaled signaled
Timer expires
A thread (re) initializes the timer
Kernel resumes all waiting threads
Timer
nonsignaled signaled
Thread terminates
Kernel resumes all waiting threads
Thread
6262
Further Reading
Mark E Russinovich and David A Solomon Microsoft Windows Internals 4th Edition Microsoft Press 2004
Chapter 3 - System Mechanisms
ndash Trap Dispatching (pp 85 ff)
ndash Synchronization (pp 149 ff)
ndash Kernel Event Tracing (pp 175 ff)
6363
33 Advanced Windows Synchronization
Deferred and Asynchronous Procedure Calls
IRQLs and CPU Time Accounting
Wait Queues amp Dispatcher Objects
6464
Used to defer processing from higher (device) interrupt level to a lower (dispatch) levelndash Also used for quantum end and timer expiration
Driver (usually ISR) queues requestndash One queue per CPU DPCs are normally queued to the current processor but can be targeted to other CPUs
ndash Executes specified procedure at dispatch IRQL (or ldquodispatch levelrdquo also ldquoDPC levelrdquo) when all higher-IRQL work (interrupts) completed
ndash Maximum times recommended ISR 10 usec DPC 25 usec
See httpwwwmicrosoftcomwhdcdriverperformmmdrvmspx
Deferred Procedure Calls (DPCs)
6565
queue head DPC object DPC object DPC object
Deferred Procedure Calls (DPCs)
6666
DPC
Delivering a DPC
DPC routines can call kernel functionsbut canlsquot call system services generatepage faults or create or wait on objects
DPC routines canlsquotassume whatprocess addressspace is currentlymapped
Interruptdispatch table
high
Power failure
DispatchDPC
APC
Low
DPC
1 Timer expires kernelqueues DPC that willrelease all waiting threadsKernel requests SW int
DPCDPC
DPC queue
2 DPC interrupt occurswhen IRQL drops belowdispatchDPC level
dispatcher
3 After DPC interruptcontrol transfers tothread dispatcher
4 Dispatcher executes each DPCroutine in DPC queue
6767
Asynchronous Procedure Calls (APCs)
Execute code in context of a particular user threadndash APC routines can acquire resources (objects) incur page faultscall system services
APC queue is thread-specific
User mode amp kernel mode APCsndash Permission required for user mode APCs
6868
Asynchronous Procedure Calls (APCs)
Executive uses APCs to complete work in thread spacendash Wait for asynchronous IO operation
ndash Emulate delivery of POSIX signals
ndash Make threads suspendterminate itself (env subsystems)
APCs are delivered when thread is in alertable wait statendash WaitForMultipleObjectsEx() SleepEx()
6969
Special kernel APCs
ndash Run in kernel mode at IRQL 1
ndash Always deliverable unless thread is already at IRQL 1 or above
ndash Used for IO completion reporting from ldquoarbitrary thread contextrdquo
ndash Kernel-mode interface is linkable but not documented
Asynchronous Procedure Calls (APCs)
7070
ldquoOrdinaryrdquo kernel APCs
ndash Always deliverable if at IRQL 0 unless explicitly disabled (disable with KeEnterCriticalRegion)
User mode APCs
ndash Used for IO completion callback routines (see ReadFileEx WriteFileEx) also QueueUserApc
ndash Only deliverable when thread is in ldquoalertable waitrdquo
Asynchronous Procedure Calls (APCs)
7171
ThreadObject
K
U
APC objects
Asynchronous Procedure Calls (APCs)
7272
IRQLs and CPU Time Accounting
Interval clock timer ISR keeps track of time
Clock ISR time accounting
ndash If IRQLlt2 charge to threadrsquos user or kernel time
ndash If IRQL=2 and processing a DPC charge to DPC time
ndash If IRQL=2 amp not processing a DPC charge to thread kernel time
ndash If IRQLgt2 charge to interrupt time
7373
IRQLs and CPU Time Accounting
Since time servicing interrupts are NOT charged to interrupted thread if system is busy but no process appears to be running must be due to interrupt-related activity
ndash Note time at IRQL 2 or more is charged to the current threadrsquos quantum (to be described)
7474
Interrupt Time Accounting
Task Manager includes interrupt and DPC time with the Idle process time
Interrupt activity is not charged to any threadprocess
ndash Process Explorer shows these as separate processesnot really processes
ndash Context switches for these are really of interrupts amp DPCs
7575
Time Accounting Quirks
Looking at total CPU time for each process may not reveal where system has spent its time
CPU time accounting is driven by programmable interrupt timer
ndash Normally 10 msec (15 msec on some MP Pentiums)
Thread execution and context switches between clock intervals NOT accounted
ndash Eg one or more threads run and enter a wait state before clock fires
ndash Thus threads may run but never get charged
View context switch activity with Process Explorer
ndash Add Context Switch Delta column
7676
For waiting threads user-mode utilities only display the wait reason
Example pstat
Looking at Waiting Threads
7777
Wait Internals 1 Dispatcher Objects
Size TypeState
Wait listhead
Object-type-specific data
DispatcherObject
(see ntddkincddkntddkh)
Any kernel object you can wait for is a ldquodispatcher objectrdquo
ndash some exclusively for synchronizationeg events mutexes (ldquomutantsrdquo) semaphores queues timers
ndash others can be waited for as a side effect of their prime function eg processes threads file objects
ndash non-waitable kernel objects are called ldquocontrol objectsrdquo
All dispatcher objects have a common header
All dispatcher objects are in one of two states
ndash ldquosignaledrdquo vs ldquononsignaledrdquo
ndash when signalled a wait on the object is satisfied
ndash different object types differ in terms of what changes their state
ndash wait and unwait implementation iscommon to all types of dispatcher objects
7878
Object-type-specific data
Wait Internals 2Wait Blocks
Size TypeState
Wait listhead
Size TypeState
Wait listhead
Represent a threadrsquos reference to something itrsquos waiting for (one per handle passed to WaitForhellip)
All wait blocks from a given wait call are chained to the waiting thread
Type indicates wait for ldquoanyrdquo or ldquoallrdquo Key denotes argument list position for
WaitForMultipleObjects
Object-type-specific data
DispatcherObjects
Thread Objects
WaitBlockListWaitBlockList
Wait blocks
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
7979
34 Windows APIs for Synchronization and IPC Windows API constructs for synchronization and interprocess communication
Synchronization
ndash Critical sections
ndash Mutexes
ndash Semaphores
ndash Event objects
Synchronization through interprocess communication
ndash Anonymous pipes
ndash Named pipes
ndash Mailslots
8080
Critical Sections
Only usable from within the same process
Critical sections are initialized and deleted but do not have handles
Only one thread at a time can be in a critical section
A thread can enter a critical section multiple times - however the number of Enter- and Leave-operations must match
Leaving a critical section before entering it may cause deadlocks
No way to test whether another thread is in a critical section
VOID InitializeCriticalSection( LPCRITICAL_SECTION sec )VOID DeleteCriticalSection( LPCRITICAL_SECTION sec )
VOID EnterCriticalSection( LPCRITICAL_SECTION sec ) VOID LeaveCriticalSection( LPCRITICAL_SECTION sec )BOOL TryEnterCriticalSection ( LPCRITICAL_SECTION sec )
8181
Critical Section Example
counter is global shared by all threads
volatile int counter = 0
CRITICAL_SECTION crit
InitializeCriticalSection ( ampcrit )
hellip main loop in any of the threads
while (done)
_try
EnterCriticalSection ( ampcrit )
counter += local_value
LeaveCriticalSection ( ampcrit )
_finally LeaveCriticalSection ( ampcrit )
DeleteCriticalSection( ampcrit )
8282
Synchronizing Threads with Kernel Objects
The following kernel objects can be used to synchronize threads
ndash Processes
ndash Threads
ndash Files
ndash Console input
File change notificationsFile change notifications
MutexesMutexes
Events (auto-reset + manual-reset)Events (auto-reset + manual-reset)
Waitable timersWaitable timers
DWORD WaitForSingleObject( HANDLE hObject DWORD dwTimeout )
DWORD WaitForMultipleObjects( DWORD cObjects LPHANDLE lpHandles BOOL bWaitAll DWORD dwTimeout )
8383
Wait Functions - Details
WaitForSingleObject()ndash hObject specifies kernel object
ndash dwTimeout specifies wait time in msecdwTimeout == 0 - no wait check whether object is signaled
dwTimeout == INFINITE - wait forever
WaitForMultipleObjects()ndash cObjects lt= MAXIMUM_WAIT_OBJECTS (64)
ndash lpHandles - pointer to array identifying these objects
ndash bWaitAll - whether to wait for first signaled object or all objectsFunction returns index of first signaled object
Side effectsndash Mutexes auto-reset events and waitable timers will be reset to non-signaled state after completing wait functions
8484
Mutexes
Mutexes work across processes
First thread has to call CreateMutex()
When sharing a mutex second thread (process) calls CreateMutex() or OpenMutex()
fInitialOwner == TRUE gives creator immediate ownership
Threads acquire mutex ownership using WaitForSingleObject() or WaitForMultipleObjects()
ReleaseMutex() gives up ownership
CloseHandle() will free mutex object
8585
Mutexes
HANDLE CreateMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
HANDLE OpenMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
BOOL ReleaseMutex( HANDLE hMutex )
8686
Mutex Example
counter is global shared by all threads
volatile int done counter = 0
HANDLE mutex = CreateMutex( NULL FALSE NULL )
main loop in any of the threads ret is local
DWORD ret
while (done)
ret = WaitForSingleObject( mutex INFINITE )
if (ret == WAIT_OBJECT_0)
counter += local_value
else mutex was abandoned
break exit the loop
ReleaseMutex( mutex )
CloseHandle( mutex )
8787
Comparison - POSIX mutexes
POSIX pthreads specification supports mutexesndash Synchronization among threads in same process
Five basic functionsndash pthread_mutex_init()
ndash pthread_mutex_destroy()
ndash pthread_mutex_lock()
ndash pthread_mutex_unlock()
ndash pthread_mutex_trylock()
Comparisonndash pthread_mutex_lock() will block - equivalent to WaitForSingleObject( hMutex )
ndash pthread_mutex_trylock() is nonblocking (polling) - equivalent to WaitForSingleObject() with timeout == 0
8888
Semaphores
Semaphore objects are used for resource countingndash A semaphore is signaled when count gt 0
Threadsprocesses use wait functionsndash Each wait function decreases semaphore count by 1
ndash ReleaseSemaphore() may increment count by any value
ndash ReleaseSemaphore() returns old semaphore count
8989
Semaphores
HANDLE CreateSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE OpenSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE ReleaseSemaphore( HANDLE hSemaphoreLONG cReleaseCount LPLONG lpPreviousCount )
9090
Events
Multiple threads can be released when a single event is signaled (barrier synchronization)ndash Manual-reset event can signal several thread simultaneously must be reset manually
ndash PulseEvent() will release all threads waiting on a manual-reset event and automatically reset the event
ndash Auto-reset event signals a single thread event is reset automatically
ndash fInitialState == TRUE - create event in signaled state
9191
Events
HANDLE CreateEvent( LPSECURITY_ATTRIBUTE lpsaBOOL fManualReset BOOL fInititalStateLPTSTR lpszEventName )
BOOL SetEvent( HANDLE hEvent )BOOL ResetEvent( HANDLE hEvent )BOOL PulseEvent( HANDLE hEvent )
9292
Comparison - POSIX condition variables
pthreadrsquos condition variables are comparable to eventsndash pthread_cond_init()
ndash pthread_cond_destroy()
Wait functionsndash pthread_cond_wait()
ndash pthread_cond_timedwait()
Signalingndash pthread_cond_signal() - one thread
ndash pthread_cond_broadcast() - all waiting threads
No exact equivalent to manual-reset events
9393
Anonymous pipes
BOOL CreatePipe( PHANDLE phReadPHANDLE phWriteLPSECURITY_ATTRIBUTES lpsaDWORD cbPipe )
main
prog1 prog2pipe
Half-duplex character-based IPC
cbPipe pipe byte size zero == default
Read on pipe handle will block if pipe is empty
Write operation to a full pipe will block
Anonymous pipes are oneway
9494
IO Redirection using an Anonymous Pipe
Create default size anonymous pipe handles are inheritable
if (CreatePipe (amphReadPipe amphWritePipe ampPipeSA 0))
fprintf(stderr ldquoAnon pipe create failednrdquo) exit(1)
Set output handle to pipe handle create first processes
StartInfoCh1hStdInput = GetStdHandle (STD_INPUT_HANDLE)
StartInfoCh1hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh1hStdOutput = hWritePipe
StartInfoCh1dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)Command1 NULL NULL TRUE 0 NULL NULL ampStartInfoCh1 ampProcInfo1))
fprintf(stderr ldquoCreateProc1 failednrdquo) exit(2)
CloseHandle (hWritePipe)
9595
Pipe example (contd)
Repeat (symmetrically) for the second process
StartInfoCh2hStdInput = hReadPipe
StartInfoCh2hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh2hStdOutput = GetStdHandle (STD_OUTPUT_HANDLE)
StartInfoCh2dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)targv NULL NULLTRUE Inherit handles
0 NULL NULL ampStartInfoCh2 ampProcInfo2))
fprintf(stderr ldquoCreateProc2 failednrdquo) exit(3)
CloseHandle (hReadPipe)
Wait for both processes to complete
WaitForSingleObject (ProcInfo1hProcess INFINITE)
WaitForSingleObject (ProcInfo2hProcess INFINITE)
CloseHandle (ProcInfo1hThread) CloseHandle (ProcInfo1hProcess)
CloseHandle (ProcInfo2hThread) CloseHandle (ProcInfo2hProcess)
return 0
9696
Named Pipes
Message orientedndash Reading process can read varying-length messages precisely as sent by the writing process
Bi-directionalndash Two processes can exchange messages over the same pipe
Multiple independent instances of a named pipendash Several clients can communicate with a single server using the same instance
ndash Server can respond to client using the same instance
Pipe can be accessed over the networkndash location transparency
Convenience and connection functions
9797
Using Named Pipes
HANDLE CreateNamedPipe (LPCTSTR lpszPipeNameDWORD fdwOpenMode DWORD fdwPipModeDWORD nMaxInstances DWORD cbOutBufDWORD cbInBuf DWORD dwTimeOutLPSECURITY_ATTRIBUTES lpsa )
Use same flag settings forall instances of a named pipe
lpszPipeName pipe[path]pipename
ndash Not possible to create a pipe on remote machine ( ndash local machine)
fdwOpenMode
ndash PIPE_ACCESS_DUPLEX PIPE_ACCESS_INBOUND PIPE_ACCESS_OUTBOUND
fdwPipeMode
ndash PIPE_TYPE_BYTE or PIPE_TYPE_MESSAGE
ndash PIPE_READMODE_BYTE or PIPE_READMODE_MESSAGE
ndash PIPE_WAIT or PIPE_NOWAIT (will ReadFile block)
9898
Named Pipes (contd)
BOOL PeekNamedPipe (HANDLE hPipeLPVOID lpvBuffer DWORD cbBufferLPDWORD lpcbRead LPDWORD lpcbAvailLPDWORD lpcbMessage)
nMaxInstances
ndash Number of instances
ndash PIPE_UNLIMITED_INSTANCES OS choice based on resources
dwTimeOut
ndash Default time-out period (in msec) for WaitNamedPipe()
First CreateNamedPipe creates named pipe
ndash Closing handle to last instance deletes named pipe
Polling a pipe
ndash Nondestructive ndash is there a message waiting for ReadFile
9999
Named Pipe Client Connections
CreateFile with named pipe namendash pipe[path]pipename
ndash servernamepipe[path]pipename
ndash First method gives better performance (local server)
Status Functionsndash GetNamedPipeHandleState
ndash SetNamedPipeHandleState
ndash GetNamedPipeInfo
100100
Convenience Functions
BOOL TransactNamedPipe( HANDLE hNamedPipeLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDOWRD lpcbRead LPOVERLAPPED lpa)
WriteFile ReadFile sequence
101101
Convenience Functions
BOOL CallNamedPipe( LPCTSTR lpszPipeNameLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDWORD lpcbRead DWORD dwTimeOut)
CreateFile WriteFile ReadFile CloseHandle
ndash dwTimeOut NMPWAIT_NOWAIT NMPWAIT_WIAT_FOREVER NMPWAIT_USE_DEFAULT_WAIT
102102
Server eliminate the polling loop
BOOL ConnectNamedPipe (HANDLE hNamedPipeLPOVERLAPPED lpo
lpo == NULLndash Call will return as soon as there is a client connection
ndash Returns false if client connected between CreateNamed Pipe calland ConnectNamedPipe()
Use DisconnectNamedPipe to free the handle for connection from another client
WaitNamedPipe()ndash Client may wait for serverlsquos ConnectNamedPipe()
Security rights for named pipesndash GENERIC_READ GENERIC_WRITE SYNCHRONIZE
103103
Comparison with UNIX
UNIX FIFOs are similar to a named pipe
ndash FIFOs are half-duplex
ndash FIFOs are limited to a single machine
ndash FIFOs are still byte-oriented so its easiest to use fixed-size records in clientserver applications
ndash Individual readwrites are atomic
A server using FIFOs must use a separate FIFO for each clientlsquos response although all clients can send requests via a single well known FIFO
Mkfifo() is the UNIX counterpart to CreateNamedPipe()
Use sockets for networked clientserver scenarios
104104
Client Example using Named Pipe
WaitNamedPipe (ServerPipeName NMPWAIT_WAIT_FOREVER)
hNamedPipe = CreateFile (ServerPipeName GENERIC_READ | GENERIC_WRITE
0 NULL OPEN_EXISTING FILE_ATTRIBUTE_NORMAL NULL)
if (hNamedPipe == INVALID_HANDLE_VALUE)
fptinf(stderr Failure to locate servern) exit(3)
Write the request
WriteFile (hNamedPipe ampRequest MAX_RQRS_LEN ampnWrite NULL)
Read each response and send it to std out
while (ReadFile (hNamedPipe ResponseRecord MAX_RQRS_LEN ampnRead NULL))
printf (s ResponseRecord)
CloseHandle (hNamedPipe)
return 0
105105
Server Example Using a Named Pipe
hNamedPipe = CreateNamedPipe (SERVER_PIPE PIPE_ACCESS_DUPLEX
PIPE_READMODE_MESSAGE | PIPE_TYPE_MESSAGE | PIPE_WAIT
1 0 0 CS_TIMEOUT pNPSA)
while (Done)
printf (Server is awaiting next requestn)
if (ConnectNamedPipe (hNamedPipe NULL)
|| ReadFile (hNamedPipe ampRequest RQ_SIZE ampnXfer NULL))
fprintf(stderr ldquoConnect or Read Named Pipe errornrdquo) exit(4)
printf( ldquoRequest is sn RequestRecord)
Send the file one line at a time to the client
fp = fopen (File r)
while ((fgets (ResponseRecord MAX_RQRS_LEN fp) = NULL))
WriteFile (hNamedPipe ampResponseRecord
(strlen(ResponseRecord) + 1) TSIZE ampnXfer NULL)
fclose (fp)
DisconnectNamedPipe (hNamedPipe)
End of server operation
106106
Win32 IPC - MailslotsMailslots bear some nasty implementation detailsthey are almost never used
Broadcast mechanism
ndash One-directional
ndash Mutliple writersmultiple readers (frequently one-to-many comm)
ndash Message delivery is unreliable
ndash Can be located over a network domain
ndash Message lengths are limited (w2k lt 426 byte) Operations on the mailslot
ndash Each reader (server) creates mailslot with CreateMailslot()
ndash Write-only client opens mailslot with CreateFile() and uses WriteFile() ndash open will fail if there are no waiting readers
ndash Clientlsquos message can be read by all servers (readers) Client lookup mailslotmailslotname
ndash Client will connect to every server in network domain
107107
Locate a server via mailslot
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
App client 0
App client n
Mailslot Servers
While () Sleep() hMS = CreateFile( ldquomailslotstatusldquo)
WriteFile(hMS ampStatInfo
App Server
Mailslot Client
Message is sent periodically
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
4242
Interrupt object
Allows device drivers to register ISRs for their devicesndash Contains dispatch code (initial handler)
ndash Dispatch code calls ISR with interrupt object as parameter(HW cannot pass parameters to ISR)
Connectingdisconnecting interrupt objectsndash Dynamic association between ISR and IDT entry
ndash Loadable device drivers (kernel modules)
ndash Turn onoff ISR
Interrupt objects can synchronize access to ISR datandash Multiple instances of ISR may be active simultaneously (MP machine)
ndash Multiple ISR may be connected with IRQL
4343
Predefined IRQLs
High ndash used when halting the system (via KeBugCheck())
Power fail ndash originated in the NT design document but has never been used
Inter-processor interruptndash used to request action from other processor (dispatching a thread updating a processors TLB system shutdown system crash)
Clockndash Used to update systemlsquos clock allocation of CPU time to threads
Profilendash Used for kernel profiling (see Kernel profiler ndash Kernprofexe Res Kit)
4444
Predefined IRQLs (contd)
Device
ndash Used to prioritize device interrupts
DPCdispatch and APC
ndash Software interrupts that kernel and device drivers generate
Passive
ndash No interrupt level at all normal thread execution
4545
IRQLs on 64-bit Systems
PassiveLowAPC
DispatchDPCDevice 1
Device n
Synch (Srv 2003)Clock
Interprocessor InterruptPower
HighProfile
012
1413
15
34
PassiveLowAPC
DispatchDPC amp Synch (UP only)Correctable Machine Check
Device 1
Device nSynch (MP only)
ClockInterprocessor Interrupt
HighProfilePower
x64 IA64
12
4646
Interrupt Prioritization amp Delivery
IRQLs are determined as followsndash x86 UP systems IRQL = 27 - IRQ
ndash x86 MP systems bucketized (random)
ndash x64 amp IA64 systems IRQL = IDT vector number 16
On MP systems which processor is chosen to deliver an interruptndash By default any processor can receive an interrupt from any deviceCan be configured with IntFilter utility in Resource Kit
ndash On x86 and x64 systems the IOAPIC (IO advanced programmable interrupt controller) is programmed to interrupt the processor running at the lowest IRQL
ndash On IA64 systems the SAPIC (streamlined advanced programmable interrupt controller) is configured to interrupt one processor for each interrupt sourceProcessors are assigned round robin for each interrupt vector
4747
Software interrupts
Initiating thread dispatching
ndash DPC allow for scheduling actions when kernel is deep within many layers of code
ndash Delayed scheduling decision one DPC queue per processor
Handling timer expiration
Asynchronous execution of a procedure in context of a particular thread
Support for asynchronous IO operations
4848
Flow of Interrupts
4949
Sync on MP use spinlocks to coordinate among processors
Spinlock acquisition and release routines implement a one-owner-at-a-time algorithm
ndash Spinlock is either free or is considered to be owned by a CPU
ndash Analogous to using Windows API mutexes from user mode
A spinlock is just a data cell in memory
ndash Accessed with a test-and-modify operation that is atomic across all processors
ndash KSPIN_LOCK is an opaque data type typedefrsquod as a ULONG
ndash To implement synchronization a single bit is sufficient
Synchronization on SMP Systems
5050
do acquire_spinlock(DPC)until (SUCCESS)
begin remove DPC from queueend
release_spinlock(DPC)
do acquire_spinlock(DPC)until (SUCCESS)
begin remove DPC from queueend
release_spinlock(DPC)
Kernel Synchronization
Processor BProcessor A
Critical section
spinlock
DPC DPC
A spinlock is a locking primitive associatedwith a global data structure such as the DPC queue
5151
Queued Spinlocks
Problem Checking status of spinlock via test-and-set operation creates bus contention
Queued spinlocks maintain queue of waiting processors
First processor acquires lock other processors wait on processor-local flag
ndash Thus busy-wait loop requires no access to the memory bus
When releasing lock the 1st processorrsquos flag is modified
ndash Exactly one processor is being signaled
ndash Pre-determined wait order
5252
SMP Scalability Improvements
Windows 2000 queued spinlocks
ndash qlocks in Kernel Debugger
Server 2003
ndash More spinlocks eliminated (context swap system space commit)
ndash Further reduction of use of spinlocks amp length they are held
ndash Scheduling database now per-CPUAllows thread state transitions in parallel
5353
SMP Scalability Improvements
XP2003
ndash Minimized lock contention for hot locks PFN or Page Frame Database lock
ndash Some locks completely eliminatedCharging nonpagedpaged pool quotas allocating and mapping system page table entries charging commitment of pages allocatingmapping physical memory through AWE functions
ndash New more efficient locking mechanism (pushlocks)Doesnrsquot use spinlocks when no contentionUsed for object manager and address windowing extensions (AWE) related locks
5454
Waiting
Flexible wait calls
ndash Wait for one or multiple objects in one call
ndash Wait for multiple can wait for ldquoanyrdquo one or ldquoallrdquo at onceldquoAllrdquo all objects must be in the signalled state concurrently to resolve the wait
ndash All wait calls include optional timeout argument
ndash Waiting threads consume no CPU time
5555
Waiting
Waitable objects include
ndash Events (may be auto-reset or manual reset may be set or ldquopulsedrdquo)
ndash Mutexes (ldquomutual exclusionrdquo one-at-a-time)
ndash Semaphores (n-at-a-time)
ndash Timers
ndash Processes and Threads (signalled upon exit or terminate)
ndash Directories (change notification)
5656
Waiting
No guaranteed ordering of wait resolution
ndash If multiple threads are waiting for an object and only one thread is released (eg itrsquos a mutex or auto-reset event) which thread gets released is unpredictable
5757
Executive Synchronization
Thread waitson an object
handle
Create and initialize thread object
Initialized
Ready
Transition
Waiting
Running
Terminated
Standby
Wait is completeSet object to
signaled state
Interaction with thread scheduling
Waiting on Dispatcher Objects ndash outside the kernel
5858
Interaction bet Synchronization amp Dispatching User mode thread waits on an event objectlsquos handle
Kernel changes threadlsquos scheduling state from ready to waiting and adds thread to wait-list
Another thread sets the event
Kernel wakes up waiting threads variable priority threads get priority boost
5959
Interaction bet Synchronization amp Dispatching Dispatcher re-schedules new thread ndash it may preempt running thread it it has lower priority and issues software interrupt to initiate context switch
If no processor can be preempted the dispatcher places the ready thread in the dispatcher ready queue to be scheduled later
6060
What signals an object
Dispatcher object
System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
Owning thread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (kernel mode)
nonsignaled signaled
Owning thread or otherthread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (exported to user mode)
nonsignaled signaled
One thread releases thesemaphore freeing a resource
A thread acquires the semaphoreMore resources are not available
Kernel resumes one or more waiting threads
Semaphore
6161
A thread reinitializesthe thread object
What signals an object (contd)
Dispatcher object System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
A thread sets the event
Kernel resumes one or more threads
Kernel resumes one or more waiting threads
Event
nonsignaled signaled
Dedicated thread sets oneevent in the event pair
Kernel resumes theother dedicated thread
Kernel resumes waitingdedicated thread
Event pair
nonsignaled signaled
Timer expires
A thread (re) initializes the timer
Kernel resumes all waiting threads
Timer
nonsignaled signaled
Thread terminates
Kernel resumes all waiting threads
Thread
6262
Further Reading
Mark E Russinovich and David A Solomon Microsoft Windows Internals 4th Edition Microsoft Press 2004
Chapter 3 - System Mechanisms
ndash Trap Dispatching (pp 85 ff)
ndash Synchronization (pp 149 ff)
ndash Kernel Event Tracing (pp 175 ff)
6363
33 Advanced Windows Synchronization
Deferred and Asynchronous Procedure Calls
IRQLs and CPU Time Accounting
Wait Queues amp Dispatcher Objects
6464
Used to defer processing from higher (device) interrupt level to a lower (dispatch) levelndash Also used for quantum end and timer expiration
Driver (usually ISR) queues requestndash One queue per CPU DPCs are normally queued to the current processor but can be targeted to other CPUs
ndash Executes specified procedure at dispatch IRQL (or ldquodispatch levelrdquo also ldquoDPC levelrdquo) when all higher-IRQL work (interrupts) completed
ndash Maximum times recommended ISR 10 usec DPC 25 usec
See httpwwwmicrosoftcomwhdcdriverperformmmdrvmspx
Deferred Procedure Calls (DPCs)
6565
queue head DPC object DPC object DPC object
Deferred Procedure Calls (DPCs)
6666
DPC
Delivering a DPC
DPC routines can call kernel functionsbut canlsquot call system services generatepage faults or create or wait on objects
DPC routines canlsquotassume whatprocess addressspace is currentlymapped
Interruptdispatch table
high
Power failure
DispatchDPC
APC
Low
DPC
1 Timer expires kernelqueues DPC that willrelease all waiting threadsKernel requests SW int
DPCDPC
DPC queue
2 DPC interrupt occurswhen IRQL drops belowdispatchDPC level
dispatcher
3 After DPC interruptcontrol transfers tothread dispatcher
4 Dispatcher executes each DPCroutine in DPC queue
6767
Asynchronous Procedure Calls (APCs)
Execute code in context of a particular user threadndash APC routines can acquire resources (objects) incur page faultscall system services
APC queue is thread-specific
User mode amp kernel mode APCsndash Permission required for user mode APCs
6868
Asynchronous Procedure Calls (APCs)
Executive uses APCs to complete work in thread spacendash Wait for asynchronous IO operation
ndash Emulate delivery of POSIX signals
ndash Make threads suspendterminate itself (env subsystems)
APCs are delivered when thread is in alertable wait statendash WaitForMultipleObjectsEx() SleepEx()
6969
Special kernel APCs
ndash Run in kernel mode at IRQL 1
ndash Always deliverable unless thread is already at IRQL 1 or above
ndash Used for IO completion reporting from ldquoarbitrary thread contextrdquo
ndash Kernel-mode interface is linkable but not documented
Asynchronous Procedure Calls (APCs)
7070
ldquoOrdinaryrdquo kernel APCs
ndash Always deliverable if at IRQL 0 unless explicitly disabled (disable with KeEnterCriticalRegion)
User mode APCs
ndash Used for IO completion callback routines (see ReadFileEx WriteFileEx) also QueueUserApc
ndash Only deliverable when thread is in ldquoalertable waitrdquo
Asynchronous Procedure Calls (APCs)
7171
ThreadObject
K
U
APC objects
Asynchronous Procedure Calls (APCs)
7272
IRQLs and CPU Time Accounting
Interval clock timer ISR keeps track of time
Clock ISR time accounting
ndash If IRQLlt2 charge to threadrsquos user or kernel time
ndash If IRQL=2 and processing a DPC charge to DPC time
ndash If IRQL=2 amp not processing a DPC charge to thread kernel time
ndash If IRQLgt2 charge to interrupt time
7373
IRQLs and CPU Time Accounting
Since time servicing interrupts are NOT charged to interrupted thread if system is busy but no process appears to be running must be due to interrupt-related activity
ndash Note time at IRQL 2 or more is charged to the current threadrsquos quantum (to be described)
7474
Interrupt Time Accounting
Task Manager includes interrupt and DPC time with the Idle process time
Interrupt activity is not charged to any threadprocess
ndash Process Explorer shows these as separate processesnot really processes
ndash Context switches for these are really of interrupts amp DPCs
7575
Time Accounting Quirks
Looking at total CPU time for each process may not reveal where system has spent its time
CPU time accounting is driven by programmable interrupt timer
ndash Normally 10 msec (15 msec on some MP Pentiums)
Thread execution and context switches between clock intervals NOT accounted
ndash Eg one or more threads run and enter a wait state before clock fires
ndash Thus threads may run but never get charged
View context switch activity with Process Explorer
ndash Add Context Switch Delta column
7676
For waiting threads user-mode utilities only display the wait reason
Example pstat
Looking at Waiting Threads
7777
Wait Internals 1 Dispatcher Objects
Size TypeState
Wait listhead
Object-type-specific data
DispatcherObject
(see ntddkincddkntddkh)
Any kernel object you can wait for is a ldquodispatcher objectrdquo
ndash some exclusively for synchronizationeg events mutexes (ldquomutantsrdquo) semaphores queues timers
ndash others can be waited for as a side effect of their prime function eg processes threads file objects
ndash non-waitable kernel objects are called ldquocontrol objectsrdquo
All dispatcher objects have a common header
All dispatcher objects are in one of two states
ndash ldquosignaledrdquo vs ldquononsignaledrdquo
ndash when signalled a wait on the object is satisfied
ndash different object types differ in terms of what changes their state
ndash wait and unwait implementation iscommon to all types of dispatcher objects
7878
Object-type-specific data
Wait Internals 2Wait Blocks
Size TypeState
Wait listhead
Size TypeState
Wait listhead
Represent a threadrsquos reference to something itrsquos waiting for (one per handle passed to WaitForhellip)
All wait blocks from a given wait call are chained to the waiting thread
Type indicates wait for ldquoanyrdquo or ldquoallrdquo Key denotes argument list position for
WaitForMultipleObjects
Object-type-specific data
DispatcherObjects
Thread Objects
WaitBlockListWaitBlockList
Wait blocks
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
7979
34 Windows APIs for Synchronization and IPC Windows API constructs for synchronization and interprocess communication
Synchronization
ndash Critical sections
ndash Mutexes
ndash Semaphores
ndash Event objects
Synchronization through interprocess communication
ndash Anonymous pipes
ndash Named pipes
ndash Mailslots
8080
Critical Sections
Only usable from within the same process
Critical sections are initialized and deleted but do not have handles
Only one thread at a time can be in a critical section
A thread can enter a critical section multiple times - however the number of Enter- and Leave-operations must match
Leaving a critical section before entering it may cause deadlocks
No way to test whether another thread is in a critical section
VOID InitializeCriticalSection( LPCRITICAL_SECTION sec )VOID DeleteCriticalSection( LPCRITICAL_SECTION sec )
VOID EnterCriticalSection( LPCRITICAL_SECTION sec ) VOID LeaveCriticalSection( LPCRITICAL_SECTION sec )BOOL TryEnterCriticalSection ( LPCRITICAL_SECTION sec )
8181
Critical Section Example
counter is global shared by all threads
volatile int counter = 0
CRITICAL_SECTION crit
InitializeCriticalSection ( ampcrit )
hellip main loop in any of the threads
while (done)
_try
EnterCriticalSection ( ampcrit )
counter += local_value
LeaveCriticalSection ( ampcrit )
_finally LeaveCriticalSection ( ampcrit )
DeleteCriticalSection( ampcrit )
8282
Synchronizing Threads with Kernel Objects
The following kernel objects can be used to synchronize threads
ndash Processes
ndash Threads
ndash Files
ndash Console input
File change notificationsFile change notifications
MutexesMutexes
Events (auto-reset + manual-reset)Events (auto-reset + manual-reset)
Waitable timersWaitable timers
DWORD WaitForSingleObject( HANDLE hObject DWORD dwTimeout )
DWORD WaitForMultipleObjects( DWORD cObjects LPHANDLE lpHandles BOOL bWaitAll DWORD dwTimeout )
8383
Wait Functions - Details
WaitForSingleObject()ndash hObject specifies kernel object
ndash dwTimeout specifies wait time in msecdwTimeout == 0 - no wait check whether object is signaled
dwTimeout == INFINITE - wait forever
WaitForMultipleObjects()ndash cObjects lt= MAXIMUM_WAIT_OBJECTS (64)
ndash lpHandles - pointer to array identifying these objects
ndash bWaitAll - whether to wait for first signaled object or all objectsFunction returns index of first signaled object
Side effectsndash Mutexes auto-reset events and waitable timers will be reset to non-signaled state after completing wait functions
8484
Mutexes
Mutexes work across processes
First thread has to call CreateMutex()
When sharing a mutex second thread (process) calls CreateMutex() or OpenMutex()
fInitialOwner == TRUE gives creator immediate ownership
Threads acquire mutex ownership using WaitForSingleObject() or WaitForMultipleObjects()
ReleaseMutex() gives up ownership
CloseHandle() will free mutex object
8585
Mutexes
HANDLE CreateMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
HANDLE OpenMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
BOOL ReleaseMutex( HANDLE hMutex )
8686
Mutex Example
counter is global shared by all threads
volatile int done counter = 0
HANDLE mutex = CreateMutex( NULL FALSE NULL )
main loop in any of the threads ret is local
DWORD ret
while (done)
ret = WaitForSingleObject( mutex INFINITE )
if (ret == WAIT_OBJECT_0)
counter += local_value
else mutex was abandoned
break exit the loop
ReleaseMutex( mutex )
CloseHandle( mutex )
8787
Comparison - POSIX mutexes
POSIX pthreads specification supports mutexesndash Synchronization among threads in same process
Five basic functionsndash pthread_mutex_init()
ndash pthread_mutex_destroy()
ndash pthread_mutex_lock()
ndash pthread_mutex_unlock()
ndash pthread_mutex_trylock()
Comparisonndash pthread_mutex_lock() will block - equivalent to WaitForSingleObject( hMutex )
ndash pthread_mutex_trylock() is nonblocking (polling) - equivalent to WaitForSingleObject() with timeout == 0
8888
Semaphores
Semaphore objects are used for resource countingndash A semaphore is signaled when count gt 0
Threadsprocesses use wait functionsndash Each wait function decreases semaphore count by 1
ndash ReleaseSemaphore() may increment count by any value
ndash ReleaseSemaphore() returns old semaphore count
8989
Semaphores
HANDLE CreateSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE OpenSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE ReleaseSemaphore( HANDLE hSemaphoreLONG cReleaseCount LPLONG lpPreviousCount )
9090
Events
Multiple threads can be released when a single event is signaled (barrier synchronization)ndash Manual-reset event can signal several thread simultaneously must be reset manually
ndash PulseEvent() will release all threads waiting on a manual-reset event and automatically reset the event
ndash Auto-reset event signals a single thread event is reset automatically
ndash fInitialState == TRUE - create event in signaled state
9191
Events
HANDLE CreateEvent( LPSECURITY_ATTRIBUTE lpsaBOOL fManualReset BOOL fInititalStateLPTSTR lpszEventName )
BOOL SetEvent( HANDLE hEvent )BOOL ResetEvent( HANDLE hEvent )BOOL PulseEvent( HANDLE hEvent )
9292
Comparison - POSIX condition variables
pthreadrsquos condition variables are comparable to eventsndash pthread_cond_init()
ndash pthread_cond_destroy()
Wait functionsndash pthread_cond_wait()
ndash pthread_cond_timedwait()
Signalingndash pthread_cond_signal() - one thread
ndash pthread_cond_broadcast() - all waiting threads
No exact equivalent to manual-reset events
9393
Anonymous pipes
BOOL CreatePipe( PHANDLE phReadPHANDLE phWriteLPSECURITY_ATTRIBUTES lpsaDWORD cbPipe )
main
prog1 prog2pipe
Half-duplex character-based IPC
cbPipe pipe byte size zero == default
Read on pipe handle will block if pipe is empty
Write operation to a full pipe will block
Anonymous pipes are oneway
9494
IO Redirection using an Anonymous Pipe
Create default size anonymous pipe handles are inheritable
if (CreatePipe (amphReadPipe amphWritePipe ampPipeSA 0))
fprintf(stderr ldquoAnon pipe create failednrdquo) exit(1)
Set output handle to pipe handle create first processes
StartInfoCh1hStdInput = GetStdHandle (STD_INPUT_HANDLE)
StartInfoCh1hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh1hStdOutput = hWritePipe
StartInfoCh1dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)Command1 NULL NULL TRUE 0 NULL NULL ampStartInfoCh1 ampProcInfo1))
fprintf(stderr ldquoCreateProc1 failednrdquo) exit(2)
CloseHandle (hWritePipe)
9595
Pipe example (contd)
Repeat (symmetrically) for the second process
StartInfoCh2hStdInput = hReadPipe
StartInfoCh2hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh2hStdOutput = GetStdHandle (STD_OUTPUT_HANDLE)
StartInfoCh2dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)targv NULL NULLTRUE Inherit handles
0 NULL NULL ampStartInfoCh2 ampProcInfo2))
fprintf(stderr ldquoCreateProc2 failednrdquo) exit(3)
CloseHandle (hReadPipe)
Wait for both processes to complete
WaitForSingleObject (ProcInfo1hProcess INFINITE)
WaitForSingleObject (ProcInfo2hProcess INFINITE)
CloseHandle (ProcInfo1hThread) CloseHandle (ProcInfo1hProcess)
CloseHandle (ProcInfo2hThread) CloseHandle (ProcInfo2hProcess)
return 0
9696
Named Pipes
Message orientedndash Reading process can read varying-length messages precisely as sent by the writing process
Bi-directionalndash Two processes can exchange messages over the same pipe
Multiple independent instances of a named pipendash Several clients can communicate with a single server using the same instance
ndash Server can respond to client using the same instance
Pipe can be accessed over the networkndash location transparency
Convenience and connection functions
9797
Using Named Pipes
HANDLE CreateNamedPipe (LPCTSTR lpszPipeNameDWORD fdwOpenMode DWORD fdwPipModeDWORD nMaxInstances DWORD cbOutBufDWORD cbInBuf DWORD dwTimeOutLPSECURITY_ATTRIBUTES lpsa )
Use same flag settings forall instances of a named pipe
lpszPipeName pipe[path]pipename
ndash Not possible to create a pipe on remote machine ( ndash local machine)
fdwOpenMode
ndash PIPE_ACCESS_DUPLEX PIPE_ACCESS_INBOUND PIPE_ACCESS_OUTBOUND
fdwPipeMode
ndash PIPE_TYPE_BYTE or PIPE_TYPE_MESSAGE
ndash PIPE_READMODE_BYTE or PIPE_READMODE_MESSAGE
ndash PIPE_WAIT or PIPE_NOWAIT (will ReadFile block)
9898
Named Pipes (contd)
BOOL PeekNamedPipe (HANDLE hPipeLPVOID lpvBuffer DWORD cbBufferLPDWORD lpcbRead LPDWORD lpcbAvailLPDWORD lpcbMessage)
nMaxInstances
ndash Number of instances
ndash PIPE_UNLIMITED_INSTANCES OS choice based on resources
dwTimeOut
ndash Default time-out period (in msec) for WaitNamedPipe()
First CreateNamedPipe creates named pipe
ndash Closing handle to last instance deletes named pipe
Polling a pipe
ndash Nondestructive ndash is there a message waiting for ReadFile
9999
Named Pipe Client Connections
CreateFile with named pipe namendash pipe[path]pipename
ndash servernamepipe[path]pipename
ndash First method gives better performance (local server)
Status Functionsndash GetNamedPipeHandleState
ndash SetNamedPipeHandleState
ndash GetNamedPipeInfo
100100
Convenience Functions
BOOL TransactNamedPipe( HANDLE hNamedPipeLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDOWRD lpcbRead LPOVERLAPPED lpa)
WriteFile ReadFile sequence
101101
Convenience Functions
BOOL CallNamedPipe( LPCTSTR lpszPipeNameLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDWORD lpcbRead DWORD dwTimeOut)
CreateFile WriteFile ReadFile CloseHandle
ndash dwTimeOut NMPWAIT_NOWAIT NMPWAIT_WIAT_FOREVER NMPWAIT_USE_DEFAULT_WAIT
102102
Server eliminate the polling loop
BOOL ConnectNamedPipe (HANDLE hNamedPipeLPOVERLAPPED lpo
lpo == NULLndash Call will return as soon as there is a client connection
ndash Returns false if client connected between CreateNamed Pipe calland ConnectNamedPipe()
Use DisconnectNamedPipe to free the handle for connection from another client
WaitNamedPipe()ndash Client may wait for serverlsquos ConnectNamedPipe()
Security rights for named pipesndash GENERIC_READ GENERIC_WRITE SYNCHRONIZE
103103
Comparison with UNIX
UNIX FIFOs are similar to a named pipe
ndash FIFOs are half-duplex
ndash FIFOs are limited to a single machine
ndash FIFOs are still byte-oriented so its easiest to use fixed-size records in clientserver applications
ndash Individual readwrites are atomic
A server using FIFOs must use a separate FIFO for each clientlsquos response although all clients can send requests via a single well known FIFO
Mkfifo() is the UNIX counterpart to CreateNamedPipe()
Use sockets for networked clientserver scenarios
104104
Client Example using Named Pipe
WaitNamedPipe (ServerPipeName NMPWAIT_WAIT_FOREVER)
hNamedPipe = CreateFile (ServerPipeName GENERIC_READ | GENERIC_WRITE
0 NULL OPEN_EXISTING FILE_ATTRIBUTE_NORMAL NULL)
if (hNamedPipe == INVALID_HANDLE_VALUE)
fptinf(stderr Failure to locate servern) exit(3)
Write the request
WriteFile (hNamedPipe ampRequest MAX_RQRS_LEN ampnWrite NULL)
Read each response and send it to std out
while (ReadFile (hNamedPipe ResponseRecord MAX_RQRS_LEN ampnRead NULL))
printf (s ResponseRecord)
CloseHandle (hNamedPipe)
return 0
105105
Server Example Using a Named Pipe
hNamedPipe = CreateNamedPipe (SERVER_PIPE PIPE_ACCESS_DUPLEX
PIPE_READMODE_MESSAGE | PIPE_TYPE_MESSAGE | PIPE_WAIT
1 0 0 CS_TIMEOUT pNPSA)
while (Done)
printf (Server is awaiting next requestn)
if (ConnectNamedPipe (hNamedPipe NULL)
|| ReadFile (hNamedPipe ampRequest RQ_SIZE ampnXfer NULL))
fprintf(stderr ldquoConnect or Read Named Pipe errornrdquo) exit(4)
printf( ldquoRequest is sn RequestRecord)
Send the file one line at a time to the client
fp = fopen (File r)
while ((fgets (ResponseRecord MAX_RQRS_LEN fp) = NULL))
WriteFile (hNamedPipe ampResponseRecord
(strlen(ResponseRecord) + 1) TSIZE ampnXfer NULL)
fclose (fp)
DisconnectNamedPipe (hNamedPipe)
End of server operation
106106
Win32 IPC - MailslotsMailslots bear some nasty implementation detailsthey are almost never used
Broadcast mechanism
ndash One-directional
ndash Mutliple writersmultiple readers (frequently one-to-many comm)
ndash Message delivery is unreliable
ndash Can be located over a network domain
ndash Message lengths are limited (w2k lt 426 byte) Operations on the mailslot
ndash Each reader (server) creates mailslot with CreateMailslot()
ndash Write-only client opens mailslot with CreateFile() and uses WriteFile() ndash open will fail if there are no waiting readers
ndash Clientlsquos message can be read by all servers (readers) Client lookup mailslotmailslotname
ndash Client will connect to every server in network domain
107107
Locate a server via mailslot
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
App client 0
App client n
Mailslot Servers
While () Sleep() hMS = CreateFile( ldquomailslotstatusldquo)
WriteFile(hMS ampStatInfo
App Server
Mailslot Client
Message is sent periodically
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
4343
Predefined IRQLs
High ndash used when halting the system (via KeBugCheck())
Power fail ndash originated in the NT design document but has never been used
Inter-processor interruptndash used to request action from other processor (dispatching a thread updating a processors TLB system shutdown system crash)
Clockndash Used to update systemlsquos clock allocation of CPU time to threads
Profilendash Used for kernel profiling (see Kernel profiler ndash Kernprofexe Res Kit)
4444
Predefined IRQLs (contd)
Device
ndash Used to prioritize device interrupts
DPCdispatch and APC
ndash Software interrupts that kernel and device drivers generate
Passive
ndash No interrupt level at all normal thread execution
4545
IRQLs on 64-bit Systems
PassiveLowAPC
DispatchDPCDevice 1
Device n
Synch (Srv 2003)Clock
Interprocessor InterruptPower
HighProfile
012
1413
15
34
PassiveLowAPC
DispatchDPC amp Synch (UP only)Correctable Machine Check
Device 1
Device nSynch (MP only)
ClockInterprocessor Interrupt
HighProfilePower
x64 IA64
12
4646
Interrupt Prioritization amp Delivery
IRQLs are determined as followsndash x86 UP systems IRQL = 27 - IRQ
ndash x86 MP systems bucketized (random)
ndash x64 amp IA64 systems IRQL = IDT vector number 16
On MP systems which processor is chosen to deliver an interruptndash By default any processor can receive an interrupt from any deviceCan be configured with IntFilter utility in Resource Kit
ndash On x86 and x64 systems the IOAPIC (IO advanced programmable interrupt controller) is programmed to interrupt the processor running at the lowest IRQL
ndash On IA64 systems the SAPIC (streamlined advanced programmable interrupt controller) is configured to interrupt one processor for each interrupt sourceProcessors are assigned round robin for each interrupt vector
4747
Software interrupts
Initiating thread dispatching
ndash DPC allow for scheduling actions when kernel is deep within many layers of code
ndash Delayed scheduling decision one DPC queue per processor
Handling timer expiration
Asynchronous execution of a procedure in context of a particular thread
Support for asynchronous IO operations
4848
Flow of Interrupts
4949
Sync on MP use spinlocks to coordinate among processors
Spinlock acquisition and release routines implement a one-owner-at-a-time algorithm
ndash Spinlock is either free or is considered to be owned by a CPU
ndash Analogous to using Windows API mutexes from user mode
A spinlock is just a data cell in memory
ndash Accessed with a test-and-modify operation that is atomic across all processors
ndash KSPIN_LOCK is an opaque data type typedefrsquod as a ULONG
ndash To implement synchronization a single bit is sufficient
Synchronization on SMP Systems
5050
do acquire_spinlock(DPC)until (SUCCESS)
begin remove DPC from queueend
release_spinlock(DPC)
do acquire_spinlock(DPC)until (SUCCESS)
begin remove DPC from queueend
release_spinlock(DPC)
Kernel Synchronization
Processor BProcessor A
Critical section
spinlock
DPC DPC
A spinlock is a locking primitive associatedwith a global data structure such as the DPC queue
5151
Queued Spinlocks
Problem Checking status of spinlock via test-and-set operation creates bus contention
Queued spinlocks maintain queue of waiting processors
First processor acquires lock other processors wait on processor-local flag
ndash Thus busy-wait loop requires no access to the memory bus
When releasing lock the 1st processorrsquos flag is modified
ndash Exactly one processor is being signaled
ndash Pre-determined wait order
5252
SMP Scalability Improvements
Windows 2000 queued spinlocks
ndash qlocks in Kernel Debugger
Server 2003
ndash More spinlocks eliminated (context swap system space commit)
ndash Further reduction of use of spinlocks amp length they are held
ndash Scheduling database now per-CPUAllows thread state transitions in parallel
5353
SMP Scalability Improvements
XP2003
ndash Minimized lock contention for hot locks PFN or Page Frame Database lock
ndash Some locks completely eliminatedCharging nonpagedpaged pool quotas allocating and mapping system page table entries charging commitment of pages allocatingmapping physical memory through AWE functions
ndash New more efficient locking mechanism (pushlocks)Doesnrsquot use spinlocks when no contentionUsed for object manager and address windowing extensions (AWE) related locks
5454
Waiting
Flexible wait calls
ndash Wait for one or multiple objects in one call
ndash Wait for multiple can wait for ldquoanyrdquo one or ldquoallrdquo at onceldquoAllrdquo all objects must be in the signalled state concurrently to resolve the wait
ndash All wait calls include optional timeout argument
ndash Waiting threads consume no CPU time
5555
Waiting
Waitable objects include
ndash Events (may be auto-reset or manual reset may be set or ldquopulsedrdquo)
ndash Mutexes (ldquomutual exclusionrdquo one-at-a-time)
ndash Semaphores (n-at-a-time)
ndash Timers
ndash Processes and Threads (signalled upon exit or terminate)
ndash Directories (change notification)
5656
Waiting
No guaranteed ordering of wait resolution
ndash If multiple threads are waiting for an object and only one thread is released (eg itrsquos a mutex or auto-reset event) which thread gets released is unpredictable
5757
Executive Synchronization
Thread waitson an object
handle
Create and initialize thread object
Initialized
Ready
Transition
Waiting
Running
Terminated
Standby
Wait is completeSet object to
signaled state
Interaction with thread scheduling
Waiting on Dispatcher Objects ndash outside the kernel
5858
Interaction bet Synchronization amp Dispatching User mode thread waits on an event objectlsquos handle
Kernel changes threadlsquos scheduling state from ready to waiting and adds thread to wait-list
Another thread sets the event
Kernel wakes up waiting threads variable priority threads get priority boost
5959
Interaction bet Synchronization amp Dispatching Dispatcher re-schedules new thread ndash it may preempt running thread it it has lower priority and issues software interrupt to initiate context switch
If no processor can be preempted the dispatcher places the ready thread in the dispatcher ready queue to be scheduled later
6060
What signals an object
Dispatcher object
System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
Owning thread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (kernel mode)
nonsignaled signaled
Owning thread or otherthread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (exported to user mode)
nonsignaled signaled
One thread releases thesemaphore freeing a resource
A thread acquires the semaphoreMore resources are not available
Kernel resumes one or more waiting threads
Semaphore
6161
A thread reinitializesthe thread object
What signals an object (contd)
Dispatcher object System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
A thread sets the event
Kernel resumes one or more threads
Kernel resumes one or more waiting threads
Event
nonsignaled signaled
Dedicated thread sets oneevent in the event pair
Kernel resumes theother dedicated thread
Kernel resumes waitingdedicated thread
Event pair
nonsignaled signaled
Timer expires
A thread (re) initializes the timer
Kernel resumes all waiting threads
Timer
nonsignaled signaled
Thread terminates
Kernel resumes all waiting threads
Thread
6262
Further Reading
Mark E Russinovich and David A Solomon Microsoft Windows Internals 4th Edition Microsoft Press 2004
Chapter 3 - System Mechanisms
ndash Trap Dispatching (pp 85 ff)
ndash Synchronization (pp 149 ff)
ndash Kernel Event Tracing (pp 175 ff)
6363
33 Advanced Windows Synchronization
Deferred and Asynchronous Procedure Calls
IRQLs and CPU Time Accounting
Wait Queues amp Dispatcher Objects
6464
Used to defer processing from higher (device) interrupt level to a lower (dispatch) levelndash Also used for quantum end and timer expiration
Driver (usually ISR) queues requestndash One queue per CPU DPCs are normally queued to the current processor but can be targeted to other CPUs
ndash Executes specified procedure at dispatch IRQL (or ldquodispatch levelrdquo also ldquoDPC levelrdquo) when all higher-IRQL work (interrupts) completed
ndash Maximum times recommended ISR 10 usec DPC 25 usec
See httpwwwmicrosoftcomwhdcdriverperformmmdrvmspx
Deferred Procedure Calls (DPCs)
6565
queue head DPC object DPC object DPC object
Deferred Procedure Calls (DPCs)
6666
DPC
Delivering a DPC
DPC routines can call kernel functionsbut canlsquot call system services generatepage faults or create or wait on objects
DPC routines canlsquotassume whatprocess addressspace is currentlymapped
Interruptdispatch table
high
Power failure
DispatchDPC
APC
Low
DPC
1 Timer expires kernelqueues DPC that willrelease all waiting threadsKernel requests SW int
DPCDPC
DPC queue
2 DPC interrupt occurswhen IRQL drops belowdispatchDPC level
dispatcher
3 After DPC interruptcontrol transfers tothread dispatcher
4 Dispatcher executes each DPCroutine in DPC queue
6767
Asynchronous Procedure Calls (APCs)
Execute code in context of a particular user threadndash APC routines can acquire resources (objects) incur page faultscall system services
APC queue is thread-specific
User mode amp kernel mode APCsndash Permission required for user mode APCs
6868
Asynchronous Procedure Calls (APCs)
Executive uses APCs to complete work in thread spacendash Wait for asynchronous IO operation
ndash Emulate delivery of POSIX signals
ndash Make threads suspendterminate itself (env subsystems)
APCs are delivered when thread is in alertable wait statendash WaitForMultipleObjectsEx() SleepEx()
6969
Special kernel APCs
ndash Run in kernel mode at IRQL 1
ndash Always deliverable unless thread is already at IRQL 1 or above
ndash Used for IO completion reporting from ldquoarbitrary thread contextrdquo
ndash Kernel-mode interface is linkable but not documented
Asynchronous Procedure Calls (APCs)
7070
ldquoOrdinaryrdquo kernel APCs
ndash Always deliverable if at IRQL 0 unless explicitly disabled (disable with KeEnterCriticalRegion)
User mode APCs
ndash Used for IO completion callback routines (see ReadFileEx WriteFileEx) also QueueUserApc
ndash Only deliverable when thread is in ldquoalertable waitrdquo
Asynchronous Procedure Calls (APCs)
7171
ThreadObject
K
U
APC objects
Asynchronous Procedure Calls (APCs)
7272
IRQLs and CPU Time Accounting
Interval clock timer ISR keeps track of time
Clock ISR time accounting
ndash If IRQLlt2 charge to threadrsquos user or kernel time
ndash If IRQL=2 and processing a DPC charge to DPC time
ndash If IRQL=2 amp not processing a DPC charge to thread kernel time
ndash If IRQLgt2 charge to interrupt time
7373
IRQLs and CPU Time Accounting
Since time servicing interrupts are NOT charged to interrupted thread if system is busy but no process appears to be running must be due to interrupt-related activity
ndash Note time at IRQL 2 or more is charged to the current threadrsquos quantum (to be described)
7474
Interrupt Time Accounting
Task Manager includes interrupt and DPC time with the Idle process time
Interrupt activity is not charged to any threadprocess
ndash Process Explorer shows these as separate processesnot really processes
ndash Context switches for these are really of interrupts amp DPCs
7575
Time Accounting Quirks
Looking at total CPU time for each process may not reveal where system has spent its time
CPU time accounting is driven by programmable interrupt timer
ndash Normally 10 msec (15 msec on some MP Pentiums)
Thread execution and context switches between clock intervals NOT accounted
ndash Eg one or more threads run and enter a wait state before clock fires
ndash Thus threads may run but never get charged
View context switch activity with Process Explorer
ndash Add Context Switch Delta column
7676
For waiting threads user-mode utilities only display the wait reason
Example pstat
Looking at Waiting Threads
7777
Wait Internals 1 Dispatcher Objects
Size TypeState
Wait listhead
Object-type-specific data
DispatcherObject
(see ntddkincddkntddkh)
Any kernel object you can wait for is a ldquodispatcher objectrdquo
ndash some exclusively for synchronizationeg events mutexes (ldquomutantsrdquo) semaphores queues timers
ndash others can be waited for as a side effect of their prime function eg processes threads file objects
ndash non-waitable kernel objects are called ldquocontrol objectsrdquo
All dispatcher objects have a common header
All dispatcher objects are in one of two states
ndash ldquosignaledrdquo vs ldquononsignaledrdquo
ndash when signalled a wait on the object is satisfied
ndash different object types differ in terms of what changes their state
ndash wait and unwait implementation iscommon to all types of dispatcher objects
7878
Object-type-specific data
Wait Internals 2Wait Blocks
Size TypeState
Wait listhead
Size TypeState
Wait listhead
Represent a threadrsquos reference to something itrsquos waiting for (one per handle passed to WaitForhellip)
All wait blocks from a given wait call are chained to the waiting thread
Type indicates wait for ldquoanyrdquo or ldquoallrdquo Key denotes argument list position for
WaitForMultipleObjects
Object-type-specific data
DispatcherObjects
Thread Objects
WaitBlockListWaitBlockList
Wait blocks
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
7979
34 Windows APIs for Synchronization and IPC Windows API constructs for synchronization and interprocess communication
Synchronization
ndash Critical sections
ndash Mutexes
ndash Semaphores
ndash Event objects
Synchronization through interprocess communication
ndash Anonymous pipes
ndash Named pipes
ndash Mailslots
8080
Critical Sections
Only usable from within the same process
Critical sections are initialized and deleted but do not have handles
Only one thread at a time can be in a critical section
A thread can enter a critical section multiple times - however the number of Enter- and Leave-operations must match
Leaving a critical section before entering it may cause deadlocks
No way to test whether another thread is in a critical section
VOID InitializeCriticalSection( LPCRITICAL_SECTION sec )VOID DeleteCriticalSection( LPCRITICAL_SECTION sec )
VOID EnterCriticalSection( LPCRITICAL_SECTION sec ) VOID LeaveCriticalSection( LPCRITICAL_SECTION sec )BOOL TryEnterCriticalSection ( LPCRITICAL_SECTION sec )
8181
Critical Section Example
counter is global shared by all threads
volatile int counter = 0
CRITICAL_SECTION crit
InitializeCriticalSection ( ampcrit )
hellip main loop in any of the threads
while (done)
_try
EnterCriticalSection ( ampcrit )
counter += local_value
LeaveCriticalSection ( ampcrit )
_finally LeaveCriticalSection ( ampcrit )
DeleteCriticalSection( ampcrit )
8282
Synchronizing Threads with Kernel Objects
The following kernel objects can be used to synchronize threads
ndash Processes
ndash Threads
ndash Files
ndash Console input
File change notificationsFile change notifications
MutexesMutexes
Events (auto-reset + manual-reset)Events (auto-reset + manual-reset)
Waitable timersWaitable timers
DWORD WaitForSingleObject( HANDLE hObject DWORD dwTimeout )
DWORD WaitForMultipleObjects( DWORD cObjects LPHANDLE lpHandles BOOL bWaitAll DWORD dwTimeout )
8383
Wait Functions - Details
WaitForSingleObject()ndash hObject specifies kernel object
ndash dwTimeout specifies wait time in msecdwTimeout == 0 - no wait check whether object is signaled
dwTimeout == INFINITE - wait forever
WaitForMultipleObjects()ndash cObjects lt= MAXIMUM_WAIT_OBJECTS (64)
ndash lpHandles - pointer to array identifying these objects
ndash bWaitAll - whether to wait for first signaled object or all objectsFunction returns index of first signaled object
Side effectsndash Mutexes auto-reset events and waitable timers will be reset to non-signaled state after completing wait functions
8484
Mutexes
Mutexes work across processes
First thread has to call CreateMutex()
When sharing a mutex second thread (process) calls CreateMutex() or OpenMutex()
fInitialOwner == TRUE gives creator immediate ownership
Threads acquire mutex ownership using WaitForSingleObject() or WaitForMultipleObjects()
ReleaseMutex() gives up ownership
CloseHandle() will free mutex object
8585
Mutexes
HANDLE CreateMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
HANDLE OpenMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
BOOL ReleaseMutex( HANDLE hMutex )
8686
Mutex Example
counter is global shared by all threads
volatile int done counter = 0
HANDLE mutex = CreateMutex( NULL FALSE NULL )
main loop in any of the threads ret is local
DWORD ret
while (done)
ret = WaitForSingleObject( mutex INFINITE )
if (ret == WAIT_OBJECT_0)
counter += local_value
else mutex was abandoned
break exit the loop
ReleaseMutex( mutex )
CloseHandle( mutex )
8787
Comparison - POSIX mutexes
POSIX pthreads specification supports mutexesndash Synchronization among threads in same process
Five basic functionsndash pthread_mutex_init()
ndash pthread_mutex_destroy()
ndash pthread_mutex_lock()
ndash pthread_mutex_unlock()
ndash pthread_mutex_trylock()
Comparisonndash pthread_mutex_lock() will block - equivalent to WaitForSingleObject( hMutex )
ndash pthread_mutex_trylock() is nonblocking (polling) - equivalent to WaitForSingleObject() with timeout == 0
8888
Semaphores
Semaphore objects are used for resource countingndash A semaphore is signaled when count gt 0
Threadsprocesses use wait functionsndash Each wait function decreases semaphore count by 1
ndash ReleaseSemaphore() may increment count by any value
ndash ReleaseSemaphore() returns old semaphore count
8989
Semaphores
HANDLE CreateSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE OpenSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE ReleaseSemaphore( HANDLE hSemaphoreLONG cReleaseCount LPLONG lpPreviousCount )
9090
Events
Multiple threads can be released when a single event is signaled (barrier synchronization)ndash Manual-reset event can signal several thread simultaneously must be reset manually
ndash PulseEvent() will release all threads waiting on a manual-reset event and automatically reset the event
ndash Auto-reset event signals a single thread event is reset automatically
ndash fInitialState == TRUE - create event in signaled state
9191
Events
HANDLE CreateEvent( LPSECURITY_ATTRIBUTE lpsaBOOL fManualReset BOOL fInititalStateLPTSTR lpszEventName )
BOOL SetEvent( HANDLE hEvent )BOOL ResetEvent( HANDLE hEvent )BOOL PulseEvent( HANDLE hEvent )
9292
Comparison - POSIX condition variables
pthreadrsquos condition variables are comparable to eventsndash pthread_cond_init()
ndash pthread_cond_destroy()
Wait functionsndash pthread_cond_wait()
ndash pthread_cond_timedwait()
Signalingndash pthread_cond_signal() - one thread
ndash pthread_cond_broadcast() - all waiting threads
No exact equivalent to manual-reset events
9393
Anonymous pipes
BOOL CreatePipe( PHANDLE phReadPHANDLE phWriteLPSECURITY_ATTRIBUTES lpsaDWORD cbPipe )
main
prog1 prog2pipe
Half-duplex character-based IPC
cbPipe pipe byte size zero == default
Read on pipe handle will block if pipe is empty
Write operation to a full pipe will block
Anonymous pipes are oneway
9494
IO Redirection using an Anonymous Pipe
Create default size anonymous pipe handles are inheritable
if (CreatePipe (amphReadPipe amphWritePipe ampPipeSA 0))
fprintf(stderr ldquoAnon pipe create failednrdquo) exit(1)
Set output handle to pipe handle create first processes
StartInfoCh1hStdInput = GetStdHandle (STD_INPUT_HANDLE)
StartInfoCh1hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh1hStdOutput = hWritePipe
StartInfoCh1dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)Command1 NULL NULL TRUE 0 NULL NULL ampStartInfoCh1 ampProcInfo1))
fprintf(stderr ldquoCreateProc1 failednrdquo) exit(2)
CloseHandle (hWritePipe)
9595
Pipe example (contd)
Repeat (symmetrically) for the second process
StartInfoCh2hStdInput = hReadPipe
StartInfoCh2hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh2hStdOutput = GetStdHandle (STD_OUTPUT_HANDLE)
StartInfoCh2dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)targv NULL NULLTRUE Inherit handles
0 NULL NULL ampStartInfoCh2 ampProcInfo2))
fprintf(stderr ldquoCreateProc2 failednrdquo) exit(3)
CloseHandle (hReadPipe)
Wait for both processes to complete
WaitForSingleObject (ProcInfo1hProcess INFINITE)
WaitForSingleObject (ProcInfo2hProcess INFINITE)
CloseHandle (ProcInfo1hThread) CloseHandle (ProcInfo1hProcess)
CloseHandle (ProcInfo2hThread) CloseHandle (ProcInfo2hProcess)
return 0
9696
Named Pipes
Message orientedndash Reading process can read varying-length messages precisely as sent by the writing process
Bi-directionalndash Two processes can exchange messages over the same pipe
Multiple independent instances of a named pipendash Several clients can communicate with a single server using the same instance
ndash Server can respond to client using the same instance
Pipe can be accessed over the networkndash location transparency
Convenience and connection functions
9797
Using Named Pipes
HANDLE CreateNamedPipe (LPCTSTR lpszPipeNameDWORD fdwOpenMode DWORD fdwPipModeDWORD nMaxInstances DWORD cbOutBufDWORD cbInBuf DWORD dwTimeOutLPSECURITY_ATTRIBUTES lpsa )
Use same flag settings forall instances of a named pipe
lpszPipeName pipe[path]pipename
ndash Not possible to create a pipe on remote machine ( ndash local machine)
fdwOpenMode
ndash PIPE_ACCESS_DUPLEX PIPE_ACCESS_INBOUND PIPE_ACCESS_OUTBOUND
fdwPipeMode
ndash PIPE_TYPE_BYTE or PIPE_TYPE_MESSAGE
ndash PIPE_READMODE_BYTE or PIPE_READMODE_MESSAGE
ndash PIPE_WAIT or PIPE_NOWAIT (will ReadFile block)
9898
Named Pipes (contd)
BOOL PeekNamedPipe (HANDLE hPipeLPVOID lpvBuffer DWORD cbBufferLPDWORD lpcbRead LPDWORD lpcbAvailLPDWORD lpcbMessage)
nMaxInstances
ndash Number of instances
ndash PIPE_UNLIMITED_INSTANCES OS choice based on resources
dwTimeOut
ndash Default time-out period (in msec) for WaitNamedPipe()
First CreateNamedPipe creates named pipe
ndash Closing handle to last instance deletes named pipe
Polling a pipe
ndash Nondestructive ndash is there a message waiting for ReadFile
9999
Named Pipe Client Connections
CreateFile with named pipe namendash pipe[path]pipename
ndash servernamepipe[path]pipename
ndash First method gives better performance (local server)
Status Functionsndash GetNamedPipeHandleState
ndash SetNamedPipeHandleState
ndash GetNamedPipeInfo
100100
Convenience Functions
BOOL TransactNamedPipe( HANDLE hNamedPipeLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDOWRD lpcbRead LPOVERLAPPED lpa)
WriteFile ReadFile sequence
101101
Convenience Functions
BOOL CallNamedPipe( LPCTSTR lpszPipeNameLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDWORD lpcbRead DWORD dwTimeOut)
CreateFile WriteFile ReadFile CloseHandle
ndash dwTimeOut NMPWAIT_NOWAIT NMPWAIT_WIAT_FOREVER NMPWAIT_USE_DEFAULT_WAIT
102102
Server eliminate the polling loop
BOOL ConnectNamedPipe (HANDLE hNamedPipeLPOVERLAPPED lpo
lpo == NULLndash Call will return as soon as there is a client connection
ndash Returns false if client connected between CreateNamed Pipe calland ConnectNamedPipe()
Use DisconnectNamedPipe to free the handle for connection from another client
WaitNamedPipe()ndash Client may wait for serverlsquos ConnectNamedPipe()
Security rights for named pipesndash GENERIC_READ GENERIC_WRITE SYNCHRONIZE
103103
Comparison with UNIX
UNIX FIFOs are similar to a named pipe
ndash FIFOs are half-duplex
ndash FIFOs are limited to a single machine
ndash FIFOs are still byte-oriented so its easiest to use fixed-size records in clientserver applications
ndash Individual readwrites are atomic
A server using FIFOs must use a separate FIFO for each clientlsquos response although all clients can send requests via a single well known FIFO
Mkfifo() is the UNIX counterpart to CreateNamedPipe()
Use sockets for networked clientserver scenarios
104104
Client Example using Named Pipe
WaitNamedPipe (ServerPipeName NMPWAIT_WAIT_FOREVER)
hNamedPipe = CreateFile (ServerPipeName GENERIC_READ | GENERIC_WRITE
0 NULL OPEN_EXISTING FILE_ATTRIBUTE_NORMAL NULL)
if (hNamedPipe == INVALID_HANDLE_VALUE)
fptinf(stderr Failure to locate servern) exit(3)
Write the request
WriteFile (hNamedPipe ampRequest MAX_RQRS_LEN ampnWrite NULL)
Read each response and send it to std out
while (ReadFile (hNamedPipe ResponseRecord MAX_RQRS_LEN ampnRead NULL))
printf (s ResponseRecord)
CloseHandle (hNamedPipe)
return 0
105105
Server Example Using a Named Pipe
hNamedPipe = CreateNamedPipe (SERVER_PIPE PIPE_ACCESS_DUPLEX
PIPE_READMODE_MESSAGE | PIPE_TYPE_MESSAGE | PIPE_WAIT
1 0 0 CS_TIMEOUT pNPSA)
while (Done)
printf (Server is awaiting next requestn)
if (ConnectNamedPipe (hNamedPipe NULL)
|| ReadFile (hNamedPipe ampRequest RQ_SIZE ampnXfer NULL))
fprintf(stderr ldquoConnect or Read Named Pipe errornrdquo) exit(4)
printf( ldquoRequest is sn RequestRecord)
Send the file one line at a time to the client
fp = fopen (File r)
while ((fgets (ResponseRecord MAX_RQRS_LEN fp) = NULL))
WriteFile (hNamedPipe ampResponseRecord
(strlen(ResponseRecord) + 1) TSIZE ampnXfer NULL)
fclose (fp)
DisconnectNamedPipe (hNamedPipe)
End of server operation
106106
Win32 IPC - MailslotsMailslots bear some nasty implementation detailsthey are almost never used
Broadcast mechanism
ndash One-directional
ndash Mutliple writersmultiple readers (frequently one-to-many comm)
ndash Message delivery is unreliable
ndash Can be located over a network domain
ndash Message lengths are limited (w2k lt 426 byte) Operations on the mailslot
ndash Each reader (server) creates mailslot with CreateMailslot()
ndash Write-only client opens mailslot with CreateFile() and uses WriteFile() ndash open will fail if there are no waiting readers
ndash Clientlsquos message can be read by all servers (readers) Client lookup mailslotmailslotname
ndash Client will connect to every server in network domain
107107
Locate a server via mailslot
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
App client 0
App client n
Mailslot Servers
While () Sleep() hMS = CreateFile( ldquomailslotstatusldquo)
WriteFile(hMS ampStatInfo
App Server
Mailslot Client
Message is sent periodically
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
4444
Predefined IRQLs (contd)
Device
ndash Used to prioritize device interrupts
DPCdispatch and APC
ndash Software interrupts that kernel and device drivers generate
Passive
ndash No interrupt level at all normal thread execution
4545
IRQLs on 64-bit Systems
PassiveLowAPC
DispatchDPCDevice 1
Device n
Synch (Srv 2003)Clock
Interprocessor InterruptPower
HighProfile
012
1413
15
34
PassiveLowAPC
DispatchDPC amp Synch (UP only)Correctable Machine Check
Device 1
Device nSynch (MP only)
ClockInterprocessor Interrupt
HighProfilePower
x64 IA64
12
4646
Interrupt Prioritization amp Delivery
IRQLs are determined as followsndash x86 UP systems IRQL = 27 - IRQ
ndash x86 MP systems bucketized (random)
ndash x64 amp IA64 systems IRQL = IDT vector number 16
On MP systems which processor is chosen to deliver an interruptndash By default any processor can receive an interrupt from any deviceCan be configured with IntFilter utility in Resource Kit
ndash On x86 and x64 systems the IOAPIC (IO advanced programmable interrupt controller) is programmed to interrupt the processor running at the lowest IRQL
ndash On IA64 systems the SAPIC (streamlined advanced programmable interrupt controller) is configured to interrupt one processor for each interrupt sourceProcessors are assigned round robin for each interrupt vector
4747
Software interrupts
Initiating thread dispatching
ndash DPC allow for scheduling actions when kernel is deep within many layers of code
ndash Delayed scheduling decision one DPC queue per processor
Handling timer expiration
Asynchronous execution of a procedure in context of a particular thread
Support for asynchronous IO operations
4848
Flow of Interrupts
4949
Sync on MP use spinlocks to coordinate among processors
Spinlock acquisition and release routines implement a one-owner-at-a-time algorithm
ndash Spinlock is either free or is considered to be owned by a CPU
ndash Analogous to using Windows API mutexes from user mode
A spinlock is just a data cell in memory
ndash Accessed with a test-and-modify operation that is atomic across all processors
ndash KSPIN_LOCK is an opaque data type typedefrsquod as a ULONG
ndash To implement synchronization a single bit is sufficient
Synchronization on SMP Systems
5050
do acquire_spinlock(DPC)until (SUCCESS)
begin remove DPC from queueend
release_spinlock(DPC)
do acquire_spinlock(DPC)until (SUCCESS)
begin remove DPC from queueend
release_spinlock(DPC)
Kernel Synchronization
Processor BProcessor A
Critical section
spinlock
DPC DPC
A spinlock is a locking primitive associatedwith a global data structure such as the DPC queue
5151
Queued Spinlocks
Problem Checking status of spinlock via test-and-set operation creates bus contention
Queued spinlocks maintain queue of waiting processors
First processor acquires lock other processors wait on processor-local flag
ndash Thus busy-wait loop requires no access to the memory bus
When releasing lock the 1st processorrsquos flag is modified
ndash Exactly one processor is being signaled
ndash Pre-determined wait order
5252
SMP Scalability Improvements
Windows 2000 queued spinlocks
ndash qlocks in Kernel Debugger
Server 2003
ndash More spinlocks eliminated (context swap system space commit)
ndash Further reduction of use of spinlocks amp length they are held
ndash Scheduling database now per-CPUAllows thread state transitions in parallel
5353
SMP Scalability Improvements
XP2003
ndash Minimized lock contention for hot locks PFN or Page Frame Database lock
ndash Some locks completely eliminatedCharging nonpagedpaged pool quotas allocating and mapping system page table entries charging commitment of pages allocatingmapping physical memory through AWE functions
ndash New more efficient locking mechanism (pushlocks)Doesnrsquot use spinlocks when no contentionUsed for object manager and address windowing extensions (AWE) related locks
5454
Waiting
Flexible wait calls
ndash Wait for one or multiple objects in one call
ndash Wait for multiple can wait for ldquoanyrdquo one or ldquoallrdquo at onceldquoAllrdquo all objects must be in the signalled state concurrently to resolve the wait
ndash All wait calls include optional timeout argument
ndash Waiting threads consume no CPU time
5555
Waiting
Waitable objects include
ndash Events (may be auto-reset or manual reset may be set or ldquopulsedrdquo)
ndash Mutexes (ldquomutual exclusionrdquo one-at-a-time)
ndash Semaphores (n-at-a-time)
ndash Timers
ndash Processes and Threads (signalled upon exit or terminate)
ndash Directories (change notification)
5656
Waiting
No guaranteed ordering of wait resolution
ndash If multiple threads are waiting for an object and only one thread is released (eg itrsquos a mutex or auto-reset event) which thread gets released is unpredictable
5757
Executive Synchronization
Thread waitson an object
handle
Create and initialize thread object
Initialized
Ready
Transition
Waiting
Running
Terminated
Standby
Wait is completeSet object to
signaled state
Interaction with thread scheduling
Waiting on Dispatcher Objects ndash outside the kernel
5858
Interaction bet Synchronization amp Dispatching User mode thread waits on an event objectlsquos handle
Kernel changes threadlsquos scheduling state from ready to waiting and adds thread to wait-list
Another thread sets the event
Kernel wakes up waiting threads variable priority threads get priority boost
5959
Interaction bet Synchronization amp Dispatching Dispatcher re-schedules new thread ndash it may preempt running thread it it has lower priority and issues software interrupt to initiate context switch
If no processor can be preempted the dispatcher places the ready thread in the dispatcher ready queue to be scheduled later
6060
What signals an object
Dispatcher object
System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
Owning thread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (kernel mode)
nonsignaled signaled
Owning thread or otherthread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (exported to user mode)
nonsignaled signaled
One thread releases thesemaphore freeing a resource
A thread acquires the semaphoreMore resources are not available
Kernel resumes one or more waiting threads
Semaphore
6161
A thread reinitializesthe thread object
What signals an object (contd)
Dispatcher object System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
A thread sets the event
Kernel resumes one or more threads
Kernel resumes one or more waiting threads
Event
nonsignaled signaled
Dedicated thread sets oneevent in the event pair
Kernel resumes theother dedicated thread
Kernel resumes waitingdedicated thread
Event pair
nonsignaled signaled
Timer expires
A thread (re) initializes the timer
Kernel resumes all waiting threads
Timer
nonsignaled signaled
Thread terminates
Kernel resumes all waiting threads
Thread
6262
Further Reading
Mark E Russinovich and David A Solomon Microsoft Windows Internals 4th Edition Microsoft Press 2004
Chapter 3 - System Mechanisms
ndash Trap Dispatching (pp 85 ff)
ndash Synchronization (pp 149 ff)
ndash Kernel Event Tracing (pp 175 ff)
6363
33 Advanced Windows Synchronization
Deferred and Asynchronous Procedure Calls
IRQLs and CPU Time Accounting
Wait Queues amp Dispatcher Objects
6464
Used to defer processing from higher (device) interrupt level to a lower (dispatch) levelndash Also used for quantum end and timer expiration
Driver (usually ISR) queues requestndash One queue per CPU DPCs are normally queued to the current processor but can be targeted to other CPUs
ndash Executes specified procedure at dispatch IRQL (or ldquodispatch levelrdquo also ldquoDPC levelrdquo) when all higher-IRQL work (interrupts) completed
ndash Maximum times recommended ISR 10 usec DPC 25 usec
See httpwwwmicrosoftcomwhdcdriverperformmmdrvmspx
Deferred Procedure Calls (DPCs)
6565
queue head DPC object DPC object DPC object
Deferred Procedure Calls (DPCs)
6666
DPC
Delivering a DPC
DPC routines can call kernel functionsbut canlsquot call system services generatepage faults or create or wait on objects
DPC routines canlsquotassume whatprocess addressspace is currentlymapped
Interruptdispatch table
high
Power failure
DispatchDPC
APC
Low
DPC
1 Timer expires kernelqueues DPC that willrelease all waiting threadsKernel requests SW int
DPCDPC
DPC queue
2 DPC interrupt occurswhen IRQL drops belowdispatchDPC level
dispatcher
3 After DPC interruptcontrol transfers tothread dispatcher
4 Dispatcher executes each DPCroutine in DPC queue
6767
Asynchronous Procedure Calls (APCs)
Execute code in context of a particular user threadndash APC routines can acquire resources (objects) incur page faultscall system services
APC queue is thread-specific
User mode amp kernel mode APCsndash Permission required for user mode APCs
6868
Asynchronous Procedure Calls (APCs)
Executive uses APCs to complete work in thread spacendash Wait for asynchronous IO operation
ndash Emulate delivery of POSIX signals
ndash Make threads suspendterminate itself (env subsystems)
APCs are delivered when thread is in alertable wait statendash WaitForMultipleObjectsEx() SleepEx()
6969
Special kernel APCs
ndash Run in kernel mode at IRQL 1
ndash Always deliverable unless thread is already at IRQL 1 or above
ndash Used for IO completion reporting from ldquoarbitrary thread contextrdquo
ndash Kernel-mode interface is linkable but not documented
Asynchronous Procedure Calls (APCs)
7070
ldquoOrdinaryrdquo kernel APCs
ndash Always deliverable if at IRQL 0 unless explicitly disabled (disable with KeEnterCriticalRegion)
User mode APCs
ndash Used for IO completion callback routines (see ReadFileEx WriteFileEx) also QueueUserApc
ndash Only deliverable when thread is in ldquoalertable waitrdquo
Asynchronous Procedure Calls (APCs)
7171
ThreadObject
K
U
APC objects
Asynchronous Procedure Calls (APCs)
7272
IRQLs and CPU Time Accounting
Interval clock timer ISR keeps track of time
Clock ISR time accounting
ndash If IRQLlt2 charge to threadrsquos user or kernel time
ndash If IRQL=2 and processing a DPC charge to DPC time
ndash If IRQL=2 amp not processing a DPC charge to thread kernel time
ndash If IRQLgt2 charge to interrupt time
7373
IRQLs and CPU Time Accounting
Since time servicing interrupts are NOT charged to interrupted thread if system is busy but no process appears to be running must be due to interrupt-related activity
ndash Note time at IRQL 2 or more is charged to the current threadrsquos quantum (to be described)
7474
Interrupt Time Accounting
Task Manager includes interrupt and DPC time with the Idle process time
Interrupt activity is not charged to any threadprocess
ndash Process Explorer shows these as separate processesnot really processes
ndash Context switches for these are really of interrupts amp DPCs
7575
Time Accounting Quirks
Looking at total CPU time for each process may not reveal where system has spent its time
CPU time accounting is driven by programmable interrupt timer
ndash Normally 10 msec (15 msec on some MP Pentiums)
Thread execution and context switches between clock intervals NOT accounted
ndash Eg one or more threads run and enter a wait state before clock fires
ndash Thus threads may run but never get charged
View context switch activity with Process Explorer
ndash Add Context Switch Delta column
7676
For waiting threads user-mode utilities only display the wait reason
Example pstat
Looking at Waiting Threads
7777
Wait Internals 1 Dispatcher Objects
Size TypeState
Wait listhead
Object-type-specific data
DispatcherObject
(see ntddkincddkntddkh)
Any kernel object you can wait for is a ldquodispatcher objectrdquo
ndash some exclusively for synchronizationeg events mutexes (ldquomutantsrdquo) semaphores queues timers
ndash others can be waited for as a side effect of their prime function eg processes threads file objects
ndash non-waitable kernel objects are called ldquocontrol objectsrdquo
All dispatcher objects have a common header
All dispatcher objects are in one of two states
ndash ldquosignaledrdquo vs ldquononsignaledrdquo
ndash when signalled a wait on the object is satisfied
ndash different object types differ in terms of what changes their state
ndash wait and unwait implementation iscommon to all types of dispatcher objects
7878
Object-type-specific data
Wait Internals 2Wait Blocks
Size TypeState
Wait listhead
Size TypeState
Wait listhead
Represent a threadrsquos reference to something itrsquos waiting for (one per handle passed to WaitForhellip)
All wait blocks from a given wait call are chained to the waiting thread
Type indicates wait for ldquoanyrdquo or ldquoallrdquo Key denotes argument list position for
WaitForMultipleObjects
Object-type-specific data
DispatcherObjects
Thread Objects
WaitBlockListWaitBlockList
Wait blocks
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
7979
34 Windows APIs for Synchronization and IPC Windows API constructs for synchronization and interprocess communication
Synchronization
ndash Critical sections
ndash Mutexes
ndash Semaphores
ndash Event objects
Synchronization through interprocess communication
ndash Anonymous pipes
ndash Named pipes
ndash Mailslots
8080
Critical Sections
Only usable from within the same process
Critical sections are initialized and deleted but do not have handles
Only one thread at a time can be in a critical section
A thread can enter a critical section multiple times - however the number of Enter- and Leave-operations must match
Leaving a critical section before entering it may cause deadlocks
No way to test whether another thread is in a critical section
VOID InitializeCriticalSection( LPCRITICAL_SECTION sec )VOID DeleteCriticalSection( LPCRITICAL_SECTION sec )
VOID EnterCriticalSection( LPCRITICAL_SECTION sec ) VOID LeaveCriticalSection( LPCRITICAL_SECTION sec )BOOL TryEnterCriticalSection ( LPCRITICAL_SECTION sec )
8181
Critical Section Example
counter is global shared by all threads
volatile int counter = 0
CRITICAL_SECTION crit
InitializeCriticalSection ( ampcrit )
hellip main loop in any of the threads
while (done)
_try
EnterCriticalSection ( ampcrit )
counter += local_value
LeaveCriticalSection ( ampcrit )
_finally LeaveCriticalSection ( ampcrit )
DeleteCriticalSection( ampcrit )
8282
Synchronizing Threads with Kernel Objects
The following kernel objects can be used to synchronize threads
ndash Processes
ndash Threads
ndash Files
ndash Console input
File change notificationsFile change notifications
MutexesMutexes
Events (auto-reset + manual-reset)Events (auto-reset + manual-reset)
Waitable timersWaitable timers
DWORD WaitForSingleObject( HANDLE hObject DWORD dwTimeout )
DWORD WaitForMultipleObjects( DWORD cObjects LPHANDLE lpHandles BOOL bWaitAll DWORD dwTimeout )
8383
Wait Functions - Details
WaitForSingleObject()ndash hObject specifies kernel object
ndash dwTimeout specifies wait time in msecdwTimeout == 0 - no wait check whether object is signaled
dwTimeout == INFINITE - wait forever
WaitForMultipleObjects()ndash cObjects lt= MAXIMUM_WAIT_OBJECTS (64)
ndash lpHandles - pointer to array identifying these objects
ndash bWaitAll - whether to wait for first signaled object or all objectsFunction returns index of first signaled object
Side effectsndash Mutexes auto-reset events and waitable timers will be reset to non-signaled state after completing wait functions
8484
Mutexes
Mutexes work across processes
First thread has to call CreateMutex()
When sharing a mutex second thread (process) calls CreateMutex() or OpenMutex()
fInitialOwner == TRUE gives creator immediate ownership
Threads acquire mutex ownership using WaitForSingleObject() or WaitForMultipleObjects()
ReleaseMutex() gives up ownership
CloseHandle() will free mutex object
8585
Mutexes
HANDLE CreateMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
HANDLE OpenMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
BOOL ReleaseMutex( HANDLE hMutex )
8686
Mutex Example
counter is global shared by all threads
volatile int done counter = 0
HANDLE mutex = CreateMutex( NULL FALSE NULL )
main loop in any of the threads ret is local
DWORD ret
while (done)
ret = WaitForSingleObject( mutex INFINITE )
if (ret == WAIT_OBJECT_0)
counter += local_value
else mutex was abandoned
break exit the loop
ReleaseMutex( mutex )
CloseHandle( mutex )
8787
Comparison - POSIX mutexes
POSIX pthreads specification supports mutexesndash Synchronization among threads in same process
Five basic functionsndash pthread_mutex_init()
ndash pthread_mutex_destroy()
ndash pthread_mutex_lock()
ndash pthread_mutex_unlock()
ndash pthread_mutex_trylock()
Comparisonndash pthread_mutex_lock() will block - equivalent to WaitForSingleObject( hMutex )
ndash pthread_mutex_trylock() is nonblocking (polling) - equivalent to WaitForSingleObject() with timeout == 0
8888
Semaphores
Semaphore objects are used for resource countingndash A semaphore is signaled when count gt 0
Threadsprocesses use wait functionsndash Each wait function decreases semaphore count by 1
ndash ReleaseSemaphore() may increment count by any value
ndash ReleaseSemaphore() returns old semaphore count
8989
Semaphores
HANDLE CreateSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE OpenSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE ReleaseSemaphore( HANDLE hSemaphoreLONG cReleaseCount LPLONG lpPreviousCount )
9090
Events
Multiple threads can be released when a single event is signaled (barrier synchronization)ndash Manual-reset event can signal several thread simultaneously must be reset manually
ndash PulseEvent() will release all threads waiting on a manual-reset event and automatically reset the event
ndash Auto-reset event signals a single thread event is reset automatically
ndash fInitialState == TRUE - create event in signaled state
9191
Events
HANDLE CreateEvent( LPSECURITY_ATTRIBUTE lpsaBOOL fManualReset BOOL fInititalStateLPTSTR lpszEventName )
BOOL SetEvent( HANDLE hEvent )BOOL ResetEvent( HANDLE hEvent )BOOL PulseEvent( HANDLE hEvent )
9292
Comparison - POSIX condition variables
pthreadrsquos condition variables are comparable to eventsndash pthread_cond_init()
ndash pthread_cond_destroy()
Wait functionsndash pthread_cond_wait()
ndash pthread_cond_timedwait()
Signalingndash pthread_cond_signal() - one thread
ndash pthread_cond_broadcast() - all waiting threads
No exact equivalent to manual-reset events
9393
Anonymous pipes
BOOL CreatePipe( PHANDLE phReadPHANDLE phWriteLPSECURITY_ATTRIBUTES lpsaDWORD cbPipe )
main
prog1 prog2pipe
Half-duplex character-based IPC
cbPipe pipe byte size zero == default
Read on pipe handle will block if pipe is empty
Write operation to a full pipe will block
Anonymous pipes are oneway
9494
IO Redirection using an Anonymous Pipe
Create default size anonymous pipe handles are inheritable
if (CreatePipe (amphReadPipe amphWritePipe ampPipeSA 0))
fprintf(stderr ldquoAnon pipe create failednrdquo) exit(1)
Set output handle to pipe handle create first processes
StartInfoCh1hStdInput = GetStdHandle (STD_INPUT_HANDLE)
StartInfoCh1hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh1hStdOutput = hWritePipe
StartInfoCh1dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)Command1 NULL NULL TRUE 0 NULL NULL ampStartInfoCh1 ampProcInfo1))
fprintf(stderr ldquoCreateProc1 failednrdquo) exit(2)
CloseHandle (hWritePipe)
9595
Pipe example (contd)
Repeat (symmetrically) for the second process
StartInfoCh2hStdInput = hReadPipe
StartInfoCh2hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh2hStdOutput = GetStdHandle (STD_OUTPUT_HANDLE)
StartInfoCh2dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)targv NULL NULLTRUE Inherit handles
0 NULL NULL ampStartInfoCh2 ampProcInfo2))
fprintf(stderr ldquoCreateProc2 failednrdquo) exit(3)
CloseHandle (hReadPipe)
Wait for both processes to complete
WaitForSingleObject (ProcInfo1hProcess INFINITE)
WaitForSingleObject (ProcInfo2hProcess INFINITE)
CloseHandle (ProcInfo1hThread) CloseHandle (ProcInfo1hProcess)
CloseHandle (ProcInfo2hThread) CloseHandle (ProcInfo2hProcess)
return 0
9696
Named Pipes
Message orientedndash Reading process can read varying-length messages precisely as sent by the writing process
Bi-directionalndash Two processes can exchange messages over the same pipe
Multiple independent instances of a named pipendash Several clients can communicate with a single server using the same instance
ndash Server can respond to client using the same instance
Pipe can be accessed over the networkndash location transparency
Convenience and connection functions
9797
Using Named Pipes
HANDLE CreateNamedPipe (LPCTSTR lpszPipeNameDWORD fdwOpenMode DWORD fdwPipModeDWORD nMaxInstances DWORD cbOutBufDWORD cbInBuf DWORD dwTimeOutLPSECURITY_ATTRIBUTES lpsa )
Use same flag settings forall instances of a named pipe
lpszPipeName pipe[path]pipename
ndash Not possible to create a pipe on remote machine ( ndash local machine)
fdwOpenMode
ndash PIPE_ACCESS_DUPLEX PIPE_ACCESS_INBOUND PIPE_ACCESS_OUTBOUND
fdwPipeMode
ndash PIPE_TYPE_BYTE or PIPE_TYPE_MESSAGE
ndash PIPE_READMODE_BYTE or PIPE_READMODE_MESSAGE
ndash PIPE_WAIT or PIPE_NOWAIT (will ReadFile block)
9898
Named Pipes (contd)
BOOL PeekNamedPipe (HANDLE hPipeLPVOID lpvBuffer DWORD cbBufferLPDWORD lpcbRead LPDWORD lpcbAvailLPDWORD lpcbMessage)
nMaxInstances
ndash Number of instances
ndash PIPE_UNLIMITED_INSTANCES OS choice based on resources
dwTimeOut
ndash Default time-out period (in msec) for WaitNamedPipe()
First CreateNamedPipe creates named pipe
ndash Closing handle to last instance deletes named pipe
Polling a pipe
ndash Nondestructive ndash is there a message waiting for ReadFile
9999
Named Pipe Client Connections
CreateFile with named pipe namendash pipe[path]pipename
ndash servernamepipe[path]pipename
ndash First method gives better performance (local server)
Status Functionsndash GetNamedPipeHandleState
ndash SetNamedPipeHandleState
ndash GetNamedPipeInfo
100100
Convenience Functions
BOOL TransactNamedPipe( HANDLE hNamedPipeLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDOWRD lpcbRead LPOVERLAPPED lpa)
WriteFile ReadFile sequence
101101
Convenience Functions
BOOL CallNamedPipe( LPCTSTR lpszPipeNameLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDWORD lpcbRead DWORD dwTimeOut)
CreateFile WriteFile ReadFile CloseHandle
ndash dwTimeOut NMPWAIT_NOWAIT NMPWAIT_WIAT_FOREVER NMPWAIT_USE_DEFAULT_WAIT
102102
Server eliminate the polling loop
BOOL ConnectNamedPipe (HANDLE hNamedPipeLPOVERLAPPED lpo
lpo == NULLndash Call will return as soon as there is a client connection
ndash Returns false if client connected between CreateNamed Pipe calland ConnectNamedPipe()
Use DisconnectNamedPipe to free the handle for connection from another client
WaitNamedPipe()ndash Client may wait for serverlsquos ConnectNamedPipe()
Security rights for named pipesndash GENERIC_READ GENERIC_WRITE SYNCHRONIZE
103103
Comparison with UNIX
UNIX FIFOs are similar to a named pipe
ndash FIFOs are half-duplex
ndash FIFOs are limited to a single machine
ndash FIFOs are still byte-oriented so its easiest to use fixed-size records in clientserver applications
ndash Individual readwrites are atomic
A server using FIFOs must use a separate FIFO for each clientlsquos response although all clients can send requests via a single well known FIFO
Mkfifo() is the UNIX counterpart to CreateNamedPipe()
Use sockets for networked clientserver scenarios
104104
Client Example using Named Pipe
WaitNamedPipe (ServerPipeName NMPWAIT_WAIT_FOREVER)
hNamedPipe = CreateFile (ServerPipeName GENERIC_READ | GENERIC_WRITE
0 NULL OPEN_EXISTING FILE_ATTRIBUTE_NORMAL NULL)
if (hNamedPipe == INVALID_HANDLE_VALUE)
fptinf(stderr Failure to locate servern) exit(3)
Write the request
WriteFile (hNamedPipe ampRequest MAX_RQRS_LEN ampnWrite NULL)
Read each response and send it to std out
while (ReadFile (hNamedPipe ResponseRecord MAX_RQRS_LEN ampnRead NULL))
printf (s ResponseRecord)
CloseHandle (hNamedPipe)
return 0
105105
Server Example Using a Named Pipe
hNamedPipe = CreateNamedPipe (SERVER_PIPE PIPE_ACCESS_DUPLEX
PIPE_READMODE_MESSAGE | PIPE_TYPE_MESSAGE | PIPE_WAIT
1 0 0 CS_TIMEOUT pNPSA)
while (Done)
printf (Server is awaiting next requestn)
if (ConnectNamedPipe (hNamedPipe NULL)
|| ReadFile (hNamedPipe ampRequest RQ_SIZE ampnXfer NULL))
fprintf(stderr ldquoConnect or Read Named Pipe errornrdquo) exit(4)
printf( ldquoRequest is sn RequestRecord)
Send the file one line at a time to the client
fp = fopen (File r)
while ((fgets (ResponseRecord MAX_RQRS_LEN fp) = NULL))
WriteFile (hNamedPipe ampResponseRecord
(strlen(ResponseRecord) + 1) TSIZE ampnXfer NULL)
fclose (fp)
DisconnectNamedPipe (hNamedPipe)
End of server operation
106106
Win32 IPC - MailslotsMailslots bear some nasty implementation detailsthey are almost never used
Broadcast mechanism
ndash One-directional
ndash Mutliple writersmultiple readers (frequently one-to-many comm)
ndash Message delivery is unreliable
ndash Can be located over a network domain
ndash Message lengths are limited (w2k lt 426 byte) Operations on the mailslot
ndash Each reader (server) creates mailslot with CreateMailslot()
ndash Write-only client opens mailslot with CreateFile() and uses WriteFile() ndash open will fail if there are no waiting readers
ndash Clientlsquos message can be read by all servers (readers) Client lookup mailslotmailslotname
ndash Client will connect to every server in network domain
107107
Locate a server via mailslot
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
App client 0
App client n
Mailslot Servers
While () Sleep() hMS = CreateFile( ldquomailslotstatusldquo)
WriteFile(hMS ampStatInfo
App Server
Mailslot Client
Message is sent periodically
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
4545
IRQLs on 64-bit Systems
PassiveLowAPC
DispatchDPCDevice 1
Device n
Synch (Srv 2003)Clock
Interprocessor InterruptPower
HighProfile
012
1413
15
34
PassiveLowAPC
DispatchDPC amp Synch (UP only)Correctable Machine Check
Device 1
Device nSynch (MP only)
ClockInterprocessor Interrupt
HighProfilePower
x64 IA64
12
4646
Interrupt Prioritization amp Delivery
IRQLs are determined as followsndash x86 UP systems IRQL = 27 - IRQ
ndash x86 MP systems bucketized (random)
ndash x64 amp IA64 systems IRQL = IDT vector number 16
On MP systems which processor is chosen to deliver an interruptndash By default any processor can receive an interrupt from any deviceCan be configured with IntFilter utility in Resource Kit
ndash On x86 and x64 systems the IOAPIC (IO advanced programmable interrupt controller) is programmed to interrupt the processor running at the lowest IRQL
ndash On IA64 systems the SAPIC (streamlined advanced programmable interrupt controller) is configured to interrupt one processor for each interrupt sourceProcessors are assigned round robin for each interrupt vector
4747
Software interrupts
Initiating thread dispatching
ndash DPC allow for scheduling actions when kernel is deep within many layers of code
ndash Delayed scheduling decision one DPC queue per processor
Handling timer expiration
Asynchronous execution of a procedure in context of a particular thread
Support for asynchronous IO operations
4848
Flow of Interrupts
4949
Sync on MP use spinlocks to coordinate among processors
Spinlock acquisition and release routines implement a one-owner-at-a-time algorithm
ndash Spinlock is either free or is considered to be owned by a CPU
ndash Analogous to using Windows API mutexes from user mode
A spinlock is just a data cell in memory
ndash Accessed with a test-and-modify operation that is atomic across all processors
ndash KSPIN_LOCK is an opaque data type typedefrsquod as a ULONG
ndash To implement synchronization a single bit is sufficient
Synchronization on SMP Systems
5050
do acquire_spinlock(DPC)until (SUCCESS)
begin remove DPC from queueend
release_spinlock(DPC)
do acquire_spinlock(DPC)until (SUCCESS)
begin remove DPC from queueend
release_spinlock(DPC)
Kernel Synchronization
Processor BProcessor A
Critical section
spinlock
DPC DPC
A spinlock is a locking primitive associatedwith a global data structure such as the DPC queue
5151
Queued Spinlocks
Problem Checking status of spinlock via test-and-set operation creates bus contention
Queued spinlocks maintain queue of waiting processors
First processor acquires lock other processors wait on processor-local flag
ndash Thus busy-wait loop requires no access to the memory bus
When releasing lock the 1st processorrsquos flag is modified
ndash Exactly one processor is being signaled
ndash Pre-determined wait order
5252
SMP Scalability Improvements
Windows 2000 queued spinlocks
ndash qlocks in Kernel Debugger
Server 2003
ndash More spinlocks eliminated (context swap system space commit)
ndash Further reduction of use of spinlocks amp length they are held
ndash Scheduling database now per-CPUAllows thread state transitions in parallel
5353
SMP Scalability Improvements
XP2003
ndash Minimized lock contention for hot locks PFN or Page Frame Database lock
ndash Some locks completely eliminatedCharging nonpagedpaged pool quotas allocating and mapping system page table entries charging commitment of pages allocatingmapping physical memory through AWE functions
ndash New more efficient locking mechanism (pushlocks)Doesnrsquot use spinlocks when no contentionUsed for object manager and address windowing extensions (AWE) related locks
5454
Waiting
Flexible wait calls
ndash Wait for one or multiple objects in one call
ndash Wait for multiple can wait for ldquoanyrdquo one or ldquoallrdquo at onceldquoAllrdquo all objects must be in the signalled state concurrently to resolve the wait
ndash All wait calls include optional timeout argument
ndash Waiting threads consume no CPU time
5555
Waiting
Waitable objects include
ndash Events (may be auto-reset or manual reset may be set or ldquopulsedrdquo)
ndash Mutexes (ldquomutual exclusionrdquo one-at-a-time)
ndash Semaphores (n-at-a-time)
ndash Timers
ndash Processes and Threads (signalled upon exit or terminate)
ndash Directories (change notification)
5656
Waiting
No guaranteed ordering of wait resolution
ndash If multiple threads are waiting for an object and only one thread is released (eg itrsquos a mutex or auto-reset event) which thread gets released is unpredictable
5757
Executive Synchronization
Thread waitson an object
handle
Create and initialize thread object
Initialized
Ready
Transition
Waiting
Running
Terminated
Standby
Wait is completeSet object to
signaled state
Interaction with thread scheduling
Waiting on Dispatcher Objects ndash outside the kernel
5858
Interaction bet Synchronization amp Dispatching User mode thread waits on an event objectlsquos handle
Kernel changes threadlsquos scheduling state from ready to waiting and adds thread to wait-list
Another thread sets the event
Kernel wakes up waiting threads variable priority threads get priority boost
5959
Interaction bet Synchronization amp Dispatching Dispatcher re-schedules new thread ndash it may preempt running thread it it has lower priority and issues software interrupt to initiate context switch
If no processor can be preempted the dispatcher places the ready thread in the dispatcher ready queue to be scheduled later
6060
What signals an object
Dispatcher object
System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
Owning thread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (kernel mode)
nonsignaled signaled
Owning thread or otherthread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (exported to user mode)
nonsignaled signaled
One thread releases thesemaphore freeing a resource
A thread acquires the semaphoreMore resources are not available
Kernel resumes one or more waiting threads
Semaphore
6161
A thread reinitializesthe thread object
What signals an object (contd)
Dispatcher object System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
A thread sets the event
Kernel resumes one or more threads
Kernel resumes one or more waiting threads
Event
nonsignaled signaled
Dedicated thread sets oneevent in the event pair
Kernel resumes theother dedicated thread
Kernel resumes waitingdedicated thread
Event pair
nonsignaled signaled
Timer expires
A thread (re) initializes the timer
Kernel resumes all waiting threads
Timer
nonsignaled signaled
Thread terminates
Kernel resumes all waiting threads
Thread
6262
Further Reading
Mark E Russinovich and David A Solomon Microsoft Windows Internals 4th Edition Microsoft Press 2004
Chapter 3 - System Mechanisms
ndash Trap Dispatching (pp 85 ff)
ndash Synchronization (pp 149 ff)
ndash Kernel Event Tracing (pp 175 ff)
6363
33 Advanced Windows Synchronization
Deferred and Asynchronous Procedure Calls
IRQLs and CPU Time Accounting
Wait Queues amp Dispatcher Objects
6464
Used to defer processing from higher (device) interrupt level to a lower (dispatch) levelndash Also used for quantum end and timer expiration
Driver (usually ISR) queues requestndash One queue per CPU DPCs are normally queued to the current processor but can be targeted to other CPUs
ndash Executes specified procedure at dispatch IRQL (or ldquodispatch levelrdquo also ldquoDPC levelrdquo) when all higher-IRQL work (interrupts) completed
ndash Maximum times recommended ISR 10 usec DPC 25 usec
See httpwwwmicrosoftcomwhdcdriverperformmmdrvmspx
Deferred Procedure Calls (DPCs)
6565
queue head DPC object DPC object DPC object
Deferred Procedure Calls (DPCs)
6666
DPC
Delivering a DPC
DPC routines can call kernel functionsbut canlsquot call system services generatepage faults or create or wait on objects
DPC routines canlsquotassume whatprocess addressspace is currentlymapped
Interruptdispatch table
high
Power failure
DispatchDPC
APC
Low
DPC
1 Timer expires kernelqueues DPC that willrelease all waiting threadsKernel requests SW int
DPCDPC
DPC queue
2 DPC interrupt occurswhen IRQL drops belowdispatchDPC level
dispatcher
3 After DPC interruptcontrol transfers tothread dispatcher
4 Dispatcher executes each DPCroutine in DPC queue
6767
Asynchronous Procedure Calls (APCs)
Execute code in context of a particular user threadndash APC routines can acquire resources (objects) incur page faultscall system services
APC queue is thread-specific
User mode amp kernel mode APCsndash Permission required for user mode APCs
6868
Asynchronous Procedure Calls (APCs)
Executive uses APCs to complete work in thread spacendash Wait for asynchronous IO operation
ndash Emulate delivery of POSIX signals
ndash Make threads suspendterminate itself (env subsystems)
APCs are delivered when thread is in alertable wait statendash WaitForMultipleObjectsEx() SleepEx()
6969
Special kernel APCs
ndash Run in kernel mode at IRQL 1
ndash Always deliverable unless thread is already at IRQL 1 or above
ndash Used for IO completion reporting from ldquoarbitrary thread contextrdquo
ndash Kernel-mode interface is linkable but not documented
Asynchronous Procedure Calls (APCs)
7070
ldquoOrdinaryrdquo kernel APCs
ndash Always deliverable if at IRQL 0 unless explicitly disabled (disable with KeEnterCriticalRegion)
User mode APCs
ndash Used for IO completion callback routines (see ReadFileEx WriteFileEx) also QueueUserApc
ndash Only deliverable when thread is in ldquoalertable waitrdquo
Asynchronous Procedure Calls (APCs)
7171
ThreadObject
K
U
APC objects
Asynchronous Procedure Calls (APCs)
7272
IRQLs and CPU Time Accounting
Interval clock timer ISR keeps track of time
Clock ISR time accounting
ndash If IRQLlt2 charge to threadrsquos user or kernel time
ndash If IRQL=2 and processing a DPC charge to DPC time
ndash If IRQL=2 amp not processing a DPC charge to thread kernel time
ndash If IRQLgt2 charge to interrupt time
7373
IRQLs and CPU Time Accounting
Since time servicing interrupts are NOT charged to interrupted thread if system is busy but no process appears to be running must be due to interrupt-related activity
ndash Note time at IRQL 2 or more is charged to the current threadrsquos quantum (to be described)
7474
Interrupt Time Accounting
Task Manager includes interrupt and DPC time with the Idle process time
Interrupt activity is not charged to any threadprocess
ndash Process Explorer shows these as separate processesnot really processes
ndash Context switches for these are really of interrupts amp DPCs
7575
Time Accounting Quirks
Looking at total CPU time for each process may not reveal where system has spent its time
CPU time accounting is driven by programmable interrupt timer
ndash Normally 10 msec (15 msec on some MP Pentiums)
Thread execution and context switches between clock intervals NOT accounted
ndash Eg one or more threads run and enter a wait state before clock fires
ndash Thus threads may run but never get charged
View context switch activity with Process Explorer
ndash Add Context Switch Delta column
7676
For waiting threads user-mode utilities only display the wait reason
Example pstat
Looking at Waiting Threads
7777
Wait Internals 1 Dispatcher Objects
Size TypeState
Wait listhead
Object-type-specific data
DispatcherObject
(see ntddkincddkntddkh)
Any kernel object you can wait for is a ldquodispatcher objectrdquo
ndash some exclusively for synchronizationeg events mutexes (ldquomutantsrdquo) semaphores queues timers
ndash others can be waited for as a side effect of their prime function eg processes threads file objects
ndash non-waitable kernel objects are called ldquocontrol objectsrdquo
All dispatcher objects have a common header
All dispatcher objects are in one of two states
ndash ldquosignaledrdquo vs ldquononsignaledrdquo
ndash when signalled a wait on the object is satisfied
ndash different object types differ in terms of what changes their state
ndash wait and unwait implementation iscommon to all types of dispatcher objects
7878
Object-type-specific data
Wait Internals 2Wait Blocks
Size TypeState
Wait listhead
Size TypeState
Wait listhead
Represent a threadrsquos reference to something itrsquos waiting for (one per handle passed to WaitForhellip)
All wait blocks from a given wait call are chained to the waiting thread
Type indicates wait for ldquoanyrdquo or ldquoallrdquo Key denotes argument list position for
WaitForMultipleObjects
Object-type-specific data
DispatcherObjects
Thread Objects
WaitBlockListWaitBlockList
Wait blocks
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
7979
34 Windows APIs for Synchronization and IPC Windows API constructs for synchronization and interprocess communication
Synchronization
ndash Critical sections
ndash Mutexes
ndash Semaphores
ndash Event objects
Synchronization through interprocess communication
ndash Anonymous pipes
ndash Named pipes
ndash Mailslots
8080
Critical Sections
Only usable from within the same process
Critical sections are initialized and deleted but do not have handles
Only one thread at a time can be in a critical section
A thread can enter a critical section multiple times - however the number of Enter- and Leave-operations must match
Leaving a critical section before entering it may cause deadlocks
No way to test whether another thread is in a critical section
VOID InitializeCriticalSection( LPCRITICAL_SECTION sec )VOID DeleteCriticalSection( LPCRITICAL_SECTION sec )
VOID EnterCriticalSection( LPCRITICAL_SECTION sec ) VOID LeaveCriticalSection( LPCRITICAL_SECTION sec )BOOL TryEnterCriticalSection ( LPCRITICAL_SECTION sec )
8181
Critical Section Example
counter is global shared by all threads
volatile int counter = 0
CRITICAL_SECTION crit
InitializeCriticalSection ( ampcrit )
hellip main loop in any of the threads
while (done)
_try
EnterCriticalSection ( ampcrit )
counter += local_value
LeaveCriticalSection ( ampcrit )
_finally LeaveCriticalSection ( ampcrit )
DeleteCriticalSection( ampcrit )
8282
Synchronizing Threads with Kernel Objects
The following kernel objects can be used to synchronize threads
ndash Processes
ndash Threads
ndash Files
ndash Console input
File change notificationsFile change notifications
MutexesMutexes
Events (auto-reset + manual-reset)Events (auto-reset + manual-reset)
Waitable timersWaitable timers
DWORD WaitForSingleObject( HANDLE hObject DWORD dwTimeout )
DWORD WaitForMultipleObjects( DWORD cObjects LPHANDLE lpHandles BOOL bWaitAll DWORD dwTimeout )
8383
Wait Functions - Details
WaitForSingleObject()ndash hObject specifies kernel object
ndash dwTimeout specifies wait time in msecdwTimeout == 0 - no wait check whether object is signaled
dwTimeout == INFINITE - wait forever
WaitForMultipleObjects()ndash cObjects lt= MAXIMUM_WAIT_OBJECTS (64)
ndash lpHandles - pointer to array identifying these objects
ndash bWaitAll - whether to wait for first signaled object or all objectsFunction returns index of first signaled object
Side effectsndash Mutexes auto-reset events and waitable timers will be reset to non-signaled state after completing wait functions
8484
Mutexes
Mutexes work across processes
First thread has to call CreateMutex()
When sharing a mutex second thread (process) calls CreateMutex() or OpenMutex()
fInitialOwner == TRUE gives creator immediate ownership
Threads acquire mutex ownership using WaitForSingleObject() or WaitForMultipleObjects()
ReleaseMutex() gives up ownership
CloseHandle() will free mutex object
8585
Mutexes
HANDLE CreateMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
HANDLE OpenMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
BOOL ReleaseMutex( HANDLE hMutex )
8686
Mutex Example
counter is global shared by all threads
volatile int done counter = 0
HANDLE mutex = CreateMutex( NULL FALSE NULL )
main loop in any of the threads ret is local
DWORD ret
while (done)
ret = WaitForSingleObject( mutex INFINITE )
if (ret == WAIT_OBJECT_0)
counter += local_value
else mutex was abandoned
break exit the loop
ReleaseMutex( mutex )
CloseHandle( mutex )
8787
Comparison - POSIX mutexes
POSIX pthreads specification supports mutexesndash Synchronization among threads in same process
Five basic functionsndash pthread_mutex_init()
ndash pthread_mutex_destroy()
ndash pthread_mutex_lock()
ndash pthread_mutex_unlock()
ndash pthread_mutex_trylock()
Comparisonndash pthread_mutex_lock() will block - equivalent to WaitForSingleObject( hMutex )
ndash pthread_mutex_trylock() is nonblocking (polling) - equivalent to WaitForSingleObject() with timeout == 0
8888
Semaphores
Semaphore objects are used for resource countingndash A semaphore is signaled when count gt 0
Threadsprocesses use wait functionsndash Each wait function decreases semaphore count by 1
ndash ReleaseSemaphore() may increment count by any value
ndash ReleaseSemaphore() returns old semaphore count
8989
Semaphores
HANDLE CreateSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE OpenSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE ReleaseSemaphore( HANDLE hSemaphoreLONG cReleaseCount LPLONG lpPreviousCount )
9090
Events
Multiple threads can be released when a single event is signaled (barrier synchronization)ndash Manual-reset event can signal several thread simultaneously must be reset manually
ndash PulseEvent() will release all threads waiting on a manual-reset event and automatically reset the event
ndash Auto-reset event signals a single thread event is reset automatically
ndash fInitialState == TRUE - create event in signaled state
9191
Events
HANDLE CreateEvent( LPSECURITY_ATTRIBUTE lpsaBOOL fManualReset BOOL fInititalStateLPTSTR lpszEventName )
BOOL SetEvent( HANDLE hEvent )BOOL ResetEvent( HANDLE hEvent )BOOL PulseEvent( HANDLE hEvent )
9292
Comparison - POSIX condition variables
pthreadrsquos condition variables are comparable to eventsndash pthread_cond_init()
ndash pthread_cond_destroy()
Wait functionsndash pthread_cond_wait()
ndash pthread_cond_timedwait()
Signalingndash pthread_cond_signal() - one thread
ndash pthread_cond_broadcast() - all waiting threads
No exact equivalent to manual-reset events
9393
Anonymous pipes
BOOL CreatePipe( PHANDLE phReadPHANDLE phWriteLPSECURITY_ATTRIBUTES lpsaDWORD cbPipe )
main
prog1 prog2pipe
Half-duplex character-based IPC
cbPipe pipe byte size zero == default
Read on pipe handle will block if pipe is empty
Write operation to a full pipe will block
Anonymous pipes are oneway
9494
IO Redirection using an Anonymous Pipe
Create default size anonymous pipe handles are inheritable
if (CreatePipe (amphReadPipe amphWritePipe ampPipeSA 0))
fprintf(stderr ldquoAnon pipe create failednrdquo) exit(1)
Set output handle to pipe handle create first processes
StartInfoCh1hStdInput = GetStdHandle (STD_INPUT_HANDLE)
StartInfoCh1hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh1hStdOutput = hWritePipe
StartInfoCh1dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)Command1 NULL NULL TRUE 0 NULL NULL ampStartInfoCh1 ampProcInfo1))
fprintf(stderr ldquoCreateProc1 failednrdquo) exit(2)
CloseHandle (hWritePipe)
9595
Pipe example (contd)
Repeat (symmetrically) for the second process
StartInfoCh2hStdInput = hReadPipe
StartInfoCh2hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh2hStdOutput = GetStdHandle (STD_OUTPUT_HANDLE)
StartInfoCh2dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)targv NULL NULLTRUE Inherit handles
0 NULL NULL ampStartInfoCh2 ampProcInfo2))
fprintf(stderr ldquoCreateProc2 failednrdquo) exit(3)
CloseHandle (hReadPipe)
Wait for both processes to complete
WaitForSingleObject (ProcInfo1hProcess INFINITE)
WaitForSingleObject (ProcInfo2hProcess INFINITE)
CloseHandle (ProcInfo1hThread) CloseHandle (ProcInfo1hProcess)
CloseHandle (ProcInfo2hThread) CloseHandle (ProcInfo2hProcess)
return 0
9696
Named Pipes
Message orientedndash Reading process can read varying-length messages precisely as sent by the writing process
Bi-directionalndash Two processes can exchange messages over the same pipe
Multiple independent instances of a named pipendash Several clients can communicate with a single server using the same instance
ndash Server can respond to client using the same instance
Pipe can be accessed over the networkndash location transparency
Convenience and connection functions
9797
Using Named Pipes
HANDLE CreateNamedPipe (LPCTSTR lpszPipeNameDWORD fdwOpenMode DWORD fdwPipModeDWORD nMaxInstances DWORD cbOutBufDWORD cbInBuf DWORD dwTimeOutLPSECURITY_ATTRIBUTES lpsa )
Use same flag settings forall instances of a named pipe
lpszPipeName pipe[path]pipename
ndash Not possible to create a pipe on remote machine ( ndash local machine)
fdwOpenMode
ndash PIPE_ACCESS_DUPLEX PIPE_ACCESS_INBOUND PIPE_ACCESS_OUTBOUND
fdwPipeMode
ndash PIPE_TYPE_BYTE or PIPE_TYPE_MESSAGE
ndash PIPE_READMODE_BYTE or PIPE_READMODE_MESSAGE
ndash PIPE_WAIT or PIPE_NOWAIT (will ReadFile block)
9898
Named Pipes (contd)
BOOL PeekNamedPipe (HANDLE hPipeLPVOID lpvBuffer DWORD cbBufferLPDWORD lpcbRead LPDWORD lpcbAvailLPDWORD lpcbMessage)
nMaxInstances
ndash Number of instances
ndash PIPE_UNLIMITED_INSTANCES OS choice based on resources
dwTimeOut
ndash Default time-out period (in msec) for WaitNamedPipe()
First CreateNamedPipe creates named pipe
ndash Closing handle to last instance deletes named pipe
Polling a pipe
ndash Nondestructive ndash is there a message waiting for ReadFile
9999
Named Pipe Client Connections
CreateFile with named pipe namendash pipe[path]pipename
ndash servernamepipe[path]pipename
ndash First method gives better performance (local server)
Status Functionsndash GetNamedPipeHandleState
ndash SetNamedPipeHandleState
ndash GetNamedPipeInfo
100100
Convenience Functions
BOOL TransactNamedPipe( HANDLE hNamedPipeLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDOWRD lpcbRead LPOVERLAPPED lpa)
WriteFile ReadFile sequence
101101
Convenience Functions
BOOL CallNamedPipe( LPCTSTR lpszPipeNameLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDWORD lpcbRead DWORD dwTimeOut)
CreateFile WriteFile ReadFile CloseHandle
ndash dwTimeOut NMPWAIT_NOWAIT NMPWAIT_WIAT_FOREVER NMPWAIT_USE_DEFAULT_WAIT
102102
Server eliminate the polling loop
BOOL ConnectNamedPipe (HANDLE hNamedPipeLPOVERLAPPED lpo
lpo == NULLndash Call will return as soon as there is a client connection
ndash Returns false if client connected between CreateNamed Pipe calland ConnectNamedPipe()
Use DisconnectNamedPipe to free the handle for connection from another client
WaitNamedPipe()ndash Client may wait for serverlsquos ConnectNamedPipe()
Security rights for named pipesndash GENERIC_READ GENERIC_WRITE SYNCHRONIZE
103103
Comparison with UNIX
UNIX FIFOs are similar to a named pipe
ndash FIFOs are half-duplex
ndash FIFOs are limited to a single machine
ndash FIFOs are still byte-oriented so its easiest to use fixed-size records in clientserver applications
ndash Individual readwrites are atomic
A server using FIFOs must use a separate FIFO for each clientlsquos response although all clients can send requests via a single well known FIFO
Mkfifo() is the UNIX counterpart to CreateNamedPipe()
Use sockets for networked clientserver scenarios
104104
Client Example using Named Pipe
WaitNamedPipe (ServerPipeName NMPWAIT_WAIT_FOREVER)
hNamedPipe = CreateFile (ServerPipeName GENERIC_READ | GENERIC_WRITE
0 NULL OPEN_EXISTING FILE_ATTRIBUTE_NORMAL NULL)
if (hNamedPipe == INVALID_HANDLE_VALUE)
fptinf(stderr Failure to locate servern) exit(3)
Write the request
WriteFile (hNamedPipe ampRequest MAX_RQRS_LEN ampnWrite NULL)
Read each response and send it to std out
while (ReadFile (hNamedPipe ResponseRecord MAX_RQRS_LEN ampnRead NULL))
printf (s ResponseRecord)
CloseHandle (hNamedPipe)
return 0
105105
Server Example Using a Named Pipe
hNamedPipe = CreateNamedPipe (SERVER_PIPE PIPE_ACCESS_DUPLEX
PIPE_READMODE_MESSAGE | PIPE_TYPE_MESSAGE | PIPE_WAIT
1 0 0 CS_TIMEOUT pNPSA)
while (Done)
printf (Server is awaiting next requestn)
if (ConnectNamedPipe (hNamedPipe NULL)
|| ReadFile (hNamedPipe ampRequest RQ_SIZE ampnXfer NULL))
fprintf(stderr ldquoConnect or Read Named Pipe errornrdquo) exit(4)
printf( ldquoRequest is sn RequestRecord)
Send the file one line at a time to the client
fp = fopen (File r)
while ((fgets (ResponseRecord MAX_RQRS_LEN fp) = NULL))
WriteFile (hNamedPipe ampResponseRecord
(strlen(ResponseRecord) + 1) TSIZE ampnXfer NULL)
fclose (fp)
DisconnectNamedPipe (hNamedPipe)
End of server operation
106106
Win32 IPC - MailslotsMailslots bear some nasty implementation detailsthey are almost never used
Broadcast mechanism
ndash One-directional
ndash Mutliple writersmultiple readers (frequently one-to-many comm)
ndash Message delivery is unreliable
ndash Can be located over a network domain
ndash Message lengths are limited (w2k lt 426 byte) Operations on the mailslot
ndash Each reader (server) creates mailslot with CreateMailslot()
ndash Write-only client opens mailslot with CreateFile() and uses WriteFile() ndash open will fail if there are no waiting readers
ndash Clientlsquos message can be read by all servers (readers) Client lookup mailslotmailslotname
ndash Client will connect to every server in network domain
107107
Locate a server via mailslot
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
App client 0
App client n
Mailslot Servers
While () Sleep() hMS = CreateFile( ldquomailslotstatusldquo)
WriteFile(hMS ampStatInfo
App Server
Mailslot Client
Message is sent periodically
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
4646
Interrupt Prioritization amp Delivery
IRQLs are determined as followsndash x86 UP systems IRQL = 27 - IRQ
ndash x86 MP systems bucketized (random)
ndash x64 amp IA64 systems IRQL = IDT vector number 16
On MP systems which processor is chosen to deliver an interruptndash By default any processor can receive an interrupt from any deviceCan be configured with IntFilter utility in Resource Kit
ndash On x86 and x64 systems the IOAPIC (IO advanced programmable interrupt controller) is programmed to interrupt the processor running at the lowest IRQL
ndash On IA64 systems the SAPIC (streamlined advanced programmable interrupt controller) is configured to interrupt one processor for each interrupt sourceProcessors are assigned round robin for each interrupt vector
4747
Software interrupts
Initiating thread dispatching
ndash DPC allow for scheduling actions when kernel is deep within many layers of code
ndash Delayed scheduling decision one DPC queue per processor
Handling timer expiration
Asynchronous execution of a procedure in context of a particular thread
Support for asynchronous IO operations
4848
Flow of Interrupts
4949
Sync on MP use spinlocks to coordinate among processors
Spinlock acquisition and release routines implement a one-owner-at-a-time algorithm
ndash Spinlock is either free or is considered to be owned by a CPU
ndash Analogous to using Windows API mutexes from user mode
A spinlock is just a data cell in memory
ndash Accessed with a test-and-modify operation that is atomic across all processors
ndash KSPIN_LOCK is an opaque data type typedefrsquod as a ULONG
ndash To implement synchronization a single bit is sufficient
Synchronization on SMP Systems
5050
do acquire_spinlock(DPC)until (SUCCESS)
begin remove DPC from queueend
release_spinlock(DPC)
do acquire_spinlock(DPC)until (SUCCESS)
begin remove DPC from queueend
release_spinlock(DPC)
Kernel Synchronization
Processor BProcessor A
Critical section
spinlock
DPC DPC
A spinlock is a locking primitive associatedwith a global data structure such as the DPC queue
5151
Queued Spinlocks
Problem Checking status of spinlock via test-and-set operation creates bus contention
Queued spinlocks maintain queue of waiting processors
First processor acquires lock other processors wait on processor-local flag
ndash Thus busy-wait loop requires no access to the memory bus
When releasing lock the 1st processorrsquos flag is modified
ndash Exactly one processor is being signaled
ndash Pre-determined wait order
5252
SMP Scalability Improvements
Windows 2000 queued spinlocks
ndash qlocks in Kernel Debugger
Server 2003
ndash More spinlocks eliminated (context swap system space commit)
ndash Further reduction of use of spinlocks amp length they are held
ndash Scheduling database now per-CPUAllows thread state transitions in parallel
5353
SMP Scalability Improvements
XP2003
ndash Minimized lock contention for hot locks PFN or Page Frame Database lock
ndash Some locks completely eliminatedCharging nonpagedpaged pool quotas allocating and mapping system page table entries charging commitment of pages allocatingmapping physical memory through AWE functions
ndash New more efficient locking mechanism (pushlocks)Doesnrsquot use spinlocks when no contentionUsed for object manager and address windowing extensions (AWE) related locks
5454
Waiting
Flexible wait calls
ndash Wait for one or multiple objects in one call
ndash Wait for multiple can wait for ldquoanyrdquo one or ldquoallrdquo at onceldquoAllrdquo all objects must be in the signalled state concurrently to resolve the wait
ndash All wait calls include optional timeout argument
ndash Waiting threads consume no CPU time
5555
Waiting
Waitable objects include
ndash Events (may be auto-reset or manual reset may be set or ldquopulsedrdquo)
ndash Mutexes (ldquomutual exclusionrdquo one-at-a-time)
ndash Semaphores (n-at-a-time)
ndash Timers
ndash Processes and Threads (signalled upon exit or terminate)
ndash Directories (change notification)
5656
Waiting
No guaranteed ordering of wait resolution
ndash If multiple threads are waiting for an object and only one thread is released (eg itrsquos a mutex or auto-reset event) which thread gets released is unpredictable
5757
Executive Synchronization
Thread waitson an object
handle
Create and initialize thread object
Initialized
Ready
Transition
Waiting
Running
Terminated
Standby
Wait is completeSet object to
signaled state
Interaction with thread scheduling
Waiting on Dispatcher Objects ndash outside the kernel
5858
Interaction bet Synchronization amp Dispatching User mode thread waits on an event objectlsquos handle
Kernel changes threadlsquos scheduling state from ready to waiting and adds thread to wait-list
Another thread sets the event
Kernel wakes up waiting threads variable priority threads get priority boost
5959
Interaction bet Synchronization amp Dispatching Dispatcher re-schedules new thread ndash it may preempt running thread it it has lower priority and issues software interrupt to initiate context switch
If no processor can be preempted the dispatcher places the ready thread in the dispatcher ready queue to be scheduled later
6060
What signals an object
Dispatcher object
System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
Owning thread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (kernel mode)
nonsignaled signaled
Owning thread or otherthread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (exported to user mode)
nonsignaled signaled
One thread releases thesemaphore freeing a resource
A thread acquires the semaphoreMore resources are not available
Kernel resumes one or more waiting threads
Semaphore
6161
A thread reinitializesthe thread object
What signals an object (contd)
Dispatcher object System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
A thread sets the event
Kernel resumes one or more threads
Kernel resumes one or more waiting threads
Event
nonsignaled signaled
Dedicated thread sets oneevent in the event pair
Kernel resumes theother dedicated thread
Kernel resumes waitingdedicated thread
Event pair
nonsignaled signaled
Timer expires
A thread (re) initializes the timer
Kernel resumes all waiting threads
Timer
nonsignaled signaled
Thread terminates
Kernel resumes all waiting threads
Thread
6262
Further Reading
Mark E Russinovich and David A Solomon Microsoft Windows Internals 4th Edition Microsoft Press 2004
Chapter 3 - System Mechanisms
ndash Trap Dispatching (pp 85 ff)
ndash Synchronization (pp 149 ff)
ndash Kernel Event Tracing (pp 175 ff)
6363
33 Advanced Windows Synchronization
Deferred and Asynchronous Procedure Calls
IRQLs and CPU Time Accounting
Wait Queues amp Dispatcher Objects
6464
Used to defer processing from higher (device) interrupt level to a lower (dispatch) levelndash Also used for quantum end and timer expiration
Driver (usually ISR) queues requestndash One queue per CPU DPCs are normally queued to the current processor but can be targeted to other CPUs
ndash Executes specified procedure at dispatch IRQL (or ldquodispatch levelrdquo also ldquoDPC levelrdquo) when all higher-IRQL work (interrupts) completed
ndash Maximum times recommended ISR 10 usec DPC 25 usec
See httpwwwmicrosoftcomwhdcdriverperformmmdrvmspx
Deferred Procedure Calls (DPCs)
6565
queue head DPC object DPC object DPC object
Deferred Procedure Calls (DPCs)
6666
DPC
Delivering a DPC
DPC routines can call kernel functionsbut canlsquot call system services generatepage faults or create or wait on objects
DPC routines canlsquotassume whatprocess addressspace is currentlymapped
Interruptdispatch table
high
Power failure
DispatchDPC
APC
Low
DPC
1 Timer expires kernelqueues DPC that willrelease all waiting threadsKernel requests SW int
DPCDPC
DPC queue
2 DPC interrupt occurswhen IRQL drops belowdispatchDPC level
dispatcher
3 After DPC interruptcontrol transfers tothread dispatcher
4 Dispatcher executes each DPCroutine in DPC queue
6767
Asynchronous Procedure Calls (APCs)
Execute code in context of a particular user threadndash APC routines can acquire resources (objects) incur page faultscall system services
APC queue is thread-specific
User mode amp kernel mode APCsndash Permission required for user mode APCs
6868
Asynchronous Procedure Calls (APCs)
Executive uses APCs to complete work in thread spacendash Wait for asynchronous IO operation
ndash Emulate delivery of POSIX signals
ndash Make threads suspendterminate itself (env subsystems)
APCs are delivered when thread is in alertable wait statendash WaitForMultipleObjectsEx() SleepEx()
6969
Special kernel APCs
ndash Run in kernel mode at IRQL 1
ndash Always deliverable unless thread is already at IRQL 1 or above
ndash Used for IO completion reporting from ldquoarbitrary thread contextrdquo
ndash Kernel-mode interface is linkable but not documented
Asynchronous Procedure Calls (APCs)
7070
ldquoOrdinaryrdquo kernel APCs
ndash Always deliverable if at IRQL 0 unless explicitly disabled (disable with KeEnterCriticalRegion)
User mode APCs
ndash Used for IO completion callback routines (see ReadFileEx WriteFileEx) also QueueUserApc
ndash Only deliverable when thread is in ldquoalertable waitrdquo
Asynchronous Procedure Calls (APCs)
7171
ThreadObject
K
U
APC objects
Asynchronous Procedure Calls (APCs)
7272
IRQLs and CPU Time Accounting
Interval clock timer ISR keeps track of time
Clock ISR time accounting
ndash If IRQLlt2 charge to threadrsquos user or kernel time
ndash If IRQL=2 and processing a DPC charge to DPC time
ndash If IRQL=2 amp not processing a DPC charge to thread kernel time
ndash If IRQLgt2 charge to interrupt time
7373
IRQLs and CPU Time Accounting
Since time servicing interrupts are NOT charged to interrupted thread if system is busy but no process appears to be running must be due to interrupt-related activity
ndash Note time at IRQL 2 or more is charged to the current threadrsquos quantum (to be described)
7474
Interrupt Time Accounting
Task Manager includes interrupt and DPC time with the Idle process time
Interrupt activity is not charged to any threadprocess
ndash Process Explorer shows these as separate processesnot really processes
ndash Context switches for these are really of interrupts amp DPCs
7575
Time Accounting Quirks
Looking at total CPU time for each process may not reveal where system has spent its time
CPU time accounting is driven by programmable interrupt timer
ndash Normally 10 msec (15 msec on some MP Pentiums)
Thread execution and context switches between clock intervals NOT accounted
ndash Eg one or more threads run and enter a wait state before clock fires
ndash Thus threads may run but never get charged
View context switch activity with Process Explorer
ndash Add Context Switch Delta column
7676
For waiting threads user-mode utilities only display the wait reason
Example pstat
Looking at Waiting Threads
7777
Wait Internals 1 Dispatcher Objects
Size TypeState
Wait listhead
Object-type-specific data
DispatcherObject
(see ntddkincddkntddkh)
Any kernel object you can wait for is a ldquodispatcher objectrdquo
ndash some exclusively for synchronizationeg events mutexes (ldquomutantsrdquo) semaphores queues timers
ndash others can be waited for as a side effect of their prime function eg processes threads file objects
ndash non-waitable kernel objects are called ldquocontrol objectsrdquo
All dispatcher objects have a common header
All dispatcher objects are in one of two states
ndash ldquosignaledrdquo vs ldquononsignaledrdquo
ndash when signalled a wait on the object is satisfied
ndash different object types differ in terms of what changes their state
ndash wait and unwait implementation iscommon to all types of dispatcher objects
7878
Object-type-specific data
Wait Internals 2Wait Blocks
Size TypeState
Wait listhead
Size TypeState
Wait listhead
Represent a threadrsquos reference to something itrsquos waiting for (one per handle passed to WaitForhellip)
All wait blocks from a given wait call are chained to the waiting thread
Type indicates wait for ldquoanyrdquo or ldquoallrdquo Key denotes argument list position for
WaitForMultipleObjects
Object-type-specific data
DispatcherObjects
Thread Objects
WaitBlockListWaitBlockList
Wait blocks
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
7979
34 Windows APIs for Synchronization and IPC Windows API constructs for synchronization and interprocess communication
Synchronization
ndash Critical sections
ndash Mutexes
ndash Semaphores
ndash Event objects
Synchronization through interprocess communication
ndash Anonymous pipes
ndash Named pipes
ndash Mailslots
8080
Critical Sections
Only usable from within the same process
Critical sections are initialized and deleted but do not have handles
Only one thread at a time can be in a critical section
A thread can enter a critical section multiple times - however the number of Enter- and Leave-operations must match
Leaving a critical section before entering it may cause deadlocks
No way to test whether another thread is in a critical section
VOID InitializeCriticalSection( LPCRITICAL_SECTION sec )VOID DeleteCriticalSection( LPCRITICAL_SECTION sec )
VOID EnterCriticalSection( LPCRITICAL_SECTION sec ) VOID LeaveCriticalSection( LPCRITICAL_SECTION sec )BOOL TryEnterCriticalSection ( LPCRITICAL_SECTION sec )
8181
Critical Section Example
counter is global shared by all threads
volatile int counter = 0
CRITICAL_SECTION crit
InitializeCriticalSection ( ampcrit )
hellip main loop in any of the threads
while (done)
_try
EnterCriticalSection ( ampcrit )
counter += local_value
LeaveCriticalSection ( ampcrit )
_finally LeaveCriticalSection ( ampcrit )
DeleteCriticalSection( ampcrit )
8282
Synchronizing Threads with Kernel Objects
The following kernel objects can be used to synchronize threads
ndash Processes
ndash Threads
ndash Files
ndash Console input
File change notificationsFile change notifications
MutexesMutexes
Events (auto-reset + manual-reset)Events (auto-reset + manual-reset)
Waitable timersWaitable timers
DWORD WaitForSingleObject( HANDLE hObject DWORD dwTimeout )
DWORD WaitForMultipleObjects( DWORD cObjects LPHANDLE lpHandles BOOL bWaitAll DWORD dwTimeout )
8383
Wait Functions - Details
WaitForSingleObject()ndash hObject specifies kernel object
ndash dwTimeout specifies wait time in msecdwTimeout == 0 - no wait check whether object is signaled
dwTimeout == INFINITE - wait forever
WaitForMultipleObjects()ndash cObjects lt= MAXIMUM_WAIT_OBJECTS (64)
ndash lpHandles - pointer to array identifying these objects
ndash bWaitAll - whether to wait for first signaled object or all objectsFunction returns index of first signaled object
Side effectsndash Mutexes auto-reset events and waitable timers will be reset to non-signaled state after completing wait functions
8484
Mutexes
Mutexes work across processes
First thread has to call CreateMutex()
When sharing a mutex second thread (process) calls CreateMutex() or OpenMutex()
fInitialOwner == TRUE gives creator immediate ownership
Threads acquire mutex ownership using WaitForSingleObject() or WaitForMultipleObjects()
ReleaseMutex() gives up ownership
CloseHandle() will free mutex object
8585
Mutexes
HANDLE CreateMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
HANDLE OpenMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
BOOL ReleaseMutex( HANDLE hMutex )
8686
Mutex Example
counter is global shared by all threads
volatile int done counter = 0
HANDLE mutex = CreateMutex( NULL FALSE NULL )
main loop in any of the threads ret is local
DWORD ret
while (done)
ret = WaitForSingleObject( mutex INFINITE )
if (ret == WAIT_OBJECT_0)
counter += local_value
else mutex was abandoned
break exit the loop
ReleaseMutex( mutex )
CloseHandle( mutex )
8787
Comparison - POSIX mutexes
POSIX pthreads specification supports mutexesndash Synchronization among threads in same process
Five basic functionsndash pthread_mutex_init()
ndash pthread_mutex_destroy()
ndash pthread_mutex_lock()
ndash pthread_mutex_unlock()
ndash pthread_mutex_trylock()
Comparisonndash pthread_mutex_lock() will block - equivalent to WaitForSingleObject( hMutex )
ndash pthread_mutex_trylock() is nonblocking (polling) - equivalent to WaitForSingleObject() with timeout == 0
8888
Semaphores
Semaphore objects are used for resource countingndash A semaphore is signaled when count gt 0
Threadsprocesses use wait functionsndash Each wait function decreases semaphore count by 1
ndash ReleaseSemaphore() may increment count by any value
ndash ReleaseSemaphore() returns old semaphore count
8989
Semaphores
HANDLE CreateSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE OpenSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE ReleaseSemaphore( HANDLE hSemaphoreLONG cReleaseCount LPLONG lpPreviousCount )
9090
Events
Multiple threads can be released when a single event is signaled (barrier synchronization)ndash Manual-reset event can signal several thread simultaneously must be reset manually
ndash PulseEvent() will release all threads waiting on a manual-reset event and automatically reset the event
ndash Auto-reset event signals a single thread event is reset automatically
ndash fInitialState == TRUE - create event in signaled state
9191
Events
HANDLE CreateEvent( LPSECURITY_ATTRIBUTE lpsaBOOL fManualReset BOOL fInititalStateLPTSTR lpszEventName )
BOOL SetEvent( HANDLE hEvent )BOOL ResetEvent( HANDLE hEvent )BOOL PulseEvent( HANDLE hEvent )
9292
Comparison - POSIX condition variables
pthreadrsquos condition variables are comparable to eventsndash pthread_cond_init()
ndash pthread_cond_destroy()
Wait functionsndash pthread_cond_wait()
ndash pthread_cond_timedwait()
Signalingndash pthread_cond_signal() - one thread
ndash pthread_cond_broadcast() - all waiting threads
No exact equivalent to manual-reset events
9393
Anonymous pipes
BOOL CreatePipe( PHANDLE phReadPHANDLE phWriteLPSECURITY_ATTRIBUTES lpsaDWORD cbPipe )
main
prog1 prog2pipe
Half-duplex character-based IPC
cbPipe pipe byte size zero == default
Read on pipe handle will block if pipe is empty
Write operation to a full pipe will block
Anonymous pipes are oneway
9494
IO Redirection using an Anonymous Pipe
Create default size anonymous pipe handles are inheritable
if (CreatePipe (amphReadPipe amphWritePipe ampPipeSA 0))
fprintf(stderr ldquoAnon pipe create failednrdquo) exit(1)
Set output handle to pipe handle create first processes
StartInfoCh1hStdInput = GetStdHandle (STD_INPUT_HANDLE)
StartInfoCh1hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh1hStdOutput = hWritePipe
StartInfoCh1dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)Command1 NULL NULL TRUE 0 NULL NULL ampStartInfoCh1 ampProcInfo1))
fprintf(stderr ldquoCreateProc1 failednrdquo) exit(2)
CloseHandle (hWritePipe)
9595
Pipe example (contd)
Repeat (symmetrically) for the second process
StartInfoCh2hStdInput = hReadPipe
StartInfoCh2hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh2hStdOutput = GetStdHandle (STD_OUTPUT_HANDLE)
StartInfoCh2dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)targv NULL NULLTRUE Inherit handles
0 NULL NULL ampStartInfoCh2 ampProcInfo2))
fprintf(stderr ldquoCreateProc2 failednrdquo) exit(3)
CloseHandle (hReadPipe)
Wait for both processes to complete
WaitForSingleObject (ProcInfo1hProcess INFINITE)
WaitForSingleObject (ProcInfo2hProcess INFINITE)
CloseHandle (ProcInfo1hThread) CloseHandle (ProcInfo1hProcess)
CloseHandle (ProcInfo2hThread) CloseHandle (ProcInfo2hProcess)
return 0
9696
Named Pipes
Message orientedndash Reading process can read varying-length messages precisely as sent by the writing process
Bi-directionalndash Two processes can exchange messages over the same pipe
Multiple independent instances of a named pipendash Several clients can communicate with a single server using the same instance
ndash Server can respond to client using the same instance
Pipe can be accessed over the networkndash location transparency
Convenience and connection functions
9797
Using Named Pipes
HANDLE CreateNamedPipe (LPCTSTR lpszPipeNameDWORD fdwOpenMode DWORD fdwPipModeDWORD nMaxInstances DWORD cbOutBufDWORD cbInBuf DWORD dwTimeOutLPSECURITY_ATTRIBUTES lpsa )
Use same flag settings forall instances of a named pipe
lpszPipeName pipe[path]pipename
ndash Not possible to create a pipe on remote machine ( ndash local machine)
fdwOpenMode
ndash PIPE_ACCESS_DUPLEX PIPE_ACCESS_INBOUND PIPE_ACCESS_OUTBOUND
fdwPipeMode
ndash PIPE_TYPE_BYTE or PIPE_TYPE_MESSAGE
ndash PIPE_READMODE_BYTE or PIPE_READMODE_MESSAGE
ndash PIPE_WAIT or PIPE_NOWAIT (will ReadFile block)
9898
Named Pipes (contd)
BOOL PeekNamedPipe (HANDLE hPipeLPVOID lpvBuffer DWORD cbBufferLPDWORD lpcbRead LPDWORD lpcbAvailLPDWORD lpcbMessage)
nMaxInstances
ndash Number of instances
ndash PIPE_UNLIMITED_INSTANCES OS choice based on resources
dwTimeOut
ndash Default time-out period (in msec) for WaitNamedPipe()
First CreateNamedPipe creates named pipe
ndash Closing handle to last instance deletes named pipe
Polling a pipe
ndash Nondestructive ndash is there a message waiting for ReadFile
9999
Named Pipe Client Connections
CreateFile with named pipe namendash pipe[path]pipename
ndash servernamepipe[path]pipename
ndash First method gives better performance (local server)
Status Functionsndash GetNamedPipeHandleState
ndash SetNamedPipeHandleState
ndash GetNamedPipeInfo
100100
Convenience Functions
BOOL TransactNamedPipe( HANDLE hNamedPipeLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDOWRD lpcbRead LPOVERLAPPED lpa)
WriteFile ReadFile sequence
101101
Convenience Functions
BOOL CallNamedPipe( LPCTSTR lpszPipeNameLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDWORD lpcbRead DWORD dwTimeOut)
CreateFile WriteFile ReadFile CloseHandle
ndash dwTimeOut NMPWAIT_NOWAIT NMPWAIT_WIAT_FOREVER NMPWAIT_USE_DEFAULT_WAIT
102102
Server eliminate the polling loop
BOOL ConnectNamedPipe (HANDLE hNamedPipeLPOVERLAPPED lpo
lpo == NULLndash Call will return as soon as there is a client connection
ndash Returns false if client connected between CreateNamed Pipe calland ConnectNamedPipe()
Use DisconnectNamedPipe to free the handle for connection from another client
WaitNamedPipe()ndash Client may wait for serverlsquos ConnectNamedPipe()
Security rights for named pipesndash GENERIC_READ GENERIC_WRITE SYNCHRONIZE
103103
Comparison with UNIX
UNIX FIFOs are similar to a named pipe
ndash FIFOs are half-duplex
ndash FIFOs are limited to a single machine
ndash FIFOs are still byte-oriented so its easiest to use fixed-size records in clientserver applications
ndash Individual readwrites are atomic
A server using FIFOs must use a separate FIFO for each clientlsquos response although all clients can send requests via a single well known FIFO
Mkfifo() is the UNIX counterpart to CreateNamedPipe()
Use sockets for networked clientserver scenarios
104104
Client Example using Named Pipe
WaitNamedPipe (ServerPipeName NMPWAIT_WAIT_FOREVER)
hNamedPipe = CreateFile (ServerPipeName GENERIC_READ | GENERIC_WRITE
0 NULL OPEN_EXISTING FILE_ATTRIBUTE_NORMAL NULL)
if (hNamedPipe == INVALID_HANDLE_VALUE)
fptinf(stderr Failure to locate servern) exit(3)
Write the request
WriteFile (hNamedPipe ampRequest MAX_RQRS_LEN ampnWrite NULL)
Read each response and send it to std out
while (ReadFile (hNamedPipe ResponseRecord MAX_RQRS_LEN ampnRead NULL))
printf (s ResponseRecord)
CloseHandle (hNamedPipe)
return 0
105105
Server Example Using a Named Pipe
hNamedPipe = CreateNamedPipe (SERVER_PIPE PIPE_ACCESS_DUPLEX
PIPE_READMODE_MESSAGE | PIPE_TYPE_MESSAGE | PIPE_WAIT
1 0 0 CS_TIMEOUT pNPSA)
while (Done)
printf (Server is awaiting next requestn)
if (ConnectNamedPipe (hNamedPipe NULL)
|| ReadFile (hNamedPipe ampRequest RQ_SIZE ampnXfer NULL))
fprintf(stderr ldquoConnect or Read Named Pipe errornrdquo) exit(4)
printf( ldquoRequest is sn RequestRecord)
Send the file one line at a time to the client
fp = fopen (File r)
while ((fgets (ResponseRecord MAX_RQRS_LEN fp) = NULL))
WriteFile (hNamedPipe ampResponseRecord
(strlen(ResponseRecord) + 1) TSIZE ampnXfer NULL)
fclose (fp)
DisconnectNamedPipe (hNamedPipe)
End of server operation
106106
Win32 IPC - MailslotsMailslots bear some nasty implementation detailsthey are almost never used
Broadcast mechanism
ndash One-directional
ndash Mutliple writersmultiple readers (frequently one-to-many comm)
ndash Message delivery is unreliable
ndash Can be located over a network domain
ndash Message lengths are limited (w2k lt 426 byte) Operations on the mailslot
ndash Each reader (server) creates mailslot with CreateMailslot()
ndash Write-only client opens mailslot with CreateFile() and uses WriteFile() ndash open will fail if there are no waiting readers
ndash Clientlsquos message can be read by all servers (readers) Client lookup mailslotmailslotname
ndash Client will connect to every server in network domain
107107
Locate a server via mailslot
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
App client 0
App client n
Mailslot Servers
While () Sleep() hMS = CreateFile( ldquomailslotstatusldquo)
WriteFile(hMS ampStatInfo
App Server
Mailslot Client
Message is sent periodically
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
4747
Software interrupts
Initiating thread dispatching
ndash DPC allow for scheduling actions when kernel is deep within many layers of code
ndash Delayed scheduling decision one DPC queue per processor
Handling timer expiration
Asynchronous execution of a procedure in context of a particular thread
Support for asynchronous IO operations
4848
Flow of Interrupts
4949
Sync on MP use spinlocks to coordinate among processors
Spinlock acquisition and release routines implement a one-owner-at-a-time algorithm
ndash Spinlock is either free or is considered to be owned by a CPU
ndash Analogous to using Windows API mutexes from user mode
A spinlock is just a data cell in memory
ndash Accessed with a test-and-modify operation that is atomic across all processors
ndash KSPIN_LOCK is an opaque data type typedefrsquod as a ULONG
ndash To implement synchronization a single bit is sufficient
Synchronization on SMP Systems
5050
do acquire_spinlock(DPC)until (SUCCESS)
begin remove DPC from queueend
release_spinlock(DPC)
do acquire_spinlock(DPC)until (SUCCESS)
begin remove DPC from queueend
release_spinlock(DPC)
Kernel Synchronization
Processor BProcessor A
Critical section
spinlock
DPC DPC
A spinlock is a locking primitive associatedwith a global data structure such as the DPC queue
5151
Queued Spinlocks
Problem Checking status of spinlock via test-and-set operation creates bus contention
Queued spinlocks maintain queue of waiting processors
First processor acquires lock other processors wait on processor-local flag
ndash Thus busy-wait loop requires no access to the memory bus
When releasing lock the 1st processorrsquos flag is modified
ndash Exactly one processor is being signaled
ndash Pre-determined wait order
5252
SMP Scalability Improvements
Windows 2000 queued spinlocks
ndash qlocks in Kernel Debugger
Server 2003
ndash More spinlocks eliminated (context swap system space commit)
ndash Further reduction of use of spinlocks amp length they are held
ndash Scheduling database now per-CPUAllows thread state transitions in parallel
5353
SMP Scalability Improvements
XP2003
ndash Minimized lock contention for hot locks PFN or Page Frame Database lock
ndash Some locks completely eliminatedCharging nonpagedpaged pool quotas allocating and mapping system page table entries charging commitment of pages allocatingmapping physical memory through AWE functions
ndash New more efficient locking mechanism (pushlocks)Doesnrsquot use spinlocks when no contentionUsed for object manager and address windowing extensions (AWE) related locks
5454
Waiting
Flexible wait calls
ndash Wait for one or multiple objects in one call
ndash Wait for multiple can wait for ldquoanyrdquo one or ldquoallrdquo at onceldquoAllrdquo all objects must be in the signalled state concurrently to resolve the wait
ndash All wait calls include optional timeout argument
ndash Waiting threads consume no CPU time
5555
Waiting
Waitable objects include
ndash Events (may be auto-reset or manual reset may be set or ldquopulsedrdquo)
ndash Mutexes (ldquomutual exclusionrdquo one-at-a-time)
ndash Semaphores (n-at-a-time)
ndash Timers
ndash Processes and Threads (signalled upon exit or terminate)
ndash Directories (change notification)
5656
Waiting
No guaranteed ordering of wait resolution
ndash If multiple threads are waiting for an object and only one thread is released (eg itrsquos a mutex or auto-reset event) which thread gets released is unpredictable
5757
Executive Synchronization
Thread waitson an object
handle
Create and initialize thread object
Initialized
Ready
Transition
Waiting
Running
Terminated
Standby
Wait is completeSet object to
signaled state
Interaction with thread scheduling
Waiting on Dispatcher Objects ndash outside the kernel
5858
Interaction bet Synchronization amp Dispatching User mode thread waits on an event objectlsquos handle
Kernel changes threadlsquos scheduling state from ready to waiting and adds thread to wait-list
Another thread sets the event
Kernel wakes up waiting threads variable priority threads get priority boost
5959
Interaction bet Synchronization amp Dispatching Dispatcher re-schedules new thread ndash it may preempt running thread it it has lower priority and issues software interrupt to initiate context switch
If no processor can be preempted the dispatcher places the ready thread in the dispatcher ready queue to be scheduled later
6060
What signals an object
Dispatcher object
System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
Owning thread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (kernel mode)
nonsignaled signaled
Owning thread or otherthread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (exported to user mode)
nonsignaled signaled
One thread releases thesemaphore freeing a resource
A thread acquires the semaphoreMore resources are not available
Kernel resumes one or more waiting threads
Semaphore
6161
A thread reinitializesthe thread object
What signals an object (contd)
Dispatcher object System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
A thread sets the event
Kernel resumes one or more threads
Kernel resumes one or more waiting threads
Event
nonsignaled signaled
Dedicated thread sets oneevent in the event pair
Kernel resumes theother dedicated thread
Kernel resumes waitingdedicated thread
Event pair
nonsignaled signaled
Timer expires
A thread (re) initializes the timer
Kernel resumes all waiting threads
Timer
nonsignaled signaled
Thread terminates
Kernel resumes all waiting threads
Thread
6262
Further Reading
Mark E Russinovich and David A Solomon Microsoft Windows Internals 4th Edition Microsoft Press 2004
Chapter 3 - System Mechanisms
ndash Trap Dispatching (pp 85 ff)
ndash Synchronization (pp 149 ff)
ndash Kernel Event Tracing (pp 175 ff)
6363
33 Advanced Windows Synchronization
Deferred and Asynchronous Procedure Calls
IRQLs and CPU Time Accounting
Wait Queues amp Dispatcher Objects
6464
Used to defer processing from higher (device) interrupt level to a lower (dispatch) levelndash Also used for quantum end and timer expiration
Driver (usually ISR) queues requestndash One queue per CPU DPCs are normally queued to the current processor but can be targeted to other CPUs
ndash Executes specified procedure at dispatch IRQL (or ldquodispatch levelrdquo also ldquoDPC levelrdquo) when all higher-IRQL work (interrupts) completed
ndash Maximum times recommended ISR 10 usec DPC 25 usec
See httpwwwmicrosoftcomwhdcdriverperformmmdrvmspx
Deferred Procedure Calls (DPCs)
6565
queue head DPC object DPC object DPC object
Deferred Procedure Calls (DPCs)
6666
DPC
Delivering a DPC
DPC routines can call kernel functionsbut canlsquot call system services generatepage faults or create or wait on objects
DPC routines canlsquotassume whatprocess addressspace is currentlymapped
Interruptdispatch table
high
Power failure
DispatchDPC
APC
Low
DPC
1 Timer expires kernelqueues DPC that willrelease all waiting threadsKernel requests SW int
DPCDPC
DPC queue
2 DPC interrupt occurswhen IRQL drops belowdispatchDPC level
dispatcher
3 After DPC interruptcontrol transfers tothread dispatcher
4 Dispatcher executes each DPCroutine in DPC queue
6767
Asynchronous Procedure Calls (APCs)
Execute code in context of a particular user threadndash APC routines can acquire resources (objects) incur page faultscall system services
APC queue is thread-specific
User mode amp kernel mode APCsndash Permission required for user mode APCs
6868
Asynchronous Procedure Calls (APCs)
Executive uses APCs to complete work in thread spacendash Wait for asynchronous IO operation
ndash Emulate delivery of POSIX signals
ndash Make threads suspendterminate itself (env subsystems)
APCs are delivered when thread is in alertable wait statendash WaitForMultipleObjectsEx() SleepEx()
6969
Special kernel APCs
ndash Run in kernel mode at IRQL 1
ndash Always deliverable unless thread is already at IRQL 1 or above
ndash Used for IO completion reporting from ldquoarbitrary thread contextrdquo
ndash Kernel-mode interface is linkable but not documented
Asynchronous Procedure Calls (APCs)
7070
ldquoOrdinaryrdquo kernel APCs
ndash Always deliverable if at IRQL 0 unless explicitly disabled (disable with KeEnterCriticalRegion)
User mode APCs
ndash Used for IO completion callback routines (see ReadFileEx WriteFileEx) also QueueUserApc
ndash Only deliverable when thread is in ldquoalertable waitrdquo
Asynchronous Procedure Calls (APCs)
7171
ThreadObject
K
U
APC objects
Asynchronous Procedure Calls (APCs)
7272
IRQLs and CPU Time Accounting
Interval clock timer ISR keeps track of time
Clock ISR time accounting
ndash If IRQLlt2 charge to threadrsquos user or kernel time
ndash If IRQL=2 and processing a DPC charge to DPC time
ndash If IRQL=2 amp not processing a DPC charge to thread kernel time
ndash If IRQLgt2 charge to interrupt time
7373
IRQLs and CPU Time Accounting
Since time servicing interrupts are NOT charged to interrupted thread if system is busy but no process appears to be running must be due to interrupt-related activity
ndash Note time at IRQL 2 or more is charged to the current threadrsquos quantum (to be described)
7474
Interrupt Time Accounting
Task Manager includes interrupt and DPC time with the Idle process time
Interrupt activity is not charged to any threadprocess
ndash Process Explorer shows these as separate processesnot really processes
ndash Context switches for these are really of interrupts amp DPCs
7575
Time Accounting Quirks
Looking at total CPU time for each process may not reveal where system has spent its time
CPU time accounting is driven by programmable interrupt timer
ndash Normally 10 msec (15 msec on some MP Pentiums)
Thread execution and context switches between clock intervals NOT accounted
ndash Eg one or more threads run and enter a wait state before clock fires
ndash Thus threads may run but never get charged
View context switch activity with Process Explorer
ndash Add Context Switch Delta column
7676
For waiting threads user-mode utilities only display the wait reason
Example pstat
Looking at Waiting Threads
7777
Wait Internals 1 Dispatcher Objects
Size TypeState
Wait listhead
Object-type-specific data
DispatcherObject
(see ntddkincddkntddkh)
Any kernel object you can wait for is a ldquodispatcher objectrdquo
ndash some exclusively for synchronizationeg events mutexes (ldquomutantsrdquo) semaphores queues timers
ndash others can be waited for as a side effect of their prime function eg processes threads file objects
ndash non-waitable kernel objects are called ldquocontrol objectsrdquo
All dispatcher objects have a common header
All dispatcher objects are in one of two states
ndash ldquosignaledrdquo vs ldquononsignaledrdquo
ndash when signalled a wait on the object is satisfied
ndash different object types differ in terms of what changes their state
ndash wait and unwait implementation iscommon to all types of dispatcher objects
7878
Object-type-specific data
Wait Internals 2Wait Blocks
Size TypeState
Wait listhead
Size TypeState
Wait listhead
Represent a threadrsquos reference to something itrsquos waiting for (one per handle passed to WaitForhellip)
All wait blocks from a given wait call are chained to the waiting thread
Type indicates wait for ldquoanyrdquo or ldquoallrdquo Key denotes argument list position for
WaitForMultipleObjects
Object-type-specific data
DispatcherObjects
Thread Objects
WaitBlockListWaitBlockList
Wait blocks
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
7979
34 Windows APIs for Synchronization and IPC Windows API constructs for synchronization and interprocess communication
Synchronization
ndash Critical sections
ndash Mutexes
ndash Semaphores
ndash Event objects
Synchronization through interprocess communication
ndash Anonymous pipes
ndash Named pipes
ndash Mailslots
8080
Critical Sections
Only usable from within the same process
Critical sections are initialized and deleted but do not have handles
Only one thread at a time can be in a critical section
A thread can enter a critical section multiple times - however the number of Enter- and Leave-operations must match
Leaving a critical section before entering it may cause deadlocks
No way to test whether another thread is in a critical section
VOID InitializeCriticalSection( LPCRITICAL_SECTION sec )VOID DeleteCriticalSection( LPCRITICAL_SECTION sec )
VOID EnterCriticalSection( LPCRITICAL_SECTION sec ) VOID LeaveCriticalSection( LPCRITICAL_SECTION sec )BOOL TryEnterCriticalSection ( LPCRITICAL_SECTION sec )
8181
Critical Section Example
counter is global shared by all threads
volatile int counter = 0
CRITICAL_SECTION crit
InitializeCriticalSection ( ampcrit )
hellip main loop in any of the threads
while (done)
_try
EnterCriticalSection ( ampcrit )
counter += local_value
LeaveCriticalSection ( ampcrit )
_finally LeaveCriticalSection ( ampcrit )
DeleteCriticalSection( ampcrit )
8282
Synchronizing Threads with Kernel Objects
The following kernel objects can be used to synchronize threads
ndash Processes
ndash Threads
ndash Files
ndash Console input
File change notificationsFile change notifications
MutexesMutexes
Events (auto-reset + manual-reset)Events (auto-reset + manual-reset)
Waitable timersWaitable timers
DWORD WaitForSingleObject( HANDLE hObject DWORD dwTimeout )
DWORD WaitForMultipleObjects( DWORD cObjects LPHANDLE lpHandles BOOL bWaitAll DWORD dwTimeout )
8383
Wait Functions - Details
WaitForSingleObject()ndash hObject specifies kernel object
ndash dwTimeout specifies wait time in msecdwTimeout == 0 - no wait check whether object is signaled
dwTimeout == INFINITE - wait forever
WaitForMultipleObjects()ndash cObjects lt= MAXIMUM_WAIT_OBJECTS (64)
ndash lpHandles - pointer to array identifying these objects
ndash bWaitAll - whether to wait for first signaled object or all objectsFunction returns index of first signaled object
Side effectsndash Mutexes auto-reset events and waitable timers will be reset to non-signaled state after completing wait functions
8484
Mutexes
Mutexes work across processes
First thread has to call CreateMutex()
When sharing a mutex second thread (process) calls CreateMutex() or OpenMutex()
fInitialOwner == TRUE gives creator immediate ownership
Threads acquire mutex ownership using WaitForSingleObject() or WaitForMultipleObjects()
ReleaseMutex() gives up ownership
CloseHandle() will free mutex object
8585
Mutexes
HANDLE CreateMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
HANDLE OpenMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
BOOL ReleaseMutex( HANDLE hMutex )
8686
Mutex Example
counter is global shared by all threads
volatile int done counter = 0
HANDLE mutex = CreateMutex( NULL FALSE NULL )
main loop in any of the threads ret is local
DWORD ret
while (done)
ret = WaitForSingleObject( mutex INFINITE )
if (ret == WAIT_OBJECT_0)
counter += local_value
else mutex was abandoned
break exit the loop
ReleaseMutex( mutex )
CloseHandle( mutex )
8787
Comparison - POSIX mutexes
POSIX pthreads specification supports mutexesndash Synchronization among threads in same process
Five basic functionsndash pthread_mutex_init()
ndash pthread_mutex_destroy()
ndash pthread_mutex_lock()
ndash pthread_mutex_unlock()
ndash pthread_mutex_trylock()
Comparisonndash pthread_mutex_lock() will block - equivalent to WaitForSingleObject( hMutex )
ndash pthread_mutex_trylock() is nonblocking (polling) - equivalent to WaitForSingleObject() with timeout == 0
8888
Semaphores
Semaphore objects are used for resource countingndash A semaphore is signaled when count gt 0
Threadsprocesses use wait functionsndash Each wait function decreases semaphore count by 1
ndash ReleaseSemaphore() may increment count by any value
ndash ReleaseSemaphore() returns old semaphore count
8989
Semaphores
HANDLE CreateSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE OpenSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE ReleaseSemaphore( HANDLE hSemaphoreLONG cReleaseCount LPLONG lpPreviousCount )
9090
Events
Multiple threads can be released when a single event is signaled (barrier synchronization)ndash Manual-reset event can signal several thread simultaneously must be reset manually
ndash PulseEvent() will release all threads waiting on a manual-reset event and automatically reset the event
ndash Auto-reset event signals a single thread event is reset automatically
ndash fInitialState == TRUE - create event in signaled state
9191
Events
HANDLE CreateEvent( LPSECURITY_ATTRIBUTE lpsaBOOL fManualReset BOOL fInititalStateLPTSTR lpszEventName )
BOOL SetEvent( HANDLE hEvent )BOOL ResetEvent( HANDLE hEvent )BOOL PulseEvent( HANDLE hEvent )
9292
Comparison - POSIX condition variables
pthreadrsquos condition variables are comparable to eventsndash pthread_cond_init()
ndash pthread_cond_destroy()
Wait functionsndash pthread_cond_wait()
ndash pthread_cond_timedwait()
Signalingndash pthread_cond_signal() - one thread
ndash pthread_cond_broadcast() - all waiting threads
No exact equivalent to manual-reset events
9393
Anonymous pipes
BOOL CreatePipe( PHANDLE phReadPHANDLE phWriteLPSECURITY_ATTRIBUTES lpsaDWORD cbPipe )
main
prog1 prog2pipe
Half-duplex character-based IPC
cbPipe pipe byte size zero == default
Read on pipe handle will block if pipe is empty
Write operation to a full pipe will block
Anonymous pipes are oneway
9494
IO Redirection using an Anonymous Pipe
Create default size anonymous pipe handles are inheritable
if (CreatePipe (amphReadPipe amphWritePipe ampPipeSA 0))
fprintf(stderr ldquoAnon pipe create failednrdquo) exit(1)
Set output handle to pipe handle create first processes
StartInfoCh1hStdInput = GetStdHandle (STD_INPUT_HANDLE)
StartInfoCh1hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh1hStdOutput = hWritePipe
StartInfoCh1dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)Command1 NULL NULL TRUE 0 NULL NULL ampStartInfoCh1 ampProcInfo1))
fprintf(stderr ldquoCreateProc1 failednrdquo) exit(2)
CloseHandle (hWritePipe)
9595
Pipe example (contd)
Repeat (symmetrically) for the second process
StartInfoCh2hStdInput = hReadPipe
StartInfoCh2hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh2hStdOutput = GetStdHandle (STD_OUTPUT_HANDLE)
StartInfoCh2dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)targv NULL NULLTRUE Inherit handles
0 NULL NULL ampStartInfoCh2 ampProcInfo2))
fprintf(stderr ldquoCreateProc2 failednrdquo) exit(3)
CloseHandle (hReadPipe)
Wait for both processes to complete
WaitForSingleObject (ProcInfo1hProcess INFINITE)
WaitForSingleObject (ProcInfo2hProcess INFINITE)
CloseHandle (ProcInfo1hThread) CloseHandle (ProcInfo1hProcess)
CloseHandle (ProcInfo2hThread) CloseHandle (ProcInfo2hProcess)
return 0
9696
Named Pipes
Message orientedndash Reading process can read varying-length messages precisely as sent by the writing process
Bi-directionalndash Two processes can exchange messages over the same pipe
Multiple independent instances of a named pipendash Several clients can communicate with a single server using the same instance
ndash Server can respond to client using the same instance
Pipe can be accessed over the networkndash location transparency
Convenience and connection functions
9797
Using Named Pipes
HANDLE CreateNamedPipe (LPCTSTR lpszPipeNameDWORD fdwOpenMode DWORD fdwPipModeDWORD nMaxInstances DWORD cbOutBufDWORD cbInBuf DWORD dwTimeOutLPSECURITY_ATTRIBUTES lpsa )
Use same flag settings forall instances of a named pipe
lpszPipeName pipe[path]pipename
ndash Not possible to create a pipe on remote machine ( ndash local machine)
fdwOpenMode
ndash PIPE_ACCESS_DUPLEX PIPE_ACCESS_INBOUND PIPE_ACCESS_OUTBOUND
fdwPipeMode
ndash PIPE_TYPE_BYTE or PIPE_TYPE_MESSAGE
ndash PIPE_READMODE_BYTE or PIPE_READMODE_MESSAGE
ndash PIPE_WAIT or PIPE_NOWAIT (will ReadFile block)
9898
Named Pipes (contd)
BOOL PeekNamedPipe (HANDLE hPipeLPVOID lpvBuffer DWORD cbBufferLPDWORD lpcbRead LPDWORD lpcbAvailLPDWORD lpcbMessage)
nMaxInstances
ndash Number of instances
ndash PIPE_UNLIMITED_INSTANCES OS choice based on resources
dwTimeOut
ndash Default time-out period (in msec) for WaitNamedPipe()
First CreateNamedPipe creates named pipe
ndash Closing handle to last instance deletes named pipe
Polling a pipe
ndash Nondestructive ndash is there a message waiting for ReadFile
9999
Named Pipe Client Connections
CreateFile with named pipe namendash pipe[path]pipename
ndash servernamepipe[path]pipename
ndash First method gives better performance (local server)
Status Functionsndash GetNamedPipeHandleState
ndash SetNamedPipeHandleState
ndash GetNamedPipeInfo
100100
Convenience Functions
BOOL TransactNamedPipe( HANDLE hNamedPipeLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDOWRD lpcbRead LPOVERLAPPED lpa)
WriteFile ReadFile sequence
101101
Convenience Functions
BOOL CallNamedPipe( LPCTSTR lpszPipeNameLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDWORD lpcbRead DWORD dwTimeOut)
CreateFile WriteFile ReadFile CloseHandle
ndash dwTimeOut NMPWAIT_NOWAIT NMPWAIT_WIAT_FOREVER NMPWAIT_USE_DEFAULT_WAIT
102102
Server eliminate the polling loop
BOOL ConnectNamedPipe (HANDLE hNamedPipeLPOVERLAPPED lpo
lpo == NULLndash Call will return as soon as there is a client connection
ndash Returns false if client connected between CreateNamed Pipe calland ConnectNamedPipe()
Use DisconnectNamedPipe to free the handle for connection from another client
WaitNamedPipe()ndash Client may wait for serverlsquos ConnectNamedPipe()
Security rights for named pipesndash GENERIC_READ GENERIC_WRITE SYNCHRONIZE
103103
Comparison with UNIX
UNIX FIFOs are similar to a named pipe
ndash FIFOs are half-duplex
ndash FIFOs are limited to a single machine
ndash FIFOs are still byte-oriented so its easiest to use fixed-size records in clientserver applications
ndash Individual readwrites are atomic
A server using FIFOs must use a separate FIFO for each clientlsquos response although all clients can send requests via a single well known FIFO
Mkfifo() is the UNIX counterpart to CreateNamedPipe()
Use sockets for networked clientserver scenarios
104104
Client Example using Named Pipe
WaitNamedPipe (ServerPipeName NMPWAIT_WAIT_FOREVER)
hNamedPipe = CreateFile (ServerPipeName GENERIC_READ | GENERIC_WRITE
0 NULL OPEN_EXISTING FILE_ATTRIBUTE_NORMAL NULL)
if (hNamedPipe == INVALID_HANDLE_VALUE)
fptinf(stderr Failure to locate servern) exit(3)
Write the request
WriteFile (hNamedPipe ampRequest MAX_RQRS_LEN ampnWrite NULL)
Read each response and send it to std out
while (ReadFile (hNamedPipe ResponseRecord MAX_RQRS_LEN ampnRead NULL))
printf (s ResponseRecord)
CloseHandle (hNamedPipe)
return 0
105105
Server Example Using a Named Pipe
hNamedPipe = CreateNamedPipe (SERVER_PIPE PIPE_ACCESS_DUPLEX
PIPE_READMODE_MESSAGE | PIPE_TYPE_MESSAGE | PIPE_WAIT
1 0 0 CS_TIMEOUT pNPSA)
while (Done)
printf (Server is awaiting next requestn)
if (ConnectNamedPipe (hNamedPipe NULL)
|| ReadFile (hNamedPipe ampRequest RQ_SIZE ampnXfer NULL))
fprintf(stderr ldquoConnect or Read Named Pipe errornrdquo) exit(4)
printf( ldquoRequest is sn RequestRecord)
Send the file one line at a time to the client
fp = fopen (File r)
while ((fgets (ResponseRecord MAX_RQRS_LEN fp) = NULL))
WriteFile (hNamedPipe ampResponseRecord
(strlen(ResponseRecord) + 1) TSIZE ampnXfer NULL)
fclose (fp)
DisconnectNamedPipe (hNamedPipe)
End of server operation
106106
Win32 IPC - MailslotsMailslots bear some nasty implementation detailsthey are almost never used
Broadcast mechanism
ndash One-directional
ndash Mutliple writersmultiple readers (frequently one-to-many comm)
ndash Message delivery is unreliable
ndash Can be located over a network domain
ndash Message lengths are limited (w2k lt 426 byte) Operations on the mailslot
ndash Each reader (server) creates mailslot with CreateMailslot()
ndash Write-only client opens mailslot with CreateFile() and uses WriteFile() ndash open will fail if there are no waiting readers
ndash Clientlsquos message can be read by all servers (readers) Client lookup mailslotmailslotname
ndash Client will connect to every server in network domain
107107
Locate a server via mailslot
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
App client 0
App client n
Mailslot Servers
While () Sleep() hMS = CreateFile( ldquomailslotstatusldquo)
WriteFile(hMS ampStatInfo
App Server
Mailslot Client
Message is sent periodically
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
4848
Flow of Interrupts
4949
Sync on MP use spinlocks to coordinate among processors
Spinlock acquisition and release routines implement a one-owner-at-a-time algorithm
ndash Spinlock is either free or is considered to be owned by a CPU
ndash Analogous to using Windows API mutexes from user mode
A spinlock is just a data cell in memory
ndash Accessed with a test-and-modify operation that is atomic across all processors
ndash KSPIN_LOCK is an opaque data type typedefrsquod as a ULONG
ndash To implement synchronization a single bit is sufficient
Synchronization on SMP Systems
5050
do acquire_spinlock(DPC)until (SUCCESS)
begin remove DPC from queueend
release_spinlock(DPC)
do acquire_spinlock(DPC)until (SUCCESS)
begin remove DPC from queueend
release_spinlock(DPC)
Kernel Synchronization
Processor BProcessor A
Critical section
spinlock
DPC DPC
A spinlock is a locking primitive associatedwith a global data structure such as the DPC queue
5151
Queued Spinlocks
Problem Checking status of spinlock via test-and-set operation creates bus contention
Queued spinlocks maintain queue of waiting processors
First processor acquires lock other processors wait on processor-local flag
ndash Thus busy-wait loop requires no access to the memory bus
When releasing lock the 1st processorrsquos flag is modified
ndash Exactly one processor is being signaled
ndash Pre-determined wait order
5252
SMP Scalability Improvements
Windows 2000 queued spinlocks
ndash qlocks in Kernel Debugger
Server 2003
ndash More spinlocks eliminated (context swap system space commit)
ndash Further reduction of use of spinlocks amp length they are held
ndash Scheduling database now per-CPUAllows thread state transitions in parallel
5353
SMP Scalability Improvements
XP2003
ndash Minimized lock contention for hot locks PFN or Page Frame Database lock
ndash Some locks completely eliminatedCharging nonpagedpaged pool quotas allocating and mapping system page table entries charging commitment of pages allocatingmapping physical memory through AWE functions
ndash New more efficient locking mechanism (pushlocks)Doesnrsquot use spinlocks when no contentionUsed for object manager and address windowing extensions (AWE) related locks
5454
Waiting
Flexible wait calls
ndash Wait for one or multiple objects in one call
ndash Wait for multiple can wait for ldquoanyrdquo one or ldquoallrdquo at onceldquoAllrdquo all objects must be in the signalled state concurrently to resolve the wait
ndash All wait calls include optional timeout argument
ndash Waiting threads consume no CPU time
5555
Waiting
Waitable objects include
ndash Events (may be auto-reset or manual reset may be set or ldquopulsedrdquo)
ndash Mutexes (ldquomutual exclusionrdquo one-at-a-time)
ndash Semaphores (n-at-a-time)
ndash Timers
ndash Processes and Threads (signalled upon exit or terminate)
ndash Directories (change notification)
5656
Waiting
No guaranteed ordering of wait resolution
ndash If multiple threads are waiting for an object and only one thread is released (eg itrsquos a mutex or auto-reset event) which thread gets released is unpredictable
5757
Executive Synchronization
Thread waitson an object
handle
Create and initialize thread object
Initialized
Ready
Transition
Waiting
Running
Terminated
Standby
Wait is completeSet object to
signaled state
Interaction with thread scheduling
Waiting on Dispatcher Objects ndash outside the kernel
5858
Interaction bet Synchronization amp Dispatching User mode thread waits on an event objectlsquos handle
Kernel changes threadlsquos scheduling state from ready to waiting and adds thread to wait-list
Another thread sets the event
Kernel wakes up waiting threads variable priority threads get priority boost
5959
Interaction bet Synchronization amp Dispatching Dispatcher re-schedules new thread ndash it may preempt running thread it it has lower priority and issues software interrupt to initiate context switch
If no processor can be preempted the dispatcher places the ready thread in the dispatcher ready queue to be scheduled later
6060
What signals an object
Dispatcher object
System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
Owning thread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (kernel mode)
nonsignaled signaled
Owning thread or otherthread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (exported to user mode)
nonsignaled signaled
One thread releases thesemaphore freeing a resource
A thread acquires the semaphoreMore resources are not available
Kernel resumes one or more waiting threads
Semaphore
6161
A thread reinitializesthe thread object
What signals an object (contd)
Dispatcher object System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
A thread sets the event
Kernel resumes one or more threads
Kernel resumes one or more waiting threads
Event
nonsignaled signaled
Dedicated thread sets oneevent in the event pair
Kernel resumes theother dedicated thread
Kernel resumes waitingdedicated thread
Event pair
nonsignaled signaled
Timer expires
A thread (re) initializes the timer
Kernel resumes all waiting threads
Timer
nonsignaled signaled
Thread terminates
Kernel resumes all waiting threads
Thread
6262
Further Reading
Mark E Russinovich and David A Solomon Microsoft Windows Internals 4th Edition Microsoft Press 2004
Chapter 3 - System Mechanisms
ndash Trap Dispatching (pp 85 ff)
ndash Synchronization (pp 149 ff)
ndash Kernel Event Tracing (pp 175 ff)
6363
33 Advanced Windows Synchronization
Deferred and Asynchronous Procedure Calls
IRQLs and CPU Time Accounting
Wait Queues amp Dispatcher Objects
6464
Used to defer processing from higher (device) interrupt level to a lower (dispatch) levelndash Also used for quantum end and timer expiration
Driver (usually ISR) queues requestndash One queue per CPU DPCs are normally queued to the current processor but can be targeted to other CPUs
ndash Executes specified procedure at dispatch IRQL (or ldquodispatch levelrdquo also ldquoDPC levelrdquo) when all higher-IRQL work (interrupts) completed
ndash Maximum times recommended ISR 10 usec DPC 25 usec
See httpwwwmicrosoftcomwhdcdriverperformmmdrvmspx
Deferred Procedure Calls (DPCs)
6565
queue head DPC object DPC object DPC object
Deferred Procedure Calls (DPCs)
6666
DPC
Delivering a DPC
DPC routines can call kernel functionsbut canlsquot call system services generatepage faults or create or wait on objects
DPC routines canlsquotassume whatprocess addressspace is currentlymapped
Interruptdispatch table
high
Power failure
DispatchDPC
APC
Low
DPC
1 Timer expires kernelqueues DPC that willrelease all waiting threadsKernel requests SW int
DPCDPC
DPC queue
2 DPC interrupt occurswhen IRQL drops belowdispatchDPC level
dispatcher
3 After DPC interruptcontrol transfers tothread dispatcher
4 Dispatcher executes each DPCroutine in DPC queue
6767
Asynchronous Procedure Calls (APCs)
Execute code in context of a particular user threadndash APC routines can acquire resources (objects) incur page faultscall system services
APC queue is thread-specific
User mode amp kernel mode APCsndash Permission required for user mode APCs
6868
Asynchronous Procedure Calls (APCs)
Executive uses APCs to complete work in thread spacendash Wait for asynchronous IO operation
ndash Emulate delivery of POSIX signals
ndash Make threads suspendterminate itself (env subsystems)
APCs are delivered when thread is in alertable wait statendash WaitForMultipleObjectsEx() SleepEx()
6969
Special kernel APCs
ndash Run in kernel mode at IRQL 1
ndash Always deliverable unless thread is already at IRQL 1 or above
ndash Used for IO completion reporting from ldquoarbitrary thread contextrdquo
ndash Kernel-mode interface is linkable but not documented
Asynchronous Procedure Calls (APCs)
7070
ldquoOrdinaryrdquo kernel APCs
ndash Always deliverable if at IRQL 0 unless explicitly disabled (disable with KeEnterCriticalRegion)
User mode APCs
ndash Used for IO completion callback routines (see ReadFileEx WriteFileEx) also QueueUserApc
ndash Only deliverable when thread is in ldquoalertable waitrdquo
Asynchronous Procedure Calls (APCs)
7171
ThreadObject
K
U
APC objects
Asynchronous Procedure Calls (APCs)
7272
IRQLs and CPU Time Accounting
Interval clock timer ISR keeps track of time
Clock ISR time accounting
ndash If IRQLlt2 charge to threadrsquos user or kernel time
ndash If IRQL=2 and processing a DPC charge to DPC time
ndash If IRQL=2 amp not processing a DPC charge to thread kernel time
ndash If IRQLgt2 charge to interrupt time
7373
IRQLs and CPU Time Accounting
Since time servicing interrupts are NOT charged to interrupted thread if system is busy but no process appears to be running must be due to interrupt-related activity
ndash Note time at IRQL 2 or more is charged to the current threadrsquos quantum (to be described)
7474
Interrupt Time Accounting
Task Manager includes interrupt and DPC time with the Idle process time
Interrupt activity is not charged to any threadprocess
ndash Process Explorer shows these as separate processesnot really processes
ndash Context switches for these are really of interrupts amp DPCs
7575
Time Accounting Quirks
Looking at total CPU time for each process may not reveal where system has spent its time
CPU time accounting is driven by programmable interrupt timer
ndash Normally 10 msec (15 msec on some MP Pentiums)
Thread execution and context switches between clock intervals NOT accounted
ndash Eg one or more threads run and enter a wait state before clock fires
ndash Thus threads may run but never get charged
View context switch activity with Process Explorer
ndash Add Context Switch Delta column
7676
For waiting threads user-mode utilities only display the wait reason
Example pstat
Looking at Waiting Threads
7777
Wait Internals 1 Dispatcher Objects
Size TypeState
Wait listhead
Object-type-specific data
DispatcherObject
(see ntddkincddkntddkh)
Any kernel object you can wait for is a ldquodispatcher objectrdquo
ndash some exclusively for synchronizationeg events mutexes (ldquomutantsrdquo) semaphores queues timers
ndash others can be waited for as a side effect of their prime function eg processes threads file objects
ndash non-waitable kernel objects are called ldquocontrol objectsrdquo
All dispatcher objects have a common header
All dispatcher objects are in one of two states
ndash ldquosignaledrdquo vs ldquononsignaledrdquo
ndash when signalled a wait on the object is satisfied
ndash different object types differ in terms of what changes their state
ndash wait and unwait implementation iscommon to all types of dispatcher objects
7878
Object-type-specific data
Wait Internals 2Wait Blocks
Size TypeState
Wait listhead
Size TypeState
Wait listhead
Represent a threadrsquos reference to something itrsquos waiting for (one per handle passed to WaitForhellip)
All wait blocks from a given wait call are chained to the waiting thread
Type indicates wait for ldquoanyrdquo or ldquoallrdquo Key denotes argument list position for
WaitForMultipleObjects
Object-type-specific data
DispatcherObjects
Thread Objects
WaitBlockListWaitBlockList
Wait blocks
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
7979
34 Windows APIs for Synchronization and IPC Windows API constructs for synchronization and interprocess communication
Synchronization
ndash Critical sections
ndash Mutexes
ndash Semaphores
ndash Event objects
Synchronization through interprocess communication
ndash Anonymous pipes
ndash Named pipes
ndash Mailslots
8080
Critical Sections
Only usable from within the same process
Critical sections are initialized and deleted but do not have handles
Only one thread at a time can be in a critical section
A thread can enter a critical section multiple times - however the number of Enter- and Leave-operations must match
Leaving a critical section before entering it may cause deadlocks
No way to test whether another thread is in a critical section
VOID InitializeCriticalSection( LPCRITICAL_SECTION sec )VOID DeleteCriticalSection( LPCRITICAL_SECTION sec )
VOID EnterCriticalSection( LPCRITICAL_SECTION sec ) VOID LeaveCriticalSection( LPCRITICAL_SECTION sec )BOOL TryEnterCriticalSection ( LPCRITICAL_SECTION sec )
8181
Critical Section Example
counter is global shared by all threads
volatile int counter = 0
CRITICAL_SECTION crit
InitializeCriticalSection ( ampcrit )
hellip main loop in any of the threads
while (done)
_try
EnterCriticalSection ( ampcrit )
counter += local_value
LeaveCriticalSection ( ampcrit )
_finally LeaveCriticalSection ( ampcrit )
DeleteCriticalSection( ampcrit )
8282
Synchronizing Threads with Kernel Objects
The following kernel objects can be used to synchronize threads
ndash Processes
ndash Threads
ndash Files
ndash Console input
File change notificationsFile change notifications
MutexesMutexes
Events (auto-reset + manual-reset)Events (auto-reset + manual-reset)
Waitable timersWaitable timers
DWORD WaitForSingleObject( HANDLE hObject DWORD dwTimeout )
DWORD WaitForMultipleObjects( DWORD cObjects LPHANDLE lpHandles BOOL bWaitAll DWORD dwTimeout )
8383
Wait Functions - Details
WaitForSingleObject()ndash hObject specifies kernel object
ndash dwTimeout specifies wait time in msecdwTimeout == 0 - no wait check whether object is signaled
dwTimeout == INFINITE - wait forever
WaitForMultipleObjects()ndash cObjects lt= MAXIMUM_WAIT_OBJECTS (64)
ndash lpHandles - pointer to array identifying these objects
ndash bWaitAll - whether to wait for first signaled object or all objectsFunction returns index of first signaled object
Side effectsndash Mutexes auto-reset events and waitable timers will be reset to non-signaled state after completing wait functions
8484
Mutexes
Mutexes work across processes
First thread has to call CreateMutex()
When sharing a mutex second thread (process) calls CreateMutex() or OpenMutex()
fInitialOwner == TRUE gives creator immediate ownership
Threads acquire mutex ownership using WaitForSingleObject() or WaitForMultipleObjects()
ReleaseMutex() gives up ownership
CloseHandle() will free mutex object
8585
Mutexes
HANDLE CreateMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
HANDLE OpenMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
BOOL ReleaseMutex( HANDLE hMutex )
8686
Mutex Example
counter is global shared by all threads
volatile int done counter = 0
HANDLE mutex = CreateMutex( NULL FALSE NULL )
main loop in any of the threads ret is local
DWORD ret
while (done)
ret = WaitForSingleObject( mutex INFINITE )
if (ret == WAIT_OBJECT_0)
counter += local_value
else mutex was abandoned
break exit the loop
ReleaseMutex( mutex )
CloseHandle( mutex )
8787
Comparison - POSIX mutexes
POSIX pthreads specification supports mutexesndash Synchronization among threads in same process
Five basic functionsndash pthread_mutex_init()
ndash pthread_mutex_destroy()
ndash pthread_mutex_lock()
ndash pthread_mutex_unlock()
ndash pthread_mutex_trylock()
Comparisonndash pthread_mutex_lock() will block - equivalent to WaitForSingleObject( hMutex )
ndash pthread_mutex_trylock() is nonblocking (polling) - equivalent to WaitForSingleObject() with timeout == 0
8888
Semaphores
Semaphore objects are used for resource countingndash A semaphore is signaled when count gt 0
Threadsprocesses use wait functionsndash Each wait function decreases semaphore count by 1
ndash ReleaseSemaphore() may increment count by any value
ndash ReleaseSemaphore() returns old semaphore count
8989
Semaphores
HANDLE CreateSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE OpenSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE ReleaseSemaphore( HANDLE hSemaphoreLONG cReleaseCount LPLONG lpPreviousCount )
9090
Events
Multiple threads can be released when a single event is signaled (barrier synchronization)ndash Manual-reset event can signal several thread simultaneously must be reset manually
ndash PulseEvent() will release all threads waiting on a manual-reset event and automatically reset the event
ndash Auto-reset event signals a single thread event is reset automatically
ndash fInitialState == TRUE - create event in signaled state
9191
Events
HANDLE CreateEvent( LPSECURITY_ATTRIBUTE lpsaBOOL fManualReset BOOL fInititalStateLPTSTR lpszEventName )
BOOL SetEvent( HANDLE hEvent )BOOL ResetEvent( HANDLE hEvent )BOOL PulseEvent( HANDLE hEvent )
9292
Comparison - POSIX condition variables
pthreadrsquos condition variables are comparable to eventsndash pthread_cond_init()
ndash pthread_cond_destroy()
Wait functionsndash pthread_cond_wait()
ndash pthread_cond_timedwait()
Signalingndash pthread_cond_signal() - one thread
ndash pthread_cond_broadcast() - all waiting threads
No exact equivalent to manual-reset events
9393
Anonymous pipes
BOOL CreatePipe( PHANDLE phReadPHANDLE phWriteLPSECURITY_ATTRIBUTES lpsaDWORD cbPipe )
main
prog1 prog2pipe
Half-duplex character-based IPC
cbPipe pipe byte size zero == default
Read on pipe handle will block if pipe is empty
Write operation to a full pipe will block
Anonymous pipes are oneway
9494
IO Redirection using an Anonymous Pipe
Create default size anonymous pipe handles are inheritable
if (CreatePipe (amphReadPipe amphWritePipe ampPipeSA 0))
fprintf(stderr ldquoAnon pipe create failednrdquo) exit(1)
Set output handle to pipe handle create first processes
StartInfoCh1hStdInput = GetStdHandle (STD_INPUT_HANDLE)
StartInfoCh1hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh1hStdOutput = hWritePipe
StartInfoCh1dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)Command1 NULL NULL TRUE 0 NULL NULL ampStartInfoCh1 ampProcInfo1))
fprintf(stderr ldquoCreateProc1 failednrdquo) exit(2)
CloseHandle (hWritePipe)
9595
Pipe example (contd)
Repeat (symmetrically) for the second process
StartInfoCh2hStdInput = hReadPipe
StartInfoCh2hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh2hStdOutput = GetStdHandle (STD_OUTPUT_HANDLE)
StartInfoCh2dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)targv NULL NULLTRUE Inherit handles
0 NULL NULL ampStartInfoCh2 ampProcInfo2))
fprintf(stderr ldquoCreateProc2 failednrdquo) exit(3)
CloseHandle (hReadPipe)
Wait for both processes to complete
WaitForSingleObject (ProcInfo1hProcess INFINITE)
WaitForSingleObject (ProcInfo2hProcess INFINITE)
CloseHandle (ProcInfo1hThread) CloseHandle (ProcInfo1hProcess)
CloseHandle (ProcInfo2hThread) CloseHandle (ProcInfo2hProcess)
return 0
9696
Named Pipes
Message orientedndash Reading process can read varying-length messages precisely as sent by the writing process
Bi-directionalndash Two processes can exchange messages over the same pipe
Multiple independent instances of a named pipendash Several clients can communicate with a single server using the same instance
ndash Server can respond to client using the same instance
Pipe can be accessed over the networkndash location transparency
Convenience and connection functions
9797
Using Named Pipes
HANDLE CreateNamedPipe (LPCTSTR lpszPipeNameDWORD fdwOpenMode DWORD fdwPipModeDWORD nMaxInstances DWORD cbOutBufDWORD cbInBuf DWORD dwTimeOutLPSECURITY_ATTRIBUTES lpsa )
Use same flag settings forall instances of a named pipe
lpszPipeName pipe[path]pipename
ndash Not possible to create a pipe on remote machine ( ndash local machine)
fdwOpenMode
ndash PIPE_ACCESS_DUPLEX PIPE_ACCESS_INBOUND PIPE_ACCESS_OUTBOUND
fdwPipeMode
ndash PIPE_TYPE_BYTE or PIPE_TYPE_MESSAGE
ndash PIPE_READMODE_BYTE or PIPE_READMODE_MESSAGE
ndash PIPE_WAIT or PIPE_NOWAIT (will ReadFile block)
9898
Named Pipes (contd)
BOOL PeekNamedPipe (HANDLE hPipeLPVOID lpvBuffer DWORD cbBufferLPDWORD lpcbRead LPDWORD lpcbAvailLPDWORD lpcbMessage)
nMaxInstances
ndash Number of instances
ndash PIPE_UNLIMITED_INSTANCES OS choice based on resources
dwTimeOut
ndash Default time-out period (in msec) for WaitNamedPipe()
First CreateNamedPipe creates named pipe
ndash Closing handle to last instance deletes named pipe
Polling a pipe
ndash Nondestructive ndash is there a message waiting for ReadFile
9999
Named Pipe Client Connections
CreateFile with named pipe namendash pipe[path]pipename
ndash servernamepipe[path]pipename
ndash First method gives better performance (local server)
Status Functionsndash GetNamedPipeHandleState
ndash SetNamedPipeHandleState
ndash GetNamedPipeInfo
100100
Convenience Functions
BOOL TransactNamedPipe( HANDLE hNamedPipeLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDOWRD lpcbRead LPOVERLAPPED lpa)
WriteFile ReadFile sequence
101101
Convenience Functions
BOOL CallNamedPipe( LPCTSTR lpszPipeNameLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDWORD lpcbRead DWORD dwTimeOut)
CreateFile WriteFile ReadFile CloseHandle
ndash dwTimeOut NMPWAIT_NOWAIT NMPWAIT_WIAT_FOREVER NMPWAIT_USE_DEFAULT_WAIT
102102
Server eliminate the polling loop
BOOL ConnectNamedPipe (HANDLE hNamedPipeLPOVERLAPPED lpo
lpo == NULLndash Call will return as soon as there is a client connection
ndash Returns false if client connected between CreateNamed Pipe calland ConnectNamedPipe()
Use DisconnectNamedPipe to free the handle for connection from another client
WaitNamedPipe()ndash Client may wait for serverlsquos ConnectNamedPipe()
Security rights for named pipesndash GENERIC_READ GENERIC_WRITE SYNCHRONIZE
103103
Comparison with UNIX
UNIX FIFOs are similar to a named pipe
ndash FIFOs are half-duplex
ndash FIFOs are limited to a single machine
ndash FIFOs are still byte-oriented so its easiest to use fixed-size records in clientserver applications
ndash Individual readwrites are atomic
A server using FIFOs must use a separate FIFO for each clientlsquos response although all clients can send requests via a single well known FIFO
Mkfifo() is the UNIX counterpart to CreateNamedPipe()
Use sockets for networked clientserver scenarios
104104
Client Example using Named Pipe
WaitNamedPipe (ServerPipeName NMPWAIT_WAIT_FOREVER)
hNamedPipe = CreateFile (ServerPipeName GENERIC_READ | GENERIC_WRITE
0 NULL OPEN_EXISTING FILE_ATTRIBUTE_NORMAL NULL)
if (hNamedPipe == INVALID_HANDLE_VALUE)
fptinf(stderr Failure to locate servern) exit(3)
Write the request
WriteFile (hNamedPipe ampRequest MAX_RQRS_LEN ampnWrite NULL)
Read each response and send it to std out
while (ReadFile (hNamedPipe ResponseRecord MAX_RQRS_LEN ampnRead NULL))
printf (s ResponseRecord)
CloseHandle (hNamedPipe)
return 0
105105
Server Example Using a Named Pipe
hNamedPipe = CreateNamedPipe (SERVER_PIPE PIPE_ACCESS_DUPLEX
PIPE_READMODE_MESSAGE | PIPE_TYPE_MESSAGE | PIPE_WAIT
1 0 0 CS_TIMEOUT pNPSA)
while (Done)
printf (Server is awaiting next requestn)
if (ConnectNamedPipe (hNamedPipe NULL)
|| ReadFile (hNamedPipe ampRequest RQ_SIZE ampnXfer NULL))
fprintf(stderr ldquoConnect or Read Named Pipe errornrdquo) exit(4)
printf( ldquoRequest is sn RequestRecord)
Send the file one line at a time to the client
fp = fopen (File r)
while ((fgets (ResponseRecord MAX_RQRS_LEN fp) = NULL))
WriteFile (hNamedPipe ampResponseRecord
(strlen(ResponseRecord) + 1) TSIZE ampnXfer NULL)
fclose (fp)
DisconnectNamedPipe (hNamedPipe)
End of server operation
106106
Win32 IPC - MailslotsMailslots bear some nasty implementation detailsthey are almost never used
Broadcast mechanism
ndash One-directional
ndash Mutliple writersmultiple readers (frequently one-to-many comm)
ndash Message delivery is unreliable
ndash Can be located over a network domain
ndash Message lengths are limited (w2k lt 426 byte) Operations on the mailslot
ndash Each reader (server) creates mailslot with CreateMailslot()
ndash Write-only client opens mailslot with CreateFile() and uses WriteFile() ndash open will fail if there are no waiting readers
ndash Clientlsquos message can be read by all servers (readers) Client lookup mailslotmailslotname
ndash Client will connect to every server in network domain
107107
Locate a server via mailslot
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
App client 0
App client n
Mailslot Servers
While () Sleep() hMS = CreateFile( ldquomailslotstatusldquo)
WriteFile(hMS ampStatInfo
App Server
Mailslot Client
Message is sent periodically
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
4949
Sync on MP use spinlocks to coordinate among processors
Spinlock acquisition and release routines implement a one-owner-at-a-time algorithm
ndash Spinlock is either free or is considered to be owned by a CPU
ndash Analogous to using Windows API mutexes from user mode
A spinlock is just a data cell in memory
ndash Accessed with a test-and-modify operation that is atomic across all processors
ndash KSPIN_LOCK is an opaque data type typedefrsquod as a ULONG
ndash To implement synchronization a single bit is sufficient
Synchronization on SMP Systems
5050
do acquire_spinlock(DPC)until (SUCCESS)
begin remove DPC from queueend
release_spinlock(DPC)
do acquire_spinlock(DPC)until (SUCCESS)
begin remove DPC from queueend
release_spinlock(DPC)
Kernel Synchronization
Processor BProcessor A
Critical section
spinlock
DPC DPC
A spinlock is a locking primitive associatedwith a global data structure such as the DPC queue
5151
Queued Spinlocks
Problem Checking status of spinlock via test-and-set operation creates bus contention
Queued spinlocks maintain queue of waiting processors
First processor acquires lock other processors wait on processor-local flag
ndash Thus busy-wait loop requires no access to the memory bus
When releasing lock the 1st processorrsquos flag is modified
ndash Exactly one processor is being signaled
ndash Pre-determined wait order
5252
SMP Scalability Improvements
Windows 2000 queued spinlocks
ndash qlocks in Kernel Debugger
Server 2003
ndash More spinlocks eliminated (context swap system space commit)
ndash Further reduction of use of spinlocks amp length they are held
ndash Scheduling database now per-CPUAllows thread state transitions in parallel
5353
SMP Scalability Improvements
XP2003
ndash Minimized lock contention for hot locks PFN or Page Frame Database lock
ndash Some locks completely eliminatedCharging nonpagedpaged pool quotas allocating and mapping system page table entries charging commitment of pages allocatingmapping physical memory through AWE functions
ndash New more efficient locking mechanism (pushlocks)Doesnrsquot use spinlocks when no contentionUsed for object manager and address windowing extensions (AWE) related locks
5454
Waiting
Flexible wait calls
ndash Wait for one or multiple objects in one call
ndash Wait for multiple can wait for ldquoanyrdquo one or ldquoallrdquo at onceldquoAllrdquo all objects must be in the signalled state concurrently to resolve the wait
ndash All wait calls include optional timeout argument
ndash Waiting threads consume no CPU time
5555
Waiting
Waitable objects include
ndash Events (may be auto-reset or manual reset may be set or ldquopulsedrdquo)
ndash Mutexes (ldquomutual exclusionrdquo one-at-a-time)
ndash Semaphores (n-at-a-time)
ndash Timers
ndash Processes and Threads (signalled upon exit or terminate)
ndash Directories (change notification)
5656
Waiting
No guaranteed ordering of wait resolution
ndash If multiple threads are waiting for an object and only one thread is released (eg itrsquos a mutex or auto-reset event) which thread gets released is unpredictable
5757
Executive Synchronization
Thread waitson an object
handle
Create and initialize thread object
Initialized
Ready
Transition
Waiting
Running
Terminated
Standby
Wait is completeSet object to
signaled state
Interaction with thread scheduling
Waiting on Dispatcher Objects ndash outside the kernel
5858
Interaction bet Synchronization amp Dispatching User mode thread waits on an event objectlsquos handle
Kernel changes threadlsquos scheduling state from ready to waiting and adds thread to wait-list
Another thread sets the event
Kernel wakes up waiting threads variable priority threads get priority boost
5959
Interaction bet Synchronization amp Dispatching Dispatcher re-schedules new thread ndash it may preempt running thread it it has lower priority and issues software interrupt to initiate context switch
If no processor can be preempted the dispatcher places the ready thread in the dispatcher ready queue to be scheduled later
6060
What signals an object
Dispatcher object
System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
Owning thread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (kernel mode)
nonsignaled signaled
Owning thread or otherthread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (exported to user mode)
nonsignaled signaled
One thread releases thesemaphore freeing a resource
A thread acquires the semaphoreMore resources are not available
Kernel resumes one or more waiting threads
Semaphore
6161
A thread reinitializesthe thread object
What signals an object (contd)
Dispatcher object System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
A thread sets the event
Kernel resumes one or more threads
Kernel resumes one or more waiting threads
Event
nonsignaled signaled
Dedicated thread sets oneevent in the event pair
Kernel resumes theother dedicated thread
Kernel resumes waitingdedicated thread
Event pair
nonsignaled signaled
Timer expires
A thread (re) initializes the timer
Kernel resumes all waiting threads
Timer
nonsignaled signaled
Thread terminates
Kernel resumes all waiting threads
Thread
6262
Further Reading
Mark E Russinovich and David A Solomon Microsoft Windows Internals 4th Edition Microsoft Press 2004
Chapter 3 - System Mechanisms
ndash Trap Dispatching (pp 85 ff)
ndash Synchronization (pp 149 ff)
ndash Kernel Event Tracing (pp 175 ff)
6363
33 Advanced Windows Synchronization
Deferred and Asynchronous Procedure Calls
IRQLs and CPU Time Accounting
Wait Queues amp Dispatcher Objects
6464
Used to defer processing from higher (device) interrupt level to a lower (dispatch) levelndash Also used for quantum end and timer expiration
Driver (usually ISR) queues requestndash One queue per CPU DPCs are normally queued to the current processor but can be targeted to other CPUs
ndash Executes specified procedure at dispatch IRQL (or ldquodispatch levelrdquo also ldquoDPC levelrdquo) when all higher-IRQL work (interrupts) completed
ndash Maximum times recommended ISR 10 usec DPC 25 usec
See httpwwwmicrosoftcomwhdcdriverperformmmdrvmspx
Deferred Procedure Calls (DPCs)
6565
queue head DPC object DPC object DPC object
Deferred Procedure Calls (DPCs)
6666
DPC
Delivering a DPC
DPC routines can call kernel functionsbut canlsquot call system services generatepage faults or create or wait on objects
DPC routines canlsquotassume whatprocess addressspace is currentlymapped
Interruptdispatch table
high
Power failure
DispatchDPC
APC
Low
DPC
1 Timer expires kernelqueues DPC that willrelease all waiting threadsKernel requests SW int
DPCDPC
DPC queue
2 DPC interrupt occurswhen IRQL drops belowdispatchDPC level
dispatcher
3 After DPC interruptcontrol transfers tothread dispatcher
4 Dispatcher executes each DPCroutine in DPC queue
6767
Asynchronous Procedure Calls (APCs)
Execute code in context of a particular user threadndash APC routines can acquire resources (objects) incur page faultscall system services
APC queue is thread-specific
User mode amp kernel mode APCsndash Permission required for user mode APCs
6868
Asynchronous Procedure Calls (APCs)
Executive uses APCs to complete work in thread spacendash Wait for asynchronous IO operation
ndash Emulate delivery of POSIX signals
ndash Make threads suspendterminate itself (env subsystems)
APCs are delivered when thread is in alertable wait statendash WaitForMultipleObjectsEx() SleepEx()
6969
Special kernel APCs
ndash Run in kernel mode at IRQL 1
ndash Always deliverable unless thread is already at IRQL 1 or above
ndash Used for IO completion reporting from ldquoarbitrary thread contextrdquo
ndash Kernel-mode interface is linkable but not documented
Asynchronous Procedure Calls (APCs)
7070
ldquoOrdinaryrdquo kernel APCs
ndash Always deliverable if at IRQL 0 unless explicitly disabled (disable with KeEnterCriticalRegion)
User mode APCs
ndash Used for IO completion callback routines (see ReadFileEx WriteFileEx) also QueueUserApc
ndash Only deliverable when thread is in ldquoalertable waitrdquo
Asynchronous Procedure Calls (APCs)
7171
ThreadObject
K
U
APC objects
Asynchronous Procedure Calls (APCs)
7272
IRQLs and CPU Time Accounting
Interval clock timer ISR keeps track of time
Clock ISR time accounting
ndash If IRQLlt2 charge to threadrsquos user or kernel time
ndash If IRQL=2 and processing a DPC charge to DPC time
ndash If IRQL=2 amp not processing a DPC charge to thread kernel time
ndash If IRQLgt2 charge to interrupt time
7373
IRQLs and CPU Time Accounting
Since time servicing interrupts are NOT charged to interrupted thread if system is busy but no process appears to be running must be due to interrupt-related activity
ndash Note time at IRQL 2 or more is charged to the current threadrsquos quantum (to be described)
7474
Interrupt Time Accounting
Task Manager includes interrupt and DPC time with the Idle process time
Interrupt activity is not charged to any threadprocess
ndash Process Explorer shows these as separate processesnot really processes
ndash Context switches for these are really of interrupts amp DPCs
7575
Time Accounting Quirks
Looking at total CPU time for each process may not reveal where system has spent its time
CPU time accounting is driven by programmable interrupt timer
ndash Normally 10 msec (15 msec on some MP Pentiums)
Thread execution and context switches between clock intervals NOT accounted
ndash Eg one or more threads run and enter a wait state before clock fires
ndash Thus threads may run but never get charged
View context switch activity with Process Explorer
ndash Add Context Switch Delta column
7676
For waiting threads user-mode utilities only display the wait reason
Example pstat
Looking at Waiting Threads
7777
Wait Internals 1 Dispatcher Objects
Size TypeState
Wait listhead
Object-type-specific data
DispatcherObject
(see ntddkincddkntddkh)
Any kernel object you can wait for is a ldquodispatcher objectrdquo
ndash some exclusively for synchronizationeg events mutexes (ldquomutantsrdquo) semaphores queues timers
ndash others can be waited for as a side effect of their prime function eg processes threads file objects
ndash non-waitable kernel objects are called ldquocontrol objectsrdquo
All dispatcher objects have a common header
All dispatcher objects are in one of two states
ndash ldquosignaledrdquo vs ldquononsignaledrdquo
ndash when signalled a wait on the object is satisfied
ndash different object types differ in terms of what changes their state
ndash wait and unwait implementation iscommon to all types of dispatcher objects
7878
Object-type-specific data
Wait Internals 2Wait Blocks
Size TypeState
Wait listhead
Size TypeState
Wait listhead
Represent a threadrsquos reference to something itrsquos waiting for (one per handle passed to WaitForhellip)
All wait blocks from a given wait call are chained to the waiting thread
Type indicates wait for ldquoanyrdquo or ldquoallrdquo Key denotes argument list position for
WaitForMultipleObjects
Object-type-specific data
DispatcherObjects
Thread Objects
WaitBlockListWaitBlockList
Wait blocks
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
7979
34 Windows APIs for Synchronization and IPC Windows API constructs for synchronization and interprocess communication
Synchronization
ndash Critical sections
ndash Mutexes
ndash Semaphores
ndash Event objects
Synchronization through interprocess communication
ndash Anonymous pipes
ndash Named pipes
ndash Mailslots
8080
Critical Sections
Only usable from within the same process
Critical sections are initialized and deleted but do not have handles
Only one thread at a time can be in a critical section
A thread can enter a critical section multiple times - however the number of Enter- and Leave-operations must match
Leaving a critical section before entering it may cause deadlocks
No way to test whether another thread is in a critical section
VOID InitializeCriticalSection( LPCRITICAL_SECTION sec )VOID DeleteCriticalSection( LPCRITICAL_SECTION sec )
VOID EnterCriticalSection( LPCRITICAL_SECTION sec ) VOID LeaveCriticalSection( LPCRITICAL_SECTION sec )BOOL TryEnterCriticalSection ( LPCRITICAL_SECTION sec )
8181
Critical Section Example
counter is global shared by all threads
volatile int counter = 0
CRITICAL_SECTION crit
InitializeCriticalSection ( ampcrit )
hellip main loop in any of the threads
while (done)
_try
EnterCriticalSection ( ampcrit )
counter += local_value
LeaveCriticalSection ( ampcrit )
_finally LeaveCriticalSection ( ampcrit )
DeleteCriticalSection( ampcrit )
8282
Synchronizing Threads with Kernel Objects
The following kernel objects can be used to synchronize threads
ndash Processes
ndash Threads
ndash Files
ndash Console input
File change notificationsFile change notifications
MutexesMutexes
Events (auto-reset + manual-reset)Events (auto-reset + manual-reset)
Waitable timersWaitable timers
DWORD WaitForSingleObject( HANDLE hObject DWORD dwTimeout )
DWORD WaitForMultipleObjects( DWORD cObjects LPHANDLE lpHandles BOOL bWaitAll DWORD dwTimeout )
8383
Wait Functions - Details
WaitForSingleObject()ndash hObject specifies kernel object
ndash dwTimeout specifies wait time in msecdwTimeout == 0 - no wait check whether object is signaled
dwTimeout == INFINITE - wait forever
WaitForMultipleObjects()ndash cObjects lt= MAXIMUM_WAIT_OBJECTS (64)
ndash lpHandles - pointer to array identifying these objects
ndash bWaitAll - whether to wait for first signaled object or all objectsFunction returns index of first signaled object
Side effectsndash Mutexes auto-reset events and waitable timers will be reset to non-signaled state after completing wait functions
8484
Mutexes
Mutexes work across processes
First thread has to call CreateMutex()
When sharing a mutex second thread (process) calls CreateMutex() or OpenMutex()
fInitialOwner == TRUE gives creator immediate ownership
Threads acquire mutex ownership using WaitForSingleObject() or WaitForMultipleObjects()
ReleaseMutex() gives up ownership
CloseHandle() will free mutex object
8585
Mutexes
HANDLE CreateMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
HANDLE OpenMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
BOOL ReleaseMutex( HANDLE hMutex )
8686
Mutex Example
counter is global shared by all threads
volatile int done counter = 0
HANDLE mutex = CreateMutex( NULL FALSE NULL )
main loop in any of the threads ret is local
DWORD ret
while (done)
ret = WaitForSingleObject( mutex INFINITE )
if (ret == WAIT_OBJECT_0)
counter += local_value
else mutex was abandoned
break exit the loop
ReleaseMutex( mutex )
CloseHandle( mutex )
8787
Comparison - POSIX mutexes
POSIX pthreads specification supports mutexesndash Synchronization among threads in same process
Five basic functionsndash pthread_mutex_init()
ndash pthread_mutex_destroy()
ndash pthread_mutex_lock()
ndash pthread_mutex_unlock()
ndash pthread_mutex_trylock()
Comparisonndash pthread_mutex_lock() will block - equivalent to WaitForSingleObject( hMutex )
ndash pthread_mutex_trylock() is nonblocking (polling) - equivalent to WaitForSingleObject() with timeout == 0
8888
Semaphores
Semaphore objects are used for resource countingndash A semaphore is signaled when count gt 0
Threadsprocesses use wait functionsndash Each wait function decreases semaphore count by 1
ndash ReleaseSemaphore() may increment count by any value
ndash ReleaseSemaphore() returns old semaphore count
8989
Semaphores
HANDLE CreateSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE OpenSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE ReleaseSemaphore( HANDLE hSemaphoreLONG cReleaseCount LPLONG lpPreviousCount )
9090
Events
Multiple threads can be released when a single event is signaled (barrier synchronization)ndash Manual-reset event can signal several thread simultaneously must be reset manually
ndash PulseEvent() will release all threads waiting on a manual-reset event and automatically reset the event
ndash Auto-reset event signals a single thread event is reset automatically
ndash fInitialState == TRUE - create event in signaled state
9191
Events
HANDLE CreateEvent( LPSECURITY_ATTRIBUTE lpsaBOOL fManualReset BOOL fInititalStateLPTSTR lpszEventName )
BOOL SetEvent( HANDLE hEvent )BOOL ResetEvent( HANDLE hEvent )BOOL PulseEvent( HANDLE hEvent )
9292
Comparison - POSIX condition variables
pthreadrsquos condition variables are comparable to eventsndash pthread_cond_init()
ndash pthread_cond_destroy()
Wait functionsndash pthread_cond_wait()
ndash pthread_cond_timedwait()
Signalingndash pthread_cond_signal() - one thread
ndash pthread_cond_broadcast() - all waiting threads
No exact equivalent to manual-reset events
9393
Anonymous pipes
BOOL CreatePipe( PHANDLE phReadPHANDLE phWriteLPSECURITY_ATTRIBUTES lpsaDWORD cbPipe )
main
prog1 prog2pipe
Half-duplex character-based IPC
cbPipe pipe byte size zero == default
Read on pipe handle will block if pipe is empty
Write operation to a full pipe will block
Anonymous pipes are oneway
9494
IO Redirection using an Anonymous Pipe
Create default size anonymous pipe handles are inheritable
if (CreatePipe (amphReadPipe amphWritePipe ampPipeSA 0))
fprintf(stderr ldquoAnon pipe create failednrdquo) exit(1)
Set output handle to pipe handle create first processes
StartInfoCh1hStdInput = GetStdHandle (STD_INPUT_HANDLE)
StartInfoCh1hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh1hStdOutput = hWritePipe
StartInfoCh1dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)Command1 NULL NULL TRUE 0 NULL NULL ampStartInfoCh1 ampProcInfo1))
fprintf(stderr ldquoCreateProc1 failednrdquo) exit(2)
CloseHandle (hWritePipe)
9595
Pipe example (contd)
Repeat (symmetrically) for the second process
StartInfoCh2hStdInput = hReadPipe
StartInfoCh2hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh2hStdOutput = GetStdHandle (STD_OUTPUT_HANDLE)
StartInfoCh2dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)targv NULL NULLTRUE Inherit handles
0 NULL NULL ampStartInfoCh2 ampProcInfo2))
fprintf(stderr ldquoCreateProc2 failednrdquo) exit(3)
CloseHandle (hReadPipe)
Wait for both processes to complete
WaitForSingleObject (ProcInfo1hProcess INFINITE)
WaitForSingleObject (ProcInfo2hProcess INFINITE)
CloseHandle (ProcInfo1hThread) CloseHandle (ProcInfo1hProcess)
CloseHandle (ProcInfo2hThread) CloseHandle (ProcInfo2hProcess)
return 0
9696
Named Pipes
Message orientedndash Reading process can read varying-length messages precisely as sent by the writing process
Bi-directionalndash Two processes can exchange messages over the same pipe
Multiple independent instances of a named pipendash Several clients can communicate with a single server using the same instance
ndash Server can respond to client using the same instance
Pipe can be accessed over the networkndash location transparency
Convenience and connection functions
9797
Using Named Pipes
HANDLE CreateNamedPipe (LPCTSTR lpszPipeNameDWORD fdwOpenMode DWORD fdwPipModeDWORD nMaxInstances DWORD cbOutBufDWORD cbInBuf DWORD dwTimeOutLPSECURITY_ATTRIBUTES lpsa )
Use same flag settings forall instances of a named pipe
lpszPipeName pipe[path]pipename
ndash Not possible to create a pipe on remote machine ( ndash local machine)
fdwOpenMode
ndash PIPE_ACCESS_DUPLEX PIPE_ACCESS_INBOUND PIPE_ACCESS_OUTBOUND
fdwPipeMode
ndash PIPE_TYPE_BYTE or PIPE_TYPE_MESSAGE
ndash PIPE_READMODE_BYTE or PIPE_READMODE_MESSAGE
ndash PIPE_WAIT or PIPE_NOWAIT (will ReadFile block)
9898
Named Pipes (contd)
BOOL PeekNamedPipe (HANDLE hPipeLPVOID lpvBuffer DWORD cbBufferLPDWORD lpcbRead LPDWORD lpcbAvailLPDWORD lpcbMessage)
nMaxInstances
ndash Number of instances
ndash PIPE_UNLIMITED_INSTANCES OS choice based on resources
dwTimeOut
ndash Default time-out period (in msec) for WaitNamedPipe()
First CreateNamedPipe creates named pipe
ndash Closing handle to last instance deletes named pipe
Polling a pipe
ndash Nondestructive ndash is there a message waiting for ReadFile
9999
Named Pipe Client Connections
CreateFile with named pipe namendash pipe[path]pipename
ndash servernamepipe[path]pipename
ndash First method gives better performance (local server)
Status Functionsndash GetNamedPipeHandleState
ndash SetNamedPipeHandleState
ndash GetNamedPipeInfo
100100
Convenience Functions
BOOL TransactNamedPipe( HANDLE hNamedPipeLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDOWRD lpcbRead LPOVERLAPPED lpa)
WriteFile ReadFile sequence
101101
Convenience Functions
BOOL CallNamedPipe( LPCTSTR lpszPipeNameLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDWORD lpcbRead DWORD dwTimeOut)
CreateFile WriteFile ReadFile CloseHandle
ndash dwTimeOut NMPWAIT_NOWAIT NMPWAIT_WIAT_FOREVER NMPWAIT_USE_DEFAULT_WAIT
102102
Server eliminate the polling loop
BOOL ConnectNamedPipe (HANDLE hNamedPipeLPOVERLAPPED lpo
lpo == NULLndash Call will return as soon as there is a client connection
ndash Returns false if client connected between CreateNamed Pipe calland ConnectNamedPipe()
Use DisconnectNamedPipe to free the handle for connection from another client
WaitNamedPipe()ndash Client may wait for serverlsquos ConnectNamedPipe()
Security rights for named pipesndash GENERIC_READ GENERIC_WRITE SYNCHRONIZE
103103
Comparison with UNIX
UNIX FIFOs are similar to a named pipe
ndash FIFOs are half-duplex
ndash FIFOs are limited to a single machine
ndash FIFOs are still byte-oriented so its easiest to use fixed-size records in clientserver applications
ndash Individual readwrites are atomic
A server using FIFOs must use a separate FIFO for each clientlsquos response although all clients can send requests via a single well known FIFO
Mkfifo() is the UNIX counterpart to CreateNamedPipe()
Use sockets for networked clientserver scenarios
104104
Client Example using Named Pipe
WaitNamedPipe (ServerPipeName NMPWAIT_WAIT_FOREVER)
hNamedPipe = CreateFile (ServerPipeName GENERIC_READ | GENERIC_WRITE
0 NULL OPEN_EXISTING FILE_ATTRIBUTE_NORMAL NULL)
if (hNamedPipe == INVALID_HANDLE_VALUE)
fptinf(stderr Failure to locate servern) exit(3)
Write the request
WriteFile (hNamedPipe ampRequest MAX_RQRS_LEN ampnWrite NULL)
Read each response and send it to std out
while (ReadFile (hNamedPipe ResponseRecord MAX_RQRS_LEN ampnRead NULL))
printf (s ResponseRecord)
CloseHandle (hNamedPipe)
return 0
105105
Server Example Using a Named Pipe
hNamedPipe = CreateNamedPipe (SERVER_PIPE PIPE_ACCESS_DUPLEX
PIPE_READMODE_MESSAGE | PIPE_TYPE_MESSAGE | PIPE_WAIT
1 0 0 CS_TIMEOUT pNPSA)
while (Done)
printf (Server is awaiting next requestn)
if (ConnectNamedPipe (hNamedPipe NULL)
|| ReadFile (hNamedPipe ampRequest RQ_SIZE ampnXfer NULL))
fprintf(stderr ldquoConnect or Read Named Pipe errornrdquo) exit(4)
printf( ldquoRequest is sn RequestRecord)
Send the file one line at a time to the client
fp = fopen (File r)
while ((fgets (ResponseRecord MAX_RQRS_LEN fp) = NULL))
WriteFile (hNamedPipe ampResponseRecord
(strlen(ResponseRecord) + 1) TSIZE ampnXfer NULL)
fclose (fp)
DisconnectNamedPipe (hNamedPipe)
End of server operation
106106
Win32 IPC - MailslotsMailslots bear some nasty implementation detailsthey are almost never used
Broadcast mechanism
ndash One-directional
ndash Mutliple writersmultiple readers (frequently one-to-many comm)
ndash Message delivery is unreliable
ndash Can be located over a network domain
ndash Message lengths are limited (w2k lt 426 byte) Operations on the mailslot
ndash Each reader (server) creates mailslot with CreateMailslot()
ndash Write-only client opens mailslot with CreateFile() and uses WriteFile() ndash open will fail if there are no waiting readers
ndash Clientlsquos message can be read by all servers (readers) Client lookup mailslotmailslotname
ndash Client will connect to every server in network domain
107107
Locate a server via mailslot
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
App client 0
App client n
Mailslot Servers
While () Sleep() hMS = CreateFile( ldquomailslotstatusldquo)
WriteFile(hMS ampStatInfo
App Server
Mailslot Client
Message is sent periodically
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
5050
do acquire_spinlock(DPC)until (SUCCESS)
begin remove DPC from queueend
release_spinlock(DPC)
do acquire_spinlock(DPC)until (SUCCESS)
begin remove DPC from queueend
release_spinlock(DPC)
Kernel Synchronization
Processor BProcessor A
Critical section
spinlock
DPC DPC
A spinlock is a locking primitive associatedwith a global data structure such as the DPC queue
5151
Queued Spinlocks
Problem Checking status of spinlock via test-and-set operation creates bus contention
Queued spinlocks maintain queue of waiting processors
First processor acquires lock other processors wait on processor-local flag
ndash Thus busy-wait loop requires no access to the memory bus
When releasing lock the 1st processorrsquos flag is modified
ndash Exactly one processor is being signaled
ndash Pre-determined wait order
5252
SMP Scalability Improvements
Windows 2000 queued spinlocks
ndash qlocks in Kernel Debugger
Server 2003
ndash More spinlocks eliminated (context swap system space commit)
ndash Further reduction of use of spinlocks amp length they are held
ndash Scheduling database now per-CPUAllows thread state transitions in parallel
5353
SMP Scalability Improvements
XP2003
ndash Minimized lock contention for hot locks PFN or Page Frame Database lock
ndash Some locks completely eliminatedCharging nonpagedpaged pool quotas allocating and mapping system page table entries charging commitment of pages allocatingmapping physical memory through AWE functions
ndash New more efficient locking mechanism (pushlocks)Doesnrsquot use spinlocks when no contentionUsed for object manager and address windowing extensions (AWE) related locks
5454
Waiting
Flexible wait calls
ndash Wait for one or multiple objects in one call
ndash Wait for multiple can wait for ldquoanyrdquo one or ldquoallrdquo at onceldquoAllrdquo all objects must be in the signalled state concurrently to resolve the wait
ndash All wait calls include optional timeout argument
ndash Waiting threads consume no CPU time
5555
Waiting
Waitable objects include
ndash Events (may be auto-reset or manual reset may be set or ldquopulsedrdquo)
ndash Mutexes (ldquomutual exclusionrdquo one-at-a-time)
ndash Semaphores (n-at-a-time)
ndash Timers
ndash Processes and Threads (signalled upon exit or terminate)
ndash Directories (change notification)
5656
Waiting
No guaranteed ordering of wait resolution
ndash If multiple threads are waiting for an object and only one thread is released (eg itrsquos a mutex or auto-reset event) which thread gets released is unpredictable
5757
Executive Synchronization
Thread waitson an object
handle
Create and initialize thread object
Initialized
Ready
Transition
Waiting
Running
Terminated
Standby
Wait is completeSet object to
signaled state
Interaction with thread scheduling
Waiting on Dispatcher Objects ndash outside the kernel
5858
Interaction bet Synchronization amp Dispatching User mode thread waits on an event objectlsquos handle
Kernel changes threadlsquos scheduling state from ready to waiting and adds thread to wait-list
Another thread sets the event
Kernel wakes up waiting threads variable priority threads get priority boost
5959
Interaction bet Synchronization amp Dispatching Dispatcher re-schedules new thread ndash it may preempt running thread it it has lower priority and issues software interrupt to initiate context switch
If no processor can be preempted the dispatcher places the ready thread in the dispatcher ready queue to be scheduled later
6060
What signals an object
Dispatcher object
System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
Owning thread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (kernel mode)
nonsignaled signaled
Owning thread or otherthread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (exported to user mode)
nonsignaled signaled
One thread releases thesemaphore freeing a resource
A thread acquires the semaphoreMore resources are not available
Kernel resumes one or more waiting threads
Semaphore
6161
A thread reinitializesthe thread object
What signals an object (contd)
Dispatcher object System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
A thread sets the event
Kernel resumes one or more threads
Kernel resumes one or more waiting threads
Event
nonsignaled signaled
Dedicated thread sets oneevent in the event pair
Kernel resumes theother dedicated thread
Kernel resumes waitingdedicated thread
Event pair
nonsignaled signaled
Timer expires
A thread (re) initializes the timer
Kernel resumes all waiting threads
Timer
nonsignaled signaled
Thread terminates
Kernel resumes all waiting threads
Thread
6262
Further Reading
Mark E Russinovich and David A Solomon Microsoft Windows Internals 4th Edition Microsoft Press 2004
Chapter 3 - System Mechanisms
ndash Trap Dispatching (pp 85 ff)
ndash Synchronization (pp 149 ff)
ndash Kernel Event Tracing (pp 175 ff)
6363
33 Advanced Windows Synchronization
Deferred and Asynchronous Procedure Calls
IRQLs and CPU Time Accounting
Wait Queues amp Dispatcher Objects
6464
Used to defer processing from higher (device) interrupt level to a lower (dispatch) levelndash Also used for quantum end and timer expiration
Driver (usually ISR) queues requestndash One queue per CPU DPCs are normally queued to the current processor but can be targeted to other CPUs
ndash Executes specified procedure at dispatch IRQL (or ldquodispatch levelrdquo also ldquoDPC levelrdquo) when all higher-IRQL work (interrupts) completed
ndash Maximum times recommended ISR 10 usec DPC 25 usec
See httpwwwmicrosoftcomwhdcdriverperformmmdrvmspx
Deferred Procedure Calls (DPCs)
6565
queue head DPC object DPC object DPC object
Deferred Procedure Calls (DPCs)
6666
DPC
Delivering a DPC
DPC routines can call kernel functionsbut canlsquot call system services generatepage faults or create or wait on objects
DPC routines canlsquotassume whatprocess addressspace is currentlymapped
Interruptdispatch table
high
Power failure
DispatchDPC
APC
Low
DPC
1 Timer expires kernelqueues DPC that willrelease all waiting threadsKernel requests SW int
DPCDPC
DPC queue
2 DPC interrupt occurswhen IRQL drops belowdispatchDPC level
dispatcher
3 After DPC interruptcontrol transfers tothread dispatcher
4 Dispatcher executes each DPCroutine in DPC queue
6767
Asynchronous Procedure Calls (APCs)
Execute code in context of a particular user threadndash APC routines can acquire resources (objects) incur page faultscall system services
APC queue is thread-specific
User mode amp kernel mode APCsndash Permission required for user mode APCs
6868
Asynchronous Procedure Calls (APCs)
Executive uses APCs to complete work in thread spacendash Wait for asynchronous IO operation
ndash Emulate delivery of POSIX signals
ndash Make threads suspendterminate itself (env subsystems)
APCs are delivered when thread is in alertable wait statendash WaitForMultipleObjectsEx() SleepEx()
6969
Special kernel APCs
ndash Run in kernel mode at IRQL 1
ndash Always deliverable unless thread is already at IRQL 1 or above
ndash Used for IO completion reporting from ldquoarbitrary thread contextrdquo
ndash Kernel-mode interface is linkable but not documented
Asynchronous Procedure Calls (APCs)
7070
ldquoOrdinaryrdquo kernel APCs
ndash Always deliverable if at IRQL 0 unless explicitly disabled (disable with KeEnterCriticalRegion)
User mode APCs
ndash Used for IO completion callback routines (see ReadFileEx WriteFileEx) also QueueUserApc
ndash Only deliverable when thread is in ldquoalertable waitrdquo
Asynchronous Procedure Calls (APCs)
7171
ThreadObject
K
U
APC objects
Asynchronous Procedure Calls (APCs)
7272
IRQLs and CPU Time Accounting
Interval clock timer ISR keeps track of time
Clock ISR time accounting
ndash If IRQLlt2 charge to threadrsquos user or kernel time
ndash If IRQL=2 and processing a DPC charge to DPC time
ndash If IRQL=2 amp not processing a DPC charge to thread kernel time
ndash If IRQLgt2 charge to interrupt time
7373
IRQLs and CPU Time Accounting
Since time servicing interrupts are NOT charged to interrupted thread if system is busy but no process appears to be running must be due to interrupt-related activity
ndash Note time at IRQL 2 or more is charged to the current threadrsquos quantum (to be described)
7474
Interrupt Time Accounting
Task Manager includes interrupt and DPC time with the Idle process time
Interrupt activity is not charged to any threadprocess
ndash Process Explorer shows these as separate processesnot really processes
ndash Context switches for these are really of interrupts amp DPCs
7575
Time Accounting Quirks
Looking at total CPU time for each process may not reveal where system has spent its time
CPU time accounting is driven by programmable interrupt timer
ndash Normally 10 msec (15 msec on some MP Pentiums)
Thread execution and context switches between clock intervals NOT accounted
ndash Eg one or more threads run and enter a wait state before clock fires
ndash Thus threads may run but never get charged
View context switch activity with Process Explorer
ndash Add Context Switch Delta column
7676
For waiting threads user-mode utilities only display the wait reason
Example pstat
Looking at Waiting Threads
7777
Wait Internals 1 Dispatcher Objects
Size TypeState
Wait listhead
Object-type-specific data
DispatcherObject
(see ntddkincddkntddkh)
Any kernel object you can wait for is a ldquodispatcher objectrdquo
ndash some exclusively for synchronizationeg events mutexes (ldquomutantsrdquo) semaphores queues timers
ndash others can be waited for as a side effect of their prime function eg processes threads file objects
ndash non-waitable kernel objects are called ldquocontrol objectsrdquo
All dispatcher objects have a common header
All dispatcher objects are in one of two states
ndash ldquosignaledrdquo vs ldquononsignaledrdquo
ndash when signalled a wait on the object is satisfied
ndash different object types differ in terms of what changes their state
ndash wait and unwait implementation iscommon to all types of dispatcher objects
7878
Object-type-specific data
Wait Internals 2Wait Blocks
Size TypeState
Wait listhead
Size TypeState
Wait listhead
Represent a threadrsquos reference to something itrsquos waiting for (one per handle passed to WaitForhellip)
All wait blocks from a given wait call are chained to the waiting thread
Type indicates wait for ldquoanyrdquo or ldquoallrdquo Key denotes argument list position for
WaitForMultipleObjects
Object-type-specific data
DispatcherObjects
Thread Objects
WaitBlockListWaitBlockList
Wait blocks
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
7979
34 Windows APIs for Synchronization and IPC Windows API constructs for synchronization and interprocess communication
Synchronization
ndash Critical sections
ndash Mutexes
ndash Semaphores
ndash Event objects
Synchronization through interprocess communication
ndash Anonymous pipes
ndash Named pipes
ndash Mailslots
8080
Critical Sections
Only usable from within the same process
Critical sections are initialized and deleted but do not have handles
Only one thread at a time can be in a critical section
A thread can enter a critical section multiple times - however the number of Enter- and Leave-operations must match
Leaving a critical section before entering it may cause deadlocks
No way to test whether another thread is in a critical section
VOID InitializeCriticalSection( LPCRITICAL_SECTION sec )VOID DeleteCriticalSection( LPCRITICAL_SECTION sec )
VOID EnterCriticalSection( LPCRITICAL_SECTION sec ) VOID LeaveCriticalSection( LPCRITICAL_SECTION sec )BOOL TryEnterCriticalSection ( LPCRITICAL_SECTION sec )
8181
Critical Section Example
counter is global shared by all threads
volatile int counter = 0
CRITICAL_SECTION crit
InitializeCriticalSection ( ampcrit )
hellip main loop in any of the threads
while (done)
_try
EnterCriticalSection ( ampcrit )
counter += local_value
LeaveCriticalSection ( ampcrit )
_finally LeaveCriticalSection ( ampcrit )
DeleteCriticalSection( ampcrit )
8282
Synchronizing Threads with Kernel Objects
The following kernel objects can be used to synchronize threads
ndash Processes
ndash Threads
ndash Files
ndash Console input
File change notificationsFile change notifications
MutexesMutexes
Events (auto-reset + manual-reset)Events (auto-reset + manual-reset)
Waitable timersWaitable timers
DWORD WaitForSingleObject( HANDLE hObject DWORD dwTimeout )
DWORD WaitForMultipleObjects( DWORD cObjects LPHANDLE lpHandles BOOL bWaitAll DWORD dwTimeout )
8383
Wait Functions - Details
WaitForSingleObject()ndash hObject specifies kernel object
ndash dwTimeout specifies wait time in msecdwTimeout == 0 - no wait check whether object is signaled
dwTimeout == INFINITE - wait forever
WaitForMultipleObjects()ndash cObjects lt= MAXIMUM_WAIT_OBJECTS (64)
ndash lpHandles - pointer to array identifying these objects
ndash bWaitAll - whether to wait for first signaled object or all objectsFunction returns index of first signaled object
Side effectsndash Mutexes auto-reset events and waitable timers will be reset to non-signaled state after completing wait functions
8484
Mutexes
Mutexes work across processes
First thread has to call CreateMutex()
When sharing a mutex second thread (process) calls CreateMutex() or OpenMutex()
fInitialOwner == TRUE gives creator immediate ownership
Threads acquire mutex ownership using WaitForSingleObject() or WaitForMultipleObjects()
ReleaseMutex() gives up ownership
CloseHandle() will free mutex object
8585
Mutexes
HANDLE CreateMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
HANDLE OpenMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
BOOL ReleaseMutex( HANDLE hMutex )
8686
Mutex Example
counter is global shared by all threads
volatile int done counter = 0
HANDLE mutex = CreateMutex( NULL FALSE NULL )
main loop in any of the threads ret is local
DWORD ret
while (done)
ret = WaitForSingleObject( mutex INFINITE )
if (ret == WAIT_OBJECT_0)
counter += local_value
else mutex was abandoned
break exit the loop
ReleaseMutex( mutex )
CloseHandle( mutex )
8787
Comparison - POSIX mutexes
POSIX pthreads specification supports mutexesndash Synchronization among threads in same process
Five basic functionsndash pthread_mutex_init()
ndash pthread_mutex_destroy()
ndash pthread_mutex_lock()
ndash pthread_mutex_unlock()
ndash pthread_mutex_trylock()
Comparisonndash pthread_mutex_lock() will block - equivalent to WaitForSingleObject( hMutex )
ndash pthread_mutex_trylock() is nonblocking (polling) - equivalent to WaitForSingleObject() with timeout == 0
8888
Semaphores
Semaphore objects are used for resource countingndash A semaphore is signaled when count gt 0
Threadsprocesses use wait functionsndash Each wait function decreases semaphore count by 1
ndash ReleaseSemaphore() may increment count by any value
ndash ReleaseSemaphore() returns old semaphore count
8989
Semaphores
HANDLE CreateSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE OpenSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE ReleaseSemaphore( HANDLE hSemaphoreLONG cReleaseCount LPLONG lpPreviousCount )
9090
Events
Multiple threads can be released when a single event is signaled (barrier synchronization)ndash Manual-reset event can signal several thread simultaneously must be reset manually
ndash PulseEvent() will release all threads waiting on a manual-reset event and automatically reset the event
ndash Auto-reset event signals a single thread event is reset automatically
ndash fInitialState == TRUE - create event in signaled state
9191
Events
HANDLE CreateEvent( LPSECURITY_ATTRIBUTE lpsaBOOL fManualReset BOOL fInititalStateLPTSTR lpszEventName )
BOOL SetEvent( HANDLE hEvent )BOOL ResetEvent( HANDLE hEvent )BOOL PulseEvent( HANDLE hEvent )
9292
Comparison - POSIX condition variables
pthreadrsquos condition variables are comparable to eventsndash pthread_cond_init()
ndash pthread_cond_destroy()
Wait functionsndash pthread_cond_wait()
ndash pthread_cond_timedwait()
Signalingndash pthread_cond_signal() - one thread
ndash pthread_cond_broadcast() - all waiting threads
No exact equivalent to manual-reset events
9393
Anonymous pipes
BOOL CreatePipe( PHANDLE phReadPHANDLE phWriteLPSECURITY_ATTRIBUTES lpsaDWORD cbPipe )
main
prog1 prog2pipe
Half-duplex character-based IPC
cbPipe pipe byte size zero == default
Read on pipe handle will block if pipe is empty
Write operation to a full pipe will block
Anonymous pipes are oneway
9494
IO Redirection using an Anonymous Pipe
Create default size anonymous pipe handles are inheritable
if (CreatePipe (amphReadPipe amphWritePipe ampPipeSA 0))
fprintf(stderr ldquoAnon pipe create failednrdquo) exit(1)
Set output handle to pipe handle create first processes
StartInfoCh1hStdInput = GetStdHandle (STD_INPUT_HANDLE)
StartInfoCh1hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh1hStdOutput = hWritePipe
StartInfoCh1dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)Command1 NULL NULL TRUE 0 NULL NULL ampStartInfoCh1 ampProcInfo1))
fprintf(stderr ldquoCreateProc1 failednrdquo) exit(2)
CloseHandle (hWritePipe)
9595
Pipe example (contd)
Repeat (symmetrically) for the second process
StartInfoCh2hStdInput = hReadPipe
StartInfoCh2hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh2hStdOutput = GetStdHandle (STD_OUTPUT_HANDLE)
StartInfoCh2dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)targv NULL NULLTRUE Inherit handles
0 NULL NULL ampStartInfoCh2 ampProcInfo2))
fprintf(stderr ldquoCreateProc2 failednrdquo) exit(3)
CloseHandle (hReadPipe)
Wait for both processes to complete
WaitForSingleObject (ProcInfo1hProcess INFINITE)
WaitForSingleObject (ProcInfo2hProcess INFINITE)
CloseHandle (ProcInfo1hThread) CloseHandle (ProcInfo1hProcess)
CloseHandle (ProcInfo2hThread) CloseHandle (ProcInfo2hProcess)
return 0
9696
Named Pipes
Message orientedndash Reading process can read varying-length messages precisely as sent by the writing process
Bi-directionalndash Two processes can exchange messages over the same pipe
Multiple independent instances of a named pipendash Several clients can communicate with a single server using the same instance
ndash Server can respond to client using the same instance
Pipe can be accessed over the networkndash location transparency
Convenience and connection functions
9797
Using Named Pipes
HANDLE CreateNamedPipe (LPCTSTR lpszPipeNameDWORD fdwOpenMode DWORD fdwPipModeDWORD nMaxInstances DWORD cbOutBufDWORD cbInBuf DWORD dwTimeOutLPSECURITY_ATTRIBUTES lpsa )
Use same flag settings forall instances of a named pipe
lpszPipeName pipe[path]pipename
ndash Not possible to create a pipe on remote machine ( ndash local machine)
fdwOpenMode
ndash PIPE_ACCESS_DUPLEX PIPE_ACCESS_INBOUND PIPE_ACCESS_OUTBOUND
fdwPipeMode
ndash PIPE_TYPE_BYTE or PIPE_TYPE_MESSAGE
ndash PIPE_READMODE_BYTE or PIPE_READMODE_MESSAGE
ndash PIPE_WAIT or PIPE_NOWAIT (will ReadFile block)
9898
Named Pipes (contd)
BOOL PeekNamedPipe (HANDLE hPipeLPVOID lpvBuffer DWORD cbBufferLPDWORD lpcbRead LPDWORD lpcbAvailLPDWORD lpcbMessage)
nMaxInstances
ndash Number of instances
ndash PIPE_UNLIMITED_INSTANCES OS choice based on resources
dwTimeOut
ndash Default time-out period (in msec) for WaitNamedPipe()
First CreateNamedPipe creates named pipe
ndash Closing handle to last instance deletes named pipe
Polling a pipe
ndash Nondestructive ndash is there a message waiting for ReadFile
9999
Named Pipe Client Connections
CreateFile with named pipe namendash pipe[path]pipename
ndash servernamepipe[path]pipename
ndash First method gives better performance (local server)
Status Functionsndash GetNamedPipeHandleState
ndash SetNamedPipeHandleState
ndash GetNamedPipeInfo
100100
Convenience Functions
BOOL TransactNamedPipe( HANDLE hNamedPipeLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDOWRD lpcbRead LPOVERLAPPED lpa)
WriteFile ReadFile sequence
101101
Convenience Functions
BOOL CallNamedPipe( LPCTSTR lpszPipeNameLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDWORD lpcbRead DWORD dwTimeOut)
CreateFile WriteFile ReadFile CloseHandle
ndash dwTimeOut NMPWAIT_NOWAIT NMPWAIT_WIAT_FOREVER NMPWAIT_USE_DEFAULT_WAIT
102102
Server eliminate the polling loop
BOOL ConnectNamedPipe (HANDLE hNamedPipeLPOVERLAPPED lpo
lpo == NULLndash Call will return as soon as there is a client connection
ndash Returns false if client connected between CreateNamed Pipe calland ConnectNamedPipe()
Use DisconnectNamedPipe to free the handle for connection from another client
WaitNamedPipe()ndash Client may wait for serverlsquos ConnectNamedPipe()
Security rights for named pipesndash GENERIC_READ GENERIC_WRITE SYNCHRONIZE
103103
Comparison with UNIX
UNIX FIFOs are similar to a named pipe
ndash FIFOs are half-duplex
ndash FIFOs are limited to a single machine
ndash FIFOs are still byte-oriented so its easiest to use fixed-size records in clientserver applications
ndash Individual readwrites are atomic
A server using FIFOs must use a separate FIFO for each clientlsquos response although all clients can send requests via a single well known FIFO
Mkfifo() is the UNIX counterpart to CreateNamedPipe()
Use sockets for networked clientserver scenarios
104104
Client Example using Named Pipe
WaitNamedPipe (ServerPipeName NMPWAIT_WAIT_FOREVER)
hNamedPipe = CreateFile (ServerPipeName GENERIC_READ | GENERIC_WRITE
0 NULL OPEN_EXISTING FILE_ATTRIBUTE_NORMAL NULL)
if (hNamedPipe == INVALID_HANDLE_VALUE)
fptinf(stderr Failure to locate servern) exit(3)
Write the request
WriteFile (hNamedPipe ampRequest MAX_RQRS_LEN ampnWrite NULL)
Read each response and send it to std out
while (ReadFile (hNamedPipe ResponseRecord MAX_RQRS_LEN ampnRead NULL))
printf (s ResponseRecord)
CloseHandle (hNamedPipe)
return 0
105105
Server Example Using a Named Pipe
hNamedPipe = CreateNamedPipe (SERVER_PIPE PIPE_ACCESS_DUPLEX
PIPE_READMODE_MESSAGE | PIPE_TYPE_MESSAGE | PIPE_WAIT
1 0 0 CS_TIMEOUT pNPSA)
while (Done)
printf (Server is awaiting next requestn)
if (ConnectNamedPipe (hNamedPipe NULL)
|| ReadFile (hNamedPipe ampRequest RQ_SIZE ampnXfer NULL))
fprintf(stderr ldquoConnect or Read Named Pipe errornrdquo) exit(4)
printf( ldquoRequest is sn RequestRecord)
Send the file one line at a time to the client
fp = fopen (File r)
while ((fgets (ResponseRecord MAX_RQRS_LEN fp) = NULL))
WriteFile (hNamedPipe ampResponseRecord
(strlen(ResponseRecord) + 1) TSIZE ampnXfer NULL)
fclose (fp)
DisconnectNamedPipe (hNamedPipe)
End of server operation
106106
Win32 IPC - MailslotsMailslots bear some nasty implementation detailsthey are almost never used
Broadcast mechanism
ndash One-directional
ndash Mutliple writersmultiple readers (frequently one-to-many comm)
ndash Message delivery is unreliable
ndash Can be located over a network domain
ndash Message lengths are limited (w2k lt 426 byte) Operations on the mailslot
ndash Each reader (server) creates mailslot with CreateMailslot()
ndash Write-only client opens mailslot with CreateFile() and uses WriteFile() ndash open will fail if there are no waiting readers
ndash Clientlsquos message can be read by all servers (readers) Client lookup mailslotmailslotname
ndash Client will connect to every server in network domain
107107
Locate a server via mailslot
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
App client 0
App client n
Mailslot Servers
While () Sleep() hMS = CreateFile( ldquomailslotstatusldquo)
WriteFile(hMS ampStatInfo
App Server
Mailslot Client
Message is sent periodically
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
5151
Queued Spinlocks
Problem Checking status of spinlock via test-and-set operation creates bus contention
Queued spinlocks maintain queue of waiting processors
First processor acquires lock other processors wait on processor-local flag
ndash Thus busy-wait loop requires no access to the memory bus
When releasing lock the 1st processorrsquos flag is modified
ndash Exactly one processor is being signaled
ndash Pre-determined wait order
5252
SMP Scalability Improvements
Windows 2000 queued spinlocks
ndash qlocks in Kernel Debugger
Server 2003
ndash More spinlocks eliminated (context swap system space commit)
ndash Further reduction of use of spinlocks amp length they are held
ndash Scheduling database now per-CPUAllows thread state transitions in parallel
5353
SMP Scalability Improvements
XP2003
ndash Minimized lock contention for hot locks PFN or Page Frame Database lock
ndash Some locks completely eliminatedCharging nonpagedpaged pool quotas allocating and mapping system page table entries charging commitment of pages allocatingmapping physical memory through AWE functions
ndash New more efficient locking mechanism (pushlocks)Doesnrsquot use spinlocks when no contentionUsed for object manager and address windowing extensions (AWE) related locks
5454
Waiting
Flexible wait calls
ndash Wait for one or multiple objects in one call
ndash Wait for multiple can wait for ldquoanyrdquo one or ldquoallrdquo at onceldquoAllrdquo all objects must be in the signalled state concurrently to resolve the wait
ndash All wait calls include optional timeout argument
ndash Waiting threads consume no CPU time
5555
Waiting
Waitable objects include
ndash Events (may be auto-reset or manual reset may be set or ldquopulsedrdquo)
ndash Mutexes (ldquomutual exclusionrdquo one-at-a-time)
ndash Semaphores (n-at-a-time)
ndash Timers
ndash Processes and Threads (signalled upon exit or terminate)
ndash Directories (change notification)
5656
Waiting
No guaranteed ordering of wait resolution
ndash If multiple threads are waiting for an object and only one thread is released (eg itrsquos a mutex or auto-reset event) which thread gets released is unpredictable
5757
Executive Synchronization
Thread waitson an object
handle
Create and initialize thread object
Initialized
Ready
Transition
Waiting
Running
Terminated
Standby
Wait is completeSet object to
signaled state
Interaction with thread scheduling
Waiting on Dispatcher Objects ndash outside the kernel
5858
Interaction bet Synchronization amp Dispatching User mode thread waits on an event objectlsquos handle
Kernel changes threadlsquos scheduling state from ready to waiting and adds thread to wait-list
Another thread sets the event
Kernel wakes up waiting threads variable priority threads get priority boost
5959
Interaction bet Synchronization amp Dispatching Dispatcher re-schedules new thread ndash it may preempt running thread it it has lower priority and issues software interrupt to initiate context switch
If no processor can be preempted the dispatcher places the ready thread in the dispatcher ready queue to be scheduled later
6060
What signals an object
Dispatcher object
System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
Owning thread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (kernel mode)
nonsignaled signaled
Owning thread or otherthread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (exported to user mode)
nonsignaled signaled
One thread releases thesemaphore freeing a resource
A thread acquires the semaphoreMore resources are not available
Kernel resumes one or more waiting threads
Semaphore
6161
A thread reinitializesthe thread object
What signals an object (contd)
Dispatcher object System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
A thread sets the event
Kernel resumes one or more threads
Kernel resumes one or more waiting threads
Event
nonsignaled signaled
Dedicated thread sets oneevent in the event pair
Kernel resumes theother dedicated thread
Kernel resumes waitingdedicated thread
Event pair
nonsignaled signaled
Timer expires
A thread (re) initializes the timer
Kernel resumes all waiting threads
Timer
nonsignaled signaled
Thread terminates
Kernel resumes all waiting threads
Thread
6262
Further Reading
Mark E Russinovich and David A Solomon Microsoft Windows Internals 4th Edition Microsoft Press 2004
Chapter 3 - System Mechanisms
ndash Trap Dispatching (pp 85 ff)
ndash Synchronization (pp 149 ff)
ndash Kernel Event Tracing (pp 175 ff)
6363
33 Advanced Windows Synchronization
Deferred and Asynchronous Procedure Calls
IRQLs and CPU Time Accounting
Wait Queues amp Dispatcher Objects
6464
Used to defer processing from higher (device) interrupt level to a lower (dispatch) levelndash Also used for quantum end and timer expiration
Driver (usually ISR) queues requestndash One queue per CPU DPCs are normally queued to the current processor but can be targeted to other CPUs
ndash Executes specified procedure at dispatch IRQL (or ldquodispatch levelrdquo also ldquoDPC levelrdquo) when all higher-IRQL work (interrupts) completed
ndash Maximum times recommended ISR 10 usec DPC 25 usec
See httpwwwmicrosoftcomwhdcdriverperformmmdrvmspx
Deferred Procedure Calls (DPCs)
6565
queue head DPC object DPC object DPC object
Deferred Procedure Calls (DPCs)
6666
DPC
Delivering a DPC
DPC routines can call kernel functionsbut canlsquot call system services generatepage faults or create or wait on objects
DPC routines canlsquotassume whatprocess addressspace is currentlymapped
Interruptdispatch table
high
Power failure
DispatchDPC
APC
Low
DPC
1 Timer expires kernelqueues DPC that willrelease all waiting threadsKernel requests SW int
DPCDPC
DPC queue
2 DPC interrupt occurswhen IRQL drops belowdispatchDPC level
dispatcher
3 After DPC interruptcontrol transfers tothread dispatcher
4 Dispatcher executes each DPCroutine in DPC queue
6767
Asynchronous Procedure Calls (APCs)
Execute code in context of a particular user threadndash APC routines can acquire resources (objects) incur page faultscall system services
APC queue is thread-specific
User mode amp kernel mode APCsndash Permission required for user mode APCs
6868
Asynchronous Procedure Calls (APCs)
Executive uses APCs to complete work in thread spacendash Wait for asynchronous IO operation
ndash Emulate delivery of POSIX signals
ndash Make threads suspendterminate itself (env subsystems)
APCs are delivered when thread is in alertable wait statendash WaitForMultipleObjectsEx() SleepEx()
6969
Special kernel APCs
ndash Run in kernel mode at IRQL 1
ndash Always deliverable unless thread is already at IRQL 1 or above
ndash Used for IO completion reporting from ldquoarbitrary thread contextrdquo
ndash Kernel-mode interface is linkable but not documented
Asynchronous Procedure Calls (APCs)
7070
ldquoOrdinaryrdquo kernel APCs
ndash Always deliverable if at IRQL 0 unless explicitly disabled (disable with KeEnterCriticalRegion)
User mode APCs
ndash Used for IO completion callback routines (see ReadFileEx WriteFileEx) also QueueUserApc
ndash Only deliverable when thread is in ldquoalertable waitrdquo
Asynchronous Procedure Calls (APCs)
7171
ThreadObject
K
U
APC objects
Asynchronous Procedure Calls (APCs)
7272
IRQLs and CPU Time Accounting
Interval clock timer ISR keeps track of time
Clock ISR time accounting
ndash If IRQLlt2 charge to threadrsquos user or kernel time
ndash If IRQL=2 and processing a DPC charge to DPC time
ndash If IRQL=2 amp not processing a DPC charge to thread kernel time
ndash If IRQLgt2 charge to interrupt time
7373
IRQLs and CPU Time Accounting
Since time servicing interrupts are NOT charged to interrupted thread if system is busy but no process appears to be running must be due to interrupt-related activity
ndash Note time at IRQL 2 or more is charged to the current threadrsquos quantum (to be described)
7474
Interrupt Time Accounting
Task Manager includes interrupt and DPC time with the Idle process time
Interrupt activity is not charged to any threadprocess
ndash Process Explorer shows these as separate processesnot really processes
ndash Context switches for these are really of interrupts amp DPCs
7575
Time Accounting Quirks
Looking at total CPU time for each process may not reveal where system has spent its time
CPU time accounting is driven by programmable interrupt timer
ndash Normally 10 msec (15 msec on some MP Pentiums)
Thread execution and context switches between clock intervals NOT accounted
ndash Eg one or more threads run and enter a wait state before clock fires
ndash Thus threads may run but never get charged
View context switch activity with Process Explorer
ndash Add Context Switch Delta column
7676
For waiting threads user-mode utilities only display the wait reason
Example pstat
Looking at Waiting Threads
7777
Wait Internals 1 Dispatcher Objects
Size TypeState
Wait listhead
Object-type-specific data
DispatcherObject
(see ntddkincddkntddkh)
Any kernel object you can wait for is a ldquodispatcher objectrdquo
ndash some exclusively for synchronizationeg events mutexes (ldquomutantsrdquo) semaphores queues timers
ndash others can be waited for as a side effect of their prime function eg processes threads file objects
ndash non-waitable kernel objects are called ldquocontrol objectsrdquo
All dispatcher objects have a common header
All dispatcher objects are in one of two states
ndash ldquosignaledrdquo vs ldquononsignaledrdquo
ndash when signalled a wait on the object is satisfied
ndash different object types differ in terms of what changes their state
ndash wait and unwait implementation iscommon to all types of dispatcher objects
7878
Object-type-specific data
Wait Internals 2Wait Blocks
Size TypeState
Wait listhead
Size TypeState
Wait listhead
Represent a threadrsquos reference to something itrsquos waiting for (one per handle passed to WaitForhellip)
All wait blocks from a given wait call are chained to the waiting thread
Type indicates wait for ldquoanyrdquo or ldquoallrdquo Key denotes argument list position for
WaitForMultipleObjects
Object-type-specific data
DispatcherObjects
Thread Objects
WaitBlockListWaitBlockList
Wait blocks
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
7979
34 Windows APIs for Synchronization and IPC Windows API constructs for synchronization and interprocess communication
Synchronization
ndash Critical sections
ndash Mutexes
ndash Semaphores
ndash Event objects
Synchronization through interprocess communication
ndash Anonymous pipes
ndash Named pipes
ndash Mailslots
8080
Critical Sections
Only usable from within the same process
Critical sections are initialized and deleted but do not have handles
Only one thread at a time can be in a critical section
A thread can enter a critical section multiple times - however the number of Enter- and Leave-operations must match
Leaving a critical section before entering it may cause deadlocks
No way to test whether another thread is in a critical section
VOID InitializeCriticalSection( LPCRITICAL_SECTION sec )VOID DeleteCriticalSection( LPCRITICAL_SECTION sec )
VOID EnterCriticalSection( LPCRITICAL_SECTION sec ) VOID LeaveCriticalSection( LPCRITICAL_SECTION sec )BOOL TryEnterCriticalSection ( LPCRITICAL_SECTION sec )
8181
Critical Section Example
counter is global shared by all threads
volatile int counter = 0
CRITICAL_SECTION crit
InitializeCriticalSection ( ampcrit )
hellip main loop in any of the threads
while (done)
_try
EnterCriticalSection ( ampcrit )
counter += local_value
LeaveCriticalSection ( ampcrit )
_finally LeaveCriticalSection ( ampcrit )
DeleteCriticalSection( ampcrit )
8282
Synchronizing Threads with Kernel Objects
The following kernel objects can be used to synchronize threads
ndash Processes
ndash Threads
ndash Files
ndash Console input
File change notificationsFile change notifications
MutexesMutexes
Events (auto-reset + manual-reset)Events (auto-reset + manual-reset)
Waitable timersWaitable timers
DWORD WaitForSingleObject( HANDLE hObject DWORD dwTimeout )
DWORD WaitForMultipleObjects( DWORD cObjects LPHANDLE lpHandles BOOL bWaitAll DWORD dwTimeout )
8383
Wait Functions - Details
WaitForSingleObject()ndash hObject specifies kernel object
ndash dwTimeout specifies wait time in msecdwTimeout == 0 - no wait check whether object is signaled
dwTimeout == INFINITE - wait forever
WaitForMultipleObjects()ndash cObjects lt= MAXIMUM_WAIT_OBJECTS (64)
ndash lpHandles - pointer to array identifying these objects
ndash bWaitAll - whether to wait for first signaled object or all objectsFunction returns index of first signaled object
Side effectsndash Mutexes auto-reset events and waitable timers will be reset to non-signaled state after completing wait functions
8484
Mutexes
Mutexes work across processes
First thread has to call CreateMutex()
When sharing a mutex second thread (process) calls CreateMutex() or OpenMutex()
fInitialOwner == TRUE gives creator immediate ownership
Threads acquire mutex ownership using WaitForSingleObject() or WaitForMultipleObjects()
ReleaseMutex() gives up ownership
CloseHandle() will free mutex object
8585
Mutexes
HANDLE CreateMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
HANDLE OpenMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
BOOL ReleaseMutex( HANDLE hMutex )
8686
Mutex Example
counter is global shared by all threads
volatile int done counter = 0
HANDLE mutex = CreateMutex( NULL FALSE NULL )
main loop in any of the threads ret is local
DWORD ret
while (done)
ret = WaitForSingleObject( mutex INFINITE )
if (ret == WAIT_OBJECT_0)
counter += local_value
else mutex was abandoned
break exit the loop
ReleaseMutex( mutex )
CloseHandle( mutex )
8787
Comparison - POSIX mutexes
POSIX pthreads specification supports mutexesndash Synchronization among threads in same process
Five basic functionsndash pthread_mutex_init()
ndash pthread_mutex_destroy()
ndash pthread_mutex_lock()
ndash pthread_mutex_unlock()
ndash pthread_mutex_trylock()
Comparisonndash pthread_mutex_lock() will block - equivalent to WaitForSingleObject( hMutex )
ndash pthread_mutex_trylock() is nonblocking (polling) - equivalent to WaitForSingleObject() with timeout == 0
8888
Semaphores
Semaphore objects are used for resource countingndash A semaphore is signaled when count gt 0
Threadsprocesses use wait functionsndash Each wait function decreases semaphore count by 1
ndash ReleaseSemaphore() may increment count by any value
ndash ReleaseSemaphore() returns old semaphore count
8989
Semaphores
HANDLE CreateSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE OpenSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE ReleaseSemaphore( HANDLE hSemaphoreLONG cReleaseCount LPLONG lpPreviousCount )
9090
Events
Multiple threads can be released when a single event is signaled (barrier synchronization)ndash Manual-reset event can signal several thread simultaneously must be reset manually
ndash PulseEvent() will release all threads waiting on a manual-reset event and automatically reset the event
ndash Auto-reset event signals a single thread event is reset automatically
ndash fInitialState == TRUE - create event in signaled state
9191
Events
HANDLE CreateEvent( LPSECURITY_ATTRIBUTE lpsaBOOL fManualReset BOOL fInititalStateLPTSTR lpszEventName )
BOOL SetEvent( HANDLE hEvent )BOOL ResetEvent( HANDLE hEvent )BOOL PulseEvent( HANDLE hEvent )
9292
Comparison - POSIX condition variables
pthreadrsquos condition variables are comparable to eventsndash pthread_cond_init()
ndash pthread_cond_destroy()
Wait functionsndash pthread_cond_wait()
ndash pthread_cond_timedwait()
Signalingndash pthread_cond_signal() - one thread
ndash pthread_cond_broadcast() - all waiting threads
No exact equivalent to manual-reset events
9393
Anonymous pipes
BOOL CreatePipe( PHANDLE phReadPHANDLE phWriteLPSECURITY_ATTRIBUTES lpsaDWORD cbPipe )
main
prog1 prog2pipe
Half-duplex character-based IPC
cbPipe pipe byte size zero == default
Read on pipe handle will block if pipe is empty
Write operation to a full pipe will block
Anonymous pipes are oneway
9494
IO Redirection using an Anonymous Pipe
Create default size anonymous pipe handles are inheritable
if (CreatePipe (amphReadPipe amphWritePipe ampPipeSA 0))
fprintf(stderr ldquoAnon pipe create failednrdquo) exit(1)
Set output handle to pipe handle create first processes
StartInfoCh1hStdInput = GetStdHandle (STD_INPUT_HANDLE)
StartInfoCh1hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh1hStdOutput = hWritePipe
StartInfoCh1dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)Command1 NULL NULL TRUE 0 NULL NULL ampStartInfoCh1 ampProcInfo1))
fprintf(stderr ldquoCreateProc1 failednrdquo) exit(2)
CloseHandle (hWritePipe)
9595
Pipe example (contd)
Repeat (symmetrically) for the second process
StartInfoCh2hStdInput = hReadPipe
StartInfoCh2hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh2hStdOutput = GetStdHandle (STD_OUTPUT_HANDLE)
StartInfoCh2dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)targv NULL NULLTRUE Inherit handles
0 NULL NULL ampStartInfoCh2 ampProcInfo2))
fprintf(stderr ldquoCreateProc2 failednrdquo) exit(3)
CloseHandle (hReadPipe)
Wait for both processes to complete
WaitForSingleObject (ProcInfo1hProcess INFINITE)
WaitForSingleObject (ProcInfo2hProcess INFINITE)
CloseHandle (ProcInfo1hThread) CloseHandle (ProcInfo1hProcess)
CloseHandle (ProcInfo2hThread) CloseHandle (ProcInfo2hProcess)
return 0
9696
Named Pipes
Message orientedndash Reading process can read varying-length messages precisely as sent by the writing process
Bi-directionalndash Two processes can exchange messages over the same pipe
Multiple independent instances of a named pipendash Several clients can communicate with a single server using the same instance
ndash Server can respond to client using the same instance
Pipe can be accessed over the networkndash location transparency
Convenience and connection functions
9797
Using Named Pipes
HANDLE CreateNamedPipe (LPCTSTR lpszPipeNameDWORD fdwOpenMode DWORD fdwPipModeDWORD nMaxInstances DWORD cbOutBufDWORD cbInBuf DWORD dwTimeOutLPSECURITY_ATTRIBUTES lpsa )
Use same flag settings forall instances of a named pipe
lpszPipeName pipe[path]pipename
ndash Not possible to create a pipe on remote machine ( ndash local machine)
fdwOpenMode
ndash PIPE_ACCESS_DUPLEX PIPE_ACCESS_INBOUND PIPE_ACCESS_OUTBOUND
fdwPipeMode
ndash PIPE_TYPE_BYTE or PIPE_TYPE_MESSAGE
ndash PIPE_READMODE_BYTE or PIPE_READMODE_MESSAGE
ndash PIPE_WAIT or PIPE_NOWAIT (will ReadFile block)
9898
Named Pipes (contd)
BOOL PeekNamedPipe (HANDLE hPipeLPVOID lpvBuffer DWORD cbBufferLPDWORD lpcbRead LPDWORD lpcbAvailLPDWORD lpcbMessage)
nMaxInstances
ndash Number of instances
ndash PIPE_UNLIMITED_INSTANCES OS choice based on resources
dwTimeOut
ndash Default time-out period (in msec) for WaitNamedPipe()
First CreateNamedPipe creates named pipe
ndash Closing handle to last instance deletes named pipe
Polling a pipe
ndash Nondestructive ndash is there a message waiting for ReadFile
9999
Named Pipe Client Connections
CreateFile with named pipe namendash pipe[path]pipename
ndash servernamepipe[path]pipename
ndash First method gives better performance (local server)
Status Functionsndash GetNamedPipeHandleState
ndash SetNamedPipeHandleState
ndash GetNamedPipeInfo
100100
Convenience Functions
BOOL TransactNamedPipe( HANDLE hNamedPipeLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDOWRD lpcbRead LPOVERLAPPED lpa)
WriteFile ReadFile sequence
101101
Convenience Functions
BOOL CallNamedPipe( LPCTSTR lpszPipeNameLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDWORD lpcbRead DWORD dwTimeOut)
CreateFile WriteFile ReadFile CloseHandle
ndash dwTimeOut NMPWAIT_NOWAIT NMPWAIT_WIAT_FOREVER NMPWAIT_USE_DEFAULT_WAIT
102102
Server eliminate the polling loop
BOOL ConnectNamedPipe (HANDLE hNamedPipeLPOVERLAPPED lpo
lpo == NULLndash Call will return as soon as there is a client connection
ndash Returns false if client connected between CreateNamed Pipe calland ConnectNamedPipe()
Use DisconnectNamedPipe to free the handle for connection from another client
WaitNamedPipe()ndash Client may wait for serverlsquos ConnectNamedPipe()
Security rights for named pipesndash GENERIC_READ GENERIC_WRITE SYNCHRONIZE
103103
Comparison with UNIX
UNIX FIFOs are similar to a named pipe
ndash FIFOs are half-duplex
ndash FIFOs are limited to a single machine
ndash FIFOs are still byte-oriented so its easiest to use fixed-size records in clientserver applications
ndash Individual readwrites are atomic
A server using FIFOs must use a separate FIFO for each clientlsquos response although all clients can send requests via a single well known FIFO
Mkfifo() is the UNIX counterpart to CreateNamedPipe()
Use sockets for networked clientserver scenarios
104104
Client Example using Named Pipe
WaitNamedPipe (ServerPipeName NMPWAIT_WAIT_FOREVER)
hNamedPipe = CreateFile (ServerPipeName GENERIC_READ | GENERIC_WRITE
0 NULL OPEN_EXISTING FILE_ATTRIBUTE_NORMAL NULL)
if (hNamedPipe == INVALID_HANDLE_VALUE)
fptinf(stderr Failure to locate servern) exit(3)
Write the request
WriteFile (hNamedPipe ampRequest MAX_RQRS_LEN ampnWrite NULL)
Read each response and send it to std out
while (ReadFile (hNamedPipe ResponseRecord MAX_RQRS_LEN ampnRead NULL))
printf (s ResponseRecord)
CloseHandle (hNamedPipe)
return 0
105105
Server Example Using a Named Pipe
hNamedPipe = CreateNamedPipe (SERVER_PIPE PIPE_ACCESS_DUPLEX
PIPE_READMODE_MESSAGE | PIPE_TYPE_MESSAGE | PIPE_WAIT
1 0 0 CS_TIMEOUT pNPSA)
while (Done)
printf (Server is awaiting next requestn)
if (ConnectNamedPipe (hNamedPipe NULL)
|| ReadFile (hNamedPipe ampRequest RQ_SIZE ampnXfer NULL))
fprintf(stderr ldquoConnect or Read Named Pipe errornrdquo) exit(4)
printf( ldquoRequest is sn RequestRecord)
Send the file one line at a time to the client
fp = fopen (File r)
while ((fgets (ResponseRecord MAX_RQRS_LEN fp) = NULL))
WriteFile (hNamedPipe ampResponseRecord
(strlen(ResponseRecord) + 1) TSIZE ampnXfer NULL)
fclose (fp)
DisconnectNamedPipe (hNamedPipe)
End of server operation
106106
Win32 IPC - MailslotsMailslots bear some nasty implementation detailsthey are almost never used
Broadcast mechanism
ndash One-directional
ndash Mutliple writersmultiple readers (frequently one-to-many comm)
ndash Message delivery is unreliable
ndash Can be located over a network domain
ndash Message lengths are limited (w2k lt 426 byte) Operations on the mailslot
ndash Each reader (server) creates mailslot with CreateMailslot()
ndash Write-only client opens mailslot with CreateFile() and uses WriteFile() ndash open will fail if there are no waiting readers
ndash Clientlsquos message can be read by all servers (readers) Client lookup mailslotmailslotname
ndash Client will connect to every server in network domain
107107
Locate a server via mailslot
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
App client 0
App client n
Mailslot Servers
While () Sleep() hMS = CreateFile( ldquomailslotstatusldquo)
WriteFile(hMS ampStatInfo
App Server
Mailslot Client
Message is sent periodically
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
5252
SMP Scalability Improvements
Windows 2000 queued spinlocks
ndash qlocks in Kernel Debugger
Server 2003
ndash More spinlocks eliminated (context swap system space commit)
ndash Further reduction of use of spinlocks amp length they are held
ndash Scheduling database now per-CPUAllows thread state transitions in parallel
5353
SMP Scalability Improvements
XP2003
ndash Minimized lock contention for hot locks PFN or Page Frame Database lock
ndash Some locks completely eliminatedCharging nonpagedpaged pool quotas allocating and mapping system page table entries charging commitment of pages allocatingmapping physical memory through AWE functions
ndash New more efficient locking mechanism (pushlocks)Doesnrsquot use spinlocks when no contentionUsed for object manager and address windowing extensions (AWE) related locks
5454
Waiting
Flexible wait calls
ndash Wait for one or multiple objects in one call
ndash Wait for multiple can wait for ldquoanyrdquo one or ldquoallrdquo at onceldquoAllrdquo all objects must be in the signalled state concurrently to resolve the wait
ndash All wait calls include optional timeout argument
ndash Waiting threads consume no CPU time
5555
Waiting
Waitable objects include
ndash Events (may be auto-reset or manual reset may be set or ldquopulsedrdquo)
ndash Mutexes (ldquomutual exclusionrdquo one-at-a-time)
ndash Semaphores (n-at-a-time)
ndash Timers
ndash Processes and Threads (signalled upon exit or terminate)
ndash Directories (change notification)
5656
Waiting
No guaranteed ordering of wait resolution
ndash If multiple threads are waiting for an object and only one thread is released (eg itrsquos a mutex or auto-reset event) which thread gets released is unpredictable
5757
Executive Synchronization
Thread waitson an object
handle
Create and initialize thread object
Initialized
Ready
Transition
Waiting
Running
Terminated
Standby
Wait is completeSet object to
signaled state
Interaction with thread scheduling
Waiting on Dispatcher Objects ndash outside the kernel
5858
Interaction bet Synchronization amp Dispatching User mode thread waits on an event objectlsquos handle
Kernel changes threadlsquos scheduling state from ready to waiting and adds thread to wait-list
Another thread sets the event
Kernel wakes up waiting threads variable priority threads get priority boost
5959
Interaction bet Synchronization amp Dispatching Dispatcher re-schedules new thread ndash it may preempt running thread it it has lower priority and issues software interrupt to initiate context switch
If no processor can be preempted the dispatcher places the ready thread in the dispatcher ready queue to be scheduled later
6060
What signals an object
Dispatcher object
System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
Owning thread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (kernel mode)
nonsignaled signaled
Owning thread or otherthread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (exported to user mode)
nonsignaled signaled
One thread releases thesemaphore freeing a resource
A thread acquires the semaphoreMore resources are not available
Kernel resumes one or more waiting threads
Semaphore
6161
A thread reinitializesthe thread object
What signals an object (contd)
Dispatcher object System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
A thread sets the event
Kernel resumes one or more threads
Kernel resumes one or more waiting threads
Event
nonsignaled signaled
Dedicated thread sets oneevent in the event pair
Kernel resumes theother dedicated thread
Kernel resumes waitingdedicated thread
Event pair
nonsignaled signaled
Timer expires
A thread (re) initializes the timer
Kernel resumes all waiting threads
Timer
nonsignaled signaled
Thread terminates
Kernel resumes all waiting threads
Thread
6262
Further Reading
Mark E Russinovich and David A Solomon Microsoft Windows Internals 4th Edition Microsoft Press 2004
Chapter 3 - System Mechanisms
ndash Trap Dispatching (pp 85 ff)
ndash Synchronization (pp 149 ff)
ndash Kernel Event Tracing (pp 175 ff)
6363
33 Advanced Windows Synchronization
Deferred and Asynchronous Procedure Calls
IRQLs and CPU Time Accounting
Wait Queues amp Dispatcher Objects
6464
Used to defer processing from higher (device) interrupt level to a lower (dispatch) levelndash Also used for quantum end and timer expiration
Driver (usually ISR) queues requestndash One queue per CPU DPCs are normally queued to the current processor but can be targeted to other CPUs
ndash Executes specified procedure at dispatch IRQL (or ldquodispatch levelrdquo also ldquoDPC levelrdquo) when all higher-IRQL work (interrupts) completed
ndash Maximum times recommended ISR 10 usec DPC 25 usec
See httpwwwmicrosoftcomwhdcdriverperformmmdrvmspx
Deferred Procedure Calls (DPCs)
6565
queue head DPC object DPC object DPC object
Deferred Procedure Calls (DPCs)
6666
DPC
Delivering a DPC
DPC routines can call kernel functionsbut canlsquot call system services generatepage faults or create or wait on objects
DPC routines canlsquotassume whatprocess addressspace is currentlymapped
Interruptdispatch table
high
Power failure
DispatchDPC
APC
Low
DPC
1 Timer expires kernelqueues DPC that willrelease all waiting threadsKernel requests SW int
DPCDPC
DPC queue
2 DPC interrupt occurswhen IRQL drops belowdispatchDPC level
dispatcher
3 After DPC interruptcontrol transfers tothread dispatcher
4 Dispatcher executes each DPCroutine in DPC queue
6767
Asynchronous Procedure Calls (APCs)
Execute code in context of a particular user threadndash APC routines can acquire resources (objects) incur page faultscall system services
APC queue is thread-specific
User mode amp kernel mode APCsndash Permission required for user mode APCs
6868
Asynchronous Procedure Calls (APCs)
Executive uses APCs to complete work in thread spacendash Wait for asynchronous IO operation
ndash Emulate delivery of POSIX signals
ndash Make threads suspendterminate itself (env subsystems)
APCs are delivered when thread is in alertable wait statendash WaitForMultipleObjectsEx() SleepEx()
6969
Special kernel APCs
ndash Run in kernel mode at IRQL 1
ndash Always deliverable unless thread is already at IRQL 1 or above
ndash Used for IO completion reporting from ldquoarbitrary thread contextrdquo
ndash Kernel-mode interface is linkable but not documented
Asynchronous Procedure Calls (APCs)
7070
ldquoOrdinaryrdquo kernel APCs
ndash Always deliverable if at IRQL 0 unless explicitly disabled (disable with KeEnterCriticalRegion)
User mode APCs
ndash Used for IO completion callback routines (see ReadFileEx WriteFileEx) also QueueUserApc
ndash Only deliverable when thread is in ldquoalertable waitrdquo
Asynchronous Procedure Calls (APCs)
7171
ThreadObject
K
U
APC objects
Asynchronous Procedure Calls (APCs)
7272
IRQLs and CPU Time Accounting
Interval clock timer ISR keeps track of time
Clock ISR time accounting
ndash If IRQLlt2 charge to threadrsquos user or kernel time
ndash If IRQL=2 and processing a DPC charge to DPC time
ndash If IRQL=2 amp not processing a DPC charge to thread kernel time
ndash If IRQLgt2 charge to interrupt time
7373
IRQLs and CPU Time Accounting
Since time servicing interrupts are NOT charged to interrupted thread if system is busy but no process appears to be running must be due to interrupt-related activity
ndash Note time at IRQL 2 or more is charged to the current threadrsquos quantum (to be described)
7474
Interrupt Time Accounting
Task Manager includes interrupt and DPC time with the Idle process time
Interrupt activity is not charged to any threadprocess
ndash Process Explorer shows these as separate processesnot really processes
ndash Context switches for these are really of interrupts amp DPCs
7575
Time Accounting Quirks
Looking at total CPU time for each process may not reveal where system has spent its time
CPU time accounting is driven by programmable interrupt timer
ndash Normally 10 msec (15 msec on some MP Pentiums)
Thread execution and context switches between clock intervals NOT accounted
ndash Eg one or more threads run and enter a wait state before clock fires
ndash Thus threads may run but never get charged
View context switch activity with Process Explorer
ndash Add Context Switch Delta column
7676
For waiting threads user-mode utilities only display the wait reason
Example pstat
Looking at Waiting Threads
7777
Wait Internals 1 Dispatcher Objects
Size TypeState
Wait listhead
Object-type-specific data
DispatcherObject
(see ntddkincddkntddkh)
Any kernel object you can wait for is a ldquodispatcher objectrdquo
ndash some exclusively for synchronizationeg events mutexes (ldquomutantsrdquo) semaphores queues timers
ndash others can be waited for as a side effect of their prime function eg processes threads file objects
ndash non-waitable kernel objects are called ldquocontrol objectsrdquo
All dispatcher objects have a common header
All dispatcher objects are in one of two states
ndash ldquosignaledrdquo vs ldquononsignaledrdquo
ndash when signalled a wait on the object is satisfied
ndash different object types differ in terms of what changes their state
ndash wait and unwait implementation iscommon to all types of dispatcher objects
7878
Object-type-specific data
Wait Internals 2Wait Blocks
Size TypeState
Wait listhead
Size TypeState
Wait listhead
Represent a threadrsquos reference to something itrsquos waiting for (one per handle passed to WaitForhellip)
All wait blocks from a given wait call are chained to the waiting thread
Type indicates wait for ldquoanyrdquo or ldquoallrdquo Key denotes argument list position for
WaitForMultipleObjects
Object-type-specific data
DispatcherObjects
Thread Objects
WaitBlockListWaitBlockList
Wait blocks
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
7979
34 Windows APIs for Synchronization and IPC Windows API constructs for synchronization and interprocess communication
Synchronization
ndash Critical sections
ndash Mutexes
ndash Semaphores
ndash Event objects
Synchronization through interprocess communication
ndash Anonymous pipes
ndash Named pipes
ndash Mailslots
8080
Critical Sections
Only usable from within the same process
Critical sections are initialized and deleted but do not have handles
Only one thread at a time can be in a critical section
A thread can enter a critical section multiple times - however the number of Enter- and Leave-operations must match
Leaving a critical section before entering it may cause deadlocks
No way to test whether another thread is in a critical section
VOID InitializeCriticalSection( LPCRITICAL_SECTION sec )VOID DeleteCriticalSection( LPCRITICAL_SECTION sec )
VOID EnterCriticalSection( LPCRITICAL_SECTION sec ) VOID LeaveCriticalSection( LPCRITICAL_SECTION sec )BOOL TryEnterCriticalSection ( LPCRITICAL_SECTION sec )
8181
Critical Section Example
counter is global shared by all threads
volatile int counter = 0
CRITICAL_SECTION crit
InitializeCriticalSection ( ampcrit )
hellip main loop in any of the threads
while (done)
_try
EnterCriticalSection ( ampcrit )
counter += local_value
LeaveCriticalSection ( ampcrit )
_finally LeaveCriticalSection ( ampcrit )
DeleteCriticalSection( ampcrit )
8282
Synchronizing Threads with Kernel Objects
The following kernel objects can be used to synchronize threads
ndash Processes
ndash Threads
ndash Files
ndash Console input
File change notificationsFile change notifications
MutexesMutexes
Events (auto-reset + manual-reset)Events (auto-reset + manual-reset)
Waitable timersWaitable timers
DWORD WaitForSingleObject( HANDLE hObject DWORD dwTimeout )
DWORD WaitForMultipleObjects( DWORD cObjects LPHANDLE lpHandles BOOL bWaitAll DWORD dwTimeout )
8383
Wait Functions - Details
WaitForSingleObject()ndash hObject specifies kernel object
ndash dwTimeout specifies wait time in msecdwTimeout == 0 - no wait check whether object is signaled
dwTimeout == INFINITE - wait forever
WaitForMultipleObjects()ndash cObjects lt= MAXIMUM_WAIT_OBJECTS (64)
ndash lpHandles - pointer to array identifying these objects
ndash bWaitAll - whether to wait for first signaled object or all objectsFunction returns index of first signaled object
Side effectsndash Mutexes auto-reset events and waitable timers will be reset to non-signaled state after completing wait functions
8484
Mutexes
Mutexes work across processes
First thread has to call CreateMutex()
When sharing a mutex second thread (process) calls CreateMutex() or OpenMutex()
fInitialOwner == TRUE gives creator immediate ownership
Threads acquire mutex ownership using WaitForSingleObject() or WaitForMultipleObjects()
ReleaseMutex() gives up ownership
CloseHandle() will free mutex object
8585
Mutexes
HANDLE CreateMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
HANDLE OpenMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
BOOL ReleaseMutex( HANDLE hMutex )
8686
Mutex Example
counter is global shared by all threads
volatile int done counter = 0
HANDLE mutex = CreateMutex( NULL FALSE NULL )
main loop in any of the threads ret is local
DWORD ret
while (done)
ret = WaitForSingleObject( mutex INFINITE )
if (ret == WAIT_OBJECT_0)
counter += local_value
else mutex was abandoned
break exit the loop
ReleaseMutex( mutex )
CloseHandle( mutex )
8787
Comparison - POSIX mutexes
POSIX pthreads specification supports mutexesndash Synchronization among threads in same process
Five basic functionsndash pthread_mutex_init()
ndash pthread_mutex_destroy()
ndash pthread_mutex_lock()
ndash pthread_mutex_unlock()
ndash pthread_mutex_trylock()
Comparisonndash pthread_mutex_lock() will block - equivalent to WaitForSingleObject( hMutex )
ndash pthread_mutex_trylock() is nonblocking (polling) - equivalent to WaitForSingleObject() with timeout == 0
8888
Semaphores
Semaphore objects are used for resource countingndash A semaphore is signaled when count gt 0
Threadsprocesses use wait functionsndash Each wait function decreases semaphore count by 1
ndash ReleaseSemaphore() may increment count by any value
ndash ReleaseSemaphore() returns old semaphore count
8989
Semaphores
HANDLE CreateSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE OpenSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE ReleaseSemaphore( HANDLE hSemaphoreLONG cReleaseCount LPLONG lpPreviousCount )
9090
Events
Multiple threads can be released when a single event is signaled (barrier synchronization)ndash Manual-reset event can signal several thread simultaneously must be reset manually
ndash PulseEvent() will release all threads waiting on a manual-reset event and automatically reset the event
ndash Auto-reset event signals a single thread event is reset automatically
ndash fInitialState == TRUE - create event in signaled state
9191
Events
HANDLE CreateEvent( LPSECURITY_ATTRIBUTE lpsaBOOL fManualReset BOOL fInititalStateLPTSTR lpszEventName )
BOOL SetEvent( HANDLE hEvent )BOOL ResetEvent( HANDLE hEvent )BOOL PulseEvent( HANDLE hEvent )
9292
Comparison - POSIX condition variables
pthreadrsquos condition variables are comparable to eventsndash pthread_cond_init()
ndash pthread_cond_destroy()
Wait functionsndash pthread_cond_wait()
ndash pthread_cond_timedwait()
Signalingndash pthread_cond_signal() - one thread
ndash pthread_cond_broadcast() - all waiting threads
No exact equivalent to manual-reset events
9393
Anonymous pipes
BOOL CreatePipe( PHANDLE phReadPHANDLE phWriteLPSECURITY_ATTRIBUTES lpsaDWORD cbPipe )
main
prog1 prog2pipe
Half-duplex character-based IPC
cbPipe pipe byte size zero == default
Read on pipe handle will block if pipe is empty
Write operation to a full pipe will block
Anonymous pipes are oneway
9494
IO Redirection using an Anonymous Pipe
Create default size anonymous pipe handles are inheritable
if (CreatePipe (amphReadPipe amphWritePipe ampPipeSA 0))
fprintf(stderr ldquoAnon pipe create failednrdquo) exit(1)
Set output handle to pipe handle create first processes
StartInfoCh1hStdInput = GetStdHandle (STD_INPUT_HANDLE)
StartInfoCh1hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh1hStdOutput = hWritePipe
StartInfoCh1dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)Command1 NULL NULL TRUE 0 NULL NULL ampStartInfoCh1 ampProcInfo1))
fprintf(stderr ldquoCreateProc1 failednrdquo) exit(2)
CloseHandle (hWritePipe)
9595
Pipe example (contd)
Repeat (symmetrically) for the second process
StartInfoCh2hStdInput = hReadPipe
StartInfoCh2hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh2hStdOutput = GetStdHandle (STD_OUTPUT_HANDLE)
StartInfoCh2dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)targv NULL NULLTRUE Inherit handles
0 NULL NULL ampStartInfoCh2 ampProcInfo2))
fprintf(stderr ldquoCreateProc2 failednrdquo) exit(3)
CloseHandle (hReadPipe)
Wait for both processes to complete
WaitForSingleObject (ProcInfo1hProcess INFINITE)
WaitForSingleObject (ProcInfo2hProcess INFINITE)
CloseHandle (ProcInfo1hThread) CloseHandle (ProcInfo1hProcess)
CloseHandle (ProcInfo2hThread) CloseHandle (ProcInfo2hProcess)
return 0
9696
Named Pipes
Message orientedndash Reading process can read varying-length messages precisely as sent by the writing process
Bi-directionalndash Two processes can exchange messages over the same pipe
Multiple independent instances of a named pipendash Several clients can communicate with a single server using the same instance
ndash Server can respond to client using the same instance
Pipe can be accessed over the networkndash location transparency
Convenience and connection functions
9797
Using Named Pipes
HANDLE CreateNamedPipe (LPCTSTR lpszPipeNameDWORD fdwOpenMode DWORD fdwPipModeDWORD nMaxInstances DWORD cbOutBufDWORD cbInBuf DWORD dwTimeOutLPSECURITY_ATTRIBUTES lpsa )
Use same flag settings forall instances of a named pipe
lpszPipeName pipe[path]pipename
ndash Not possible to create a pipe on remote machine ( ndash local machine)
fdwOpenMode
ndash PIPE_ACCESS_DUPLEX PIPE_ACCESS_INBOUND PIPE_ACCESS_OUTBOUND
fdwPipeMode
ndash PIPE_TYPE_BYTE or PIPE_TYPE_MESSAGE
ndash PIPE_READMODE_BYTE or PIPE_READMODE_MESSAGE
ndash PIPE_WAIT or PIPE_NOWAIT (will ReadFile block)
9898
Named Pipes (contd)
BOOL PeekNamedPipe (HANDLE hPipeLPVOID lpvBuffer DWORD cbBufferLPDWORD lpcbRead LPDWORD lpcbAvailLPDWORD lpcbMessage)
nMaxInstances
ndash Number of instances
ndash PIPE_UNLIMITED_INSTANCES OS choice based on resources
dwTimeOut
ndash Default time-out period (in msec) for WaitNamedPipe()
First CreateNamedPipe creates named pipe
ndash Closing handle to last instance deletes named pipe
Polling a pipe
ndash Nondestructive ndash is there a message waiting for ReadFile
9999
Named Pipe Client Connections
CreateFile with named pipe namendash pipe[path]pipename
ndash servernamepipe[path]pipename
ndash First method gives better performance (local server)
Status Functionsndash GetNamedPipeHandleState
ndash SetNamedPipeHandleState
ndash GetNamedPipeInfo
100100
Convenience Functions
BOOL TransactNamedPipe( HANDLE hNamedPipeLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDOWRD lpcbRead LPOVERLAPPED lpa)
WriteFile ReadFile sequence
101101
Convenience Functions
BOOL CallNamedPipe( LPCTSTR lpszPipeNameLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDWORD lpcbRead DWORD dwTimeOut)
CreateFile WriteFile ReadFile CloseHandle
ndash dwTimeOut NMPWAIT_NOWAIT NMPWAIT_WIAT_FOREVER NMPWAIT_USE_DEFAULT_WAIT
102102
Server eliminate the polling loop
BOOL ConnectNamedPipe (HANDLE hNamedPipeLPOVERLAPPED lpo
lpo == NULLndash Call will return as soon as there is a client connection
ndash Returns false if client connected between CreateNamed Pipe calland ConnectNamedPipe()
Use DisconnectNamedPipe to free the handle for connection from another client
WaitNamedPipe()ndash Client may wait for serverlsquos ConnectNamedPipe()
Security rights for named pipesndash GENERIC_READ GENERIC_WRITE SYNCHRONIZE
103103
Comparison with UNIX
UNIX FIFOs are similar to a named pipe
ndash FIFOs are half-duplex
ndash FIFOs are limited to a single machine
ndash FIFOs are still byte-oriented so its easiest to use fixed-size records in clientserver applications
ndash Individual readwrites are atomic
A server using FIFOs must use a separate FIFO for each clientlsquos response although all clients can send requests via a single well known FIFO
Mkfifo() is the UNIX counterpart to CreateNamedPipe()
Use sockets for networked clientserver scenarios
104104
Client Example using Named Pipe
WaitNamedPipe (ServerPipeName NMPWAIT_WAIT_FOREVER)
hNamedPipe = CreateFile (ServerPipeName GENERIC_READ | GENERIC_WRITE
0 NULL OPEN_EXISTING FILE_ATTRIBUTE_NORMAL NULL)
if (hNamedPipe == INVALID_HANDLE_VALUE)
fptinf(stderr Failure to locate servern) exit(3)
Write the request
WriteFile (hNamedPipe ampRequest MAX_RQRS_LEN ampnWrite NULL)
Read each response and send it to std out
while (ReadFile (hNamedPipe ResponseRecord MAX_RQRS_LEN ampnRead NULL))
printf (s ResponseRecord)
CloseHandle (hNamedPipe)
return 0
105105
Server Example Using a Named Pipe
hNamedPipe = CreateNamedPipe (SERVER_PIPE PIPE_ACCESS_DUPLEX
PIPE_READMODE_MESSAGE | PIPE_TYPE_MESSAGE | PIPE_WAIT
1 0 0 CS_TIMEOUT pNPSA)
while (Done)
printf (Server is awaiting next requestn)
if (ConnectNamedPipe (hNamedPipe NULL)
|| ReadFile (hNamedPipe ampRequest RQ_SIZE ampnXfer NULL))
fprintf(stderr ldquoConnect or Read Named Pipe errornrdquo) exit(4)
printf( ldquoRequest is sn RequestRecord)
Send the file one line at a time to the client
fp = fopen (File r)
while ((fgets (ResponseRecord MAX_RQRS_LEN fp) = NULL))
WriteFile (hNamedPipe ampResponseRecord
(strlen(ResponseRecord) + 1) TSIZE ampnXfer NULL)
fclose (fp)
DisconnectNamedPipe (hNamedPipe)
End of server operation
106106
Win32 IPC - MailslotsMailslots bear some nasty implementation detailsthey are almost never used
Broadcast mechanism
ndash One-directional
ndash Mutliple writersmultiple readers (frequently one-to-many comm)
ndash Message delivery is unreliable
ndash Can be located over a network domain
ndash Message lengths are limited (w2k lt 426 byte) Operations on the mailslot
ndash Each reader (server) creates mailslot with CreateMailslot()
ndash Write-only client opens mailslot with CreateFile() and uses WriteFile() ndash open will fail if there are no waiting readers
ndash Clientlsquos message can be read by all servers (readers) Client lookup mailslotmailslotname
ndash Client will connect to every server in network domain
107107
Locate a server via mailslot
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
App client 0
App client n
Mailslot Servers
While () Sleep() hMS = CreateFile( ldquomailslotstatusldquo)
WriteFile(hMS ampStatInfo
App Server
Mailslot Client
Message is sent periodically
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
5353
SMP Scalability Improvements
XP2003
ndash Minimized lock contention for hot locks PFN or Page Frame Database lock
ndash Some locks completely eliminatedCharging nonpagedpaged pool quotas allocating and mapping system page table entries charging commitment of pages allocatingmapping physical memory through AWE functions
ndash New more efficient locking mechanism (pushlocks)Doesnrsquot use spinlocks when no contentionUsed for object manager and address windowing extensions (AWE) related locks
5454
Waiting
Flexible wait calls
ndash Wait for one or multiple objects in one call
ndash Wait for multiple can wait for ldquoanyrdquo one or ldquoallrdquo at onceldquoAllrdquo all objects must be in the signalled state concurrently to resolve the wait
ndash All wait calls include optional timeout argument
ndash Waiting threads consume no CPU time
5555
Waiting
Waitable objects include
ndash Events (may be auto-reset or manual reset may be set or ldquopulsedrdquo)
ndash Mutexes (ldquomutual exclusionrdquo one-at-a-time)
ndash Semaphores (n-at-a-time)
ndash Timers
ndash Processes and Threads (signalled upon exit or terminate)
ndash Directories (change notification)
5656
Waiting
No guaranteed ordering of wait resolution
ndash If multiple threads are waiting for an object and only one thread is released (eg itrsquos a mutex or auto-reset event) which thread gets released is unpredictable
5757
Executive Synchronization
Thread waitson an object
handle
Create and initialize thread object
Initialized
Ready
Transition
Waiting
Running
Terminated
Standby
Wait is completeSet object to
signaled state
Interaction with thread scheduling
Waiting on Dispatcher Objects ndash outside the kernel
5858
Interaction bet Synchronization amp Dispatching User mode thread waits on an event objectlsquos handle
Kernel changes threadlsquos scheduling state from ready to waiting and adds thread to wait-list
Another thread sets the event
Kernel wakes up waiting threads variable priority threads get priority boost
5959
Interaction bet Synchronization amp Dispatching Dispatcher re-schedules new thread ndash it may preempt running thread it it has lower priority and issues software interrupt to initiate context switch
If no processor can be preempted the dispatcher places the ready thread in the dispatcher ready queue to be scheduled later
6060
What signals an object
Dispatcher object
System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
Owning thread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (kernel mode)
nonsignaled signaled
Owning thread or otherthread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (exported to user mode)
nonsignaled signaled
One thread releases thesemaphore freeing a resource
A thread acquires the semaphoreMore resources are not available
Kernel resumes one or more waiting threads
Semaphore
6161
A thread reinitializesthe thread object
What signals an object (contd)
Dispatcher object System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
A thread sets the event
Kernel resumes one or more threads
Kernel resumes one or more waiting threads
Event
nonsignaled signaled
Dedicated thread sets oneevent in the event pair
Kernel resumes theother dedicated thread
Kernel resumes waitingdedicated thread
Event pair
nonsignaled signaled
Timer expires
A thread (re) initializes the timer
Kernel resumes all waiting threads
Timer
nonsignaled signaled
Thread terminates
Kernel resumes all waiting threads
Thread
6262
Further Reading
Mark E Russinovich and David A Solomon Microsoft Windows Internals 4th Edition Microsoft Press 2004
Chapter 3 - System Mechanisms
ndash Trap Dispatching (pp 85 ff)
ndash Synchronization (pp 149 ff)
ndash Kernel Event Tracing (pp 175 ff)
6363
33 Advanced Windows Synchronization
Deferred and Asynchronous Procedure Calls
IRQLs and CPU Time Accounting
Wait Queues amp Dispatcher Objects
6464
Used to defer processing from higher (device) interrupt level to a lower (dispatch) levelndash Also used for quantum end and timer expiration
Driver (usually ISR) queues requestndash One queue per CPU DPCs are normally queued to the current processor but can be targeted to other CPUs
ndash Executes specified procedure at dispatch IRQL (or ldquodispatch levelrdquo also ldquoDPC levelrdquo) when all higher-IRQL work (interrupts) completed
ndash Maximum times recommended ISR 10 usec DPC 25 usec
See httpwwwmicrosoftcomwhdcdriverperformmmdrvmspx
Deferred Procedure Calls (DPCs)
6565
queue head DPC object DPC object DPC object
Deferred Procedure Calls (DPCs)
6666
DPC
Delivering a DPC
DPC routines can call kernel functionsbut canlsquot call system services generatepage faults or create or wait on objects
DPC routines canlsquotassume whatprocess addressspace is currentlymapped
Interruptdispatch table
high
Power failure
DispatchDPC
APC
Low
DPC
1 Timer expires kernelqueues DPC that willrelease all waiting threadsKernel requests SW int
DPCDPC
DPC queue
2 DPC interrupt occurswhen IRQL drops belowdispatchDPC level
dispatcher
3 After DPC interruptcontrol transfers tothread dispatcher
4 Dispatcher executes each DPCroutine in DPC queue
6767
Asynchronous Procedure Calls (APCs)
Execute code in context of a particular user threadndash APC routines can acquire resources (objects) incur page faultscall system services
APC queue is thread-specific
User mode amp kernel mode APCsndash Permission required for user mode APCs
6868
Asynchronous Procedure Calls (APCs)
Executive uses APCs to complete work in thread spacendash Wait for asynchronous IO operation
ndash Emulate delivery of POSIX signals
ndash Make threads suspendterminate itself (env subsystems)
APCs are delivered when thread is in alertable wait statendash WaitForMultipleObjectsEx() SleepEx()
6969
Special kernel APCs
ndash Run in kernel mode at IRQL 1
ndash Always deliverable unless thread is already at IRQL 1 or above
ndash Used for IO completion reporting from ldquoarbitrary thread contextrdquo
ndash Kernel-mode interface is linkable but not documented
Asynchronous Procedure Calls (APCs)
7070
ldquoOrdinaryrdquo kernel APCs
ndash Always deliverable if at IRQL 0 unless explicitly disabled (disable with KeEnterCriticalRegion)
User mode APCs
ndash Used for IO completion callback routines (see ReadFileEx WriteFileEx) also QueueUserApc
ndash Only deliverable when thread is in ldquoalertable waitrdquo
Asynchronous Procedure Calls (APCs)
7171
ThreadObject
K
U
APC objects
Asynchronous Procedure Calls (APCs)
7272
IRQLs and CPU Time Accounting
Interval clock timer ISR keeps track of time
Clock ISR time accounting
ndash If IRQLlt2 charge to threadrsquos user or kernel time
ndash If IRQL=2 and processing a DPC charge to DPC time
ndash If IRQL=2 amp not processing a DPC charge to thread kernel time
ndash If IRQLgt2 charge to interrupt time
7373
IRQLs and CPU Time Accounting
Since time servicing interrupts are NOT charged to interrupted thread if system is busy but no process appears to be running must be due to interrupt-related activity
ndash Note time at IRQL 2 or more is charged to the current threadrsquos quantum (to be described)
7474
Interrupt Time Accounting
Task Manager includes interrupt and DPC time with the Idle process time
Interrupt activity is not charged to any threadprocess
ndash Process Explorer shows these as separate processesnot really processes
ndash Context switches for these are really of interrupts amp DPCs
7575
Time Accounting Quirks
Looking at total CPU time for each process may not reveal where system has spent its time
CPU time accounting is driven by programmable interrupt timer
ndash Normally 10 msec (15 msec on some MP Pentiums)
Thread execution and context switches between clock intervals NOT accounted
ndash Eg one or more threads run and enter a wait state before clock fires
ndash Thus threads may run but never get charged
View context switch activity with Process Explorer
ndash Add Context Switch Delta column
7676
For waiting threads user-mode utilities only display the wait reason
Example pstat
Looking at Waiting Threads
7777
Wait Internals 1 Dispatcher Objects
Size TypeState
Wait listhead
Object-type-specific data
DispatcherObject
(see ntddkincddkntddkh)
Any kernel object you can wait for is a ldquodispatcher objectrdquo
ndash some exclusively for synchronizationeg events mutexes (ldquomutantsrdquo) semaphores queues timers
ndash others can be waited for as a side effect of their prime function eg processes threads file objects
ndash non-waitable kernel objects are called ldquocontrol objectsrdquo
All dispatcher objects have a common header
All dispatcher objects are in one of two states
ndash ldquosignaledrdquo vs ldquononsignaledrdquo
ndash when signalled a wait on the object is satisfied
ndash different object types differ in terms of what changes their state
ndash wait and unwait implementation iscommon to all types of dispatcher objects
7878
Object-type-specific data
Wait Internals 2Wait Blocks
Size TypeState
Wait listhead
Size TypeState
Wait listhead
Represent a threadrsquos reference to something itrsquos waiting for (one per handle passed to WaitForhellip)
All wait blocks from a given wait call are chained to the waiting thread
Type indicates wait for ldquoanyrdquo or ldquoallrdquo Key denotes argument list position for
WaitForMultipleObjects
Object-type-specific data
DispatcherObjects
Thread Objects
WaitBlockListWaitBlockList
Wait blocks
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
7979
34 Windows APIs for Synchronization and IPC Windows API constructs for synchronization and interprocess communication
Synchronization
ndash Critical sections
ndash Mutexes
ndash Semaphores
ndash Event objects
Synchronization through interprocess communication
ndash Anonymous pipes
ndash Named pipes
ndash Mailslots
8080
Critical Sections
Only usable from within the same process
Critical sections are initialized and deleted but do not have handles
Only one thread at a time can be in a critical section
A thread can enter a critical section multiple times - however the number of Enter- and Leave-operations must match
Leaving a critical section before entering it may cause deadlocks
No way to test whether another thread is in a critical section
VOID InitializeCriticalSection( LPCRITICAL_SECTION sec )VOID DeleteCriticalSection( LPCRITICAL_SECTION sec )
VOID EnterCriticalSection( LPCRITICAL_SECTION sec ) VOID LeaveCriticalSection( LPCRITICAL_SECTION sec )BOOL TryEnterCriticalSection ( LPCRITICAL_SECTION sec )
8181
Critical Section Example
counter is global shared by all threads
volatile int counter = 0
CRITICAL_SECTION crit
InitializeCriticalSection ( ampcrit )
hellip main loop in any of the threads
while (done)
_try
EnterCriticalSection ( ampcrit )
counter += local_value
LeaveCriticalSection ( ampcrit )
_finally LeaveCriticalSection ( ampcrit )
DeleteCriticalSection( ampcrit )
8282
Synchronizing Threads with Kernel Objects
The following kernel objects can be used to synchronize threads
ndash Processes
ndash Threads
ndash Files
ndash Console input
File change notificationsFile change notifications
MutexesMutexes
Events (auto-reset + manual-reset)Events (auto-reset + manual-reset)
Waitable timersWaitable timers
DWORD WaitForSingleObject( HANDLE hObject DWORD dwTimeout )
DWORD WaitForMultipleObjects( DWORD cObjects LPHANDLE lpHandles BOOL bWaitAll DWORD dwTimeout )
8383
Wait Functions - Details
WaitForSingleObject()ndash hObject specifies kernel object
ndash dwTimeout specifies wait time in msecdwTimeout == 0 - no wait check whether object is signaled
dwTimeout == INFINITE - wait forever
WaitForMultipleObjects()ndash cObjects lt= MAXIMUM_WAIT_OBJECTS (64)
ndash lpHandles - pointer to array identifying these objects
ndash bWaitAll - whether to wait for first signaled object or all objectsFunction returns index of first signaled object
Side effectsndash Mutexes auto-reset events and waitable timers will be reset to non-signaled state after completing wait functions
8484
Mutexes
Mutexes work across processes
First thread has to call CreateMutex()
When sharing a mutex second thread (process) calls CreateMutex() or OpenMutex()
fInitialOwner == TRUE gives creator immediate ownership
Threads acquire mutex ownership using WaitForSingleObject() or WaitForMultipleObjects()
ReleaseMutex() gives up ownership
CloseHandle() will free mutex object
8585
Mutexes
HANDLE CreateMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
HANDLE OpenMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
BOOL ReleaseMutex( HANDLE hMutex )
8686
Mutex Example
counter is global shared by all threads
volatile int done counter = 0
HANDLE mutex = CreateMutex( NULL FALSE NULL )
main loop in any of the threads ret is local
DWORD ret
while (done)
ret = WaitForSingleObject( mutex INFINITE )
if (ret == WAIT_OBJECT_0)
counter += local_value
else mutex was abandoned
break exit the loop
ReleaseMutex( mutex )
CloseHandle( mutex )
8787
Comparison - POSIX mutexes
POSIX pthreads specification supports mutexesndash Synchronization among threads in same process
Five basic functionsndash pthread_mutex_init()
ndash pthread_mutex_destroy()
ndash pthread_mutex_lock()
ndash pthread_mutex_unlock()
ndash pthread_mutex_trylock()
Comparisonndash pthread_mutex_lock() will block - equivalent to WaitForSingleObject( hMutex )
ndash pthread_mutex_trylock() is nonblocking (polling) - equivalent to WaitForSingleObject() with timeout == 0
8888
Semaphores
Semaphore objects are used for resource countingndash A semaphore is signaled when count gt 0
Threadsprocesses use wait functionsndash Each wait function decreases semaphore count by 1
ndash ReleaseSemaphore() may increment count by any value
ndash ReleaseSemaphore() returns old semaphore count
8989
Semaphores
HANDLE CreateSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE OpenSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE ReleaseSemaphore( HANDLE hSemaphoreLONG cReleaseCount LPLONG lpPreviousCount )
9090
Events
Multiple threads can be released when a single event is signaled (barrier synchronization)ndash Manual-reset event can signal several thread simultaneously must be reset manually
ndash PulseEvent() will release all threads waiting on a manual-reset event and automatically reset the event
ndash Auto-reset event signals a single thread event is reset automatically
ndash fInitialState == TRUE - create event in signaled state
9191
Events
HANDLE CreateEvent( LPSECURITY_ATTRIBUTE lpsaBOOL fManualReset BOOL fInititalStateLPTSTR lpszEventName )
BOOL SetEvent( HANDLE hEvent )BOOL ResetEvent( HANDLE hEvent )BOOL PulseEvent( HANDLE hEvent )
9292
Comparison - POSIX condition variables
pthreadrsquos condition variables are comparable to eventsndash pthread_cond_init()
ndash pthread_cond_destroy()
Wait functionsndash pthread_cond_wait()
ndash pthread_cond_timedwait()
Signalingndash pthread_cond_signal() - one thread
ndash pthread_cond_broadcast() - all waiting threads
No exact equivalent to manual-reset events
9393
Anonymous pipes
BOOL CreatePipe( PHANDLE phReadPHANDLE phWriteLPSECURITY_ATTRIBUTES lpsaDWORD cbPipe )
main
prog1 prog2pipe
Half-duplex character-based IPC
cbPipe pipe byte size zero == default
Read on pipe handle will block if pipe is empty
Write operation to a full pipe will block
Anonymous pipes are oneway
9494
IO Redirection using an Anonymous Pipe
Create default size anonymous pipe handles are inheritable
if (CreatePipe (amphReadPipe amphWritePipe ampPipeSA 0))
fprintf(stderr ldquoAnon pipe create failednrdquo) exit(1)
Set output handle to pipe handle create first processes
StartInfoCh1hStdInput = GetStdHandle (STD_INPUT_HANDLE)
StartInfoCh1hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh1hStdOutput = hWritePipe
StartInfoCh1dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)Command1 NULL NULL TRUE 0 NULL NULL ampStartInfoCh1 ampProcInfo1))
fprintf(stderr ldquoCreateProc1 failednrdquo) exit(2)
CloseHandle (hWritePipe)
9595
Pipe example (contd)
Repeat (symmetrically) for the second process
StartInfoCh2hStdInput = hReadPipe
StartInfoCh2hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh2hStdOutput = GetStdHandle (STD_OUTPUT_HANDLE)
StartInfoCh2dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)targv NULL NULLTRUE Inherit handles
0 NULL NULL ampStartInfoCh2 ampProcInfo2))
fprintf(stderr ldquoCreateProc2 failednrdquo) exit(3)
CloseHandle (hReadPipe)
Wait for both processes to complete
WaitForSingleObject (ProcInfo1hProcess INFINITE)
WaitForSingleObject (ProcInfo2hProcess INFINITE)
CloseHandle (ProcInfo1hThread) CloseHandle (ProcInfo1hProcess)
CloseHandle (ProcInfo2hThread) CloseHandle (ProcInfo2hProcess)
return 0
9696
Named Pipes
Message orientedndash Reading process can read varying-length messages precisely as sent by the writing process
Bi-directionalndash Two processes can exchange messages over the same pipe
Multiple independent instances of a named pipendash Several clients can communicate with a single server using the same instance
ndash Server can respond to client using the same instance
Pipe can be accessed over the networkndash location transparency
Convenience and connection functions
9797
Using Named Pipes
HANDLE CreateNamedPipe (LPCTSTR lpszPipeNameDWORD fdwOpenMode DWORD fdwPipModeDWORD nMaxInstances DWORD cbOutBufDWORD cbInBuf DWORD dwTimeOutLPSECURITY_ATTRIBUTES lpsa )
Use same flag settings forall instances of a named pipe
lpszPipeName pipe[path]pipename
ndash Not possible to create a pipe on remote machine ( ndash local machine)
fdwOpenMode
ndash PIPE_ACCESS_DUPLEX PIPE_ACCESS_INBOUND PIPE_ACCESS_OUTBOUND
fdwPipeMode
ndash PIPE_TYPE_BYTE or PIPE_TYPE_MESSAGE
ndash PIPE_READMODE_BYTE or PIPE_READMODE_MESSAGE
ndash PIPE_WAIT or PIPE_NOWAIT (will ReadFile block)
9898
Named Pipes (contd)
BOOL PeekNamedPipe (HANDLE hPipeLPVOID lpvBuffer DWORD cbBufferLPDWORD lpcbRead LPDWORD lpcbAvailLPDWORD lpcbMessage)
nMaxInstances
ndash Number of instances
ndash PIPE_UNLIMITED_INSTANCES OS choice based on resources
dwTimeOut
ndash Default time-out period (in msec) for WaitNamedPipe()
First CreateNamedPipe creates named pipe
ndash Closing handle to last instance deletes named pipe
Polling a pipe
ndash Nondestructive ndash is there a message waiting for ReadFile
9999
Named Pipe Client Connections
CreateFile with named pipe namendash pipe[path]pipename
ndash servernamepipe[path]pipename
ndash First method gives better performance (local server)
Status Functionsndash GetNamedPipeHandleState
ndash SetNamedPipeHandleState
ndash GetNamedPipeInfo
100100
Convenience Functions
BOOL TransactNamedPipe( HANDLE hNamedPipeLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDOWRD lpcbRead LPOVERLAPPED lpa)
WriteFile ReadFile sequence
101101
Convenience Functions
BOOL CallNamedPipe( LPCTSTR lpszPipeNameLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDWORD lpcbRead DWORD dwTimeOut)
CreateFile WriteFile ReadFile CloseHandle
ndash dwTimeOut NMPWAIT_NOWAIT NMPWAIT_WIAT_FOREVER NMPWAIT_USE_DEFAULT_WAIT
102102
Server eliminate the polling loop
BOOL ConnectNamedPipe (HANDLE hNamedPipeLPOVERLAPPED lpo
lpo == NULLndash Call will return as soon as there is a client connection
ndash Returns false if client connected between CreateNamed Pipe calland ConnectNamedPipe()
Use DisconnectNamedPipe to free the handle for connection from another client
WaitNamedPipe()ndash Client may wait for serverlsquos ConnectNamedPipe()
Security rights for named pipesndash GENERIC_READ GENERIC_WRITE SYNCHRONIZE
103103
Comparison with UNIX
UNIX FIFOs are similar to a named pipe
ndash FIFOs are half-duplex
ndash FIFOs are limited to a single machine
ndash FIFOs are still byte-oriented so its easiest to use fixed-size records in clientserver applications
ndash Individual readwrites are atomic
A server using FIFOs must use a separate FIFO for each clientlsquos response although all clients can send requests via a single well known FIFO
Mkfifo() is the UNIX counterpart to CreateNamedPipe()
Use sockets for networked clientserver scenarios
104104
Client Example using Named Pipe
WaitNamedPipe (ServerPipeName NMPWAIT_WAIT_FOREVER)
hNamedPipe = CreateFile (ServerPipeName GENERIC_READ | GENERIC_WRITE
0 NULL OPEN_EXISTING FILE_ATTRIBUTE_NORMAL NULL)
if (hNamedPipe == INVALID_HANDLE_VALUE)
fptinf(stderr Failure to locate servern) exit(3)
Write the request
WriteFile (hNamedPipe ampRequest MAX_RQRS_LEN ampnWrite NULL)
Read each response and send it to std out
while (ReadFile (hNamedPipe ResponseRecord MAX_RQRS_LEN ampnRead NULL))
printf (s ResponseRecord)
CloseHandle (hNamedPipe)
return 0
105105
Server Example Using a Named Pipe
hNamedPipe = CreateNamedPipe (SERVER_PIPE PIPE_ACCESS_DUPLEX
PIPE_READMODE_MESSAGE | PIPE_TYPE_MESSAGE | PIPE_WAIT
1 0 0 CS_TIMEOUT pNPSA)
while (Done)
printf (Server is awaiting next requestn)
if (ConnectNamedPipe (hNamedPipe NULL)
|| ReadFile (hNamedPipe ampRequest RQ_SIZE ampnXfer NULL))
fprintf(stderr ldquoConnect or Read Named Pipe errornrdquo) exit(4)
printf( ldquoRequest is sn RequestRecord)
Send the file one line at a time to the client
fp = fopen (File r)
while ((fgets (ResponseRecord MAX_RQRS_LEN fp) = NULL))
WriteFile (hNamedPipe ampResponseRecord
(strlen(ResponseRecord) + 1) TSIZE ampnXfer NULL)
fclose (fp)
DisconnectNamedPipe (hNamedPipe)
End of server operation
106106
Win32 IPC - MailslotsMailslots bear some nasty implementation detailsthey are almost never used
Broadcast mechanism
ndash One-directional
ndash Mutliple writersmultiple readers (frequently one-to-many comm)
ndash Message delivery is unreliable
ndash Can be located over a network domain
ndash Message lengths are limited (w2k lt 426 byte) Operations on the mailslot
ndash Each reader (server) creates mailslot with CreateMailslot()
ndash Write-only client opens mailslot with CreateFile() and uses WriteFile() ndash open will fail if there are no waiting readers
ndash Clientlsquos message can be read by all servers (readers) Client lookup mailslotmailslotname
ndash Client will connect to every server in network domain
107107
Locate a server via mailslot
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
App client 0
App client n
Mailslot Servers
While () Sleep() hMS = CreateFile( ldquomailslotstatusldquo)
WriteFile(hMS ampStatInfo
App Server
Mailslot Client
Message is sent periodically
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
5454
Waiting
Flexible wait calls
ndash Wait for one or multiple objects in one call
ndash Wait for multiple can wait for ldquoanyrdquo one or ldquoallrdquo at onceldquoAllrdquo all objects must be in the signalled state concurrently to resolve the wait
ndash All wait calls include optional timeout argument
ndash Waiting threads consume no CPU time
5555
Waiting
Waitable objects include
ndash Events (may be auto-reset or manual reset may be set or ldquopulsedrdquo)
ndash Mutexes (ldquomutual exclusionrdquo one-at-a-time)
ndash Semaphores (n-at-a-time)
ndash Timers
ndash Processes and Threads (signalled upon exit or terminate)
ndash Directories (change notification)
5656
Waiting
No guaranteed ordering of wait resolution
ndash If multiple threads are waiting for an object and only one thread is released (eg itrsquos a mutex or auto-reset event) which thread gets released is unpredictable
5757
Executive Synchronization
Thread waitson an object
handle
Create and initialize thread object
Initialized
Ready
Transition
Waiting
Running
Terminated
Standby
Wait is completeSet object to
signaled state
Interaction with thread scheduling
Waiting on Dispatcher Objects ndash outside the kernel
5858
Interaction bet Synchronization amp Dispatching User mode thread waits on an event objectlsquos handle
Kernel changes threadlsquos scheduling state from ready to waiting and adds thread to wait-list
Another thread sets the event
Kernel wakes up waiting threads variable priority threads get priority boost
5959
Interaction bet Synchronization amp Dispatching Dispatcher re-schedules new thread ndash it may preempt running thread it it has lower priority and issues software interrupt to initiate context switch
If no processor can be preempted the dispatcher places the ready thread in the dispatcher ready queue to be scheduled later
6060
What signals an object
Dispatcher object
System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
Owning thread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (kernel mode)
nonsignaled signaled
Owning thread or otherthread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (exported to user mode)
nonsignaled signaled
One thread releases thesemaphore freeing a resource
A thread acquires the semaphoreMore resources are not available
Kernel resumes one or more waiting threads
Semaphore
6161
A thread reinitializesthe thread object
What signals an object (contd)
Dispatcher object System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
A thread sets the event
Kernel resumes one or more threads
Kernel resumes one or more waiting threads
Event
nonsignaled signaled
Dedicated thread sets oneevent in the event pair
Kernel resumes theother dedicated thread
Kernel resumes waitingdedicated thread
Event pair
nonsignaled signaled
Timer expires
A thread (re) initializes the timer
Kernel resumes all waiting threads
Timer
nonsignaled signaled
Thread terminates
Kernel resumes all waiting threads
Thread
6262
Further Reading
Mark E Russinovich and David A Solomon Microsoft Windows Internals 4th Edition Microsoft Press 2004
Chapter 3 - System Mechanisms
ndash Trap Dispatching (pp 85 ff)
ndash Synchronization (pp 149 ff)
ndash Kernel Event Tracing (pp 175 ff)
6363
33 Advanced Windows Synchronization
Deferred and Asynchronous Procedure Calls
IRQLs and CPU Time Accounting
Wait Queues amp Dispatcher Objects
6464
Used to defer processing from higher (device) interrupt level to a lower (dispatch) levelndash Also used for quantum end and timer expiration
Driver (usually ISR) queues requestndash One queue per CPU DPCs are normally queued to the current processor but can be targeted to other CPUs
ndash Executes specified procedure at dispatch IRQL (or ldquodispatch levelrdquo also ldquoDPC levelrdquo) when all higher-IRQL work (interrupts) completed
ndash Maximum times recommended ISR 10 usec DPC 25 usec
See httpwwwmicrosoftcomwhdcdriverperformmmdrvmspx
Deferred Procedure Calls (DPCs)
6565
queue head DPC object DPC object DPC object
Deferred Procedure Calls (DPCs)
6666
DPC
Delivering a DPC
DPC routines can call kernel functionsbut canlsquot call system services generatepage faults or create or wait on objects
DPC routines canlsquotassume whatprocess addressspace is currentlymapped
Interruptdispatch table
high
Power failure
DispatchDPC
APC
Low
DPC
1 Timer expires kernelqueues DPC that willrelease all waiting threadsKernel requests SW int
DPCDPC
DPC queue
2 DPC interrupt occurswhen IRQL drops belowdispatchDPC level
dispatcher
3 After DPC interruptcontrol transfers tothread dispatcher
4 Dispatcher executes each DPCroutine in DPC queue
6767
Asynchronous Procedure Calls (APCs)
Execute code in context of a particular user threadndash APC routines can acquire resources (objects) incur page faultscall system services
APC queue is thread-specific
User mode amp kernel mode APCsndash Permission required for user mode APCs
6868
Asynchronous Procedure Calls (APCs)
Executive uses APCs to complete work in thread spacendash Wait for asynchronous IO operation
ndash Emulate delivery of POSIX signals
ndash Make threads suspendterminate itself (env subsystems)
APCs are delivered when thread is in alertable wait statendash WaitForMultipleObjectsEx() SleepEx()
6969
Special kernel APCs
ndash Run in kernel mode at IRQL 1
ndash Always deliverable unless thread is already at IRQL 1 or above
ndash Used for IO completion reporting from ldquoarbitrary thread contextrdquo
ndash Kernel-mode interface is linkable but not documented
Asynchronous Procedure Calls (APCs)
7070
ldquoOrdinaryrdquo kernel APCs
ndash Always deliverable if at IRQL 0 unless explicitly disabled (disable with KeEnterCriticalRegion)
User mode APCs
ndash Used for IO completion callback routines (see ReadFileEx WriteFileEx) also QueueUserApc
ndash Only deliverable when thread is in ldquoalertable waitrdquo
Asynchronous Procedure Calls (APCs)
7171
ThreadObject
K
U
APC objects
Asynchronous Procedure Calls (APCs)
7272
IRQLs and CPU Time Accounting
Interval clock timer ISR keeps track of time
Clock ISR time accounting
ndash If IRQLlt2 charge to threadrsquos user or kernel time
ndash If IRQL=2 and processing a DPC charge to DPC time
ndash If IRQL=2 amp not processing a DPC charge to thread kernel time
ndash If IRQLgt2 charge to interrupt time
7373
IRQLs and CPU Time Accounting
Since time servicing interrupts are NOT charged to interrupted thread if system is busy but no process appears to be running must be due to interrupt-related activity
ndash Note time at IRQL 2 or more is charged to the current threadrsquos quantum (to be described)
7474
Interrupt Time Accounting
Task Manager includes interrupt and DPC time with the Idle process time
Interrupt activity is not charged to any threadprocess
ndash Process Explorer shows these as separate processesnot really processes
ndash Context switches for these are really of interrupts amp DPCs
7575
Time Accounting Quirks
Looking at total CPU time for each process may not reveal where system has spent its time
CPU time accounting is driven by programmable interrupt timer
ndash Normally 10 msec (15 msec on some MP Pentiums)
Thread execution and context switches between clock intervals NOT accounted
ndash Eg one or more threads run and enter a wait state before clock fires
ndash Thus threads may run but never get charged
View context switch activity with Process Explorer
ndash Add Context Switch Delta column
7676
For waiting threads user-mode utilities only display the wait reason
Example pstat
Looking at Waiting Threads
7777
Wait Internals 1 Dispatcher Objects
Size TypeState
Wait listhead
Object-type-specific data
DispatcherObject
(see ntddkincddkntddkh)
Any kernel object you can wait for is a ldquodispatcher objectrdquo
ndash some exclusively for synchronizationeg events mutexes (ldquomutantsrdquo) semaphores queues timers
ndash others can be waited for as a side effect of their prime function eg processes threads file objects
ndash non-waitable kernel objects are called ldquocontrol objectsrdquo
All dispatcher objects have a common header
All dispatcher objects are in one of two states
ndash ldquosignaledrdquo vs ldquononsignaledrdquo
ndash when signalled a wait on the object is satisfied
ndash different object types differ in terms of what changes their state
ndash wait and unwait implementation iscommon to all types of dispatcher objects
7878
Object-type-specific data
Wait Internals 2Wait Blocks
Size TypeState
Wait listhead
Size TypeState
Wait listhead
Represent a threadrsquos reference to something itrsquos waiting for (one per handle passed to WaitForhellip)
All wait blocks from a given wait call are chained to the waiting thread
Type indicates wait for ldquoanyrdquo or ldquoallrdquo Key denotes argument list position for
WaitForMultipleObjects
Object-type-specific data
DispatcherObjects
Thread Objects
WaitBlockListWaitBlockList
Wait blocks
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
7979
34 Windows APIs for Synchronization and IPC Windows API constructs for synchronization and interprocess communication
Synchronization
ndash Critical sections
ndash Mutexes
ndash Semaphores
ndash Event objects
Synchronization through interprocess communication
ndash Anonymous pipes
ndash Named pipes
ndash Mailslots
8080
Critical Sections
Only usable from within the same process
Critical sections are initialized and deleted but do not have handles
Only one thread at a time can be in a critical section
A thread can enter a critical section multiple times - however the number of Enter- and Leave-operations must match
Leaving a critical section before entering it may cause deadlocks
No way to test whether another thread is in a critical section
VOID InitializeCriticalSection( LPCRITICAL_SECTION sec )VOID DeleteCriticalSection( LPCRITICAL_SECTION sec )
VOID EnterCriticalSection( LPCRITICAL_SECTION sec ) VOID LeaveCriticalSection( LPCRITICAL_SECTION sec )BOOL TryEnterCriticalSection ( LPCRITICAL_SECTION sec )
8181
Critical Section Example
counter is global shared by all threads
volatile int counter = 0
CRITICAL_SECTION crit
InitializeCriticalSection ( ampcrit )
hellip main loop in any of the threads
while (done)
_try
EnterCriticalSection ( ampcrit )
counter += local_value
LeaveCriticalSection ( ampcrit )
_finally LeaveCriticalSection ( ampcrit )
DeleteCriticalSection( ampcrit )
8282
Synchronizing Threads with Kernel Objects
The following kernel objects can be used to synchronize threads
ndash Processes
ndash Threads
ndash Files
ndash Console input
File change notificationsFile change notifications
MutexesMutexes
Events (auto-reset + manual-reset)Events (auto-reset + manual-reset)
Waitable timersWaitable timers
DWORD WaitForSingleObject( HANDLE hObject DWORD dwTimeout )
DWORD WaitForMultipleObjects( DWORD cObjects LPHANDLE lpHandles BOOL bWaitAll DWORD dwTimeout )
8383
Wait Functions - Details
WaitForSingleObject()ndash hObject specifies kernel object
ndash dwTimeout specifies wait time in msecdwTimeout == 0 - no wait check whether object is signaled
dwTimeout == INFINITE - wait forever
WaitForMultipleObjects()ndash cObjects lt= MAXIMUM_WAIT_OBJECTS (64)
ndash lpHandles - pointer to array identifying these objects
ndash bWaitAll - whether to wait for first signaled object or all objectsFunction returns index of first signaled object
Side effectsndash Mutexes auto-reset events and waitable timers will be reset to non-signaled state after completing wait functions
8484
Mutexes
Mutexes work across processes
First thread has to call CreateMutex()
When sharing a mutex second thread (process) calls CreateMutex() or OpenMutex()
fInitialOwner == TRUE gives creator immediate ownership
Threads acquire mutex ownership using WaitForSingleObject() or WaitForMultipleObjects()
ReleaseMutex() gives up ownership
CloseHandle() will free mutex object
8585
Mutexes
HANDLE CreateMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
HANDLE OpenMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
BOOL ReleaseMutex( HANDLE hMutex )
8686
Mutex Example
counter is global shared by all threads
volatile int done counter = 0
HANDLE mutex = CreateMutex( NULL FALSE NULL )
main loop in any of the threads ret is local
DWORD ret
while (done)
ret = WaitForSingleObject( mutex INFINITE )
if (ret == WAIT_OBJECT_0)
counter += local_value
else mutex was abandoned
break exit the loop
ReleaseMutex( mutex )
CloseHandle( mutex )
8787
Comparison - POSIX mutexes
POSIX pthreads specification supports mutexesndash Synchronization among threads in same process
Five basic functionsndash pthread_mutex_init()
ndash pthread_mutex_destroy()
ndash pthread_mutex_lock()
ndash pthread_mutex_unlock()
ndash pthread_mutex_trylock()
Comparisonndash pthread_mutex_lock() will block - equivalent to WaitForSingleObject( hMutex )
ndash pthread_mutex_trylock() is nonblocking (polling) - equivalent to WaitForSingleObject() with timeout == 0
8888
Semaphores
Semaphore objects are used for resource countingndash A semaphore is signaled when count gt 0
Threadsprocesses use wait functionsndash Each wait function decreases semaphore count by 1
ndash ReleaseSemaphore() may increment count by any value
ndash ReleaseSemaphore() returns old semaphore count
8989
Semaphores
HANDLE CreateSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE OpenSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE ReleaseSemaphore( HANDLE hSemaphoreLONG cReleaseCount LPLONG lpPreviousCount )
9090
Events
Multiple threads can be released when a single event is signaled (barrier synchronization)ndash Manual-reset event can signal several thread simultaneously must be reset manually
ndash PulseEvent() will release all threads waiting on a manual-reset event and automatically reset the event
ndash Auto-reset event signals a single thread event is reset automatically
ndash fInitialState == TRUE - create event in signaled state
9191
Events
HANDLE CreateEvent( LPSECURITY_ATTRIBUTE lpsaBOOL fManualReset BOOL fInititalStateLPTSTR lpszEventName )
BOOL SetEvent( HANDLE hEvent )BOOL ResetEvent( HANDLE hEvent )BOOL PulseEvent( HANDLE hEvent )
9292
Comparison - POSIX condition variables
pthreadrsquos condition variables are comparable to eventsndash pthread_cond_init()
ndash pthread_cond_destroy()
Wait functionsndash pthread_cond_wait()
ndash pthread_cond_timedwait()
Signalingndash pthread_cond_signal() - one thread
ndash pthread_cond_broadcast() - all waiting threads
No exact equivalent to manual-reset events
9393
Anonymous pipes
BOOL CreatePipe( PHANDLE phReadPHANDLE phWriteLPSECURITY_ATTRIBUTES lpsaDWORD cbPipe )
main
prog1 prog2pipe
Half-duplex character-based IPC
cbPipe pipe byte size zero == default
Read on pipe handle will block if pipe is empty
Write operation to a full pipe will block
Anonymous pipes are oneway
9494
IO Redirection using an Anonymous Pipe
Create default size anonymous pipe handles are inheritable
if (CreatePipe (amphReadPipe amphWritePipe ampPipeSA 0))
fprintf(stderr ldquoAnon pipe create failednrdquo) exit(1)
Set output handle to pipe handle create first processes
StartInfoCh1hStdInput = GetStdHandle (STD_INPUT_HANDLE)
StartInfoCh1hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh1hStdOutput = hWritePipe
StartInfoCh1dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)Command1 NULL NULL TRUE 0 NULL NULL ampStartInfoCh1 ampProcInfo1))
fprintf(stderr ldquoCreateProc1 failednrdquo) exit(2)
CloseHandle (hWritePipe)
9595
Pipe example (contd)
Repeat (symmetrically) for the second process
StartInfoCh2hStdInput = hReadPipe
StartInfoCh2hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh2hStdOutput = GetStdHandle (STD_OUTPUT_HANDLE)
StartInfoCh2dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)targv NULL NULLTRUE Inherit handles
0 NULL NULL ampStartInfoCh2 ampProcInfo2))
fprintf(stderr ldquoCreateProc2 failednrdquo) exit(3)
CloseHandle (hReadPipe)
Wait for both processes to complete
WaitForSingleObject (ProcInfo1hProcess INFINITE)
WaitForSingleObject (ProcInfo2hProcess INFINITE)
CloseHandle (ProcInfo1hThread) CloseHandle (ProcInfo1hProcess)
CloseHandle (ProcInfo2hThread) CloseHandle (ProcInfo2hProcess)
return 0
9696
Named Pipes
Message orientedndash Reading process can read varying-length messages precisely as sent by the writing process
Bi-directionalndash Two processes can exchange messages over the same pipe
Multiple independent instances of a named pipendash Several clients can communicate with a single server using the same instance
ndash Server can respond to client using the same instance
Pipe can be accessed over the networkndash location transparency
Convenience and connection functions
9797
Using Named Pipes
HANDLE CreateNamedPipe (LPCTSTR lpszPipeNameDWORD fdwOpenMode DWORD fdwPipModeDWORD nMaxInstances DWORD cbOutBufDWORD cbInBuf DWORD dwTimeOutLPSECURITY_ATTRIBUTES lpsa )
Use same flag settings forall instances of a named pipe
lpszPipeName pipe[path]pipename
ndash Not possible to create a pipe on remote machine ( ndash local machine)
fdwOpenMode
ndash PIPE_ACCESS_DUPLEX PIPE_ACCESS_INBOUND PIPE_ACCESS_OUTBOUND
fdwPipeMode
ndash PIPE_TYPE_BYTE or PIPE_TYPE_MESSAGE
ndash PIPE_READMODE_BYTE or PIPE_READMODE_MESSAGE
ndash PIPE_WAIT or PIPE_NOWAIT (will ReadFile block)
9898
Named Pipes (contd)
BOOL PeekNamedPipe (HANDLE hPipeLPVOID lpvBuffer DWORD cbBufferLPDWORD lpcbRead LPDWORD lpcbAvailLPDWORD lpcbMessage)
nMaxInstances
ndash Number of instances
ndash PIPE_UNLIMITED_INSTANCES OS choice based on resources
dwTimeOut
ndash Default time-out period (in msec) for WaitNamedPipe()
First CreateNamedPipe creates named pipe
ndash Closing handle to last instance deletes named pipe
Polling a pipe
ndash Nondestructive ndash is there a message waiting for ReadFile
9999
Named Pipe Client Connections
CreateFile with named pipe namendash pipe[path]pipename
ndash servernamepipe[path]pipename
ndash First method gives better performance (local server)
Status Functionsndash GetNamedPipeHandleState
ndash SetNamedPipeHandleState
ndash GetNamedPipeInfo
100100
Convenience Functions
BOOL TransactNamedPipe( HANDLE hNamedPipeLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDOWRD lpcbRead LPOVERLAPPED lpa)
WriteFile ReadFile sequence
101101
Convenience Functions
BOOL CallNamedPipe( LPCTSTR lpszPipeNameLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDWORD lpcbRead DWORD dwTimeOut)
CreateFile WriteFile ReadFile CloseHandle
ndash dwTimeOut NMPWAIT_NOWAIT NMPWAIT_WIAT_FOREVER NMPWAIT_USE_DEFAULT_WAIT
102102
Server eliminate the polling loop
BOOL ConnectNamedPipe (HANDLE hNamedPipeLPOVERLAPPED lpo
lpo == NULLndash Call will return as soon as there is a client connection
ndash Returns false if client connected between CreateNamed Pipe calland ConnectNamedPipe()
Use DisconnectNamedPipe to free the handle for connection from another client
WaitNamedPipe()ndash Client may wait for serverlsquos ConnectNamedPipe()
Security rights for named pipesndash GENERIC_READ GENERIC_WRITE SYNCHRONIZE
103103
Comparison with UNIX
UNIX FIFOs are similar to a named pipe
ndash FIFOs are half-duplex
ndash FIFOs are limited to a single machine
ndash FIFOs are still byte-oriented so its easiest to use fixed-size records in clientserver applications
ndash Individual readwrites are atomic
A server using FIFOs must use a separate FIFO for each clientlsquos response although all clients can send requests via a single well known FIFO
Mkfifo() is the UNIX counterpart to CreateNamedPipe()
Use sockets for networked clientserver scenarios
104104
Client Example using Named Pipe
WaitNamedPipe (ServerPipeName NMPWAIT_WAIT_FOREVER)
hNamedPipe = CreateFile (ServerPipeName GENERIC_READ | GENERIC_WRITE
0 NULL OPEN_EXISTING FILE_ATTRIBUTE_NORMAL NULL)
if (hNamedPipe == INVALID_HANDLE_VALUE)
fptinf(stderr Failure to locate servern) exit(3)
Write the request
WriteFile (hNamedPipe ampRequest MAX_RQRS_LEN ampnWrite NULL)
Read each response and send it to std out
while (ReadFile (hNamedPipe ResponseRecord MAX_RQRS_LEN ampnRead NULL))
printf (s ResponseRecord)
CloseHandle (hNamedPipe)
return 0
105105
Server Example Using a Named Pipe
hNamedPipe = CreateNamedPipe (SERVER_PIPE PIPE_ACCESS_DUPLEX
PIPE_READMODE_MESSAGE | PIPE_TYPE_MESSAGE | PIPE_WAIT
1 0 0 CS_TIMEOUT pNPSA)
while (Done)
printf (Server is awaiting next requestn)
if (ConnectNamedPipe (hNamedPipe NULL)
|| ReadFile (hNamedPipe ampRequest RQ_SIZE ampnXfer NULL))
fprintf(stderr ldquoConnect or Read Named Pipe errornrdquo) exit(4)
printf( ldquoRequest is sn RequestRecord)
Send the file one line at a time to the client
fp = fopen (File r)
while ((fgets (ResponseRecord MAX_RQRS_LEN fp) = NULL))
WriteFile (hNamedPipe ampResponseRecord
(strlen(ResponseRecord) + 1) TSIZE ampnXfer NULL)
fclose (fp)
DisconnectNamedPipe (hNamedPipe)
End of server operation
106106
Win32 IPC - MailslotsMailslots bear some nasty implementation detailsthey are almost never used
Broadcast mechanism
ndash One-directional
ndash Mutliple writersmultiple readers (frequently one-to-many comm)
ndash Message delivery is unreliable
ndash Can be located over a network domain
ndash Message lengths are limited (w2k lt 426 byte) Operations on the mailslot
ndash Each reader (server) creates mailslot with CreateMailslot()
ndash Write-only client opens mailslot with CreateFile() and uses WriteFile() ndash open will fail if there are no waiting readers
ndash Clientlsquos message can be read by all servers (readers) Client lookup mailslotmailslotname
ndash Client will connect to every server in network domain
107107
Locate a server via mailslot
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
App client 0
App client n
Mailslot Servers
While () Sleep() hMS = CreateFile( ldquomailslotstatusldquo)
WriteFile(hMS ampStatInfo
App Server
Mailslot Client
Message is sent periodically
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
5555
Waiting
Waitable objects include
ndash Events (may be auto-reset or manual reset may be set or ldquopulsedrdquo)
ndash Mutexes (ldquomutual exclusionrdquo one-at-a-time)
ndash Semaphores (n-at-a-time)
ndash Timers
ndash Processes and Threads (signalled upon exit or terminate)
ndash Directories (change notification)
5656
Waiting
No guaranteed ordering of wait resolution
ndash If multiple threads are waiting for an object and only one thread is released (eg itrsquos a mutex or auto-reset event) which thread gets released is unpredictable
5757
Executive Synchronization
Thread waitson an object
handle
Create and initialize thread object
Initialized
Ready
Transition
Waiting
Running
Terminated
Standby
Wait is completeSet object to
signaled state
Interaction with thread scheduling
Waiting on Dispatcher Objects ndash outside the kernel
5858
Interaction bet Synchronization amp Dispatching User mode thread waits on an event objectlsquos handle
Kernel changes threadlsquos scheduling state from ready to waiting and adds thread to wait-list
Another thread sets the event
Kernel wakes up waiting threads variable priority threads get priority boost
5959
Interaction bet Synchronization amp Dispatching Dispatcher re-schedules new thread ndash it may preempt running thread it it has lower priority and issues software interrupt to initiate context switch
If no processor can be preempted the dispatcher places the ready thread in the dispatcher ready queue to be scheduled later
6060
What signals an object
Dispatcher object
System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
Owning thread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (kernel mode)
nonsignaled signaled
Owning thread or otherthread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (exported to user mode)
nonsignaled signaled
One thread releases thesemaphore freeing a resource
A thread acquires the semaphoreMore resources are not available
Kernel resumes one or more waiting threads
Semaphore
6161
A thread reinitializesthe thread object
What signals an object (contd)
Dispatcher object System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
A thread sets the event
Kernel resumes one or more threads
Kernel resumes one or more waiting threads
Event
nonsignaled signaled
Dedicated thread sets oneevent in the event pair
Kernel resumes theother dedicated thread
Kernel resumes waitingdedicated thread
Event pair
nonsignaled signaled
Timer expires
A thread (re) initializes the timer
Kernel resumes all waiting threads
Timer
nonsignaled signaled
Thread terminates
Kernel resumes all waiting threads
Thread
6262
Further Reading
Mark E Russinovich and David A Solomon Microsoft Windows Internals 4th Edition Microsoft Press 2004
Chapter 3 - System Mechanisms
ndash Trap Dispatching (pp 85 ff)
ndash Synchronization (pp 149 ff)
ndash Kernel Event Tracing (pp 175 ff)
6363
33 Advanced Windows Synchronization
Deferred and Asynchronous Procedure Calls
IRQLs and CPU Time Accounting
Wait Queues amp Dispatcher Objects
6464
Used to defer processing from higher (device) interrupt level to a lower (dispatch) levelndash Also used for quantum end and timer expiration
Driver (usually ISR) queues requestndash One queue per CPU DPCs are normally queued to the current processor but can be targeted to other CPUs
ndash Executes specified procedure at dispatch IRQL (or ldquodispatch levelrdquo also ldquoDPC levelrdquo) when all higher-IRQL work (interrupts) completed
ndash Maximum times recommended ISR 10 usec DPC 25 usec
See httpwwwmicrosoftcomwhdcdriverperformmmdrvmspx
Deferred Procedure Calls (DPCs)
6565
queue head DPC object DPC object DPC object
Deferred Procedure Calls (DPCs)
6666
DPC
Delivering a DPC
DPC routines can call kernel functionsbut canlsquot call system services generatepage faults or create or wait on objects
DPC routines canlsquotassume whatprocess addressspace is currentlymapped
Interruptdispatch table
high
Power failure
DispatchDPC
APC
Low
DPC
1 Timer expires kernelqueues DPC that willrelease all waiting threadsKernel requests SW int
DPCDPC
DPC queue
2 DPC interrupt occurswhen IRQL drops belowdispatchDPC level
dispatcher
3 After DPC interruptcontrol transfers tothread dispatcher
4 Dispatcher executes each DPCroutine in DPC queue
6767
Asynchronous Procedure Calls (APCs)
Execute code in context of a particular user threadndash APC routines can acquire resources (objects) incur page faultscall system services
APC queue is thread-specific
User mode amp kernel mode APCsndash Permission required for user mode APCs
6868
Asynchronous Procedure Calls (APCs)
Executive uses APCs to complete work in thread spacendash Wait for asynchronous IO operation
ndash Emulate delivery of POSIX signals
ndash Make threads suspendterminate itself (env subsystems)
APCs are delivered when thread is in alertable wait statendash WaitForMultipleObjectsEx() SleepEx()
6969
Special kernel APCs
ndash Run in kernel mode at IRQL 1
ndash Always deliverable unless thread is already at IRQL 1 or above
ndash Used for IO completion reporting from ldquoarbitrary thread contextrdquo
ndash Kernel-mode interface is linkable but not documented
Asynchronous Procedure Calls (APCs)
7070
ldquoOrdinaryrdquo kernel APCs
ndash Always deliverable if at IRQL 0 unless explicitly disabled (disable with KeEnterCriticalRegion)
User mode APCs
ndash Used for IO completion callback routines (see ReadFileEx WriteFileEx) also QueueUserApc
ndash Only deliverable when thread is in ldquoalertable waitrdquo
Asynchronous Procedure Calls (APCs)
7171
ThreadObject
K
U
APC objects
Asynchronous Procedure Calls (APCs)
7272
IRQLs and CPU Time Accounting
Interval clock timer ISR keeps track of time
Clock ISR time accounting
ndash If IRQLlt2 charge to threadrsquos user or kernel time
ndash If IRQL=2 and processing a DPC charge to DPC time
ndash If IRQL=2 amp not processing a DPC charge to thread kernel time
ndash If IRQLgt2 charge to interrupt time
7373
IRQLs and CPU Time Accounting
Since time servicing interrupts are NOT charged to interrupted thread if system is busy but no process appears to be running must be due to interrupt-related activity
ndash Note time at IRQL 2 or more is charged to the current threadrsquos quantum (to be described)
7474
Interrupt Time Accounting
Task Manager includes interrupt and DPC time with the Idle process time
Interrupt activity is not charged to any threadprocess
ndash Process Explorer shows these as separate processesnot really processes
ndash Context switches for these are really of interrupts amp DPCs
7575
Time Accounting Quirks
Looking at total CPU time for each process may not reveal where system has spent its time
CPU time accounting is driven by programmable interrupt timer
ndash Normally 10 msec (15 msec on some MP Pentiums)
Thread execution and context switches between clock intervals NOT accounted
ndash Eg one or more threads run and enter a wait state before clock fires
ndash Thus threads may run but never get charged
View context switch activity with Process Explorer
ndash Add Context Switch Delta column
7676
For waiting threads user-mode utilities only display the wait reason
Example pstat
Looking at Waiting Threads
7777
Wait Internals 1 Dispatcher Objects
Size TypeState
Wait listhead
Object-type-specific data
DispatcherObject
(see ntddkincddkntddkh)
Any kernel object you can wait for is a ldquodispatcher objectrdquo
ndash some exclusively for synchronizationeg events mutexes (ldquomutantsrdquo) semaphores queues timers
ndash others can be waited for as a side effect of their prime function eg processes threads file objects
ndash non-waitable kernel objects are called ldquocontrol objectsrdquo
All dispatcher objects have a common header
All dispatcher objects are in one of two states
ndash ldquosignaledrdquo vs ldquononsignaledrdquo
ndash when signalled a wait on the object is satisfied
ndash different object types differ in terms of what changes their state
ndash wait and unwait implementation iscommon to all types of dispatcher objects
7878
Object-type-specific data
Wait Internals 2Wait Blocks
Size TypeState
Wait listhead
Size TypeState
Wait listhead
Represent a threadrsquos reference to something itrsquos waiting for (one per handle passed to WaitForhellip)
All wait blocks from a given wait call are chained to the waiting thread
Type indicates wait for ldquoanyrdquo or ldquoallrdquo Key denotes argument list position for
WaitForMultipleObjects
Object-type-specific data
DispatcherObjects
Thread Objects
WaitBlockListWaitBlockList
Wait blocks
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
7979
34 Windows APIs for Synchronization and IPC Windows API constructs for synchronization and interprocess communication
Synchronization
ndash Critical sections
ndash Mutexes
ndash Semaphores
ndash Event objects
Synchronization through interprocess communication
ndash Anonymous pipes
ndash Named pipes
ndash Mailslots
8080
Critical Sections
Only usable from within the same process
Critical sections are initialized and deleted but do not have handles
Only one thread at a time can be in a critical section
A thread can enter a critical section multiple times - however the number of Enter- and Leave-operations must match
Leaving a critical section before entering it may cause deadlocks
No way to test whether another thread is in a critical section
VOID InitializeCriticalSection( LPCRITICAL_SECTION sec )VOID DeleteCriticalSection( LPCRITICAL_SECTION sec )
VOID EnterCriticalSection( LPCRITICAL_SECTION sec ) VOID LeaveCriticalSection( LPCRITICAL_SECTION sec )BOOL TryEnterCriticalSection ( LPCRITICAL_SECTION sec )
8181
Critical Section Example
counter is global shared by all threads
volatile int counter = 0
CRITICAL_SECTION crit
InitializeCriticalSection ( ampcrit )
hellip main loop in any of the threads
while (done)
_try
EnterCriticalSection ( ampcrit )
counter += local_value
LeaveCriticalSection ( ampcrit )
_finally LeaveCriticalSection ( ampcrit )
DeleteCriticalSection( ampcrit )
8282
Synchronizing Threads with Kernel Objects
The following kernel objects can be used to synchronize threads
ndash Processes
ndash Threads
ndash Files
ndash Console input
File change notificationsFile change notifications
MutexesMutexes
Events (auto-reset + manual-reset)Events (auto-reset + manual-reset)
Waitable timersWaitable timers
DWORD WaitForSingleObject( HANDLE hObject DWORD dwTimeout )
DWORD WaitForMultipleObjects( DWORD cObjects LPHANDLE lpHandles BOOL bWaitAll DWORD dwTimeout )
8383
Wait Functions - Details
WaitForSingleObject()ndash hObject specifies kernel object
ndash dwTimeout specifies wait time in msecdwTimeout == 0 - no wait check whether object is signaled
dwTimeout == INFINITE - wait forever
WaitForMultipleObjects()ndash cObjects lt= MAXIMUM_WAIT_OBJECTS (64)
ndash lpHandles - pointer to array identifying these objects
ndash bWaitAll - whether to wait for first signaled object or all objectsFunction returns index of first signaled object
Side effectsndash Mutexes auto-reset events and waitable timers will be reset to non-signaled state after completing wait functions
8484
Mutexes
Mutexes work across processes
First thread has to call CreateMutex()
When sharing a mutex second thread (process) calls CreateMutex() or OpenMutex()
fInitialOwner == TRUE gives creator immediate ownership
Threads acquire mutex ownership using WaitForSingleObject() or WaitForMultipleObjects()
ReleaseMutex() gives up ownership
CloseHandle() will free mutex object
8585
Mutexes
HANDLE CreateMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
HANDLE OpenMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
BOOL ReleaseMutex( HANDLE hMutex )
8686
Mutex Example
counter is global shared by all threads
volatile int done counter = 0
HANDLE mutex = CreateMutex( NULL FALSE NULL )
main loop in any of the threads ret is local
DWORD ret
while (done)
ret = WaitForSingleObject( mutex INFINITE )
if (ret == WAIT_OBJECT_0)
counter += local_value
else mutex was abandoned
break exit the loop
ReleaseMutex( mutex )
CloseHandle( mutex )
8787
Comparison - POSIX mutexes
POSIX pthreads specification supports mutexesndash Synchronization among threads in same process
Five basic functionsndash pthread_mutex_init()
ndash pthread_mutex_destroy()
ndash pthread_mutex_lock()
ndash pthread_mutex_unlock()
ndash pthread_mutex_trylock()
Comparisonndash pthread_mutex_lock() will block - equivalent to WaitForSingleObject( hMutex )
ndash pthread_mutex_trylock() is nonblocking (polling) - equivalent to WaitForSingleObject() with timeout == 0
8888
Semaphores
Semaphore objects are used for resource countingndash A semaphore is signaled when count gt 0
Threadsprocesses use wait functionsndash Each wait function decreases semaphore count by 1
ndash ReleaseSemaphore() may increment count by any value
ndash ReleaseSemaphore() returns old semaphore count
8989
Semaphores
HANDLE CreateSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE OpenSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE ReleaseSemaphore( HANDLE hSemaphoreLONG cReleaseCount LPLONG lpPreviousCount )
9090
Events
Multiple threads can be released when a single event is signaled (barrier synchronization)ndash Manual-reset event can signal several thread simultaneously must be reset manually
ndash PulseEvent() will release all threads waiting on a manual-reset event and automatically reset the event
ndash Auto-reset event signals a single thread event is reset automatically
ndash fInitialState == TRUE - create event in signaled state
9191
Events
HANDLE CreateEvent( LPSECURITY_ATTRIBUTE lpsaBOOL fManualReset BOOL fInititalStateLPTSTR lpszEventName )
BOOL SetEvent( HANDLE hEvent )BOOL ResetEvent( HANDLE hEvent )BOOL PulseEvent( HANDLE hEvent )
9292
Comparison - POSIX condition variables
pthreadrsquos condition variables are comparable to eventsndash pthread_cond_init()
ndash pthread_cond_destroy()
Wait functionsndash pthread_cond_wait()
ndash pthread_cond_timedwait()
Signalingndash pthread_cond_signal() - one thread
ndash pthread_cond_broadcast() - all waiting threads
No exact equivalent to manual-reset events
9393
Anonymous pipes
BOOL CreatePipe( PHANDLE phReadPHANDLE phWriteLPSECURITY_ATTRIBUTES lpsaDWORD cbPipe )
main
prog1 prog2pipe
Half-duplex character-based IPC
cbPipe pipe byte size zero == default
Read on pipe handle will block if pipe is empty
Write operation to a full pipe will block
Anonymous pipes are oneway
9494
IO Redirection using an Anonymous Pipe
Create default size anonymous pipe handles are inheritable
if (CreatePipe (amphReadPipe amphWritePipe ampPipeSA 0))
fprintf(stderr ldquoAnon pipe create failednrdquo) exit(1)
Set output handle to pipe handle create first processes
StartInfoCh1hStdInput = GetStdHandle (STD_INPUT_HANDLE)
StartInfoCh1hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh1hStdOutput = hWritePipe
StartInfoCh1dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)Command1 NULL NULL TRUE 0 NULL NULL ampStartInfoCh1 ampProcInfo1))
fprintf(stderr ldquoCreateProc1 failednrdquo) exit(2)
CloseHandle (hWritePipe)
9595
Pipe example (contd)
Repeat (symmetrically) for the second process
StartInfoCh2hStdInput = hReadPipe
StartInfoCh2hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh2hStdOutput = GetStdHandle (STD_OUTPUT_HANDLE)
StartInfoCh2dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)targv NULL NULLTRUE Inherit handles
0 NULL NULL ampStartInfoCh2 ampProcInfo2))
fprintf(stderr ldquoCreateProc2 failednrdquo) exit(3)
CloseHandle (hReadPipe)
Wait for both processes to complete
WaitForSingleObject (ProcInfo1hProcess INFINITE)
WaitForSingleObject (ProcInfo2hProcess INFINITE)
CloseHandle (ProcInfo1hThread) CloseHandle (ProcInfo1hProcess)
CloseHandle (ProcInfo2hThread) CloseHandle (ProcInfo2hProcess)
return 0
9696
Named Pipes
Message orientedndash Reading process can read varying-length messages precisely as sent by the writing process
Bi-directionalndash Two processes can exchange messages over the same pipe
Multiple independent instances of a named pipendash Several clients can communicate with a single server using the same instance
ndash Server can respond to client using the same instance
Pipe can be accessed over the networkndash location transparency
Convenience and connection functions
9797
Using Named Pipes
HANDLE CreateNamedPipe (LPCTSTR lpszPipeNameDWORD fdwOpenMode DWORD fdwPipModeDWORD nMaxInstances DWORD cbOutBufDWORD cbInBuf DWORD dwTimeOutLPSECURITY_ATTRIBUTES lpsa )
Use same flag settings forall instances of a named pipe
lpszPipeName pipe[path]pipename
ndash Not possible to create a pipe on remote machine ( ndash local machine)
fdwOpenMode
ndash PIPE_ACCESS_DUPLEX PIPE_ACCESS_INBOUND PIPE_ACCESS_OUTBOUND
fdwPipeMode
ndash PIPE_TYPE_BYTE or PIPE_TYPE_MESSAGE
ndash PIPE_READMODE_BYTE or PIPE_READMODE_MESSAGE
ndash PIPE_WAIT or PIPE_NOWAIT (will ReadFile block)
9898
Named Pipes (contd)
BOOL PeekNamedPipe (HANDLE hPipeLPVOID lpvBuffer DWORD cbBufferLPDWORD lpcbRead LPDWORD lpcbAvailLPDWORD lpcbMessage)
nMaxInstances
ndash Number of instances
ndash PIPE_UNLIMITED_INSTANCES OS choice based on resources
dwTimeOut
ndash Default time-out period (in msec) for WaitNamedPipe()
First CreateNamedPipe creates named pipe
ndash Closing handle to last instance deletes named pipe
Polling a pipe
ndash Nondestructive ndash is there a message waiting for ReadFile
9999
Named Pipe Client Connections
CreateFile with named pipe namendash pipe[path]pipename
ndash servernamepipe[path]pipename
ndash First method gives better performance (local server)
Status Functionsndash GetNamedPipeHandleState
ndash SetNamedPipeHandleState
ndash GetNamedPipeInfo
100100
Convenience Functions
BOOL TransactNamedPipe( HANDLE hNamedPipeLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDOWRD lpcbRead LPOVERLAPPED lpa)
WriteFile ReadFile sequence
101101
Convenience Functions
BOOL CallNamedPipe( LPCTSTR lpszPipeNameLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDWORD lpcbRead DWORD dwTimeOut)
CreateFile WriteFile ReadFile CloseHandle
ndash dwTimeOut NMPWAIT_NOWAIT NMPWAIT_WIAT_FOREVER NMPWAIT_USE_DEFAULT_WAIT
102102
Server eliminate the polling loop
BOOL ConnectNamedPipe (HANDLE hNamedPipeLPOVERLAPPED lpo
lpo == NULLndash Call will return as soon as there is a client connection
ndash Returns false if client connected between CreateNamed Pipe calland ConnectNamedPipe()
Use DisconnectNamedPipe to free the handle for connection from another client
WaitNamedPipe()ndash Client may wait for serverlsquos ConnectNamedPipe()
Security rights for named pipesndash GENERIC_READ GENERIC_WRITE SYNCHRONIZE
103103
Comparison with UNIX
UNIX FIFOs are similar to a named pipe
ndash FIFOs are half-duplex
ndash FIFOs are limited to a single machine
ndash FIFOs are still byte-oriented so its easiest to use fixed-size records in clientserver applications
ndash Individual readwrites are atomic
A server using FIFOs must use a separate FIFO for each clientlsquos response although all clients can send requests via a single well known FIFO
Mkfifo() is the UNIX counterpart to CreateNamedPipe()
Use sockets for networked clientserver scenarios
104104
Client Example using Named Pipe
WaitNamedPipe (ServerPipeName NMPWAIT_WAIT_FOREVER)
hNamedPipe = CreateFile (ServerPipeName GENERIC_READ | GENERIC_WRITE
0 NULL OPEN_EXISTING FILE_ATTRIBUTE_NORMAL NULL)
if (hNamedPipe == INVALID_HANDLE_VALUE)
fptinf(stderr Failure to locate servern) exit(3)
Write the request
WriteFile (hNamedPipe ampRequest MAX_RQRS_LEN ampnWrite NULL)
Read each response and send it to std out
while (ReadFile (hNamedPipe ResponseRecord MAX_RQRS_LEN ampnRead NULL))
printf (s ResponseRecord)
CloseHandle (hNamedPipe)
return 0
105105
Server Example Using a Named Pipe
hNamedPipe = CreateNamedPipe (SERVER_PIPE PIPE_ACCESS_DUPLEX
PIPE_READMODE_MESSAGE | PIPE_TYPE_MESSAGE | PIPE_WAIT
1 0 0 CS_TIMEOUT pNPSA)
while (Done)
printf (Server is awaiting next requestn)
if (ConnectNamedPipe (hNamedPipe NULL)
|| ReadFile (hNamedPipe ampRequest RQ_SIZE ampnXfer NULL))
fprintf(stderr ldquoConnect or Read Named Pipe errornrdquo) exit(4)
printf( ldquoRequest is sn RequestRecord)
Send the file one line at a time to the client
fp = fopen (File r)
while ((fgets (ResponseRecord MAX_RQRS_LEN fp) = NULL))
WriteFile (hNamedPipe ampResponseRecord
(strlen(ResponseRecord) + 1) TSIZE ampnXfer NULL)
fclose (fp)
DisconnectNamedPipe (hNamedPipe)
End of server operation
106106
Win32 IPC - MailslotsMailslots bear some nasty implementation detailsthey are almost never used
Broadcast mechanism
ndash One-directional
ndash Mutliple writersmultiple readers (frequently one-to-many comm)
ndash Message delivery is unreliable
ndash Can be located over a network domain
ndash Message lengths are limited (w2k lt 426 byte) Operations on the mailslot
ndash Each reader (server) creates mailslot with CreateMailslot()
ndash Write-only client opens mailslot with CreateFile() and uses WriteFile() ndash open will fail if there are no waiting readers
ndash Clientlsquos message can be read by all servers (readers) Client lookup mailslotmailslotname
ndash Client will connect to every server in network domain
107107
Locate a server via mailslot
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
App client 0
App client n
Mailslot Servers
While () Sleep() hMS = CreateFile( ldquomailslotstatusldquo)
WriteFile(hMS ampStatInfo
App Server
Mailslot Client
Message is sent periodically
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
5656
Waiting
No guaranteed ordering of wait resolution
ndash If multiple threads are waiting for an object and only one thread is released (eg itrsquos a mutex or auto-reset event) which thread gets released is unpredictable
5757
Executive Synchronization
Thread waitson an object
handle
Create and initialize thread object
Initialized
Ready
Transition
Waiting
Running
Terminated
Standby
Wait is completeSet object to
signaled state
Interaction with thread scheduling
Waiting on Dispatcher Objects ndash outside the kernel
5858
Interaction bet Synchronization amp Dispatching User mode thread waits on an event objectlsquos handle
Kernel changes threadlsquos scheduling state from ready to waiting and adds thread to wait-list
Another thread sets the event
Kernel wakes up waiting threads variable priority threads get priority boost
5959
Interaction bet Synchronization amp Dispatching Dispatcher re-schedules new thread ndash it may preempt running thread it it has lower priority and issues software interrupt to initiate context switch
If no processor can be preempted the dispatcher places the ready thread in the dispatcher ready queue to be scheduled later
6060
What signals an object
Dispatcher object
System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
Owning thread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (kernel mode)
nonsignaled signaled
Owning thread or otherthread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (exported to user mode)
nonsignaled signaled
One thread releases thesemaphore freeing a resource
A thread acquires the semaphoreMore resources are not available
Kernel resumes one or more waiting threads
Semaphore
6161
A thread reinitializesthe thread object
What signals an object (contd)
Dispatcher object System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
A thread sets the event
Kernel resumes one or more threads
Kernel resumes one or more waiting threads
Event
nonsignaled signaled
Dedicated thread sets oneevent in the event pair
Kernel resumes theother dedicated thread
Kernel resumes waitingdedicated thread
Event pair
nonsignaled signaled
Timer expires
A thread (re) initializes the timer
Kernel resumes all waiting threads
Timer
nonsignaled signaled
Thread terminates
Kernel resumes all waiting threads
Thread
6262
Further Reading
Mark E Russinovich and David A Solomon Microsoft Windows Internals 4th Edition Microsoft Press 2004
Chapter 3 - System Mechanisms
ndash Trap Dispatching (pp 85 ff)
ndash Synchronization (pp 149 ff)
ndash Kernel Event Tracing (pp 175 ff)
6363
33 Advanced Windows Synchronization
Deferred and Asynchronous Procedure Calls
IRQLs and CPU Time Accounting
Wait Queues amp Dispatcher Objects
6464
Used to defer processing from higher (device) interrupt level to a lower (dispatch) levelndash Also used for quantum end and timer expiration
Driver (usually ISR) queues requestndash One queue per CPU DPCs are normally queued to the current processor but can be targeted to other CPUs
ndash Executes specified procedure at dispatch IRQL (or ldquodispatch levelrdquo also ldquoDPC levelrdquo) when all higher-IRQL work (interrupts) completed
ndash Maximum times recommended ISR 10 usec DPC 25 usec
See httpwwwmicrosoftcomwhdcdriverperformmmdrvmspx
Deferred Procedure Calls (DPCs)
6565
queue head DPC object DPC object DPC object
Deferred Procedure Calls (DPCs)
6666
DPC
Delivering a DPC
DPC routines can call kernel functionsbut canlsquot call system services generatepage faults or create or wait on objects
DPC routines canlsquotassume whatprocess addressspace is currentlymapped
Interruptdispatch table
high
Power failure
DispatchDPC
APC
Low
DPC
1 Timer expires kernelqueues DPC that willrelease all waiting threadsKernel requests SW int
DPCDPC
DPC queue
2 DPC interrupt occurswhen IRQL drops belowdispatchDPC level
dispatcher
3 After DPC interruptcontrol transfers tothread dispatcher
4 Dispatcher executes each DPCroutine in DPC queue
6767
Asynchronous Procedure Calls (APCs)
Execute code in context of a particular user threadndash APC routines can acquire resources (objects) incur page faultscall system services
APC queue is thread-specific
User mode amp kernel mode APCsndash Permission required for user mode APCs
6868
Asynchronous Procedure Calls (APCs)
Executive uses APCs to complete work in thread spacendash Wait for asynchronous IO operation
ndash Emulate delivery of POSIX signals
ndash Make threads suspendterminate itself (env subsystems)
APCs are delivered when thread is in alertable wait statendash WaitForMultipleObjectsEx() SleepEx()
6969
Special kernel APCs
ndash Run in kernel mode at IRQL 1
ndash Always deliverable unless thread is already at IRQL 1 or above
ndash Used for IO completion reporting from ldquoarbitrary thread contextrdquo
ndash Kernel-mode interface is linkable but not documented
Asynchronous Procedure Calls (APCs)
7070
ldquoOrdinaryrdquo kernel APCs
ndash Always deliverable if at IRQL 0 unless explicitly disabled (disable with KeEnterCriticalRegion)
User mode APCs
ndash Used for IO completion callback routines (see ReadFileEx WriteFileEx) also QueueUserApc
ndash Only deliverable when thread is in ldquoalertable waitrdquo
Asynchronous Procedure Calls (APCs)
7171
ThreadObject
K
U
APC objects
Asynchronous Procedure Calls (APCs)
7272
IRQLs and CPU Time Accounting
Interval clock timer ISR keeps track of time
Clock ISR time accounting
ndash If IRQLlt2 charge to threadrsquos user or kernel time
ndash If IRQL=2 and processing a DPC charge to DPC time
ndash If IRQL=2 amp not processing a DPC charge to thread kernel time
ndash If IRQLgt2 charge to interrupt time
7373
IRQLs and CPU Time Accounting
Since time servicing interrupts are NOT charged to interrupted thread if system is busy but no process appears to be running must be due to interrupt-related activity
ndash Note time at IRQL 2 or more is charged to the current threadrsquos quantum (to be described)
7474
Interrupt Time Accounting
Task Manager includes interrupt and DPC time with the Idle process time
Interrupt activity is not charged to any threadprocess
ndash Process Explorer shows these as separate processesnot really processes
ndash Context switches for these are really of interrupts amp DPCs
7575
Time Accounting Quirks
Looking at total CPU time for each process may not reveal where system has spent its time
CPU time accounting is driven by programmable interrupt timer
ndash Normally 10 msec (15 msec on some MP Pentiums)
Thread execution and context switches between clock intervals NOT accounted
ndash Eg one or more threads run and enter a wait state before clock fires
ndash Thus threads may run but never get charged
View context switch activity with Process Explorer
ndash Add Context Switch Delta column
7676
For waiting threads user-mode utilities only display the wait reason
Example pstat
Looking at Waiting Threads
7777
Wait Internals 1 Dispatcher Objects
Size TypeState
Wait listhead
Object-type-specific data
DispatcherObject
(see ntddkincddkntddkh)
Any kernel object you can wait for is a ldquodispatcher objectrdquo
ndash some exclusively for synchronizationeg events mutexes (ldquomutantsrdquo) semaphores queues timers
ndash others can be waited for as a side effect of their prime function eg processes threads file objects
ndash non-waitable kernel objects are called ldquocontrol objectsrdquo
All dispatcher objects have a common header
All dispatcher objects are in one of two states
ndash ldquosignaledrdquo vs ldquononsignaledrdquo
ndash when signalled a wait on the object is satisfied
ndash different object types differ in terms of what changes their state
ndash wait and unwait implementation iscommon to all types of dispatcher objects
7878
Object-type-specific data
Wait Internals 2Wait Blocks
Size TypeState
Wait listhead
Size TypeState
Wait listhead
Represent a threadrsquos reference to something itrsquos waiting for (one per handle passed to WaitForhellip)
All wait blocks from a given wait call are chained to the waiting thread
Type indicates wait for ldquoanyrdquo or ldquoallrdquo Key denotes argument list position for
WaitForMultipleObjects
Object-type-specific data
DispatcherObjects
Thread Objects
WaitBlockListWaitBlockList
Wait blocks
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
7979
34 Windows APIs for Synchronization and IPC Windows API constructs for synchronization and interprocess communication
Synchronization
ndash Critical sections
ndash Mutexes
ndash Semaphores
ndash Event objects
Synchronization through interprocess communication
ndash Anonymous pipes
ndash Named pipes
ndash Mailslots
8080
Critical Sections
Only usable from within the same process
Critical sections are initialized and deleted but do not have handles
Only one thread at a time can be in a critical section
A thread can enter a critical section multiple times - however the number of Enter- and Leave-operations must match
Leaving a critical section before entering it may cause deadlocks
No way to test whether another thread is in a critical section
VOID InitializeCriticalSection( LPCRITICAL_SECTION sec )VOID DeleteCriticalSection( LPCRITICAL_SECTION sec )
VOID EnterCriticalSection( LPCRITICAL_SECTION sec ) VOID LeaveCriticalSection( LPCRITICAL_SECTION sec )BOOL TryEnterCriticalSection ( LPCRITICAL_SECTION sec )
8181
Critical Section Example
counter is global shared by all threads
volatile int counter = 0
CRITICAL_SECTION crit
InitializeCriticalSection ( ampcrit )
hellip main loop in any of the threads
while (done)
_try
EnterCriticalSection ( ampcrit )
counter += local_value
LeaveCriticalSection ( ampcrit )
_finally LeaveCriticalSection ( ampcrit )
DeleteCriticalSection( ampcrit )
8282
Synchronizing Threads with Kernel Objects
The following kernel objects can be used to synchronize threads
ndash Processes
ndash Threads
ndash Files
ndash Console input
File change notificationsFile change notifications
MutexesMutexes
Events (auto-reset + manual-reset)Events (auto-reset + manual-reset)
Waitable timersWaitable timers
DWORD WaitForSingleObject( HANDLE hObject DWORD dwTimeout )
DWORD WaitForMultipleObjects( DWORD cObjects LPHANDLE lpHandles BOOL bWaitAll DWORD dwTimeout )
8383
Wait Functions - Details
WaitForSingleObject()ndash hObject specifies kernel object
ndash dwTimeout specifies wait time in msecdwTimeout == 0 - no wait check whether object is signaled
dwTimeout == INFINITE - wait forever
WaitForMultipleObjects()ndash cObjects lt= MAXIMUM_WAIT_OBJECTS (64)
ndash lpHandles - pointer to array identifying these objects
ndash bWaitAll - whether to wait for first signaled object or all objectsFunction returns index of first signaled object
Side effectsndash Mutexes auto-reset events and waitable timers will be reset to non-signaled state after completing wait functions
8484
Mutexes
Mutexes work across processes
First thread has to call CreateMutex()
When sharing a mutex second thread (process) calls CreateMutex() or OpenMutex()
fInitialOwner == TRUE gives creator immediate ownership
Threads acquire mutex ownership using WaitForSingleObject() or WaitForMultipleObjects()
ReleaseMutex() gives up ownership
CloseHandle() will free mutex object
8585
Mutexes
HANDLE CreateMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
HANDLE OpenMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
BOOL ReleaseMutex( HANDLE hMutex )
8686
Mutex Example
counter is global shared by all threads
volatile int done counter = 0
HANDLE mutex = CreateMutex( NULL FALSE NULL )
main loop in any of the threads ret is local
DWORD ret
while (done)
ret = WaitForSingleObject( mutex INFINITE )
if (ret == WAIT_OBJECT_0)
counter += local_value
else mutex was abandoned
break exit the loop
ReleaseMutex( mutex )
CloseHandle( mutex )
8787
Comparison - POSIX mutexes
POSIX pthreads specification supports mutexesndash Synchronization among threads in same process
Five basic functionsndash pthread_mutex_init()
ndash pthread_mutex_destroy()
ndash pthread_mutex_lock()
ndash pthread_mutex_unlock()
ndash pthread_mutex_trylock()
Comparisonndash pthread_mutex_lock() will block - equivalent to WaitForSingleObject( hMutex )
ndash pthread_mutex_trylock() is nonblocking (polling) - equivalent to WaitForSingleObject() with timeout == 0
8888
Semaphores
Semaphore objects are used for resource countingndash A semaphore is signaled when count gt 0
Threadsprocesses use wait functionsndash Each wait function decreases semaphore count by 1
ndash ReleaseSemaphore() may increment count by any value
ndash ReleaseSemaphore() returns old semaphore count
8989
Semaphores
HANDLE CreateSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE OpenSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE ReleaseSemaphore( HANDLE hSemaphoreLONG cReleaseCount LPLONG lpPreviousCount )
9090
Events
Multiple threads can be released when a single event is signaled (barrier synchronization)ndash Manual-reset event can signal several thread simultaneously must be reset manually
ndash PulseEvent() will release all threads waiting on a manual-reset event and automatically reset the event
ndash Auto-reset event signals a single thread event is reset automatically
ndash fInitialState == TRUE - create event in signaled state
9191
Events
HANDLE CreateEvent( LPSECURITY_ATTRIBUTE lpsaBOOL fManualReset BOOL fInititalStateLPTSTR lpszEventName )
BOOL SetEvent( HANDLE hEvent )BOOL ResetEvent( HANDLE hEvent )BOOL PulseEvent( HANDLE hEvent )
9292
Comparison - POSIX condition variables
pthreadrsquos condition variables are comparable to eventsndash pthread_cond_init()
ndash pthread_cond_destroy()
Wait functionsndash pthread_cond_wait()
ndash pthread_cond_timedwait()
Signalingndash pthread_cond_signal() - one thread
ndash pthread_cond_broadcast() - all waiting threads
No exact equivalent to manual-reset events
9393
Anonymous pipes
BOOL CreatePipe( PHANDLE phReadPHANDLE phWriteLPSECURITY_ATTRIBUTES lpsaDWORD cbPipe )
main
prog1 prog2pipe
Half-duplex character-based IPC
cbPipe pipe byte size zero == default
Read on pipe handle will block if pipe is empty
Write operation to a full pipe will block
Anonymous pipes are oneway
9494
IO Redirection using an Anonymous Pipe
Create default size anonymous pipe handles are inheritable
if (CreatePipe (amphReadPipe amphWritePipe ampPipeSA 0))
fprintf(stderr ldquoAnon pipe create failednrdquo) exit(1)
Set output handle to pipe handle create first processes
StartInfoCh1hStdInput = GetStdHandle (STD_INPUT_HANDLE)
StartInfoCh1hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh1hStdOutput = hWritePipe
StartInfoCh1dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)Command1 NULL NULL TRUE 0 NULL NULL ampStartInfoCh1 ampProcInfo1))
fprintf(stderr ldquoCreateProc1 failednrdquo) exit(2)
CloseHandle (hWritePipe)
9595
Pipe example (contd)
Repeat (symmetrically) for the second process
StartInfoCh2hStdInput = hReadPipe
StartInfoCh2hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh2hStdOutput = GetStdHandle (STD_OUTPUT_HANDLE)
StartInfoCh2dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)targv NULL NULLTRUE Inherit handles
0 NULL NULL ampStartInfoCh2 ampProcInfo2))
fprintf(stderr ldquoCreateProc2 failednrdquo) exit(3)
CloseHandle (hReadPipe)
Wait for both processes to complete
WaitForSingleObject (ProcInfo1hProcess INFINITE)
WaitForSingleObject (ProcInfo2hProcess INFINITE)
CloseHandle (ProcInfo1hThread) CloseHandle (ProcInfo1hProcess)
CloseHandle (ProcInfo2hThread) CloseHandle (ProcInfo2hProcess)
return 0
9696
Named Pipes
Message orientedndash Reading process can read varying-length messages precisely as sent by the writing process
Bi-directionalndash Two processes can exchange messages over the same pipe
Multiple independent instances of a named pipendash Several clients can communicate with a single server using the same instance
ndash Server can respond to client using the same instance
Pipe can be accessed over the networkndash location transparency
Convenience and connection functions
9797
Using Named Pipes
HANDLE CreateNamedPipe (LPCTSTR lpszPipeNameDWORD fdwOpenMode DWORD fdwPipModeDWORD nMaxInstances DWORD cbOutBufDWORD cbInBuf DWORD dwTimeOutLPSECURITY_ATTRIBUTES lpsa )
Use same flag settings forall instances of a named pipe
lpszPipeName pipe[path]pipename
ndash Not possible to create a pipe on remote machine ( ndash local machine)
fdwOpenMode
ndash PIPE_ACCESS_DUPLEX PIPE_ACCESS_INBOUND PIPE_ACCESS_OUTBOUND
fdwPipeMode
ndash PIPE_TYPE_BYTE or PIPE_TYPE_MESSAGE
ndash PIPE_READMODE_BYTE or PIPE_READMODE_MESSAGE
ndash PIPE_WAIT or PIPE_NOWAIT (will ReadFile block)
9898
Named Pipes (contd)
BOOL PeekNamedPipe (HANDLE hPipeLPVOID lpvBuffer DWORD cbBufferLPDWORD lpcbRead LPDWORD lpcbAvailLPDWORD lpcbMessage)
nMaxInstances
ndash Number of instances
ndash PIPE_UNLIMITED_INSTANCES OS choice based on resources
dwTimeOut
ndash Default time-out period (in msec) for WaitNamedPipe()
First CreateNamedPipe creates named pipe
ndash Closing handle to last instance deletes named pipe
Polling a pipe
ndash Nondestructive ndash is there a message waiting for ReadFile
9999
Named Pipe Client Connections
CreateFile with named pipe namendash pipe[path]pipename
ndash servernamepipe[path]pipename
ndash First method gives better performance (local server)
Status Functionsndash GetNamedPipeHandleState
ndash SetNamedPipeHandleState
ndash GetNamedPipeInfo
100100
Convenience Functions
BOOL TransactNamedPipe( HANDLE hNamedPipeLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDOWRD lpcbRead LPOVERLAPPED lpa)
WriteFile ReadFile sequence
101101
Convenience Functions
BOOL CallNamedPipe( LPCTSTR lpszPipeNameLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDWORD lpcbRead DWORD dwTimeOut)
CreateFile WriteFile ReadFile CloseHandle
ndash dwTimeOut NMPWAIT_NOWAIT NMPWAIT_WIAT_FOREVER NMPWAIT_USE_DEFAULT_WAIT
102102
Server eliminate the polling loop
BOOL ConnectNamedPipe (HANDLE hNamedPipeLPOVERLAPPED lpo
lpo == NULLndash Call will return as soon as there is a client connection
ndash Returns false if client connected between CreateNamed Pipe calland ConnectNamedPipe()
Use DisconnectNamedPipe to free the handle for connection from another client
WaitNamedPipe()ndash Client may wait for serverlsquos ConnectNamedPipe()
Security rights for named pipesndash GENERIC_READ GENERIC_WRITE SYNCHRONIZE
103103
Comparison with UNIX
UNIX FIFOs are similar to a named pipe
ndash FIFOs are half-duplex
ndash FIFOs are limited to a single machine
ndash FIFOs are still byte-oriented so its easiest to use fixed-size records in clientserver applications
ndash Individual readwrites are atomic
A server using FIFOs must use a separate FIFO for each clientlsquos response although all clients can send requests via a single well known FIFO
Mkfifo() is the UNIX counterpart to CreateNamedPipe()
Use sockets for networked clientserver scenarios
104104
Client Example using Named Pipe
WaitNamedPipe (ServerPipeName NMPWAIT_WAIT_FOREVER)
hNamedPipe = CreateFile (ServerPipeName GENERIC_READ | GENERIC_WRITE
0 NULL OPEN_EXISTING FILE_ATTRIBUTE_NORMAL NULL)
if (hNamedPipe == INVALID_HANDLE_VALUE)
fptinf(stderr Failure to locate servern) exit(3)
Write the request
WriteFile (hNamedPipe ampRequest MAX_RQRS_LEN ampnWrite NULL)
Read each response and send it to std out
while (ReadFile (hNamedPipe ResponseRecord MAX_RQRS_LEN ampnRead NULL))
printf (s ResponseRecord)
CloseHandle (hNamedPipe)
return 0
105105
Server Example Using a Named Pipe
hNamedPipe = CreateNamedPipe (SERVER_PIPE PIPE_ACCESS_DUPLEX
PIPE_READMODE_MESSAGE | PIPE_TYPE_MESSAGE | PIPE_WAIT
1 0 0 CS_TIMEOUT pNPSA)
while (Done)
printf (Server is awaiting next requestn)
if (ConnectNamedPipe (hNamedPipe NULL)
|| ReadFile (hNamedPipe ampRequest RQ_SIZE ampnXfer NULL))
fprintf(stderr ldquoConnect or Read Named Pipe errornrdquo) exit(4)
printf( ldquoRequest is sn RequestRecord)
Send the file one line at a time to the client
fp = fopen (File r)
while ((fgets (ResponseRecord MAX_RQRS_LEN fp) = NULL))
WriteFile (hNamedPipe ampResponseRecord
(strlen(ResponseRecord) + 1) TSIZE ampnXfer NULL)
fclose (fp)
DisconnectNamedPipe (hNamedPipe)
End of server operation
106106
Win32 IPC - MailslotsMailslots bear some nasty implementation detailsthey are almost never used
Broadcast mechanism
ndash One-directional
ndash Mutliple writersmultiple readers (frequently one-to-many comm)
ndash Message delivery is unreliable
ndash Can be located over a network domain
ndash Message lengths are limited (w2k lt 426 byte) Operations on the mailslot
ndash Each reader (server) creates mailslot with CreateMailslot()
ndash Write-only client opens mailslot with CreateFile() and uses WriteFile() ndash open will fail if there are no waiting readers
ndash Clientlsquos message can be read by all servers (readers) Client lookup mailslotmailslotname
ndash Client will connect to every server in network domain
107107
Locate a server via mailslot
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
App client 0
App client n
Mailslot Servers
While () Sleep() hMS = CreateFile( ldquomailslotstatusldquo)
WriteFile(hMS ampStatInfo
App Server
Mailslot Client
Message is sent periodically
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
5757
Executive Synchronization
Thread waitson an object
handle
Create and initialize thread object
Initialized
Ready
Transition
Waiting
Running
Terminated
Standby
Wait is completeSet object to
signaled state
Interaction with thread scheduling
Waiting on Dispatcher Objects ndash outside the kernel
5858
Interaction bet Synchronization amp Dispatching User mode thread waits on an event objectlsquos handle
Kernel changes threadlsquos scheduling state from ready to waiting and adds thread to wait-list
Another thread sets the event
Kernel wakes up waiting threads variable priority threads get priority boost
5959
Interaction bet Synchronization amp Dispatching Dispatcher re-schedules new thread ndash it may preempt running thread it it has lower priority and issues software interrupt to initiate context switch
If no processor can be preempted the dispatcher places the ready thread in the dispatcher ready queue to be scheduled later
6060
What signals an object
Dispatcher object
System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
Owning thread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (kernel mode)
nonsignaled signaled
Owning thread or otherthread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (exported to user mode)
nonsignaled signaled
One thread releases thesemaphore freeing a resource
A thread acquires the semaphoreMore resources are not available
Kernel resumes one or more waiting threads
Semaphore
6161
A thread reinitializesthe thread object
What signals an object (contd)
Dispatcher object System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
A thread sets the event
Kernel resumes one or more threads
Kernel resumes one or more waiting threads
Event
nonsignaled signaled
Dedicated thread sets oneevent in the event pair
Kernel resumes theother dedicated thread
Kernel resumes waitingdedicated thread
Event pair
nonsignaled signaled
Timer expires
A thread (re) initializes the timer
Kernel resumes all waiting threads
Timer
nonsignaled signaled
Thread terminates
Kernel resumes all waiting threads
Thread
6262
Further Reading
Mark E Russinovich and David A Solomon Microsoft Windows Internals 4th Edition Microsoft Press 2004
Chapter 3 - System Mechanisms
ndash Trap Dispatching (pp 85 ff)
ndash Synchronization (pp 149 ff)
ndash Kernel Event Tracing (pp 175 ff)
6363
33 Advanced Windows Synchronization
Deferred and Asynchronous Procedure Calls
IRQLs and CPU Time Accounting
Wait Queues amp Dispatcher Objects
6464
Used to defer processing from higher (device) interrupt level to a lower (dispatch) levelndash Also used for quantum end and timer expiration
Driver (usually ISR) queues requestndash One queue per CPU DPCs are normally queued to the current processor but can be targeted to other CPUs
ndash Executes specified procedure at dispatch IRQL (or ldquodispatch levelrdquo also ldquoDPC levelrdquo) when all higher-IRQL work (interrupts) completed
ndash Maximum times recommended ISR 10 usec DPC 25 usec
See httpwwwmicrosoftcomwhdcdriverperformmmdrvmspx
Deferred Procedure Calls (DPCs)
6565
queue head DPC object DPC object DPC object
Deferred Procedure Calls (DPCs)
6666
DPC
Delivering a DPC
DPC routines can call kernel functionsbut canlsquot call system services generatepage faults or create or wait on objects
DPC routines canlsquotassume whatprocess addressspace is currentlymapped
Interruptdispatch table
high
Power failure
DispatchDPC
APC
Low
DPC
1 Timer expires kernelqueues DPC that willrelease all waiting threadsKernel requests SW int
DPCDPC
DPC queue
2 DPC interrupt occurswhen IRQL drops belowdispatchDPC level
dispatcher
3 After DPC interruptcontrol transfers tothread dispatcher
4 Dispatcher executes each DPCroutine in DPC queue
6767
Asynchronous Procedure Calls (APCs)
Execute code in context of a particular user threadndash APC routines can acquire resources (objects) incur page faultscall system services
APC queue is thread-specific
User mode amp kernel mode APCsndash Permission required for user mode APCs
6868
Asynchronous Procedure Calls (APCs)
Executive uses APCs to complete work in thread spacendash Wait for asynchronous IO operation
ndash Emulate delivery of POSIX signals
ndash Make threads suspendterminate itself (env subsystems)
APCs are delivered when thread is in alertable wait statendash WaitForMultipleObjectsEx() SleepEx()
6969
Special kernel APCs
ndash Run in kernel mode at IRQL 1
ndash Always deliverable unless thread is already at IRQL 1 or above
ndash Used for IO completion reporting from ldquoarbitrary thread contextrdquo
ndash Kernel-mode interface is linkable but not documented
Asynchronous Procedure Calls (APCs)
7070
ldquoOrdinaryrdquo kernel APCs
ndash Always deliverable if at IRQL 0 unless explicitly disabled (disable with KeEnterCriticalRegion)
User mode APCs
ndash Used for IO completion callback routines (see ReadFileEx WriteFileEx) also QueueUserApc
ndash Only deliverable when thread is in ldquoalertable waitrdquo
Asynchronous Procedure Calls (APCs)
7171
ThreadObject
K
U
APC objects
Asynchronous Procedure Calls (APCs)
7272
IRQLs and CPU Time Accounting
Interval clock timer ISR keeps track of time
Clock ISR time accounting
ndash If IRQLlt2 charge to threadrsquos user or kernel time
ndash If IRQL=2 and processing a DPC charge to DPC time
ndash If IRQL=2 amp not processing a DPC charge to thread kernel time
ndash If IRQLgt2 charge to interrupt time
7373
IRQLs and CPU Time Accounting
Since time servicing interrupts are NOT charged to interrupted thread if system is busy but no process appears to be running must be due to interrupt-related activity
ndash Note time at IRQL 2 or more is charged to the current threadrsquos quantum (to be described)
7474
Interrupt Time Accounting
Task Manager includes interrupt and DPC time with the Idle process time
Interrupt activity is not charged to any threadprocess
ndash Process Explorer shows these as separate processesnot really processes
ndash Context switches for these are really of interrupts amp DPCs
7575
Time Accounting Quirks
Looking at total CPU time for each process may not reveal where system has spent its time
CPU time accounting is driven by programmable interrupt timer
ndash Normally 10 msec (15 msec on some MP Pentiums)
Thread execution and context switches between clock intervals NOT accounted
ndash Eg one or more threads run and enter a wait state before clock fires
ndash Thus threads may run but never get charged
View context switch activity with Process Explorer
ndash Add Context Switch Delta column
7676
For waiting threads user-mode utilities only display the wait reason
Example pstat
Looking at Waiting Threads
7777
Wait Internals 1 Dispatcher Objects
Size TypeState
Wait listhead
Object-type-specific data
DispatcherObject
(see ntddkincddkntddkh)
Any kernel object you can wait for is a ldquodispatcher objectrdquo
ndash some exclusively for synchronizationeg events mutexes (ldquomutantsrdquo) semaphores queues timers
ndash others can be waited for as a side effect of their prime function eg processes threads file objects
ndash non-waitable kernel objects are called ldquocontrol objectsrdquo
All dispatcher objects have a common header
All dispatcher objects are in one of two states
ndash ldquosignaledrdquo vs ldquononsignaledrdquo
ndash when signalled a wait on the object is satisfied
ndash different object types differ in terms of what changes their state
ndash wait and unwait implementation iscommon to all types of dispatcher objects
7878
Object-type-specific data
Wait Internals 2Wait Blocks
Size TypeState
Wait listhead
Size TypeState
Wait listhead
Represent a threadrsquos reference to something itrsquos waiting for (one per handle passed to WaitForhellip)
All wait blocks from a given wait call are chained to the waiting thread
Type indicates wait for ldquoanyrdquo or ldquoallrdquo Key denotes argument list position for
WaitForMultipleObjects
Object-type-specific data
DispatcherObjects
Thread Objects
WaitBlockListWaitBlockList
Wait blocks
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
7979
34 Windows APIs for Synchronization and IPC Windows API constructs for synchronization and interprocess communication
Synchronization
ndash Critical sections
ndash Mutexes
ndash Semaphores
ndash Event objects
Synchronization through interprocess communication
ndash Anonymous pipes
ndash Named pipes
ndash Mailslots
8080
Critical Sections
Only usable from within the same process
Critical sections are initialized and deleted but do not have handles
Only one thread at a time can be in a critical section
A thread can enter a critical section multiple times - however the number of Enter- and Leave-operations must match
Leaving a critical section before entering it may cause deadlocks
No way to test whether another thread is in a critical section
VOID InitializeCriticalSection( LPCRITICAL_SECTION sec )VOID DeleteCriticalSection( LPCRITICAL_SECTION sec )
VOID EnterCriticalSection( LPCRITICAL_SECTION sec ) VOID LeaveCriticalSection( LPCRITICAL_SECTION sec )BOOL TryEnterCriticalSection ( LPCRITICAL_SECTION sec )
8181
Critical Section Example
counter is global shared by all threads
volatile int counter = 0
CRITICAL_SECTION crit
InitializeCriticalSection ( ampcrit )
hellip main loop in any of the threads
while (done)
_try
EnterCriticalSection ( ampcrit )
counter += local_value
LeaveCriticalSection ( ampcrit )
_finally LeaveCriticalSection ( ampcrit )
DeleteCriticalSection( ampcrit )
8282
Synchronizing Threads with Kernel Objects
The following kernel objects can be used to synchronize threads
ndash Processes
ndash Threads
ndash Files
ndash Console input
File change notificationsFile change notifications
MutexesMutexes
Events (auto-reset + manual-reset)Events (auto-reset + manual-reset)
Waitable timersWaitable timers
DWORD WaitForSingleObject( HANDLE hObject DWORD dwTimeout )
DWORD WaitForMultipleObjects( DWORD cObjects LPHANDLE lpHandles BOOL bWaitAll DWORD dwTimeout )
8383
Wait Functions - Details
WaitForSingleObject()ndash hObject specifies kernel object
ndash dwTimeout specifies wait time in msecdwTimeout == 0 - no wait check whether object is signaled
dwTimeout == INFINITE - wait forever
WaitForMultipleObjects()ndash cObjects lt= MAXIMUM_WAIT_OBJECTS (64)
ndash lpHandles - pointer to array identifying these objects
ndash bWaitAll - whether to wait for first signaled object or all objectsFunction returns index of first signaled object
Side effectsndash Mutexes auto-reset events and waitable timers will be reset to non-signaled state after completing wait functions
8484
Mutexes
Mutexes work across processes
First thread has to call CreateMutex()
When sharing a mutex second thread (process) calls CreateMutex() or OpenMutex()
fInitialOwner == TRUE gives creator immediate ownership
Threads acquire mutex ownership using WaitForSingleObject() or WaitForMultipleObjects()
ReleaseMutex() gives up ownership
CloseHandle() will free mutex object
8585
Mutexes
HANDLE CreateMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
HANDLE OpenMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
BOOL ReleaseMutex( HANDLE hMutex )
8686
Mutex Example
counter is global shared by all threads
volatile int done counter = 0
HANDLE mutex = CreateMutex( NULL FALSE NULL )
main loop in any of the threads ret is local
DWORD ret
while (done)
ret = WaitForSingleObject( mutex INFINITE )
if (ret == WAIT_OBJECT_0)
counter += local_value
else mutex was abandoned
break exit the loop
ReleaseMutex( mutex )
CloseHandle( mutex )
8787
Comparison - POSIX mutexes
POSIX pthreads specification supports mutexesndash Synchronization among threads in same process
Five basic functionsndash pthread_mutex_init()
ndash pthread_mutex_destroy()
ndash pthread_mutex_lock()
ndash pthread_mutex_unlock()
ndash pthread_mutex_trylock()
Comparisonndash pthread_mutex_lock() will block - equivalent to WaitForSingleObject( hMutex )
ndash pthread_mutex_trylock() is nonblocking (polling) - equivalent to WaitForSingleObject() with timeout == 0
8888
Semaphores
Semaphore objects are used for resource countingndash A semaphore is signaled when count gt 0
Threadsprocesses use wait functionsndash Each wait function decreases semaphore count by 1
ndash ReleaseSemaphore() may increment count by any value
ndash ReleaseSemaphore() returns old semaphore count
8989
Semaphores
HANDLE CreateSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE OpenSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE ReleaseSemaphore( HANDLE hSemaphoreLONG cReleaseCount LPLONG lpPreviousCount )
9090
Events
Multiple threads can be released when a single event is signaled (barrier synchronization)ndash Manual-reset event can signal several thread simultaneously must be reset manually
ndash PulseEvent() will release all threads waiting on a manual-reset event and automatically reset the event
ndash Auto-reset event signals a single thread event is reset automatically
ndash fInitialState == TRUE - create event in signaled state
9191
Events
HANDLE CreateEvent( LPSECURITY_ATTRIBUTE lpsaBOOL fManualReset BOOL fInititalStateLPTSTR lpszEventName )
BOOL SetEvent( HANDLE hEvent )BOOL ResetEvent( HANDLE hEvent )BOOL PulseEvent( HANDLE hEvent )
9292
Comparison - POSIX condition variables
pthreadrsquos condition variables are comparable to eventsndash pthread_cond_init()
ndash pthread_cond_destroy()
Wait functionsndash pthread_cond_wait()
ndash pthread_cond_timedwait()
Signalingndash pthread_cond_signal() - one thread
ndash pthread_cond_broadcast() - all waiting threads
No exact equivalent to manual-reset events
9393
Anonymous pipes
BOOL CreatePipe( PHANDLE phReadPHANDLE phWriteLPSECURITY_ATTRIBUTES lpsaDWORD cbPipe )
main
prog1 prog2pipe
Half-duplex character-based IPC
cbPipe pipe byte size zero == default
Read on pipe handle will block if pipe is empty
Write operation to a full pipe will block
Anonymous pipes are oneway
9494
IO Redirection using an Anonymous Pipe
Create default size anonymous pipe handles are inheritable
if (CreatePipe (amphReadPipe amphWritePipe ampPipeSA 0))
fprintf(stderr ldquoAnon pipe create failednrdquo) exit(1)
Set output handle to pipe handle create first processes
StartInfoCh1hStdInput = GetStdHandle (STD_INPUT_HANDLE)
StartInfoCh1hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh1hStdOutput = hWritePipe
StartInfoCh1dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)Command1 NULL NULL TRUE 0 NULL NULL ampStartInfoCh1 ampProcInfo1))
fprintf(stderr ldquoCreateProc1 failednrdquo) exit(2)
CloseHandle (hWritePipe)
9595
Pipe example (contd)
Repeat (symmetrically) for the second process
StartInfoCh2hStdInput = hReadPipe
StartInfoCh2hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh2hStdOutput = GetStdHandle (STD_OUTPUT_HANDLE)
StartInfoCh2dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)targv NULL NULLTRUE Inherit handles
0 NULL NULL ampStartInfoCh2 ampProcInfo2))
fprintf(stderr ldquoCreateProc2 failednrdquo) exit(3)
CloseHandle (hReadPipe)
Wait for both processes to complete
WaitForSingleObject (ProcInfo1hProcess INFINITE)
WaitForSingleObject (ProcInfo2hProcess INFINITE)
CloseHandle (ProcInfo1hThread) CloseHandle (ProcInfo1hProcess)
CloseHandle (ProcInfo2hThread) CloseHandle (ProcInfo2hProcess)
return 0
9696
Named Pipes
Message orientedndash Reading process can read varying-length messages precisely as sent by the writing process
Bi-directionalndash Two processes can exchange messages over the same pipe
Multiple independent instances of a named pipendash Several clients can communicate with a single server using the same instance
ndash Server can respond to client using the same instance
Pipe can be accessed over the networkndash location transparency
Convenience and connection functions
9797
Using Named Pipes
HANDLE CreateNamedPipe (LPCTSTR lpszPipeNameDWORD fdwOpenMode DWORD fdwPipModeDWORD nMaxInstances DWORD cbOutBufDWORD cbInBuf DWORD dwTimeOutLPSECURITY_ATTRIBUTES lpsa )
Use same flag settings forall instances of a named pipe
lpszPipeName pipe[path]pipename
ndash Not possible to create a pipe on remote machine ( ndash local machine)
fdwOpenMode
ndash PIPE_ACCESS_DUPLEX PIPE_ACCESS_INBOUND PIPE_ACCESS_OUTBOUND
fdwPipeMode
ndash PIPE_TYPE_BYTE or PIPE_TYPE_MESSAGE
ndash PIPE_READMODE_BYTE or PIPE_READMODE_MESSAGE
ndash PIPE_WAIT or PIPE_NOWAIT (will ReadFile block)
9898
Named Pipes (contd)
BOOL PeekNamedPipe (HANDLE hPipeLPVOID lpvBuffer DWORD cbBufferLPDWORD lpcbRead LPDWORD lpcbAvailLPDWORD lpcbMessage)
nMaxInstances
ndash Number of instances
ndash PIPE_UNLIMITED_INSTANCES OS choice based on resources
dwTimeOut
ndash Default time-out period (in msec) for WaitNamedPipe()
First CreateNamedPipe creates named pipe
ndash Closing handle to last instance deletes named pipe
Polling a pipe
ndash Nondestructive ndash is there a message waiting for ReadFile
9999
Named Pipe Client Connections
CreateFile with named pipe namendash pipe[path]pipename
ndash servernamepipe[path]pipename
ndash First method gives better performance (local server)
Status Functionsndash GetNamedPipeHandleState
ndash SetNamedPipeHandleState
ndash GetNamedPipeInfo
100100
Convenience Functions
BOOL TransactNamedPipe( HANDLE hNamedPipeLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDOWRD lpcbRead LPOVERLAPPED lpa)
WriteFile ReadFile sequence
101101
Convenience Functions
BOOL CallNamedPipe( LPCTSTR lpszPipeNameLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDWORD lpcbRead DWORD dwTimeOut)
CreateFile WriteFile ReadFile CloseHandle
ndash dwTimeOut NMPWAIT_NOWAIT NMPWAIT_WIAT_FOREVER NMPWAIT_USE_DEFAULT_WAIT
102102
Server eliminate the polling loop
BOOL ConnectNamedPipe (HANDLE hNamedPipeLPOVERLAPPED lpo
lpo == NULLndash Call will return as soon as there is a client connection
ndash Returns false if client connected between CreateNamed Pipe calland ConnectNamedPipe()
Use DisconnectNamedPipe to free the handle for connection from another client
WaitNamedPipe()ndash Client may wait for serverlsquos ConnectNamedPipe()
Security rights for named pipesndash GENERIC_READ GENERIC_WRITE SYNCHRONIZE
103103
Comparison with UNIX
UNIX FIFOs are similar to a named pipe
ndash FIFOs are half-duplex
ndash FIFOs are limited to a single machine
ndash FIFOs are still byte-oriented so its easiest to use fixed-size records in clientserver applications
ndash Individual readwrites are atomic
A server using FIFOs must use a separate FIFO for each clientlsquos response although all clients can send requests via a single well known FIFO
Mkfifo() is the UNIX counterpart to CreateNamedPipe()
Use sockets for networked clientserver scenarios
104104
Client Example using Named Pipe
WaitNamedPipe (ServerPipeName NMPWAIT_WAIT_FOREVER)
hNamedPipe = CreateFile (ServerPipeName GENERIC_READ | GENERIC_WRITE
0 NULL OPEN_EXISTING FILE_ATTRIBUTE_NORMAL NULL)
if (hNamedPipe == INVALID_HANDLE_VALUE)
fptinf(stderr Failure to locate servern) exit(3)
Write the request
WriteFile (hNamedPipe ampRequest MAX_RQRS_LEN ampnWrite NULL)
Read each response and send it to std out
while (ReadFile (hNamedPipe ResponseRecord MAX_RQRS_LEN ampnRead NULL))
printf (s ResponseRecord)
CloseHandle (hNamedPipe)
return 0
105105
Server Example Using a Named Pipe
hNamedPipe = CreateNamedPipe (SERVER_PIPE PIPE_ACCESS_DUPLEX
PIPE_READMODE_MESSAGE | PIPE_TYPE_MESSAGE | PIPE_WAIT
1 0 0 CS_TIMEOUT pNPSA)
while (Done)
printf (Server is awaiting next requestn)
if (ConnectNamedPipe (hNamedPipe NULL)
|| ReadFile (hNamedPipe ampRequest RQ_SIZE ampnXfer NULL))
fprintf(stderr ldquoConnect or Read Named Pipe errornrdquo) exit(4)
printf( ldquoRequest is sn RequestRecord)
Send the file one line at a time to the client
fp = fopen (File r)
while ((fgets (ResponseRecord MAX_RQRS_LEN fp) = NULL))
WriteFile (hNamedPipe ampResponseRecord
(strlen(ResponseRecord) + 1) TSIZE ampnXfer NULL)
fclose (fp)
DisconnectNamedPipe (hNamedPipe)
End of server operation
106106
Win32 IPC - MailslotsMailslots bear some nasty implementation detailsthey are almost never used
Broadcast mechanism
ndash One-directional
ndash Mutliple writersmultiple readers (frequently one-to-many comm)
ndash Message delivery is unreliable
ndash Can be located over a network domain
ndash Message lengths are limited (w2k lt 426 byte) Operations on the mailslot
ndash Each reader (server) creates mailslot with CreateMailslot()
ndash Write-only client opens mailslot with CreateFile() and uses WriteFile() ndash open will fail if there are no waiting readers
ndash Clientlsquos message can be read by all servers (readers) Client lookup mailslotmailslotname
ndash Client will connect to every server in network domain
107107
Locate a server via mailslot
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
App client 0
App client n
Mailslot Servers
While () Sleep() hMS = CreateFile( ldquomailslotstatusldquo)
WriteFile(hMS ampStatInfo
App Server
Mailslot Client
Message is sent periodically
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
5858
Interaction bet Synchronization amp Dispatching User mode thread waits on an event objectlsquos handle
Kernel changes threadlsquos scheduling state from ready to waiting and adds thread to wait-list
Another thread sets the event
Kernel wakes up waiting threads variable priority threads get priority boost
5959
Interaction bet Synchronization amp Dispatching Dispatcher re-schedules new thread ndash it may preempt running thread it it has lower priority and issues software interrupt to initiate context switch
If no processor can be preempted the dispatcher places the ready thread in the dispatcher ready queue to be scheduled later
6060
What signals an object
Dispatcher object
System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
Owning thread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (kernel mode)
nonsignaled signaled
Owning thread or otherthread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (exported to user mode)
nonsignaled signaled
One thread releases thesemaphore freeing a resource
A thread acquires the semaphoreMore resources are not available
Kernel resumes one or more waiting threads
Semaphore
6161
A thread reinitializesthe thread object
What signals an object (contd)
Dispatcher object System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
A thread sets the event
Kernel resumes one or more threads
Kernel resumes one or more waiting threads
Event
nonsignaled signaled
Dedicated thread sets oneevent in the event pair
Kernel resumes theother dedicated thread
Kernel resumes waitingdedicated thread
Event pair
nonsignaled signaled
Timer expires
A thread (re) initializes the timer
Kernel resumes all waiting threads
Timer
nonsignaled signaled
Thread terminates
Kernel resumes all waiting threads
Thread
6262
Further Reading
Mark E Russinovich and David A Solomon Microsoft Windows Internals 4th Edition Microsoft Press 2004
Chapter 3 - System Mechanisms
ndash Trap Dispatching (pp 85 ff)
ndash Synchronization (pp 149 ff)
ndash Kernel Event Tracing (pp 175 ff)
6363
33 Advanced Windows Synchronization
Deferred and Asynchronous Procedure Calls
IRQLs and CPU Time Accounting
Wait Queues amp Dispatcher Objects
6464
Used to defer processing from higher (device) interrupt level to a lower (dispatch) levelndash Also used for quantum end and timer expiration
Driver (usually ISR) queues requestndash One queue per CPU DPCs are normally queued to the current processor but can be targeted to other CPUs
ndash Executes specified procedure at dispatch IRQL (or ldquodispatch levelrdquo also ldquoDPC levelrdquo) when all higher-IRQL work (interrupts) completed
ndash Maximum times recommended ISR 10 usec DPC 25 usec
See httpwwwmicrosoftcomwhdcdriverperformmmdrvmspx
Deferred Procedure Calls (DPCs)
6565
queue head DPC object DPC object DPC object
Deferred Procedure Calls (DPCs)
6666
DPC
Delivering a DPC
DPC routines can call kernel functionsbut canlsquot call system services generatepage faults or create or wait on objects
DPC routines canlsquotassume whatprocess addressspace is currentlymapped
Interruptdispatch table
high
Power failure
DispatchDPC
APC
Low
DPC
1 Timer expires kernelqueues DPC that willrelease all waiting threadsKernel requests SW int
DPCDPC
DPC queue
2 DPC interrupt occurswhen IRQL drops belowdispatchDPC level
dispatcher
3 After DPC interruptcontrol transfers tothread dispatcher
4 Dispatcher executes each DPCroutine in DPC queue
6767
Asynchronous Procedure Calls (APCs)
Execute code in context of a particular user threadndash APC routines can acquire resources (objects) incur page faultscall system services
APC queue is thread-specific
User mode amp kernel mode APCsndash Permission required for user mode APCs
6868
Asynchronous Procedure Calls (APCs)
Executive uses APCs to complete work in thread spacendash Wait for asynchronous IO operation
ndash Emulate delivery of POSIX signals
ndash Make threads suspendterminate itself (env subsystems)
APCs are delivered when thread is in alertable wait statendash WaitForMultipleObjectsEx() SleepEx()
6969
Special kernel APCs
ndash Run in kernel mode at IRQL 1
ndash Always deliverable unless thread is already at IRQL 1 or above
ndash Used for IO completion reporting from ldquoarbitrary thread contextrdquo
ndash Kernel-mode interface is linkable but not documented
Asynchronous Procedure Calls (APCs)
7070
ldquoOrdinaryrdquo kernel APCs
ndash Always deliverable if at IRQL 0 unless explicitly disabled (disable with KeEnterCriticalRegion)
User mode APCs
ndash Used for IO completion callback routines (see ReadFileEx WriteFileEx) also QueueUserApc
ndash Only deliverable when thread is in ldquoalertable waitrdquo
Asynchronous Procedure Calls (APCs)
7171
ThreadObject
K
U
APC objects
Asynchronous Procedure Calls (APCs)
7272
IRQLs and CPU Time Accounting
Interval clock timer ISR keeps track of time
Clock ISR time accounting
ndash If IRQLlt2 charge to threadrsquos user or kernel time
ndash If IRQL=2 and processing a DPC charge to DPC time
ndash If IRQL=2 amp not processing a DPC charge to thread kernel time
ndash If IRQLgt2 charge to interrupt time
7373
IRQLs and CPU Time Accounting
Since time servicing interrupts are NOT charged to interrupted thread if system is busy but no process appears to be running must be due to interrupt-related activity
ndash Note time at IRQL 2 or more is charged to the current threadrsquos quantum (to be described)
7474
Interrupt Time Accounting
Task Manager includes interrupt and DPC time with the Idle process time
Interrupt activity is not charged to any threadprocess
ndash Process Explorer shows these as separate processesnot really processes
ndash Context switches for these are really of interrupts amp DPCs
7575
Time Accounting Quirks
Looking at total CPU time for each process may not reveal where system has spent its time
CPU time accounting is driven by programmable interrupt timer
ndash Normally 10 msec (15 msec on some MP Pentiums)
Thread execution and context switches between clock intervals NOT accounted
ndash Eg one or more threads run and enter a wait state before clock fires
ndash Thus threads may run but never get charged
View context switch activity with Process Explorer
ndash Add Context Switch Delta column
7676
For waiting threads user-mode utilities only display the wait reason
Example pstat
Looking at Waiting Threads
7777
Wait Internals 1 Dispatcher Objects
Size TypeState
Wait listhead
Object-type-specific data
DispatcherObject
(see ntddkincddkntddkh)
Any kernel object you can wait for is a ldquodispatcher objectrdquo
ndash some exclusively for synchronizationeg events mutexes (ldquomutantsrdquo) semaphores queues timers
ndash others can be waited for as a side effect of their prime function eg processes threads file objects
ndash non-waitable kernel objects are called ldquocontrol objectsrdquo
All dispatcher objects have a common header
All dispatcher objects are in one of two states
ndash ldquosignaledrdquo vs ldquononsignaledrdquo
ndash when signalled a wait on the object is satisfied
ndash different object types differ in terms of what changes their state
ndash wait and unwait implementation iscommon to all types of dispatcher objects
7878
Object-type-specific data
Wait Internals 2Wait Blocks
Size TypeState
Wait listhead
Size TypeState
Wait listhead
Represent a threadrsquos reference to something itrsquos waiting for (one per handle passed to WaitForhellip)
All wait blocks from a given wait call are chained to the waiting thread
Type indicates wait for ldquoanyrdquo or ldquoallrdquo Key denotes argument list position for
WaitForMultipleObjects
Object-type-specific data
DispatcherObjects
Thread Objects
WaitBlockListWaitBlockList
Wait blocks
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
7979
34 Windows APIs for Synchronization and IPC Windows API constructs for synchronization and interprocess communication
Synchronization
ndash Critical sections
ndash Mutexes
ndash Semaphores
ndash Event objects
Synchronization through interprocess communication
ndash Anonymous pipes
ndash Named pipes
ndash Mailslots
8080
Critical Sections
Only usable from within the same process
Critical sections are initialized and deleted but do not have handles
Only one thread at a time can be in a critical section
A thread can enter a critical section multiple times - however the number of Enter- and Leave-operations must match
Leaving a critical section before entering it may cause deadlocks
No way to test whether another thread is in a critical section
VOID InitializeCriticalSection( LPCRITICAL_SECTION sec )VOID DeleteCriticalSection( LPCRITICAL_SECTION sec )
VOID EnterCriticalSection( LPCRITICAL_SECTION sec ) VOID LeaveCriticalSection( LPCRITICAL_SECTION sec )BOOL TryEnterCriticalSection ( LPCRITICAL_SECTION sec )
8181
Critical Section Example
counter is global shared by all threads
volatile int counter = 0
CRITICAL_SECTION crit
InitializeCriticalSection ( ampcrit )
hellip main loop in any of the threads
while (done)
_try
EnterCriticalSection ( ampcrit )
counter += local_value
LeaveCriticalSection ( ampcrit )
_finally LeaveCriticalSection ( ampcrit )
DeleteCriticalSection( ampcrit )
8282
Synchronizing Threads with Kernel Objects
The following kernel objects can be used to synchronize threads
ndash Processes
ndash Threads
ndash Files
ndash Console input
File change notificationsFile change notifications
MutexesMutexes
Events (auto-reset + manual-reset)Events (auto-reset + manual-reset)
Waitable timersWaitable timers
DWORD WaitForSingleObject( HANDLE hObject DWORD dwTimeout )
DWORD WaitForMultipleObjects( DWORD cObjects LPHANDLE lpHandles BOOL bWaitAll DWORD dwTimeout )
8383
Wait Functions - Details
WaitForSingleObject()ndash hObject specifies kernel object
ndash dwTimeout specifies wait time in msecdwTimeout == 0 - no wait check whether object is signaled
dwTimeout == INFINITE - wait forever
WaitForMultipleObjects()ndash cObjects lt= MAXIMUM_WAIT_OBJECTS (64)
ndash lpHandles - pointer to array identifying these objects
ndash bWaitAll - whether to wait for first signaled object or all objectsFunction returns index of first signaled object
Side effectsndash Mutexes auto-reset events and waitable timers will be reset to non-signaled state after completing wait functions
8484
Mutexes
Mutexes work across processes
First thread has to call CreateMutex()
When sharing a mutex second thread (process) calls CreateMutex() or OpenMutex()
fInitialOwner == TRUE gives creator immediate ownership
Threads acquire mutex ownership using WaitForSingleObject() or WaitForMultipleObjects()
ReleaseMutex() gives up ownership
CloseHandle() will free mutex object
8585
Mutexes
HANDLE CreateMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
HANDLE OpenMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
BOOL ReleaseMutex( HANDLE hMutex )
8686
Mutex Example
counter is global shared by all threads
volatile int done counter = 0
HANDLE mutex = CreateMutex( NULL FALSE NULL )
main loop in any of the threads ret is local
DWORD ret
while (done)
ret = WaitForSingleObject( mutex INFINITE )
if (ret == WAIT_OBJECT_0)
counter += local_value
else mutex was abandoned
break exit the loop
ReleaseMutex( mutex )
CloseHandle( mutex )
8787
Comparison - POSIX mutexes
POSIX pthreads specification supports mutexesndash Synchronization among threads in same process
Five basic functionsndash pthread_mutex_init()
ndash pthread_mutex_destroy()
ndash pthread_mutex_lock()
ndash pthread_mutex_unlock()
ndash pthread_mutex_trylock()
Comparisonndash pthread_mutex_lock() will block - equivalent to WaitForSingleObject( hMutex )
ndash pthread_mutex_trylock() is nonblocking (polling) - equivalent to WaitForSingleObject() with timeout == 0
8888
Semaphores
Semaphore objects are used for resource countingndash A semaphore is signaled when count gt 0
Threadsprocesses use wait functionsndash Each wait function decreases semaphore count by 1
ndash ReleaseSemaphore() may increment count by any value
ndash ReleaseSemaphore() returns old semaphore count
8989
Semaphores
HANDLE CreateSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE OpenSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE ReleaseSemaphore( HANDLE hSemaphoreLONG cReleaseCount LPLONG lpPreviousCount )
9090
Events
Multiple threads can be released when a single event is signaled (barrier synchronization)ndash Manual-reset event can signal several thread simultaneously must be reset manually
ndash PulseEvent() will release all threads waiting on a manual-reset event and automatically reset the event
ndash Auto-reset event signals a single thread event is reset automatically
ndash fInitialState == TRUE - create event in signaled state
9191
Events
HANDLE CreateEvent( LPSECURITY_ATTRIBUTE lpsaBOOL fManualReset BOOL fInititalStateLPTSTR lpszEventName )
BOOL SetEvent( HANDLE hEvent )BOOL ResetEvent( HANDLE hEvent )BOOL PulseEvent( HANDLE hEvent )
9292
Comparison - POSIX condition variables
pthreadrsquos condition variables are comparable to eventsndash pthread_cond_init()
ndash pthread_cond_destroy()
Wait functionsndash pthread_cond_wait()
ndash pthread_cond_timedwait()
Signalingndash pthread_cond_signal() - one thread
ndash pthread_cond_broadcast() - all waiting threads
No exact equivalent to manual-reset events
9393
Anonymous pipes
BOOL CreatePipe( PHANDLE phReadPHANDLE phWriteLPSECURITY_ATTRIBUTES lpsaDWORD cbPipe )
main
prog1 prog2pipe
Half-duplex character-based IPC
cbPipe pipe byte size zero == default
Read on pipe handle will block if pipe is empty
Write operation to a full pipe will block
Anonymous pipes are oneway
9494
IO Redirection using an Anonymous Pipe
Create default size anonymous pipe handles are inheritable
if (CreatePipe (amphReadPipe amphWritePipe ampPipeSA 0))
fprintf(stderr ldquoAnon pipe create failednrdquo) exit(1)
Set output handle to pipe handle create first processes
StartInfoCh1hStdInput = GetStdHandle (STD_INPUT_HANDLE)
StartInfoCh1hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh1hStdOutput = hWritePipe
StartInfoCh1dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)Command1 NULL NULL TRUE 0 NULL NULL ampStartInfoCh1 ampProcInfo1))
fprintf(stderr ldquoCreateProc1 failednrdquo) exit(2)
CloseHandle (hWritePipe)
9595
Pipe example (contd)
Repeat (symmetrically) for the second process
StartInfoCh2hStdInput = hReadPipe
StartInfoCh2hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh2hStdOutput = GetStdHandle (STD_OUTPUT_HANDLE)
StartInfoCh2dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)targv NULL NULLTRUE Inherit handles
0 NULL NULL ampStartInfoCh2 ampProcInfo2))
fprintf(stderr ldquoCreateProc2 failednrdquo) exit(3)
CloseHandle (hReadPipe)
Wait for both processes to complete
WaitForSingleObject (ProcInfo1hProcess INFINITE)
WaitForSingleObject (ProcInfo2hProcess INFINITE)
CloseHandle (ProcInfo1hThread) CloseHandle (ProcInfo1hProcess)
CloseHandle (ProcInfo2hThread) CloseHandle (ProcInfo2hProcess)
return 0
9696
Named Pipes
Message orientedndash Reading process can read varying-length messages precisely as sent by the writing process
Bi-directionalndash Two processes can exchange messages over the same pipe
Multiple independent instances of a named pipendash Several clients can communicate with a single server using the same instance
ndash Server can respond to client using the same instance
Pipe can be accessed over the networkndash location transparency
Convenience and connection functions
9797
Using Named Pipes
HANDLE CreateNamedPipe (LPCTSTR lpszPipeNameDWORD fdwOpenMode DWORD fdwPipModeDWORD nMaxInstances DWORD cbOutBufDWORD cbInBuf DWORD dwTimeOutLPSECURITY_ATTRIBUTES lpsa )
Use same flag settings forall instances of a named pipe
lpszPipeName pipe[path]pipename
ndash Not possible to create a pipe on remote machine ( ndash local machine)
fdwOpenMode
ndash PIPE_ACCESS_DUPLEX PIPE_ACCESS_INBOUND PIPE_ACCESS_OUTBOUND
fdwPipeMode
ndash PIPE_TYPE_BYTE or PIPE_TYPE_MESSAGE
ndash PIPE_READMODE_BYTE or PIPE_READMODE_MESSAGE
ndash PIPE_WAIT or PIPE_NOWAIT (will ReadFile block)
9898
Named Pipes (contd)
BOOL PeekNamedPipe (HANDLE hPipeLPVOID lpvBuffer DWORD cbBufferLPDWORD lpcbRead LPDWORD lpcbAvailLPDWORD lpcbMessage)
nMaxInstances
ndash Number of instances
ndash PIPE_UNLIMITED_INSTANCES OS choice based on resources
dwTimeOut
ndash Default time-out period (in msec) for WaitNamedPipe()
First CreateNamedPipe creates named pipe
ndash Closing handle to last instance deletes named pipe
Polling a pipe
ndash Nondestructive ndash is there a message waiting for ReadFile
9999
Named Pipe Client Connections
CreateFile with named pipe namendash pipe[path]pipename
ndash servernamepipe[path]pipename
ndash First method gives better performance (local server)
Status Functionsndash GetNamedPipeHandleState
ndash SetNamedPipeHandleState
ndash GetNamedPipeInfo
100100
Convenience Functions
BOOL TransactNamedPipe( HANDLE hNamedPipeLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDOWRD lpcbRead LPOVERLAPPED lpa)
WriteFile ReadFile sequence
101101
Convenience Functions
BOOL CallNamedPipe( LPCTSTR lpszPipeNameLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDWORD lpcbRead DWORD dwTimeOut)
CreateFile WriteFile ReadFile CloseHandle
ndash dwTimeOut NMPWAIT_NOWAIT NMPWAIT_WIAT_FOREVER NMPWAIT_USE_DEFAULT_WAIT
102102
Server eliminate the polling loop
BOOL ConnectNamedPipe (HANDLE hNamedPipeLPOVERLAPPED lpo
lpo == NULLndash Call will return as soon as there is a client connection
ndash Returns false if client connected between CreateNamed Pipe calland ConnectNamedPipe()
Use DisconnectNamedPipe to free the handle for connection from another client
WaitNamedPipe()ndash Client may wait for serverlsquos ConnectNamedPipe()
Security rights for named pipesndash GENERIC_READ GENERIC_WRITE SYNCHRONIZE
103103
Comparison with UNIX
UNIX FIFOs are similar to a named pipe
ndash FIFOs are half-duplex
ndash FIFOs are limited to a single machine
ndash FIFOs are still byte-oriented so its easiest to use fixed-size records in clientserver applications
ndash Individual readwrites are atomic
A server using FIFOs must use a separate FIFO for each clientlsquos response although all clients can send requests via a single well known FIFO
Mkfifo() is the UNIX counterpart to CreateNamedPipe()
Use sockets for networked clientserver scenarios
104104
Client Example using Named Pipe
WaitNamedPipe (ServerPipeName NMPWAIT_WAIT_FOREVER)
hNamedPipe = CreateFile (ServerPipeName GENERIC_READ | GENERIC_WRITE
0 NULL OPEN_EXISTING FILE_ATTRIBUTE_NORMAL NULL)
if (hNamedPipe == INVALID_HANDLE_VALUE)
fptinf(stderr Failure to locate servern) exit(3)
Write the request
WriteFile (hNamedPipe ampRequest MAX_RQRS_LEN ampnWrite NULL)
Read each response and send it to std out
while (ReadFile (hNamedPipe ResponseRecord MAX_RQRS_LEN ampnRead NULL))
printf (s ResponseRecord)
CloseHandle (hNamedPipe)
return 0
105105
Server Example Using a Named Pipe
hNamedPipe = CreateNamedPipe (SERVER_PIPE PIPE_ACCESS_DUPLEX
PIPE_READMODE_MESSAGE | PIPE_TYPE_MESSAGE | PIPE_WAIT
1 0 0 CS_TIMEOUT pNPSA)
while (Done)
printf (Server is awaiting next requestn)
if (ConnectNamedPipe (hNamedPipe NULL)
|| ReadFile (hNamedPipe ampRequest RQ_SIZE ampnXfer NULL))
fprintf(stderr ldquoConnect or Read Named Pipe errornrdquo) exit(4)
printf( ldquoRequest is sn RequestRecord)
Send the file one line at a time to the client
fp = fopen (File r)
while ((fgets (ResponseRecord MAX_RQRS_LEN fp) = NULL))
WriteFile (hNamedPipe ampResponseRecord
(strlen(ResponseRecord) + 1) TSIZE ampnXfer NULL)
fclose (fp)
DisconnectNamedPipe (hNamedPipe)
End of server operation
106106
Win32 IPC - MailslotsMailslots bear some nasty implementation detailsthey are almost never used
Broadcast mechanism
ndash One-directional
ndash Mutliple writersmultiple readers (frequently one-to-many comm)
ndash Message delivery is unreliable
ndash Can be located over a network domain
ndash Message lengths are limited (w2k lt 426 byte) Operations on the mailslot
ndash Each reader (server) creates mailslot with CreateMailslot()
ndash Write-only client opens mailslot with CreateFile() and uses WriteFile() ndash open will fail if there are no waiting readers
ndash Clientlsquos message can be read by all servers (readers) Client lookup mailslotmailslotname
ndash Client will connect to every server in network domain
107107
Locate a server via mailslot
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
App client 0
App client n
Mailslot Servers
While () Sleep() hMS = CreateFile( ldquomailslotstatusldquo)
WriteFile(hMS ampStatInfo
App Server
Mailslot Client
Message is sent periodically
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
5959
Interaction bet Synchronization amp Dispatching Dispatcher re-schedules new thread ndash it may preempt running thread it it has lower priority and issues software interrupt to initiate context switch
If no processor can be preempted the dispatcher places the ready thread in the dispatcher ready queue to be scheduled later
6060
What signals an object
Dispatcher object
System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
Owning thread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (kernel mode)
nonsignaled signaled
Owning thread or otherthread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (exported to user mode)
nonsignaled signaled
One thread releases thesemaphore freeing a resource
A thread acquires the semaphoreMore resources are not available
Kernel resumes one or more waiting threads
Semaphore
6161
A thread reinitializesthe thread object
What signals an object (contd)
Dispatcher object System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
A thread sets the event
Kernel resumes one or more threads
Kernel resumes one or more waiting threads
Event
nonsignaled signaled
Dedicated thread sets oneevent in the event pair
Kernel resumes theother dedicated thread
Kernel resumes waitingdedicated thread
Event pair
nonsignaled signaled
Timer expires
A thread (re) initializes the timer
Kernel resumes all waiting threads
Timer
nonsignaled signaled
Thread terminates
Kernel resumes all waiting threads
Thread
6262
Further Reading
Mark E Russinovich and David A Solomon Microsoft Windows Internals 4th Edition Microsoft Press 2004
Chapter 3 - System Mechanisms
ndash Trap Dispatching (pp 85 ff)
ndash Synchronization (pp 149 ff)
ndash Kernel Event Tracing (pp 175 ff)
6363
33 Advanced Windows Synchronization
Deferred and Asynchronous Procedure Calls
IRQLs and CPU Time Accounting
Wait Queues amp Dispatcher Objects
6464
Used to defer processing from higher (device) interrupt level to a lower (dispatch) levelndash Also used for quantum end and timer expiration
Driver (usually ISR) queues requestndash One queue per CPU DPCs are normally queued to the current processor but can be targeted to other CPUs
ndash Executes specified procedure at dispatch IRQL (or ldquodispatch levelrdquo also ldquoDPC levelrdquo) when all higher-IRQL work (interrupts) completed
ndash Maximum times recommended ISR 10 usec DPC 25 usec
See httpwwwmicrosoftcomwhdcdriverperformmmdrvmspx
Deferred Procedure Calls (DPCs)
6565
queue head DPC object DPC object DPC object
Deferred Procedure Calls (DPCs)
6666
DPC
Delivering a DPC
DPC routines can call kernel functionsbut canlsquot call system services generatepage faults or create or wait on objects
DPC routines canlsquotassume whatprocess addressspace is currentlymapped
Interruptdispatch table
high
Power failure
DispatchDPC
APC
Low
DPC
1 Timer expires kernelqueues DPC that willrelease all waiting threadsKernel requests SW int
DPCDPC
DPC queue
2 DPC interrupt occurswhen IRQL drops belowdispatchDPC level
dispatcher
3 After DPC interruptcontrol transfers tothread dispatcher
4 Dispatcher executes each DPCroutine in DPC queue
6767
Asynchronous Procedure Calls (APCs)
Execute code in context of a particular user threadndash APC routines can acquire resources (objects) incur page faultscall system services
APC queue is thread-specific
User mode amp kernel mode APCsndash Permission required for user mode APCs
6868
Asynchronous Procedure Calls (APCs)
Executive uses APCs to complete work in thread spacendash Wait for asynchronous IO operation
ndash Emulate delivery of POSIX signals
ndash Make threads suspendterminate itself (env subsystems)
APCs are delivered when thread is in alertable wait statendash WaitForMultipleObjectsEx() SleepEx()
6969
Special kernel APCs
ndash Run in kernel mode at IRQL 1
ndash Always deliverable unless thread is already at IRQL 1 or above
ndash Used for IO completion reporting from ldquoarbitrary thread contextrdquo
ndash Kernel-mode interface is linkable but not documented
Asynchronous Procedure Calls (APCs)
7070
ldquoOrdinaryrdquo kernel APCs
ndash Always deliverable if at IRQL 0 unless explicitly disabled (disable with KeEnterCriticalRegion)
User mode APCs
ndash Used for IO completion callback routines (see ReadFileEx WriteFileEx) also QueueUserApc
ndash Only deliverable when thread is in ldquoalertable waitrdquo
Asynchronous Procedure Calls (APCs)
7171
ThreadObject
K
U
APC objects
Asynchronous Procedure Calls (APCs)
7272
IRQLs and CPU Time Accounting
Interval clock timer ISR keeps track of time
Clock ISR time accounting
ndash If IRQLlt2 charge to threadrsquos user or kernel time
ndash If IRQL=2 and processing a DPC charge to DPC time
ndash If IRQL=2 amp not processing a DPC charge to thread kernel time
ndash If IRQLgt2 charge to interrupt time
7373
IRQLs and CPU Time Accounting
Since time servicing interrupts are NOT charged to interrupted thread if system is busy but no process appears to be running must be due to interrupt-related activity
ndash Note time at IRQL 2 or more is charged to the current threadrsquos quantum (to be described)
7474
Interrupt Time Accounting
Task Manager includes interrupt and DPC time with the Idle process time
Interrupt activity is not charged to any threadprocess
ndash Process Explorer shows these as separate processesnot really processes
ndash Context switches for these are really of interrupts amp DPCs
7575
Time Accounting Quirks
Looking at total CPU time for each process may not reveal where system has spent its time
CPU time accounting is driven by programmable interrupt timer
ndash Normally 10 msec (15 msec on some MP Pentiums)
Thread execution and context switches between clock intervals NOT accounted
ndash Eg one or more threads run and enter a wait state before clock fires
ndash Thus threads may run but never get charged
View context switch activity with Process Explorer
ndash Add Context Switch Delta column
7676
For waiting threads user-mode utilities only display the wait reason
Example pstat
Looking at Waiting Threads
7777
Wait Internals 1 Dispatcher Objects
Size TypeState
Wait listhead
Object-type-specific data
DispatcherObject
(see ntddkincddkntddkh)
Any kernel object you can wait for is a ldquodispatcher objectrdquo
ndash some exclusively for synchronizationeg events mutexes (ldquomutantsrdquo) semaphores queues timers
ndash others can be waited for as a side effect of their prime function eg processes threads file objects
ndash non-waitable kernel objects are called ldquocontrol objectsrdquo
All dispatcher objects have a common header
All dispatcher objects are in one of two states
ndash ldquosignaledrdquo vs ldquononsignaledrdquo
ndash when signalled a wait on the object is satisfied
ndash different object types differ in terms of what changes their state
ndash wait and unwait implementation iscommon to all types of dispatcher objects
7878
Object-type-specific data
Wait Internals 2Wait Blocks
Size TypeState
Wait listhead
Size TypeState
Wait listhead
Represent a threadrsquos reference to something itrsquos waiting for (one per handle passed to WaitForhellip)
All wait blocks from a given wait call are chained to the waiting thread
Type indicates wait for ldquoanyrdquo or ldquoallrdquo Key denotes argument list position for
WaitForMultipleObjects
Object-type-specific data
DispatcherObjects
Thread Objects
WaitBlockListWaitBlockList
Wait blocks
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
7979
34 Windows APIs for Synchronization and IPC Windows API constructs for synchronization and interprocess communication
Synchronization
ndash Critical sections
ndash Mutexes
ndash Semaphores
ndash Event objects
Synchronization through interprocess communication
ndash Anonymous pipes
ndash Named pipes
ndash Mailslots
8080
Critical Sections
Only usable from within the same process
Critical sections are initialized and deleted but do not have handles
Only one thread at a time can be in a critical section
A thread can enter a critical section multiple times - however the number of Enter- and Leave-operations must match
Leaving a critical section before entering it may cause deadlocks
No way to test whether another thread is in a critical section
VOID InitializeCriticalSection( LPCRITICAL_SECTION sec )VOID DeleteCriticalSection( LPCRITICAL_SECTION sec )
VOID EnterCriticalSection( LPCRITICAL_SECTION sec ) VOID LeaveCriticalSection( LPCRITICAL_SECTION sec )BOOL TryEnterCriticalSection ( LPCRITICAL_SECTION sec )
8181
Critical Section Example
counter is global shared by all threads
volatile int counter = 0
CRITICAL_SECTION crit
InitializeCriticalSection ( ampcrit )
hellip main loop in any of the threads
while (done)
_try
EnterCriticalSection ( ampcrit )
counter += local_value
LeaveCriticalSection ( ampcrit )
_finally LeaveCriticalSection ( ampcrit )
DeleteCriticalSection( ampcrit )
8282
Synchronizing Threads with Kernel Objects
The following kernel objects can be used to synchronize threads
ndash Processes
ndash Threads
ndash Files
ndash Console input
File change notificationsFile change notifications
MutexesMutexes
Events (auto-reset + manual-reset)Events (auto-reset + manual-reset)
Waitable timersWaitable timers
DWORD WaitForSingleObject( HANDLE hObject DWORD dwTimeout )
DWORD WaitForMultipleObjects( DWORD cObjects LPHANDLE lpHandles BOOL bWaitAll DWORD dwTimeout )
8383
Wait Functions - Details
WaitForSingleObject()ndash hObject specifies kernel object
ndash dwTimeout specifies wait time in msecdwTimeout == 0 - no wait check whether object is signaled
dwTimeout == INFINITE - wait forever
WaitForMultipleObjects()ndash cObjects lt= MAXIMUM_WAIT_OBJECTS (64)
ndash lpHandles - pointer to array identifying these objects
ndash bWaitAll - whether to wait for first signaled object or all objectsFunction returns index of first signaled object
Side effectsndash Mutexes auto-reset events and waitable timers will be reset to non-signaled state after completing wait functions
8484
Mutexes
Mutexes work across processes
First thread has to call CreateMutex()
When sharing a mutex second thread (process) calls CreateMutex() or OpenMutex()
fInitialOwner == TRUE gives creator immediate ownership
Threads acquire mutex ownership using WaitForSingleObject() or WaitForMultipleObjects()
ReleaseMutex() gives up ownership
CloseHandle() will free mutex object
8585
Mutexes
HANDLE CreateMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
HANDLE OpenMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
BOOL ReleaseMutex( HANDLE hMutex )
8686
Mutex Example
counter is global shared by all threads
volatile int done counter = 0
HANDLE mutex = CreateMutex( NULL FALSE NULL )
main loop in any of the threads ret is local
DWORD ret
while (done)
ret = WaitForSingleObject( mutex INFINITE )
if (ret == WAIT_OBJECT_0)
counter += local_value
else mutex was abandoned
break exit the loop
ReleaseMutex( mutex )
CloseHandle( mutex )
8787
Comparison - POSIX mutexes
POSIX pthreads specification supports mutexesndash Synchronization among threads in same process
Five basic functionsndash pthread_mutex_init()
ndash pthread_mutex_destroy()
ndash pthread_mutex_lock()
ndash pthread_mutex_unlock()
ndash pthread_mutex_trylock()
Comparisonndash pthread_mutex_lock() will block - equivalent to WaitForSingleObject( hMutex )
ndash pthread_mutex_trylock() is nonblocking (polling) - equivalent to WaitForSingleObject() with timeout == 0
8888
Semaphores
Semaphore objects are used for resource countingndash A semaphore is signaled when count gt 0
Threadsprocesses use wait functionsndash Each wait function decreases semaphore count by 1
ndash ReleaseSemaphore() may increment count by any value
ndash ReleaseSemaphore() returns old semaphore count
8989
Semaphores
HANDLE CreateSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE OpenSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE ReleaseSemaphore( HANDLE hSemaphoreLONG cReleaseCount LPLONG lpPreviousCount )
9090
Events
Multiple threads can be released when a single event is signaled (barrier synchronization)ndash Manual-reset event can signal several thread simultaneously must be reset manually
ndash PulseEvent() will release all threads waiting on a manual-reset event and automatically reset the event
ndash Auto-reset event signals a single thread event is reset automatically
ndash fInitialState == TRUE - create event in signaled state
9191
Events
HANDLE CreateEvent( LPSECURITY_ATTRIBUTE lpsaBOOL fManualReset BOOL fInititalStateLPTSTR lpszEventName )
BOOL SetEvent( HANDLE hEvent )BOOL ResetEvent( HANDLE hEvent )BOOL PulseEvent( HANDLE hEvent )
9292
Comparison - POSIX condition variables
pthreadrsquos condition variables are comparable to eventsndash pthread_cond_init()
ndash pthread_cond_destroy()
Wait functionsndash pthread_cond_wait()
ndash pthread_cond_timedwait()
Signalingndash pthread_cond_signal() - one thread
ndash pthread_cond_broadcast() - all waiting threads
No exact equivalent to manual-reset events
9393
Anonymous pipes
BOOL CreatePipe( PHANDLE phReadPHANDLE phWriteLPSECURITY_ATTRIBUTES lpsaDWORD cbPipe )
main
prog1 prog2pipe
Half-duplex character-based IPC
cbPipe pipe byte size zero == default
Read on pipe handle will block if pipe is empty
Write operation to a full pipe will block
Anonymous pipes are oneway
9494
IO Redirection using an Anonymous Pipe
Create default size anonymous pipe handles are inheritable
if (CreatePipe (amphReadPipe amphWritePipe ampPipeSA 0))
fprintf(stderr ldquoAnon pipe create failednrdquo) exit(1)
Set output handle to pipe handle create first processes
StartInfoCh1hStdInput = GetStdHandle (STD_INPUT_HANDLE)
StartInfoCh1hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh1hStdOutput = hWritePipe
StartInfoCh1dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)Command1 NULL NULL TRUE 0 NULL NULL ampStartInfoCh1 ampProcInfo1))
fprintf(stderr ldquoCreateProc1 failednrdquo) exit(2)
CloseHandle (hWritePipe)
9595
Pipe example (contd)
Repeat (symmetrically) for the second process
StartInfoCh2hStdInput = hReadPipe
StartInfoCh2hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh2hStdOutput = GetStdHandle (STD_OUTPUT_HANDLE)
StartInfoCh2dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)targv NULL NULLTRUE Inherit handles
0 NULL NULL ampStartInfoCh2 ampProcInfo2))
fprintf(stderr ldquoCreateProc2 failednrdquo) exit(3)
CloseHandle (hReadPipe)
Wait for both processes to complete
WaitForSingleObject (ProcInfo1hProcess INFINITE)
WaitForSingleObject (ProcInfo2hProcess INFINITE)
CloseHandle (ProcInfo1hThread) CloseHandle (ProcInfo1hProcess)
CloseHandle (ProcInfo2hThread) CloseHandle (ProcInfo2hProcess)
return 0
9696
Named Pipes
Message orientedndash Reading process can read varying-length messages precisely as sent by the writing process
Bi-directionalndash Two processes can exchange messages over the same pipe
Multiple independent instances of a named pipendash Several clients can communicate with a single server using the same instance
ndash Server can respond to client using the same instance
Pipe can be accessed over the networkndash location transparency
Convenience and connection functions
9797
Using Named Pipes
HANDLE CreateNamedPipe (LPCTSTR lpszPipeNameDWORD fdwOpenMode DWORD fdwPipModeDWORD nMaxInstances DWORD cbOutBufDWORD cbInBuf DWORD dwTimeOutLPSECURITY_ATTRIBUTES lpsa )
Use same flag settings forall instances of a named pipe
lpszPipeName pipe[path]pipename
ndash Not possible to create a pipe on remote machine ( ndash local machine)
fdwOpenMode
ndash PIPE_ACCESS_DUPLEX PIPE_ACCESS_INBOUND PIPE_ACCESS_OUTBOUND
fdwPipeMode
ndash PIPE_TYPE_BYTE or PIPE_TYPE_MESSAGE
ndash PIPE_READMODE_BYTE or PIPE_READMODE_MESSAGE
ndash PIPE_WAIT or PIPE_NOWAIT (will ReadFile block)
9898
Named Pipes (contd)
BOOL PeekNamedPipe (HANDLE hPipeLPVOID lpvBuffer DWORD cbBufferLPDWORD lpcbRead LPDWORD lpcbAvailLPDWORD lpcbMessage)
nMaxInstances
ndash Number of instances
ndash PIPE_UNLIMITED_INSTANCES OS choice based on resources
dwTimeOut
ndash Default time-out period (in msec) for WaitNamedPipe()
First CreateNamedPipe creates named pipe
ndash Closing handle to last instance deletes named pipe
Polling a pipe
ndash Nondestructive ndash is there a message waiting for ReadFile
9999
Named Pipe Client Connections
CreateFile with named pipe namendash pipe[path]pipename
ndash servernamepipe[path]pipename
ndash First method gives better performance (local server)
Status Functionsndash GetNamedPipeHandleState
ndash SetNamedPipeHandleState
ndash GetNamedPipeInfo
100100
Convenience Functions
BOOL TransactNamedPipe( HANDLE hNamedPipeLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDOWRD lpcbRead LPOVERLAPPED lpa)
WriteFile ReadFile sequence
101101
Convenience Functions
BOOL CallNamedPipe( LPCTSTR lpszPipeNameLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDWORD lpcbRead DWORD dwTimeOut)
CreateFile WriteFile ReadFile CloseHandle
ndash dwTimeOut NMPWAIT_NOWAIT NMPWAIT_WIAT_FOREVER NMPWAIT_USE_DEFAULT_WAIT
102102
Server eliminate the polling loop
BOOL ConnectNamedPipe (HANDLE hNamedPipeLPOVERLAPPED lpo
lpo == NULLndash Call will return as soon as there is a client connection
ndash Returns false if client connected between CreateNamed Pipe calland ConnectNamedPipe()
Use DisconnectNamedPipe to free the handle for connection from another client
WaitNamedPipe()ndash Client may wait for serverlsquos ConnectNamedPipe()
Security rights for named pipesndash GENERIC_READ GENERIC_WRITE SYNCHRONIZE
103103
Comparison with UNIX
UNIX FIFOs are similar to a named pipe
ndash FIFOs are half-duplex
ndash FIFOs are limited to a single machine
ndash FIFOs are still byte-oriented so its easiest to use fixed-size records in clientserver applications
ndash Individual readwrites are atomic
A server using FIFOs must use a separate FIFO for each clientlsquos response although all clients can send requests via a single well known FIFO
Mkfifo() is the UNIX counterpart to CreateNamedPipe()
Use sockets for networked clientserver scenarios
104104
Client Example using Named Pipe
WaitNamedPipe (ServerPipeName NMPWAIT_WAIT_FOREVER)
hNamedPipe = CreateFile (ServerPipeName GENERIC_READ | GENERIC_WRITE
0 NULL OPEN_EXISTING FILE_ATTRIBUTE_NORMAL NULL)
if (hNamedPipe == INVALID_HANDLE_VALUE)
fptinf(stderr Failure to locate servern) exit(3)
Write the request
WriteFile (hNamedPipe ampRequest MAX_RQRS_LEN ampnWrite NULL)
Read each response and send it to std out
while (ReadFile (hNamedPipe ResponseRecord MAX_RQRS_LEN ampnRead NULL))
printf (s ResponseRecord)
CloseHandle (hNamedPipe)
return 0
105105
Server Example Using a Named Pipe
hNamedPipe = CreateNamedPipe (SERVER_PIPE PIPE_ACCESS_DUPLEX
PIPE_READMODE_MESSAGE | PIPE_TYPE_MESSAGE | PIPE_WAIT
1 0 0 CS_TIMEOUT pNPSA)
while (Done)
printf (Server is awaiting next requestn)
if (ConnectNamedPipe (hNamedPipe NULL)
|| ReadFile (hNamedPipe ampRequest RQ_SIZE ampnXfer NULL))
fprintf(stderr ldquoConnect or Read Named Pipe errornrdquo) exit(4)
printf( ldquoRequest is sn RequestRecord)
Send the file one line at a time to the client
fp = fopen (File r)
while ((fgets (ResponseRecord MAX_RQRS_LEN fp) = NULL))
WriteFile (hNamedPipe ampResponseRecord
(strlen(ResponseRecord) + 1) TSIZE ampnXfer NULL)
fclose (fp)
DisconnectNamedPipe (hNamedPipe)
End of server operation
106106
Win32 IPC - MailslotsMailslots bear some nasty implementation detailsthey are almost never used
Broadcast mechanism
ndash One-directional
ndash Mutliple writersmultiple readers (frequently one-to-many comm)
ndash Message delivery is unreliable
ndash Can be located over a network domain
ndash Message lengths are limited (w2k lt 426 byte) Operations on the mailslot
ndash Each reader (server) creates mailslot with CreateMailslot()
ndash Write-only client opens mailslot with CreateFile() and uses WriteFile() ndash open will fail if there are no waiting readers
ndash Clientlsquos message can be read by all servers (readers) Client lookup mailslotmailslotname
ndash Client will connect to every server in network domain
107107
Locate a server via mailslot
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
App client 0
App client n
Mailslot Servers
While () Sleep() hMS = CreateFile( ldquomailslotstatusldquo)
WriteFile(hMS ampStatInfo
App Server
Mailslot Client
Message is sent periodically
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
6060
What signals an object
Dispatcher object
System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
Owning thread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (kernel mode)
nonsignaled signaled
Owning thread or otherthread releases mutex
Resumed thread acquires mutex
Kernel resumes one waiting thread
Mutex (exported to user mode)
nonsignaled signaled
One thread releases thesemaphore freeing a resource
A thread acquires the semaphoreMore resources are not available
Kernel resumes one or more waiting threads
Semaphore
6161
A thread reinitializesthe thread object
What signals an object (contd)
Dispatcher object System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
A thread sets the event
Kernel resumes one or more threads
Kernel resumes one or more waiting threads
Event
nonsignaled signaled
Dedicated thread sets oneevent in the event pair
Kernel resumes theother dedicated thread
Kernel resumes waitingdedicated thread
Event pair
nonsignaled signaled
Timer expires
A thread (re) initializes the timer
Kernel resumes all waiting threads
Timer
nonsignaled signaled
Thread terminates
Kernel resumes all waiting threads
Thread
6262
Further Reading
Mark E Russinovich and David A Solomon Microsoft Windows Internals 4th Edition Microsoft Press 2004
Chapter 3 - System Mechanisms
ndash Trap Dispatching (pp 85 ff)
ndash Synchronization (pp 149 ff)
ndash Kernel Event Tracing (pp 175 ff)
6363
33 Advanced Windows Synchronization
Deferred and Asynchronous Procedure Calls
IRQLs and CPU Time Accounting
Wait Queues amp Dispatcher Objects
6464
Used to defer processing from higher (device) interrupt level to a lower (dispatch) levelndash Also used for quantum end and timer expiration
Driver (usually ISR) queues requestndash One queue per CPU DPCs are normally queued to the current processor but can be targeted to other CPUs
ndash Executes specified procedure at dispatch IRQL (or ldquodispatch levelrdquo also ldquoDPC levelrdquo) when all higher-IRQL work (interrupts) completed
ndash Maximum times recommended ISR 10 usec DPC 25 usec
See httpwwwmicrosoftcomwhdcdriverperformmmdrvmspx
Deferred Procedure Calls (DPCs)
6565
queue head DPC object DPC object DPC object
Deferred Procedure Calls (DPCs)
6666
DPC
Delivering a DPC
DPC routines can call kernel functionsbut canlsquot call system services generatepage faults or create or wait on objects
DPC routines canlsquotassume whatprocess addressspace is currentlymapped
Interruptdispatch table
high
Power failure
DispatchDPC
APC
Low
DPC
1 Timer expires kernelqueues DPC that willrelease all waiting threadsKernel requests SW int
DPCDPC
DPC queue
2 DPC interrupt occurswhen IRQL drops belowdispatchDPC level
dispatcher
3 After DPC interruptcontrol transfers tothread dispatcher
4 Dispatcher executes each DPCroutine in DPC queue
6767
Asynchronous Procedure Calls (APCs)
Execute code in context of a particular user threadndash APC routines can acquire resources (objects) incur page faultscall system services
APC queue is thread-specific
User mode amp kernel mode APCsndash Permission required for user mode APCs
6868
Asynchronous Procedure Calls (APCs)
Executive uses APCs to complete work in thread spacendash Wait for asynchronous IO operation
ndash Emulate delivery of POSIX signals
ndash Make threads suspendterminate itself (env subsystems)
APCs are delivered when thread is in alertable wait statendash WaitForMultipleObjectsEx() SleepEx()
6969
Special kernel APCs
ndash Run in kernel mode at IRQL 1
ndash Always deliverable unless thread is already at IRQL 1 or above
ndash Used for IO completion reporting from ldquoarbitrary thread contextrdquo
ndash Kernel-mode interface is linkable but not documented
Asynchronous Procedure Calls (APCs)
7070
ldquoOrdinaryrdquo kernel APCs
ndash Always deliverable if at IRQL 0 unless explicitly disabled (disable with KeEnterCriticalRegion)
User mode APCs
ndash Used for IO completion callback routines (see ReadFileEx WriteFileEx) also QueueUserApc
ndash Only deliverable when thread is in ldquoalertable waitrdquo
Asynchronous Procedure Calls (APCs)
7171
ThreadObject
K
U
APC objects
Asynchronous Procedure Calls (APCs)
7272
IRQLs and CPU Time Accounting
Interval clock timer ISR keeps track of time
Clock ISR time accounting
ndash If IRQLlt2 charge to threadrsquos user or kernel time
ndash If IRQL=2 and processing a DPC charge to DPC time
ndash If IRQL=2 amp not processing a DPC charge to thread kernel time
ndash If IRQLgt2 charge to interrupt time
7373
IRQLs and CPU Time Accounting
Since time servicing interrupts are NOT charged to interrupted thread if system is busy but no process appears to be running must be due to interrupt-related activity
ndash Note time at IRQL 2 or more is charged to the current threadrsquos quantum (to be described)
7474
Interrupt Time Accounting
Task Manager includes interrupt and DPC time with the Idle process time
Interrupt activity is not charged to any threadprocess
ndash Process Explorer shows these as separate processesnot really processes
ndash Context switches for these are really of interrupts amp DPCs
7575
Time Accounting Quirks
Looking at total CPU time for each process may not reveal where system has spent its time
CPU time accounting is driven by programmable interrupt timer
ndash Normally 10 msec (15 msec on some MP Pentiums)
Thread execution and context switches between clock intervals NOT accounted
ndash Eg one or more threads run and enter a wait state before clock fires
ndash Thus threads may run but never get charged
View context switch activity with Process Explorer
ndash Add Context Switch Delta column
7676
For waiting threads user-mode utilities only display the wait reason
Example pstat
Looking at Waiting Threads
7777
Wait Internals 1 Dispatcher Objects
Size TypeState
Wait listhead
Object-type-specific data
DispatcherObject
(see ntddkincddkntddkh)
Any kernel object you can wait for is a ldquodispatcher objectrdquo
ndash some exclusively for synchronizationeg events mutexes (ldquomutantsrdquo) semaphores queues timers
ndash others can be waited for as a side effect of their prime function eg processes threads file objects
ndash non-waitable kernel objects are called ldquocontrol objectsrdquo
All dispatcher objects have a common header
All dispatcher objects are in one of two states
ndash ldquosignaledrdquo vs ldquononsignaledrdquo
ndash when signalled a wait on the object is satisfied
ndash different object types differ in terms of what changes their state
ndash wait and unwait implementation iscommon to all types of dispatcher objects
7878
Object-type-specific data
Wait Internals 2Wait Blocks
Size TypeState
Wait listhead
Size TypeState
Wait listhead
Represent a threadrsquos reference to something itrsquos waiting for (one per handle passed to WaitForhellip)
All wait blocks from a given wait call are chained to the waiting thread
Type indicates wait for ldquoanyrdquo or ldquoallrdquo Key denotes argument list position for
WaitForMultipleObjects
Object-type-specific data
DispatcherObjects
Thread Objects
WaitBlockListWaitBlockList
Wait blocks
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
7979
34 Windows APIs for Synchronization and IPC Windows API constructs for synchronization and interprocess communication
Synchronization
ndash Critical sections
ndash Mutexes
ndash Semaphores
ndash Event objects
Synchronization through interprocess communication
ndash Anonymous pipes
ndash Named pipes
ndash Mailslots
8080
Critical Sections
Only usable from within the same process
Critical sections are initialized and deleted but do not have handles
Only one thread at a time can be in a critical section
A thread can enter a critical section multiple times - however the number of Enter- and Leave-operations must match
Leaving a critical section before entering it may cause deadlocks
No way to test whether another thread is in a critical section
VOID InitializeCriticalSection( LPCRITICAL_SECTION sec )VOID DeleteCriticalSection( LPCRITICAL_SECTION sec )
VOID EnterCriticalSection( LPCRITICAL_SECTION sec ) VOID LeaveCriticalSection( LPCRITICAL_SECTION sec )BOOL TryEnterCriticalSection ( LPCRITICAL_SECTION sec )
8181
Critical Section Example
counter is global shared by all threads
volatile int counter = 0
CRITICAL_SECTION crit
InitializeCriticalSection ( ampcrit )
hellip main loop in any of the threads
while (done)
_try
EnterCriticalSection ( ampcrit )
counter += local_value
LeaveCriticalSection ( ampcrit )
_finally LeaveCriticalSection ( ampcrit )
DeleteCriticalSection( ampcrit )
8282
Synchronizing Threads with Kernel Objects
The following kernel objects can be used to synchronize threads
ndash Processes
ndash Threads
ndash Files
ndash Console input
File change notificationsFile change notifications
MutexesMutexes
Events (auto-reset + manual-reset)Events (auto-reset + manual-reset)
Waitable timersWaitable timers
DWORD WaitForSingleObject( HANDLE hObject DWORD dwTimeout )
DWORD WaitForMultipleObjects( DWORD cObjects LPHANDLE lpHandles BOOL bWaitAll DWORD dwTimeout )
8383
Wait Functions - Details
WaitForSingleObject()ndash hObject specifies kernel object
ndash dwTimeout specifies wait time in msecdwTimeout == 0 - no wait check whether object is signaled
dwTimeout == INFINITE - wait forever
WaitForMultipleObjects()ndash cObjects lt= MAXIMUM_WAIT_OBJECTS (64)
ndash lpHandles - pointer to array identifying these objects
ndash bWaitAll - whether to wait for first signaled object or all objectsFunction returns index of first signaled object
Side effectsndash Mutexes auto-reset events and waitable timers will be reset to non-signaled state after completing wait functions
8484
Mutexes
Mutexes work across processes
First thread has to call CreateMutex()
When sharing a mutex second thread (process) calls CreateMutex() or OpenMutex()
fInitialOwner == TRUE gives creator immediate ownership
Threads acquire mutex ownership using WaitForSingleObject() or WaitForMultipleObjects()
ReleaseMutex() gives up ownership
CloseHandle() will free mutex object
8585
Mutexes
HANDLE CreateMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
HANDLE OpenMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
BOOL ReleaseMutex( HANDLE hMutex )
8686
Mutex Example
counter is global shared by all threads
volatile int done counter = 0
HANDLE mutex = CreateMutex( NULL FALSE NULL )
main loop in any of the threads ret is local
DWORD ret
while (done)
ret = WaitForSingleObject( mutex INFINITE )
if (ret == WAIT_OBJECT_0)
counter += local_value
else mutex was abandoned
break exit the loop
ReleaseMutex( mutex )
CloseHandle( mutex )
8787
Comparison - POSIX mutexes
POSIX pthreads specification supports mutexesndash Synchronization among threads in same process
Five basic functionsndash pthread_mutex_init()
ndash pthread_mutex_destroy()
ndash pthread_mutex_lock()
ndash pthread_mutex_unlock()
ndash pthread_mutex_trylock()
Comparisonndash pthread_mutex_lock() will block - equivalent to WaitForSingleObject( hMutex )
ndash pthread_mutex_trylock() is nonblocking (polling) - equivalent to WaitForSingleObject() with timeout == 0
8888
Semaphores
Semaphore objects are used for resource countingndash A semaphore is signaled when count gt 0
Threadsprocesses use wait functionsndash Each wait function decreases semaphore count by 1
ndash ReleaseSemaphore() may increment count by any value
ndash ReleaseSemaphore() returns old semaphore count
8989
Semaphores
HANDLE CreateSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE OpenSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE ReleaseSemaphore( HANDLE hSemaphoreLONG cReleaseCount LPLONG lpPreviousCount )
9090
Events
Multiple threads can be released when a single event is signaled (barrier synchronization)ndash Manual-reset event can signal several thread simultaneously must be reset manually
ndash PulseEvent() will release all threads waiting on a manual-reset event and automatically reset the event
ndash Auto-reset event signals a single thread event is reset automatically
ndash fInitialState == TRUE - create event in signaled state
9191
Events
HANDLE CreateEvent( LPSECURITY_ATTRIBUTE lpsaBOOL fManualReset BOOL fInititalStateLPTSTR lpszEventName )
BOOL SetEvent( HANDLE hEvent )BOOL ResetEvent( HANDLE hEvent )BOOL PulseEvent( HANDLE hEvent )
9292
Comparison - POSIX condition variables
pthreadrsquos condition variables are comparable to eventsndash pthread_cond_init()
ndash pthread_cond_destroy()
Wait functionsndash pthread_cond_wait()
ndash pthread_cond_timedwait()
Signalingndash pthread_cond_signal() - one thread
ndash pthread_cond_broadcast() - all waiting threads
No exact equivalent to manual-reset events
9393
Anonymous pipes
BOOL CreatePipe( PHANDLE phReadPHANDLE phWriteLPSECURITY_ATTRIBUTES lpsaDWORD cbPipe )
main
prog1 prog2pipe
Half-duplex character-based IPC
cbPipe pipe byte size zero == default
Read on pipe handle will block if pipe is empty
Write operation to a full pipe will block
Anonymous pipes are oneway
9494
IO Redirection using an Anonymous Pipe
Create default size anonymous pipe handles are inheritable
if (CreatePipe (amphReadPipe amphWritePipe ampPipeSA 0))
fprintf(stderr ldquoAnon pipe create failednrdquo) exit(1)
Set output handle to pipe handle create first processes
StartInfoCh1hStdInput = GetStdHandle (STD_INPUT_HANDLE)
StartInfoCh1hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh1hStdOutput = hWritePipe
StartInfoCh1dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)Command1 NULL NULL TRUE 0 NULL NULL ampStartInfoCh1 ampProcInfo1))
fprintf(stderr ldquoCreateProc1 failednrdquo) exit(2)
CloseHandle (hWritePipe)
9595
Pipe example (contd)
Repeat (symmetrically) for the second process
StartInfoCh2hStdInput = hReadPipe
StartInfoCh2hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh2hStdOutput = GetStdHandle (STD_OUTPUT_HANDLE)
StartInfoCh2dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)targv NULL NULLTRUE Inherit handles
0 NULL NULL ampStartInfoCh2 ampProcInfo2))
fprintf(stderr ldquoCreateProc2 failednrdquo) exit(3)
CloseHandle (hReadPipe)
Wait for both processes to complete
WaitForSingleObject (ProcInfo1hProcess INFINITE)
WaitForSingleObject (ProcInfo2hProcess INFINITE)
CloseHandle (ProcInfo1hThread) CloseHandle (ProcInfo1hProcess)
CloseHandle (ProcInfo2hThread) CloseHandle (ProcInfo2hProcess)
return 0
9696
Named Pipes
Message orientedndash Reading process can read varying-length messages precisely as sent by the writing process
Bi-directionalndash Two processes can exchange messages over the same pipe
Multiple independent instances of a named pipendash Several clients can communicate with a single server using the same instance
ndash Server can respond to client using the same instance
Pipe can be accessed over the networkndash location transparency
Convenience and connection functions
9797
Using Named Pipes
HANDLE CreateNamedPipe (LPCTSTR lpszPipeNameDWORD fdwOpenMode DWORD fdwPipModeDWORD nMaxInstances DWORD cbOutBufDWORD cbInBuf DWORD dwTimeOutLPSECURITY_ATTRIBUTES lpsa )
Use same flag settings forall instances of a named pipe
lpszPipeName pipe[path]pipename
ndash Not possible to create a pipe on remote machine ( ndash local machine)
fdwOpenMode
ndash PIPE_ACCESS_DUPLEX PIPE_ACCESS_INBOUND PIPE_ACCESS_OUTBOUND
fdwPipeMode
ndash PIPE_TYPE_BYTE or PIPE_TYPE_MESSAGE
ndash PIPE_READMODE_BYTE or PIPE_READMODE_MESSAGE
ndash PIPE_WAIT or PIPE_NOWAIT (will ReadFile block)
9898
Named Pipes (contd)
BOOL PeekNamedPipe (HANDLE hPipeLPVOID lpvBuffer DWORD cbBufferLPDWORD lpcbRead LPDWORD lpcbAvailLPDWORD lpcbMessage)
nMaxInstances
ndash Number of instances
ndash PIPE_UNLIMITED_INSTANCES OS choice based on resources
dwTimeOut
ndash Default time-out period (in msec) for WaitNamedPipe()
First CreateNamedPipe creates named pipe
ndash Closing handle to last instance deletes named pipe
Polling a pipe
ndash Nondestructive ndash is there a message waiting for ReadFile
9999
Named Pipe Client Connections
CreateFile with named pipe namendash pipe[path]pipename
ndash servernamepipe[path]pipename
ndash First method gives better performance (local server)
Status Functionsndash GetNamedPipeHandleState
ndash SetNamedPipeHandleState
ndash GetNamedPipeInfo
100100
Convenience Functions
BOOL TransactNamedPipe( HANDLE hNamedPipeLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDOWRD lpcbRead LPOVERLAPPED lpa)
WriteFile ReadFile sequence
101101
Convenience Functions
BOOL CallNamedPipe( LPCTSTR lpszPipeNameLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDWORD lpcbRead DWORD dwTimeOut)
CreateFile WriteFile ReadFile CloseHandle
ndash dwTimeOut NMPWAIT_NOWAIT NMPWAIT_WIAT_FOREVER NMPWAIT_USE_DEFAULT_WAIT
102102
Server eliminate the polling loop
BOOL ConnectNamedPipe (HANDLE hNamedPipeLPOVERLAPPED lpo
lpo == NULLndash Call will return as soon as there is a client connection
ndash Returns false if client connected between CreateNamed Pipe calland ConnectNamedPipe()
Use DisconnectNamedPipe to free the handle for connection from another client
WaitNamedPipe()ndash Client may wait for serverlsquos ConnectNamedPipe()
Security rights for named pipesndash GENERIC_READ GENERIC_WRITE SYNCHRONIZE
103103
Comparison with UNIX
UNIX FIFOs are similar to a named pipe
ndash FIFOs are half-duplex
ndash FIFOs are limited to a single machine
ndash FIFOs are still byte-oriented so its easiest to use fixed-size records in clientserver applications
ndash Individual readwrites are atomic
A server using FIFOs must use a separate FIFO for each clientlsquos response although all clients can send requests via a single well known FIFO
Mkfifo() is the UNIX counterpart to CreateNamedPipe()
Use sockets for networked clientserver scenarios
104104
Client Example using Named Pipe
WaitNamedPipe (ServerPipeName NMPWAIT_WAIT_FOREVER)
hNamedPipe = CreateFile (ServerPipeName GENERIC_READ | GENERIC_WRITE
0 NULL OPEN_EXISTING FILE_ATTRIBUTE_NORMAL NULL)
if (hNamedPipe == INVALID_HANDLE_VALUE)
fptinf(stderr Failure to locate servern) exit(3)
Write the request
WriteFile (hNamedPipe ampRequest MAX_RQRS_LEN ampnWrite NULL)
Read each response and send it to std out
while (ReadFile (hNamedPipe ResponseRecord MAX_RQRS_LEN ampnRead NULL))
printf (s ResponseRecord)
CloseHandle (hNamedPipe)
return 0
105105
Server Example Using a Named Pipe
hNamedPipe = CreateNamedPipe (SERVER_PIPE PIPE_ACCESS_DUPLEX
PIPE_READMODE_MESSAGE | PIPE_TYPE_MESSAGE | PIPE_WAIT
1 0 0 CS_TIMEOUT pNPSA)
while (Done)
printf (Server is awaiting next requestn)
if (ConnectNamedPipe (hNamedPipe NULL)
|| ReadFile (hNamedPipe ampRequest RQ_SIZE ampnXfer NULL))
fprintf(stderr ldquoConnect or Read Named Pipe errornrdquo) exit(4)
printf( ldquoRequest is sn RequestRecord)
Send the file one line at a time to the client
fp = fopen (File r)
while ((fgets (ResponseRecord MAX_RQRS_LEN fp) = NULL))
WriteFile (hNamedPipe ampResponseRecord
(strlen(ResponseRecord) + 1) TSIZE ampnXfer NULL)
fclose (fp)
DisconnectNamedPipe (hNamedPipe)
End of server operation
106106
Win32 IPC - MailslotsMailslots bear some nasty implementation detailsthey are almost never used
Broadcast mechanism
ndash One-directional
ndash Mutliple writersmultiple readers (frequently one-to-many comm)
ndash Message delivery is unreliable
ndash Can be located over a network domain
ndash Message lengths are limited (w2k lt 426 byte) Operations on the mailslot
ndash Each reader (server) creates mailslot with CreateMailslot()
ndash Write-only client opens mailslot with CreateFile() and uses WriteFile() ndash open will fail if there are no waiting readers
ndash Clientlsquos message can be read by all servers (readers) Client lookup mailslotmailslotname
ndash Client will connect to every server in network domain
107107
Locate a server via mailslot
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
App client 0
App client n
Mailslot Servers
While () Sleep() hMS = CreateFile( ldquomailslotstatusldquo)
WriteFile(hMS ampStatInfo
App Server
Mailslot Client
Message is sent periodically
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
6161
A thread reinitializesthe thread object
What signals an object (contd)
Dispatcher object System events and resultingstate change
Effect of signaled stateon waiting threads
nonsignaled signaled
A thread sets the event
Kernel resumes one or more threads
Kernel resumes one or more waiting threads
Event
nonsignaled signaled
Dedicated thread sets oneevent in the event pair
Kernel resumes theother dedicated thread
Kernel resumes waitingdedicated thread
Event pair
nonsignaled signaled
Timer expires
A thread (re) initializes the timer
Kernel resumes all waiting threads
Timer
nonsignaled signaled
Thread terminates
Kernel resumes all waiting threads
Thread
6262
Further Reading
Mark E Russinovich and David A Solomon Microsoft Windows Internals 4th Edition Microsoft Press 2004
Chapter 3 - System Mechanisms
ndash Trap Dispatching (pp 85 ff)
ndash Synchronization (pp 149 ff)
ndash Kernel Event Tracing (pp 175 ff)
6363
33 Advanced Windows Synchronization
Deferred and Asynchronous Procedure Calls
IRQLs and CPU Time Accounting
Wait Queues amp Dispatcher Objects
6464
Used to defer processing from higher (device) interrupt level to a lower (dispatch) levelndash Also used for quantum end and timer expiration
Driver (usually ISR) queues requestndash One queue per CPU DPCs are normally queued to the current processor but can be targeted to other CPUs
ndash Executes specified procedure at dispatch IRQL (or ldquodispatch levelrdquo also ldquoDPC levelrdquo) when all higher-IRQL work (interrupts) completed
ndash Maximum times recommended ISR 10 usec DPC 25 usec
See httpwwwmicrosoftcomwhdcdriverperformmmdrvmspx
Deferred Procedure Calls (DPCs)
6565
queue head DPC object DPC object DPC object
Deferred Procedure Calls (DPCs)
6666
DPC
Delivering a DPC
DPC routines can call kernel functionsbut canlsquot call system services generatepage faults or create or wait on objects
DPC routines canlsquotassume whatprocess addressspace is currentlymapped
Interruptdispatch table
high
Power failure
DispatchDPC
APC
Low
DPC
1 Timer expires kernelqueues DPC that willrelease all waiting threadsKernel requests SW int
DPCDPC
DPC queue
2 DPC interrupt occurswhen IRQL drops belowdispatchDPC level
dispatcher
3 After DPC interruptcontrol transfers tothread dispatcher
4 Dispatcher executes each DPCroutine in DPC queue
6767
Asynchronous Procedure Calls (APCs)
Execute code in context of a particular user threadndash APC routines can acquire resources (objects) incur page faultscall system services
APC queue is thread-specific
User mode amp kernel mode APCsndash Permission required for user mode APCs
6868
Asynchronous Procedure Calls (APCs)
Executive uses APCs to complete work in thread spacendash Wait for asynchronous IO operation
ndash Emulate delivery of POSIX signals
ndash Make threads suspendterminate itself (env subsystems)
APCs are delivered when thread is in alertable wait statendash WaitForMultipleObjectsEx() SleepEx()
6969
Special kernel APCs
ndash Run in kernel mode at IRQL 1
ndash Always deliverable unless thread is already at IRQL 1 or above
ndash Used for IO completion reporting from ldquoarbitrary thread contextrdquo
ndash Kernel-mode interface is linkable but not documented
Asynchronous Procedure Calls (APCs)
7070
ldquoOrdinaryrdquo kernel APCs
ndash Always deliverable if at IRQL 0 unless explicitly disabled (disable with KeEnterCriticalRegion)
User mode APCs
ndash Used for IO completion callback routines (see ReadFileEx WriteFileEx) also QueueUserApc
ndash Only deliverable when thread is in ldquoalertable waitrdquo
Asynchronous Procedure Calls (APCs)
7171
ThreadObject
K
U
APC objects
Asynchronous Procedure Calls (APCs)
7272
IRQLs and CPU Time Accounting
Interval clock timer ISR keeps track of time
Clock ISR time accounting
ndash If IRQLlt2 charge to threadrsquos user or kernel time
ndash If IRQL=2 and processing a DPC charge to DPC time
ndash If IRQL=2 amp not processing a DPC charge to thread kernel time
ndash If IRQLgt2 charge to interrupt time
7373
IRQLs and CPU Time Accounting
Since time servicing interrupts are NOT charged to interrupted thread if system is busy but no process appears to be running must be due to interrupt-related activity
ndash Note time at IRQL 2 or more is charged to the current threadrsquos quantum (to be described)
7474
Interrupt Time Accounting
Task Manager includes interrupt and DPC time with the Idle process time
Interrupt activity is not charged to any threadprocess
ndash Process Explorer shows these as separate processesnot really processes
ndash Context switches for these are really of interrupts amp DPCs
7575
Time Accounting Quirks
Looking at total CPU time for each process may not reveal where system has spent its time
CPU time accounting is driven by programmable interrupt timer
ndash Normally 10 msec (15 msec on some MP Pentiums)
Thread execution and context switches between clock intervals NOT accounted
ndash Eg one or more threads run and enter a wait state before clock fires
ndash Thus threads may run but never get charged
View context switch activity with Process Explorer
ndash Add Context Switch Delta column
7676
For waiting threads user-mode utilities only display the wait reason
Example pstat
Looking at Waiting Threads
7777
Wait Internals 1 Dispatcher Objects
Size TypeState
Wait listhead
Object-type-specific data
DispatcherObject
(see ntddkincddkntddkh)
Any kernel object you can wait for is a ldquodispatcher objectrdquo
ndash some exclusively for synchronizationeg events mutexes (ldquomutantsrdquo) semaphores queues timers
ndash others can be waited for as a side effect of their prime function eg processes threads file objects
ndash non-waitable kernel objects are called ldquocontrol objectsrdquo
All dispatcher objects have a common header
All dispatcher objects are in one of two states
ndash ldquosignaledrdquo vs ldquononsignaledrdquo
ndash when signalled a wait on the object is satisfied
ndash different object types differ in terms of what changes their state
ndash wait and unwait implementation iscommon to all types of dispatcher objects
7878
Object-type-specific data
Wait Internals 2Wait Blocks
Size TypeState
Wait listhead
Size TypeState
Wait listhead
Represent a threadrsquos reference to something itrsquos waiting for (one per handle passed to WaitForhellip)
All wait blocks from a given wait call are chained to the waiting thread
Type indicates wait for ldquoanyrdquo or ldquoallrdquo Key denotes argument list position for
WaitForMultipleObjects
Object-type-specific data
DispatcherObjects
Thread Objects
WaitBlockListWaitBlockList
Wait blocks
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
7979
34 Windows APIs for Synchronization and IPC Windows API constructs for synchronization and interprocess communication
Synchronization
ndash Critical sections
ndash Mutexes
ndash Semaphores
ndash Event objects
Synchronization through interprocess communication
ndash Anonymous pipes
ndash Named pipes
ndash Mailslots
8080
Critical Sections
Only usable from within the same process
Critical sections are initialized and deleted but do not have handles
Only one thread at a time can be in a critical section
A thread can enter a critical section multiple times - however the number of Enter- and Leave-operations must match
Leaving a critical section before entering it may cause deadlocks
No way to test whether another thread is in a critical section
VOID InitializeCriticalSection( LPCRITICAL_SECTION sec )VOID DeleteCriticalSection( LPCRITICAL_SECTION sec )
VOID EnterCriticalSection( LPCRITICAL_SECTION sec ) VOID LeaveCriticalSection( LPCRITICAL_SECTION sec )BOOL TryEnterCriticalSection ( LPCRITICAL_SECTION sec )
8181
Critical Section Example
counter is global shared by all threads
volatile int counter = 0
CRITICAL_SECTION crit
InitializeCriticalSection ( ampcrit )
hellip main loop in any of the threads
while (done)
_try
EnterCriticalSection ( ampcrit )
counter += local_value
LeaveCriticalSection ( ampcrit )
_finally LeaveCriticalSection ( ampcrit )
DeleteCriticalSection( ampcrit )
8282
Synchronizing Threads with Kernel Objects
The following kernel objects can be used to synchronize threads
ndash Processes
ndash Threads
ndash Files
ndash Console input
File change notificationsFile change notifications
MutexesMutexes
Events (auto-reset + manual-reset)Events (auto-reset + manual-reset)
Waitable timersWaitable timers
DWORD WaitForSingleObject( HANDLE hObject DWORD dwTimeout )
DWORD WaitForMultipleObjects( DWORD cObjects LPHANDLE lpHandles BOOL bWaitAll DWORD dwTimeout )
8383
Wait Functions - Details
WaitForSingleObject()ndash hObject specifies kernel object
ndash dwTimeout specifies wait time in msecdwTimeout == 0 - no wait check whether object is signaled
dwTimeout == INFINITE - wait forever
WaitForMultipleObjects()ndash cObjects lt= MAXIMUM_WAIT_OBJECTS (64)
ndash lpHandles - pointer to array identifying these objects
ndash bWaitAll - whether to wait for first signaled object or all objectsFunction returns index of first signaled object
Side effectsndash Mutexes auto-reset events and waitable timers will be reset to non-signaled state after completing wait functions
8484
Mutexes
Mutexes work across processes
First thread has to call CreateMutex()
When sharing a mutex second thread (process) calls CreateMutex() or OpenMutex()
fInitialOwner == TRUE gives creator immediate ownership
Threads acquire mutex ownership using WaitForSingleObject() or WaitForMultipleObjects()
ReleaseMutex() gives up ownership
CloseHandle() will free mutex object
8585
Mutexes
HANDLE CreateMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
HANDLE OpenMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
BOOL ReleaseMutex( HANDLE hMutex )
8686
Mutex Example
counter is global shared by all threads
volatile int done counter = 0
HANDLE mutex = CreateMutex( NULL FALSE NULL )
main loop in any of the threads ret is local
DWORD ret
while (done)
ret = WaitForSingleObject( mutex INFINITE )
if (ret == WAIT_OBJECT_0)
counter += local_value
else mutex was abandoned
break exit the loop
ReleaseMutex( mutex )
CloseHandle( mutex )
8787
Comparison - POSIX mutexes
POSIX pthreads specification supports mutexesndash Synchronization among threads in same process
Five basic functionsndash pthread_mutex_init()
ndash pthread_mutex_destroy()
ndash pthread_mutex_lock()
ndash pthread_mutex_unlock()
ndash pthread_mutex_trylock()
Comparisonndash pthread_mutex_lock() will block - equivalent to WaitForSingleObject( hMutex )
ndash pthread_mutex_trylock() is nonblocking (polling) - equivalent to WaitForSingleObject() with timeout == 0
8888
Semaphores
Semaphore objects are used for resource countingndash A semaphore is signaled when count gt 0
Threadsprocesses use wait functionsndash Each wait function decreases semaphore count by 1
ndash ReleaseSemaphore() may increment count by any value
ndash ReleaseSemaphore() returns old semaphore count
8989
Semaphores
HANDLE CreateSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE OpenSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE ReleaseSemaphore( HANDLE hSemaphoreLONG cReleaseCount LPLONG lpPreviousCount )
9090
Events
Multiple threads can be released when a single event is signaled (barrier synchronization)ndash Manual-reset event can signal several thread simultaneously must be reset manually
ndash PulseEvent() will release all threads waiting on a manual-reset event and automatically reset the event
ndash Auto-reset event signals a single thread event is reset automatically
ndash fInitialState == TRUE - create event in signaled state
9191
Events
HANDLE CreateEvent( LPSECURITY_ATTRIBUTE lpsaBOOL fManualReset BOOL fInititalStateLPTSTR lpszEventName )
BOOL SetEvent( HANDLE hEvent )BOOL ResetEvent( HANDLE hEvent )BOOL PulseEvent( HANDLE hEvent )
9292
Comparison - POSIX condition variables
pthreadrsquos condition variables are comparable to eventsndash pthread_cond_init()
ndash pthread_cond_destroy()
Wait functionsndash pthread_cond_wait()
ndash pthread_cond_timedwait()
Signalingndash pthread_cond_signal() - one thread
ndash pthread_cond_broadcast() - all waiting threads
No exact equivalent to manual-reset events
9393
Anonymous pipes
BOOL CreatePipe( PHANDLE phReadPHANDLE phWriteLPSECURITY_ATTRIBUTES lpsaDWORD cbPipe )
main
prog1 prog2pipe
Half-duplex character-based IPC
cbPipe pipe byte size zero == default
Read on pipe handle will block if pipe is empty
Write operation to a full pipe will block
Anonymous pipes are oneway
9494
IO Redirection using an Anonymous Pipe
Create default size anonymous pipe handles are inheritable
if (CreatePipe (amphReadPipe amphWritePipe ampPipeSA 0))
fprintf(stderr ldquoAnon pipe create failednrdquo) exit(1)
Set output handle to pipe handle create first processes
StartInfoCh1hStdInput = GetStdHandle (STD_INPUT_HANDLE)
StartInfoCh1hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh1hStdOutput = hWritePipe
StartInfoCh1dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)Command1 NULL NULL TRUE 0 NULL NULL ampStartInfoCh1 ampProcInfo1))
fprintf(stderr ldquoCreateProc1 failednrdquo) exit(2)
CloseHandle (hWritePipe)
9595
Pipe example (contd)
Repeat (symmetrically) for the second process
StartInfoCh2hStdInput = hReadPipe
StartInfoCh2hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh2hStdOutput = GetStdHandle (STD_OUTPUT_HANDLE)
StartInfoCh2dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)targv NULL NULLTRUE Inherit handles
0 NULL NULL ampStartInfoCh2 ampProcInfo2))
fprintf(stderr ldquoCreateProc2 failednrdquo) exit(3)
CloseHandle (hReadPipe)
Wait for both processes to complete
WaitForSingleObject (ProcInfo1hProcess INFINITE)
WaitForSingleObject (ProcInfo2hProcess INFINITE)
CloseHandle (ProcInfo1hThread) CloseHandle (ProcInfo1hProcess)
CloseHandle (ProcInfo2hThread) CloseHandle (ProcInfo2hProcess)
return 0
9696
Named Pipes
Message orientedndash Reading process can read varying-length messages precisely as sent by the writing process
Bi-directionalndash Two processes can exchange messages over the same pipe
Multiple independent instances of a named pipendash Several clients can communicate with a single server using the same instance
ndash Server can respond to client using the same instance
Pipe can be accessed over the networkndash location transparency
Convenience and connection functions
9797
Using Named Pipes
HANDLE CreateNamedPipe (LPCTSTR lpszPipeNameDWORD fdwOpenMode DWORD fdwPipModeDWORD nMaxInstances DWORD cbOutBufDWORD cbInBuf DWORD dwTimeOutLPSECURITY_ATTRIBUTES lpsa )
Use same flag settings forall instances of a named pipe
lpszPipeName pipe[path]pipename
ndash Not possible to create a pipe on remote machine ( ndash local machine)
fdwOpenMode
ndash PIPE_ACCESS_DUPLEX PIPE_ACCESS_INBOUND PIPE_ACCESS_OUTBOUND
fdwPipeMode
ndash PIPE_TYPE_BYTE or PIPE_TYPE_MESSAGE
ndash PIPE_READMODE_BYTE or PIPE_READMODE_MESSAGE
ndash PIPE_WAIT or PIPE_NOWAIT (will ReadFile block)
9898
Named Pipes (contd)
BOOL PeekNamedPipe (HANDLE hPipeLPVOID lpvBuffer DWORD cbBufferLPDWORD lpcbRead LPDWORD lpcbAvailLPDWORD lpcbMessage)
nMaxInstances
ndash Number of instances
ndash PIPE_UNLIMITED_INSTANCES OS choice based on resources
dwTimeOut
ndash Default time-out period (in msec) for WaitNamedPipe()
First CreateNamedPipe creates named pipe
ndash Closing handle to last instance deletes named pipe
Polling a pipe
ndash Nondestructive ndash is there a message waiting for ReadFile
9999
Named Pipe Client Connections
CreateFile with named pipe namendash pipe[path]pipename
ndash servernamepipe[path]pipename
ndash First method gives better performance (local server)
Status Functionsndash GetNamedPipeHandleState
ndash SetNamedPipeHandleState
ndash GetNamedPipeInfo
100100
Convenience Functions
BOOL TransactNamedPipe( HANDLE hNamedPipeLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDOWRD lpcbRead LPOVERLAPPED lpa)
WriteFile ReadFile sequence
101101
Convenience Functions
BOOL CallNamedPipe( LPCTSTR lpszPipeNameLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDWORD lpcbRead DWORD dwTimeOut)
CreateFile WriteFile ReadFile CloseHandle
ndash dwTimeOut NMPWAIT_NOWAIT NMPWAIT_WIAT_FOREVER NMPWAIT_USE_DEFAULT_WAIT
102102
Server eliminate the polling loop
BOOL ConnectNamedPipe (HANDLE hNamedPipeLPOVERLAPPED lpo
lpo == NULLndash Call will return as soon as there is a client connection
ndash Returns false if client connected between CreateNamed Pipe calland ConnectNamedPipe()
Use DisconnectNamedPipe to free the handle for connection from another client
WaitNamedPipe()ndash Client may wait for serverlsquos ConnectNamedPipe()
Security rights for named pipesndash GENERIC_READ GENERIC_WRITE SYNCHRONIZE
103103
Comparison with UNIX
UNIX FIFOs are similar to a named pipe
ndash FIFOs are half-duplex
ndash FIFOs are limited to a single machine
ndash FIFOs are still byte-oriented so its easiest to use fixed-size records in clientserver applications
ndash Individual readwrites are atomic
A server using FIFOs must use a separate FIFO for each clientlsquos response although all clients can send requests via a single well known FIFO
Mkfifo() is the UNIX counterpart to CreateNamedPipe()
Use sockets for networked clientserver scenarios
104104
Client Example using Named Pipe
WaitNamedPipe (ServerPipeName NMPWAIT_WAIT_FOREVER)
hNamedPipe = CreateFile (ServerPipeName GENERIC_READ | GENERIC_WRITE
0 NULL OPEN_EXISTING FILE_ATTRIBUTE_NORMAL NULL)
if (hNamedPipe == INVALID_HANDLE_VALUE)
fptinf(stderr Failure to locate servern) exit(3)
Write the request
WriteFile (hNamedPipe ampRequest MAX_RQRS_LEN ampnWrite NULL)
Read each response and send it to std out
while (ReadFile (hNamedPipe ResponseRecord MAX_RQRS_LEN ampnRead NULL))
printf (s ResponseRecord)
CloseHandle (hNamedPipe)
return 0
105105
Server Example Using a Named Pipe
hNamedPipe = CreateNamedPipe (SERVER_PIPE PIPE_ACCESS_DUPLEX
PIPE_READMODE_MESSAGE | PIPE_TYPE_MESSAGE | PIPE_WAIT
1 0 0 CS_TIMEOUT pNPSA)
while (Done)
printf (Server is awaiting next requestn)
if (ConnectNamedPipe (hNamedPipe NULL)
|| ReadFile (hNamedPipe ampRequest RQ_SIZE ampnXfer NULL))
fprintf(stderr ldquoConnect or Read Named Pipe errornrdquo) exit(4)
printf( ldquoRequest is sn RequestRecord)
Send the file one line at a time to the client
fp = fopen (File r)
while ((fgets (ResponseRecord MAX_RQRS_LEN fp) = NULL))
WriteFile (hNamedPipe ampResponseRecord
(strlen(ResponseRecord) + 1) TSIZE ampnXfer NULL)
fclose (fp)
DisconnectNamedPipe (hNamedPipe)
End of server operation
106106
Win32 IPC - MailslotsMailslots bear some nasty implementation detailsthey are almost never used
Broadcast mechanism
ndash One-directional
ndash Mutliple writersmultiple readers (frequently one-to-many comm)
ndash Message delivery is unreliable
ndash Can be located over a network domain
ndash Message lengths are limited (w2k lt 426 byte) Operations on the mailslot
ndash Each reader (server) creates mailslot with CreateMailslot()
ndash Write-only client opens mailslot with CreateFile() and uses WriteFile() ndash open will fail if there are no waiting readers
ndash Clientlsquos message can be read by all servers (readers) Client lookup mailslotmailslotname
ndash Client will connect to every server in network domain
107107
Locate a server via mailslot
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
App client 0
App client n
Mailslot Servers
While () Sleep() hMS = CreateFile( ldquomailslotstatusldquo)
WriteFile(hMS ampStatInfo
App Server
Mailslot Client
Message is sent periodically
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
6262
Further Reading
Mark E Russinovich and David A Solomon Microsoft Windows Internals 4th Edition Microsoft Press 2004
Chapter 3 - System Mechanisms
ndash Trap Dispatching (pp 85 ff)
ndash Synchronization (pp 149 ff)
ndash Kernel Event Tracing (pp 175 ff)
6363
33 Advanced Windows Synchronization
Deferred and Asynchronous Procedure Calls
IRQLs and CPU Time Accounting
Wait Queues amp Dispatcher Objects
6464
Used to defer processing from higher (device) interrupt level to a lower (dispatch) levelndash Also used for quantum end and timer expiration
Driver (usually ISR) queues requestndash One queue per CPU DPCs are normally queued to the current processor but can be targeted to other CPUs
ndash Executes specified procedure at dispatch IRQL (or ldquodispatch levelrdquo also ldquoDPC levelrdquo) when all higher-IRQL work (interrupts) completed
ndash Maximum times recommended ISR 10 usec DPC 25 usec
See httpwwwmicrosoftcomwhdcdriverperformmmdrvmspx
Deferred Procedure Calls (DPCs)
6565
queue head DPC object DPC object DPC object
Deferred Procedure Calls (DPCs)
6666
DPC
Delivering a DPC
DPC routines can call kernel functionsbut canlsquot call system services generatepage faults or create or wait on objects
DPC routines canlsquotassume whatprocess addressspace is currentlymapped
Interruptdispatch table
high
Power failure
DispatchDPC
APC
Low
DPC
1 Timer expires kernelqueues DPC that willrelease all waiting threadsKernel requests SW int
DPCDPC
DPC queue
2 DPC interrupt occurswhen IRQL drops belowdispatchDPC level
dispatcher
3 After DPC interruptcontrol transfers tothread dispatcher
4 Dispatcher executes each DPCroutine in DPC queue
6767
Asynchronous Procedure Calls (APCs)
Execute code in context of a particular user threadndash APC routines can acquire resources (objects) incur page faultscall system services
APC queue is thread-specific
User mode amp kernel mode APCsndash Permission required for user mode APCs
6868
Asynchronous Procedure Calls (APCs)
Executive uses APCs to complete work in thread spacendash Wait for asynchronous IO operation
ndash Emulate delivery of POSIX signals
ndash Make threads suspendterminate itself (env subsystems)
APCs are delivered when thread is in alertable wait statendash WaitForMultipleObjectsEx() SleepEx()
6969
Special kernel APCs
ndash Run in kernel mode at IRQL 1
ndash Always deliverable unless thread is already at IRQL 1 or above
ndash Used for IO completion reporting from ldquoarbitrary thread contextrdquo
ndash Kernel-mode interface is linkable but not documented
Asynchronous Procedure Calls (APCs)
7070
ldquoOrdinaryrdquo kernel APCs
ndash Always deliverable if at IRQL 0 unless explicitly disabled (disable with KeEnterCriticalRegion)
User mode APCs
ndash Used for IO completion callback routines (see ReadFileEx WriteFileEx) also QueueUserApc
ndash Only deliverable when thread is in ldquoalertable waitrdquo
Asynchronous Procedure Calls (APCs)
7171
ThreadObject
K
U
APC objects
Asynchronous Procedure Calls (APCs)
7272
IRQLs and CPU Time Accounting
Interval clock timer ISR keeps track of time
Clock ISR time accounting
ndash If IRQLlt2 charge to threadrsquos user or kernel time
ndash If IRQL=2 and processing a DPC charge to DPC time
ndash If IRQL=2 amp not processing a DPC charge to thread kernel time
ndash If IRQLgt2 charge to interrupt time
7373
IRQLs and CPU Time Accounting
Since time servicing interrupts are NOT charged to interrupted thread if system is busy but no process appears to be running must be due to interrupt-related activity
ndash Note time at IRQL 2 or more is charged to the current threadrsquos quantum (to be described)
7474
Interrupt Time Accounting
Task Manager includes interrupt and DPC time with the Idle process time
Interrupt activity is not charged to any threadprocess
ndash Process Explorer shows these as separate processesnot really processes
ndash Context switches for these are really of interrupts amp DPCs
7575
Time Accounting Quirks
Looking at total CPU time for each process may not reveal where system has spent its time
CPU time accounting is driven by programmable interrupt timer
ndash Normally 10 msec (15 msec on some MP Pentiums)
Thread execution and context switches between clock intervals NOT accounted
ndash Eg one or more threads run and enter a wait state before clock fires
ndash Thus threads may run but never get charged
View context switch activity with Process Explorer
ndash Add Context Switch Delta column
7676
For waiting threads user-mode utilities only display the wait reason
Example pstat
Looking at Waiting Threads
7777
Wait Internals 1 Dispatcher Objects
Size TypeState
Wait listhead
Object-type-specific data
DispatcherObject
(see ntddkincddkntddkh)
Any kernel object you can wait for is a ldquodispatcher objectrdquo
ndash some exclusively for synchronizationeg events mutexes (ldquomutantsrdquo) semaphores queues timers
ndash others can be waited for as a side effect of their prime function eg processes threads file objects
ndash non-waitable kernel objects are called ldquocontrol objectsrdquo
All dispatcher objects have a common header
All dispatcher objects are in one of two states
ndash ldquosignaledrdquo vs ldquononsignaledrdquo
ndash when signalled a wait on the object is satisfied
ndash different object types differ in terms of what changes their state
ndash wait and unwait implementation iscommon to all types of dispatcher objects
7878
Object-type-specific data
Wait Internals 2Wait Blocks
Size TypeState
Wait listhead
Size TypeState
Wait listhead
Represent a threadrsquos reference to something itrsquos waiting for (one per handle passed to WaitForhellip)
All wait blocks from a given wait call are chained to the waiting thread
Type indicates wait for ldquoanyrdquo or ldquoallrdquo Key denotes argument list position for
WaitForMultipleObjects
Object-type-specific data
DispatcherObjects
Thread Objects
WaitBlockListWaitBlockList
Wait blocks
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
7979
34 Windows APIs for Synchronization and IPC Windows API constructs for synchronization and interprocess communication
Synchronization
ndash Critical sections
ndash Mutexes
ndash Semaphores
ndash Event objects
Synchronization through interprocess communication
ndash Anonymous pipes
ndash Named pipes
ndash Mailslots
8080
Critical Sections
Only usable from within the same process
Critical sections are initialized and deleted but do not have handles
Only one thread at a time can be in a critical section
A thread can enter a critical section multiple times - however the number of Enter- and Leave-operations must match
Leaving a critical section before entering it may cause deadlocks
No way to test whether another thread is in a critical section
VOID InitializeCriticalSection( LPCRITICAL_SECTION sec )VOID DeleteCriticalSection( LPCRITICAL_SECTION sec )
VOID EnterCriticalSection( LPCRITICAL_SECTION sec ) VOID LeaveCriticalSection( LPCRITICAL_SECTION sec )BOOL TryEnterCriticalSection ( LPCRITICAL_SECTION sec )
8181
Critical Section Example
counter is global shared by all threads
volatile int counter = 0
CRITICAL_SECTION crit
InitializeCriticalSection ( ampcrit )
hellip main loop in any of the threads
while (done)
_try
EnterCriticalSection ( ampcrit )
counter += local_value
LeaveCriticalSection ( ampcrit )
_finally LeaveCriticalSection ( ampcrit )
DeleteCriticalSection( ampcrit )
8282
Synchronizing Threads with Kernel Objects
The following kernel objects can be used to synchronize threads
ndash Processes
ndash Threads
ndash Files
ndash Console input
File change notificationsFile change notifications
MutexesMutexes
Events (auto-reset + manual-reset)Events (auto-reset + manual-reset)
Waitable timersWaitable timers
DWORD WaitForSingleObject( HANDLE hObject DWORD dwTimeout )
DWORD WaitForMultipleObjects( DWORD cObjects LPHANDLE lpHandles BOOL bWaitAll DWORD dwTimeout )
8383
Wait Functions - Details
WaitForSingleObject()ndash hObject specifies kernel object
ndash dwTimeout specifies wait time in msecdwTimeout == 0 - no wait check whether object is signaled
dwTimeout == INFINITE - wait forever
WaitForMultipleObjects()ndash cObjects lt= MAXIMUM_WAIT_OBJECTS (64)
ndash lpHandles - pointer to array identifying these objects
ndash bWaitAll - whether to wait for first signaled object or all objectsFunction returns index of first signaled object
Side effectsndash Mutexes auto-reset events and waitable timers will be reset to non-signaled state after completing wait functions
8484
Mutexes
Mutexes work across processes
First thread has to call CreateMutex()
When sharing a mutex second thread (process) calls CreateMutex() or OpenMutex()
fInitialOwner == TRUE gives creator immediate ownership
Threads acquire mutex ownership using WaitForSingleObject() or WaitForMultipleObjects()
ReleaseMutex() gives up ownership
CloseHandle() will free mutex object
8585
Mutexes
HANDLE CreateMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
HANDLE OpenMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
BOOL ReleaseMutex( HANDLE hMutex )
8686
Mutex Example
counter is global shared by all threads
volatile int done counter = 0
HANDLE mutex = CreateMutex( NULL FALSE NULL )
main loop in any of the threads ret is local
DWORD ret
while (done)
ret = WaitForSingleObject( mutex INFINITE )
if (ret == WAIT_OBJECT_0)
counter += local_value
else mutex was abandoned
break exit the loop
ReleaseMutex( mutex )
CloseHandle( mutex )
8787
Comparison - POSIX mutexes
POSIX pthreads specification supports mutexesndash Synchronization among threads in same process
Five basic functionsndash pthread_mutex_init()
ndash pthread_mutex_destroy()
ndash pthread_mutex_lock()
ndash pthread_mutex_unlock()
ndash pthread_mutex_trylock()
Comparisonndash pthread_mutex_lock() will block - equivalent to WaitForSingleObject( hMutex )
ndash pthread_mutex_trylock() is nonblocking (polling) - equivalent to WaitForSingleObject() with timeout == 0
8888
Semaphores
Semaphore objects are used for resource countingndash A semaphore is signaled when count gt 0
Threadsprocesses use wait functionsndash Each wait function decreases semaphore count by 1
ndash ReleaseSemaphore() may increment count by any value
ndash ReleaseSemaphore() returns old semaphore count
8989
Semaphores
HANDLE CreateSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE OpenSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE ReleaseSemaphore( HANDLE hSemaphoreLONG cReleaseCount LPLONG lpPreviousCount )
9090
Events
Multiple threads can be released when a single event is signaled (barrier synchronization)ndash Manual-reset event can signal several thread simultaneously must be reset manually
ndash PulseEvent() will release all threads waiting on a manual-reset event and automatically reset the event
ndash Auto-reset event signals a single thread event is reset automatically
ndash fInitialState == TRUE - create event in signaled state
9191
Events
HANDLE CreateEvent( LPSECURITY_ATTRIBUTE lpsaBOOL fManualReset BOOL fInititalStateLPTSTR lpszEventName )
BOOL SetEvent( HANDLE hEvent )BOOL ResetEvent( HANDLE hEvent )BOOL PulseEvent( HANDLE hEvent )
9292
Comparison - POSIX condition variables
pthreadrsquos condition variables are comparable to eventsndash pthread_cond_init()
ndash pthread_cond_destroy()
Wait functionsndash pthread_cond_wait()
ndash pthread_cond_timedwait()
Signalingndash pthread_cond_signal() - one thread
ndash pthread_cond_broadcast() - all waiting threads
No exact equivalent to manual-reset events
9393
Anonymous pipes
BOOL CreatePipe( PHANDLE phReadPHANDLE phWriteLPSECURITY_ATTRIBUTES lpsaDWORD cbPipe )
main
prog1 prog2pipe
Half-duplex character-based IPC
cbPipe pipe byte size zero == default
Read on pipe handle will block if pipe is empty
Write operation to a full pipe will block
Anonymous pipes are oneway
9494
IO Redirection using an Anonymous Pipe
Create default size anonymous pipe handles are inheritable
if (CreatePipe (amphReadPipe amphWritePipe ampPipeSA 0))
fprintf(stderr ldquoAnon pipe create failednrdquo) exit(1)
Set output handle to pipe handle create first processes
StartInfoCh1hStdInput = GetStdHandle (STD_INPUT_HANDLE)
StartInfoCh1hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh1hStdOutput = hWritePipe
StartInfoCh1dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)Command1 NULL NULL TRUE 0 NULL NULL ampStartInfoCh1 ampProcInfo1))
fprintf(stderr ldquoCreateProc1 failednrdquo) exit(2)
CloseHandle (hWritePipe)
9595
Pipe example (contd)
Repeat (symmetrically) for the second process
StartInfoCh2hStdInput = hReadPipe
StartInfoCh2hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh2hStdOutput = GetStdHandle (STD_OUTPUT_HANDLE)
StartInfoCh2dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)targv NULL NULLTRUE Inherit handles
0 NULL NULL ampStartInfoCh2 ampProcInfo2))
fprintf(stderr ldquoCreateProc2 failednrdquo) exit(3)
CloseHandle (hReadPipe)
Wait for both processes to complete
WaitForSingleObject (ProcInfo1hProcess INFINITE)
WaitForSingleObject (ProcInfo2hProcess INFINITE)
CloseHandle (ProcInfo1hThread) CloseHandle (ProcInfo1hProcess)
CloseHandle (ProcInfo2hThread) CloseHandle (ProcInfo2hProcess)
return 0
9696
Named Pipes
Message orientedndash Reading process can read varying-length messages precisely as sent by the writing process
Bi-directionalndash Two processes can exchange messages over the same pipe
Multiple independent instances of a named pipendash Several clients can communicate with a single server using the same instance
ndash Server can respond to client using the same instance
Pipe can be accessed over the networkndash location transparency
Convenience and connection functions
9797
Using Named Pipes
HANDLE CreateNamedPipe (LPCTSTR lpszPipeNameDWORD fdwOpenMode DWORD fdwPipModeDWORD nMaxInstances DWORD cbOutBufDWORD cbInBuf DWORD dwTimeOutLPSECURITY_ATTRIBUTES lpsa )
Use same flag settings forall instances of a named pipe
lpszPipeName pipe[path]pipename
ndash Not possible to create a pipe on remote machine ( ndash local machine)
fdwOpenMode
ndash PIPE_ACCESS_DUPLEX PIPE_ACCESS_INBOUND PIPE_ACCESS_OUTBOUND
fdwPipeMode
ndash PIPE_TYPE_BYTE or PIPE_TYPE_MESSAGE
ndash PIPE_READMODE_BYTE or PIPE_READMODE_MESSAGE
ndash PIPE_WAIT or PIPE_NOWAIT (will ReadFile block)
9898
Named Pipes (contd)
BOOL PeekNamedPipe (HANDLE hPipeLPVOID lpvBuffer DWORD cbBufferLPDWORD lpcbRead LPDWORD lpcbAvailLPDWORD lpcbMessage)
nMaxInstances
ndash Number of instances
ndash PIPE_UNLIMITED_INSTANCES OS choice based on resources
dwTimeOut
ndash Default time-out period (in msec) for WaitNamedPipe()
First CreateNamedPipe creates named pipe
ndash Closing handle to last instance deletes named pipe
Polling a pipe
ndash Nondestructive ndash is there a message waiting for ReadFile
9999
Named Pipe Client Connections
CreateFile with named pipe namendash pipe[path]pipename
ndash servernamepipe[path]pipename
ndash First method gives better performance (local server)
Status Functionsndash GetNamedPipeHandleState
ndash SetNamedPipeHandleState
ndash GetNamedPipeInfo
100100
Convenience Functions
BOOL TransactNamedPipe( HANDLE hNamedPipeLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDOWRD lpcbRead LPOVERLAPPED lpa)
WriteFile ReadFile sequence
101101
Convenience Functions
BOOL CallNamedPipe( LPCTSTR lpszPipeNameLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDWORD lpcbRead DWORD dwTimeOut)
CreateFile WriteFile ReadFile CloseHandle
ndash dwTimeOut NMPWAIT_NOWAIT NMPWAIT_WIAT_FOREVER NMPWAIT_USE_DEFAULT_WAIT
102102
Server eliminate the polling loop
BOOL ConnectNamedPipe (HANDLE hNamedPipeLPOVERLAPPED lpo
lpo == NULLndash Call will return as soon as there is a client connection
ndash Returns false if client connected between CreateNamed Pipe calland ConnectNamedPipe()
Use DisconnectNamedPipe to free the handle for connection from another client
WaitNamedPipe()ndash Client may wait for serverlsquos ConnectNamedPipe()
Security rights for named pipesndash GENERIC_READ GENERIC_WRITE SYNCHRONIZE
103103
Comparison with UNIX
UNIX FIFOs are similar to a named pipe
ndash FIFOs are half-duplex
ndash FIFOs are limited to a single machine
ndash FIFOs are still byte-oriented so its easiest to use fixed-size records in clientserver applications
ndash Individual readwrites are atomic
A server using FIFOs must use a separate FIFO for each clientlsquos response although all clients can send requests via a single well known FIFO
Mkfifo() is the UNIX counterpart to CreateNamedPipe()
Use sockets for networked clientserver scenarios
104104
Client Example using Named Pipe
WaitNamedPipe (ServerPipeName NMPWAIT_WAIT_FOREVER)
hNamedPipe = CreateFile (ServerPipeName GENERIC_READ | GENERIC_WRITE
0 NULL OPEN_EXISTING FILE_ATTRIBUTE_NORMAL NULL)
if (hNamedPipe == INVALID_HANDLE_VALUE)
fptinf(stderr Failure to locate servern) exit(3)
Write the request
WriteFile (hNamedPipe ampRequest MAX_RQRS_LEN ampnWrite NULL)
Read each response and send it to std out
while (ReadFile (hNamedPipe ResponseRecord MAX_RQRS_LEN ampnRead NULL))
printf (s ResponseRecord)
CloseHandle (hNamedPipe)
return 0
105105
Server Example Using a Named Pipe
hNamedPipe = CreateNamedPipe (SERVER_PIPE PIPE_ACCESS_DUPLEX
PIPE_READMODE_MESSAGE | PIPE_TYPE_MESSAGE | PIPE_WAIT
1 0 0 CS_TIMEOUT pNPSA)
while (Done)
printf (Server is awaiting next requestn)
if (ConnectNamedPipe (hNamedPipe NULL)
|| ReadFile (hNamedPipe ampRequest RQ_SIZE ampnXfer NULL))
fprintf(stderr ldquoConnect or Read Named Pipe errornrdquo) exit(4)
printf( ldquoRequest is sn RequestRecord)
Send the file one line at a time to the client
fp = fopen (File r)
while ((fgets (ResponseRecord MAX_RQRS_LEN fp) = NULL))
WriteFile (hNamedPipe ampResponseRecord
(strlen(ResponseRecord) + 1) TSIZE ampnXfer NULL)
fclose (fp)
DisconnectNamedPipe (hNamedPipe)
End of server operation
106106
Win32 IPC - MailslotsMailslots bear some nasty implementation detailsthey are almost never used
Broadcast mechanism
ndash One-directional
ndash Mutliple writersmultiple readers (frequently one-to-many comm)
ndash Message delivery is unreliable
ndash Can be located over a network domain
ndash Message lengths are limited (w2k lt 426 byte) Operations on the mailslot
ndash Each reader (server) creates mailslot with CreateMailslot()
ndash Write-only client opens mailslot with CreateFile() and uses WriteFile() ndash open will fail if there are no waiting readers
ndash Clientlsquos message can be read by all servers (readers) Client lookup mailslotmailslotname
ndash Client will connect to every server in network domain
107107
Locate a server via mailslot
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
App client 0
App client n
Mailslot Servers
While () Sleep() hMS = CreateFile( ldquomailslotstatusldquo)
WriteFile(hMS ampStatInfo
App Server
Mailslot Client
Message is sent periodically
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
6363
33 Advanced Windows Synchronization
Deferred and Asynchronous Procedure Calls
IRQLs and CPU Time Accounting
Wait Queues amp Dispatcher Objects
6464
Used to defer processing from higher (device) interrupt level to a lower (dispatch) levelndash Also used for quantum end and timer expiration
Driver (usually ISR) queues requestndash One queue per CPU DPCs are normally queued to the current processor but can be targeted to other CPUs
ndash Executes specified procedure at dispatch IRQL (or ldquodispatch levelrdquo also ldquoDPC levelrdquo) when all higher-IRQL work (interrupts) completed
ndash Maximum times recommended ISR 10 usec DPC 25 usec
See httpwwwmicrosoftcomwhdcdriverperformmmdrvmspx
Deferred Procedure Calls (DPCs)
6565
queue head DPC object DPC object DPC object
Deferred Procedure Calls (DPCs)
6666
DPC
Delivering a DPC
DPC routines can call kernel functionsbut canlsquot call system services generatepage faults or create or wait on objects
DPC routines canlsquotassume whatprocess addressspace is currentlymapped
Interruptdispatch table
high
Power failure
DispatchDPC
APC
Low
DPC
1 Timer expires kernelqueues DPC that willrelease all waiting threadsKernel requests SW int
DPCDPC
DPC queue
2 DPC interrupt occurswhen IRQL drops belowdispatchDPC level
dispatcher
3 After DPC interruptcontrol transfers tothread dispatcher
4 Dispatcher executes each DPCroutine in DPC queue
6767
Asynchronous Procedure Calls (APCs)
Execute code in context of a particular user threadndash APC routines can acquire resources (objects) incur page faultscall system services
APC queue is thread-specific
User mode amp kernel mode APCsndash Permission required for user mode APCs
6868
Asynchronous Procedure Calls (APCs)
Executive uses APCs to complete work in thread spacendash Wait for asynchronous IO operation
ndash Emulate delivery of POSIX signals
ndash Make threads suspendterminate itself (env subsystems)
APCs are delivered when thread is in alertable wait statendash WaitForMultipleObjectsEx() SleepEx()
6969
Special kernel APCs
ndash Run in kernel mode at IRQL 1
ndash Always deliverable unless thread is already at IRQL 1 or above
ndash Used for IO completion reporting from ldquoarbitrary thread contextrdquo
ndash Kernel-mode interface is linkable but not documented
Asynchronous Procedure Calls (APCs)
7070
ldquoOrdinaryrdquo kernel APCs
ndash Always deliverable if at IRQL 0 unless explicitly disabled (disable with KeEnterCriticalRegion)
User mode APCs
ndash Used for IO completion callback routines (see ReadFileEx WriteFileEx) also QueueUserApc
ndash Only deliverable when thread is in ldquoalertable waitrdquo
Asynchronous Procedure Calls (APCs)
7171
ThreadObject
K
U
APC objects
Asynchronous Procedure Calls (APCs)
7272
IRQLs and CPU Time Accounting
Interval clock timer ISR keeps track of time
Clock ISR time accounting
ndash If IRQLlt2 charge to threadrsquos user or kernel time
ndash If IRQL=2 and processing a DPC charge to DPC time
ndash If IRQL=2 amp not processing a DPC charge to thread kernel time
ndash If IRQLgt2 charge to interrupt time
7373
IRQLs and CPU Time Accounting
Since time servicing interrupts are NOT charged to interrupted thread if system is busy but no process appears to be running must be due to interrupt-related activity
ndash Note time at IRQL 2 or more is charged to the current threadrsquos quantum (to be described)
7474
Interrupt Time Accounting
Task Manager includes interrupt and DPC time with the Idle process time
Interrupt activity is not charged to any threadprocess
ndash Process Explorer shows these as separate processesnot really processes
ndash Context switches for these are really of interrupts amp DPCs
7575
Time Accounting Quirks
Looking at total CPU time for each process may not reveal where system has spent its time
CPU time accounting is driven by programmable interrupt timer
ndash Normally 10 msec (15 msec on some MP Pentiums)
Thread execution and context switches between clock intervals NOT accounted
ndash Eg one or more threads run and enter a wait state before clock fires
ndash Thus threads may run but never get charged
View context switch activity with Process Explorer
ndash Add Context Switch Delta column
7676
For waiting threads user-mode utilities only display the wait reason
Example pstat
Looking at Waiting Threads
7777
Wait Internals 1 Dispatcher Objects
Size TypeState
Wait listhead
Object-type-specific data
DispatcherObject
(see ntddkincddkntddkh)
Any kernel object you can wait for is a ldquodispatcher objectrdquo
ndash some exclusively for synchronizationeg events mutexes (ldquomutantsrdquo) semaphores queues timers
ndash others can be waited for as a side effect of their prime function eg processes threads file objects
ndash non-waitable kernel objects are called ldquocontrol objectsrdquo
All dispatcher objects have a common header
All dispatcher objects are in one of two states
ndash ldquosignaledrdquo vs ldquononsignaledrdquo
ndash when signalled a wait on the object is satisfied
ndash different object types differ in terms of what changes their state
ndash wait and unwait implementation iscommon to all types of dispatcher objects
7878
Object-type-specific data
Wait Internals 2Wait Blocks
Size TypeState
Wait listhead
Size TypeState
Wait listhead
Represent a threadrsquos reference to something itrsquos waiting for (one per handle passed to WaitForhellip)
All wait blocks from a given wait call are chained to the waiting thread
Type indicates wait for ldquoanyrdquo or ldquoallrdquo Key denotes argument list position for
WaitForMultipleObjects
Object-type-specific data
DispatcherObjects
Thread Objects
WaitBlockListWaitBlockList
Wait blocks
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
7979
34 Windows APIs for Synchronization and IPC Windows API constructs for synchronization and interprocess communication
Synchronization
ndash Critical sections
ndash Mutexes
ndash Semaphores
ndash Event objects
Synchronization through interprocess communication
ndash Anonymous pipes
ndash Named pipes
ndash Mailslots
8080
Critical Sections
Only usable from within the same process
Critical sections are initialized and deleted but do not have handles
Only one thread at a time can be in a critical section
A thread can enter a critical section multiple times - however the number of Enter- and Leave-operations must match
Leaving a critical section before entering it may cause deadlocks
No way to test whether another thread is in a critical section
VOID InitializeCriticalSection( LPCRITICAL_SECTION sec )VOID DeleteCriticalSection( LPCRITICAL_SECTION sec )
VOID EnterCriticalSection( LPCRITICAL_SECTION sec ) VOID LeaveCriticalSection( LPCRITICAL_SECTION sec )BOOL TryEnterCriticalSection ( LPCRITICAL_SECTION sec )
8181
Critical Section Example
counter is global shared by all threads
volatile int counter = 0
CRITICAL_SECTION crit
InitializeCriticalSection ( ampcrit )
hellip main loop in any of the threads
while (done)
_try
EnterCriticalSection ( ampcrit )
counter += local_value
LeaveCriticalSection ( ampcrit )
_finally LeaveCriticalSection ( ampcrit )
DeleteCriticalSection( ampcrit )
8282
Synchronizing Threads with Kernel Objects
The following kernel objects can be used to synchronize threads
ndash Processes
ndash Threads
ndash Files
ndash Console input
File change notificationsFile change notifications
MutexesMutexes
Events (auto-reset + manual-reset)Events (auto-reset + manual-reset)
Waitable timersWaitable timers
DWORD WaitForSingleObject( HANDLE hObject DWORD dwTimeout )
DWORD WaitForMultipleObjects( DWORD cObjects LPHANDLE lpHandles BOOL bWaitAll DWORD dwTimeout )
8383
Wait Functions - Details
WaitForSingleObject()ndash hObject specifies kernel object
ndash dwTimeout specifies wait time in msecdwTimeout == 0 - no wait check whether object is signaled
dwTimeout == INFINITE - wait forever
WaitForMultipleObjects()ndash cObjects lt= MAXIMUM_WAIT_OBJECTS (64)
ndash lpHandles - pointer to array identifying these objects
ndash bWaitAll - whether to wait for first signaled object or all objectsFunction returns index of first signaled object
Side effectsndash Mutexes auto-reset events and waitable timers will be reset to non-signaled state after completing wait functions
8484
Mutexes
Mutexes work across processes
First thread has to call CreateMutex()
When sharing a mutex second thread (process) calls CreateMutex() or OpenMutex()
fInitialOwner == TRUE gives creator immediate ownership
Threads acquire mutex ownership using WaitForSingleObject() or WaitForMultipleObjects()
ReleaseMutex() gives up ownership
CloseHandle() will free mutex object
8585
Mutexes
HANDLE CreateMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
HANDLE OpenMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
BOOL ReleaseMutex( HANDLE hMutex )
8686
Mutex Example
counter is global shared by all threads
volatile int done counter = 0
HANDLE mutex = CreateMutex( NULL FALSE NULL )
main loop in any of the threads ret is local
DWORD ret
while (done)
ret = WaitForSingleObject( mutex INFINITE )
if (ret == WAIT_OBJECT_0)
counter += local_value
else mutex was abandoned
break exit the loop
ReleaseMutex( mutex )
CloseHandle( mutex )
8787
Comparison - POSIX mutexes
POSIX pthreads specification supports mutexesndash Synchronization among threads in same process
Five basic functionsndash pthread_mutex_init()
ndash pthread_mutex_destroy()
ndash pthread_mutex_lock()
ndash pthread_mutex_unlock()
ndash pthread_mutex_trylock()
Comparisonndash pthread_mutex_lock() will block - equivalent to WaitForSingleObject( hMutex )
ndash pthread_mutex_trylock() is nonblocking (polling) - equivalent to WaitForSingleObject() with timeout == 0
8888
Semaphores
Semaphore objects are used for resource countingndash A semaphore is signaled when count gt 0
Threadsprocesses use wait functionsndash Each wait function decreases semaphore count by 1
ndash ReleaseSemaphore() may increment count by any value
ndash ReleaseSemaphore() returns old semaphore count
8989
Semaphores
HANDLE CreateSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE OpenSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE ReleaseSemaphore( HANDLE hSemaphoreLONG cReleaseCount LPLONG lpPreviousCount )
9090
Events
Multiple threads can be released when a single event is signaled (barrier synchronization)ndash Manual-reset event can signal several thread simultaneously must be reset manually
ndash PulseEvent() will release all threads waiting on a manual-reset event and automatically reset the event
ndash Auto-reset event signals a single thread event is reset automatically
ndash fInitialState == TRUE - create event in signaled state
9191
Events
HANDLE CreateEvent( LPSECURITY_ATTRIBUTE lpsaBOOL fManualReset BOOL fInititalStateLPTSTR lpszEventName )
BOOL SetEvent( HANDLE hEvent )BOOL ResetEvent( HANDLE hEvent )BOOL PulseEvent( HANDLE hEvent )
9292
Comparison - POSIX condition variables
pthreadrsquos condition variables are comparable to eventsndash pthread_cond_init()
ndash pthread_cond_destroy()
Wait functionsndash pthread_cond_wait()
ndash pthread_cond_timedwait()
Signalingndash pthread_cond_signal() - one thread
ndash pthread_cond_broadcast() - all waiting threads
No exact equivalent to manual-reset events
9393
Anonymous pipes
BOOL CreatePipe( PHANDLE phReadPHANDLE phWriteLPSECURITY_ATTRIBUTES lpsaDWORD cbPipe )
main
prog1 prog2pipe
Half-duplex character-based IPC
cbPipe pipe byte size zero == default
Read on pipe handle will block if pipe is empty
Write operation to a full pipe will block
Anonymous pipes are oneway
9494
IO Redirection using an Anonymous Pipe
Create default size anonymous pipe handles are inheritable
if (CreatePipe (amphReadPipe amphWritePipe ampPipeSA 0))
fprintf(stderr ldquoAnon pipe create failednrdquo) exit(1)
Set output handle to pipe handle create first processes
StartInfoCh1hStdInput = GetStdHandle (STD_INPUT_HANDLE)
StartInfoCh1hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh1hStdOutput = hWritePipe
StartInfoCh1dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)Command1 NULL NULL TRUE 0 NULL NULL ampStartInfoCh1 ampProcInfo1))
fprintf(stderr ldquoCreateProc1 failednrdquo) exit(2)
CloseHandle (hWritePipe)
9595
Pipe example (contd)
Repeat (symmetrically) for the second process
StartInfoCh2hStdInput = hReadPipe
StartInfoCh2hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh2hStdOutput = GetStdHandle (STD_OUTPUT_HANDLE)
StartInfoCh2dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)targv NULL NULLTRUE Inherit handles
0 NULL NULL ampStartInfoCh2 ampProcInfo2))
fprintf(stderr ldquoCreateProc2 failednrdquo) exit(3)
CloseHandle (hReadPipe)
Wait for both processes to complete
WaitForSingleObject (ProcInfo1hProcess INFINITE)
WaitForSingleObject (ProcInfo2hProcess INFINITE)
CloseHandle (ProcInfo1hThread) CloseHandle (ProcInfo1hProcess)
CloseHandle (ProcInfo2hThread) CloseHandle (ProcInfo2hProcess)
return 0
9696
Named Pipes
Message orientedndash Reading process can read varying-length messages precisely as sent by the writing process
Bi-directionalndash Two processes can exchange messages over the same pipe
Multiple independent instances of a named pipendash Several clients can communicate with a single server using the same instance
ndash Server can respond to client using the same instance
Pipe can be accessed over the networkndash location transparency
Convenience and connection functions
9797
Using Named Pipes
HANDLE CreateNamedPipe (LPCTSTR lpszPipeNameDWORD fdwOpenMode DWORD fdwPipModeDWORD nMaxInstances DWORD cbOutBufDWORD cbInBuf DWORD dwTimeOutLPSECURITY_ATTRIBUTES lpsa )
Use same flag settings forall instances of a named pipe
lpszPipeName pipe[path]pipename
ndash Not possible to create a pipe on remote machine ( ndash local machine)
fdwOpenMode
ndash PIPE_ACCESS_DUPLEX PIPE_ACCESS_INBOUND PIPE_ACCESS_OUTBOUND
fdwPipeMode
ndash PIPE_TYPE_BYTE or PIPE_TYPE_MESSAGE
ndash PIPE_READMODE_BYTE or PIPE_READMODE_MESSAGE
ndash PIPE_WAIT or PIPE_NOWAIT (will ReadFile block)
9898
Named Pipes (contd)
BOOL PeekNamedPipe (HANDLE hPipeLPVOID lpvBuffer DWORD cbBufferLPDWORD lpcbRead LPDWORD lpcbAvailLPDWORD lpcbMessage)
nMaxInstances
ndash Number of instances
ndash PIPE_UNLIMITED_INSTANCES OS choice based on resources
dwTimeOut
ndash Default time-out period (in msec) for WaitNamedPipe()
First CreateNamedPipe creates named pipe
ndash Closing handle to last instance deletes named pipe
Polling a pipe
ndash Nondestructive ndash is there a message waiting for ReadFile
9999
Named Pipe Client Connections
CreateFile with named pipe namendash pipe[path]pipename
ndash servernamepipe[path]pipename
ndash First method gives better performance (local server)
Status Functionsndash GetNamedPipeHandleState
ndash SetNamedPipeHandleState
ndash GetNamedPipeInfo
100100
Convenience Functions
BOOL TransactNamedPipe( HANDLE hNamedPipeLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDOWRD lpcbRead LPOVERLAPPED lpa)
WriteFile ReadFile sequence
101101
Convenience Functions
BOOL CallNamedPipe( LPCTSTR lpszPipeNameLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDWORD lpcbRead DWORD dwTimeOut)
CreateFile WriteFile ReadFile CloseHandle
ndash dwTimeOut NMPWAIT_NOWAIT NMPWAIT_WIAT_FOREVER NMPWAIT_USE_DEFAULT_WAIT
102102
Server eliminate the polling loop
BOOL ConnectNamedPipe (HANDLE hNamedPipeLPOVERLAPPED lpo
lpo == NULLndash Call will return as soon as there is a client connection
ndash Returns false if client connected between CreateNamed Pipe calland ConnectNamedPipe()
Use DisconnectNamedPipe to free the handle for connection from another client
WaitNamedPipe()ndash Client may wait for serverlsquos ConnectNamedPipe()
Security rights for named pipesndash GENERIC_READ GENERIC_WRITE SYNCHRONIZE
103103
Comparison with UNIX
UNIX FIFOs are similar to a named pipe
ndash FIFOs are half-duplex
ndash FIFOs are limited to a single machine
ndash FIFOs are still byte-oriented so its easiest to use fixed-size records in clientserver applications
ndash Individual readwrites are atomic
A server using FIFOs must use a separate FIFO for each clientlsquos response although all clients can send requests via a single well known FIFO
Mkfifo() is the UNIX counterpart to CreateNamedPipe()
Use sockets for networked clientserver scenarios
104104
Client Example using Named Pipe
WaitNamedPipe (ServerPipeName NMPWAIT_WAIT_FOREVER)
hNamedPipe = CreateFile (ServerPipeName GENERIC_READ | GENERIC_WRITE
0 NULL OPEN_EXISTING FILE_ATTRIBUTE_NORMAL NULL)
if (hNamedPipe == INVALID_HANDLE_VALUE)
fptinf(stderr Failure to locate servern) exit(3)
Write the request
WriteFile (hNamedPipe ampRequest MAX_RQRS_LEN ampnWrite NULL)
Read each response and send it to std out
while (ReadFile (hNamedPipe ResponseRecord MAX_RQRS_LEN ampnRead NULL))
printf (s ResponseRecord)
CloseHandle (hNamedPipe)
return 0
105105
Server Example Using a Named Pipe
hNamedPipe = CreateNamedPipe (SERVER_PIPE PIPE_ACCESS_DUPLEX
PIPE_READMODE_MESSAGE | PIPE_TYPE_MESSAGE | PIPE_WAIT
1 0 0 CS_TIMEOUT pNPSA)
while (Done)
printf (Server is awaiting next requestn)
if (ConnectNamedPipe (hNamedPipe NULL)
|| ReadFile (hNamedPipe ampRequest RQ_SIZE ampnXfer NULL))
fprintf(stderr ldquoConnect or Read Named Pipe errornrdquo) exit(4)
printf( ldquoRequest is sn RequestRecord)
Send the file one line at a time to the client
fp = fopen (File r)
while ((fgets (ResponseRecord MAX_RQRS_LEN fp) = NULL))
WriteFile (hNamedPipe ampResponseRecord
(strlen(ResponseRecord) + 1) TSIZE ampnXfer NULL)
fclose (fp)
DisconnectNamedPipe (hNamedPipe)
End of server operation
106106
Win32 IPC - MailslotsMailslots bear some nasty implementation detailsthey are almost never used
Broadcast mechanism
ndash One-directional
ndash Mutliple writersmultiple readers (frequently one-to-many comm)
ndash Message delivery is unreliable
ndash Can be located over a network domain
ndash Message lengths are limited (w2k lt 426 byte) Operations on the mailslot
ndash Each reader (server) creates mailslot with CreateMailslot()
ndash Write-only client opens mailslot with CreateFile() and uses WriteFile() ndash open will fail if there are no waiting readers
ndash Clientlsquos message can be read by all servers (readers) Client lookup mailslotmailslotname
ndash Client will connect to every server in network domain
107107
Locate a server via mailslot
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
App client 0
App client n
Mailslot Servers
While () Sleep() hMS = CreateFile( ldquomailslotstatusldquo)
WriteFile(hMS ampStatInfo
App Server
Mailslot Client
Message is sent periodically
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
6464
Used to defer processing from higher (device) interrupt level to a lower (dispatch) levelndash Also used for quantum end and timer expiration
Driver (usually ISR) queues requestndash One queue per CPU DPCs are normally queued to the current processor but can be targeted to other CPUs
ndash Executes specified procedure at dispatch IRQL (or ldquodispatch levelrdquo also ldquoDPC levelrdquo) when all higher-IRQL work (interrupts) completed
ndash Maximum times recommended ISR 10 usec DPC 25 usec
See httpwwwmicrosoftcomwhdcdriverperformmmdrvmspx
Deferred Procedure Calls (DPCs)
6565
queue head DPC object DPC object DPC object
Deferred Procedure Calls (DPCs)
6666
DPC
Delivering a DPC
DPC routines can call kernel functionsbut canlsquot call system services generatepage faults or create or wait on objects
DPC routines canlsquotassume whatprocess addressspace is currentlymapped
Interruptdispatch table
high
Power failure
DispatchDPC
APC
Low
DPC
1 Timer expires kernelqueues DPC that willrelease all waiting threadsKernel requests SW int
DPCDPC
DPC queue
2 DPC interrupt occurswhen IRQL drops belowdispatchDPC level
dispatcher
3 After DPC interruptcontrol transfers tothread dispatcher
4 Dispatcher executes each DPCroutine in DPC queue
6767
Asynchronous Procedure Calls (APCs)
Execute code in context of a particular user threadndash APC routines can acquire resources (objects) incur page faultscall system services
APC queue is thread-specific
User mode amp kernel mode APCsndash Permission required for user mode APCs
6868
Asynchronous Procedure Calls (APCs)
Executive uses APCs to complete work in thread spacendash Wait for asynchronous IO operation
ndash Emulate delivery of POSIX signals
ndash Make threads suspendterminate itself (env subsystems)
APCs are delivered when thread is in alertable wait statendash WaitForMultipleObjectsEx() SleepEx()
6969
Special kernel APCs
ndash Run in kernel mode at IRQL 1
ndash Always deliverable unless thread is already at IRQL 1 or above
ndash Used for IO completion reporting from ldquoarbitrary thread contextrdquo
ndash Kernel-mode interface is linkable but not documented
Asynchronous Procedure Calls (APCs)
7070
ldquoOrdinaryrdquo kernel APCs
ndash Always deliverable if at IRQL 0 unless explicitly disabled (disable with KeEnterCriticalRegion)
User mode APCs
ndash Used for IO completion callback routines (see ReadFileEx WriteFileEx) also QueueUserApc
ndash Only deliverable when thread is in ldquoalertable waitrdquo
Asynchronous Procedure Calls (APCs)
7171
ThreadObject
K
U
APC objects
Asynchronous Procedure Calls (APCs)
7272
IRQLs and CPU Time Accounting
Interval clock timer ISR keeps track of time
Clock ISR time accounting
ndash If IRQLlt2 charge to threadrsquos user or kernel time
ndash If IRQL=2 and processing a DPC charge to DPC time
ndash If IRQL=2 amp not processing a DPC charge to thread kernel time
ndash If IRQLgt2 charge to interrupt time
7373
IRQLs and CPU Time Accounting
Since time servicing interrupts are NOT charged to interrupted thread if system is busy but no process appears to be running must be due to interrupt-related activity
ndash Note time at IRQL 2 or more is charged to the current threadrsquos quantum (to be described)
7474
Interrupt Time Accounting
Task Manager includes interrupt and DPC time with the Idle process time
Interrupt activity is not charged to any threadprocess
ndash Process Explorer shows these as separate processesnot really processes
ndash Context switches for these are really of interrupts amp DPCs
7575
Time Accounting Quirks
Looking at total CPU time for each process may not reveal where system has spent its time
CPU time accounting is driven by programmable interrupt timer
ndash Normally 10 msec (15 msec on some MP Pentiums)
Thread execution and context switches between clock intervals NOT accounted
ndash Eg one or more threads run and enter a wait state before clock fires
ndash Thus threads may run but never get charged
View context switch activity with Process Explorer
ndash Add Context Switch Delta column
7676
For waiting threads user-mode utilities only display the wait reason
Example pstat
Looking at Waiting Threads
7777
Wait Internals 1 Dispatcher Objects
Size TypeState
Wait listhead
Object-type-specific data
DispatcherObject
(see ntddkincddkntddkh)
Any kernel object you can wait for is a ldquodispatcher objectrdquo
ndash some exclusively for synchronizationeg events mutexes (ldquomutantsrdquo) semaphores queues timers
ndash others can be waited for as a side effect of their prime function eg processes threads file objects
ndash non-waitable kernel objects are called ldquocontrol objectsrdquo
All dispatcher objects have a common header
All dispatcher objects are in one of two states
ndash ldquosignaledrdquo vs ldquononsignaledrdquo
ndash when signalled a wait on the object is satisfied
ndash different object types differ in terms of what changes their state
ndash wait and unwait implementation iscommon to all types of dispatcher objects
7878
Object-type-specific data
Wait Internals 2Wait Blocks
Size TypeState
Wait listhead
Size TypeState
Wait listhead
Represent a threadrsquos reference to something itrsquos waiting for (one per handle passed to WaitForhellip)
All wait blocks from a given wait call are chained to the waiting thread
Type indicates wait for ldquoanyrdquo or ldquoallrdquo Key denotes argument list position for
WaitForMultipleObjects
Object-type-specific data
DispatcherObjects
Thread Objects
WaitBlockListWaitBlockList
Wait blocks
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
7979
34 Windows APIs for Synchronization and IPC Windows API constructs for synchronization and interprocess communication
Synchronization
ndash Critical sections
ndash Mutexes
ndash Semaphores
ndash Event objects
Synchronization through interprocess communication
ndash Anonymous pipes
ndash Named pipes
ndash Mailslots
8080
Critical Sections
Only usable from within the same process
Critical sections are initialized and deleted but do not have handles
Only one thread at a time can be in a critical section
A thread can enter a critical section multiple times - however the number of Enter- and Leave-operations must match
Leaving a critical section before entering it may cause deadlocks
No way to test whether another thread is in a critical section
VOID InitializeCriticalSection( LPCRITICAL_SECTION sec )VOID DeleteCriticalSection( LPCRITICAL_SECTION sec )
VOID EnterCriticalSection( LPCRITICAL_SECTION sec ) VOID LeaveCriticalSection( LPCRITICAL_SECTION sec )BOOL TryEnterCriticalSection ( LPCRITICAL_SECTION sec )
8181
Critical Section Example
counter is global shared by all threads
volatile int counter = 0
CRITICAL_SECTION crit
InitializeCriticalSection ( ampcrit )
hellip main loop in any of the threads
while (done)
_try
EnterCriticalSection ( ampcrit )
counter += local_value
LeaveCriticalSection ( ampcrit )
_finally LeaveCriticalSection ( ampcrit )
DeleteCriticalSection( ampcrit )
8282
Synchronizing Threads with Kernel Objects
The following kernel objects can be used to synchronize threads
ndash Processes
ndash Threads
ndash Files
ndash Console input
File change notificationsFile change notifications
MutexesMutexes
Events (auto-reset + manual-reset)Events (auto-reset + manual-reset)
Waitable timersWaitable timers
DWORD WaitForSingleObject( HANDLE hObject DWORD dwTimeout )
DWORD WaitForMultipleObjects( DWORD cObjects LPHANDLE lpHandles BOOL bWaitAll DWORD dwTimeout )
8383
Wait Functions - Details
WaitForSingleObject()ndash hObject specifies kernel object
ndash dwTimeout specifies wait time in msecdwTimeout == 0 - no wait check whether object is signaled
dwTimeout == INFINITE - wait forever
WaitForMultipleObjects()ndash cObjects lt= MAXIMUM_WAIT_OBJECTS (64)
ndash lpHandles - pointer to array identifying these objects
ndash bWaitAll - whether to wait for first signaled object or all objectsFunction returns index of first signaled object
Side effectsndash Mutexes auto-reset events and waitable timers will be reset to non-signaled state after completing wait functions
8484
Mutexes
Mutexes work across processes
First thread has to call CreateMutex()
When sharing a mutex second thread (process) calls CreateMutex() or OpenMutex()
fInitialOwner == TRUE gives creator immediate ownership
Threads acquire mutex ownership using WaitForSingleObject() or WaitForMultipleObjects()
ReleaseMutex() gives up ownership
CloseHandle() will free mutex object
8585
Mutexes
HANDLE CreateMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
HANDLE OpenMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
BOOL ReleaseMutex( HANDLE hMutex )
8686
Mutex Example
counter is global shared by all threads
volatile int done counter = 0
HANDLE mutex = CreateMutex( NULL FALSE NULL )
main loop in any of the threads ret is local
DWORD ret
while (done)
ret = WaitForSingleObject( mutex INFINITE )
if (ret == WAIT_OBJECT_0)
counter += local_value
else mutex was abandoned
break exit the loop
ReleaseMutex( mutex )
CloseHandle( mutex )
8787
Comparison - POSIX mutexes
POSIX pthreads specification supports mutexesndash Synchronization among threads in same process
Five basic functionsndash pthread_mutex_init()
ndash pthread_mutex_destroy()
ndash pthread_mutex_lock()
ndash pthread_mutex_unlock()
ndash pthread_mutex_trylock()
Comparisonndash pthread_mutex_lock() will block - equivalent to WaitForSingleObject( hMutex )
ndash pthread_mutex_trylock() is nonblocking (polling) - equivalent to WaitForSingleObject() with timeout == 0
8888
Semaphores
Semaphore objects are used for resource countingndash A semaphore is signaled when count gt 0
Threadsprocesses use wait functionsndash Each wait function decreases semaphore count by 1
ndash ReleaseSemaphore() may increment count by any value
ndash ReleaseSemaphore() returns old semaphore count
8989
Semaphores
HANDLE CreateSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE OpenSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE ReleaseSemaphore( HANDLE hSemaphoreLONG cReleaseCount LPLONG lpPreviousCount )
9090
Events
Multiple threads can be released when a single event is signaled (barrier synchronization)ndash Manual-reset event can signal several thread simultaneously must be reset manually
ndash PulseEvent() will release all threads waiting on a manual-reset event and automatically reset the event
ndash Auto-reset event signals a single thread event is reset automatically
ndash fInitialState == TRUE - create event in signaled state
9191
Events
HANDLE CreateEvent( LPSECURITY_ATTRIBUTE lpsaBOOL fManualReset BOOL fInititalStateLPTSTR lpszEventName )
BOOL SetEvent( HANDLE hEvent )BOOL ResetEvent( HANDLE hEvent )BOOL PulseEvent( HANDLE hEvent )
9292
Comparison - POSIX condition variables
pthreadrsquos condition variables are comparable to eventsndash pthread_cond_init()
ndash pthread_cond_destroy()
Wait functionsndash pthread_cond_wait()
ndash pthread_cond_timedwait()
Signalingndash pthread_cond_signal() - one thread
ndash pthread_cond_broadcast() - all waiting threads
No exact equivalent to manual-reset events
9393
Anonymous pipes
BOOL CreatePipe( PHANDLE phReadPHANDLE phWriteLPSECURITY_ATTRIBUTES lpsaDWORD cbPipe )
main
prog1 prog2pipe
Half-duplex character-based IPC
cbPipe pipe byte size zero == default
Read on pipe handle will block if pipe is empty
Write operation to a full pipe will block
Anonymous pipes are oneway
9494
IO Redirection using an Anonymous Pipe
Create default size anonymous pipe handles are inheritable
if (CreatePipe (amphReadPipe amphWritePipe ampPipeSA 0))
fprintf(stderr ldquoAnon pipe create failednrdquo) exit(1)
Set output handle to pipe handle create first processes
StartInfoCh1hStdInput = GetStdHandle (STD_INPUT_HANDLE)
StartInfoCh1hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh1hStdOutput = hWritePipe
StartInfoCh1dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)Command1 NULL NULL TRUE 0 NULL NULL ampStartInfoCh1 ampProcInfo1))
fprintf(stderr ldquoCreateProc1 failednrdquo) exit(2)
CloseHandle (hWritePipe)
9595
Pipe example (contd)
Repeat (symmetrically) for the second process
StartInfoCh2hStdInput = hReadPipe
StartInfoCh2hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh2hStdOutput = GetStdHandle (STD_OUTPUT_HANDLE)
StartInfoCh2dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)targv NULL NULLTRUE Inherit handles
0 NULL NULL ampStartInfoCh2 ampProcInfo2))
fprintf(stderr ldquoCreateProc2 failednrdquo) exit(3)
CloseHandle (hReadPipe)
Wait for both processes to complete
WaitForSingleObject (ProcInfo1hProcess INFINITE)
WaitForSingleObject (ProcInfo2hProcess INFINITE)
CloseHandle (ProcInfo1hThread) CloseHandle (ProcInfo1hProcess)
CloseHandle (ProcInfo2hThread) CloseHandle (ProcInfo2hProcess)
return 0
9696
Named Pipes
Message orientedndash Reading process can read varying-length messages precisely as sent by the writing process
Bi-directionalndash Two processes can exchange messages over the same pipe
Multiple independent instances of a named pipendash Several clients can communicate with a single server using the same instance
ndash Server can respond to client using the same instance
Pipe can be accessed over the networkndash location transparency
Convenience and connection functions
9797
Using Named Pipes
HANDLE CreateNamedPipe (LPCTSTR lpszPipeNameDWORD fdwOpenMode DWORD fdwPipModeDWORD nMaxInstances DWORD cbOutBufDWORD cbInBuf DWORD dwTimeOutLPSECURITY_ATTRIBUTES lpsa )
Use same flag settings forall instances of a named pipe
lpszPipeName pipe[path]pipename
ndash Not possible to create a pipe on remote machine ( ndash local machine)
fdwOpenMode
ndash PIPE_ACCESS_DUPLEX PIPE_ACCESS_INBOUND PIPE_ACCESS_OUTBOUND
fdwPipeMode
ndash PIPE_TYPE_BYTE or PIPE_TYPE_MESSAGE
ndash PIPE_READMODE_BYTE or PIPE_READMODE_MESSAGE
ndash PIPE_WAIT or PIPE_NOWAIT (will ReadFile block)
9898
Named Pipes (contd)
BOOL PeekNamedPipe (HANDLE hPipeLPVOID lpvBuffer DWORD cbBufferLPDWORD lpcbRead LPDWORD lpcbAvailLPDWORD lpcbMessage)
nMaxInstances
ndash Number of instances
ndash PIPE_UNLIMITED_INSTANCES OS choice based on resources
dwTimeOut
ndash Default time-out period (in msec) for WaitNamedPipe()
First CreateNamedPipe creates named pipe
ndash Closing handle to last instance deletes named pipe
Polling a pipe
ndash Nondestructive ndash is there a message waiting for ReadFile
9999
Named Pipe Client Connections
CreateFile with named pipe namendash pipe[path]pipename
ndash servernamepipe[path]pipename
ndash First method gives better performance (local server)
Status Functionsndash GetNamedPipeHandleState
ndash SetNamedPipeHandleState
ndash GetNamedPipeInfo
100100
Convenience Functions
BOOL TransactNamedPipe( HANDLE hNamedPipeLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDOWRD lpcbRead LPOVERLAPPED lpa)
WriteFile ReadFile sequence
101101
Convenience Functions
BOOL CallNamedPipe( LPCTSTR lpszPipeNameLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDWORD lpcbRead DWORD dwTimeOut)
CreateFile WriteFile ReadFile CloseHandle
ndash dwTimeOut NMPWAIT_NOWAIT NMPWAIT_WIAT_FOREVER NMPWAIT_USE_DEFAULT_WAIT
102102
Server eliminate the polling loop
BOOL ConnectNamedPipe (HANDLE hNamedPipeLPOVERLAPPED lpo
lpo == NULLndash Call will return as soon as there is a client connection
ndash Returns false if client connected between CreateNamed Pipe calland ConnectNamedPipe()
Use DisconnectNamedPipe to free the handle for connection from another client
WaitNamedPipe()ndash Client may wait for serverlsquos ConnectNamedPipe()
Security rights for named pipesndash GENERIC_READ GENERIC_WRITE SYNCHRONIZE
103103
Comparison with UNIX
UNIX FIFOs are similar to a named pipe
ndash FIFOs are half-duplex
ndash FIFOs are limited to a single machine
ndash FIFOs are still byte-oriented so its easiest to use fixed-size records in clientserver applications
ndash Individual readwrites are atomic
A server using FIFOs must use a separate FIFO for each clientlsquos response although all clients can send requests via a single well known FIFO
Mkfifo() is the UNIX counterpart to CreateNamedPipe()
Use sockets for networked clientserver scenarios
104104
Client Example using Named Pipe
WaitNamedPipe (ServerPipeName NMPWAIT_WAIT_FOREVER)
hNamedPipe = CreateFile (ServerPipeName GENERIC_READ | GENERIC_WRITE
0 NULL OPEN_EXISTING FILE_ATTRIBUTE_NORMAL NULL)
if (hNamedPipe == INVALID_HANDLE_VALUE)
fptinf(stderr Failure to locate servern) exit(3)
Write the request
WriteFile (hNamedPipe ampRequest MAX_RQRS_LEN ampnWrite NULL)
Read each response and send it to std out
while (ReadFile (hNamedPipe ResponseRecord MAX_RQRS_LEN ampnRead NULL))
printf (s ResponseRecord)
CloseHandle (hNamedPipe)
return 0
105105
Server Example Using a Named Pipe
hNamedPipe = CreateNamedPipe (SERVER_PIPE PIPE_ACCESS_DUPLEX
PIPE_READMODE_MESSAGE | PIPE_TYPE_MESSAGE | PIPE_WAIT
1 0 0 CS_TIMEOUT pNPSA)
while (Done)
printf (Server is awaiting next requestn)
if (ConnectNamedPipe (hNamedPipe NULL)
|| ReadFile (hNamedPipe ampRequest RQ_SIZE ampnXfer NULL))
fprintf(stderr ldquoConnect or Read Named Pipe errornrdquo) exit(4)
printf( ldquoRequest is sn RequestRecord)
Send the file one line at a time to the client
fp = fopen (File r)
while ((fgets (ResponseRecord MAX_RQRS_LEN fp) = NULL))
WriteFile (hNamedPipe ampResponseRecord
(strlen(ResponseRecord) + 1) TSIZE ampnXfer NULL)
fclose (fp)
DisconnectNamedPipe (hNamedPipe)
End of server operation
106106
Win32 IPC - MailslotsMailslots bear some nasty implementation detailsthey are almost never used
Broadcast mechanism
ndash One-directional
ndash Mutliple writersmultiple readers (frequently one-to-many comm)
ndash Message delivery is unreliable
ndash Can be located over a network domain
ndash Message lengths are limited (w2k lt 426 byte) Operations on the mailslot
ndash Each reader (server) creates mailslot with CreateMailslot()
ndash Write-only client opens mailslot with CreateFile() and uses WriteFile() ndash open will fail if there are no waiting readers
ndash Clientlsquos message can be read by all servers (readers) Client lookup mailslotmailslotname
ndash Client will connect to every server in network domain
107107
Locate a server via mailslot
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
App client 0
App client n
Mailslot Servers
While () Sleep() hMS = CreateFile( ldquomailslotstatusldquo)
WriteFile(hMS ampStatInfo
App Server
Mailslot Client
Message is sent periodically
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
6565
queue head DPC object DPC object DPC object
Deferred Procedure Calls (DPCs)
6666
DPC
Delivering a DPC
DPC routines can call kernel functionsbut canlsquot call system services generatepage faults or create or wait on objects
DPC routines canlsquotassume whatprocess addressspace is currentlymapped
Interruptdispatch table
high
Power failure
DispatchDPC
APC
Low
DPC
1 Timer expires kernelqueues DPC that willrelease all waiting threadsKernel requests SW int
DPCDPC
DPC queue
2 DPC interrupt occurswhen IRQL drops belowdispatchDPC level
dispatcher
3 After DPC interruptcontrol transfers tothread dispatcher
4 Dispatcher executes each DPCroutine in DPC queue
6767
Asynchronous Procedure Calls (APCs)
Execute code in context of a particular user threadndash APC routines can acquire resources (objects) incur page faultscall system services
APC queue is thread-specific
User mode amp kernel mode APCsndash Permission required for user mode APCs
6868
Asynchronous Procedure Calls (APCs)
Executive uses APCs to complete work in thread spacendash Wait for asynchronous IO operation
ndash Emulate delivery of POSIX signals
ndash Make threads suspendterminate itself (env subsystems)
APCs are delivered when thread is in alertable wait statendash WaitForMultipleObjectsEx() SleepEx()
6969
Special kernel APCs
ndash Run in kernel mode at IRQL 1
ndash Always deliverable unless thread is already at IRQL 1 or above
ndash Used for IO completion reporting from ldquoarbitrary thread contextrdquo
ndash Kernel-mode interface is linkable but not documented
Asynchronous Procedure Calls (APCs)
7070
ldquoOrdinaryrdquo kernel APCs
ndash Always deliverable if at IRQL 0 unless explicitly disabled (disable with KeEnterCriticalRegion)
User mode APCs
ndash Used for IO completion callback routines (see ReadFileEx WriteFileEx) also QueueUserApc
ndash Only deliverable when thread is in ldquoalertable waitrdquo
Asynchronous Procedure Calls (APCs)
7171
ThreadObject
K
U
APC objects
Asynchronous Procedure Calls (APCs)
7272
IRQLs and CPU Time Accounting
Interval clock timer ISR keeps track of time
Clock ISR time accounting
ndash If IRQLlt2 charge to threadrsquos user or kernel time
ndash If IRQL=2 and processing a DPC charge to DPC time
ndash If IRQL=2 amp not processing a DPC charge to thread kernel time
ndash If IRQLgt2 charge to interrupt time
7373
IRQLs and CPU Time Accounting
Since time servicing interrupts are NOT charged to interrupted thread if system is busy but no process appears to be running must be due to interrupt-related activity
ndash Note time at IRQL 2 or more is charged to the current threadrsquos quantum (to be described)
7474
Interrupt Time Accounting
Task Manager includes interrupt and DPC time with the Idle process time
Interrupt activity is not charged to any threadprocess
ndash Process Explorer shows these as separate processesnot really processes
ndash Context switches for these are really of interrupts amp DPCs
7575
Time Accounting Quirks
Looking at total CPU time for each process may not reveal where system has spent its time
CPU time accounting is driven by programmable interrupt timer
ndash Normally 10 msec (15 msec on some MP Pentiums)
Thread execution and context switches between clock intervals NOT accounted
ndash Eg one or more threads run and enter a wait state before clock fires
ndash Thus threads may run but never get charged
View context switch activity with Process Explorer
ndash Add Context Switch Delta column
7676
For waiting threads user-mode utilities only display the wait reason
Example pstat
Looking at Waiting Threads
7777
Wait Internals 1 Dispatcher Objects
Size TypeState
Wait listhead
Object-type-specific data
DispatcherObject
(see ntddkincddkntddkh)
Any kernel object you can wait for is a ldquodispatcher objectrdquo
ndash some exclusively for synchronizationeg events mutexes (ldquomutantsrdquo) semaphores queues timers
ndash others can be waited for as a side effect of their prime function eg processes threads file objects
ndash non-waitable kernel objects are called ldquocontrol objectsrdquo
All dispatcher objects have a common header
All dispatcher objects are in one of two states
ndash ldquosignaledrdquo vs ldquononsignaledrdquo
ndash when signalled a wait on the object is satisfied
ndash different object types differ in terms of what changes their state
ndash wait and unwait implementation iscommon to all types of dispatcher objects
7878
Object-type-specific data
Wait Internals 2Wait Blocks
Size TypeState
Wait listhead
Size TypeState
Wait listhead
Represent a threadrsquos reference to something itrsquos waiting for (one per handle passed to WaitForhellip)
All wait blocks from a given wait call are chained to the waiting thread
Type indicates wait for ldquoanyrdquo or ldquoallrdquo Key denotes argument list position for
WaitForMultipleObjects
Object-type-specific data
DispatcherObjects
Thread Objects
WaitBlockListWaitBlockList
Wait blocks
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
7979
34 Windows APIs for Synchronization and IPC Windows API constructs for synchronization and interprocess communication
Synchronization
ndash Critical sections
ndash Mutexes
ndash Semaphores
ndash Event objects
Synchronization through interprocess communication
ndash Anonymous pipes
ndash Named pipes
ndash Mailslots
8080
Critical Sections
Only usable from within the same process
Critical sections are initialized and deleted but do not have handles
Only one thread at a time can be in a critical section
A thread can enter a critical section multiple times - however the number of Enter- and Leave-operations must match
Leaving a critical section before entering it may cause deadlocks
No way to test whether another thread is in a critical section
VOID InitializeCriticalSection( LPCRITICAL_SECTION sec )VOID DeleteCriticalSection( LPCRITICAL_SECTION sec )
VOID EnterCriticalSection( LPCRITICAL_SECTION sec ) VOID LeaveCriticalSection( LPCRITICAL_SECTION sec )BOOL TryEnterCriticalSection ( LPCRITICAL_SECTION sec )
8181
Critical Section Example
counter is global shared by all threads
volatile int counter = 0
CRITICAL_SECTION crit
InitializeCriticalSection ( ampcrit )
hellip main loop in any of the threads
while (done)
_try
EnterCriticalSection ( ampcrit )
counter += local_value
LeaveCriticalSection ( ampcrit )
_finally LeaveCriticalSection ( ampcrit )
DeleteCriticalSection( ampcrit )
8282
Synchronizing Threads with Kernel Objects
The following kernel objects can be used to synchronize threads
ndash Processes
ndash Threads
ndash Files
ndash Console input
File change notificationsFile change notifications
MutexesMutexes
Events (auto-reset + manual-reset)Events (auto-reset + manual-reset)
Waitable timersWaitable timers
DWORD WaitForSingleObject( HANDLE hObject DWORD dwTimeout )
DWORD WaitForMultipleObjects( DWORD cObjects LPHANDLE lpHandles BOOL bWaitAll DWORD dwTimeout )
8383
Wait Functions - Details
WaitForSingleObject()ndash hObject specifies kernel object
ndash dwTimeout specifies wait time in msecdwTimeout == 0 - no wait check whether object is signaled
dwTimeout == INFINITE - wait forever
WaitForMultipleObjects()ndash cObjects lt= MAXIMUM_WAIT_OBJECTS (64)
ndash lpHandles - pointer to array identifying these objects
ndash bWaitAll - whether to wait for first signaled object or all objectsFunction returns index of first signaled object
Side effectsndash Mutexes auto-reset events and waitable timers will be reset to non-signaled state after completing wait functions
8484
Mutexes
Mutexes work across processes
First thread has to call CreateMutex()
When sharing a mutex second thread (process) calls CreateMutex() or OpenMutex()
fInitialOwner == TRUE gives creator immediate ownership
Threads acquire mutex ownership using WaitForSingleObject() or WaitForMultipleObjects()
ReleaseMutex() gives up ownership
CloseHandle() will free mutex object
8585
Mutexes
HANDLE CreateMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
HANDLE OpenMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
BOOL ReleaseMutex( HANDLE hMutex )
8686
Mutex Example
counter is global shared by all threads
volatile int done counter = 0
HANDLE mutex = CreateMutex( NULL FALSE NULL )
main loop in any of the threads ret is local
DWORD ret
while (done)
ret = WaitForSingleObject( mutex INFINITE )
if (ret == WAIT_OBJECT_0)
counter += local_value
else mutex was abandoned
break exit the loop
ReleaseMutex( mutex )
CloseHandle( mutex )
8787
Comparison - POSIX mutexes
POSIX pthreads specification supports mutexesndash Synchronization among threads in same process
Five basic functionsndash pthread_mutex_init()
ndash pthread_mutex_destroy()
ndash pthread_mutex_lock()
ndash pthread_mutex_unlock()
ndash pthread_mutex_trylock()
Comparisonndash pthread_mutex_lock() will block - equivalent to WaitForSingleObject( hMutex )
ndash pthread_mutex_trylock() is nonblocking (polling) - equivalent to WaitForSingleObject() with timeout == 0
8888
Semaphores
Semaphore objects are used for resource countingndash A semaphore is signaled when count gt 0
Threadsprocesses use wait functionsndash Each wait function decreases semaphore count by 1
ndash ReleaseSemaphore() may increment count by any value
ndash ReleaseSemaphore() returns old semaphore count
8989
Semaphores
HANDLE CreateSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE OpenSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE ReleaseSemaphore( HANDLE hSemaphoreLONG cReleaseCount LPLONG lpPreviousCount )
9090
Events
Multiple threads can be released when a single event is signaled (barrier synchronization)ndash Manual-reset event can signal several thread simultaneously must be reset manually
ndash PulseEvent() will release all threads waiting on a manual-reset event and automatically reset the event
ndash Auto-reset event signals a single thread event is reset automatically
ndash fInitialState == TRUE - create event in signaled state
9191
Events
HANDLE CreateEvent( LPSECURITY_ATTRIBUTE lpsaBOOL fManualReset BOOL fInititalStateLPTSTR lpszEventName )
BOOL SetEvent( HANDLE hEvent )BOOL ResetEvent( HANDLE hEvent )BOOL PulseEvent( HANDLE hEvent )
9292
Comparison - POSIX condition variables
pthreadrsquos condition variables are comparable to eventsndash pthread_cond_init()
ndash pthread_cond_destroy()
Wait functionsndash pthread_cond_wait()
ndash pthread_cond_timedwait()
Signalingndash pthread_cond_signal() - one thread
ndash pthread_cond_broadcast() - all waiting threads
No exact equivalent to manual-reset events
9393
Anonymous pipes
BOOL CreatePipe( PHANDLE phReadPHANDLE phWriteLPSECURITY_ATTRIBUTES lpsaDWORD cbPipe )
main
prog1 prog2pipe
Half-duplex character-based IPC
cbPipe pipe byte size zero == default
Read on pipe handle will block if pipe is empty
Write operation to a full pipe will block
Anonymous pipes are oneway
9494
IO Redirection using an Anonymous Pipe
Create default size anonymous pipe handles are inheritable
if (CreatePipe (amphReadPipe amphWritePipe ampPipeSA 0))
fprintf(stderr ldquoAnon pipe create failednrdquo) exit(1)
Set output handle to pipe handle create first processes
StartInfoCh1hStdInput = GetStdHandle (STD_INPUT_HANDLE)
StartInfoCh1hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh1hStdOutput = hWritePipe
StartInfoCh1dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)Command1 NULL NULL TRUE 0 NULL NULL ampStartInfoCh1 ampProcInfo1))
fprintf(stderr ldquoCreateProc1 failednrdquo) exit(2)
CloseHandle (hWritePipe)
9595
Pipe example (contd)
Repeat (symmetrically) for the second process
StartInfoCh2hStdInput = hReadPipe
StartInfoCh2hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh2hStdOutput = GetStdHandle (STD_OUTPUT_HANDLE)
StartInfoCh2dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)targv NULL NULLTRUE Inherit handles
0 NULL NULL ampStartInfoCh2 ampProcInfo2))
fprintf(stderr ldquoCreateProc2 failednrdquo) exit(3)
CloseHandle (hReadPipe)
Wait for both processes to complete
WaitForSingleObject (ProcInfo1hProcess INFINITE)
WaitForSingleObject (ProcInfo2hProcess INFINITE)
CloseHandle (ProcInfo1hThread) CloseHandle (ProcInfo1hProcess)
CloseHandle (ProcInfo2hThread) CloseHandle (ProcInfo2hProcess)
return 0
9696
Named Pipes
Message orientedndash Reading process can read varying-length messages precisely as sent by the writing process
Bi-directionalndash Two processes can exchange messages over the same pipe
Multiple independent instances of a named pipendash Several clients can communicate with a single server using the same instance
ndash Server can respond to client using the same instance
Pipe can be accessed over the networkndash location transparency
Convenience and connection functions
9797
Using Named Pipes
HANDLE CreateNamedPipe (LPCTSTR lpszPipeNameDWORD fdwOpenMode DWORD fdwPipModeDWORD nMaxInstances DWORD cbOutBufDWORD cbInBuf DWORD dwTimeOutLPSECURITY_ATTRIBUTES lpsa )
Use same flag settings forall instances of a named pipe
lpszPipeName pipe[path]pipename
ndash Not possible to create a pipe on remote machine ( ndash local machine)
fdwOpenMode
ndash PIPE_ACCESS_DUPLEX PIPE_ACCESS_INBOUND PIPE_ACCESS_OUTBOUND
fdwPipeMode
ndash PIPE_TYPE_BYTE or PIPE_TYPE_MESSAGE
ndash PIPE_READMODE_BYTE or PIPE_READMODE_MESSAGE
ndash PIPE_WAIT or PIPE_NOWAIT (will ReadFile block)
9898
Named Pipes (contd)
BOOL PeekNamedPipe (HANDLE hPipeLPVOID lpvBuffer DWORD cbBufferLPDWORD lpcbRead LPDWORD lpcbAvailLPDWORD lpcbMessage)
nMaxInstances
ndash Number of instances
ndash PIPE_UNLIMITED_INSTANCES OS choice based on resources
dwTimeOut
ndash Default time-out period (in msec) for WaitNamedPipe()
First CreateNamedPipe creates named pipe
ndash Closing handle to last instance deletes named pipe
Polling a pipe
ndash Nondestructive ndash is there a message waiting for ReadFile
9999
Named Pipe Client Connections
CreateFile with named pipe namendash pipe[path]pipename
ndash servernamepipe[path]pipename
ndash First method gives better performance (local server)
Status Functionsndash GetNamedPipeHandleState
ndash SetNamedPipeHandleState
ndash GetNamedPipeInfo
100100
Convenience Functions
BOOL TransactNamedPipe( HANDLE hNamedPipeLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDOWRD lpcbRead LPOVERLAPPED lpa)
WriteFile ReadFile sequence
101101
Convenience Functions
BOOL CallNamedPipe( LPCTSTR lpszPipeNameLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDWORD lpcbRead DWORD dwTimeOut)
CreateFile WriteFile ReadFile CloseHandle
ndash dwTimeOut NMPWAIT_NOWAIT NMPWAIT_WIAT_FOREVER NMPWAIT_USE_DEFAULT_WAIT
102102
Server eliminate the polling loop
BOOL ConnectNamedPipe (HANDLE hNamedPipeLPOVERLAPPED lpo
lpo == NULLndash Call will return as soon as there is a client connection
ndash Returns false if client connected between CreateNamed Pipe calland ConnectNamedPipe()
Use DisconnectNamedPipe to free the handle for connection from another client
WaitNamedPipe()ndash Client may wait for serverlsquos ConnectNamedPipe()
Security rights for named pipesndash GENERIC_READ GENERIC_WRITE SYNCHRONIZE
103103
Comparison with UNIX
UNIX FIFOs are similar to a named pipe
ndash FIFOs are half-duplex
ndash FIFOs are limited to a single machine
ndash FIFOs are still byte-oriented so its easiest to use fixed-size records in clientserver applications
ndash Individual readwrites are atomic
A server using FIFOs must use a separate FIFO for each clientlsquos response although all clients can send requests via a single well known FIFO
Mkfifo() is the UNIX counterpart to CreateNamedPipe()
Use sockets for networked clientserver scenarios
104104
Client Example using Named Pipe
WaitNamedPipe (ServerPipeName NMPWAIT_WAIT_FOREVER)
hNamedPipe = CreateFile (ServerPipeName GENERIC_READ | GENERIC_WRITE
0 NULL OPEN_EXISTING FILE_ATTRIBUTE_NORMAL NULL)
if (hNamedPipe == INVALID_HANDLE_VALUE)
fptinf(stderr Failure to locate servern) exit(3)
Write the request
WriteFile (hNamedPipe ampRequest MAX_RQRS_LEN ampnWrite NULL)
Read each response and send it to std out
while (ReadFile (hNamedPipe ResponseRecord MAX_RQRS_LEN ampnRead NULL))
printf (s ResponseRecord)
CloseHandle (hNamedPipe)
return 0
105105
Server Example Using a Named Pipe
hNamedPipe = CreateNamedPipe (SERVER_PIPE PIPE_ACCESS_DUPLEX
PIPE_READMODE_MESSAGE | PIPE_TYPE_MESSAGE | PIPE_WAIT
1 0 0 CS_TIMEOUT pNPSA)
while (Done)
printf (Server is awaiting next requestn)
if (ConnectNamedPipe (hNamedPipe NULL)
|| ReadFile (hNamedPipe ampRequest RQ_SIZE ampnXfer NULL))
fprintf(stderr ldquoConnect or Read Named Pipe errornrdquo) exit(4)
printf( ldquoRequest is sn RequestRecord)
Send the file one line at a time to the client
fp = fopen (File r)
while ((fgets (ResponseRecord MAX_RQRS_LEN fp) = NULL))
WriteFile (hNamedPipe ampResponseRecord
(strlen(ResponseRecord) + 1) TSIZE ampnXfer NULL)
fclose (fp)
DisconnectNamedPipe (hNamedPipe)
End of server operation
106106
Win32 IPC - MailslotsMailslots bear some nasty implementation detailsthey are almost never used
Broadcast mechanism
ndash One-directional
ndash Mutliple writersmultiple readers (frequently one-to-many comm)
ndash Message delivery is unreliable
ndash Can be located over a network domain
ndash Message lengths are limited (w2k lt 426 byte) Operations on the mailslot
ndash Each reader (server) creates mailslot with CreateMailslot()
ndash Write-only client opens mailslot with CreateFile() and uses WriteFile() ndash open will fail if there are no waiting readers
ndash Clientlsquos message can be read by all servers (readers) Client lookup mailslotmailslotname
ndash Client will connect to every server in network domain
107107
Locate a server via mailslot
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
App client 0
App client n
Mailslot Servers
While () Sleep() hMS = CreateFile( ldquomailslotstatusldquo)
WriteFile(hMS ampStatInfo
App Server
Mailslot Client
Message is sent periodically
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
6666
DPC
Delivering a DPC
DPC routines can call kernel functionsbut canlsquot call system services generatepage faults or create or wait on objects
DPC routines canlsquotassume whatprocess addressspace is currentlymapped
Interruptdispatch table
high
Power failure
DispatchDPC
APC
Low
DPC
1 Timer expires kernelqueues DPC that willrelease all waiting threadsKernel requests SW int
DPCDPC
DPC queue
2 DPC interrupt occurswhen IRQL drops belowdispatchDPC level
dispatcher
3 After DPC interruptcontrol transfers tothread dispatcher
4 Dispatcher executes each DPCroutine in DPC queue
6767
Asynchronous Procedure Calls (APCs)
Execute code in context of a particular user threadndash APC routines can acquire resources (objects) incur page faultscall system services
APC queue is thread-specific
User mode amp kernel mode APCsndash Permission required for user mode APCs
6868
Asynchronous Procedure Calls (APCs)
Executive uses APCs to complete work in thread spacendash Wait for asynchronous IO operation
ndash Emulate delivery of POSIX signals
ndash Make threads suspendterminate itself (env subsystems)
APCs are delivered when thread is in alertable wait statendash WaitForMultipleObjectsEx() SleepEx()
6969
Special kernel APCs
ndash Run in kernel mode at IRQL 1
ndash Always deliverable unless thread is already at IRQL 1 or above
ndash Used for IO completion reporting from ldquoarbitrary thread contextrdquo
ndash Kernel-mode interface is linkable but not documented
Asynchronous Procedure Calls (APCs)
7070
ldquoOrdinaryrdquo kernel APCs
ndash Always deliverable if at IRQL 0 unless explicitly disabled (disable with KeEnterCriticalRegion)
User mode APCs
ndash Used for IO completion callback routines (see ReadFileEx WriteFileEx) also QueueUserApc
ndash Only deliverable when thread is in ldquoalertable waitrdquo
Asynchronous Procedure Calls (APCs)
7171
ThreadObject
K
U
APC objects
Asynchronous Procedure Calls (APCs)
7272
IRQLs and CPU Time Accounting
Interval clock timer ISR keeps track of time
Clock ISR time accounting
ndash If IRQLlt2 charge to threadrsquos user or kernel time
ndash If IRQL=2 and processing a DPC charge to DPC time
ndash If IRQL=2 amp not processing a DPC charge to thread kernel time
ndash If IRQLgt2 charge to interrupt time
7373
IRQLs and CPU Time Accounting
Since time servicing interrupts are NOT charged to interrupted thread if system is busy but no process appears to be running must be due to interrupt-related activity
ndash Note time at IRQL 2 or more is charged to the current threadrsquos quantum (to be described)
7474
Interrupt Time Accounting
Task Manager includes interrupt and DPC time with the Idle process time
Interrupt activity is not charged to any threadprocess
ndash Process Explorer shows these as separate processesnot really processes
ndash Context switches for these are really of interrupts amp DPCs
7575
Time Accounting Quirks
Looking at total CPU time for each process may not reveal where system has spent its time
CPU time accounting is driven by programmable interrupt timer
ndash Normally 10 msec (15 msec on some MP Pentiums)
Thread execution and context switches between clock intervals NOT accounted
ndash Eg one or more threads run and enter a wait state before clock fires
ndash Thus threads may run but never get charged
View context switch activity with Process Explorer
ndash Add Context Switch Delta column
7676
For waiting threads user-mode utilities only display the wait reason
Example pstat
Looking at Waiting Threads
7777
Wait Internals 1 Dispatcher Objects
Size TypeState
Wait listhead
Object-type-specific data
DispatcherObject
(see ntddkincddkntddkh)
Any kernel object you can wait for is a ldquodispatcher objectrdquo
ndash some exclusively for synchronizationeg events mutexes (ldquomutantsrdquo) semaphores queues timers
ndash others can be waited for as a side effect of their prime function eg processes threads file objects
ndash non-waitable kernel objects are called ldquocontrol objectsrdquo
All dispatcher objects have a common header
All dispatcher objects are in one of two states
ndash ldquosignaledrdquo vs ldquononsignaledrdquo
ndash when signalled a wait on the object is satisfied
ndash different object types differ in terms of what changes their state
ndash wait and unwait implementation iscommon to all types of dispatcher objects
7878
Object-type-specific data
Wait Internals 2Wait Blocks
Size TypeState
Wait listhead
Size TypeState
Wait listhead
Represent a threadrsquos reference to something itrsquos waiting for (one per handle passed to WaitForhellip)
All wait blocks from a given wait call are chained to the waiting thread
Type indicates wait for ldquoanyrdquo or ldquoallrdquo Key denotes argument list position for
WaitForMultipleObjects
Object-type-specific data
DispatcherObjects
Thread Objects
WaitBlockListWaitBlockList
Wait blocks
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
7979
34 Windows APIs for Synchronization and IPC Windows API constructs for synchronization and interprocess communication
Synchronization
ndash Critical sections
ndash Mutexes
ndash Semaphores
ndash Event objects
Synchronization through interprocess communication
ndash Anonymous pipes
ndash Named pipes
ndash Mailslots
8080
Critical Sections
Only usable from within the same process
Critical sections are initialized and deleted but do not have handles
Only one thread at a time can be in a critical section
A thread can enter a critical section multiple times - however the number of Enter- and Leave-operations must match
Leaving a critical section before entering it may cause deadlocks
No way to test whether another thread is in a critical section
VOID InitializeCriticalSection( LPCRITICAL_SECTION sec )VOID DeleteCriticalSection( LPCRITICAL_SECTION sec )
VOID EnterCriticalSection( LPCRITICAL_SECTION sec ) VOID LeaveCriticalSection( LPCRITICAL_SECTION sec )BOOL TryEnterCriticalSection ( LPCRITICAL_SECTION sec )
8181
Critical Section Example
counter is global shared by all threads
volatile int counter = 0
CRITICAL_SECTION crit
InitializeCriticalSection ( ampcrit )
hellip main loop in any of the threads
while (done)
_try
EnterCriticalSection ( ampcrit )
counter += local_value
LeaveCriticalSection ( ampcrit )
_finally LeaveCriticalSection ( ampcrit )
DeleteCriticalSection( ampcrit )
8282
Synchronizing Threads with Kernel Objects
The following kernel objects can be used to synchronize threads
ndash Processes
ndash Threads
ndash Files
ndash Console input
File change notificationsFile change notifications
MutexesMutexes
Events (auto-reset + manual-reset)Events (auto-reset + manual-reset)
Waitable timersWaitable timers
DWORD WaitForSingleObject( HANDLE hObject DWORD dwTimeout )
DWORD WaitForMultipleObjects( DWORD cObjects LPHANDLE lpHandles BOOL bWaitAll DWORD dwTimeout )
8383
Wait Functions - Details
WaitForSingleObject()ndash hObject specifies kernel object
ndash dwTimeout specifies wait time in msecdwTimeout == 0 - no wait check whether object is signaled
dwTimeout == INFINITE - wait forever
WaitForMultipleObjects()ndash cObjects lt= MAXIMUM_WAIT_OBJECTS (64)
ndash lpHandles - pointer to array identifying these objects
ndash bWaitAll - whether to wait for first signaled object or all objectsFunction returns index of first signaled object
Side effectsndash Mutexes auto-reset events and waitable timers will be reset to non-signaled state after completing wait functions
8484
Mutexes
Mutexes work across processes
First thread has to call CreateMutex()
When sharing a mutex second thread (process) calls CreateMutex() or OpenMutex()
fInitialOwner == TRUE gives creator immediate ownership
Threads acquire mutex ownership using WaitForSingleObject() or WaitForMultipleObjects()
ReleaseMutex() gives up ownership
CloseHandle() will free mutex object
8585
Mutexes
HANDLE CreateMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
HANDLE OpenMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
BOOL ReleaseMutex( HANDLE hMutex )
8686
Mutex Example
counter is global shared by all threads
volatile int done counter = 0
HANDLE mutex = CreateMutex( NULL FALSE NULL )
main loop in any of the threads ret is local
DWORD ret
while (done)
ret = WaitForSingleObject( mutex INFINITE )
if (ret == WAIT_OBJECT_0)
counter += local_value
else mutex was abandoned
break exit the loop
ReleaseMutex( mutex )
CloseHandle( mutex )
8787
Comparison - POSIX mutexes
POSIX pthreads specification supports mutexesndash Synchronization among threads in same process
Five basic functionsndash pthread_mutex_init()
ndash pthread_mutex_destroy()
ndash pthread_mutex_lock()
ndash pthread_mutex_unlock()
ndash pthread_mutex_trylock()
Comparisonndash pthread_mutex_lock() will block - equivalent to WaitForSingleObject( hMutex )
ndash pthread_mutex_trylock() is nonblocking (polling) - equivalent to WaitForSingleObject() with timeout == 0
8888
Semaphores
Semaphore objects are used for resource countingndash A semaphore is signaled when count gt 0
Threadsprocesses use wait functionsndash Each wait function decreases semaphore count by 1
ndash ReleaseSemaphore() may increment count by any value
ndash ReleaseSemaphore() returns old semaphore count
8989
Semaphores
HANDLE CreateSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE OpenSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE ReleaseSemaphore( HANDLE hSemaphoreLONG cReleaseCount LPLONG lpPreviousCount )
9090
Events
Multiple threads can be released when a single event is signaled (barrier synchronization)ndash Manual-reset event can signal several thread simultaneously must be reset manually
ndash PulseEvent() will release all threads waiting on a manual-reset event and automatically reset the event
ndash Auto-reset event signals a single thread event is reset automatically
ndash fInitialState == TRUE - create event in signaled state
9191
Events
HANDLE CreateEvent( LPSECURITY_ATTRIBUTE lpsaBOOL fManualReset BOOL fInititalStateLPTSTR lpszEventName )
BOOL SetEvent( HANDLE hEvent )BOOL ResetEvent( HANDLE hEvent )BOOL PulseEvent( HANDLE hEvent )
9292
Comparison - POSIX condition variables
pthreadrsquos condition variables are comparable to eventsndash pthread_cond_init()
ndash pthread_cond_destroy()
Wait functionsndash pthread_cond_wait()
ndash pthread_cond_timedwait()
Signalingndash pthread_cond_signal() - one thread
ndash pthread_cond_broadcast() - all waiting threads
No exact equivalent to manual-reset events
9393
Anonymous pipes
BOOL CreatePipe( PHANDLE phReadPHANDLE phWriteLPSECURITY_ATTRIBUTES lpsaDWORD cbPipe )
main
prog1 prog2pipe
Half-duplex character-based IPC
cbPipe pipe byte size zero == default
Read on pipe handle will block if pipe is empty
Write operation to a full pipe will block
Anonymous pipes are oneway
9494
IO Redirection using an Anonymous Pipe
Create default size anonymous pipe handles are inheritable
if (CreatePipe (amphReadPipe amphWritePipe ampPipeSA 0))
fprintf(stderr ldquoAnon pipe create failednrdquo) exit(1)
Set output handle to pipe handle create first processes
StartInfoCh1hStdInput = GetStdHandle (STD_INPUT_HANDLE)
StartInfoCh1hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh1hStdOutput = hWritePipe
StartInfoCh1dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)Command1 NULL NULL TRUE 0 NULL NULL ampStartInfoCh1 ampProcInfo1))
fprintf(stderr ldquoCreateProc1 failednrdquo) exit(2)
CloseHandle (hWritePipe)
9595
Pipe example (contd)
Repeat (symmetrically) for the second process
StartInfoCh2hStdInput = hReadPipe
StartInfoCh2hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh2hStdOutput = GetStdHandle (STD_OUTPUT_HANDLE)
StartInfoCh2dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)targv NULL NULLTRUE Inherit handles
0 NULL NULL ampStartInfoCh2 ampProcInfo2))
fprintf(stderr ldquoCreateProc2 failednrdquo) exit(3)
CloseHandle (hReadPipe)
Wait for both processes to complete
WaitForSingleObject (ProcInfo1hProcess INFINITE)
WaitForSingleObject (ProcInfo2hProcess INFINITE)
CloseHandle (ProcInfo1hThread) CloseHandle (ProcInfo1hProcess)
CloseHandle (ProcInfo2hThread) CloseHandle (ProcInfo2hProcess)
return 0
9696
Named Pipes
Message orientedndash Reading process can read varying-length messages precisely as sent by the writing process
Bi-directionalndash Two processes can exchange messages over the same pipe
Multiple independent instances of a named pipendash Several clients can communicate with a single server using the same instance
ndash Server can respond to client using the same instance
Pipe can be accessed over the networkndash location transparency
Convenience and connection functions
9797
Using Named Pipes
HANDLE CreateNamedPipe (LPCTSTR lpszPipeNameDWORD fdwOpenMode DWORD fdwPipModeDWORD nMaxInstances DWORD cbOutBufDWORD cbInBuf DWORD dwTimeOutLPSECURITY_ATTRIBUTES lpsa )
Use same flag settings forall instances of a named pipe
lpszPipeName pipe[path]pipename
ndash Not possible to create a pipe on remote machine ( ndash local machine)
fdwOpenMode
ndash PIPE_ACCESS_DUPLEX PIPE_ACCESS_INBOUND PIPE_ACCESS_OUTBOUND
fdwPipeMode
ndash PIPE_TYPE_BYTE or PIPE_TYPE_MESSAGE
ndash PIPE_READMODE_BYTE or PIPE_READMODE_MESSAGE
ndash PIPE_WAIT or PIPE_NOWAIT (will ReadFile block)
9898
Named Pipes (contd)
BOOL PeekNamedPipe (HANDLE hPipeLPVOID lpvBuffer DWORD cbBufferLPDWORD lpcbRead LPDWORD lpcbAvailLPDWORD lpcbMessage)
nMaxInstances
ndash Number of instances
ndash PIPE_UNLIMITED_INSTANCES OS choice based on resources
dwTimeOut
ndash Default time-out period (in msec) for WaitNamedPipe()
First CreateNamedPipe creates named pipe
ndash Closing handle to last instance deletes named pipe
Polling a pipe
ndash Nondestructive ndash is there a message waiting for ReadFile
9999
Named Pipe Client Connections
CreateFile with named pipe namendash pipe[path]pipename
ndash servernamepipe[path]pipename
ndash First method gives better performance (local server)
Status Functionsndash GetNamedPipeHandleState
ndash SetNamedPipeHandleState
ndash GetNamedPipeInfo
100100
Convenience Functions
BOOL TransactNamedPipe( HANDLE hNamedPipeLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDOWRD lpcbRead LPOVERLAPPED lpa)
WriteFile ReadFile sequence
101101
Convenience Functions
BOOL CallNamedPipe( LPCTSTR lpszPipeNameLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDWORD lpcbRead DWORD dwTimeOut)
CreateFile WriteFile ReadFile CloseHandle
ndash dwTimeOut NMPWAIT_NOWAIT NMPWAIT_WIAT_FOREVER NMPWAIT_USE_DEFAULT_WAIT
102102
Server eliminate the polling loop
BOOL ConnectNamedPipe (HANDLE hNamedPipeLPOVERLAPPED lpo
lpo == NULLndash Call will return as soon as there is a client connection
ndash Returns false if client connected between CreateNamed Pipe calland ConnectNamedPipe()
Use DisconnectNamedPipe to free the handle for connection from another client
WaitNamedPipe()ndash Client may wait for serverlsquos ConnectNamedPipe()
Security rights for named pipesndash GENERIC_READ GENERIC_WRITE SYNCHRONIZE
103103
Comparison with UNIX
UNIX FIFOs are similar to a named pipe
ndash FIFOs are half-duplex
ndash FIFOs are limited to a single machine
ndash FIFOs are still byte-oriented so its easiest to use fixed-size records in clientserver applications
ndash Individual readwrites are atomic
A server using FIFOs must use a separate FIFO for each clientlsquos response although all clients can send requests via a single well known FIFO
Mkfifo() is the UNIX counterpart to CreateNamedPipe()
Use sockets for networked clientserver scenarios
104104
Client Example using Named Pipe
WaitNamedPipe (ServerPipeName NMPWAIT_WAIT_FOREVER)
hNamedPipe = CreateFile (ServerPipeName GENERIC_READ | GENERIC_WRITE
0 NULL OPEN_EXISTING FILE_ATTRIBUTE_NORMAL NULL)
if (hNamedPipe == INVALID_HANDLE_VALUE)
fptinf(stderr Failure to locate servern) exit(3)
Write the request
WriteFile (hNamedPipe ampRequest MAX_RQRS_LEN ampnWrite NULL)
Read each response and send it to std out
while (ReadFile (hNamedPipe ResponseRecord MAX_RQRS_LEN ampnRead NULL))
printf (s ResponseRecord)
CloseHandle (hNamedPipe)
return 0
105105
Server Example Using a Named Pipe
hNamedPipe = CreateNamedPipe (SERVER_PIPE PIPE_ACCESS_DUPLEX
PIPE_READMODE_MESSAGE | PIPE_TYPE_MESSAGE | PIPE_WAIT
1 0 0 CS_TIMEOUT pNPSA)
while (Done)
printf (Server is awaiting next requestn)
if (ConnectNamedPipe (hNamedPipe NULL)
|| ReadFile (hNamedPipe ampRequest RQ_SIZE ampnXfer NULL))
fprintf(stderr ldquoConnect or Read Named Pipe errornrdquo) exit(4)
printf( ldquoRequest is sn RequestRecord)
Send the file one line at a time to the client
fp = fopen (File r)
while ((fgets (ResponseRecord MAX_RQRS_LEN fp) = NULL))
WriteFile (hNamedPipe ampResponseRecord
(strlen(ResponseRecord) + 1) TSIZE ampnXfer NULL)
fclose (fp)
DisconnectNamedPipe (hNamedPipe)
End of server operation
106106
Win32 IPC - MailslotsMailslots bear some nasty implementation detailsthey are almost never used
Broadcast mechanism
ndash One-directional
ndash Mutliple writersmultiple readers (frequently one-to-many comm)
ndash Message delivery is unreliable
ndash Can be located over a network domain
ndash Message lengths are limited (w2k lt 426 byte) Operations on the mailslot
ndash Each reader (server) creates mailslot with CreateMailslot()
ndash Write-only client opens mailslot with CreateFile() and uses WriteFile() ndash open will fail if there are no waiting readers
ndash Clientlsquos message can be read by all servers (readers) Client lookup mailslotmailslotname
ndash Client will connect to every server in network domain
107107
Locate a server via mailslot
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
App client 0
App client n
Mailslot Servers
While () Sleep() hMS = CreateFile( ldquomailslotstatusldquo)
WriteFile(hMS ampStatInfo
App Server
Mailslot Client
Message is sent periodically
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
6767
Asynchronous Procedure Calls (APCs)
Execute code in context of a particular user threadndash APC routines can acquire resources (objects) incur page faultscall system services
APC queue is thread-specific
User mode amp kernel mode APCsndash Permission required for user mode APCs
6868
Asynchronous Procedure Calls (APCs)
Executive uses APCs to complete work in thread spacendash Wait for asynchronous IO operation
ndash Emulate delivery of POSIX signals
ndash Make threads suspendterminate itself (env subsystems)
APCs are delivered when thread is in alertable wait statendash WaitForMultipleObjectsEx() SleepEx()
6969
Special kernel APCs
ndash Run in kernel mode at IRQL 1
ndash Always deliverable unless thread is already at IRQL 1 or above
ndash Used for IO completion reporting from ldquoarbitrary thread contextrdquo
ndash Kernel-mode interface is linkable but not documented
Asynchronous Procedure Calls (APCs)
7070
ldquoOrdinaryrdquo kernel APCs
ndash Always deliverable if at IRQL 0 unless explicitly disabled (disable with KeEnterCriticalRegion)
User mode APCs
ndash Used for IO completion callback routines (see ReadFileEx WriteFileEx) also QueueUserApc
ndash Only deliverable when thread is in ldquoalertable waitrdquo
Asynchronous Procedure Calls (APCs)
7171
ThreadObject
K
U
APC objects
Asynchronous Procedure Calls (APCs)
7272
IRQLs and CPU Time Accounting
Interval clock timer ISR keeps track of time
Clock ISR time accounting
ndash If IRQLlt2 charge to threadrsquos user or kernel time
ndash If IRQL=2 and processing a DPC charge to DPC time
ndash If IRQL=2 amp not processing a DPC charge to thread kernel time
ndash If IRQLgt2 charge to interrupt time
7373
IRQLs and CPU Time Accounting
Since time servicing interrupts are NOT charged to interrupted thread if system is busy but no process appears to be running must be due to interrupt-related activity
ndash Note time at IRQL 2 or more is charged to the current threadrsquos quantum (to be described)
7474
Interrupt Time Accounting
Task Manager includes interrupt and DPC time with the Idle process time
Interrupt activity is not charged to any threadprocess
ndash Process Explorer shows these as separate processesnot really processes
ndash Context switches for these are really of interrupts amp DPCs
7575
Time Accounting Quirks
Looking at total CPU time for each process may not reveal where system has spent its time
CPU time accounting is driven by programmable interrupt timer
ndash Normally 10 msec (15 msec on some MP Pentiums)
Thread execution and context switches between clock intervals NOT accounted
ndash Eg one or more threads run and enter a wait state before clock fires
ndash Thus threads may run but never get charged
View context switch activity with Process Explorer
ndash Add Context Switch Delta column
7676
For waiting threads user-mode utilities only display the wait reason
Example pstat
Looking at Waiting Threads
7777
Wait Internals 1 Dispatcher Objects
Size TypeState
Wait listhead
Object-type-specific data
DispatcherObject
(see ntddkincddkntddkh)
Any kernel object you can wait for is a ldquodispatcher objectrdquo
ndash some exclusively for synchronizationeg events mutexes (ldquomutantsrdquo) semaphores queues timers
ndash others can be waited for as a side effect of their prime function eg processes threads file objects
ndash non-waitable kernel objects are called ldquocontrol objectsrdquo
All dispatcher objects have a common header
All dispatcher objects are in one of two states
ndash ldquosignaledrdquo vs ldquononsignaledrdquo
ndash when signalled a wait on the object is satisfied
ndash different object types differ in terms of what changes their state
ndash wait and unwait implementation iscommon to all types of dispatcher objects
7878
Object-type-specific data
Wait Internals 2Wait Blocks
Size TypeState
Wait listhead
Size TypeState
Wait listhead
Represent a threadrsquos reference to something itrsquos waiting for (one per handle passed to WaitForhellip)
All wait blocks from a given wait call are chained to the waiting thread
Type indicates wait for ldquoanyrdquo or ldquoallrdquo Key denotes argument list position for
WaitForMultipleObjects
Object-type-specific data
DispatcherObjects
Thread Objects
WaitBlockListWaitBlockList
Wait blocks
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
7979
34 Windows APIs for Synchronization and IPC Windows API constructs for synchronization and interprocess communication
Synchronization
ndash Critical sections
ndash Mutexes
ndash Semaphores
ndash Event objects
Synchronization through interprocess communication
ndash Anonymous pipes
ndash Named pipes
ndash Mailslots
8080
Critical Sections
Only usable from within the same process
Critical sections are initialized and deleted but do not have handles
Only one thread at a time can be in a critical section
A thread can enter a critical section multiple times - however the number of Enter- and Leave-operations must match
Leaving a critical section before entering it may cause deadlocks
No way to test whether another thread is in a critical section
VOID InitializeCriticalSection( LPCRITICAL_SECTION sec )VOID DeleteCriticalSection( LPCRITICAL_SECTION sec )
VOID EnterCriticalSection( LPCRITICAL_SECTION sec ) VOID LeaveCriticalSection( LPCRITICAL_SECTION sec )BOOL TryEnterCriticalSection ( LPCRITICAL_SECTION sec )
8181
Critical Section Example
counter is global shared by all threads
volatile int counter = 0
CRITICAL_SECTION crit
InitializeCriticalSection ( ampcrit )
hellip main loop in any of the threads
while (done)
_try
EnterCriticalSection ( ampcrit )
counter += local_value
LeaveCriticalSection ( ampcrit )
_finally LeaveCriticalSection ( ampcrit )
DeleteCriticalSection( ampcrit )
8282
Synchronizing Threads with Kernel Objects
The following kernel objects can be used to synchronize threads
ndash Processes
ndash Threads
ndash Files
ndash Console input
File change notificationsFile change notifications
MutexesMutexes
Events (auto-reset + manual-reset)Events (auto-reset + manual-reset)
Waitable timersWaitable timers
DWORD WaitForSingleObject( HANDLE hObject DWORD dwTimeout )
DWORD WaitForMultipleObjects( DWORD cObjects LPHANDLE lpHandles BOOL bWaitAll DWORD dwTimeout )
8383
Wait Functions - Details
WaitForSingleObject()ndash hObject specifies kernel object
ndash dwTimeout specifies wait time in msecdwTimeout == 0 - no wait check whether object is signaled
dwTimeout == INFINITE - wait forever
WaitForMultipleObjects()ndash cObjects lt= MAXIMUM_WAIT_OBJECTS (64)
ndash lpHandles - pointer to array identifying these objects
ndash bWaitAll - whether to wait for first signaled object or all objectsFunction returns index of first signaled object
Side effectsndash Mutexes auto-reset events and waitable timers will be reset to non-signaled state after completing wait functions
8484
Mutexes
Mutexes work across processes
First thread has to call CreateMutex()
When sharing a mutex second thread (process) calls CreateMutex() or OpenMutex()
fInitialOwner == TRUE gives creator immediate ownership
Threads acquire mutex ownership using WaitForSingleObject() or WaitForMultipleObjects()
ReleaseMutex() gives up ownership
CloseHandle() will free mutex object
8585
Mutexes
HANDLE CreateMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
HANDLE OpenMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
BOOL ReleaseMutex( HANDLE hMutex )
8686
Mutex Example
counter is global shared by all threads
volatile int done counter = 0
HANDLE mutex = CreateMutex( NULL FALSE NULL )
main loop in any of the threads ret is local
DWORD ret
while (done)
ret = WaitForSingleObject( mutex INFINITE )
if (ret == WAIT_OBJECT_0)
counter += local_value
else mutex was abandoned
break exit the loop
ReleaseMutex( mutex )
CloseHandle( mutex )
8787
Comparison - POSIX mutexes
POSIX pthreads specification supports mutexesndash Synchronization among threads in same process
Five basic functionsndash pthread_mutex_init()
ndash pthread_mutex_destroy()
ndash pthread_mutex_lock()
ndash pthread_mutex_unlock()
ndash pthread_mutex_trylock()
Comparisonndash pthread_mutex_lock() will block - equivalent to WaitForSingleObject( hMutex )
ndash pthread_mutex_trylock() is nonblocking (polling) - equivalent to WaitForSingleObject() with timeout == 0
8888
Semaphores
Semaphore objects are used for resource countingndash A semaphore is signaled when count gt 0
Threadsprocesses use wait functionsndash Each wait function decreases semaphore count by 1
ndash ReleaseSemaphore() may increment count by any value
ndash ReleaseSemaphore() returns old semaphore count
8989
Semaphores
HANDLE CreateSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE OpenSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE ReleaseSemaphore( HANDLE hSemaphoreLONG cReleaseCount LPLONG lpPreviousCount )
9090
Events
Multiple threads can be released when a single event is signaled (barrier synchronization)ndash Manual-reset event can signal several thread simultaneously must be reset manually
ndash PulseEvent() will release all threads waiting on a manual-reset event and automatically reset the event
ndash Auto-reset event signals a single thread event is reset automatically
ndash fInitialState == TRUE - create event in signaled state
9191
Events
HANDLE CreateEvent( LPSECURITY_ATTRIBUTE lpsaBOOL fManualReset BOOL fInititalStateLPTSTR lpszEventName )
BOOL SetEvent( HANDLE hEvent )BOOL ResetEvent( HANDLE hEvent )BOOL PulseEvent( HANDLE hEvent )
9292
Comparison - POSIX condition variables
pthreadrsquos condition variables are comparable to eventsndash pthread_cond_init()
ndash pthread_cond_destroy()
Wait functionsndash pthread_cond_wait()
ndash pthread_cond_timedwait()
Signalingndash pthread_cond_signal() - one thread
ndash pthread_cond_broadcast() - all waiting threads
No exact equivalent to manual-reset events
9393
Anonymous pipes
BOOL CreatePipe( PHANDLE phReadPHANDLE phWriteLPSECURITY_ATTRIBUTES lpsaDWORD cbPipe )
main
prog1 prog2pipe
Half-duplex character-based IPC
cbPipe pipe byte size zero == default
Read on pipe handle will block if pipe is empty
Write operation to a full pipe will block
Anonymous pipes are oneway
9494
IO Redirection using an Anonymous Pipe
Create default size anonymous pipe handles are inheritable
if (CreatePipe (amphReadPipe amphWritePipe ampPipeSA 0))
fprintf(stderr ldquoAnon pipe create failednrdquo) exit(1)
Set output handle to pipe handle create first processes
StartInfoCh1hStdInput = GetStdHandle (STD_INPUT_HANDLE)
StartInfoCh1hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh1hStdOutput = hWritePipe
StartInfoCh1dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)Command1 NULL NULL TRUE 0 NULL NULL ampStartInfoCh1 ampProcInfo1))
fprintf(stderr ldquoCreateProc1 failednrdquo) exit(2)
CloseHandle (hWritePipe)
9595
Pipe example (contd)
Repeat (symmetrically) for the second process
StartInfoCh2hStdInput = hReadPipe
StartInfoCh2hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh2hStdOutput = GetStdHandle (STD_OUTPUT_HANDLE)
StartInfoCh2dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)targv NULL NULLTRUE Inherit handles
0 NULL NULL ampStartInfoCh2 ampProcInfo2))
fprintf(stderr ldquoCreateProc2 failednrdquo) exit(3)
CloseHandle (hReadPipe)
Wait for both processes to complete
WaitForSingleObject (ProcInfo1hProcess INFINITE)
WaitForSingleObject (ProcInfo2hProcess INFINITE)
CloseHandle (ProcInfo1hThread) CloseHandle (ProcInfo1hProcess)
CloseHandle (ProcInfo2hThread) CloseHandle (ProcInfo2hProcess)
return 0
9696
Named Pipes
Message orientedndash Reading process can read varying-length messages precisely as sent by the writing process
Bi-directionalndash Two processes can exchange messages over the same pipe
Multiple independent instances of a named pipendash Several clients can communicate with a single server using the same instance
ndash Server can respond to client using the same instance
Pipe can be accessed over the networkndash location transparency
Convenience and connection functions
9797
Using Named Pipes
HANDLE CreateNamedPipe (LPCTSTR lpszPipeNameDWORD fdwOpenMode DWORD fdwPipModeDWORD nMaxInstances DWORD cbOutBufDWORD cbInBuf DWORD dwTimeOutLPSECURITY_ATTRIBUTES lpsa )
Use same flag settings forall instances of a named pipe
lpszPipeName pipe[path]pipename
ndash Not possible to create a pipe on remote machine ( ndash local machine)
fdwOpenMode
ndash PIPE_ACCESS_DUPLEX PIPE_ACCESS_INBOUND PIPE_ACCESS_OUTBOUND
fdwPipeMode
ndash PIPE_TYPE_BYTE or PIPE_TYPE_MESSAGE
ndash PIPE_READMODE_BYTE or PIPE_READMODE_MESSAGE
ndash PIPE_WAIT or PIPE_NOWAIT (will ReadFile block)
9898
Named Pipes (contd)
BOOL PeekNamedPipe (HANDLE hPipeLPVOID lpvBuffer DWORD cbBufferLPDWORD lpcbRead LPDWORD lpcbAvailLPDWORD lpcbMessage)
nMaxInstances
ndash Number of instances
ndash PIPE_UNLIMITED_INSTANCES OS choice based on resources
dwTimeOut
ndash Default time-out period (in msec) for WaitNamedPipe()
First CreateNamedPipe creates named pipe
ndash Closing handle to last instance deletes named pipe
Polling a pipe
ndash Nondestructive ndash is there a message waiting for ReadFile
9999
Named Pipe Client Connections
CreateFile with named pipe namendash pipe[path]pipename
ndash servernamepipe[path]pipename
ndash First method gives better performance (local server)
Status Functionsndash GetNamedPipeHandleState
ndash SetNamedPipeHandleState
ndash GetNamedPipeInfo
100100
Convenience Functions
BOOL TransactNamedPipe( HANDLE hNamedPipeLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDOWRD lpcbRead LPOVERLAPPED lpa)
WriteFile ReadFile sequence
101101
Convenience Functions
BOOL CallNamedPipe( LPCTSTR lpszPipeNameLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDWORD lpcbRead DWORD dwTimeOut)
CreateFile WriteFile ReadFile CloseHandle
ndash dwTimeOut NMPWAIT_NOWAIT NMPWAIT_WIAT_FOREVER NMPWAIT_USE_DEFAULT_WAIT
102102
Server eliminate the polling loop
BOOL ConnectNamedPipe (HANDLE hNamedPipeLPOVERLAPPED lpo
lpo == NULLndash Call will return as soon as there is a client connection
ndash Returns false if client connected between CreateNamed Pipe calland ConnectNamedPipe()
Use DisconnectNamedPipe to free the handle for connection from another client
WaitNamedPipe()ndash Client may wait for serverlsquos ConnectNamedPipe()
Security rights for named pipesndash GENERIC_READ GENERIC_WRITE SYNCHRONIZE
103103
Comparison with UNIX
UNIX FIFOs are similar to a named pipe
ndash FIFOs are half-duplex
ndash FIFOs are limited to a single machine
ndash FIFOs are still byte-oriented so its easiest to use fixed-size records in clientserver applications
ndash Individual readwrites are atomic
A server using FIFOs must use a separate FIFO for each clientlsquos response although all clients can send requests via a single well known FIFO
Mkfifo() is the UNIX counterpart to CreateNamedPipe()
Use sockets for networked clientserver scenarios
104104
Client Example using Named Pipe
WaitNamedPipe (ServerPipeName NMPWAIT_WAIT_FOREVER)
hNamedPipe = CreateFile (ServerPipeName GENERIC_READ | GENERIC_WRITE
0 NULL OPEN_EXISTING FILE_ATTRIBUTE_NORMAL NULL)
if (hNamedPipe == INVALID_HANDLE_VALUE)
fptinf(stderr Failure to locate servern) exit(3)
Write the request
WriteFile (hNamedPipe ampRequest MAX_RQRS_LEN ampnWrite NULL)
Read each response and send it to std out
while (ReadFile (hNamedPipe ResponseRecord MAX_RQRS_LEN ampnRead NULL))
printf (s ResponseRecord)
CloseHandle (hNamedPipe)
return 0
105105
Server Example Using a Named Pipe
hNamedPipe = CreateNamedPipe (SERVER_PIPE PIPE_ACCESS_DUPLEX
PIPE_READMODE_MESSAGE | PIPE_TYPE_MESSAGE | PIPE_WAIT
1 0 0 CS_TIMEOUT pNPSA)
while (Done)
printf (Server is awaiting next requestn)
if (ConnectNamedPipe (hNamedPipe NULL)
|| ReadFile (hNamedPipe ampRequest RQ_SIZE ampnXfer NULL))
fprintf(stderr ldquoConnect or Read Named Pipe errornrdquo) exit(4)
printf( ldquoRequest is sn RequestRecord)
Send the file one line at a time to the client
fp = fopen (File r)
while ((fgets (ResponseRecord MAX_RQRS_LEN fp) = NULL))
WriteFile (hNamedPipe ampResponseRecord
(strlen(ResponseRecord) + 1) TSIZE ampnXfer NULL)
fclose (fp)
DisconnectNamedPipe (hNamedPipe)
End of server operation
106106
Win32 IPC - MailslotsMailslots bear some nasty implementation detailsthey are almost never used
Broadcast mechanism
ndash One-directional
ndash Mutliple writersmultiple readers (frequently one-to-many comm)
ndash Message delivery is unreliable
ndash Can be located over a network domain
ndash Message lengths are limited (w2k lt 426 byte) Operations on the mailslot
ndash Each reader (server) creates mailslot with CreateMailslot()
ndash Write-only client opens mailslot with CreateFile() and uses WriteFile() ndash open will fail if there are no waiting readers
ndash Clientlsquos message can be read by all servers (readers) Client lookup mailslotmailslotname
ndash Client will connect to every server in network domain
107107
Locate a server via mailslot
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
App client 0
App client n
Mailslot Servers
While () Sleep() hMS = CreateFile( ldquomailslotstatusldquo)
WriteFile(hMS ampStatInfo
App Server
Mailslot Client
Message is sent periodically
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
6868
Asynchronous Procedure Calls (APCs)
Executive uses APCs to complete work in thread spacendash Wait for asynchronous IO operation
ndash Emulate delivery of POSIX signals
ndash Make threads suspendterminate itself (env subsystems)
APCs are delivered when thread is in alertable wait statendash WaitForMultipleObjectsEx() SleepEx()
6969
Special kernel APCs
ndash Run in kernel mode at IRQL 1
ndash Always deliverable unless thread is already at IRQL 1 or above
ndash Used for IO completion reporting from ldquoarbitrary thread contextrdquo
ndash Kernel-mode interface is linkable but not documented
Asynchronous Procedure Calls (APCs)
7070
ldquoOrdinaryrdquo kernel APCs
ndash Always deliverable if at IRQL 0 unless explicitly disabled (disable with KeEnterCriticalRegion)
User mode APCs
ndash Used for IO completion callback routines (see ReadFileEx WriteFileEx) also QueueUserApc
ndash Only deliverable when thread is in ldquoalertable waitrdquo
Asynchronous Procedure Calls (APCs)
7171
ThreadObject
K
U
APC objects
Asynchronous Procedure Calls (APCs)
7272
IRQLs and CPU Time Accounting
Interval clock timer ISR keeps track of time
Clock ISR time accounting
ndash If IRQLlt2 charge to threadrsquos user or kernel time
ndash If IRQL=2 and processing a DPC charge to DPC time
ndash If IRQL=2 amp not processing a DPC charge to thread kernel time
ndash If IRQLgt2 charge to interrupt time
7373
IRQLs and CPU Time Accounting
Since time servicing interrupts are NOT charged to interrupted thread if system is busy but no process appears to be running must be due to interrupt-related activity
ndash Note time at IRQL 2 or more is charged to the current threadrsquos quantum (to be described)
7474
Interrupt Time Accounting
Task Manager includes interrupt and DPC time with the Idle process time
Interrupt activity is not charged to any threadprocess
ndash Process Explorer shows these as separate processesnot really processes
ndash Context switches for these are really of interrupts amp DPCs
7575
Time Accounting Quirks
Looking at total CPU time for each process may not reveal where system has spent its time
CPU time accounting is driven by programmable interrupt timer
ndash Normally 10 msec (15 msec on some MP Pentiums)
Thread execution and context switches between clock intervals NOT accounted
ndash Eg one or more threads run and enter a wait state before clock fires
ndash Thus threads may run but never get charged
View context switch activity with Process Explorer
ndash Add Context Switch Delta column
7676
For waiting threads user-mode utilities only display the wait reason
Example pstat
Looking at Waiting Threads
7777
Wait Internals 1 Dispatcher Objects
Size TypeState
Wait listhead
Object-type-specific data
DispatcherObject
(see ntddkincddkntddkh)
Any kernel object you can wait for is a ldquodispatcher objectrdquo
ndash some exclusively for synchronizationeg events mutexes (ldquomutantsrdquo) semaphores queues timers
ndash others can be waited for as a side effect of their prime function eg processes threads file objects
ndash non-waitable kernel objects are called ldquocontrol objectsrdquo
All dispatcher objects have a common header
All dispatcher objects are in one of two states
ndash ldquosignaledrdquo vs ldquononsignaledrdquo
ndash when signalled a wait on the object is satisfied
ndash different object types differ in terms of what changes their state
ndash wait and unwait implementation iscommon to all types of dispatcher objects
7878
Object-type-specific data
Wait Internals 2Wait Blocks
Size TypeState
Wait listhead
Size TypeState
Wait listhead
Represent a threadrsquos reference to something itrsquos waiting for (one per handle passed to WaitForhellip)
All wait blocks from a given wait call are chained to the waiting thread
Type indicates wait for ldquoanyrdquo or ldquoallrdquo Key denotes argument list position for
WaitForMultipleObjects
Object-type-specific data
DispatcherObjects
Thread Objects
WaitBlockListWaitBlockList
Wait blocks
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
7979
34 Windows APIs for Synchronization and IPC Windows API constructs for synchronization and interprocess communication
Synchronization
ndash Critical sections
ndash Mutexes
ndash Semaphores
ndash Event objects
Synchronization through interprocess communication
ndash Anonymous pipes
ndash Named pipes
ndash Mailslots
8080
Critical Sections
Only usable from within the same process
Critical sections are initialized and deleted but do not have handles
Only one thread at a time can be in a critical section
A thread can enter a critical section multiple times - however the number of Enter- and Leave-operations must match
Leaving a critical section before entering it may cause deadlocks
No way to test whether another thread is in a critical section
VOID InitializeCriticalSection( LPCRITICAL_SECTION sec )VOID DeleteCriticalSection( LPCRITICAL_SECTION sec )
VOID EnterCriticalSection( LPCRITICAL_SECTION sec ) VOID LeaveCriticalSection( LPCRITICAL_SECTION sec )BOOL TryEnterCriticalSection ( LPCRITICAL_SECTION sec )
8181
Critical Section Example
counter is global shared by all threads
volatile int counter = 0
CRITICAL_SECTION crit
InitializeCriticalSection ( ampcrit )
hellip main loop in any of the threads
while (done)
_try
EnterCriticalSection ( ampcrit )
counter += local_value
LeaveCriticalSection ( ampcrit )
_finally LeaveCriticalSection ( ampcrit )
DeleteCriticalSection( ampcrit )
8282
Synchronizing Threads with Kernel Objects
The following kernel objects can be used to synchronize threads
ndash Processes
ndash Threads
ndash Files
ndash Console input
File change notificationsFile change notifications
MutexesMutexes
Events (auto-reset + manual-reset)Events (auto-reset + manual-reset)
Waitable timersWaitable timers
DWORD WaitForSingleObject( HANDLE hObject DWORD dwTimeout )
DWORD WaitForMultipleObjects( DWORD cObjects LPHANDLE lpHandles BOOL bWaitAll DWORD dwTimeout )
8383
Wait Functions - Details
WaitForSingleObject()ndash hObject specifies kernel object
ndash dwTimeout specifies wait time in msecdwTimeout == 0 - no wait check whether object is signaled
dwTimeout == INFINITE - wait forever
WaitForMultipleObjects()ndash cObjects lt= MAXIMUM_WAIT_OBJECTS (64)
ndash lpHandles - pointer to array identifying these objects
ndash bWaitAll - whether to wait for first signaled object or all objectsFunction returns index of first signaled object
Side effectsndash Mutexes auto-reset events and waitable timers will be reset to non-signaled state after completing wait functions
8484
Mutexes
Mutexes work across processes
First thread has to call CreateMutex()
When sharing a mutex second thread (process) calls CreateMutex() or OpenMutex()
fInitialOwner == TRUE gives creator immediate ownership
Threads acquire mutex ownership using WaitForSingleObject() or WaitForMultipleObjects()
ReleaseMutex() gives up ownership
CloseHandle() will free mutex object
8585
Mutexes
HANDLE CreateMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
HANDLE OpenMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
BOOL ReleaseMutex( HANDLE hMutex )
8686
Mutex Example
counter is global shared by all threads
volatile int done counter = 0
HANDLE mutex = CreateMutex( NULL FALSE NULL )
main loop in any of the threads ret is local
DWORD ret
while (done)
ret = WaitForSingleObject( mutex INFINITE )
if (ret == WAIT_OBJECT_0)
counter += local_value
else mutex was abandoned
break exit the loop
ReleaseMutex( mutex )
CloseHandle( mutex )
8787
Comparison - POSIX mutexes
POSIX pthreads specification supports mutexesndash Synchronization among threads in same process
Five basic functionsndash pthread_mutex_init()
ndash pthread_mutex_destroy()
ndash pthread_mutex_lock()
ndash pthread_mutex_unlock()
ndash pthread_mutex_trylock()
Comparisonndash pthread_mutex_lock() will block - equivalent to WaitForSingleObject( hMutex )
ndash pthread_mutex_trylock() is nonblocking (polling) - equivalent to WaitForSingleObject() with timeout == 0
8888
Semaphores
Semaphore objects are used for resource countingndash A semaphore is signaled when count gt 0
Threadsprocesses use wait functionsndash Each wait function decreases semaphore count by 1
ndash ReleaseSemaphore() may increment count by any value
ndash ReleaseSemaphore() returns old semaphore count
8989
Semaphores
HANDLE CreateSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE OpenSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE ReleaseSemaphore( HANDLE hSemaphoreLONG cReleaseCount LPLONG lpPreviousCount )
9090
Events
Multiple threads can be released when a single event is signaled (barrier synchronization)ndash Manual-reset event can signal several thread simultaneously must be reset manually
ndash PulseEvent() will release all threads waiting on a manual-reset event and automatically reset the event
ndash Auto-reset event signals a single thread event is reset automatically
ndash fInitialState == TRUE - create event in signaled state
9191
Events
HANDLE CreateEvent( LPSECURITY_ATTRIBUTE lpsaBOOL fManualReset BOOL fInititalStateLPTSTR lpszEventName )
BOOL SetEvent( HANDLE hEvent )BOOL ResetEvent( HANDLE hEvent )BOOL PulseEvent( HANDLE hEvent )
9292
Comparison - POSIX condition variables
pthreadrsquos condition variables are comparable to eventsndash pthread_cond_init()
ndash pthread_cond_destroy()
Wait functionsndash pthread_cond_wait()
ndash pthread_cond_timedwait()
Signalingndash pthread_cond_signal() - one thread
ndash pthread_cond_broadcast() - all waiting threads
No exact equivalent to manual-reset events
9393
Anonymous pipes
BOOL CreatePipe( PHANDLE phReadPHANDLE phWriteLPSECURITY_ATTRIBUTES lpsaDWORD cbPipe )
main
prog1 prog2pipe
Half-duplex character-based IPC
cbPipe pipe byte size zero == default
Read on pipe handle will block if pipe is empty
Write operation to a full pipe will block
Anonymous pipes are oneway
9494
IO Redirection using an Anonymous Pipe
Create default size anonymous pipe handles are inheritable
if (CreatePipe (amphReadPipe amphWritePipe ampPipeSA 0))
fprintf(stderr ldquoAnon pipe create failednrdquo) exit(1)
Set output handle to pipe handle create first processes
StartInfoCh1hStdInput = GetStdHandle (STD_INPUT_HANDLE)
StartInfoCh1hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh1hStdOutput = hWritePipe
StartInfoCh1dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)Command1 NULL NULL TRUE 0 NULL NULL ampStartInfoCh1 ampProcInfo1))
fprintf(stderr ldquoCreateProc1 failednrdquo) exit(2)
CloseHandle (hWritePipe)
9595
Pipe example (contd)
Repeat (symmetrically) for the second process
StartInfoCh2hStdInput = hReadPipe
StartInfoCh2hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh2hStdOutput = GetStdHandle (STD_OUTPUT_HANDLE)
StartInfoCh2dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)targv NULL NULLTRUE Inherit handles
0 NULL NULL ampStartInfoCh2 ampProcInfo2))
fprintf(stderr ldquoCreateProc2 failednrdquo) exit(3)
CloseHandle (hReadPipe)
Wait for both processes to complete
WaitForSingleObject (ProcInfo1hProcess INFINITE)
WaitForSingleObject (ProcInfo2hProcess INFINITE)
CloseHandle (ProcInfo1hThread) CloseHandle (ProcInfo1hProcess)
CloseHandle (ProcInfo2hThread) CloseHandle (ProcInfo2hProcess)
return 0
9696
Named Pipes
Message orientedndash Reading process can read varying-length messages precisely as sent by the writing process
Bi-directionalndash Two processes can exchange messages over the same pipe
Multiple independent instances of a named pipendash Several clients can communicate with a single server using the same instance
ndash Server can respond to client using the same instance
Pipe can be accessed over the networkndash location transparency
Convenience and connection functions
9797
Using Named Pipes
HANDLE CreateNamedPipe (LPCTSTR lpszPipeNameDWORD fdwOpenMode DWORD fdwPipModeDWORD nMaxInstances DWORD cbOutBufDWORD cbInBuf DWORD dwTimeOutLPSECURITY_ATTRIBUTES lpsa )
Use same flag settings forall instances of a named pipe
lpszPipeName pipe[path]pipename
ndash Not possible to create a pipe on remote machine ( ndash local machine)
fdwOpenMode
ndash PIPE_ACCESS_DUPLEX PIPE_ACCESS_INBOUND PIPE_ACCESS_OUTBOUND
fdwPipeMode
ndash PIPE_TYPE_BYTE or PIPE_TYPE_MESSAGE
ndash PIPE_READMODE_BYTE or PIPE_READMODE_MESSAGE
ndash PIPE_WAIT or PIPE_NOWAIT (will ReadFile block)
9898
Named Pipes (contd)
BOOL PeekNamedPipe (HANDLE hPipeLPVOID lpvBuffer DWORD cbBufferLPDWORD lpcbRead LPDWORD lpcbAvailLPDWORD lpcbMessage)
nMaxInstances
ndash Number of instances
ndash PIPE_UNLIMITED_INSTANCES OS choice based on resources
dwTimeOut
ndash Default time-out period (in msec) for WaitNamedPipe()
First CreateNamedPipe creates named pipe
ndash Closing handle to last instance deletes named pipe
Polling a pipe
ndash Nondestructive ndash is there a message waiting for ReadFile
9999
Named Pipe Client Connections
CreateFile with named pipe namendash pipe[path]pipename
ndash servernamepipe[path]pipename
ndash First method gives better performance (local server)
Status Functionsndash GetNamedPipeHandleState
ndash SetNamedPipeHandleState
ndash GetNamedPipeInfo
100100
Convenience Functions
BOOL TransactNamedPipe( HANDLE hNamedPipeLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDOWRD lpcbRead LPOVERLAPPED lpa)
WriteFile ReadFile sequence
101101
Convenience Functions
BOOL CallNamedPipe( LPCTSTR lpszPipeNameLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDWORD lpcbRead DWORD dwTimeOut)
CreateFile WriteFile ReadFile CloseHandle
ndash dwTimeOut NMPWAIT_NOWAIT NMPWAIT_WIAT_FOREVER NMPWAIT_USE_DEFAULT_WAIT
102102
Server eliminate the polling loop
BOOL ConnectNamedPipe (HANDLE hNamedPipeLPOVERLAPPED lpo
lpo == NULLndash Call will return as soon as there is a client connection
ndash Returns false if client connected between CreateNamed Pipe calland ConnectNamedPipe()
Use DisconnectNamedPipe to free the handle for connection from another client
WaitNamedPipe()ndash Client may wait for serverlsquos ConnectNamedPipe()
Security rights for named pipesndash GENERIC_READ GENERIC_WRITE SYNCHRONIZE
103103
Comparison with UNIX
UNIX FIFOs are similar to a named pipe
ndash FIFOs are half-duplex
ndash FIFOs are limited to a single machine
ndash FIFOs are still byte-oriented so its easiest to use fixed-size records in clientserver applications
ndash Individual readwrites are atomic
A server using FIFOs must use a separate FIFO for each clientlsquos response although all clients can send requests via a single well known FIFO
Mkfifo() is the UNIX counterpart to CreateNamedPipe()
Use sockets for networked clientserver scenarios
104104
Client Example using Named Pipe
WaitNamedPipe (ServerPipeName NMPWAIT_WAIT_FOREVER)
hNamedPipe = CreateFile (ServerPipeName GENERIC_READ | GENERIC_WRITE
0 NULL OPEN_EXISTING FILE_ATTRIBUTE_NORMAL NULL)
if (hNamedPipe == INVALID_HANDLE_VALUE)
fptinf(stderr Failure to locate servern) exit(3)
Write the request
WriteFile (hNamedPipe ampRequest MAX_RQRS_LEN ampnWrite NULL)
Read each response and send it to std out
while (ReadFile (hNamedPipe ResponseRecord MAX_RQRS_LEN ampnRead NULL))
printf (s ResponseRecord)
CloseHandle (hNamedPipe)
return 0
105105
Server Example Using a Named Pipe
hNamedPipe = CreateNamedPipe (SERVER_PIPE PIPE_ACCESS_DUPLEX
PIPE_READMODE_MESSAGE | PIPE_TYPE_MESSAGE | PIPE_WAIT
1 0 0 CS_TIMEOUT pNPSA)
while (Done)
printf (Server is awaiting next requestn)
if (ConnectNamedPipe (hNamedPipe NULL)
|| ReadFile (hNamedPipe ampRequest RQ_SIZE ampnXfer NULL))
fprintf(stderr ldquoConnect or Read Named Pipe errornrdquo) exit(4)
printf( ldquoRequest is sn RequestRecord)
Send the file one line at a time to the client
fp = fopen (File r)
while ((fgets (ResponseRecord MAX_RQRS_LEN fp) = NULL))
WriteFile (hNamedPipe ampResponseRecord
(strlen(ResponseRecord) + 1) TSIZE ampnXfer NULL)
fclose (fp)
DisconnectNamedPipe (hNamedPipe)
End of server operation
106106
Win32 IPC - MailslotsMailslots bear some nasty implementation detailsthey are almost never used
Broadcast mechanism
ndash One-directional
ndash Mutliple writersmultiple readers (frequently one-to-many comm)
ndash Message delivery is unreliable
ndash Can be located over a network domain
ndash Message lengths are limited (w2k lt 426 byte) Operations on the mailslot
ndash Each reader (server) creates mailslot with CreateMailslot()
ndash Write-only client opens mailslot with CreateFile() and uses WriteFile() ndash open will fail if there are no waiting readers
ndash Clientlsquos message can be read by all servers (readers) Client lookup mailslotmailslotname
ndash Client will connect to every server in network domain
107107
Locate a server via mailslot
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
App client 0
App client n
Mailslot Servers
While () Sleep() hMS = CreateFile( ldquomailslotstatusldquo)
WriteFile(hMS ampStatInfo
App Server
Mailslot Client
Message is sent periodically
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
6969
Special kernel APCs
ndash Run in kernel mode at IRQL 1
ndash Always deliverable unless thread is already at IRQL 1 or above
ndash Used for IO completion reporting from ldquoarbitrary thread contextrdquo
ndash Kernel-mode interface is linkable but not documented
Asynchronous Procedure Calls (APCs)
7070
ldquoOrdinaryrdquo kernel APCs
ndash Always deliverable if at IRQL 0 unless explicitly disabled (disable with KeEnterCriticalRegion)
User mode APCs
ndash Used for IO completion callback routines (see ReadFileEx WriteFileEx) also QueueUserApc
ndash Only deliverable when thread is in ldquoalertable waitrdquo
Asynchronous Procedure Calls (APCs)
7171
ThreadObject
K
U
APC objects
Asynchronous Procedure Calls (APCs)
7272
IRQLs and CPU Time Accounting
Interval clock timer ISR keeps track of time
Clock ISR time accounting
ndash If IRQLlt2 charge to threadrsquos user or kernel time
ndash If IRQL=2 and processing a DPC charge to DPC time
ndash If IRQL=2 amp not processing a DPC charge to thread kernel time
ndash If IRQLgt2 charge to interrupt time
7373
IRQLs and CPU Time Accounting
Since time servicing interrupts are NOT charged to interrupted thread if system is busy but no process appears to be running must be due to interrupt-related activity
ndash Note time at IRQL 2 or more is charged to the current threadrsquos quantum (to be described)
7474
Interrupt Time Accounting
Task Manager includes interrupt and DPC time with the Idle process time
Interrupt activity is not charged to any threadprocess
ndash Process Explorer shows these as separate processesnot really processes
ndash Context switches for these are really of interrupts amp DPCs
7575
Time Accounting Quirks
Looking at total CPU time for each process may not reveal where system has spent its time
CPU time accounting is driven by programmable interrupt timer
ndash Normally 10 msec (15 msec on some MP Pentiums)
Thread execution and context switches between clock intervals NOT accounted
ndash Eg one or more threads run and enter a wait state before clock fires
ndash Thus threads may run but never get charged
View context switch activity with Process Explorer
ndash Add Context Switch Delta column
7676
For waiting threads user-mode utilities only display the wait reason
Example pstat
Looking at Waiting Threads
7777
Wait Internals 1 Dispatcher Objects
Size TypeState
Wait listhead
Object-type-specific data
DispatcherObject
(see ntddkincddkntddkh)
Any kernel object you can wait for is a ldquodispatcher objectrdquo
ndash some exclusively for synchronizationeg events mutexes (ldquomutantsrdquo) semaphores queues timers
ndash others can be waited for as a side effect of their prime function eg processes threads file objects
ndash non-waitable kernel objects are called ldquocontrol objectsrdquo
All dispatcher objects have a common header
All dispatcher objects are in one of two states
ndash ldquosignaledrdquo vs ldquononsignaledrdquo
ndash when signalled a wait on the object is satisfied
ndash different object types differ in terms of what changes their state
ndash wait and unwait implementation iscommon to all types of dispatcher objects
7878
Object-type-specific data
Wait Internals 2Wait Blocks
Size TypeState
Wait listhead
Size TypeState
Wait listhead
Represent a threadrsquos reference to something itrsquos waiting for (one per handle passed to WaitForhellip)
All wait blocks from a given wait call are chained to the waiting thread
Type indicates wait for ldquoanyrdquo or ldquoallrdquo Key denotes argument list position for
WaitForMultipleObjects
Object-type-specific data
DispatcherObjects
Thread Objects
WaitBlockListWaitBlockList
Wait blocks
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
7979
34 Windows APIs for Synchronization and IPC Windows API constructs for synchronization and interprocess communication
Synchronization
ndash Critical sections
ndash Mutexes
ndash Semaphores
ndash Event objects
Synchronization through interprocess communication
ndash Anonymous pipes
ndash Named pipes
ndash Mailslots
8080
Critical Sections
Only usable from within the same process
Critical sections are initialized and deleted but do not have handles
Only one thread at a time can be in a critical section
A thread can enter a critical section multiple times - however the number of Enter- and Leave-operations must match
Leaving a critical section before entering it may cause deadlocks
No way to test whether another thread is in a critical section
VOID InitializeCriticalSection( LPCRITICAL_SECTION sec )VOID DeleteCriticalSection( LPCRITICAL_SECTION sec )
VOID EnterCriticalSection( LPCRITICAL_SECTION sec ) VOID LeaveCriticalSection( LPCRITICAL_SECTION sec )BOOL TryEnterCriticalSection ( LPCRITICAL_SECTION sec )
8181
Critical Section Example
counter is global shared by all threads
volatile int counter = 0
CRITICAL_SECTION crit
InitializeCriticalSection ( ampcrit )
hellip main loop in any of the threads
while (done)
_try
EnterCriticalSection ( ampcrit )
counter += local_value
LeaveCriticalSection ( ampcrit )
_finally LeaveCriticalSection ( ampcrit )
DeleteCriticalSection( ampcrit )
8282
Synchronizing Threads with Kernel Objects
The following kernel objects can be used to synchronize threads
ndash Processes
ndash Threads
ndash Files
ndash Console input
File change notificationsFile change notifications
MutexesMutexes
Events (auto-reset + manual-reset)Events (auto-reset + manual-reset)
Waitable timersWaitable timers
DWORD WaitForSingleObject( HANDLE hObject DWORD dwTimeout )
DWORD WaitForMultipleObjects( DWORD cObjects LPHANDLE lpHandles BOOL bWaitAll DWORD dwTimeout )
8383
Wait Functions - Details
WaitForSingleObject()ndash hObject specifies kernel object
ndash dwTimeout specifies wait time in msecdwTimeout == 0 - no wait check whether object is signaled
dwTimeout == INFINITE - wait forever
WaitForMultipleObjects()ndash cObjects lt= MAXIMUM_WAIT_OBJECTS (64)
ndash lpHandles - pointer to array identifying these objects
ndash bWaitAll - whether to wait for first signaled object or all objectsFunction returns index of first signaled object
Side effectsndash Mutexes auto-reset events and waitable timers will be reset to non-signaled state after completing wait functions
8484
Mutexes
Mutexes work across processes
First thread has to call CreateMutex()
When sharing a mutex second thread (process) calls CreateMutex() or OpenMutex()
fInitialOwner == TRUE gives creator immediate ownership
Threads acquire mutex ownership using WaitForSingleObject() or WaitForMultipleObjects()
ReleaseMutex() gives up ownership
CloseHandle() will free mutex object
8585
Mutexes
HANDLE CreateMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
HANDLE OpenMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
BOOL ReleaseMutex( HANDLE hMutex )
8686
Mutex Example
counter is global shared by all threads
volatile int done counter = 0
HANDLE mutex = CreateMutex( NULL FALSE NULL )
main loop in any of the threads ret is local
DWORD ret
while (done)
ret = WaitForSingleObject( mutex INFINITE )
if (ret == WAIT_OBJECT_0)
counter += local_value
else mutex was abandoned
break exit the loop
ReleaseMutex( mutex )
CloseHandle( mutex )
8787
Comparison - POSIX mutexes
POSIX pthreads specification supports mutexesndash Synchronization among threads in same process
Five basic functionsndash pthread_mutex_init()
ndash pthread_mutex_destroy()
ndash pthread_mutex_lock()
ndash pthread_mutex_unlock()
ndash pthread_mutex_trylock()
Comparisonndash pthread_mutex_lock() will block - equivalent to WaitForSingleObject( hMutex )
ndash pthread_mutex_trylock() is nonblocking (polling) - equivalent to WaitForSingleObject() with timeout == 0
8888
Semaphores
Semaphore objects are used for resource countingndash A semaphore is signaled when count gt 0
Threadsprocesses use wait functionsndash Each wait function decreases semaphore count by 1
ndash ReleaseSemaphore() may increment count by any value
ndash ReleaseSemaphore() returns old semaphore count
8989
Semaphores
HANDLE CreateSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE OpenSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE ReleaseSemaphore( HANDLE hSemaphoreLONG cReleaseCount LPLONG lpPreviousCount )
9090
Events
Multiple threads can be released when a single event is signaled (barrier synchronization)ndash Manual-reset event can signal several thread simultaneously must be reset manually
ndash PulseEvent() will release all threads waiting on a manual-reset event and automatically reset the event
ndash Auto-reset event signals a single thread event is reset automatically
ndash fInitialState == TRUE - create event in signaled state
9191
Events
HANDLE CreateEvent( LPSECURITY_ATTRIBUTE lpsaBOOL fManualReset BOOL fInititalStateLPTSTR lpszEventName )
BOOL SetEvent( HANDLE hEvent )BOOL ResetEvent( HANDLE hEvent )BOOL PulseEvent( HANDLE hEvent )
9292
Comparison - POSIX condition variables
pthreadrsquos condition variables are comparable to eventsndash pthread_cond_init()
ndash pthread_cond_destroy()
Wait functionsndash pthread_cond_wait()
ndash pthread_cond_timedwait()
Signalingndash pthread_cond_signal() - one thread
ndash pthread_cond_broadcast() - all waiting threads
No exact equivalent to manual-reset events
9393
Anonymous pipes
BOOL CreatePipe( PHANDLE phReadPHANDLE phWriteLPSECURITY_ATTRIBUTES lpsaDWORD cbPipe )
main
prog1 prog2pipe
Half-duplex character-based IPC
cbPipe pipe byte size zero == default
Read on pipe handle will block if pipe is empty
Write operation to a full pipe will block
Anonymous pipes are oneway
9494
IO Redirection using an Anonymous Pipe
Create default size anonymous pipe handles are inheritable
if (CreatePipe (amphReadPipe amphWritePipe ampPipeSA 0))
fprintf(stderr ldquoAnon pipe create failednrdquo) exit(1)
Set output handle to pipe handle create first processes
StartInfoCh1hStdInput = GetStdHandle (STD_INPUT_HANDLE)
StartInfoCh1hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh1hStdOutput = hWritePipe
StartInfoCh1dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)Command1 NULL NULL TRUE 0 NULL NULL ampStartInfoCh1 ampProcInfo1))
fprintf(stderr ldquoCreateProc1 failednrdquo) exit(2)
CloseHandle (hWritePipe)
9595
Pipe example (contd)
Repeat (symmetrically) for the second process
StartInfoCh2hStdInput = hReadPipe
StartInfoCh2hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh2hStdOutput = GetStdHandle (STD_OUTPUT_HANDLE)
StartInfoCh2dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)targv NULL NULLTRUE Inherit handles
0 NULL NULL ampStartInfoCh2 ampProcInfo2))
fprintf(stderr ldquoCreateProc2 failednrdquo) exit(3)
CloseHandle (hReadPipe)
Wait for both processes to complete
WaitForSingleObject (ProcInfo1hProcess INFINITE)
WaitForSingleObject (ProcInfo2hProcess INFINITE)
CloseHandle (ProcInfo1hThread) CloseHandle (ProcInfo1hProcess)
CloseHandle (ProcInfo2hThread) CloseHandle (ProcInfo2hProcess)
return 0
9696
Named Pipes
Message orientedndash Reading process can read varying-length messages precisely as sent by the writing process
Bi-directionalndash Two processes can exchange messages over the same pipe
Multiple independent instances of a named pipendash Several clients can communicate with a single server using the same instance
ndash Server can respond to client using the same instance
Pipe can be accessed over the networkndash location transparency
Convenience and connection functions
9797
Using Named Pipes
HANDLE CreateNamedPipe (LPCTSTR lpszPipeNameDWORD fdwOpenMode DWORD fdwPipModeDWORD nMaxInstances DWORD cbOutBufDWORD cbInBuf DWORD dwTimeOutLPSECURITY_ATTRIBUTES lpsa )
Use same flag settings forall instances of a named pipe
lpszPipeName pipe[path]pipename
ndash Not possible to create a pipe on remote machine ( ndash local machine)
fdwOpenMode
ndash PIPE_ACCESS_DUPLEX PIPE_ACCESS_INBOUND PIPE_ACCESS_OUTBOUND
fdwPipeMode
ndash PIPE_TYPE_BYTE or PIPE_TYPE_MESSAGE
ndash PIPE_READMODE_BYTE or PIPE_READMODE_MESSAGE
ndash PIPE_WAIT or PIPE_NOWAIT (will ReadFile block)
9898
Named Pipes (contd)
BOOL PeekNamedPipe (HANDLE hPipeLPVOID lpvBuffer DWORD cbBufferLPDWORD lpcbRead LPDWORD lpcbAvailLPDWORD lpcbMessage)
nMaxInstances
ndash Number of instances
ndash PIPE_UNLIMITED_INSTANCES OS choice based on resources
dwTimeOut
ndash Default time-out period (in msec) for WaitNamedPipe()
First CreateNamedPipe creates named pipe
ndash Closing handle to last instance deletes named pipe
Polling a pipe
ndash Nondestructive ndash is there a message waiting for ReadFile
9999
Named Pipe Client Connections
CreateFile with named pipe namendash pipe[path]pipename
ndash servernamepipe[path]pipename
ndash First method gives better performance (local server)
Status Functionsndash GetNamedPipeHandleState
ndash SetNamedPipeHandleState
ndash GetNamedPipeInfo
100100
Convenience Functions
BOOL TransactNamedPipe( HANDLE hNamedPipeLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDOWRD lpcbRead LPOVERLAPPED lpa)
WriteFile ReadFile sequence
101101
Convenience Functions
BOOL CallNamedPipe( LPCTSTR lpszPipeNameLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDWORD lpcbRead DWORD dwTimeOut)
CreateFile WriteFile ReadFile CloseHandle
ndash dwTimeOut NMPWAIT_NOWAIT NMPWAIT_WIAT_FOREVER NMPWAIT_USE_DEFAULT_WAIT
102102
Server eliminate the polling loop
BOOL ConnectNamedPipe (HANDLE hNamedPipeLPOVERLAPPED lpo
lpo == NULLndash Call will return as soon as there is a client connection
ndash Returns false if client connected between CreateNamed Pipe calland ConnectNamedPipe()
Use DisconnectNamedPipe to free the handle for connection from another client
WaitNamedPipe()ndash Client may wait for serverlsquos ConnectNamedPipe()
Security rights for named pipesndash GENERIC_READ GENERIC_WRITE SYNCHRONIZE
103103
Comparison with UNIX
UNIX FIFOs are similar to a named pipe
ndash FIFOs are half-duplex
ndash FIFOs are limited to a single machine
ndash FIFOs are still byte-oriented so its easiest to use fixed-size records in clientserver applications
ndash Individual readwrites are atomic
A server using FIFOs must use a separate FIFO for each clientlsquos response although all clients can send requests via a single well known FIFO
Mkfifo() is the UNIX counterpart to CreateNamedPipe()
Use sockets for networked clientserver scenarios
104104
Client Example using Named Pipe
WaitNamedPipe (ServerPipeName NMPWAIT_WAIT_FOREVER)
hNamedPipe = CreateFile (ServerPipeName GENERIC_READ | GENERIC_WRITE
0 NULL OPEN_EXISTING FILE_ATTRIBUTE_NORMAL NULL)
if (hNamedPipe == INVALID_HANDLE_VALUE)
fptinf(stderr Failure to locate servern) exit(3)
Write the request
WriteFile (hNamedPipe ampRequest MAX_RQRS_LEN ampnWrite NULL)
Read each response and send it to std out
while (ReadFile (hNamedPipe ResponseRecord MAX_RQRS_LEN ampnRead NULL))
printf (s ResponseRecord)
CloseHandle (hNamedPipe)
return 0
105105
Server Example Using a Named Pipe
hNamedPipe = CreateNamedPipe (SERVER_PIPE PIPE_ACCESS_DUPLEX
PIPE_READMODE_MESSAGE | PIPE_TYPE_MESSAGE | PIPE_WAIT
1 0 0 CS_TIMEOUT pNPSA)
while (Done)
printf (Server is awaiting next requestn)
if (ConnectNamedPipe (hNamedPipe NULL)
|| ReadFile (hNamedPipe ampRequest RQ_SIZE ampnXfer NULL))
fprintf(stderr ldquoConnect or Read Named Pipe errornrdquo) exit(4)
printf( ldquoRequest is sn RequestRecord)
Send the file one line at a time to the client
fp = fopen (File r)
while ((fgets (ResponseRecord MAX_RQRS_LEN fp) = NULL))
WriteFile (hNamedPipe ampResponseRecord
(strlen(ResponseRecord) + 1) TSIZE ampnXfer NULL)
fclose (fp)
DisconnectNamedPipe (hNamedPipe)
End of server operation
106106
Win32 IPC - MailslotsMailslots bear some nasty implementation detailsthey are almost never used
Broadcast mechanism
ndash One-directional
ndash Mutliple writersmultiple readers (frequently one-to-many comm)
ndash Message delivery is unreliable
ndash Can be located over a network domain
ndash Message lengths are limited (w2k lt 426 byte) Operations on the mailslot
ndash Each reader (server) creates mailslot with CreateMailslot()
ndash Write-only client opens mailslot with CreateFile() and uses WriteFile() ndash open will fail if there are no waiting readers
ndash Clientlsquos message can be read by all servers (readers) Client lookup mailslotmailslotname
ndash Client will connect to every server in network domain
107107
Locate a server via mailslot
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
App client 0
App client n
Mailslot Servers
While () Sleep() hMS = CreateFile( ldquomailslotstatusldquo)
WriteFile(hMS ampStatInfo
App Server
Mailslot Client
Message is sent periodically
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
7070
ldquoOrdinaryrdquo kernel APCs
ndash Always deliverable if at IRQL 0 unless explicitly disabled (disable with KeEnterCriticalRegion)
User mode APCs
ndash Used for IO completion callback routines (see ReadFileEx WriteFileEx) also QueueUserApc
ndash Only deliverable when thread is in ldquoalertable waitrdquo
Asynchronous Procedure Calls (APCs)
7171
ThreadObject
K
U
APC objects
Asynchronous Procedure Calls (APCs)
7272
IRQLs and CPU Time Accounting
Interval clock timer ISR keeps track of time
Clock ISR time accounting
ndash If IRQLlt2 charge to threadrsquos user or kernel time
ndash If IRQL=2 and processing a DPC charge to DPC time
ndash If IRQL=2 amp not processing a DPC charge to thread kernel time
ndash If IRQLgt2 charge to interrupt time
7373
IRQLs and CPU Time Accounting
Since time servicing interrupts are NOT charged to interrupted thread if system is busy but no process appears to be running must be due to interrupt-related activity
ndash Note time at IRQL 2 or more is charged to the current threadrsquos quantum (to be described)
7474
Interrupt Time Accounting
Task Manager includes interrupt and DPC time with the Idle process time
Interrupt activity is not charged to any threadprocess
ndash Process Explorer shows these as separate processesnot really processes
ndash Context switches for these are really of interrupts amp DPCs
7575
Time Accounting Quirks
Looking at total CPU time for each process may not reveal where system has spent its time
CPU time accounting is driven by programmable interrupt timer
ndash Normally 10 msec (15 msec on some MP Pentiums)
Thread execution and context switches between clock intervals NOT accounted
ndash Eg one or more threads run and enter a wait state before clock fires
ndash Thus threads may run but never get charged
View context switch activity with Process Explorer
ndash Add Context Switch Delta column
7676
For waiting threads user-mode utilities only display the wait reason
Example pstat
Looking at Waiting Threads
7777
Wait Internals 1 Dispatcher Objects
Size TypeState
Wait listhead
Object-type-specific data
DispatcherObject
(see ntddkincddkntddkh)
Any kernel object you can wait for is a ldquodispatcher objectrdquo
ndash some exclusively for synchronizationeg events mutexes (ldquomutantsrdquo) semaphores queues timers
ndash others can be waited for as a side effect of their prime function eg processes threads file objects
ndash non-waitable kernel objects are called ldquocontrol objectsrdquo
All dispatcher objects have a common header
All dispatcher objects are in one of two states
ndash ldquosignaledrdquo vs ldquononsignaledrdquo
ndash when signalled a wait on the object is satisfied
ndash different object types differ in terms of what changes their state
ndash wait and unwait implementation iscommon to all types of dispatcher objects
7878
Object-type-specific data
Wait Internals 2Wait Blocks
Size TypeState
Wait listhead
Size TypeState
Wait listhead
Represent a threadrsquos reference to something itrsquos waiting for (one per handle passed to WaitForhellip)
All wait blocks from a given wait call are chained to the waiting thread
Type indicates wait for ldquoanyrdquo or ldquoallrdquo Key denotes argument list position for
WaitForMultipleObjects
Object-type-specific data
DispatcherObjects
Thread Objects
WaitBlockListWaitBlockList
Wait blocks
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
7979
34 Windows APIs for Synchronization and IPC Windows API constructs for synchronization and interprocess communication
Synchronization
ndash Critical sections
ndash Mutexes
ndash Semaphores
ndash Event objects
Synchronization through interprocess communication
ndash Anonymous pipes
ndash Named pipes
ndash Mailslots
8080
Critical Sections
Only usable from within the same process
Critical sections are initialized and deleted but do not have handles
Only one thread at a time can be in a critical section
A thread can enter a critical section multiple times - however the number of Enter- and Leave-operations must match
Leaving a critical section before entering it may cause deadlocks
No way to test whether another thread is in a critical section
VOID InitializeCriticalSection( LPCRITICAL_SECTION sec )VOID DeleteCriticalSection( LPCRITICAL_SECTION sec )
VOID EnterCriticalSection( LPCRITICAL_SECTION sec ) VOID LeaveCriticalSection( LPCRITICAL_SECTION sec )BOOL TryEnterCriticalSection ( LPCRITICAL_SECTION sec )
8181
Critical Section Example
counter is global shared by all threads
volatile int counter = 0
CRITICAL_SECTION crit
InitializeCriticalSection ( ampcrit )
hellip main loop in any of the threads
while (done)
_try
EnterCriticalSection ( ampcrit )
counter += local_value
LeaveCriticalSection ( ampcrit )
_finally LeaveCriticalSection ( ampcrit )
DeleteCriticalSection( ampcrit )
8282
Synchronizing Threads with Kernel Objects
The following kernel objects can be used to synchronize threads
ndash Processes
ndash Threads
ndash Files
ndash Console input
File change notificationsFile change notifications
MutexesMutexes
Events (auto-reset + manual-reset)Events (auto-reset + manual-reset)
Waitable timersWaitable timers
DWORD WaitForSingleObject( HANDLE hObject DWORD dwTimeout )
DWORD WaitForMultipleObjects( DWORD cObjects LPHANDLE lpHandles BOOL bWaitAll DWORD dwTimeout )
8383
Wait Functions - Details
WaitForSingleObject()ndash hObject specifies kernel object
ndash dwTimeout specifies wait time in msecdwTimeout == 0 - no wait check whether object is signaled
dwTimeout == INFINITE - wait forever
WaitForMultipleObjects()ndash cObjects lt= MAXIMUM_WAIT_OBJECTS (64)
ndash lpHandles - pointer to array identifying these objects
ndash bWaitAll - whether to wait for first signaled object or all objectsFunction returns index of first signaled object
Side effectsndash Mutexes auto-reset events and waitable timers will be reset to non-signaled state after completing wait functions
8484
Mutexes
Mutexes work across processes
First thread has to call CreateMutex()
When sharing a mutex second thread (process) calls CreateMutex() or OpenMutex()
fInitialOwner == TRUE gives creator immediate ownership
Threads acquire mutex ownership using WaitForSingleObject() or WaitForMultipleObjects()
ReleaseMutex() gives up ownership
CloseHandle() will free mutex object
8585
Mutexes
HANDLE CreateMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
HANDLE OpenMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
BOOL ReleaseMutex( HANDLE hMutex )
8686
Mutex Example
counter is global shared by all threads
volatile int done counter = 0
HANDLE mutex = CreateMutex( NULL FALSE NULL )
main loop in any of the threads ret is local
DWORD ret
while (done)
ret = WaitForSingleObject( mutex INFINITE )
if (ret == WAIT_OBJECT_0)
counter += local_value
else mutex was abandoned
break exit the loop
ReleaseMutex( mutex )
CloseHandle( mutex )
8787
Comparison - POSIX mutexes
POSIX pthreads specification supports mutexesndash Synchronization among threads in same process
Five basic functionsndash pthread_mutex_init()
ndash pthread_mutex_destroy()
ndash pthread_mutex_lock()
ndash pthread_mutex_unlock()
ndash pthread_mutex_trylock()
Comparisonndash pthread_mutex_lock() will block - equivalent to WaitForSingleObject( hMutex )
ndash pthread_mutex_trylock() is nonblocking (polling) - equivalent to WaitForSingleObject() with timeout == 0
8888
Semaphores
Semaphore objects are used for resource countingndash A semaphore is signaled when count gt 0
Threadsprocesses use wait functionsndash Each wait function decreases semaphore count by 1
ndash ReleaseSemaphore() may increment count by any value
ndash ReleaseSemaphore() returns old semaphore count
8989
Semaphores
HANDLE CreateSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE OpenSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE ReleaseSemaphore( HANDLE hSemaphoreLONG cReleaseCount LPLONG lpPreviousCount )
9090
Events
Multiple threads can be released when a single event is signaled (barrier synchronization)ndash Manual-reset event can signal several thread simultaneously must be reset manually
ndash PulseEvent() will release all threads waiting on a manual-reset event and automatically reset the event
ndash Auto-reset event signals a single thread event is reset automatically
ndash fInitialState == TRUE - create event in signaled state
9191
Events
HANDLE CreateEvent( LPSECURITY_ATTRIBUTE lpsaBOOL fManualReset BOOL fInititalStateLPTSTR lpszEventName )
BOOL SetEvent( HANDLE hEvent )BOOL ResetEvent( HANDLE hEvent )BOOL PulseEvent( HANDLE hEvent )
9292
Comparison - POSIX condition variables
pthreadrsquos condition variables are comparable to eventsndash pthread_cond_init()
ndash pthread_cond_destroy()
Wait functionsndash pthread_cond_wait()
ndash pthread_cond_timedwait()
Signalingndash pthread_cond_signal() - one thread
ndash pthread_cond_broadcast() - all waiting threads
No exact equivalent to manual-reset events
9393
Anonymous pipes
BOOL CreatePipe( PHANDLE phReadPHANDLE phWriteLPSECURITY_ATTRIBUTES lpsaDWORD cbPipe )
main
prog1 prog2pipe
Half-duplex character-based IPC
cbPipe pipe byte size zero == default
Read on pipe handle will block if pipe is empty
Write operation to a full pipe will block
Anonymous pipes are oneway
9494
IO Redirection using an Anonymous Pipe
Create default size anonymous pipe handles are inheritable
if (CreatePipe (amphReadPipe amphWritePipe ampPipeSA 0))
fprintf(stderr ldquoAnon pipe create failednrdquo) exit(1)
Set output handle to pipe handle create first processes
StartInfoCh1hStdInput = GetStdHandle (STD_INPUT_HANDLE)
StartInfoCh1hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh1hStdOutput = hWritePipe
StartInfoCh1dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)Command1 NULL NULL TRUE 0 NULL NULL ampStartInfoCh1 ampProcInfo1))
fprintf(stderr ldquoCreateProc1 failednrdquo) exit(2)
CloseHandle (hWritePipe)
9595
Pipe example (contd)
Repeat (symmetrically) for the second process
StartInfoCh2hStdInput = hReadPipe
StartInfoCh2hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh2hStdOutput = GetStdHandle (STD_OUTPUT_HANDLE)
StartInfoCh2dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)targv NULL NULLTRUE Inherit handles
0 NULL NULL ampStartInfoCh2 ampProcInfo2))
fprintf(stderr ldquoCreateProc2 failednrdquo) exit(3)
CloseHandle (hReadPipe)
Wait for both processes to complete
WaitForSingleObject (ProcInfo1hProcess INFINITE)
WaitForSingleObject (ProcInfo2hProcess INFINITE)
CloseHandle (ProcInfo1hThread) CloseHandle (ProcInfo1hProcess)
CloseHandle (ProcInfo2hThread) CloseHandle (ProcInfo2hProcess)
return 0
9696
Named Pipes
Message orientedndash Reading process can read varying-length messages precisely as sent by the writing process
Bi-directionalndash Two processes can exchange messages over the same pipe
Multiple independent instances of a named pipendash Several clients can communicate with a single server using the same instance
ndash Server can respond to client using the same instance
Pipe can be accessed over the networkndash location transparency
Convenience and connection functions
9797
Using Named Pipes
HANDLE CreateNamedPipe (LPCTSTR lpszPipeNameDWORD fdwOpenMode DWORD fdwPipModeDWORD nMaxInstances DWORD cbOutBufDWORD cbInBuf DWORD dwTimeOutLPSECURITY_ATTRIBUTES lpsa )
Use same flag settings forall instances of a named pipe
lpszPipeName pipe[path]pipename
ndash Not possible to create a pipe on remote machine ( ndash local machine)
fdwOpenMode
ndash PIPE_ACCESS_DUPLEX PIPE_ACCESS_INBOUND PIPE_ACCESS_OUTBOUND
fdwPipeMode
ndash PIPE_TYPE_BYTE or PIPE_TYPE_MESSAGE
ndash PIPE_READMODE_BYTE or PIPE_READMODE_MESSAGE
ndash PIPE_WAIT or PIPE_NOWAIT (will ReadFile block)
9898
Named Pipes (contd)
BOOL PeekNamedPipe (HANDLE hPipeLPVOID lpvBuffer DWORD cbBufferLPDWORD lpcbRead LPDWORD lpcbAvailLPDWORD lpcbMessage)
nMaxInstances
ndash Number of instances
ndash PIPE_UNLIMITED_INSTANCES OS choice based on resources
dwTimeOut
ndash Default time-out period (in msec) for WaitNamedPipe()
First CreateNamedPipe creates named pipe
ndash Closing handle to last instance deletes named pipe
Polling a pipe
ndash Nondestructive ndash is there a message waiting for ReadFile
9999
Named Pipe Client Connections
CreateFile with named pipe namendash pipe[path]pipename
ndash servernamepipe[path]pipename
ndash First method gives better performance (local server)
Status Functionsndash GetNamedPipeHandleState
ndash SetNamedPipeHandleState
ndash GetNamedPipeInfo
100100
Convenience Functions
BOOL TransactNamedPipe( HANDLE hNamedPipeLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDOWRD lpcbRead LPOVERLAPPED lpa)
WriteFile ReadFile sequence
101101
Convenience Functions
BOOL CallNamedPipe( LPCTSTR lpszPipeNameLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDWORD lpcbRead DWORD dwTimeOut)
CreateFile WriteFile ReadFile CloseHandle
ndash dwTimeOut NMPWAIT_NOWAIT NMPWAIT_WIAT_FOREVER NMPWAIT_USE_DEFAULT_WAIT
102102
Server eliminate the polling loop
BOOL ConnectNamedPipe (HANDLE hNamedPipeLPOVERLAPPED lpo
lpo == NULLndash Call will return as soon as there is a client connection
ndash Returns false if client connected between CreateNamed Pipe calland ConnectNamedPipe()
Use DisconnectNamedPipe to free the handle for connection from another client
WaitNamedPipe()ndash Client may wait for serverlsquos ConnectNamedPipe()
Security rights for named pipesndash GENERIC_READ GENERIC_WRITE SYNCHRONIZE
103103
Comparison with UNIX
UNIX FIFOs are similar to a named pipe
ndash FIFOs are half-duplex
ndash FIFOs are limited to a single machine
ndash FIFOs are still byte-oriented so its easiest to use fixed-size records in clientserver applications
ndash Individual readwrites are atomic
A server using FIFOs must use a separate FIFO for each clientlsquos response although all clients can send requests via a single well known FIFO
Mkfifo() is the UNIX counterpart to CreateNamedPipe()
Use sockets for networked clientserver scenarios
104104
Client Example using Named Pipe
WaitNamedPipe (ServerPipeName NMPWAIT_WAIT_FOREVER)
hNamedPipe = CreateFile (ServerPipeName GENERIC_READ | GENERIC_WRITE
0 NULL OPEN_EXISTING FILE_ATTRIBUTE_NORMAL NULL)
if (hNamedPipe == INVALID_HANDLE_VALUE)
fptinf(stderr Failure to locate servern) exit(3)
Write the request
WriteFile (hNamedPipe ampRequest MAX_RQRS_LEN ampnWrite NULL)
Read each response and send it to std out
while (ReadFile (hNamedPipe ResponseRecord MAX_RQRS_LEN ampnRead NULL))
printf (s ResponseRecord)
CloseHandle (hNamedPipe)
return 0
105105
Server Example Using a Named Pipe
hNamedPipe = CreateNamedPipe (SERVER_PIPE PIPE_ACCESS_DUPLEX
PIPE_READMODE_MESSAGE | PIPE_TYPE_MESSAGE | PIPE_WAIT
1 0 0 CS_TIMEOUT pNPSA)
while (Done)
printf (Server is awaiting next requestn)
if (ConnectNamedPipe (hNamedPipe NULL)
|| ReadFile (hNamedPipe ampRequest RQ_SIZE ampnXfer NULL))
fprintf(stderr ldquoConnect or Read Named Pipe errornrdquo) exit(4)
printf( ldquoRequest is sn RequestRecord)
Send the file one line at a time to the client
fp = fopen (File r)
while ((fgets (ResponseRecord MAX_RQRS_LEN fp) = NULL))
WriteFile (hNamedPipe ampResponseRecord
(strlen(ResponseRecord) + 1) TSIZE ampnXfer NULL)
fclose (fp)
DisconnectNamedPipe (hNamedPipe)
End of server operation
106106
Win32 IPC - MailslotsMailslots bear some nasty implementation detailsthey are almost never used
Broadcast mechanism
ndash One-directional
ndash Mutliple writersmultiple readers (frequently one-to-many comm)
ndash Message delivery is unreliable
ndash Can be located over a network domain
ndash Message lengths are limited (w2k lt 426 byte) Operations on the mailslot
ndash Each reader (server) creates mailslot with CreateMailslot()
ndash Write-only client opens mailslot with CreateFile() and uses WriteFile() ndash open will fail if there are no waiting readers
ndash Clientlsquos message can be read by all servers (readers) Client lookup mailslotmailslotname
ndash Client will connect to every server in network domain
107107
Locate a server via mailslot
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
App client 0
App client n
Mailslot Servers
While () Sleep() hMS = CreateFile( ldquomailslotstatusldquo)
WriteFile(hMS ampStatInfo
App Server
Mailslot Client
Message is sent periodically
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
7171
ThreadObject
K
U
APC objects
Asynchronous Procedure Calls (APCs)
7272
IRQLs and CPU Time Accounting
Interval clock timer ISR keeps track of time
Clock ISR time accounting
ndash If IRQLlt2 charge to threadrsquos user or kernel time
ndash If IRQL=2 and processing a DPC charge to DPC time
ndash If IRQL=2 amp not processing a DPC charge to thread kernel time
ndash If IRQLgt2 charge to interrupt time
7373
IRQLs and CPU Time Accounting
Since time servicing interrupts are NOT charged to interrupted thread if system is busy but no process appears to be running must be due to interrupt-related activity
ndash Note time at IRQL 2 or more is charged to the current threadrsquos quantum (to be described)
7474
Interrupt Time Accounting
Task Manager includes interrupt and DPC time with the Idle process time
Interrupt activity is not charged to any threadprocess
ndash Process Explorer shows these as separate processesnot really processes
ndash Context switches for these are really of interrupts amp DPCs
7575
Time Accounting Quirks
Looking at total CPU time for each process may not reveal where system has spent its time
CPU time accounting is driven by programmable interrupt timer
ndash Normally 10 msec (15 msec on some MP Pentiums)
Thread execution and context switches between clock intervals NOT accounted
ndash Eg one or more threads run and enter a wait state before clock fires
ndash Thus threads may run but never get charged
View context switch activity with Process Explorer
ndash Add Context Switch Delta column
7676
For waiting threads user-mode utilities only display the wait reason
Example pstat
Looking at Waiting Threads
7777
Wait Internals 1 Dispatcher Objects
Size TypeState
Wait listhead
Object-type-specific data
DispatcherObject
(see ntddkincddkntddkh)
Any kernel object you can wait for is a ldquodispatcher objectrdquo
ndash some exclusively for synchronizationeg events mutexes (ldquomutantsrdquo) semaphores queues timers
ndash others can be waited for as a side effect of their prime function eg processes threads file objects
ndash non-waitable kernel objects are called ldquocontrol objectsrdquo
All dispatcher objects have a common header
All dispatcher objects are in one of two states
ndash ldquosignaledrdquo vs ldquononsignaledrdquo
ndash when signalled a wait on the object is satisfied
ndash different object types differ in terms of what changes their state
ndash wait and unwait implementation iscommon to all types of dispatcher objects
7878
Object-type-specific data
Wait Internals 2Wait Blocks
Size TypeState
Wait listhead
Size TypeState
Wait listhead
Represent a threadrsquos reference to something itrsquos waiting for (one per handle passed to WaitForhellip)
All wait blocks from a given wait call are chained to the waiting thread
Type indicates wait for ldquoanyrdquo or ldquoallrdquo Key denotes argument list position for
WaitForMultipleObjects
Object-type-specific data
DispatcherObjects
Thread Objects
WaitBlockListWaitBlockList
Wait blocks
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
7979
34 Windows APIs for Synchronization and IPC Windows API constructs for synchronization and interprocess communication
Synchronization
ndash Critical sections
ndash Mutexes
ndash Semaphores
ndash Event objects
Synchronization through interprocess communication
ndash Anonymous pipes
ndash Named pipes
ndash Mailslots
8080
Critical Sections
Only usable from within the same process
Critical sections are initialized and deleted but do not have handles
Only one thread at a time can be in a critical section
A thread can enter a critical section multiple times - however the number of Enter- and Leave-operations must match
Leaving a critical section before entering it may cause deadlocks
No way to test whether another thread is in a critical section
VOID InitializeCriticalSection( LPCRITICAL_SECTION sec )VOID DeleteCriticalSection( LPCRITICAL_SECTION sec )
VOID EnterCriticalSection( LPCRITICAL_SECTION sec ) VOID LeaveCriticalSection( LPCRITICAL_SECTION sec )BOOL TryEnterCriticalSection ( LPCRITICAL_SECTION sec )
8181
Critical Section Example
counter is global shared by all threads
volatile int counter = 0
CRITICAL_SECTION crit
InitializeCriticalSection ( ampcrit )
hellip main loop in any of the threads
while (done)
_try
EnterCriticalSection ( ampcrit )
counter += local_value
LeaveCriticalSection ( ampcrit )
_finally LeaveCriticalSection ( ampcrit )
DeleteCriticalSection( ampcrit )
8282
Synchronizing Threads with Kernel Objects
The following kernel objects can be used to synchronize threads
ndash Processes
ndash Threads
ndash Files
ndash Console input
File change notificationsFile change notifications
MutexesMutexes
Events (auto-reset + manual-reset)Events (auto-reset + manual-reset)
Waitable timersWaitable timers
DWORD WaitForSingleObject( HANDLE hObject DWORD dwTimeout )
DWORD WaitForMultipleObjects( DWORD cObjects LPHANDLE lpHandles BOOL bWaitAll DWORD dwTimeout )
8383
Wait Functions - Details
WaitForSingleObject()ndash hObject specifies kernel object
ndash dwTimeout specifies wait time in msecdwTimeout == 0 - no wait check whether object is signaled
dwTimeout == INFINITE - wait forever
WaitForMultipleObjects()ndash cObjects lt= MAXIMUM_WAIT_OBJECTS (64)
ndash lpHandles - pointer to array identifying these objects
ndash bWaitAll - whether to wait for first signaled object or all objectsFunction returns index of first signaled object
Side effectsndash Mutexes auto-reset events and waitable timers will be reset to non-signaled state after completing wait functions
8484
Mutexes
Mutexes work across processes
First thread has to call CreateMutex()
When sharing a mutex second thread (process) calls CreateMutex() or OpenMutex()
fInitialOwner == TRUE gives creator immediate ownership
Threads acquire mutex ownership using WaitForSingleObject() or WaitForMultipleObjects()
ReleaseMutex() gives up ownership
CloseHandle() will free mutex object
8585
Mutexes
HANDLE CreateMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
HANDLE OpenMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
BOOL ReleaseMutex( HANDLE hMutex )
8686
Mutex Example
counter is global shared by all threads
volatile int done counter = 0
HANDLE mutex = CreateMutex( NULL FALSE NULL )
main loop in any of the threads ret is local
DWORD ret
while (done)
ret = WaitForSingleObject( mutex INFINITE )
if (ret == WAIT_OBJECT_0)
counter += local_value
else mutex was abandoned
break exit the loop
ReleaseMutex( mutex )
CloseHandle( mutex )
8787
Comparison - POSIX mutexes
POSIX pthreads specification supports mutexesndash Synchronization among threads in same process
Five basic functionsndash pthread_mutex_init()
ndash pthread_mutex_destroy()
ndash pthread_mutex_lock()
ndash pthread_mutex_unlock()
ndash pthread_mutex_trylock()
Comparisonndash pthread_mutex_lock() will block - equivalent to WaitForSingleObject( hMutex )
ndash pthread_mutex_trylock() is nonblocking (polling) - equivalent to WaitForSingleObject() with timeout == 0
8888
Semaphores
Semaphore objects are used for resource countingndash A semaphore is signaled when count gt 0
Threadsprocesses use wait functionsndash Each wait function decreases semaphore count by 1
ndash ReleaseSemaphore() may increment count by any value
ndash ReleaseSemaphore() returns old semaphore count
8989
Semaphores
HANDLE CreateSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE OpenSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE ReleaseSemaphore( HANDLE hSemaphoreLONG cReleaseCount LPLONG lpPreviousCount )
9090
Events
Multiple threads can be released when a single event is signaled (barrier synchronization)ndash Manual-reset event can signal several thread simultaneously must be reset manually
ndash PulseEvent() will release all threads waiting on a manual-reset event and automatically reset the event
ndash Auto-reset event signals a single thread event is reset automatically
ndash fInitialState == TRUE - create event in signaled state
9191
Events
HANDLE CreateEvent( LPSECURITY_ATTRIBUTE lpsaBOOL fManualReset BOOL fInititalStateLPTSTR lpszEventName )
BOOL SetEvent( HANDLE hEvent )BOOL ResetEvent( HANDLE hEvent )BOOL PulseEvent( HANDLE hEvent )
9292
Comparison - POSIX condition variables
pthreadrsquos condition variables are comparable to eventsndash pthread_cond_init()
ndash pthread_cond_destroy()
Wait functionsndash pthread_cond_wait()
ndash pthread_cond_timedwait()
Signalingndash pthread_cond_signal() - one thread
ndash pthread_cond_broadcast() - all waiting threads
No exact equivalent to manual-reset events
9393
Anonymous pipes
BOOL CreatePipe( PHANDLE phReadPHANDLE phWriteLPSECURITY_ATTRIBUTES lpsaDWORD cbPipe )
main
prog1 prog2pipe
Half-duplex character-based IPC
cbPipe pipe byte size zero == default
Read on pipe handle will block if pipe is empty
Write operation to a full pipe will block
Anonymous pipes are oneway
9494
IO Redirection using an Anonymous Pipe
Create default size anonymous pipe handles are inheritable
if (CreatePipe (amphReadPipe amphWritePipe ampPipeSA 0))
fprintf(stderr ldquoAnon pipe create failednrdquo) exit(1)
Set output handle to pipe handle create first processes
StartInfoCh1hStdInput = GetStdHandle (STD_INPUT_HANDLE)
StartInfoCh1hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh1hStdOutput = hWritePipe
StartInfoCh1dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)Command1 NULL NULL TRUE 0 NULL NULL ampStartInfoCh1 ampProcInfo1))
fprintf(stderr ldquoCreateProc1 failednrdquo) exit(2)
CloseHandle (hWritePipe)
9595
Pipe example (contd)
Repeat (symmetrically) for the second process
StartInfoCh2hStdInput = hReadPipe
StartInfoCh2hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh2hStdOutput = GetStdHandle (STD_OUTPUT_HANDLE)
StartInfoCh2dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)targv NULL NULLTRUE Inherit handles
0 NULL NULL ampStartInfoCh2 ampProcInfo2))
fprintf(stderr ldquoCreateProc2 failednrdquo) exit(3)
CloseHandle (hReadPipe)
Wait for both processes to complete
WaitForSingleObject (ProcInfo1hProcess INFINITE)
WaitForSingleObject (ProcInfo2hProcess INFINITE)
CloseHandle (ProcInfo1hThread) CloseHandle (ProcInfo1hProcess)
CloseHandle (ProcInfo2hThread) CloseHandle (ProcInfo2hProcess)
return 0
9696
Named Pipes
Message orientedndash Reading process can read varying-length messages precisely as sent by the writing process
Bi-directionalndash Two processes can exchange messages over the same pipe
Multiple independent instances of a named pipendash Several clients can communicate with a single server using the same instance
ndash Server can respond to client using the same instance
Pipe can be accessed over the networkndash location transparency
Convenience and connection functions
9797
Using Named Pipes
HANDLE CreateNamedPipe (LPCTSTR lpszPipeNameDWORD fdwOpenMode DWORD fdwPipModeDWORD nMaxInstances DWORD cbOutBufDWORD cbInBuf DWORD dwTimeOutLPSECURITY_ATTRIBUTES lpsa )
Use same flag settings forall instances of a named pipe
lpszPipeName pipe[path]pipename
ndash Not possible to create a pipe on remote machine ( ndash local machine)
fdwOpenMode
ndash PIPE_ACCESS_DUPLEX PIPE_ACCESS_INBOUND PIPE_ACCESS_OUTBOUND
fdwPipeMode
ndash PIPE_TYPE_BYTE or PIPE_TYPE_MESSAGE
ndash PIPE_READMODE_BYTE or PIPE_READMODE_MESSAGE
ndash PIPE_WAIT or PIPE_NOWAIT (will ReadFile block)
9898
Named Pipes (contd)
BOOL PeekNamedPipe (HANDLE hPipeLPVOID lpvBuffer DWORD cbBufferLPDWORD lpcbRead LPDWORD lpcbAvailLPDWORD lpcbMessage)
nMaxInstances
ndash Number of instances
ndash PIPE_UNLIMITED_INSTANCES OS choice based on resources
dwTimeOut
ndash Default time-out period (in msec) for WaitNamedPipe()
First CreateNamedPipe creates named pipe
ndash Closing handle to last instance deletes named pipe
Polling a pipe
ndash Nondestructive ndash is there a message waiting for ReadFile
9999
Named Pipe Client Connections
CreateFile with named pipe namendash pipe[path]pipename
ndash servernamepipe[path]pipename
ndash First method gives better performance (local server)
Status Functionsndash GetNamedPipeHandleState
ndash SetNamedPipeHandleState
ndash GetNamedPipeInfo
100100
Convenience Functions
BOOL TransactNamedPipe( HANDLE hNamedPipeLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDOWRD lpcbRead LPOVERLAPPED lpa)
WriteFile ReadFile sequence
101101
Convenience Functions
BOOL CallNamedPipe( LPCTSTR lpszPipeNameLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDWORD lpcbRead DWORD dwTimeOut)
CreateFile WriteFile ReadFile CloseHandle
ndash dwTimeOut NMPWAIT_NOWAIT NMPWAIT_WIAT_FOREVER NMPWAIT_USE_DEFAULT_WAIT
102102
Server eliminate the polling loop
BOOL ConnectNamedPipe (HANDLE hNamedPipeLPOVERLAPPED lpo
lpo == NULLndash Call will return as soon as there is a client connection
ndash Returns false if client connected between CreateNamed Pipe calland ConnectNamedPipe()
Use DisconnectNamedPipe to free the handle for connection from another client
WaitNamedPipe()ndash Client may wait for serverlsquos ConnectNamedPipe()
Security rights for named pipesndash GENERIC_READ GENERIC_WRITE SYNCHRONIZE
103103
Comparison with UNIX
UNIX FIFOs are similar to a named pipe
ndash FIFOs are half-duplex
ndash FIFOs are limited to a single machine
ndash FIFOs are still byte-oriented so its easiest to use fixed-size records in clientserver applications
ndash Individual readwrites are atomic
A server using FIFOs must use a separate FIFO for each clientlsquos response although all clients can send requests via a single well known FIFO
Mkfifo() is the UNIX counterpart to CreateNamedPipe()
Use sockets for networked clientserver scenarios
104104
Client Example using Named Pipe
WaitNamedPipe (ServerPipeName NMPWAIT_WAIT_FOREVER)
hNamedPipe = CreateFile (ServerPipeName GENERIC_READ | GENERIC_WRITE
0 NULL OPEN_EXISTING FILE_ATTRIBUTE_NORMAL NULL)
if (hNamedPipe == INVALID_HANDLE_VALUE)
fptinf(stderr Failure to locate servern) exit(3)
Write the request
WriteFile (hNamedPipe ampRequest MAX_RQRS_LEN ampnWrite NULL)
Read each response and send it to std out
while (ReadFile (hNamedPipe ResponseRecord MAX_RQRS_LEN ampnRead NULL))
printf (s ResponseRecord)
CloseHandle (hNamedPipe)
return 0
105105
Server Example Using a Named Pipe
hNamedPipe = CreateNamedPipe (SERVER_PIPE PIPE_ACCESS_DUPLEX
PIPE_READMODE_MESSAGE | PIPE_TYPE_MESSAGE | PIPE_WAIT
1 0 0 CS_TIMEOUT pNPSA)
while (Done)
printf (Server is awaiting next requestn)
if (ConnectNamedPipe (hNamedPipe NULL)
|| ReadFile (hNamedPipe ampRequest RQ_SIZE ampnXfer NULL))
fprintf(stderr ldquoConnect or Read Named Pipe errornrdquo) exit(4)
printf( ldquoRequest is sn RequestRecord)
Send the file one line at a time to the client
fp = fopen (File r)
while ((fgets (ResponseRecord MAX_RQRS_LEN fp) = NULL))
WriteFile (hNamedPipe ampResponseRecord
(strlen(ResponseRecord) + 1) TSIZE ampnXfer NULL)
fclose (fp)
DisconnectNamedPipe (hNamedPipe)
End of server operation
106106
Win32 IPC - MailslotsMailslots bear some nasty implementation detailsthey are almost never used
Broadcast mechanism
ndash One-directional
ndash Mutliple writersmultiple readers (frequently one-to-many comm)
ndash Message delivery is unreliable
ndash Can be located over a network domain
ndash Message lengths are limited (w2k lt 426 byte) Operations on the mailslot
ndash Each reader (server) creates mailslot with CreateMailslot()
ndash Write-only client opens mailslot with CreateFile() and uses WriteFile() ndash open will fail if there are no waiting readers
ndash Clientlsquos message can be read by all servers (readers) Client lookup mailslotmailslotname
ndash Client will connect to every server in network domain
107107
Locate a server via mailslot
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
App client 0
App client n
Mailslot Servers
While () Sleep() hMS = CreateFile( ldquomailslotstatusldquo)
WriteFile(hMS ampStatInfo
App Server
Mailslot Client
Message is sent periodically
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
7272
IRQLs and CPU Time Accounting
Interval clock timer ISR keeps track of time
Clock ISR time accounting
ndash If IRQLlt2 charge to threadrsquos user or kernel time
ndash If IRQL=2 and processing a DPC charge to DPC time
ndash If IRQL=2 amp not processing a DPC charge to thread kernel time
ndash If IRQLgt2 charge to interrupt time
7373
IRQLs and CPU Time Accounting
Since time servicing interrupts are NOT charged to interrupted thread if system is busy but no process appears to be running must be due to interrupt-related activity
ndash Note time at IRQL 2 or more is charged to the current threadrsquos quantum (to be described)
7474
Interrupt Time Accounting
Task Manager includes interrupt and DPC time with the Idle process time
Interrupt activity is not charged to any threadprocess
ndash Process Explorer shows these as separate processesnot really processes
ndash Context switches for these are really of interrupts amp DPCs
7575
Time Accounting Quirks
Looking at total CPU time for each process may not reveal where system has spent its time
CPU time accounting is driven by programmable interrupt timer
ndash Normally 10 msec (15 msec on some MP Pentiums)
Thread execution and context switches between clock intervals NOT accounted
ndash Eg one or more threads run and enter a wait state before clock fires
ndash Thus threads may run but never get charged
View context switch activity with Process Explorer
ndash Add Context Switch Delta column
7676
For waiting threads user-mode utilities only display the wait reason
Example pstat
Looking at Waiting Threads
7777
Wait Internals 1 Dispatcher Objects
Size TypeState
Wait listhead
Object-type-specific data
DispatcherObject
(see ntddkincddkntddkh)
Any kernel object you can wait for is a ldquodispatcher objectrdquo
ndash some exclusively for synchronizationeg events mutexes (ldquomutantsrdquo) semaphores queues timers
ndash others can be waited for as a side effect of their prime function eg processes threads file objects
ndash non-waitable kernel objects are called ldquocontrol objectsrdquo
All dispatcher objects have a common header
All dispatcher objects are in one of two states
ndash ldquosignaledrdquo vs ldquononsignaledrdquo
ndash when signalled a wait on the object is satisfied
ndash different object types differ in terms of what changes their state
ndash wait and unwait implementation iscommon to all types of dispatcher objects
7878
Object-type-specific data
Wait Internals 2Wait Blocks
Size TypeState
Wait listhead
Size TypeState
Wait listhead
Represent a threadrsquos reference to something itrsquos waiting for (one per handle passed to WaitForhellip)
All wait blocks from a given wait call are chained to the waiting thread
Type indicates wait for ldquoanyrdquo or ldquoallrdquo Key denotes argument list position for
WaitForMultipleObjects
Object-type-specific data
DispatcherObjects
Thread Objects
WaitBlockListWaitBlockList
Wait blocks
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
7979
34 Windows APIs for Synchronization and IPC Windows API constructs for synchronization and interprocess communication
Synchronization
ndash Critical sections
ndash Mutexes
ndash Semaphores
ndash Event objects
Synchronization through interprocess communication
ndash Anonymous pipes
ndash Named pipes
ndash Mailslots
8080
Critical Sections
Only usable from within the same process
Critical sections are initialized and deleted but do not have handles
Only one thread at a time can be in a critical section
A thread can enter a critical section multiple times - however the number of Enter- and Leave-operations must match
Leaving a critical section before entering it may cause deadlocks
No way to test whether another thread is in a critical section
VOID InitializeCriticalSection( LPCRITICAL_SECTION sec )VOID DeleteCriticalSection( LPCRITICAL_SECTION sec )
VOID EnterCriticalSection( LPCRITICAL_SECTION sec ) VOID LeaveCriticalSection( LPCRITICAL_SECTION sec )BOOL TryEnterCriticalSection ( LPCRITICAL_SECTION sec )
8181
Critical Section Example
counter is global shared by all threads
volatile int counter = 0
CRITICAL_SECTION crit
InitializeCriticalSection ( ampcrit )
hellip main loop in any of the threads
while (done)
_try
EnterCriticalSection ( ampcrit )
counter += local_value
LeaveCriticalSection ( ampcrit )
_finally LeaveCriticalSection ( ampcrit )
DeleteCriticalSection( ampcrit )
8282
Synchronizing Threads with Kernel Objects
The following kernel objects can be used to synchronize threads
ndash Processes
ndash Threads
ndash Files
ndash Console input
File change notificationsFile change notifications
MutexesMutexes
Events (auto-reset + manual-reset)Events (auto-reset + manual-reset)
Waitable timersWaitable timers
DWORD WaitForSingleObject( HANDLE hObject DWORD dwTimeout )
DWORD WaitForMultipleObjects( DWORD cObjects LPHANDLE lpHandles BOOL bWaitAll DWORD dwTimeout )
8383
Wait Functions - Details
WaitForSingleObject()ndash hObject specifies kernel object
ndash dwTimeout specifies wait time in msecdwTimeout == 0 - no wait check whether object is signaled
dwTimeout == INFINITE - wait forever
WaitForMultipleObjects()ndash cObjects lt= MAXIMUM_WAIT_OBJECTS (64)
ndash lpHandles - pointer to array identifying these objects
ndash bWaitAll - whether to wait for first signaled object or all objectsFunction returns index of first signaled object
Side effectsndash Mutexes auto-reset events and waitable timers will be reset to non-signaled state after completing wait functions
8484
Mutexes
Mutexes work across processes
First thread has to call CreateMutex()
When sharing a mutex second thread (process) calls CreateMutex() or OpenMutex()
fInitialOwner == TRUE gives creator immediate ownership
Threads acquire mutex ownership using WaitForSingleObject() or WaitForMultipleObjects()
ReleaseMutex() gives up ownership
CloseHandle() will free mutex object
8585
Mutexes
HANDLE CreateMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
HANDLE OpenMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
BOOL ReleaseMutex( HANDLE hMutex )
8686
Mutex Example
counter is global shared by all threads
volatile int done counter = 0
HANDLE mutex = CreateMutex( NULL FALSE NULL )
main loop in any of the threads ret is local
DWORD ret
while (done)
ret = WaitForSingleObject( mutex INFINITE )
if (ret == WAIT_OBJECT_0)
counter += local_value
else mutex was abandoned
break exit the loop
ReleaseMutex( mutex )
CloseHandle( mutex )
8787
Comparison - POSIX mutexes
POSIX pthreads specification supports mutexesndash Synchronization among threads in same process
Five basic functionsndash pthread_mutex_init()
ndash pthread_mutex_destroy()
ndash pthread_mutex_lock()
ndash pthread_mutex_unlock()
ndash pthread_mutex_trylock()
Comparisonndash pthread_mutex_lock() will block - equivalent to WaitForSingleObject( hMutex )
ndash pthread_mutex_trylock() is nonblocking (polling) - equivalent to WaitForSingleObject() with timeout == 0
8888
Semaphores
Semaphore objects are used for resource countingndash A semaphore is signaled when count gt 0
Threadsprocesses use wait functionsndash Each wait function decreases semaphore count by 1
ndash ReleaseSemaphore() may increment count by any value
ndash ReleaseSemaphore() returns old semaphore count
8989
Semaphores
HANDLE CreateSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE OpenSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE ReleaseSemaphore( HANDLE hSemaphoreLONG cReleaseCount LPLONG lpPreviousCount )
9090
Events
Multiple threads can be released when a single event is signaled (barrier synchronization)ndash Manual-reset event can signal several thread simultaneously must be reset manually
ndash PulseEvent() will release all threads waiting on a manual-reset event and automatically reset the event
ndash Auto-reset event signals a single thread event is reset automatically
ndash fInitialState == TRUE - create event in signaled state
9191
Events
HANDLE CreateEvent( LPSECURITY_ATTRIBUTE lpsaBOOL fManualReset BOOL fInititalStateLPTSTR lpszEventName )
BOOL SetEvent( HANDLE hEvent )BOOL ResetEvent( HANDLE hEvent )BOOL PulseEvent( HANDLE hEvent )
9292
Comparison - POSIX condition variables
pthreadrsquos condition variables are comparable to eventsndash pthread_cond_init()
ndash pthread_cond_destroy()
Wait functionsndash pthread_cond_wait()
ndash pthread_cond_timedwait()
Signalingndash pthread_cond_signal() - one thread
ndash pthread_cond_broadcast() - all waiting threads
No exact equivalent to manual-reset events
9393
Anonymous pipes
BOOL CreatePipe( PHANDLE phReadPHANDLE phWriteLPSECURITY_ATTRIBUTES lpsaDWORD cbPipe )
main
prog1 prog2pipe
Half-duplex character-based IPC
cbPipe pipe byte size zero == default
Read on pipe handle will block if pipe is empty
Write operation to a full pipe will block
Anonymous pipes are oneway
9494
IO Redirection using an Anonymous Pipe
Create default size anonymous pipe handles are inheritable
if (CreatePipe (amphReadPipe amphWritePipe ampPipeSA 0))
fprintf(stderr ldquoAnon pipe create failednrdquo) exit(1)
Set output handle to pipe handle create first processes
StartInfoCh1hStdInput = GetStdHandle (STD_INPUT_HANDLE)
StartInfoCh1hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh1hStdOutput = hWritePipe
StartInfoCh1dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)Command1 NULL NULL TRUE 0 NULL NULL ampStartInfoCh1 ampProcInfo1))
fprintf(stderr ldquoCreateProc1 failednrdquo) exit(2)
CloseHandle (hWritePipe)
9595
Pipe example (contd)
Repeat (symmetrically) for the second process
StartInfoCh2hStdInput = hReadPipe
StartInfoCh2hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh2hStdOutput = GetStdHandle (STD_OUTPUT_HANDLE)
StartInfoCh2dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)targv NULL NULLTRUE Inherit handles
0 NULL NULL ampStartInfoCh2 ampProcInfo2))
fprintf(stderr ldquoCreateProc2 failednrdquo) exit(3)
CloseHandle (hReadPipe)
Wait for both processes to complete
WaitForSingleObject (ProcInfo1hProcess INFINITE)
WaitForSingleObject (ProcInfo2hProcess INFINITE)
CloseHandle (ProcInfo1hThread) CloseHandle (ProcInfo1hProcess)
CloseHandle (ProcInfo2hThread) CloseHandle (ProcInfo2hProcess)
return 0
9696
Named Pipes
Message orientedndash Reading process can read varying-length messages precisely as sent by the writing process
Bi-directionalndash Two processes can exchange messages over the same pipe
Multiple independent instances of a named pipendash Several clients can communicate with a single server using the same instance
ndash Server can respond to client using the same instance
Pipe can be accessed over the networkndash location transparency
Convenience and connection functions
9797
Using Named Pipes
HANDLE CreateNamedPipe (LPCTSTR lpszPipeNameDWORD fdwOpenMode DWORD fdwPipModeDWORD nMaxInstances DWORD cbOutBufDWORD cbInBuf DWORD dwTimeOutLPSECURITY_ATTRIBUTES lpsa )
Use same flag settings forall instances of a named pipe
lpszPipeName pipe[path]pipename
ndash Not possible to create a pipe on remote machine ( ndash local machine)
fdwOpenMode
ndash PIPE_ACCESS_DUPLEX PIPE_ACCESS_INBOUND PIPE_ACCESS_OUTBOUND
fdwPipeMode
ndash PIPE_TYPE_BYTE or PIPE_TYPE_MESSAGE
ndash PIPE_READMODE_BYTE or PIPE_READMODE_MESSAGE
ndash PIPE_WAIT or PIPE_NOWAIT (will ReadFile block)
9898
Named Pipes (contd)
BOOL PeekNamedPipe (HANDLE hPipeLPVOID lpvBuffer DWORD cbBufferLPDWORD lpcbRead LPDWORD lpcbAvailLPDWORD lpcbMessage)
nMaxInstances
ndash Number of instances
ndash PIPE_UNLIMITED_INSTANCES OS choice based on resources
dwTimeOut
ndash Default time-out period (in msec) for WaitNamedPipe()
First CreateNamedPipe creates named pipe
ndash Closing handle to last instance deletes named pipe
Polling a pipe
ndash Nondestructive ndash is there a message waiting for ReadFile
9999
Named Pipe Client Connections
CreateFile with named pipe namendash pipe[path]pipename
ndash servernamepipe[path]pipename
ndash First method gives better performance (local server)
Status Functionsndash GetNamedPipeHandleState
ndash SetNamedPipeHandleState
ndash GetNamedPipeInfo
100100
Convenience Functions
BOOL TransactNamedPipe( HANDLE hNamedPipeLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDOWRD lpcbRead LPOVERLAPPED lpa)
WriteFile ReadFile sequence
101101
Convenience Functions
BOOL CallNamedPipe( LPCTSTR lpszPipeNameLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDWORD lpcbRead DWORD dwTimeOut)
CreateFile WriteFile ReadFile CloseHandle
ndash dwTimeOut NMPWAIT_NOWAIT NMPWAIT_WIAT_FOREVER NMPWAIT_USE_DEFAULT_WAIT
102102
Server eliminate the polling loop
BOOL ConnectNamedPipe (HANDLE hNamedPipeLPOVERLAPPED lpo
lpo == NULLndash Call will return as soon as there is a client connection
ndash Returns false if client connected between CreateNamed Pipe calland ConnectNamedPipe()
Use DisconnectNamedPipe to free the handle for connection from another client
WaitNamedPipe()ndash Client may wait for serverlsquos ConnectNamedPipe()
Security rights for named pipesndash GENERIC_READ GENERIC_WRITE SYNCHRONIZE
103103
Comparison with UNIX
UNIX FIFOs are similar to a named pipe
ndash FIFOs are half-duplex
ndash FIFOs are limited to a single machine
ndash FIFOs are still byte-oriented so its easiest to use fixed-size records in clientserver applications
ndash Individual readwrites are atomic
A server using FIFOs must use a separate FIFO for each clientlsquos response although all clients can send requests via a single well known FIFO
Mkfifo() is the UNIX counterpart to CreateNamedPipe()
Use sockets for networked clientserver scenarios
104104
Client Example using Named Pipe
WaitNamedPipe (ServerPipeName NMPWAIT_WAIT_FOREVER)
hNamedPipe = CreateFile (ServerPipeName GENERIC_READ | GENERIC_WRITE
0 NULL OPEN_EXISTING FILE_ATTRIBUTE_NORMAL NULL)
if (hNamedPipe == INVALID_HANDLE_VALUE)
fptinf(stderr Failure to locate servern) exit(3)
Write the request
WriteFile (hNamedPipe ampRequest MAX_RQRS_LEN ampnWrite NULL)
Read each response and send it to std out
while (ReadFile (hNamedPipe ResponseRecord MAX_RQRS_LEN ampnRead NULL))
printf (s ResponseRecord)
CloseHandle (hNamedPipe)
return 0
105105
Server Example Using a Named Pipe
hNamedPipe = CreateNamedPipe (SERVER_PIPE PIPE_ACCESS_DUPLEX
PIPE_READMODE_MESSAGE | PIPE_TYPE_MESSAGE | PIPE_WAIT
1 0 0 CS_TIMEOUT pNPSA)
while (Done)
printf (Server is awaiting next requestn)
if (ConnectNamedPipe (hNamedPipe NULL)
|| ReadFile (hNamedPipe ampRequest RQ_SIZE ampnXfer NULL))
fprintf(stderr ldquoConnect or Read Named Pipe errornrdquo) exit(4)
printf( ldquoRequest is sn RequestRecord)
Send the file one line at a time to the client
fp = fopen (File r)
while ((fgets (ResponseRecord MAX_RQRS_LEN fp) = NULL))
WriteFile (hNamedPipe ampResponseRecord
(strlen(ResponseRecord) + 1) TSIZE ampnXfer NULL)
fclose (fp)
DisconnectNamedPipe (hNamedPipe)
End of server operation
106106
Win32 IPC - MailslotsMailslots bear some nasty implementation detailsthey are almost never used
Broadcast mechanism
ndash One-directional
ndash Mutliple writersmultiple readers (frequently one-to-many comm)
ndash Message delivery is unreliable
ndash Can be located over a network domain
ndash Message lengths are limited (w2k lt 426 byte) Operations on the mailslot
ndash Each reader (server) creates mailslot with CreateMailslot()
ndash Write-only client opens mailslot with CreateFile() and uses WriteFile() ndash open will fail if there are no waiting readers
ndash Clientlsquos message can be read by all servers (readers) Client lookup mailslotmailslotname
ndash Client will connect to every server in network domain
107107
Locate a server via mailslot
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
App client 0
App client n
Mailslot Servers
While () Sleep() hMS = CreateFile( ldquomailslotstatusldquo)
WriteFile(hMS ampStatInfo
App Server
Mailslot Client
Message is sent periodically
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
7373
IRQLs and CPU Time Accounting
Since time servicing interrupts are NOT charged to interrupted thread if system is busy but no process appears to be running must be due to interrupt-related activity
ndash Note time at IRQL 2 or more is charged to the current threadrsquos quantum (to be described)
7474
Interrupt Time Accounting
Task Manager includes interrupt and DPC time with the Idle process time
Interrupt activity is not charged to any threadprocess
ndash Process Explorer shows these as separate processesnot really processes
ndash Context switches for these are really of interrupts amp DPCs
7575
Time Accounting Quirks
Looking at total CPU time for each process may not reveal where system has spent its time
CPU time accounting is driven by programmable interrupt timer
ndash Normally 10 msec (15 msec on some MP Pentiums)
Thread execution and context switches between clock intervals NOT accounted
ndash Eg one or more threads run and enter a wait state before clock fires
ndash Thus threads may run but never get charged
View context switch activity with Process Explorer
ndash Add Context Switch Delta column
7676
For waiting threads user-mode utilities only display the wait reason
Example pstat
Looking at Waiting Threads
7777
Wait Internals 1 Dispatcher Objects
Size TypeState
Wait listhead
Object-type-specific data
DispatcherObject
(see ntddkincddkntddkh)
Any kernel object you can wait for is a ldquodispatcher objectrdquo
ndash some exclusively for synchronizationeg events mutexes (ldquomutantsrdquo) semaphores queues timers
ndash others can be waited for as a side effect of their prime function eg processes threads file objects
ndash non-waitable kernel objects are called ldquocontrol objectsrdquo
All dispatcher objects have a common header
All dispatcher objects are in one of two states
ndash ldquosignaledrdquo vs ldquononsignaledrdquo
ndash when signalled a wait on the object is satisfied
ndash different object types differ in terms of what changes their state
ndash wait and unwait implementation iscommon to all types of dispatcher objects
7878
Object-type-specific data
Wait Internals 2Wait Blocks
Size TypeState
Wait listhead
Size TypeState
Wait listhead
Represent a threadrsquos reference to something itrsquos waiting for (one per handle passed to WaitForhellip)
All wait blocks from a given wait call are chained to the waiting thread
Type indicates wait for ldquoanyrdquo or ldquoallrdquo Key denotes argument list position for
WaitForMultipleObjects
Object-type-specific data
DispatcherObjects
Thread Objects
WaitBlockListWaitBlockList
Wait blocks
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
7979
34 Windows APIs for Synchronization and IPC Windows API constructs for synchronization and interprocess communication
Synchronization
ndash Critical sections
ndash Mutexes
ndash Semaphores
ndash Event objects
Synchronization through interprocess communication
ndash Anonymous pipes
ndash Named pipes
ndash Mailslots
8080
Critical Sections
Only usable from within the same process
Critical sections are initialized and deleted but do not have handles
Only one thread at a time can be in a critical section
A thread can enter a critical section multiple times - however the number of Enter- and Leave-operations must match
Leaving a critical section before entering it may cause deadlocks
No way to test whether another thread is in a critical section
VOID InitializeCriticalSection( LPCRITICAL_SECTION sec )VOID DeleteCriticalSection( LPCRITICAL_SECTION sec )
VOID EnterCriticalSection( LPCRITICAL_SECTION sec ) VOID LeaveCriticalSection( LPCRITICAL_SECTION sec )BOOL TryEnterCriticalSection ( LPCRITICAL_SECTION sec )
8181
Critical Section Example
counter is global shared by all threads
volatile int counter = 0
CRITICAL_SECTION crit
InitializeCriticalSection ( ampcrit )
hellip main loop in any of the threads
while (done)
_try
EnterCriticalSection ( ampcrit )
counter += local_value
LeaveCriticalSection ( ampcrit )
_finally LeaveCriticalSection ( ampcrit )
DeleteCriticalSection( ampcrit )
8282
Synchronizing Threads with Kernel Objects
The following kernel objects can be used to synchronize threads
ndash Processes
ndash Threads
ndash Files
ndash Console input
File change notificationsFile change notifications
MutexesMutexes
Events (auto-reset + manual-reset)Events (auto-reset + manual-reset)
Waitable timersWaitable timers
DWORD WaitForSingleObject( HANDLE hObject DWORD dwTimeout )
DWORD WaitForMultipleObjects( DWORD cObjects LPHANDLE lpHandles BOOL bWaitAll DWORD dwTimeout )
8383
Wait Functions - Details
WaitForSingleObject()ndash hObject specifies kernel object
ndash dwTimeout specifies wait time in msecdwTimeout == 0 - no wait check whether object is signaled
dwTimeout == INFINITE - wait forever
WaitForMultipleObjects()ndash cObjects lt= MAXIMUM_WAIT_OBJECTS (64)
ndash lpHandles - pointer to array identifying these objects
ndash bWaitAll - whether to wait for first signaled object or all objectsFunction returns index of first signaled object
Side effectsndash Mutexes auto-reset events and waitable timers will be reset to non-signaled state after completing wait functions
8484
Mutexes
Mutexes work across processes
First thread has to call CreateMutex()
When sharing a mutex second thread (process) calls CreateMutex() or OpenMutex()
fInitialOwner == TRUE gives creator immediate ownership
Threads acquire mutex ownership using WaitForSingleObject() or WaitForMultipleObjects()
ReleaseMutex() gives up ownership
CloseHandle() will free mutex object
8585
Mutexes
HANDLE CreateMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
HANDLE OpenMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
BOOL ReleaseMutex( HANDLE hMutex )
8686
Mutex Example
counter is global shared by all threads
volatile int done counter = 0
HANDLE mutex = CreateMutex( NULL FALSE NULL )
main loop in any of the threads ret is local
DWORD ret
while (done)
ret = WaitForSingleObject( mutex INFINITE )
if (ret == WAIT_OBJECT_0)
counter += local_value
else mutex was abandoned
break exit the loop
ReleaseMutex( mutex )
CloseHandle( mutex )
8787
Comparison - POSIX mutexes
POSIX pthreads specification supports mutexesndash Synchronization among threads in same process
Five basic functionsndash pthread_mutex_init()
ndash pthread_mutex_destroy()
ndash pthread_mutex_lock()
ndash pthread_mutex_unlock()
ndash pthread_mutex_trylock()
Comparisonndash pthread_mutex_lock() will block - equivalent to WaitForSingleObject( hMutex )
ndash pthread_mutex_trylock() is nonblocking (polling) - equivalent to WaitForSingleObject() with timeout == 0
8888
Semaphores
Semaphore objects are used for resource countingndash A semaphore is signaled when count gt 0
Threadsprocesses use wait functionsndash Each wait function decreases semaphore count by 1
ndash ReleaseSemaphore() may increment count by any value
ndash ReleaseSemaphore() returns old semaphore count
8989
Semaphores
HANDLE CreateSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE OpenSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE ReleaseSemaphore( HANDLE hSemaphoreLONG cReleaseCount LPLONG lpPreviousCount )
9090
Events
Multiple threads can be released when a single event is signaled (barrier synchronization)ndash Manual-reset event can signal several thread simultaneously must be reset manually
ndash PulseEvent() will release all threads waiting on a manual-reset event and automatically reset the event
ndash Auto-reset event signals a single thread event is reset automatically
ndash fInitialState == TRUE - create event in signaled state
9191
Events
HANDLE CreateEvent( LPSECURITY_ATTRIBUTE lpsaBOOL fManualReset BOOL fInititalStateLPTSTR lpszEventName )
BOOL SetEvent( HANDLE hEvent )BOOL ResetEvent( HANDLE hEvent )BOOL PulseEvent( HANDLE hEvent )
9292
Comparison - POSIX condition variables
pthreadrsquos condition variables are comparable to eventsndash pthread_cond_init()
ndash pthread_cond_destroy()
Wait functionsndash pthread_cond_wait()
ndash pthread_cond_timedwait()
Signalingndash pthread_cond_signal() - one thread
ndash pthread_cond_broadcast() - all waiting threads
No exact equivalent to manual-reset events
9393
Anonymous pipes
BOOL CreatePipe( PHANDLE phReadPHANDLE phWriteLPSECURITY_ATTRIBUTES lpsaDWORD cbPipe )
main
prog1 prog2pipe
Half-duplex character-based IPC
cbPipe pipe byte size zero == default
Read on pipe handle will block if pipe is empty
Write operation to a full pipe will block
Anonymous pipes are oneway
9494
IO Redirection using an Anonymous Pipe
Create default size anonymous pipe handles are inheritable
if (CreatePipe (amphReadPipe amphWritePipe ampPipeSA 0))
fprintf(stderr ldquoAnon pipe create failednrdquo) exit(1)
Set output handle to pipe handle create first processes
StartInfoCh1hStdInput = GetStdHandle (STD_INPUT_HANDLE)
StartInfoCh1hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh1hStdOutput = hWritePipe
StartInfoCh1dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)Command1 NULL NULL TRUE 0 NULL NULL ampStartInfoCh1 ampProcInfo1))
fprintf(stderr ldquoCreateProc1 failednrdquo) exit(2)
CloseHandle (hWritePipe)
9595
Pipe example (contd)
Repeat (symmetrically) for the second process
StartInfoCh2hStdInput = hReadPipe
StartInfoCh2hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh2hStdOutput = GetStdHandle (STD_OUTPUT_HANDLE)
StartInfoCh2dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)targv NULL NULLTRUE Inherit handles
0 NULL NULL ampStartInfoCh2 ampProcInfo2))
fprintf(stderr ldquoCreateProc2 failednrdquo) exit(3)
CloseHandle (hReadPipe)
Wait for both processes to complete
WaitForSingleObject (ProcInfo1hProcess INFINITE)
WaitForSingleObject (ProcInfo2hProcess INFINITE)
CloseHandle (ProcInfo1hThread) CloseHandle (ProcInfo1hProcess)
CloseHandle (ProcInfo2hThread) CloseHandle (ProcInfo2hProcess)
return 0
9696
Named Pipes
Message orientedndash Reading process can read varying-length messages precisely as sent by the writing process
Bi-directionalndash Two processes can exchange messages over the same pipe
Multiple independent instances of a named pipendash Several clients can communicate with a single server using the same instance
ndash Server can respond to client using the same instance
Pipe can be accessed over the networkndash location transparency
Convenience and connection functions
9797
Using Named Pipes
HANDLE CreateNamedPipe (LPCTSTR lpszPipeNameDWORD fdwOpenMode DWORD fdwPipModeDWORD nMaxInstances DWORD cbOutBufDWORD cbInBuf DWORD dwTimeOutLPSECURITY_ATTRIBUTES lpsa )
Use same flag settings forall instances of a named pipe
lpszPipeName pipe[path]pipename
ndash Not possible to create a pipe on remote machine ( ndash local machine)
fdwOpenMode
ndash PIPE_ACCESS_DUPLEX PIPE_ACCESS_INBOUND PIPE_ACCESS_OUTBOUND
fdwPipeMode
ndash PIPE_TYPE_BYTE or PIPE_TYPE_MESSAGE
ndash PIPE_READMODE_BYTE or PIPE_READMODE_MESSAGE
ndash PIPE_WAIT or PIPE_NOWAIT (will ReadFile block)
9898
Named Pipes (contd)
BOOL PeekNamedPipe (HANDLE hPipeLPVOID lpvBuffer DWORD cbBufferLPDWORD lpcbRead LPDWORD lpcbAvailLPDWORD lpcbMessage)
nMaxInstances
ndash Number of instances
ndash PIPE_UNLIMITED_INSTANCES OS choice based on resources
dwTimeOut
ndash Default time-out period (in msec) for WaitNamedPipe()
First CreateNamedPipe creates named pipe
ndash Closing handle to last instance deletes named pipe
Polling a pipe
ndash Nondestructive ndash is there a message waiting for ReadFile
9999
Named Pipe Client Connections
CreateFile with named pipe namendash pipe[path]pipename
ndash servernamepipe[path]pipename
ndash First method gives better performance (local server)
Status Functionsndash GetNamedPipeHandleState
ndash SetNamedPipeHandleState
ndash GetNamedPipeInfo
100100
Convenience Functions
BOOL TransactNamedPipe( HANDLE hNamedPipeLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDOWRD lpcbRead LPOVERLAPPED lpa)
WriteFile ReadFile sequence
101101
Convenience Functions
BOOL CallNamedPipe( LPCTSTR lpszPipeNameLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDWORD lpcbRead DWORD dwTimeOut)
CreateFile WriteFile ReadFile CloseHandle
ndash dwTimeOut NMPWAIT_NOWAIT NMPWAIT_WIAT_FOREVER NMPWAIT_USE_DEFAULT_WAIT
102102
Server eliminate the polling loop
BOOL ConnectNamedPipe (HANDLE hNamedPipeLPOVERLAPPED lpo
lpo == NULLndash Call will return as soon as there is a client connection
ndash Returns false if client connected between CreateNamed Pipe calland ConnectNamedPipe()
Use DisconnectNamedPipe to free the handle for connection from another client
WaitNamedPipe()ndash Client may wait for serverlsquos ConnectNamedPipe()
Security rights for named pipesndash GENERIC_READ GENERIC_WRITE SYNCHRONIZE
103103
Comparison with UNIX
UNIX FIFOs are similar to a named pipe
ndash FIFOs are half-duplex
ndash FIFOs are limited to a single machine
ndash FIFOs are still byte-oriented so its easiest to use fixed-size records in clientserver applications
ndash Individual readwrites are atomic
A server using FIFOs must use a separate FIFO for each clientlsquos response although all clients can send requests via a single well known FIFO
Mkfifo() is the UNIX counterpart to CreateNamedPipe()
Use sockets for networked clientserver scenarios
104104
Client Example using Named Pipe
WaitNamedPipe (ServerPipeName NMPWAIT_WAIT_FOREVER)
hNamedPipe = CreateFile (ServerPipeName GENERIC_READ | GENERIC_WRITE
0 NULL OPEN_EXISTING FILE_ATTRIBUTE_NORMAL NULL)
if (hNamedPipe == INVALID_HANDLE_VALUE)
fptinf(stderr Failure to locate servern) exit(3)
Write the request
WriteFile (hNamedPipe ampRequest MAX_RQRS_LEN ampnWrite NULL)
Read each response and send it to std out
while (ReadFile (hNamedPipe ResponseRecord MAX_RQRS_LEN ampnRead NULL))
printf (s ResponseRecord)
CloseHandle (hNamedPipe)
return 0
105105
Server Example Using a Named Pipe
hNamedPipe = CreateNamedPipe (SERVER_PIPE PIPE_ACCESS_DUPLEX
PIPE_READMODE_MESSAGE | PIPE_TYPE_MESSAGE | PIPE_WAIT
1 0 0 CS_TIMEOUT pNPSA)
while (Done)
printf (Server is awaiting next requestn)
if (ConnectNamedPipe (hNamedPipe NULL)
|| ReadFile (hNamedPipe ampRequest RQ_SIZE ampnXfer NULL))
fprintf(stderr ldquoConnect or Read Named Pipe errornrdquo) exit(4)
printf( ldquoRequest is sn RequestRecord)
Send the file one line at a time to the client
fp = fopen (File r)
while ((fgets (ResponseRecord MAX_RQRS_LEN fp) = NULL))
WriteFile (hNamedPipe ampResponseRecord
(strlen(ResponseRecord) + 1) TSIZE ampnXfer NULL)
fclose (fp)
DisconnectNamedPipe (hNamedPipe)
End of server operation
106106
Win32 IPC - MailslotsMailslots bear some nasty implementation detailsthey are almost never used
Broadcast mechanism
ndash One-directional
ndash Mutliple writersmultiple readers (frequently one-to-many comm)
ndash Message delivery is unreliable
ndash Can be located over a network domain
ndash Message lengths are limited (w2k lt 426 byte) Operations on the mailslot
ndash Each reader (server) creates mailslot with CreateMailslot()
ndash Write-only client opens mailslot with CreateFile() and uses WriteFile() ndash open will fail if there are no waiting readers
ndash Clientlsquos message can be read by all servers (readers) Client lookup mailslotmailslotname
ndash Client will connect to every server in network domain
107107
Locate a server via mailslot
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
App client 0
App client n
Mailslot Servers
While () Sleep() hMS = CreateFile( ldquomailslotstatusldquo)
WriteFile(hMS ampStatInfo
App Server
Mailslot Client
Message is sent periodically
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
7474
Interrupt Time Accounting
Task Manager includes interrupt and DPC time with the Idle process time
Interrupt activity is not charged to any threadprocess
ndash Process Explorer shows these as separate processesnot really processes
ndash Context switches for these are really of interrupts amp DPCs
7575
Time Accounting Quirks
Looking at total CPU time for each process may not reveal where system has spent its time
CPU time accounting is driven by programmable interrupt timer
ndash Normally 10 msec (15 msec on some MP Pentiums)
Thread execution and context switches between clock intervals NOT accounted
ndash Eg one or more threads run and enter a wait state before clock fires
ndash Thus threads may run but never get charged
View context switch activity with Process Explorer
ndash Add Context Switch Delta column
7676
For waiting threads user-mode utilities only display the wait reason
Example pstat
Looking at Waiting Threads
7777
Wait Internals 1 Dispatcher Objects
Size TypeState
Wait listhead
Object-type-specific data
DispatcherObject
(see ntddkincddkntddkh)
Any kernel object you can wait for is a ldquodispatcher objectrdquo
ndash some exclusively for synchronizationeg events mutexes (ldquomutantsrdquo) semaphores queues timers
ndash others can be waited for as a side effect of their prime function eg processes threads file objects
ndash non-waitable kernel objects are called ldquocontrol objectsrdquo
All dispatcher objects have a common header
All dispatcher objects are in one of two states
ndash ldquosignaledrdquo vs ldquononsignaledrdquo
ndash when signalled a wait on the object is satisfied
ndash different object types differ in terms of what changes their state
ndash wait and unwait implementation iscommon to all types of dispatcher objects
7878
Object-type-specific data
Wait Internals 2Wait Blocks
Size TypeState
Wait listhead
Size TypeState
Wait listhead
Represent a threadrsquos reference to something itrsquos waiting for (one per handle passed to WaitForhellip)
All wait blocks from a given wait call are chained to the waiting thread
Type indicates wait for ldquoanyrdquo or ldquoallrdquo Key denotes argument list position for
WaitForMultipleObjects
Object-type-specific data
DispatcherObjects
Thread Objects
WaitBlockListWaitBlockList
Wait blocks
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
7979
34 Windows APIs for Synchronization and IPC Windows API constructs for synchronization and interprocess communication
Synchronization
ndash Critical sections
ndash Mutexes
ndash Semaphores
ndash Event objects
Synchronization through interprocess communication
ndash Anonymous pipes
ndash Named pipes
ndash Mailslots
8080
Critical Sections
Only usable from within the same process
Critical sections are initialized and deleted but do not have handles
Only one thread at a time can be in a critical section
A thread can enter a critical section multiple times - however the number of Enter- and Leave-operations must match
Leaving a critical section before entering it may cause deadlocks
No way to test whether another thread is in a critical section
VOID InitializeCriticalSection( LPCRITICAL_SECTION sec )VOID DeleteCriticalSection( LPCRITICAL_SECTION sec )
VOID EnterCriticalSection( LPCRITICAL_SECTION sec ) VOID LeaveCriticalSection( LPCRITICAL_SECTION sec )BOOL TryEnterCriticalSection ( LPCRITICAL_SECTION sec )
8181
Critical Section Example
counter is global shared by all threads
volatile int counter = 0
CRITICAL_SECTION crit
InitializeCriticalSection ( ampcrit )
hellip main loop in any of the threads
while (done)
_try
EnterCriticalSection ( ampcrit )
counter += local_value
LeaveCriticalSection ( ampcrit )
_finally LeaveCriticalSection ( ampcrit )
DeleteCriticalSection( ampcrit )
8282
Synchronizing Threads with Kernel Objects
The following kernel objects can be used to synchronize threads
ndash Processes
ndash Threads
ndash Files
ndash Console input
File change notificationsFile change notifications
MutexesMutexes
Events (auto-reset + manual-reset)Events (auto-reset + manual-reset)
Waitable timersWaitable timers
DWORD WaitForSingleObject( HANDLE hObject DWORD dwTimeout )
DWORD WaitForMultipleObjects( DWORD cObjects LPHANDLE lpHandles BOOL bWaitAll DWORD dwTimeout )
8383
Wait Functions - Details
WaitForSingleObject()ndash hObject specifies kernel object
ndash dwTimeout specifies wait time in msecdwTimeout == 0 - no wait check whether object is signaled
dwTimeout == INFINITE - wait forever
WaitForMultipleObjects()ndash cObjects lt= MAXIMUM_WAIT_OBJECTS (64)
ndash lpHandles - pointer to array identifying these objects
ndash bWaitAll - whether to wait for first signaled object or all objectsFunction returns index of first signaled object
Side effectsndash Mutexes auto-reset events and waitable timers will be reset to non-signaled state after completing wait functions
8484
Mutexes
Mutexes work across processes
First thread has to call CreateMutex()
When sharing a mutex second thread (process) calls CreateMutex() or OpenMutex()
fInitialOwner == TRUE gives creator immediate ownership
Threads acquire mutex ownership using WaitForSingleObject() or WaitForMultipleObjects()
ReleaseMutex() gives up ownership
CloseHandle() will free mutex object
8585
Mutexes
HANDLE CreateMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
HANDLE OpenMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
BOOL ReleaseMutex( HANDLE hMutex )
8686
Mutex Example
counter is global shared by all threads
volatile int done counter = 0
HANDLE mutex = CreateMutex( NULL FALSE NULL )
main loop in any of the threads ret is local
DWORD ret
while (done)
ret = WaitForSingleObject( mutex INFINITE )
if (ret == WAIT_OBJECT_0)
counter += local_value
else mutex was abandoned
break exit the loop
ReleaseMutex( mutex )
CloseHandle( mutex )
8787
Comparison - POSIX mutexes
POSIX pthreads specification supports mutexesndash Synchronization among threads in same process
Five basic functionsndash pthread_mutex_init()
ndash pthread_mutex_destroy()
ndash pthread_mutex_lock()
ndash pthread_mutex_unlock()
ndash pthread_mutex_trylock()
Comparisonndash pthread_mutex_lock() will block - equivalent to WaitForSingleObject( hMutex )
ndash pthread_mutex_trylock() is nonblocking (polling) - equivalent to WaitForSingleObject() with timeout == 0
8888
Semaphores
Semaphore objects are used for resource countingndash A semaphore is signaled when count gt 0
Threadsprocesses use wait functionsndash Each wait function decreases semaphore count by 1
ndash ReleaseSemaphore() may increment count by any value
ndash ReleaseSemaphore() returns old semaphore count
8989
Semaphores
HANDLE CreateSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE OpenSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE ReleaseSemaphore( HANDLE hSemaphoreLONG cReleaseCount LPLONG lpPreviousCount )
9090
Events
Multiple threads can be released when a single event is signaled (barrier synchronization)ndash Manual-reset event can signal several thread simultaneously must be reset manually
ndash PulseEvent() will release all threads waiting on a manual-reset event and automatically reset the event
ndash Auto-reset event signals a single thread event is reset automatically
ndash fInitialState == TRUE - create event in signaled state
9191
Events
HANDLE CreateEvent( LPSECURITY_ATTRIBUTE lpsaBOOL fManualReset BOOL fInititalStateLPTSTR lpszEventName )
BOOL SetEvent( HANDLE hEvent )BOOL ResetEvent( HANDLE hEvent )BOOL PulseEvent( HANDLE hEvent )
9292
Comparison - POSIX condition variables
pthreadrsquos condition variables are comparable to eventsndash pthread_cond_init()
ndash pthread_cond_destroy()
Wait functionsndash pthread_cond_wait()
ndash pthread_cond_timedwait()
Signalingndash pthread_cond_signal() - one thread
ndash pthread_cond_broadcast() - all waiting threads
No exact equivalent to manual-reset events
9393
Anonymous pipes
BOOL CreatePipe( PHANDLE phReadPHANDLE phWriteLPSECURITY_ATTRIBUTES lpsaDWORD cbPipe )
main
prog1 prog2pipe
Half-duplex character-based IPC
cbPipe pipe byte size zero == default
Read on pipe handle will block if pipe is empty
Write operation to a full pipe will block
Anonymous pipes are oneway
9494
IO Redirection using an Anonymous Pipe
Create default size anonymous pipe handles are inheritable
if (CreatePipe (amphReadPipe amphWritePipe ampPipeSA 0))
fprintf(stderr ldquoAnon pipe create failednrdquo) exit(1)
Set output handle to pipe handle create first processes
StartInfoCh1hStdInput = GetStdHandle (STD_INPUT_HANDLE)
StartInfoCh1hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh1hStdOutput = hWritePipe
StartInfoCh1dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)Command1 NULL NULL TRUE 0 NULL NULL ampStartInfoCh1 ampProcInfo1))
fprintf(stderr ldquoCreateProc1 failednrdquo) exit(2)
CloseHandle (hWritePipe)
9595
Pipe example (contd)
Repeat (symmetrically) for the second process
StartInfoCh2hStdInput = hReadPipe
StartInfoCh2hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh2hStdOutput = GetStdHandle (STD_OUTPUT_HANDLE)
StartInfoCh2dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)targv NULL NULLTRUE Inherit handles
0 NULL NULL ampStartInfoCh2 ampProcInfo2))
fprintf(stderr ldquoCreateProc2 failednrdquo) exit(3)
CloseHandle (hReadPipe)
Wait for both processes to complete
WaitForSingleObject (ProcInfo1hProcess INFINITE)
WaitForSingleObject (ProcInfo2hProcess INFINITE)
CloseHandle (ProcInfo1hThread) CloseHandle (ProcInfo1hProcess)
CloseHandle (ProcInfo2hThread) CloseHandle (ProcInfo2hProcess)
return 0
9696
Named Pipes
Message orientedndash Reading process can read varying-length messages precisely as sent by the writing process
Bi-directionalndash Two processes can exchange messages over the same pipe
Multiple independent instances of a named pipendash Several clients can communicate with a single server using the same instance
ndash Server can respond to client using the same instance
Pipe can be accessed over the networkndash location transparency
Convenience and connection functions
9797
Using Named Pipes
HANDLE CreateNamedPipe (LPCTSTR lpszPipeNameDWORD fdwOpenMode DWORD fdwPipModeDWORD nMaxInstances DWORD cbOutBufDWORD cbInBuf DWORD dwTimeOutLPSECURITY_ATTRIBUTES lpsa )
Use same flag settings forall instances of a named pipe
lpszPipeName pipe[path]pipename
ndash Not possible to create a pipe on remote machine ( ndash local machine)
fdwOpenMode
ndash PIPE_ACCESS_DUPLEX PIPE_ACCESS_INBOUND PIPE_ACCESS_OUTBOUND
fdwPipeMode
ndash PIPE_TYPE_BYTE or PIPE_TYPE_MESSAGE
ndash PIPE_READMODE_BYTE or PIPE_READMODE_MESSAGE
ndash PIPE_WAIT or PIPE_NOWAIT (will ReadFile block)
9898
Named Pipes (contd)
BOOL PeekNamedPipe (HANDLE hPipeLPVOID lpvBuffer DWORD cbBufferLPDWORD lpcbRead LPDWORD lpcbAvailLPDWORD lpcbMessage)
nMaxInstances
ndash Number of instances
ndash PIPE_UNLIMITED_INSTANCES OS choice based on resources
dwTimeOut
ndash Default time-out period (in msec) for WaitNamedPipe()
First CreateNamedPipe creates named pipe
ndash Closing handle to last instance deletes named pipe
Polling a pipe
ndash Nondestructive ndash is there a message waiting for ReadFile
9999
Named Pipe Client Connections
CreateFile with named pipe namendash pipe[path]pipename
ndash servernamepipe[path]pipename
ndash First method gives better performance (local server)
Status Functionsndash GetNamedPipeHandleState
ndash SetNamedPipeHandleState
ndash GetNamedPipeInfo
100100
Convenience Functions
BOOL TransactNamedPipe( HANDLE hNamedPipeLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDOWRD lpcbRead LPOVERLAPPED lpa)
WriteFile ReadFile sequence
101101
Convenience Functions
BOOL CallNamedPipe( LPCTSTR lpszPipeNameLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDWORD lpcbRead DWORD dwTimeOut)
CreateFile WriteFile ReadFile CloseHandle
ndash dwTimeOut NMPWAIT_NOWAIT NMPWAIT_WIAT_FOREVER NMPWAIT_USE_DEFAULT_WAIT
102102
Server eliminate the polling loop
BOOL ConnectNamedPipe (HANDLE hNamedPipeLPOVERLAPPED lpo
lpo == NULLndash Call will return as soon as there is a client connection
ndash Returns false if client connected between CreateNamed Pipe calland ConnectNamedPipe()
Use DisconnectNamedPipe to free the handle for connection from another client
WaitNamedPipe()ndash Client may wait for serverlsquos ConnectNamedPipe()
Security rights for named pipesndash GENERIC_READ GENERIC_WRITE SYNCHRONIZE
103103
Comparison with UNIX
UNIX FIFOs are similar to a named pipe
ndash FIFOs are half-duplex
ndash FIFOs are limited to a single machine
ndash FIFOs are still byte-oriented so its easiest to use fixed-size records in clientserver applications
ndash Individual readwrites are atomic
A server using FIFOs must use a separate FIFO for each clientlsquos response although all clients can send requests via a single well known FIFO
Mkfifo() is the UNIX counterpart to CreateNamedPipe()
Use sockets for networked clientserver scenarios
104104
Client Example using Named Pipe
WaitNamedPipe (ServerPipeName NMPWAIT_WAIT_FOREVER)
hNamedPipe = CreateFile (ServerPipeName GENERIC_READ | GENERIC_WRITE
0 NULL OPEN_EXISTING FILE_ATTRIBUTE_NORMAL NULL)
if (hNamedPipe == INVALID_HANDLE_VALUE)
fptinf(stderr Failure to locate servern) exit(3)
Write the request
WriteFile (hNamedPipe ampRequest MAX_RQRS_LEN ampnWrite NULL)
Read each response and send it to std out
while (ReadFile (hNamedPipe ResponseRecord MAX_RQRS_LEN ampnRead NULL))
printf (s ResponseRecord)
CloseHandle (hNamedPipe)
return 0
105105
Server Example Using a Named Pipe
hNamedPipe = CreateNamedPipe (SERVER_PIPE PIPE_ACCESS_DUPLEX
PIPE_READMODE_MESSAGE | PIPE_TYPE_MESSAGE | PIPE_WAIT
1 0 0 CS_TIMEOUT pNPSA)
while (Done)
printf (Server is awaiting next requestn)
if (ConnectNamedPipe (hNamedPipe NULL)
|| ReadFile (hNamedPipe ampRequest RQ_SIZE ampnXfer NULL))
fprintf(stderr ldquoConnect or Read Named Pipe errornrdquo) exit(4)
printf( ldquoRequest is sn RequestRecord)
Send the file one line at a time to the client
fp = fopen (File r)
while ((fgets (ResponseRecord MAX_RQRS_LEN fp) = NULL))
WriteFile (hNamedPipe ampResponseRecord
(strlen(ResponseRecord) + 1) TSIZE ampnXfer NULL)
fclose (fp)
DisconnectNamedPipe (hNamedPipe)
End of server operation
106106
Win32 IPC - MailslotsMailslots bear some nasty implementation detailsthey are almost never used
Broadcast mechanism
ndash One-directional
ndash Mutliple writersmultiple readers (frequently one-to-many comm)
ndash Message delivery is unreliable
ndash Can be located over a network domain
ndash Message lengths are limited (w2k lt 426 byte) Operations on the mailslot
ndash Each reader (server) creates mailslot with CreateMailslot()
ndash Write-only client opens mailslot with CreateFile() and uses WriteFile() ndash open will fail if there are no waiting readers
ndash Clientlsquos message can be read by all servers (readers) Client lookup mailslotmailslotname
ndash Client will connect to every server in network domain
107107
Locate a server via mailslot
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
App client 0
App client n
Mailslot Servers
While () Sleep() hMS = CreateFile( ldquomailslotstatusldquo)
WriteFile(hMS ampStatInfo
App Server
Mailslot Client
Message is sent periodically
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
7575
Time Accounting Quirks
Looking at total CPU time for each process may not reveal where system has spent its time
CPU time accounting is driven by programmable interrupt timer
ndash Normally 10 msec (15 msec on some MP Pentiums)
Thread execution and context switches between clock intervals NOT accounted
ndash Eg one or more threads run and enter a wait state before clock fires
ndash Thus threads may run but never get charged
View context switch activity with Process Explorer
ndash Add Context Switch Delta column
7676
For waiting threads user-mode utilities only display the wait reason
Example pstat
Looking at Waiting Threads
7777
Wait Internals 1 Dispatcher Objects
Size TypeState
Wait listhead
Object-type-specific data
DispatcherObject
(see ntddkincddkntddkh)
Any kernel object you can wait for is a ldquodispatcher objectrdquo
ndash some exclusively for synchronizationeg events mutexes (ldquomutantsrdquo) semaphores queues timers
ndash others can be waited for as a side effect of their prime function eg processes threads file objects
ndash non-waitable kernel objects are called ldquocontrol objectsrdquo
All dispatcher objects have a common header
All dispatcher objects are in one of two states
ndash ldquosignaledrdquo vs ldquononsignaledrdquo
ndash when signalled a wait on the object is satisfied
ndash different object types differ in terms of what changes their state
ndash wait and unwait implementation iscommon to all types of dispatcher objects
7878
Object-type-specific data
Wait Internals 2Wait Blocks
Size TypeState
Wait listhead
Size TypeState
Wait listhead
Represent a threadrsquos reference to something itrsquos waiting for (one per handle passed to WaitForhellip)
All wait blocks from a given wait call are chained to the waiting thread
Type indicates wait for ldquoanyrdquo or ldquoallrdquo Key denotes argument list position for
WaitForMultipleObjects
Object-type-specific data
DispatcherObjects
Thread Objects
WaitBlockListWaitBlockList
Wait blocks
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
7979
34 Windows APIs for Synchronization and IPC Windows API constructs for synchronization and interprocess communication
Synchronization
ndash Critical sections
ndash Mutexes
ndash Semaphores
ndash Event objects
Synchronization through interprocess communication
ndash Anonymous pipes
ndash Named pipes
ndash Mailslots
8080
Critical Sections
Only usable from within the same process
Critical sections are initialized and deleted but do not have handles
Only one thread at a time can be in a critical section
A thread can enter a critical section multiple times - however the number of Enter- and Leave-operations must match
Leaving a critical section before entering it may cause deadlocks
No way to test whether another thread is in a critical section
VOID InitializeCriticalSection( LPCRITICAL_SECTION sec )VOID DeleteCriticalSection( LPCRITICAL_SECTION sec )
VOID EnterCriticalSection( LPCRITICAL_SECTION sec ) VOID LeaveCriticalSection( LPCRITICAL_SECTION sec )BOOL TryEnterCriticalSection ( LPCRITICAL_SECTION sec )
8181
Critical Section Example
counter is global shared by all threads
volatile int counter = 0
CRITICAL_SECTION crit
InitializeCriticalSection ( ampcrit )
hellip main loop in any of the threads
while (done)
_try
EnterCriticalSection ( ampcrit )
counter += local_value
LeaveCriticalSection ( ampcrit )
_finally LeaveCriticalSection ( ampcrit )
DeleteCriticalSection( ampcrit )
8282
Synchronizing Threads with Kernel Objects
The following kernel objects can be used to synchronize threads
ndash Processes
ndash Threads
ndash Files
ndash Console input
File change notificationsFile change notifications
MutexesMutexes
Events (auto-reset + manual-reset)Events (auto-reset + manual-reset)
Waitable timersWaitable timers
DWORD WaitForSingleObject( HANDLE hObject DWORD dwTimeout )
DWORD WaitForMultipleObjects( DWORD cObjects LPHANDLE lpHandles BOOL bWaitAll DWORD dwTimeout )
8383
Wait Functions - Details
WaitForSingleObject()ndash hObject specifies kernel object
ndash dwTimeout specifies wait time in msecdwTimeout == 0 - no wait check whether object is signaled
dwTimeout == INFINITE - wait forever
WaitForMultipleObjects()ndash cObjects lt= MAXIMUM_WAIT_OBJECTS (64)
ndash lpHandles - pointer to array identifying these objects
ndash bWaitAll - whether to wait for first signaled object or all objectsFunction returns index of first signaled object
Side effectsndash Mutexes auto-reset events and waitable timers will be reset to non-signaled state after completing wait functions
8484
Mutexes
Mutexes work across processes
First thread has to call CreateMutex()
When sharing a mutex second thread (process) calls CreateMutex() or OpenMutex()
fInitialOwner == TRUE gives creator immediate ownership
Threads acquire mutex ownership using WaitForSingleObject() or WaitForMultipleObjects()
ReleaseMutex() gives up ownership
CloseHandle() will free mutex object
8585
Mutexes
HANDLE CreateMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
HANDLE OpenMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
BOOL ReleaseMutex( HANDLE hMutex )
8686
Mutex Example
counter is global shared by all threads
volatile int done counter = 0
HANDLE mutex = CreateMutex( NULL FALSE NULL )
main loop in any of the threads ret is local
DWORD ret
while (done)
ret = WaitForSingleObject( mutex INFINITE )
if (ret == WAIT_OBJECT_0)
counter += local_value
else mutex was abandoned
break exit the loop
ReleaseMutex( mutex )
CloseHandle( mutex )
8787
Comparison - POSIX mutexes
POSIX pthreads specification supports mutexesndash Synchronization among threads in same process
Five basic functionsndash pthread_mutex_init()
ndash pthread_mutex_destroy()
ndash pthread_mutex_lock()
ndash pthread_mutex_unlock()
ndash pthread_mutex_trylock()
Comparisonndash pthread_mutex_lock() will block - equivalent to WaitForSingleObject( hMutex )
ndash pthread_mutex_trylock() is nonblocking (polling) - equivalent to WaitForSingleObject() with timeout == 0
8888
Semaphores
Semaphore objects are used for resource countingndash A semaphore is signaled when count gt 0
Threadsprocesses use wait functionsndash Each wait function decreases semaphore count by 1
ndash ReleaseSemaphore() may increment count by any value
ndash ReleaseSemaphore() returns old semaphore count
8989
Semaphores
HANDLE CreateSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE OpenSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE ReleaseSemaphore( HANDLE hSemaphoreLONG cReleaseCount LPLONG lpPreviousCount )
9090
Events
Multiple threads can be released when a single event is signaled (barrier synchronization)ndash Manual-reset event can signal several thread simultaneously must be reset manually
ndash PulseEvent() will release all threads waiting on a manual-reset event and automatically reset the event
ndash Auto-reset event signals a single thread event is reset automatically
ndash fInitialState == TRUE - create event in signaled state
9191
Events
HANDLE CreateEvent( LPSECURITY_ATTRIBUTE lpsaBOOL fManualReset BOOL fInititalStateLPTSTR lpszEventName )
BOOL SetEvent( HANDLE hEvent )BOOL ResetEvent( HANDLE hEvent )BOOL PulseEvent( HANDLE hEvent )
9292
Comparison - POSIX condition variables
pthreadrsquos condition variables are comparable to eventsndash pthread_cond_init()
ndash pthread_cond_destroy()
Wait functionsndash pthread_cond_wait()
ndash pthread_cond_timedwait()
Signalingndash pthread_cond_signal() - one thread
ndash pthread_cond_broadcast() - all waiting threads
No exact equivalent to manual-reset events
9393
Anonymous pipes
BOOL CreatePipe( PHANDLE phReadPHANDLE phWriteLPSECURITY_ATTRIBUTES lpsaDWORD cbPipe )
main
prog1 prog2pipe
Half-duplex character-based IPC
cbPipe pipe byte size zero == default
Read on pipe handle will block if pipe is empty
Write operation to a full pipe will block
Anonymous pipes are oneway
9494
IO Redirection using an Anonymous Pipe
Create default size anonymous pipe handles are inheritable
if (CreatePipe (amphReadPipe amphWritePipe ampPipeSA 0))
fprintf(stderr ldquoAnon pipe create failednrdquo) exit(1)
Set output handle to pipe handle create first processes
StartInfoCh1hStdInput = GetStdHandle (STD_INPUT_HANDLE)
StartInfoCh1hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh1hStdOutput = hWritePipe
StartInfoCh1dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)Command1 NULL NULL TRUE 0 NULL NULL ampStartInfoCh1 ampProcInfo1))
fprintf(stderr ldquoCreateProc1 failednrdquo) exit(2)
CloseHandle (hWritePipe)
9595
Pipe example (contd)
Repeat (symmetrically) for the second process
StartInfoCh2hStdInput = hReadPipe
StartInfoCh2hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh2hStdOutput = GetStdHandle (STD_OUTPUT_HANDLE)
StartInfoCh2dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)targv NULL NULLTRUE Inherit handles
0 NULL NULL ampStartInfoCh2 ampProcInfo2))
fprintf(stderr ldquoCreateProc2 failednrdquo) exit(3)
CloseHandle (hReadPipe)
Wait for both processes to complete
WaitForSingleObject (ProcInfo1hProcess INFINITE)
WaitForSingleObject (ProcInfo2hProcess INFINITE)
CloseHandle (ProcInfo1hThread) CloseHandle (ProcInfo1hProcess)
CloseHandle (ProcInfo2hThread) CloseHandle (ProcInfo2hProcess)
return 0
9696
Named Pipes
Message orientedndash Reading process can read varying-length messages precisely as sent by the writing process
Bi-directionalndash Two processes can exchange messages over the same pipe
Multiple independent instances of a named pipendash Several clients can communicate with a single server using the same instance
ndash Server can respond to client using the same instance
Pipe can be accessed over the networkndash location transparency
Convenience and connection functions
9797
Using Named Pipes
HANDLE CreateNamedPipe (LPCTSTR lpszPipeNameDWORD fdwOpenMode DWORD fdwPipModeDWORD nMaxInstances DWORD cbOutBufDWORD cbInBuf DWORD dwTimeOutLPSECURITY_ATTRIBUTES lpsa )
Use same flag settings forall instances of a named pipe
lpszPipeName pipe[path]pipename
ndash Not possible to create a pipe on remote machine ( ndash local machine)
fdwOpenMode
ndash PIPE_ACCESS_DUPLEX PIPE_ACCESS_INBOUND PIPE_ACCESS_OUTBOUND
fdwPipeMode
ndash PIPE_TYPE_BYTE or PIPE_TYPE_MESSAGE
ndash PIPE_READMODE_BYTE or PIPE_READMODE_MESSAGE
ndash PIPE_WAIT or PIPE_NOWAIT (will ReadFile block)
9898
Named Pipes (contd)
BOOL PeekNamedPipe (HANDLE hPipeLPVOID lpvBuffer DWORD cbBufferLPDWORD lpcbRead LPDWORD lpcbAvailLPDWORD lpcbMessage)
nMaxInstances
ndash Number of instances
ndash PIPE_UNLIMITED_INSTANCES OS choice based on resources
dwTimeOut
ndash Default time-out period (in msec) for WaitNamedPipe()
First CreateNamedPipe creates named pipe
ndash Closing handle to last instance deletes named pipe
Polling a pipe
ndash Nondestructive ndash is there a message waiting for ReadFile
9999
Named Pipe Client Connections
CreateFile with named pipe namendash pipe[path]pipename
ndash servernamepipe[path]pipename
ndash First method gives better performance (local server)
Status Functionsndash GetNamedPipeHandleState
ndash SetNamedPipeHandleState
ndash GetNamedPipeInfo
100100
Convenience Functions
BOOL TransactNamedPipe( HANDLE hNamedPipeLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDOWRD lpcbRead LPOVERLAPPED lpa)
WriteFile ReadFile sequence
101101
Convenience Functions
BOOL CallNamedPipe( LPCTSTR lpszPipeNameLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDWORD lpcbRead DWORD dwTimeOut)
CreateFile WriteFile ReadFile CloseHandle
ndash dwTimeOut NMPWAIT_NOWAIT NMPWAIT_WIAT_FOREVER NMPWAIT_USE_DEFAULT_WAIT
102102
Server eliminate the polling loop
BOOL ConnectNamedPipe (HANDLE hNamedPipeLPOVERLAPPED lpo
lpo == NULLndash Call will return as soon as there is a client connection
ndash Returns false if client connected between CreateNamed Pipe calland ConnectNamedPipe()
Use DisconnectNamedPipe to free the handle for connection from another client
WaitNamedPipe()ndash Client may wait for serverlsquos ConnectNamedPipe()
Security rights for named pipesndash GENERIC_READ GENERIC_WRITE SYNCHRONIZE
103103
Comparison with UNIX
UNIX FIFOs are similar to a named pipe
ndash FIFOs are half-duplex
ndash FIFOs are limited to a single machine
ndash FIFOs are still byte-oriented so its easiest to use fixed-size records in clientserver applications
ndash Individual readwrites are atomic
A server using FIFOs must use a separate FIFO for each clientlsquos response although all clients can send requests via a single well known FIFO
Mkfifo() is the UNIX counterpart to CreateNamedPipe()
Use sockets for networked clientserver scenarios
104104
Client Example using Named Pipe
WaitNamedPipe (ServerPipeName NMPWAIT_WAIT_FOREVER)
hNamedPipe = CreateFile (ServerPipeName GENERIC_READ | GENERIC_WRITE
0 NULL OPEN_EXISTING FILE_ATTRIBUTE_NORMAL NULL)
if (hNamedPipe == INVALID_HANDLE_VALUE)
fptinf(stderr Failure to locate servern) exit(3)
Write the request
WriteFile (hNamedPipe ampRequest MAX_RQRS_LEN ampnWrite NULL)
Read each response and send it to std out
while (ReadFile (hNamedPipe ResponseRecord MAX_RQRS_LEN ampnRead NULL))
printf (s ResponseRecord)
CloseHandle (hNamedPipe)
return 0
105105
Server Example Using a Named Pipe
hNamedPipe = CreateNamedPipe (SERVER_PIPE PIPE_ACCESS_DUPLEX
PIPE_READMODE_MESSAGE | PIPE_TYPE_MESSAGE | PIPE_WAIT
1 0 0 CS_TIMEOUT pNPSA)
while (Done)
printf (Server is awaiting next requestn)
if (ConnectNamedPipe (hNamedPipe NULL)
|| ReadFile (hNamedPipe ampRequest RQ_SIZE ampnXfer NULL))
fprintf(stderr ldquoConnect or Read Named Pipe errornrdquo) exit(4)
printf( ldquoRequest is sn RequestRecord)
Send the file one line at a time to the client
fp = fopen (File r)
while ((fgets (ResponseRecord MAX_RQRS_LEN fp) = NULL))
WriteFile (hNamedPipe ampResponseRecord
(strlen(ResponseRecord) + 1) TSIZE ampnXfer NULL)
fclose (fp)
DisconnectNamedPipe (hNamedPipe)
End of server operation
106106
Win32 IPC - MailslotsMailslots bear some nasty implementation detailsthey are almost never used
Broadcast mechanism
ndash One-directional
ndash Mutliple writersmultiple readers (frequently one-to-many comm)
ndash Message delivery is unreliable
ndash Can be located over a network domain
ndash Message lengths are limited (w2k lt 426 byte) Operations on the mailslot
ndash Each reader (server) creates mailslot with CreateMailslot()
ndash Write-only client opens mailslot with CreateFile() and uses WriteFile() ndash open will fail if there are no waiting readers
ndash Clientlsquos message can be read by all servers (readers) Client lookup mailslotmailslotname
ndash Client will connect to every server in network domain
107107
Locate a server via mailslot
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
App client 0
App client n
Mailslot Servers
While () Sleep() hMS = CreateFile( ldquomailslotstatusldquo)
WriteFile(hMS ampStatInfo
App Server
Mailslot Client
Message is sent periodically
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
7676
For waiting threads user-mode utilities only display the wait reason
Example pstat
Looking at Waiting Threads
7777
Wait Internals 1 Dispatcher Objects
Size TypeState
Wait listhead
Object-type-specific data
DispatcherObject
(see ntddkincddkntddkh)
Any kernel object you can wait for is a ldquodispatcher objectrdquo
ndash some exclusively for synchronizationeg events mutexes (ldquomutantsrdquo) semaphores queues timers
ndash others can be waited for as a side effect of their prime function eg processes threads file objects
ndash non-waitable kernel objects are called ldquocontrol objectsrdquo
All dispatcher objects have a common header
All dispatcher objects are in one of two states
ndash ldquosignaledrdquo vs ldquononsignaledrdquo
ndash when signalled a wait on the object is satisfied
ndash different object types differ in terms of what changes their state
ndash wait and unwait implementation iscommon to all types of dispatcher objects
7878
Object-type-specific data
Wait Internals 2Wait Blocks
Size TypeState
Wait listhead
Size TypeState
Wait listhead
Represent a threadrsquos reference to something itrsquos waiting for (one per handle passed to WaitForhellip)
All wait blocks from a given wait call are chained to the waiting thread
Type indicates wait for ldquoanyrdquo or ldquoallrdquo Key denotes argument list position for
WaitForMultipleObjects
Object-type-specific data
DispatcherObjects
Thread Objects
WaitBlockListWaitBlockList
Wait blocks
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
7979
34 Windows APIs for Synchronization and IPC Windows API constructs for synchronization and interprocess communication
Synchronization
ndash Critical sections
ndash Mutexes
ndash Semaphores
ndash Event objects
Synchronization through interprocess communication
ndash Anonymous pipes
ndash Named pipes
ndash Mailslots
8080
Critical Sections
Only usable from within the same process
Critical sections are initialized and deleted but do not have handles
Only one thread at a time can be in a critical section
A thread can enter a critical section multiple times - however the number of Enter- and Leave-operations must match
Leaving a critical section before entering it may cause deadlocks
No way to test whether another thread is in a critical section
VOID InitializeCriticalSection( LPCRITICAL_SECTION sec )VOID DeleteCriticalSection( LPCRITICAL_SECTION sec )
VOID EnterCriticalSection( LPCRITICAL_SECTION sec ) VOID LeaveCriticalSection( LPCRITICAL_SECTION sec )BOOL TryEnterCriticalSection ( LPCRITICAL_SECTION sec )
8181
Critical Section Example
counter is global shared by all threads
volatile int counter = 0
CRITICAL_SECTION crit
InitializeCriticalSection ( ampcrit )
hellip main loop in any of the threads
while (done)
_try
EnterCriticalSection ( ampcrit )
counter += local_value
LeaveCriticalSection ( ampcrit )
_finally LeaveCriticalSection ( ampcrit )
DeleteCriticalSection( ampcrit )
8282
Synchronizing Threads with Kernel Objects
The following kernel objects can be used to synchronize threads
ndash Processes
ndash Threads
ndash Files
ndash Console input
File change notificationsFile change notifications
MutexesMutexes
Events (auto-reset + manual-reset)Events (auto-reset + manual-reset)
Waitable timersWaitable timers
DWORD WaitForSingleObject( HANDLE hObject DWORD dwTimeout )
DWORD WaitForMultipleObjects( DWORD cObjects LPHANDLE lpHandles BOOL bWaitAll DWORD dwTimeout )
8383
Wait Functions - Details
WaitForSingleObject()ndash hObject specifies kernel object
ndash dwTimeout specifies wait time in msecdwTimeout == 0 - no wait check whether object is signaled
dwTimeout == INFINITE - wait forever
WaitForMultipleObjects()ndash cObjects lt= MAXIMUM_WAIT_OBJECTS (64)
ndash lpHandles - pointer to array identifying these objects
ndash bWaitAll - whether to wait for first signaled object or all objectsFunction returns index of first signaled object
Side effectsndash Mutexes auto-reset events and waitable timers will be reset to non-signaled state after completing wait functions
8484
Mutexes
Mutexes work across processes
First thread has to call CreateMutex()
When sharing a mutex second thread (process) calls CreateMutex() or OpenMutex()
fInitialOwner == TRUE gives creator immediate ownership
Threads acquire mutex ownership using WaitForSingleObject() or WaitForMultipleObjects()
ReleaseMutex() gives up ownership
CloseHandle() will free mutex object
8585
Mutexes
HANDLE CreateMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
HANDLE OpenMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
BOOL ReleaseMutex( HANDLE hMutex )
8686
Mutex Example
counter is global shared by all threads
volatile int done counter = 0
HANDLE mutex = CreateMutex( NULL FALSE NULL )
main loop in any of the threads ret is local
DWORD ret
while (done)
ret = WaitForSingleObject( mutex INFINITE )
if (ret == WAIT_OBJECT_0)
counter += local_value
else mutex was abandoned
break exit the loop
ReleaseMutex( mutex )
CloseHandle( mutex )
8787
Comparison - POSIX mutexes
POSIX pthreads specification supports mutexesndash Synchronization among threads in same process
Five basic functionsndash pthread_mutex_init()
ndash pthread_mutex_destroy()
ndash pthread_mutex_lock()
ndash pthread_mutex_unlock()
ndash pthread_mutex_trylock()
Comparisonndash pthread_mutex_lock() will block - equivalent to WaitForSingleObject( hMutex )
ndash pthread_mutex_trylock() is nonblocking (polling) - equivalent to WaitForSingleObject() with timeout == 0
8888
Semaphores
Semaphore objects are used for resource countingndash A semaphore is signaled when count gt 0
Threadsprocesses use wait functionsndash Each wait function decreases semaphore count by 1
ndash ReleaseSemaphore() may increment count by any value
ndash ReleaseSemaphore() returns old semaphore count
8989
Semaphores
HANDLE CreateSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE OpenSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE ReleaseSemaphore( HANDLE hSemaphoreLONG cReleaseCount LPLONG lpPreviousCount )
9090
Events
Multiple threads can be released when a single event is signaled (barrier synchronization)ndash Manual-reset event can signal several thread simultaneously must be reset manually
ndash PulseEvent() will release all threads waiting on a manual-reset event and automatically reset the event
ndash Auto-reset event signals a single thread event is reset automatically
ndash fInitialState == TRUE - create event in signaled state
9191
Events
HANDLE CreateEvent( LPSECURITY_ATTRIBUTE lpsaBOOL fManualReset BOOL fInititalStateLPTSTR lpszEventName )
BOOL SetEvent( HANDLE hEvent )BOOL ResetEvent( HANDLE hEvent )BOOL PulseEvent( HANDLE hEvent )
9292
Comparison - POSIX condition variables
pthreadrsquos condition variables are comparable to eventsndash pthread_cond_init()
ndash pthread_cond_destroy()
Wait functionsndash pthread_cond_wait()
ndash pthread_cond_timedwait()
Signalingndash pthread_cond_signal() - one thread
ndash pthread_cond_broadcast() - all waiting threads
No exact equivalent to manual-reset events
9393
Anonymous pipes
BOOL CreatePipe( PHANDLE phReadPHANDLE phWriteLPSECURITY_ATTRIBUTES lpsaDWORD cbPipe )
main
prog1 prog2pipe
Half-duplex character-based IPC
cbPipe pipe byte size zero == default
Read on pipe handle will block if pipe is empty
Write operation to a full pipe will block
Anonymous pipes are oneway
9494
IO Redirection using an Anonymous Pipe
Create default size anonymous pipe handles are inheritable
if (CreatePipe (amphReadPipe amphWritePipe ampPipeSA 0))
fprintf(stderr ldquoAnon pipe create failednrdquo) exit(1)
Set output handle to pipe handle create first processes
StartInfoCh1hStdInput = GetStdHandle (STD_INPUT_HANDLE)
StartInfoCh1hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh1hStdOutput = hWritePipe
StartInfoCh1dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)Command1 NULL NULL TRUE 0 NULL NULL ampStartInfoCh1 ampProcInfo1))
fprintf(stderr ldquoCreateProc1 failednrdquo) exit(2)
CloseHandle (hWritePipe)
9595
Pipe example (contd)
Repeat (symmetrically) for the second process
StartInfoCh2hStdInput = hReadPipe
StartInfoCh2hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh2hStdOutput = GetStdHandle (STD_OUTPUT_HANDLE)
StartInfoCh2dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)targv NULL NULLTRUE Inherit handles
0 NULL NULL ampStartInfoCh2 ampProcInfo2))
fprintf(stderr ldquoCreateProc2 failednrdquo) exit(3)
CloseHandle (hReadPipe)
Wait for both processes to complete
WaitForSingleObject (ProcInfo1hProcess INFINITE)
WaitForSingleObject (ProcInfo2hProcess INFINITE)
CloseHandle (ProcInfo1hThread) CloseHandle (ProcInfo1hProcess)
CloseHandle (ProcInfo2hThread) CloseHandle (ProcInfo2hProcess)
return 0
9696
Named Pipes
Message orientedndash Reading process can read varying-length messages precisely as sent by the writing process
Bi-directionalndash Two processes can exchange messages over the same pipe
Multiple independent instances of a named pipendash Several clients can communicate with a single server using the same instance
ndash Server can respond to client using the same instance
Pipe can be accessed over the networkndash location transparency
Convenience and connection functions
9797
Using Named Pipes
HANDLE CreateNamedPipe (LPCTSTR lpszPipeNameDWORD fdwOpenMode DWORD fdwPipModeDWORD nMaxInstances DWORD cbOutBufDWORD cbInBuf DWORD dwTimeOutLPSECURITY_ATTRIBUTES lpsa )
Use same flag settings forall instances of a named pipe
lpszPipeName pipe[path]pipename
ndash Not possible to create a pipe on remote machine ( ndash local machine)
fdwOpenMode
ndash PIPE_ACCESS_DUPLEX PIPE_ACCESS_INBOUND PIPE_ACCESS_OUTBOUND
fdwPipeMode
ndash PIPE_TYPE_BYTE or PIPE_TYPE_MESSAGE
ndash PIPE_READMODE_BYTE or PIPE_READMODE_MESSAGE
ndash PIPE_WAIT or PIPE_NOWAIT (will ReadFile block)
9898
Named Pipes (contd)
BOOL PeekNamedPipe (HANDLE hPipeLPVOID lpvBuffer DWORD cbBufferLPDWORD lpcbRead LPDWORD lpcbAvailLPDWORD lpcbMessage)
nMaxInstances
ndash Number of instances
ndash PIPE_UNLIMITED_INSTANCES OS choice based on resources
dwTimeOut
ndash Default time-out period (in msec) for WaitNamedPipe()
First CreateNamedPipe creates named pipe
ndash Closing handle to last instance deletes named pipe
Polling a pipe
ndash Nondestructive ndash is there a message waiting for ReadFile
9999
Named Pipe Client Connections
CreateFile with named pipe namendash pipe[path]pipename
ndash servernamepipe[path]pipename
ndash First method gives better performance (local server)
Status Functionsndash GetNamedPipeHandleState
ndash SetNamedPipeHandleState
ndash GetNamedPipeInfo
100100
Convenience Functions
BOOL TransactNamedPipe( HANDLE hNamedPipeLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDOWRD lpcbRead LPOVERLAPPED lpa)
WriteFile ReadFile sequence
101101
Convenience Functions
BOOL CallNamedPipe( LPCTSTR lpszPipeNameLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDWORD lpcbRead DWORD dwTimeOut)
CreateFile WriteFile ReadFile CloseHandle
ndash dwTimeOut NMPWAIT_NOWAIT NMPWAIT_WIAT_FOREVER NMPWAIT_USE_DEFAULT_WAIT
102102
Server eliminate the polling loop
BOOL ConnectNamedPipe (HANDLE hNamedPipeLPOVERLAPPED lpo
lpo == NULLndash Call will return as soon as there is a client connection
ndash Returns false if client connected between CreateNamed Pipe calland ConnectNamedPipe()
Use DisconnectNamedPipe to free the handle for connection from another client
WaitNamedPipe()ndash Client may wait for serverlsquos ConnectNamedPipe()
Security rights for named pipesndash GENERIC_READ GENERIC_WRITE SYNCHRONIZE
103103
Comparison with UNIX
UNIX FIFOs are similar to a named pipe
ndash FIFOs are half-duplex
ndash FIFOs are limited to a single machine
ndash FIFOs are still byte-oriented so its easiest to use fixed-size records in clientserver applications
ndash Individual readwrites are atomic
A server using FIFOs must use a separate FIFO for each clientlsquos response although all clients can send requests via a single well known FIFO
Mkfifo() is the UNIX counterpart to CreateNamedPipe()
Use sockets for networked clientserver scenarios
104104
Client Example using Named Pipe
WaitNamedPipe (ServerPipeName NMPWAIT_WAIT_FOREVER)
hNamedPipe = CreateFile (ServerPipeName GENERIC_READ | GENERIC_WRITE
0 NULL OPEN_EXISTING FILE_ATTRIBUTE_NORMAL NULL)
if (hNamedPipe == INVALID_HANDLE_VALUE)
fptinf(stderr Failure to locate servern) exit(3)
Write the request
WriteFile (hNamedPipe ampRequest MAX_RQRS_LEN ampnWrite NULL)
Read each response and send it to std out
while (ReadFile (hNamedPipe ResponseRecord MAX_RQRS_LEN ampnRead NULL))
printf (s ResponseRecord)
CloseHandle (hNamedPipe)
return 0
105105
Server Example Using a Named Pipe
hNamedPipe = CreateNamedPipe (SERVER_PIPE PIPE_ACCESS_DUPLEX
PIPE_READMODE_MESSAGE | PIPE_TYPE_MESSAGE | PIPE_WAIT
1 0 0 CS_TIMEOUT pNPSA)
while (Done)
printf (Server is awaiting next requestn)
if (ConnectNamedPipe (hNamedPipe NULL)
|| ReadFile (hNamedPipe ampRequest RQ_SIZE ampnXfer NULL))
fprintf(stderr ldquoConnect or Read Named Pipe errornrdquo) exit(4)
printf( ldquoRequest is sn RequestRecord)
Send the file one line at a time to the client
fp = fopen (File r)
while ((fgets (ResponseRecord MAX_RQRS_LEN fp) = NULL))
WriteFile (hNamedPipe ampResponseRecord
(strlen(ResponseRecord) + 1) TSIZE ampnXfer NULL)
fclose (fp)
DisconnectNamedPipe (hNamedPipe)
End of server operation
106106
Win32 IPC - MailslotsMailslots bear some nasty implementation detailsthey are almost never used
Broadcast mechanism
ndash One-directional
ndash Mutliple writersmultiple readers (frequently one-to-many comm)
ndash Message delivery is unreliable
ndash Can be located over a network domain
ndash Message lengths are limited (w2k lt 426 byte) Operations on the mailslot
ndash Each reader (server) creates mailslot with CreateMailslot()
ndash Write-only client opens mailslot with CreateFile() and uses WriteFile() ndash open will fail if there are no waiting readers
ndash Clientlsquos message can be read by all servers (readers) Client lookup mailslotmailslotname
ndash Client will connect to every server in network domain
107107
Locate a server via mailslot
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
App client 0
App client n
Mailslot Servers
While () Sleep() hMS = CreateFile( ldquomailslotstatusldquo)
WriteFile(hMS ampStatInfo
App Server
Mailslot Client
Message is sent periodically
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
7777
Wait Internals 1 Dispatcher Objects
Size TypeState
Wait listhead
Object-type-specific data
DispatcherObject
(see ntddkincddkntddkh)
Any kernel object you can wait for is a ldquodispatcher objectrdquo
ndash some exclusively for synchronizationeg events mutexes (ldquomutantsrdquo) semaphores queues timers
ndash others can be waited for as a side effect of their prime function eg processes threads file objects
ndash non-waitable kernel objects are called ldquocontrol objectsrdquo
All dispatcher objects have a common header
All dispatcher objects are in one of two states
ndash ldquosignaledrdquo vs ldquononsignaledrdquo
ndash when signalled a wait on the object is satisfied
ndash different object types differ in terms of what changes their state
ndash wait and unwait implementation iscommon to all types of dispatcher objects
7878
Object-type-specific data
Wait Internals 2Wait Blocks
Size TypeState
Wait listhead
Size TypeState
Wait listhead
Represent a threadrsquos reference to something itrsquos waiting for (one per handle passed to WaitForhellip)
All wait blocks from a given wait call are chained to the waiting thread
Type indicates wait for ldquoanyrdquo or ldquoallrdquo Key denotes argument list position for
WaitForMultipleObjects
Object-type-specific data
DispatcherObjects
Thread Objects
WaitBlockListWaitBlockList
Wait blocks
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
7979
34 Windows APIs for Synchronization and IPC Windows API constructs for synchronization and interprocess communication
Synchronization
ndash Critical sections
ndash Mutexes
ndash Semaphores
ndash Event objects
Synchronization through interprocess communication
ndash Anonymous pipes
ndash Named pipes
ndash Mailslots
8080
Critical Sections
Only usable from within the same process
Critical sections are initialized and deleted but do not have handles
Only one thread at a time can be in a critical section
A thread can enter a critical section multiple times - however the number of Enter- and Leave-operations must match
Leaving a critical section before entering it may cause deadlocks
No way to test whether another thread is in a critical section
VOID InitializeCriticalSection( LPCRITICAL_SECTION sec )VOID DeleteCriticalSection( LPCRITICAL_SECTION sec )
VOID EnterCriticalSection( LPCRITICAL_SECTION sec ) VOID LeaveCriticalSection( LPCRITICAL_SECTION sec )BOOL TryEnterCriticalSection ( LPCRITICAL_SECTION sec )
8181
Critical Section Example
counter is global shared by all threads
volatile int counter = 0
CRITICAL_SECTION crit
InitializeCriticalSection ( ampcrit )
hellip main loop in any of the threads
while (done)
_try
EnterCriticalSection ( ampcrit )
counter += local_value
LeaveCriticalSection ( ampcrit )
_finally LeaveCriticalSection ( ampcrit )
DeleteCriticalSection( ampcrit )
8282
Synchronizing Threads with Kernel Objects
The following kernel objects can be used to synchronize threads
ndash Processes
ndash Threads
ndash Files
ndash Console input
File change notificationsFile change notifications
MutexesMutexes
Events (auto-reset + manual-reset)Events (auto-reset + manual-reset)
Waitable timersWaitable timers
DWORD WaitForSingleObject( HANDLE hObject DWORD dwTimeout )
DWORD WaitForMultipleObjects( DWORD cObjects LPHANDLE lpHandles BOOL bWaitAll DWORD dwTimeout )
8383
Wait Functions - Details
WaitForSingleObject()ndash hObject specifies kernel object
ndash dwTimeout specifies wait time in msecdwTimeout == 0 - no wait check whether object is signaled
dwTimeout == INFINITE - wait forever
WaitForMultipleObjects()ndash cObjects lt= MAXIMUM_WAIT_OBJECTS (64)
ndash lpHandles - pointer to array identifying these objects
ndash bWaitAll - whether to wait for first signaled object or all objectsFunction returns index of first signaled object
Side effectsndash Mutexes auto-reset events and waitable timers will be reset to non-signaled state after completing wait functions
8484
Mutexes
Mutexes work across processes
First thread has to call CreateMutex()
When sharing a mutex second thread (process) calls CreateMutex() or OpenMutex()
fInitialOwner == TRUE gives creator immediate ownership
Threads acquire mutex ownership using WaitForSingleObject() or WaitForMultipleObjects()
ReleaseMutex() gives up ownership
CloseHandle() will free mutex object
8585
Mutexes
HANDLE CreateMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
HANDLE OpenMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
BOOL ReleaseMutex( HANDLE hMutex )
8686
Mutex Example
counter is global shared by all threads
volatile int done counter = 0
HANDLE mutex = CreateMutex( NULL FALSE NULL )
main loop in any of the threads ret is local
DWORD ret
while (done)
ret = WaitForSingleObject( mutex INFINITE )
if (ret == WAIT_OBJECT_0)
counter += local_value
else mutex was abandoned
break exit the loop
ReleaseMutex( mutex )
CloseHandle( mutex )
8787
Comparison - POSIX mutexes
POSIX pthreads specification supports mutexesndash Synchronization among threads in same process
Five basic functionsndash pthread_mutex_init()
ndash pthread_mutex_destroy()
ndash pthread_mutex_lock()
ndash pthread_mutex_unlock()
ndash pthread_mutex_trylock()
Comparisonndash pthread_mutex_lock() will block - equivalent to WaitForSingleObject( hMutex )
ndash pthread_mutex_trylock() is nonblocking (polling) - equivalent to WaitForSingleObject() with timeout == 0
8888
Semaphores
Semaphore objects are used for resource countingndash A semaphore is signaled when count gt 0
Threadsprocesses use wait functionsndash Each wait function decreases semaphore count by 1
ndash ReleaseSemaphore() may increment count by any value
ndash ReleaseSemaphore() returns old semaphore count
8989
Semaphores
HANDLE CreateSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE OpenSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE ReleaseSemaphore( HANDLE hSemaphoreLONG cReleaseCount LPLONG lpPreviousCount )
9090
Events
Multiple threads can be released when a single event is signaled (barrier synchronization)ndash Manual-reset event can signal several thread simultaneously must be reset manually
ndash PulseEvent() will release all threads waiting on a manual-reset event and automatically reset the event
ndash Auto-reset event signals a single thread event is reset automatically
ndash fInitialState == TRUE - create event in signaled state
9191
Events
HANDLE CreateEvent( LPSECURITY_ATTRIBUTE lpsaBOOL fManualReset BOOL fInititalStateLPTSTR lpszEventName )
BOOL SetEvent( HANDLE hEvent )BOOL ResetEvent( HANDLE hEvent )BOOL PulseEvent( HANDLE hEvent )
9292
Comparison - POSIX condition variables
pthreadrsquos condition variables are comparable to eventsndash pthread_cond_init()
ndash pthread_cond_destroy()
Wait functionsndash pthread_cond_wait()
ndash pthread_cond_timedwait()
Signalingndash pthread_cond_signal() - one thread
ndash pthread_cond_broadcast() - all waiting threads
No exact equivalent to manual-reset events
9393
Anonymous pipes
BOOL CreatePipe( PHANDLE phReadPHANDLE phWriteLPSECURITY_ATTRIBUTES lpsaDWORD cbPipe )
main
prog1 prog2pipe
Half-duplex character-based IPC
cbPipe pipe byte size zero == default
Read on pipe handle will block if pipe is empty
Write operation to a full pipe will block
Anonymous pipes are oneway
9494
IO Redirection using an Anonymous Pipe
Create default size anonymous pipe handles are inheritable
if (CreatePipe (amphReadPipe amphWritePipe ampPipeSA 0))
fprintf(stderr ldquoAnon pipe create failednrdquo) exit(1)
Set output handle to pipe handle create first processes
StartInfoCh1hStdInput = GetStdHandle (STD_INPUT_HANDLE)
StartInfoCh1hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh1hStdOutput = hWritePipe
StartInfoCh1dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)Command1 NULL NULL TRUE 0 NULL NULL ampStartInfoCh1 ampProcInfo1))
fprintf(stderr ldquoCreateProc1 failednrdquo) exit(2)
CloseHandle (hWritePipe)
9595
Pipe example (contd)
Repeat (symmetrically) for the second process
StartInfoCh2hStdInput = hReadPipe
StartInfoCh2hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh2hStdOutput = GetStdHandle (STD_OUTPUT_HANDLE)
StartInfoCh2dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)targv NULL NULLTRUE Inherit handles
0 NULL NULL ampStartInfoCh2 ampProcInfo2))
fprintf(stderr ldquoCreateProc2 failednrdquo) exit(3)
CloseHandle (hReadPipe)
Wait for both processes to complete
WaitForSingleObject (ProcInfo1hProcess INFINITE)
WaitForSingleObject (ProcInfo2hProcess INFINITE)
CloseHandle (ProcInfo1hThread) CloseHandle (ProcInfo1hProcess)
CloseHandle (ProcInfo2hThread) CloseHandle (ProcInfo2hProcess)
return 0
9696
Named Pipes
Message orientedndash Reading process can read varying-length messages precisely as sent by the writing process
Bi-directionalndash Two processes can exchange messages over the same pipe
Multiple independent instances of a named pipendash Several clients can communicate with a single server using the same instance
ndash Server can respond to client using the same instance
Pipe can be accessed over the networkndash location transparency
Convenience and connection functions
9797
Using Named Pipes
HANDLE CreateNamedPipe (LPCTSTR lpszPipeNameDWORD fdwOpenMode DWORD fdwPipModeDWORD nMaxInstances DWORD cbOutBufDWORD cbInBuf DWORD dwTimeOutLPSECURITY_ATTRIBUTES lpsa )
Use same flag settings forall instances of a named pipe
lpszPipeName pipe[path]pipename
ndash Not possible to create a pipe on remote machine ( ndash local machine)
fdwOpenMode
ndash PIPE_ACCESS_DUPLEX PIPE_ACCESS_INBOUND PIPE_ACCESS_OUTBOUND
fdwPipeMode
ndash PIPE_TYPE_BYTE or PIPE_TYPE_MESSAGE
ndash PIPE_READMODE_BYTE or PIPE_READMODE_MESSAGE
ndash PIPE_WAIT or PIPE_NOWAIT (will ReadFile block)
9898
Named Pipes (contd)
BOOL PeekNamedPipe (HANDLE hPipeLPVOID lpvBuffer DWORD cbBufferLPDWORD lpcbRead LPDWORD lpcbAvailLPDWORD lpcbMessage)
nMaxInstances
ndash Number of instances
ndash PIPE_UNLIMITED_INSTANCES OS choice based on resources
dwTimeOut
ndash Default time-out period (in msec) for WaitNamedPipe()
First CreateNamedPipe creates named pipe
ndash Closing handle to last instance deletes named pipe
Polling a pipe
ndash Nondestructive ndash is there a message waiting for ReadFile
9999
Named Pipe Client Connections
CreateFile with named pipe namendash pipe[path]pipename
ndash servernamepipe[path]pipename
ndash First method gives better performance (local server)
Status Functionsndash GetNamedPipeHandleState
ndash SetNamedPipeHandleState
ndash GetNamedPipeInfo
100100
Convenience Functions
BOOL TransactNamedPipe( HANDLE hNamedPipeLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDOWRD lpcbRead LPOVERLAPPED lpa)
WriteFile ReadFile sequence
101101
Convenience Functions
BOOL CallNamedPipe( LPCTSTR lpszPipeNameLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDWORD lpcbRead DWORD dwTimeOut)
CreateFile WriteFile ReadFile CloseHandle
ndash dwTimeOut NMPWAIT_NOWAIT NMPWAIT_WIAT_FOREVER NMPWAIT_USE_DEFAULT_WAIT
102102
Server eliminate the polling loop
BOOL ConnectNamedPipe (HANDLE hNamedPipeLPOVERLAPPED lpo
lpo == NULLndash Call will return as soon as there is a client connection
ndash Returns false if client connected between CreateNamed Pipe calland ConnectNamedPipe()
Use DisconnectNamedPipe to free the handle for connection from another client
WaitNamedPipe()ndash Client may wait for serverlsquos ConnectNamedPipe()
Security rights for named pipesndash GENERIC_READ GENERIC_WRITE SYNCHRONIZE
103103
Comparison with UNIX
UNIX FIFOs are similar to a named pipe
ndash FIFOs are half-duplex
ndash FIFOs are limited to a single machine
ndash FIFOs are still byte-oriented so its easiest to use fixed-size records in clientserver applications
ndash Individual readwrites are atomic
A server using FIFOs must use a separate FIFO for each clientlsquos response although all clients can send requests via a single well known FIFO
Mkfifo() is the UNIX counterpart to CreateNamedPipe()
Use sockets for networked clientserver scenarios
104104
Client Example using Named Pipe
WaitNamedPipe (ServerPipeName NMPWAIT_WAIT_FOREVER)
hNamedPipe = CreateFile (ServerPipeName GENERIC_READ | GENERIC_WRITE
0 NULL OPEN_EXISTING FILE_ATTRIBUTE_NORMAL NULL)
if (hNamedPipe == INVALID_HANDLE_VALUE)
fptinf(stderr Failure to locate servern) exit(3)
Write the request
WriteFile (hNamedPipe ampRequest MAX_RQRS_LEN ampnWrite NULL)
Read each response and send it to std out
while (ReadFile (hNamedPipe ResponseRecord MAX_RQRS_LEN ampnRead NULL))
printf (s ResponseRecord)
CloseHandle (hNamedPipe)
return 0
105105
Server Example Using a Named Pipe
hNamedPipe = CreateNamedPipe (SERVER_PIPE PIPE_ACCESS_DUPLEX
PIPE_READMODE_MESSAGE | PIPE_TYPE_MESSAGE | PIPE_WAIT
1 0 0 CS_TIMEOUT pNPSA)
while (Done)
printf (Server is awaiting next requestn)
if (ConnectNamedPipe (hNamedPipe NULL)
|| ReadFile (hNamedPipe ampRequest RQ_SIZE ampnXfer NULL))
fprintf(stderr ldquoConnect or Read Named Pipe errornrdquo) exit(4)
printf( ldquoRequest is sn RequestRecord)
Send the file one line at a time to the client
fp = fopen (File r)
while ((fgets (ResponseRecord MAX_RQRS_LEN fp) = NULL))
WriteFile (hNamedPipe ampResponseRecord
(strlen(ResponseRecord) + 1) TSIZE ampnXfer NULL)
fclose (fp)
DisconnectNamedPipe (hNamedPipe)
End of server operation
106106
Win32 IPC - MailslotsMailslots bear some nasty implementation detailsthey are almost never used
Broadcast mechanism
ndash One-directional
ndash Mutliple writersmultiple readers (frequently one-to-many comm)
ndash Message delivery is unreliable
ndash Can be located over a network domain
ndash Message lengths are limited (w2k lt 426 byte) Operations on the mailslot
ndash Each reader (server) creates mailslot with CreateMailslot()
ndash Write-only client opens mailslot with CreateFile() and uses WriteFile() ndash open will fail if there are no waiting readers
ndash Clientlsquos message can be read by all servers (readers) Client lookup mailslotmailslotname
ndash Client will connect to every server in network domain
107107
Locate a server via mailslot
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
App client 0
App client n
Mailslot Servers
While () Sleep() hMS = CreateFile( ldquomailslotstatusldquo)
WriteFile(hMS ampStatInfo
App Server
Mailslot Client
Message is sent periodically
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
7878
Object-type-specific data
Wait Internals 2Wait Blocks
Size TypeState
Wait listhead
Size TypeState
Wait listhead
Represent a threadrsquos reference to something itrsquos waiting for (one per handle passed to WaitForhellip)
All wait blocks from a given wait call are chained to the waiting thread
Type indicates wait for ldquoanyrdquo or ldquoallrdquo Key denotes argument list position for
WaitForMultipleObjects
Object-type-specific data
DispatcherObjects
Thread Objects
WaitBlockListWaitBlockList
Wait blocks
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
Key TypeNext link
List entry
ObjectThread
7979
34 Windows APIs for Synchronization and IPC Windows API constructs for synchronization and interprocess communication
Synchronization
ndash Critical sections
ndash Mutexes
ndash Semaphores
ndash Event objects
Synchronization through interprocess communication
ndash Anonymous pipes
ndash Named pipes
ndash Mailslots
8080
Critical Sections
Only usable from within the same process
Critical sections are initialized and deleted but do not have handles
Only one thread at a time can be in a critical section
A thread can enter a critical section multiple times - however the number of Enter- and Leave-operations must match
Leaving a critical section before entering it may cause deadlocks
No way to test whether another thread is in a critical section
VOID InitializeCriticalSection( LPCRITICAL_SECTION sec )VOID DeleteCriticalSection( LPCRITICAL_SECTION sec )
VOID EnterCriticalSection( LPCRITICAL_SECTION sec ) VOID LeaveCriticalSection( LPCRITICAL_SECTION sec )BOOL TryEnterCriticalSection ( LPCRITICAL_SECTION sec )
8181
Critical Section Example
counter is global shared by all threads
volatile int counter = 0
CRITICAL_SECTION crit
InitializeCriticalSection ( ampcrit )
hellip main loop in any of the threads
while (done)
_try
EnterCriticalSection ( ampcrit )
counter += local_value
LeaveCriticalSection ( ampcrit )
_finally LeaveCriticalSection ( ampcrit )
DeleteCriticalSection( ampcrit )
8282
Synchronizing Threads with Kernel Objects
The following kernel objects can be used to synchronize threads
ndash Processes
ndash Threads
ndash Files
ndash Console input
File change notificationsFile change notifications
MutexesMutexes
Events (auto-reset + manual-reset)Events (auto-reset + manual-reset)
Waitable timersWaitable timers
DWORD WaitForSingleObject( HANDLE hObject DWORD dwTimeout )
DWORD WaitForMultipleObjects( DWORD cObjects LPHANDLE lpHandles BOOL bWaitAll DWORD dwTimeout )
8383
Wait Functions - Details
WaitForSingleObject()ndash hObject specifies kernel object
ndash dwTimeout specifies wait time in msecdwTimeout == 0 - no wait check whether object is signaled
dwTimeout == INFINITE - wait forever
WaitForMultipleObjects()ndash cObjects lt= MAXIMUM_WAIT_OBJECTS (64)
ndash lpHandles - pointer to array identifying these objects
ndash bWaitAll - whether to wait for first signaled object or all objectsFunction returns index of first signaled object
Side effectsndash Mutexes auto-reset events and waitable timers will be reset to non-signaled state after completing wait functions
8484
Mutexes
Mutexes work across processes
First thread has to call CreateMutex()
When sharing a mutex second thread (process) calls CreateMutex() or OpenMutex()
fInitialOwner == TRUE gives creator immediate ownership
Threads acquire mutex ownership using WaitForSingleObject() or WaitForMultipleObjects()
ReleaseMutex() gives up ownership
CloseHandle() will free mutex object
8585
Mutexes
HANDLE CreateMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
HANDLE OpenMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
BOOL ReleaseMutex( HANDLE hMutex )
8686
Mutex Example
counter is global shared by all threads
volatile int done counter = 0
HANDLE mutex = CreateMutex( NULL FALSE NULL )
main loop in any of the threads ret is local
DWORD ret
while (done)
ret = WaitForSingleObject( mutex INFINITE )
if (ret == WAIT_OBJECT_0)
counter += local_value
else mutex was abandoned
break exit the loop
ReleaseMutex( mutex )
CloseHandle( mutex )
8787
Comparison - POSIX mutexes
POSIX pthreads specification supports mutexesndash Synchronization among threads in same process
Five basic functionsndash pthread_mutex_init()
ndash pthread_mutex_destroy()
ndash pthread_mutex_lock()
ndash pthread_mutex_unlock()
ndash pthread_mutex_trylock()
Comparisonndash pthread_mutex_lock() will block - equivalent to WaitForSingleObject( hMutex )
ndash pthread_mutex_trylock() is nonblocking (polling) - equivalent to WaitForSingleObject() with timeout == 0
8888
Semaphores
Semaphore objects are used for resource countingndash A semaphore is signaled when count gt 0
Threadsprocesses use wait functionsndash Each wait function decreases semaphore count by 1
ndash ReleaseSemaphore() may increment count by any value
ndash ReleaseSemaphore() returns old semaphore count
8989
Semaphores
HANDLE CreateSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE OpenSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE ReleaseSemaphore( HANDLE hSemaphoreLONG cReleaseCount LPLONG lpPreviousCount )
9090
Events
Multiple threads can be released when a single event is signaled (barrier synchronization)ndash Manual-reset event can signal several thread simultaneously must be reset manually
ndash PulseEvent() will release all threads waiting on a manual-reset event and automatically reset the event
ndash Auto-reset event signals a single thread event is reset automatically
ndash fInitialState == TRUE - create event in signaled state
9191
Events
HANDLE CreateEvent( LPSECURITY_ATTRIBUTE lpsaBOOL fManualReset BOOL fInititalStateLPTSTR lpszEventName )
BOOL SetEvent( HANDLE hEvent )BOOL ResetEvent( HANDLE hEvent )BOOL PulseEvent( HANDLE hEvent )
9292
Comparison - POSIX condition variables
pthreadrsquos condition variables are comparable to eventsndash pthread_cond_init()
ndash pthread_cond_destroy()
Wait functionsndash pthread_cond_wait()
ndash pthread_cond_timedwait()
Signalingndash pthread_cond_signal() - one thread
ndash pthread_cond_broadcast() - all waiting threads
No exact equivalent to manual-reset events
9393
Anonymous pipes
BOOL CreatePipe( PHANDLE phReadPHANDLE phWriteLPSECURITY_ATTRIBUTES lpsaDWORD cbPipe )
main
prog1 prog2pipe
Half-duplex character-based IPC
cbPipe pipe byte size zero == default
Read on pipe handle will block if pipe is empty
Write operation to a full pipe will block
Anonymous pipes are oneway
9494
IO Redirection using an Anonymous Pipe
Create default size anonymous pipe handles are inheritable
if (CreatePipe (amphReadPipe amphWritePipe ampPipeSA 0))
fprintf(stderr ldquoAnon pipe create failednrdquo) exit(1)
Set output handle to pipe handle create first processes
StartInfoCh1hStdInput = GetStdHandle (STD_INPUT_HANDLE)
StartInfoCh1hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh1hStdOutput = hWritePipe
StartInfoCh1dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)Command1 NULL NULL TRUE 0 NULL NULL ampStartInfoCh1 ampProcInfo1))
fprintf(stderr ldquoCreateProc1 failednrdquo) exit(2)
CloseHandle (hWritePipe)
9595
Pipe example (contd)
Repeat (symmetrically) for the second process
StartInfoCh2hStdInput = hReadPipe
StartInfoCh2hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh2hStdOutput = GetStdHandle (STD_OUTPUT_HANDLE)
StartInfoCh2dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)targv NULL NULLTRUE Inherit handles
0 NULL NULL ampStartInfoCh2 ampProcInfo2))
fprintf(stderr ldquoCreateProc2 failednrdquo) exit(3)
CloseHandle (hReadPipe)
Wait for both processes to complete
WaitForSingleObject (ProcInfo1hProcess INFINITE)
WaitForSingleObject (ProcInfo2hProcess INFINITE)
CloseHandle (ProcInfo1hThread) CloseHandle (ProcInfo1hProcess)
CloseHandle (ProcInfo2hThread) CloseHandle (ProcInfo2hProcess)
return 0
9696
Named Pipes
Message orientedndash Reading process can read varying-length messages precisely as sent by the writing process
Bi-directionalndash Two processes can exchange messages over the same pipe
Multiple independent instances of a named pipendash Several clients can communicate with a single server using the same instance
ndash Server can respond to client using the same instance
Pipe can be accessed over the networkndash location transparency
Convenience and connection functions
9797
Using Named Pipes
HANDLE CreateNamedPipe (LPCTSTR lpszPipeNameDWORD fdwOpenMode DWORD fdwPipModeDWORD nMaxInstances DWORD cbOutBufDWORD cbInBuf DWORD dwTimeOutLPSECURITY_ATTRIBUTES lpsa )
Use same flag settings forall instances of a named pipe
lpszPipeName pipe[path]pipename
ndash Not possible to create a pipe on remote machine ( ndash local machine)
fdwOpenMode
ndash PIPE_ACCESS_DUPLEX PIPE_ACCESS_INBOUND PIPE_ACCESS_OUTBOUND
fdwPipeMode
ndash PIPE_TYPE_BYTE or PIPE_TYPE_MESSAGE
ndash PIPE_READMODE_BYTE or PIPE_READMODE_MESSAGE
ndash PIPE_WAIT or PIPE_NOWAIT (will ReadFile block)
9898
Named Pipes (contd)
BOOL PeekNamedPipe (HANDLE hPipeLPVOID lpvBuffer DWORD cbBufferLPDWORD lpcbRead LPDWORD lpcbAvailLPDWORD lpcbMessage)
nMaxInstances
ndash Number of instances
ndash PIPE_UNLIMITED_INSTANCES OS choice based on resources
dwTimeOut
ndash Default time-out period (in msec) for WaitNamedPipe()
First CreateNamedPipe creates named pipe
ndash Closing handle to last instance deletes named pipe
Polling a pipe
ndash Nondestructive ndash is there a message waiting for ReadFile
9999
Named Pipe Client Connections
CreateFile with named pipe namendash pipe[path]pipename
ndash servernamepipe[path]pipename
ndash First method gives better performance (local server)
Status Functionsndash GetNamedPipeHandleState
ndash SetNamedPipeHandleState
ndash GetNamedPipeInfo
100100
Convenience Functions
BOOL TransactNamedPipe( HANDLE hNamedPipeLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDOWRD lpcbRead LPOVERLAPPED lpa)
WriteFile ReadFile sequence
101101
Convenience Functions
BOOL CallNamedPipe( LPCTSTR lpszPipeNameLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDWORD lpcbRead DWORD dwTimeOut)
CreateFile WriteFile ReadFile CloseHandle
ndash dwTimeOut NMPWAIT_NOWAIT NMPWAIT_WIAT_FOREVER NMPWAIT_USE_DEFAULT_WAIT
102102
Server eliminate the polling loop
BOOL ConnectNamedPipe (HANDLE hNamedPipeLPOVERLAPPED lpo
lpo == NULLndash Call will return as soon as there is a client connection
ndash Returns false if client connected between CreateNamed Pipe calland ConnectNamedPipe()
Use DisconnectNamedPipe to free the handle for connection from another client
WaitNamedPipe()ndash Client may wait for serverlsquos ConnectNamedPipe()
Security rights for named pipesndash GENERIC_READ GENERIC_WRITE SYNCHRONIZE
103103
Comparison with UNIX
UNIX FIFOs are similar to a named pipe
ndash FIFOs are half-duplex
ndash FIFOs are limited to a single machine
ndash FIFOs are still byte-oriented so its easiest to use fixed-size records in clientserver applications
ndash Individual readwrites are atomic
A server using FIFOs must use a separate FIFO for each clientlsquos response although all clients can send requests via a single well known FIFO
Mkfifo() is the UNIX counterpart to CreateNamedPipe()
Use sockets for networked clientserver scenarios
104104
Client Example using Named Pipe
WaitNamedPipe (ServerPipeName NMPWAIT_WAIT_FOREVER)
hNamedPipe = CreateFile (ServerPipeName GENERIC_READ | GENERIC_WRITE
0 NULL OPEN_EXISTING FILE_ATTRIBUTE_NORMAL NULL)
if (hNamedPipe == INVALID_HANDLE_VALUE)
fptinf(stderr Failure to locate servern) exit(3)
Write the request
WriteFile (hNamedPipe ampRequest MAX_RQRS_LEN ampnWrite NULL)
Read each response and send it to std out
while (ReadFile (hNamedPipe ResponseRecord MAX_RQRS_LEN ampnRead NULL))
printf (s ResponseRecord)
CloseHandle (hNamedPipe)
return 0
105105
Server Example Using a Named Pipe
hNamedPipe = CreateNamedPipe (SERVER_PIPE PIPE_ACCESS_DUPLEX
PIPE_READMODE_MESSAGE | PIPE_TYPE_MESSAGE | PIPE_WAIT
1 0 0 CS_TIMEOUT pNPSA)
while (Done)
printf (Server is awaiting next requestn)
if (ConnectNamedPipe (hNamedPipe NULL)
|| ReadFile (hNamedPipe ampRequest RQ_SIZE ampnXfer NULL))
fprintf(stderr ldquoConnect or Read Named Pipe errornrdquo) exit(4)
printf( ldquoRequest is sn RequestRecord)
Send the file one line at a time to the client
fp = fopen (File r)
while ((fgets (ResponseRecord MAX_RQRS_LEN fp) = NULL))
WriteFile (hNamedPipe ampResponseRecord
(strlen(ResponseRecord) + 1) TSIZE ampnXfer NULL)
fclose (fp)
DisconnectNamedPipe (hNamedPipe)
End of server operation
106106
Win32 IPC - MailslotsMailslots bear some nasty implementation detailsthey are almost never used
Broadcast mechanism
ndash One-directional
ndash Mutliple writersmultiple readers (frequently one-to-many comm)
ndash Message delivery is unreliable
ndash Can be located over a network domain
ndash Message lengths are limited (w2k lt 426 byte) Operations on the mailslot
ndash Each reader (server) creates mailslot with CreateMailslot()
ndash Write-only client opens mailslot with CreateFile() and uses WriteFile() ndash open will fail if there are no waiting readers
ndash Clientlsquos message can be read by all servers (readers) Client lookup mailslotmailslotname
ndash Client will connect to every server in network domain
107107
Locate a server via mailslot
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
App client 0
App client n
Mailslot Servers
While () Sleep() hMS = CreateFile( ldquomailslotstatusldquo)
WriteFile(hMS ampStatInfo
App Server
Mailslot Client
Message is sent periodically
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
7979
34 Windows APIs for Synchronization and IPC Windows API constructs for synchronization and interprocess communication
Synchronization
ndash Critical sections
ndash Mutexes
ndash Semaphores
ndash Event objects
Synchronization through interprocess communication
ndash Anonymous pipes
ndash Named pipes
ndash Mailslots
8080
Critical Sections
Only usable from within the same process
Critical sections are initialized and deleted but do not have handles
Only one thread at a time can be in a critical section
A thread can enter a critical section multiple times - however the number of Enter- and Leave-operations must match
Leaving a critical section before entering it may cause deadlocks
No way to test whether another thread is in a critical section
VOID InitializeCriticalSection( LPCRITICAL_SECTION sec )VOID DeleteCriticalSection( LPCRITICAL_SECTION sec )
VOID EnterCriticalSection( LPCRITICAL_SECTION sec ) VOID LeaveCriticalSection( LPCRITICAL_SECTION sec )BOOL TryEnterCriticalSection ( LPCRITICAL_SECTION sec )
8181
Critical Section Example
counter is global shared by all threads
volatile int counter = 0
CRITICAL_SECTION crit
InitializeCriticalSection ( ampcrit )
hellip main loop in any of the threads
while (done)
_try
EnterCriticalSection ( ampcrit )
counter += local_value
LeaveCriticalSection ( ampcrit )
_finally LeaveCriticalSection ( ampcrit )
DeleteCriticalSection( ampcrit )
8282
Synchronizing Threads with Kernel Objects
The following kernel objects can be used to synchronize threads
ndash Processes
ndash Threads
ndash Files
ndash Console input
File change notificationsFile change notifications
MutexesMutexes
Events (auto-reset + manual-reset)Events (auto-reset + manual-reset)
Waitable timersWaitable timers
DWORD WaitForSingleObject( HANDLE hObject DWORD dwTimeout )
DWORD WaitForMultipleObjects( DWORD cObjects LPHANDLE lpHandles BOOL bWaitAll DWORD dwTimeout )
8383
Wait Functions - Details
WaitForSingleObject()ndash hObject specifies kernel object
ndash dwTimeout specifies wait time in msecdwTimeout == 0 - no wait check whether object is signaled
dwTimeout == INFINITE - wait forever
WaitForMultipleObjects()ndash cObjects lt= MAXIMUM_WAIT_OBJECTS (64)
ndash lpHandles - pointer to array identifying these objects
ndash bWaitAll - whether to wait for first signaled object or all objectsFunction returns index of first signaled object
Side effectsndash Mutexes auto-reset events and waitable timers will be reset to non-signaled state after completing wait functions
8484
Mutexes
Mutexes work across processes
First thread has to call CreateMutex()
When sharing a mutex second thread (process) calls CreateMutex() or OpenMutex()
fInitialOwner == TRUE gives creator immediate ownership
Threads acquire mutex ownership using WaitForSingleObject() or WaitForMultipleObjects()
ReleaseMutex() gives up ownership
CloseHandle() will free mutex object
8585
Mutexes
HANDLE CreateMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
HANDLE OpenMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
BOOL ReleaseMutex( HANDLE hMutex )
8686
Mutex Example
counter is global shared by all threads
volatile int done counter = 0
HANDLE mutex = CreateMutex( NULL FALSE NULL )
main loop in any of the threads ret is local
DWORD ret
while (done)
ret = WaitForSingleObject( mutex INFINITE )
if (ret == WAIT_OBJECT_0)
counter += local_value
else mutex was abandoned
break exit the loop
ReleaseMutex( mutex )
CloseHandle( mutex )
8787
Comparison - POSIX mutexes
POSIX pthreads specification supports mutexesndash Synchronization among threads in same process
Five basic functionsndash pthread_mutex_init()
ndash pthread_mutex_destroy()
ndash pthread_mutex_lock()
ndash pthread_mutex_unlock()
ndash pthread_mutex_trylock()
Comparisonndash pthread_mutex_lock() will block - equivalent to WaitForSingleObject( hMutex )
ndash pthread_mutex_trylock() is nonblocking (polling) - equivalent to WaitForSingleObject() with timeout == 0
8888
Semaphores
Semaphore objects are used for resource countingndash A semaphore is signaled when count gt 0
Threadsprocesses use wait functionsndash Each wait function decreases semaphore count by 1
ndash ReleaseSemaphore() may increment count by any value
ndash ReleaseSemaphore() returns old semaphore count
8989
Semaphores
HANDLE CreateSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE OpenSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE ReleaseSemaphore( HANDLE hSemaphoreLONG cReleaseCount LPLONG lpPreviousCount )
9090
Events
Multiple threads can be released when a single event is signaled (barrier synchronization)ndash Manual-reset event can signal several thread simultaneously must be reset manually
ndash PulseEvent() will release all threads waiting on a manual-reset event and automatically reset the event
ndash Auto-reset event signals a single thread event is reset automatically
ndash fInitialState == TRUE - create event in signaled state
9191
Events
HANDLE CreateEvent( LPSECURITY_ATTRIBUTE lpsaBOOL fManualReset BOOL fInititalStateLPTSTR lpszEventName )
BOOL SetEvent( HANDLE hEvent )BOOL ResetEvent( HANDLE hEvent )BOOL PulseEvent( HANDLE hEvent )
9292
Comparison - POSIX condition variables
pthreadrsquos condition variables are comparable to eventsndash pthread_cond_init()
ndash pthread_cond_destroy()
Wait functionsndash pthread_cond_wait()
ndash pthread_cond_timedwait()
Signalingndash pthread_cond_signal() - one thread
ndash pthread_cond_broadcast() - all waiting threads
No exact equivalent to manual-reset events
9393
Anonymous pipes
BOOL CreatePipe( PHANDLE phReadPHANDLE phWriteLPSECURITY_ATTRIBUTES lpsaDWORD cbPipe )
main
prog1 prog2pipe
Half-duplex character-based IPC
cbPipe pipe byte size zero == default
Read on pipe handle will block if pipe is empty
Write operation to a full pipe will block
Anonymous pipes are oneway
9494
IO Redirection using an Anonymous Pipe
Create default size anonymous pipe handles are inheritable
if (CreatePipe (amphReadPipe amphWritePipe ampPipeSA 0))
fprintf(stderr ldquoAnon pipe create failednrdquo) exit(1)
Set output handle to pipe handle create first processes
StartInfoCh1hStdInput = GetStdHandle (STD_INPUT_HANDLE)
StartInfoCh1hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh1hStdOutput = hWritePipe
StartInfoCh1dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)Command1 NULL NULL TRUE 0 NULL NULL ampStartInfoCh1 ampProcInfo1))
fprintf(stderr ldquoCreateProc1 failednrdquo) exit(2)
CloseHandle (hWritePipe)
9595
Pipe example (contd)
Repeat (symmetrically) for the second process
StartInfoCh2hStdInput = hReadPipe
StartInfoCh2hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh2hStdOutput = GetStdHandle (STD_OUTPUT_HANDLE)
StartInfoCh2dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)targv NULL NULLTRUE Inherit handles
0 NULL NULL ampStartInfoCh2 ampProcInfo2))
fprintf(stderr ldquoCreateProc2 failednrdquo) exit(3)
CloseHandle (hReadPipe)
Wait for both processes to complete
WaitForSingleObject (ProcInfo1hProcess INFINITE)
WaitForSingleObject (ProcInfo2hProcess INFINITE)
CloseHandle (ProcInfo1hThread) CloseHandle (ProcInfo1hProcess)
CloseHandle (ProcInfo2hThread) CloseHandle (ProcInfo2hProcess)
return 0
9696
Named Pipes
Message orientedndash Reading process can read varying-length messages precisely as sent by the writing process
Bi-directionalndash Two processes can exchange messages over the same pipe
Multiple independent instances of a named pipendash Several clients can communicate with a single server using the same instance
ndash Server can respond to client using the same instance
Pipe can be accessed over the networkndash location transparency
Convenience and connection functions
9797
Using Named Pipes
HANDLE CreateNamedPipe (LPCTSTR lpszPipeNameDWORD fdwOpenMode DWORD fdwPipModeDWORD nMaxInstances DWORD cbOutBufDWORD cbInBuf DWORD dwTimeOutLPSECURITY_ATTRIBUTES lpsa )
Use same flag settings forall instances of a named pipe
lpszPipeName pipe[path]pipename
ndash Not possible to create a pipe on remote machine ( ndash local machine)
fdwOpenMode
ndash PIPE_ACCESS_DUPLEX PIPE_ACCESS_INBOUND PIPE_ACCESS_OUTBOUND
fdwPipeMode
ndash PIPE_TYPE_BYTE or PIPE_TYPE_MESSAGE
ndash PIPE_READMODE_BYTE or PIPE_READMODE_MESSAGE
ndash PIPE_WAIT or PIPE_NOWAIT (will ReadFile block)
9898
Named Pipes (contd)
BOOL PeekNamedPipe (HANDLE hPipeLPVOID lpvBuffer DWORD cbBufferLPDWORD lpcbRead LPDWORD lpcbAvailLPDWORD lpcbMessage)
nMaxInstances
ndash Number of instances
ndash PIPE_UNLIMITED_INSTANCES OS choice based on resources
dwTimeOut
ndash Default time-out period (in msec) for WaitNamedPipe()
First CreateNamedPipe creates named pipe
ndash Closing handle to last instance deletes named pipe
Polling a pipe
ndash Nondestructive ndash is there a message waiting for ReadFile
9999
Named Pipe Client Connections
CreateFile with named pipe namendash pipe[path]pipename
ndash servernamepipe[path]pipename
ndash First method gives better performance (local server)
Status Functionsndash GetNamedPipeHandleState
ndash SetNamedPipeHandleState
ndash GetNamedPipeInfo
100100
Convenience Functions
BOOL TransactNamedPipe( HANDLE hNamedPipeLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDOWRD lpcbRead LPOVERLAPPED lpa)
WriteFile ReadFile sequence
101101
Convenience Functions
BOOL CallNamedPipe( LPCTSTR lpszPipeNameLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDWORD lpcbRead DWORD dwTimeOut)
CreateFile WriteFile ReadFile CloseHandle
ndash dwTimeOut NMPWAIT_NOWAIT NMPWAIT_WIAT_FOREVER NMPWAIT_USE_DEFAULT_WAIT
102102
Server eliminate the polling loop
BOOL ConnectNamedPipe (HANDLE hNamedPipeLPOVERLAPPED lpo
lpo == NULLndash Call will return as soon as there is a client connection
ndash Returns false if client connected between CreateNamed Pipe calland ConnectNamedPipe()
Use DisconnectNamedPipe to free the handle for connection from another client
WaitNamedPipe()ndash Client may wait for serverlsquos ConnectNamedPipe()
Security rights for named pipesndash GENERIC_READ GENERIC_WRITE SYNCHRONIZE
103103
Comparison with UNIX
UNIX FIFOs are similar to a named pipe
ndash FIFOs are half-duplex
ndash FIFOs are limited to a single machine
ndash FIFOs are still byte-oriented so its easiest to use fixed-size records in clientserver applications
ndash Individual readwrites are atomic
A server using FIFOs must use a separate FIFO for each clientlsquos response although all clients can send requests via a single well known FIFO
Mkfifo() is the UNIX counterpart to CreateNamedPipe()
Use sockets for networked clientserver scenarios
104104
Client Example using Named Pipe
WaitNamedPipe (ServerPipeName NMPWAIT_WAIT_FOREVER)
hNamedPipe = CreateFile (ServerPipeName GENERIC_READ | GENERIC_WRITE
0 NULL OPEN_EXISTING FILE_ATTRIBUTE_NORMAL NULL)
if (hNamedPipe == INVALID_HANDLE_VALUE)
fptinf(stderr Failure to locate servern) exit(3)
Write the request
WriteFile (hNamedPipe ampRequest MAX_RQRS_LEN ampnWrite NULL)
Read each response and send it to std out
while (ReadFile (hNamedPipe ResponseRecord MAX_RQRS_LEN ampnRead NULL))
printf (s ResponseRecord)
CloseHandle (hNamedPipe)
return 0
105105
Server Example Using a Named Pipe
hNamedPipe = CreateNamedPipe (SERVER_PIPE PIPE_ACCESS_DUPLEX
PIPE_READMODE_MESSAGE | PIPE_TYPE_MESSAGE | PIPE_WAIT
1 0 0 CS_TIMEOUT pNPSA)
while (Done)
printf (Server is awaiting next requestn)
if (ConnectNamedPipe (hNamedPipe NULL)
|| ReadFile (hNamedPipe ampRequest RQ_SIZE ampnXfer NULL))
fprintf(stderr ldquoConnect or Read Named Pipe errornrdquo) exit(4)
printf( ldquoRequest is sn RequestRecord)
Send the file one line at a time to the client
fp = fopen (File r)
while ((fgets (ResponseRecord MAX_RQRS_LEN fp) = NULL))
WriteFile (hNamedPipe ampResponseRecord
(strlen(ResponseRecord) + 1) TSIZE ampnXfer NULL)
fclose (fp)
DisconnectNamedPipe (hNamedPipe)
End of server operation
106106
Win32 IPC - MailslotsMailslots bear some nasty implementation detailsthey are almost never used
Broadcast mechanism
ndash One-directional
ndash Mutliple writersmultiple readers (frequently one-to-many comm)
ndash Message delivery is unreliable
ndash Can be located over a network domain
ndash Message lengths are limited (w2k lt 426 byte) Operations on the mailslot
ndash Each reader (server) creates mailslot with CreateMailslot()
ndash Write-only client opens mailslot with CreateFile() and uses WriteFile() ndash open will fail if there are no waiting readers
ndash Clientlsquos message can be read by all servers (readers) Client lookup mailslotmailslotname
ndash Client will connect to every server in network domain
107107
Locate a server via mailslot
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
App client 0
App client n
Mailslot Servers
While () Sleep() hMS = CreateFile( ldquomailslotstatusldquo)
WriteFile(hMS ampStatInfo
App Server
Mailslot Client
Message is sent periodically
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
8080
Critical Sections
Only usable from within the same process
Critical sections are initialized and deleted but do not have handles
Only one thread at a time can be in a critical section
A thread can enter a critical section multiple times - however the number of Enter- and Leave-operations must match
Leaving a critical section before entering it may cause deadlocks
No way to test whether another thread is in a critical section
VOID InitializeCriticalSection( LPCRITICAL_SECTION sec )VOID DeleteCriticalSection( LPCRITICAL_SECTION sec )
VOID EnterCriticalSection( LPCRITICAL_SECTION sec ) VOID LeaveCriticalSection( LPCRITICAL_SECTION sec )BOOL TryEnterCriticalSection ( LPCRITICAL_SECTION sec )
8181
Critical Section Example
counter is global shared by all threads
volatile int counter = 0
CRITICAL_SECTION crit
InitializeCriticalSection ( ampcrit )
hellip main loop in any of the threads
while (done)
_try
EnterCriticalSection ( ampcrit )
counter += local_value
LeaveCriticalSection ( ampcrit )
_finally LeaveCriticalSection ( ampcrit )
DeleteCriticalSection( ampcrit )
8282
Synchronizing Threads with Kernel Objects
The following kernel objects can be used to synchronize threads
ndash Processes
ndash Threads
ndash Files
ndash Console input
File change notificationsFile change notifications
MutexesMutexes
Events (auto-reset + manual-reset)Events (auto-reset + manual-reset)
Waitable timersWaitable timers
DWORD WaitForSingleObject( HANDLE hObject DWORD dwTimeout )
DWORD WaitForMultipleObjects( DWORD cObjects LPHANDLE lpHandles BOOL bWaitAll DWORD dwTimeout )
8383
Wait Functions - Details
WaitForSingleObject()ndash hObject specifies kernel object
ndash dwTimeout specifies wait time in msecdwTimeout == 0 - no wait check whether object is signaled
dwTimeout == INFINITE - wait forever
WaitForMultipleObjects()ndash cObjects lt= MAXIMUM_WAIT_OBJECTS (64)
ndash lpHandles - pointer to array identifying these objects
ndash bWaitAll - whether to wait for first signaled object or all objectsFunction returns index of first signaled object
Side effectsndash Mutexes auto-reset events and waitable timers will be reset to non-signaled state after completing wait functions
8484
Mutexes
Mutexes work across processes
First thread has to call CreateMutex()
When sharing a mutex second thread (process) calls CreateMutex() or OpenMutex()
fInitialOwner == TRUE gives creator immediate ownership
Threads acquire mutex ownership using WaitForSingleObject() or WaitForMultipleObjects()
ReleaseMutex() gives up ownership
CloseHandle() will free mutex object
8585
Mutexes
HANDLE CreateMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
HANDLE OpenMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
BOOL ReleaseMutex( HANDLE hMutex )
8686
Mutex Example
counter is global shared by all threads
volatile int done counter = 0
HANDLE mutex = CreateMutex( NULL FALSE NULL )
main loop in any of the threads ret is local
DWORD ret
while (done)
ret = WaitForSingleObject( mutex INFINITE )
if (ret == WAIT_OBJECT_0)
counter += local_value
else mutex was abandoned
break exit the loop
ReleaseMutex( mutex )
CloseHandle( mutex )
8787
Comparison - POSIX mutexes
POSIX pthreads specification supports mutexesndash Synchronization among threads in same process
Five basic functionsndash pthread_mutex_init()
ndash pthread_mutex_destroy()
ndash pthread_mutex_lock()
ndash pthread_mutex_unlock()
ndash pthread_mutex_trylock()
Comparisonndash pthread_mutex_lock() will block - equivalent to WaitForSingleObject( hMutex )
ndash pthread_mutex_trylock() is nonblocking (polling) - equivalent to WaitForSingleObject() with timeout == 0
8888
Semaphores
Semaphore objects are used for resource countingndash A semaphore is signaled when count gt 0
Threadsprocesses use wait functionsndash Each wait function decreases semaphore count by 1
ndash ReleaseSemaphore() may increment count by any value
ndash ReleaseSemaphore() returns old semaphore count
8989
Semaphores
HANDLE CreateSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE OpenSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE ReleaseSemaphore( HANDLE hSemaphoreLONG cReleaseCount LPLONG lpPreviousCount )
9090
Events
Multiple threads can be released when a single event is signaled (barrier synchronization)ndash Manual-reset event can signal several thread simultaneously must be reset manually
ndash PulseEvent() will release all threads waiting on a manual-reset event and automatically reset the event
ndash Auto-reset event signals a single thread event is reset automatically
ndash fInitialState == TRUE - create event in signaled state
9191
Events
HANDLE CreateEvent( LPSECURITY_ATTRIBUTE lpsaBOOL fManualReset BOOL fInititalStateLPTSTR lpszEventName )
BOOL SetEvent( HANDLE hEvent )BOOL ResetEvent( HANDLE hEvent )BOOL PulseEvent( HANDLE hEvent )
9292
Comparison - POSIX condition variables
pthreadrsquos condition variables are comparable to eventsndash pthread_cond_init()
ndash pthread_cond_destroy()
Wait functionsndash pthread_cond_wait()
ndash pthread_cond_timedwait()
Signalingndash pthread_cond_signal() - one thread
ndash pthread_cond_broadcast() - all waiting threads
No exact equivalent to manual-reset events
9393
Anonymous pipes
BOOL CreatePipe( PHANDLE phReadPHANDLE phWriteLPSECURITY_ATTRIBUTES lpsaDWORD cbPipe )
main
prog1 prog2pipe
Half-duplex character-based IPC
cbPipe pipe byte size zero == default
Read on pipe handle will block if pipe is empty
Write operation to a full pipe will block
Anonymous pipes are oneway
9494
IO Redirection using an Anonymous Pipe
Create default size anonymous pipe handles are inheritable
if (CreatePipe (amphReadPipe amphWritePipe ampPipeSA 0))
fprintf(stderr ldquoAnon pipe create failednrdquo) exit(1)
Set output handle to pipe handle create first processes
StartInfoCh1hStdInput = GetStdHandle (STD_INPUT_HANDLE)
StartInfoCh1hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh1hStdOutput = hWritePipe
StartInfoCh1dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)Command1 NULL NULL TRUE 0 NULL NULL ampStartInfoCh1 ampProcInfo1))
fprintf(stderr ldquoCreateProc1 failednrdquo) exit(2)
CloseHandle (hWritePipe)
9595
Pipe example (contd)
Repeat (symmetrically) for the second process
StartInfoCh2hStdInput = hReadPipe
StartInfoCh2hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh2hStdOutput = GetStdHandle (STD_OUTPUT_HANDLE)
StartInfoCh2dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)targv NULL NULLTRUE Inherit handles
0 NULL NULL ampStartInfoCh2 ampProcInfo2))
fprintf(stderr ldquoCreateProc2 failednrdquo) exit(3)
CloseHandle (hReadPipe)
Wait for both processes to complete
WaitForSingleObject (ProcInfo1hProcess INFINITE)
WaitForSingleObject (ProcInfo2hProcess INFINITE)
CloseHandle (ProcInfo1hThread) CloseHandle (ProcInfo1hProcess)
CloseHandle (ProcInfo2hThread) CloseHandle (ProcInfo2hProcess)
return 0
9696
Named Pipes
Message orientedndash Reading process can read varying-length messages precisely as sent by the writing process
Bi-directionalndash Two processes can exchange messages over the same pipe
Multiple independent instances of a named pipendash Several clients can communicate with a single server using the same instance
ndash Server can respond to client using the same instance
Pipe can be accessed over the networkndash location transparency
Convenience and connection functions
9797
Using Named Pipes
HANDLE CreateNamedPipe (LPCTSTR lpszPipeNameDWORD fdwOpenMode DWORD fdwPipModeDWORD nMaxInstances DWORD cbOutBufDWORD cbInBuf DWORD dwTimeOutLPSECURITY_ATTRIBUTES lpsa )
Use same flag settings forall instances of a named pipe
lpszPipeName pipe[path]pipename
ndash Not possible to create a pipe on remote machine ( ndash local machine)
fdwOpenMode
ndash PIPE_ACCESS_DUPLEX PIPE_ACCESS_INBOUND PIPE_ACCESS_OUTBOUND
fdwPipeMode
ndash PIPE_TYPE_BYTE or PIPE_TYPE_MESSAGE
ndash PIPE_READMODE_BYTE or PIPE_READMODE_MESSAGE
ndash PIPE_WAIT or PIPE_NOWAIT (will ReadFile block)
9898
Named Pipes (contd)
BOOL PeekNamedPipe (HANDLE hPipeLPVOID lpvBuffer DWORD cbBufferLPDWORD lpcbRead LPDWORD lpcbAvailLPDWORD lpcbMessage)
nMaxInstances
ndash Number of instances
ndash PIPE_UNLIMITED_INSTANCES OS choice based on resources
dwTimeOut
ndash Default time-out period (in msec) for WaitNamedPipe()
First CreateNamedPipe creates named pipe
ndash Closing handle to last instance deletes named pipe
Polling a pipe
ndash Nondestructive ndash is there a message waiting for ReadFile
9999
Named Pipe Client Connections
CreateFile with named pipe namendash pipe[path]pipename
ndash servernamepipe[path]pipename
ndash First method gives better performance (local server)
Status Functionsndash GetNamedPipeHandleState
ndash SetNamedPipeHandleState
ndash GetNamedPipeInfo
100100
Convenience Functions
BOOL TransactNamedPipe( HANDLE hNamedPipeLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDOWRD lpcbRead LPOVERLAPPED lpa)
WriteFile ReadFile sequence
101101
Convenience Functions
BOOL CallNamedPipe( LPCTSTR lpszPipeNameLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDWORD lpcbRead DWORD dwTimeOut)
CreateFile WriteFile ReadFile CloseHandle
ndash dwTimeOut NMPWAIT_NOWAIT NMPWAIT_WIAT_FOREVER NMPWAIT_USE_DEFAULT_WAIT
102102
Server eliminate the polling loop
BOOL ConnectNamedPipe (HANDLE hNamedPipeLPOVERLAPPED lpo
lpo == NULLndash Call will return as soon as there is a client connection
ndash Returns false if client connected between CreateNamed Pipe calland ConnectNamedPipe()
Use DisconnectNamedPipe to free the handle for connection from another client
WaitNamedPipe()ndash Client may wait for serverlsquos ConnectNamedPipe()
Security rights for named pipesndash GENERIC_READ GENERIC_WRITE SYNCHRONIZE
103103
Comparison with UNIX
UNIX FIFOs are similar to a named pipe
ndash FIFOs are half-duplex
ndash FIFOs are limited to a single machine
ndash FIFOs are still byte-oriented so its easiest to use fixed-size records in clientserver applications
ndash Individual readwrites are atomic
A server using FIFOs must use a separate FIFO for each clientlsquos response although all clients can send requests via a single well known FIFO
Mkfifo() is the UNIX counterpart to CreateNamedPipe()
Use sockets for networked clientserver scenarios
104104
Client Example using Named Pipe
WaitNamedPipe (ServerPipeName NMPWAIT_WAIT_FOREVER)
hNamedPipe = CreateFile (ServerPipeName GENERIC_READ | GENERIC_WRITE
0 NULL OPEN_EXISTING FILE_ATTRIBUTE_NORMAL NULL)
if (hNamedPipe == INVALID_HANDLE_VALUE)
fptinf(stderr Failure to locate servern) exit(3)
Write the request
WriteFile (hNamedPipe ampRequest MAX_RQRS_LEN ampnWrite NULL)
Read each response and send it to std out
while (ReadFile (hNamedPipe ResponseRecord MAX_RQRS_LEN ampnRead NULL))
printf (s ResponseRecord)
CloseHandle (hNamedPipe)
return 0
105105
Server Example Using a Named Pipe
hNamedPipe = CreateNamedPipe (SERVER_PIPE PIPE_ACCESS_DUPLEX
PIPE_READMODE_MESSAGE | PIPE_TYPE_MESSAGE | PIPE_WAIT
1 0 0 CS_TIMEOUT pNPSA)
while (Done)
printf (Server is awaiting next requestn)
if (ConnectNamedPipe (hNamedPipe NULL)
|| ReadFile (hNamedPipe ampRequest RQ_SIZE ampnXfer NULL))
fprintf(stderr ldquoConnect or Read Named Pipe errornrdquo) exit(4)
printf( ldquoRequest is sn RequestRecord)
Send the file one line at a time to the client
fp = fopen (File r)
while ((fgets (ResponseRecord MAX_RQRS_LEN fp) = NULL))
WriteFile (hNamedPipe ampResponseRecord
(strlen(ResponseRecord) + 1) TSIZE ampnXfer NULL)
fclose (fp)
DisconnectNamedPipe (hNamedPipe)
End of server operation
106106
Win32 IPC - MailslotsMailslots bear some nasty implementation detailsthey are almost never used
Broadcast mechanism
ndash One-directional
ndash Mutliple writersmultiple readers (frequently one-to-many comm)
ndash Message delivery is unreliable
ndash Can be located over a network domain
ndash Message lengths are limited (w2k lt 426 byte) Operations on the mailslot
ndash Each reader (server) creates mailslot with CreateMailslot()
ndash Write-only client opens mailslot with CreateFile() and uses WriteFile() ndash open will fail if there are no waiting readers
ndash Clientlsquos message can be read by all servers (readers) Client lookup mailslotmailslotname
ndash Client will connect to every server in network domain
107107
Locate a server via mailslot
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
App client 0
App client n
Mailslot Servers
While () Sleep() hMS = CreateFile( ldquomailslotstatusldquo)
WriteFile(hMS ampStatInfo
App Server
Mailslot Client
Message is sent periodically
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
8181
Critical Section Example
counter is global shared by all threads
volatile int counter = 0
CRITICAL_SECTION crit
InitializeCriticalSection ( ampcrit )
hellip main loop in any of the threads
while (done)
_try
EnterCriticalSection ( ampcrit )
counter += local_value
LeaveCriticalSection ( ampcrit )
_finally LeaveCriticalSection ( ampcrit )
DeleteCriticalSection( ampcrit )
8282
Synchronizing Threads with Kernel Objects
The following kernel objects can be used to synchronize threads
ndash Processes
ndash Threads
ndash Files
ndash Console input
File change notificationsFile change notifications
MutexesMutexes
Events (auto-reset + manual-reset)Events (auto-reset + manual-reset)
Waitable timersWaitable timers
DWORD WaitForSingleObject( HANDLE hObject DWORD dwTimeout )
DWORD WaitForMultipleObjects( DWORD cObjects LPHANDLE lpHandles BOOL bWaitAll DWORD dwTimeout )
8383
Wait Functions - Details
WaitForSingleObject()ndash hObject specifies kernel object
ndash dwTimeout specifies wait time in msecdwTimeout == 0 - no wait check whether object is signaled
dwTimeout == INFINITE - wait forever
WaitForMultipleObjects()ndash cObjects lt= MAXIMUM_WAIT_OBJECTS (64)
ndash lpHandles - pointer to array identifying these objects
ndash bWaitAll - whether to wait for first signaled object or all objectsFunction returns index of first signaled object
Side effectsndash Mutexes auto-reset events and waitable timers will be reset to non-signaled state after completing wait functions
8484
Mutexes
Mutexes work across processes
First thread has to call CreateMutex()
When sharing a mutex second thread (process) calls CreateMutex() or OpenMutex()
fInitialOwner == TRUE gives creator immediate ownership
Threads acquire mutex ownership using WaitForSingleObject() or WaitForMultipleObjects()
ReleaseMutex() gives up ownership
CloseHandle() will free mutex object
8585
Mutexes
HANDLE CreateMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
HANDLE OpenMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
BOOL ReleaseMutex( HANDLE hMutex )
8686
Mutex Example
counter is global shared by all threads
volatile int done counter = 0
HANDLE mutex = CreateMutex( NULL FALSE NULL )
main loop in any of the threads ret is local
DWORD ret
while (done)
ret = WaitForSingleObject( mutex INFINITE )
if (ret == WAIT_OBJECT_0)
counter += local_value
else mutex was abandoned
break exit the loop
ReleaseMutex( mutex )
CloseHandle( mutex )
8787
Comparison - POSIX mutexes
POSIX pthreads specification supports mutexesndash Synchronization among threads in same process
Five basic functionsndash pthread_mutex_init()
ndash pthread_mutex_destroy()
ndash pthread_mutex_lock()
ndash pthread_mutex_unlock()
ndash pthread_mutex_trylock()
Comparisonndash pthread_mutex_lock() will block - equivalent to WaitForSingleObject( hMutex )
ndash pthread_mutex_trylock() is nonblocking (polling) - equivalent to WaitForSingleObject() with timeout == 0
8888
Semaphores
Semaphore objects are used for resource countingndash A semaphore is signaled when count gt 0
Threadsprocesses use wait functionsndash Each wait function decreases semaphore count by 1
ndash ReleaseSemaphore() may increment count by any value
ndash ReleaseSemaphore() returns old semaphore count
8989
Semaphores
HANDLE CreateSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE OpenSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE ReleaseSemaphore( HANDLE hSemaphoreLONG cReleaseCount LPLONG lpPreviousCount )
9090
Events
Multiple threads can be released when a single event is signaled (barrier synchronization)ndash Manual-reset event can signal several thread simultaneously must be reset manually
ndash PulseEvent() will release all threads waiting on a manual-reset event and automatically reset the event
ndash Auto-reset event signals a single thread event is reset automatically
ndash fInitialState == TRUE - create event in signaled state
9191
Events
HANDLE CreateEvent( LPSECURITY_ATTRIBUTE lpsaBOOL fManualReset BOOL fInititalStateLPTSTR lpszEventName )
BOOL SetEvent( HANDLE hEvent )BOOL ResetEvent( HANDLE hEvent )BOOL PulseEvent( HANDLE hEvent )
9292
Comparison - POSIX condition variables
pthreadrsquos condition variables are comparable to eventsndash pthread_cond_init()
ndash pthread_cond_destroy()
Wait functionsndash pthread_cond_wait()
ndash pthread_cond_timedwait()
Signalingndash pthread_cond_signal() - one thread
ndash pthread_cond_broadcast() - all waiting threads
No exact equivalent to manual-reset events
9393
Anonymous pipes
BOOL CreatePipe( PHANDLE phReadPHANDLE phWriteLPSECURITY_ATTRIBUTES lpsaDWORD cbPipe )
main
prog1 prog2pipe
Half-duplex character-based IPC
cbPipe pipe byte size zero == default
Read on pipe handle will block if pipe is empty
Write operation to a full pipe will block
Anonymous pipes are oneway
9494
IO Redirection using an Anonymous Pipe
Create default size anonymous pipe handles are inheritable
if (CreatePipe (amphReadPipe amphWritePipe ampPipeSA 0))
fprintf(stderr ldquoAnon pipe create failednrdquo) exit(1)
Set output handle to pipe handle create first processes
StartInfoCh1hStdInput = GetStdHandle (STD_INPUT_HANDLE)
StartInfoCh1hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh1hStdOutput = hWritePipe
StartInfoCh1dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)Command1 NULL NULL TRUE 0 NULL NULL ampStartInfoCh1 ampProcInfo1))
fprintf(stderr ldquoCreateProc1 failednrdquo) exit(2)
CloseHandle (hWritePipe)
9595
Pipe example (contd)
Repeat (symmetrically) for the second process
StartInfoCh2hStdInput = hReadPipe
StartInfoCh2hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh2hStdOutput = GetStdHandle (STD_OUTPUT_HANDLE)
StartInfoCh2dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)targv NULL NULLTRUE Inherit handles
0 NULL NULL ampStartInfoCh2 ampProcInfo2))
fprintf(stderr ldquoCreateProc2 failednrdquo) exit(3)
CloseHandle (hReadPipe)
Wait for both processes to complete
WaitForSingleObject (ProcInfo1hProcess INFINITE)
WaitForSingleObject (ProcInfo2hProcess INFINITE)
CloseHandle (ProcInfo1hThread) CloseHandle (ProcInfo1hProcess)
CloseHandle (ProcInfo2hThread) CloseHandle (ProcInfo2hProcess)
return 0
9696
Named Pipes
Message orientedndash Reading process can read varying-length messages precisely as sent by the writing process
Bi-directionalndash Two processes can exchange messages over the same pipe
Multiple independent instances of a named pipendash Several clients can communicate with a single server using the same instance
ndash Server can respond to client using the same instance
Pipe can be accessed over the networkndash location transparency
Convenience and connection functions
9797
Using Named Pipes
HANDLE CreateNamedPipe (LPCTSTR lpszPipeNameDWORD fdwOpenMode DWORD fdwPipModeDWORD nMaxInstances DWORD cbOutBufDWORD cbInBuf DWORD dwTimeOutLPSECURITY_ATTRIBUTES lpsa )
Use same flag settings forall instances of a named pipe
lpszPipeName pipe[path]pipename
ndash Not possible to create a pipe on remote machine ( ndash local machine)
fdwOpenMode
ndash PIPE_ACCESS_DUPLEX PIPE_ACCESS_INBOUND PIPE_ACCESS_OUTBOUND
fdwPipeMode
ndash PIPE_TYPE_BYTE or PIPE_TYPE_MESSAGE
ndash PIPE_READMODE_BYTE or PIPE_READMODE_MESSAGE
ndash PIPE_WAIT or PIPE_NOWAIT (will ReadFile block)
9898
Named Pipes (contd)
BOOL PeekNamedPipe (HANDLE hPipeLPVOID lpvBuffer DWORD cbBufferLPDWORD lpcbRead LPDWORD lpcbAvailLPDWORD lpcbMessage)
nMaxInstances
ndash Number of instances
ndash PIPE_UNLIMITED_INSTANCES OS choice based on resources
dwTimeOut
ndash Default time-out period (in msec) for WaitNamedPipe()
First CreateNamedPipe creates named pipe
ndash Closing handle to last instance deletes named pipe
Polling a pipe
ndash Nondestructive ndash is there a message waiting for ReadFile
9999
Named Pipe Client Connections
CreateFile with named pipe namendash pipe[path]pipename
ndash servernamepipe[path]pipename
ndash First method gives better performance (local server)
Status Functionsndash GetNamedPipeHandleState
ndash SetNamedPipeHandleState
ndash GetNamedPipeInfo
100100
Convenience Functions
BOOL TransactNamedPipe( HANDLE hNamedPipeLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDOWRD lpcbRead LPOVERLAPPED lpa)
WriteFile ReadFile sequence
101101
Convenience Functions
BOOL CallNamedPipe( LPCTSTR lpszPipeNameLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDWORD lpcbRead DWORD dwTimeOut)
CreateFile WriteFile ReadFile CloseHandle
ndash dwTimeOut NMPWAIT_NOWAIT NMPWAIT_WIAT_FOREVER NMPWAIT_USE_DEFAULT_WAIT
102102
Server eliminate the polling loop
BOOL ConnectNamedPipe (HANDLE hNamedPipeLPOVERLAPPED lpo
lpo == NULLndash Call will return as soon as there is a client connection
ndash Returns false if client connected between CreateNamed Pipe calland ConnectNamedPipe()
Use DisconnectNamedPipe to free the handle for connection from another client
WaitNamedPipe()ndash Client may wait for serverlsquos ConnectNamedPipe()
Security rights for named pipesndash GENERIC_READ GENERIC_WRITE SYNCHRONIZE
103103
Comparison with UNIX
UNIX FIFOs are similar to a named pipe
ndash FIFOs are half-duplex
ndash FIFOs are limited to a single machine
ndash FIFOs are still byte-oriented so its easiest to use fixed-size records in clientserver applications
ndash Individual readwrites are atomic
A server using FIFOs must use a separate FIFO for each clientlsquos response although all clients can send requests via a single well known FIFO
Mkfifo() is the UNIX counterpart to CreateNamedPipe()
Use sockets for networked clientserver scenarios
104104
Client Example using Named Pipe
WaitNamedPipe (ServerPipeName NMPWAIT_WAIT_FOREVER)
hNamedPipe = CreateFile (ServerPipeName GENERIC_READ | GENERIC_WRITE
0 NULL OPEN_EXISTING FILE_ATTRIBUTE_NORMAL NULL)
if (hNamedPipe == INVALID_HANDLE_VALUE)
fptinf(stderr Failure to locate servern) exit(3)
Write the request
WriteFile (hNamedPipe ampRequest MAX_RQRS_LEN ampnWrite NULL)
Read each response and send it to std out
while (ReadFile (hNamedPipe ResponseRecord MAX_RQRS_LEN ampnRead NULL))
printf (s ResponseRecord)
CloseHandle (hNamedPipe)
return 0
105105
Server Example Using a Named Pipe
hNamedPipe = CreateNamedPipe (SERVER_PIPE PIPE_ACCESS_DUPLEX
PIPE_READMODE_MESSAGE | PIPE_TYPE_MESSAGE | PIPE_WAIT
1 0 0 CS_TIMEOUT pNPSA)
while (Done)
printf (Server is awaiting next requestn)
if (ConnectNamedPipe (hNamedPipe NULL)
|| ReadFile (hNamedPipe ampRequest RQ_SIZE ampnXfer NULL))
fprintf(stderr ldquoConnect or Read Named Pipe errornrdquo) exit(4)
printf( ldquoRequest is sn RequestRecord)
Send the file one line at a time to the client
fp = fopen (File r)
while ((fgets (ResponseRecord MAX_RQRS_LEN fp) = NULL))
WriteFile (hNamedPipe ampResponseRecord
(strlen(ResponseRecord) + 1) TSIZE ampnXfer NULL)
fclose (fp)
DisconnectNamedPipe (hNamedPipe)
End of server operation
106106
Win32 IPC - MailslotsMailslots bear some nasty implementation detailsthey are almost never used
Broadcast mechanism
ndash One-directional
ndash Mutliple writersmultiple readers (frequently one-to-many comm)
ndash Message delivery is unreliable
ndash Can be located over a network domain
ndash Message lengths are limited (w2k lt 426 byte) Operations on the mailslot
ndash Each reader (server) creates mailslot with CreateMailslot()
ndash Write-only client opens mailslot with CreateFile() and uses WriteFile() ndash open will fail if there are no waiting readers
ndash Clientlsquos message can be read by all servers (readers) Client lookup mailslotmailslotname
ndash Client will connect to every server in network domain
107107
Locate a server via mailslot
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
App client 0
App client n
Mailslot Servers
While () Sleep() hMS = CreateFile( ldquomailslotstatusldquo)
WriteFile(hMS ampStatInfo
App Server
Mailslot Client
Message is sent periodically
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
8282
Synchronizing Threads with Kernel Objects
The following kernel objects can be used to synchronize threads
ndash Processes
ndash Threads
ndash Files
ndash Console input
File change notificationsFile change notifications
MutexesMutexes
Events (auto-reset + manual-reset)Events (auto-reset + manual-reset)
Waitable timersWaitable timers
DWORD WaitForSingleObject( HANDLE hObject DWORD dwTimeout )
DWORD WaitForMultipleObjects( DWORD cObjects LPHANDLE lpHandles BOOL bWaitAll DWORD dwTimeout )
8383
Wait Functions - Details
WaitForSingleObject()ndash hObject specifies kernel object
ndash dwTimeout specifies wait time in msecdwTimeout == 0 - no wait check whether object is signaled
dwTimeout == INFINITE - wait forever
WaitForMultipleObjects()ndash cObjects lt= MAXIMUM_WAIT_OBJECTS (64)
ndash lpHandles - pointer to array identifying these objects
ndash bWaitAll - whether to wait for first signaled object or all objectsFunction returns index of first signaled object
Side effectsndash Mutexes auto-reset events and waitable timers will be reset to non-signaled state after completing wait functions
8484
Mutexes
Mutexes work across processes
First thread has to call CreateMutex()
When sharing a mutex second thread (process) calls CreateMutex() or OpenMutex()
fInitialOwner == TRUE gives creator immediate ownership
Threads acquire mutex ownership using WaitForSingleObject() or WaitForMultipleObjects()
ReleaseMutex() gives up ownership
CloseHandle() will free mutex object
8585
Mutexes
HANDLE CreateMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
HANDLE OpenMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
BOOL ReleaseMutex( HANDLE hMutex )
8686
Mutex Example
counter is global shared by all threads
volatile int done counter = 0
HANDLE mutex = CreateMutex( NULL FALSE NULL )
main loop in any of the threads ret is local
DWORD ret
while (done)
ret = WaitForSingleObject( mutex INFINITE )
if (ret == WAIT_OBJECT_0)
counter += local_value
else mutex was abandoned
break exit the loop
ReleaseMutex( mutex )
CloseHandle( mutex )
8787
Comparison - POSIX mutexes
POSIX pthreads specification supports mutexesndash Synchronization among threads in same process
Five basic functionsndash pthread_mutex_init()
ndash pthread_mutex_destroy()
ndash pthread_mutex_lock()
ndash pthread_mutex_unlock()
ndash pthread_mutex_trylock()
Comparisonndash pthread_mutex_lock() will block - equivalent to WaitForSingleObject( hMutex )
ndash pthread_mutex_trylock() is nonblocking (polling) - equivalent to WaitForSingleObject() with timeout == 0
8888
Semaphores
Semaphore objects are used for resource countingndash A semaphore is signaled when count gt 0
Threadsprocesses use wait functionsndash Each wait function decreases semaphore count by 1
ndash ReleaseSemaphore() may increment count by any value
ndash ReleaseSemaphore() returns old semaphore count
8989
Semaphores
HANDLE CreateSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE OpenSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE ReleaseSemaphore( HANDLE hSemaphoreLONG cReleaseCount LPLONG lpPreviousCount )
9090
Events
Multiple threads can be released when a single event is signaled (barrier synchronization)ndash Manual-reset event can signal several thread simultaneously must be reset manually
ndash PulseEvent() will release all threads waiting on a manual-reset event and automatically reset the event
ndash Auto-reset event signals a single thread event is reset automatically
ndash fInitialState == TRUE - create event in signaled state
9191
Events
HANDLE CreateEvent( LPSECURITY_ATTRIBUTE lpsaBOOL fManualReset BOOL fInititalStateLPTSTR lpszEventName )
BOOL SetEvent( HANDLE hEvent )BOOL ResetEvent( HANDLE hEvent )BOOL PulseEvent( HANDLE hEvent )
9292
Comparison - POSIX condition variables
pthreadrsquos condition variables are comparable to eventsndash pthread_cond_init()
ndash pthread_cond_destroy()
Wait functionsndash pthread_cond_wait()
ndash pthread_cond_timedwait()
Signalingndash pthread_cond_signal() - one thread
ndash pthread_cond_broadcast() - all waiting threads
No exact equivalent to manual-reset events
9393
Anonymous pipes
BOOL CreatePipe( PHANDLE phReadPHANDLE phWriteLPSECURITY_ATTRIBUTES lpsaDWORD cbPipe )
main
prog1 prog2pipe
Half-duplex character-based IPC
cbPipe pipe byte size zero == default
Read on pipe handle will block if pipe is empty
Write operation to a full pipe will block
Anonymous pipes are oneway
9494
IO Redirection using an Anonymous Pipe
Create default size anonymous pipe handles are inheritable
if (CreatePipe (amphReadPipe amphWritePipe ampPipeSA 0))
fprintf(stderr ldquoAnon pipe create failednrdquo) exit(1)
Set output handle to pipe handle create first processes
StartInfoCh1hStdInput = GetStdHandle (STD_INPUT_HANDLE)
StartInfoCh1hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh1hStdOutput = hWritePipe
StartInfoCh1dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)Command1 NULL NULL TRUE 0 NULL NULL ampStartInfoCh1 ampProcInfo1))
fprintf(stderr ldquoCreateProc1 failednrdquo) exit(2)
CloseHandle (hWritePipe)
9595
Pipe example (contd)
Repeat (symmetrically) for the second process
StartInfoCh2hStdInput = hReadPipe
StartInfoCh2hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh2hStdOutput = GetStdHandle (STD_OUTPUT_HANDLE)
StartInfoCh2dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)targv NULL NULLTRUE Inherit handles
0 NULL NULL ampStartInfoCh2 ampProcInfo2))
fprintf(stderr ldquoCreateProc2 failednrdquo) exit(3)
CloseHandle (hReadPipe)
Wait for both processes to complete
WaitForSingleObject (ProcInfo1hProcess INFINITE)
WaitForSingleObject (ProcInfo2hProcess INFINITE)
CloseHandle (ProcInfo1hThread) CloseHandle (ProcInfo1hProcess)
CloseHandle (ProcInfo2hThread) CloseHandle (ProcInfo2hProcess)
return 0
9696
Named Pipes
Message orientedndash Reading process can read varying-length messages precisely as sent by the writing process
Bi-directionalndash Two processes can exchange messages over the same pipe
Multiple independent instances of a named pipendash Several clients can communicate with a single server using the same instance
ndash Server can respond to client using the same instance
Pipe can be accessed over the networkndash location transparency
Convenience and connection functions
9797
Using Named Pipes
HANDLE CreateNamedPipe (LPCTSTR lpszPipeNameDWORD fdwOpenMode DWORD fdwPipModeDWORD nMaxInstances DWORD cbOutBufDWORD cbInBuf DWORD dwTimeOutLPSECURITY_ATTRIBUTES lpsa )
Use same flag settings forall instances of a named pipe
lpszPipeName pipe[path]pipename
ndash Not possible to create a pipe on remote machine ( ndash local machine)
fdwOpenMode
ndash PIPE_ACCESS_DUPLEX PIPE_ACCESS_INBOUND PIPE_ACCESS_OUTBOUND
fdwPipeMode
ndash PIPE_TYPE_BYTE or PIPE_TYPE_MESSAGE
ndash PIPE_READMODE_BYTE or PIPE_READMODE_MESSAGE
ndash PIPE_WAIT or PIPE_NOWAIT (will ReadFile block)
9898
Named Pipes (contd)
BOOL PeekNamedPipe (HANDLE hPipeLPVOID lpvBuffer DWORD cbBufferLPDWORD lpcbRead LPDWORD lpcbAvailLPDWORD lpcbMessage)
nMaxInstances
ndash Number of instances
ndash PIPE_UNLIMITED_INSTANCES OS choice based on resources
dwTimeOut
ndash Default time-out period (in msec) for WaitNamedPipe()
First CreateNamedPipe creates named pipe
ndash Closing handle to last instance deletes named pipe
Polling a pipe
ndash Nondestructive ndash is there a message waiting for ReadFile
9999
Named Pipe Client Connections
CreateFile with named pipe namendash pipe[path]pipename
ndash servernamepipe[path]pipename
ndash First method gives better performance (local server)
Status Functionsndash GetNamedPipeHandleState
ndash SetNamedPipeHandleState
ndash GetNamedPipeInfo
100100
Convenience Functions
BOOL TransactNamedPipe( HANDLE hNamedPipeLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDOWRD lpcbRead LPOVERLAPPED lpa)
WriteFile ReadFile sequence
101101
Convenience Functions
BOOL CallNamedPipe( LPCTSTR lpszPipeNameLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDWORD lpcbRead DWORD dwTimeOut)
CreateFile WriteFile ReadFile CloseHandle
ndash dwTimeOut NMPWAIT_NOWAIT NMPWAIT_WIAT_FOREVER NMPWAIT_USE_DEFAULT_WAIT
102102
Server eliminate the polling loop
BOOL ConnectNamedPipe (HANDLE hNamedPipeLPOVERLAPPED lpo
lpo == NULLndash Call will return as soon as there is a client connection
ndash Returns false if client connected between CreateNamed Pipe calland ConnectNamedPipe()
Use DisconnectNamedPipe to free the handle for connection from another client
WaitNamedPipe()ndash Client may wait for serverlsquos ConnectNamedPipe()
Security rights for named pipesndash GENERIC_READ GENERIC_WRITE SYNCHRONIZE
103103
Comparison with UNIX
UNIX FIFOs are similar to a named pipe
ndash FIFOs are half-duplex
ndash FIFOs are limited to a single machine
ndash FIFOs are still byte-oriented so its easiest to use fixed-size records in clientserver applications
ndash Individual readwrites are atomic
A server using FIFOs must use a separate FIFO for each clientlsquos response although all clients can send requests via a single well known FIFO
Mkfifo() is the UNIX counterpart to CreateNamedPipe()
Use sockets for networked clientserver scenarios
104104
Client Example using Named Pipe
WaitNamedPipe (ServerPipeName NMPWAIT_WAIT_FOREVER)
hNamedPipe = CreateFile (ServerPipeName GENERIC_READ | GENERIC_WRITE
0 NULL OPEN_EXISTING FILE_ATTRIBUTE_NORMAL NULL)
if (hNamedPipe == INVALID_HANDLE_VALUE)
fptinf(stderr Failure to locate servern) exit(3)
Write the request
WriteFile (hNamedPipe ampRequest MAX_RQRS_LEN ampnWrite NULL)
Read each response and send it to std out
while (ReadFile (hNamedPipe ResponseRecord MAX_RQRS_LEN ampnRead NULL))
printf (s ResponseRecord)
CloseHandle (hNamedPipe)
return 0
105105
Server Example Using a Named Pipe
hNamedPipe = CreateNamedPipe (SERVER_PIPE PIPE_ACCESS_DUPLEX
PIPE_READMODE_MESSAGE | PIPE_TYPE_MESSAGE | PIPE_WAIT
1 0 0 CS_TIMEOUT pNPSA)
while (Done)
printf (Server is awaiting next requestn)
if (ConnectNamedPipe (hNamedPipe NULL)
|| ReadFile (hNamedPipe ampRequest RQ_SIZE ampnXfer NULL))
fprintf(stderr ldquoConnect or Read Named Pipe errornrdquo) exit(4)
printf( ldquoRequest is sn RequestRecord)
Send the file one line at a time to the client
fp = fopen (File r)
while ((fgets (ResponseRecord MAX_RQRS_LEN fp) = NULL))
WriteFile (hNamedPipe ampResponseRecord
(strlen(ResponseRecord) + 1) TSIZE ampnXfer NULL)
fclose (fp)
DisconnectNamedPipe (hNamedPipe)
End of server operation
106106
Win32 IPC - MailslotsMailslots bear some nasty implementation detailsthey are almost never used
Broadcast mechanism
ndash One-directional
ndash Mutliple writersmultiple readers (frequently one-to-many comm)
ndash Message delivery is unreliable
ndash Can be located over a network domain
ndash Message lengths are limited (w2k lt 426 byte) Operations on the mailslot
ndash Each reader (server) creates mailslot with CreateMailslot()
ndash Write-only client opens mailslot with CreateFile() and uses WriteFile() ndash open will fail if there are no waiting readers
ndash Clientlsquos message can be read by all servers (readers) Client lookup mailslotmailslotname
ndash Client will connect to every server in network domain
107107
Locate a server via mailslot
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
App client 0
App client n
Mailslot Servers
While () Sleep() hMS = CreateFile( ldquomailslotstatusldquo)
WriteFile(hMS ampStatInfo
App Server
Mailslot Client
Message is sent periodically
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
8383
Wait Functions - Details
WaitForSingleObject()ndash hObject specifies kernel object
ndash dwTimeout specifies wait time in msecdwTimeout == 0 - no wait check whether object is signaled
dwTimeout == INFINITE - wait forever
WaitForMultipleObjects()ndash cObjects lt= MAXIMUM_WAIT_OBJECTS (64)
ndash lpHandles - pointer to array identifying these objects
ndash bWaitAll - whether to wait for first signaled object or all objectsFunction returns index of first signaled object
Side effectsndash Mutexes auto-reset events and waitable timers will be reset to non-signaled state after completing wait functions
8484
Mutexes
Mutexes work across processes
First thread has to call CreateMutex()
When sharing a mutex second thread (process) calls CreateMutex() or OpenMutex()
fInitialOwner == TRUE gives creator immediate ownership
Threads acquire mutex ownership using WaitForSingleObject() or WaitForMultipleObjects()
ReleaseMutex() gives up ownership
CloseHandle() will free mutex object
8585
Mutexes
HANDLE CreateMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
HANDLE OpenMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
BOOL ReleaseMutex( HANDLE hMutex )
8686
Mutex Example
counter is global shared by all threads
volatile int done counter = 0
HANDLE mutex = CreateMutex( NULL FALSE NULL )
main loop in any of the threads ret is local
DWORD ret
while (done)
ret = WaitForSingleObject( mutex INFINITE )
if (ret == WAIT_OBJECT_0)
counter += local_value
else mutex was abandoned
break exit the loop
ReleaseMutex( mutex )
CloseHandle( mutex )
8787
Comparison - POSIX mutexes
POSIX pthreads specification supports mutexesndash Synchronization among threads in same process
Five basic functionsndash pthread_mutex_init()
ndash pthread_mutex_destroy()
ndash pthread_mutex_lock()
ndash pthread_mutex_unlock()
ndash pthread_mutex_trylock()
Comparisonndash pthread_mutex_lock() will block - equivalent to WaitForSingleObject( hMutex )
ndash pthread_mutex_trylock() is nonblocking (polling) - equivalent to WaitForSingleObject() with timeout == 0
8888
Semaphores
Semaphore objects are used for resource countingndash A semaphore is signaled when count gt 0
Threadsprocesses use wait functionsndash Each wait function decreases semaphore count by 1
ndash ReleaseSemaphore() may increment count by any value
ndash ReleaseSemaphore() returns old semaphore count
8989
Semaphores
HANDLE CreateSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE OpenSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE ReleaseSemaphore( HANDLE hSemaphoreLONG cReleaseCount LPLONG lpPreviousCount )
9090
Events
Multiple threads can be released when a single event is signaled (barrier synchronization)ndash Manual-reset event can signal several thread simultaneously must be reset manually
ndash PulseEvent() will release all threads waiting on a manual-reset event and automatically reset the event
ndash Auto-reset event signals a single thread event is reset automatically
ndash fInitialState == TRUE - create event in signaled state
9191
Events
HANDLE CreateEvent( LPSECURITY_ATTRIBUTE lpsaBOOL fManualReset BOOL fInititalStateLPTSTR lpszEventName )
BOOL SetEvent( HANDLE hEvent )BOOL ResetEvent( HANDLE hEvent )BOOL PulseEvent( HANDLE hEvent )
9292
Comparison - POSIX condition variables
pthreadrsquos condition variables are comparable to eventsndash pthread_cond_init()
ndash pthread_cond_destroy()
Wait functionsndash pthread_cond_wait()
ndash pthread_cond_timedwait()
Signalingndash pthread_cond_signal() - one thread
ndash pthread_cond_broadcast() - all waiting threads
No exact equivalent to manual-reset events
9393
Anonymous pipes
BOOL CreatePipe( PHANDLE phReadPHANDLE phWriteLPSECURITY_ATTRIBUTES lpsaDWORD cbPipe )
main
prog1 prog2pipe
Half-duplex character-based IPC
cbPipe pipe byte size zero == default
Read on pipe handle will block if pipe is empty
Write operation to a full pipe will block
Anonymous pipes are oneway
9494
IO Redirection using an Anonymous Pipe
Create default size anonymous pipe handles are inheritable
if (CreatePipe (amphReadPipe amphWritePipe ampPipeSA 0))
fprintf(stderr ldquoAnon pipe create failednrdquo) exit(1)
Set output handle to pipe handle create first processes
StartInfoCh1hStdInput = GetStdHandle (STD_INPUT_HANDLE)
StartInfoCh1hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh1hStdOutput = hWritePipe
StartInfoCh1dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)Command1 NULL NULL TRUE 0 NULL NULL ampStartInfoCh1 ampProcInfo1))
fprintf(stderr ldquoCreateProc1 failednrdquo) exit(2)
CloseHandle (hWritePipe)
9595
Pipe example (contd)
Repeat (symmetrically) for the second process
StartInfoCh2hStdInput = hReadPipe
StartInfoCh2hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh2hStdOutput = GetStdHandle (STD_OUTPUT_HANDLE)
StartInfoCh2dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)targv NULL NULLTRUE Inherit handles
0 NULL NULL ampStartInfoCh2 ampProcInfo2))
fprintf(stderr ldquoCreateProc2 failednrdquo) exit(3)
CloseHandle (hReadPipe)
Wait for both processes to complete
WaitForSingleObject (ProcInfo1hProcess INFINITE)
WaitForSingleObject (ProcInfo2hProcess INFINITE)
CloseHandle (ProcInfo1hThread) CloseHandle (ProcInfo1hProcess)
CloseHandle (ProcInfo2hThread) CloseHandle (ProcInfo2hProcess)
return 0
9696
Named Pipes
Message orientedndash Reading process can read varying-length messages precisely as sent by the writing process
Bi-directionalndash Two processes can exchange messages over the same pipe
Multiple independent instances of a named pipendash Several clients can communicate with a single server using the same instance
ndash Server can respond to client using the same instance
Pipe can be accessed over the networkndash location transparency
Convenience and connection functions
9797
Using Named Pipes
HANDLE CreateNamedPipe (LPCTSTR lpszPipeNameDWORD fdwOpenMode DWORD fdwPipModeDWORD nMaxInstances DWORD cbOutBufDWORD cbInBuf DWORD dwTimeOutLPSECURITY_ATTRIBUTES lpsa )
Use same flag settings forall instances of a named pipe
lpszPipeName pipe[path]pipename
ndash Not possible to create a pipe on remote machine ( ndash local machine)
fdwOpenMode
ndash PIPE_ACCESS_DUPLEX PIPE_ACCESS_INBOUND PIPE_ACCESS_OUTBOUND
fdwPipeMode
ndash PIPE_TYPE_BYTE or PIPE_TYPE_MESSAGE
ndash PIPE_READMODE_BYTE or PIPE_READMODE_MESSAGE
ndash PIPE_WAIT or PIPE_NOWAIT (will ReadFile block)
9898
Named Pipes (contd)
BOOL PeekNamedPipe (HANDLE hPipeLPVOID lpvBuffer DWORD cbBufferLPDWORD lpcbRead LPDWORD lpcbAvailLPDWORD lpcbMessage)
nMaxInstances
ndash Number of instances
ndash PIPE_UNLIMITED_INSTANCES OS choice based on resources
dwTimeOut
ndash Default time-out period (in msec) for WaitNamedPipe()
First CreateNamedPipe creates named pipe
ndash Closing handle to last instance deletes named pipe
Polling a pipe
ndash Nondestructive ndash is there a message waiting for ReadFile
9999
Named Pipe Client Connections
CreateFile with named pipe namendash pipe[path]pipename
ndash servernamepipe[path]pipename
ndash First method gives better performance (local server)
Status Functionsndash GetNamedPipeHandleState
ndash SetNamedPipeHandleState
ndash GetNamedPipeInfo
100100
Convenience Functions
BOOL TransactNamedPipe( HANDLE hNamedPipeLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDOWRD lpcbRead LPOVERLAPPED lpa)
WriteFile ReadFile sequence
101101
Convenience Functions
BOOL CallNamedPipe( LPCTSTR lpszPipeNameLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDWORD lpcbRead DWORD dwTimeOut)
CreateFile WriteFile ReadFile CloseHandle
ndash dwTimeOut NMPWAIT_NOWAIT NMPWAIT_WIAT_FOREVER NMPWAIT_USE_DEFAULT_WAIT
102102
Server eliminate the polling loop
BOOL ConnectNamedPipe (HANDLE hNamedPipeLPOVERLAPPED lpo
lpo == NULLndash Call will return as soon as there is a client connection
ndash Returns false if client connected between CreateNamed Pipe calland ConnectNamedPipe()
Use DisconnectNamedPipe to free the handle for connection from another client
WaitNamedPipe()ndash Client may wait for serverlsquos ConnectNamedPipe()
Security rights for named pipesndash GENERIC_READ GENERIC_WRITE SYNCHRONIZE
103103
Comparison with UNIX
UNIX FIFOs are similar to a named pipe
ndash FIFOs are half-duplex
ndash FIFOs are limited to a single machine
ndash FIFOs are still byte-oriented so its easiest to use fixed-size records in clientserver applications
ndash Individual readwrites are atomic
A server using FIFOs must use a separate FIFO for each clientlsquos response although all clients can send requests via a single well known FIFO
Mkfifo() is the UNIX counterpart to CreateNamedPipe()
Use sockets for networked clientserver scenarios
104104
Client Example using Named Pipe
WaitNamedPipe (ServerPipeName NMPWAIT_WAIT_FOREVER)
hNamedPipe = CreateFile (ServerPipeName GENERIC_READ | GENERIC_WRITE
0 NULL OPEN_EXISTING FILE_ATTRIBUTE_NORMAL NULL)
if (hNamedPipe == INVALID_HANDLE_VALUE)
fptinf(stderr Failure to locate servern) exit(3)
Write the request
WriteFile (hNamedPipe ampRequest MAX_RQRS_LEN ampnWrite NULL)
Read each response and send it to std out
while (ReadFile (hNamedPipe ResponseRecord MAX_RQRS_LEN ampnRead NULL))
printf (s ResponseRecord)
CloseHandle (hNamedPipe)
return 0
105105
Server Example Using a Named Pipe
hNamedPipe = CreateNamedPipe (SERVER_PIPE PIPE_ACCESS_DUPLEX
PIPE_READMODE_MESSAGE | PIPE_TYPE_MESSAGE | PIPE_WAIT
1 0 0 CS_TIMEOUT pNPSA)
while (Done)
printf (Server is awaiting next requestn)
if (ConnectNamedPipe (hNamedPipe NULL)
|| ReadFile (hNamedPipe ampRequest RQ_SIZE ampnXfer NULL))
fprintf(stderr ldquoConnect or Read Named Pipe errornrdquo) exit(4)
printf( ldquoRequest is sn RequestRecord)
Send the file one line at a time to the client
fp = fopen (File r)
while ((fgets (ResponseRecord MAX_RQRS_LEN fp) = NULL))
WriteFile (hNamedPipe ampResponseRecord
(strlen(ResponseRecord) + 1) TSIZE ampnXfer NULL)
fclose (fp)
DisconnectNamedPipe (hNamedPipe)
End of server operation
106106
Win32 IPC - MailslotsMailslots bear some nasty implementation detailsthey are almost never used
Broadcast mechanism
ndash One-directional
ndash Mutliple writersmultiple readers (frequently one-to-many comm)
ndash Message delivery is unreliable
ndash Can be located over a network domain
ndash Message lengths are limited (w2k lt 426 byte) Operations on the mailslot
ndash Each reader (server) creates mailslot with CreateMailslot()
ndash Write-only client opens mailslot with CreateFile() and uses WriteFile() ndash open will fail if there are no waiting readers
ndash Clientlsquos message can be read by all servers (readers) Client lookup mailslotmailslotname
ndash Client will connect to every server in network domain
107107
Locate a server via mailslot
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
App client 0
App client n
Mailslot Servers
While () Sleep() hMS = CreateFile( ldquomailslotstatusldquo)
WriteFile(hMS ampStatInfo
App Server
Mailslot Client
Message is sent periodically
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
8484
Mutexes
Mutexes work across processes
First thread has to call CreateMutex()
When sharing a mutex second thread (process) calls CreateMutex() or OpenMutex()
fInitialOwner == TRUE gives creator immediate ownership
Threads acquire mutex ownership using WaitForSingleObject() or WaitForMultipleObjects()
ReleaseMutex() gives up ownership
CloseHandle() will free mutex object
8585
Mutexes
HANDLE CreateMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
HANDLE OpenMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
BOOL ReleaseMutex( HANDLE hMutex )
8686
Mutex Example
counter is global shared by all threads
volatile int done counter = 0
HANDLE mutex = CreateMutex( NULL FALSE NULL )
main loop in any of the threads ret is local
DWORD ret
while (done)
ret = WaitForSingleObject( mutex INFINITE )
if (ret == WAIT_OBJECT_0)
counter += local_value
else mutex was abandoned
break exit the loop
ReleaseMutex( mutex )
CloseHandle( mutex )
8787
Comparison - POSIX mutexes
POSIX pthreads specification supports mutexesndash Synchronization among threads in same process
Five basic functionsndash pthread_mutex_init()
ndash pthread_mutex_destroy()
ndash pthread_mutex_lock()
ndash pthread_mutex_unlock()
ndash pthread_mutex_trylock()
Comparisonndash pthread_mutex_lock() will block - equivalent to WaitForSingleObject( hMutex )
ndash pthread_mutex_trylock() is nonblocking (polling) - equivalent to WaitForSingleObject() with timeout == 0
8888
Semaphores
Semaphore objects are used for resource countingndash A semaphore is signaled when count gt 0
Threadsprocesses use wait functionsndash Each wait function decreases semaphore count by 1
ndash ReleaseSemaphore() may increment count by any value
ndash ReleaseSemaphore() returns old semaphore count
8989
Semaphores
HANDLE CreateSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE OpenSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE ReleaseSemaphore( HANDLE hSemaphoreLONG cReleaseCount LPLONG lpPreviousCount )
9090
Events
Multiple threads can be released when a single event is signaled (barrier synchronization)ndash Manual-reset event can signal several thread simultaneously must be reset manually
ndash PulseEvent() will release all threads waiting on a manual-reset event and automatically reset the event
ndash Auto-reset event signals a single thread event is reset automatically
ndash fInitialState == TRUE - create event in signaled state
9191
Events
HANDLE CreateEvent( LPSECURITY_ATTRIBUTE lpsaBOOL fManualReset BOOL fInititalStateLPTSTR lpszEventName )
BOOL SetEvent( HANDLE hEvent )BOOL ResetEvent( HANDLE hEvent )BOOL PulseEvent( HANDLE hEvent )
9292
Comparison - POSIX condition variables
pthreadrsquos condition variables are comparable to eventsndash pthread_cond_init()
ndash pthread_cond_destroy()
Wait functionsndash pthread_cond_wait()
ndash pthread_cond_timedwait()
Signalingndash pthread_cond_signal() - one thread
ndash pthread_cond_broadcast() - all waiting threads
No exact equivalent to manual-reset events
9393
Anonymous pipes
BOOL CreatePipe( PHANDLE phReadPHANDLE phWriteLPSECURITY_ATTRIBUTES lpsaDWORD cbPipe )
main
prog1 prog2pipe
Half-duplex character-based IPC
cbPipe pipe byte size zero == default
Read on pipe handle will block if pipe is empty
Write operation to a full pipe will block
Anonymous pipes are oneway
9494
IO Redirection using an Anonymous Pipe
Create default size anonymous pipe handles are inheritable
if (CreatePipe (amphReadPipe amphWritePipe ampPipeSA 0))
fprintf(stderr ldquoAnon pipe create failednrdquo) exit(1)
Set output handle to pipe handle create first processes
StartInfoCh1hStdInput = GetStdHandle (STD_INPUT_HANDLE)
StartInfoCh1hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh1hStdOutput = hWritePipe
StartInfoCh1dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)Command1 NULL NULL TRUE 0 NULL NULL ampStartInfoCh1 ampProcInfo1))
fprintf(stderr ldquoCreateProc1 failednrdquo) exit(2)
CloseHandle (hWritePipe)
9595
Pipe example (contd)
Repeat (symmetrically) for the second process
StartInfoCh2hStdInput = hReadPipe
StartInfoCh2hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh2hStdOutput = GetStdHandle (STD_OUTPUT_HANDLE)
StartInfoCh2dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)targv NULL NULLTRUE Inherit handles
0 NULL NULL ampStartInfoCh2 ampProcInfo2))
fprintf(stderr ldquoCreateProc2 failednrdquo) exit(3)
CloseHandle (hReadPipe)
Wait for both processes to complete
WaitForSingleObject (ProcInfo1hProcess INFINITE)
WaitForSingleObject (ProcInfo2hProcess INFINITE)
CloseHandle (ProcInfo1hThread) CloseHandle (ProcInfo1hProcess)
CloseHandle (ProcInfo2hThread) CloseHandle (ProcInfo2hProcess)
return 0
9696
Named Pipes
Message orientedndash Reading process can read varying-length messages precisely as sent by the writing process
Bi-directionalndash Two processes can exchange messages over the same pipe
Multiple independent instances of a named pipendash Several clients can communicate with a single server using the same instance
ndash Server can respond to client using the same instance
Pipe can be accessed over the networkndash location transparency
Convenience and connection functions
9797
Using Named Pipes
HANDLE CreateNamedPipe (LPCTSTR lpszPipeNameDWORD fdwOpenMode DWORD fdwPipModeDWORD nMaxInstances DWORD cbOutBufDWORD cbInBuf DWORD dwTimeOutLPSECURITY_ATTRIBUTES lpsa )
Use same flag settings forall instances of a named pipe
lpszPipeName pipe[path]pipename
ndash Not possible to create a pipe on remote machine ( ndash local machine)
fdwOpenMode
ndash PIPE_ACCESS_DUPLEX PIPE_ACCESS_INBOUND PIPE_ACCESS_OUTBOUND
fdwPipeMode
ndash PIPE_TYPE_BYTE or PIPE_TYPE_MESSAGE
ndash PIPE_READMODE_BYTE or PIPE_READMODE_MESSAGE
ndash PIPE_WAIT or PIPE_NOWAIT (will ReadFile block)
9898
Named Pipes (contd)
BOOL PeekNamedPipe (HANDLE hPipeLPVOID lpvBuffer DWORD cbBufferLPDWORD lpcbRead LPDWORD lpcbAvailLPDWORD lpcbMessage)
nMaxInstances
ndash Number of instances
ndash PIPE_UNLIMITED_INSTANCES OS choice based on resources
dwTimeOut
ndash Default time-out period (in msec) for WaitNamedPipe()
First CreateNamedPipe creates named pipe
ndash Closing handle to last instance deletes named pipe
Polling a pipe
ndash Nondestructive ndash is there a message waiting for ReadFile
9999
Named Pipe Client Connections
CreateFile with named pipe namendash pipe[path]pipename
ndash servernamepipe[path]pipename
ndash First method gives better performance (local server)
Status Functionsndash GetNamedPipeHandleState
ndash SetNamedPipeHandleState
ndash GetNamedPipeInfo
100100
Convenience Functions
BOOL TransactNamedPipe( HANDLE hNamedPipeLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDOWRD lpcbRead LPOVERLAPPED lpa)
WriteFile ReadFile sequence
101101
Convenience Functions
BOOL CallNamedPipe( LPCTSTR lpszPipeNameLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDWORD lpcbRead DWORD dwTimeOut)
CreateFile WriteFile ReadFile CloseHandle
ndash dwTimeOut NMPWAIT_NOWAIT NMPWAIT_WIAT_FOREVER NMPWAIT_USE_DEFAULT_WAIT
102102
Server eliminate the polling loop
BOOL ConnectNamedPipe (HANDLE hNamedPipeLPOVERLAPPED lpo
lpo == NULLndash Call will return as soon as there is a client connection
ndash Returns false if client connected between CreateNamed Pipe calland ConnectNamedPipe()
Use DisconnectNamedPipe to free the handle for connection from another client
WaitNamedPipe()ndash Client may wait for serverlsquos ConnectNamedPipe()
Security rights for named pipesndash GENERIC_READ GENERIC_WRITE SYNCHRONIZE
103103
Comparison with UNIX
UNIX FIFOs are similar to a named pipe
ndash FIFOs are half-duplex
ndash FIFOs are limited to a single machine
ndash FIFOs are still byte-oriented so its easiest to use fixed-size records in clientserver applications
ndash Individual readwrites are atomic
A server using FIFOs must use a separate FIFO for each clientlsquos response although all clients can send requests via a single well known FIFO
Mkfifo() is the UNIX counterpart to CreateNamedPipe()
Use sockets for networked clientserver scenarios
104104
Client Example using Named Pipe
WaitNamedPipe (ServerPipeName NMPWAIT_WAIT_FOREVER)
hNamedPipe = CreateFile (ServerPipeName GENERIC_READ | GENERIC_WRITE
0 NULL OPEN_EXISTING FILE_ATTRIBUTE_NORMAL NULL)
if (hNamedPipe == INVALID_HANDLE_VALUE)
fptinf(stderr Failure to locate servern) exit(3)
Write the request
WriteFile (hNamedPipe ampRequest MAX_RQRS_LEN ampnWrite NULL)
Read each response and send it to std out
while (ReadFile (hNamedPipe ResponseRecord MAX_RQRS_LEN ampnRead NULL))
printf (s ResponseRecord)
CloseHandle (hNamedPipe)
return 0
105105
Server Example Using a Named Pipe
hNamedPipe = CreateNamedPipe (SERVER_PIPE PIPE_ACCESS_DUPLEX
PIPE_READMODE_MESSAGE | PIPE_TYPE_MESSAGE | PIPE_WAIT
1 0 0 CS_TIMEOUT pNPSA)
while (Done)
printf (Server is awaiting next requestn)
if (ConnectNamedPipe (hNamedPipe NULL)
|| ReadFile (hNamedPipe ampRequest RQ_SIZE ampnXfer NULL))
fprintf(stderr ldquoConnect or Read Named Pipe errornrdquo) exit(4)
printf( ldquoRequest is sn RequestRecord)
Send the file one line at a time to the client
fp = fopen (File r)
while ((fgets (ResponseRecord MAX_RQRS_LEN fp) = NULL))
WriteFile (hNamedPipe ampResponseRecord
(strlen(ResponseRecord) + 1) TSIZE ampnXfer NULL)
fclose (fp)
DisconnectNamedPipe (hNamedPipe)
End of server operation
106106
Win32 IPC - MailslotsMailslots bear some nasty implementation detailsthey are almost never used
Broadcast mechanism
ndash One-directional
ndash Mutliple writersmultiple readers (frequently one-to-many comm)
ndash Message delivery is unreliable
ndash Can be located over a network domain
ndash Message lengths are limited (w2k lt 426 byte) Operations on the mailslot
ndash Each reader (server) creates mailslot with CreateMailslot()
ndash Write-only client opens mailslot with CreateFile() and uses WriteFile() ndash open will fail if there are no waiting readers
ndash Clientlsquos message can be read by all servers (readers) Client lookup mailslotmailslotname
ndash Client will connect to every server in network domain
107107
Locate a server via mailslot
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
App client 0
App client n
Mailslot Servers
While () Sleep() hMS = CreateFile( ldquomailslotstatusldquo)
WriteFile(hMS ampStatInfo
App Server
Mailslot Client
Message is sent periodically
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
8585
Mutexes
HANDLE CreateMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
HANDLE OpenMutex( LPSECURITY_ATTRIBUTE lpsaBOOL fInitialOwner LPTSTR lpszMutexName )
BOOL ReleaseMutex( HANDLE hMutex )
8686
Mutex Example
counter is global shared by all threads
volatile int done counter = 0
HANDLE mutex = CreateMutex( NULL FALSE NULL )
main loop in any of the threads ret is local
DWORD ret
while (done)
ret = WaitForSingleObject( mutex INFINITE )
if (ret == WAIT_OBJECT_0)
counter += local_value
else mutex was abandoned
break exit the loop
ReleaseMutex( mutex )
CloseHandle( mutex )
8787
Comparison - POSIX mutexes
POSIX pthreads specification supports mutexesndash Synchronization among threads in same process
Five basic functionsndash pthread_mutex_init()
ndash pthread_mutex_destroy()
ndash pthread_mutex_lock()
ndash pthread_mutex_unlock()
ndash pthread_mutex_trylock()
Comparisonndash pthread_mutex_lock() will block - equivalent to WaitForSingleObject( hMutex )
ndash pthread_mutex_trylock() is nonblocking (polling) - equivalent to WaitForSingleObject() with timeout == 0
8888
Semaphores
Semaphore objects are used for resource countingndash A semaphore is signaled when count gt 0
Threadsprocesses use wait functionsndash Each wait function decreases semaphore count by 1
ndash ReleaseSemaphore() may increment count by any value
ndash ReleaseSemaphore() returns old semaphore count
8989
Semaphores
HANDLE CreateSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE OpenSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE ReleaseSemaphore( HANDLE hSemaphoreLONG cReleaseCount LPLONG lpPreviousCount )
9090
Events
Multiple threads can be released when a single event is signaled (barrier synchronization)ndash Manual-reset event can signal several thread simultaneously must be reset manually
ndash PulseEvent() will release all threads waiting on a manual-reset event and automatically reset the event
ndash Auto-reset event signals a single thread event is reset automatically
ndash fInitialState == TRUE - create event in signaled state
9191
Events
HANDLE CreateEvent( LPSECURITY_ATTRIBUTE lpsaBOOL fManualReset BOOL fInititalStateLPTSTR lpszEventName )
BOOL SetEvent( HANDLE hEvent )BOOL ResetEvent( HANDLE hEvent )BOOL PulseEvent( HANDLE hEvent )
9292
Comparison - POSIX condition variables
pthreadrsquos condition variables are comparable to eventsndash pthread_cond_init()
ndash pthread_cond_destroy()
Wait functionsndash pthread_cond_wait()
ndash pthread_cond_timedwait()
Signalingndash pthread_cond_signal() - one thread
ndash pthread_cond_broadcast() - all waiting threads
No exact equivalent to manual-reset events
9393
Anonymous pipes
BOOL CreatePipe( PHANDLE phReadPHANDLE phWriteLPSECURITY_ATTRIBUTES lpsaDWORD cbPipe )
main
prog1 prog2pipe
Half-duplex character-based IPC
cbPipe pipe byte size zero == default
Read on pipe handle will block if pipe is empty
Write operation to a full pipe will block
Anonymous pipes are oneway
9494
IO Redirection using an Anonymous Pipe
Create default size anonymous pipe handles are inheritable
if (CreatePipe (amphReadPipe amphWritePipe ampPipeSA 0))
fprintf(stderr ldquoAnon pipe create failednrdquo) exit(1)
Set output handle to pipe handle create first processes
StartInfoCh1hStdInput = GetStdHandle (STD_INPUT_HANDLE)
StartInfoCh1hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh1hStdOutput = hWritePipe
StartInfoCh1dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)Command1 NULL NULL TRUE 0 NULL NULL ampStartInfoCh1 ampProcInfo1))
fprintf(stderr ldquoCreateProc1 failednrdquo) exit(2)
CloseHandle (hWritePipe)
9595
Pipe example (contd)
Repeat (symmetrically) for the second process
StartInfoCh2hStdInput = hReadPipe
StartInfoCh2hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh2hStdOutput = GetStdHandle (STD_OUTPUT_HANDLE)
StartInfoCh2dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)targv NULL NULLTRUE Inherit handles
0 NULL NULL ampStartInfoCh2 ampProcInfo2))
fprintf(stderr ldquoCreateProc2 failednrdquo) exit(3)
CloseHandle (hReadPipe)
Wait for both processes to complete
WaitForSingleObject (ProcInfo1hProcess INFINITE)
WaitForSingleObject (ProcInfo2hProcess INFINITE)
CloseHandle (ProcInfo1hThread) CloseHandle (ProcInfo1hProcess)
CloseHandle (ProcInfo2hThread) CloseHandle (ProcInfo2hProcess)
return 0
9696
Named Pipes
Message orientedndash Reading process can read varying-length messages precisely as sent by the writing process
Bi-directionalndash Two processes can exchange messages over the same pipe
Multiple independent instances of a named pipendash Several clients can communicate with a single server using the same instance
ndash Server can respond to client using the same instance
Pipe can be accessed over the networkndash location transparency
Convenience and connection functions
9797
Using Named Pipes
HANDLE CreateNamedPipe (LPCTSTR lpszPipeNameDWORD fdwOpenMode DWORD fdwPipModeDWORD nMaxInstances DWORD cbOutBufDWORD cbInBuf DWORD dwTimeOutLPSECURITY_ATTRIBUTES lpsa )
Use same flag settings forall instances of a named pipe
lpszPipeName pipe[path]pipename
ndash Not possible to create a pipe on remote machine ( ndash local machine)
fdwOpenMode
ndash PIPE_ACCESS_DUPLEX PIPE_ACCESS_INBOUND PIPE_ACCESS_OUTBOUND
fdwPipeMode
ndash PIPE_TYPE_BYTE or PIPE_TYPE_MESSAGE
ndash PIPE_READMODE_BYTE or PIPE_READMODE_MESSAGE
ndash PIPE_WAIT or PIPE_NOWAIT (will ReadFile block)
9898
Named Pipes (contd)
BOOL PeekNamedPipe (HANDLE hPipeLPVOID lpvBuffer DWORD cbBufferLPDWORD lpcbRead LPDWORD lpcbAvailLPDWORD lpcbMessage)
nMaxInstances
ndash Number of instances
ndash PIPE_UNLIMITED_INSTANCES OS choice based on resources
dwTimeOut
ndash Default time-out period (in msec) for WaitNamedPipe()
First CreateNamedPipe creates named pipe
ndash Closing handle to last instance deletes named pipe
Polling a pipe
ndash Nondestructive ndash is there a message waiting for ReadFile
9999
Named Pipe Client Connections
CreateFile with named pipe namendash pipe[path]pipename
ndash servernamepipe[path]pipename
ndash First method gives better performance (local server)
Status Functionsndash GetNamedPipeHandleState
ndash SetNamedPipeHandleState
ndash GetNamedPipeInfo
100100
Convenience Functions
BOOL TransactNamedPipe( HANDLE hNamedPipeLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDOWRD lpcbRead LPOVERLAPPED lpa)
WriteFile ReadFile sequence
101101
Convenience Functions
BOOL CallNamedPipe( LPCTSTR lpszPipeNameLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDWORD lpcbRead DWORD dwTimeOut)
CreateFile WriteFile ReadFile CloseHandle
ndash dwTimeOut NMPWAIT_NOWAIT NMPWAIT_WIAT_FOREVER NMPWAIT_USE_DEFAULT_WAIT
102102
Server eliminate the polling loop
BOOL ConnectNamedPipe (HANDLE hNamedPipeLPOVERLAPPED lpo
lpo == NULLndash Call will return as soon as there is a client connection
ndash Returns false if client connected between CreateNamed Pipe calland ConnectNamedPipe()
Use DisconnectNamedPipe to free the handle for connection from another client
WaitNamedPipe()ndash Client may wait for serverlsquos ConnectNamedPipe()
Security rights for named pipesndash GENERIC_READ GENERIC_WRITE SYNCHRONIZE
103103
Comparison with UNIX
UNIX FIFOs are similar to a named pipe
ndash FIFOs are half-duplex
ndash FIFOs are limited to a single machine
ndash FIFOs are still byte-oriented so its easiest to use fixed-size records in clientserver applications
ndash Individual readwrites are atomic
A server using FIFOs must use a separate FIFO for each clientlsquos response although all clients can send requests via a single well known FIFO
Mkfifo() is the UNIX counterpart to CreateNamedPipe()
Use sockets for networked clientserver scenarios
104104
Client Example using Named Pipe
WaitNamedPipe (ServerPipeName NMPWAIT_WAIT_FOREVER)
hNamedPipe = CreateFile (ServerPipeName GENERIC_READ | GENERIC_WRITE
0 NULL OPEN_EXISTING FILE_ATTRIBUTE_NORMAL NULL)
if (hNamedPipe == INVALID_HANDLE_VALUE)
fptinf(stderr Failure to locate servern) exit(3)
Write the request
WriteFile (hNamedPipe ampRequest MAX_RQRS_LEN ampnWrite NULL)
Read each response and send it to std out
while (ReadFile (hNamedPipe ResponseRecord MAX_RQRS_LEN ampnRead NULL))
printf (s ResponseRecord)
CloseHandle (hNamedPipe)
return 0
105105
Server Example Using a Named Pipe
hNamedPipe = CreateNamedPipe (SERVER_PIPE PIPE_ACCESS_DUPLEX
PIPE_READMODE_MESSAGE | PIPE_TYPE_MESSAGE | PIPE_WAIT
1 0 0 CS_TIMEOUT pNPSA)
while (Done)
printf (Server is awaiting next requestn)
if (ConnectNamedPipe (hNamedPipe NULL)
|| ReadFile (hNamedPipe ampRequest RQ_SIZE ampnXfer NULL))
fprintf(stderr ldquoConnect or Read Named Pipe errornrdquo) exit(4)
printf( ldquoRequest is sn RequestRecord)
Send the file one line at a time to the client
fp = fopen (File r)
while ((fgets (ResponseRecord MAX_RQRS_LEN fp) = NULL))
WriteFile (hNamedPipe ampResponseRecord
(strlen(ResponseRecord) + 1) TSIZE ampnXfer NULL)
fclose (fp)
DisconnectNamedPipe (hNamedPipe)
End of server operation
106106
Win32 IPC - MailslotsMailslots bear some nasty implementation detailsthey are almost never used
Broadcast mechanism
ndash One-directional
ndash Mutliple writersmultiple readers (frequently one-to-many comm)
ndash Message delivery is unreliable
ndash Can be located over a network domain
ndash Message lengths are limited (w2k lt 426 byte) Operations on the mailslot
ndash Each reader (server) creates mailslot with CreateMailslot()
ndash Write-only client opens mailslot with CreateFile() and uses WriteFile() ndash open will fail if there are no waiting readers
ndash Clientlsquos message can be read by all servers (readers) Client lookup mailslotmailslotname
ndash Client will connect to every server in network domain
107107
Locate a server via mailslot
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
App client 0
App client n
Mailslot Servers
While () Sleep() hMS = CreateFile( ldquomailslotstatusldquo)
WriteFile(hMS ampStatInfo
App Server
Mailslot Client
Message is sent periodically
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
8686
Mutex Example
counter is global shared by all threads
volatile int done counter = 0
HANDLE mutex = CreateMutex( NULL FALSE NULL )
main loop in any of the threads ret is local
DWORD ret
while (done)
ret = WaitForSingleObject( mutex INFINITE )
if (ret == WAIT_OBJECT_0)
counter += local_value
else mutex was abandoned
break exit the loop
ReleaseMutex( mutex )
CloseHandle( mutex )
8787
Comparison - POSIX mutexes
POSIX pthreads specification supports mutexesndash Synchronization among threads in same process
Five basic functionsndash pthread_mutex_init()
ndash pthread_mutex_destroy()
ndash pthread_mutex_lock()
ndash pthread_mutex_unlock()
ndash pthread_mutex_trylock()
Comparisonndash pthread_mutex_lock() will block - equivalent to WaitForSingleObject( hMutex )
ndash pthread_mutex_trylock() is nonblocking (polling) - equivalent to WaitForSingleObject() with timeout == 0
8888
Semaphores
Semaphore objects are used for resource countingndash A semaphore is signaled when count gt 0
Threadsprocesses use wait functionsndash Each wait function decreases semaphore count by 1
ndash ReleaseSemaphore() may increment count by any value
ndash ReleaseSemaphore() returns old semaphore count
8989
Semaphores
HANDLE CreateSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE OpenSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE ReleaseSemaphore( HANDLE hSemaphoreLONG cReleaseCount LPLONG lpPreviousCount )
9090
Events
Multiple threads can be released when a single event is signaled (barrier synchronization)ndash Manual-reset event can signal several thread simultaneously must be reset manually
ndash PulseEvent() will release all threads waiting on a manual-reset event and automatically reset the event
ndash Auto-reset event signals a single thread event is reset automatically
ndash fInitialState == TRUE - create event in signaled state
9191
Events
HANDLE CreateEvent( LPSECURITY_ATTRIBUTE lpsaBOOL fManualReset BOOL fInititalStateLPTSTR lpszEventName )
BOOL SetEvent( HANDLE hEvent )BOOL ResetEvent( HANDLE hEvent )BOOL PulseEvent( HANDLE hEvent )
9292
Comparison - POSIX condition variables
pthreadrsquos condition variables are comparable to eventsndash pthread_cond_init()
ndash pthread_cond_destroy()
Wait functionsndash pthread_cond_wait()
ndash pthread_cond_timedwait()
Signalingndash pthread_cond_signal() - one thread
ndash pthread_cond_broadcast() - all waiting threads
No exact equivalent to manual-reset events
9393
Anonymous pipes
BOOL CreatePipe( PHANDLE phReadPHANDLE phWriteLPSECURITY_ATTRIBUTES lpsaDWORD cbPipe )
main
prog1 prog2pipe
Half-duplex character-based IPC
cbPipe pipe byte size zero == default
Read on pipe handle will block if pipe is empty
Write operation to a full pipe will block
Anonymous pipes are oneway
9494
IO Redirection using an Anonymous Pipe
Create default size anonymous pipe handles are inheritable
if (CreatePipe (amphReadPipe amphWritePipe ampPipeSA 0))
fprintf(stderr ldquoAnon pipe create failednrdquo) exit(1)
Set output handle to pipe handle create first processes
StartInfoCh1hStdInput = GetStdHandle (STD_INPUT_HANDLE)
StartInfoCh1hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh1hStdOutput = hWritePipe
StartInfoCh1dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)Command1 NULL NULL TRUE 0 NULL NULL ampStartInfoCh1 ampProcInfo1))
fprintf(stderr ldquoCreateProc1 failednrdquo) exit(2)
CloseHandle (hWritePipe)
9595
Pipe example (contd)
Repeat (symmetrically) for the second process
StartInfoCh2hStdInput = hReadPipe
StartInfoCh2hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh2hStdOutput = GetStdHandle (STD_OUTPUT_HANDLE)
StartInfoCh2dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)targv NULL NULLTRUE Inherit handles
0 NULL NULL ampStartInfoCh2 ampProcInfo2))
fprintf(stderr ldquoCreateProc2 failednrdquo) exit(3)
CloseHandle (hReadPipe)
Wait for both processes to complete
WaitForSingleObject (ProcInfo1hProcess INFINITE)
WaitForSingleObject (ProcInfo2hProcess INFINITE)
CloseHandle (ProcInfo1hThread) CloseHandle (ProcInfo1hProcess)
CloseHandle (ProcInfo2hThread) CloseHandle (ProcInfo2hProcess)
return 0
9696
Named Pipes
Message orientedndash Reading process can read varying-length messages precisely as sent by the writing process
Bi-directionalndash Two processes can exchange messages over the same pipe
Multiple independent instances of a named pipendash Several clients can communicate with a single server using the same instance
ndash Server can respond to client using the same instance
Pipe can be accessed over the networkndash location transparency
Convenience and connection functions
9797
Using Named Pipes
HANDLE CreateNamedPipe (LPCTSTR lpszPipeNameDWORD fdwOpenMode DWORD fdwPipModeDWORD nMaxInstances DWORD cbOutBufDWORD cbInBuf DWORD dwTimeOutLPSECURITY_ATTRIBUTES lpsa )
Use same flag settings forall instances of a named pipe
lpszPipeName pipe[path]pipename
ndash Not possible to create a pipe on remote machine ( ndash local machine)
fdwOpenMode
ndash PIPE_ACCESS_DUPLEX PIPE_ACCESS_INBOUND PIPE_ACCESS_OUTBOUND
fdwPipeMode
ndash PIPE_TYPE_BYTE or PIPE_TYPE_MESSAGE
ndash PIPE_READMODE_BYTE or PIPE_READMODE_MESSAGE
ndash PIPE_WAIT or PIPE_NOWAIT (will ReadFile block)
9898
Named Pipes (contd)
BOOL PeekNamedPipe (HANDLE hPipeLPVOID lpvBuffer DWORD cbBufferLPDWORD lpcbRead LPDWORD lpcbAvailLPDWORD lpcbMessage)
nMaxInstances
ndash Number of instances
ndash PIPE_UNLIMITED_INSTANCES OS choice based on resources
dwTimeOut
ndash Default time-out period (in msec) for WaitNamedPipe()
First CreateNamedPipe creates named pipe
ndash Closing handle to last instance deletes named pipe
Polling a pipe
ndash Nondestructive ndash is there a message waiting for ReadFile
9999
Named Pipe Client Connections
CreateFile with named pipe namendash pipe[path]pipename
ndash servernamepipe[path]pipename
ndash First method gives better performance (local server)
Status Functionsndash GetNamedPipeHandleState
ndash SetNamedPipeHandleState
ndash GetNamedPipeInfo
100100
Convenience Functions
BOOL TransactNamedPipe( HANDLE hNamedPipeLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDOWRD lpcbRead LPOVERLAPPED lpa)
WriteFile ReadFile sequence
101101
Convenience Functions
BOOL CallNamedPipe( LPCTSTR lpszPipeNameLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDWORD lpcbRead DWORD dwTimeOut)
CreateFile WriteFile ReadFile CloseHandle
ndash dwTimeOut NMPWAIT_NOWAIT NMPWAIT_WIAT_FOREVER NMPWAIT_USE_DEFAULT_WAIT
102102
Server eliminate the polling loop
BOOL ConnectNamedPipe (HANDLE hNamedPipeLPOVERLAPPED lpo
lpo == NULLndash Call will return as soon as there is a client connection
ndash Returns false if client connected between CreateNamed Pipe calland ConnectNamedPipe()
Use DisconnectNamedPipe to free the handle for connection from another client
WaitNamedPipe()ndash Client may wait for serverlsquos ConnectNamedPipe()
Security rights for named pipesndash GENERIC_READ GENERIC_WRITE SYNCHRONIZE
103103
Comparison with UNIX
UNIX FIFOs are similar to a named pipe
ndash FIFOs are half-duplex
ndash FIFOs are limited to a single machine
ndash FIFOs are still byte-oriented so its easiest to use fixed-size records in clientserver applications
ndash Individual readwrites are atomic
A server using FIFOs must use a separate FIFO for each clientlsquos response although all clients can send requests via a single well known FIFO
Mkfifo() is the UNIX counterpart to CreateNamedPipe()
Use sockets for networked clientserver scenarios
104104
Client Example using Named Pipe
WaitNamedPipe (ServerPipeName NMPWAIT_WAIT_FOREVER)
hNamedPipe = CreateFile (ServerPipeName GENERIC_READ | GENERIC_WRITE
0 NULL OPEN_EXISTING FILE_ATTRIBUTE_NORMAL NULL)
if (hNamedPipe == INVALID_HANDLE_VALUE)
fptinf(stderr Failure to locate servern) exit(3)
Write the request
WriteFile (hNamedPipe ampRequest MAX_RQRS_LEN ampnWrite NULL)
Read each response and send it to std out
while (ReadFile (hNamedPipe ResponseRecord MAX_RQRS_LEN ampnRead NULL))
printf (s ResponseRecord)
CloseHandle (hNamedPipe)
return 0
105105
Server Example Using a Named Pipe
hNamedPipe = CreateNamedPipe (SERVER_PIPE PIPE_ACCESS_DUPLEX
PIPE_READMODE_MESSAGE | PIPE_TYPE_MESSAGE | PIPE_WAIT
1 0 0 CS_TIMEOUT pNPSA)
while (Done)
printf (Server is awaiting next requestn)
if (ConnectNamedPipe (hNamedPipe NULL)
|| ReadFile (hNamedPipe ampRequest RQ_SIZE ampnXfer NULL))
fprintf(stderr ldquoConnect or Read Named Pipe errornrdquo) exit(4)
printf( ldquoRequest is sn RequestRecord)
Send the file one line at a time to the client
fp = fopen (File r)
while ((fgets (ResponseRecord MAX_RQRS_LEN fp) = NULL))
WriteFile (hNamedPipe ampResponseRecord
(strlen(ResponseRecord) + 1) TSIZE ampnXfer NULL)
fclose (fp)
DisconnectNamedPipe (hNamedPipe)
End of server operation
106106
Win32 IPC - MailslotsMailslots bear some nasty implementation detailsthey are almost never used
Broadcast mechanism
ndash One-directional
ndash Mutliple writersmultiple readers (frequently one-to-many comm)
ndash Message delivery is unreliable
ndash Can be located over a network domain
ndash Message lengths are limited (w2k lt 426 byte) Operations on the mailslot
ndash Each reader (server) creates mailslot with CreateMailslot()
ndash Write-only client opens mailslot with CreateFile() and uses WriteFile() ndash open will fail if there are no waiting readers
ndash Clientlsquos message can be read by all servers (readers) Client lookup mailslotmailslotname
ndash Client will connect to every server in network domain
107107
Locate a server via mailslot
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
App client 0
App client n
Mailslot Servers
While () Sleep() hMS = CreateFile( ldquomailslotstatusldquo)
WriteFile(hMS ampStatInfo
App Server
Mailslot Client
Message is sent periodically
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
8787
Comparison - POSIX mutexes
POSIX pthreads specification supports mutexesndash Synchronization among threads in same process
Five basic functionsndash pthread_mutex_init()
ndash pthread_mutex_destroy()
ndash pthread_mutex_lock()
ndash pthread_mutex_unlock()
ndash pthread_mutex_trylock()
Comparisonndash pthread_mutex_lock() will block - equivalent to WaitForSingleObject( hMutex )
ndash pthread_mutex_trylock() is nonblocking (polling) - equivalent to WaitForSingleObject() with timeout == 0
8888
Semaphores
Semaphore objects are used for resource countingndash A semaphore is signaled when count gt 0
Threadsprocesses use wait functionsndash Each wait function decreases semaphore count by 1
ndash ReleaseSemaphore() may increment count by any value
ndash ReleaseSemaphore() returns old semaphore count
8989
Semaphores
HANDLE CreateSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE OpenSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE ReleaseSemaphore( HANDLE hSemaphoreLONG cReleaseCount LPLONG lpPreviousCount )
9090
Events
Multiple threads can be released when a single event is signaled (barrier synchronization)ndash Manual-reset event can signal several thread simultaneously must be reset manually
ndash PulseEvent() will release all threads waiting on a manual-reset event and automatically reset the event
ndash Auto-reset event signals a single thread event is reset automatically
ndash fInitialState == TRUE - create event in signaled state
9191
Events
HANDLE CreateEvent( LPSECURITY_ATTRIBUTE lpsaBOOL fManualReset BOOL fInititalStateLPTSTR lpszEventName )
BOOL SetEvent( HANDLE hEvent )BOOL ResetEvent( HANDLE hEvent )BOOL PulseEvent( HANDLE hEvent )
9292
Comparison - POSIX condition variables
pthreadrsquos condition variables are comparable to eventsndash pthread_cond_init()
ndash pthread_cond_destroy()
Wait functionsndash pthread_cond_wait()
ndash pthread_cond_timedwait()
Signalingndash pthread_cond_signal() - one thread
ndash pthread_cond_broadcast() - all waiting threads
No exact equivalent to manual-reset events
9393
Anonymous pipes
BOOL CreatePipe( PHANDLE phReadPHANDLE phWriteLPSECURITY_ATTRIBUTES lpsaDWORD cbPipe )
main
prog1 prog2pipe
Half-duplex character-based IPC
cbPipe pipe byte size zero == default
Read on pipe handle will block if pipe is empty
Write operation to a full pipe will block
Anonymous pipes are oneway
9494
IO Redirection using an Anonymous Pipe
Create default size anonymous pipe handles are inheritable
if (CreatePipe (amphReadPipe amphWritePipe ampPipeSA 0))
fprintf(stderr ldquoAnon pipe create failednrdquo) exit(1)
Set output handle to pipe handle create first processes
StartInfoCh1hStdInput = GetStdHandle (STD_INPUT_HANDLE)
StartInfoCh1hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh1hStdOutput = hWritePipe
StartInfoCh1dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)Command1 NULL NULL TRUE 0 NULL NULL ampStartInfoCh1 ampProcInfo1))
fprintf(stderr ldquoCreateProc1 failednrdquo) exit(2)
CloseHandle (hWritePipe)
9595
Pipe example (contd)
Repeat (symmetrically) for the second process
StartInfoCh2hStdInput = hReadPipe
StartInfoCh2hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh2hStdOutput = GetStdHandle (STD_OUTPUT_HANDLE)
StartInfoCh2dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)targv NULL NULLTRUE Inherit handles
0 NULL NULL ampStartInfoCh2 ampProcInfo2))
fprintf(stderr ldquoCreateProc2 failednrdquo) exit(3)
CloseHandle (hReadPipe)
Wait for both processes to complete
WaitForSingleObject (ProcInfo1hProcess INFINITE)
WaitForSingleObject (ProcInfo2hProcess INFINITE)
CloseHandle (ProcInfo1hThread) CloseHandle (ProcInfo1hProcess)
CloseHandle (ProcInfo2hThread) CloseHandle (ProcInfo2hProcess)
return 0
9696
Named Pipes
Message orientedndash Reading process can read varying-length messages precisely as sent by the writing process
Bi-directionalndash Two processes can exchange messages over the same pipe
Multiple independent instances of a named pipendash Several clients can communicate with a single server using the same instance
ndash Server can respond to client using the same instance
Pipe can be accessed over the networkndash location transparency
Convenience and connection functions
9797
Using Named Pipes
HANDLE CreateNamedPipe (LPCTSTR lpszPipeNameDWORD fdwOpenMode DWORD fdwPipModeDWORD nMaxInstances DWORD cbOutBufDWORD cbInBuf DWORD dwTimeOutLPSECURITY_ATTRIBUTES lpsa )
Use same flag settings forall instances of a named pipe
lpszPipeName pipe[path]pipename
ndash Not possible to create a pipe on remote machine ( ndash local machine)
fdwOpenMode
ndash PIPE_ACCESS_DUPLEX PIPE_ACCESS_INBOUND PIPE_ACCESS_OUTBOUND
fdwPipeMode
ndash PIPE_TYPE_BYTE or PIPE_TYPE_MESSAGE
ndash PIPE_READMODE_BYTE or PIPE_READMODE_MESSAGE
ndash PIPE_WAIT or PIPE_NOWAIT (will ReadFile block)
9898
Named Pipes (contd)
BOOL PeekNamedPipe (HANDLE hPipeLPVOID lpvBuffer DWORD cbBufferLPDWORD lpcbRead LPDWORD lpcbAvailLPDWORD lpcbMessage)
nMaxInstances
ndash Number of instances
ndash PIPE_UNLIMITED_INSTANCES OS choice based on resources
dwTimeOut
ndash Default time-out period (in msec) for WaitNamedPipe()
First CreateNamedPipe creates named pipe
ndash Closing handle to last instance deletes named pipe
Polling a pipe
ndash Nondestructive ndash is there a message waiting for ReadFile
9999
Named Pipe Client Connections
CreateFile with named pipe namendash pipe[path]pipename
ndash servernamepipe[path]pipename
ndash First method gives better performance (local server)
Status Functionsndash GetNamedPipeHandleState
ndash SetNamedPipeHandleState
ndash GetNamedPipeInfo
100100
Convenience Functions
BOOL TransactNamedPipe( HANDLE hNamedPipeLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDOWRD lpcbRead LPOVERLAPPED lpa)
WriteFile ReadFile sequence
101101
Convenience Functions
BOOL CallNamedPipe( LPCTSTR lpszPipeNameLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDWORD lpcbRead DWORD dwTimeOut)
CreateFile WriteFile ReadFile CloseHandle
ndash dwTimeOut NMPWAIT_NOWAIT NMPWAIT_WIAT_FOREVER NMPWAIT_USE_DEFAULT_WAIT
102102
Server eliminate the polling loop
BOOL ConnectNamedPipe (HANDLE hNamedPipeLPOVERLAPPED lpo
lpo == NULLndash Call will return as soon as there is a client connection
ndash Returns false if client connected between CreateNamed Pipe calland ConnectNamedPipe()
Use DisconnectNamedPipe to free the handle for connection from another client
WaitNamedPipe()ndash Client may wait for serverlsquos ConnectNamedPipe()
Security rights for named pipesndash GENERIC_READ GENERIC_WRITE SYNCHRONIZE
103103
Comparison with UNIX
UNIX FIFOs are similar to a named pipe
ndash FIFOs are half-duplex
ndash FIFOs are limited to a single machine
ndash FIFOs are still byte-oriented so its easiest to use fixed-size records in clientserver applications
ndash Individual readwrites are atomic
A server using FIFOs must use a separate FIFO for each clientlsquos response although all clients can send requests via a single well known FIFO
Mkfifo() is the UNIX counterpart to CreateNamedPipe()
Use sockets for networked clientserver scenarios
104104
Client Example using Named Pipe
WaitNamedPipe (ServerPipeName NMPWAIT_WAIT_FOREVER)
hNamedPipe = CreateFile (ServerPipeName GENERIC_READ | GENERIC_WRITE
0 NULL OPEN_EXISTING FILE_ATTRIBUTE_NORMAL NULL)
if (hNamedPipe == INVALID_HANDLE_VALUE)
fptinf(stderr Failure to locate servern) exit(3)
Write the request
WriteFile (hNamedPipe ampRequest MAX_RQRS_LEN ampnWrite NULL)
Read each response and send it to std out
while (ReadFile (hNamedPipe ResponseRecord MAX_RQRS_LEN ampnRead NULL))
printf (s ResponseRecord)
CloseHandle (hNamedPipe)
return 0
105105
Server Example Using a Named Pipe
hNamedPipe = CreateNamedPipe (SERVER_PIPE PIPE_ACCESS_DUPLEX
PIPE_READMODE_MESSAGE | PIPE_TYPE_MESSAGE | PIPE_WAIT
1 0 0 CS_TIMEOUT pNPSA)
while (Done)
printf (Server is awaiting next requestn)
if (ConnectNamedPipe (hNamedPipe NULL)
|| ReadFile (hNamedPipe ampRequest RQ_SIZE ampnXfer NULL))
fprintf(stderr ldquoConnect or Read Named Pipe errornrdquo) exit(4)
printf( ldquoRequest is sn RequestRecord)
Send the file one line at a time to the client
fp = fopen (File r)
while ((fgets (ResponseRecord MAX_RQRS_LEN fp) = NULL))
WriteFile (hNamedPipe ampResponseRecord
(strlen(ResponseRecord) + 1) TSIZE ampnXfer NULL)
fclose (fp)
DisconnectNamedPipe (hNamedPipe)
End of server operation
106106
Win32 IPC - MailslotsMailslots bear some nasty implementation detailsthey are almost never used
Broadcast mechanism
ndash One-directional
ndash Mutliple writersmultiple readers (frequently one-to-many comm)
ndash Message delivery is unreliable
ndash Can be located over a network domain
ndash Message lengths are limited (w2k lt 426 byte) Operations on the mailslot
ndash Each reader (server) creates mailslot with CreateMailslot()
ndash Write-only client opens mailslot with CreateFile() and uses WriteFile() ndash open will fail if there are no waiting readers
ndash Clientlsquos message can be read by all servers (readers) Client lookup mailslotmailslotname
ndash Client will connect to every server in network domain
107107
Locate a server via mailslot
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
App client 0
App client n
Mailslot Servers
While () Sleep() hMS = CreateFile( ldquomailslotstatusldquo)
WriteFile(hMS ampStatInfo
App Server
Mailslot Client
Message is sent periodically
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
8888
Semaphores
Semaphore objects are used for resource countingndash A semaphore is signaled when count gt 0
Threadsprocesses use wait functionsndash Each wait function decreases semaphore count by 1
ndash ReleaseSemaphore() may increment count by any value
ndash ReleaseSemaphore() returns old semaphore count
8989
Semaphores
HANDLE CreateSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE OpenSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE ReleaseSemaphore( HANDLE hSemaphoreLONG cReleaseCount LPLONG lpPreviousCount )
9090
Events
Multiple threads can be released when a single event is signaled (barrier synchronization)ndash Manual-reset event can signal several thread simultaneously must be reset manually
ndash PulseEvent() will release all threads waiting on a manual-reset event and automatically reset the event
ndash Auto-reset event signals a single thread event is reset automatically
ndash fInitialState == TRUE - create event in signaled state
9191
Events
HANDLE CreateEvent( LPSECURITY_ATTRIBUTE lpsaBOOL fManualReset BOOL fInititalStateLPTSTR lpszEventName )
BOOL SetEvent( HANDLE hEvent )BOOL ResetEvent( HANDLE hEvent )BOOL PulseEvent( HANDLE hEvent )
9292
Comparison - POSIX condition variables
pthreadrsquos condition variables are comparable to eventsndash pthread_cond_init()
ndash pthread_cond_destroy()
Wait functionsndash pthread_cond_wait()
ndash pthread_cond_timedwait()
Signalingndash pthread_cond_signal() - one thread
ndash pthread_cond_broadcast() - all waiting threads
No exact equivalent to manual-reset events
9393
Anonymous pipes
BOOL CreatePipe( PHANDLE phReadPHANDLE phWriteLPSECURITY_ATTRIBUTES lpsaDWORD cbPipe )
main
prog1 prog2pipe
Half-duplex character-based IPC
cbPipe pipe byte size zero == default
Read on pipe handle will block if pipe is empty
Write operation to a full pipe will block
Anonymous pipes are oneway
9494
IO Redirection using an Anonymous Pipe
Create default size anonymous pipe handles are inheritable
if (CreatePipe (amphReadPipe amphWritePipe ampPipeSA 0))
fprintf(stderr ldquoAnon pipe create failednrdquo) exit(1)
Set output handle to pipe handle create first processes
StartInfoCh1hStdInput = GetStdHandle (STD_INPUT_HANDLE)
StartInfoCh1hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh1hStdOutput = hWritePipe
StartInfoCh1dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)Command1 NULL NULL TRUE 0 NULL NULL ampStartInfoCh1 ampProcInfo1))
fprintf(stderr ldquoCreateProc1 failednrdquo) exit(2)
CloseHandle (hWritePipe)
9595
Pipe example (contd)
Repeat (symmetrically) for the second process
StartInfoCh2hStdInput = hReadPipe
StartInfoCh2hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh2hStdOutput = GetStdHandle (STD_OUTPUT_HANDLE)
StartInfoCh2dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)targv NULL NULLTRUE Inherit handles
0 NULL NULL ampStartInfoCh2 ampProcInfo2))
fprintf(stderr ldquoCreateProc2 failednrdquo) exit(3)
CloseHandle (hReadPipe)
Wait for both processes to complete
WaitForSingleObject (ProcInfo1hProcess INFINITE)
WaitForSingleObject (ProcInfo2hProcess INFINITE)
CloseHandle (ProcInfo1hThread) CloseHandle (ProcInfo1hProcess)
CloseHandle (ProcInfo2hThread) CloseHandle (ProcInfo2hProcess)
return 0
9696
Named Pipes
Message orientedndash Reading process can read varying-length messages precisely as sent by the writing process
Bi-directionalndash Two processes can exchange messages over the same pipe
Multiple independent instances of a named pipendash Several clients can communicate with a single server using the same instance
ndash Server can respond to client using the same instance
Pipe can be accessed over the networkndash location transparency
Convenience and connection functions
9797
Using Named Pipes
HANDLE CreateNamedPipe (LPCTSTR lpszPipeNameDWORD fdwOpenMode DWORD fdwPipModeDWORD nMaxInstances DWORD cbOutBufDWORD cbInBuf DWORD dwTimeOutLPSECURITY_ATTRIBUTES lpsa )
Use same flag settings forall instances of a named pipe
lpszPipeName pipe[path]pipename
ndash Not possible to create a pipe on remote machine ( ndash local machine)
fdwOpenMode
ndash PIPE_ACCESS_DUPLEX PIPE_ACCESS_INBOUND PIPE_ACCESS_OUTBOUND
fdwPipeMode
ndash PIPE_TYPE_BYTE or PIPE_TYPE_MESSAGE
ndash PIPE_READMODE_BYTE or PIPE_READMODE_MESSAGE
ndash PIPE_WAIT or PIPE_NOWAIT (will ReadFile block)
9898
Named Pipes (contd)
BOOL PeekNamedPipe (HANDLE hPipeLPVOID lpvBuffer DWORD cbBufferLPDWORD lpcbRead LPDWORD lpcbAvailLPDWORD lpcbMessage)
nMaxInstances
ndash Number of instances
ndash PIPE_UNLIMITED_INSTANCES OS choice based on resources
dwTimeOut
ndash Default time-out period (in msec) for WaitNamedPipe()
First CreateNamedPipe creates named pipe
ndash Closing handle to last instance deletes named pipe
Polling a pipe
ndash Nondestructive ndash is there a message waiting for ReadFile
9999
Named Pipe Client Connections
CreateFile with named pipe namendash pipe[path]pipename
ndash servernamepipe[path]pipename
ndash First method gives better performance (local server)
Status Functionsndash GetNamedPipeHandleState
ndash SetNamedPipeHandleState
ndash GetNamedPipeInfo
100100
Convenience Functions
BOOL TransactNamedPipe( HANDLE hNamedPipeLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDOWRD lpcbRead LPOVERLAPPED lpa)
WriteFile ReadFile sequence
101101
Convenience Functions
BOOL CallNamedPipe( LPCTSTR lpszPipeNameLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDWORD lpcbRead DWORD dwTimeOut)
CreateFile WriteFile ReadFile CloseHandle
ndash dwTimeOut NMPWAIT_NOWAIT NMPWAIT_WIAT_FOREVER NMPWAIT_USE_DEFAULT_WAIT
102102
Server eliminate the polling loop
BOOL ConnectNamedPipe (HANDLE hNamedPipeLPOVERLAPPED lpo
lpo == NULLndash Call will return as soon as there is a client connection
ndash Returns false if client connected between CreateNamed Pipe calland ConnectNamedPipe()
Use DisconnectNamedPipe to free the handle for connection from another client
WaitNamedPipe()ndash Client may wait for serverlsquos ConnectNamedPipe()
Security rights for named pipesndash GENERIC_READ GENERIC_WRITE SYNCHRONIZE
103103
Comparison with UNIX
UNIX FIFOs are similar to a named pipe
ndash FIFOs are half-duplex
ndash FIFOs are limited to a single machine
ndash FIFOs are still byte-oriented so its easiest to use fixed-size records in clientserver applications
ndash Individual readwrites are atomic
A server using FIFOs must use a separate FIFO for each clientlsquos response although all clients can send requests via a single well known FIFO
Mkfifo() is the UNIX counterpart to CreateNamedPipe()
Use sockets for networked clientserver scenarios
104104
Client Example using Named Pipe
WaitNamedPipe (ServerPipeName NMPWAIT_WAIT_FOREVER)
hNamedPipe = CreateFile (ServerPipeName GENERIC_READ | GENERIC_WRITE
0 NULL OPEN_EXISTING FILE_ATTRIBUTE_NORMAL NULL)
if (hNamedPipe == INVALID_HANDLE_VALUE)
fptinf(stderr Failure to locate servern) exit(3)
Write the request
WriteFile (hNamedPipe ampRequest MAX_RQRS_LEN ampnWrite NULL)
Read each response and send it to std out
while (ReadFile (hNamedPipe ResponseRecord MAX_RQRS_LEN ampnRead NULL))
printf (s ResponseRecord)
CloseHandle (hNamedPipe)
return 0
105105
Server Example Using a Named Pipe
hNamedPipe = CreateNamedPipe (SERVER_PIPE PIPE_ACCESS_DUPLEX
PIPE_READMODE_MESSAGE | PIPE_TYPE_MESSAGE | PIPE_WAIT
1 0 0 CS_TIMEOUT pNPSA)
while (Done)
printf (Server is awaiting next requestn)
if (ConnectNamedPipe (hNamedPipe NULL)
|| ReadFile (hNamedPipe ampRequest RQ_SIZE ampnXfer NULL))
fprintf(stderr ldquoConnect or Read Named Pipe errornrdquo) exit(4)
printf( ldquoRequest is sn RequestRecord)
Send the file one line at a time to the client
fp = fopen (File r)
while ((fgets (ResponseRecord MAX_RQRS_LEN fp) = NULL))
WriteFile (hNamedPipe ampResponseRecord
(strlen(ResponseRecord) + 1) TSIZE ampnXfer NULL)
fclose (fp)
DisconnectNamedPipe (hNamedPipe)
End of server operation
106106
Win32 IPC - MailslotsMailslots bear some nasty implementation detailsthey are almost never used
Broadcast mechanism
ndash One-directional
ndash Mutliple writersmultiple readers (frequently one-to-many comm)
ndash Message delivery is unreliable
ndash Can be located over a network domain
ndash Message lengths are limited (w2k lt 426 byte) Operations on the mailslot
ndash Each reader (server) creates mailslot with CreateMailslot()
ndash Write-only client opens mailslot with CreateFile() and uses WriteFile() ndash open will fail if there are no waiting readers
ndash Clientlsquos message can be read by all servers (readers) Client lookup mailslotmailslotname
ndash Client will connect to every server in network domain
107107
Locate a server via mailslot
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
App client 0
App client n
Mailslot Servers
While () Sleep() hMS = CreateFile( ldquomailslotstatusldquo)
WriteFile(hMS ampStatInfo
App Server
Mailslot Client
Message is sent periodically
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
8989
Semaphores
HANDLE CreateSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE OpenSemaphore( LPSECURITY_ATTRIBUTE lpsaLONG cSemInit LONG cSemMaxLPTSTR lpszSemName )
HANDLE ReleaseSemaphore( HANDLE hSemaphoreLONG cReleaseCount LPLONG lpPreviousCount )
9090
Events
Multiple threads can be released when a single event is signaled (barrier synchronization)ndash Manual-reset event can signal several thread simultaneously must be reset manually
ndash PulseEvent() will release all threads waiting on a manual-reset event and automatically reset the event
ndash Auto-reset event signals a single thread event is reset automatically
ndash fInitialState == TRUE - create event in signaled state
9191
Events
HANDLE CreateEvent( LPSECURITY_ATTRIBUTE lpsaBOOL fManualReset BOOL fInititalStateLPTSTR lpszEventName )
BOOL SetEvent( HANDLE hEvent )BOOL ResetEvent( HANDLE hEvent )BOOL PulseEvent( HANDLE hEvent )
9292
Comparison - POSIX condition variables
pthreadrsquos condition variables are comparable to eventsndash pthread_cond_init()
ndash pthread_cond_destroy()
Wait functionsndash pthread_cond_wait()
ndash pthread_cond_timedwait()
Signalingndash pthread_cond_signal() - one thread
ndash pthread_cond_broadcast() - all waiting threads
No exact equivalent to manual-reset events
9393
Anonymous pipes
BOOL CreatePipe( PHANDLE phReadPHANDLE phWriteLPSECURITY_ATTRIBUTES lpsaDWORD cbPipe )
main
prog1 prog2pipe
Half-duplex character-based IPC
cbPipe pipe byte size zero == default
Read on pipe handle will block if pipe is empty
Write operation to a full pipe will block
Anonymous pipes are oneway
9494
IO Redirection using an Anonymous Pipe
Create default size anonymous pipe handles are inheritable
if (CreatePipe (amphReadPipe amphWritePipe ampPipeSA 0))
fprintf(stderr ldquoAnon pipe create failednrdquo) exit(1)
Set output handle to pipe handle create first processes
StartInfoCh1hStdInput = GetStdHandle (STD_INPUT_HANDLE)
StartInfoCh1hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh1hStdOutput = hWritePipe
StartInfoCh1dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)Command1 NULL NULL TRUE 0 NULL NULL ampStartInfoCh1 ampProcInfo1))
fprintf(stderr ldquoCreateProc1 failednrdquo) exit(2)
CloseHandle (hWritePipe)
9595
Pipe example (contd)
Repeat (symmetrically) for the second process
StartInfoCh2hStdInput = hReadPipe
StartInfoCh2hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh2hStdOutput = GetStdHandle (STD_OUTPUT_HANDLE)
StartInfoCh2dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)targv NULL NULLTRUE Inherit handles
0 NULL NULL ampStartInfoCh2 ampProcInfo2))
fprintf(stderr ldquoCreateProc2 failednrdquo) exit(3)
CloseHandle (hReadPipe)
Wait for both processes to complete
WaitForSingleObject (ProcInfo1hProcess INFINITE)
WaitForSingleObject (ProcInfo2hProcess INFINITE)
CloseHandle (ProcInfo1hThread) CloseHandle (ProcInfo1hProcess)
CloseHandle (ProcInfo2hThread) CloseHandle (ProcInfo2hProcess)
return 0
9696
Named Pipes
Message orientedndash Reading process can read varying-length messages precisely as sent by the writing process
Bi-directionalndash Two processes can exchange messages over the same pipe
Multiple independent instances of a named pipendash Several clients can communicate with a single server using the same instance
ndash Server can respond to client using the same instance
Pipe can be accessed over the networkndash location transparency
Convenience and connection functions
9797
Using Named Pipes
HANDLE CreateNamedPipe (LPCTSTR lpszPipeNameDWORD fdwOpenMode DWORD fdwPipModeDWORD nMaxInstances DWORD cbOutBufDWORD cbInBuf DWORD dwTimeOutLPSECURITY_ATTRIBUTES lpsa )
Use same flag settings forall instances of a named pipe
lpszPipeName pipe[path]pipename
ndash Not possible to create a pipe on remote machine ( ndash local machine)
fdwOpenMode
ndash PIPE_ACCESS_DUPLEX PIPE_ACCESS_INBOUND PIPE_ACCESS_OUTBOUND
fdwPipeMode
ndash PIPE_TYPE_BYTE or PIPE_TYPE_MESSAGE
ndash PIPE_READMODE_BYTE or PIPE_READMODE_MESSAGE
ndash PIPE_WAIT or PIPE_NOWAIT (will ReadFile block)
9898
Named Pipes (contd)
BOOL PeekNamedPipe (HANDLE hPipeLPVOID lpvBuffer DWORD cbBufferLPDWORD lpcbRead LPDWORD lpcbAvailLPDWORD lpcbMessage)
nMaxInstances
ndash Number of instances
ndash PIPE_UNLIMITED_INSTANCES OS choice based on resources
dwTimeOut
ndash Default time-out period (in msec) for WaitNamedPipe()
First CreateNamedPipe creates named pipe
ndash Closing handle to last instance deletes named pipe
Polling a pipe
ndash Nondestructive ndash is there a message waiting for ReadFile
9999
Named Pipe Client Connections
CreateFile with named pipe namendash pipe[path]pipename
ndash servernamepipe[path]pipename
ndash First method gives better performance (local server)
Status Functionsndash GetNamedPipeHandleState
ndash SetNamedPipeHandleState
ndash GetNamedPipeInfo
100100
Convenience Functions
BOOL TransactNamedPipe( HANDLE hNamedPipeLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDOWRD lpcbRead LPOVERLAPPED lpa)
WriteFile ReadFile sequence
101101
Convenience Functions
BOOL CallNamedPipe( LPCTSTR lpszPipeNameLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDWORD lpcbRead DWORD dwTimeOut)
CreateFile WriteFile ReadFile CloseHandle
ndash dwTimeOut NMPWAIT_NOWAIT NMPWAIT_WIAT_FOREVER NMPWAIT_USE_DEFAULT_WAIT
102102
Server eliminate the polling loop
BOOL ConnectNamedPipe (HANDLE hNamedPipeLPOVERLAPPED lpo
lpo == NULLndash Call will return as soon as there is a client connection
ndash Returns false if client connected between CreateNamed Pipe calland ConnectNamedPipe()
Use DisconnectNamedPipe to free the handle for connection from another client
WaitNamedPipe()ndash Client may wait for serverlsquos ConnectNamedPipe()
Security rights for named pipesndash GENERIC_READ GENERIC_WRITE SYNCHRONIZE
103103
Comparison with UNIX
UNIX FIFOs are similar to a named pipe
ndash FIFOs are half-duplex
ndash FIFOs are limited to a single machine
ndash FIFOs are still byte-oriented so its easiest to use fixed-size records in clientserver applications
ndash Individual readwrites are atomic
A server using FIFOs must use a separate FIFO for each clientlsquos response although all clients can send requests via a single well known FIFO
Mkfifo() is the UNIX counterpart to CreateNamedPipe()
Use sockets for networked clientserver scenarios
104104
Client Example using Named Pipe
WaitNamedPipe (ServerPipeName NMPWAIT_WAIT_FOREVER)
hNamedPipe = CreateFile (ServerPipeName GENERIC_READ | GENERIC_WRITE
0 NULL OPEN_EXISTING FILE_ATTRIBUTE_NORMAL NULL)
if (hNamedPipe == INVALID_HANDLE_VALUE)
fptinf(stderr Failure to locate servern) exit(3)
Write the request
WriteFile (hNamedPipe ampRequest MAX_RQRS_LEN ampnWrite NULL)
Read each response and send it to std out
while (ReadFile (hNamedPipe ResponseRecord MAX_RQRS_LEN ampnRead NULL))
printf (s ResponseRecord)
CloseHandle (hNamedPipe)
return 0
105105
Server Example Using a Named Pipe
hNamedPipe = CreateNamedPipe (SERVER_PIPE PIPE_ACCESS_DUPLEX
PIPE_READMODE_MESSAGE | PIPE_TYPE_MESSAGE | PIPE_WAIT
1 0 0 CS_TIMEOUT pNPSA)
while (Done)
printf (Server is awaiting next requestn)
if (ConnectNamedPipe (hNamedPipe NULL)
|| ReadFile (hNamedPipe ampRequest RQ_SIZE ampnXfer NULL))
fprintf(stderr ldquoConnect or Read Named Pipe errornrdquo) exit(4)
printf( ldquoRequest is sn RequestRecord)
Send the file one line at a time to the client
fp = fopen (File r)
while ((fgets (ResponseRecord MAX_RQRS_LEN fp) = NULL))
WriteFile (hNamedPipe ampResponseRecord
(strlen(ResponseRecord) + 1) TSIZE ampnXfer NULL)
fclose (fp)
DisconnectNamedPipe (hNamedPipe)
End of server operation
106106
Win32 IPC - MailslotsMailslots bear some nasty implementation detailsthey are almost never used
Broadcast mechanism
ndash One-directional
ndash Mutliple writersmultiple readers (frequently one-to-many comm)
ndash Message delivery is unreliable
ndash Can be located over a network domain
ndash Message lengths are limited (w2k lt 426 byte) Operations on the mailslot
ndash Each reader (server) creates mailslot with CreateMailslot()
ndash Write-only client opens mailslot with CreateFile() and uses WriteFile() ndash open will fail if there are no waiting readers
ndash Clientlsquos message can be read by all servers (readers) Client lookup mailslotmailslotname
ndash Client will connect to every server in network domain
107107
Locate a server via mailslot
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
App client 0
App client n
Mailslot Servers
While () Sleep() hMS = CreateFile( ldquomailslotstatusldquo)
WriteFile(hMS ampStatInfo
App Server
Mailslot Client
Message is sent periodically
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
9090
Events
Multiple threads can be released when a single event is signaled (barrier synchronization)ndash Manual-reset event can signal several thread simultaneously must be reset manually
ndash PulseEvent() will release all threads waiting on a manual-reset event and automatically reset the event
ndash Auto-reset event signals a single thread event is reset automatically
ndash fInitialState == TRUE - create event in signaled state
9191
Events
HANDLE CreateEvent( LPSECURITY_ATTRIBUTE lpsaBOOL fManualReset BOOL fInititalStateLPTSTR lpszEventName )
BOOL SetEvent( HANDLE hEvent )BOOL ResetEvent( HANDLE hEvent )BOOL PulseEvent( HANDLE hEvent )
9292
Comparison - POSIX condition variables
pthreadrsquos condition variables are comparable to eventsndash pthread_cond_init()
ndash pthread_cond_destroy()
Wait functionsndash pthread_cond_wait()
ndash pthread_cond_timedwait()
Signalingndash pthread_cond_signal() - one thread
ndash pthread_cond_broadcast() - all waiting threads
No exact equivalent to manual-reset events
9393
Anonymous pipes
BOOL CreatePipe( PHANDLE phReadPHANDLE phWriteLPSECURITY_ATTRIBUTES lpsaDWORD cbPipe )
main
prog1 prog2pipe
Half-duplex character-based IPC
cbPipe pipe byte size zero == default
Read on pipe handle will block if pipe is empty
Write operation to a full pipe will block
Anonymous pipes are oneway
9494
IO Redirection using an Anonymous Pipe
Create default size anonymous pipe handles are inheritable
if (CreatePipe (amphReadPipe amphWritePipe ampPipeSA 0))
fprintf(stderr ldquoAnon pipe create failednrdquo) exit(1)
Set output handle to pipe handle create first processes
StartInfoCh1hStdInput = GetStdHandle (STD_INPUT_HANDLE)
StartInfoCh1hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh1hStdOutput = hWritePipe
StartInfoCh1dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)Command1 NULL NULL TRUE 0 NULL NULL ampStartInfoCh1 ampProcInfo1))
fprintf(stderr ldquoCreateProc1 failednrdquo) exit(2)
CloseHandle (hWritePipe)
9595
Pipe example (contd)
Repeat (symmetrically) for the second process
StartInfoCh2hStdInput = hReadPipe
StartInfoCh2hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh2hStdOutput = GetStdHandle (STD_OUTPUT_HANDLE)
StartInfoCh2dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)targv NULL NULLTRUE Inherit handles
0 NULL NULL ampStartInfoCh2 ampProcInfo2))
fprintf(stderr ldquoCreateProc2 failednrdquo) exit(3)
CloseHandle (hReadPipe)
Wait for both processes to complete
WaitForSingleObject (ProcInfo1hProcess INFINITE)
WaitForSingleObject (ProcInfo2hProcess INFINITE)
CloseHandle (ProcInfo1hThread) CloseHandle (ProcInfo1hProcess)
CloseHandle (ProcInfo2hThread) CloseHandle (ProcInfo2hProcess)
return 0
9696
Named Pipes
Message orientedndash Reading process can read varying-length messages precisely as sent by the writing process
Bi-directionalndash Two processes can exchange messages over the same pipe
Multiple independent instances of a named pipendash Several clients can communicate with a single server using the same instance
ndash Server can respond to client using the same instance
Pipe can be accessed over the networkndash location transparency
Convenience and connection functions
9797
Using Named Pipes
HANDLE CreateNamedPipe (LPCTSTR lpszPipeNameDWORD fdwOpenMode DWORD fdwPipModeDWORD nMaxInstances DWORD cbOutBufDWORD cbInBuf DWORD dwTimeOutLPSECURITY_ATTRIBUTES lpsa )
Use same flag settings forall instances of a named pipe
lpszPipeName pipe[path]pipename
ndash Not possible to create a pipe on remote machine ( ndash local machine)
fdwOpenMode
ndash PIPE_ACCESS_DUPLEX PIPE_ACCESS_INBOUND PIPE_ACCESS_OUTBOUND
fdwPipeMode
ndash PIPE_TYPE_BYTE or PIPE_TYPE_MESSAGE
ndash PIPE_READMODE_BYTE or PIPE_READMODE_MESSAGE
ndash PIPE_WAIT or PIPE_NOWAIT (will ReadFile block)
9898
Named Pipes (contd)
BOOL PeekNamedPipe (HANDLE hPipeLPVOID lpvBuffer DWORD cbBufferLPDWORD lpcbRead LPDWORD lpcbAvailLPDWORD lpcbMessage)
nMaxInstances
ndash Number of instances
ndash PIPE_UNLIMITED_INSTANCES OS choice based on resources
dwTimeOut
ndash Default time-out period (in msec) for WaitNamedPipe()
First CreateNamedPipe creates named pipe
ndash Closing handle to last instance deletes named pipe
Polling a pipe
ndash Nondestructive ndash is there a message waiting for ReadFile
9999
Named Pipe Client Connections
CreateFile with named pipe namendash pipe[path]pipename
ndash servernamepipe[path]pipename
ndash First method gives better performance (local server)
Status Functionsndash GetNamedPipeHandleState
ndash SetNamedPipeHandleState
ndash GetNamedPipeInfo
100100
Convenience Functions
BOOL TransactNamedPipe( HANDLE hNamedPipeLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDOWRD lpcbRead LPOVERLAPPED lpa)
WriteFile ReadFile sequence
101101
Convenience Functions
BOOL CallNamedPipe( LPCTSTR lpszPipeNameLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDWORD lpcbRead DWORD dwTimeOut)
CreateFile WriteFile ReadFile CloseHandle
ndash dwTimeOut NMPWAIT_NOWAIT NMPWAIT_WIAT_FOREVER NMPWAIT_USE_DEFAULT_WAIT
102102
Server eliminate the polling loop
BOOL ConnectNamedPipe (HANDLE hNamedPipeLPOVERLAPPED lpo
lpo == NULLndash Call will return as soon as there is a client connection
ndash Returns false if client connected between CreateNamed Pipe calland ConnectNamedPipe()
Use DisconnectNamedPipe to free the handle for connection from another client
WaitNamedPipe()ndash Client may wait for serverlsquos ConnectNamedPipe()
Security rights for named pipesndash GENERIC_READ GENERIC_WRITE SYNCHRONIZE
103103
Comparison with UNIX
UNIX FIFOs are similar to a named pipe
ndash FIFOs are half-duplex
ndash FIFOs are limited to a single machine
ndash FIFOs are still byte-oriented so its easiest to use fixed-size records in clientserver applications
ndash Individual readwrites are atomic
A server using FIFOs must use a separate FIFO for each clientlsquos response although all clients can send requests via a single well known FIFO
Mkfifo() is the UNIX counterpart to CreateNamedPipe()
Use sockets for networked clientserver scenarios
104104
Client Example using Named Pipe
WaitNamedPipe (ServerPipeName NMPWAIT_WAIT_FOREVER)
hNamedPipe = CreateFile (ServerPipeName GENERIC_READ | GENERIC_WRITE
0 NULL OPEN_EXISTING FILE_ATTRIBUTE_NORMAL NULL)
if (hNamedPipe == INVALID_HANDLE_VALUE)
fptinf(stderr Failure to locate servern) exit(3)
Write the request
WriteFile (hNamedPipe ampRequest MAX_RQRS_LEN ampnWrite NULL)
Read each response and send it to std out
while (ReadFile (hNamedPipe ResponseRecord MAX_RQRS_LEN ampnRead NULL))
printf (s ResponseRecord)
CloseHandle (hNamedPipe)
return 0
105105
Server Example Using a Named Pipe
hNamedPipe = CreateNamedPipe (SERVER_PIPE PIPE_ACCESS_DUPLEX
PIPE_READMODE_MESSAGE | PIPE_TYPE_MESSAGE | PIPE_WAIT
1 0 0 CS_TIMEOUT pNPSA)
while (Done)
printf (Server is awaiting next requestn)
if (ConnectNamedPipe (hNamedPipe NULL)
|| ReadFile (hNamedPipe ampRequest RQ_SIZE ampnXfer NULL))
fprintf(stderr ldquoConnect or Read Named Pipe errornrdquo) exit(4)
printf( ldquoRequest is sn RequestRecord)
Send the file one line at a time to the client
fp = fopen (File r)
while ((fgets (ResponseRecord MAX_RQRS_LEN fp) = NULL))
WriteFile (hNamedPipe ampResponseRecord
(strlen(ResponseRecord) + 1) TSIZE ampnXfer NULL)
fclose (fp)
DisconnectNamedPipe (hNamedPipe)
End of server operation
106106
Win32 IPC - MailslotsMailslots bear some nasty implementation detailsthey are almost never used
Broadcast mechanism
ndash One-directional
ndash Mutliple writersmultiple readers (frequently one-to-many comm)
ndash Message delivery is unreliable
ndash Can be located over a network domain
ndash Message lengths are limited (w2k lt 426 byte) Operations on the mailslot
ndash Each reader (server) creates mailslot with CreateMailslot()
ndash Write-only client opens mailslot with CreateFile() and uses WriteFile() ndash open will fail if there are no waiting readers
ndash Clientlsquos message can be read by all servers (readers) Client lookup mailslotmailslotname
ndash Client will connect to every server in network domain
107107
Locate a server via mailslot
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
App client 0
App client n
Mailslot Servers
While () Sleep() hMS = CreateFile( ldquomailslotstatusldquo)
WriteFile(hMS ampStatInfo
App Server
Mailslot Client
Message is sent periodically
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
9191
Events
HANDLE CreateEvent( LPSECURITY_ATTRIBUTE lpsaBOOL fManualReset BOOL fInititalStateLPTSTR lpszEventName )
BOOL SetEvent( HANDLE hEvent )BOOL ResetEvent( HANDLE hEvent )BOOL PulseEvent( HANDLE hEvent )
9292
Comparison - POSIX condition variables
pthreadrsquos condition variables are comparable to eventsndash pthread_cond_init()
ndash pthread_cond_destroy()
Wait functionsndash pthread_cond_wait()
ndash pthread_cond_timedwait()
Signalingndash pthread_cond_signal() - one thread
ndash pthread_cond_broadcast() - all waiting threads
No exact equivalent to manual-reset events
9393
Anonymous pipes
BOOL CreatePipe( PHANDLE phReadPHANDLE phWriteLPSECURITY_ATTRIBUTES lpsaDWORD cbPipe )
main
prog1 prog2pipe
Half-duplex character-based IPC
cbPipe pipe byte size zero == default
Read on pipe handle will block if pipe is empty
Write operation to a full pipe will block
Anonymous pipes are oneway
9494
IO Redirection using an Anonymous Pipe
Create default size anonymous pipe handles are inheritable
if (CreatePipe (amphReadPipe amphWritePipe ampPipeSA 0))
fprintf(stderr ldquoAnon pipe create failednrdquo) exit(1)
Set output handle to pipe handle create first processes
StartInfoCh1hStdInput = GetStdHandle (STD_INPUT_HANDLE)
StartInfoCh1hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh1hStdOutput = hWritePipe
StartInfoCh1dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)Command1 NULL NULL TRUE 0 NULL NULL ampStartInfoCh1 ampProcInfo1))
fprintf(stderr ldquoCreateProc1 failednrdquo) exit(2)
CloseHandle (hWritePipe)
9595
Pipe example (contd)
Repeat (symmetrically) for the second process
StartInfoCh2hStdInput = hReadPipe
StartInfoCh2hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh2hStdOutput = GetStdHandle (STD_OUTPUT_HANDLE)
StartInfoCh2dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)targv NULL NULLTRUE Inherit handles
0 NULL NULL ampStartInfoCh2 ampProcInfo2))
fprintf(stderr ldquoCreateProc2 failednrdquo) exit(3)
CloseHandle (hReadPipe)
Wait for both processes to complete
WaitForSingleObject (ProcInfo1hProcess INFINITE)
WaitForSingleObject (ProcInfo2hProcess INFINITE)
CloseHandle (ProcInfo1hThread) CloseHandle (ProcInfo1hProcess)
CloseHandle (ProcInfo2hThread) CloseHandle (ProcInfo2hProcess)
return 0
9696
Named Pipes
Message orientedndash Reading process can read varying-length messages precisely as sent by the writing process
Bi-directionalndash Two processes can exchange messages over the same pipe
Multiple independent instances of a named pipendash Several clients can communicate with a single server using the same instance
ndash Server can respond to client using the same instance
Pipe can be accessed over the networkndash location transparency
Convenience and connection functions
9797
Using Named Pipes
HANDLE CreateNamedPipe (LPCTSTR lpszPipeNameDWORD fdwOpenMode DWORD fdwPipModeDWORD nMaxInstances DWORD cbOutBufDWORD cbInBuf DWORD dwTimeOutLPSECURITY_ATTRIBUTES lpsa )
Use same flag settings forall instances of a named pipe
lpszPipeName pipe[path]pipename
ndash Not possible to create a pipe on remote machine ( ndash local machine)
fdwOpenMode
ndash PIPE_ACCESS_DUPLEX PIPE_ACCESS_INBOUND PIPE_ACCESS_OUTBOUND
fdwPipeMode
ndash PIPE_TYPE_BYTE or PIPE_TYPE_MESSAGE
ndash PIPE_READMODE_BYTE or PIPE_READMODE_MESSAGE
ndash PIPE_WAIT or PIPE_NOWAIT (will ReadFile block)
9898
Named Pipes (contd)
BOOL PeekNamedPipe (HANDLE hPipeLPVOID lpvBuffer DWORD cbBufferLPDWORD lpcbRead LPDWORD lpcbAvailLPDWORD lpcbMessage)
nMaxInstances
ndash Number of instances
ndash PIPE_UNLIMITED_INSTANCES OS choice based on resources
dwTimeOut
ndash Default time-out period (in msec) for WaitNamedPipe()
First CreateNamedPipe creates named pipe
ndash Closing handle to last instance deletes named pipe
Polling a pipe
ndash Nondestructive ndash is there a message waiting for ReadFile
9999
Named Pipe Client Connections
CreateFile with named pipe namendash pipe[path]pipename
ndash servernamepipe[path]pipename
ndash First method gives better performance (local server)
Status Functionsndash GetNamedPipeHandleState
ndash SetNamedPipeHandleState
ndash GetNamedPipeInfo
100100
Convenience Functions
BOOL TransactNamedPipe( HANDLE hNamedPipeLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDOWRD lpcbRead LPOVERLAPPED lpa)
WriteFile ReadFile sequence
101101
Convenience Functions
BOOL CallNamedPipe( LPCTSTR lpszPipeNameLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDWORD lpcbRead DWORD dwTimeOut)
CreateFile WriteFile ReadFile CloseHandle
ndash dwTimeOut NMPWAIT_NOWAIT NMPWAIT_WIAT_FOREVER NMPWAIT_USE_DEFAULT_WAIT
102102
Server eliminate the polling loop
BOOL ConnectNamedPipe (HANDLE hNamedPipeLPOVERLAPPED lpo
lpo == NULLndash Call will return as soon as there is a client connection
ndash Returns false if client connected between CreateNamed Pipe calland ConnectNamedPipe()
Use DisconnectNamedPipe to free the handle for connection from another client
WaitNamedPipe()ndash Client may wait for serverlsquos ConnectNamedPipe()
Security rights for named pipesndash GENERIC_READ GENERIC_WRITE SYNCHRONIZE
103103
Comparison with UNIX
UNIX FIFOs are similar to a named pipe
ndash FIFOs are half-duplex
ndash FIFOs are limited to a single machine
ndash FIFOs are still byte-oriented so its easiest to use fixed-size records in clientserver applications
ndash Individual readwrites are atomic
A server using FIFOs must use a separate FIFO for each clientlsquos response although all clients can send requests via a single well known FIFO
Mkfifo() is the UNIX counterpart to CreateNamedPipe()
Use sockets for networked clientserver scenarios
104104
Client Example using Named Pipe
WaitNamedPipe (ServerPipeName NMPWAIT_WAIT_FOREVER)
hNamedPipe = CreateFile (ServerPipeName GENERIC_READ | GENERIC_WRITE
0 NULL OPEN_EXISTING FILE_ATTRIBUTE_NORMAL NULL)
if (hNamedPipe == INVALID_HANDLE_VALUE)
fptinf(stderr Failure to locate servern) exit(3)
Write the request
WriteFile (hNamedPipe ampRequest MAX_RQRS_LEN ampnWrite NULL)
Read each response and send it to std out
while (ReadFile (hNamedPipe ResponseRecord MAX_RQRS_LEN ampnRead NULL))
printf (s ResponseRecord)
CloseHandle (hNamedPipe)
return 0
105105
Server Example Using a Named Pipe
hNamedPipe = CreateNamedPipe (SERVER_PIPE PIPE_ACCESS_DUPLEX
PIPE_READMODE_MESSAGE | PIPE_TYPE_MESSAGE | PIPE_WAIT
1 0 0 CS_TIMEOUT pNPSA)
while (Done)
printf (Server is awaiting next requestn)
if (ConnectNamedPipe (hNamedPipe NULL)
|| ReadFile (hNamedPipe ampRequest RQ_SIZE ampnXfer NULL))
fprintf(stderr ldquoConnect or Read Named Pipe errornrdquo) exit(4)
printf( ldquoRequest is sn RequestRecord)
Send the file one line at a time to the client
fp = fopen (File r)
while ((fgets (ResponseRecord MAX_RQRS_LEN fp) = NULL))
WriteFile (hNamedPipe ampResponseRecord
(strlen(ResponseRecord) + 1) TSIZE ampnXfer NULL)
fclose (fp)
DisconnectNamedPipe (hNamedPipe)
End of server operation
106106
Win32 IPC - MailslotsMailslots bear some nasty implementation detailsthey are almost never used
Broadcast mechanism
ndash One-directional
ndash Mutliple writersmultiple readers (frequently one-to-many comm)
ndash Message delivery is unreliable
ndash Can be located over a network domain
ndash Message lengths are limited (w2k lt 426 byte) Operations on the mailslot
ndash Each reader (server) creates mailslot with CreateMailslot()
ndash Write-only client opens mailslot with CreateFile() and uses WriteFile() ndash open will fail if there are no waiting readers
ndash Clientlsquos message can be read by all servers (readers) Client lookup mailslotmailslotname
ndash Client will connect to every server in network domain
107107
Locate a server via mailslot
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
App client 0
App client n
Mailslot Servers
While () Sleep() hMS = CreateFile( ldquomailslotstatusldquo)
WriteFile(hMS ampStatInfo
App Server
Mailslot Client
Message is sent periodically
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
9292
Comparison - POSIX condition variables
pthreadrsquos condition variables are comparable to eventsndash pthread_cond_init()
ndash pthread_cond_destroy()
Wait functionsndash pthread_cond_wait()
ndash pthread_cond_timedwait()
Signalingndash pthread_cond_signal() - one thread
ndash pthread_cond_broadcast() - all waiting threads
No exact equivalent to manual-reset events
9393
Anonymous pipes
BOOL CreatePipe( PHANDLE phReadPHANDLE phWriteLPSECURITY_ATTRIBUTES lpsaDWORD cbPipe )
main
prog1 prog2pipe
Half-duplex character-based IPC
cbPipe pipe byte size zero == default
Read on pipe handle will block if pipe is empty
Write operation to a full pipe will block
Anonymous pipes are oneway
9494
IO Redirection using an Anonymous Pipe
Create default size anonymous pipe handles are inheritable
if (CreatePipe (amphReadPipe amphWritePipe ampPipeSA 0))
fprintf(stderr ldquoAnon pipe create failednrdquo) exit(1)
Set output handle to pipe handle create first processes
StartInfoCh1hStdInput = GetStdHandle (STD_INPUT_HANDLE)
StartInfoCh1hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh1hStdOutput = hWritePipe
StartInfoCh1dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)Command1 NULL NULL TRUE 0 NULL NULL ampStartInfoCh1 ampProcInfo1))
fprintf(stderr ldquoCreateProc1 failednrdquo) exit(2)
CloseHandle (hWritePipe)
9595
Pipe example (contd)
Repeat (symmetrically) for the second process
StartInfoCh2hStdInput = hReadPipe
StartInfoCh2hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh2hStdOutput = GetStdHandle (STD_OUTPUT_HANDLE)
StartInfoCh2dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)targv NULL NULLTRUE Inherit handles
0 NULL NULL ampStartInfoCh2 ampProcInfo2))
fprintf(stderr ldquoCreateProc2 failednrdquo) exit(3)
CloseHandle (hReadPipe)
Wait for both processes to complete
WaitForSingleObject (ProcInfo1hProcess INFINITE)
WaitForSingleObject (ProcInfo2hProcess INFINITE)
CloseHandle (ProcInfo1hThread) CloseHandle (ProcInfo1hProcess)
CloseHandle (ProcInfo2hThread) CloseHandle (ProcInfo2hProcess)
return 0
9696
Named Pipes
Message orientedndash Reading process can read varying-length messages precisely as sent by the writing process
Bi-directionalndash Two processes can exchange messages over the same pipe
Multiple independent instances of a named pipendash Several clients can communicate with a single server using the same instance
ndash Server can respond to client using the same instance
Pipe can be accessed over the networkndash location transparency
Convenience and connection functions
9797
Using Named Pipes
HANDLE CreateNamedPipe (LPCTSTR lpszPipeNameDWORD fdwOpenMode DWORD fdwPipModeDWORD nMaxInstances DWORD cbOutBufDWORD cbInBuf DWORD dwTimeOutLPSECURITY_ATTRIBUTES lpsa )
Use same flag settings forall instances of a named pipe
lpszPipeName pipe[path]pipename
ndash Not possible to create a pipe on remote machine ( ndash local machine)
fdwOpenMode
ndash PIPE_ACCESS_DUPLEX PIPE_ACCESS_INBOUND PIPE_ACCESS_OUTBOUND
fdwPipeMode
ndash PIPE_TYPE_BYTE or PIPE_TYPE_MESSAGE
ndash PIPE_READMODE_BYTE or PIPE_READMODE_MESSAGE
ndash PIPE_WAIT or PIPE_NOWAIT (will ReadFile block)
9898
Named Pipes (contd)
BOOL PeekNamedPipe (HANDLE hPipeLPVOID lpvBuffer DWORD cbBufferLPDWORD lpcbRead LPDWORD lpcbAvailLPDWORD lpcbMessage)
nMaxInstances
ndash Number of instances
ndash PIPE_UNLIMITED_INSTANCES OS choice based on resources
dwTimeOut
ndash Default time-out period (in msec) for WaitNamedPipe()
First CreateNamedPipe creates named pipe
ndash Closing handle to last instance deletes named pipe
Polling a pipe
ndash Nondestructive ndash is there a message waiting for ReadFile
9999
Named Pipe Client Connections
CreateFile with named pipe namendash pipe[path]pipename
ndash servernamepipe[path]pipename
ndash First method gives better performance (local server)
Status Functionsndash GetNamedPipeHandleState
ndash SetNamedPipeHandleState
ndash GetNamedPipeInfo
100100
Convenience Functions
BOOL TransactNamedPipe( HANDLE hNamedPipeLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDOWRD lpcbRead LPOVERLAPPED lpa)
WriteFile ReadFile sequence
101101
Convenience Functions
BOOL CallNamedPipe( LPCTSTR lpszPipeNameLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDWORD lpcbRead DWORD dwTimeOut)
CreateFile WriteFile ReadFile CloseHandle
ndash dwTimeOut NMPWAIT_NOWAIT NMPWAIT_WIAT_FOREVER NMPWAIT_USE_DEFAULT_WAIT
102102
Server eliminate the polling loop
BOOL ConnectNamedPipe (HANDLE hNamedPipeLPOVERLAPPED lpo
lpo == NULLndash Call will return as soon as there is a client connection
ndash Returns false if client connected between CreateNamed Pipe calland ConnectNamedPipe()
Use DisconnectNamedPipe to free the handle for connection from another client
WaitNamedPipe()ndash Client may wait for serverlsquos ConnectNamedPipe()
Security rights for named pipesndash GENERIC_READ GENERIC_WRITE SYNCHRONIZE
103103
Comparison with UNIX
UNIX FIFOs are similar to a named pipe
ndash FIFOs are half-duplex
ndash FIFOs are limited to a single machine
ndash FIFOs are still byte-oriented so its easiest to use fixed-size records in clientserver applications
ndash Individual readwrites are atomic
A server using FIFOs must use a separate FIFO for each clientlsquos response although all clients can send requests via a single well known FIFO
Mkfifo() is the UNIX counterpart to CreateNamedPipe()
Use sockets for networked clientserver scenarios
104104
Client Example using Named Pipe
WaitNamedPipe (ServerPipeName NMPWAIT_WAIT_FOREVER)
hNamedPipe = CreateFile (ServerPipeName GENERIC_READ | GENERIC_WRITE
0 NULL OPEN_EXISTING FILE_ATTRIBUTE_NORMAL NULL)
if (hNamedPipe == INVALID_HANDLE_VALUE)
fptinf(stderr Failure to locate servern) exit(3)
Write the request
WriteFile (hNamedPipe ampRequest MAX_RQRS_LEN ampnWrite NULL)
Read each response and send it to std out
while (ReadFile (hNamedPipe ResponseRecord MAX_RQRS_LEN ampnRead NULL))
printf (s ResponseRecord)
CloseHandle (hNamedPipe)
return 0
105105
Server Example Using a Named Pipe
hNamedPipe = CreateNamedPipe (SERVER_PIPE PIPE_ACCESS_DUPLEX
PIPE_READMODE_MESSAGE | PIPE_TYPE_MESSAGE | PIPE_WAIT
1 0 0 CS_TIMEOUT pNPSA)
while (Done)
printf (Server is awaiting next requestn)
if (ConnectNamedPipe (hNamedPipe NULL)
|| ReadFile (hNamedPipe ampRequest RQ_SIZE ampnXfer NULL))
fprintf(stderr ldquoConnect or Read Named Pipe errornrdquo) exit(4)
printf( ldquoRequest is sn RequestRecord)
Send the file one line at a time to the client
fp = fopen (File r)
while ((fgets (ResponseRecord MAX_RQRS_LEN fp) = NULL))
WriteFile (hNamedPipe ampResponseRecord
(strlen(ResponseRecord) + 1) TSIZE ampnXfer NULL)
fclose (fp)
DisconnectNamedPipe (hNamedPipe)
End of server operation
106106
Win32 IPC - MailslotsMailslots bear some nasty implementation detailsthey are almost never used
Broadcast mechanism
ndash One-directional
ndash Mutliple writersmultiple readers (frequently one-to-many comm)
ndash Message delivery is unreliable
ndash Can be located over a network domain
ndash Message lengths are limited (w2k lt 426 byte) Operations on the mailslot
ndash Each reader (server) creates mailslot with CreateMailslot()
ndash Write-only client opens mailslot with CreateFile() and uses WriteFile() ndash open will fail if there are no waiting readers
ndash Clientlsquos message can be read by all servers (readers) Client lookup mailslotmailslotname
ndash Client will connect to every server in network domain
107107
Locate a server via mailslot
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
App client 0
App client n
Mailslot Servers
While () Sleep() hMS = CreateFile( ldquomailslotstatusldquo)
WriteFile(hMS ampStatInfo
App Server
Mailslot Client
Message is sent periodically
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
9393
Anonymous pipes
BOOL CreatePipe( PHANDLE phReadPHANDLE phWriteLPSECURITY_ATTRIBUTES lpsaDWORD cbPipe )
main
prog1 prog2pipe
Half-duplex character-based IPC
cbPipe pipe byte size zero == default
Read on pipe handle will block if pipe is empty
Write operation to a full pipe will block
Anonymous pipes are oneway
9494
IO Redirection using an Anonymous Pipe
Create default size anonymous pipe handles are inheritable
if (CreatePipe (amphReadPipe amphWritePipe ampPipeSA 0))
fprintf(stderr ldquoAnon pipe create failednrdquo) exit(1)
Set output handle to pipe handle create first processes
StartInfoCh1hStdInput = GetStdHandle (STD_INPUT_HANDLE)
StartInfoCh1hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh1hStdOutput = hWritePipe
StartInfoCh1dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)Command1 NULL NULL TRUE 0 NULL NULL ampStartInfoCh1 ampProcInfo1))
fprintf(stderr ldquoCreateProc1 failednrdquo) exit(2)
CloseHandle (hWritePipe)
9595
Pipe example (contd)
Repeat (symmetrically) for the second process
StartInfoCh2hStdInput = hReadPipe
StartInfoCh2hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh2hStdOutput = GetStdHandle (STD_OUTPUT_HANDLE)
StartInfoCh2dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)targv NULL NULLTRUE Inherit handles
0 NULL NULL ampStartInfoCh2 ampProcInfo2))
fprintf(stderr ldquoCreateProc2 failednrdquo) exit(3)
CloseHandle (hReadPipe)
Wait for both processes to complete
WaitForSingleObject (ProcInfo1hProcess INFINITE)
WaitForSingleObject (ProcInfo2hProcess INFINITE)
CloseHandle (ProcInfo1hThread) CloseHandle (ProcInfo1hProcess)
CloseHandle (ProcInfo2hThread) CloseHandle (ProcInfo2hProcess)
return 0
9696
Named Pipes
Message orientedndash Reading process can read varying-length messages precisely as sent by the writing process
Bi-directionalndash Two processes can exchange messages over the same pipe
Multiple independent instances of a named pipendash Several clients can communicate with a single server using the same instance
ndash Server can respond to client using the same instance
Pipe can be accessed over the networkndash location transparency
Convenience and connection functions
9797
Using Named Pipes
HANDLE CreateNamedPipe (LPCTSTR lpszPipeNameDWORD fdwOpenMode DWORD fdwPipModeDWORD nMaxInstances DWORD cbOutBufDWORD cbInBuf DWORD dwTimeOutLPSECURITY_ATTRIBUTES lpsa )
Use same flag settings forall instances of a named pipe
lpszPipeName pipe[path]pipename
ndash Not possible to create a pipe on remote machine ( ndash local machine)
fdwOpenMode
ndash PIPE_ACCESS_DUPLEX PIPE_ACCESS_INBOUND PIPE_ACCESS_OUTBOUND
fdwPipeMode
ndash PIPE_TYPE_BYTE or PIPE_TYPE_MESSAGE
ndash PIPE_READMODE_BYTE or PIPE_READMODE_MESSAGE
ndash PIPE_WAIT or PIPE_NOWAIT (will ReadFile block)
9898
Named Pipes (contd)
BOOL PeekNamedPipe (HANDLE hPipeLPVOID lpvBuffer DWORD cbBufferLPDWORD lpcbRead LPDWORD lpcbAvailLPDWORD lpcbMessage)
nMaxInstances
ndash Number of instances
ndash PIPE_UNLIMITED_INSTANCES OS choice based on resources
dwTimeOut
ndash Default time-out period (in msec) for WaitNamedPipe()
First CreateNamedPipe creates named pipe
ndash Closing handle to last instance deletes named pipe
Polling a pipe
ndash Nondestructive ndash is there a message waiting for ReadFile
9999
Named Pipe Client Connections
CreateFile with named pipe namendash pipe[path]pipename
ndash servernamepipe[path]pipename
ndash First method gives better performance (local server)
Status Functionsndash GetNamedPipeHandleState
ndash SetNamedPipeHandleState
ndash GetNamedPipeInfo
100100
Convenience Functions
BOOL TransactNamedPipe( HANDLE hNamedPipeLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDOWRD lpcbRead LPOVERLAPPED lpa)
WriteFile ReadFile sequence
101101
Convenience Functions
BOOL CallNamedPipe( LPCTSTR lpszPipeNameLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDWORD lpcbRead DWORD dwTimeOut)
CreateFile WriteFile ReadFile CloseHandle
ndash dwTimeOut NMPWAIT_NOWAIT NMPWAIT_WIAT_FOREVER NMPWAIT_USE_DEFAULT_WAIT
102102
Server eliminate the polling loop
BOOL ConnectNamedPipe (HANDLE hNamedPipeLPOVERLAPPED lpo
lpo == NULLndash Call will return as soon as there is a client connection
ndash Returns false if client connected between CreateNamed Pipe calland ConnectNamedPipe()
Use DisconnectNamedPipe to free the handle for connection from another client
WaitNamedPipe()ndash Client may wait for serverlsquos ConnectNamedPipe()
Security rights for named pipesndash GENERIC_READ GENERIC_WRITE SYNCHRONIZE
103103
Comparison with UNIX
UNIX FIFOs are similar to a named pipe
ndash FIFOs are half-duplex
ndash FIFOs are limited to a single machine
ndash FIFOs are still byte-oriented so its easiest to use fixed-size records in clientserver applications
ndash Individual readwrites are atomic
A server using FIFOs must use a separate FIFO for each clientlsquos response although all clients can send requests via a single well known FIFO
Mkfifo() is the UNIX counterpart to CreateNamedPipe()
Use sockets for networked clientserver scenarios
104104
Client Example using Named Pipe
WaitNamedPipe (ServerPipeName NMPWAIT_WAIT_FOREVER)
hNamedPipe = CreateFile (ServerPipeName GENERIC_READ | GENERIC_WRITE
0 NULL OPEN_EXISTING FILE_ATTRIBUTE_NORMAL NULL)
if (hNamedPipe == INVALID_HANDLE_VALUE)
fptinf(stderr Failure to locate servern) exit(3)
Write the request
WriteFile (hNamedPipe ampRequest MAX_RQRS_LEN ampnWrite NULL)
Read each response and send it to std out
while (ReadFile (hNamedPipe ResponseRecord MAX_RQRS_LEN ampnRead NULL))
printf (s ResponseRecord)
CloseHandle (hNamedPipe)
return 0
105105
Server Example Using a Named Pipe
hNamedPipe = CreateNamedPipe (SERVER_PIPE PIPE_ACCESS_DUPLEX
PIPE_READMODE_MESSAGE | PIPE_TYPE_MESSAGE | PIPE_WAIT
1 0 0 CS_TIMEOUT pNPSA)
while (Done)
printf (Server is awaiting next requestn)
if (ConnectNamedPipe (hNamedPipe NULL)
|| ReadFile (hNamedPipe ampRequest RQ_SIZE ampnXfer NULL))
fprintf(stderr ldquoConnect or Read Named Pipe errornrdquo) exit(4)
printf( ldquoRequest is sn RequestRecord)
Send the file one line at a time to the client
fp = fopen (File r)
while ((fgets (ResponseRecord MAX_RQRS_LEN fp) = NULL))
WriteFile (hNamedPipe ampResponseRecord
(strlen(ResponseRecord) + 1) TSIZE ampnXfer NULL)
fclose (fp)
DisconnectNamedPipe (hNamedPipe)
End of server operation
106106
Win32 IPC - MailslotsMailslots bear some nasty implementation detailsthey are almost never used
Broadcast mechanism
ndash One-directional
ndash Mutliple writersmultiple readers (frequently one-to-many comm)
ndash Message delivery is unreliable
ndash Can be located over a network domain
ndash Message lengths are limited (w2k lt 426 byte) Operations on the mailslot
ndash Each reader (server) creates mailslot with CreateMailslot()
ndash Write-only client opens mailslot with CreateFile() and uses WriteFile() ndash open will fail if there are no waiting readers
ndash Clientlsquos message can be read by all servers (readers) Client lookup mailslotmailslotname
ndash Client will connect to every server in network domain
107107
Locate a server via mailslot
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
App client 0
App client n
Mailslot Servers
While () Sleep() hMS = CreateFile( ldquomailslotstatusldquo)
WriteFile(hMS ampStatInfo
App Server
Mailslot Client
Message is sent periodically
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
9494
IO Redirection using an Anonymous Pipe
Create default size anonymous pipe handles are inheritable
if (CreatePipe (amphReadPipe amphWritePipe ampPipeSA 0))
fprintf(stderr ldquoAnon pipe create failednrdquo) exit(1)
Set output handle to pipe handle create first processes
StartInfoCh1hStdInput = GetStdHandle (STD_INPUT_HANDLE)
StartInfoCh1hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh1hStdOutput = hWritePipe
StartInfoCh1dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)Command1 NULL NULL TRUE 0 NULL NULL ampStartInfoCh1 ampProcInfo1))
fprintf(stderr ldquoCreateProc1 failednrdquo) exit(2)
CloseHandle (hWritePipe)
9595
Pipe example (contd)
Repeat (symmetrically) for the second process
StartInfoCh2hStdInput = hReadPipe
StartInfoCh2hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh2hStdOutput = GetStdHandle (STD_OUTPUT_HANDLE)
StartInfoCh2dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)targv NULL NULLTRUE Inherit handles
0 NULL NULL ampStartInfoCh2 ampProcInfo2))
fprintf(stderr ldquoCreateProc2 failednrdquo) exit(3)
CloseHandle (hReadPipe)
Wait for both processes to complete
WaitForSingleObject (ProcInfo1hProcess INFINITE)
WaitForSingleObject (ProcInfo2hProcess INFINITE)
CloseHandle (ProcInfo1hThread) CloseHandle (ProcInfo1hProcess)
CloseHandle (ProcInfo2hThread) CloseHandle (ProcInfo2hProcess)
return 0
9696
Named Pipes
Message orientedndash Reading process can read varying-length messages precisely as sent by the writing process
Bi-directionalndash Two processes can exchange messages over the same pipe
Multiple independent instances of a named pipendash Several clients can communicate with a single server using the same instance
ndash Server can respond to client using the same instance
Pipe can be accessed over the networkndash location transparency
Convenience and connection functions
9797
Using Named Pipes
HANDLE CreateNamedPipe (LPCTSTR lpszPipeNameDWORD fdwOpenMode DWORD fdwPipModeDWORD nMaxInstances DWORD cbOutBufDWORD cbInBuf DWORD dwTimeOutLPSECURITY_ATTRIBUTES lpsa )
Use same flag settings forall instances of a named pipe
lpszPipeName pipe[path]pipename
ndash Not possible to create a pipe on remote machine ( ndash local machine)
fdwOpenMode
ndash PIPE_ACCESS_DUPLEX PIPE_ACCESS_INBOUND PIPE_ACCESS_OUTBOUND
fdwPipeMode
ndash PIPE_TYPE_BYTE or PIPE_TYPE_MESSAGE
ndash PIPE_READMODE_BYTE or PIPE_READMODE_MESSAGE
ndash PIPE_WAIT or PIPE_NOWAIT (will ReadFile block)
9898
Named Pipes (contd)
BOOL PeekNamedPipe (HANDLE hPipeLPVOID lpvBuffer DWORD cbBufferLPDWORD lpcbRead LPDWORD lpcbAvailLPDWORD lpcbMessage)
nMaxInstances
ndash Number of instances
ndash PIPE_UNLIMITED_INSTANCES OS choice based on resources
dwTimeOut
ndash Default time-out period (in msec) for WaitNamedPipe()
First CreateNamedPipe creates named pipe
ndash Closing handle to last instance deletes named pipe
Polling a pipe
ndash Nondestructive ndash is there a message waiting for ReadFile
9999
Named Pipe Client Connections
CreateFile with named pipe namendash pipe[path]pipename
ndash servernamepipe[path]pipename
ndash First method gives better performance (local server)
Status Functionsndash GetNamedPipeHandleState
ndash SetNamedPipeHandleState
ndash GetNamedPipeInfo
100100
Convenience Functions
BOOL TransactNamedPipe( HANDLE hNamedPipeLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDOWRD lpcbRead LPOVERLAPPED lpa)
WriteFile ReadFile sequence
101101
Convenience Functions
BOOL CallNamedPipe( LPCTSTR lpszPipeNameLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDWORD lpcbRead DWORD dwTimeOut)
CreateFile WriteFile ReadFile CloseHandle
ndash dwTimeOut NMPWAIT_NOWAIT NMPWAIT_WIAT_FOREVER NMPWAIT_USE_DEFAULT_WAIT
102102
Server eliminate the polling loop
BOOL ConnectNamedPipe (HANDLE hNamedPipeLPOVERLAPPED lpo
lpo == NULLndash Call will return as soon as there is a client connection
ndash Returns false if client connected between CreateNamed Pipe calland ConnectNamedPipe()
Use DisconnectNamedPipe to free the handle for connection from another client
WaitNamedPipe()ndash Client may wait for serverlsquos ConnectNamedPipe()
Security rights for named pipesndash GENERIC_READ GENERIC_WRITE SYNCHRONIZE
103103
Comparison with UNIX
UNIX FIFOs are similar to a named pipe
ndash FIFOs are half-duplex
ndash FIFOs are limited to a single machine
ndash FIFOs are still byte-oriented so its easiest to use fixed-size records in clientserver applications
ndash Individual readwrites are atomic
A server using FIFOs must use a separate FIFO for each clientlsquos response although all clients can send requests via a single well known FIFO
Mkfifo() is the UNIX counterpart to CreateNamedPipe()
Use sockets for networked clientserver scenarios
104104
Client Example using Named Pipe
WaitNamedPipe (ServerPipeName NMPWAIT_WAIT_FOREVER)
hNamedPipe = CreateFile (ServerPipeName GENERIC_READ | GENERIC_WRITE
0 NULL OPEN_EXISTING FILE_ATTRIBUTE_NORMAL NULL)
if (hNamedPipe == INVALID_HANDLE_VALUE)
fptinf(stderr Failure to locate servern) exit(3)
Write the request
WriteFile (hNamedPipe ampRequest MAX_RQRS_LEN ampnWrite NULL)
Read each response and send it to std out
while (ReadFile (hNamedPipe ResponseRecord MAX_RQRS_LEN ampnRead NULL))
printf (s ResponseRecord)
CloseHandle (hNamedPipe)
return 0
105105
Server Example Using a Named Pipe
hNamedPipe = CreateNamedPipe (SERVER_PIPE PIPE_ACCESS_DUPLEX
PIPE_READMODE_MESSAGE | PIPE_TYPE_MESSAGE | PIPE_WAIT
1 0 0 CS_TIMEOUT pNPSA)
while (Done)
printf (Server is awaiting next requestn)
if (ConnectNamedPipe (hNamedPipe NULL)
|| ReadFile (hNamedPipe ampRequest RQ_SIZE ampnXfer NULL))
fprintf(stderr ldquoConnect or Read Named Pipe errornrdquo) exit(4)
printf( ldquoRequest is sn RequestRecord)
Send the file one line at a time to the client
fp = fopen (File r)
while ((fgets (ResponseRecord MAX_RQRS_LEN fp) = NULL))
WriteFile (hNamedPipe ampResponseRecord
(strlen(ResponseRecord) + 1) TSIZE ampnXfer NULL)
fclose (fp)
DisconnectNamedPipe (hNamedPipe)
End of server operation
106106
Win32 IPC - MailslotsMailslots bear some nasty implementation detailsthey are almost never used
Broadcast mechanism
ndash One-directional
ndash Mutliple writersmultiple readers (frequently one-to-many comm)
ndash Message delivery is unreliable
ndash Can be located over a network domain
ndash Message lengths are limited (w2k lt 426 byte) Operations on the mailslot
ndash Each reader (server) creates mailslot with CreateMailslot()
ndash Write-only client opens mailslot with CreateFile() and uses WriteFile() ndash open will fail if there are no waiting readers
ndash Clientlsquos message can be read by all servers (readers) Client lookup mailslotmailslotname
ndash Client will connect to every server in network domain
107107
Locate a server via mailslot
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
App client 0
App client n
Mailslot Servers
While () Sleep() hMS = CreateFile( ldquomailslotstatusldquo)
WriteFile(hMS ampStatInfo
App Server
Mailslot Client
Message is sent periodically
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
9595
Pipe example (contd)
Repeat (symmetrically) for the second process
StartInfoCh2hStdInput = hReadPipe
StartInfoCh2hStdError = GetStdHandle (STD_ERROR_HANDLE)
StartInfoCh2hStdOutput = GetStdHandle (STD_OUTPUT_HANDLE)
StartInfoCh2dwFlags = STARTF_USESTDHANDLES
if (CreateProcess (NULL (LPTSTR)targv NULL NULLTRUE Inherit handles
0 NULL NULL ampStartInfoCh2 ampProcInfo2))
fprintf(stderr ldquoCreateProc2 failednrdquo) exit(3)
CloseHandle (hReadPipe)
Wait for both processes to complete
WaitForSingleObject (ProcInfo1hProcess INFINITE)
WaitForSingleObject (ProcInfo2hProcess INFINITE)
CloseHandle (ProcInfo1hThread) CloseHandle (ProcInfo1hProcess)
CloseHandle (ProcInfo2hThread) CloseHandle (ProcInfo2hProcess)
return 0
9696
Named Pipes
Message orientedndash Reading process can read varying-length messages precisely as sent by the writing process
Bi-directionalndash Two processes can exchange messages over the same pipe
Multiple independent instances of a named pipendash Several clients can communicate with a single server using the same instance
ndash Server can respond to client using the same instance
Pipe can be accessed over the networkndash location transparency
Convenience and connection functions
9797
Using Named Pipes
HANDLE CreateNamedPipe (LPCTSTR lpszPipeNameDWORD fdwOpenMode DWORD fdwPipModeDWORD nMaxInstances DWORD cbOutBufDWORD cbInBuf DWORD dwTimeOutLPSECURITY_ATTRIBUTES lpsa )
Use same flag settings forall instances of a named pipe
lpszPipeName pipe[path]pipename
ndash Not possible to create a pipe on remote machine ( ndash local machine)
fdwOpenMode
ndash PIPE_ACCESS_DUPLEX PIPE_ACCESS_INBOUND PIPE_ACCESS_OUTBOUND
fdwPipeMode
ndash PIPE_TYPE_BYTE or PIPE_TYPE_MESSAGE
ndash PIPE_READMODE_BYTE or PIPE_READMODE_MESSAGE
ndash PIPE_WAIT or PIPE_NOWAIT (will ReadFile block)
9898
Named Pipes (contd)
BOOL PeekNamedPipe (HANDLE hPipeLPVOID lpvBuffer DWORD cbBufferLPDWORD lpcbRead LPDWORD lpcbAvailLPDWORD lpcbMessage)
nMaxInstances
ndash Number of instances
ndash PIPE_UNLIMITED_INSTANCES OS choice based on resources
dwTimeOut
ndash Default time-out period (in msec) for WaitNamedPipe()
First CreateNamedPipe creates named pipe
ndash Closing handle to last instance deletes named pipe
Polling a pipe
ndash Nondestructive ndash is there a message waiting for ReadFile
9999
Named Pipe Client Connections
CreateFile with named pipe namendash pipe[path]pipename
ndash servernamepipe[path]pipename
ndash First method gives better performance (local server)
Status Functionsndash GetNamedPipeHandleState
ndash SetNamedPipeHandleState
ndash GetNamedPipeInfo
100100
Convenience Functions
BOOL TransactNamedPipe( HANDLE hNamedPipeLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDOWRD lpcbRead LPOVERLAPPED lpa)
WriteFile ReadFile sequence
101101
Convenience Functions
BOOL CallNamedPipe( LPCTSTR lpszPipeNameLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDWORD lpcbRead DWORD dwTimeOut)
CreateFile WriteFile ReadFile CloseHandle
ndash dwTimeOut NMPWAIT_NOWAIT NMPWAIT_WIAT_FOREVER NMPWAIT_USE_DEFAULT_WAIT
102102
Server eliminate the polling loop
BOOL ConnectNamedPipe (HANDLE hNamedPipeLPOVERLAPPED lpo
lpo == NULLndash Call will return as soon as there is a client connection
ndash Returns false if client connected between CreateNamed Pipe calland ConnectNamedPipe()
Use DisconnectNamedPipe to free the handle for connection from another client
WaitNamedPipe()ndash Client may wait for serverlsquos ConnectNamedPipe()
Security rights for named pipesndash GENERIC_READ GENERIC_WRITE SYNCHRONIZE
103103
Comparison with UNIX
UNIX FIFOs are similar to a named pipe
ndash FIFOs are half-duplex
ndash FIFOs are limited to a single machine
ndash FIFOs are still byte-oriented so its easiest to use fixed-size records in clientserver applications
ndash Individual readwrites are atomic
A server using FIFOs must use a separate FIFO for each clientlsquos response although all clients can send requests via a single well known FIFO
Mkfifo() is the UNIX counterpart to CreateNamedPipe()
Use sockets for networked clientserver scenarios
104104
Client Example using Named Pipe
WaitNamedPipe (ServerPipeName NMPWAIT_WAIT_FOREVER)
hNamedPipe = CreateFile (ServerPipeName GENERIC_READ | GENERIC_WRITE
0 NULL OPEN_EXISTING FILE_ATTRIBUTE_NORMAL NULL)
if (hNamedPipe == INVALID_HANDLE_VALUE)
fptinf(stderr Failure to locate servern) exit(3)
Write the request
WriteFile (hNamedPipe ampRequest MAX_RQRS_LEN ampnWrite NULL)
Read each response and send it to std out
while (ReadFile (hNamedPipe ResponseRecord MAX_RQRS_LEN ampnRead NULL))
printf (s ResponseRecord)
CloseHandle (hNamedPipe)
return 0
105105
Server Example Using a Named Pipe
hNamedPipe = CreateNamedPipe (SERVER_PIPE PIPE_ACCESS_DUPLEX
PIPE_READMODE_MESSAGE | PIPE_TYPE_MESSAGE | PIPE_WAIT
1 0 0 CS_TIMEOUT pNPSA)
while (Done)
printf (Server is awaiting next requestn)
if (ConnectNamedPipe (hNamedPipe NULL)
|| ReadFile (hNamedPipe ampRequest RQ_SIZE ampnXfer NULL))
fprintf(stderr ldquoConnect or Read Named Pipe errornrdquo) exit(4)
printf( ldquoRequest is sn RequestRecord)
Send the file one line at a time to the client
fp = fopen (File r)
while ((fgets (ResponseRecord MAX_RQRS_LEN fp) = NULL))
WriteFile (hNamedPipe ampResponseRecord
(strlen(ResponseRecord) + 1) TSIZE ampnXfer NULL)
fclose (fp)
DisconnectNamedPipe (hNamedPipe)
End of server operation
106106
Win32 IPC - MailslotsMailslots bear some nasty implementation detailsthey are almost never used
Broadcast mechanism
ndash One-directional
ndash Mutliple writersmultiple readers (frequently one-to-many comm)
ndash Message delivery is unreliable
ndash Can be located over a network domain
ndash Message lengths are limited (w2k lt 426 byte) Operations on the mailslot
ndash Each reader (server) creates mailslot with CreateMailslot()
ndash Write-only client opens mailslot with CreateFile() and uses WriteFile() ndash open will fail if there are no waiting readers
ndash Clientlsquos message can be read by all servers (readers) Client lookup mailslotmailslotname
ndash Client will connect to every server in network domain
107107
Locate a server via mailslot
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
App client 0
App client n
Mailslot Servers
While () Sleep() hMS = CreateFile( ldquomailslotstatusldquo)
WriteFile(hMS ampStatInfo
App Server
Mailslot Client
Message is sent periodically
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
9696
Named Pipes
Message orientedndash Reading process can read varying-length messages precisely as sent by the writing process
Bi-directionalndash Two processes can exchange messages over the same pipe
Multiple independent instances of a named pipendash Several clients can communicate with a single server using the same instance
ndash Server can respond to client using the same instance
Pipe can be accessed over the networkndash location transparency
Convenience and connection functions
9797
Using Named Pipes
HANDLE CreateNamedPipe (LPCTSTR lpszPipeNameDWORD fdwOpenMode DWORD fdwPipModeDWORD nMaxInstances DWORD cbOutBufDWORD cbInBuf DWORD dwTimeOutLPSECURITY_ATTRIBUTES lpsa )
Use same flag settings forall instances of a named pipe
lpszPipeName pipe[path]pipename
ndash Not possible to create a pipe on remote machine ( ndash local machine)
fdwOpenMode
ndash PIPE_ACCESS_DUPLEX PIPE_ACCESS_INBOUND PIPE_ACCESS_OUTBOUND
fdwPipeMode
ndash PIPE_TYPE_BYTE or PIPE_TYPE_MESSAGE
ndash PIPE_READMODE_BYTE or PIPE_READMODE_MESSAGE
ndash PIPE_WAIT or PIPE_NOWAIT (will ReadFile block)
9898
Named Pipes (contd)
BOOL PeekNamedPipe (HANDLE hPipeLPVOID lpvBuffer DWORD cbBufferLPDWORD lpcbRead LPDWORD lpcbAvailLPDWORD lpcbMessage)
nMaxInstances
ndash Number of instances
ndash PIPE_UNLIMITED_INSTANCES OS choice based on resources
dwTimeOut
ndash Default time-out period (in msec) for WaitNamedPipe()
First CreateNamedPipe creates named pipe
ndash Closing handle to last instance deletes named pipe
Polling a pipe
ndash Nondestructive ndash is there a message waiting for ReadFile
9999
Named Pipe Client Connections
CreateFile with named pipe namendash pipe[path]pipename
ndash servernamepipe[path]pipename
ndash First method gives better performance (local server)
Status Functionsndash GetNamedPipeHandleState
ndash SetNamedPipeHandleState
ndash GetNamedPipeInfo
100100
Convenience Functions
BOOL TransactNamedPipe( HANDLE hNamedPipeLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDOWRD lpcbRead LPOVERLAPPED lpa)
WriteFile ReadFile sequence
101101
Convenience Functions
BOOL CallNamedPipe( LPCTSTR lpszPipeNameLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDWORD lpcbRead DWORD dwTimeOut)
CreateFile WriteFile ReadFile CloseHandle
ndash dwTimeOut NMPWAIT_NOWAIT NMPWAIT_WIAT_FOREVER NMPWAIT_USE_DEFAULT_WAIT
102102
Server eliminate the polling loop
BOOL ConnectNamedPipe (HANDLE hNamedPipeLPOVERLAPPED lpo
lpo == NULLndash Call will return as soon as there is a client connection
ndash Returns false if client connected between CreateNamed Pipe calland ConnectNamedPipe()
Use DisconnectNamedPipe to free the handle for connection from another client
WaitNamedPipe()ndash Client may wait for serverlsquos ConnectNamedPipe()
Security rights for named pipesndash GENERIC_READ GENERIC_WRITE SYNCHRONIZE
103103
Comparison with UNIX
UNIX FIFOs are similar to a named pipe
ndash FIFOs are half-duplex
ndash FIFOs are limited to a single machine
ndash FIFOs are still byte-oriented so its easiest to use fixed-size records in clientserver applications
ndash Individual readwrites are atomic
A server using FIFOs must use a separate FIFO for each clientlsquos response although all clients can send requests via a single well known FIFO
Mkfifo() is the UNIX counterpart to CreateNamedPipe()
Use sockets for networked clientserver scenarios
104104
Client Example using Named Pipe
WaitNamedPipe (ServerPipeName NMPWAIT_WAIT_FOREVER)
hNamedPipe = CreateFile (ServerPipeName GENERIC_READ | GENERIC_WRITE
0 NULL OPEN_EXISTING FILE_ATTRIBUTE_NORMAL NULL)
if (hNamedPipe == INVALID_HANDLE_VALUE)
fptinf(stderr Failure to locate servern) exit(3)
Write the request
WriteFile (hNamedPipe ampRequest MAX_RQRS_LEN ampnWrite NULL)
Read each response and send it to std out
while (ReadFile (hNamedPipe ResponseRecord MAX_RQRS_LEN ampnRead NULL))
printf (s ResponseRecord)
CloseHandle (hNamedPipe)
return 0
105105
Server Example Using a Named Pipe
hNamedPipe = CreateNamedPipe (SERVER_PIPE PIPE_ACCESS_DUPLEX
PIPE_READMODE_MESSAGE | PIPE_TYPE_MESSAGE | PIPE_WAIT
1 0 0 CS_TIMEOUT pNPSA)
while (Done)
printf (Server is awaiting next requestn)
if (ConnectNamedPipe (hNamedPipe NULL)
|| ReadFile (hNamedPipe ampRequest RQ_SIZE ampnXfer NULL))
fprintf(stderr ldquoConnect or Read Named Pipe errornrdquo) exit(4)
printf( ldquoRequest is sn RequestRecord)
Send the file one line at a time to the client
fp = fopen (File r)
while ((fgets (ResponseRecord MAX_RQRS_LEN fp) = NULL))
WriteFile (hNamedPipe ampResponseRecord
(strlen(ResponseRecord) + 1) TSIZE ampnXfer NULL)
fclose (fp)
DisconnectNamedPipe (hNamedPipe)
End of server operation
106106
Win32 IPC - MailslotsMailslots bear some nasty implementation detailsthey are almost never used
Broadcast mechanism
ndash One-directional
ndash Mutliple writersmultiple readers (frequently one-to-many comm)
ndash Message delivery is unreliable
ndash Can be located over a network domain
ndash Message lengths are limited (w2k lt 426 byte) Operations on the mailslot
ndash Each reader (server) creates mailslot with CreateMailslot()
ndash Write-only client opens mailslot with CreateFile() and uses WriteFile() ndash open will fail if there are no waiting readers
ndash Clientlsquos message can be read by all servers (readers) Client lookup mailslotmailslotname
ndash Client will connect to every server in network domain
107107
Locate a server via mailslot
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
App client 0
App client n
Mailslot Servers
While () Sleep() hMS = CreateFile( ldquomailslotstatusldquo)
WriteFile(hMS ampStatInfo
App Server
Mailslot Client
Message is sent periodically
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
9797
Using Named Pipes
HANDLE CreateNamedPipe (LPCTSTR lpszPipeNameDWORD fdwOpenMode DWORD fdwPipModeDWORD nMaxInstances DWORD cbOutBufDWORD cbInBuf DWORD dwTimeOutLPSECURITY_ATTRIBUTES lpsa )
Use same flag settings forall instances of a named pipe
lpszPipeName pipe[path]pipename
ndash Not possible to create a pipe on remote machine ( ndash local machine)
fdwOpenMode
ndash PIPE_ACCESS_DUPLEX PIPE_ACCESS_INBOUND PIPE_ACCESS_OUTBOUND
fdwPipeMode
ndash PIPE_TYPE_BYTE or PIPE_TYPE_MESSAGE
ndash PIPE_READMODE_BYTE or PIPE_READMODE_MESSAGE
ndash PIPE_WAIT or PIPE_NOWAIT (will ReadFile block)
9898
Named Pipes (contd)
BOOL PeekNamedPipe (HANDLE hPipeLPVOID lpvBuffer DWORD cbBufferLPDWORD lpcbRead LPDWORD lpcbAvailLPDWORD lpcbMessage)
nMaxInstances
ndash Number of instances
ndash PIPE_UNLIMITED_INSTANCES OS choice based on resources
dwTimeOut
ndash Default time-out period (in msec) for WaitNamedPipe()
First CreateNamedPipe creates named pipe
ndash Closing handle to last instance deletes named pipe
Polling a pipe
ndash Nondestructive ndash is there a message waiting for ReadFile
9999
Named Pipe Client Connections
CreateFile with named pipe namendash pipe[path]pipename
ndash servernamepipe[path]pipename
ndash First method gives better performance (local server)
Status Functionsndash GetNamedPipeHandleState
ndash SetNamedPipeHandleState
ndash GetNamedPipeInfo
100100
Convenience Functions
BOOL TransactNamedPipe( HANDLE hNamedPipeLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDOWRD lpcbRead LPOVERLAPPED lpa)
WriteFile ReadFile sequence
101101
Convenience Functions
BOOL CallNamedPipe( LPCTSTR lpszPipeNameLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDWORD lpcbRead DWORD dwTimeOut)
CreateFile WriteFile ReadFile CloseHandle
ndash dwTimeOut NMPWAIT_NOWAIT NMPWAIT_WIAT_FOREVER NMPWAIT_USE_DEFAULT_WAIT
102102
Server eliminate the polling loop
BOOL ConnectNamedPipe (HANDLE hNamedPipeLPOVERLAPPED lpo
lpo == NULLndash Call will return as soon as there is a client connection
ndash Returns false if client connected between CreateNamed Pipe calland ConnectNamedPipe()
Use DisconnectNamedPipe to free the handle for connection from another client
WaitNamedPipe()ndash Client may wait for serverlsquos ConnectNamedPipe()
Security rights for named pipesndash GENERIC_READ GENERIC_WRITE SYNCHRONIZE
103103
Comparison with UNIX
UNIX FIFOs are similar to a named pipe
ndash FIFOs are half-duplex
ndash FIFOs are limited to a single machine
ndash FIFOs are still byte-oriented so its easiest to use fixed-size records in clientserver applications
ndash Individual readwrites are atomic
A server using FIFOs must use a separate FIFO for each clientlsquos response although all clients can send requests via a single well known FIFO
Mkfifo() is the UNIX counterpart to CreateNamedPipe()
Use sockets for networked clientserver scenarios
104104
Client Example using Named Pipe
WaitNamedPipe (ServerPipeName NMPWAIT_WAIT_FOREVER)
hNamedPipe = CreateFile (ServerPipeName GENERIC_READ | GENERIC_WRITE
0 NULL OPEN_EXISTING FILE_ATTRIBUTE_NORMAL NULL)
if (hNamedPipe == INVALID_HANDLE_VALUE)
fptinf(stderr Failure to locate servern) exit(3)
Write the request
WriteFile (hNamedPipe ampRequest MAX_RQRS_LEN ampnWrite NULL)
Read each response and send it to std out
while (ReadFile (hNamedPipe ResponseRecord MAX_RQRS_LEN ampnRead NULL))
printf (s ResponseRecord)
CloseHandle (hNamedPipe)
return 0
105105
Server Example Using a Named Pipe
hNamedPipe = CreateNamedPipe (SERVER_PIPE PIPE_ACCESS_DUPLEX
PIPE_READMODE_MESSAGE | PIPE_TYPE_MESSAGE | PIPE_WAIT
1 0 0 CS_TIMEOUT pNPSA)
while (Done)
printf (Server is awaiting next requestn)
if (ConnectNamedPipe (hNamedPipe NULL)
|| ReadFile (hNamedPipe ampRequest RQ_SIZE ampnXfer NULL))
fprintf(stderr ldquoConnect or Read Named Pipe errornrdquo) exit(4)
printf( ldquoRequest is sn RequestRecord)
Send the file one line at a time to the client
fp = fopen (File r)
while ((fgets (ResponseRecord MAX_RQRS_LEN fp) = NULL))
WriteFile (hNamedPipe ampResponseRecord
(strlen(ResponseRecord) + 1) TSIZE ampnXfer NULL)
fclose (fp)
DisconnectNamedPipe (hNamedPipe)
End of server operation
106106
Win32 IPC - MailslotsMailslots bear some nasty implementation detailsthey are almost never used
Broadcast mechanism
ndash One-directional
ndash Mutliple writersmultiple readers (frequently one-to-many comm)
ndash Message delivery is unreliable
ndash Can be located over a network domain
ndash Message lengths are limited (w2k lt 426 byte) Operations on the mailslot
ndash Each reader (server) creates mailslot with CreateMailslot()
ndash Write-only client opens mailslot with CreateFile() and uses WriteFile() ndash open will fail if there are no waiting readers
ndash Clientlsquos message can be read by all servers (readers) Client lookup mailslotmailslotname
ndash Client will connect to every server in network domain
107107
Locate a server via mailslot
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
App client 0
App client n
Mailslot Servers
While () Sleep() hMS = CreateFile( ldquomailslotstatusldquo)
WriteFile(hMS ampStatInfo
App Server
Mailslot Client
Message is sent periodically
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
9898
Named Pipes (contd)
BOOL PeekNamedPipe (HANDLE hPipeLPVOID lpvBuffer DWORD cbBufferLPDWORD lpcbRead LPDWORD lpcbAvailLPDWORD lpcbMessage)
nMaxInstances
ndash Number of instances
ndash PIPE_UNLIMITED_INSTANCES OS choice based on resources
dwTimeOut
ndash Default time-out period (in msec) for WaitNamedPipe()
First CreateNamedPipe creates named pipe
ndash Closing handle to last instance deletes named pipe
Polling a pipe
ndash Nondestructive ndash is there a message waiting for ReadFile
9999
Named Pipe Client Connections
CreateFile with named pipe namendash pipe[path]pipename
ndash servernamepipe[path]pipename
ndash First method gives better performance (local server)
Status Functionsndash GetNamedPipeHandleState
ndash SetNamedPipeHandleState
ndash GetNamedPipeInfo
100100
Convenience Functions
BOOL TransactNamedPipe( HANDLE hNamedPipeLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDOWRD lpcbRead LPOVERLAPPED lpa)
WriteFile ReadFile sequence
101101
Convenience Functions
BOOL CallNamedPipe( LPCTSTR lpszPipeNameLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDWORD lpcbRead DWORD dwTimeOut)
CreateFile WriteFile ReadFile CloseHandle
ndash dwTimeOut NMPWAIT_NOWAIT NMPWAIT_WIAT_FOREVER NMPWAIT_USE_DEFAULT_WAIT
102102
Server eliminate the polling loop
BOOL ConnectNamedPipe (HANDLE hNamedPipeLPOVERLAPPED lpo
lpo == NULLndash Call will return as soon as there is a client connection
ndash Returns false if client connected between CreateNamed Pipe calland ConnectNamedPipe()
Use DisconnectNamedPipe to free the handle for connection from another client
WaitNamedPipe()ndash Client may wait for serverlsquos ConnectNamedPipe()
Security rights for named pipesndash GENERIC_READ GENERIC_WRITE SYNCHRONIZE
103103
Comparison with UNIX
UNIX FIFOs are similar to a named pipe
ndash FIFOs are half-duplex
ndash FIFOs are limited to a single machine
ndash FIFOs are still byte-oriented so its easiest to use fixed-size records in clientserver applications
ndash Individual readwrites are atomic
A server using FIFOs must use a separate FIFO for each clientlsquos response although all clients can send requests via a single well known FIFO
Mkfifo() is the UNIX counterpart to CreateNamedPipe()
Use sockets for networked clientserver scenarios
104104
Client Example using Named Pipe
WaitNamedPipe (ServerPipeName NMPWAIT_WAIT_FOREVER)
hNamedPipe = CreateFile (ServerPipeName GENERIC_READ | GENERIC_WRITE
0 NULL OPEN_EXISTING FILE_ATTRIBUTE_NORMAL NULL)
if (hNamedPipe == INVALID_HANDLE_VALUE)
fptinf(stderr Failure to locate servern) exit(3)
Write the request
WriteFile (hNamedPipe ampRequest MAX_RQRS_LEN ampnWrite NULL)
Read each response and send it to std out
while (ReadFile (hNamedPipe ResponseRecord MAX_RQRS_LEN ampnRead NULL))
printf (s ResponseRecord)
CloseHandle (hNamedPipe)
return 0
105105
Server Example Using a Named Pipe
hNamedPipe = CreateNamedPipe (SERVER_PIPE PIPE_ACCESS_DUPLEX
PIPE_READMODE_MESSAGE | PIPE_TYPE_MESSAGE | PIPE_WAIT
1 0 0 CS_TIMEOUT pNPSA)
while (Done)
printf (Server is awaiting next requestn)
if (ConnectNamedPipe (hNamedPipe NULL)
|| ReadFile (hNamedPipe ampRequest RQ_SIZE ampnXfer NULL))
fprintf(stderr ldquoConnect or Read Named Pipe errornrdquo) exit(4)
printf( ldquoRequest is sn RequestRecord)
Send the file one line at a time to the client
fp = fopen (File r)
while ((fgets (ResponseRecord MAX_RQRS_LEN fp) = NULL))
WriteFile (hNamedPipe ampResponseRecord
(strlen(ResponseRecord) + 1) TSIZE ampnXfer NULL)
fclose (fp)
DisconnectNamedPipe (hNamedPipe)
End of server operation
106106
Win32 IPC - MailslotsMailslots bear some nasty implementation detailsthey are almost never used
Broadcast mechanism
ndash One-directional
ndash Mutliple writersmultiple readers (frequently one-to-many comm)
ndash Message delivery is unreliable
ndash Can be located over a network domain
ndash Message lengths are limited (w2k lt 426 byte) Operations on the mailslot
ndash Each reader (server) creates mailslot with CreateMailslot()
ndash Write-only client opens mailslot with CreateFile() and uses WriteFile() ndash open will fail if there are no waiting readers
ndash Clientlsquos message can be read by all servers (readers) Client lookup mailslotmailslotname
ndash Client will connect to every server in network domain
107107
Locate a server via mailslot
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
App client 0
App client n
Mailslot Servers
While () Sleep() hMS = CreateFile( ldquomailslotstatusldquo)
WriteFile(hMS ampStatInfo
App Server
Mailslot Client
Message is sent periodically
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
9999
Named Pipe Client Connections
CreateFile with named pipe namendash pipe[path]pipename
ndash servernamepipe[path]pipename
ndash First method gives better performance (local server)
Status Functionsndash GetNamedPipeHandleState
ndash SetNamedPipeHandleState
ndash GetNamedPipeInfo
100100
Convenience Functions
BOOL TransactNamedPipe( HANDLE hNamedPipeLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDOWRD lpcbRead LPOVERLAPPED lpa)
WriteFile ReadFile sequence
101101
Convenience Functions
BOOL CallNamedPipe( LPCTSTR lpszPipeNameLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDWORD lpcbRead DWORD dwTimeOut)
CreateFile WriteFile ReadFile CloseHandle
ndash dwTimeOut NMPWAIT_NOWAIT NMPWAIT_WIAT_FOREVER NMPWAIT_USE_DEFAULT_WAIT
102102
Server eliminate the polling loop
BOOL ConnectNamedPipe (HANDLE hNamedPipeLPOVERLAPPED lpo
lpo == NULLndash Call will return as soon as there is a client connection
ndash Returns false if client connected between CreateNamed Pipe calland ConnectNamedPipe()
Use DisconnectNamedPipe to free the handle for connection from another client
WaitNamedPipe()ndash Client may wait for serverlsquos ConnectNamedPipe()
Security rights for named pipesndash GENERIC_READ GENERIC_WRITE SYNCHRONIZE
103103
Comparison with UNIX
UNIX FIFOs are similar to a named pipe
ndash FIFOs are half-duplex
ndash FIFOs are limited to a single machine
ndash FIFOs are still byte-oriented so its easiest to use fixed-size records in clientserver applications
ndash Individual readwrites are atomic
A server using FIFOs must use a separate FIFO for each clientlsquos response although all clients can send requests via a single well known FIFO
Mkfifo() is the UNIX counterpart to CreateNamedPipe()
Use sockets for networked clientserver scenarios
104104
Client Example using Named Pipe
WaitNamedPipe (ServerPipeName NMPWAIT_WAIT_FOREVER)
hNamedPipe = CreateFile (ServerPipeName GENERIC_READ | GENERIC_WRITE
0 NULL OPEN_EXISTING FILE_ATTRIBUTE_NORMAL NULL)
if (hNamedPipe == INVALID_HANDLE_VALUE)
fptinf(stderr Failure to locate servern) exit(3)
Write the request
WriteFile (hNamedPipe ampRequest MAX_RQRS_LEN ampnWrite NULL)
Read each response and send it to std out
while (ReadFile (hNamedPipe ResponseRecord MAX_RQRS_LEN ampnRead NULL))
printf (s ResponseRecord)
CloseHandle (hNamedPipe)
return 0
105105
Server Example Using a Named Pipe
hNamedPipe = CreateNamedPipe (SERVER_PIPE PIPE_ACCESS_DUPLEX
PIPE_READMODE_MESSAGE | PIPE_TYPE_MESSAGE | PIPE_WAIT
1 0 0 CS_TIMEOUT pNPSA)
while (Done)
printf (Server is awaiting next requestn)
if (ConnectNamedPipe (hNamedPipe NULL)
|| ReadFile (hNamedPipe ampRequest RQ_SIZE ampnXfer NULL))
fprintf(stderr ldquoConnect or Read Named Pipe errornrdquo) exit(4)
printf( ldquoRequest is sn RequestRecord)
Send the file one line at a time to the client
fp = fopen (File r)
while ((fgets (ResponseRecord MAX_RQRS_LEN fp) = NULL))
WriteFile (hNamedPipe ampResponseRecord
(strlen(ResponseRecord) + 1) TSIZE ampnXfer NULL)
fclose (fp)
DisconnectNamedPipe (hNamedPipe)
End of server operation
106106
Win32 IPC - MailslotsMailslots bear some nasty implementation detailsthey are almost never used
Broadcast mechanism
ndash One-directional
ndash Mutliple writersmultiple readers (frequently one-to-many comm)
ndash Message delivery is unreliable
ndash Can be located over a network domain
ndash Message lengths are limited (w2k lt 426 byte) Operations on the mailslot
ndash Each reader (server) creates mailslot with CreateMailslot()
ndash Write-only client opens mailslot with CreateFile() and uses WriteFile() ndash open will fail if there are no waiting readers
ndash Clientlsquos message can be read by all servers (readers) Client lookup mailslotmailslotname
ndash Client will connect to every server in network domain
107107
Locate a server via mailslot
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
App client 0
App client n
Mailslot Servers
While () Sleep() hMS = CreateFile( ldquomailslotstatusldquo)
WriteFile(hMS ampStatInfo
App Server
Mailslot Client
Message is sent periodically
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
100100
Convenience Functions
BOOL TransactNamedPipe( HANDLE hNamedPipeLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDOWRD lpcbRead LPOVERLAPPED lpa)
WriteFile ReadFile sequence
101101
Convenience Functions
BOOL CallNamedPipe( LPCTSTR lpszPipeNameLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDWORD lpcbRead DWORD dwTimeOut)
CreateFile WriteFile ReadFile CloseHandle
ndash dwTimeOut NMPWAIT_NOWAIT NMPWAIT_WIAT_FOREVER NMPWAIT_USE_DEFAULT_WAIT
102102
Server eliminate the polling loop
BOOL ConnectNamedPipe (HANDLE hNamedPipeLPOVERLAPPED lpo
lpo == NULLndash Call will return as soon as there is a client connection
ndash Returns false if client connected between CreateNamed Pipe calland ConnectNamedPipe()
Use DisconnectNamedPipe to free the handle for connection from another client
WaitNamedPipe()ndash Client may wait for serverlsquos ConnectNamedPipe()
Security rights for named pipesndash GENERIC_READ GENERIC_WRITE SYNCHRONIZE
103103
Comparison with UNIX
UNIX FIFOs are similar to a named pipe
ndash FIFOs are half-duplex
ndash FIFOs are limited to a single machine
ndash FIFOs are still byte-oriented so its easiest to use fixed-size records in clientserver applications
ndash Individual readwrites are atomic
A server using FIFOs must use a separate FIFO for each clientlsquos response although all clients can send requests via a single well known FIFO
Mkfifo() is the UNIX counterpart to CreateNamedPipe()
Use sockets for networked clientserver scenarios
104104
Client Example using Named Pipe
WaitNamedPipe (ServerPipeName NMPWAIT_WAIT_FOREVER)
hNamedPipe = CreateFile (ServerPipeName GENERIC_READ | GENERIC_WRITE
0 NULL OPEN_EXISTING FILE_ATTRIBUTE_NORMAL NULL)
if (hNamedPipe == INVALID_HANDLE_VALUE)
fptinf(stderr Failure to locate servern) exit(3)
Write the request
WriteFile (hNamedPipe ampRequest MAX_RQRS_LEN ampnWrite NULL)
Read each response and send it to std out
while (ReadFile (hNamedPipe ResponseRecord MAX_RQRS_LEN ampnRead NULL))
printf (s ResponseRecord)
CloseHandle (hNamedPipe)
return 0
105105
Server Example Using a Named Pipe
hNamedPipe = CreateNamedPipe (SERVER_PIPE PIPE_ACCESS_DUPLEX
PIPE_READMODE_MESSAGE | PIPE_TYPE_MESSAGE | PIPE_WAIT
1 0 0 CS_TIMEOUT pNPSA)
while (Done)
printf (Server is awaiting next requestn)
if (ConnectNamedPipe (hNamedPipe NULL)
|| ReadFile (hNamedPipe ampRequest RQ_SIZE ampnXfer NULL))
fprintf(stderr ldquoConnect or Read Named Pipe errornrdquo) exit(4)
printf( ldquoRequest is sn RequestRecord)
Send the file one line at a time to the client
fp = fopen (File r)
while ((fgets (ResponseRecord MAX_RQRS_LEN fp) = NULL))
WriteFile (hNamedPipe ampResponseRecord
(strlen(ResponseRecord) + 1) TSIZE ampnXfer NULL)
fclose (fp)
DisconnectNamedPipe (hNamedPipe)
End of server operation
106106
Win32 IPC - MailslotsMailslots bear some nasty implementation detailsthey are almost never used
Broadcast mechanism
ndash One-directional
ndash Mutliple writersmultiple readers (frequently one-to-many comm)
ndash Message delivery is unreliable
ndash Can be located over a network domain
ndash Message lengths are limited (w2k lt 426 byte) Operations on the mailslot
ndash Each reader (server) creates mailslot with CreateMailslot()
ndash Write-only client opens mailslot with CreateFile() and uses WriteFile() ndash open will fail if there are no waiting readers
ndash Clientlsquos message can be read by all servers (readers) Client lookup mailslotmailslotname
ndash Client will connect to every server in network domain
107107
Locate a server via mailslot
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
App client 0
App client n
Mailslot Servers
While () Sleep() hMS = CreateFile( ldquomailslotstatusldquo)
WriteFile(hMS ampStatInfo
App Server
Mailslot Client
Message is sent periodically
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
101101
Convenience Functions
BOOL CallNamedPipe( LPCTSTR lpszPipeNameLPVOID lpvWriteBuf DWORD cbWriteBufLPVOID lpvReadBuf DWORD cbReadBufLPDWORD lpcbRead DWORD dwTimeOut)
CreateFile WriteFile ReadFile CloseHandle
ndash dwTimeOut NMPWAIT_NOWAIT NMPWAIT_WIAT_FOREVER NMPWAIT_USE_DEFAULT_WAIT
102102
Server eliminate the polling loop
BOOL ConnectNamedPipe (HANDLE hNamedPipeLPOVERLAPPED lpo
lpo == NULLndash Call will return as soon as there is a client connection
ndash Returns false if client connected between CreateNamed Pipe calland ConnectNamedPipe()
Use DisconnectNamedPipe to free the handle for connection from another client
WaitNamedPipe()ndash Client may wait for serverlsquos ConnectNamedPipe()
Security rights for named pipesndash GENERIC_READ GENERIC_WRITE SYNCHRONIZE
103103
Comparison with UNIX
UNIX FIFOs are similar to a named pipe
ndash FIFOs are half-duplex
ndash FIFOs are limited to a single machine
ndash FIFOs are still byte-oriented so its easiest to use fixed-size records in clientserver applications
ndash Individual readwrites are atomic
A server using FIFOs must use a separate FIFO for each clientlsquos response although all clients can send requests via a single well known FIFO
Mkfifo() is the UNIX counterpart to CreateNamedPipe()
Use sockets for networked clientserver scenarios
104104
Client Example using Named Pipe
WaitNamedPipe (ServerPipeName NMPWAIT_WAIT_FOREVER)
hNamedPipe = CreateFile (ServerPipeName GENERIC_READ | GENERIC_WRITE
0 NULL OPEN_EXISTING FILE_ATTRIBUTE_NORMAL NULL)
if (hNamedPipe == INVALID_HANDLE_VALUE)
fptinf(stderr Failure to locate servern) exit(3)
Write the request
WriteFile (hNamedPipe ampRequest MAX_RQRS_LEN ampnWrite NULL)
Read each response and send it to std out
while (ReadFile (hNamedPipe ResponseRecord MAX_RQRS_LEN ampnRead NULL))
printf (s ResponseRecord)
CloseHandle (hNamedPipe)
return 0
105105
Server Example Using a Named Pipe
hNamedPipe = CreateNamedPipe (SERVER_PIPE PIPE_ACCESS_DUPLEX
PIPE_READMODE_MESSAGE | PIPE_TYPE_MESSAGE | PIPE_WAIT
1 0 0 CS_TIMEOUT pNPSA)
while (Done)
printf (Server is awaiting next requestn)
if (ConnectNamedPipe (hNamedPipe NULL)
|| ReadFile (hNamedPipe ampRequest RQ_SIZE ampnXfer NULL))
fprintf(stderr ldquoConnect or Read Named Pipe errornrdquo) exit(4)
printf( ldquoRequest is sn RequestRecord)
Send the file one line at a time to the client
fp = fopen (File r)
while ((fgets (ResponseRecord MAX_RQRS_LEN fp) = NULL))
WriteFile (hNamedPipe ampResponseRecord
(strlen(ResponseRecord) + 1) TSIZE ampnXfer NULL)
fclose (fp)
DisconnectNamedPipe (hNamedPipe)
End of server operation
106106
Win32 IPC - MailslotsMailslots bear some nasty implementation detailsthey are almost never used
Broadcast mechanism
ndash One-directional
ndash Mutliple writersmultiple readers (frequently one-to-many comm)
ndash Message delivery is unreliable
ndash Can be located over a network domain
ndash Message lengths are limited (w2k lt 426 byte) Operations on the mailslot
ndash Each reader (server) creates mailslot with CreateMailslot()
ndash Write-only client opens mailslot with CreateFile() and uses WriteFile() ndash open will fail if there are no waiting readers
ndash Clientlsquos message can be read by all servers (readers) Client lookup mailslotmailslotname
ndash Client will connect to every server in network domain
107107
Locate a server via mailslot
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
App client 0
App client n
Mailslot Servers
While () Sleep() hMS = CreateFile( ldquomailslotstatusldquo)
WriteFile(hMS ampStatInfo
App Server
Mailslot Client
Message is sent periodically
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
102102
Server eliminate the polling loop
BOOL ConnectNamedPipe (HANDLE hNamedPipeLPOVERLAPPED lpo
lpo == NULLndash Call will return as soon as there is a client connection
ndash Returns false if client connected between CreateNamed Pipe calland ConnectNamedPipe()
Use DisconnectNamedPipe to free the handle for connection from another client
WaitNamedPipe()ndash Client may wait for serverlsquos ConnectNamedPipe()
Security rights for named pipesndash GENERIC_READ GENERIC_WRITE SYNCHRONIZE
103103
Comparison with UNIX
UNIX FIFOs are similar to a named pipe
ndash FIFOs are half-duplex
ndash FIFOs are limited to a single machine
ndash FIFOs are still byte-oriented so its easiest to use fixed-size records in clientserver applications
ndash Individual readwrites are atomic
A server using FIFOs must use a separate FIFO for each clientlsquos response although all clients can send requests via a single well known FIFO
Mkfifo() is the UNIX counterpart to CreateNamedPipe()
Use sockets for networked clientserver scenarios
104104
Client Example using Named Pipe
WaitNamedPipe (ServerPipeName NMPWAIT_WAIT_FOREVER)
hNamedPipe = CreateFile (ServerPipeName GENERIC_READ | GENERIC_WRITE
0 NULL OPEN_EXISTING FILE_ATTRIBUTE_NORMAL NULL)
if (hNamedPipe == INVALID_HANDLE_VALUE)
fptinf(stderr Failure to locate servern) exit(3)
Write the request
WriteFile (hNamedPipe ampRequest MAX_RQRS_LEN ampnWrite NULL)
Read each response and send it to std out
while (ReadFile (hNamedPipe ResponseRecord MAX_RQRS_LEN ampnRead NULL))
printf (s ResponseRecord)
CloseHandle (hNamedPipe)
return 0
105105
Server Example Using a Named Pipe
hNamedPipe = CreateNamedPipe (SERVER_PIPE PIPE_ACCESS_DUPLEX
PIPE_READMODE_MESSAGE | PIPE_TYPE_MESSAGE | PIPE_WAIT
1 0 0 CS_TIMEOUT pNPSA)
while (Done)
printf (Server is awaiting next requestn)
if (ConnectNamedPipe (hNamedPipe NULL)
|| ReadFile (hNamedPipe ampRequest RQ_SIZE ampnXfer NULL))
fprintf(stderr ldquoConnect or Read Named Pipe errornrdquo) exit(4)
printf( ldquoRequest is sn RequestRecord)
Send the file one line at a time to the client
fp = fopen (File r)
while ((fgets (ResponseRecord MAX_RQRS_LEN fp) = NULL))
WriteFile (hNamedPipe ampResponseRecord
(strlen(ResponseRecord) + 1) TSIZE ampnXfer NULL)
fclose (fp)
DisconnectNamedPipe (hNamedPipe)
End of server operation
106106
Win32 IPC - MailslotsMailslots bear some nasty implementation detailsthey are almost never used
Broadcast mechanism
ndash One-directional
ndash Mutliple writersmultiple readers (frequently one-to-many comm)
ndash Message delivery is unreliable
ndash Can be located over a network domain
ndash Message lengths are limited (w2k lt 426 byte) Operations on the mailslot
ndash Each reader (server) creates mailslot with CreateMailslot()
ndash Write-only client opens mailslot with CreateFile() and uses WriteFile() ndash open will fail if there are no waiting readers
ndash Clientlsquos message can be read by all servers (readers) Client lookup mailslotmailslotname
ndash Client will connect to every server in network domain
107107
Locate a server via mailslot
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
App client 0
App client n
Mailslot Servers
While () Sleep() hMS = CreateFile( ldquomailslotstatusldquo)
WriteFile(hMS ampStatInfo
App Server
Mailslot Client
Message is sent periodically
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
103103
Comparison with UNIX
UNIX FIFOs are similar to a named pipe
ndash FIFOs are half-duplex
ndash FIFOs are limited to a single machine
ndash FIFOs are still byte-oriented so its easiest to use fixed-size records in clientserver applications
ndash Individual readwrites are atomic
A server using FIFOs must use a separate FIFO for each clientlsquos response although all clients can send requests via a single well known FIFO
Mkfifo() is the UNIX counterpart to CreateNamedPipe()
Use sockets for networked clientserver scenarios
104104
Client Example using Named Pipe
WaitNamedPipe (ServerPipeName NMPWAIT_WAIT_FOREVER)
hNamedPipe = CreateFile (ServerPipeName GENERIC_READ | GENERIC_WRITE
0 NULL OPEN_EXISTING FILE_ATTRIBUTE_NORMAL NULL)
if (hNamedPipe == INVALID_HANDLE_VALUE)
fptinf(stderr Failure to locate servern) exit(3)
Write the request
WriteFile (hNamedPipe ampRequest MAX_RQRS_LEN ampnWrite NULL)
Read each response and send it to std out
while (ReadFile (hNamedPipe ResponseRecord MAX_RQRS_LEN ampnRead NULL))
printf (s ResponseRecord)
CloseHandle (hNamedPipe)
return 0
105105
Server Example Using a Named Pipe
hNamedPipe = CreateNamedPipe (SERVER_PIPE PIPE_ACCESS_DUPLEX
PIPE_READMODE_MESSAGE | PIPE_TYPE_MESSAGE | PIPE_WAIT
1 0 0 CS_TIMEOUT pNPSA)
while (Done)
printf (Server is awaiting next requestn)
if (ConnectNamedPipe (hNamedPipe NULL)
|| ReadFile (hNamedPipe ampRequest RQ_SIZE ampnXfer NULL))
fprintf(stderr ldquoConnect or Read Named Pipe errornrdquo) exit(4)
printf( ldquoRequest is sn RequestRecord)
Send the file one line at a time to the client
fp = fopen (File r)
while ((fgets (ResponseRecord MAX_RQRS_LEN fp) = NULL))
WriteFile (hNamedPipe ampResponseRecord
(strlen(ResponseRecord) + 1) TSIZE ampnXfer NULL)
fclose (fp)
DisconnectNamedPipe (hNamedPipe)
End of server operation
106106
Win32 IPC - MailslotsMailslots bear some nasty implementation detailsthey are almost never used
Broadcast mechanism
ndash One-directional
ndash Mutliple writersmultiple readers (frequently one-to-many comm)
ndash Message delivery is unreliable
ndash Can be located over a network domain
ndash Message lengths are limited (w2k lt 426 byte) Operations on the mailslot
ndash Each reader (server) creates mailslot with CreateMailslot()
ndash Write-only client opens mailslot with CreateFile() and uses WriteFile() ndash open will fail if there are no waiting readers
ndash Clientlsquos message can be read by all servers (readers) Client lookup mailslotmailslotname
ndash Client will connect to every server in network domain
107107
Locate a server via mailslot
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
App client 0
App client n
Mailslot Servers
While () Sleep() hMS = CreateFile( ldquomailslotstatusldquo)
WriteFile(hMS ampStatInfo
App Server
Mailslot Client
Message is sent periodically
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
104104
Client Example using Named Pipe
WaitNamedPipe (ServerPipeName NMPWAIT_WAIT_FOREVER)
hNamedPipe = CreateFile (ServerPipeName GENERIC_READ | GENERIC_WRITE
0 NULL OPEN_EXISTING FILE_ATTRIBUTE_NORMAL NULL)
if (hNamedPipe == INVALID_HANDLE_VALUE)
fptinf(stderr Failure to locate servern) exit(3)
Write the request
WriteFile (hNamedPipe ampRequest MAX_RQRS_LEN ampnWrite NULL)
Read each response and send it to std out
while (ReadFile (hNamedPipe ResponseRecord MAX_RQRS_LEN ampnRead NULL))
printf (s ResponseRecord)
CloseHandle (hNamedPipe)
return 0
105105
Server Example Using a Named Pipe
hNamedPipe = CreateNamedPipe (SERVER_PIPE PIPE_ACCESS_DUPLEX
PIPE_READMODE_MESSAGE | PIPE_TYPE_MESSAGE | PIPE_WAIT
1 0 0 CS_TIMEOUT pNPSA)
while (Done)
printf (Server is awaiting next requestn)
if (ConnectNamedPipe (hNamedPipe NULL)
|| ReadFile (hNamedPipe ampRequest RQ_SIZE ampnXfer NULL))
fprintf(stderr ldquoConnect or Read Named Pipe errornrdquo) exit(4)
printf( ldquoRequest is sn RequestRecord)
Send the file one line at a time to the client
fp = fopen (File r)
while ((fgets (ResponseRecord MAX_RQRS_LEN fp) = NULL))
WriteFile (hNamedPipe ampResponseRecord
(strlen(ResponseRecord) + 1) TSIZE ampnXfer NULL)
fclose (fp)
DisconnectNamedPipe (hNamedPipe)
End of server operation
106106
Win32 IPC - MailslotsMailslots bear some nasty implementation detailsthey are almost never used
Broadcast mechanism
ndash One-directional
ndash Mutliple writersmultiple readers (frequently one-to-many comm)
ndash Message delivery is unreliable
ndash Can be located over a network domain
ndash Message lengths are limited (w2k lt 426 byte) Operations on the mailslot
ndash Each reader (server) creates mailslot with CreateMailslot()
ndash Write-only client opens mailslot with CreateFile() and uses WriteFile() ndash open will fail if there are no waiting readers
ndash Clientlsquos message can be read by all servers (readers) Client lookup mailslotmailslotname
ndash Client will connect to every server in network domain
107107
Locate a server via mailslot
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
App client 0
App client n
Mailslot Servers
While () Sleep() hMS = CreateFile( ldquomailslotstatusldquo)
WriteFile(hMS ampStatInfo
App Server
Mailslot Client
Message is sent periodically
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
105105
Server Example Using a Named Pipe
hNamedPipe = CreateNamedPipe (SERVER_PIPE PIPE_ACCESS_DUPLEX
PIPE_READMODE_MESSAGE | PIPE_TYPE_MESSAGE | PIPE_WAIT
1 0 0 CS_TIMEOUT pNPSA)
while (Done)
printf (Server is awaiting next requestn)
if (ConnectNamedPipe (hNamedPipe NULL)
|| ReadFile (hNamedPipe ampRequest RQ_SIZE ampnXfer NULL))
fprintf(stderr ldquoConnect or Read Named Pipe errornrdquo) exit(4)
printf( ldquoRequest is sn RequestRecord)
Send the file one line at a time to the client
fp = fopen (File r)
while ((fgets (ResponseRecord MAX_RQRS_LEN fp) = NULL))
WriteFile (hNamedPipe ampResponseRecord
(strlen(ResponseRecord) + 1) TSIZE ampnXfer NULL)
fclose (fp)
DisconnectNamedPipe (hNamedPipe)
End of server operation
106106
Win32 IPC - MailslotsMailslots bear some nasty implementation detailsthey are almost never used
Broadcast mechanism
ndash One-directional
ndash Mutliple writersmultiple readers (frequently one-to-many comm)
ndash Message delivery is unreliable
ndash Can be located over a network domain
ndash Message lengths are limited (w2k lt 426 byte) Operations on the mailslot
ndash Each reader (server) creates mailslot with CreateMailslot()
ndash Write-only client opens mailslot with CreateFile() and uses WriteFile() ndash open will fail if there are no waiting readers
ndash Clientlsquos message can be read by all servers (readers) Client lookup mailslotmailslotname
ndash Client will connect to every server in network domain
107107
Locate a server via mailslot
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
App client 0
App client n
Mailslot Servers
While () Sleep() hMS = CreateFile( ldquomailslotstatusldquo)
WriteFile(hMS ampStatInfo
App Server
Mailslot Client
Message is sent periodically
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
106106
Win32 IPC - MailslotsMailslots bear some nasty implementation detailsthey are almost never used
Broadcast mechanism
ndash One-directional
ndash Mutliple writersmultiple readers (frequently one-to-many comm)
ndash Message delivery is unreliable
ndash Can be located over a network domain
ndash Message lengths are limited (w2k lt 426 byte) Operations on the mailslot
ndash Each reader (server) creates mailslot with CreateMailslot()
ndash Write-only client opens mailslot with CreateFile() and uses WriteFile() ndash open will fail if there are no waiting readers
ndash Clientlsquos message can be read by all servers (readers) Client lookup mailslotmailslotname
ndash Client will connect to every server in network domain
107107
Locate a server via mailslot
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
App client 0
App client n
Mailslot Servers
While () Sleep() hMS = CreateFile( ldquomailslotstatusldquo)
WriteFile(hMS ampStatInfo
App Server
Mailslot Client
Message is sent periodically
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
107107
Locate a server via mailslot
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
hMS = CreateMailslot( ldquomailslotstatusldquo)ReadFile(hMS ampServStat) connect to server
App client 0
App client n
Mailslot Servers
While () Sleep() hMS = CreateFile( ldquomailslotstatusldquo)
WriteFile(hMS ampStatInfo
App Server
Mailslot Client
Message is sent periodically
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
108108
Creating a mailslot
HANDLE CreateMailslot(LPCTSTR lpszNameDWORD cbMaxMsgDWORD dwReadTimeoutLPSECURITY_ATTRIBUTES lpsa)
lpszName points to a name of the formndash mailslot[path]name
ndash Name must be unique mailslot is created locally
cbMaxMsg is msg size in byte
dwReadTimeout ndash Read operation will wait for so many msec
ndash 0 ndash immediate return
ndash MAILSLOT_WAIT_FOREVER ndash infinite wait
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
109109
Opening a mailslot
CreateFile with the following namesndash mailslot[path]name - retrieve handle for local mailslot
ndash hostmailslot[path]name - retrieve handlefor mailslot on specified host
ndash domainmailslot[path]name - returns handle representing all mailslots on machines in the domain
ndash mailslot[path]name - returns handle representing mailslots on machines in the systemlsquos primary domain max mesg len 400 bytes
ndash Client must specifiy FILE_SHARE_READ flag
GetMailslotInfo() and SetMailslotInfo() are similar to their named pipe counterparts
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-
Thoughts Change Life意念改变生活
- Unit 3 Concurrency
- Interrupt Dispatching
- Thoughts Change Life 意念改变生活
-