The Need to Save State Many of the FT systems we have discussed
need a way to restart processes from previous points in their computation
A checkpoint is just a ‘snapshot’ of a process (or system) at a certain point in time
A checkpointing system provides a way to take these snapshots, and to restart from them
Types of Ckpt Systems Kernel Level
OS supports ckpt & recovery Transparent to the application and developer
User Level Application linked against (user) library
Library functions perform ckpt and recovery Transparent to application Limitations (cannot restore PID, PPID, etc.)
Application Level Applications coded to ckpt themselves, and to
restart from a checkpoint
Comparison of Levels Kernel & User (System) Level
Easy to add checkpointing to existing code
Works with (almost) any programs General, ‘coarse’, approach
Application Level Could require complete re-write, or
extensive modifications Specific, ‘fine-grained’ solutions
System Level Checkpointing
Libckpt (1994) Plank, Beck, Kingsley (UTK), Li
(Princeton)
User level library for UNIX
Libckpt User Level Checkpoint Library Goals
Transparent Requires minimal modifications to code
and re-re-linking Low Overhead
Automatic optimizations to reduce ckpt file size
Allow user directed checkpointing
Libckpt Overview Taking the ‘snapshot’
Suspend the process Write process’ memory and registers
to a file Recovery
Reload executable from original file Reconstruct memory and register
state from checkpoint file
Libckpt Operation Application main() is re-named
ckpt_target() Library main() checks if in restore
mode (specified using command line option); otherwise reads checkpoint parameters from file
Libckpt Operation (2) main() sets a timer to interrupt
application every n seconds On signal
Uses setjmp to record registers, pc, etc.
Writes the stack and heap segments to file
Resumes application code
Libckpt Operation If application started with =recover
as command line option Application begins, recovering Text
segments Open checkpoint file Recover heap from file Recover stack from file Restores register file (using longjmp)
Checkpoint And Recovery Algorithms
main()if(recovery)
restore stackrestore heappos = top of
stacklongjmp(pos, 1)// restore regs.
elserun usual code
signal_handler()jmp_buf posif(setjmp(pos)==0)//saved reg. in known //position on stack
write stackwrite heap
else// process recovered
return
Illustration
main()
user_main()fun1() fun2() signal save regs on
stack save stack to file save heap to file resume
main() restore()
restore stackrestore heap
take jump
Optimization: Incremental Checkpointing
Observation: between taking two checkpoints, only a portion of the memory has actually been changed
Optimization: save only what has been changed since last ckpt, the rest can be read from previous ckpts
Taking Incremental Ckpts. After taking a ckpt (and after init.), set
protection on all pages to ‘read-only’ Write to page will cause a protection
violation Libckpt library catches that signal, and
sets page protection to ‘read-write’, page is marked as dirty
When writing checkpoint file, only write dirty pages
Drawbacks to Incremental Ckpt
Required to keep multiple copies of the checkpoint file
On recovery, will unnecessarily restore old copies of data
Optimization: Asynchronous Checkpointing
Observation: the process must be suspended while the checkpoint file is written
Optimization: a separate thread could write the checkpoint file while the main thread was allowed to continue
Asynchronous Checkpointing
Make a copy of the process space
2nd thread takes writes copy to disk
1st thread continues without halting
Asynchronous Checkpointing(2)
Unix fork() provides the necessary behavior
When about to take ckpt, process forks
OS makes a complete copy of the original process’ space
Clone writes ckpt file, then dies Original continues computing
Copy-On-Write Checkpointing
Like asynchronous checkpointing, but only copy page if the two versions are about to differ
Some (most?) OS implement fork() in this manner, so benefit is automatic
Checkpoint Compression Use a standard data compression
algorithm to shrink the size of the checkpoint file
Only improves overhead if the speed of compression is faster than the speed of disk writes, and compression is significant
“For uniprocessor checkpointing, this is not the case”
Not implemented in libckpt
User Directed Checkpointing
As described so far, libckpt is (almost) entirely transparent to the programmer
Compare to application level checkpoint requiring extensive code changes
Is there a middle ground? Libckpt allows programmers to
annotate application code with directives that guide the checkpointing
Memory Exclusion Certain areas of memory can be excluded
from the checkpoint Dead memory – will never be read or written Clean memory – values have not changed
since previous checkpoint Incremental Ckpt provides clean memory
opt. at a coarse level (page size) Only writing the ‘active’ areas of the stack
and heap provides dead memory opt.
User Directed Memory Exclusion
Libckpt provides the app. programer with two functions exclude_bytes(ptr, length, usage)
Specify an area of memory to exclude from future checkpoints
include_bytes(ptr, length) Add a previously excluded area of
memory to future checkpoints
Clean Memory If mem is clean
exclude_bytes(mem, …, CKPT_READONLY)
Include mem in next checkpoint, but exclude in all subsequent
Cannot write to mem until after call to include_bytes(mem)
Restore last saved version of mem
Clean Memory: Example
for (…){
A = init_A()exclude_bytes(A,…,CKPT_READONLY)do_stuff(A) //assuming A does not change
include_bytes(A…)}
Dead Memory If mem is dead
exclude_bytes(mem, …, CKPT_DEAD) Do not checkpoint mem Cannot read mem until after
include_bytes(mem) Will not restore mem
Dead Memory: Examplefor (…){
A = init_A()do_stuff(A)exclude_bytes(A…DEAD)do_other_stuff() // assumes will not read Ainclude_bytes(A)
}
Using Memory Exclusion There can be a dramatic reduction
in the size of the checkpoint file Must be used very carefully
Inadvertently excluding a live region from a checkpoint could cause erroneous behavior on restart
Synchronous Checkpointing
At different points in the program’s execution the amount of ‘live’ state varies widely The stack might be much smaller
(shallower call graph) Heap items might have been de-
allocated Regions of memory might be dead or
clean
Synchronous Ckpt (2) If checkpoints are taken at times
where there is relatively little live state, the checkpoint file size (and overhead) will be smaller
Allow user to specify where in a program a checkpoint should be taken
Independent of timers (signals)
Synchronous Ckpt (3) To avoid checkpointing too
frequently, mintime parameter specifies the minimal amount of time between two checkpoints
If checkpoint_here() is called less than mintime seconds after the last checkpoints, the call is ignored
Synchronous Ckpt (4) To ensure that checkpoints are
taken frequently enough to be of use, maxtime parameter specifies the maximum time allowed to elapse between two checkpoints
If maxtime passes, an asynchronous checkpoint is taken
Top Related