Chapter 19 Recovery and Fault Tolerance Copyright © 2008.

Chapter 19

Recovery and Fault ToleranceCopyright © 2008

Operating Systems, by Dhananjay Dhamdhere Copyright © 2008 19.2Operating Systems, by Dhananjay Dhamdhere 2

Introduction

• Faults, Failures, and Recovery• Byzantine Faults and Agreement Protocols• Recovery• Fault Tolerance Techniques• Resiliency


Faults, Failures, and Recovery

• A fault may damage the state of a system– Error: a part of the system state that is erroneous

• Failure: unexpected behavior or situation


Faults, Failures, and Recovery (continued)

• Recovery: for reliable operation, system is restored to a consistent state, and operation resumed– A recovery is performed when a failure is noticed


Classes of Faults

• Fault model: properties that determine the kinds of errors/failures that might result from a fault

• Classes of faults:– System fault system crash

• Amnesia and partial amnesia faults

• A fail-stop fault brings a system to a halt

– Process fault • Byzantine faults: malicious or arbitrary actions

– Storage fault amnesia faults

– Communication fault nonamnesia faults


Overview of Recovery Techniques

• For non-Byzantine faults, recovery involves restoring system or application to a consistent state– Involves reexecuting some actions


Overview of Recovery Techniques (continued)

• Recovery approaches are classified into:– Backward recovery: resetting state of entity affected by

fault to a prior state and resuming its operation• Involves reexecution of some actions

– Forward recovery: repairing erroneous state of a system so system can continue its operation

• Repair cost depends on the nature of the computation

• May involve a certain amount of reexecution

• Backward recovery is simpler to implement– But, requires a practical method of producing a consistent

state recording of a system


Byzantine Faults and Agreement Protocols

• Byzantine faults have been studied only in the restricted context of agreement between processes– Byzantine generals problem: Attack or retreat

• Agreement protocols used for:– Byzantine agreement problem

– Consensus problem

– Interactive consistency problem

• Impossibility result: a group of three (m) processes containing one (k) faulty cannot reach agreement– If m > 3, agreement is possible if m > 3 k


Recovery

• A recovery scheme consists of two components:– Checkpointing algorithm

• Decides when to take a checkpoint for a process

– Recovery algorithm• Uses checkpoints to roll back processes such that new

process states are mutually consistent


Recovery (continued)

• State of a process cannot be recovered in isolation– Must restore state of computation S’ in which states of all

pairs of processes are mutually consistent

• Goal of recovery algorithm is to decide:– Whether a process Pi should be rolled back

– Identify checkpoint to which Pi should be rolled back

• Asynchronous or synchronous checkpointing


Fault Tolerance Techniques

• Basic principle in fault tolerance is to ensure that a fault either does not cause an error– Or that the error can be removed easily

• Two facets of the tolerance of system faults that follow the fail-stop model:– Fault tolerance for replicated data

– Fault tolerance for distributed data


Logs, Forward Recovery, and Backward Recovery

• A log is a record of actions or activities in a process– Do logs (also called redo logs)

• Used to implement forward recovery

– Undo logs• Used to implement backward recovery

• A write-ahead logging principle is used• A log could be an operation log or a value log

– Example: intentions list (for atomic actions) is a value log used as a redo log

• Value logs provide idempotency


Handling Replicated Data

• Availability of data D provided through replication– At least one copy of D would be accessible from any

node despite anticipated faults in the system

• If D may be modified, it is essential to use rules to ensure correctness of data access and updates:1. Many processes can concurrently read D

2. Only one process can write into D at any time

3. Reading and writing can’t be performed concurrently

4. A process reading D must see the latest value of D


Handling Replicated Data (continued)

• Quorum: number of copies of D that must be accessed to perform a specific operation on D

• Quorum algorithms enforce Rules 1–4 by specifying a read quorum Qr and a write quorum Qw

– 2 x Qw > n and Qr + Qw > n

– Two kinds of locks used on D: read and write locks

– If a system is required to tolerate faults in up to k nodes, we could choose:

• Qr = k + 1

• Qw = n − k

• n > 2 × k


Handling Distributed Data

• Distributed transaction: facility for manipulating files located in different nodes of a distributed system in a mutually consistent manner– Also called a multisite transaction

– Each node contains a transaction manager

– Originating node contains a transaction coordinator• Coordinator implements the all-or-nothing property of

transactions with two-phase commit (2PC) protocol– Depending on responses from participating nodes, decides

whether to commit or abort transaction

Operating Systems, by Dhananjay Dhamdhere Copyright © 2008 19.16

Two-Phase Commit Protocol

• Phase 11. Actions of transaction coordinator:

a. Write a ‘Prepare Ti’ record in the log

b. Set a time-out and send a ‘Prepare Ti’ message to all nodes participating in the transaction

2. Actions of a participating node:a. If it is ready to commit, write updates in stable storage and a

‘Prepared Ti’ record in the log, and send a ‘Prepared Ti’ message to coordinator

b. Otherwise, write an ‘Abandoned Ti’ record in the log and send an ‘Abandoned Ti’ message to coordinator

Operating Systems, by Dhananjay Dhamdhere 16


Two-Phase Commit Protocol

• Phase 21. Actions of transaction coordinator:

a. If it receives a ‘Prepared’ reply from all nodes before time-out occurs, write a ‘Commit Ti’ record in the log and send ‘Commit Ti’ messages to all nodes

b. Otherwise, write an ‘Abort Ti’ record in the log and send ‘Abort Ti’ messages to all nodes

c. Wait for an acknowledgment from each node and write a ‘Complete Ti’ message in the log

2. Actions of a participating node: Depending on the coordinator’s message,a. Write a ‘Commit Ti’ record in the log and perform commit

processing

b. Write an ‘Abort Ti’ record in log and abandon updates of Ti

17Operating Systems, by Dhananjay Dhamdhere


Resiliency

• Resiliency techniques focus on minimizing the cost of reexecution when faults occur– Basis for resiliency: failures in a distributed system are

partial, so some parts of a distributed computation may survive a failure

• For example, distributed transactions use resiliency techniques:– Nested transactions

– Tentative commits


Nested Transaction

• A nested transaction Tik is a part of an atomic transaction Ti

– It commits only if its parent transaction Ti commits

– It is implemented as follows• When Tik completes, it is said to have reached a tentative

commit

• When Ti wishes to commit, it checks whether all nested transactions have reached a tentative commit and can participate in commit processing

– It is implemented using a 2PC protocol



Nested Transaction

• Resiliency is implemented as follows:– If a nested transaction Tik does not respond to a ‘Prepare’

message, the coordinator can retry Tik in the same node or in some other node

• If Tik had reached tentative commit and its node had failed when ‘Prepare’ message was sent

• If the failed node recovers and the coordinator retries Tik in it

• The results of Tik, computed before the failure, can be used



Summary

• Recovery and fault tolerance are two approaches to reliability of a computer system– Generically called recovery

• A third recovery approach is resiliency• A fault causes an error in the state of the system, which

leads to a failure– A fail-stop fault brings the system to a halt

– An amnesia fault makes it lose a part of its state

– A Byzantine fault makes it behave in an unpredictable manner


Summary (continued)

• Recovery from non-Byzantine faults can be performed by using two approaches:– Backward recovery and forward recovery

• Fault tolerance implemented by maintaining logs– E.g., undo or do logs

– Logs used to implement atomic transactions• Two-phase commit protocol (2PC protocol) is used

• Nested transactions are a resiliency technique – Used when transaction involves data in many nodes

Chapter 19 Recovery and Fault Tolerance Copyright © 2008.

Documents

Transcript of Chapter 19 Recovery and Fault Tolerance Copyright © 2008.