Exception Handling

7
17/8/2015 Topic: Exception Handling http://users.ece.cmu.edu/~koopman/des_s99/exceptions/ 1/7 Exception Handling Carnegie Mellon University 18849b Dependable Embedded Systems Spring 1999 Author: Charles P. Shelton Abstract: Exceptional conditions are things that occur in a system that are not expected or are not a part of normal system operation. When the system handles these exceptional conditions improperly, it can lead to failures and system crashes. Exception failures are estimated to cause two thirds of system crashes and fifty percent of computer system security vulnerabilities. Exception handling is especially important in embedded and realtime computer systems because software in these systems cannot easily be fixed or replaced, and they must deal with the unpredictability of the real world. Robust exception handling in software can improve software fault tolerance and fault avoidance, but no structured techniques exist for implementing dependable exception handling. However, many exceptional conditions can be anticipated when the system is designed, and protection against these conditions can be incorporated into the system. Traditional software engineering techniques such as code walkthroughs and software testing can illuminate more exceptional conditions to be caught, such as bad input for functions and memory and data errors. However, it is impossible to cover all exceptional cases. It is also difficult to design a dependable system that can tolerate truly unexpected conditions. In these cases, some form of graceful degradation is necessary to safely bring down the system without causing major hazards. Contents: Introduction Key Concepts Programmed Exception Handling Default Exception Handling RealTime System Constraints Available tools, techniques, and metrics Dependability Cases Xept Ballista Relationship to other topics Conclusions Annotated Reference List

description

Handling

Transcript of Exception Handling

17/8/2015 Topic: Exception Handling

http://users.ece.cmu.edu/~koopman/des_s99/exceptions/ 1/7

Exception HandlingCarnegie Mellon University

18­849b Dependable Embedded Systems Spring 1999

Author: Charles P. Shelton

Abstract:

Exceptional conditions are things that occur in a system that are not expected or are not a part of normalsystem operation. When the system handles these exceptional conditions improperly, it can lead tofailures and system crashes. Exception failures are estimated to cause two thirds of system crashes andfifty percent of computer system security vulnerabilities. Exception handling is especially important inembedded and real­time computer systems because software in these systems cannot easily be fixed orreplaced, and they must deal with the unpredictability of the real world. Robust exception handling insoftware can improve software fault tolerance and fault avoidance, but no structured techniques exist forimplementing dependable exception handling. However, many exceptional conditions can be anticipatedwhen the system is designed, and protection against these conditions can be incorporated into the system.Traditional software engineering techniques such as code walkthroughs and software testing canilluminate more exceptional conditions to be caught, such as bad input for functions and memory anddata errors. However, it is impossible to cover all exceptional cases. It is also difficult to design adependable system that can tolerate truly unexpected conditions. In these cases, some form of gracefuldegradation is necessary to safely bring down the system without causing major hazards.

Contents:

IntroductionKey Concepts

Programmed Exception HandlingDefault Exception HandlingReal­Time System Constraints

Available tools, techniques, and metrics

Dependability CasesXeptBallista

Relationship to other topicsConclusionsAnnotated Reference List

17/8/2015 Topic: Exception Handling

http://users.ece.cmu.edu/~koopman/des_s99/exceptions/ 2/7

Introduction

Exception handling is the method of building a system to detect and recover from exceptional conditions.Exceptional conditions are any unexpected occurrences that are not accounted for in a system's normaloperation. It is difficult to protect a system from the effects of exceptional conditions because, by nature,all unusual occurrences cannot be anticipated when the system is designed. Some examples ofexceptional conditions are incorrect inputs from the user, bit level memory or data corruption, softwaredesign defects that cause a system to enter an undefined state, and environmental anomalies. If theseexceptional conditions are not properly caught and handled, they can cause an error or failure in thesystem. Failures due to exceptions are estimated to account for two thirds of system crashes and fiftypercent of system security vulnerabilities [Maxion98].

Exception handling is different from fault tolerance. Fault tolerance focuses on keeping known errorstates from causing system failures. Exception handling deals with the undefined and unanticipatedconditions that, if left unchecked, can propagate through the system and cause a fault. Exceptionhandling is more like fault avoidance or fault containment. I submit that exception handling is moredifficult than fault tolerance because it must deal with all the unpredictabilities of the system.

When designing an embedded system, exception handling is usually focused on software. In fact, morethan two thirds of code written for systems is devoted to properly detecting and handling exceptions.However, most software testing efforts focus on exercising the correct operation of code, and notdetermining how robust it is to exceptional conditions [Cristian80]. Therefore, exception handling code isthe least tested and most susceptible to bugs.

Exception handling should also not be ignored in system components other than software. Hardware anduser interface components should also have some built­in protection from exceptional conditions as wellas having some system­level protection. This was one of the problems with the Therac­25 medicaldevice. The Therac­20 had hardware interlocks to prevent lethal doses of radiation that were removed inthe Therac­25. Thus, unknown software defects that were effectively neutralized in the Therac­20 wereexposed in the Therac­25 and caused several deaths (both machines used the same basic software). Thisillustrates the need for system­level as well as component­level exception handling mechanisms.

Unfortunately, no well defined techniques exist for building robust exception handling into a system.Most methods are ad hoc and limited to what the design team can anticipate the system will encounter.Luckily, many of the most common problems can easily be avoided as long as code is written to checkfor them. Many exception failures in commercial libraries are linked to simple conditions such aschecking that a pointer is not null before dereferencing it, or checking that a file is open beforeattempting to read or write to it. Good software engineering practices such as code reviews, codewalkthroughs, and thorough testing can illuminate many of these exceptional conditions, but are limitedto the software of the system. It is also difficult to model the complex interactions of system componentsat the design phase to determine where other problems lie.

It is unrealistic to build a system that is completely bulletproof to exceptional conditions because wecannot anticipate all possible situations. Therefore it is necessary to build in default exception handlersthat will attempt to recover from any of these unanticipated conditions. If the application is somewhatsafety critical or has real­time deadlines, some form of graceful degradation must be put in place toreduce the harm or damage done by any system failures.

Key Concepts

17/8/2015 Topic: Exception Handling

http://users.ece.cmu.edu/~koopman/des_s99/exceptions/ 3/7

Exception handling techniques can be separated into two broad categories: programmed exceptionhandling and default exception handling. In some cases programmed exception handling is capable ofdoing forward error recovery, but both programmed and default exception handling methods can performbackward error recovery. Forward error recovery can mask any exceptional occurrences and continuenormal operation. Backward error recovery must halt normal system execution and attempt to return to aprevious normal state to continue execution and retry the operation. Checkpointing and recovery is atechnique of backward error recovery for tolerating transient or internittent conditions.

Programmed Exception Handling

Programmed exception handling modules are mechanisms built into software for specific exceptionalcases that are known are likely to occur. Since these occurrences are relatively well understood,protection for them can be incorporated into the system. When a program is executing, if one of theexceptional conditions is detected, control is passed from the main process block to the special exceptionhandling block. This code will deviate from normal execution to compensate for the exceptionalcondition and will attempt to mask it to prevent propagating an error condition to higher levels in thesoftware hierarchy.

If the condition cannot be recovered, the exception handler may call checkpointing recovery code toreturn the system to a known state before the exception occurrence and retry the operation.

Default Exception Handling

For all the exceptional conditions that are not anticipated by the system designers, default exceptionhandlers must be built. The default handlers may be within the programming language or operatingenvironment itself, transparent to the application developer. They must be a catch­all for any unexpectedexceptions, and must also be responsible for containing exceptions due to design defects.

Exceptional conditions due to design defects are especially dangerous because they will always bepresent. If you knew about all design defects in a system a priori, they would have been eliminatedbefore building the system. Since we have not yet learned how to design perfect systems, it is importantthat exception handlers can reduce the impact of design defects as much as possible.

In most cases, default exception handlers cannot do much to continue system operation. In the best casesthey can use the checkpointing and recovery system to mask transient errors, but for truly exceptionalconditions that cause error states, the best that can be hoped for is a graceful program termination.

In order to achieve robust operation, as much exception handling as possible is desired. However,exception handling overhead may be too great for real­time systems and make timing and schedulingdifficult.

Real­Time System Constraints

In real­time systems, timing and meeting deadlines are the first priority, especially for safety criticalsystems. However, if exceptional conditions occur, there must be some detection and recoverymechanisms in place to prevent error propagation. The extent and complexity of the exception handlingmechanisms will make it difficult to calculate and meet timing constraints [Colnaric93]. Either thescheduling will have to be worst­case, making performance worst­case, or exception handling will haveto be sacrificed. This is a tradeoff between getting results on time, or getting correct results. Someresearch is being done in constructing models that use object­oriented techniques to account for both

17/8/2015 Topic: Exception Handling

http://users.ece.cmu.edu/~koopman/des_s99/exceptions/ 4/7

real­time constraints and exception handling mechanisms, so that they can be more easily and compatiblydesigned [Romanovsky98].

Available tools, techniques, and metrics

As discussed above, there are no mature methods for generating robust exception handlers or ensuringthat all exceptions have been accounted for, but there is research being done in these areas. Extendingtraditional software engineering practices to use dependability cases for generating exceptionalconditions is one technique. Another technique called Xept provides an instrumentation language forstructured generation of wrappers for exceptional inputs to software library modules. Another problem isthat there are no accepted ways of measuring how robust a system is to exceptional conditions. TheBallista project has developed a methodology for automatically testing and comparing the relativerobustness of software modules.

Dependability Cases

It is hypothesized that exceptional conditions are not guarded in software because designers do not thinkof them. Dependability cases aim to provide a general framework and methodology for generatingscenarios of exceptional conditions so the system designer can build exception handlers for them into thesystem. This technique, when used in conjunction with good software engineering processes, is supposedto improve software robustness. Hazard analysis techniques such as fault trees and fishbone diagrams areused to aid the designer in anticipating exceptional conditions. Using dependability cases, a taxonomy ofexceptional conditions can be developed. For example, [Maxion98] describes the CHILDREN mnemonicfor exceptions:

Computational problemHardware problemI/O and file problemsLibrary function problemData input problemReturn­value problem: function or procedure callExternal user/client problemNull pointer and memory problems

However, since it is impossible to anticipate and cover all exceptional conditions, it is unclear how muchof an improvement dependability cases can make in the system's software. Whatever taxonomy ofexceptional conditions we develop, it may exclude a key class of exceptions, leaving the systemvulnerable. However, this more structured approach is better than ad hoc methods.

Xept

Xept is a method of generating wrappers for software modules. Using an instrumentation language, youcan generate code to check for exceptional inputs before passing parameters to library functions [Vo97].This is particularly useful for Commercial Off­The­Shelf (COTS) software where source code may notbe available and the programmer only has access to the module interface. Many COTS software modulesare not as robust as they can be, and extra protection must be built into the system if you use thesecomponents in your software. Xept provides a structured method of instrumenting application code tomask and handle exceptions in library code. However, in order to generate these exception handlers, the

17/8/2015 Topic: Exception Handling

http://users.ece.cmu.edu/~koopman/des_s99/exceptions/ 5/7

conditions to be protected against must already be known. Xept does not detect exceptional conditions, itonly provides a way of correcting for them.

Ballista

The Ballista software testing methodology focuses on passing exceptional inputs at the module level andrecording the results. Ballista is completely automated and can demonstrate repeatable, atomic responsesto exceptional conditions from unexpected parameters. It is scalable because testing is based of theparameters passed to the function, not the function's operation [Kropp98]. Therefore, once test cases fordata types are developed, any function that uses those data types can be tested. This is ideal for testingCOTS software and making comparisons between different implementations of the same applicationprogramming interface (API). Since Ballista focuses of repeatable results, it is only useful for componenttesting and cannot detect exceptional conditions due to complex interactions between systemcomponents. Also, when testing modules, the tester must come up with the exceptional inputs for the datatypes to be tested. However, as the system grows, a database of exceptional values is being kept and canbe reused for the same data type.

Relationship to other topics

Exception handling is a method of achieving system robustness, and is also related to fault tolerance anderror recovery.

Robustness ­ Exception handling is a technique for designing a robust system. Robustness isdefined as the degree to which a system can function in the presence of invalid inputs or stressfulenvironmental conditions. These are exceptional conditions.Software Testing ­ Testing is currently the only metric we have for measuring how well a systemcan handle exceptional conditions. It is also used to uncover any cases previously unanticipated. Unfortunately, the problem of completely testing any system for all possible occurrences isintractable.Fault Tolerant Computing ­ Fault tolerant computing is similar to robustness and exceptionhandling, but deals with controlling and containing system or component errors after they haveoccurred. Exception handling attempts to keep unanticipated conditions from causing faults.Software Fault Tolerance ­ Fault tolerance in software is especially important since software isquickly becoming the most complex and integral part of any embedded system. Softwareexception handling can improve software fault tolerance by preventing exceptional conditions frombecoming software faults.Checkpoint/Recovery ­ Checkpoint/Recovery is a method that can recover from some transient andintermittent failures and can mask exceptional occurrences.Security ­ Many security vulnerabilities are caused by not properly containing exceptionalconditions. For example, many security holes are caused by race conditions and not detecting amemory buffer overflow. These vulnerabilities can be exploited by people to gain access to andtamper with restricted systems.Human Interface/Human Error ­ Since input from a human user is one of most likely places thatexceptional and invalid inputs can be generated in an embedded system, the user interface shouldbe able to prevent the operator from causing a fault condition. The interface should constrain theuser to only entering valid inputs into the system.

17/8/2015 Topic: Exception Handling

http://users.ece.cmu.edu/~koopman/des_s99/exceptions/ 6/7

Conclusions

The following ideas are the important ones to take away from reading about this topic:

Exception handling differs from fault tolerance, but they are related. Fault tolerance deals withcorrecting for known error conditions. Exception handling can be seen as fault avoidance or faultcontainment. Unexpected conditions must be masked before they can cause a fault in the system.It is not possible to cover every exception within a closed system. There are unanticipatedsituations that the system cannot compensate for.Where you draw the system boundary determines the level of exception handling you can do. Forexample, if you only look at the software, environmental exceptional conditions cannot besufficiently handled. If a human operator is part of the system, there may be more exceptions thatcan be covered, but with less certainty.Coverage is a major problem. It is unrealistic to cover all exceptional conditions because they arenot predictableIt is difficult to develop strategies to safely handle exceptions for unanticipated situations. Mostmethods are ad hoc and based on previous experience.In real­time systems, there is a tension between developing robust exception handlers for safetyand correctness, and meeting timing constraints.

Annotated Reference List

[Colnaric93] Colnaric, Matjaz; Halang, Wolfgang A., "Exception Handling and Predictability inHard Real­Time Systems." SAFECOMP 93. 12th International Conference on Computer Safety,Reliability and Security, October 1993, p. 371­378.

This paper discusses the concerns of implementing exception handling and accounting forunpredictability in the face of the timing constraints in hard real­time systems. [Cristian80] Cristian, Flaviu, "Exception Handling And Software­Fault Tolerance." 10thInternational Symposium on Fault­Tolerant Computing, October 1980, p. 97­103.

Basic concepts in software exception handling and mathematical definitions. [Kropp98] Kropp, Nathan P.; Koopman, Philip J.; Siewiorek, Daniel P., "Automated robustnesstesting of off­the­shelf software components." Twenty­Eighth Annual International Symposium onFault­Tolerant Computing, June 1998, p. 230­239.

Motivation, methodology, and results of applying the Balista software testing technology toPOSIX operating systems. [Maxion98] Maxion, Roy A.; Olszewski, Robert T., "Improving Software Robustness WithDependability Cases." Twenty­Eighth Annual International Symposium on Fault­TolerantComputing, June 1998, p. 346­355.

Introduces technique of dependability cases and how it can help improve exception handling. [Romanovsky98] Romanovsky, Alexander; Xu, Jie; Randell, Brian, "Exception Handling in

17/8/2015 Topic: Exception Handling

http://users.ece.cmu.edu/~koopman/des_s99/exceptions/ 7/7

Object­Oriented Real­Time Distributed Systems." First International Symposium on Object­Oriented Real­Time Distributed Computing (ISORC '98), April 1998, p. 32­42.

Research into using object­oriented programming techniques to build structed exception handlinginto real­time systems. [Vo97] Vo, Kiem­Pheng; Wang, Yi­Min; Chung, P.Emerald; Huang, Yennun, "Xept: A SoftwareInstrumentation Method For Exception Handling." Eighth International Symposium on SoftwareReliability Engineering, November 1997, p. 60­69.

Information about Xept, the motivation, methodolgy, and the instrumentation language developed.

Index of other topics

Home page