AutA Methodology for Automated Insertion of Concurrent Error Detection Hardware in Synthesizable...

7/29/2019 AutA Methodology for Automated Insertion of Concurrent Error Detection Hardware in Synthesizable Verilog RTLom

1/4

A Methodology for Automated Insertionof Concurrent Error DetectionHardware in Synthesizable VerilogRTL

Kartik Mohanram, C.V.Krishna, and Nur A . ToubaUniversity of Texas at AustinAustin, TX 78712-1084

ABSTRACTThis paper describes a methodology for automated insertionof concurrent error detection (CED) circuitry in synthesizableVerilog RT L that allows easy tradeoff between overhead andcoverage. The insertion is done at the front-end of the synthesisprocess which is highly advantageous. Computer-aided design(CAD) tools that support the proposed methodology have beenimplemented and are described. The CAD tools allow thedesigner to select from different CED schemes that providevarious costhoverage tradeoffs. The CED circuitry is thenautomatically inserted into the Verilog design. Experimentalresults are shown which demonstrate the broad range of design

points that the proposed methodology offers the designer tochoose from to satisfy costhoverage requirements. Themethodology can be seamlessly integrated into the typical designflow used in industry.1. INTRODUCTION

The move towards deep-submicron VLSI technologies withhigher frequencies, lower voltage levels, and smaller noisemargins is increasing the susceptibility of systems to transientand intermittent faults resulting in soft errors. Early detectionof errors is crucial for preserving the state of the system andmaintaining data integrity. Circuit-level techniques forconcurrent error detection (CED) permit early detection andcontainment of errors before they can propagate to other parts ofthe system and corrupt data. This reduces the complexity andincreases the effectiveness of system-level fault tolerancefeatures.The general approach for performing CED with a systematicerror detecting code is i llustrated in Fig. 1. The function logic isaugmented with a check symbol generator. The function logichas n output bits, and the check symbol generator provides kcheck bits. Together, they form an (n,k) systematic code. Thereis a checker which monitors the encoded outputs and gives anerror indication if a non-codeword occurs. The checker can bedesigned so that it is self-checking, which means that it isguaranteed to give an error indication if a fault occurs in thechecker itself.Efficient schemes have been developed for CED in circuitswith regular structures, e.g., adders [Lo 921, [Gorshe 961,multipliers [Pradhan 861, PLAs [Mak821, [Nicolaidis 891, etc.However, the problem is much more difficult for control logicwhich typically has an irregular multilevel structure [Touba971.As process technology scales below 100nanometers, currentstudies indicate that circuits will become increasingly sensitiveto noise, particularly to atmospheric neutrons and alphaparticles. It is projected that this will result in unacceptable softerror rates even in mainstream commercial electronics.

FunctionLogicChecker

Generator

FunctionLogic

Check Symbol CheckerGenerator

outputs

errordetection

Figure 1. General Approach for Internal Error DetectionPrevious research in CED has focused on mission criticalsystems where very high levels of dependability are required. Inthese applications, the main goal is to provide high levels ofcoverage, while the cost of the error detection circuitry is of less

importance. As CED becomes increasingly important in moremainstream commercial electronics, there is a need for morecost-effective techniques. The two main sources of cost inincorporating error detection circuitry are:1. Design time- The time required to properly insert the errordetection circuitry in a design and to verify that it workscorrectly and meets coverage requirements.2. Overhead- The error detection circuitry requires additionalsilicon area and can impact the performance, powerconsumption, and other design criteria.There is a need for computer-aided design (CAD) tools andmethodologies to automate the process of inserting errordetection circuitry and to allow easy tradeoffs between overheadand coverage.In this paper, we describe a methodology for automatedinsertion of CED in synthesizable Verilog RTL that allows easytradeoff between overhead and coverage. This methodology wasdeveloped as part of a joint project between researchers at theUniversity of Texas-Austin and Hewlett Packard.A common design flow that is used at Hewlett Packard andelsewhere is for a designer to describe a design in synthesizableVerilog RTL and then use a synthesis tool to generate animplementation. An advantage of having a Verilog RT Ldescription of the design is that it can be used for fast RTLsimulation, and it provides a high-level, easy to understanddocumentation of the designs function. We insert the CEDcircuitry at the RTL level rather than at the gate level becauseinsertion at the front-end of the synthesis process has severaladvantages. It allows the error detection circuitry to beoptimized along with the functional circuitry. The synthesis toolcan take the error detection circuitry into account whensatisfying timing constraints (as well as other constraints onpower, testability, etc.). This avoids problems with having toreiterate the synthesis process because back-end insertion oferror detection circuitry caused timing or other constraints to beviolated. The approach of inserting the error detection circuitryat the RTL level can be easily and seamlessly incorporated intothe standard design flow.

0-7803-7448-7/02/$17.0002002 IEEE I - 577


2/4

We have developed CAD tools to support this methodologyand have evaluated it on existing Verilog designs at HewlettPackard. This paper describes the methodology and discussesthe results that have been obtained.2.OVERVIEW OFMETHODOLGY

A flowchart for the methodology is show in Fig. 2. Given asynthesizable Verilog description, the Verilog is parsed and theCED circuitry is inserted in the design at the appropriatelocations. The CED scheme to be used is selected by thedesigner. Different schemes can be used depending on thecoverage/cost requirements. After the CED ci rcuitry has beeninserted in the design, the resulting V erilog description can thenbe passed to a commercial synthesis tool. The synthesis toolgenerates a gate-level implementation. Fault simulation is thenperformed on the gate-level implementation to evaluate thecoverage that is provided by the CED circuitry. Other designcriteria such as area, delay, power, etc., for the implementationcan also be evaluated. If the coverage that is provided by theCED circuitry meets requirements and the other design criteriaare also satisfactory, then the process is complete. However, ifthe coverage or cost is not satisfactory, then the designer canrepeat the process with a different CED scheme. Thus, thismethodology supports rapid evaluation of the design space toconverge on the best implementation.

7++4

InsertCEDCircuitry EstimateOperatingParametersand FailureSynthesize RatesCircuil 4

Required CoverageEstimateCoverage,Area, Delay. PowerCoverage vs CostAnalysis

Figure2. Flowchart for M ethodologyWe have developed two CAD tools to support thismethodology. The first is a CED insertion tool. This tool takesas an input a synthesizable Verilog description, andautomatically inserts the CED circuitry based on the optionsspecif ied by the designer. The tool outputs a modifiedsynthesizable Verilog description with the appropriate CEDcircuitry in place. The modified Verilog description can then beused for RTL simulation and passed on to the synthesis tool.The second CA D tool is a coverage evaluation tool. This tooltakes as an input the gate-level implementation after synthesisand evaluates the coverage that is provided by the CED circuitry.The techniques used in the CED insertion tool and coverage

evaluation tool are described in Sections3 and4.3. CED I NSERT ION TOOL

The goal of the CED insertion tool is to provide the designerwith various options in terms of which CED schemes to use, andthen automatically insert the error detection circuitry into anexisting synthesizable Verilog design. Synthesis is generally notused for generating an implementation for the data path portion

of a design. The data path logic has a regular structure and isgenerally constructed fromhighly optimized components. Forthese components which have a regular structure, very efficientCED schemes exist (e.g., self-checking adder or multiplierdesigns [Lo921, [Gorshe 961). Synthesis is generally used forobtaining an implementation of the control logic portion of adesign which has an irregular multilevel structure. The CEDinsertion tool described here is targeted towards the control logicportion of a design.3.1 CED Schemes SupportedThe control logic consists of finite state machines (FSMs)which control the data path. The CED insertion tool must inserterror detection circuitry to detect errors in these FSM s. An FSMconsists of next-state logic and output logic. Our CED insertiontool provides the designer with four dif ferent options in terms ofthe CED scheme that is used for each FSM:1. Dudication - n this scheme, the FSM is simply duplicatedand the outputs are compared with an equality checker. The areaoverhead incurred is greater than 100%. If the FSM has a lot ofoutputs, the equality checker can become very large.2. One-Hot - n this scheme, a one-hot state assignment isused for the FSM (Le., each state is encoded such that only onebit is a 1 while the rest are all 0). A one-hot checker is usedto check the present state and a parity checker may be used tocheck the outputs. One limitation of applying this scheme to anexisting design is that the states need to be re-encoded. If thedesign uses special status bits or decodes a subset of the bits inthe state register for some operations, then this scheme cannot beapplied.3. pari tV - n this scheme, a parity bit is appended to the stateencoding. Unlike the one-hot scheme, the states do not need tobe re-encoded. Another advantage is that if parity prediction isused on the outputs, then a single parity checker can be used tocheck both the present state and output parity. This is done bygenerating the parity check bit so that it is the XOR of both thepresent state parity and output parity. The advantage of theparity scheme is that it requires less overhead than theduplication scheme, but this comes at the cost of losing coverageof errors that affect an even number of bits. However, in somedesigns, this loss of coverage is not substantial.4. Hvbrid Pari@- n this scheme, a parity bit is added to thestate encoding, however, the output logic is duplicated toprovide higher error coverage. This approach is a hybridbetween duplication and parity. It has less overhead than usingfull duplication, but better coverage than parity.Each of the CED schemes has advantages and disadvantages.The designer can choose the scheme that is best suited for eachFSM in the design. ForFSM s with a small number of states, theone-hot scheme may provide better coverage at lower cost thanparity. For larger FSM s, parity may be the most cost effective.If the coverage that is provided by parity is not sufficient, thanhybrid parity orduplication can be used.3.2 Verilog M odificationThe CED insertion tool automatically modifies an existingsynthesizable Verilog design to insert the CED circuitry. This isdone by having the user identify each FSM and specify the CEDscheme that is to be used for that FSM . The FSM s are identifiedby specifylng the module name and the present state and nextstate variables. The tool then parses the module, modifies the

I - 578


3/4

statements containing the present state and next state variablesaccordingly, and adds the necessary checkers.For the duplication scheme, the tool simply duplicates theentire FSM and adds an equality checker. For both the parityand one-hot schemes, the Verilog source has to be modified inplaces where the present and next state variables occur. T his isexplained in greater detail with respect to the example presentedin Fig. 3:1. Reaister declarations - The present and next state variablesregister declarations are modified appropriately by increasing thewidth of the register to reflect a parity or one-hot encoded state.In the example in Fig. 3, the parity scheme requires that thepresent and next state registers change from 2 to 3 bit-wideregisters. The parity bit is always the most significant bit, and thevalue is chosen to reflect an odd parity under normal operation.In a similar manner, the one-hot scheme would require that theregisters be modified to be 4 bit-wide registers. Obviously,modifymg the register width for the one-hot scheme can prove tobe wasteful if all the states are not used in the FSM. Our CEDinsertion tool provides the user with an option to specify theexact number of states that are used in the FSM so that thepresent and next state registers are expanded to the optimumwidth when the one-hot encoding scheme is inserted.2. Equalitv ODerator - The present state variable can also occurin conditional statements, for example, when the equalityoperator is used to check that the FSM is in a certain state. Here,the state encoding provided in the original design has to beappropriately modified to reflect the parity or one-hot encodingscheme. In the example, 2blO is modified to 3b010, and theparity bit is made the most significant bit. Similarly, 2bIOwould be expanded to 4bOlOO if the one-hot encoding schemewere chosen.3. Blockina Assimments - The next state variable can occur inblocking assignments when the FSM changes state depending onthe combination of inputs and the present state. The changes tothe Verilog source are similar to the ones described above forconditional statements.

reg [1:0] PS;reg [1:O 1NS;assign slgX =( eIgY B (PS-- 2bIO ) );always@ (PSor reset or ... ) begin reg

[2:O ] PS ;reg[2:O 1 NS;assign slgX=( slgY % ( PS==Sb010 ) );always @ ( PS or reset or ... ) begin

IF ( reset ) NS=ZbOO;else begin else begin

IF ( reset ) NS=Sb100;case (PS )

ZbOO:case ( PS )

Yb100:* NS=3bOOI;S=ZbOl;...

default:NS-Zbll;

endcaseend

end

...default:

NS=3b l l l ;endcase

endend

Figure3. Example of ModifyingVerilog Design4. Outuut Declarations - The parity encoding scheme hassignificant advantages over the one-hot encoding scheme in twocases. The first is when the contents of the present state registerare exported ouside the module in which the FSM is contained,by declaring all (or part) of the present state register as primaryoutputs of the module. The second case is when all (or part) ofthe present state registers contents are accessed in say ablocking assignment statement within or outside the module ofoccurrence of the FSM. In both these cases, the addition of theparity bit in the most significant position does not alter the

functionality in any manner, and automation of the parity schemeposes no difficulties. In other words, since the parity encodingscheme is a separable encoding scheme, the modifications to theVerilog source are transparent to the entire design. Anotheradvantage of parity encoding under these circumstances is thatthe functionality cannot be inadvertently altered during a post-CED injection optimization phase.Lastly, when more than a single FSM occurs in the design,the error indication signals are stitched together during the write-back phase of the parser, and finally result in two global errorindication signals that are encoded in a two-rail code. While thechecker for the parity encoding scheme is straight-forward tointroduce, care has to be taken to ensure that the checker for theone-hot encoding scheme is self-testing.

4. COVERAGE EVAL UATION TOOLThe goal of the coverage evaluation tool is to evaluate thecoverage that the CED circuitry provides. This is done at thegate-level after synthesis. The synthesis tool generates a gate-level implementation of the design which contains the errordetection circuitry.We implemented the coverage evaluation tool using acommercial fault simulator. The coverage evaluation tool first

generates scripts for the commercial fault simulator. Thecommercial fault simulator isthen invoked to execute the scriptsand produce output files. The coverage evaluation tool thenparses the output files and computes statistics about the coveragethat the error detection tool provides.There are two options for specifying the input patterns thatare used by the coverage evaluation tool. One is for the user tosupply the set of input pattems. The other is for the tool tointemally generate a set of random input pattems.The coverage evaluation tool performs three runs of faultsimulation. On each run, all single stuck-at faults are faultsimulated. If the fault effect for each single stuck-at fault ispropagated to an observation point, then the fault is considereddetected. The location of the observation points is different foreach of the runs. This is necessary to discount all faults that arewithin the checker circuitry itself, and hence propagate only tothe error indication signals and not to the functional outputs.We use the fault dictionary option of the commercial faultsimulator to report all the faults that are propagated to theobservation points on each run. On the first run, both the errorindication signals and the functional outputs are considered asobservation points. On the second run, only the functionaloutputs are considered as observation points. Finally, on thethird run, only the error indication signals are considered asobservation points.Let fl , f 2 and f3 be the faults propagated to the observationpoints on the first, second and third runs respectively. Thecoverage evaluation tool computes the fault coverage as(f3 - (fl - f2)) f2. Note that (fl- f 2) is the set of faults thatpropagate errors only to the error indication signals and not tothe functional outputs. These are faults that occur in the errordetection circuitry itself. (f 3 - fl - 2)) gives the set of faultsthat cause errors in the functional outputs, but are detected bythe error detection circuitry. These correspond to the faults inthe original functional circuit that would have gone undetectedwithout the presence of the error detection circuitry. The ratio ofthese detected faults to all faults that cause errors inthe functionoutputs gives the coverage that isprovided by the error detection

I - 579


4/4

circuitry. This is the final coverage number that is reported bythe coverage evaluation tool. If this coverage does not meetrequirements, then the designer can use the CED insertion toolto insert a different CED scheme that will provide bettercoverage.5. EXPERIMENTAL RESULT S

Experiments were performed using the above CAD tools onsome synthesizable Verilog control logic designs. Designs 1, 2,and 3 were taken from the Torch processor. Design 4 wasextracted from a Hewlett-Packard industrial design.

Table 1. Normalized Results for Designs

No CEDOne-Hot 112

DESIGN 1Scheme 11 Area I Delay I Power I Coverage100 100 0101 140 75.3101 145 77.0HybridDuplication 190 I 152 I 165 I 93.0219 I 166 I 231 I 100

DESIGN 2Scheme 11 Area I Delay I Power I CoverageNoCED 11 100 I 100 I 100 I 0One-HotH rid 220 215 240 94.0Du lication 244 232 248

DESIGN 3 1Scheme Area I Delay I Power I CoverageMI 100 I 100 I 100 I 0130 I 120 I 79.4190 I 133 I 85.0

Hybrid 11 215 I 150 1 230 I 94.1Duplication 11 270 I 177 1 280 I 100

DESIGN 4Scheme 11 Area I Delay I Power I CoverageNoCED 11 100 I 100 I inn I n

92.3225 2nn 1nn

For each design, we used the CED insertion tool toautomatically insert each of the four supported CED schemes:parity, one-hot, hybrid, and duplication. We then used acommercial synthesis tool with generic libraries to generate agate-level implementation. We measured the coverage that wasprovided by the CED circuitry for each of the schemes using thecoverage evaluation tool. We also compared the area, delay andpower, for the designs with the CED circuitry versus the originaldesigns with no CED. The results are shown in Table 1. Notethat the values for the area, delay, and power, are normalizedwith respect to the original designs with no CED.From the results, it can be seen that the 4 different schemesprovide a broad range of costkoverage tradeoffs that the

designer can choose from. The parity scheme has the smallestarea overhead in all cases, but also has the lowest coverage in allcases expect for Design 4 where it provided slightly bettercoverage than the one-hot scheme. In general, the one-hotscheme requires a little more overhead than parity and provides ali ttle better coverage (Design 4 is an exception). To boost thecoverage higher, the hybrid scheme can be used. It providedover 92% coverage in all cases with less overhead than fullduplication. If 100% coverage is required, then the duplicationscheme can be used, although it has the largest overhead.6. SUM M ARY AND CONCLUSIONS

This paper presented four CED schemes that can beautomatically inserted into synthesizable Verilog RT L. A CEDinsertion tool was described for modifying the Verilog source toadd the error detection circuitry. A coverage evaluation tool wasalso described for measuring the fault coverage that the CEDcircuitry provides in the gate-level implementation aftersynthesis. Using these two tools, the designer can rapidlyexplore various design points to find the best costkoveragetradeoff that meets requirements. The experimental resultsdemonstrated that the four CED schemes provide a broad rangeof costhoverage tradeoffs to choose from.The methodology described here can be easily incorporatedinto the typical design flow used in industry and provides aquick solution for designers to use to satisfy coveragerequirements. As soft errors rates continue to escalate with eachnew generation of VLSI technology, automated designtechniques for inserting CED circuitry will become veryimportant.7.ACKNOWLEDGEMENTS

The authors would like to acknowledge the contributions tothis work by Mike Ziegler, David Fotland, Bil l Bryg, GaryGostin, Harry Foster and L ionel Bening from Hewlett-Packardand Sanjay Ramnath, Debaleena Das and Jayabrata Ghosh-Dastidar from University of Texas-Austin. This research wassupported by a grant from the Hewlett-Packard Company.[Gorshe 961 Gorshe, S.,and B. Bose, A Self-checking ALUDesign with Efficient Codes, Proc. of VLSI TestSymposium,pp. 157-161, 1996[Lo 921 Lo, J .-C., S.Thanawastein, and M. N icolaidis, An SFSBerger Check Prediction ALU and Its Application to Self-Checking Processor Designs, IEEE Trans. on ComputerAided-Design, Vol. 11,No. 4, pp. 525-540, Apr. 1992.[Mak 821 Mak, G.P., J .A. Abraham, and E.S. Davidson, TheDesign of PL As with Concurrent Error Detection, Proc.FTCS, pp. 303-310, J un. 1982.[Nicolaidis 891 Nicolaidis, M., Self-Exercising Checkers forUnified Built-In Self-Test (UBIST), IEEE Trans. onComputer Aided-Design, Vol. 8, NO. 3, pp. 203-218,

Mar. 1989.[Pradhan 861 Pradhan, D.K ., F ault ToI erant Computing: Theoryand Techniques, Vol. 1, Englewood Cliffs, NJ :Prentice-Hall, 1986, Chap. 5[Touba 971 Touba, N.A ., and E.J . M cCluskey, Logic Synthesisof M ultilevel Circuits with Concurrent Error Detection,IEEE Transactions on Computer-Aided Design, Vol. 16,

8. REFERENCES

NO. 7, pp. 783-789, JuI. 1997.

I - 580

AutA Methodology for Automated Insertion of Concurrent Error Detection Hardware in Synthesizable...

Documents

Transcript of AutA Methodology for Automated Insertion of Concurrent Error Detection Hardware in Synthesizable...