1SAS 03/ GSFC/SATC- NSWC-DD System and Software Reliability Dolores R. Wallace SRS Technologies...

17
1 SAS 03/ GSFC/SATC- NSWC-DD System and Software Reliability Dolores R. Wallace SRS Technologies Software Assurance Technology Center http:// satc.gsfc.nasa.gov / Dr. William H. Farr, Dr. John R. Crigler Naval Surface Warfare Center NASA OSMA SAS '03

Transcript of 1SAS 03/ GSFC/SATC- NSWC-DD System and Software Reliability Dolores R. Wallace SRS Technologies...

Page 1: 1SAS 03/ GSFC/SATC- NSWC-DD System and Software Reliability Dolores R. Wallace SRS Technologies Software Assurance Technology Center .

1SAS 03/ GSFC/SATC-NSWC-DD

System and Software Reliability

Dolores R. WallaceSRS Technologies

Software Assurance Technology Centerhttp://satc.gsfc.nasa.gov/

Dr. William H. Farr, Dr. John R. CriglerNaval Surface Warfare Center Dahlgren Division

NASA OSMA

SAS '03

Page 2: 1SAS 03/ GSFC/SATC- NSWC-DD System and Software Reliability Dolores R. Wallace SRS Technologies Software Assurance Technology Center .

2SAS 03/ GSFC/SATC-NSWC-DD

Overview of the Problem

• Reliability Measurement is a critical objective for NASA systems

• Systems are assessed from the software/hardware/systems perspective

• Methodologies for hardware reliability assessment have been developed and utilized over the past several decades

• Methodologies for software reliability assessment have been developed since the 70’s and have been utilized over the last twenty years

• Methodologies for system reliability assessment have only been addressed over the last 10 years with little application experience

• Need for a tool that integrates all aspects of reliability data (software, hardware, and systems perspectives)

Page 3: 1SAS 03/ GSFC/SATC- NSWC-DD System and Software Reliability Dolores R. Wallace SRS Technologies Software Assurance Technology Center .

3SAS 03/ GSFC/SATC-NSWC-DD

Project Objectives

• Enhance the capability for NASA to assess software reliability by identifying and incorporating recent models into the tool Statistical Modeling and Estimation of Reliability Functions for Systems (SMERFS^3) – First Year Initiative– Perform a detailed literature search (1990 and beyond)

• Enhance the capability for NASA to assess system reliability by updating SMERFS^3– Second Year Initiative– Identify system models for incorporation

• Apply the identified methodologies to project data sets within the NASA/DoD environments

Page 4: 1SAS 03/ GSFC/SATC- NSWC-DD System and Software Reliability Dolores R. Wallace SRS Technologies Software Assurance Technology Center .

4SAS 03/ GSFC/SATC-NSWC-DD

FY03 Research Plan

• Literature search

• Selection of new models

• Build new models into SMERFS^3

• Test new models with Goddard project data

• Make latest version of SMERFS^3 available

Page 5: 1SAS 03/ GSFC/SATC- NSWC-DD System and Software Reliability Dolores R. Wallace SRS Technologies Software Assurance Technology Center .

5SAS 03/ GSFC/SATC-NSWC-DD

Literature Search

• Articles from 1990 forward• Journals - sample

– IEEE TSE– IEEE Reliability– Software Testing, Verification,

and Reliability– IEEE Software– IEEE Computer

• Conferences– ISSRE– ICSE– Reliability & Maintainability– High-Assurance Systems Eng.– Various others

• Model selection criteria

– Model assumptions

– Fit within current SMERFS^3

– Type of system

– Data availability

• Domain Experts

Page 6: 1SAS 03/ GSFC/SATC- NSWC-DD System and Software Reliability Dolores R. Wallace SRS Technologies Software Assurance Technology Center .

6SAS 03/ GSFC/SATC-NSWC-DD

Characteristics of the Software Based Systems

• Software

– Real-time

– Large-scale

– Time-critical

– Embedded

– Maybe heavy COTS

– Distributed

• System

– Safety-critical components

– Heterogeneous

– Fault tolerant

– Costly to develop

– Long lifetime, evolutionary

Page 7: 1SAS 03/ GSFC/SATC- NSWC-DD System and Software Reliability Dolores R. Wallace SRS Technologies Software Assurance Technology Center .

7SAS 03/ GSFC/SATC-NSWC-DD

SMERFS^3

• Current Version features:

– 6 software reliability models

– 2D, 3D plots of input data, fit into each model

– Various reliability estimates

– User queries for predictions

• Updates constraints:

– Employ data from integration, system test, or operational phase

– Use existing graphics of SMERFS^3

– Integrate with existing user interfaces, goodness-of-fit tests, and prediction capabilities

Page 8: 1SAS 03/ GSFC/SATC- NSWC-DD System and Software Reliability Dolores R. Wallace SRS Technologies Software Assurance Technology Center .

8SAS 03/ GSFC/SATC-NSWC-DD

Available Data

• Large GSFC project, but confidentiality required

• GSFC person invaluable in explaining the system and the data

• Several subsystems

• Data flat files – much effort into spreadsheet/database

• Operational failures only

• Remove specific faults and sort others

• Apply IntervalCounter

Bottom line: organizing data required substantial effort – minimized if project person prepared the data

Page 9: 1SAS 03/ GSFC/SATC- NSWC-DD System and Software Reliability Dolores R. Wallace SRS Technologies Software Assurance Technology Center .

9SAS 03/ GSFC/SATC-NSWC-DD

Identified Models

• Hypergeometric• Schneidewind (enhancements)• Log-logistic• Extended Execution Time (EET)

• The first two models require error count failure data; the last two require time-between-failure data

• Only error count data has been captured in the GSFC project database available for analysis

• Hence, software reliability additions to SMERFS^3 in this task will be limited to the hypergeometric model and the metrics enhancements to the Schneidewind model

Page 10: 1SAS 03/ GSFC/SATC- NSWC-DD System and Software Reliability Dolores R. Wallace SRS Technologies Software Assurance Technology Center .

10SAS 03/ GSFC/SATC-NSWC-DD

Hypergeometric ModelAssumptions

Test instance, t(i): A collection of input test data.

N: Total number of initial faults in the software.

• Faults detected by a test instance are removed before the next test instance is exercised

• No new fault is inserted into the software in the removal of the detected fault.

• A test instance t(i) senses w(i) initial faults. w(i) may vary with the condition of test instances over i. It is sometimes referred to in the authors' papers as a "sensitivity" factor.

• The initial faults actually sensed by t(i) depend upon t(i) itself. The w(i) initial faults are taken randomly from the N initial faults.

Page 11: 1SAS 03/ GSFC/SATC- NSWC-DD System and Software Reliability Dolores R. Wallace SRS Technologies Software Assurance Technology Center .

11SAS 03/ GSFC/SATC-NSWC-DD

• Meets many of our selection criteria:– Data type– Fits within the framework of the SMERFS^3

software– Research shows that it appears to perform well

against other models• Allows for testing intensity factor (for

example: number of test cases, number of testing personnel, debug time)

• Scheduled for implementation in the last quarter of FY03

Hypergeometric Model

Page 12: 1SAS 03/ GSFC/SATC- NSWC-DD System and Software Reliability Dolores R. Wallace SRS Technologies Software Assurance Technology Center .

12SAS 03/ GSFC/SATC-NSWC-DD

Schneidewind Model

• There are three versions:– Model 1: All of the fault counts for each testing

period are treated the same. – Model 2: Ignore the first s-1 testing periods and

their associated fault counts. Only use the data from s to n.

– Model 3: Combine the fault counts of the intervals 1 to s-1 into the first data point. Thus there are s+1 data points.

Page 13: 1SAS 03/ GSFC/SATC- NSWC-DD System and Software Reliability Dolores R. Wallace SRS Technologies Software Assurance Technology Center .

13SAS 03/ GSFC/SATC-NSWC-DD

Schneidewind Assumptions

• The number of faults detected in each of the respective intervals are independent.

• The fault correction rate is proportional to the number of faults to be corrected.

• The intervals over which the software is tested are all taken to be of the same length.

• The cumulative number of faults by time t, M(t), follows a Poisson process with mean value function μ(t). The mean value function is such that the expected number of fault occurrences for any time period is proportional to the expected number of undetected faults at that time.

• The failure intensity function, λ(t), is assumed to be an exponentially decreasing function of time; that is, λ(t)=αexp(-βt) for some α, β > 0.

Page 14: 1SAS 03/ GSFC/SATC- NSWC-DD System and Software Reliability Dolores R. Wallace SRS Technologies Software Assurance Technology Center .

14SAS 03/ GSFC/SATC-NSWC-DD

• Meets many of our selection criteria:– Data type– Basic model already in the SMERFS^3 software– It has been shown to perform well against other models

• Allows learning curve effect• Updates are being implemented this quarter

– Risk measures• Operational quality at time t• Risk criterion metric for the remaining faults at time t• Risk criterion metric for the time to next failure at

time t– Confidence intervals

Schneidewind Model Enhancements

Page 15: 1SAS 03/ GSFC/SATC- NSWC-DD System and Software Reliability Dolores R. Wallace SRS Technologies Software Assurance Technology Center .
Page 16: 1SAS 03/ GSFC/SATC- NSWC-DD System and Software Reliability Dolores R. Wallace SRS Technologies Software Assurance Technology Center .

Data Analysis of NASA Three

Month Fault Counts

Page 17: 1SAS 03/ GSFC/SATC- NSWC-DD System and Software Reliability Dolores R. Wallace SRS Technologies Software Assurance Technology Center .

17SAS 03/ GSFC/SATC-NSWC-DD

Proposed Next Steps

• FY03 – Focused on software

– Complete implementation and testing

– Prepare paper describing the research and model selection, implementation, conclusions

• Apply the enhancements on the Goddard data set

– Prepare SMERFS^3 for distribution

• FY04

– Conduct similar research effort for System Reliability

• University of Connecticut will participate

– Enhance and validate system models