Common Testing Problems – Pitfalls to Prevent and Mitigate

Common Testing Problems –Pitfalls to Prevent and Mitigate:Descriptions, Symptoms, Consequences, Causes, and Recommendations

Donald G. Firesmith

of 108© 2013 by Carnegie Mellon University

Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and Recommendations

Table of Contents1 Introduction............................................................................................................................5

1.1 Usage.................................................................................................................................5

1.2 Problem Specifications.....................................................................................................6

1.3 Problem Interpretation......................................................................................................6

2 Testing Problems...................................................................................................................8

2.1 General Testing Problems.................................................................................................8

2.1.1 Test Planning and Scheduling Problems...................................................................8

2.1.2 Stakeholder Involvement and Commitment Problems............................................17

2.1.3 Management-related Testing Problems...................................................................21

2.1.4 Test Organization and Professionalism Problems...................................................28

2.1.5 Test Process Problems.............................................................................................32

2.1.6 Test Tools and Environments Problems..................................................................45

2.1.7 Test Communication Problems................................................................................54

2.1.8 Requirements-related Testing Problems..................................................................60

2.2 Test Type Specific Problems..........................................................................................70

2.2.1 Unit Testing Problems.............................................................................................71

2.2.2 Integration Testing Problems...................................................................................72

2.2.3 Specialty Engineering Testing Problems.................................................................74

2.2.4 System Testing Problems........................................................................................82

2.2.5 System of Systems (SoS) Testing Problems............................................................84

2.2.6 Regression Testing Problems..................................................................................89

3 2BConclusion......................................................................................................................97

3.1 Testing Problems.............................................................................................................97

3.2 Common Consequences..................................................................................................97

3.3 Common Solutions..........................................................................................................98

4 Potential Future Work......................................................................................................100

5 Acknowledgements............................................................................................................101

© 2012-2013 by Carnegie Mellon University of 108


Abstract

This special report documents the different types of problems that commonly occur when testing software-reliant systems. These 77 problems are organized into 14 categories. Each of these problems is given a title, description, a set of potential symptoms by which it can be recognized, a set of potential negative consequences that can result if the problem occurs, a set of potential causes for the problem, and recommendations for avoiding the problem or solving the should it occur.



1 Introduction

Many testing problems can occur during the development or maintenance of software-reliant systems and software applications. While no project is likely to be so poorly managed and executed as to experience the majority of these problems, most projects will suffer several of them. Similarly, while these testing problems do not guarantee failure, they definitely pose serious risks that need to be managed.Based on over 30 years of experience developing systems and software as well as performing numerous independent technical assessments, this technical report documents 77 problems that have been observed to commonly occur during testing. These problems have been categorized as follows:• General Testing Problems

Test Planning and Scheduling Problems Stakeholder Involvement and Commitment Problems Management-related Testing Problems Test Organization and Professionalism Problems Test Process Problems Test Tools and Environments Problems Test Communication Problems Requirements-related Testing Problems

• Testing Type Specific Problems Unit Testing Problems Integration Testing Problems Specialty Engineering Testing Problems System Testing Problems System of Systems (SoS) Problems Regression Testing Problems

1.1 UsageThe information describing each of the commonly occurring testing problems can be used:• To improve communication regarding commonly occurring testing problems• As training materials for testers and the stakeholders of testing• As checklists when:

Developing and reviewing an organizational or project testing process or strategy Developing and reviewing test plans, the testing sections of system engineering

management plans (SEMPs), and software development plans (SDPs) Evaluating the testing-related parts of contractor proposals Evaluating test plans and related documentation (quality control)



Evaluating the actual as-performed testing process during the oversight 0F

1 (quality assurance)

Identifying testing risks and appropriate risk mitigation approaches• To categorize testing problems for metrics collection, analysis, and reporting• As an aid to identify testing areas potentially needing improvement during project post

mortems (post implementation reviews)Although each of these testing problems has been observed on multiple projects, it is entirely possible that you may have testing problems not addressed by this document.

1.2 Problem SpecificationsThe following tables document each testing problem with the following information:• Title – a short descriptive name of the problem• Description – a brief definition of the problem• Potential Symptoms (how you will know) – potential symptoms that indicate possible

existence of the problem• Potential Consequences (why you should care) – potential negative consequences to expect

if the problem is not avoided or solved2

• Potential Causes –potential root and proximate causes of the problem3

• Recommendations (what you should do) – recommended (prepare, enable, perform, and verify) actions to take to avoid or solve the problem4

• Related Problems – a list of links to other related testing problems

1.3 Problem InterpretationThe goal of testing is not to prove that something works, but rather to demonstrate that it does not.2F

5 A good tester assumes that there are always defects (an extremely safe assumption) and

1 Not all testing problems have the same probability or harm severity. These problem specifications are not intended to be used as part of a quantitative scoring scheme based on the number of problems found. Instead, they are offered to support qualitative review and planning.

2 Note that the occurrence of a potential consequence may be a symptom by which the problem is recognized.3 Causes are important because recommendations should be based on the causes. Also, recommendation to

address root causes may be more important than proximate causes, because recommendations addressing proximate causes may not combat the root cause and therefore may not prevent the problem under all circumstances.

4 Some of the recommendations may no longer be practical after the problem rears its ugly head. It is usually much easier to avoid the problem or nip it in the bud instead of fixing it when the project is well along or near completion. For example, several possible ways exist to deal with inadequate time to complete testing including (1) delay the test completion date and reschedule testing, (2) keep the test completion date and (a) reduce the scope of delivered capabilities, (b) reduce the amount of testing, (c) add testers, and (d) perform more parallel testing (e.g., different types of testing simultaneously). Selection of the appropriate recommendations to follow therefore depends on the actual state of the project.

5 Although tests that pass are often used as evidence that the system (or subsystem) under test meets its (derived and allocated) requirements, testing can never be exhaustive for even a simple system and therefore cannot “prove” that all requirements are met. However, system and operational testing can provide evidence that the system under test is “fit for purpose” and ready to be placed into operation. For example, certain types of testing


Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and Recommendationsseeks to uncover them. Thus, a good test is one that causes the thing being tested to fail so that the underlying defect(s) can be found and fixed.6

Defects are not restricted to violations of specified (or unspecified) requirements. Some of the other important types of defects are:• inconsistencies between the architecture, design, and implementation• violations of coding standards• lack of input checking (i.e., unexpected data)• the inclusion of safety or security vulnerabilities (e.g., the use of inherently unsafe language

features or lack of verification of input data)

may provide evidence required for safety and security accreditation and certification. Nevertheless, a tester must take a “show it fails” rather than a “show it works” mindset to be effective.

6 Note that testing cannot identify all defects because some defects (e.g., the failure to implement missing requirements) do not cause the system to fail in a manner detectable by testing.



2 Testing Problems

The commonly occurring testing problems documented in this section are categorized as either general testing problems or testing type specific problems.

2.1 General Testing ProblemsThe following testing problems can occur regardless of the type of testing being performed:• Test Planning and Scheduling Problems • Stakeholder Involvement and Commitment Problems • Management-related Testing Problems • Test Organization and Professionalism Problems • Test Process Problems • Test Tools and Environments Problems • Test Communication Problems • Requirements-related Testing Problems

2.1.1 Test Planning and Scheduling Problems

The following testing problems are related to test planning and estimation:• GEN-TPS-1 No Separate Test Plan • GEN-TPS-2 Incomplete Test Planning • GEN-TPS-3 Test Plans Ignored • GEN-TPS-4 Test Case Documents rather than Test Plans • GEN-TPS-5 Inadequate Test Schedule • GEN-TPS-6 Testing is Postponed

2.1.1.1 GEN-TPS-1 No Separate Test Plan

Description: There are no separate testing-specific planning document(s).

Potential Symptoms:• There is no separate Test and Evaluation Master Plan (TEMP) or System/Software Test

Plan (STP).• There are only incomplete high-level overviews of testing in System Engineering Master

Plan (SEMP) and System/Software Development Plan (SDP).

Potential Consequences:• The test planning parts of these other documents are not written by testers.• Testing is not adequately planned.• The test plans are not adequately documented.• It is difficult or impossible to evaluate the planned testing process.



• Testing is inefficiently and ineffectively performed.

Potential Causes:• The customer has not specified the development and delivery of a separate test plan.• The system engineering, software engineering, or testing process has not included the

development of a separate test plan.• There was no template for the content and format of a separate test plan.• Management, the customer representative, or the testers did not understand the:

scope, complexity, and importance of testing value of a separate test plan

Recommendations:• Prepare:

Reuse or create a standard template and content/format standard for test plans. Include one or more separate TEMPs and/or STPs as deliverable work products in the

contract. Include the development and delivery of test planning documents in the project master

schedule (e.g., as part of major milestones).• Enable:

Provide sufficient resources (staffing and schedule) for the development of one or more separate test plans.

• Perform: Develop and deliver one or more separate TEMPs and/or STPs.

• Verify: Verify the existence and delivery of one or more separate test planning documents. Do not accept incomplete high-level overviews of testing in the SEMP and/or SDP as

the only test planning documentation.

2.1.1.2 GEN-TPS-2 Incomplete Test Planning

Description: The test planning documents are incomplete.

Potential Symptoms:• The test planning documents are incomplete, missing some or all7 of the:

references – listing of all relevant documents influencing testing test goals and objectives – listing the high-level goals and subordinate objectives of the

testing program scope of testing – listing the component(s), functionality, and/or capabilities to be

tested (and any that are not to be tested) test levels – listing and describing the relevant levels of testing (e.g., unit, subsystem

7 This does not mean that every test plan must include all of this information; test plans should include only the information that is relevant for the current project. It is quite reasonable to reuse much/most of this information in multiple test plans; just because it is highly reusable does not mean that it is meaningless boilerplate that can be ignored. Test plans can be used to estimate the amount of test resources (e.g., time and tools) needed as well as the skills/expertise that the testers need.



integration, system integration, system, and system of systems testing) test types – listing and describing the types of testing such as:

blackbox, graybox, and whitebox testing developmental vs. acceptance testing initial vs. regression testing manual vs. automated mode-based testing (system start-up7F

8, operational mode, degraded mode, training mode, and system shutdown)

normal vs. abnormal behavior (i.e., nominal vs. off-nominal, sunny day vs. rainy day use case paths)

quality criteria based testing such as availability, capacity (e.g., load and stress testing), interoperability, performance, reliability, robustness9, safety, security (e.g., penetration testing), and usability testing

static vs. dynamic testing time- or date-based testing

testing methods and techniques – listing and describing the planned testing methods and techniques (e.g., boundary value testing, penetration testing, fuzz testing, alpha and beta testing) to be used including the associated: test case selection criteria – listing and describing the criteria to be used to select

test cases (e.g., interface-based, use-case path, boundary value testing, and error guessing)

test entrance criteria – listing the criteria that must hold before testing should begin

test exit/completion criteria – listing the test completion criteria (e.g., based on different levels of code coverage such as statement, branch, condition coverage)

test suspension and resumption criteria test completeness and rigor – describing how the rigor and completeness of the testing

varies as a function of mission-, safety-, and security-criticality resources:

staffing – listing the different testing roles and teams, their responsibilities, their associated qualifications (e.g., expertise, training, and experience), and their numbers

environments – listing and description of required computers (e.g., laptops and servers), test tools (e.g., debuggers and test management tools), test environments (software and hardware test beds), and test facilities

testing work products – listing and describing of the testing work products to be produced or obtained such as test documents (e.g., plans and reports), test software (e.g., test drivers and stubs), test data (e.g., inputs and expected outputs), test hardware, and test environments

testing tasks – listing and describing the major testing tasks (e.g., name, objective, preconditions, inputs, steps, postconditions, and outputs)

8 This includes combinations such as the testing of system start-up when hardware/software components fail.9 This includes the testing of error, fault, and failure tolerance.



testing schedule – listing and describing the major testing milestones and activities in the context of the project development cycle, schedule, and major project milestones

reviews, metrics, and status reporting – listing and describing the test-related reviews (e.g., Test Readiness Review), test metrics (e.g., number of tests developed and run), and status reports (e.g., content, frequency, and distribution)

dependencies of testing on other project activities – such as the need to incorporate certain hardware and software components into test beds before testing using those environments can begin

acronym list and glossaryPotential Consequences:• Testers and stakeholders in testing may not understand the primary objective of testing (i.e.,

to find defects so that they can be fixed).• Some levels and types of tests may not be performed, allowing certain types of residual

defects to remain in the system.• Some testing may be ad hoc and therefore inefficient and ineffectual.• Mission-, safety-, and security-critical software may not be sufficiently tested to the

appropriate level of rigor.• Certain types of test cases may be ignored, resulting in related residual defects in the tested

system.• Test completion criteria may be based more on schedule deadlines than on the required

degree of freedom from defects.• Adequate amounts of test resources (e.g., e.g., testers, test tools, environments, and test

facilities) may not be made available because they are not in the budget.• Some testers may not have adequate expertise, experience, and skills to perform all of the

types of testing that needs to be performed.

Potential Causes:• There were no templates or content and format standards for separate test plans.• The associated templates or content and format standards were incomplete.• The test planning documents were written by people (e.g., managers or developers) who did

not understand the scope, complexity, and importance of testing.


Reuse or create a standard template and/or content/format standard for test plans.• Enable:

Provide sufficient resources (staffing and schedule) to develop complete test plan(s).• Perform:

Use a proper template and/or content/format standard to develop the test plans (i.e., ones that are derived from test plan standards and tailored if necessary for the specific project).

• Verify: Verify during inspections/reviews that all test plans are sufficiently complete



Do not accept incomplete test plans.

Related Problems: GEN-TOP-2 Unclear Testing Responsibilities, GEN-PRO-8 Inadequate Test Evaluations, GEN-TTE-7 Tests Not Delivered, TTS-SPC-1 Inadequate Capacity Requirements, TTS-SPC-2 Inadequate Concurrency Requirements, TTS-SPC-3 Inadequate Performance Requirements, TTS-SPC-4 Inadequate Reliability Requirements, TTS-SPC-5 Inadequate Robustness Requirements, TTS-SPC-6 Inadequate Safety Requirements, TTS-SPC-7 Inadequate Security Requirements, TTS-SPC-8 Inadequate Usability Requirements, TTS-SoS-1 Inadequate SoS Planning, TTS-REG-5 Disagreement over Maintenance Test Resources

2.1.1.3 GEN-TPS-3 Test Plans Ignored

Description: The test plans are ignored once developed and delivered.

Potential Symptoms:• The way the testers perform testing is not consistent with the relevant test plan(s).• The test plan(s) are never updated after initial delivery shortly after the start of the project.

Potential Consequences:• Management may not have budgeted sufficient funds to the pay for the necessary test

resources e.g., testers, test tools, environments, and test facilities).• Management may not have made adequate amounts of test resources available because they

are not in the budget.• Testers will not have an approved document that justifies:

their request for additional needed resources when they need them their insistence that certain types of testing is necessary and must not be dropped when

the schedule becomes tight• Some testers may not have adequate expertise, experience, and skills to perform all of the

types of testing that needs to be performed.• The test plan may not be maintained.• Some levels and types of tests may not be performed so that certain types of residual defects

to remain in the system.• Some important test cases may not be developed and executed.• Mission-, safety-, and security-critical software may not be sufficiently tested to the

appropriate level of rigor.• Test completion criteria may be based more on schedule deadlines than on the required

degree of freedom from defects.

Potential Causes:• The testers may have forgotten some of the test plan contents.• The testers may have thought that the only reason a test plan was developed was because it

was a deliverable in the contract that needed to be check off.• The test plan(s) may be so incomplete and at such a generic high level of abstraction as to

be relatively useless.




Have project management (both administrative and technical), testers, and quality assurance personnel read and review the test plan.

Have management (acquisition and project) sign off on the completed test plan document.

Use the test plan as input to the project master schedule and work breakdown schedule (WBS).

• Enable: Develop a short check list from the test plan(s) for use when assessing the performance

of testing.• Perform:

Have the test manager periodically review the test work products and as-performed test process against the test plan(s).

Have the test team update the test plan(s) as needed.• Verify:

Have the testers present their work and status at project and test-team status meetings. Have quality engineering periodically review the test work products (quality control)

and as performed test process (quality assurance). Have progress, productivity, and quality test metrics collected, analyzed, and reported to

project and customer management.

Related Problems: GEN-TPS-2 Incomplete Test Planning

2.1.1.4 GEN-TPS-4 Test Case Documents rather than Test Plans

Description: Test case documents documenting specific test cases are labeled test plans.

Potential Symptoms:• The “test plan(s)” contain specific test cases including inputs, test steps, expected outputs,

and sources such as specific requirements (blackbox testing) or design decisions (whitebox testing).

• The test plans do not contain the type of general planning information listed in GEN-TPS-2 Incomplete Test Planning.

Potential Consequences:• All of the negative consequences of GEN-TPS-2 Incomplete Test Planning may occur.• The test case documents may not be maintained.

Potential Causes:• There may have been no template or content format for the test case documents.• The test plan authors may not have had adequate expertise, experience, and skills to develop

test plans or know their proper content.

Recommendations:



• Prepare: Provide the test manager and testers with at least minimal training in test planning.

• Enable: Provide a proper test plan template. Provide a proper content and format standard for test plans. Add test plans and test case documents to the project technical glossary.

• Perform: Develop the test plan in accordance with the test plan template or content and format

standard. Develop the test case documents in accordance with the test case document template

and/or content and format standard. Where practical, automate the test cases so that the resulting tests (extended with

comments) replace the test case documents so that the distinction is clear (i.e., the test plan is a document meant to be read whereas the test case is meant to be executable).

• Verify: Have the test plan(s) reviewed against the associated template or content and format

standard prior to acceptance.


2.1.1.5 GEN-TPS-5 Inadequate Test Schedule

Description: The testing schedule is inadequate to permit proper testing.

Potential Symptoms:• Testing is significantly incomplete and behind schedule.• An insufficient time is allocated in the project master schedule to perform all:

test activities (e.g., automating testing, configuring test environments, and developing test data, test scripts/drivers, and test stubs)

appropriate tests (e.g., abnormal behavior, quality requirements, regression testing)8F

10

• Testers are working excessively and unsustainably long hours and days per week in an attempt to meet schedule deadlines.

Potential Consequences:• Testers are exhausted and therefore making an unacceptably large number of mistakes.• Tester productivity (e.g., importance of defects found and number of defects found per unit

time) is decreasing.• Customer representatives, managers, and developers have a false sense of security that the

system functions properly.• There is a significant probability that the system or software will be delivered late with an

10 Note that an agile (i.e., iterative, incremental, and concurrent) development/life cycle greatly increases the amount of regression testing needed (although this increase in testing can be largely offset by highly automating regression tests). Although testing can never be exhaustive, more time is typically needed for adequate testing unless testing can be made more efficient. For example, fewer defects could be produced and these defects could be found and fixed earlier and thereby be prevented from reaching the current iteration.



unacceptably large number of residual defects.

Potential Causes:• The overall project schedule was insufficient.• The size and complexity of the system were underestimated.• The project master plan was written by people (e.g., managers, chief engineers, or technical

leads) who do not understand the scope, complexity, and importance of testing.• The project master plan was developed without input from the test team(s).


Provide evidence-based estimates of the amount of testing and associated test effort that will be needed.

Ensure that adequate time for testing is included in the program master schedule and test team schedules including the testing of abnormal behavior and the specialty engineering testing of quality requirements (e.g., load testing for capacity requirements and penetration testing for security requirements).9F

11

Provide adequate time for testing in change request estimates.• Enable:

Deliver inputs to the testing process (e.g., requirements, architecture, design, and implementation) earlier and more often (e.g., as part of an incremental, iterative, parallel – agile – development cycle).

Provide sufficient test resources (e.g., number of testers, test environments, and test tools).

If at all possible, do not reduce the testing effort in order to meet a delivery deadline.• Perform:

Automate as much of the regression testing as is practical, and allocate sufficient resources to maintain the automated tests.10F

12

• Verify: Verify that amount of time scheduled for testing is consistent with the evidence-based

estimates of need time.

Related Problems: TTS-SoS-5 SoS Testing Not Properly Scheduled

2.1.1.6 GEN-TPS-6 Testing is Postponed

Description: Testing is postponed until late in the development schedule.

Potential Symptoms:• Testing is scheduled to be performed late in the development cycle on the project master

11 Also integrate the testing process into the software development process.12 When there is insufficient time to perform manual testing, it may be difficult to justify the automation of these

tests. However, automating regression testing is not just a maintenance issue. Even during initial development, there should typically be a large amount of regression testing, especially if an iterative and incremental development cycle is used. Thus, ignoring the automation of regression testing is often a case of being penny wise and pound foolish.



schedule.• Little or no unit or integration testing:

is planned is being performed during the early and middle stages of the development cycle

Potential Consequences:• There is insufficient time left in the schedule to correct any major defects found.11F

13

• It is difficult to show the required degree of test coverage.• Because so much of the system has been integrated before the beginning of testing, it is

very difficult to find and localize defects that remain hidden within the internals of the system.

Potential Causes:• The project is using a strictly-interpreted traditional sequential Waterfall development

cycle.• Management was not able to staff the testing team early during the development cycle.• Management was primarily interested in system testing and did not recognize the need for

lower-level (e.g., unit and integration) testing.


Plan and schedule testing to be performed iteratively, incrementally, and in a parallel manner (i.e., agile) starting early during development.

Provide training in incremental iterative testing. Incorporate iterative and incremental testing into the project’s system/software

engineering process.• Enable:

Provide adequate testing resources (staffing, tools, budget, and schedule) early during development.

• Perform: Perform testing in an iterative, incremental, and parallel manner starting early during the

development cycle.• Verify:

Verify in an ongoing manner (or at the very least during major project milestones) that testing is being performed iteratively, incrementally, and in parallel with design, implementation, and integration.

Use testing metrics to verify status and ongoing progress.

Related Problems: GEN-PRO-1 Testing and Engineering Process not Integrated

13 An interesting example of this is the Hubble telescope. Testing of the mirror’s focusing was postponed until after launch, resulting in an incredibly expensive repair mission.



2.1.2 Stakeholder Involvement and Commitment Problems

The following testing problems are related to stakeholder involvement in and commitment to the testing effort:• GEN-SIC-1 Wrong Testing Mindset • GEN-SIC-2 Unrealistic Testing Expectations / False Sense of Security • GEN-SIC-3 Lack of Stakeholder Commitment

2.1.2.1 GEN-SIC-1 Wrong Testing Mindset

Description: Some of the testers and other testing stakeholders have the wrong testing mindset.

Potential Symptoms:• Some testers and other testing stakeholders begin testing assuming that the system/software

works.• Testers believe that their job is to verify or “prove” that the system/software works. 12F

14

• Testing is used to demonstrate that the system/software works properly rather than to determine where and how it fails.

• Only normal (“sunny day”, “happy path”, or “golden path”) behavior is being tested.• There is little or no testing of:

exceptional or fault/failure tolerant (“rainy day”) behavior input data (e.g., range testing to identify incorrect handling of invalid input values)

• Test inputs only include middle of the road values rather than boundary values and corner cases.

Potential Consequences:• There is a high probability that:

the delivered system or software will contain a significant number of residual defects, especially related to abnormal behavior (e.g., exceptional use case paths)

these defects will unacceptably reduce its reliability and robustness (e.g., error, fault, and failure tolerance)

• Customer representatives, managers, and developers have a false sense of security that the system functions properly.

Potential Causes:• Testers were taught or explicitly told that their job is to verify or “prove” that the

system/software works.• Developers are testing their own software15 so that there is a “conflict of interest” (i.e., build

software that works and show that their software does not work). This is especially a problem with small, cross-functional development organizations/teams that “cannot afford” to have separate testers (i.e., professional testers who specialize in testing).

14 Using testing to “prove” that their software works is most likely to become a problem when developers test their own software (e.g., with unit testing and with small cross-functional or agile teams).

15 Developers typically do their own unit level (i.e., lowest level) testing. With small, cross functional (e.g., agile) teams, it is becoming more common for developers to also do integration and subsystem testing.



• There was insufficient schedule allocated for testing so that there is only sufficient time to test the normal behavior (e.g., use case paths).

• The organizational culture is very success oriented so that looking “too hard” for problems is (implicitly) discouraged.

• Management gave the testers the strong impression that they do not want to hear any “bad” news (i.e., that there are any significant defects being found in the system).


Explicitly state in the project test plan that the primary goal of testing is to: find defects by causing system faults and failures rather than to demonstrate that

there are no defects break the system rather than to prove that it works

• Enable: Provide test training that emphasizes uncovering defects by causing faults or failures. Provide sufficient time in the schedule for testing beyond the basic success paths. Hire new testers who exhibit a strong “destructive” mindset to testing.

• Perform: In addition to test cases that verify all normal behavior, emphasize looking for defects

where they are most likely to hide (e.g., boundary values, corner cases, and input type/range verification).13F

16

Incentivize testers based more on the number of significant defects they uncover than merely on the number requirements “verified” or test cases ran.17

Foster a healthy competition between developers (who seek to avoid inserting defects) and testers (who seek to find those defects).

• Verify: Verify that the testers exhibit a testing mindset.

Related Problems: GEN-MGMT-2 Inappropriate External Pressures, GEN-COM-4 Inadequate Communication Concerning Testing, TTS-UNT-3 Unit Testing Considered Unimportant

2.1.2.2 GEN-SIC-2 Unrealistic Testing Expectations / False Sense of Security

Description: Testers and other testing stakeholders have unrealistic testing expectations that generate a false sense of security.

Potential Symptoms:• Testing stakeholders (e.g., managers and customer representatives) and some testers falsely

16 Whereas tests that verify nominal behavior are essential, testers must keep in mind that there are typically many more ways for the system/software under test to fail than to work properly. Also, nominal tests must remain part of the regression test suite even after all known defects are fixed because changes could introduce new defects that cause nominal behavior to fail.

17 Take care to avoid incentivizing developers to insert defects into their own software so that they can then find them during testing.



believe that: Testing detects all (or even the majority of) defects.14F

18

Testing proves that there are no remaining defects and that the system therefore works as intended.

Testing can be, for all practical purposes, exhaustive. Testing can be relied on for all verification. (Note that some requirements are better

verified via analysis, demonstration, certification, and inspection.) Testing (if it is automated) will guarantee the quality of the tests and reduce the testing

effort15F

19

• Managers and other testing stakeholders may not understand that: Test automation requires specialized expertise and needs to be budgeted for the effort

required to develop, verify, and maintain the automated tests. A passed test could result from a weak/incorrect test rather than a lack of defects. A truly successful/useful test is one that finds one or more defects, whereas a passed test

only shows that the system worked in that single specific instance.

Potential Consequences:• Testers and other testing stakeholders have a false sense of security that the system or

software will work properly on delivery and deployment.• Non-testing forms of verification (e.g., analysis, demonstration, inspection, and simulation)

are not given adequate emphasis.

Potential Causes:• Testing stakeholders and testers were not exposed to research results that document the

relatively large percentage of residual defects that typically remain after testing.• Testers and testing stakeholders have not been trained in verification approaches (e.g.,

analysis, demonstration, inspection) other than testing and their relative pros and cons.• Project testing metrics do not include estimates of residual defects.


Collect information on the limitations of testing. Collect information on when and how to augment testing with other types of

verification.• Enable:

Provide basic training in verification methods including their associated strengths and limitations.

• Perform: Train and mentor managers, customer representatives, testers, and other test

stakeholders concerning the limits of testing:

18 Testing typically finds less than half of all latent defects and is not the most efficient way of detecting many defects.

19 This depends on the development cycle and the volatility of the system’s requirements, architecture, design, and implementation.



Testing will not detect all (or even a majority of) defects. No testing is truly exhaustive. Testing cannot prove (or demonstrate) that the system works under all combinations

of preconditions and trigger events. A passed test could result from a weak test rather than a lack of defects. A truly successful test is one that finds one or more defects.

Do not rely on testing for the verification of all requirements, especially architecturally-significant quality requirements.

Collect, analyze, and report testing metrics that estimate the number of defects remaining after testing.

• Verify: Verify that testing stakeholders understand the limitations of testing. Verify that testing is not the only type of verification being used. Verify that the number of defects remaining is estimated and reported.

Related Problems: GEN-MGMT-2 Inappropriate External Pressures, GEN-COM-4 Inadequate Communication Concerning Testing, TTS-REG-2 Regression Testing not Performed

2.1.2.3 GEN-SIC-3 Lack of Stakeholder Commitment

Description: There is a lack of adequate stakeholder commitment to the testing effort.

Potential Symptoms:• Stakeholders (especially customers and management) are not providing sufficient resources

(e.g., people, schedule, tools, funding) for the testing effort.• Stakeholders are unavailable for the review of test assets such as test plans and important

test cases.• Stakeholders (e.g., customer representatives) point out defects in test assets after they have

been reviewed.• Stakeholders do not support testing when resources must be cut (e.g., due to schedule

slippages and budget overruns).

Potential Consequences:• Testing is less effective due to inadequate resources.• Stakeholders (e.g., customer representatives) reject reviewed test assets.• The testing effort loses needed resources when the schedule slips or the budget overruns.

Potential Causes:• Stakeholders did not understand the scope, complexity, and importance of testing.• Stakeholders were not provided adequate estimates of the resources needed to properly

perform testing.• Stakeholders were extremely busy with other duties.• The overall project schedule and budget estimates were inadequate, thereby forcing cuts in

testing.




Convey the scope, complexity, and importance of testing to the testing stakeholders.• Enable:

Provide stakeholders with adequate estimates of the resources needed to properly perform testing.

• Perform: Officially request sufficient testing resources from the testing stakeholders. Obtain commitments of support for authoritative stakeholders at the beginning of the

project.• Verify:

Verify that the testing stakeholders are providing sufficient resources (e.g., people, schedule, tools, funding) for the testing effort.

Related Problems: GEN-MGMT-1 Inadequate Test Resources, GEN-MGMT-5 Test Lessons Learned Ignored, GEN-MGMT-2 Inappropriate External Pressures , GEN-COM-4 Inadequate Communication Concerning Testing , TTS-SoS-4 Inadequate Funding for SoS Testing , TTS-SoS-6 Inadequate Test Support from Individual Systems

2.1.3 Management-related Testing Problems

The following testing problems are related to stakeholder involvement in and commitment to the testing effort:• GEN-MGMT-1 Inadequate Test Resources • GEN-MGMT-2 Inappropriate External Pressures • GEN-MGMT-3 Inadequate Test-related Risk Management • GEN-MGMT-4 Inadequate Test Metrics • GEN-MGMT-5 Test Lessons Learned Ignored

2.1.3.1 GEN-MGMT-1 Inadequate Test Resources

Description: Management allocates an inadequate amount of resources to testing.

Potential Symptoms:• The test planning documents and schedules fail to provide for adequate test resources such

as: test time in schedule with inadequate schedule reserves trained and experienced testers and reviewers funding test tools and environments (e.g., integration test beds and repositories of test data)

Potential Consequences:• Adequate test resources will likely not be provided to perform sufficient testing within

schedule and budget limitations.



• An unnecessary number of defects may make it through testing and into the deployed system.

Potential Causes:• Testing stakeholders may not understand the scope, complexity, and importance of testing,

and therefore its impact on the resources needed to properly perform testing.• Estimates of needed testing resources may not be based on any evidenced-based cost/effort

models.• Resource estimates may be informally made by management without input from the testing

organization, especially those testers who will be actually performing the testing tasks.• Resource estimates may be based on available resources rather than resource needs.• Management may believe that the testers have padded their estimates and therefore cut the

tester’s estimates.• Testers and testing stakeholders may be being overly optimistic so that their informal

estimates of needed resources are based on best case scenarios rather than most likely or worst case scenarios.


Ensure that testing stakeholders understand the scope, complexity, and importance of testing, and therefore its impact on the resources needed to properly perform testing.

• Enable: Begin test planning at project inception (e.g., at contract award or during proposal

development). Train testers in the use of evidence-based cost/effort models to estimate the amount of

testing resources needed.• Perform:

Use evidenced-based cost/effort models to estimate the needed testing resources. Officially request sufficient testing resources from the testing stakeholders. Ensure that the test planning documents, schedules, and project work breakdown

structure (WBS) provide for adequate levels of these test resources. Obtain commitments of support for authoritative stakeholders at the beginning of the

project.• Verify:

Verify that the testing stakeholders are providing sufficient resources (e.g., people, schedule, tools, funding) for the testing effort.

Related Problems: GEN-SIC-3 Lack of Stakeholder Commitment, GEN-TOP-3 Inadequate Testing Expertise

2.1.3.2 GEN-MGMT-2 Inappropriate External Pressures

Description: Testers are subject to inappropriate external pressures, primarily from managers.

Potential Symptoms:



• Managers (or possibly customers or developers) are dictating to the testers what constitutes a bug or a defect worth reporting.

• Managerial pressure exists to: inappropriately cut corners (e.g., only perform “sunny day” testing in order to meet

schedule deadlines inappropriately lower the severity and priority of reported defects not find defects (e.g., until after delivery because the project is so far behind schedule

that there is no time to fix any defects found)

Potential Consequences:• If the testers yield to this pressure, then the test metrics do not accurately reflect either the

true state of the system / software or the status of the testing process.• The delivered system or software contains an unacceptably large number of residual

defects.

Potential Causes:• The project is significantly behind schedule and/or over budget.• There is insufficient time until the delivery/release date to fix a significant number of

defects that were found via testing.• The project is in danger of being cancelled due to lack of performance.• Management is highly risk adverse and therefore did not want to officially label any testing

risk as a risk.


Establish criteria for determining the priority and severity of reported defects.• Enable:

Ensure that trained testers determine what constitutes a bug or a defect worth reporting. Place the manager of the testing organization at the same or higher level as the project

manager in the organizational hierarchy (i.e., have the test manager report independently of the project manager).20

• Perform: Support testers when they oppose any inappropriate managerial pressure that would

have them violate their professional ethics. Customer representatives must insist on proper testing.

• Verify: Verify that the testers are the ones who decide what constitutes a reportable defect. Verify that the testing manager reports independently of the project manager.

Related Problems: GEN-SIC-1 Wrong Testing Mindset, GEN-TOP-1 Lack of Independence

2.1.3.3 GEN-MGMT-3 Inadequate Test-related Risk Management

Description: There are too few test-related risks identified in the project’s official risk 20 Note that this will only help if the test manager is not below the manager applying improper pressure.



repository. F

21

Potential Symptoms:• Managers are highly risk adverse, treating risk as if it were a “four letter word”.17F

• Because adding risks to the risk repository is looked on as a symptom of management failure, risks (including testing risks) are mislabeled as issues or concerns so that they need not be reported as an official risk.

• There are few if any test-related risks identified in the project’s official risk repository.• The number of test-related risks is unrealistically low.• The identified test-related risks have inappropriately low probabilities, low harm severities,

and low priorities.• The identified test risks have no:

associated risk mitigation approaches one assigned as being responsible for the risk

• The test risks are never updated (e.g., additions or modification) over the course of the project.

• Testing risks are not addressed in either the test plan(s) or the risk management plan.

Potential Consequences:• Testing risks are not reported.• Management and acquirer representatives are unaware of their existence.• Testing risks are not being managed.• The management of testing risks is not given sufficiently high priority.

Potential Causes:• Management is highly risk adverse.• Managers strongly communicate their preference that only a small number of the most

critical risks be entered into the project risk repository.• The people responsible for risk management and managing the risk repository have never

been trained or exposed to the many potential test-related risks (e.g., those associated with the commonly occurring testing problems addressed in this document).

• The risk management process strongly emphasizes system-specific or system-level (as opposed to software-level) risks and tends to not address any development activity risks (such as those associated with testing).

• It is early in the development cycle before sufficient testing has begun.• There have been few if any evaluations of the testing process.• There has been little if any oversight of the testing process.

21 These potential testing problems can be viewed as generic testing risks.




Determine management’s degree of risk aversion and attitude regarding inclusion of risks in the project risk repository.

• Enable: Ensure that the people responsible for risk management and managing the risk

repository are aware of the many potential test-related risks.• Perform:

Identify test-related risks and incorporate them into the official project risk repository. Provide test-related risks with realistic probabilities, harm severities, and priorities.

• Verify: Verify that the risk repository contains an appropriate number of testing risks. Verify that there is sufficient management and quality assurance oversight and

evaluation of the testing process.

Related Problems: GEN-SIC-2 Unrealistic Testing Expectations / False Sense of Security

2.1.3.4 GEN-MGMT-4 Inadequate Test Metrics

Description: Insufficient test metrics are being produced, analyzed, and reported.

Potential Symptoms:• Insufficient or no test metrics are being produced, analyzed, and reported.• The primary test metrics (e.g., number of tests18F

22, number of tests needed to meet adequate or required test coverage levels, number of tests passed/failed, number of defects found) show neither the productivity of the testers nor their effectiveness at finding defects (e.g., defects found per test or per day).

• The number of latent undiscovered defects remaining is not being estimated (e.g., using COQUALMO19F

23).• Management measures tester productivity strictly in terms of defects found per unit time,

ignoring the importance or severity of the defects found.

Potential Consequences:• Managers, testers, and other stakeholders in testing do not accurately know the quality of

testing, the importance of the defects being found, or the number of residual defects in the delivered system or software.

• Managers do not know the productivity of the testers and their effectiveness at finding of important defects, thereby making it difficult to improve the testing process.

• Testers concentrate on finding lots of (unimportant) defects rather than finding critical defects (e.g., those with mission-critical, safety-critical, or security-critical ramifications).

22 Note that the number of tests metric does not indicate the effort or complexity of identifying, analyzing, and fixing defects.

23 COQUALMO (COnstructive QUALity Model is an estimation model that can be used for predicting the number of residual defects/KSLOC (thousands of source lines of code) or defects/FP (Function Point) in a software product.



• Customer representatives, managers, and developers have a false sense of security that the system functions properly.

Potential Causes:• Project management (including the managers/leaders of test organizations/teams) are not

familiar with the different types of testing metrics (e.g., quality, status, and productivity) that could be useful.

• Metrics collection, analysis, and reporting is at such a high level that individual disciplines (such as testing) are rarely assigned more than one or two highly-generic metrics (e.g., “Inadequate testing is a risk”).

• Project management (and testers) are only aware of backward looking metrics (e.g., defects found and fixed) as opposed to forward looking metrics (e.g., residual defects remaining to be found).


Provide testers and testing stakeholders with basic training in metrics with an emphasis on test metrics.

• Enable: Incorporate a robust metrics program in the test plan that covers leading indicators. Emphasize the finding of important defects.

• Perform: Consider using some of the following representative examples of useful testing metrics:

number of defects found per test (test effectiveness metric) number of defects found per tester day (tester productivity metric) number of defects that slip through each verification milestone / inch pebble (e.g.,

reviews, inspections, tests)20F

24

estimated number of latent undiscovered defects remaining in the delivered system (e.g., estimated using COQUALMO)

Regularly collect, analyze, and report an appropriate set of testing metrics.• Verify:

Important: Evaluate and maintain visibility into the as-performed testing process to ensure that it does not become metrics-driven.

Watch out for signs that testers worry more about looking good (e.g., by concentrating on only the defects that are easy to find) than on finding the most important defects.

Verify that sufficient testing metrics are collected, analyzed, and reported.

Related Problems: None

2.1.3.5 GEN-MGMT-5 Test Lessons Learned Ignored

Description: Lessons that are learned regarding testing are not placed into practice.

24 For example, what are the percentages of defects that manage to slip by architecture reviews, design reviews, implementation inspections, unit testing, integration testing, and system testing without being detected?



Potential Symptoms:• Management, the test teams, or customer representatives ignore lessons learned during

previous projects or during the testing of previous increments of the system under test.

Potential Consequences:• The test processes is not being continually improved.• The same problems continue to occur.• Customer representatives, managers, and developers have a false sense of security that the

system functions properly.

Potential Causes:• Lessons learned were not documented.• The capturing of lessons learned was being postponed until after the project was over when

the people who have learned the lessons were no longer available, having scattered to new projects.

• The only usage of lessons learned is informal and solely based on the experience that the individual developers and testers bring to new projects.

• Lessons learned from previous projects are not reviewed before starting new projects.


Make the documentation of lessons learned an explicit part of the testing process. Review previous lessons learned as an initial step in determining the testing process.

• Enable: Capture (and implement) lessons learned as they are learned. Do not wait until a project postmortem when project staff member’s memories are

fading and they are moving (have moved) on to their next project.• Perform:

Incorporate previously learned testing lessons learned into the current testing process and test plans.

• Verify: Verify that previously learned testing lessons learned have been incorporated into the

current testing process and test plans. Verify that testing lessons learned are capture (and implemented) as they are learned.

Related Problems: GEN-SIC-3 Lack of Stakeholder Commitment

2.1.4 Test Organization and Professionalism Problems

The following testing problems are related to the test organization and the professionalism of the testers:• GEN-TOP-1 Lack of Independence • GEN-TOP-2 Unclear Testing Responsibilities • GEN-TOP-3 Inadequate Testing Expertise



2.1.4.1 GEN-TOP-1 Lack of Independence

Description: The test organization or team lacks adequate independence to enable them to properly perform their testing tasks.

Potential Symptoms:• The manager of the test organization reports to the development manager.• The lead of the project test team reports to the project manager.• The test organization manager or test team leader does not have sufficient authority to raise

and manage testing-related risks.

Potential Consequences:• A lack of sufficient independence forces the test organization or team to select an

inappropriate test process or tool.• Members of the test organization or team are intimidated into withholding objective and

timely information from the testing stakeholders.• The test organization or team has insufficient budget and schedule to be effective.• The project manager inappropriately overrules or pressures the testers to violate their

principles.

Potential Causes:• Management does not see the value or need for independent reporting.• Management does not see the similarity between quality assurance and testing with regard

to independence.


Determine reporting structures Identify potential independence problems

• Enable: Clarify to testing stakeholders (especially project management) the value of independent

reporting for the test organization manager and project test team leader.• Perform:

Ensure that the test organization or team has: Technical independence so that they can select the most appropriate test process and

tools for the job Managerial independence so that they can provide objective and timely information

about the test program and results without fear of intimidation due to business considerations or project-internal politics

Financial independence so that their budget (and schedule) is sufficient to enable them to be effective and efficient

Have the test organization manager report at the same or higher level as the development organization manager.

Have the project test team leader report independently of the project manager to the test organization manager or equivalent (e.g., quality assurance manager).



• Verify: Verify that the test organization manager reports at the same or higher level as the

development organization manager. Verify that project test team leader report independently of the project manager to the

test organization manager or equivalent (e.g., quality assurance manager).

Related Problems: GEN-MGMT-2 Inappropriate External Pressures

2.1.4.2 GEN-TOP-2 Unclear Testing Responsibilities

Description: The testing responsibilities are unclear.

Potential Symptoms:• The test planning documents does not adequately address testing responsibilities in terms of

which organizations, teams, and people: will perform which types of testing on what [types of] components are responsible for procuring, building, configuring, and maintaining the test

environments are the ultimate decision makers regarding testing risks, test completion criteria, test

completion, and the status/priority of defects

Potential Consequences:• Certain tests are not performed, while other tests are performed redundantly by multiple

organizations or people.• Incomplete testing enables some defects to make it through testing and into the deployed

system.• Redundant testing wastes test resources and cause testing deadlines to slip.

Potential Causes:• The test plan template did not clearly address responsibilities.• The project team is very small with everyone wearing multiple hats and therefore

performing testing on an as available / as needed basis.


Obtain documents describing current testing responsibilities Identify potential testing responsibility problems (e.g., missing, vague responsibilities)

• Enable: Obtain organizational agreement as to the testing responsibilities.

• Perform: Clearly and completely document the responsibilities for testing in the test plans as well

as the charters of the teams who will be performing the tests. Managers should clearly communicate these responsibilities to the relevant

organizations and people.• Verify:



Verify that testing responsibilities are clearly and completely documented in the test plans as well as the charters of the teams who will be performing the tests.

Related Problems: GEN-TPS-2 Incomplete Test Planning, GEN-PRO-7 Too Immature for Testing, GEN-COM-2 Inadequate Test Documentation, TTS-SoS-3 Unclear SoS Testing Responsibilities

2.1.4.3 GEN-TOP-3 Inadequate Testing Expertise

Description: Too many people have inadequate testing expertise, experience, and training.

Potential Symptoms:• Testers and/or those who oversee them (e.g., managers and customer representatives) have

inadequate testing expertise, experience, or training.• Developers who are not professional testers have been tasked to perform testing.• Little or no classroom or on-the-job training in testing has taken place.• Testing is ad hoc without any proper process.• Industry best practices are not followed.

Potential Consequences:• Testing is not effective in detecting defects, especially the less obvious ones.• There are unusually large numbers of false positive and false negative test results.• The productivity of the testers is needlessly low.• There is a high probability that the system or software will be delivered late with an

unacceptably large number of residual defects.• During development, managers, developers, and customer representatives have a false sense

of security that the system functions properly.21F

25

Potential Causes:• Management did not understand the scope and complexity of testing.• Management did not understand the required qualifications of a professional tester.• There was insufficient funding to hire fully qualified professional testers.• The project team is very small with everyone wearing multiple hats and therefore

performing testing on an as available / as needed basis.• An agile development method is being followed that emphasizes cross functional

development teams.


Provide proper test processes including procedures, standards, guidelines, and templates for On-The-Job training.

Ensure that the required qualifications of a professional tester are documented in the tester job description.

25 This false sense of security is likely to be replaced by a sense of panic when the system begins to frequently fail operational testing or real-world usage after deployment.



• Enable: Convey the required qualifications of the different types of testers to those technically

evaluating prospective testers. Provide appropriate amounts of test training (both classroom and on-the-job) for both

testers and those overseeing testing. Ensure that the testers who will be automating testing have the necessary specialized

expertise and training.22F

26

Obtain independent support for those overseeing testing.• Perform:

Hire full time (i.e., professional) testers who have sufficient expertise and experience in testing.

Use an independent test organization staffed with experienced trained testers for system/acceptance testing, whereby the head of this organization is at the same (or higher) level as the project manager.

• Verify: Verify that those technically evaluating prospective testers understand the required

qualifications of the different types of testers. Verify that the testers have adequate testing expertise, experience, and training.

Related Problems: GEN-MGMT-1 Inadequate Test Resources

2.1.5 Test Process Problems

The following testing problems are related to the processes and techniques being used to perform testing:• GEN-PRO-1 Testing and Engineering Process not Integrated • GEN-PRO-2 One-Size-Fits-All Testing • GEN-PRO-3 Inadequate Test Prioritization • GEN-PRO-4 Functionality Testing Overemphasized • GEN-PRO-5 Black-box System Testing Overemphasized • GEN-PRO-6 White-box Unit and Integration Testing Overemphasized • GEN-PRO-7 Too Immature for Testing • GEN-PRO-8 Inadequate Test Evaluations • GEN-PRO-9 Inadequate Test Maintenance

2.1.5.1 GEN-PRO-1 Testing and Engineering Process Not Integrated

Description: The testing process is not adequately integrated into the overall system/software engineering process.

Potential Symptoms:• There is little or no discussion of testing in the system/software engineering documentation:

26 Note that these recommendations apply, regardless of whether the project uses separate testing teams or cross functional teams including testers.



System Engineering Master Plan (SEMP), Software Development Plan (SDP), Work Breakdown Structure (WBS), Project Master Schedule (PMS), and system/software development cycle (SDC).

• All or most of the testing is being done as a completely independent activity performed by staff members who are not part of the project engineering team.

• Testing is treated as a separate specialty-engineering activity with only limited interfaces with the primary engineering activities.

• Testers are not included in the requirements teams, architecture teams, and any cross functional engineering teams.

Potential Consequences:• There is inadequate communication between testers and other system/software engineers

(e.g., requirements engineers, architects, designers, and implementers).• Few testing outsiders understand the scope, complexity, and importance of testing.• Testers do not understand the work being performed by other engineers.• There are incompatibilities between outputs and associated inputs at the interfaces between

testers and other engineers.• Testing is less effective and takes longer than necessary.

Potential Causes:• Testers are not involved in the determination and documentation of the overall engineering

process.• The people determining and documenting the overall engineering process do not have

significant testing expertise, training, or experience.


Obtain SEMP, SDP, WBS, and project master schedule• Enable:

Provide a top-level briefing/training in testing to the chief system engineer, system architect, system/software process engineer.

• Perform: Have test subject matter experts and project testers collaborate closely with the project

chief engineer / technical lead and process engineer when they develop the engineering process descriptions and associated process documents.

In addition to being in test plans such as the Test and Evaluation Master Plan (TEMP) or Software Test Plan (STP) as well as in other process documents, provide high-level overviews of testing in the SEMP(s) and SDP(s).

Document how testing is integrated into the system/software development/life cycle, regardless of whether it is traditional waterfall, agile (iterative, incremental, and parallel), or anything in between. For example, document handover points in the development cycle when testing

input and output work products are delivered from one project organization or group to another.

Incorporate testing into the Project Master Schedule.



Incorporate testing into the project’s work breakdown structure (WBS).• Verify:

Verify that testing is incorporated into the project’s: system/software engineering process SEMP and SDP WBS PMS SDC

Related Problems: GEN-COM-4 Inadequate Communication Concerning Testing

2.1.5.2 GEN-PRO-2 One-Size-Fits-All Testing

Description: All testing is to be performed to the same level of rigor, regardless of its criticality.

Potential Symptoms:• The test planning documents may contain only generic boilerplate rather than appropriate

system-specific information.• Mission-, safety-, and security-critical software may not be required to be tested more

completely and rigorously than other less-critical software.• Only general techniques suitable for testing functional requirements/behavior may be

documented; for example, there is no description of the special types of testing needed for quality requirements (e.g., availability, capacity, performance, reliability, robustness, safety, security, and usability requirements).

Potential Consequences:• Mission-, safety-, and security-critical software may not be adequately tested.• When there are insufficient resources to adequately test all of the software, some of these

limited resources may be misapplied to lower-priority software instead of being concentrated on the testing of more critical capabilities.

• Some defects may not be found, and an unnecessary number of these defects may make it through testing and into the deployed system.

• The system may not be sufficiently safe or secure.

Potential Causes:• Test plan templates and content/format standards may be incomplete and may not address

the impact of mission/safety/security criticality on testing.• Test engineers may not be familiar with the impact of safety and security on testing (e.g.,

the higher level of testing rigor required to achieve accreditation and certification.• Safety and security engineers may not have input into the test planning process.


Provide training to those writing system/software development plans and



system/software test plans concerning the need to include project-specific testing information including potential content

Tailor the templates for test plans and development methods to address the need for project/system-specific information.

• Enable: Update (if needed) the templates for test plans and development methods to address the

type, completeness, and rigor• Perform:

Address in the system/software test plans and system/software development plans: Difference in testing types/degrees of completeness and rigor, etc. as a function of

mission/safety/security criticality. Specialty engineering testing methods and techniques for testing the quality

requirements (e.g., penetration testing for security requirements). Test mission-, safety-, and security-critical software more completely and rigorously

than other less-critical software.• Verify:

Verify that the completeness, type, and rigor of testing: is addressed in the system/software development plans and system/software test

plans are a function of the criticality of the system/subsystem/software being tested are sufficient based on the degree of criticality of the system/subsystem/software

being tested

Related Problems: GEN-PRO-3 Inadequate Test Prioritization

2.1.5.3 GEN-PRO-3 Inadequate Test Prioritization

Description: Testing is not being adequately prioritized.

Potential Symptoms:• All types of testing may have the same priority.• All test cases for the system or one of its subsystems may have the same priority.• The most important tests of a given type may not be being performed first.• Testing may begin with the easy testing of “low-hanging fruit”.• Difficult testing or the testing of high risk functionality/components may be being

postponed until late in the schedule.• Testing ignores the order of integration and delivery; for example, unit testing before

integration before system testing and the testing of the functionality of current the current increment before the testing of future increments.23F

27

Potential Consequences:

27 While the actual testing of future capabilities must wait until those capabilities are delivered to the testers, one can begin to develop black-box test cases based on requirements allocated to future builds (i.e., tests that are currently not needed and may never be needed if the associated requirements change or are deleted).



• Limited testing resources may be wasted or ineffectively used.• Some of the most critical defects (in terms of failure consequences) may not be discovered

until after the system/software is delivered and placed into operation.• Specifically, defects with mission, safety, and security ramifications may not be found.

Potential Causes:• The system/software test plans and testing parts of the system/software development plans

do not address the prioritization of the testing.• Any prioritization of testing is not used to schedule testing.• Evaluations of the individual testers and test teams:

are based [totally] on number of tests performed per unit time ignore the importance of capabilities, subsystems, or defects found


Update the following documents to address the prioritization of testing: system/software test plans testing parts of the system/software development plans

Define the different types and levels/categories of criticality• Enable:

Perform a mission analysis to determine the mission-criticality of the different capabilities and subsystems

Perform a safety (hazard) analysis to determine the safety-criticality of the different capabilities and subsystems

Perform a security (threat) analysis to determine the safety-criticality of the different capabilities and subsystems

• Perform: Work with the developers, management, and stakeholders to prioritize testing according

to the: criticality (e.g., mission, safety, and security) of the system/subsystem/software

being tested potential importance of the potential defects identified via test failure probability that the test is likely to elicit important failures potential level of risk incurred if the defects are not identified via test failure delivery schedules integration/dependency order

Use prioritization of testing to schedule testing so that the highest priority tests are tested first.

Collect test metrics based on the number and importance of the defects found Base the performance evaluations of the individual testers and test teams on the test

effectiveness (e.g., the number and importance of defects found) rather than merely on the number of tests written and performed.

• Verify:



Evaluate the system/software test plans and the testing parts of the system/software development plans to verify that they properly address test prioritization.

Verify that mission, safety, and security analysis have been performed and the results are used to prioritize testing.

Verify that testing is properly prioritized. Verify that testing is in fact being performed in accordance with the prioritization. Verify that testing metrics address test prioritization. Verify that performance evaluations are based on

Related Problems: GEN-PRO-2 One-Size-Fits-All Testing

2.1.5.4 GEN-PRO-4 Functionality Testing Overemphasized

Description: There is an over emphasis on testing functionality as opposed to quality characteristics, data, and interfaces.

Potential Symptoms:• The vast majority of testing may be concerned with verifying functional behavior.• Little unit or testing may be being performed to verify adequate levels of the quality

characteristics (e.g., availability, reliability, robustness, safety, security, and usability).• Inadequate levels of various quality characteristics and their attributes are only being

recognized after the system has been delivered and placed into operation.

Potential Consequences:• The system may not have adequate levels of important quality characteristics and thereby

fail to meet all of its quality requirements.• Failures to meet data and interface requirements (e.g., due to a lack of verification of input

data and message contents) may not be recognized until late during integration or after delivery.

• Testers and developers may have a harder time localizing the defects that the system tests reveal.

• The system or software may be delivered late and fail to meet an unacceptably large number of non-functional requirements.

Potential Causes:• The test plans and process documents do not adequately address the testing of non-

functional requirements.• There are no process requirements (e.g., in the development contract) mandating the

specialized testing of non-functional requirements.• Managers, developers, and or testers believe:

Testing other types of requirements (i.e., data, interface, quality, and architecture/design/implementation/configuration constraints) is too hard.

Testing the non-functional requirements will take too long.28

28 Note that adequately testing quality requirements requires significantly more time to prepare for and perform that typical functional requirements.



The non-functional requirements are not as important as the functional requirements. Testing the non-functional testing will naturally occur as a byproduct of the testing of

the functional requirements.29

• The other types of requirements (especially quality requirements) are: poorly specified (e.g., “The system shall be secure.” or “The system shall be easy to

use.”) not specified therefore not testable

• Functional testing may be the only testing that is mandated by the development contract and therefore the testing of the non-functional requirements is out of scope or unimportant to the acquisition organization.


Adequately address the testing of non-functional requirements in the test plans and process documents.

Include process requirements mandating the specialized testing of non-functional requirements in the contract.

• Enable: Ensure that managers, developers, and or testers understand the importance of testing

non-functional requirements as well as conformance to the architecture and design (e.g., via whitebox testing).

• Perform: Adequately perform the other types of testing.

• Verify: Verify that the managers, developers, and or testers understand the importance of testing

non-functional requirements and conformance to the architecture and design. Have quality engineers verify that the testers are testing the quality, data, and interface

requirements as well as the architecture/design/implementation/configuration constraints.

Review the test plans and process documents to ensure that they adequately address the testing of non-functional behavior.

Measure, analyze, and report the types of non-functional defects and when they are being detected.


2.1.5.5 GEN-PRO-5 Black-box System Testing Overemphasized

Description: There is an over emphasis on black-box system testing for requirements conformance.

Potential Symptoms:29 Note that this can be largely true for some of the non-functional requirements (e.g., interface requirements and

performance requirements).



• The vast majority of testing is occurring at the system level for purposes of verifying conformance to requirements.

• There is very little white-box unit and integration testing.• System testing is detecting many defects that could have been more easily identified during

unit or integration testing.• Similar residual defects may also be causing faults and failures after the system has been

delivered and placed into operation.

Potential Consequences:• Defects that could have been found during unit or integration testing are harder to detect,

localize, analyze, and fix.• System testing is unlikely to be completed on schedule.• It is harder to develop sufficient system-level tests to meet code coverage criteria.• The system or software may be delivered late with an unacceptably large number of

residual defects that will only rarely be executed and thereby cause faults or failures.

Potential Causes:• The test plans and process documents do not adequately address unit and integration testing.• There are no process requirements (e.g., in the development contract) mandating unit and

integration testing.• The developers believe that blackbox system test is all that is necessary to detect the

defects.• Developers believe that testing is totally the responsibility of the independent test team,

which is only planning on performing system-level testing.• The schedule does not contain adequate time for unit and integration testing. Note that this

may really be an under emphasis of unit and integration testing rather than an overemphasis on system testing.

• Independent testers rather than developers are performing the testing.


Adequately address in the test plans, test process documents, and contract: whitebox and graybox testing unit and integration testing

• Enable: Ensure that managers, developers, and or testers understand the importance these lower-

level types of testing. Use a test plan template or content and format standard that addresses these lower-level

types of testing.• Perform:

Increase the amount and effectiveness of these lower-level types of testing.• Verify:

Review the test plans and process documents to ensure that they adequately address



these lower-level types of tests. Verify that the managers, developers, and or testers understand the importance of these

lower-level types of testing. Have quality engineers verify that the testers are actually performing these lower-level

types of testing and at an appropriate percentage of total tests. Review the test plans and process documents to ensure that they adequately address

lower-level testing. Measure the number of defects slipping past unit and integration testing.

Related Problems: GEN-PRO-6 White-box Unit and Integration Testing Overemphasized

2.1.5.6 GEN-PRO-6 White-box Unit and Integration Testing Overemphasized

Description: There is an over emphasis on white-box unit and integration testing.

Potential Symptoms:• The vast majority of testing is occurring at the unit and integration level.• Very little time is being spent on black-box system testing to verify conformance to the

requirements.• People are stating that significant system testing is not necessary because “lower level tests

have already verified the system requirements” or “there is insufficient time left to perform significant system testing”.

• There is little or no testing of quality requirements (because the associated quality attributes are system-level characteristics).

Potential Consequences:• The delivered system may fail to meet some of its system requirements, especially quality

requirements and those functional requirements that require the collaboration of the integrated subsystems.

Potential Causes:• The test plans and process documents do not adequately address black-box system testing.• There are no process requirements (e.g., in the development contract) mandating black-box

system/software testing.• The developers believe that if the components work properly, then the system will work

properly when they are integrated.• No blackbox testing metrics are being collected, analyzed, and reported.• The schedule does not contain adequate time for blackbox system testing (e.g., due to

schedule slippages and a firm release date). Note that this may really be an under emphasis of blackbox testing rather than an overemphasis on whitebox unit and integration testing.

• Developers rather than independent testers are performing much/most of the testing.




TBD• Enable:

TBD• Perform:

TBD• Verify:

Verify that TBD• Increase the amount and effectiveness of system testing.• Review the test plans and process documents to ensure that they adequately address black-

box system testing.• When appropriate, improve the test plans and process documents with regard to system

testing.• Measure, analyze, and report the number of requirements that have been verified by system

testing.

Related Problems: GEN-PRO-5 Black-box System Testing Overemphasized

2.1.5.7 GEN-PRO-7 Too Immature for Testing

Description: Some of the products being tested are immature, containing too many defects.

Potential Symptoms:• Large numbers of requirements, architecture, and design defects are being found that should

have been discovered (during reviews) and fixed prior to current testing.• The product may be being delivered for testing when it is not ready for testing because:

Schedule pressures cause corners to be cut during earlier testing. Test readiness criteria do not exist or are not enforced. Management, customer/user representatives, and developers do not understand the

impact on testing of immature products.

Potential Consequences:• Testing may find many defects that should have been detected during previous levels of

testing.• Encapsulation due to integration may make it unnecessarily difficult to localize the defect

that caused the test failure.• Testing may not be completed on schedule.

Potential Causes:• There is insufficient time for proper design and implementation prior to testing.• There is insufficient staff for proper design and implementation.• There are no completion criteria for design, implementation, and lower-level testing.• Lower-level tests have not been properly performed.




Set reasonable criteria for test readiness.• Enable:

Enforce the following of reasonable criteria for test readiness. TBD

• Perform: TBD

• Verify: Verify that TBD

• Increase the amount of earlier verification of the requirements, architecture, and design (e.g., with peer-level reviews and inspections).

• Improve the effectiveness of earlier disciplines and types of testing (e.g., by improving methods and providing training).

• Measure the number of defects slipping through multiple disciplines and types of testing (e.g., where the defect was introduced and where it was found).

Related Problems: GEN-TOP-2 Unclear Testing Responsibilities

2.1.5.8 GEN-PRO-8 Inadequate Test Evaluations

Description: The quality of the test assets is not being adequately evaluated prior to their use.

Potential Symptoms:• Little or no [peer-level] inspections, walk-throughs, or reviews of the test assets (e.g., test

inputs, preconditions, trigger events, expected test outputs and postconditions) are being performed prior to actual testing.

Potential Consequences:• Test plans, procedures, test cases, and other testing work products contain defects that could

have been found during these evaluations.• There will be an increase in false positive and false negative test results.• Unnecessary effort will be wasted identifying and fixing problems.• Some defects may not be found, and an unnecessary number of these defects may make it

through testing and into the deployed system.

Potential Causes:• Evaluating the test assets is not addressed in the:

Test and Evaluation Master Plan (TEMP) or System/Software Test Plan (STP) The high-level overviews of testing in the System Engineering Master Plan (SEMP) and

System/Software Development Plan (SDP) Quality Engineering/Assurance plans. Master Project Schedule Work Breakdown Structure (WBS)

• There is insufficient time and staff to evaluate the deliverable system/software and the test assets.



• The test assets are not deemed to be sufficiently important to evaluate.• The developers believe that the test assets will automatically be verified during the actual

testing.


Incorporate test evaluations into the: system/software development plans system/software test plans project schedules (master and team) project work breakdown structure (WBS)

Ensure that the following test assets are reviewed prior to actual testing: test inputs, preconditions (pre-test state), and test oracle including expected test outputs and postconditions.

• Enable: To the extent practical, ensure that the test evaluation team includes other testers,

requirements engineers, user representatives, subject matter experts, architects, and implementers.

• Perform: Have the testers perform peer-level reviews of the testing work products. Have quality engineering perform evaluations of the testing work products (quality

control) and test process (quality assurance) Have stakeholders in testing perform technical evaluations of the major testing work

products. Have the results of these technical evaluations presented at major project status

meetings and major formal reviews.• Verify:

Verify that these evaluations do in fact occur. Verify that the results of these evaluations are reported to the proper stakeholders. Verify that problems discovered are assigned for fixing and are in fact fixed.

Related Problems: GEN-TPS-2 Incomplete Test Planning, GEN-TOP-2 Unclear Testing Responsibilities

2.1.5.9 GEN-PRO-9 Inadequate Test Maintenance

Description: Testing assets are not being properly maintained.

Potential Symptoms:• Testing assets (e.g., test software and documents such as test cases, test procedures, test

drivers, test stubs, and tracings between requirements and tests30) may not be being

30 Although requirements traceability matrices can be used when the number of requirements and test cases are quite small, the use of a requirements management, test management, or configuration management tool is usually needed to document and maintain tracings between requirements and tests.



adequately updated and iterated as defects are found and the system software is changed (e.g., due to refactoring, change requests, or the use of an agile – incremental and iterative development cycle).

Potential Consequences:• Testing assets (e.g., automated regression tests) may no longer be consistent with the

current requirements, architecture, design, and implementation.• Test productivity may decrease as the number of false negative test results increases (i.e., as

tests fail due to test defects).• The amount of productive regression testing may decrease as effort is redirect to identifying

and fixing test defects.

Potential Causes:• Maintenance of the testing assets may not be an explicit part of the testing process.• Maintenance of the testing assets may not be explicitly documented in the Test and

Evaluation Master Plan (TEMP), the System/Software Test Plan (STP), or the testing sections of the System Engineering Master Plan (SEMP) and Software Development Plan (SDP).

• The test resources (e.g., schedule and staffing) provided by management may be insufficient to properly maintain the testing assets.

• The project master schedule may not have included (sufficient) time for test asset maintenance.

• Testing stakeholders may not understand the importance of maintaining the testing assets.• There may be no (requirements management, test management, or configuration

management) tool support for maintaining the tracing between requirements and tests.


Explicitly address the maintenance of the testing assets in the: Test and Evaluation Master Plan (TEMP), the System/Software Test Plan (STP), or

the testing sections of the System Engineering Master Plan (SEMP) and Software Development Plan (SDP)

testing process documents (e.g., procedures and guidelines) project work breakdown structure (WBS)

Include adequate time for test asset maintenance in the project master schedule. Clearly communicate the importance of maintaining the testing assets to the testing

stakeholders. Ensure that the maintenance testers are adequately trained and experienced.25F

31

• Enable: Provide sufficient test resources (e.g., schedule and staffing) to properly maintain the

testing assets. Provide tool support (e.g., via a requirements management, test management, or

31 This will help combat the loss of project expertise due to the fact that many/most of the testers who are members of the development staff tend to move on after delivery.



configuration management tool) for maintaining the tracing between requirements and tests.

• Perform: Keep the testing assets consistent with the current requirements, architecture, design,

and implementation.24F

32

Properly maintain the testing assets as defects are found and system changes are introduced.

• Verify: Verify that the test plans address maintaining testing assets. Verify that the project master schedule include time for maintaining testing assets. Verify that the testing assets are in fact being maintained (e.g., via quality assurance and

control).

Related Problems: GEN-TPS-1 No Separate Test Plan, GEN-TPS-2 Incomplete Test Planning, GEN-TOP-2 Unclear Testing Responsibilities

2.1.6 Test Tools and Environments Problems

The following testing problems are related to the test tools and environments:• GEN-TTE-1 Over-reliance on Manual Testing • GEN-TTE-2 Over-reliance on Testing Tools • GEN-TTE-3 Insufficient Test Environments • GEN-TTE-4 Poor Fidelity of Test Environments • GEN-TTE-5 Inadequate Test Environment Quality • GEN-TTE-6 System/Software Under Test Behaves Differently • GEN-TTE-7 Tests not Delivered • GEN-TTE-8 Inadequate Test Configuration Management (CM)

2.1.6.1 GEN-TTE-1 Over-reliance on Manual Testing

Description: Testers are placing too much reliance on manual testing.

Potential Symptoms:• All, or the majority of, testing is being performed manually without adequate support of test

tools or test scripts.33

Potential Consequences:• Testing will be very labor intensive.• Any non-trivial amount of regression testing will likely be impractical.

32 While this is useful with regard to any product that undergoes multiple internal or external releases, it is especially a good idea when an agile (iterative and incremental) development cycle produces numerous short duration increments.

33 This may not be a problem if test automation is not practical for some reason (e.g., the quick and dirty testing of a UI- heavy rapid prototype that will not be maintained).



• Testing will likely be subject to significant human error, especially with regard to test inputs and the interpretation and recording of test outputs.

Potential Causes:• Test automation is not addressed in the Test and Evaluation Master Plan (TEMP) or

System/Software Test Plan (STP).• Test automation is not addressed in the testing parts of the System Engineering Master Plan

(SEMP) and Software Development Plans (SDP).• Test automation is not included in the project Work Breakdown Schedule (WBS).• Time for test automation is not included in the project master schedule and test team

schedules.• Testers do not have adequate training and experience in test automation.


Address test automation in the TEMP and/or STP. Address test automation in the testing parts of the SEMP and/or SDP. Address test automation in the project WBS. Include time for test automation in the project master schedule and test team schedules. Provide sufficient funding for the evaluation, select, purchase, and maintenance of test

tools. Provide sufficient staff and funding for the automation of testing.

• Enable: Evaluate, select, purchase, and maintain of test tools. Where needed, provide training in automated testing.

• Perform: Limit manual testing to only the testing for which is most appropriate. Automate regression testing. Maintain regression tests (e.g., scripts, inputs, expected outputs). Use test tools and scripts to automate appropriate parts of the testing process (e.g., to

ensure that testing provides adequate code coverage).• Verify:

Evaluate the test planning documentation for the inclusion of automated testing. Evaluate the schedules for inclusion of test automation. Verify that sufficient tests are being automated and maintained.

Related Problems: GEN-TTE-2 Over-reliance on Testing Tools, TTS-REG-1 Insufficient Regression Test Automation

2.1.6.2 GEN-TTE-2 Over-reliance on Testing Tools

Description: Testers and other testing stakeholders are placing too much reliance on (COTS and home-grown) testing tools.34



Potential Symptoms:• Testers and other testing stakeholders are relying on testing tools to do far more than to

merely generate sufficient white-box test cases to ensure code coverage.• Testers are relying on the tools to automate test case creation including test case selection

and completion (“coverage”) criteria.• Testers are relying on the test tools as their test oracle (to determine the expected – correct –

test result).• Testers let the tool drive the test methodology rather than the other way around.

Potential Consequences:• Testing may emphasize white-box (design-driven) testing and may include inadequate

black-box (requirements-driven) testing.• Many design defects may not be found during testing and thus remain in the delivered

system.

Potential Causes:• The tool vendor’s marketing information may:

— be overly optimistic (e.g., promise that it covers everything)— equate tool with method (so no extra methodology needs to be addressed)

• Management may equate the test tools with the testing method.• The testers may be sufficiently inexperienced in testing to not recognize what the tool does

not cover.


Ensure that manual testing including its scope (when and for what) is documented in the test plans and test process documents.

• Enable: Provide sufficient resources to perform the tests that should or have to be performed

manually. Ensure that testers (e.g., via training and test planning) understand the limits of testing

tools and the automation of test case creation.• Perform:

Let the test methodology drive tool selection. Ensure that testers (when appropriate) use the requirements, architecture, and design as

the test oracle (to determine the correct test result).• Verify:

Verify that the testers are not relying 100% on test tools to automate test case selection and set the test completion (“coverage”) criteria.

Related Problems: GEN-TTE-1 Over-reliance on Manual Testing

34 This problem does not imply that regression tests need not be automated (it should to the extent practical), but rather that the tester needs to use the requirements, architecture, and design as the oracle rather than tool and code.



2.1.6.3 GEN-TTE-3 Insufficient Test Environments

Description: There are too few test environments.

Potential Symptoms:• The types of test environments needed are not [adequately or completely] addressed in the

Test and Evaluation Master Plan (TEMP) or System/Software Test Plan (STP).• The types of test environments needed are not [adequately or completely] addressed in the

testing parts of the System Engineering Master Plan (SEMP) and Software Development Plans (SDP).

• There are not sufficient test environments of one or more types (listed in order of increasing fidelity) such as:

• Software only test environments hosted on basic general-purpose platform such as desktop and laptop computers

• Software only test environment on appropriate computational environment (e.g., correct processors, busses, operating system, middleware, databases)

• Software with prototype hardware (e.g., sensors and actuators)• Software with early/previous version of the actual hardware• Initial integrated test system with partial functionality (e.g., initial test aircraft for ground

testing or flight testing of existing functionality)• Integrated test system with full functionality (e.g., operational aircraft for flight testing or

operational evaluation testing)• There is an excessive amount of competition between and among the integration testers and

other testers for time on the test environments.

Potential Consequences:• It may be difficult to optimally schedule the allocation of test teams to test environments,

resulting in scheduling conflicts.• Too much time may be wasted reconfiguring the test environments for the next team’s use.

Testing may not be completed on schedule.• Certain types of testing are not able to be performed.• Defects that should be found during testing on earlier test environments are not found until

later test environments when it becomes harder to cause and reproduce test failures and localize defects.

Potential Causes:• Lack of adequate planning (e.g., test environments missing from test plans)• Lack of experience with the different types of test environments• Lack of funding for the creation, testing, and maintenance of test environments• Under estimation of the amount of testing to be performed• Lack of adequate hardware for test environments, for example because hardware is needed

for (1) initial prototype systems or (2) initial systems during Low Rate of Initial Production (LRIP)




Ensure that the test team and especially the test managers understand the different types of test environments, their uses, their costs, and their benefits.

Determine/estimate the amount and types of tests needed. Determine the number of testers needed. Determine the test environment requirements in terms of types of test environments and

numbers of each type.• Enable:

Address all of the types of test environments needed in the Test and Evaluation Master Plan (TEMP) or System/Software Test Plan (STP).

Address (e.g., list) the types of test environments needed in the testing parts of the System Engineering Master Plan (SEMP) and Software Development Plans (SDP).

Include the development of the test environments in the project master schedule (and schedules of individual test teams) and ensure that it is consistent with the test schedule.

Include the development and maintenance of the test environments in the project Work Breakdown Schedule (WBS).

Ensure that the necessary software (e.g., test tools) is available when needed. Ensure that sufficient hardware/systems to build/support the test environments are

available when needed. Do not transfer the hardware/systems needed for the test environments to other uses,

leaving insufficient hardware for the test environments. Create and use a process for scheduling the test teams’ use of the test environments.

• Perform: Develop all of the needed test environments. Maintain all of the needed test environments.

• Verify: Verify that development of the test environments is properly addressed in the TEMPS,

STPs, schedules, and WBS. Verify that sufficient test environments are available and reliable (e.g., via testing

metrics).

Related Problems: GEN-TTE-5 Inadequate Test Environment Quality, GEN-TTE-8 Inadequate Test Configuration Management (CM)

2.1.6.4 GEN-TTE-4 Poor Fidelity of Test Environments

Description: Testing is problematic due to the test environment having poor fidelity related to the operational system/software.

Potential Symptoms:• Testing is being performed using test environments incorporating:

A different (or different version of) the computing platform than that used on the



delivered software: compiler or programming language class library operating system, middleware, or database(s) network software

A different (or different version of) the computer or system hardware: processor(s), memory, motherboard, or graphic card network devices (e.g., routers and firewalls) sensors, actuators, etc.

Software that poorly simulates hardware (e.g., stimulators/drivers and stubs)• There are a significant number of tests that fail due to low fidelity of the testing

environment.

Potential Consequences:• Testing will experience many false negatives. It will be more difficult to localize and fix

defects.• Test cases will need to be repeated when the fidelity problems are solved by:

fixing defects in the test environment using a different test environment that better conforms to the operational system and its

environment (e.g., by replacing software simulation by hardware or by replacing prototype hardware with actual operational hardware)

Potential Causes:• Poor configuration management of the hardware or software.• Lack of availability of the correct version of the hardware or software.


Include how the testers are going to address test environment fidelity in the test plans• Enable:

Provide good configuration management of components under test and test environments.

Provide tools to evaluate the fidelity of the test environment’s behavior. Provide the test labs with sufficient numbers of prototype and Low Rate of Initial

Production (LRIP) system components (subsystems, software, and hardware).• Perform:

To the extent practical, use the same versions of development tools and system components during testing as during operation.

• Verify: Verify to the extent practical that when used, test stimulators and stubs have the

characteristics as the eventual components they are replacing during testing.

Related Problems: GEN-TTE-5 Inadequate Test Environment Quality, GEN-TTE-6 Software Under Test Behaves Differently, TTS-INT-2 Unavailable Components



2.1.6.5 GEN-TTE-5 Inadequate Test Environment Quality

Description: The quality of the test environments is inadequate.

Potential Symptoms:• One or more test environments contain an excessive number of defects.35

Potential Consequences:• There may be numerous false negative test results.27F

36

• It will be more difficult to determine whether test failures are due to the system/software under test or the test environments.

• Testing will take a needlessly long time to perform.• The system may be delivered late and with an unacceptably large number of residual

defects.

Potential Causes:• TBD


TBD• Enable:

TBD• Perform:

TBD• Verify:

Verify that TBD• Ensure that the quality of the test environment is as good as the system/software under test,

especially when testing mission-, safety-, or security-critical software.• Ensure that the test environments are of sufficient quality (e.g., via good development

practices, adequate testing, and careful tool selection).

Related Problems: GEN-TTE-4 Poor Fidelity of Test Environment, GEN-TTE-6 System/Software Under Test Behaves Differently

2.1.6.6 GEN-TTE-6 System/Software Under Test Behaves Differently

Description: The system or software under test (SUT) and the operational software behave differently.

Potential Symptoms:• A fault or failure that occurs during testing is not repeatable during normal operation.• The SUT that behaved correctly during testing causes a fault or failure during operation.

35 This is primarily a problem with test environments and their components that are developed in-house.36 A false negative test result is a test indicating that the system/software under test fails when it is actually the test

environment that failed to perform properly.



• The SUT contains test software that is either removed (e.g., physically or via complier switch) before being placed in operation.

Potential Consequences:• Extra time is spent localizing the defect.• Correct behavior due to the existence of integrated test software leads to a false sense of

security.



TBD• Enable:

TBD• Perform:

TBD• Verify:

Verify that TBD• Perform black-box regression testing after removing the test software.• Consider incorporating the test software as deliverable built-in-test (BIT) software. 28F

37

Related Problems: GEN-TTE-4 Poor Fidelity of Test Environments, GEN-TTE-5 Inadequate Test Environment Quality

2.1.6.7 GEN-TTE-7 Tests not Delivered

Description: Test assets are not being delivered along with the system / software.

Potential Symptoms:• The delivery of test assets (e.g., test cases, test oracles, test drivers/scripts, test stubs, and

test environments) is neither required nor planned.• Test assets are not delivered along with the system / software.

Potential Consequences:• It may be unnecessarily difficult to perform testing during maintenance.• There may be inadequate regression testing as the delivered system/software is updated.• Some post-delivery testing may not be performed so that some post-delivery defects may

not being found and fixed.

Potential Causes:• The delivery of test assets is not an explicit part of the testing process.

37 Note that it may not be practical (e.g., for performance reasons or code size) or permitted (e.g., for safety or security reasons) to deliver the system with embedded test software. For example, embedded test software could provide an attacker with a back door capability.



• The delivery of test assets is not mentioned in the System Engineering Management Plan (SEMP), System/Software Development Plan (SDP), or Test and Evaluation Master Plan (TEMP).


TBD• Enable:

TBD• Perform:

TBD• Verify:

Verify that TBD• Ensure that the migration to maintenance section of the system development contract or

associated list of deliverables includes the delivery of all test assets needed to perform testing after delivery.

Related Problems: GEN-TPS-2 Incomplete Test Planning, GEN-TOP-2 Unclear Testing Responsibilities

2.1.6.8 GEN-TTE-8 Inadequate Test Configuration Management (CM)

Description: Testing work products are not under configuration control.

Potential Symptoms:• Test environments, test cases, and other testing work products are not under configuration

control.• Inconsistencies are found between the current versions of the system/software under test

and the test cases and test environments.

Potential Consequences:• Test environments, test cases, and other testing work products may cease to be consistent

with the system/software being tested and with each other.29F

38

• It may be impossible to reproduce tests (i.e., get the same test results given the same preconditions and test stimuli).

• It may be much more difficult to know that the correct versions of the system, test environment, and tests are being used when testing.

• There may be an increase in false positive and false negative test results.• False positive test results due to incorrect version control may lead to incorrect fixes and the

resulting insertion of defects into the system/software.• Unnecessary effort may be wasted identifying and fixing CM problems.• Some defects may not be found, and an unnecessary number of these defects may make it

38 Note that a closely related problem would be that the subsystem/software under test (SUT) is not under configuration control. Incompatibilities will also occur if the SUT is informally updated with undocumented and uncontrolled “fixes” without the test team being aware.




Potential Causes:• Placing the test assets under configuration management is not an explicit part of either the

configuration management or testing process.• The CM of test assets is not mentioned in the System Engineering Management Plan

(SEMP), System/Software Development Plan (SDP), Test and Evaluation Master Plan (TEMP), or System/Software Test Plan (STP).


TBD• Enable:

TBD• Perform:

TBD• Verify:

Verify that TBD• Ensure that all test plans, procedures, test cases, test environments, and other testing work

products are placed under configuration control before they are used.• Ensure that the right versions of test environment components are used so that the test

environments are restored to the correct known state prior to each new testing cycle.


2.1.7 Test Communication Problems

The following testing problems are related to the communication including documentation and reporting:• GEN-COM-1 Inadequate Defect Reports • GEN-COM-2 Inadequate Test Documentation • GEN-COM-3 Source Documents Not Maintained • GEN-COM-4 Inadequate Communication Concerning Testing

2.1.7.1 GEN-COM-1 Inadequate Defect Reports

Description: Some defect (a.k.a., bug and trouble) reports are incomplete or contain incorrect information.

30F

39

Potential Symptoms:• Some defect reports are incomplete (e.g., do not contain some of the following

information): Summary – a one sentence summary of the fault/failure

39 This is especially a problem when the fault/failure is intermittent and inherently difficult to reproduce.



Detailed Description – a relatively comprehensive detailed description of the failure Author – the name and contact information of the person reporting the defect System – the version (build) or variant of the system Environment – the software infrastructure such as OS, middleware, and database types

and versions Location – the author’s assessment of the subsystem or module that contains the defect

that cause the fault/failure Priority and severity – the author’s assessment of the priority and severity of the defect Steps – the steps to be followed to replicate the fault/failure (if reproducible by the

report’s author) including: Preconditions (e.g., system mode or state and stored data values) Trigger events including input data

Actual behavior including fault/failure warnings, cautions, or advisories Expected behavior Comments Attachments (e.g., screen shots or logs)

• Some defect reports contain incorrect information.• Defect reports are returned with comments such as “not clear” and “need more

information”.• Developers/testers contact the defect report author for more information.• Different individuals or teams use different defect report templates, content/format

standards, or test management tools.31F

40

Potential Consequences:• Testers will be unable to reproduce to faults/failures and thereby identify the underlying

defects.• It will take longer for developers/testers to identify and diagnose the underlying defects.



TBD• Enable:

TBD• Perform:

TBD• Verify:

Verify that TBD• Use templates and standards to specify content and format the defect reports. Use a test

40 This is especially likely when a prime contractor/system integrator and subcontractors are involved in development.



management tool to enter and managed the defect reports.• To the extent practical,32F

41 ensure that all defect reports are reviewed for completeness, duplication, and scope by the test manager or the change control board (CCB) before being assigned to:33F

42

individual testers for testing and analysis (for defect reports not authored by testers) developers for analysis and fixing (for defect/test reports authored by testers)


2.1.7.2 GEN-COM-2 Inadequate Test Documentation

Description: Some test documents are incomplete or contain incorrect information.

Potential Symptoms:• Some test documentation is inadequate for defect identification and analysis, regression

testing, test automation, reuse, and quality assurance of the testing process.34F

43

• Some test documentation templates or format/content standards are either missing or incomplete.

• Test scripts/cases do not completely describe test preconditions, test trigger events, test input data, expected/mandatory test outputs (data and commands), and expected/mandatory test post-conditions.

• An agile approach is being used by developers with little testing expertise and experience.• Testing documents are not maintained or placed under configuration management.

Potential Consequences:• Testing assets (e.g., test documents, environments, and test cases) may not be sufficiently

documented to be used by: testers to drive test automation testers to perform regression testing, either during initial development or during

maintenance quality assurance personnel and customer representatives during evaluation and

oversight of the testing process testers other than the original test developer (e.g., by those performing integration,

system, system of system, and maintenance testing) test teams from other projects developing/maintaining related systems within a product

family or product line• Tests may not be reproducible. It may take longer to identify and fix some of the underlying

defects, thereby causing some test deadlines to be missed.• Maintenance costs may be needlessly high. Insufficient regression testing may be

performed.

41 It is critical to ensure that this review not become a bottle neck.42 This review should not happen more than once and the correct time to perform this review may well depend on

who authored the defect report and on the defect resolution process.43 This is often caused by managers attempting to decrease the testing effort and thereby meet schedule deadlines

or by processes developed by people who do not have adequate testing training and experience.



• The reuse of testing assets may be needlessly low, thereby unacceptably increasing the costs, schedule, and effort that will be spent recreating testing assets.

Potential Causes:• Prepare:

TBD• Enable:

TBD• Perform:

TBD• Verify:

Verify that TBD• The content (and format) of test documents may not be an explicit part of the testing

process and thus not addressed in document templates or content/format standards.• Testers may not appreciate the need for good test documentation, which tends to occur if an

agile development method is used (e.g., because of the emphasis on minimizing documentation and the emphasis on having each development team determine its own documentation needs on a case-by-case basis).

Recommendations:• Use the contract, test plans, test training, test process documents, and test standards to

specify the required test documents and ensure that test work products are adequately documented.

• Ensure that test cases completely describe test preconditions, test trigger events, test input data, mandatory/expected test outputs (data and commands), and mandatory/expected system post-conditions.

• When using an iterative, incremental, and parallel – agile – development cycle in which components under test will frequently change, concentrate on making their associated executable testing work products self-documenting (rather than using separate testing documentation) so that the components and their testing work products are more likely to be changed together and thereby remain consistent.

• Use common standard templates for test documents (e.g., test plans, test cases, test procedures, and test reports).

• Use a test documentation tool or database to record test reports.• When using a database to store test results, make sure that its schema supports easy

searches.• Clearly identify the versions of the software, test environment, test cases, etc. to use to

ensure consistency.

Related Problems: GEN-TOP-2 Unclear Testing Responsibilities, GEN-TTE-8 Inadequate Test Configuration Management, GEN-PRO-9 Inadequate Test Maintenance

2.1.7.3 GEN-COM-3 Source Documents Not Maintained

Description:



• The requirements specifications, architecture documents, design documents, and other developmental documents that are needed as inputs to the development of tests (e.g., needed to determine test inputs, preconditions, steps, expected outputs, expected postconditions) are not maintained.

Potential Symptoms:• The requirements specifications, architecture documents, design documents, and

developmental documents that are needed to drive the development of tests are obsolete.• The test drivers and code get out of sync with each other.• The testers are unaware that their tests have become obsolete by undocumented changes to

the requirements, architecture, design, and implementation.

Potential Consequences:• The regression tests35F

44 run during maintenance begin to produce large numbers of false negative results.

• The effort running these tests is wasted.• Testing takes longer because the testers must determine the true current state of the

requirements, architecture, design, and implementation.



TBD• Enable:

TBD• Perform:

TBD• Verify:

Verify that TBD• Ensure that the requirements specifications, architecture documents, design documents, and

other developmental documents that are needed as inputs to the development of tests are properly maintained.36F

45

• Ensure that testers are notified when these changes occur.• Testers should report the occurrence of this problem to project management including the

test manager, the project manager, and the technical leader.

Related Problems: GEN-TTE-8 Inadequate Test Configuration Management (CM), TTS-UNT-1 Unstable Design

44 Although this is primarily a problem with regression tests that are obsoleted by changes to the requirements, architecture, design, and software, it can also be a problem with developing new tests for new capabilities if these capabilities are not documented properly.

45 As stated in GEN-TTE-5 Inadequate Test Configuration Management (CM), these test documents also need to be placed under configuration control.



2.1.7.4 GEN-COM-4 Inadequate Communication Concerning Testing

Description: There is inadequate communication concerning testing among testers and other testing stakeholders.

Potential Symptoms:• There is inadequate testing-related communication between:

Teams within large or geographically-distributed programs Contractually separated teams (prime vs. subcontractor, system of systems) Between testers and:

Other developers (requirements engineers, architects, designers, and implementers) Other testers Customer representatives, user representatives, and subject matter experts (SMEs)

• For example, the developers fail to notify the testers of bug fixes and their consequences.• Testers fail to notify other testers of test environment changes (e.g., configurations and uses

of different versions of hardware and software).

Potential Consequences:• Some of the requirements may not be testable.• Some architectural decisions may make certain types of testing more difficult or impossible.• Safety and security concerns may not influence the level of testing of safety- and security-

critical functionality.• Different test teams may have difficulty coordinating their testing and scheduling their use

of common test environments.



TBD• Enable:

TBD• Perform:

TBD• Verify:

Verify that TBD• Ensure that there is sufficient testing-related communication between and among the testers

and the stakeholders in testing.

Related Problems: GEN-PRO-1 Testing Process Not Integrated Into Engineering Process



2.1.8 Requirements-related Testing Problems

Good requirements are complete, consistent, correct, feasible, mandatory,46 testable, and unambiguous.47 Requirements that are deficient in any of these criteria decrease the testability of systems and software. Given poor requirements, black-box testing is relatively inefficient and ineffective so that testers may rely on higher-risk strategies including white-box testing (e.g., structural testing such as path testing for code coverage).3F

48

The following testing problems are directly related to requirements:49

• GEN-REQ-1 Ambiguous Requirements • GEN-REQ-2 Missing Requirements • GEN-REQ-3 Incomplete Requirements • GEN-REQ-4 Incorrect Requirements • GEN-REQ-5 Unstable Requirements • GEN-REQ-6 Poor Derived Requirements • GEN-REQ-7 Verification Methods Not Specified • GEN-REQ-8 Lack of Requirements Tracing

2.1.8.1 GEN-REQ-1 Ambiguous Requirements

Description: Testing fails to expose certain defects because some of the requirements are ambiguous.

Potential Symptoms:• Some of the requirements are ambiguous due to the use of:

inherently ambiguous words undefined technical jargon (e.g., application domain-specific terminology) and

acronyms misuses of contractual words such as “shall”, “should”, “may”, “recommended”, and

“optional” required quantities without associated units of measure unclear synonyms, near synonyms, and false synonyms

• Inconsistencies result when requirements engineers and testers interpret the same requirement differently.

Potential Consequences:• Testers may misinterpret the requirements, leading to incorrect black-box testing.

46 They are truly needed and not unnecessary architectural or design constraints.47 There are actually quite a few characteristics that good requirements should exhibit. These are merely some of

the more important ones.48 At least, this will help to get the system to where it will run without crashing and thereby provide a stable

system that can be modified when the customer finally determines what the true requirements are.49 While the first five problem types below are violations of the characteristics of good requirements, there are

many other such characteristics that are not listed below. This inconsistency is because the first five below tend to cause the most frequent and severe testing problems.



• Numerous false positive and false negative test results are observed because the tests were developed in accordance with the testers’, rather than the requirements engineers’, interpretation of the associated requirements.

• Specifically, ambiguous requirements will often give rise to incorrect test inputs and incorrect expected outputs (i.e., the test oracle is incorrect).

• Testers may have to spend significant time meeting with requirements engineers, customer/user representatives, and subject matter experts to clarify ambiguities so that testing can proceed.

Potential Causes:• The people (e.g., requirements engineers and business analysts) who are engineering the

requirements have not been adequately trained in how to recognize and avoid ambiguous requirements.

• The requirement team does not include anyone with testing expertise.• The requirements have not been reviewed for ambiguity. The testers have not reviewed the

requirements for ambiguity.• The requirements reviewers are not using a requirements review checklist or the checklist

does not address ambiguous requirements.• The textual requirements have not been analyzed by a tool for inherently ambiguous words.• The requirements engineers are rushed because insufficient resources (e.g., time and

staffing) have been allocated to properly engineer the requirements.


TBD• Enable:

TBD• Perform:

TBD• Verify:

Verify that TBD• Improve the requirements engineering process and associated training.• Consider adding a senior test engineer to the requirements engineering team to ensure that

the requirements are unambiguous.50

• Promote testability by ensuring that requirements are clear and unambiguous.• Require that one or more testers review the requirements documents and each requirement

for verifiability (esp. testability) before it is approved for use.• Encourage testers to request clarification for all ambiguous requirements, and encourage

that the requirements be updated based on the clarification given.• Verify that the requirements do not include words that are inherently ambiguous, undefined

technical terms and acronyms, quantities without associated units of measure, or synonyms.

50 Because testers often begin outlining black-box test cases during requirements engineering based on initial requirements, they are in a good position to identify requirements ambiguity.



• Ensure that (1) the project has both a glossary and acronym list and (2) the requirements include technical jargon and acronyms only if they are defined therein.

Related Problems: GEN-REQ-3 Incomplete Requirements, GEN-COM-4 Inadequate Communication Concerning Testing, TTS-SoS-2 Poor or Missing SoS Requirements

2.1.8.2 GEN-REQ-2 Missing Requirements

Description: Testing fails to expose certain defects because some of the requirements are missing.

Potential Symptoms:• Some requirements are missing such as:

Requirements specifying mandatory responses to abnormal conditions (e.g., error, fault, and failure detection and reaction)51

Quality requirements (e.g., availability, interoperability, maintainability, performance, portability, reliability, robustness, safety, security, and usability)

Data requirements Requirements specifying system behavior during non-operational modes (e.g., start-up,

degraded mode, training, and shut-down)

Potential Consequences:• Tests cannot be developed for missing requirements.• Requirements-based testing will not reveal missing behavior and characteristics.• Customer representatives and developers will have a false sense of security that the system

will function properly on delivery and deployment.• Testers may have to spend a sizable amount of time meeting with requirements engineers,

customer/user representatives in order to identify missing requirements the existence of which was implied by failed tests.

• Defects associated with missing requirements may not be found and therefore they make it through testing and into the deployed system.

Potential Causes:• Use cases only define normal paths (a.k.a., sunny day, happy path, and golden path) and not

fault tolerant and failure paths (a.k.a., rainy day paths or alternative flows).4F

52

• The stakeholders have not reviewed the set of requirements for missing requirements.• The requirements have not been reviewed to ensure that they contain robustness

requirements that mandate the detection and proper reaction to input errors, system faults

51 These requirements are often critical for achieving adequate reliability, robustness, and safety. While requirements often specify how the system should behave, they rarely specify how the system should behave in those cases where it does not or cannot behave as it should. It is equally critical that testing verify that the system does not do what it should not do (i.e., that the system meets its negative as well as positive requirements).

52 This is not to imply that the testing of normal paths is not necessary. However, it is often incorrectly assumed that it is sufficient. Software can misbehave in many more ways than it can work properly. Defects are also more likely to be triggered by boundary conditions or rainy day paths than in sunny day paths. Thus, there should typically be more boundary and invalid condition test cases than normal behavior test cases.



(e.g., incorrect system-internal modes, states, or data), and system failures.• The requirements engineers are rushed because insufficient resources (e.g., time and



TBD• Enable:

TBD• Perform:

TBD• Verify:

Verify that TBD• Improve the requirements engineering process and associated training.• Consider adding a tester to the requirements engineering team to ensure that the

requirements specify rainy day situations that must be addressed to achieve error, fault, and failure tolerance.

• Promote testability by ensuring that use case analysis adequately addresses error, fault, and failure (i.e., rainy day) tolerant paths as well as normal (sunny day or golden) paths.

• Ensure that the requirements repository includes an appropriate number of the quality and data requirements.

• Ensure that one or more requirements stakeholders (e.g., customer representatives, user representatives, subject matter experts) review the requirements documents and requirements repository contents for missing requirements before they are accepted and approved for use.

• Ensure that higher-level requirements are traced to lower-level (derived) requirements so that it is possible to verify that the lower-level requirements, if met, are sufficient to meet the higher-level requirements.

Related Problems: GEN-COM-4 Inadequate Communication Concerning Testing, TTS-SoS-2 Poor or Missing SoS Requirements

2.1.8.3 GEN-REQ-3 Incomplete Requirements

Description: Testing fails to expose certain defects because some of the individual requirements are incomplete.

Potential Symptoms:• Individual requirements are incomplete and lack (where appropriate) some of the following

components:53

trigger events

53 All of these components are not needed for each requirement. However, stakeholders and requirements engineers often assume them to be implicitly part of the requirements and thus unnecessary to state explicitly. However, tests that ignore these missing parts of incomplete requirements can easily yield incorrect results if they are not taken into account.



preconditions mandatory quantitative thresholds mandatory postconditions mandatory outputs

Potential Consequences:• Testing will be incomplete or may return incorrect (i.e., false negative and false positive)

results.• Some defects associated with incomplete requirements may not be found and therefore

make it through testing and into the deployed system.


requirements have not been adequately trained in how to recognize and avoid incomplete requirements.

• The requirement team does not include anyone with testing expertise.• The individual requirements have not been reviewed for completeness.• The testers have not reviewed each requirement for completeness.• The requirements reviewers are not using a requirements review checklist or the checklist

does not address incomplete requirements.• The requirements engineers are rushed because insufficient resources (e.g., time and



TBD• Enable:

TBD• Perform:

TBD• Verify:

Verify that TBD• Improve the requirements engineering process and associated training.• Consider adding a tester to the requirements engineering team to ensure that the

requirements are sufficiently complete to enable testers to develop test inputs and determine correct associated outputs.

• Ensure that the individual requirements are complete (e.g., via templates, guidelines, and inspections).

• Ensure that the testers and one or more requirements stakeholders review the requirements documents and requirements repository contents for incomplete requirements before they are accepted and approved for use.

Related Problems: GEN-REQ-1 Ambiguous Requirements, GEN-COM-4 Inadequate Communication Concerning Testing



2.1.8.4 GEN-REQ-4 Incorrect Requirements

Description: Testing fails to expose certain defects because some of the requirements are incorrect.

Potential Symptoms:• Requirements are determined to be incorrect after the associated black-box tests have been

developed and run.

Potential Consequences:• Testing results include many false positive and false negative results.• The tests associated with incorrect requirements must be modified or replaced and then

rerun, potentially from scratch.• Some defects caused by incorrect requirements may not be found and therefore make it


Potential Causes:• The stakeholders have not reviewed the requirements for correctness.• Stakeholders are not available to validate the requirements. Insufficient resources (e.g., time

and staffing) are allocated to properly engineer the requirements.


TBD• Enable:

TBD• Perform:

TBD• Verify:

Verify that TBD• Improve the requirements engineering process.• Ensure that the requirements are sufficiently validated by requirements stakeholders (e.g.,

customer representatives, user representatives, subject matter experts) before they are accepted by the testers and large numbers of associated test cases are development based on them.

Related Problems: GEN-SIC-3 Lack of Stakeholder Commitment, TTS-SoS-2 Poor or Missing SoS Requirements

2.1.8.5 GEN-REQ-5 Unstable Requirements

Description: Testing is problematic because of the volatility of many of the requirements. 5F

54

54 This testing problem is similar to but more general than the preceding problem: Incorrect Requirements because fixing incorrect requirements is one potential reason that the requirements may be volatile. Other reasons may be engineering missing requirements and changing stakeholder needs.



Potential Symptoms:• The requirements are continually changing: new requirements are being added and existing

requirements are being modified and deleted.• The requirements selected for implementation are not frozen, especially during a short

duration increment (e.g., Scrum sprint) when using an incremental, iterative, and parallel (a.k.a., agile) development cycle.

Potential Consequences:• Test cases (test inputs, preconditions, and expected test outputs) and automated regression

tests are being obsoleted because of requirements changes.• Significant time originally scheduled for the development and running of new tests is spent

in testing churn (fixing and rerunning broken tests).• As testing schedules get further behind, regression tests are not maintained and rerun.• Broken tests may be abandoned.

Potential Causes:• The requirements are not well understood by the requirements stakeholders.• Many of the requirements are being rapidly iterated because they do not exhibit the

characteristics of good requirements.• The actual requirements are rapidly changing due to changes in the system’s environment

(e.g., new competing systems, rapidly changing threats, and changing markets).• These potential causes can be exacerbated when using a development cycle with many,

short-duration iterative increments.• The requirements engineers are rushed because insufficient resources (e.g., time and



TBD• Enable:

TBD• Perform:

TBD• Verify:

Verify that TBD• Where practical, ensure that the requirements are reasonably stable before developing test

cases (test scripts, test inputs, preconditions, and expected test outputs) and automating regression tests.55

• Consider adding a tester to the requirements engineering team as a liaison to the testers so that the testers know which requirements are most likely to be sufficiently stable to enable the testers to begin developing the associated tests.


55 Note that this may be impossible or impractical due to delivery schedules and the amount of testing required.



2.1.8.6 GEN-REQ-6 Poor Derived Requirements

Description: Testing is problematic due to problems with derived requirements.

Potential Symptoms:• Derived requirements merely restate their associated parent requirements.• Newly derived requirements are not at the proper level of abstraction (e.g., subsystem

requirements at the same level of abstraction as the system requirements from which they were derived). Note that the first symptom is often an example of this second symptom.

• The set of lower-level requirements derived from a higher requirement are necessary but not sufficient (i.e., meeting the lower-level requirements does not imply meeting the higher-level requirement).

• A derived requirement is not actually implied by its “source” requirement.• Restrictions implied by architecture and design decisions are not being used to derive

requirements in the form of derived architecture or design constraints.

Potential Consequences:• It will be difficult to produce tests at the correct level of abstraction.• Testing at the unit- and subsystem-level for these derived requirements may be incomplete.• Associated lower-level defects may not be detected during testing.


requirements have not been adequately trained in how to derive new requirements at the appropriate level of abstraction.

• The requirements engineers are rushed because insufficient resources (e.g., time and staffing) have been allocated to properly engineer the requirements.


TBD• Enable:

TBD• Perform:

TBD• Verify:

Verify that TBD• Review the derived and allocated requirements to ensure that they are at the proper level of

abstraction and exhibit all of the standard characteristics of good requirements (e.g., complete, consistent, correct, feasible, mandatory, testable and unambiguous).

Related Problems: GEN-REQ-8 Lack of Requirements Tracing

2.1.8.7 GEN-REQ-7 Verification Methods Not Specified

Description: Each requirement has not been allocated one or more verification methods (e.g.,



analysis, demonstration, inspection, simulation, testing).56

Potential Symptoms:• The requirements specifications do not specify the verification method for its associated

requirements.• The requirements repository does not include verification method(s) as requirements

metadata.57

Potential Consequences:• Testers and testing stakeholders may incorrectly assume that all requirements must be

verified via testing, even though other verification methods may be adequate, be more appropriate, require less time or effort.

• Time may be spent testing requirements that should have been verified using another, more appropriate verification method.

• Requirements stakeholders may incorrectly assume that if a requirement is not testable, then it is also not verifiable.

Potential Causes:• The requirements repository (or requirements management tool) schema does not include

metadata for specifying verification methods.• Specifying verification methods for each requirement is not an explicit part of the

requirements engineering process.• The requirements engineers are rushed because insufficient resources (e.g., time and



TBD• Enable:

TBD• Perform:

TBD• Verify:

Verify that TBD• Ensure that each requirement (or set of similar requirements) has one or more appropriate

verification methods assigned to it/them.• Check the appropriateness of these verification methods during requirements inspections,

walk-throughs, and reviews.• Ensure that actual verification methods used are consistent with the specified requirements

verification methods, updating the requirements specifications and repositories when

56 Note that it may be adequate that a group of requirements are allocated one or more verification method(s). Thus, the individual requirements in the group are indirectly assigned verification methods via the group. This approach can potentially save time and effort and is therefore not an example of this problem.

57 There are multiple ways to specify verification methods, and the appropriate one(s) to use will depend on the requirements engineering process.



necessary.• Consider adding a tester to the requirements engineering team to ensure that the

requirements verification methods are properly specified.


2.1.8.8 GEN-REQ-8 Lack of Requirements Trace

Description: The requirements are not traced to the individual test cases.

Potential Symptoms:• There is no documented tracing from individual requirements to their associated test cases.• The mapping from the requirements to the test cases is not stored in any project repository

(e.g., requirements management, test management, or configuration tool).• There may only be a backwards trace from the individual test cases to the requirement(s)

they test.• Any tracing that was originally created is not maintained as the requirements change.

Potential Consequences:• There will not be any easy way to plan testing tasks, determine if all requirements have

been tested, and determine what needs to be regression tested after changes occur.• If requirements change, there will be no way of knowing which test cases need to be

created, modified, or deleted.

Potential Causes:• The requirements repository (or requirements management tool) schema does not include

metadata for specifying the requirements trace to test cases.• Specifying the requirements trace to testing is not an explicit part of the requirements

engineering process.• There is insufficient staffing and time allocated to tracing requirements.• The tool support for tracing requirements is inadequate or non-existent.


TBD• Enable:

TBD• Perform:

TBD• Verify:

Verify that TBD• Create a tracing between the requirements and test cases.• Include the tracing from requirements to tests as a test asset in the appropriate repository.• Include generating and maintaining the tracing from requirements to test cases in the test

plan(s).



• Evaluate the testing process and work products to ensure that this tracing is being properly performed.

• Allocate time in the project master schedule to perform this tracing.


2.2 Test Type Specific ProblemsThe following types of testing problems are related to the type of testing being performed:• Unit Testing Problems • Integration Testing Problems • Specialty Engineering Testing Problems • System Testing Problems • System of Systems (SoS) Testing Problems • Regression Testing Problems

2.2.1 Unit Testing Problems

The following testing problems are related to unit testing: 37F

58

• TTS-UNT-1 Unstable Design • TTS-UNT-2 Inadequate Design Detail • TTS-UNT-3 Unit Testing Considered Unimportant

2.2.1.1 TTS-UNT-1 Unstable Design

Description: Unit testing is problematic due to design volatility.

Potential Symptoms:• Design changes (e.g., refactoring and new capabilities) cause the test cases to be constantly

updated and test hooks to be lost.38F

59

Potential Consequences:• Unit tests will be unstable, requiring numerous changes and unit-level regression testing.• Unit testing will take an unacceptably long time to perform.


Recommendations:• Promote testability by ensuring that the design is reasonably stable so that test cases do not

need to be constantly updated and test hooks are not lost due to refactoring and new capabilities.39F

60

58 Note that because unit testing is typically the responsibility of the developers instead of professional testers, the general problem of inadequate testing expertise, experience, and training often applies.

59 This is especially true with agile development cycles with many short-duration increments and with projects where abnormal behavior is postponed until late increments.



Related Problems: GEN-COM-3 Source Documents Not Maintained

2.2.1.2 TTS-UNT-2 Inadequate Design Detail

Description: Unit testing is problematic due to an inadequate level of design detail.

Potential Symptoms:• There is insufficient design detail to drive the testing.• Specifically, there is insufficient detail to support black-box (interface) and white-box

(implementation) unit and integration testing.

Potential Consequences:• Unit testing (especially regression testing during maintenance by someone other than the

original developer) will be difficult to perform and repeat.• Unit testing will take an unacceptably long time to perform. Unit-level defects may not be

found.


Recommendations:• Ensure that the designers/programmers provide sufficient, well-documented design details

to drive the unit testing.


2.2.1.3 TTS-UNT-3 Unit Testing Considered Unimportant

Description: Unit testing is poorly and incompletely done because the developers consider it to be unimportant.

Potential Symptoms:• Developers consider unit testing to be unimportant, especially in relationship to the actual

development of the software.• Developers feel that the testers will catch any defects they miss.40F

61

Potential Consequences:• Unit testing is poorly or incompletely done.• An unacceptably large number of defects that should have been found during unit testing

pass through to integration and system testing, which are thereby slowed down.


60 This is especially important with agile development cycles with many short-duration increments and with projects where abnormal behavior is postponed until late increments.

61 This problem is exacerbated by schedule pressures on the developers and their tendency to try to show that their software works rather than to find the defects they have incorporated into their software (see GEN-SIC-1 Wrong Testing Mindset).



Recommendations:• Ensure that the developers are clear as to their testing responsibilities (see GEN-TOP-2

Unclear Testing Responsibilities).• Ensure that they understand the importance of finding highly localized defects during unit

testing when they are much easier to localize, analyze, and fix.• Establish clear unit testing success criteria that must be passed before the unit can be

delivered for integration and integration testing.

Related Problems: GEN-SIC-1 Wrong Testing Mindset

2.2.2 Integration Testing Problems

The following testing problems are related to poor requirements:• TTS-INT-1 Defect Localization • TTS-INT-2 Unavailable Components • TTP-INT-3 Inadequate Self-Test

2.2.2.1 TTS-INT-1 Defect Localization

Description: Localizing defects is problematic due to encapsulation caused by integration.

Potential Symptoms:• It is difficult to determine the location of the defect: in the new or updated operational

software under test, in the operational hardware under test, in the COTS OS and middleware, in the software test bed (e.g., in software simulations of hardware), in the hardware test beds (e.g., in pre-production hardware), in the tests themselves (e.g., in the test inputs, preconditions, expected outputs, and expected postconditions), or in a configuration/version mismatch among them.

Potential Consequences:• Defect localization will take an unacceptably large amount of time and effort to perform.• Errors in defect localization may cause the wrong fix (e.g., the wrong changes or changes to

the wrong software) to be made.


Recommendations:• Ensure that the architecture and design adequately support testability (i.e., provide the

testers with sufficient visibility and control to develop and execute adequate tests).• Ensure that the design and implementation (with exception handling, BIT, and test hooks),

the tests, and the test tools make it relatively easy to determine the location of defects.• Where appropriate, incorporate a test mode that logs information about errors, faults, and

failures to support defect identification and localization.• Because a single type of defect often occurs in multiple locations, check similar locations

for the same type of defect once a defect has been localized.



Related Problems: GEN-SIC-1 Wrong Testing Mindset

2.2.2.2 TTS-INT-2 Unavailable Components

Description: Integration testing is problematic due to unavailability of needed system, software, or test environment components.

Potential Symptoms:• The operational software, simulation software, test hardware, and actual hardware

components (e.g., sensors, actuators, and network devices) are not available for integration into the test environments prior to scheduled integration testing.

Potential Consequences:• Testing will not be able to begin until the missing components are available and have been

integrated into the test environments. Testing may not be completed on schedule.


Recommendations:• Ensure that the operational software, simulation software, test hardware, and actual

hardware components are available for integration into the test environments prior to scheduled integration testing.

• The project budget and schedule need to include the effort and time required to develop and install the simulation software and test hardware.

• If necessary: Obtain components with lower fidelity for initial testing. Develop simulators for the missing components.

Related Problems: GEN-TTE-4 Poor Fidelity of Test Environments

2.2.2.3 TTS-INT-3 Inadequate Self-Test

Description: Testing is problematic due to a lack of system- or software-internal self-tests.

Potential Symptoms:• The operational subsystem or software does not contain sufficient test hooks, built-in-test

(BIT), or prognostics and health management (PHM) software.

Potential Consequences:• Failures will be difficult to cause, reproduce, and localize.• Testing will take an unacceptably long time to perform, potentially exceeding the test

schedule.


Recommendations:



• Ensure that the operational software or subsystem contains sufficient test hooks, built-in-test (BIT), or prognostics and health management (PHM) software so that failures are reasonably easy to cause, reproduce, and localize.


2.2.3 Specialty Engineering Testing Problems

The following testing problems are related to the specialty engineering testing of quality characteristics and attributes:41F

62

• TTS-SPC-1 Inadequate Capacity Testing • TTS-SPC-2 Inadequate Concurrency Testing • TTS-SPC-3 Inadequate Performance Testing • TTS-SPC-4 Inadequate Reliability Testing • TTS-SPC-5 Inadequate Robustness Testing • TTS-SPC-6 Inadequate Safety Testing • TTS-SPC-7 Inadequate Security Testing • TTS-SPC-8 Inadequate Usability Testing

Note that specialty engineering tests tend to find the kinds of defects that are both difficult and costly to fix (e.g., because they often involve making architectural changes). Even though these are system-level quality characteristics, waiting until system testing is generally a bad idea. These types of testing (or other verification approaches) should begin relatively early during development.

2.2.3.1 TTS-SPC-1 Inadequate Capacity Testing

Description: An inadequate level of capacity testing is being performed.

Potential Symptoms:• All capacity requirements are not identified and specified.• There is little or no testing to determine if performance degrades gracefully as capacity

limits are approached, reached, and exceeded.• There is little or no verification of adequate capacity-related computational resources (e.g.,

memory utilization or processor utilization).

Potential Consequences:• Testing is less likely to detect some defects causing violations of capacity requirements.• The system may not meet its capacity requirements.


Recommendations:

62 Note that analogous testing problems could also exist for other quality characteristics.



• Ensure that all capacity requirements are properly specified.• Specify how capacity requirements will be verified (and tested) in a project test planning

document.• Ensure that all capacity requirements are adequately tested to determine performance as

capacity limits are approached, reached, and exceeded.• Use tools that simulate large numbers of simultaneous users.


2.2.3.2 TTS-SPC-2 Inadequate Concurrency Testing

Description: An inadequate level of concurrency testing is being performed.

Potential Symptoms:• The testing of concurrent behavior is not addressed in any test planning or process

description documents.• There is little or no testing being performed explicitly to identify the defects that cause the

common types of concurrency faults and failures: deadlock, livelock, starvation, priority inversion, race conditions, inconsistent views of shared memory, and unintentional infinite loops.

• Any concurrency testing that is being performed is based on a random rather than systematic approach to test case identification (e.g., based on the interleaving of threads).

• Any concurrency testing is being performed manually.• Concurrency faults and failures are only being identified when they happen to occur while

unrelated testing is being performed.• Concurrency faults and failures occur infrequently, intermittently, and are difficult to

reproduce.• Concurrency testing is performed using a low fidelity environment with regard to

concurrency: threads rather than processes single rather than multiple processors the use of deterministic rather than probabilistic drivers and stubs the use of hardware simulation rather than actual hardware.

Potential Consequences:• Any concurrency testing is both ineffectual and labor intensive.• Many defects that can cause concurrency faults and failures are not being found and fixed

until final system testing, operational testing, or system operation when they are much more difficult to reproduce, localize, and understand.


Recommendations:• Provide testers with training in concurrency defects, faults, and failures.• Use concurrency testing techniques that enable the systematic selection of a reasonable



number of test cases (e.g., ways of interleaving the threads) from the impractically large number of potential test cases.

• For testing of threads sharing a single processor, use a concurrency testing tool that provides control over thread creation and scheduling.

• When such tools are unavailable or inadequate, develop scripts that: automate the testing of deadlock and race conditions enable the reproducibility of test inputs record test results for analysis

• To the extent possible, do not rely on: merely throwing large numbers of simultaneous inputs/requests42F

63 at the system performing manual testing

Related Problems: GEN-TPS-2 Incomplete Test Planning, TTS-SPC-3 Inadequate Performance Testing

2.2.3.3 TTS-SPC-3 Inadequate Performance Testing

Description: An inadequate level of performance testing is being performed.

Potential Symptoms:• Performance requirements are not specified for all of performance quality attributes: event

schedulability, jitter, latency, response time, and through-put.• There is little or no performance testing or testing to determine if performance degrades

gracefully.• There is little or no verification of adequate performance-related computational resources.43F

64

• Performance testing is performed using a low fidelity environment.

Potential Consequences:• Testing is less likely to detect some performance defects.• The system may not meet its performance requirements.• Developers may have a false sense of security based on adequate performance under normal

testing involving nominal loads and a subset of operational profiles.


Recommendations:• Specify how performance requirements will be verified (and tested) in a project test

planning document.• Create realistic workload models under all relevant operational profiles. Ensure that all

performance requirements are properly identified and specified.

63 Such tests may redundantly test the same interleaving of threads while leaving many interleavings untested. Unexpected determinism may even result in the exact same interleaving being performed over and over again.

64 Examples include network and disk I/O, bus/network bandwidth, processor utilization, memory (RAM and disk) utilization, or database performance.



• Create or use existing (COTS) performance tools, such as a System Level Exerciser (SLE), to manage, schedule, perform, monitor, and report the results of performance tests.

• Measure performance under nominal conditions and exceptional (i.e., fault and failure tolerance) conditions as well as conditions of peak loading and graceful degradation.

• As appropriate, run single thread, multi-thread, and multi-processor/core tests.• Ensure that all performance requirements are adequately tested including all relevant

performance attributes, operational profiles, and credible workloads.

Related Problems: GEN-TPS-2 Incomplete Test Planning, TTS-SPC-2 Inadequate Concurrency Testing

2.2.3.4 TTS-SPC-4 Inadequate Reliability Testing

Description: An inadequate level of reliability testing is being performed. 44F

65

Potential Symptoms:• There is little or no long duration reliability testing (a.k.a., stability testing) under

operational profiles.

Potential Consequences:• Testing is less likely to detect some defects causing violations of reliability requirements

(and data to enable the estimation of system reliability will not be collected).• The system may not meet its reliability requirements.


Recommendations:• Ensure that all reliability requirements are properly identified and specified.• Specify how reliability requirements will be verified (or tested) in a project test planning

document.• To the degree that testing as opposed to analysis is practical as a verification method, ensure

that all reliability requirements undergo sufficient long duration reliability testing (a.k.a., soak tests) under operational profiles to estimate the system’s reliability.

Related Problems: GEN-TPS-2 Incomplete Test Planning, TTS-SPC-5 Inadequate Robustness Testing

2.2.3.5 TTS-SPC-5 Inadequate Robustness Testing

Description: An inadequate level of robustness testing is being performed.

Potential Symptoms:• Robustness testing is not based on robustness analysis such as abnormal (i.e., fault,

65 Note that reliability (load and stability) testing are nominal tests in the sense that they are executed within the performance envelop of the System Under Test (SUT). Capacity (stress) testing, where you test for graceful degradation, is outside the scope of performance testing.



degraded mode, and failure) use case paths, Event Tree Analysis (ETA), Fault Tree Analysis (FTA), or Failure Modes Effects Criticality Analysis (FMECA).

• There is little or no robustness testing: Error Tolerance Testing, the goal of which is to show that system does not detect or

react properly to input errors (a subtype of which is Fuzz Testing) Fault Tolerance Testing, the goal of which is to show that system does not detect or

react properly to system faults (bad internal states) Failure Tolerance Testing, the goal of which is to show that system does not detect or

react properly to system failures (to meet requirements) Environmental Tolerance Testing, the goal of which is to show that system does not

detect or react properly to dangerous environmental conditions

Potential Consequences:• Testing is less likely to detect some defects causing violations of robustness requirements.

Some error, fault, failure, and environmental tolerance defects will not be found.• The system may exhibit inadequate robustness.


Recommendations:• Ensure that all robustness requirements are properly identified and specified.• Specify how robustness requirements will be verified (and tested) in a project test planning

document.• Ensure that there is sufficient testing of all robustness requirements to verify adequate error,

fault, failure, and environmental tolerance.• Ensure that this testing is based on proper robustness analysis such as abnormal (i.e., fault,

degraded mode, and failure) use case paths, Event Tree Analysis (ETA), Fault Tree Analysis (FTA), or Failure Modes Effects Criticality Analysis (FMECA).

Related Problems: GEN-TPS-2 Incomplete Test Planning, TTS-SPC-4 Inadequate Reliability Testing

2.2.3.6 TTS-SPC-6 Inadequate Safety Testing

Description: An inadequate level of safety testing is being performed.

Potential Symptoms:• There is little or no:

Testing based on safety analysis (e.g., abuse/mishap cases, ETA, or FTA) Testing of safeguards (e.g., interlocks) Testing of fail-safe behavior Safety-specific testing:

Vulnerability Testing, the goal of which is to expose a system vulnerability (i.e., defect or weakness)45F

66

Hazard Testing, the goal of which is to make the system cause a hazard to come into



existence Mishap Testing, the goal of which is to make the system cause an accident or near

miss

Potential Consequences:• Testing is less likely to detect some defects causing violations of safety requirements. Some

defects with safety ramifications will not be found.• The system may exhibit inadequate safety.


Recommendations:• Ensure that all safety-related requirements are properly identified and specified.• Specify how safety requirements will be verified (and tested) in a project test planning

document.• Ensure that there is sufficient black-box testing of all safety requirements and sufficient

white-box testing of safeguards (e.g., interlocks) and fail-safe behavior.• Ensure that this testing is based on adequate safety analysis (e.g., abuse/mishap cases) as

well as the safety architecture and design.

Related Problems: GEN-TPS-2 Incomplete Test Planning, TTS-SPC-7 Inadequate Security Testing

2.2.3.7 TTS-SPC-7 Inadequate Security Testing

Description: An inadequate level of security testing is being performed.

Potential Symptoms:• There is little or no:

Testing based on security analysis (e.g., attack trees or abuse/misuse cases) Testing of security controls (e.g., access control, encryption/decryption, or intrusion

detection) Testing of fail-secure behavior Security-specific testing:

Penetration Testing, the goal of which is to penetrate the system’s defenses Fuzz Testing, the goal of which is to cause the system to fail due to random input Vulnerability Testing, the goal of which is to expose a system vulnerability (i.e.,

defect or weakness)

66 Note that the term vulnerability (meaning a weakness in the system/software) applies to both safety and security. Vulnerabilities can be exploited by an abuser [either unintentional (safety) or intentional (security)] and contribute to the occurrence of an abuse [either mishap (safety) or misuse (security)].



Potential Consequences:• Testing is less likely to detect some defects causing violations of security requirements. 46F

67

• Some vulnerabilities and other defects having security ramifications will not be found.• The system may exhibit inadequate security.


Recommendations:• Ensure that all security-related requirements are properly identified and specified.• Specify how security requirements will be verified (and tested) in a project test planning

document.• Ensure that all system actors are documented (e.g., profiled).• Ensure that there is sufficient security testing (e.g., penetration testing) of all security

requirements, security features, security controls, and fail-secure behavior.• Ensure that this testing is based on adequate security analysis (e.g., attack trees,

abuse/misuse cases).• Use static vulnerability analysis tools to identify commonly occurring security

vulnerabilities.

Related Problems: GEN-TPS-2 Incomplete Test Planning, TTS-SPC-6 Inadequate Safety Testing

2.2.3.8 TTS-SPC-8 Inadequate Usability Testing

Description: An inadequate level of usability testing is being performed.

Potential Symptoms:• There is little or no explicit usability testing of the system’s or software’s human interfaces.• Specifically, there is no testing of such quality attributes of usability such as: accessibility,

attractiveness (a.k.a., engagability, preference, and stickiness68), credibility (a.k.a., trustworthiness), differentiation, ease of entry, ease of location, ease of remembering, effectiveness, effort minimization, error minimization, predictability, learnability, navigability, retrievability, suitability (also known as appropriateness) , understandability, and user satisfaction.

Potential Consequences:• Testing is less likely to detect some defects causing violations of usability requirements.• Some defects with usability ramifications will not be found.• The system may exhibit inadequate usability.

67 Warning; although a bad idea, security requirements are sometimes specified in a security document rather than in the requirements specification/repository. Similarly, security testing is sometimes documented in security rather than testing documents.

68 The term “stickiness” is typically used with reference to web pages and refers to how long users remain at (that is, remain “stuck” to) given web pages.




Recommendations:• Ensure that all usability requirements are properly identified and specified.• Specify how usability requirements will be verified (and tested) in a project test planning

document.• Ensure that there is sufficient usability testing of the human interfaces. Include usability

testing for all relevant usability attributes such as accessibility, attractiveness (also known as engagability, preference, and stickiness), credibility (also known as trustworthiness), differentiation, ease of entry, ease of location, ease of remembering, effectiveness, effort minimization, error minimization, learnability, navigability, retrievability, suitability (also known as appropriateness), understandability, and user satisfaction.


2.2.4 System Testing Problems

The very nature of system testing often ensures that these problems cannot be eliminated. At best, the recommended solutions can only mitigate them.The following testing problems are related to system testing:• TTS-SYS-1 Testing Robustness Requirements is Difficult • TTS-SYS-2 Lack of Test Hooks • TTS-SYS-3 Testing Code Coverage is Difficult

2.2.4.1 TTS-SYS-1 Testing Robustness Requirements is Difficult

Description:• The testing of robustness requirements (specifying error, fault, and failure tolerance)47F

69 is difficult.

Potential Symptoms:• It is difficult for tests of the integrated system to cause local faults (i.e., internal to a

subsystem) in order to test for fault tolerance.

Potential Consequences:• The system or software is less testable because it is less controllable (e.g., causing local

faults).• Less robustness testing will be done and the delivered system will contain an unacceptably

large number of defects that lessen error, fault, and failure tolerance.

69 An error is bad input (from a human, another system, or hardware). A fault is an encapsulated (information hiding) incorrect state or incorrect stored data. A failure is an externally visible incorrect response (e.g., output data or control) that typically is a violation of some requirement. An error may or may not result in a fault depending on whether it is stored and there is error tolerance. A fault may or may not cause a failure depending on whether it is executed and there is fault tolerance.




Recommendations:• Ensure that robustness requirements are specified and associated architecture/design

decisions are documented.• Ensure adequate test tool support or that sufficient robustness including error, fault, and

failure logging is incorporated into the system to enable adequate testing for tolerance (e.g., by causing encapsulated errors and faults, and observing the resulting robustness).

• Where appropriate, incorporate test hooks, built-in test (BIT), fault logging (possibly triggered by exception handling, a prognostics and health management (PHM) function or subsystem, or some other way to overcome information hiding in order to verify test case preconditions and post-conditions.

Related Problems: TTS-SPC-5 Inadequate Robustness Testing

2.2.4.2 TTS-SYS-2 Lack of Test Hooks

Description: System testing is difficult because temporary test hooks have been removed.

Potential Symptoms:• Internal test hooks and testing software has been removed prior to system testing (e.g., for

security or performance reasons).

Potential Consequences:• It will be difficult to test locally implemented requirements.• Such requirements will not be verified at the system level because of decreased testability

due to low controllability and observability.


Recommendations:• Ensure that unit and integration testing have adequately tested locally implemented and

encapsulated requirements that are difficult to verify during system testing. Use a test/logging system mode (if one exists).


2.2.4.3 TTS-SYS-3 Testing Code Coverage is Difficult

Description: Ensuring that tests provide adequate code coverage is difficult.

Potential Symptoms:• It is difficult for tests of the integrated system to demonstrate code coverage.48F

70

70 Code coverage is typically very important for software with safety or security ramifications. When software is categorized by safety or security significance, the mandatory rigor of testing (including the completeness of coverage) increases as the safety and security risk increases (e.g., from function coverage through statement



Consequences:• Adequate code coverage as mandated for mission-, safety-, and security-critical software

will not be verified.• The system will not receive its safety and security accreditation and certification until code

coverage is verified.


Recommendations:• Ensure that unit and integration testing (including regression testing) have demonstrated

sufficient code coverage so that code coverage need not be demonstrated at the system level.

• Use software test tools or probes to measure and report code coverage.


2.2.5 System of Systems (SoS) Testing Problems

Note that system of systems means the integration of separately developed, funded, and scheduled systems having independent governance. This is not referring to a system developed by a prime contractor or integrated by a system integrator consisting of subsystems developed by subcontractors or vendors.The following testing problems are related to system of systems testing:• TTS-SoS-1 Inadequate SoS Test Planning • TTS-SoS-2 Unclear SoS Testing Responsibilities • TTS-SoS-3 Inadequate Funding for SoS Testing • TTS-SoS-4 SoS Testing not Properly Scheduled • TTS-SoS-5 Poor or Missing SoS Requirements • TTS-SoS-6 Inadequate Test Support from Individual Systems • TTS-SoS-7 Inadequate Defect Tracking Across Projects • TTS-SoS-8 Finger-Pointing

2.2.5.1 TTS-SoS-1 Inadequate SoS Test Planning

Description: An inadequate amount of SoS test planning is being performed.

Potential Symptoms:• There may be no SoS Test and Evaluation Master Plan (TEMP) or SoS Test Plan.• There may be no SoS System Engineering Master Plan (SEMP) and SoS Development Plan

(SoSDP).• There may only be incomplete high-level overviews of testing in the SoS SEMP or Test

Plan.

coverage, decision or branch coverage, and condition coverage to path coverage).



Potential Consequences:• There may be no clear test responsibilities, objectives, methods and techniques, and

completion/acceptance criteria at the SoS level.• It may be unclear which project, organization, team, or individual is to responsible for

performing the different SoS testing tasks.• Adequate resources (funding, staffing, and schedule) may not be made available for SoS

testing.• SoS testing may be inadequate.• There may be numerous system to system interface defects causing the failure of end-to-end

mission threads.

Potential Causes:• There may be (probably is) little if any governance at the SoS level.• Little or no planning may have occurred for testing above the individual system level.• The SoS testing tasks may not have been determined, planned for, or documented.


Determine the level of testing that is taking place at the system level. Reuse or create a standard template and content/format standard for the SoS TEMP or

Test Plan. Include the SoS TEMP or Test Plan as a deliverable work product in the SoS integration

project’s contract. Include the delivery of the SoS TEMP or Test Plan in the SoS project’s master schedule

(e.g., as part of major milestones).• Enable:

To the extent practical, ensure close and regular communication (e.g., via status/working meetings and participation in major reviews) between the various system-level test organizations/teams.

• Perform: Perform sufficient test planning at the SoS level. Create a SoS TEMP or Test Plan in order to ensure that.

• Verify: Verify the existence and completeness of the SoS TEMP or Test Plan.

Related Problems: GEN-TPS-1 No Separate Test Plan, GEN-TPS-2 Incomplete Test Planning

2.2.5.2 TTS-SoS-2 Unclear SoS Testing Responsibilities

Description: The responsibilities for performing end-to-end system of systems testing are unclear.

Potential Symptoms:• No project is explicitly tasked with testing end-to-end SoS behavior.



Potential Consequences:• No project will have planned to provide the resources (e.g., staffing, budget, schedule)

needed to perform SoS testing.• Adequate SoS testing is unlikely to be performed, and the SoS will be unlikely to meet its

schedule for deployment of new/updated capabilities.

Potential Causes: TBD

Recommendations:• Ensure that responsibilities for testing the end-to-end SoS behavior are clearly assigned to

some organization and project.• To the extent practical, ensure close and regular communication (e.g., via status/working

meetings and participation in major reviews) between the various system-level test organizations/teams.

Related Problems: GEN-TOP-2 Unclear Testing Responsibilities

2.2.5.3 TTS-SoS-3 Inadequate Funding for SoS Testing

Description: The funding for system of systems (SoS) testing is not adequate for the performance of sufficient testing.

Potential Symptoms:• Little or no funding has been provided to perform end-to-end SoS testing.• None of the system-level projects have been funded to perform end-to-end SoS testing.

Potential Consequences:• Little or no end-to-end SoS testing will be performed.• It is likely that residual system to system interface defects will cause the failure of end-to-

end mission threads.


Recommendations:• Ensure that adequate funding for testing the end-to-end SoS behavior is clearly supplied to

the responsible organization and project.


2.2.5.4 TTS-SoS-4 SoS Testing not Properly Scheduled

Description: System of system testing is not properly scheduled.

Potential Symptoms:• SoS testing is not in the individual system’s integrated master schedules, and there is no

SoS-level master schedule.• SoS testing must be fit into the uncoordinated schedules of the individual systems



comprising the SoS.

Potential Consequences:• SoS testing that is not scheduled will be unlikely to be performed.• If performed, it is likely that the testing will be rushed, incomplete, and inadequate with

more mistakes than typical.• The operational SoS is likely to contain more SoS integration defects and end-to-end

mission thread defects than is appropriate.


Recommendations:• Ensure that SoS testing is on the SoS master schedule.• Ensure that SoS testing is also on the individual system’s integrated master schedules so

that support for SoS testing can be planned.• Ensure that SoS testing is coordinated with the schedules of the individual systems.• To the extent practical, ensure close and regular communication (e.g., via status/working

meetings and participation in major reviews) between the various system-level test organizations/teams.

Related Problems: GEN-TPS-3 Inadequate Test Schedule

2.2.5.5 TTS-SoS-5 Poor or Missing SoS Requirements

Description: Many system of systems requirements are either missing or of poor quality.

Potential Symptoms:• Little or no requirements exist above the system level.• Those SoS requirements that do exist do not exhibit all of the characteristics of good

requirements.

Potential Consequences:• Requirements-based SoS testing will be difficult to perform because there are no officially-

approved SoS requirements to verify.• It will be hard to develop test cases and to determine the corresponding expected test

outputs.• It is likely that system to system interface defects will cause the failure of end-to-end

mission threads.


Recommendations:• Ensure that there are sufficient officially approved SoS requirements to drive requirements-

based SoS testing.

Related Problems: GEN-REQ-1 Ambiguous Requirements, GEN-REQ-2 Missing



Requirements, GEN-REQ-4 Incorrect Requirements

2.2.5.6 TTS-SoS-6 Inadequate Test Support from Individual Systems

Description: Test support from individual system development/maintenance projects is inadequate to perform system of system testing.

Potential Symptoms:• All available system-level test resources (e.g., staffing, funding, and test environments) are

already committed to system testing.

Potential Consequences:• It will be difficult or impossible to obtain the necessary test resources from individual

projects to support SoS testing.


Recommendations:• Ensure that the individual projects provide adequate test resources (e.g., people and test

beds) to support SoS testing.• Ensure that these resources are not already committed elsewhere.


2.2.5.7 TTS-SoS-7 Inadequate Defect Tracking Across Projects

Description: Defect tracking across individual system development or maintenance projects is inadequate to support system of systems testing.

Potential Symptoms:• There is little or no coordination of defect tracking and associated regression testing across

multiple projects.• Different projects collect different types and amounts of information concerning defects

identified during testing.

Potential Consequences:• It will be unnecessarily difficult to synchronize system- and SoS-level activities.• Defect localization and allocation of defects to individual or sets of systems will be difficult

to perform.


Recommendations:• Develop a consensus concerning how to address defect reporting and tracking across the

systems making up the SoS.• Document this consensus in all relevant testing plans (SoS and individual systems).



• Verify that defect tracking and associated regression testing across the individual projects of the systems making up the SoS are adequately coordinated and reported.


2.2.5.8 TTS-SoS-8 Finger-Pointing

Description: Different system development/maintenance projects assign the responsibility for defects and fixing them to other projects.

Potential Symptoms:• There is a significant amount of finger pointing across project boundaries regarding whether

something is a defect (or feature) or where defects lie (i.e., in which systems and in which project’s testing).

Potential Consequences:• Time and effort will be wasted in the allocation of defects to individual or sets of systems.• Defects will take longer to be fixed, and these fixes will take longer to be verified.


Recommendations:• Ensure that representatives of the individual systems are on the SoS change control board

(CCB) and are involved in SoS defect triage.• Work to develop a SoS mindset among the members of the SoS CCB.


2.2.6 Regression Testing Problems

The following problems are specific to the performance of regression testing including testing during maintenance:• TTS-REG-1 Insufficient Regression Test Automation • TTS-REG-2 Regression Testing not Performed • TTS-REG-3 Inadequate Scope of Regression Testing • TTS-REG-4 Only Low-Level Regression Tests • TTS-REG-5 Disagreement over Maintenance Test Resources

2.2.6.1 TTS-REG-1 Insufficient Regression Test Automation

Description: Too few of the regression tests are automated.71

71 The automation of regression testing is especially important when an agile (iterative, incremental, and parallel) development cycle is used. The resulting numerous, short-duration increments of the system must be retested because of changes due to iteration (e.g., refactoring and defect correction) and the integration of additional components with existing components.



Potential Symptoms:• Many or even most of the tests may be being performed manually.

Potential Consequences:• Manual regression testing may take so much time and effort that it is not done.• If performed, regression testing may be rushed, incomplete, and inadequate to uncover

sufficient number of defects.• Testers may be making an excessive number of mistakes while manually performing the

tests.• Defects introduced into previously tested subsystems/software while making changes may

remain in the operational system

Potential Causes:• Testing stakeholders (e.g., managers and the developers of unit tests) may:

mistakenly believe that performing regression testing is neither necessary nor cost effective because: of the minor scope of most changes system testing will catch any inadvertently introduced integration defects they are overconfident that changes have not introduced any new defects

not be aware of the: importance of regression testing value of automating regression testing

• Automated regression testing may not be an explicit part of the testing process.• Automated regression testing may not be incorporated into the Test and Evaluation Master

Plan (TEMP) or System/Software Test Plan (STP).• The schedule may contain little or no time for the development and maintenance of

automated tests.• Tool support for automated regression testing may be lacking (e.g., due to insufficient test

budget) or impractical to use.• The initially developed automated tests may not be maintained.• The initially developed automated tests may not be delivered with the system/software.


Explicitly address automated regression testing in the project’s: test process documentation (e.g., procedures and guidelines) TEMP or STP master schedule work break down structure (WBS)

• Enable: Provide training/mentoring to the testing stakeholders in the importance and value of

automated regression testing. Provide sufficient time in the schedule for automating and maintaining the tests.



Provide sufficient funding to pay for automated test tools Ensure that adequate resources (staffing, budget, and schedule) are planned and

available for automating and maintaining the tests.• Perform:

Automate as many of the regression tests as is practical. Where appropriate, use commercially available test tools to automate testing. Ensure that both automated and manual test results are integrated into the same overall

test results database so that test reporting and monitoring are seamless. Maintain the automated tests as the system/software changes. Deliver the automated tests with the system/software.

• Verify: Verify that the test process documentation addresses automated regression testing Verify that the TEMP / STP and WBS address automated regression testing. Verify that the schedule provides sufficient time to automate and maintain the tests Verify that a sufficient number of the tests have been automated. Verify that the automated tests function properly. Verify that the automated tests are properly maintained. Verify that the automated tests are delivered with the system/software.

Related Problems: GEN-TPS-1 No Separate Test Plan, GEN-TPS-2 Incomplete Test Planning, GEN-TPS-3 Inadequate Test Schedule, GEN-SIC-2 Unrealistic Testing Expectations / False Sense of Security, GEN-MGMT-1 Inadequate Test Resources, GEN-PRO-9 Inadequate Test Maintenance, GEN-TTE-1 Over-reliance on Manual Testing, GEN-TTE-7 Tests not Delivered, GEN-TTE-8 Inadequate Test Configuration Management (CM)

2.2.6.2 TTS-REG-2 Regression Testing not Performed

Description: Insufficient72 regression testing is being performed after changes are made to the system/software.

Potential Symptoms:• The may be no regression testing being performed.• Parts of the system/software may not be being retested after they are changed (e.g.,

refactoring and fixes).• Appropriate parts of the system/software may not be being retested after interfacing parts

are changed (e.g., additions, modifications, deletions).• Defects may be being traced to previously tested components.

Potential Consequences:

72 The proper amount of regression testing depends on many factors including the criticality of the system/software, the potential risks associated with introducing new defects, the potential costs of fixing these defects, the potential costs of performing regression testing, and the resources available to perform regression testing. There is a natural tension between managers that wants to minimize regression testing and testers who want to perform as much testing as practical.



• Defects introduced into previously tested subsystems/software while making changes may: not be found during regression testing remain in the operational system


mistakenly believe that performing regression testing is neither necessary nor cost effective because: of the minor scope of most changes the change will only have local effects and thus can’t affect the rest of the system system testing will catch any inadvertently introduced integration defects they are overconfident that changes have not introduced any new defects

not be aware of the: importance of regression testing value of automating regression testing

• Regression testing may not be an explicit part of the testing process.• Regression testing may not be incorporated into the Test and Evaluation Master Plan

(TEMP) or System/Software Test Plan (STP).• The schedule may contain little or no time for the performance and maintenance of

automated tests.• Regression tests may not be automated.• The initially developed automated tests may not be maintained.• The initially developed automated tests may not be delivered with the system/software.• There is insufficient time and staffing to perform regression testing, especially if it must be

performed manually.• Change impact analysis may not:

be performed (e.g., because of inadequate configuration management) address the impact on regression testing

• The architecture and design of the system/software may be overly complex with excessive coupling and insufficient encapsulation between components, thereby hiding interactions that may be broken by the changes.


Explicitly address regression testing in the project’s: test process documentation (e.g., procedures and guidelines) TEMP or STP master schedule work break down structure (WBS)

Provide sufficient time in the schedule for performing and maintaining the regression tests.

• Enable: Provide training/mentoring to the testing stakeholders in the importance and value of



automated regression testing. Automate as many of the regression tests as is practical. Maintain the regression tests. Deliver the regression tests with the system/software. Provide sufficient time in the schedule to perform the regression testing. Collect, analyze, and distribute the results of metrics concerning the performance of

regression testing.• Perform:

Perform change impact analysis to determine what part of the system/software needs to be regression tested.

Perform regression testing on the potentially impacted parts of the system/software. Resist efforts to skip regression testing unless a change impact analysis has determined

that retesting is not necessary.• Verify:

Verify that the test process documentation addresses automated regression testing Verify that the TEMP / STP and WBS address automated regression testing. Verify that the schedule provides sufficient time to automate and maintain the tests Verify that a sufficient number of the tests have been automated. Verify that the automated tests function properly. Verify that the automated tests are properly maintained. Verify that the automated tests are delivered with the system/software. Verify that change impact analysis is being performed and addresses the impact of the

change on regression testing. Verify that sufficient regression testing is being performed.

Related Problems: GEN-TPS-1 No Separate Test Plan, GEN-TPS-2 Incomplete Test Planning, GEN-TPS-3 Inadequate Test Schedule, GEN-SIC-2 Unrealistic Testing Expectations / False Sense of Security, GEN-MGMT-1 Inadequate Test Resources, GEN-PRO-9 Inadequate Test Maintenance, GEN-TTE-1 Over-reliance on Manual Testing, GEN-TTE-7 Tests not Delivered, GEN-TTE-8 Inadequate Test Configuration Management (CM)

2.2.6.3 TTS-REG-3 Inadequate Scope of Regression Testing

Description: The scope of regression testing is not sufficiently broad.

Potential Symptoms:• Regression testing may be restricted to only the subsystem/software that changed. 73

• Appropriate parts of the system/software may not be retested after interfacing parts are changed (e.g., additions, modifications, deletions).

• Defects may be found to trace to previously tested components.

73 Unfortunately, changes in one part of the system/software can sometimes impact apparently unrelated parts of the system/software. Defects also often unexpectedly propagate faults and failures beyond their local scope.



Potential Consequences:• Defects introduced into previously tested subsystems/software while making changes may:

not be found during regression testing remain in the operational system


mistakenly believe that performing regression testing is neither necessary nor cost effective because: of the minor scope of most changes the change will only have local effects and thus can’t affect the rest of the system system testing will catch any inadvertently introduced integration defects they are overconfident that changes have not introduced any new defects

be under significant cost and schedule pressure to minimize regression testing• Determining the proper scope of regression testing may not be an explicit part of the testing

process.• The schedule may contain little or no time for the performance and maintenance of

regression tests.• Regression tests may not be automated.• The initially developed automated tests may not be maintained.• The initially developed automated tests may not be delivered with the system/software.• There is insufficient time and staffing to perform regression testing, especially if it must be

performed manually.• Change impact analysis may not:

be performed (e.g., because of inadequate configuration management) address the impact on regression testing

• The architecture and design of the system/software may be overly complex with excessive coupling and insufficient encapsulation between components, thereby hiding interactions that may be broken by the changes.


Explicitly address the proper scope of regression testing in the project’s test process documentation (e.g., procedures and guidelines).

Provide sufficient time in the schedule for performing and maintaining the regression tests.

• Enable: Provide training/mentoring to the testers in the proper scope of regression testing. Automate as many of the regression tests as is practical. Maintain the regression tests. Deliver the regression tests with the system/software. Provide sufficient time in the schedule to perform the regression testing.



Collect, analyze, and distribute the results of metrics concerning the performance of regression testing.

• Perform: Perform change impact analysis to determine what part of the system/software needs to

be regression tested. Perform regression testing on the potentially impacted parts of the system/software. Resist efforts to skip regression testing unless a change impact analysis has determined

that retesting is not necessary.• Verify:

Verify that the test process documentation addresses the proper scope of regression testing

Verify that the schedule provides sufficient time to automate and maintain the tests Verify that a sufficient number of the tests have been automated. Verify that the automated tests function properly. Verify that the automated tests are properly maintained. Verify that the automated tests are delivered with the system/software.

Verify that change impact analysis is being performed and addresses the impact of the change on regression testing.

Verify that sufficient regression testing is being performed.

Related Problems: GEN-TPS-1 No Separate Test Plan, GEN-TPS-2 Incomplete Test Planning, GEN-TPS-3 Inadequate Test Schedule, GEN-SIC-2 Unrealistic Testing Expectations / False Sense of Security, GEN-MGMT-1 Inadequate Test Resources, GEN-PRO-9 Inadequate Test Maintenance, GEN-TTE-1 Over-reliance on Manual Testing, GEN-TTE-7 Tests not Delivered, GEN-TTE-8 Inadequate Test Configuration Management (CM), TTS-REG-2 Regression Testing not Performed, TTS-REG-4 Only Low-Level Regression Tests

2.2.6.4 TTS-REG-4 Only Low-Level Regression Tests

Description: Only low-level (e.g., unit level) regression tests are rerun.

Potential Symptoms:• Regression testing may be restricted to unit testing (and possibly some integration testing).• Regression testing may not include system and/or SoS testing.

Potential Consequences:• Integration defects introduced while changing existing previously tested

subsystems/software will remain in the operational system because they will not be found during regression testing.


Recommendations:• Ensure that all relevant levels of regression testing (e.g., unit, integration, system, specialty,



and SoS) are rerun when changes are made.• Automate as many of these regression tests so that it will be practical to rerun them.

Related Problems: TTS-REG-3 Inadequate Scope of Regression Testing

2.2.6.5 TTS-REG-5 Disagreement over Maintenance Test Resources

Description: The development and maintenance projects disagree over who is responsible for providing the test resources (e.g., staffing, budget, test work products) during maintenance.

Potential Symptoms:• There is disagreement as to whether the resources for maintenance testing should be

provided by the development or maintenance projects.

Potential Consequences:• Insufficient resources will be made available to adequately support maintenance testing.

Testing will be delayed while the source of these resources is negotiated.


Recommendations:• Ensure that the funding for maintenance testing is clearly assigned to either the

development or sustainment project. Include funding responsibilities in the transition plan (if there is one).



3 2BConclusion

3.1 Testing ProblemsThere are many testing problems that can occur during the development or maintenance of software-reliant systems and software applications. While no project is likely to be so poorly managed and executed as to experience the majority of these problems, most projects will suffer several of them. Similarly, while exhibiting these testing problems does not guarantee failure, these problems are definitely risks that need to be managed.The 77 common problems involving how testing is performed have been grouped into the following 14 categories:• General Testing Problems

Requirements Testing Problems Test Planning and Scheduling Problems Stakeholder Involvement and Commitment Problems Management-related Testing Problems Test Organization and Professionalism Problems Test Process Problems Test Tools and Environments Problems Test Communication Problems

• Testing Type Specific Problems Unit Testing Problems Integration Testing Problems Specialty Engineering Testing Problems System Testing Problems System of Systems (SoS) Problems Regression Testing Problems

3.2 Common ConsequencesWhile different testing problems have different proximate negative consequences, they all tend to contribute to the following overall ultimate results:• The testing effort is less effective and efficient.• Some defects are discovered later than they should be, when they are more difficult to

localize and fix.• The testers must work unsustainably long hours causing them to become exhausted and

therefore make excessive numbers of mistakes.• The software-reliant system or software application is delivered late and over budget because

of extra unplanned time and effort spent finding and fixing defects late during development.


Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and Recommendations• In spite of this extra budget and schedule, the software-reliant system or software application

is still delivered and placed into operation with more residual defects than either expected or necessary.

3.3 Common SolutionsIn addition to the individual problem-specific recommendations provided in the preceding problem specifications, the following general solutions are applicable to most of the common testing problems:• Prevention Solutions – The following solutions can prevent the problems from occurring in

the first place: Formally require the solutions – Customer representatives formally require the

solutions to the testing problems in the appropriate documentation such as the Request for Proposals and Contract.

Mandate the solutions – Managers, chief engineers (development team leaders), or chief testers (test team leaders) explicitly mandate the solutions to the testing problems in the appropriate documentation such as the System Engineering Management Plan (SEMP), System/Software Development Plan (SDP), Test Plan(s), and/or Test Strategy.

Provide training – Chief testers or trainers provide appropriate amounts and levels of test training to relevant personnel (such as to acquisition staff, management, testers, and quality assurance) that covers the potential testing problems and how to prevent, detect, and react to them.

Management support – Managers explicitly state (and provide) their support for testing and the need to avoid the commonly occurring test problems.

• Detection Solutions – The following solutions enable existing problems to be identified and diagnosed: Evaluate documentation – Review, inspect, or walk through the test-related

documentation (e.g., Test Plan and test sections of development plans). Oversight – Provide acquirer, management, quality assurance, and peer oversight of the

testing process as it is performed. Metrics – Collect, analyze, and report relevant test metrics to stakeholders (e.g.,

acquirers, managers, technical leads or chief engineers, and chief testers).• Reaction Solutions – The following solutions help to solve existing problems once they are

detected: Reject test documentation – Customer representatives, managers, and chief engineers

refuse to accept test-related documentation until identified problems are solved. Fail the test– Customer representatives, managers, and chief engineers refuse to accept

the system/subsystem/software under test until identified problems (e.g., in test environments, test procedures, or test cases) are solved. Rerun the tests after prioritizing and fixing the associated defects.

Provide training – Chief testers or trainers provide appropriate amounts and levels of remedial test training to relevant personnel (such as to acquisition staff, management, testers, and quality assurance) that covers the observed testing problems and how to prevent, detect, and react to them.



Update process – Chief engineers, chief testers, and/or process engineers update the test process documentation to minimize the likelihood of reoccurrence of the observed testing problems.

Formally raise risk – Raise existing test problems as formal risks and inform both project management and the customer representative.



4 Potential Future Work

The contents of this document were not the results of a formal academic study. Rather, they were derived largely from the author’s 30+ years of experience assessing and taking part in numerous projects as well as numerous discussions with testing subject matter experts.As such, the current qualitative document leaves several important quantitative questions unanswered:• Frequency. What is the probability distribution of these problems? Which problems occur

most often? Which problems tend to cluster together?• Impact. Which problems have the largest negative consequences? What are the probability

distributions of harm caused by each problem?• Risk. Based on the above frequencies and impacts, which of these problems cause the

greatest risks? Given these risks, how should one prioritize the identification and resolution of these problems?

• Distribution. Do different problems tend to occur with different probabilities in different application domains such as commercial vs. governmental vs. military, web vs. IT vs. embedded systems, etc.)?

Provided sufficient funding, it is the author’s intent to turn this document into an industry survey and to perform a formal study to answer these questions.



5 Acknowledgements

This paper has been provided for review to over 191 professionals and academics from 33 countries and incorporates comments and recommendations received from the following individuals who I would like to acknowledge:1. Vince Alcalde, Independent Consultant, Australia2. Laxmi Bhat, Minerva Networks, USA3. Robert V. Binder, System Verification Associates, USA4. Peter Bolin, Revolution IT Pty Ltd, Australia5. Alexandru Cosma, ISDC, Romania6. Jorge Alberto De Flon, Servicio de Administración Tributara (SAT), Mexico7. Lee Eldridge, Independent Consultant, Australia8. Eliazar Elisha, University of Liverpool, UK9. Sam Harbaugh, Integrated Software Inc., USA10. M. E. Hom, Compass360 Consulting, USA11. Thanh Cong Huynh, LogiGear, Vietnam12. Ronald Kohl, Independent Consultant, USA13. Wido Kunde, Baker Hughes, Germany14. Philippe Lebacq, Toyota Europe, Belgium15. Stephen Masters, Software Engineering Institute, USA16. Ken Niddefer, Software Engineering Institute, USA17. Anne Nieberding, Independent Consultant, USA18. William Novak, Software Engineering Institute, USA19. Mahesh Palan, Calypso Technology, USA20. Dan Pautler, Elekta, USA21. Mark Powel, Attwater Consulting, USA22. James Redpath, Sogeti, USA23. Sudip Saha, Navigators Software, India24. Alejandro Salado, Kayser – Threde GmbH, Germany25. Matt Sheranko, Knowledge Code, USA26. Oleg Spozito, Independent Consultant, Canada27. Barry Stanly, Independent Consultant, USA28. Lou Wheatcraft, Requirements Experts, USA29. Thomas Zalewski, Texas State Government, USA

Analysis (verification) – the verification method in which established technical or mathematical models or simulations, algorithms, or other scientific principles and procedures are used to provide evidence that a work product (e.g., document, software application, or system) meets its specified requirements


Appendix A: Glossary

Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and RecommendationsBlack-box testing (a.k.a., interface testing) – any method of testing the externally visible behavior and characteristics of software without regard to its internal structures or workingsBoundary value testing – the testing technique in which test cases are selected just inside, on, and just outside of each boundary of an equivalence class of potential test cases74

Branch coverage – the type of code coverage in which test cases have executed each possible branch each branch of each control structure (e.g., If-Then-Else and Case statement) in the software under testCode coverage – a measure of the degree to which testing executes the source code within a program Condition coverage (a.k.a., predicate coverage) – the type of code coverage in which test cases have caused each Boolean sub-expression to be evaluated both to true and to false in the software under test75

Decision coverage – the type of code coverage in which test cases have met as well as not met each condition of each branch of each control structure (e.g., If-Then-Else and Case statement) in the software under testDefect – any flaw resulting from an error made during development that will cause the system to perform in an unintended or unanticipated manner if executed (possibly only under certain circumstances)76

Demonstration – the verification method in which a system or subsystem is observed during operation under specific scenarios to provide visual evidence of whether it behaves properlyDerived requirement – any requirement that is implied or inferred from other requirements or from applicable standards, laws, policies, common practices, management and business decisions, or constraintsEntry/exit coverage – the type of code coverage in which test cases have executed every possible call and return of the function in the software under testError – any human mistake (e.g., an incorrect, missing, extra, or improperly timed action) that can cause erroneous input or a defect77

Erroneous input – any incorrect input values (i.e., those that do not match the actual or required values) Error tolerance – the degree to which the system detects erroneous input (e.g., from a human or failed sensor) and response properly to avoid faults and failures

74 All test cases within the equivalence class are considered equivalent because they all follow the same path through the code with regards to branching.

75 Condition coverage does not necessarily imply decision coverage.76 The defect could be in software (e.g., incorrect statements or declarations), in hardware (e.g., a flaw in material

or workmanship, manufacturing defects), or in data (e.g., incorrect hardcoded values in configuration files). A software defect (a.k.a., bug) is the concrete manifestation within the software of one or more errors. One error may cause several defects, and multiple errors may cause the same defect.

77 If an error occurs during development, it can create a defect. If the error occurs during operation, it can produce erroneous input that can cause a fault.


Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and RecommendationsFailure – the system ceases to meet one or more of its requirements (i.e., fails to exhibit a mandatory behavior or characteristic)78

Failure tolerance – the degree to which the system detects the existence of failures and reacts appropriately to avoid harm (e.g., by going into a degraded mode or failing into a safe and secure state)False negative test result – the test result implies that no underlying defect exists although a defect actually exists (i.e., the test fails to expose the defect)79

False positive test result – the test result implies the existence of an underlying defect although no such defect actually exists80

Fault – any abnormal system-internal condition (e.g., incorrect stored data value, incorrect subsystem state, or execution of the wrong block of code) which may cause the system to fail81

Fault tolerance – the degree to which the system detects the existence of faults and reacts appropriately to avoid failuresFunction coverage – the type of code coverage in which test cases have called each function (a.k.a., procedure, method, or subroutine) in the software under testFunctional requirement – any requirement that specifies a mandatory behavior of a system or subsystemFunctional testing – any testing intended to cause the implementation of a system function to fail in order to identify associated defects as well as to provide some information that they function is correctly implementedFuzz testing – the testing technique in which random inputs are used to cause the system to failIncremental development cycle – any development cycle in which the development process (including testing) is repeated to add additional capabilities Inspection – the verification method in which a static work product is observed using one or more of the five senses, simple physical manipulation, and mechanical and electrical gauging and measurement to determine if it contains defectsIntegration testing – the incremental testing of larger and larger subsystems as they are integrated to form the overall system.Iterative development cycle – any development cycle in which all or part of the development process (including testing) is repeated to modify an existing subsystem or software component,

78 Failure often refers to both the condition of not meeting requirements as well as the event that causes this condition to occur.

79 There are many reasons for false negative test results. They are most often caused by selecting test inputs and preconditions that do not exercise the underlying defect.

80 A false positive test result could be caused by bad test input data, incorrect test preconditions, incorrect test oracles (outputs and postconditions), defects in a test driver or test stub, improperly configured test environment, etc.

81 A fault can be caused by erroneous input or execution of a defect. Unless properly handled, a fault can cause a failure.


Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and Recommendationstypically to correct defects or make improvements (e.g., refactoring the architecture/design or replace existing components)Load testing – TBDLoop coverage – the type of code coverage in which test cases have executed every loop zero times, once, and more than once in the software under testMetadata – TBDPath coverage – the type of code coverage in which test cases have executed every possible route through the software under test82

Parallel development cycle – TBDPenetration testing – the testing technique in which a tester plays the role of an attacker and tries to penetrate the system’s defensesPost-condition - any assertion that must hold following the successful execution of the associated function (e.g., use case path)Precondition - any assertion that must hold prior to the successful execution of the associated function (e.g., use case path)Quality requirement – any requirement that specifies a mandatory quality characteristic in terms of a minimum acceptable level of some associated quality attributeRegression testing – the repetition of testing after a change has been made to ensure that the change did not inadvertently introduce any defectsRequirement – any requirement that specifies a mandatory capability of a specific product or type of product (e.g., system or subsystem)Requirements management tool – TBDRequirements metadata – TBDRequirements trace – TBDStatement coverage – the type of code coverage in which test cases have executed each statement in the software under test Structural testing – synonym for whitebox testingSystem testing – TBDTest – TBDTestability – TBDTest asset – TBDTest case – TBDTest case selection criteria – the criteria that are used to determine the actual test cases to create and run Test completion criteria – TBDTest driver – TBD

82 This level of code coverage is usually impractical or impossible.


Common Testing Problems: Pitfalls to Prevent and Mitigate 25 January 2013Descriptions, Symptoms, Consequences, Causes, and RecommendationsTest engineer – TBDTest environment – TBDTester – TBDTesting - the verification method in which a system/subsystem is executed under controlled preconditions (e.g., inputs and pretest mode and states) and actual postconditions (e.g., outputs and post-test mode and states) are compared with expected/required postconditionsTesting method – TBDTest input – TBDTest oracle – any source of the information defining correct and expected system behavior and test postconditionsTest output – TBDTest plan – TBDTest script – TBDTest stakeholder – TBDTest stub – TBDTest tool – TBDTrigger event – TBDUnit testing – TBDUse case – TBDUse case path – TBDValidation – TBDVerification – TBDVulnerability – any system-internal weakness that can increase the likelihood or harm severity of one or more abuses (i.e., mishaps or misuses)Vulnerability testing – the testing technique the goal of which is to expose a system vulnerability (i.e., defect or weakness) that can be exploited to cause a mishap or misuseWhite-box testing (a.k.a., structural and implementation testing) – any method of testing the internal, typically encapsulated structures or workings of software as opposed to its externally visible behavior, often performed to meet some kind of code coverage criteria83

83 Typical code coverage criteria include branch, decision, path, and statement coverage.


Testing Problems

Potential Symptom(s)

Observed

Potential Consequence(s)

Observed

Potential Cause(s)

IdentifiedRecommendations

Implemented

Test Planning and Scheduling ProblemsGEN-TPS-1 No Separate Test Plan GEN-TPS-2 Incomplete Test Planning GEN-TPS-3 Test Plans IgnoredGEN-TPS-4 Test Case Documents rather than Test PlansGEN-TPS-5 Inadequate Test Schedule GEN-TPS-6 Testing is Postponed Stakeholder Involvement and Commitment ProblemsGEN-SIC-1 Wrong Testing Mindset GEN-SIC-2 Unrealistic Testing Expectations / False Sense of Security

GEN-SIC-3 Lack of Stakeholder Commitment Management-related Testing ProblemsGEN-MGMT-1 Inadequate Test Resources GEN-MGMT-2 Inappropriate External Pressures GEN-MGMT-3 Inadequate Test-related Risk Management GEN-MGMT-4 Inadequate Test Metrics GEN-MGMT-5 Test Lessons Learned Ignored Test Organization and Professionalism ProblemsGEN-TOP-1 Lack of Independence GEN-TOP-2 Unclear Testing Responsibilities GEN-TOP-3 Inadequate Testing Expertise

Potential Potential Potential



Testing ProblemsSymptom(s)

ObservedConsequence(s)

ObservedCause(s)


Implemented

Test Process ProblemsGEN-PRO-1 Testing Process Not Integrated Into Engineering Process

GEN-PRO-2 One-Size-Fits-All Testing GEN-PRO-3 Inadequate Test Prioritization GEN-PRO-4 Functionality Testing Overemphasized GEN-PRO-5 Black-box System Testing Overemphasized GEN-PRO-6 White-box Unit and Integration Testing Overemphasized

GEN-PRO-7 Too Immature for Testing GEN-PRO-8 Inadequate Test Evaluations GEN-PRO-9 Inadequate Test Maintenance

Test Tools and Environments ProblemsGEN-TTE-1 Over-reliance on Manual Testing GEN-TTE-2 Over-reliance on Testing Tools GEN-TTE-3 Insufficient Test Environments GEN-TTE-4 Poor Fidelity of Test Environments GEN-TTE-5 Inadequate Test Environment Quality GEN-TTE-6 System/Software Under Test Behaves Differently GEN-TTE-7 Tests not Delivered GEN-TTE-8 Inadequate Test Configuration Management (CM)

Testing Problems


Observed


Observed

Potential Cause(s)


Implemented

Test Communication Problems



GEN-COM-1 Inadequate Defect Reports GEN-COM-2 Inadequate Test Documentation GEN-COM-3 Source Documents Not Maintained GEN-COM-4 Inadequate Communication Concerning Testing

Requirements-related Testing ProblemsGEN-REQ-1 Ambiguous Requirements GEN-REQ-2 Missing Requirements GEN-REQ-3 Incomplete Requirements GEN-REQ-4 Incorrect Requirements GEN-REQ-5 Unstable Requirements GEN-REQ-6 Poor Derived Requirements GEN-REQ-7 Verification Methods Not Specified GEN-REQ-8 Lack of Requirements Tracing

Unit Testing ProblemsTTS-UNT-1 Unstable Design TTS-UNT-2 Inadequate Design Detail TTS-UNT-3 Unit Testing Considered Unimportant

Testing Problems


Observed


Observed

Potential Cause(s)


Implemented

Integration Testing ProblemsTTS-INT-1 Defect Localization TTS-INT-2 Unavailable Components TTP-INT-3 Inadequate Self-Test

Specialty Engineering Testing Problems



TTS-SPC-1 Inadequate Capacity Testing TTS-SPC-2 Inadequate Concurrency Testing TTS-SPC-3 Inadequate Performance Testing TTS-SPC-4 Inadequate Reliability Testing TTS-SPC-5 Inadequate Robustness Testing TTS-SPC-6 Inadequate Safety Testing TTS-SPC-7 Inadequate Security Testing TTS-SPC-8 Inadequate Usability Testing

System Testing ProblemsTTS-SYS-1 Testing Robustness Requirements is Difficult TTS-SYS-2 Lack of Test Hooks TTS-SYS-3 Testing Code Coverage is Difficult

Testing Problems


Observed


Observed

Potential Cause(s)


Implemented

System of Systems (SoS) Testing ProblemsTTS-SoS-1 Inadequate SoS Planning TTS-SoS-2 Unclear SoS Testing Responsibilities TTS-SoS-3 Inadequate Funding for SoS Testing TTS-SoS-4 SoS Testing not Properly Scheduled TTS-SoS-5 Poor or Missing SoS Requirements TTS-SoS-6 Inadequate Test Support from Individual Systems TTS-SoS-7 Inadequate Defect Tracking Across Projects TTS-SoS-8 Finger-Pointing

Regression Testing Problems



TTS-REG-1 Insufficient Regression Test Automation TTS-REG-2 Regression Testing not Performed TTS-REG-3 Inadequate Scope of Regression Testing TTS-REG-4 Only Low-Level Regression Tests TTS-REG-5 Disagreement over Maintenance Test Resources


Common Testing Problems – Pitfalls to Prevent and Mitigate

Technology

Transcript of Common Testing Problems – Pitfalls to Prevent and Mitigate