Optimizing test cases

62
Algorithms For Optimizing Test Cases Presented by team 4 Jim Kile Don Little Samir Shah

description

Optimizing Test Cases

Transcript of Optimizing test cases

Page 1: Optimizing test cases

Algorithms For Optimizing Test Cases

Presented by team 4Jim Kile

Don LittleSamir Shah

Page 2: Optimizing test cases

Software Testing – So What? July 28 1962 – Mariner I space probe

Mission control destroys the rocket

1985-1987 – Therac-25 medical accelerator At least five patients die Others seriously injured

November 2000 – National Cancer Institute Panama City At least eight patients die Another 20 receive significant overdoses Physicians indicted for murder

Page 3: Optimizing test cases

What Is Test Case Optimization? Typically applies to unit test cases where

coverage approaches 100% Implies ordering execution such that:

Rate of fault detection is increasedAmount of time to perform regression is

reduced Elimination of unnecessary test cases

during regression runs

Page 4: Optimizing test cases

What Are Goals For Test Case Execution? Increase the rate of early fault detection

and correctionFind bugs early so they can be corrected

early Regression test only those areas that have

changed Reduce the amount of time to execute full

unit regression test suites

Page 5: Optimizing test cases

Why Is Test Case Optimization Important? Execution complexity

Non-linear problem Elapsed time to execute Computing resources Human resources

Wait time for test completionLate defect detection and correction

Page 6: Optimizing test cases

Who Develops Test Cases?

Developers

Dedicated quality assurance staff

Automated test generation techniques

Page 7: Optimizing test cases

What Are Benefits Of Automated Test Generation Techniques? No cognitive bias Better able to:

Generate test cases that concentrate on error prone areas

Produce highly novel test cases Better coverage overall

Page 8: Optimizing test cases

Why Automated Test Optimization?

Generating set of basic test cases - easyEasily cover 50–70% of faults

Improving a test set’s quality - hard Improving to 90–100%

Time consuming and expensive

Page 9: Optimizing test cases

What Is Difference Between Optimization And Test Case Generation? Optimization problem

Single goal is sought Test case generation

No single goalOptimized coverage of code under test

Page 10: Optimizing test cases

What Are Benefits Of GA Vs. Random Test Generation? Random test generation

Uniform distribution GA-based search process

More focused test setSet focuses on identified flawsSome highly novel test cases

Page 11: Optimizing test cases

How Do Genetic Algorithms Work?

A GA operates on strings of digits called chromosomes

Each digit that makes up the chromosome is called a gene

Collection of such chromosomes makes up a population

Page 12: Optimizing test cases

How Do Genetic Algorithms Work?

Each chromosome has a fitness value Fitness value determines probability of

survival in next generation

Page 13: Optimizing test cases

How Do Genetic Algorithms Work? Algorithm begins with random population Algorithm evolves incrementally – generation Produces a structure from iterative development

Reproduction Combine with another chromosome (crossover) Adjusted slightly (mutation)

Original chromosome may have a poor/ low fitness Create offspring with much higher fitness

Page 14: Optimizing test cases

Uses of GA’s In Testing 1) Generate a range of effective test data

with fault revealing power

Both papers used this technique

2) Introduce faults in software under test to determine effectives of test cases

Only one paper used this technique

Page 15: Optimizing test cases

How Is Quality Of Test Cases Evaluated? Number of detected injected faults

Killed by the test caseOtherwise, it’s alive

Develop a scoreErrors killed by test caseDivided by test set

Page 16: Optimizing test cases

How Does GA Work Specifically?Example

Genetic algorithmGenerates random population of binary digitsFor example each chromosome may be 36

bits long Each twelve bit segment representing one of the

sides of a triangle

Chromosomes cross between gene 5 and 32

Page 17: Optimizing test cases

How Does GA Work Specifically? Crossover

Before crossover between gene 5 and 32

1) 111001110101 100101100110 001010111000

2) 111101011010 100101101010 101110110100

After crossover

1) 111001011010 100101101010 101110111000

2) 111101110101 100101100110 001010110100

Page 18: Optimizing test cases

How Does Mutation Work Specifically?

Mutation will randomly switch genes in population

Gene 23 in chromosome 1 was switched from a 1 to 0

Page 19: Optimizing test cases

How Does Mutation Work?

Before mutation of gene 23 chromosome 1

1) 111001110101 100101100110 001010111000

After mutation

1) 111001110101 100101100100 001010111000

Page 20: Optimizing test cases

Generation Of Next Population

Based on a “roulette wheel” Where fitness determines the probability of

selectionThose with higher fitness more chance of

offspring in the next generation in comparison to their less fit companions

Page 21: Optimizing test cases

Dynamic Software Testing Techniques Structural – first paper

Code coverage Boundary conditions Individual or combined statement traversal Path coverage

Functional – second paperConfirms that a function from specification is

correctly implementedNo analysis of the structure of the program

Page 22: Optimizing test cases

First paper

Automatic test case optimization: A bacteriologic algorithmBenoit Baudry, Franck Fleurey, Jean-marc Jézéquel And Yves Le Traon

Page 23: Optimizing test cases

Contribution

Finding an optimal set of test cases through revealing a test case’s “fault revealing power”

Building confidence in the test suite through “mutation analysis”

Page 24: Optimizing test cases

Bacteriologic AlgorithmTheoretical Basis

Adapted from genetic algorithms Inspired by evolutionary ecology and

bacteriologic adaptation Similarities in this problem domain

Can’t generate a single perfect test suite

Page 25: Optimizing test cases

Bacteriologic AlgorithmFramework

Page 26: Optimizing test cases

Bacteriologic AlgorithmBasic Functions

Initialization Iterate incrementally creating new

generation Limitation

Only works on test cases of similar size

Page 27: Optimizing test cases

Bacteriologic AlgorithmInitialization

Initial test cases either written by hand or automatically generated

For the experiment test cases were randomly generated

Initial size set to 25 nodes

Page 28: Optimizing test cases

Bacteriologic AlgorithmComputing Fitness

Tool used to generate test case mutants Uses the mutation score of a set of test

cases as that set’s fitness functionMS(T) = 100(d/(m - equiv))

Test cases are executed to determine how many mutants they can killGlobal mutation score computed

Page 29: Optimizing test cases

Bacteriologic AlgorithmMemorization

Used to compute relative fitness Test case mutation score relative to the solution set’s

mutation score

Test cases are selected whose relative score exceeds the memorization threshold

)(}){(),( TCSMStcTCSMStcTCSrelMS

Page 30: Optimizing test cases

Bacteriologic AlgorithmMutation

Randomly selects test casesSelection is weighted by relative fitness of the

test case Selected cases and code are mutated to

create new cases for the next generation Code is represented by an abstract syntax

treeNodes in the tree are replaced

Page 31: Optimizing test cases

Bacteriologic AlgorithmMutation – Abstract Syntax Tree

A finite, labeled directed treeNodes are labeled by operatorsEdges represent operandsLeaves contain variables or constants

Used in a parser Range of all possible structures defined by

the syntax

Page 32: Optimizing test cases

Bacteriologic AlgorithmMutation – Abstract Syntax Tree Example

x = a + b;

y = a * b;

while (y > a) {

a++;

x = a + b;

}

Page 33: Optimizing test cases

Bacteriologic AlgorithmFiltering

Filtering = removing Two different implementations

Delete any test case whose relative mutation score is equal to 0

That is the function kills no mutant that the test cases in the solution set haven’t killed

Reduce the coverage matrix by deleting redundant test cases

Page 34: Optimizing test cases

Bacteriologic AlgorithmResults

Page 35: Optimizing test cases

Bacteriologic AlgorithmResults

Comparison with genetic algorithmBoth ran 50 times

Genetic algorithm results200 generations createdAverage mutation score of 85Required executing an average of 480,000

test cases

Page 36: Optimizing test cases

Bacteriologic AlgorithmResults

Bacteriologic algorithm results30 generations createdAverage mutation score of 96Required executing an average of 46,375 test

cases

Page 37: Optimizing test cases

Second paper

Breeding software test cases with genetic algorithmsD. Berndt, J. Fisher, L. Johnson, J. Pinglikar And A. Watkins

Page 38: Optimizing test cases

Focus Breeding software test cases using genetic

algorithms as part of a software testing cycle

Uses automated test generation techniques

Evolving fitness function Relies on fossil record of organisms Search behaviors

Novelty Proximity Severity

Page 39: Optimizing test cases

Genetic AlgorithmSimple Triangle Classification Program (TRITYP)

Classify triangle by type Three sides of the triangle

Parameters x, y, and zRange 0 – 2000

Search space Illegal / legal triangle

Page 40: Optimizing test cases

Genetic AlgorithmApproach Flaws were intentionally introduced into

data for testing purposes Errors introduced for specific ranges of x

and y parametersX coordinate between 500 - 1000Y coordinate between 0 – 500Result in error

Page 41: Optimizing test cases

Genetic AlgorithmSearch Space Illegal/Legal Triangle

Page 42: Optimizing test cases

Genetic AlgorithmHow does it work specifically? Generates random population of x, y and z

coordinates as binary digits Each chromosome is 36 bits long

Each twelve bit segment representing one of the sides of a triangle

Page 43: Optimizing test cases

Genetic AlgorithmFitness

Relative fitness function Compares

Particular chromosome’s fitnessHistorical information from the fossil record

Page 44: Optimizing test cases

Genetic AlgorithmGenerating Software Test Cases

Variety of sources into test case breeding with genetic algorithms Powerful evolutionary Naturally parallel computational engine

Balance fitness with diversity Wide variety of test cases can be bred

Concepts of novelty, proximity and severity Used to create a relative or changing fitness function

Page 45: Optimizing test cases

Genetic AlgorithmBreeding Software Test Cases Using genetic algorithms Evolving fitness function

Fossil record of organisms Interesting search behaviors

Novelty Proximity Severity

Page 46: Optimizing test cases

Genetic AlgorithmNovelty Measure of the uniqueness of particular test

case Quantified by measuring distance in

parameter space from previous invocations stored in the fossil record

2* ijijn fck

Page 47: Optimizing test cases

Genetic AlgorithmProximity Measure of closeness to other test cases that

resulted in system failures

2* ijijp eck

Page 48: Optimizing test cases

Genetic AlgorithmSeverity

Measure of the seriousness of a system error

Page 49: Optimizing test cases

Genetic AlgorithmDiversity Used to avoid being trapped by local

maxima Generation of test cases diversity means

Emphasizing noveltyDownplaying proximity

Simple rules complex behaviorExplorersProspectorsMiners

Page 50: Optimizing test cases

Genetic AlgorithmExplorer

Highly novel test case Spread across the lightly populated

regions of the test space Once an error is discovered - fitness

function encourages more thorough testing of the region

Page 51: Optimizing test cases

Genetic AlgorithmProspectors

Somewhat unique test cases Near newfound errors Both novelty and proximity Prospectors uncover additional errors Fitness function will reward points that are

simply near other errors

Page 52: Optimizing test cases

Genetic AlgorithmMiners

More fully probe error areas

Page 53: Optimizing test cases

Genetic AlgorithmExplorer, Prospectors & Miners

Page 54: Optimizing test cases

Genetic AlgorithmFossil Record Contains all previously generated test

cases Information about type of error generated,

if any Provides an context for changing notions

of fitness Relative rather than absolute fitness

function

Page 55: Optimizing test cases

Genetic AlgorithmFitness Functions and Fossil Records

Genetic algorithm is used to generate good test cases

Compared to fossil records Points awarded for novelty & proximity Causes population to evolve

Page 56: Optimizing test cases

Genetic AlgorithmVisualizing the fossil record Three dimensional representation Using 200-by-200 cells to divide up the x-y

parameter space Bars represent the number of organisms

Page 57: Optimizing test cases

Genetic AlgorithmHigh Novelty & High Proximity

Fairly high novelty Forces the search to reward exploratory

behavior High proximity

Fosters the collection of points where errors are detected

Three seeded range errors

Page 58: Optimizing test cases

Genetic AlgorithmFossil Record

Page 59: Optimizing test cases

Genetic AlgorithmLow Novelty & Low Proximity

Weights are reversed Little reward for novelty or proximity More uniformly distribution

Page 60: Optimizing test cases

Genetic AlgorithmFossil Record

Page 61: Optimizing test cases

Genetic AlgorithmResults Describes preliminary results from genetic algorithm

based approach to software test case breeding

Project’s reliance on relative or changing fitness function using fossil record

Fossil record records past organisms allowing any current fitness calculations to be influenced by past generations

Three factors are developed for the fitness function: novelty proximity and severity

Interplay of these factors produces fairly complex search behaviors

Page 62: Optimizing test cases

Conclusions Structural versus functional

Structural - Code coverage – First Paper Boundary conditions Individual or combined statement traversal Path coverage

Functional – specification – Second Paper Confirms that a function from specification is correctly

implemented No analysis of the structure of the program

GA algorithm generation Generate effective test data – Both Papers Introduction of software faults – First Paper