(Quickly) Testing the Tester via Path Coverage
description
Transcript of (Quickly) Testing the Tester via Path Coverage
![Page 1: (Quickly) Testing the Tester via Path Coverage](https://reader035.fdocuments.net/reader035/viewer/2022062316/56816933550346895de087ae/html5/thumbnails/1.jpg)
(Quickly) Testing the Tester via Path Coverage
Alex GroceOregon State University
(formerly NASA/JPL Laboratory for Reliable Software)
![Page 2: (Quickly) Testing the Tester via Path Coverage](https://reader035.fdocuments.net/reader035/viewer/2022062316/56816933550346895de087ae/html5/thumbnails/2.jpg)
A Sad Software Story
A Very Important Space Mission
A Critical Module:Multiplier FOR MARS
Test Engineer
![Page 3: (Quickly) Testing the Tester via Path Coverage](https://reader035.fdocuments.net/reader035/viewer/2022062316/56816933550346895de087ae/html5/thumbnails/3.jpg)
A Sad Software Story
A Very Important Space Mission
A Critical Module:Multiplier FOR MARS
Test Engineer
“If this fails, we could lose the mission!”
Automated testing!
![Page 4: (Quickly) Testing the Tester via Path Coverage](https://reader035.fdocuments.net/reader035/viewer/2022062316/56816933550346895de087ae/html5/thumbnails/4.jpg)
A Sad Software Story
Multiplier FOR MARS
Test Engineer
Complex automated test framework
![Page 5: (Quickly) Testing the Tester via Path Coverage](https://reader035.fdocuments.net/reader035/viewer/2022062316/56816933550346895de087ae/html5/thumbnails/5.jpg)
A Sad Software Story
Multiplier FOR MARS
Test Engineer
Complex automated test framework
6 months…
8,976,423,124 tests…
Improvements…Bug fixes…Tester changes…
1,000,000,000 tests withNO failures!
![Page 6: (Quickly) Testing the Tester via Path Coverage](https://reader035.fdocuments.net/reader035/viewer/2022062316/56816933550346895de087ae/html5/thumbnails/6.jpg)
A Sad Software Story
Multiplier FOR MARS
Test Engineer
Launch!
Mission Day 9
6 x 9 =
42…
42???
![Page 7: (Quickly) Testing the Tester via Path Coverage](https://reader035.fdocuments.net/reader035/viewer/2022062316/56816933550346895de087ae/html5/thumbnails/7.jpg)
A Sad Software Story
• “We found three very subtle bugs.Manual testing would never havefound them. We assumed itwould find all the important bugs.”
• “The automated tests had very highbranch coverage.”
• “We ran the tester for six days in a row,and found no bugs.”
Congressional hearings
![Page 8: (Quickly) Testing the Tester via Path Coverage](https://reader035.fdocuments.net/reader035/viewer/2022062316/56816933550346895de087ae/html5/thumbnails/8.jpg)
Automated Software Testing
• Powerful, effective, important, but…• Relies on a large code base, may be nearly as complex as
the module to be tested!• Behavior too complex to really understand• Configuration management can be a nightmare• Invites complacency about testing, neglect of
manual tests• When a bug is introduced into the tester, the result may
be lots of passing tests• Very hard to know when something is wrong
Congressional hearings: conclusions
![Page 9: (Quickly) Testing the Tester via Path Coverage](https://reader035.fdocuments.net/reader035/viewer/2022062316/56816933550346895de087ae/html5/thumbnails/9.jpg)
The Problem
• Very hard to know when something is wrong
• How do we know when an automated tester is producing false negatives (no failed tests) due to a bug in the tester?– Bug may mean a coding error, configuration foul
up, or a fundamentally bogus assumption
![Page 10: (Quickly) Testing the Tester via Path Coverage](https://reader035.fdocuments.net/reader035/viewer/2022062316/56816933550346895de087ae/html5/thumbnails/10.jpg)
The Problem
• Automated testers are highly complex software systems with behavior that is– Particularly hard to specify (“find all the bugs” is
not a nice clean LTL property or assertion)– Pretty much impossible for humans to understand
(how do you summarize 100,000,000 tests?)– Easy to get wrong– Potentially mission or safety
critical
![Page 11: (Quickly) Testing the Tester via Path Coverage](https://reader035.fdocuments.net/reader035/viewer/2022062316/56816933550346895de087ae/html5/thumbnails/11.jpg)
Possible Solutions?
• Traditional Regression Testing
• Differential Testing (“bakeoff”)
• Coverage Measures
![Page 12: (Quickly) Testing the Tester via Path Coverage](https://reader035.fdocuments.net/reader035/viewer/2022062316/56816933550346895de087ae/html5/thumbnails/12.jpg)
Traditional Regression Testing
• Run latest tester on old (known buggy) versions of the SUT
• Good:– Good for detecting regressions of the tester– Easy to understand results (“Yesterday, my tester
caught this bug; today, it does not”)
![Page 13: (Quickly) Testing the Tester via Path Coverage](https://reader035.fdocuments.net/reader035/viewer/2022062316/56816933550346895de087ae/html5/thumbnails/13.jpg)
Traditional Regression Testing
• Bad:– Changes to interface of SUT require lots of work– Very coarse, very slow – need full run to compare– Old bugs may be easier to find• As software becomes more mature, remaining bugs are
(almost by definition) lower probability
![Page 14: (Quickly) Testing the Tester via Path Coverage](https://reader035.fdocuments.net/reader035/viewer/2022062316/56816933550346895de087ae/html5/thumbnails/14.jpg)
Differential Testing
• A variation: compare to a different tester on current software version
• Problems:– Where do we get another effective automated
tester? These things are hard to write!– If it’s better, why not just use that one?• Why bother with the copper tester when we have a
gold standard available?
![Page 15: (Quickly) Testing the Tester via Path Coverage](https://reader035.fdocuments.net/reader035/viewer/2022062316/56816933550346895de087ae/html5/thumbnails/15.jpg)
Coverage
• Branch and statement coverage– Good, minimal checks: know why lines that aren’t
covered aren’t covered– RED ALERT if a previously covered branch isn’t
covered by latest version of the tester
![Page 16: (Quickly) Testing the Tester via Path Coverage](https://reader035.fdocuments.net/reader035/viewer/2022062316/56816933550346895de087ae/html5/thumbnails/16.jpg)
Coverage
• Branch and/or statement coverage– Coarse: random testing and model checking
perform similarly, even in cases where model checking is known better for fault detection
– Slow: may take full test period to find a difference in branch coverage• Full automated test runs often take a day or two• When do we declare the coverage worse, given the
all/nothing nature of covering branches?
![Page 17: (Quickly) Testing the Tester via Path Coverage](https://reader035.fdocuments.net/reader035/viewer/2022062316/56816933550346895de087ae/html5/thumbnails/17.jpg)
Path Coverage
• Fine grained– Therefore often quick– Exposes differences between test approaches that
aren’t detected with branch coverage
![Page 18: (Quickly) Testing the Tester via Path Coverage](https://reader035.fdocuments.net/reader035/viewer/2022062316/56816933550346895de087ae/html5/thumbnails/18.jpg)
Another Software Story
• File system modules for JPL’s Mars Science Laboratory mission
• Automated testing system based on explicit-state model checking [VMCAI 08, WODA 08, CFV 08, ASE 08]
• Weeks of “no bugs” testing– Developer of file system
happened to stumble acrosssome bugs while testing newfunctionality
• “How did we miss this stuff???”
![Page 19: (Quickly) Testing the Tester via Path Coverage](https://reader035.fdocuments.net/reader035/viewer/2022062316/56816933550346895de087ae/html5/thumbnails/19.jpg)
Path Coverage
• Instrument with CIL– Track path bitvector, function entry
if (x == 3) { x++; if (y > 0) { y++; }} else { x--;}
becomes
if (x == 3) { add_to_bv(pathBV, 1); x++; if (y > 0) { add_to_bv(pathBV, 1); y++; } else { add_to_bv(pathBV, 0); }} else { add_to_bv(pathBV, 0); x--;}
![Page 20: (Quickly) Testing the Tester via Path Coverage](https://reader035.fdocuments.net/reader035/viewer/2022062316/56816933550346895de087ae/html5/thumbnails/20.jpg)
Path Coverage
• Coverage here is per entry function, not whole program paths– Our application is a file system– Testing of a library: therefore we care about top-
level function entry paths, not whole test-case– Takes less storage, still guarantees unique path
• Overhead is acceptable (~15%) because does not change model checking storage time, which dominates test runtime
![Page 21: (Quickly) Testing the Tester via Path Coverage](https://reader035.fdocuments.net/reader035/viewer/2022062316/56816933550346895de087ae/html5/thumbnails/21.jpg)
Traditional Regression TestingTen minutes of testing (x 6 processors)
![Page 22: (Quickly) Testing the Tester via Path Coverage](https://reader035.fdocuments.net/reader035/viewer/2022062316/56816933550346895de087ae/html5/thumbnails/22.jpg)
Ten minutes of testing (x 6 processors)
![Page 23: (Quickly) Testing the Tester via Path Coverage](https://reader035.fdocuments.net/reader035/viewer/2022062316/56816933550346895de087ae/html5/thumbnails/23.jpg)
Swarm Model Checking• Standard Depth First Search on a very large model gets
lost somewhere in a branch of a branch of a very big tree• Heuristics? But we have no idea– Where the bugs are– The structure of the state space
• So, generate a vast array of different search configurations, transitions orderings– And let parallelism (multicore desktops) have at it!
• Most effective method we know for testing programs with very large state spaces
![Page 24: (Quickly) Testing the Tester via Path Coverage](https://reader035.fdocuments.net/reader035/viewer/2022062316/56816933550346895de087ae/html5/thumbnails/24.jpg)
Test Focus
• Worse overall path coverage doesn’t always mean the tester is buggy– Can get better coverage of some functions if we don’t
cover other functions at all– But we don’t want to cover only some functions…
• Bugs may only arise when both are called– Or build 500 different configurations…
– Automatic generation of a diverse set of focuses• Swarm for test focus
![Page 25: (Quickly) Testing the Tester via Path Coverage](https://reader035.fdocuments.net/reader035/viewer/2022062316/56816933550346895de087ae/html5/thumbnails/25.jpg)
Is Path Coverage the Solution?
• Not really• It’s helpful, and it finds some problems• Branch/path coverage measures should be
seen as basic due diligence for critical systems testing
• But testing the tester is still very difficult
![Page 26: (Quickly) Testing the Tester via Path Coverage](https://reader035.fdocuments.net/reader035/viewer/2022062316/56816933550346895de087ae/html5/thumbnails/26.jpg)
Questions? Suggestions?
• How do you test your automated testers?