Laurent Bossavit - Skeptical Self-Defense for the Serious Tester - EuroSTAR 2013
-
Upload
eurostar-software-testing-conference -
Category
Technology
-
view
207 -
download
1
Transcript of Laurent Bossavit - Skeptical Self-Defense for the Serious Tester - EuroSTAR 2013
Skeptical Self-Defense For The Serious Tester
Laurent BossavitTwitter:@Morendil http://institut-agile.fr/
Programmer. (Homo sapiens geekus.)
@Morendil
FINISHED FILES ARE THE RESULT OF YEARS OF SCIENTIFIC STUDY COMBINED WITHTHE EXPERIENCE OF YEARS.
How many F’s do you count?
How we view testing
How we view testing
• Execute test cases?
• Obsess over details?
• Look for bugs?
• Solve complex puzzles?
• Reveal (sometimes unpleasant) truth about the quality of a product?
What we have in common...
What we have in common...
What we have in common...
What we have in common...
What we have in common...
Bullying?
Bullying?
• Telling me how to do my job, when I do not assess you as competent
Bullying?
• Telling me how to do my job, when I do not assess you as competent
• Telling my manager how I should do my job
Bullying?
• Telling me how to do my job, when I do not assess you as competent
• Telling my manager how I should do my job
• Belittling my or my community’s contributions to theory or practice
Mainstay of the bully’s arsenal
How to be a bully
• Use “studies show” or “research proves”
• Use citations liberally (Bossavit 2013)
• Rely on unsound “metrics”
• Claim large-scale surveys (“thousands of projects”), exaggerate sample sizes
• Claim large monetary impacts
The claim:
"It can cost you up to 110 times more to correct a requirements defect found in production than if that same defect
were found during requirements development (Grady, 1999)"
-- EGB Consulting Web site
(“Cost of defects curve”)
The problem:
• Justifies “Requirements-Based Testing”, any number of Software Lifecycles
• Supposedly supported by “decades of research”
• Actually very hard to find corroborating data, best available data contradicts claim
The evidence:
Relative cost to fix errors:Boehm (1976)
Relative cost to fix errors:U.S. Fed.Hwy. (2007), after McConnell
Hughes Aircraft (1998)
Hughes Aircraft (1998)
Hughes Aircraft (1998)
"This study didn't accurately record the engineering times to fix the
defects, so we will use average times summarized from several other
studies to weight the defect origins"-- R.B. Grady, 1989
"An important distinction that emerged was that the large effort
multiplier holds for severe defects; many defects with lesser impact will not cost appreciably more to change
after delivery than before."-- Shull et al.,What We Have Learned
About Fighting Defects
The claim:
High ROI (Return on Investment) for IV&V (Independent Validation &
Verification)
(“Big money” claims)
83 x IV&V budget= $1.6 billion!
NASA IV&V Facility Report, 2006
The problem:
• GIGO - bad assumptions in, nonsense results out
• Missing a key variable for ROI calculation (the actual cost of fixing defects)
• Suffers from “counting to one” issues
• Suffers from “Flaubert math”, from the following puzzle by the French writer:
What is the captain’s age?
A ship sails the ocean. It left Boston with a cargo of cotton. It grosses 200
tons [...], there are 12 passengers aboard, the wind is blowing East-North-East, the clock points to a
quarter past three in the afternoon.
The claim:
Any number of statements of the form “46% of project failures originate from
poor requirements”(e.g. the infamous Chaos Report)
The problem:
• These “studies” are typically opinion surveys, interviewing executives over the phone or by email
• Similar surveys find that 46% of Americans believe in creationism
The trouble with opinions...
The claim:
A 1990s survey by U.S. Defense showed a 75% failure rate for waterfall
projects
Biggest bluff of all
$35.7 billion - 75% waste(DoD projects)
“Requirements Risks Can Drown Software Projects”
(Leishman, 2002)
The problem:
• No such study ever existed (!)
"At the 5th Annual Joint Aerospace Weapons Systems Support, Sensors, and
Simulation Symposium in 1999, the results of a study of 1995 Department of Defense (DoD) software spending were
presented. (Jarzombek 1999)"-- Leishman, 2002
"A 1999 review of failure rates in a sample of earlier DoD projects drew grave
conclusions: Of a total $37 billion for the sample set, 75% of the projects failed or were never used, and only 2% were used
without extensive modification."-- C. Larman, 2003
Also cited by Benefield, Korhonen, Knoernschild, Gilb, Sutherland, Leitass, Leinonen, Martin (ObjectMentor), etc.
(indirect cite via Larman in many cases)
"I do not have readily available access to that information..."
-- Lt. Col. (Ret.) Jarzombek(personal email, 2013)
1979 19952 % 2 %2 % 3 %19 % 20 %29 % 29 %46 % 47 %
Two-tailed chi-squared test, under reasonable (but debatable) assumptions, of the hypothesis “H0 = These results come from two samplings of
the same population, rather than one being a copy of the other” - rejected at p < 0.001
More claims...
• 10x software engineers
• Cone of Uncertainty
• “Only x% of a product’s features are used”
• Various revisionist histories
For more details...
Bullies exploit:
• Reciprocity
• Commitment
• Social proof
• Liking
• Authority
• Scarcity
You can fight back!
Self-defense skills
Self-defense skills
• “Citation needed”
Self-defense skills
• “Citation needed”
• Look for primary sources
Self-defense skills
• “Citation needed”
• Look for primary sources
• Google-fu
Self-defense skills
• “Citation needed”
• Look for primary sources
• Google-fu
• Academic search: Google Scholar, ACM, IEEE
Self-defense skills
• “Citation needed”
• Look for primary sources
• Google-fu
• Academic search: Google Scholar, ACM, IEEE
• “Search by date” to identify spread pattern; Google n-Grams
Curiosity
Skepticism
Tenacity
Don’ts
• Don’t let bullies turn you away from “unstructured” testing
• Don’t turn into a bully, i.e. rely on borrowed authority, plagiarism, claims of dubious status
Do’s
• Dare to be a scientist
• Question: claims, assumptions, authority
• Take power back from bullies:
• Occupy Software Engineering,
• Occupy Testing!
Remember, remember,the fifth of November,
the misleading statistics and plotI know of no reason
to blindly trust another’s plot
Remember, remember...