Laurent Bossavit - Skeptical Self-Defense for the Serious Tester - EuroSTAR 2013

Skeptical Self-Defense For The Serious Tester

Laurent BossavitTwitter:@Morendil http://institut-agile.fr/

http://institut-agile.fr

http://institut-agile.fr

Programmer. (Homo sapiens geekus.)

@Morendil

FINISHED FILES ARE THE RESULT OF YEARS OF SCIENTIFIC STUDY COMBINED WITHTHE EXPERIENCE OF YEARS.

How many F’s do you count?

How we view testing

How we view testing

• Execute test cases?

• Obsess over details?

• Look for bugs?

• Solve complex puzzles?

• Reveal (sometimes unpleasant) truth about the quality of a product?

What we have in common...

Bullying?

Bullying?

• Telling me how to do my job, when I do not assess you as competent

Bullying?


• Telling my manager how I should do my job

Bullying?


• Telling my manager how I should do my job

• Belittling my or my community’s contributions to theory or practice

Mainstay of the bully’s arsenal

How to be a bully

• Use “studies show” or “research proves”

• Use citations liberally (Bossavit 2013)

• Rely on unsound “metrics”

• Claim large-scale surveys (“thousands of projects”), exaggerate sample sizes

• Claim large monetary impacts

The claim:

"It can cost you up to 110 times more to correct a requirements defect found in production than if that same defect

were found during requirements development (Grady, 1999)"

-- EGB Consulting Web site

(“Cost of defects curve”)

The problem:

• Justifies “Requirements-Based Testing”, any number of Software Lifecycles

• Supposedly supported by “decades of research”

• Actually very hard to find corroborating data, best available data contradicts claim

The evidence:

Relative cost to fix errors:Boehm (1976)

Relative cost to fix errors:U.S. Fed.Hwy. (2007), after McConnell

Hughes Aircraft (1998)

"This study didn't accurately record the engineering times to fix the

defects, so we will use average times summarized from several other

studies to weight the defect origins"-- R.B. Grady, 1989

"An important distinction that emerged was that the large effort

multiplier holds for severe defects; many defects with lesser impact will not cost appreciably more to change

after delivery than before."-- Shull et al.,What We Have Learned

About Fighting Defects

The claim:

High ROI (Return on Investment) for IV&V (Independent Validation &

Verification)

(“Big money” claims)

83 x IV&V budget= $1.6 billion!

NASA IV&V Facility Report, 2006

The problem:

• GIGO - bad assumptions in, nonsense results out

• Missing a key variable for ROI calculation (the actual cost of fixing defects)

• Suffers from “counting to one” issues

• Suffers from “Flaubert math”, from the following puzzle by the French writer:

What is the captain’s age?

A ship sails the ocean. It left Boston with a cargo of cotton. It grosses 200

tons [...], there are 12 passengers aboard, the wind is blowing East-North-East, the clock points to a

quarter past three in the afternoon.

The claim:

Any number of statements of the form “46% of project failures originate from

poor requirements”(e.g. the infamous Chaos Report)

The problem:

• These “studies” are typically opinion surveys, interviewing executives over the phone or by email

• Similar surveys find that 46% of Americans believe in creationism

The trouble with opinions...

The claim:

A 1990s survey by U.S. Defense showed a 75% failure rate for waterfall

projects

Biggest bluff of all

$35.7 billion - 75% waste(DoD projects)

“Requirements Risks Can Drown Software Projects”

(Leishman, 2002)

The problem:

• No such study ever existed (!)

"At the 5th Annual Joint Aerospace Weapons Systems Support, Sensors, and

Simulation Symposium in 1999, the results of a study of 1995 Department of Defense (DoD) software spending were

presented. (Jarzombek 1999)"-- Leishman, 2002

"A 1999 review of failure rates in a sample of earlier DoD projects drew grave

conclusions: Of a total $37 billion for the sample set, 75% of the projects failed or were never used, and only 2% were used

without extensive modification."-- C. Larman, 2003

Also cited by Benefield, Korhonen, Knoernschild, Gilb, Sutherland, Leitass, Leinonen, Martin (ObjectMentor), etc.

(indirect cite via Larman in many cases)

"I do not have readily available access to that information..."

-- Lt. Col. (Ret.) Jarzombek(personal email, 2013)

1979 19952 % 2 %2 % 3 %19 % 20 %29 % 29 %46 % 47 %

Two-tailed chi-squared test, under reasonable (but debatable) assumptions, of the hypothesis “H0 = These results come from two samplings of

the same population, rather than one being a copy of the other” - rejected at p < 0.001

More claims...

• 10x software engineers

• Cone of Uncertainty

• “Only x% of a product’s features are used”

• Various revisionist histories

For more details...

Bullies exploit:

• Reciprocity

• Commitment

• Social proof

• Liking

• Authority

• Scarcity

You can fight back!

Self-defense skills

Self-defense skills

• “Citation needed”

Self-defense skills


• Look for primary sources

Self-defense skills



• Google-fu

Self-defense skills



• Google-fu

• Academic search: Google Scholar, ACM, IEEE

Self-defense skills



• Google-fu

• Academic search: Google Scholar, ACM, IEEE

• “Search by date” to identify spread pattern; Google n-Grams

Curiosity

Skepticism

Tenacity

Don’ts

• Don’t let bullies turn you away from “unstructured” testing

• Don’t turn into a bully, i.e. rely on borrowed authority, plagiarism, claims of dubious status

Do’s

• Dare to be a scientist

• Question: claims, assumptions, authority

• Take power back from bullies:

• Occupy Software Engineering,

• Occupy Testing!

Remember, remember,the fifth of November,

the misleading statistics and plotI know of no reason

to blindly trust another’s plot

Remember, remember...

Laurent Bossavit - Skeptical Self-Defense for the Serious Tester - EuroSTAR 2013

Technology

Transcript of Laurent Bossavit - Skeptical Self-Defense for the Serious Tester - EuroSTAR 2013