Msr13 mistake

14

description

Tim Menzies, MSR'13

Transcript of Msr13 mistake

Page 2: Msr13 mistake

Inevitable, due to the complexity &novelty of our work

(But rarely reported, which is…. suspicious)

What can we learn from those mistakes? 2

Page 3: Msr13 mistake

An MSR’13 paper: Cross-company learning Can “Us” can learn from “them”?

• Provided “us” selects right data from “them”– Relevancy filtering: [Turhan09] (and any others)– Selection guided by structure of “us”

• If “we” is small and “them” is many:– Selection guided using kernel

functions learned from “them” – Result #1: out-performed [Turhan09].

• Result #2: Result #1 was a coding error3

Page 4: Msr13 mistake

Houston, we have a problem• Mar 15: paper accepted to MSR

– “Better cross-company defect prediction”

• Mar 29: camera-ready submitted,

• ?Apr 10: pre-prints go on-line

• April 29: Hyeongmin Jeon, graduate student at Pusan Natl. Univ.,

– Emailed us: can’t reproduce result

• May 4: Peters, checking code, found error– Manic week of experiments ….

• May11: results definitely wrong– Emails to MSR organizers

4

Btw, < 3 weeks. Wow…

Page 5: Msr13 mistake

Coding error

• Distance between test & training instance – Remove classes– Ran a distance function – Re-inserted the classes

• But…. bad re-insert– Used the training class – Not the test class

5

Page 6: Msr13 mistake

Pull the paper?• In the internet age, is that even possible?

– X people now have local copies of that paper– Which Google might easily stumble across

Old pre-print, found

May 15

Old pre-print, found

May 15

6

Page 7: Msr13 mistake

Authors: report your mistakes, openly and honestly

• We need to expect, allow, papers with sections: “clarifications”, “errata”, “retractions”

• E.g. Murphy-Hill, Parnin, Black. IEEE TSE, Jan 2012

7

Page 8: Msr13 mistake

Conference organizers: encourage research honesty

• Need CFPs with text that encourages

• Repeating and testing and challenging old results

8

Page 9: Msr13 mistake

Researchers: Share data, check each other’s conclusions

• Reinhart & Rogoff [2010]– “countries with debt over 90% of GDP suffer notably lower

economic growth.”

• Thomas Herndon, 3rd year Ph.D. U.Mass.– Unable to replicate with publicly available data , – Asked Reinhart & Rogoff for their data– Got it (Their spreadsheet)– Found errors in data on economic growth vs debt levels.

• A triumph for open science – Sadly, reported in media as grave mistake– E.g. http://goo.gl/HGugL– Immature view of the nature of science

9

Page 10: Msr13 mistake

Supervisors : encourage a culture of research honesty

• What will you tell others about this paper?– A failure? Or a success of the open science method?

– Its up to you but understand the implications

• If we don’t let grad students report mistakes– Then they won’t

• Students graduate, • Leave you, • The error emerges• And you are left with with the problem

10

Page 11: Msr13 mistake

Specific lessons

• Data mining experiments are complex software prototypes– Version control

(of code and data)– Code inspections– Trap and log your random number seeds– Rewrite data rarely

• Pull out the class, process, put it back?• Fuhgeddaboudit• Have data headers of different types

– So (say) distance measures can skip over classes11

The above error does noteffect Peters & MenziesICSE’12 and TSE’13

Page 12: Msr13 mistake

Open access science • Repeatable, improvable,

– and sometimes even refutable

• We should not celebrate the failed paper

• But we should celebrate– The open science community that finds such errors

• MSR, PROMISE, etc

– The grad students that struggle to reproduce results• Hyeongmin Jeon

– The integrity of grad students whose first response on finding an error was to report it

• Fayola Peters

12

Page 13: Msr13 mistake

Was this a “useful” mistake?

• Is this insight within this mistake?

• What does it mean if using more experience makes the defect predictor worse?

• International workshop on Transfer Learning in Software Engineering– Nov, ASE’13

13

Page 14: Msr13 mistake

14