Automatic Identification of Bug-Introducing Changes

72
Automatic Identification of Bug-Introducing Changes Sunghun Kim Kai Pan E. James Whitehead, Jr. University of California, Santa Cruz, USA Tom Zimmermann Saarland University, Saarbrücken, Germany

description

Presented at ASE 2006.

Transcript of Automatic Identification of Bug-Introducing Changes

Page 1: Automatic Identification of Bug-Introducing Changes

Automatic Identification of Bug-Introducing Changes

Sunghun KimKai Pan E. James Whitehead, Jr.

University of California, Santa Cruz, USATom Zimmermann

Saarland University, Saarbrücken, Germany

Page 2: Automatic Identification of Bug-Introducing Changes

Bugs

Motivation

Page 3: Automatic Identification of Bug-Introducing Changes

Bugs

Motivation

Statistics

Page 4: Automatic Identification of Bug-Introducing Changes

Bugs

Motivation

DetectionStatistics

Page 5: Automatic Identification of Bug-Introducing Changes

Bugs

Motivation

DetectionStatistics

Prediction

Page 6: Automatic Identification of Bug-Introducing Changes

Bugs

Motivation

DetectionStatistics

Analysis

Prediction

Page 7: Automatic Identification of Bug-Introducing Changes

Bugs

Motivation

DetectionStatistics

Observation

Analysis

Prediction

Page 8: Automatic Identification of Bug-Introducing Changes

So far: Focus on fixes

fixes issues mentioned in bug 45635: [hovering] rollover hovers- mouse exit detection is safer and should not allow for loopholes any more, except for shell deactiviation- hovers behave like normal ones: - tooltips pop up below the control - they move with subjectArea - once a popup is showing, they will show up instantly

teicher 2003-10-29 16:11:01

Page 9: Automatic Identification of Bug-Introducing Changes

So far: Focus on fixes

fixes issues mentioned in bug 45635: [hovering] rollover hovers- mouse exit detection is safer and should not allow for loopholes any more, except for shell deactiviation- hovers behave like normal ones: - tooltips pop up below the control - they move with subjectArea - once a popup is showing, they will show up instantly

teicher 2003-10-29 16:11:01

Fixes give only the location of a defect,not when it was introduced.

Page 10: Automatic Identification of Bug-Introducing Changes

Bug-introducing changes

...if (foo==null) { foo.bar();...

BUG-INTRODUCING

if (foo==null) {

Page 11: Automatic Identification of Bug-Introducing Changes

Bug-introducing changes

...if (foo!=null) { foo.bar();...

FIX

if (foo!=null) {...if (foo==null) { foo.bar();...

BUG-INTRODUCING

if (foo==null) { later fixed

Page 12: Automatic Identification of Bug-Introducing Changes

Bug-introducing changes

Bug-introducing changes are changes that lead to problems as indicated by later fixes.

...if (foo!=null) { foo.bar();...

FIX

if (foo!=null) {...if (foo==null) { foo.bar();...

BUG-INTRODUCING

if (foo==null) { later fixed

Page 13: Automatic Identification of Bug-Introducing Changes

Life-cycle of a “bug”

Page 14: Automatic Identification of Bug-Introducing Changes

Life-cycle of a “bug”

BUG-INTRODUCINGCHANGE

Page 15: Automatic Identification of Bug-Introducing Changes

Life-cycle of a “bug”

fixes issues mentioned in bug 45635: [hovering] rollover hovers- mouse exit detection is safer and should not allow for loopholes any more, except for shell deactiviation- hovers behave like normal ones: - tooltips pop up below the control - they move with subjectArea - once a popup is showing, they will show up instantly

BUG REPORT

BUG-INTRODUCINGCHANGE

Page 16: Automatic Identification of Bug-Introducing Changes

Life-cycle of a “bug”

fixes issues mentioned in bug 45635: [hovering] rollover hovers- mouse exit detection is safer and should not allow for loopholes any more, except for shell deactiviation- hovers behave like normal ones: - tooltips pop up below the control - they move with subjectArea - once a popup is showing, they will show up instantly

BUG REPORT

FIXCHANGE

BUG-INTRODUCINGCHANGE

Page 17: Automatic Identification of Bug-Introducing Changes

The SZZ algorithm

1.18

FIXED BUG42233

Page 18: Automatic Identification of Bug-Introducing Changes

The SZZ algorithm

1.18

FIXED BUG42233

$ cvs annotate -r 1.17 Foo.java ...20: 1.11 (john 12-Feb-03): return i/0; ...40: 1.14 (kate 23-May-03): return 42; ...60: 1.16 (mary 10-Jun-03): int i=0;

Page 19: Automatic Identification of Bug-Introducing Changes

The SZZ algorithm

1.18

FIXED BUG42233

$ cvs annotate -r 1.17 Foo.java ...20: 1.11 (john 12-Feb-03): return i/0; ...40: 1.14 (kate 23-May-03): return 42; ...60: 1.16 (mary 10-Jun-03): int i=0;

Page 20: Automatic Identification of Bug-Introducing Changes

1.14 1.161.11

The SZZ algorithm

1.18

FIXED BUG42233

$ cvs annotate -r 1.17 Foo.java ...20: 1.11 (john 12-Feb-03): return i/0; ...40: 1.14 (kate 23-May-03): return 42; ...60: 1.16 (mary 10-Jun-03): int i=0;

Page 21: Automatic Identification of Bug-Introducing Changes

1.14 1.161.111.11

The SZZ algorithm

1.18

FIXED BUG42233

BUGINTRO

$ cvs annotate -r 1.17 Foo.java ...20: 1.11 (john 12-Feb-03): return i/0; ...40: 1.14 (kate 23-May-03): return 42; ...60: 1.16 (mary 10-Jun-03): int i=0;

Page 22: Automatic Identification of Bug-Introducing Changes

1.14 1.161.111.11 1.14

The SZZ algorithm

1.18

FIXED BUG42233

BUGINTRO

BUGINTRO

$ cvs annotate -r 1.17 Foo.java ...20: 1.11 (john 12-Feb-03): return i/0; ...40: 1.14 (kate 23-May-03): return 42; ...60: 1.16 (mary 10-Jun-03): int i=0;

Page 23: Automatic Identification of Bug-Introducing Changes

1.14 1.161.111.11 1.14 1.16

The SZZ algorithm

1.18

FIXED BUG42233

BUGINTRO

BUGINTRO

BUGINTRO

$ cvs annotate -r 1.17 Foo.java ...20: 1.11 (john 12-Feb-03): return i/0; ...40: 1.14 (kate 23-May-03): return 42; ...60: 1.16 (mary 10-Jun-03): int i=0;

Page 24: Automatic Identification of Bug-Introducing Changes

1.14 1.16

The SZZ algorithm

1.14 1.16 1.18

FIXED BUG42233

BUGINTRO

BUGINTRO

BUGINTRO

1.11 1.14 1.16

BUGINTRO

BUGINTRO

Page 25: Automatic Identification of Bug-Introducing Changes

fixes issues mentioned in bug 45635: [hovering] rollover hovers- mouse exit detection is safer and should not allow for loopholes any more, except for shell deactiviation- hovers behave like normal ones: - tooltips pop up below the control - they move with subjectArea - once a popup is showing, they will show up instantly

BUG REPORT

closedsubmitted

1.14 1.16

The SZZ algorithm

1.14 1.16 1.18

FIXED BUG42233

BUGINTRO

BUGINTRO

BUGINTRO

1.11 1.14 1.16

BUGINTRO

BUGINTRO

Page 26: Automatic Identification of Bug-Introducing Changes

fixes issues mentioned in bug 45635: [hovering] rollover hovers- mouse exit detection is safer and should not allow for loopholes any more, except for shell deactiviation- hovers behave like normal ones: - tooltips pop up below the control - they move with subjectArea - once a popup is showing, they will show up instantly

BUG REPORT

closedsubmitted

1.14 1.16

The SZZ algorithm

1.14 1.16 1.18

FIXED BUG42233

BUGINTRO

BUGINTRO

BUGINTRO

1.11

REMOVE FALSE POSITIVES

Page 27: Automatic Identification of Bug-Introducing Changes

Drawbacks of SZZ

Annotation by SCMs is insufficient.(line number in bug-introducing revision is missing)

Page 28: Automatic Identification of Bug-Introducing Changes

Drawbacks of SZZ

Not all modifications are fixes.(blank lines, comments, etc.)

Annotation by SCMs is insufficient.(line number in bug-introducing revision is missing)

Page 29: Automatic Identification of Bug-Introducing Changes

False negatives and positives

BUGSBUG-INTRODUCING

False negatives

False positives

Correct

Page 30: Automatic Identification of Bug-Introducing Changes

An example

void bar() { if (val==null) { println(val); }...

BUG-INTRODUCING

Revision 7: tom- introduces the defects

Page 31: Automatic Identification of Bug-Introducing Changes

An example

void bar() { if (val==null) { println(val); }...

BUG-INTRODUCING

Revision 7: tom- introduces the defects

void foo() { // print val if (val==null) { println(val); }...Revision 23: jim- inserts a comment- reformats if statement

void foo() { // print value if (val!=null) { println(val); }...

BUG FIX

Revision 42: kim- changes comment- corrects defect

Page 32: Automatic Identification of Bug-Introducing Changes

An example

void bar() { if (val==null) { println(val); }...

BUG-INTRODUCING

Revision 7: tom- introduces the defects

void foo() { // print val if (val==null) { println(val); }...Revision 23: jim- inserts a comment- reformats if statement

void foo() { // print value if (val!=null) { println(val); }...

BUG FIX

Revision 42: kim- changes comment- corrects defect

// print val if (val!=null)

Page 33: Automatic Identification of Bug-Introducing Changes

An example

void bar() { if (val==null) { println(val); }...

BUG-INTRODUCING

Revision 7: tom- introduces the defects

void foo() { // print val if (val==null) { println(val); }...Revision 23: jim- inserts a comment- reformats if statement

void foo() { // print value if (val!=null) { println(val); }...

BUG FIX

Revision 42: kim- changes comment- corrects defect

// print val if (val==null) {

// print val if (val!=null)

Page 34: Automatic Identification of Bug-Introducing Changes

An example

void bar() { if (val==null) { println(val); }...

BUG-INTRODUCING

Revision 7: tom- introduces the defects

void foo() { // print val if (val==null) { println(val); }...Revision 23: jim- inserts a comment- reformats if statement

void foo() { // print value if (val!=null) { println(val); }...

BUG FIX

Revision 42: kim- changes comment- corrects defect

// print val if (val==null) {

// print val if (val!=null)

The originial SZZ algorithm has too many false positives (rev 23) and false negatives (rev 7).

Page 35: Automatic Identification of Bug-Introducing Changes

Our study

COLUMBA

Software type

Number of revisions

Investigated period

Email client

500

11/2002-06/2003

Project

Number of fixes

Average LOC

ECLIPSE (jdt.core)

143 (29%)

48,135

IDE

1000

06/2001-03/2002

158 (16%)

111,059

Page 36: Automatic Identification of Bug-Introducing Changes

Annotation graphs

Page 37: Automatic Identification of Bug-Introducing Changes

Annotation graphs

fixed

revisions

Page 38: Automatic Identification of Bug-Introducing Changes

foo()

Annotation graphs

fixed

revisions

Page 39: Automatic Identification of Bug-Introducing Changes

foo()

Annotation graphs

fixed

revisions

bar() renamed to foo()

Page 40: Automatic Identification of Bug-Introducing Changes

foo()

bar()

Annotation graphs

fixed

revisions

bar() renamed to foo()

Page 41: Automatic Identification of Bug-Introducing Changes

foo()

bar()

Annotation graphs

SZZ reports a bug-introducing change for foo (false positive) but not for bar (false negative).

fixed

revisions

bar() renamed to foo()

Page 42: Automatic Identification of Bug-Introducing Changes

foo()

bar()

Annotation graphs

fixed

revisions

bar() renamed to foo()

Using annotation graphs we can remove 2% as false positives and identifies further 1%~4%.

Page 43: Automatic Identification of Bug-Introducing Changes

Comments & blank lines

public void notifySourceElementRequestor() {- + if (reportReferenceInfo) {+ notifyAllUnknownReferences();+ } // collect the top level ast nodes int length = 0;

Page 44: Automatic Identification of Bug-Introducing Changes

Comments & blank lines

public void notifySourceElementRequestor() {- + if (reportReferenceInfo) {+ notifyAllUnknownReferences();+ } // collect the top level ast nodes int length = 0;

Ignoring comments and blank linesremoves 14%~20% as false positives.

Page 45: Automatic Identification of Bug-Introducing Changes

Format changes

if ( a==true ) return;BUG-INTRODUCING

Page 46: Automatic Identification of Bug-Introducing Changes

Format changes

if ( a==true ) return;BUG-INTRODUCING

if (a==true) return;

if (a==false) return;

BUG FIX

Page 47: Automatic Identification of Bug-Introducing Changes

Format changes

if ( a==true ) return;BUG-INTRODUCING

if (a==true) return;

if (a==false) return;

BUG FIX

FALSE POSITIVE

Page 48: Automatic Identification of Bug-Introducing Changes

Format changes

if ( a==true ) return;BUG-INTRODUCING

if (a==true) return;

if (a==false) return;

BUG FIX

FALSE POSITIVE

FALSE NEGATIVE

Page 49: Automatic Identification of Bug-Introducing Changes

Format changes

if ( a==true ) return;BUG-INTRODUCING

if (a==true) return;

if (a==false) return;

BUG FIX

FALSE POSITIVE

FALSE NEGATIVE

Ignoring format changes removes 18%~25% as false positives and identifies further 13%~14%.

Page 50: Automatic Identification of Bug-Introducing Changes

Fixes that affect many files

- public boolean visit(TypeDeclaration - typeDeclaration, BlockScope scope){+ public boolean visit(LocalTypeDeclaration+ typeDeclaration, BlockScope scope){

Most large fixes are refactoring

Page 51: Automatic Identification of Bug-Introducing Changes

Fixes that affect many files

- public boolean visit(TypeDeclaration - typeDeclaration, BlockScope scope){+ public boolean visit(LocalTypeDeclaration+ typeDeclaration, BlockScope scope){

Most large fixes are refactoring

Ignoring fixes that affect many files (=more than five times the median) removes 7%~16% as false positives

Page 52: Automatic Identification of Bug-Introducing Changes
Page 53: Automatic Identification of Bug-Introducing Changes

Manual inspection of fixes

deleteResources(actualNonJavaResources,fForce);- IResource[] remaingFiles; + IResource[] remainingFiles;  try { - remaingFiles=((IFolder)res).members(); + remainingFiles=((IFolder)res).members(); }

Two judges check whether a fix is actually a fix.

Page 54: Automatic Identification of Bug-Introducing Changes

Manual inspection of fixes

deleteResources(actualNonJavaResources,fForce);- IResource[] remaingFiles; + IResource[] remainingFiles;  try { - remaingFiles=((IFolder)res).members(); + remainingFiles=((IFolder)res).members(); }

Manual inspection of fixes removes only 4%~5% as false positives.

Two judges check whether a fix is actually a fix.

Page 55: Automatic Identification of Bug-Introducing Changes

Putting it together

BUGS

SZZ

ASE

Page 56: Automatic Identification of Bug-Introducing Changes

Putting it together

BUGS

SZZ

False negativesof SZZ: 14%

ASE

Page 57: Automatic Identification of Bug-Introducing Changes

Putting it together

BUGS

SZZ

False positives of SZZ: 38%~51%

False negativesof SZZ: 14%

ASE

Page 58: Automatic Identification of Bug-Introducing Changes

Putting it together

BUGS

SZZ

False positives of SZZ: 38%~51%

False negativesof SZZ: 14%

ASE

Automatically we can remove 36%~48% as false positive.

Page 59: Automatic Identification of Bug-Introducing Changes

Don’t program on Fridays ;-)

Sunday

Saturday

Friday

Thursday

Wednesday

Tuesday

Monday

0% 2.5% 5.0% 7.5% 10.0% 12.5% 15.0%

11.3%

10.4%

11.1%

12.1%

12.2%

11.7%

11.6%

Percentage of bug-introducing changes

Page 60: Automatic Identification of Bug-Introducing Changes

Don’t program on Fridays ;-)

Sunday

Saturday

Friday

Thursday

Wednesday

Tuesday

Monday

0% 2.5% 5.0% 7.5% 10.0% 12.5% 15.0%

11.3%

10.4%

11.1%

12.1%

12.2%

11.7%

11.6%

Percentage of bug-introducing changes

Page 61: Automatic Identification of Bug-Introducing Changes

Defect-prone authors

Amy

Bender

Leela

Fry

0% 10% 20% 30% 40% 50% 60% 70% 80%

Bug-Introducing changesFix changes

Authors(anonymized)

Frequency

Page 62: Automatic Identification of Bug-Introducing Changes

Risk awareness

“Safe” Location(green)

Risky Location(dark red)

Page 63: Automatic Identification of Bug-Introducing Changes

Change classification

Page 64: Automatic Identification of Bug-Introducing Changes

Change classification

X X X X

bug-introducing (“bad”)

Page 65: Automatic Identification of Bug-Introducing Changes

Change classification

X X X X

BUILD A CLASSIFIER

bug-introducing (“bad”)

Page 66: Automatic Identification of Bug-Introducing Changes

Change classification

X X X X

BUILD A CLASSIFIER

new change

bug-introducing (“bad”)

Page 67: Automatic Identification of Bug-Introducing Changes

Change classification

X X X X

BUILD A CLASSIFIER

PREDICT QUALITY

new change

bug-introducing (“bad”)

Page 68: Automatic Identification of Bug-Introducing Changes

Conclusions

Page 69: Automatic Identification of Bug-Introducing Changes

Conclusions

• Bug-introducing changes tell when a defect was introduced, not only its location.

Page 70: Automatic Identification of Bug-Introducing Changes

Conclusions

• Bug-introducing changes tell when a defect was introduced, not only its location.

• We can automatically identify – 36%~48% of SZZ as false positives and – further 14% of missed bug-introductions.

Page 71: Automatic Identification of Bug-Introducing Changes

Conclusions

• Bug-introducing changes tell when a defect was introduced, not only its location.

• We can automatically identify – 36%~48% of SZZ as false positives and – further 14% of missed bug-introductions.

• Bug-introducing changes are useful for defect prediction and software evolution.

Page 72: Automatic Identification of Bug-Introducing Changes

Conclusions

• Bug-introducing changes tell when a defect was introduced, not only its location.

• We can automatically identify – 36%~48% of SZZ as false positives and – further 14% of missed bug-introductions.

• Bug-introducing changes are useful for defect prediction and software evolution.

More mining @ ASE:Friday morning.