Automatic Identification of Bug-Introducing Changes. Presenter: Haroon Malik.

20
Automatic Identification of Bug-Introducing Changes. Presenter: Haroon Malik

Transcript of Automatic Identification of Bug-Introducing Changes. Presenter: Haroon Malik.

Page 1: Automatic Identification of Bug-Introducing Changes. Presenter: Haroon Malik.

Automatic Identification of Bug-Introducing Changes.

Presenter: Haroon Malik

Page 2: Automatic Identification of Bug-Introducing Changes. Presenter: Haroon Malik.

Abstract Bug-fixes do not contain information about the change that

initially introduced the bug. Extraction of bug-introducing changes is challenging. An algorithm to automatically and accurately identify a bug-

introducing changes. Algorithm can remove 30%~51% of false positive and

14%~15% of false negative to previous algorithm.

Page 3: Automatic Identification of Bug-Introducing Changes. Presenter: Haroon Malik.

Introduction Software project control their changes using SCM and capture bug

reports using bug tracking software e.g Bugzilla. Records which changed in SCM system fixes a Specific bug in the

change tracking system. Bug Progression:

Programmer makes the change Bug-introducing change

Bug manifest itself in some undesirable external behavior. Recorded in bug tracking system

Developer modifies the code to fix bug Bug-fix change

Page 4: Automatic Identification of Bug-Introducing Changes. Presenter: Haroon Malik.

Introduction (Cont’d)

Wide spread use of SCM, data concerning bug fix changes in readily availble.

It is easy to mine SCM repository to mine changes that have repaired a bug Linking key words with bug report refrence

E.g: “Bug” or “Fixed” #902340

Page 5: Automatic Identification of Bug-Introducing Changes. Presenter: Haroon Malik.

Major Problemswith bug-fix data It shed no light on when a bug was injected Not always person who fixes a bug is one

who caused Can not determine where a bug occurred.

Page 6: Automatic Identification of Bug-Introducing Changes. Presenter: Haroon Malik.

Background SZZ Algoritham Working:

Firstly locates key words to mark bug-fixed changes Secondly, running a diff tool what changed in bug-fix Diff tool returns “Hunk” Utilizes annotate feature of SCM to find the changes

Most recent revision Who made the chage

Page 7: Automatic Identification of Bug-Introducing Changes. Presenter: Haroon Malik.

Background (Cont’d) Revision1: Origin of bug (Line 3). Revision 2: Function name changed (bar foo). Revision 3: Bug removal

Page 8: Automatic Identification of Bug-Introducing Changes. Presenter: Haroon Malik.

SZZ Limitations Blank spaces and Comments Formatting changes (Line 3) Name of function containing bug.

Page 9: Automatic Identification of Bug-Introducing Changes. Presenter: Haroon Malik.

Proposed Approach Applied of method level for two java open source projects

Columba and Eclipise Two human judges manually verified all hunk in series of

bug-fix to ensure the corresponding hunks are real bug fixes.

Page 10: Automatic Identification of Bug-Introducing Changes. Presenter: Haroon Malik.

Proposed Approach Steps(1-5) remove 38%~51% of false positive

and 14%~15% of false negatives as compared to SZZ.

Page 11: Automatic Identification of Bug-Introducing Changes. Presenter: Haroon Malik.

Experimental setup History Extraction

Used Kenyon to extract histories from SCM systems

Page 12: Automatic Identification of Bug-Introducing Changes. Presenter: Haroon Malik.

Experimental setup (Cont’d) Accuracy Measures

Bug-introducing change set consists of all the changes with in specific project revisions that have been identified as bug-introducing

Assuming R is the more accurate bug-introducing change set, then compute false positives and false negatives for the set P can be computed as follow:

Page 13: Automatic Identification of Bug-Introducing Changes. Presenter: Haroon Malik.

Annotation Graph Annotation Graph

A graph which contains information on the cross-revision mappings of individual lines.

Major improvement over the SZZ

Page 14: Automatic Identification of Bug-Introducing Changes. Presenter: Haroon Malik.

Experimental setup (Cont’d) Non behavior changes

Code format, comments & blank lines.

14%~20% false positive

Format changes

Page 15: Automatic Identification of Bug-Introducing Changes. Presenter: Haroon Malik.

Manual Verification If a change log indicates the revisions is a bug-fix, it is

assumed all the hunks in revision are bug fixes. Two humans judges marked each bug-fix hunk for both

projects. Used bug-fix hunk verification tool

M
1) how many of them are true bug-fixes.3) one developer it might be bug fix and for other one it might be a code cleanup or addition of new feature.4) Renaming files was observed by humman but it do not effect the execution behaviour.
Page 16: Automatic Identification of Bug-Introducing Changes. Presenter: Haroon Malik.

Real Bugs?

M
1) example of == null2) IEEE " any condition that departs from the expected
Page 17: Automatic Identification of Bug-Introducing Changes. Presenter: Haroon Malik.

Validation Hurdles Non representative systems. Open Source. Bug fix data is incomplete. Manual Varicication

M
1) and the development methodology are diffrent then coomervial. They can be better in a sense that they have diffrent departmetn working for code improvement. More over they can have worst resutls and diffrent patteren of bugs due to lod pressure and targets to achive. such as release deedlines etc.
Page 18: Automatic Identification of Bug-Introducing Changes. Presenter: Haroon Malik.

Bug-Introduction Statistics

Eclipse Columba

Page 19: Automatic Identification of Bug-Introducing Changes. Presenter: Haroon Malik.

Conclusion Refined SSZ approach by introducing

Annotation Graph. Experiments showed the achievement of

38~51% of false positive and 14% of false negative removal as compared to SSZ

Page 20: Automatic Identification of Bug-Introducing Changes. Presenter: Haroon Malik.

Thank You

M
1)repatation.2)reduce false negative and false positve.. but real bugs.3) %