Automatic Identification of Bug-Introducing Changes. Presenter: Haroon Malik.
-
Upload
gladys-bates -
Category
Documents
-
view
231 -
download
0
Transcript of Automatic Identification of Bug-Introducing Changes. Presenter: Haroon Malik.
Automatic Identification of Bug-Introducing Changes.
Presenter: Haroon Malik
Abstract Bug-fixes do not contain information about the change that
initially introduced the bug. Extraction of bug-introducing changes is challenging. An algorithm to automatically and accurately identify a bug-
introducing changes. Algorithm can remove 30%~51% of false positive and
14%~15% of false negative to previous algorithm.
Introduction Software project control their changes using SCM and capture bug
reports using bug tracking software e.g Bugzilla. Records which changed in SCM system fixes a Specific bug in the
change tracking system. Bug Progression:
Programmer makes the change Bug-introducing change
Bug manifest itself in some undesirable external behavior. Recorded in bug tracking system
Developer modifies the code to fix bug Bug-fix change
Introduction (Cont’d)
Wide spread use of SCM, data concerning bug fix changes in readily availble.
It is easy to mine SCM repository to mine changes that have repaired a bug Linking key words with bug report refrence
E.g: “Bug” or “Fixed” #902340
Major Problemswith bug-fix data It shed no light on when a bug was injected Not always person who fixes a bug is one
who caused Can not determine where a bug occurred.
Background SZZ Algoritham Working:
Firstly locates key words to mark bug-fixed changes Secondly, running a diff tool what changed in bug-fix Diff tool returns “Hunk” Utilizes annotate feature of SCM to find the changes
Most recent revision Who made the chage
Background (Cont’d) Revision1: Origin of bug (Line 3). Revision 2: Function name changed (bar foo). Revision 3: Bug removal
SZZ Limitations Blank spaces and Comments Formatting changes (Line 3) Name of function containing bug.
Proposed Approach Applied of method level for two java open source projects
Columba and Eclipise Two human judges manually verified all hunk in series of
bug-fix to ensure the corresponding hunks are real bug fixes.
Proposed Approach Steps(1-5) remove 38%~51% of false positive
and 14%~15% of false negatives as compared to SZZ.
Experimental setup History Extraction
Used Kenyon to extract histories from SCM systems
Experimental setup (Cont’d) Accuracy Measures
Bug-introducing change set consists of all the changes with in specific project revisions that have been identified as bug-introducing
Assuming R is the more accurate bug-introducing change set, then compute false positives and false negatives for the set P can be computed as follow:
Annotation Graph Annotation Graph
A graph which contains information on the cross-revision mappings of individual lines.
Major improvement over the SZZ
Experimental setup (Cont’d) Non behavior changes
Code format, comments & blank lines.
14%~20% false positive
Format changes
Manual Verification If a change log indicates the revisions is a bug-fix, it is
assumed all the hunks in revision are bug fixes. Two humans judges marked each bug-fix hunk for both
projects. Used bug-fix hunk verification tool
Real Bugs?
Validation Hurdles Non representative systems. Open Source. Bug fix data is incomplete. Manual Varicication
Bug-Introduction Statistics
Eclipse Columba
Conclusion Refined SSZ approach by introducing
Annotation Graph. Experiments showed the achievement of
38~51% of false positive and 14% of false negative removal as compared to SSZ
Thank You