Software Analytics: Data Analytics for Software Engineering and Security
Software Analytics: Towards Software Mining that Matters
-
Upload
tao-xie -
Category
Technology
-
view
660 -
download
0
description
Transcript of Software Analytics: Towards Software Mining that Matters
![Page 1: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/1.jpg)
Software Analytics:
Towards Software Mining
that Matters Tao Xie
University of Illinois at Urbana-Champaign http://www.cs.illinois.edu/homes/taoxie/
![Page 2: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/2.jpg)
Should I test\review my?
©A. Hassan
![Page 3: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/3.jpg)
©A. Hassan
![Page 4: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/4.jpg)
©A. Hassan
![Page 5: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/5.jpg)
©A. Hassan
![Page 6: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/6.jpg)
Software analytics is to enable software practitioners to perform data exploration and analysis in order to obtain insightful and actionable information for data-driven tasks around software and services.
[MALETS’11 Zhang et al.]
![Page 7: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/7.jpg)
Software Intelligence & Analytics for Software Development
http://people.engr.ncsu.edu/txie/publications/foser10-si.pdf http://thomas-zimmermann.com/publications/files/buse-foser-2010.pdf
![Page 8: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/8.jpg)
• use Data Exploration and Analysis Mining Software Repositories (MSR)
• for Software Practitioners Beyond Software Developers
• obtain Insightful and Actionable info Need get real as well
• Analytic Techniques • Producing Impact on Practice
![Page 9: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/9.jpg)
Look through your software data
©A. Hassan
![Page 10: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/10.jpg)
Look through your software data
©A. Hassan
![Page 11: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/11.jpg)
Mine through the data!
http://msrconf.org
An international effort to make software repositories actionable
http://promisedata.org ©A. Hassan
![Page 12: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/12.jpg)
Mine through the data!
http://msrconf.org
An international effort to make software repositories actionable
http://promisedata.org ©A. Hassan
![Page 13: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/13.jpg)
Mine through the data!
http://msrconf.org
An international effort to make software repositories actionable
http://promisedata.org
Promise Data Repository
©A. Hassan
![Page 14: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/14.jpg)
Mining Software Repositories (MSR)
• Transforms static record-keeping repositories to active repositories
• Makes repository data actionable by uncovering hidden patterns and trends
11
Mailinglist Bugzilla Crashes
Field logs CVS/SVN
©A. Hassan
![Page 15: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/15.jpg)
Mining Software Repositories (MSR)
• Transforms static record-keeping repositories to active repositories
• Makes repository data actionable by uncovering hidden patterns and trends
11
Mailinglist Bugzilla Crashes
Field logs CVS/SVN
©A. Hassan
![Page 16: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/16.jpg)
12
Source Control CVS/SVN
Bugzilla Mailing lists
©A. Hassan
![Page 17: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/17.jpg)
12
Field Logs
Source Control CVS/SVN
Bugzilla Mailing lists
Crash Repos
©A. Hassan
![Page 18: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/18.jpg)
12
Field Logs
Source Control CVS/SVN
Bugzilla Mailing lists
Crash Repos
Historical Repositories ©A. Hassan
![Page 19: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/19.jpg)
12
Field Logs
Source Control CVS/SVN
Bugzilla Mailing lists
Crash Repos
Historical Repositories Runtime Repos ©A. Hassan
![Page 20: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/20.jpg)
12
Field Logs
Source Control CVS/SVN
Bugzilla Mailing lists
Crash Repos
Historical Repositories Runtime Repos
Code Repos
Sourceforge GoogleCode
©A. Hassan
![Page 21: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/21.jpg)
Bugzilla CVS/SVN Mailinglist Crashes
MSR researchers analyze and cross-link repositories
©A. Hassan
![Page 22: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/22.jpg)
Bugzilla CVS/SVN Mailinglist Crashes
MSR researchers analyze and cross-link repositories
fixed bug
discussions Buggy change &
Fixing change Field crashes
©A. Hassan
![Page 23: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/23.jpg)
Bugzilla CVS/SVN Mailinglist Crashes
MSR researchers analyze and cross-link repositories
fixed bug
discussions Buggy change &
Fixing change Field crashes
New Bug Report
©A. Hassan
![Page 24: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/24.jpg)
Bugzilla CVS/SVN Mailinglist Crashes
MSR researchers analyze and cross-link repositories
fixed bug
discussions Buggy change &
Fixing change Field crashes
Estimate fix effort Mark duplicates
Suggest experts and fix!
New Bug Report
©A. Hassan
![Page 25: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/25.jpg)
• use Data Exploration and Analysis Mining Software Repositories (MSR)
• for Software Practitioners Beyond Software Developers
• obtain Insightful and Actionable info Need get real as well
• Analytic Techniques • Producing Impact on Practice
![Page 26: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/26.jpg)
We continue to help practitioners (esp. developers)
©A. Hassan
![Page 27: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/27.jpg)
©A. Hassan
![Page 28: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/28.jpg)
©A. Hassan
![Page 29: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/29.jpg)
©A. Hassan
![Page 30: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/30.jpg)
©A. Hassan
![Page 31: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/31.jpg)
©A. Hassan
![Page 32: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/32.jpg)
©A. Hassan
![Page 33: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/33.jpg)
©A. Hassan
![Page 34: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/34.jpg)
©A. Hassan
![Page 35: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/35.jpg)
©A. Hassan
![Page 36: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/36.jpg)
Detection and Management of Code Clones
©A. Hassan
![Page 37: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/37.jpg)
Support Logs
Source Code
©A. Hassan
![Page 38: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/38.jpg)
©A. Hassan
![Page 39: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/39.jpg)
• use Data Exploration and Analysis Mining Software Repositories (MSR)
• for Software Practitioners Beyond Software Developers
• obtain Insightful and Actionable info Need get real as well
• Analytic Techniques • Case Studies
![Page 40: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/40.jpg)
Predicting Bugs • Studies have shown that most complexity metrics
correlate well with LOC! – Graves et al. 2000 on commercial systems – Herraiz et al. 2007 on open source systems
• Noteworthy findings: – Previous bugs are good predictors of future bugs – The more a file changes, the more likely it will have
bugs in it – Recent changes affect more the bug potential of a file
over older changes (weighted time damp models) – Number of developers is of little help in predicting bugs – Hard to generalize bug predictors across projects
unless in similar domains [Nagappan, Ball et al. 2006]
23
![Page 41: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/41.jpg)
Using Imports in Eclipse to Predict Bugs
24
import org.eclipse.jdt.internal.compiler.lookup.*; import org.eclipse.jdt.internal.compiler.*; import org.eclipse.jdt.internal.compiler.ast.*; import org.eclipse.jdt.internal.compiler.util.*; ... import org.eclipse.pde.core.*; import org.eclipse.jface.wizard.*; import org.eclipse.ui.*;
14% of all files that import ui packages, had to be fixed later on.
71% of files that import compiler packages, had to be fixed later on.
[Schröter et al. 06]
![Page 42: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/42.jpg)
25
Percentage of bug-introducing changes for eclipse
Don’t program on Fridays ;-)
[Zimmermann et al. 05]
![Page 43: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/43.jpg)
26
Failure is a 4-letter Word
[PROMISE’11 Zeller et al.]
![Page 44: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/44.jpg)
27
Actionable Alone is not Enough!
[PROMISE’11 Zeller et al.]
![Page 45: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/45.jpg)
Who produces more buggy code?
©A. Hassan
![Page 46: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/46.jpg)
Who produces more buggy code?
©A. Hassan
![Page 47: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/47.jpg)
• use Data Exploration and Analysis Mining Software Repositories (MSR)
• for Software Practitioners Beyond Software Developers
• obtain Insightful and Actionable info Need get real as well
• Analytic Techniques • Producing Impact on Practice
![Page 48: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/48.jpg)
Analytic Techniques in SE
• Association rules and frequent patterns • Classification • Clustering • Text mining/Natural language processing • Visualization More details are at • https://sites.google.com/site/xsoftanalytics/ 30
![Page 49: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/49.jpg)
49
Basic mining
algorithms
Solution-Driven Problem-Driven
Advanced mining
algorithms New/adapted
mining algorithms
Where can I apply X miner? What patterns do we really need?
E.g., frequent partial order mining [ESEC/FSE 07]
E.g., association rule, frequent itemset mining… E.g., [ICSE 09], [ASE 09]
![Page 50: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/50.jpg)
50
1 2 mining patterns
Eclipse, Linux, …
Traditional approaches
Code repositories
Mining Searching + Mining
![Page 51: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/51.jpg)
51
1 2 mining patterns
Eclipse, Linux, …
Traditional approaches
Often lack sufficient relevant data points (Eg. API call sites)
Code repositories
Mining Searching + Mining
![Page 52: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/52.jpg)
52
53 53
Code repositories
1 2 N …
1 2 mining patterns
searching mining patterns
Code search engine e.g., Open source code
on the web
Eclipse, Linux, …
Traditional approaches
Our new approaches
Often lack sufficient relevant data points (Eg. API call sites)
Code repositories
Mining Searching + Mining
![Page 53: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/53.jpg)
Existing approaches produce high % of false positives One major observation: Programmers often write code in different ways for
achieving the same task Some ways are more frequent than others
S.Thummalapenta and T. Xie. Alattin: Mining Alternative Patterns for Detecting Neglected Conditions. ASE 2009.
![Page 54: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/54.jpg)
Existing approaches produce high % of false positives One major observation: Programmers often write code in different ways for
achieving the same task Some ways are more frequent than others
Frequent ways
Infrequent ways
S.Thummalapenta and T. Xie. Alattin: Mining Alternative Patterns for Detecting Neglected Conditions. ASE 2009.
![Page 55: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/55.jpg)
Existing approaches produce high % of false positives One major observation: Programmers often write code in different ways for
achieving the same task Some ways are more frequent than others
Frequent ways
Infrequent ways
Mined Patterns
mine patterns
S.Thummalapenta and T. Xie. Alattin: Mining Alternative Patterns for Detecting Neglected Conditions. ASE 2009.
![Page 56: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/56.jpg)
Existing approaches produce high % of false positives One major observation: Programmers often write code in different ways for
achieving the same task Some ways are more frequent than others
Frequent ways
Infrequent ways
Mined Patterns
mine patterns detect violations
S.Thummalapenta and T. Xie. Alattin: Mining Alternative Patterns for Detecting Neglected Conditions. ASE 2009.
![Page 57: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/57.jpg)
Existing approaches produce high % of false positives One major observation: Programmers often write code in different ways for
achieving the same task Some ways are more frequent than others
Frequent ways
Infrequent ways
Mined Patterns
mine patterns detect violations
S.Thummalapenta and T. Xie. Alattin: Mining Alternative Patterns for Detecting Neglected Conditions. ASE 2009.
![Page 58: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/58.jpg)
58
Example: java.util.Iterator.next()
PrintEntries1(ArrayList<string> entries) { … Iterator it = entries.iterator(); if(it.hasNext()) { string last = (string) it.next(); } … }
Code Sample 1
PrintEntries2(ArrayList<string> entries) { … if(entries.size() > 0) { Iterator it = entries.iterator(); string last = (string) it.next(); } … }
Code Sample 2
Java.util.Iterator.next() throws NoSuchElementException when invoked on a list without any elements
S.Thummalapenta and T. Xie. Alattin: Mining Alternative Patterns for Detecting Neglected Conditions. ASE 2009.
![Page 59: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/59.jpg)
59
Example: java.util.Iterator.next()
PrintEntries1(ArrayList<string> entries) { … Iterator it = entries.iterator(); if(it.hasNext()) { string last = (string) it.next(); } … }
Code Sample 1
PrintEntries2(ArrayList<string> entries) { … if(entries.size() > 0) { Iterator it = entries.iterator(); string last = (string) it.next(); } … }
Code Sample 2
S.Thummalapenta and T. Xie. Alattin: Mining Alternative Patterns for Detecting Neglected Conditions. ASE 2009.
![Page 60: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/60.jpg)
60
Example: java.util.Iterator.next()
PrintEntries1(ArrayList<string> entries) { … Iterator it = entries.iterator(); if(it.hasNext()) { string last = (string) it.next(); } … }
Code Sample 1
PrintEntries2(ArrayList<string> entries) { … if(entries.size() > 0) { Iterator it = entries.iterator(); string last = (string) it.next(); } … }
Code Sample 2
1243 code examples
S.Thummalapenta and T. Xie. Alattin: Mining Alternative Patterns for Detecting Neglected Conditions. ASE 2009.
![Page 61: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/61.jpg)
61
Example: java.util.Iterator.next()
PrintEntries1(ArrayList<string> entries) { … Iterator it = entries.iterator(); if(it.hasNext()) { string last = (string) it.next(); } … }
Code Sample 1
PrintEntries2(ArrayList<string> entries) { … if(entries.size() > 0) { Iterator it = entries.iterator(); string last = (string) it.next(); } … }
Code Sample 2
1243 code examples
Sample 1 (1218 / 1243)
S.Thummalapenta and T. Xie. Alattin: Mining Alternative Patterns for Detecting Neglected Conditions. ASE 2009.
![Page 62: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/62.jpg)
62
Example: java.util.Iterator.next()
PrintEntries1(ArrayList<string> entries) { … Iterator it = entries.iterator(); if(it.hasNext()) { string last = (string) it.next(); } … }
Code Sample 1
PrintEntries2(ArrayList<string> entries) { … if(entries.size() > 0) { Iterator it = entries.iterator(); string last = (string) it.next(); } … }
Code Sample 2
1243 code examples
Sample 1 (1218 / 1243)
Sample 2 (6/1243)
S.Thummalapenta and T. Xie. Alattin: Mining Alternative Patterns for Detecting Neglected Conditions. ASE 2009.
![Page 63: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/63.jpg)
63
Example: java.util.Iterator.next()
PrintEntries1(ArrayList<string> entries) { … Iterator it = entries.iterator(); if(it.hasNext()) { string last = (string) it.next(); } … }
Code Sample 1
PrintEntries2(ArrayList<string> entries) { … if(entries.size() > 0) { Iterator it = entries.iterator(); string last = (string) it.next(); } … }
Code Sample 2
1243 code examples
Sample 1 (1218 / 1243)
Sample 2 (6/1243)
Mined Pattern from existing approaches:
“boolean check on return of Iterator.hasNext before Iterator.next” S.Thummalapenta and T. Xie. Alattin: Mining Alternative Patterns for Detecting Neglected Conditions. ASE 2009.
![Page 64: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/64.jpg)
64
Example: java.util.Iterator.next()
PrintEntries1(ArrayList<string> entries) { … Iterator it = entries.iterator(); if(it.hasNext()) { string last = (string) it.next(); } … }
Code Sample 1
PrintEntries2(ArrayList<string> entries) { … if(entries.size() > 0) { Iterator it = entries.iterator(); string last = (string) it.next(); } … }
Code Sample 2
1243 code examples
Sample 1 (1218 / 1243)
Sample 2 (6/1243)
Mined Pattern from existing approaches:
“boolean check on return of Iterator.hasNext before Iterator.next” S.Thummalapenta and T. Xie. Alattin: Mining Alternative Patterns for Detecting Neglected Conditions. ASE 2009.
![Page 65: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/65.jpg)
65
Example: java.util.Iterator.next()
PrintEntries1(ArrayList<string> entries) { … Iterator it = entries.iterator(); if(it.hasNext()) { string last = (string) it.next(); } … }
Code Sample 1
PrintEntries2(ArrayList<string> entries) { … if(entries.size() > 0) { Iterator it = entries.iterator(); string last = (string) it.next(); } … }
Code Sample 2
1243 code examples
Sample 1 (1218 / 1243)
Sample 2 (6/1243)
Mined Pattern from existing approaches:
“boolean check on return of Iterator.hasNext before Iterator.next” S.Thummalapenta and T. Xie. Alattin: Mining Alternative Patterns for Detecting Neglected Conditions. ASE 2009.
![Page 66: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/66.jpg)
Example: java.util.Iterator.next()
PrintEntries1(ArrayList<string> entries) { … Iterator it = entries.iterator(); if(it.hasNext()) { string last = (string) it.next(); } … }
Code Sample 1
PrintEntries2(ArrayList<string> entries) { … if(entries.size() > 0) { Iterator it = entries.iterator(); string last = (string) it.next(); } … }
Code Sample 2
S.Thummalapenta and T. Xie. Alattin: Mining Alternative Patterns for Detecting Neglected Conditions. ASE 2009.
![Page 67: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/67.jpg)
Example: java.util.Iterator.next()
PrintEntries1(ArrayList<string> entries) { … Iterator it = entries.iterator(); if(it.hasNext()) { string last = (string) it.next(); } … }
Code Sample 1
PrintEntries2(ArrayList<string> entries) { … if(entries.size() > 0) { Iterator it = entries.iterator(); string last = (string) it.next(); } … }
Code Sample 2
S.Thummalapenta and T. Xie. Alattin: Mining Alternative Patterns for Detecting Neglected Conditions. ASE 2009.
![Page 68: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/68.jpg)
Example: java.util.Iterator.next()
Require more general patterns (alternative patterns): P1 or P2
PrintEntries1(ArrayList<string> entries) { … Iterator it = entries.iterator(); if(it.hasNext()) { string last = (string) it.next(); } … }
Code Sample 1
PrintEntries2(ArrayList<string> entries) { … if(entries.size() > 0) { Iterator it = entries.iterator(); string last = (string) it.next(); } … }
Code Sample 2
S.Thummalapenta and T. Xie. Alattin: Mining Alternative Patterns for Detecting Neglected Conditions. ASE 2009.
![Page 69: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/69.jpg)
Example: java.util.Iterator.next()
Require more general patterns (alternative patterns): P1 or P2
P1 : boolean check on return of Iterator.hasNext before Iterator.next
PrintEntries1(ArrayList<string> entries) { … Iterator it = entries.iterator(); if(it.hasNext()) { string last = (string) it.next(); } … }
Code Sample 1
PrintEntries2(ArrayList<string> entries) { … if(entries.size() > 0) { Iterator it = entries.iterator(); string last = (string) it.next(); } … }
Code Sample 2
S.Thummalapenta and T. Xie. Alattin: Mining Alternative Patterns for Detecting Neglected Conditions. ASE 2009.
![Page 70: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/70.jpg)
Example: java.util.Iterator.next()
Require more general patterns (alternative patterns): P1 or P2
P1 : boolean check on return of Iterator.hasNext before Iterator.next
P2 : boolean check on return of ArrayList.size before Iterator.next
PrintEntries1(ArrayList<string> entries) { … Iterator it = entries.iterator(); if(it.hasNext()) { string last = (string) it.next(); } … }
Code Sample 1
PrintEntries2(ArrayList<string> entries) { … if(entries.size() > 0) { Iterator it = entries.iterator(); string last = (string) it.next(); } … }
Code Sample 2
S.Thummalapenta and T. Xie. Alattin: Mining Alternative Patterns for Detecting Neglected Conditions. ASE 2009.
![Page 71: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/71.jpg)
Example: java.util.Iterator.next()
Require more general patterns (alternative patterns): P1 or P2
P1 : boolean check on return of Iterator.hasNext before Iterator.next
P2 : boolean check on return of ArrayList.size before Iterator.next Cannot be mined by existing approaches, since alternative P2 is infrequent
PrintEntries1(ArrayList<string> entries) { … Iterator it = entries.iterator(); if(it.hasNext()) { string last = (string) it.next(); } … }
Code Sample 1
PrintEntries2(ArrayList<string> entries) { … if(entries.size() > 0) { Iterator it = entries.iterator(); string last = (string) it.next(); } … }
Code Sample 2
S.Thummalapenta and T. Xie. Alattin: Mining Alternative Patterns for Detecting Neglected Conditions. ASE 2009.
![Page 72: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/72.jpg)
72
Our Solution: ImMiner Algorithm
Mines alternative patterns of the form P1 or P2
Based on the observation that infrequent alternatives such as P2 are frequent among code examples that do not support P1
[ASE 09]
S.Thummalapenta and T. Xie. Alattin: Mining Alternative Patterns for Detecting Neglected Conditions. ASE 2009.
![Page 73: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/73.jpg)
73
Our Solution: ImMiner Algorithm
Mines alternative patterns of the form P1 or P2
Based on the observation that infrequent alternatives such as P2 are frequent among code examples that do not support P1
1243 code examples
Sample 1 (1218 / 1243)
Sample 2 (6/1243)
[ASE 09]
S.Thummalapenta and T. Xie. Alattin: Mining Alternative Patterns for Detecting Neglected Conditions. ASE 2009.
![Page 74: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/74.jpg)
74
Our Solution: ImMiner Algorithm
Mines alternative patterns of the form P1 or P2
Based on the observation that infrequent alternatives such as P2 are frequent among code examples that do not support P1
1243 code examples
Sample 1 (1218 / 1243)
Sample 2 (6/1243)
P2 is infrequent among entire 1243 code examples
[ASE 09]
S.Thummalapenta and T. Xie. Alattin: Mining Alternative Patterns for Detecting Neglected Conditions. ASE 2009.
![Page 75: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/75.jpg)
75
Our Solution: ImMiner Algorithm
Mines alternative patterns of the form P1 or P2
Based on the observation that infrequent alternatives such as P2 are frequent among code examples that do not support P1
1243 code examples
Sample 1 (1218 / 1243)
Sample 2 (6/1243)
P2 is frequent among code examples not supporting P1
[ASE 09]
S.Thummalapenta and T. Xie. Alattin: Mining Alternative Patterns for Detecting Neglected Conditions. ASE 2009.
![Page 76: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/76.jpg)
76
Alternative Patterns
ImMiner mines three kinds of alternative patterns of the general form “P1 or P2”
Balanced: all alternatives (both P1 and P2) are frequent Imbalanced: some alternatives (P1) are frequent and
others are infrequent (P2). Represented as “P1 or P^2”
Single: only one alternative
S.Thummalapenta and T. Xie. Alattin: Mining Alternative Patterns for Detecting Neglected Conditions. ASE 2009.
![Page 77: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/77.jpg)
ImMiner Algorithm
Uses frequent-itemset mining [Burdick et al. ICDE 01] iteratively An input database with the following APIs for Iterator.next()
Input database Mapping of IDs to APIs
S.Thummalapenta and T. Xie. Alattin: Mining Alternative Patterns for Detecting Neglected Conditions. ASE 2009.
![Page 78: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/78.jpg)
ImMiner Algorithm: Frequent Alternatives
Input database
Frequent itemset mining
(min_sup 0.5)
Frequent item: 1 P1: boolean-check on the return of Iterator.hasNext()
before Iterator.next() S.Thummalapenta and T. Xie. Alattin: Mining Alternative Patterns for Detecting Neglected Conditions. ASE 2009.
![Page 79: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/79.jpg)
79
ImMiner: Infrequent Alternatives of P1
S.Thummalapenta and T. Xie. Alattin: Mining Alternative Patterns for Detecting Neglected Conditions. ASE 2009.
![Page 80: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/80.jpg)
80
ImMiner: Infrequent Alternatives of P1 Split input database into two databases: Positive and Negative
S.Thummalapenta and T. Xie. Alattin: Mining Alternative Patterns for Detecting Neglected Conditions. ASE 2009.
![Page 81: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/81.jpg)
81
ImMiner: Infrequent Alternatives of P1
Positive database (PSD)
Split input database into two databases: Positive and Negative
S.Thummalapenta and T. Xie. Alattin: Mining Alternative Patterns for Detecting Neglected Conditions. ASE 2009.
![Page 82: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/82.jpg)
82
ImMiner: Infrequent Alternatives of P1
Positive database (PSD)
Negative database (NSD)
Split input database into two databases: Positive and Negative
S.Thummalapenta and T. Xie. Alattin: Mining Alternative Patterns for Detecting Neglected Conditions. ASE 2009.
![Page 83: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/83.jpg)
83
ImMiner: Infrequent Alternatives of P1
Positive database (PSD)
Negative database (NSD)
Split input database into two databases: Positive and Negative
Mine patterns that are frequent in NSD and are infrequent in PSD Reason: Only such patterns serve as alternatives for P1
S.Thummalapenta and T. Xie. Alattin: Mining Alternative Patterns for Detecting Neglected Conditions. ASE 2009.
![Page 84: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/84.jpg)
84
ImMiner: Infrequent Alternatives of P1
Positive database (PSD)
Negative database (NSD)
Split input database into two databases: Positive and Negative
Mine patterns that are frequent in NSD and are infrequent in PSD Reason: Only such patterns serve as alternatives for P1
Alternative Pattern : P2 “const check on the return of ArrayList.size() before Iterator.next()” Alattin applies ImMiner algorithm to detect neglected conditions
S.Thummalapenta and T. Xie. Alattin: Mining Alternative Patterns for Detecting Neglected Conditions. ASE 2009.
![Page 85: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/85.jpg)
Neglected Conditions
Neglected conditions refer to Missing conditions that check the arguments or receiver of the API call before the API call Missing conditions that check the return or receiver of the API call after the API call
One primary reason for many fatal issues security or buffer-overflow vulnerabilities [Chang et al. ISSTA 07]
S.Thummalapenta and T. Xie. Alattin: Mining Alternative Patterns for Detecting Neglected Conditions. ASE 2009.
![Page 86: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/86.jpg)
• use Data Exploration and Analysis Mining Software Repositories (MSR)
• for Software Practitioners Beyond Software Developers
• obtain Insightful and Actionable info Need get real as well
• Analytic Techniques • Producing Impact on Practice
![Page 87: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/87.jpg)
Machine Learning that Matters
http://arxiv.org/ftp/arxiv/papers/1206/1206.4656.pdf
[ICML’12 Wagstaff]
![Page 88: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/88.jpg)
• Hyper-Focus on Benchmark Data Sets
• Hyper-Focus on Abstract Metrics
• Lack of Follow-Through
http://arxiv.org/ftp/arxiv/papers/1206/1206.4656.pdf
[ICML’12 Wagstaff]
![Page 89: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/89.jpg)
• Meaningful Evaluation Methods
• Involvement of the World Outside ML
• Eyes on the Prize
http://arxiv.org/ftp/arxiv/papers/1206/1206.4656.pdf
[ICML’12 Wagstaff]
![Page 90: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/90.jpg)
MSRA Software Analytics Group
Utilize data-driven approach to help create highly performing, user friendly, and efficiently developed and operated software and services.
Contact: Dongmei Zhang ([email protected])
http://research.microsoft.com/groups/sa/
![Page 91: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/91.jpg)
MSRA Software Analytics Group
Utilize data-driven approach to help create highly performing, user friendly, and efficiently developed and operated software and services.
Software Development
Process
Software Systems
Software Users
Research Topics
Contact: Dongmei Zhang ([email protected])
http://research.microsoft.com/groups/sa/
![Page 92: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/92.jpg)
MSRA Software Analytics Group
Utilize data-driven approach to help create highly performing, user friendly, and efficiently developed and operated software and services.
Software Development
Process
Software Systems
Software Users
Information Visualization
Analysis Algorithms
Large-scale Computing
Research Topics Technology Pillars
Contact: Dongmei Zhang ([email protected])
http://research.microsoft.com/groups/sa/
![Page 93: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/93.jpg)
MSRA Software Analytics Group
Utilize data-driven approach to help create highly performing, user friendly, and efficiently developed and operated software and services.
Software Development
Process
Software Systems
Software Users
Information Visualization
Analysis Algorithms
Large-scale Computing
Research Topics Technology Pillars
Contact: Dongmei Zhang ([email protected])
http://research.microsoft.com/groups/sa/
![Page 94: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/94.jpg)
MSRA Software Analytics Group
Utilize data-driven approach to help create highly performing, user friendly, and efficiently developed and operated software and services.
Software Development
Process
Software Systems
Software Users
Information Visualization
Analysis Algorithms
Large-scale Computing
Research Topics Technology Pillars
Vertical
Horizontal
Contact: Dongmei Zhang ([email protected])
http://research.microsoft.com/groups/sa/
![Page 95: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/95.jpg)
Software Analytics in Practice
![Page 96: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/96.jpg)
Adoption Challenges for Software Analytics
Must show value before data quality
improves
Correlation vs. Causation
![Page 97: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/97.jpg)
ICSE Papers: Industry vs. Academia
Source© Carlo Ghezzi
![Page 98: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/98.jpg)
ICSE Papers: Industry vs. Academia
Source© Carlo Ghezzi
OSDI 2008 26% vs. xSE ?% Developers, Programmers, Architects Among All Attendees
![Page 99: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/99.jpg)
ICSE Papers: Industry vs. Academia
Source© Carlo Ghezzi
OSDI 2008 26% vs. xSE ?% Developers, Programmers, Architects Among All Attendees
ICSM 11 Keynote ICSE 09 Keynote
MSR 12 Keynote MSR 11 Keynote
SCAM 12 Keynote
![Page 100: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/100.jpg)
"Are Automated Debugging [Research] Techniques Actually Helping Programmers?"
• 50 years of automated debugging research – N papers only 5 evaluated with actual programmers
“
” [ISSTA11 Parnin&Orso]
![Page 101: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/101.jpg)
Are Regression Testing [Research] Techniques Actually Helping Industry?
• Likely most studied testing problems – N papers
“
” [STVR11 Yoo&Harman]
![Page 102: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/102.jpg)
Are [Some] Failure-Proneness Prediction [Research] Techniques Actually Helping?
• Empirical software engineering (on prediction) – N papers
[PROMISE11 Zeller et al.]
”
![Page 103: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/103.jpg)
A Researcher's Observation in HCI Research Community • “The reviewers simply do not value the
difficulty of building real systems and how hard controlled studies are to run on real systems for real tasks. This is in contrast with how easy it is to build new interaction techniques and then to run tight, controlled studies on these new techniques with small, artificial tasks”
“I give up on CHI/UIST” by James Landay http://dubfuture.blogspot.com/2009/11/i-give-up-on-chiuist.html Source©J. Landay
![Page 104: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/104.jpg)
• “This attitude is a joke and it offers researchers no incentive to do systems work. Why should they? Why should we put 3-4 person years into every CHI publication? Instead we can do 8 weeks of work on an idea piece or create a new interaction technique and test it tightly in 8-12 weeks and get a full CHI paper.”
A Researcher's Observation in HCI Research Community
“I give up on CHI/UIST” by James Landay http://dubfuture.blogspot.com/2009/11/i-give-up-on-chiuist.html Source©J. Landay
![Page 105: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/105.jpg)
A Researcher's Observation in HCI Research Community • “When will this community wake up and
understand that they are going to run out any work on creating new systems (rather than small pieces of systems) and cede that important endeavor to industry?”
• “We are our own worst enemies. I think we have been blinded by the perception that "true scientific" research is only found in controlled experiments and nice statistics.”
“I give up on CHI/UIST” by James Landay http://dubfuture.blogspot.com/2009/11/i-give-up-on-chiuist.html Source©J. Landay
![Page 106: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/106.jpg)
A Researcher's Observation in HCI Research Community • “When will this community wake up and
understand that they are going to run out any work on creating new systems (rather than small pieces of systems) and cede that important endeavor to industry?”
• “We are our own worst enemies. I think we have been blinded by the perception that "true scientific" research is only found in controlled experiments and nice statistics.”
Does our research community
have similar issues??
“I give up on CHI/UIST” by James Landay http://dubfuture.blogspot.com/2009/11/i-give-up-on-chiuist.html Source©J. Landay
![Page 107: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/107.jpg)
MS Academic Search: “Pointer Analysis”
![Page 108: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/108.jpg)
“Pointer Analysis: Haven’t We Solved This Problem Yet?” [Hind PASTE’01]
58
“During the past 21 years, over 75 papers and 9 Ph.D. theses have been published on pointer analysis. Given the tones of work on this topic one may wonder, “Haven't we solved this problem yet?'' With input from many researchers in the field, this paper describes issues related to pointer analysis and remaining open problems.”
Michael Hind. Pointer analysis: haven't we solved this problem yet?. In Proc. ACM SIGPLAN-SIGSOFT Workshop on Program Analysis for Software Tools and Engineering (PASTE 2001)
Source©M. Hind
![Page 109: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/109.jpg)
“Pointer Analysis: Haven’t We Solved This Problem Yet?” [Hind PASTE’01]
59
Section 4.3 Designing an Analysis for a Client’s Needs
“Barbara Ryder expands on this topic: “… We can all write an unbounded number of papers that compare different pointer analysis approximations in the abstract. However, this does not accomplish the key goal, which is to design and engineer pointer analyses that are useful for solving real software problems for realistic programs.”
Michael Hind. Pointer analysis: haven't we solved this problem yet?. In Proc. ACM SIGPLAN-SIGSOFT Workshop on Program Analysis for Software Tools and Engineering (PASTE 2001)
Source©M. Hind&B. Ryder
![Page 110: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/110.jpg)
MS Academic Search: “Clone Detection”
![Page 111: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/111.jpg)
MS Academic Search: “Clone Detection”
Typically focus/evaluate on intermediate steps (e.g., clone detection) instead of ultimate tasks (e.g., bug detection or refactoring), even when the field already grows mature with n years of efforts on
intermediate steps
![Page 112: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/112.jpg)
Some Success Stories of Applying Clone Detection [Focus on Ultimate Tasks]
61
Zhenmin Li, Shan Lu, Suvda Myagmar, and Yuanyuan Zhou. CP-Miner: a tool for finding copy-paste and related bugs in operating system code. In Proc. OSDI 2004.
MSRA XIAO
Yingnong Dang, Dongmei Zhang, Song Ge, Chengyun Chu, Yingjun Qiu, and Tao Xie. XIAO: Tuning Code Clones at Hands of Engineers in Practice. In Proc. ACSAC 2012,
http://patterninsight.com/
http://www.blackducksoftware.com/
http://research.microsoft.com/en-us/groups/sa/
![Page 113: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/113.jpg)
Suggested Actions Tech Adoption
• Get research problems from real practice • Get feedback from real practice • Collaborate across disciplines • Collaborate with industry
![Page 114: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/114.jpg)
•Software Analytics Data Exploration and Analysis For Software Practitioners Obtain Insightful and Actionable info With Analytic Techniques
• Producing Impact on Practice
![Page 115: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/115.jpg)
Acknowledgments • Microsoft Research Asia Software Analytics
Group • Ahmed Hassan, Lin Tan, Jian Pei • Many other colleagues
64
![Page 116: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/116.jpg)
Q&A
![Page 117: Software Analytics: Towards Software Mining that Matters](https://reader034.fdocuments.net/reader034/viewer/2022051514/54892a66b47959140d8b58f3/html5/thumbnails/117.jpg)
•Software Analytics Data Exploration and Analysis For Software Practitioners Obtain Insightful and Actionable info With Analytic Techniques
• Producing Impact on Practice