Code coverage for MSR Researches [Work in Progress]
-
Upload
mauricio-aniche -
Category
Technology
-
view
243 -
download
1
description
Transcript of Code coverage for MSR Researches [Work in Progress]
What is Code Coverage?
• Describes how much a production code is tested by the test suite.
• It basically counts the numbers of executed lines (when running the test suite) divided by the number of total lines.
Monday, November 18, 13
Why Do We Need This?
• It is hard to calculate code coverage when studying a large quantity of repositories.
• Compiled code needed
• Test suite execution needed
• As we know, every project contains a different way to compile/run.
Monday, November 18, 13
Statical Analysis
• Statical Analysis would solve this problem.
• It is impossible to execute the code statically.
• We need heuristics!
Monday, November 18, 13
Our idea
• A production method contains a certain level of complexity (which can be measured by McCabe’s number)
• public void a() { if(x) return 1; else return 2;}
• If a method contains 2 different paths, then it probably needs two different tests.
Monday, November 18, 13
Our formula
• Method-level: Qty of tests / McCabe’s number
• Class-Level: Sum(Qty of tests per method) / Sum(McCabe’s number per method)
Monday, November 18, 13
Identifying test methods
@Test public void testaOMetodo2() { A a = new A(); int resultado = a.fazAlgo(); Assert.assertEquals(1, resultado); }
@Test public void testaOMetodo() { A a = new A(); int resultado = a.getB().fazAlgo(); Assert.assertEquals(1, resultado); }
1st impl.tests fazAlgo()
2nd impl.tests getB(),
fazAlgo()
Monday, November 18, 13
Comparing the solution
• I want to compare to Emma (a tool that does dynamic analysis on the source code)
• I don’t want to replace the tool (it does not make sense)
• I want to discover the error average
• If it is small, then we can use it.
Monday, November 18, 13
Calculating the difference
• All charts were based on the difference between our calculated number minus Emma’s number.
• It means that a “0” means that the numbers were the same
• A negative number indicates the our tool calculates a smaller code coverage than Emma.
• A positive number, the other way around.
Monday, November 18, 13
A few examples
Monday, November 18, 13
Spearman Correlation
Monday, November 18, 13
Metric Miner
Monday, November 18, 13
Gnarus
Monday, November 18, 13
Caelum Web
Monday, November 18, 13
Discussion
• Looks like the tool can differ from dynamic analysis by 25%~30%.
• Questions:
• How can I eliminate big mistakes?
• How can I determine if the tool is valid or not?
Monday, November 18, 13
Advantages
• Really fast. It does not need to compile and run the tests.
• If the test fails, dynamic analysis may fail. Static analysis do not.
Monday, November 18, 13
Disadvantages
• It is an heuristic.
• The implementation is very complicated.
• There might be bugs on the implementation.
• There are a few things that is pretty hard to identify. Mainly inheritance and polymorphism.
• AOP code.
Monday, November 18, 13
Wanna help?
• github.com/mauricioaniche/gelato2
• github.com/metricminer-msr/codemetrics
• My goal: MSR MSR MSR!
Monday, November 18, 13
Thanks!
Monday, November 18, 13