Code coverage for MSR Researches [Work in Progress]

19
Code Coverage for MSR Researchers Mauricio Aniche [email protected] Monday, November 18, 13

description

My talk about the tool and heuristic I am developing to calculate code coverage statically.

Transcript of Code coverage for MSR Researches [Work in Progress]

Page 1: Code coverage for MSR Researches [Work in Progress]

Code Coveragefor MSR Researchers

Mauricio [email protected]

Monday, November 18, 13

Page 2: Code coverage for MSR Researches [Work in Progress]

What is Code Coverage?

• Describes how much a production code is tested by the test suite.

• It basically counts the numbers of executed lines (when running the test suite) divided by the number of total lines.

Monday, November 18, 13

Page 3: Code coverage for MSR Researches [Work in Progress]

Why Do We Need This?

• It is hard to calculate code coverage when studying a large quantity of repositories.

• Compiled code needed

• Test suite execution needed

• As we know, every project contains a different way to compile/run.

Monday, November 18, 13

Page 4: Code coverage for MSR Researches [Work in Progress]

Statical Analysis

• Statical Analysis would solve this problem.

• It is impossible to execute the code statically.

• We need heuristics!

Monday, November 18, 13

Page 5: Code coverage for MSR Researches [Work in Progress]

Our idea

• A production method contains a certain level of complexity (which can be measured by McCabe’s number)

• public void a() { if(x) return 1; else return 2;}

• If a method contains 2 different paths, then it probably needs two different tests.

Monday, November 18, 13

Page 6: Code coverage for MSR Researches [Work in Progress]

Our formula

• Method-level: Qty of tests / McCabe’s number

• Class-Level: Sum(Qty of tests per method) / Sum(McCabe’s number per method)

Monday, November 18, 13

Page 7: Code coverage for MSR Researches [Work in Progress]

Identifying test methods

@Test public void testaOMetodo2() { A a = new A(); int resultado = a.fazAlgo(); Assert.assertEquals(1, resultado); }

@Test public void testaOMetodo() { A a = new A(); int resultado = a.getB().fazAlgo(); Assert.assertEquals(1, resultado); }

1st impl.tests fazAlgo()

2nd impl.tests getB(),

fazAlgo()

Monday, November 18, 13

Page 8: Code coverage for MSR Researches [Work in Progress]

Comparing the solution

• I want to compare to Emma (a tool that does dynamic analysis on the source code)

• I don’t want to replace the tool (it does not make sense)

• I want to discover the error average

• If it is small, then we can use it.

Monday, November 18, 13

Page 9: Code coverage for MSR Researches [Work in Progress]

Calculating the difference

• All charts were based on the difference between our calculated number minus Emma’s number.

• It means that a “0” means that the numbers were the same

• A negative number indicates the our tool calculates a smaller code coverage than Emma.

• A positive number, the other way around.

Monday, November 18, 13

Page 10: Code coverage for MSR Researches [Work in Progress]

A few examples

Monday, November 18, 13

Page 11: Code coverage for MSR Researches [Work in Progress]

Spearman Correlation

Monday, November 18, 13

Page 12: Code coverage for MSR Researches [Work in Progress]

Metric Miner

Monday, November 18, 13

Page 13: Code coverage for MSR Researches [Work in Progress]

Gnarus

Monday, November 18, 13

Page 14: Code coverage for MSR Researches [Work in Progress]

Caelum Web

Monday, November 18, 13

Page 15: Code coverage for MSR Researches [Work in Progress]

Discussion

• Looks like the tool can differ from dynamic analysis by 25%~30%.

• Questions:

• How can I eliminate big mistakes?

• How can I determine if the tool is valid or not?

Monday, November 18, 13

Page 16: Code coverage for MSR Researches [Work in Progress]

Advantages

• Really fast. It does not need to compile and run the tests.

• If the test fails, dynamic analysis may fail. Static analysis do not.

Monday, November 18, 13

Page 17: Code coverage for MSR Researches [Work in Progress]

Disadvantages

• It is an heuristic.

• The implementation is very complicated.

• There might be bugs on the implementation.

• There are a few things that is pretty hard to identify. Mainly inheritance and polymorphism.

• AOP code.

Monday, November 18, 13

Page 18: Code coverage for MSR Researches [Work in Progress]

Wanna help?

• github.com/mauricioaniche/gelato2

• github.com/metricminer-msr/codemetrics

• My goal: MSR MSR MSR!

Monday, November 18, 13

Page 19: Code coverage for MSR Researches [Work in Progress]

Thanks!

[email protected]

Monday, November 18, 13