Revisiting the Experimental Design Choices for Approaches ...€¦ · Ph.D. : Mohamed Sami Rakha...

67
Revisiting the Experimental Design Choices for Approaches for the Automated Retrieval of Duplicate Issue Reports Ph.D. : Mohamed Sami Rakha Supervisor: Prof. Ahmed E. Hassan Ph.D. Thesis Defense

Transcript of Revisiting the Experimental Design Choices for Approaches ...€¦ · Ph.D. : Mohamed Sami Rakha...

Page 1: Revisiting the Experimental Design Choices for Approaches ...€¦ · Ph.D. : Mohamed Sami Rakha Supervisor: Prof. Ahmed E. Hassan Ph.D. Thesis Defense. Software issues are common.

Revisiting the Experimental Design Choices for Approaches for the Automated Retrieval

of Duplicate Issue Reports

Ph.D. : Mohamed Sami Rakha

Supervisor: Prof. Ahmed E. Hassan

Ph.D. Thesis Defense

Page 2: Revisiting the Experimental Design Choices for Approaches ...€¦ · Ph.D. : Mohamed Sami Rakha Supervisor: Prof. Ahmed E. Hassan Ph.D. Thesis Defense. Software issues are common.
Page 3: Revisiting the Experimental Design Choices for Approaches ...€¦ · Ph.D. : Mohamed Sami Rakha Supervisor: Prof. Ahmed E. Hassan Ph.D. Thesis Defense. Software issues are common.

Software issues are common

Page 4: Revisiting the Experimental Design Choices for Approaches ...€¦ · Ph.D. : Mohamed Sami Rakha Supervisor: Prof. Ahmed E. Hassan Ph.D. Thesis Defense. Software issues are common.

There are many issue tracking systems

Page 5: Revisiting the Experimental Design Choices for Approaches ...€¦ · Ph.D. : Mohamed Sami Rakha Supervisor: Prof. Ahmed E. Hassan Ph.D. Thesis Defense. Software issues are common.

Issue tracking systems receive a large number of issue reports

New Issue Reports

Over 1.2 MillionIssue Reports

Page 6: Revisiting the Experimental Design Choices for Approaches ...€¦ · Ph.D. : Mohamed Sami Rakha Supervisor: Prof. Ahmed E. Hassan Ph.D. Thesis Defense. Software issues are common.

Different persons may report the same issue

User #2

User #1

Page 7: Revisiting the Experimental Design Choices for Approaches ...€¦ · Ph.D. : Mohamed Sami Rakha Supervisor: Prof. Ahmed E. Hassan Ph.D. Thesis Defense. Software issues are common.

Different persons may report the same issue

User #2

User #1

Duplicate

Master

Page 8: Revisiting the Experimental Design Choices for Approaches ...€¦ · Ph.D. : Mohamed Sami Rakha Supervisor: Prof. Ahmed E. Hassan Ph.D. Thesis Defense. Software issues are common.

The identification of duplicate issue reports is an annoying and timewasting task

User #2

User #1

Duplicate

20-30% Duplicates! Master

Bettenburg et al. Duplicate bug reports considered harmful… really? [ICSM 2008:]

Page 9: Revisiting the Experimental Design Choices for Approaches ...€¦ · Ph.D. : Mohamed Sami Rakha Supervisor: Prof. Ahmed E. Hassan Ph.D. Thesis Defense. Software issues are common.

Information retrieval is leveraged by prior research

Duplicate

AutomatedApproach

Search

Most similar previously reported issues

Ranked list

Master

Page 10: Revisiting the Experimental Design Choices for Approaches ...€¦ · Ph.D. : Mohamed Sami Rakha Supervisor: Prof. Ahmed E. Hassan Ph.D. Thesis Defense. Software issues are common.

Developer choose the accurate candidate for the newly reported duplicate… If there is any

Duplicate

AutomatedApproach

Search

Most similar previously reported issues

Ranked list

Developer

Master

Master

Page 11: Revisiting the Experimental Design Choices for Approaches ...€¦ · Ph.D. : Mohamed Sami Rakha Supervisor: Prof. Ahmed E. Hassan Ph.D. Thesis Defense. Software issues are common.

Several automated approaches for duplicates retrieval are proposed by prior research

Prior Research

2007: Runeson et al.

2008: Wang et al.

2010: Sun et al.

2010: Sureka et al.

2011: Sun et al.

2012: Zhou al.

2012: Nguyen al.

2015: Aggarwal al.

2016: Hindle al.2016: Jie et al.

Page 12: Revisiting the Experimental Design Choices for Approaches ...€¦ · Ph.D. : Mohamed Sami Rakha Supervisor: Prof. Ahmed E. Hassan Ph.D. Thesis Defense. Software issues are common.

Prior research focuses on improving the performance

AutomatedApproach

Performance

Recall Rate

Duplicate

The goal of newly proposed approach is to have higher performance than prior ones

Prior Research

2007: Runeson et al.

2008: Wang et al.

2010: Sun et al.

2010: Sureka et al.

2011: Sun et al.

2012: Zhou al.

2012: Nguyen al.

2015: Aggarwal al.

2016: Hindle al.2016: Jie et al.

Page 13: Revisiting the Experimental Design Choices for Approaches ...€¦ · Ph.D. : Mohamed Sami Rakha Supervisor: Prof. Ahmed E. Hassan Ph.D. Thesis Defense. Software issues are common.

In my thesis, I revisit the experimental design choices from four perspectives

AutomatedApproach

Performance

Recall Rate

Duplicate

The goal of newly proposed approach is to have better performance than prior ones

A BRecall Rate = 0.6 Recall Rate = 0.5

>Example:

Prior Research

2007: Runeson et al.

2008: Wang et al.

2010: Sun et al.

2010: Sureka et al.

2011: Sun et al.

2012: Zhou al.

2012: Nguyen al.

2015: Aggarwal al.

2016: Hindle al.2016: Jie et al.

Revisiting the experimental design choices can help better understand the benefits and limitations of prior approaches evaluation.

Needed Effort

Data Filtration

Data Changes

Evaluation Process

Page 14: Revisiting the Experimental Design Choices for Approaches ...€¦ · Ph.D. : Mohamed Sami Rakha Supervisor: Prof. Ahmed E. Hassan Ph.D. Thesis Defense. Software issues are common.

Evaluation Process Data Filtration Data ChangesNeeded Effort

Prior research treats all duplicate reports equally

Duplicate reports differ based on the effortneeded for their manual identification

Page 15: Revisiting the Experimental Design Choices for Approaches ...€¦ · Ph.D. : Mohamed Sami Rakha Supervisor: Prof. Ahmed E. Hassan Ph.D. Thesis Defense. Software issues are common.

Evaluation Process Data Filtration Data ChangesNeeded Effort

Real Example: Duplicate (B) needs more much effort to be identified

Timereported

BA

Page 16: Revisiting the Experimental Design Choices for Approaches ...€¦ · Ph.D. : Mohamed Sami Rakha Supervisor: Prof. Ahmed E. Hassan Ph.D. Thesis Defense. Software issues are common.

Evaluation Process Data Filtration Data ChangesNeeded Effort

16

Identified

A

Real Example: Duplicate (B) needs more much effort to be identified

15 minutes

Timereported

BA

Page 17: Revisiting the Experimental Design Choices for Approaches ...€¦ · Ph.D. : Mohamed Sami Rakha Supervisor: Prof. Ahmed E. Hassan Ph.D. Thesis Defense. Software issues are common.

Evaluation Process Data Filtration Data ChangesNeeded Effort

17

Identified

A BIdentified

Real Example: Duplicate (B) needs more much effort to be identified

15 minutes

44 days

Timereported

BA

Page 18: Revisiting the Experimental Design Choices for Approaches ...€¦ · Ph.D. : Mohamed Sami Rakha Supervisor: Prof. Ahmed E. Hassan Ph.D. Thesis Defense. Software issues are common.

Evaluation Process Data Filtration Data ChangesNeeded Effort

A duplicate can be hard or easybased on the effort needed for identification

HARDEASY

Page 19: Revisiting the Experimental Design Choices for Approaches ...€¦ · Ph.D. : Mohamed Sami Rakha Supervisor: Prof. Ahmed E. Hassan Ph.D. Thesis Defense. Software issues are common.

Evaluation Process Data Filtration Data ChangesNeeded Effort

50%

Within 1 Day

WithoutDiscussions

With onePerson

50%EASY HARD

Studied Systems:

Page 20: Revisiting the Experimental Design Choices for Approaches ...€¦ · Ph.D. : Mohamed Sami Rakha Supervisor: Prof. Ahmed E. Hassan Ph.D. Thesis Defense. Software issues are common.

Evaluation Process Data Filtration Data ChangesNeeded Effort

Random Forest Model“Binary Classification”

AUCs0.68 – 0.8

50%50%EASY HARD

Page 21: Revisiting the Experimental Design Choices for Approaches ...€¦ · Ph.D. : Mohamed Sami Rakha Supervisor: Prof. Ahmed E. Hassan Ph.D. Thesis Defense. Software issues are common.

Evaluation Process Data Filtration Data ChangesNeeded Effort

Random Forest Model“Binary Classification”

AUCs0.68 – 0.8

50%50%EASY HARD

This model is used to explain the most important factors on the effort needed for identification

Factors such as Description Similarity is one of the top important ones

Page 22: Revisiting the Experimental Design Choices for Approaches ...€¦ · Ph.D. : Mohamed Sami Rakha Supervisor: Prof. Ahmed E. Hassan Ph.D. Thesis Defense. Software issues are common.

Evaluation Process Data Filtration Data ChangesNeeded Effort

Thesis Contribution #1:Show the importance of considering the neededeffort for retrieving duplicate reports in theperformance measurement of automated duplicateretrieval approaches

[Published in EMSE 2015]

Page 23: Revisiting the Experimental Design Choices for Approaches ...€¦ · Ph.D. : Mohamed Sami Rakha Supervisor: Prof. Ahmed E. Hassan Ph.D. Thesis Defense. Software issues are common.

Data ChangesData FiltrationNeeded Effort Evaluation Process

Testing period (1 Year)

AA Time

Not considered in the evaluation

Many prior research evaluate their approaches on short testing periods

Page 24: Revisiting the Experimental Design Choices for Approaches ...€¦ · Ph.D. : Mohamed Sami Rakha Supervisor: Prof. Ahmed E. Hassan Ph.D. Thesis Defense. Software issues are common.

Data ChangesData FiltrationNeeded Effort Evaluation Process

AA B Time

Testing period (1 Year)Not considered in the evaluation

Many prior research evaluate their approaches on short testing periods

Page 25: Revisiting the Experimental Design Choices for Approaches ...€¦ · Ph.D. : Mohamed Sami Rakha Supervisor: Prof. Ahmed E. Hassan Ph.D. Thesis Defense. Software issues are common.

Data ChangesData FiltrationNeeded Effort Evaluation Process

AA BB Time

Testing period (1 Year)

Duplicate (B) match is outside the testing period

Not considered in the evaluation

Page 26: Revisiting the Experimental Design Choices for Approaches ...€¦ · Ph.D. : Mohamed Sami Rakha Supervisor: Prof. Ahmed E. Hassan Ph.D. Thesis Defense. Software issues are common.

Data ChangesData FiltrationNeeded Effort Evaluation Process

TimeB AA B

Ignored

Testing period (1 Year)

Duplicate (B) match is outside the testing period

Not considered in the evaluation

Page 27: Revisiting the Experimental Design Choices for Approaches ...€¦ · Ph.D. : Mohamed Sami Rakha Supervisor: Prof. Ahmed E. Hassan Ph.D. Thesis Defense. Software issues are common.

Data ChangesData FiltrationNeeded Effort Evaluation Process

TimeB AA B

Ignored

Testing period (1 Year)Not considered in the evaluation

ClassicalEvaluation

I refer to this evaluation by “Classical Evaluation”

Page 28: Revisiting the Experimental Design Choices for Approaches ...€¦ · Ph.D. : Mohamed Sami Rakha Supervisor: Prof. Ahmed E. Hassan Ph.D. Thesis Defense. Software issues are common.

Data ChangesData FiltrationNeeded Effort Evaluation Process

Time

ClassicalEvaluationRealisticEvaluation

Testing period (1 Year)

Instead, I propose “Realistic Evaluation”

Page 29: Revisiting the Experimental Design Choices for Approaches ...€¦ · Ph.D. : Mohamed Sami Rakha Supervisor: Prof. Ahmed E. Hassan Ph.D. Thesis Defense. Software issues are common.

Data ChangesData FiltrationNeeded Effort Evaluation Process

Time

ClassicalEvaluationRealisticEvaluation

Testing period (1 Year)

Instead, I propose “Realistic Evaluation”

AB A B

Page 30: Revisiting the Experimental Design Choices for Approaches ...€¦ · Ph.D. : Mohamed Sami Rakha Supervisor: Prof. Ahmed E. Hassan Ph.D. Thesis Defense. Software issues are common.

Data ChangesData FiltrationNeeded Effort Evaluation Process

Mozilla 2010

Eclipse2008

OpenOffice 2008 - 2010

ClassicalEvaluationRealisticEvaluation

Problem #1

Number of considered duplicates

Problem #2

Performance overestimation

I compare the two evaluations on frequently used testing data in prior research

Page 31: Revisiting the Experimental Design Choices for Approaches ...€¦ · Ph.D. : Mohamed Sami Rakha Supervisor: Prof. Ahmed E. Hassan Ph.D. Thesis Defense. Software issues are common.

Time

Data ChangesData FiltrationNeeded Effort Evaluation Process

Classical evaluation is missing 23.5% of

duplicates 10,043 Duplicates

7,551 Duplicates

ClassicalEvaluationRealisticEvaluation

Mozilla 2010

Problem #1

Number of considered duplicates

Testing period

Page 32: Revisiting the Experimental Design Choices for Approaches ...€¦ · Ph.D. : Mohamed Sami Rakha Supervisor: Prof. Ahmed E. Hassan Ph.D. Thesis Defense. Software issues are common.

Time

Data ChangesData FiltrationNeeded Effort Evaluation Process

Recall rate is overestimated by 39%

for REP approach Recall: 0.43

Recall: 0.58

Mozilla 2010

ClassicalEvaluationRealisticEvaluation

Problem #2

Performance overestimation

Testing period

Page 33: Revisiting the Experimental Design Choices for Approaches ...€¦ · Ph.D. : Mohamed Sami Rakha Supervisor: Prof. Ahmed E. Hassan Ph.D. Thesis Defense. Software issues are common.

Data ChangesData FiltrationNeeded Effort Evaluation Process

Mozilla 2010

Eclipse2008

OpenOffice 2008 - 2010

ClassicalEvaluationRealisticEvaluation

Problem #1

Number of considered duplicates

Problem #2

Performance overestimation

Page 34: Revisiting the Experimental Design Choices for Approaches ...€¦ · Ph.D. : Mohamed Sami Rakha Supervisor: Prof. Ahmed E. Hassan Ph.D. Thesis Defense. Software issues are common.

Data ChangesData FiltrationNeeded Effort Evaluation Process

Mozilla 2010

Eclipse2008

OpenOffice 2008 - 2010

ClassicalEvaluationRealisticEvaluation

Problem #1

Number of considered duplicates

Problem #2

Performance overestimation

Is the difference in performance between the two evaluations generalize to other testing periods within the tested ITSs

Page 35: Revisiting the Experimental Design Choices for Approaches ...€¦ · Ph.D. : Mohamed Sami Rakha Supervisor: Prof. Ahmed E. Hassan Ph.D. Thesis Defense. Software issues are common.

Data ChangesData FiltrationNeeded Effort Evaluation Process

Randomly select 100 chunks as “Testing Periods”

Studying the statistical difference between the classical and realistic

evaluations

ClassicalEvaluationRealisticEvaluation

Chunk = 1 Year Time

Page 36: Revisiting the Experimental Design Choices for Approaches ...€¦ · Ph.D. : Mohamed Sami Rakha Supervisor: Prof. Ahmed E. Hassan Ph.D. Thesis Defense. Software issues are common.

Data ChangesData FiltrationNeeded Effort Evaluation Process

ClassicalEvaluationRealisticEvaluation

Chunk 1 (Dec2004 – Dec2005)

Chunk 2 (Nov2003 – Nov2004)

Chunk 100 (Sep2007 – Sep2008)

Results 1

Results 2

Results 100

AutomatedApproach

Page 37: Revisiting the Experimental Design Choices for Approaches ...€¦ · Ph.D. : Mohamed Sami Rakha Supervisor: Prof. Ahmed E. Hassan Ph.D. Thesis Defense. Software issues are common.

Data ChangesData FiltrationNeeded Effort Evaluation Process

Classical evaluation significantly overestimates the performance of the automated approaches (p-value<0.001)

Page 38: Revisiting the Experimental Design Choices for Approaches ...€¦ · Ph.D. : Mohamed Sami Rakha Supervisor: Prof. Ahmed E. Hassan Ph.D. Thesis Defense. Software issues are common.

Data ChangesData FiltrationNeeded Effort Evaluation Process

Thesis Contribution #2:Propose a realistic evaluation for the automatedretrieval of duplicate reports, and analyze priorfindings using such realistic evaluation

[Published in TSE 2017]

Page 39: Revisiting the Experimental Design Choices for Approaches ...€¦ · Ph.D. : Mohamed Sami Rakha Supervisor: Prof. Ahmed E. Hassan Ph.D. Thesis Defense. Software issues are common.

Time

Data ChangesNeeded Effort Evaluation Process Data Filtration

RealisticEvaluation

Do we need all these issue reports to search in?

Filtering out old issue reports should lead to reduced search time

Testing period (1 Year)

No Duplicates are Ignored

Page 40: Revisiting the Experimental Design Choices for Approaches ...€¦ · Ph.D. : Mohamed Sami Rakha Supervisor: Prof. Ahmed E. Hassan Ph.D. Thesis Defense. Software issues are common.

Time

Data ChangesNeeded Effort Evaluation Process Data Filtration

RealisticEvaluation

The possible solution is to filter out the old issue reports using a time window based on the report date

Testing period (1 Year)

No Duplicates are Ignored

Page 41: Revisiting the Experimental Design Choices for Approaches ...€¦ · Ph.D. : Mohamed Sami Rakha Supervisor: Prof. Ahmed E. Hassan Ph.D. Thesis Defense. Software issues are common.

Time

Data ChangesNeeded Effort Evaluation Process Data Filtration

RealisticEvaluation

2 Years

Example (1)

AA BB

Testing period (1 Year)

The possible solution is to filter out the old issue reports using a time window based on the report date

Page 42: Revisiting the Experimental Design Choices for Approaches ...€¦ · Ph.D. : Mohamed Sami Rakha Supervisor: Prof. Ahmed E. Hassan Ph.D. Thesis Defense. Software issues are common.

Time

Data ChangesNeeded Effort Evaluation Process Data Filtration

RealisticEvaluation

4 months

Example (2)

AA BB

Testing period (1 Year)

The possible solution is to filter out the old issue reports using a time window based on the report date

Page 43: Revisiting the Experimental Design Choices for Approaches ...€¦ · Ph.D. : Mohamed Sami Rakha Supervisor: Prof. Ahmed E. Hassan Ph.D. Thesis Defense. Software issues are common.

Data ChangesNeeded Effort Evaluation Process Data Filtration

Page 44: Revisiting the Experimental Design Choices for Approaches ...€¦ · Ph.D. : Mohamed Sami Rakha Supervisor: Prof. Ahmed E. Hassan Ph.D. Thesis Defense. Software issues are common.

Data ChangesNeeded Effort Evaluation Process Data Filtration

RealisticEvaluation

Page 45: Revisiting the Experimental Design Choices for Approaches ...€¦ · Ph.D. : Mohamed Sami Rakha Supervisor: Prof. Ahmed E. Hassan Ph.D. Thesis Defense. Software issues are common.

Data ChangesNeeded Effort Evaluation Process Data Filtration

RealisticEvaluation

Results of the limitation by (n) months

It is not possible to limit the issue reports that are searched without decreasing

the performance of automated approaches

Page 46: Revisiting the Experimental Design Choices for Approaches ...€¦ · Ph.D. : Mohamed Sami Rakha Supervisor: Prof. Ahmed E. Hassan Ph.D. Thesis Defense. Software issues are common.

Data ChangesNeeded Effort Evaluation Process Data Filtration

A B Time

Testing periodreported

Page 47: Revisiting the Experimental Design Choices for Approaches ...€¦ · Ph.D. : Mohamed Sami Rakha Supervisor: Prof. Ahmed E. Hassan Ph.D. Thesis Defense. Software issues are common.

Data ChangesNeeded Effort Evaluation Process Data Filtration

A B Time

(A) is resolved 1

month ago

Testing periodreported

Page 48: Revisiting the Experimental Design Choices for Approaches ...€¦ · Ph.D. : Mohamed Sami Rakha Supervisor: Prof. Ahmed E. Hassan Ph.D. Thesis Defense. Software issues are common.

Data ChangesNeeded Effort Evaluation Process Data Filtration

A B Time

(A) is resolved 1

month ago(B) is resolved 2

years ago

Testing periodreported

Page 49: Revisiting the Experimental Design Choices for Approaches ...€¦ · Ph.D. : Mohamed Sami Rakha Supervisor: Prof. Ahmed E. Hassan Ph.D. Thesis Defense. Software issues are common.

Data ChangesNeeded Effort Evaluation Process Data Filtration

A B Time

Testing period

A new duplicate is more likely

to be duplicate of (A)

(A) is resolved 1

month ago(B) is resolved 2

years ago

reported

Page 50: Revisiting the Experimental Design Choices for Approaches ...€¦ · Ph.D. : Mohamed Sami Rakha Supervisor: Prof. Ahmed E. Hassan Ph.D. Thesis Defense. Software issues are common.

Data ChangesNeeded Effort Evaluation Process Data Filtration

I propose filtration approach based on

time threshold from resolution date for

each resolution valueand rank

The thresholds are optimized using

Genetic Algorithm (NSGA-II)

Page 51: Revisiting the Experimental Design Choices for Approaches ...€¦ · Ph.D. : Mohamed Sami Rakha Supervisor: Prof. Ahmed E. Hassan Ph.D. Thesis Defense. Software issues are common.

Data ChangesNeeded Effort Evaluation Process Data Filtration

The proposed filtration approach

improves the performance by up

to 60%

Page 52: Revisiting the Experimental Design Choices for Approaches ...€¦ · Ph.D. : Mohamed Sami Rakha Supervisor: Prof. Ahmed E. Hassan Ph.D. Thesis Defense. Software issues are common.

Data ChangesNeeded Effort Evaluation Process Data Filtration

Thesis Contribution #3:Propose a genetic algorithm based approach forfiltering out the old issue reports to improve theperformance of duplicate retrieval approaches.

[Published in TSE 2017]

Page 53: Revisiting the Experimental Design Choices for Approaches ...€¦ · Ph.D. : Mohamed Sami Rakha Supervisor: Prof. Ahmed E. Hassan Ph.D. Thesis Defense. Software issues are common.

Time

Needed Effort Evaluation Process Data Filtration Data Changes

2011

Just-in-time (JIT) duplicate retrieval feature is introduced

Page 54: Revisiting the Experimental Design Choices for Approaches ...€¦ · Ph.D. : Mohamed Sami Rakha Supervisor: Prof. Ahmed E. Hassan Ph.D. Thesis Defense. Software issues are common.

Needed Effort Evaluation Process Data Filtration Data Changes

Page 55: Revisiting the Experimental Design Choices for Approaches ...€¦ · Ph.D. : Mohamed Sami Rakha Supervisor: Prof. Ahmed E. Hassan Ph.D. Thesis Defense. Software issues are common.

Time

Needed Effort Evaluation Process Data Filtration Data Changes

2011

Significantly lessduplicates but more

identification effort is needed

Page 56: Revisiting the Experimental Design Choices for Approaches ...€¦ · Ph.D. : Mohamed Sami Rakha Supervisor: Prof. Ahmed E. Hassan Ph.D. Thesis Defense. Software issues are common.

Time

Needed Effort Evaluation Process Data Filtration Data Changes

2011

AutomatedApproach

AutomatedApproach

How the JIT duplicate retrieval feature impact the performance evaluation of the automated approaches?

Page 57: Revisiting the Experimental Design Choices for Approaches ...€¦ · Ph.D. : Mohamed Sami Rakha Supervisor: Prof. Ahmed E. Hassan Ph.D. Thesis Defense. Software issues are common.

Needed Effort Evaluation Process Data Filtration Data Changes

Approach (1): BM25F Approach (2): REP

The performance of both approaches is significantly lower for after-JIT duplicates

Firefox

Page 58: Revisiting the Experimental Design Choices for Approaches ...€¦ · Ph.D. : Mohamed Sami Rakha Supervisor: Prof. Ahmed E. Hassan Ph.D. Thesis Defense. Software issues are common.

Needed Effort Evaluation Process Data Filtration Data Changes

The relative improvement of the performance of REPover BM25F is significantly larger after the activation of the JIT

Firefox

Page 59: Revisiting the Experimental Design Choices for Approaches ...€¦ · Ph.D. : Mohamed Sami Rakha Supervisor: Prof. Ahmed E. Hassan Ph.D. Thesis Defense. Software issues are common.

Needed Effort Evaluation Process Data Filtration Data Changes

Thesis Contribution #4:Highlight the impact of the “JIT” feature on theperformance evaluation of duplicate retrievalapproaches

[Submitted to EMSE] – under review

Page 60: Revisiting the Experimental Design Choices for Approaches ...€¦ · Ph.D. : Mohamed Sami Rakha Supervisor: Prof. Ahmed E. Hassan Ph.D. Thesis Defense. Software issues are common.

1 2

3 4

Summary

Page 61: Revisiting the Experimental Design Choices for Approaches ...€¦ · Ph.D. : Mohamed Sami Rakha Supervisor: Prof. Ahmed E. Hassan Ph.D. Thesis Defense. Software issues are common.

Prior research focuses on improving the performance

AutomatedApproach

Performance

Recall Rate

Duplicate

The goal of newly proposed approach is to have higher performance than prior ones

Prior Research

2007: Runeson et al.

2008: Wang et al.

2010: Sun et al.

2010: Sureka et al.

2011: Sun et al.

2012: Zhou al.

2012: Nguyen al.

2015: Aggarwal al.

2016: Hindle al.2016: Jie et al.

Page 62: Revisiting the Experimental Design Choices for Approaches ...€¦ · Ph.D. : Mohamed Sami Rakha Supervisor: Prof. Ahmed E. Hassan Ph.D. Thesis Defense. Software issues are common.

In my thesis, I revisit the experimental design choices from four perspectives

AutomatedApproach

Performance

Recall Rate

Duplicate

The goal of newly proposed approach is to have better performance than prior ones

A BRecall Rate = 0.6 Recall Rate = 0.5

>Example:

Prior Research

2007: Runeson et al.

2008: Wang et al.

2010: Sun et al.

2010: Sureka et al.

2011: Sun et al.

2012: Zhou al.

2012: Nguyen al.

2015: Aggarwal al.

2016: Hindle al.2016: Jie et al.

Revisiting the experimental design choices can help better understand the benefits and limitations of prior approaches evaluation.

Needed Effort

Data Filtration

Data Changes

Evaluation Process

Page 63: Revisiting the Experimental Design Choices for Approaches ...€¦ · Ph.D. : Mohamed Sami Rakha Supervisor: Prof. Ahmed E. Hassan Ph.D. Thesis Defense. Software issues are common.

Evaluation Process Data Filtration Data ChangesNeeded Effort

Thesis Contribution #1:Show the importance of considering the neededeffort for retrieving duplicate reports in theperformance measurement of automated duplicateretrieval approaches

[Published in EMSE 2015]

Page 64: Revisiting the Experimental Design Choices for Approaches ...€¦ · Ph.D. : Mohamed Sami Rakha Supervisor: Prof. Ahmed E. Hassan Ph.D. Thesis Defense. Software issues are common.

Data ChangesData FiltrationNeeded Effort Evaluation Process

Thesis Contribution #2:Propose a realistic evaluation for the automatedretrieval of duplicate reports, and analyze priorfindings using such realistic evaluation

[Published in TSE 2017]

Page 65: Revisiting the Experimental Design Choices for Approaches ...€¦ · Ph.D. : Mohamed Sami Rakha Supervisor: Prof. Ahmed E. Hassan Ph.D. Thesis Defense. Software issues are common.

Data ChangesNeeded Effort Evaluation Process Data Filtration

Thesis Contribution #3:Propose a genetic algorithm based approach forfiltering out the old issue reports to improve theperformance of duplicate retrieval approaches

[Published in TSE 2017]

Page 66: Revisiting the Experimental Design Choices for Approaches ...€¦ · Ph.D. : Mohamed Sami Rakha Supervisor: Prof. Ahmed E. Hassan Ph.D. Thesis Defense. Software issues are common.

Needed Effort Evaluation Process Data Filtration Data Changes

Thesis Contribution #4:Highlight the impact of the “JIT” feature on theperformance evaluation of duplicate retrievalapproaches

[EMSE 2017] – under review

Page 67: Revisiting the Experimental Design Choices for Approaches ...€¦ · Ph.D. : Mohamed Sami Rakha Supervisor: Prof. Ahmed E. Hassan Ph.D. Thesis Defense. Software issues are common.

1 2

3 4

Summary