VL/HCC 2014 - A Longitudinal Study of Programmers' Backtracking

37
A Longitudinal Study of Programmers’ Backtracking YoungSeok Yoon ([email protected]) Institute for Software Research Carnegie Mellon University Brad Myers ([email protected]) Human-Computer Interaction Institute Carnegie Mellon University

description

 

Transcript of VL/HCC 2014 - A Longitudinal Study of Programmers' Backtracking

Page 1: VL/HCC 2014 - A Longitudinal Study of Programmers' Backtracking

A Longitudinal Study of Programmers’ Backtracking

YoungSeok Yoon ([email protected])

Institute for Software Research Carnegie Mellon University

Brad Myers ([email protected])

Human-Computer Interaction Institute Carnegie Mellon University

Page 2: VL/HCC 2014 - A Longitudinal Study of Programmers' Backtracking

Background

VL/HCC 2014 2

Page 3: VL/HCC 2014 - A Longitudinal Study of Programmers' Backtracking

What is Backtracking?

•  Reverting code fragments to an earlier state

•  Examples – Reverting a parameter to a previously used value – Removing debugging statements after fixing a bug – Restoring some deleted code – …

VL/HCC 2014 3

Page 4: VL/HCC 2014 - A Longitudinal Study of Programmers' Backtracking

Previous Studies of Backtracking

•  Two qualitative studies of backtracking [Yoon+, CHASE’12]

1.  Preliminary lab study (12 programmers)

2.  Online survey (48 respondents)

VL/HCC 2014 4

Page 5: VL/HCC 2014 - A Longitudinal Study of Programmers' Backtracking

Previous Studies of Backtracking

•  Observation

– Programmers face challenges when backtracking •  locating the right code to be backtracked •  restoring some deleted code correctly •  reverting inter-related code fragments together

– Programmers backtrack relatively often (75% answered at least “sometimes”)

VL/HCC 2014 5

Page 6: VL/HCC 2014 - A Longitudinal Study of Programmers' Backtracking

Limitations of the Previous Studies

•  Lab study tasks required participants to backtrack

•  Survey results may not correctly reflect the reality (e.g., programmers might backtrack unconsciously)

•  The analyses were mostly qualitative

VL/HCC 2014 6

Page 7: VL/HCC 2014 - A Longitudinal Study of Programmers' Backtracking

A Longitudinal Study of Backtracking

As a follow-up:

VL/HCC 2014 7

Page 8: VL/HCC 2014 - A Longitudinal Study of Programmers' Backtracking

Longitudinal Study of Backtracking

•  Two main goals

– Obtain backtracking statistics in order to quantify the need for backtracking tools

–  Identify backtracking situations that are not very well supported by existing programming tools

VL/HCC 2014 8

Page 9: VL/HCC 2014 - A Longitudinal Study of Programmers' Backtracking

Data Collection – Fluorite Logger http://www.cs.cmu.edu/~fluorite/

•  Eclipse logger for fine-grained code editing data [Yoon+, PLATEAU’11]

•  Information collected: –  Initial snapshot of each source file –  All edit operations (insert, delete, or replace) –  Timestamps, executed editor commands, etc.

•  Distributed to programmers since April 2012

VL/HCC 2014 9

[Image Src: Attribution: Rob Lavinsky, iRocks.com - CC-BY-SA-3.0]

Page 10: VL/HCC 2014 - A Longitudinal Study of Programmers' Backtracking

Study Participants

Group Description No. of Participants

Coding Time (hours)  

[min  /  avg  /  max  /  sum]  

The first author (myself) 1 294  /  294  /  294  /  294  

Graduate students @ CMU

13    3  /    40  /  216  /  520  

Research programmers / System scientists @ CMU

5    6  /  118  /  446  /  588  

Graduate students @ UPitt 2    6  /    29  /    51  /    57  

Total 21 people 1,460 hours

VL/HCC 2014 10

Page 11: VL/HCC 2014 - A Longitudinal Study of Programmers' Backtracking

Analysis Process

•  The data was too big for manual inspection – 1,345,241 coding events in the logs

•  Key idea of the automated analysis – Keep the evolution history of individual AST nodes

of interest throughout the lifetime of the nodes – Detect backtracking instances within each node

VL/HCC 2014 11

Page 12: VL/HCC 2014 - A Longitudinal Study of Programmers' Backtracking

Analysis Process Illustrated

VL/HCC 2014 12

package example;public class Example {public void printRectangleInfo() { Rectangle rect = getEnclosingRect(); int value = rect.getHeight(); System.out.println("Value:" + value);}public Rectangle getEnclosingRect() { // return some rectangle here // actual code omitted // ...} }

[Example Source Code Being Processed]

S1

S2

S3

Change history of S1 [v1] Rectangle rect = getEnclosingRect();

Change history of S2 [v1] int value = rect.getHeight();[v2] int value = rect.getWidth();[v3] int value = rect.getSize(); [v4] int value = rect.getHeight();

Change history of S3 [v1] System.out.println(value);[v2] System.out.println("Value:" + value);

[Memory of the Analyzer]

Backtracking Detected!

Page 13: VL/HCC 2014 - A Longitudinal Study of Programmers' Backtracking

Backtracking Instance

A B C A B A v1 v2 v3 v4 v5 v6

time

getHeight(); getWidth(); getSize(); getHeight(); getWidth(); getHeight();

Three Backtracking Instances: •  v1..v4 •  v2..v5 •  v4..v6

NOTE: v1..v6 is NOT a backtracking instance

VL/HCC 2014 13

Page 14: VL/HCC 2014 - A Longitudinal Study of Programmers' Backtracking

Research Questions

1.  How frequently do programmers backtrack in reality?

2.  How large are the backtrackings?

3.  How exactly do programmers perform backtracking? Are they backtracking manually?

4.  Is there evidence of exploratory programming?

5.  Are there backtrackings performed across multiple editing sessions?

6.  Are there selective backtrackings, which cannot be performed by the undo command?

7.  Do programmers backtrack to the same code repeatedly?

VL/HCC 2014 14

Page 15: VL/HCC 2014 - A Longitudinal Study of Programmers' Backtracking

1. Frequency of Backtracking “How frequently do programmers backtrack in reality?”

•  A total of 15,095 backtracking instances detected

•  10.3 instances/hour on average

VL/HCC 2014 15

0 10 20 30

P20P19P18P17P16P15P14P13P12P11P10P9 P8 P7 P6 P5 P4 P3 P2 P1 P0

Backtracking Instances per Hour

3.8 (min)

28.4 (max)

Average: 10.3/h

Rate varied across participants

(min=3.8/h, max=28.4/h), but all of them backtracked

frequently

Page 16: VL/HCC 2014 - A Longitudinal Study of Programmers' Backtracking

2. Size of Backtracking “How large are the backtrackings?”

•  How did we define the size of a backtracking? –  Measured the edit distance (Levenshtein distance) between the original

version and the other versions –  Took the maximum value as the size of backtracking instance

A B C D E A v1 v2 v3 v4 v5 v6

time farthest version

(max edit distance)

forward changes backward changes

original version

VL/HCC 2014 16

Page 17: VL/HCC 2014 - A Longitudinal Study of Programmers' Backtracking

2. Size of Backtracking “How large are the backtrackings?”

VL/HCC 2014 17

1304

3752

5269

2026 2259

265 220 0

2000

4000

6000

1 2-9 10 -49

50 -99

100 -499

500 -999

≥1000

Num

ber

of

Bac

ktra

ckin

g In

stan

ces

Backtracking Size (No. of Characters)

Page 18: VL/HCC 2014 - A Longitudinal Study of Programmers' Backtracking

2. Size of Backtracking “How large are the backtrackings?”

•  Method / variable names

•  String literals

•  Number literals

VL/HCC 2014 18

1304

3752

5269

2026 2259

265 220 0

2000

4000

6000

1 2-9 10 -49

50 -99

100 -499

500 -999

≥1000

Num

ber

of

Bac

ktra

ckin

g In

stan

ces

Backtracking Size (No. of Characters)

Page 19: VL/HCC 2014 - A Longitudinal Study of Programmers' Backtracking

2. Size of Backtracking “How large are the backtrackings?”

•  Simple parameter changes

•  Reverting renaming changes on methods or variables

VL/HCC 2014 19

1304

3752

5269

2026 2259

265 220 0

2000

4000

6000

1 2-9 10 -49

50 -99

100 -499

500 -999

≥1000

Num

ber

of

Bac

ktra

ckin

g In

stan

ces

Backtracking Size (No. of Characters)

Page 20: VL/HCC 2014 - A Longitudinal Study of Programmers' Backtracking

2. Size of Backtracking “How large are the backtrackings?”

•  Single statement changes

•  Surrounding existing code (e.g., try-catch) then reverting

VL/HCC 2014 20

1304

3752

5269

2026 2259

265 220 0

2000

4000

6000

1 2-9 10 -49

50 -99

100 -499

500 -999

≥1000

Num

ber

of

Bac

ktra

ckin

g In

stan

ces

Backtracking Size (No. of Characters)

Page 21: VL/HCC 2014 - A Longitudinal Study of Programmers' Backtracking

2. Size of Backtracking “How large are the backtrackings?”

•  Adding, removing, or modifying multiple statements and then reverting them altogether

VL/HCC 2014 21

1304

3752

5269

2026 2259

265 220 0

2000

4000

6000

1 2-9 10 -49

50 -99

100 -499

500 -999

≥1000

Num

ber

of

Bac

ktra

ckin

g In

stan

ces

Backtracking Size (No. of Characters)

Page 22: VL/HCC 2014 - A Longitudinal Study of Programmers' Backtracking

2. Size of Backtracking “How large are the backtrackings?”

•  Significant algorithmic changes

•  Adding / removing / modifying multiple methods and then reverting

VL/HCC 2014 22

1304

3752

5269

2026 2259

265 220 0

2000

4000

6000

1 2-9 10 -49

50 -99

100 -499

500 -999

≥1000

Num

ber

of

Bac

ktra

ckin

g In

stan

ces

Backtracking Size (No. of Characters)

Page 23: VL/HCC 2014 - A Longitudinal Study of Programmers' Backtracking

2. Size of Backtracking “How large are the backtrackings?”

VL/HCC 2014 23

1304

3752

5269

2026 2259

265 220 0

2000

4000

6000

1 2-9 10 -49

50 -99

100 -499

500 -999

≥1000

Num

ber

of

Bac

ktra

ckin

g In

stan

ces

Backtracking Size (No. of Characters)

Programmers backtrack at varying granularities, from simple name changes to

significant algorithmic changes

Page 24: VL/HCC 2014 - A Longitudinal Study of Programmers' Backtracking

3. Backtracking Tactics “How exactly do programmers perform backtracking?”

How were the backtrackings performed?

Manually 38% Using

Existing Tools 49%

Others 13%

VL/HCC 2014 24

Page 25: VL/HCC 2014 - A Longitudinal Study of Programmers' Backtracking

3. Backtracking Tactics “How exactly do programmers perform backtracking?”

How were the backtrackings performed?

Manually 38% Using

Existing Tools 49%

Others 13%

•  Undo (37%) •  Paste (6%) •  Redo (3%) •  Content Assist (2%) •  Toggle Comment (1%)

VL/HCC 2014 25

Page 26: VL/HCC 2014 - A Longitudinal Study of Programmers' Backtracking

3. Backtracking Tactics “How exactly do programmers perform backtracking?”

How were the backtrackings performed?

Manually 38% Using

Existing Tools 49%

Others 13%

•  Unidentified (9%)

•  Multiple (4%)

VL/HCC 2014 26

Page 27: VL/HCC 2014 - A Longitudinal Study of Programmers' Backtracking

3. Backtracking Tactics “How exactly do programmers perform backtracking?”

How were the backtrackings performed?

Manually 38% Using

Existing Tools 49%

Others 13%

•  Manual Deletion (25%)

•  Manual Typing (13%)

VL/HCC 2014 27

38% of the backtracking instances were NOT

supported by existing tools, indicating programmers need

better backtracking tools

Page 28: VL/HCC 2014 - A Longitudinal Study of Programmers' Backtracking

4. Cross-Run Backtracking “Is there evidence of exploratory programming?”

•  Make some changes à run the application à revert the code back to the way it was before

•  20.4% of all instances were cross-run instances on average.

VL/HCC 2014 28

0% 10% 20% 30% 40% 50%

P20P19P18P17P16P15P14P13P12P11P10P9 P8 P7 P6 P5 P4 P3 P2 P1 P0

Cross-Run Backtracking Percentage

Average: 20.4%

This provides support that programmers do this kind of exploratory programming.

Page 29: VL/HCC 2014 - A Longitudinal Study of Programmers' Backtracking

5. Cross-Session Backtracking “Are there backtrackings performed across multiple editing sessions?”

96.7%

98.2%

98.8% 99.0%

99.2% 99.3%

96%

97%

98%

99%

100%

Same Session

≤1 ≤2 ≤3 ≤4 ≤5 Cum

ulat

ive

Perc

enta

ge o

f All

BIs

Editing Session Distance

VL/HCC 2014 29

A backtracking tool would work for 97% of the cases

with only the history within the same editing session.

Page 30: VL/HCC 2014 - A Longitudinal Study of Programmers' Backtracking

6. Selective Backtracking “Are there backtrackings that could not have done by regular undo?”

•  Selective backtracking? –  There are edits in the middle

of a backtracking that change other parts of the same file, that are not backtracked together

VL/HCC 2014 30

0% 5% 10% 15% 20%

P20P19P18P17P16P15P14P13P12P11P10P9 P8 P7 P6 P5 P4 P3 P2 P1 P0

Selective Backtracking Percentage

Average: 9.5%

Page 31: VL/HCC 2014 - A Longitudinal Study of Programmers' Backtracking

6. Selective Backtracking “Are there backtrackings that could not have done by regular undo?”

•  Selective backtracking? –  There are edits in the middle

of a backtracking that change other parts of the same file, that are not backtracked together

VL/HCC 2014 31

0% 5% 10% 15% 20%

P20P19P18P17P16P15P14P13P12P11P10P9 P8 P7 P6 P5 P4 P3 P2 P1 P0

Selective Backtracking Percentage

Average: 9.5%

On average, 9.5% of all backtracking instances were

selective, supporting that programmers need better

selective backtracking tools

Page 32: VL/HCC 2014 - A Longitudinal Study of Programmers' Backtracking

7. Repeat Count “Do programmers backtrack to the same code repeatedly?”

85.0%

11.1% 2.7% 0.7% 0.6%

0%

20%

40%

60%

80%

100%

1 2 3 4 ≥5

Perc

enta

ge o

f Bac

ktra

cked

Nod

es

Repeat Count

VL/HCC 2014 32

Most (85%) of the time, programmers backtrack once and then never gets back to

the same state after diverging from it

Page 33: VL/HCC 2014 - A Longitudinal Study of Programmers' Backtracking

Wrapping Up

VL/HCC 2014 33

Page 34: VL/HCC 2014 - A Longitudinal Study of Programmers' Backtracking

Limitations of the Analysis

•  Only exact and successful backtracking instances were detected

•  Only for Java / Eclipse

•  Could not determine the semantic relationships among the backtracking instances

VL/HCC 2014 34

Page 35: VL/HCC 2014 - A Longitudinal Study of Programmers' Backtracking

Main Takeaways

•  Programmers backtrack quite frequently (10.3/hr)

•  38% of the backtrackings are done purely manually

•  9.5% of the backtrackings are selective, meaning that they are not supported by conventional undo

•  Programmers would benefit from better backtracking tools!

VL/HCC 2014 35

Page 36: VL/HCC 2014 - A Longitudinal Study of Programmers' Backtracking

Azurite – Selective Undo Tool http://www.cs.cmu.edu/~azurite/

•  A selective undo plug-in for Eclipse IDE –  can handle the 9.5% of selective backtrackings

•  Presented at VL/HCC –  Initial User Interfaces of the Tool:

Yoon, Myers, & Koo, “Visualization of Fine-Grained Code Change History”, Full Paper at VL/HCC’13

–  Tool Demonstration (yesterday): Yoon & Myers, “A Demonstration of Azurite: Backtracking Tool for Programmers”, Showpiece at VL/HCC’14

VL/HCC 2014 36

[Image Src: Attribution: cobalt, flickr.com - CC-BY-SA-2.0 ]

Page 37: VL/HCC 2014 - A Longitudinal Study of Programmers' Backtracking

Thank You!

•  FLUORITE: A logging plug-in for Eclipse (Full of Low-level User Operations Recorded In The Editor) available at: http://www.cs.cmu.edu/~fluorite/

•  AZURITE: A selective undo plug-in for Eclipse (Adding Zest to Undoing and Restoring Improves Textual Exploration) available at: http://www.cs.cmu.edu/~azurite/

•  Thanks for funding from:

VL/HCC 2014 37