IMPROVINGPROGRAMS THROUGHSOURCECODE...

59
IMPROVING PROGRAMS THROUGH SOURCE CODE TRANSFORMATIONS Disserta(on Proposal Jonathan Dorn April 19, 2016

Transcript of IMPROVINGPROGRAMS THROUGHSOURCECODE...

Page 1: IMPROVINGPROGRAMS THROUGHSOURCECODE ...web.eecs.umich.edu/~weimerw/students/dorn-proposal...NonFunctional)Properties • “Howgood ”+instead+of+“what”[Paech2004]. • “More”+or+“less;”+“higher”+or+“lower.”+

IMPROVING  PROGRAMS  THROUGH  SOURCE  CODE  TRANSFORMATIONS  Disserta(on  Proposal  Jonathan  Dorn  April  19,  2016  

Page 2: IMPROVINGPROGRAMS THROUGHSOURCECODE ...web.eecs.umich.edu/~weimerw/students/dorn-proposal...NonFunctional)Properties • “Howgood ”+instead+of+“what”[Paech2004]. • “More”+or+“less;”+“higher”+or+“lower.”+

Beyond  Functional  Correctness  

• So9ware  development  involves  tradeoffs.  

• “Fast,  good,  or  cheap.  Pick  any  two.”  

• Time  vs.  memory.  

• Maintainability  vs.  efficiency.  

• …  

2  

Page 3: IMPROVINGPROGRAMS THROUGHSOURCECODE ...web.eecs.umich.edu/~weimerw/students/dorn-proposal...NonFunctional)Properties • “Howgood ”+instead+of+“what”[Paech2004]. • “More”+or+“less;”+“higher”+or+“lower.”+

3  

Page 4: IMPROVINGPROGRAMS THROUGHSOURCECODE ...web.eecs.umich.edu/~weimerw/students/dorn-proposal...NonFunctional)Properties • “Howgood ”+instead+of+“what”[Paech2004]. • “More”+or+“less;”+“higher”+or+“lower.”+

4  

Page 5: IMPROVINGPROGRAMS THROUGHSOURCECODE ...web.eecs.umich.edu/~weimerw/students/dorn-proposal...NonFunctional)Properties • “Howgood ”+instead+of+“what”[Paech2004]. • “More”+or+“less;”+“higher”+or+“lower.”+

5  

Page 6: IMPROVINGPROGRAMS THROUGHSOURCECODE ...web.eecs.umich.edu/~weimerw/students/dorn-proposal...NonFunctional)Properties • “Howgood ”+instead+of+“what”[Paech2004]. • “More”+or+“less;”+“higher”+or+“lower.”+

6  

Page 7: IMPROVINGPROGRAMS THROUGHSOURCECODE ...web.eecs.umich.edu/~weimerw/students/dorn-proposal...NonFunctional)Properties • “Howgood ”+instead+of+“what”[Paech2004]. • “More”+or+“less;”+“higher”+or+“lower.”+

Non-­‐Functional  Properties  

• “How  good”  instead  of  “what”  [Paech  2004].  • “More”  or  “less;”  “higher”  or  “lower.”  

• Difficult  to  reason  about  (e.g.,  security).  

• Characterize  implementa(ons  by  how  much  of  a  property  they  possess.  

7  

Page 8: IMPROVINGPROGRAMS THROUGHSOURCECODE ...web.eecs.umich.edu/~weimerw/students/dorn-proposal...NonFunctional)Properties • “Howgood ”+instead+of+“what”[Paech2004]. • “More”+or+“less;”+“higher”+or+“lower.”+

Non-­‐Functional  Tradeoffs  

Visual  Error  

Run3me  

8  

Page 9: IMPROVINGPROGRAMS THROUGHSOURCECODE ...web.eecs.umich.edu/~weimerw/students/dorn-proposal...NonFunctional)Properties • “Howgood ”+instead+of+“what”[Paech2004]. • “More”+or+“less;”+“higher”+or+“lower.”+

Non-­‐Functional  Tradeoffs  

Visual  Error  

Run3me  

9  

Page 10: IMPROVINGPROGRAMS THROUGHSOURCECODE ...web.eecs.umich.edu/~weimerw/students/dorn-proposal...NonFunctional)Properties • “Howgood ”+instead+of+“what”[Paech2004]. • “More”+or+“less;”+“higher”+or+“lower.”+

Quantifying  Non-­‐Functionality  • Different  metrics  for  different  proper(es.  •  Image  quality:  RGB  distance  (e.g.,  L2),  SSIM,  EMD.  

• Run3me:  seconds,  speedup/slowdown.  

• Energy  efficiency:  joules,  wa_s.  

• Maintainability:  bug  fix  (me,  defect  density.  

• Correctness:  %  error,  accuracy,  precision,  PSNR.  10  

Page 11: IMPROVINGPROGRAMS THROUGHSOURCECODE ...web.eecs.umich.edu/~weimerw/students/dorn-proposal...NonFunctional)Properties • “Howgood ”+instead+of+“what”[Paech2004]. • “More”+or+“less;”+“higher”+or+“lower.”+

Local  Changes  

• Small  changes  can  have  large  effects.  • E.g.,  bubble_sort(a)  à  quick_sort(a).  

• Op(on  of  fine-­‐grained  control.  

• Program  lines,  statements,  AST  nodes.  

11  

Page 12: IMPROVINGPROGRAMS THROUGHSOURCECODE ...web.eecs.umich.edu/~weimerw/students/dorn-proposal...NonFunctional)Properties • “Howgood ”+instead+of+“what”[Paech2004]. • “More”+or+“less;”+“higher”+or+“lower.”+

Proposal  Thesis  

By  applying  local  so6ware  transforma8ons,  we  

can  select  be_er  tradeoffs  between  non-­‐

func8onal  proper8es  of  exis(ng  so9ware  

ar(facts.  

12  

Page 13: IMPROVINGPROGRAMS THROUGHSOURCECODE ...web.eecs.umich.edu/~weimerw/students/dorn-proposal...NonFunctional)Properties • “Howgood ”+instead+of+“what”[Paech2004]. • “More”+or+“less;”+“higher”+or+“lower.”+

The  rest  of  this  proposal  

• Overview  of  the  proposed  research  thrusts  • Visual  error  and  run(me  performance  • Energy  usage  • Coding  style  

• Proposed  research  (meline  • Conclusion  

13  

Page 14: IMPROVINGPROGRAMS THROUGHSOURCECODE ...web.eecs.umich.edu/~weimerw/students/dorn-proposal...NonFunctional)Properties • “Howgood ”+instead+of+“what”[Paech2004]. • “More”+or+“less;”+“higher”+or+“lower.”+

The  rest  of  this  proposal  

• Overview  of  the  proposed  research  thrusts  • Visual  error  and  run(me  performance  • Energy  usage  • Coding  style  

• Proposed  research  (meline  • Conclusion  

14  

Page 15: IMPROVINGPROGRAMS THROUGHSOURCECODE ...web.eecs.umich.edu/~weimerw/students/dorn-proposal...NonFunctional)Properties • “Howgood ”+instead+of+“what”[Paech2004]. • “More”+or+“less;”+“higher”+or+“lower.”+

Computer  Generated  Imagery  • 11%  of  all  (ckets  in  2015  went  to  computer  animated  movies.*  • Does  not  include  live  movies  with  CGI.  

• Video  games  topped  $90B  in  2015.**  

15  

 *  h_p://www.boxofficemojo.com/    **  h_p://www.gamesindustry.biz/ar(cles/2015-­‐04-­‐22-­‐gaming-­‐will-­‐hit-­‐usd91-­‐5-­‐billion-­‐this-­‐year-­‐newzoo  

Page 16: IMPROVINGPROGRAMS THROUGHSOURCECODE ...web.eecs.umich.edu/~weimerw/students/dorn-proposal...NonFunctional)Properties • “Howgood ”+instead+of+“what”[Paech2004]. • “More”+or+“less;”+“higher”+or+“lower.”+

Visual  Error  and  Runtime  Performance  

Visual  Error  

Run3me  

16  

Original  Program  

Exis(ng  Technique  Desired  Program  

Page 17: IMPROVINGPROGRAMS THROUGHSOURCECODE ...web.eecs.umich.edu/~weimerw/students/dorn-proposal...NonFunctional)Properties • “Howgood ”+instead+of+“what”[Paech2004]. • “More”+or+“less;”+“higher”+or+“lower.”+

Hypothesis  • We  can  apply  local  changes  to  the  abstract  syntax  tree  of  a  graphics  program  to  produce  an  approxima(on  that  is:  • Visually  faithful  to  the  original  and  • Efficient  to  compute.  

• Evaluate  both  image  quality  and  run(me.  17  

Page 18: IMPROVINGPROGRAMS THROUGHSOURCECODE ...web.eecs.umich.edu/~weimerw/students/dorn-proposal...NonFunctional)Properties • “Howgood ”+instead+of+“what”[Paech2004]. • “More”+or+“less;”+“higher”+or+“lower.”+

Simple  Example  

18  

floor

-

+ p.x

p.x 0.5

return

(color)

floor

Page 19: IMPROVINGPROGRAMS THROUGHSOURCECODE ...web.eecs.umich.edu/~weimerw/students/dorn-proposal...NonFunctional)Properties • “Howgood ”+instead+of+“what”[Paech2004]. • “More”+or+“less;”+“higher”+or+“lower.”+

Simple  Example  

19  

floor

-

+ p.x

p.x 0.5

return

(color)

floor

Page 20: IMPROVINGPROGRAMS THROUGHSOURCECODE ...web.eecs.umich.edu/~weimerw/students/dorn-proposal...NonFunctional)Properties • “Howgood ”+instead+of+“what”[Paech2004]. • “More”+or+“less;”+“higher”+or+“lower.”+

Simple  Example  

20  

floor

-

+ p.x

p.x 0.5

return

(color)

floor

Before   A9er  

Page 21: IMPROVINGPROGRAMS THROUGHSOURCECODE ...web.eecs.umich.edu/~weimerw/students/dorn-proposal...NonFunctional)Properties • “Howgood ”+instead+of+“what”[Paech2004]. • “More”+or+“less;”+“higher”+or+“lower.”+

Approach  

• Transforma(on:  Replace  node  N  with  N’.  

• Determine  replacements  offline  (manual).  

• Gene(c  search  to  select  nodes  to  replace.  • Use  image  quality  as  fitness  func(on.  

21  

Page 22: IMPROVINGPROGRAMS THROUGHSOURCECODE ...web.eecs.umich.edu/~weimerw/students/dorn-proposal...NonFunctional)Properties • “Howgood ”+instead+of+“what”[Paech2004]. • “More”+or+“less;”+“higher”+or+“lower.”+

Experimental  Setup  • Benchmarks  chosen  from  previous  work.  • Record  run8me  and  image  quality.  • Three  data  points  for  each  benchmark:  1.  Original  program.  2.  Baseline  “slower  but  less  error”  approach.  3.  Best  transformed  variant  from  our  search.  

22  

Page 23: IMPROVINGPROGRAMS THROUGHSOURCECODE ...web.eecs.umich.edu/~weimerw/students/dorn-proposal...NonFunctional)Properties • “Howgood ”+instead+of+“what”[Paech2004]. • “More”+or+“less;”+“higher”+or+“lower.”+

Runtime  Results  

0  

5  

10  

15  

20  

Normalized

 Run

3me  

Baseline   Our  Approach  

23  

Page 24: IMPROVINGPROGRAMS THROUGHSOURCECODE ...web.eecs.umich.edu/~weimerw/students/dorn-proposal...NonFunctional)Properties • “Howgood ”+instead+of+“what”[Paech2004]. • “More”+or+“less;”+“higher”+or+“lower.”+

Image  Quality  Results  

0  

0.5  

1  

Normalized

 Error  

Baseline   Our  Approach  

24  

Page 25: IMPROVINGPROGRAMS THROUGHSOURCECODE ...web.eecs.umich.edu/~weimerw/students/dorn-proposal...NonFunctional)Properties • “Howgood ”+instead+of+“what”[Paech2004]. • “More”+or+“less;”+“higher”+or+“lower.”+

Summary  

• We  can  apply  local  changes  to  produce  programs  that:  

• Are  significantly  faster  than  the  baseline  approach,  

• Have  less  error  than  the  original  program,  and  

• O9en  have  less  error  than  the  baseline.  

25  

Page 26: IMPROVINGPROGRAMS THROUGHSOURCECODE ...web.eecs.umich.edu/~weimerw/students/dorn-proposal...NonFunctional)Properties • “Howgood ”+instead+of+“what”[Paech2004]. • “More”+or+“less;”+“higher”+or+“lower.”+

Outline  

• Overview  of  the  proposed  research  thrusts  • Visual  error  and  run(me  performance  • Energy  usage  • Coding  style  

• Proposed  research  (meline  • Conclusion  

26  

Page 27: IMPROVINGPROGRAMS THROUGHSOURCECODE ...web.eecs.umich.edu/~weimerw/students/dorn-proposal...NonFunctional)Properties • “Howgood ”+instead+of+“what”[Paech2004]. • “More”+or+“less;”+“higher”+or+“lower.”+

Data  Center  Energy  Use  

0.8%  

1.5%  

2.0%  

0  

20  

40  

60  

80  

100  

2000   2005   2010  

Electricity

 Use  

(billion  kW

h/year)   Infrastructure  

Communica(ons  

Storage  

High-­‐end  servers  

Mid-­‐range  servers  

Volume  servers  

27  

Percentages  of  US  electricity  use  in  a  given  year  

Reproduced  from  J.  Koomey.  Growth  in  data  center  electricity  use  2005  to  2010.  Analy(cs  Press,  Oakland,  CA,  2011.    

Page 28: IMPROVINGPROGRAMS THROUGHSOURCECODE ...web.eecs.umich.edu/~weimerw/students/dorn-proposal...NonFunctional)Properties • “Howgood ”+instead+of+“what”[Paech2004]. • “More”+or+“less;”+“higher”+or+“lower.”+

Genetic  Optimization  Algorithm  

• Local  changes  to  assembly  code.  

• Tradeoff  between  reduced  energy  and  relaxed  seman8cs.  •  Validated  with  test  suite.   0   25   50   75   100  

blackscholes  bodytrack  

ferret  fluidanimate  

freqmine  swap(ons  

vips  x264  

average  

Energy  Reduc3on  (%)  28  

Page 29: IMPROVINGPROGRAMS THROUGHSOURCECODE ...web.eecs.umich.edu/~weimerw/students/dorn-proposal...NonFunctional)Properties • “Howgood ”+instead+of+“what”[Paech2004]. • “More”+or+“less;”+“higher”+or+“lower.”+

Genetic  Optimization  Algorithm  

• Local  changes  to  assembly  code.  

• Tradeoff  between  reduced  energy  and  relaxed  seman8cs.  •  Validated  with  test  suite.   0   25   50   75   100  

blackscholes  bodytrack  

ferret  fluidanimate  

freqmine  swap(ons  

vips  x264  

average  

Energy  Reduc3on  (%)  29  

Page 30: IMPROVINGPROGRAMS THROUGHSOURCECODE ...web.eecs.umich.edu/~weimerw/students/dorn-proposal...NonFunctional)Properties • “Howgood ”+instead+of+“what”[Paech2004]. • “More”+or+“less;”+“higher”+or+“lower.”+

Scaling  to  Larger  Programs  

0  

20  

40  

60  

80  

100  

0   200000   400000   600000   800000   1000000  

Energy  Red

uc3o

n  (%

)  

Lines  of  Code  (Assembly)  

30  

Page 31: IMPROVINGPROGRAMS THROUGHSOURCECODE ...web.eecs.umich.edu/~weimerw/students/dorn-proposal...NonFunctional)Properties • “Howgood ”+instead+of+“what”[Paech2004]. • “More”+or+“less;”+“higher”+or+“lower.”+

Hypothesis  

By  direc(ng  the  gene(c  search  more  effec(vely  and  reducing  the  search  space,  we  can  achieve  larger  energy  op(miza(ons  faster.  

Evaluate  both  magnitude  of  op(miza(on  and  search  (me.  

31  

Page 32: IMPROVINGPROGRAMS THROUGHSOURCECODE ...web.eecs.umich.edu/~weimerw/students/dorn-proposal...NonFunctional)Properties • “Howgood ”+instead+of+“what”[Paech2004]. • “More”+or+“less;”+“higher”+or+“lower.”+

Intuition  • Op(miza(ons  on  different  paths  through  the  program  are  likely  to  be  independent.  • Combine  op(miza(ons  from  separate  searches.  

• Op(miza(ons  on  frequently  executed  paths  are  likely  to  have  larger  impact.  • Profile  the  program  to  target  hot  paths.  

32  

Page 33: IMPROVINGPROGRAMS THROUGHSOURCECODE ...web.eecs.umich.edu/~weimerw/students/dorn-proposal...NonFunctional)Properties • “Howgood ”+instead+of+“what”[Paech2004]. • “More”+or+“less;”+“higher”+or+“lower.”+

Algorithm  Overview  

33  

INPUT  

MUTATE  

EVALUATE  FITNESS  

OUTPUT  

Eject  

Report  

Page 34: IMPROVINGPROGRAMS THROUGHSOURCECODE ...web.eecs.umich.edu/~weimerw/students/dorn-proposal...NonFunctional)Properties • “Howgood ”+instead+of+“what”[Paech2004]. • “More”+or+“less;”+“higher”+or+“lower.”+

Optimizing  Two  Workloads  

34  

INPUT  

MUTATE  

EVALUATE  FITNESS  

OUTPUT  

Eject  

Report  

INPUT  

MUTATE  

EVALUATE  FITNESS  

OUTPUT  

Eject  

Report  

Page 35: IMPROVINGPROGRAMS THROUGHSOURCECODE ...web.eecs.umich.edu/~weimerw/students/dorn-proposal...NonFunctional)Properties • “Howgood ”+instead+of+“what”[Paech2004]. • “More”+or+“less;”+“higher”+or+“lower.”+

Option  1:  Share  Variants  During  Search  

35  

INPUT  

MUTATE  

EVALUATE  FITNESS  

OUTPUT  

Eject  

Report  

INPUT  

OUTPUT  

Eject  

Report  

MUTATE  

EVALUATE  FITNESS  

Page 36: IMPROVINGPROGRAMS THROUGHSOURCECODE ...web.eecs.umich.edu/~weimerw/students/dorn-proposal...NonFunctional)Properties • “Howgood ”+instead+of+“what”[Paech2004]. • “More”+or+“less;”+“higher”+or+“lower.”+

INPUT  

MUTATE  

EVALUATE  FITNESS  Eject  

Report  

Option  2:  Combine  Best  After  Search  

36  

INPUT  

MUTATE  

EVALUATE  FITNESS  Eject  

Report  

OUTPUT   OUTPUT  

Page 37: IMPROVINGPROGRAMS THROUGHSOURCECODE ...web.eecs.umich.edu/~weimerw/students/dorn-proposal...NonFunctional)Properties • “Howgood ”+instead+of+“what”[Paech2004]. • “More”+or+“less;”+“higher”+or+“lower.”+

Experimental  Setup  Benchmarks  • Collect  HPC  and  data  center  benchmarks.  • Collect  mul(ple  workloads  for  each  benchmark.  

Baseline:  GOA  search  1.  Only  one  workload.  2.  All  workloads  in  single  

fitness  func(on.  

37  

Page 38: IMPROVINGPROGRAMS THROUGHSOURCECODE ...web.eecs.umich.edu/~weimerw/students/dorn-proposal...NonFunctional)Properties • “Howgood ”+instead+of+“what”[Paech2004]. • “More”+or+“less;”+“higher”+or+“lower.”+

Metrics  for  Energy  Optimization  • Energy  measured  at  the  wall.  • Wall  8me  before  best  variant.  • Latest  best  variant  if  combining  a9er  search.  

• Fitness  evalua8ons  before  best  variant.  

• Success  if  searching  separately  produces  larger  energy  reduc(on  across  all  workloads.  

38  

Page 39: IMPROVINGPROGRAMS THROUGHSOURCECODE ...web.eecs.umich.edu/~weimerw/students/dorn-proposal...NonFunctional)Properties • “Howgood ”+instead+of+“what”[Paech2004]. • “More”+or+“less;”+“higher”+or+“lower.”+

Preliminary  Results  

0%   10%   20%   30%   40%   50%  

GOA  

combined  

set  1  

set  0  

Energy  Improvement  

39  

Page 40: IMPROVINGPROGRAMS THROUGHSOURCECODE ...web.eecs.umich.edu/~weimerw/students/dorn-proposal...NonFunctional)Properties • “Howgood ”+instead+of+“what”[Paech2004]. • “More”+or+“less;”+“higher”+or+“lower.”+

Outline  

• Overview  of  the  proposed  research  thrusts  • Visual  error  and  run(me  performance  • Energy  usage  • Coding  style  

• Proposed  research  (meline  • Conclusion  

40  

Page 41: IMPROVINGPROGRAMS THROUGHSOURCECODE ...web.eecs.umich.edu/~weimerw/students/dorn-proposal...NonFunctional)Properties • “Howgood ”+instead+of+“what”[Paech2004]. • “More”+or+“less;”+“higher”+or+“lower.”+

Programmer  Time  

• Programmer  salaries  in  the  U.S.  exceed  $100B.  

• Programmers  spend  much  more  (me  reading  code  than  wri(ng  it.  

41  

Wri(ng  

Modifying  

Understanding  

Reproduced  from  P.  Hallam.  What  do  programmers  really  do  anyway?  (aka  part  2  of  the  yards(ck  saga).  h_p:  //blogs.msdn.com/b/peterhal/archive/2006/01/04/509302.aspx.  Accessed:  2016-­‐02-­‐01.    

Page 42: IMPROVINGPROGRAMS THROUGHSOURCECODE ...web.eecs.umich.edu/~weimerw/students/dorn-proposal...NonFunctional)Properties • “Howgood ”+instead+of+“what”[Paech2004]. • “More”+or+“less;”+“higher”+or+“lower.”+

Stylish  Code  

• Broad  consensus  for  standardized  coding  style.  

• Persistent  disagreement  on  specifics.  • E.g.,  tabs  vs.  spaces.  

• “Every  major  open  source  project  has  its  own  style  guide.”  –  Google’s  style  guide.  

42  

Page 43: IMPROVINGPROGRAMS THROUGHSOURCECODE ...web.eecs.umich.edu/~weimerw/students/dorn-proposal...NonFunctional)Properties • “Howgood ”+instead+of+“what”[Paech2004]. • “More”+or+“less;”+“higher”+or+“lower.”+

for (i=0; i<n; i++) { for (j=0; j<n; j++) { … temp = a[i]; a[i] = a[j]; a[j] = temp; … }}

Beacons  

•  Indicate  likely  structure  or  func(onality.  

• Seman(c  or  syntac(c.  

• May  vary  •  Between  programmers,  •  And  over  (me.  

43  

Page 44: IMPROVINGPROGRAMS THROUGHSOURCECODE ...web.eecs.umich.edu/~weimerw/students/dorn-proposal...NonFunctional)Properties • “Howgood ”+instead+of+“what”[Paech2004]. • “More”+or+“less;”+“higher”+or+“lower.”+

for (i=0; i<n; i++) { for (j=0; j<n; j++) { … temp = a[i]; a[i] = a[j]; a[j] = temp; … }}

Beacons  

•  Indicate  likely  structure  or  func(onality.  

• Seman(c  or  syntac(c.  

• May  vary  •  Between  programmers  •  And  over  (me.  

44  

Possible  sort  Implementa(on?  

Page 45: IMPROVINGPROGRAMS THROUGHSOURCECODE ...web.eecs.umich.edu/~weimerw/students/dorn-proposal...NonFunctional)Properties • “Howgood ”+instead+of+“what”[Paech2004]. • “More”+or+“less;”+“higher”+or+“lower.”+

for (i=0; i<n; i++) { for (j=0; j<n; j++) { … temp = a[i]; a[i] = a[j]; a[j] = temp; … }}

Beacons  

•  Indicate  likely  structure  or  func(onality.  

• Seman(c  or  syntac(c.  

• May  vary  •  Between  programmers,  •  And  over  (me.  

45  

End  of  scope.  

Page 46: IMPROVINGPROGRAMS THROUGHSOURCECODE ...web.eecs.umich.edu/~weimerw/students/dorn-proposal...NonFunctional)Properties • “Howgood ”+instead+of+“what”[Paech2004]. • “More”+or+“less;”+“higher”+or+“lower.”+

ClassiSication  of  Coding  Style  

• Typographic  and  Structural  [Oman  1988].  

• Typographic:  whitespace,  line  length,  iden(fier  length,  layout.  

• Structural:  modularity,  level  of  nes(ng,  control  and  informa(on  flow.  

46  

Page 47: IMPROVINGPROGRAMS THROUGHSOURCECODE ...web.eecs.umich.edu/~weimerw/students/dorn-proposal...NonFunctional)Properties • “Howgood ”+instead+of+“what”[Paech2004]. • “More”+or+“less;”+“higher”+or+“lower.”+

ClassiSication  of  Coding  Style  

• Typographic  and  Structural  [Oman  1988].  

• Typographic:  whitespace,  line  length,  iden(fier  length,  layout.  

• Structural:  modularity,  level  of  nes(ng,  control  and  informa(on  flow.  

47  

Page 48: IMPROVINGPROGRAMS THROUGHSOURCECODE ...web.eecs.umich.edu/~weimerw/students/dorn-proposal...NonFunctional)Properties • “Howgood ”+instead+of+“what”[Paech2004]. • “More”+or+“less;”+“higher”+or+“lower.”+

Hypothesis  • We  can  apply  local  changes  to  the  typographic  elements  of  source  code  to  • Match  a  programmer’s  expected  style  and  •  Improve  their  understanding  of  the  code.  

• Evaluate  (me  and  accuracy  on  tests  of  understanding.  

48  

Page 49: IMPROVINGPROGRAMS THROUGHSOURCECODE ...web.eecs.umich.edu/~weimerw/students/dorn-proposal...NonFunctional)Properties • “Howgood ”+instead+of+“what”[Paech2004]. • “More”+or+“less;”+“higher”+or+“lower.”+

Modeling  Typographic  Style  

• N-­‐gram  language  model.  

• Uses  previous  n-­‐1  tokens  to  predict  next  token.  

• Learn  probabili(es  from  exis(ng  code.  

• NATURALIZE  framework  [Allamanis  2014].  

• Can  predict  or  suggest  whitespace.  

49  

Page 50: IMPROVINGPROGRAMS THROUGHSOURCECODE ...web.eecs.umich.edu/~weimerw/students/dorn-proposal...NonFunctional)Properties • “Howgood ”+instead+of+“what”[Paech2004]. • “More”+or+“less;”+“higher”+or+“lower.”+

Similarity  of  Typographic  Style  

• Measure  similarity  of  N-­‐gram  models.  • N-­‐gram  models  are  probability  distribu(ons.  

• Measure  similarity  of  style-­‐checker  rules.  • Allamanis  et  al.  generate  rules  from  n-­‐gram  models.  

50  

Page 51: IMPROVINGPROGRAMS THROUGHSOURCECODE ...web.eecs.umich.edu/~weimerw/students/dorn-proposal...NonFunctional)Properties • “Howgood ”+instead+of+“what”[Paech2004]. • “More”+or+“less;”+“higher”+or+“lower.”+

Experimental  Setup  Benchmarks  1.  Reformat  the  same  

code  in  different  ways.  2.  Collect  similar  code  

from  different  authors  (e.g.,  textbook  examples).  

Par(cipants  • Undergraduate  student  volunteers  from  upper  level  elec(ves.  • Amazon  Mechanical  Turk  workers  who  pass  a  screening  test.  

51  

Page 52: IMPROVINGPROGRAMS THROUGHSOURCECODE ...web.eecs.umich.edu/~weimerw/students/dorn-proposal...NonFunctional)Properties • “Howgood ”+instead+of+“what”[Paech2004]. • “More”+or+“less;”+“higher”+or+“lower.”+

Human  Study  1.  Iden(fy  wri_en  style.  • Par(cipants  write  code  to  accomplish  simple  tasks.  • E.g.,  check  that  a  list  is  sorted.  

2.  Perform  maintenance  tasks.  • Par(cipants  answer  ques(ons  about  code  examples.  • E.g.,  what  is  the  value  of  x  on  line  5?  

52  

Page 53: IMPROVINGPROGRAMS THROUGHSOURCECODE ...web.eecs.umich.edu/~weimerw/students/dorn-proposal...NonFunctional)Properties • “Howgood ”+instead+of+“what”[Paech2004]. • “More”+or+“less;”+“higher”+or+“lower.”+

Metrics  for  Program  Understanding  • Collect  similarity  between  code  par(cipants  wrote  and  the  code  samples.  • Collect  8me  and  accuracy  in  answering  ques(ons.  

• Measure  correla(on  between  similarity  and  (me  and  between  similarity  and  accuracy.  

53  

Page 54: IMPROVINGPROGRAMS THROUGHSOURCECODE ...web.eecs.umich.edu/~weimerw/students/dorn-proposal...NonFunctional)Properties • “Howgood ”+instead+of+“what”[Paech2004]. • “More”+or+“less;”+“higher”+or+“lower.”+

Outline  

• Overview  of  the  proposed  research  thrusts  • Visual  error  and  run(me  performance  • Energy  usage  • Coding  style  

• Proposed  research  (meline  • Conclusion  

54  

Page 55: IMPROVINGPROGRAMS THROUGHSOURCECODE ...web.eecs.umich.edu/~weimerw/students/dorn-proposal...NonFunctional)Properties • “Howgood ”+instead+of+“what”[Paech2004]. • “More”+or+“less;”+“higher”+or+“lower.”+

Today& Gradua*on&

Jan&'13& Jan&'14& Jan&'15& Jan&'16& Jan&'17& Jan&'18&

Energy&Op*miza*on&(RT2)&

Coding&Style&of&Tests&(RT3)&

Coding&Style&in&Guava&(RT3)&

Shader&Acceptability&(RT1)&

Energy&Op*miza*on&at&Scale&(RT2)&

Coding&Style&Acceptability&(RT3)&Research&Period&Publica*on&Lag&

Research  Timeline  

55  

Page 56: IMPROVINGPROGRAMS THROUGHSOURCECODE ...web.eecs.umich.edu/~weimerw/students/dorn-proposal...NonFunctional)Properties • “Howgood ”+instead+of+“what”[Paech2004]. • “More”+or+“less;”+“higher”+or+“lower.”+

Conclusion  Enable  be_er  tradeoffs  between  non-­‐func(onal  proper(es  through  local  so9ware  transforma(ons.  1.  Visual  error  and  run(me  performance.  2.  Energy  usage.  3.  Coding  style.  

56  

Page 57: IMPROVINGPROGRAMS THROUGHSOURCECODE ...web.eecs.umich.edu/~weimerw/students/dorn-proposal...NonFunctional)Properties • “Howgood ”+instead+of+“what”[Paech2004]. • “More”+or+“less;”+“higher”+or+“lower.”+

BACKUP  

57  

Page 58: IMPROVINGPROGRAMS THROUGHSOURCECODE ...web.eecs.umich.edu/~weimerw/students/dorn-proposal...NonFunctional)Properties • “Howgood ”+instead+of+“what”[Paech2004]. • “More”+or+“less;”+“higher”+or+“lower.”+

Does  Coding  Style  Matter?  

IF A>B THEN S := 1;IF (A=B) AND (C>D) THEN S := 2;IF (A=B) AND (C<=D) THEN S := 3;IF (A<B) AND (C>D) THEN S := 4;IF (A<B) AND (C=D) THEN S := 5;IF (A<B) AND (C<D) THEN S := 6;

IF A > B THEN S := 1ELSE IF A = B THEN IF C > D THEN S := 2 ELSE S := 3ELSE IF C > D THEN S := 4ELSE IF C = D THEN S := 5ELSE S := 6;

58  Reproduced  from  P.  W.  Oman  and  C.  R.  Cook.  A  paradigm  for  programming  style  research.  ACM  Sigplan  No=ces,  23(12):69–78,  1988.    

Page 59: IMPROVINGPROGRAMS THROUGHSOURCECODE ...web.eecs.umich.edu/~weimerw/students/dorn-proposal...NonFunctional)Properties • “Howgood ”+instead+of+“what”[Paech2004]. • “More”+or+“less;”+“higher”+or+“lower.”+

Does  Coding  Style  Matter?  

IF A>B THEN S := 1;IF (A=B) AND (C>D) THEN S := 2;IF (A=B) AND (C<=D) THEN S := 3;IF (A<B) AND (C>D) THEN S := 4;IF (A<B) AND (C=D) THEN S := 5;IF (A<B) AND (C<D) THEN S := 6;

IF A > B THEN S := 1ELSE IF A = B THEN IF C > D THEN S := 2 ELSE S := 3ELSE IF C > D THEN S := 4ELSE IF C = D THEN S := 5ELSE S := 6;

59  Reproduced  from  P.  W.  Oman  and  C.  R.  Cook.  A  paradigm  for  programming  style  research.  ACM  Sigplan  No=ces,  23(12):69–78,  1988.    

Avg.  Score:    4.90  Avg.  Time:  1.93  

Avg.  Score:    3.36  Avg.  Time:  2.87