Copyright by Xinyu Wang 2019 xwangsd/pubs/WANG-DISSERTATION-2019.pdf to talk. I truly feel extremely

download Copyright by Xinyu Wang 2019 xwangsd/pubs/WANG-DISSERTATION-2019.pdf to talk. I truly feel extremely

of 142

  • date post

    11-Aug-2020
  • Category

    Documents

  • view

    1
  • download

    0

Embed Size (px)

Transcript of Copyright by Xinyu Wang 2019 xwangsd/pubs/WANG-DISSERTATION-2019.pdf to talk. I truly feel extremely

  • Copyright

    by

    Xinyu Wang

    2019

  • The Dissertation Committee for Xinyu Wang certifies that this is the approved version of the following dissertation:

    An Efficient Programming-by-Example Framework

    Committee:

    Isil Dillig, Supervisor

    Gregory Durrett

    Keshav Pingali

    Ranjit Jhala

    Mayur Naik

  • An Efficient Programming-by-Example Framework

    by

    Xinyu Wang

    DISSERTATION

    Presented to the Faculty of the Graduate School of

    The University of Texas at Austin

    in Partial Fulfillment

    of the Requirements

    for the Degree of

    DOCTOR OF PHILOSOPHY

    THE UNIVERSITY OF TEXAS AT AUSTIN

    August 2019

  • Acknowledgments

    I will always be in the debt of my advisor, Isil Dillig, for the support

    and guidance I have received over the past six years. Isil introduced me to

    the field of programming languages and taught me how to conduct scientific

    research hand in hand. She taught me how to crystallize a mess of ideas into a

    simple and precise solution and everything else to become a good researcher.

    Besides teaching me how to do good research, Isil also showed me how to be

    a supportive and caring advisor. She always has time for me whenever I need

    to talk. I truly feel extremely lucky to be her student and have the privilege

    to work with her during the past few years.

    I also want to thank Rishabh Singh for helping me throughout my PhD.

    Rishabh was my mentor during an internship at Microsoft Research in 2015,

    where I did my first project on program synthesis with him. I fell in love with

    Rishabh’s passion about research the first time I met him. He is always ready

    to listen to me and brainstorm ideas with me no matter how crazy they are.

    Thank you Rishabh for always being so encouraging.

    I am also extremely grateful to Sumit Gulwani for his guidance during

    my PhD career. The first paper on program synthesis that I read thoroughly is

    his FlashFill paper, and I was extremely privileged to have him as my other

    mentor during my MSR internship. This dissertation is largely influenced by

    iv

  • those ideas in FlashFill, and I hope I was able to advance the state-of-the-

    art on top of that. Thank you Sumit for encouraging me to always focus on

    doing great work and pursue what I truly feel happy about.

    I would like to thank Mayur Naik and Ranjit Jhala for being on my

    dissertation committee and supporting me during my job search. I would also

    like to thank Keshav Pingali and Greg Durrett for being on my dissertation

    committee as well. I have recently started collaborating with Greg and it has

    been quite a refreshing experience. I have learned immensely from him.

    I want to thank Yu Feng for teaching me the basics of program analysis

    and collaborating with me in a couple of projects during our first couple of

    years at UT. I will never forget the days and nights that we have been working

    together as well as what Austin looks like at 4 am in the morning. Thank you

    my dear friend. I wish you all the best and success one can hope for.

    I was fortunate to get to collaborate with many great researchers: Greg

    Anderson, Osbert Bastani, Jocelyn Chen, Isil Dillig, Thomas Dillig, Greg Dur-

    rett, Yu Feng, Sumit Gulwani, Calvin Lin, Ken McMillan, Hovav Shacham,

    Rishabh Singh, Shankara Pailoor, Yuepeng Wang, Navid Yaghmazadeh, and

    Xi Ye. I learned a lot about how to conduct good research and improve myself

    as a researcher and presenter while working with them. Thank y’all!

    My colleagues in the UToPiA group made this long PhD journey much

    shorter and much more enjoyable, and I wholeheartedly thank them for that.

    Our UToPiA family started with Yu and me (and also Isil, of course). Then,

    v

  • our hacker Oswaldo and awesome Navid joined. After that, we had Yuepeng,

    Kostas, Ruben, Valentin, and Jacob join the party. I will always miss the good

    old days that we play board games at Isil’s house (and the pool parties). Now,

    our UToPiA family has many more members: Jia, Greg, Jiayi, Jocelyn, Jon,

    Rong, and Shankara. I really enjoy the past several years with all of you, and

    I will miss every one of you in the future. Ciao Utopians!

    Besides Utopians, I also want to thank my many other friends at UT

    who made my everyday life full of joy: Bo, Chunzhi, Hangchen, Jian, Jianyu,

    Wenguang, Ye, Yuanzhong, Zhiting, and many others.

    Finally, I would like to thank my parents who support me all the way

    until I finish my PhD. I dedicate this dissertation to them.

    vi

  • An Efficient Programming-by-Example Framework

    Publication No.

    Xinyu Wang, Ph.D.

    The University of Texas at Austin, 2019

    Supervisor: Isil Dillig

    Due to the ubiquity of computing, programming has started to become

    an essential skill for an increasing number of people, including data scientists,

    financial analysts, and spreadsheet users. While it is well known that building

    any complex and reliable software is difficult, writing even simple scripts is

    challenging for novices with no formal programming background. Therefore,

    there is an increasing need for technology that can provide basic programming

    support to non-expert computer end-users.

    Program synthesis, as a technique for generating programs from high-

    level specifications such as input-output examples, has been used to automate

    many real-world programming tasks in a number of application domains such

    as spreadsheet programming and data science. However, developing special-

    ized synthesizers for these application domains is notoriously hard.

    This dissertation aims to make the development of program synthesizers

    easier so that we can expand the applicability of program synthesis to more

    vii

  • application domains. In particular, this dissertation describes a programming-

    by-example framework that is both generic and efficient. This framework can

    be applied broadly to automating tasks across different application domains.

    It is also efficient and achieves orders of magnitude improvement in terms of

    the synthesis speed compared to existing state-of-the-art techniques.

    viii

  • Table of Contents

    Acknowledgments iv

    Abstract vii

    List of Figures xi

    Chapter 1. Introduction 1

    Chapter 2. Program Synthesis using Finite Tree Automata 9

    2.1 Background on Finite Tree Automata . . . . . . . . . . . . . . 9

    2.2 Program Synthesis using Finite Tree Automata . . . . . . . . . 10

    2.3 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . 16

    2.4 Application . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

    2.5 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26

    Chapter 3. Improving Efficiency using Abstract Interpretation 35

    3.1 Program Synthesis using Abstract Finite Tree Automata . . . 36

    3.1.1 Abstractions . . . . . . . . . . . . . . . . . . . . . . . . 36

    3.1.2 Abstract Finite Tree Automata . . . . . . . . . . . . . . 39

    3.2 Program Synthesis using Abstraction Refinement . . . . . . . . 43

    3.2.1 Algorithm Architecture . . . . . . . . . . . . . . . . . . 44

    3.2.2 Constructing Incorrectness Proofs . . . . . . . . . . . . 48

    3.3 A Working Example . . . . . . . . . . . . . . . . . . . . . . . . 56

    3.4 Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . 59

    3.5 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

    3.5.1 String Processing . . . . . . . . . . . . . . . . . . . . . . 60

    3.5.2 Tensor Reshaping . . . . . . . . . . . . . . . . . . . . . 62

    3.6 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

    3.6.1 String Processing . . . . . . . . . . . . . . . . . . . . . . 65

    ix

  • 3.6.2 Tensor Reshaping . . . . . . . . . . . . . . . . . . . . . 71

    3.6.3 Discussion . . . . . . . . . . . . . . . . . . . . . . . . . 75

    Chapter 4. Learning Abstractions for Program Synthesis 76

    4.1 Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77

    4.2 An Illustrative Example . . . . . . . . . . . . . . . . . . . . . . 79

    4.3 Overall Abstraction Learning Algorithm . . . . . . . . . . . . . 81

    4.4 Synthesis of Predicate Templates . . . . . . . . . . . . . . . . . 82

    4.5 Synthesis of Abstract Transformers . . . . . . . . . . . . . . . 87

    4.5.1 Example Generation . . . . . . . . . . . . . . . . . . . . 92

    4.6 Evaluation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95

    4.6.1 Abstraction Learning . . . . . . . . . . . . . . . . . . . 96

    4.6.2 Evaluating the Usefulness of Learned Abstractions . . . 98

    Chapter 5. Related Work 100

    Chapter 6. Conclusion 107

    Appendix 108

    Bibliography 117

    Vita 130

    x

  • List of Figures

    1.1 Workflow of our synthesis framework. . . . . . . . . . . . . . . 4

    2.1 A finite tree automaton example. . . . . . . . . . . . . . . . . 11

    2.2 CFTA construction rules. . . . . . . . . . . . . . . . . . . . . . 13

    2.3 A CFT