How Can We Write Large Programs Without Thinking? · Linux: 15 million lines of code Windows 7: 40...
Transcript of How Can We Write Large Programs Without Thinking? · Linux: 15 million lines of code Windows 7: 40...
![Page 1: How Can We Write Large Programs Without Thinking? · Linux: 15 million lines of code Windows 7: 40 million lines of code Google: 2 billion lines of code 1](https://reader034.fdocuments.net/reader034/viewer/2022050218/5f9214fa932b686ed70d8491/html5/thumbnails/1.jpg)
How Can We Write Large Programs
Without Thinking?
Percy Liang
Neural Abstract Machines & Program Induction Workshop
Dec. 10, 2016
![Page 2: How Can We Write Large Programs Without Thinking? · Linux: 15 million lines of code Windows 7: 40 million lines of code Google: 2 billion lines of code 1](https://reader034.fdocuments.net/reader034/viewer/2022050218/5f9214fa932b686ed70d8491/html5/thumbnails/2.jpg)
1
![Page 3: How Can We Write Large Programs Without Thinking? · Linux: 15 million lines of code Windows 7: 40 million lines of code Google: 2 billion lines of code 1](https://reader034.fdocuments.net/reader034/viewer/2022050218/5f9214fa932b686ed70d8491/html5/thumbnails/3.jpg)
Linux: 15 million lines of code
Windows 7: 40 million lines of code
Google: 2 billion lines of code
1
![Page 4: How Can We Write Large Programs Without Thinking? · Linux: 15 million lines of code Windows 7: 40 million lines of code Google: 2 billion lines of code 1](https://reader034.fdocuments.net/reader034/viewer/2022050218/5f9214fa932b686ed70d8491/html5/thumbnails/4.jpg)
Where we are...
2
![Page 5: How Can We Write Large Programs Without Thinking? · Linux: 15 million lines of code Windows 7: 40 million lines of code Google: 2 billion lines of code 1](https://reader034.fdocuments.net/reader034/viewer/2022050218/5f9214fa932b686ed70d8491/html5/thumbnails/5.jpg)
End-user programming[Harris & Gulwani, 2011]
3
![Page 6: How Can We Write Large Programs Without Thinking? · Linux: 15 million lines of code Windows 7: 40 million lines of code Google: 2 billion lines of code 1](https://reader034.fdocuments.net/reader034/viewer/2022050218/5f9214fa932b686ed70d8491/html5/thumbnails/6.jpg)
End-user programming
Concatenate(ToLower(Substring(v,WordToken,1)), ” ”,
ToLower(Substring(v,WordToken,2)))
[Harris & Gulwani, 2011]
3
![Page 7: How Can We Write Large Programs Without Thinking? · Linux: 15 million lines of code Windows 7: 40 million lines of code Google: 2 billion lines of code 1](https://reader034.fdocuments.net/reader034/viewer/2022050218/5f9214fa932b686ed70d8491/html5/thumbnails/7.jpg)
Sketching
Set up the skeleton, only fill in ??
[Solar-Lezama, 2008]
4
![Page 8: How Can We Write Large Programs Without Thinking? · Linux: 15 million lines of code Windows 7: 40 million lines of code Google: 2 billion lines of code 1](https://reader034.fdocuments.net/reader034/viewer/2022050218/5f9214fa932b686ed70d8491/html5/thumbnails/8.jpg)
Stochastic superoptimization (STOKE)
Montgomery multiplication:
STOKE code 16 lines shorter, 1.6x faster than gcc
[Schkufza/Sharma/Aiken, 2013]
5
![Page 9: How Can We Write Large Programs Without Thinking? · Linux: 15 million lines of code Windows 7: 40 million lines of code Google: 2 billion lines of code 1](https://reader034.fdocuments.net/reader034/viewer/2022050218/5f9214fa932b686ed70d8491/html5/thumbnails/9.jpg)
How do we scale up?
6
![Page 10: How Can We Write Large Programs Without Thinking? · Linux: 15 million lines of code Windows 7: 40 million lines of code Google: 2 billion lines of code 1](https://reader034.fdocuments.net/reader034/viewer/2022050218/5f9214fa932b686ed70d8491/html5/thumbnails/10.jpg)
Modularity (model)
Natural language specifications
Moduarity (search)
Final remarks
7
![Page 11: How Can We Write Large Programs Without Thinking? · Linux: 15 million lines of code Windows 7: 40 million lines of code Google: 2 billion lines of code 1](https://reader034.fdocuments.net/reader034/viewer/2022050218/5f9214fa932b686ed70d8491/html5/thumbnails/11.jpg)
A property of programs
def min(x, y): return x if x ¡ y else y
[Liang et al., 2010, ICML]
8
![Page 12: How Can We Write Large Programs Without Thinking? · Linux: 15 million lines of code Windows 7: 40 million lines of code Google: 2 billion lines of code 1](https://reader034.fdocuments.net/reader034/viewer/2022050218/5f9214fa932b686ed70d8491/html5/thumbnails/12.jpg)
A property of programs
def min(x, y): return x if x ¡ y else y
def max(x, y): return x if x ¿ y else y
[Liang et al., 2010, ICML]
8
![Page 13: How Can We Write Large Programs Without Thinking? · Linux: 15 million lines of code Windows 7: 40 million lines of code Google: 2 billion lines of code 1](https://reader034.fdocuments.net/reader034/viewer/2022050218/5f9214fa932b686ed70d8491/html5/thumbnails/13.jpg)
A property of programs
def min(x, y): return x if x ¡ y else y
def max(x, y): return x if x ¿ y else y
Programs share common subprograms
[Liang et al., 2010, ICML]
8
![Page 14: How Can We Write Large Programs Without Thinking? · Linux: 15 million lines of code Windows 7: 40 million lines of code Google: 2 billion lines of code 1](https://reader034.fdocuments.net/reader034/viewer/2022050218/5f9214fa932b686ed70d8491/html5/thumbnails/14.jpg)
A property of programs
def min(x, y): return x if x ¡ y else y
def max(x, y): return x if x ¿ y else y
Programs share common subprograms
Problems:
• Part that changes embedded deeply in program
• Variables make it hard to extract subprograms
[Liang et al., 2010, ICML]
8
![Page 15: How Can We Write Large Programs Without Thinking? · Linux: 15 million lines of code Windows 7: 40 million lines of code Google: 2 billion lines of code 1](https://reader034.fdocuments.net/reader034/viewer/2022050218/5f9214fa932b686ed70d8491/html5/thumbnails/15.jpg)
A new representation of programs
9
![Page 16: How Can We Write Large Programs Without Thinking? · Linux: 15 million lines of code Windows 7: 40 million lines of code Google: 2 billion lines of code 1](https://reader034.fdocuments.net/reader034/viewer/2022050218/5f9214fa932b686ed70d8491/html5/thumbnails/16.jpg)
A new representation of programs
• Combinators B, C, S, I route encode how arguments get routeddown
• Now subproblems are subtrees
• Extension of classic combinatory logic (Schonfinkel, 1924)
9
![Page 17: How Can We Write Large Programs Without Thinking? · Linux: 15 million lines of code Windows 7: 40 million lines of code Google: 2 billion lines of code 1](https://reader034.fdocuments.net/reader034/viewer/2022050218/5f9214fa932b686ed70d8491/html5/thumbnails/17.jpg)
Representation of min and max
10
![Page 18: How Can We Write Large Programs Without Thinking? · Linux: 15 million lines of code Windows 7: 40 million lines of code Google: 2 billion lines of code 1](https://reader034.fdocuments.net/reader034/viewer/2022050218/5f9214fa932b686ed70d8491/html5/thumbnails/18.jpg)
Adaptor grammars [Johnson, 2007]
Inference: MCMC11
![Page 19: How Can We Write Large Programs Without Thinking? · Linux: 15 million lines of code Windows 7: 40 million lines of code Google: 2 billion lines of code 1](https://reader034.fdocuments.net/reader034/viewer/2022050218/5f9214fa932b686ed70d8491/html5/thumbnails/19.jpg)
Experimental results
24 text editing tasks [Lau, 2003]
Cardinals 5, Pirates 2
⇓GameScore[winner ’Cardinals’; loser ’Pirates’; scores [5, 2]]
12
![Page 20: How Can We Write Large Programs Without Thinking? · Linux: 15 million lines of code Windows 7: 40 million lines of code Google: 2 billion lines of code 1](https://reader034.fdocuments.net/reader034/viewer/2022050218/5f9214fa932b686ed70d8491/html5/thumbnails/20.jpg)
Experimental results
24 text editing tasks [Lau, 2003]
Cardinals 5, Pirates 2
⇓GameScore[winner ’Cardinals’; loser ’Pirates’; scores [5, 2]]
12
![Page 21: How Can We Write Large Programs Without Thinking? · Linux: 15 million lines of code Windows 7: 40 million lines of code Google: 2 billion lines of code 1](https://reader034.fdocuments.net/reader034/viewer/2022050218/5f9214fa932b686ed70d8491/html5/thumbnails/21.jpg)
Summary so far
• Multi-task learning: if induce one program, should be easier toinduce a related one
• Need to expose shared subprograms
• Use adaptor grammar over combinatory logic
• Cache becomes a library of useful primitives
13
![Page 22: How Can We Write Large Programs Without Thinking? · Linux: 15 million lines of code Windows 7: 40 million lines of code Google: 2 billion lines of code 1](https://reader034.fdocuments.net/reader034/viewer/2022050218/5f9214fa932b686ed70d8491/html5/thumbnails/22.jpg)
Modularity (model)
Natural language specifications
Moduarity (search)
Final remarks
14
![Page 23: How Can We Write Large Programs Without Thinking? · Linux: 15 million lines of code Windows 7: 40 million lines of code Google: 2 billion lines of code 1](https://reader034.fdocuments.net/reader034/viewer/2022050218/5f9214fa932b686ed70d8491/html5/thumbnails/23.jpg)
15
![Page 24: How Can We Write Large Programs Without Thinking? · Linux: 15 million lines of code Windows 7: 40 million lines of code Google: 2 billion lines of code 1](https://reader034.fdocuments.net/reader034/viewer/2022050218/5f9214fa932b686ed70d8491/html5/thumbnails/24.jpg)
[3, 5, 2] ⇒ 38
[1, 1, 6, 5, 3, 9] ⇒ 34
[2, 2, 4, 3, 7] ⇒ 66
15
![Page 25: How Can We Write Large Programs Without Thinking? · Linux: 15 million lines of code Windows 7: 40 million lines of code Google: 2 billion lines of code 1](https://reader034.fdocuments.net/reader034/viewer/2022050218/5f9214fa932b686ed70d8491/html5/thumbnails/25.jpg)
sum of squares of primes in the list
15
![Page 26: How Can We Write Large Programs Without Thinking? · Linux: 15 million lines of code Windows 7: 40 million lines of code Google: 2 billion lines of code 1](https://reader034.fdocuments.net/reader034/viewer/2022050218/5f9214fa932b686ed70d8491/html5/thumbnails/26.jpg)
People code from partial specifications, not from examples!
15
![Page 27: How Can We Write Large Programs Without Thinking? · Linux: 15 million lines of code Windows 7: 40 million lines of code Google: 2 billion lines of code 1](https://reader034.fdocuments.net/reader034/viewer/2022050218/5f9214fa932b686ed70d8491/html5/thumbnails/27.jpg)
Language to programs
What is the largest city in Europe by population?[database]
16
![Page 28: How Can We Write Large Programs Without Thinking? · Linux: 15 million lines of code Windows 7: 40 million lines of code Google: 2 billion lines of code 1](https://reader034.fdocuments.net/reader034/viewer/2022050218/5f9214fa932b686ed70d8491/html5/thumbnails/28.jpg)
Language to programs
What is the largest city in Europe by population?
semantic parsing
Cities
[database]
16
![Page 29: How Can We Write Large Programs Without Thinking? · Linux: 15 million lines of code Windows 7: 40 million lines of code Google: 2 billion lines of code 1](https://reader034.fdocuments.net/reader034/viewer/2022050218/5f9214fa932b686ed70d8491/html5/thumbnails/29.jpg)
Language to programs
What is the largest city in Europe by population?
semantic parsing
Cities Europe
[database]
16
![Page 30: How Can We Write Large Programs Without Thinking? · Linux: 15 million lines of code Windows 7: 40 million lines of code Google: 2 billion lines of code 1](https://reader034.fdocuments.net/reader034/viewer/2022050218/5f9214fa932b686ed70d8491/html5/thumbnails/30.jpg)
Language to programs
What is the largest city in Europe by population?
semantic parsing
Cities ContainedBy(Europe)
[database]
16
![Page 31: How Can We Write Large Programs Without Thinking? · Linux: 15 million lines of code Windows 7: 40 million lines of code Google: 2 billion lines of code 1](https://reader034.fdocuments.net/reader034/viewer/2022050218/5f9214fa932b686ed70d8491/html5/thumbnails/31.jpg)
Language to programs
What is the largest city in Europe by population?
semantic parsing
Cities ∩ ContainedBy(Europe)
[database]
16
![Page 32: How Can We Write Large Programs Without Thinking? · Linux: 15 million lines of code Windows 7: 40 million lines of code Google: 2 billion lines of code 1](https://reader034.fdocuments.net/reader034/viewer/2022050218/5f9214fa932b686ed70d8491/html5/thumbnails/32.jpg)
Language to programs
What is the largest city in Europe by population?
semantic parsing
Cities ∩ ContainedBy(Europe) Population
[database]
16
![Page 33: How Can We Write Large Programs Without Thinking? · Linux: 15 million lines of code Windows 7: 40 million lines of code Google: 2 billion lines of code 1](https://reader034.fdocuments.net/reader034/viewer/2022050218/5f9214fa932b686ed70d8491/html5/thumbnails/33.jpg)
Language to programs
What is the largest city in Europe by population?
semantic parsing
argmax(Cities ∩ ContainedBy(Europe),Population)
[database]
16
![Page 34: How Can We Write Large Programs Without Thinking? · Linux: 15 million lines of code Windows 7: 40 million lines of code Google: 2 billion lines of code 1](https://reader034.fdocuments.net/reader034/viewer/2022050218/5f9214fa932b686ed70d8491/html5/thumbnails/34.jpg)
Language to programs
What is the largest city in Europe by population?
semantic parsing
argmax(Cities ∩ ContainedBy(Europe),Population)
execute
Istanbul
[database]
16
![Page 35: How Can We Write Large Programs Without Thinking? · Linux: 15 million lines of code Windows 7: 40 million lines of code Google: 2 billion lines of code 1](https://reader034.fdocuments.net/reader034/viewer/2022050218/5f9214fa932b686ed70d8491/html5/thumbnails/35.jpg)
Language to programs
Remind me to buy milk after my last meeting on Monday.
[calendar]
17
![Page 36: How Can We Write Large Programs Without Thinking? · Linux: 15 million lines of code Windows 7: 40 million lines of code Google: 2 billion lines of code 1](https://reader034.fdocuments.net/reader034/viewer/2022050218/5f9214fa932b686ed70d8491/html5/thumbnails/36.jpg)
Language to programs
Remind me to buy milk after my last meeting on Monday.
semantic parsing
Add(Buy(Milk), argmax(Meetings ∩ HasDate(2016-07-18),EndTime))
[calendar]
17
![Page 37: How Can We Write Large Programs Without Thinking? · Linux: 15 million lines of code Windows 7: 40 million lines of code Google: 2 billion lines of code 1](https://reader034.fdocuments.net/reader034/viewer/2022050218/5f9214fa932b686ed70d8491/html5/thumbnails/37.jpg)
Language to programs
Remind me to buy milk after my last meeting on Monday.
semantic parsing
Add(Buy(Milk), argmax(Meetings ∩ HasDate(2016-07-18),EndTime))
execute
[reminder added]
[calendar]
17
![Page 38: How Can We Write Large Programs Without Thinking? · Linux: 15 million lines of code Windows 7: 40 million lines of code Google: 2 billion lines of code 1](https://reader034.fdocuments.net/reader034/viewer/2022050218/5f9214fa932b686ed70d8491/html5/thumbnails/38.jpg)
Language to programs
[sentence]
semantic parsing
[program]
execute
[behavior]
[context]
18
![Page 39: How Can We Write Large Programs Without Thinking? · Linux: 15 million lines of code Windows 7: 40 million lines of code Google: 2 billion lines of code 1](https://reader034.fdocuments.net/reader034/viewer/2022050218/5f9214fa932b686ed70d8491/html5/thumbnails/39.jpg)
A brief history of semantic parsing
GeoQuery [Zelle & Mooney 1996]
Inductive logic programming [Tang & Mooney 2001]
CCG [Zettlemoyer & Collins 2005]
String kernels [Kate & Mooney 2006]
Synchronous grammars [Wong & Mooney 2007] Relaxed CCG [Zettlemoyer & Collins 2007]
Learning from world [Clarke et al. 2010]
Higher-order unification [Kwiatkowski et al. 2011] Learning from answers [Liang et al. 2011]
Language + vision [Matsusek et al. 2012]
Large-scale KBs [Berant et al.; Kwiatkowski et al. 2013]Regular expressions [Kushman et al. 2013]
Instruction following [Artzi & Zettlemoyer 2013]Reduction to paraphrasing [Berant & Liang 2014]
Compositionality on tables [Pasupat & Liang, 2015] Dataset from logical forms [Wang et al. 2015]
19
![Page 40: How Can We Write Large Programs Without Thinking? · Linux: 15 million lines of code Windows 7: 40 million lines of code Google: 2 billion lines of code 1](https://reader034.fdocuments.net/reader034/viewer/2022050218/5f9214fa932b686ed70d8491/html5/thumbnails/40.jpg)
QA on Freebase
100M entities (nodes) 1B assertions (edges)
BarackObama
Person
Type
Politician
Profession
1961.08.04
DateOfBirth
HonoluluPlaceOfBirth
Hawaii
ContainedBy
City
Type
UnitedStates
ContainedBy
USState
Type
Event8
Marriage
MichelleObama
Spouse
Type
FemaleGender
1992.10.03
StartDate
Event3
PlacesLived
Chicago
Location
Event21
PlacesLived
Location
ContainedBy
WebQuestions [Berant et al., 2013]
[Bollacker, 2008; Google, 2013]
20
![Page 41: How Can We Write Large Programs Without Thinking? · Linux: 15 million lines of code Windows 7: 40 million lines of code Google: 2 billion lines of code 1](https://reader034.fdocuments.net/reader034/viewer/2022050218/5f9214fa932b686ed70d8491/html5/thumbnails/41.jpg)
QA on semi-structured data
154 million tables on the web [Cafarella et al. 2008]
21
![Page 42: How Can We Write Large Programs Without Thinking? · Linux: 15 million lines of code Windows 7: 40 million lines of code Google: 2 billion lines of code 1](https://reader034.fdocuments.net/reader034/viewer/2022050218/5f9214fa932b686ed70d8491/html5/thumbnails/42.jpg)
22
![Page 43: How Can We Write Large Programs Without Thinking? · Linux: 15 million lines of code Windows 7: 40 million lines of code Google: 2 billion lines of code 1](https://reader034.fdocuments.net/reader034/viewer/2022050218/5f9214fa932b686ed70d8491/html5/thumbnails/43.jpg)
WikiTableQuestions dataset
Statistics:
• 22000 question/answers
• 2100 tables
• 6.3 columns and 27.5 rows per table
[Pasupat & Liang, 2015]
23
![Page 44: How Can We Write Large Programs Without Thinking? · Linux: 15 million lines of code Windows 7: 40 million lines of code Google: 2 billion lines of code 1](https://reader034.fdocuments.net/reader034/viewer/2022050218/5f9214fa932b686ed70d8491/html5/thumbnails/44.jpg)
WikiTableQuestions dataset
Statistics:
• 22000 question/answers
• 2100 tables
• 6.3 columns and 27.5 rows per table
Challenges:
• High logical complexity (conjunction, disjunction, superlatives,comparatives, aggregation, arithmetic)
• Tables are unnormalized
• Train and test tables are distinct; need to generalize!
[Pasupat & Liang, 2015]
23
![Page 45: How Can We Write Large Programs Without Thinking? · Linux: 15 million lines of code Windows 7: 40 million lines of code Google: 2 billion lines of code 1](https://reader034.fdocuments.net/reader034/viewer/2022050218/5f9214fa932b686ed70d8491/html5/thumbnails/45.jpg)
Model framework
Greece held its last Summer Olympics in which year?
200424
![Page 46: How Can We Write Large Programs Without Thinking? · Linux: 15 million lines of code Windows 7: 40 million lines of code Google: 2 billion lines of code 1](https://reader034.fdocuments.net/reader034/viewer/2022050218/5f9214fa932b686ed70d8491/html5/thumbnails/46.jpg)
Model framework
Greece held its last Summer Olympics in which year?
R[Date].R[Year].argmax(Country.Greece, Index)
200424
![Page 47: How Can We Write Large Programs Without Thinking? · Linux: 15 million lines of code Windows 7: 40 million lines of code Google: 2 billion lines of code 1](https://reader034.fdocuments.net/reader034/viewer/2022050218/5f9214fa932b686ed70d8491/html5/thumbnails/47.jpg)
Model details
Generate programs recursively of increasing size:
Greece
City
Country
Nations
Year
25
![Page 48: How Can We Write Large Programs Without Thinking? · Linux: 15 million lines of code Windows 7: 40 million lines of code Google: 2 billion lines of code 1](https://reader034.fdocuments.net/reader034/viewer/2022050218/5f9214fa932b686ed70d8491/html5/thumbnails/48.jpg)
Model details
Generate programs recursively of increasing size:
Greece
City
Country
Nations
Year
Country.Greece
25
![Page 49: How Can We Write Large Programs Without Thinking? · Linux: 15 million lines of code Windows 7: 40 million lines of code Google: 2 billion lines of code 1](https://reader034.fdocuments.net/reader034/viewer/2022050218/5f9214fa932b686ed70d8491/html5/thumbnails/49.jpg)
Model details
Generate programs recursively of increasing size:
Greece
City
Country
Nations
Year
Country.Greece
R[City].Country.Greece
R[Nations].Country.Greece
R[Year].Country.Greece
25
![Page 50: How Can We Write Large Programs Without Thinking? · Linux: 15 million lines of code Windows 7: 40 million lines of code Google: 2 billion lines of code 1](https://reader034.fdocuments.net/reader034/viewer/2022050218/5f9214fa932b686ed70d8491/html5/thumbnails/50.jpg)
Model details
Generate programs recursively of increasing size:
Greece
City
Country
Nations
Year
Country.Greece
R[City].Country.Greece
R[Nations].Country.Greece
R[Year].Country.Greece
argmax(Country.Greece, Index)
argmax(Country.Greece,Nations)
argmax(Country.Greece,Year)
...
R[Date].R[Year].argmax(Country.Greece, Index)
25
![Page 51: How Can We Write Large Programs Without Thinking? · Linux: 15 million lines of code Windows 7: 40 million lines of code Google: 2 billion lines of code 1](https://reader034.fdocuments.net/reader034/viewer/2022050218/5f9214fa932b686ed70d8491/html5/thumbnails/51.jpg)
Model details
Generate programs recursively of increasing size:
Greece
City
Country
Nations
Year
Country.Greece
R[City].Country.Greece
R[Nations].Country.Greece
R[Year].Country.Greece
argmax(Country.Greece, Index)
argmax(Country.Greece,Nations)
argmax(Country.Greece,Year)
...
R[Date].R[Year].argmax(Country.Greece, Index)
Training:
maxθ
log∑
Exec(program)=answer
pθ(program | question)
25
![Page 52: How Can We Write Large Programs Without Thinking? · Linux: 15 million lines of code Windows 7: 40 million lines of code Google: 2 billion lines of code 1](https://reader034.fdocuments.net/reader034/viewer/2022050218/5f9214fa932b686ed70d8491/html5/thumbnails/52.jpg)
Results on WikiTableQuestions
Predict table cell Semantic parser Neural programmer
[Neelakantan+, 2016]
0
10
20
30
40
50
answ
eraccuracy
12.7
37.1 37.2
26
![Page 53: How Can We Write Large Programs Without Thinking? · Linux: 15 million lines of code Windows 7: 40 million lines of code Google: 2 billion lines of code 1](https://reader034.fdocuments.net/reader034/viewer/2022050218/5f9214fa932b686ed70d8491/html5/thumbnails/53.jpg)
Error analysis
Unhandled operations:
• Was there more gold medals won than silver?
• Which movies were number 1 for at least two consecutive weeks?
• How many titles had the same author listed as the illustrator?
27
![Page 54: How Can We Write Large Programs Without Thinking? · Linux: 15 million lines of code Windows 7: 40 million lines of code Google: 2 billion lines of code 1](https://reader034.fdocuments.net/reader034/viewer/2022050218/5f9214fa932b686ed70d8491/html5/thumbnails/54.jpg)
Error analysis
Unhandled operations:
• Was there more gold medals won than silver?
• Which movies were number 1 for at least two consecutive weeks?
• How many titles had the same author listed as the illustrator?
Table normalization:
• In what city did Piotr’s last 1st place finish occur? ...[Bangkok,Thailand]...
• How long does the show defcon 3 last? ...[2pm-3pm]...
27
![Page 55: How Can We Write Large Programs Without Thinking? · Linux: 15 million lines of code Windows 7: 40 million lines of code Google: 2 billion lines of code 1](https://reader034.fdocuments.net/reader034/viewer/2022050218/5f9214fa932b686ed70d8491/html5/thumbnails/55.jpg)
Modularity (model)
Natural language specifications
Moduarity (search)
Final remarks
28
![Page 56: How Can We Write Large Programs Without Thinking? · Linux: 15 million lines of code Windows 7: 40 million lines of code Google: 2 billion lines of code 1](https://reader034.fdocuments.net/reader034/viewer/2022050218/5f9214fa932b686ed70d8491/html5/thumbnails/56.jpg)
Searching...
Greece held its last Summer Olympics in which year?
?2004
Year City Country Nations
1896 Athens Greece 14
1900 Paris France 24
1904 St. Louis USA 12
... ... ... ...
2004 Athens Greece 201
2008 Beijing China 204
2012 London UK 204
29
![Page 57: How Can We Write Large Programs Without Thinking? · Linux: 15 million lines of code Windows 7: 40 million lines of code Google: 2 billion lines of code 1](https://reader034.fdocuments.net/reader034/viewer/2022050218/5f9214fa932b686ed70d8491/html5/thumbnails/57.jpg)
Searching...
Greece held its last Summer Olympics in which year?
R[Index].Country.Greece
2004
Year City Country Nations
1896 Athens Greece 14
1900 Paris France 24
1904 St. Louis USA 12
... ... ... ...
2004 Athens Greece 201
2008 Beijing China 204
2012 London UK 204
29
![Page 58: How Can We Write Large Programs Without Thinking? · Linux: 15 million lines of code Windows 7: 40 million lines of code Google: 2 billion lines of code 1](https://reader034.fdocuments.net/reader034/viewer/2022050218/5f9214fa932b686ed70d8491/html5/thumbnails/58.jpg)
Searching...
Greece held its last Summer Olympics in which year?
R[Nations].Country.Greece
2004
Year City Country Nations
1896 Athens Greece 14
1900 Paris France 24
1904 St. Louis USA 12
... ... ... ...
2004 Athens Greece 201
2008 Beijing China 204
2012 London UK 204
29
![Page 59: How Can We Write Large Programs Without Thinking? · Linux: 15 million lines of code Windows 7: 40 million lines of code Google: 2 billion lines of code 1](https://reader034.fdocuments.net/reader034/viewer/2022050218/5f9214fa932b686ed70d8491/html5/thumbnails/59.jpg)
Searching...
Greece held its last Summer Olympics in which year?
argmax(Country.Greece,Nations)
2004
Year City Country Nations
1896 Athens Greece 14
1900 Paris France 24
1904 St. Louis USA 12
... ... ... ...
2004 Athens Greece 201
2008 Beijing China 204
2012 London UK 204
29
![Page 60: How Can We Write Large Programs Without Thinking? · Linux: 15 million lines of code Windows 7: 40 million lines of code Google: 2 billion lines of code 1](https://reader034.fdocuments.net/reader034/viewer/2022050218/5f9214fa932b686ed70d8491/html5/thumbnails/60.jpg)
Searching...
Greece held its last Summer Olympics in which year?
argmax(Country.Greece, Index)
2004
Year City Country Nations
1896 Athens Greece 14
1900 Paris France 24
1904 St. Louis USA 12
... ... ... ...
2004 Athens Greece 201
2008 Beijing China 204
2012 London UK 204
29
![Page 61: How Can We Write Large Programs Without Thinking? · Linux: 15 million lines of code Windows 7: 40 million lines of code Google: 2 billion lines of code 1](https://reader034.fdocuments.net/reader034/viewer/2022050218/5f9214fa932b686ed70d8491/html5/thumbnails/61.jpg)
Searching...
Greece held its last Summer Olympics in which year?
... (thousands of logical forms later) ...
2004
Year City Country Nations
1896 Athens Greece 14
1900 Paris France 24
1904 St. Louis USA 12
... ... ... ...
2004 Athens Greece 201
2008 Beijing China 204
2012 London UK 204
29
![Page 62: How Can We Write Large Programs Without Thinking? · Linux: 15 million lines of code Windows 7: 40 million lines of code Google: 2 billion lines of code 1](https://reader034.fdocuments.net/reader034/viewer/2022050218/5f9214fa932b686ed70d8491/html5/thumbnails/62.jpg)
Searching...
Greece held its last Summer Olympics in which year?
R[Date].R[Year].argmax(Country.Greece, Index)
2004
Year City Country Nations
1896 Athens Greece 14
1900 Paris France 24
1904 St. Louis USA 12
... ... ... ...
2004 Athens Greece 201
2008 Beijing China 204
2012 London UK 204
29
![Page 63: How Can We Write Large Programs Without Thinking? · Linux: 15 million lines of code Windows 7: 40 million lines of code Google: 2 billion lines of code 1](https://reader034.fdocuments.net/reader034/viewer/2022050218/5f9214fa932b686ed70d8491/html5/thumbnails/63.jpg)
30
![Page 64: How Can We Write Large Programs Without Thinking? · Linux: 15 million lines of code Windows 7: 40 million lines of code Google: 2 billion lines of code 1](https://reader034.fdocuments.net/reader034/viewer/2022050218/5f9214fa932b686ed70d8491/html5/thumbnails/64.jpg)
Oracle accuracy
How many times did Greece hold the Summer Olympics?
2
31
![Page 65: How Can We Write Large Programs Without Thinking? · Linux: 15 million lines of code Windows 7: 40 million lines of code Google: 2 billion lines of code 1](https://reader034.fdocuments.net/reader034/viewer/2022050218/5f9214fa932b686ed70d8491/html5/thumbnails/65.jpg)
Oracle accuracy
How many times did Greece hold the Summer Olympics?
2
How often can the system even generate a set of 200 candidate programscontaining the right answer?
31
![Page 66: How Can We Write Large Programs Without Thinking? · Linux: 15 million lines of code Windows 7: 40 million lines of code Google: 2 billion lines of code 1](https://reader034.fdocuments.net/reader034/viewer/2022050218/5f9214fa932b686ed70d8491/html5/thumbnails/66.jpg)
Oracle accuracy
How many times did Greece hold the Summer Olympics?
2
How often can the system even generate a set of 200 candidate programscontaining the right answer?
76.6%
31
![Page 67: How Can We Write Large Programs Without Thinking? · Linux: 15 million lines of code Windows 7: 40 million lines of code Google: 2 billion lines of code 1](https://reader034.fdocuments.net/reader034/viewer/2022050218/5f9214fa932b686ed70d8491/html5/thumbnails/67.jpg)
Right for the wrong reasons
How many times did Greece hold the Summer Olympics?
count(Country.Greece)
2
32
![Page 68: How Can We Write Large Programs Without Thinking? · Linux: 15 million lines of code Windows 7: 40 million lines of code Google: 2 billion lines of code 1](https://reader034.fdocuments.net/reader034/viewer/2022050218/5f9214fa932b686ed70d8491/html5/thumbnails/68.jpg)
Right for the wrong reasons
How many times did Greece hold the Summer Olympics?
count(Country.Greece)− count(Country.Norway)
2
32
![Page 69: How Can We Write Large Programs Without Thinking? · Linux: 15 million lines of code Windows 7: 40 million lines of code Google: 2 billion lines of code 1](https://reader034.fdocuments.net/reader034/viewer/2022050218/5f9214fa932b686ed70d8491/html5/thumbnails/69.jpg)
Right for the wrong reasons
How many times did Greece hold the Summer Olympics?
R[Index].R[Next].R[Next].argmin(Country.Greece, Index)
2
32
![Page 70: How Can We Write Large Programs Without Thinking? · Linux: 15 million lines of code Windows 7: 40 million lines of code Google: 2 billion lines of code 1](https://reader034.fdocuments.net/reader034/viewer/2022050218/5f9214fa932b686ed70d8491/html5/thumbnails/70.jpg)
Right for the wrong reasons
How many times did Greece hold the Summer Olympics?
R[Index].R[Next].R[Next].argmin(Country.Greece, Index)
2
System generates list of 200 candidate programs
Any gets correct answer: 76.6%
Any gets correct program: 53.5%
32
![Page 71: How Can We Write Large Programs Without Thinking? · Linux: 15 million lines of code Windows 7: 40 million lines of code Google: 2 billion lines of code 1](https://reader034.fdocuments.net/reader034/viewer/2022050218/5f9214fa932b686ed70d8491/html5/thumbnails/71.jpg)
Right for the wrong reasons
How many times did Greece hold the Summer Olympics?
R[Index].R[Next].R[Next].argmin(Country.Greece, Index)
2
System generates list of 200 candidate programs
Any gets correct answer: 76.6%
Any gets correct program: 53.5%
Recovering program is unsupervised problem
32
![Page 72: How Can We Write Large Programs Without Thinking? · Linux: 15 million lines of code Windows 7: 40 million lines of code Google: 2 billion lines of code 1](https://reader034.fdocuments.net/reader034/viewer/2022050218/5f9214fa932b686ed70d8491/html5/thumbnails/72.jpg)
Challenge
How many times did Greece hold the Summer Olympics?
2
Can we efficiently generate all programs (up to some size) that producethe correct answer?
33
![Page 73: How Can We Write Large Programs Without Thinking? · Linux: 15 million lines of code Windows 7: 40 million lines of code Google: 2 billion lines of code 1](https://reader034.fdocuments.net/reader034/viewer/2022050218/5f9214fa932b686ed70d8491/html5/thumbnails/73.jpg)
Intuition: dynamic programming
PopulationOf.CapitalOf.Colorado
PopulationOf.argmax(Type.City u ContainedBy.Colorado,Population)
PopulationOf.Denver
[Pasupat & Liang, 2016]
34
![Page 74: How Can We Write Large Programs Without Thinking? · Linux: 15 million lines of code Windows 7: 40 million lines of code Google: 2 billion lines of code 1](https://reader034.fdocuments.net/reader034/viewer/2022050218/5f9214fa932b686ed70d8491/html5/thumbnails/74.jpg)
Dynamic programming on denotations
Step 1: compute all reachable denotations
Colorado Denver United States ... 649,495
[Pasupat & Liang, 2016]
35
![Page 75: How Can We Write Large Programs Without Thinking? · Linux: 15 million lines of code Windows 7: 40 million lines of code Google: 2 billion lines of code 1](https://reader034.fdocuments.net/reader034/viewer/2022050218/5f9214fa932b686ed70d8491/html5/thumbnails/75.jpg)
Dynamic programming on denotations
Step 1: compute all reachable denotations
Colorado Denver United States ... 649,495
Step 2: discard denotations that don’t reach the correct answer
(in practice: 99% reduction)
[Pasupat & Liang, 2016]
35
![Page 76: How Can We Write Large Programs Without Thinking? · Linux: 15 million lines of code Windows 7: 40 million lines of code Google: 2 billion lines of code 1](https://reader034.fdocuments.net/reader034/viewer/2022050218/5f9214fa932b686ed70d8491/html5/thumbnails/76.jpg)
Dynamic programming on denotations
Step 1: compute all reachable denotations
Colorado Denver United States ... 649,495
Step 2: discard denotations that don’t reach the correct answer
(in practice: 99% reduction)
Step 3: enumerate all programs on remaining denotations
PopulationOf.CapitalOf.Colorado
PopulationOf.argmax(Type.City u ContainedBy.Colorado,Population)
[Pasupat & Liang, 2016]
35
![Page 77: How Can We Write Large Programs Without Thinking? · Linux: 15 million lines of code Windows 7: 40 million lines of code Google: 2 billion lines of code 1](https://reader034.fdocuments.net/reader034/viewer/2022050218/5f9214fa932b686ed70d8491/html5/thumbnails/77.jpg)
Results: number of items explored
36
![Page 78: How Can We Write Large Programs Without Thinking? · Linux: 15 million lines of code Windows 7: 40 million lines of code Google: 2 billion lines of code 1](https://reader034.fdocuments.net/reader034/viewer/2022050218/5f9214fa932b686ed70d8491/html5/thumbnails/78.jpg)
Results: oracle accuracy
37
![Page 79: How Can We Write Large Programs Without Thinking? · Linux: 15 million lines of code Windows 7: 40 million lines of code Google: 2 billion lines of code 1](https://reader034.fdocuments.net/reader034/viewer/2022050218/5f9214fa932b686ed70d8491/html5/thumbnails/79.jpg)
Results: oracle accuracy
53.5% ⇒ 76.6%
37
![Page 80: How Can We Write Large Programs Without Thinking? · Linux: 15 million lines of code Windows 7: 40 million lines of code Google: 2 billion lines of code 1](https://reader034.fdocuments.net/reader034/viewer/2022050218/5f9214fa932b686ed70d8491/html5/thumbnails/80.jpg)
Learning macros
the same number of #0!=.cell1 AND R[col2].col0.R[col0].col2.cell1
how many more #0R[Number].R[col0].col2.cell1 - R[Number].R[col0].col2.cell3
at least NUM #0NUM ¿= @num #col:0
when was the lastmax(R[Date]. )
after #0R[col1].R[Next].col1.cell0
Ongoing work: use these in learning the semantic parser
38
![Page 81: How Can We Write Large Programs Without Thinking? · Linux: 15 million lines of code Windows 7: 40 million lines of code Google: 2 billion lines of code 1](https://reader034.fdocuments.net/reader034/viewer/2022050218/5f9214fa932b686ed70d8491/html5/thumbnails/81.jpg)
Summary so far
What is the largest city in Europe by population?
semantic parsing
argmax(Cities ∩ ContainedBy(Europe),Population)
execute
Istanbul
• Semantic parsing converts language to programs
• When we have learned the language, search is easy!
• Until then, search is hard, use dynamic programming
39
![Page 82: How Can We Write Large Programs Without Thinking? · Linux: 15 million lines of code Windows 7: 40 million lines of code Google: 2 billion lines of code 1](https://reader034.fdocuments.net/reader034/viewer/2022050218/5f9214fa932b686ed70d8491/html5/thumbnails/82.jpg)
Modularity (model)
Natural language specifications
Moduarity (search)
Final remarks
40
![Page 83: How Can We Write Large Programs Without Thinking? · Linux: 15 million lines of code Windows 7: 40 million lines of code Google: 2 billion lines of code 1](https://reader034.fdocuments.net/reader034/viewer/2022050218/5f9214fa932b686ed70d8491/html5/thumbnails/83.jpg)
Where’s the neural stuff?
41
![Page 84: How Can We Write Large Programs Without Thinking? · Linux: 15 million lines of code Windows 7: 40 million lines of code Google: 2 billion lines of code 1](https://reader034.fdocuments.net/reader034/viewer/2022050218/5f9214fa932b686ed70d8491/html5/thumbnails/84.jpg)
Neural semantic parsing
argmax Type City , Population
what is the largest city
[Robin Jia, ACL 2016]
42
![Page 85: How Can We Write Large Programs Without Thinking? · Linux: 15 million lines of code Windows 7: 40 million lines of code Google: 2 billion lines of code 1](https://reader034.fdocuments.net/reader034/viewer/2022050218/5f9214fa932b686ed70d8491/html5/thumbnails/85.jpg)
Neural semantic parsing
argmax Type City , Population
what is the largest city
• Learn semantic composition without predefined grammar
[Robin Jia, ACL 2016]
42
![Page 86: How Can We Write Large Programs Without Thinking? · Linux: 15 million lines of code Windows 7: 40 million lines of code Google: 2 billion lines of code 1](https://reader034.fdocuments.net/reader034/viewer/2022050218/5f9214fa932b686ed70d8491/html5/thumbnails/86.jpg)
Neural semantic parsing
argmax Type City , Population
what is the largest city
• Learn semantic composition without predefined grammar
• Encode compositionality through data recombination
what’s the capital of Germany? CapitalOf(Germany)
what countries border France? Borders(France)
[Robin Jia, ACL 2016]
42
![Page 87: How Can We Write Large Programs Without Thinking? · Linux: 15 million lines of code Windows 7: 40 million lines of code Google: 2 billion lines of code 1](https://reader034.fdocuments.net/reader034/viewer/2022050218/5f9214fa932b686ed70d8491/html5/thumbnails/87.jpg)
Neural semantic parsing
argmax Type City , Population
what is the largest city
• Learn semantic composition without predefined grammar
• Encode compositionality through data recombination
what’s the capital of Germany? CapitalOf(Germany)
what countries border France? Borders(France)
what’s the capital of France? CapitalOf(France)
what countries border Germany? Borders(Germany)
[Robin Jia, ACL 2016]
42
![Page 88: How Can We Write Large Programs Without Thinking? · Linux: 15 million lines of code Windows 7: 40 million lines of code Google: 2 billion lines of code 1](https://reader034.fdocuments.net/reader034/viewer/2022050218/5f9214fa932b686ed70d8491/html5/thumbnails/88.jpg)
Neural semantic parsing
Dataset: US Geography dataset (Zelle & Mooney, 1996)
What is the highest point in Florida?
WM07 ZC07 KZGS11 RNN RNN+recomb70
80
90
100
accuracy 86.1 86.6
88.9
85
89.3
state-of-art, simpler
[Robin Jia, ACL 2016]
43
![Page 89: How Can We Write Large Programs Without Thinking? · Linux: 15 million lines of code Windows 7: 40 million lines of code Google: 2 billion lines of code 1](https://reader034.fdocuments.net/reader034/viewer/2022050218/5f9214fa932b686ed70d8491/html5/thumbnails/89.jpg)
Point 1/2: factorization
question answering
44
![Page 90: How Can We Write Large Programs Without Thinking? · Linux: 15 million lines of code Windows 7: 40 million lines of code Google: 2 billion lines of code 1](https://reader034.fdocuments.net/reader034/viewer/2022050218/5f9214fa932b686ed70d8491/html5/thumbnails/90.jpg)
Point 1/2: factorization
understanding knowing
44
![Page 91: How Can We Write Large Programs Without Thinking? · Linux: 15 million lines of code Windows 7: 40 million lines of code Google: 2 billion lines of code 1](https://reader034.fdocuments.net/reader034/viewer/2022050218/5f9214fa932b686ed70d8491/html5/thumbnails/91.jpg)
Point 1/2: factorization
understanding knowing
What is the largest city in Missouri?
44
![Page 92: How Can We Write Large Programs Without Thinking? · Linux: 15 million lines of code Windows 7: 40 million lines of code Google: 2 billion lines of code 1](https://reader034.fdocuments.net/reader034/viewer/2022050218/5f9214fa932b686ed70d8491/html5/thumbnails/92.jpg)
Point 1/2: factorization
understanding knowing
What is the largest city in Missouri?
argmax(Type.City u ContainedBy.Missouri,Population)
44
![Page 93: How Can We Write Large Programs Without Thinking? · Linux: 15 million lines of code Windows 7: 40 million lines of code Google: 2 billion lines of code 1](https://reader034.fdocuments.net/reader034/viewer/2022050218/5f9214fa932b686ed70d8491/html5/thumbnails/93.jpg)
Point 1/2: factorization
understanding knowing
What is the largest city in Missouri?
argmax(Type.City u ContainedBy.Missouri,Population)
Kansas City
44
![Page 94: How Can We Write Large Programs Without Thinking? · Linux: 15 million lines of code Windows 7: 40 million lines of code Google: 2 billion lines of code 1](https://reader034.fdocuments.net/reader034/viewer/2022050218/5f9214fa932b686ed70d8491/html5/thumbnails/94.jpg)
Point 1/2: factorization
understanding knowing
What is the largest city in Missouri?
argmax(Type.City u ContainedBy.Missouri,Population)
Kansas City
Generalize robustly to all worlds!
44
![Page 95: How Can We Write Large Programs Without Thinking? · Linux: 15 million lines of code Windows 7: 40 million lines of code Google: 2 billion lines of code 1](https://reader034.fdocuments.net/reader034/viewer/2022050218/5f9214fa932b686ed70d8491/html5/thumbnails/95.jpg)
Point 2/2: discrete execution
What does differentiable execution buy you?
• Soft reasoning, perhaps (happens in language)
• Avoid combinatorial search
45
![Page 96: How Can We Write Large Programs Without Thinking? · Linux: 15 million lines of code Windows 7: 40 million lines of code Google: 2 billion lines of code 1](https://reader034.fdocuments.net/reader034/viewer/2022050218/5f9214fa932b686ed70d8491/html5/thumbnails/96.jpg)
Point 2/2: discrete execution
What does differentiable execution buy you?
• Soft reasoning, perhaps (happens in language)
• Avoid combinatorial search
But get nasty non-convex optimization problem instead!
45
![Page 97: How Can We Write Large Programs Without Thinking? · Linux: 15 million lines of code Windows 7: 40 million lines of code Google: 2 billion lines of code 1](https://reader034.fdocuments.net/reader034/viewer/2022050218/5f9214fa932b686ed70d8491/html5/thumbnails/97.jpg)
Point 2/2: discrete execution
• Real gains might be overprovisioning (many hidden units)
• Happens in discrete search too (many registers) [Schkufza+, 2013]
46
![Page 98: How Can We Write Large Programs Without Thinking? · Linux: 15 million lines of code Windows 7: 40 million lines of code Google: 2 billion lines of code 1](https://reader034.fdocuments.net/reader034/viewer/2022050218/5f9214fa932b686ed70d8491/html5/thumbnails/98.jpg)
47
![Page 99: How Can We Write Large Programs Without Thinking? · Linux: 15 million lines of code Windows 7: 40 million lines of code Google: 2 billion lines of code 1](https://reader034.fdocuments.net/reader034/viewer/2022050218/5f9214fa932b686ed70d8491/html5/thumbnails/99.jpg)
Modularity: subprograms yield larger search operators
47
![Page 100: How Can We Write Large Programs Without Thinking? · Linux: 15 million lines of code Windows 7: 40 million lines of code Google: 2 billion lines of code 1](https://reader034.fdocuments.net/reader034/viewer/2022050218/5f9214fa932b686ed70d8491/html5/thumbnails/100.jpg)
Modularity: subprograms yield larger search operators
Modularity: cache denotations (dynamic programming)
47
![Page 101: How Can We Write Large Programs Without Thinking? · Linux: 15 million lines of code Windows 7: 40 million lines of code Google: 2 billion lines of code 1](https://reader034.fdocuments.net/reader034/viewer/2022050218/5f9214fa932b686ed70d8491/html5/thumbnails/101.jpg)
Modularity: subprograms yield larger search operators
Modularity: cache denotations (dynamic programming)
Specifications: language tells you what the program is!
47
![Page 102: How Can We Write Large Programs Without Thinking? · Linux: 15 million lines of code Windows 7: 40 million lines of code Google: 2 billion lines of code 1](https://reader034.fdocuments.net/reader034/viewer/2022050218/5f9214fa932b686ed70d8491/html5/thumbnails/102.jpg)
Collaborators
Ice Pasupat Robin Jia Michael Jordan Dan Klein
Code, data, experiments
worksheets.codalab.org
Funding
Google Microsoft DARPA
48
![Page 103: How Can We Write Large Programs Without Thinking? · Linux: 15 million lines of code Windows 7: 40 million lines of code Google: 2 billion lines of code 1](https://reader034.fdocuments.net/reader034/viewer/2022050218/5f9214fa932b686ed70d8491/html5/thumbnails/103.jpg)
Collaborators
Ice Pasupat Robin Jia Michael Jordan Dan Klein
Code, data, experiments
worksheets.codalab.org
Funding
Google Microsoft DARPA
Thank you!
48