CMPT 755 Compilers - cs.sfu.caanoop/courses/CMPT-755-Fall-2004/parsing… · Predictive Top-Down...
Transcript of CMPT 755 Compilers - cs.sfu.caanoop/courses/CMPT-755-Fall-2004/parsing… · Predictive Top-Down...
![Page 1: CMPT 755 Compilers - cs.sfu.caanoop/courses/CMPT-755-Fall-2004/parsing… · Predictive Top-Down Parser •Knows which production to choose based on single lookahead symbol •Need](https://reader034.fdocuments.net/reader034/viewer/2022042810/5f9f24619b30dd3a34035b2c/html5/thumbnails/1.jpg)
CMPT 755CompilersAnoop Sarkar
http://www.cs.sfu.ca/~anoop
![Page 2: CMPT 755 Compilers - cs.sfu.caanoop/courses/CMPT-755-Fall-2004/parsing… · Predictive Top-Down Parser •Knows which production to choose based on single lookahead symbol •Need](https://reader034.fdocuments.net/reader034/viewer/2022042810/5f9f24619b30dd3a34035b2c/html5/thumbnails/2.jpg)
Parsing - Roadmap
• Parser:– decision procedure: builds a parse tree
• Top-down vs. bottom-up• LL(1) – Deterministic Parsing
– recursive-descent– table-driven
• LR(k) – Deterministic Parsing– LR(0), SLR(1), LR(1), LALR(1)
• Parsing arbitrary CFGs – Polynomial time parsing
![Page 3: CMPT 755 Compilers - cs.sfu.caanoop/courses/CMPT-755-Fall-2004/parsing… · Predictive Top-Down Parser •Knows which production to choose based on single lookahead symbol •Need](https://reader034.fdocuments.net/reader034/viewer/2022042810/5f9f24619b30dd3a34035b2c/html5/thumbnails/3.jpg)
Top-Down vs. Bottom UpS → A BA → c | εB → cbB | ca
Input String: ccbca
S ⇒ AB ⇒ cB ⇒ ccbB ⇒ ccbca
Top-Down/leftmostS→ABA→cB→cbBB→ca
A→cB→caB→cbBS→AB
ccbca ⇐ Acbca ⇐ AcbB ⇐ AB ⇐ S
Bottom-Up/rightmost
Grammar:
![Page 4: CMPT 755 Compilers - cs.sfu.caanoop/courses/CMPT-755-Fall-2004/parsing… · Predictive Top-Down Parser •Knows which production to choose based on single lookahead symbol •Need](https://reader034.fdocuments.net/reader034/viewer/2022042810/5f9f24619b30dd3a34035b2c/html5/thumbnails/4.jpg)
Top-Down: Backtracking
S → A BA → c | εB → cbB | ca
True/FalseS ⇒* cbca?
S cbca try S→ABAB cbca try A→ccB cbca match cB bca dead-end, try A→ε
εB cbca try B→cbBcbB cbca match cbB bca match bB ca try B→cbBcbB ca match cbB a dead-end, try B→caca ca match ca a match a, Done!
![Page 5: CMPT 755 Compilers - cs.sfu.caanoop/courses/CMPT-755-Fall-2004/parsing… · Predictive Top-Down Parser •Knows which production to choose based on single lookahead symbol •Need](https://reader034.fdocuments.net/reader034/viewer/2022042810/5f9f24619b30dd3a34035b2c/html5/thumbnails/5.jpg)
BacktrackingS → cAd | cA → ad | a
S → cAd | cA → a | ad
S
c A d
a
Input: cad
S
c A d
a dFailureSuccess
Recursive descent parser does not backtrack into rules that succeed
![Page 6: CMPT 755 Compilers - cs.sfu.caanoop/courses/CMPT-755-Fall-2004/parsing… · Predictive Top-Down Parser •Knows which production to choose based on single lookahead symbol •Need](https://reader034.fdocuments.net/reader034/viewer/2022042810/5f9f24619b30dd3a34035b2c/html5/thumbnails/6.jpg)
Transition Diagramc A aS:
B
c BA:
cb B
εB:
S → cAa
A → cB | B
B → bcB | ε
![Page 7: CMPT 755 Compilers - cs.sfu.caanoop/courses/CMPT-755-Fall-2004/parsing… · Predictive Top-Down Parser •Knows which production to choose based on single lookahead symbol •Need](https://reader034.fdocuments.net/reader034/viewer/2022042810/5f9f24619b30dd3a34035b2c/html5/thumbnails/7.jpg)
Predictive Top-Down Parser
• Knows which production to choose based onsingle lookahead symbol
• Need LL(1) grammars– First L: reads input Left to right– Second L: produce Leftmost derivation– 1: one symbol of lookahead
• Can’t have left-recursion• Must be left-factored (no left-factors)• Not all grammars can be made LL(1)
![Page 8: CMPT 755 Compilers - cs.sfu.caanoop/courses/CMPT-755-Fall-2004/parsing… · Predictive Top-Down Parser •Knows which production to choose based on single lookahead symbol •Need](https://reader034.fdocuments.net/reader034/viewer/2022042810/5f9f24619b30dd3a34035b2c/html5/thumbnails/8.jpg)
Leftmost derivation forid + id * idE ⇒ E + E⇒ id + E⇒ id + E * E⇒ id + id * E⇒ id + id * id
E → E + EE → E * EE → ( E )E → - EE → id
E ⇒*lm id + E \* E
![Page 9: CMPT 755 Compilers - cs.sfu.caanoop/courses/CMPT-755-Fall-2004/parsing… · Predictive Top-Down Parser •Knows which production to choose based on single lookahead symbol •Need](https://reader034.fdocuments.net/reader034/viewer/2022042810/5f9f24619b30dd3a34035b2c/html5/thumbnails/9.jpg)
Predictive Parsing Table
F → idF → ( T )F
T’ → εT’ → εT’ → * F T’T’
T → F T’T → F T’T$id)(*
T’ → ε2T → F T’1
F → ( T )5F → id4T’ → * F T’3
Productions
![Page 10: CMPT 755 Compilers - cs.sfu.caanoop/courses/CMPT-755-Fall-2004/parsing… · Predictive Top-Down Parser •Knows which production to choose based on single lookahead symbol •Need](https://reader034.fdocuments.net/reader034/viewer/2022042810/5f9f24619b30dd3a34035b2c/html5/thumbnails/10.jpg)
Trace “(id)*id”
(id)*id$$TOutputInputStack
T → F T’(id)*id$$T’FF → ( T )(id)*id$$T’)T(
id)*id$$T’)TT → F T’id)*id$$T’)T’FF → idid)*id$$T’)T’id
)*id$$T’)T’T’ → ε)*id$$T’)
F → idF → (T)F
T’ → εT’ → εT’ → *FT’T’
T → FT’T → FT’T
$id)(*
![Page 11: CMPT 755 Compilers - cs.sfu.caanoop/courses/CMPT-755-Fall-2004/parsing… · Predictive Top-Down Parser •Knows which production to choose based on single lookahead symbol •Need](https://reader034.fdocuments.net/reader034/viewer/2022042810/5f9f24619b30dd3a34035b2c/html5/thumbnails/11.jpg)
Trace “(id)*id”
*id$$T’OutputInputStack
T’ → * F T’*id$$T’F*id$$T’F
F → idid$$T’id$$T’
T’ → ε$$
F → idF → (T)F
T’ → εT’ → εT’ → *FT’T’
T → FT’T → FT’T
$id)(*
![Page 12: CMPT 755 Compilers - cs.sfu.caanoop/courses/CMPT-755-Fall-2004/parsing… · Predictive Top-Down Parser •Knows which production to choose based on single lookahead symbol •Need](https://reader034.fdocuments.net/reader034/viewer/2022042810/5f9f24619b30dd3a34035b2c/html5/thumbnails/12.jpg)
Table-Driven Parsing stack.push($); stack.push(S);
a = input.read();forever do begin X = stack.peek(); if X = a and a = $ then return SUCCESS; elsif X = a and a != $ then pop X; a = input.read(); elsif X != a and X ∈ N and M[X,a] then pop X; push right-hand side of M[X,a]; else ERROR!end
![Page 13: CMPT 755 Compilers - cs.sfu.caanoop/courses/CMPT-755-Fall-2004/parsing… · Predictive Top-Down Parser •Knows which production to choose based on single lookahead symbol •Need](https://reader034.fdocuments.net/reader034/viewer/2022042810/5f9f24619b30dd3a34035b2c/html5/thumbnails/13.jpg)
Predictive Parsing table
• Given a grammar produce the predictive parsingtable
• We need to to know for all rules A → α | β thelookahead symbol
• Based on the lookahead symbol the table can beused to pick which rule to push onto the stack
• This can be done using two sets: FIRST andFOLLOW
![Page 14: CMPT 755 Compilers - cs.sfu.caanoop/courses/CMPT-755-Fall-2004/parsing… · Predictive Top-Down Parser •Knows which production to choose based on single lookahead symbol •Need](https://reader034.fdocuments.net/reader034/viewer/2022042810/5f9f24619b30dd3a34035b2c/html5/thumbnails/14.jpg)
FIRST and FOLLOW
![Page 15: CMPT 755 Compilers - cs.sfu.caanoop/courses/CMPT-755-Fall-2004/parsing… · Predictive Top-Down Parser •Knows which production to choose based on single lookahead symbol •Need](https://reader034.fdocuments.net/reader034/viewer/2022042810/5f9f24619b30dd3a34035b2c/html5/thumbnails/15.jpg)
Conditions for LL(1)
• Necessary conditions:– no ambiguity– no left recursion– Left factored grammar
• A grammar G is LL(1) iff - whenever A → α | β
1. First(α) ∩ First(β) = ∅2. α ⇒* ε implies !(β ⇒* ε)3. α ⇒* ε implies First(β) ∩ Follow(A) = ∅
![Page 16: CMPT 755 Compilers - cs.sfu.caanoop/courses/CMPT-755-Fall-2004/parsing… · Predictive Top-Down Parser •Knows which production to choose based on single lookahead symbol •Need](https://reader034.fdocuments.net/reader034/viewer/2022042810/5f9f24619b30dd3a34035b2c/html5/thumbnails/16.jpg)
proc First(α: string of symbols) // assume α = X1 X2 X3 … Xn
if X1 ∈ T then First(α) := {X1}else begin i:=1; First(α) := First(X1)\{ε}; while Xi ⇒* ε do begin if i < n then First(α) := First(α) ∪ First(Xi+1)\{ε}; else First(α) := First(α) ∪ {ε}; i := i + 1; endend
![Page 17: CMPT 755 Compilers - cs.sfu.caanoop/courses/CMPT-755-Fall-2004/parsing… · Predictive Top-Down Parser •Knows which production to choose based on single lookahead symbol •Need](https://reader034.fdocuments.net/reader034/viewer/2022042810/5f9f24619b30dd3a34035b2c/html5/thumbnails/17.jpg)
proc First(X); modifiedforeach X ∈ T do First(X) := X;foreach p ∈ P : X → ε do First(X) := {ε};repeat foreach X ∈ N, p : X → Y1 Y2 Y3 … Yn do
begin i:=1; while Yi ⇒* ε and i <= n do begin First(X) := First(X) ∪ First(Yi)\{ε}; i := i+1; end
if i = n+1 then First(X) := First(X) ∪ {ε}; else First(X) := First(X) ∪ First(Yi);
until no change in any First(X);
![Page 18: CMPT 755 Compilers - cs.sfu.caanoop/courses/CMPT-755-Fall-2004/parsing… · Predictive Top-Down Parser •Knows which production to choose based on single lookahead symbol •Need](https://reader034.fdocuments.net/reader034/viewer/2022042810/5f9f24619b30dd3a34035b2c/html5/thumbnails/18.jpg)
proc Follow(N: non-terminal)Follow(S) := {$};repeat foreach p ∈ P do
case p = A → αBβ begin Follow(B) := Follow(B) ∪ First(β)\{ε}; if ε ∈ First(β) then Follow(B) := Follow(B) ∪ Follow(A); end
case p = A → αB Follow(B) := Follow(B) ∪ Follow(A);
until no change in any Follow(N)
![Page 19: CMPT 755 Compilers - cs.sfu.caanoop/courses/CMPT-755-Fall-2004/parsing… · Predictive Top-Down Parser •Knows which production to choose based on single lookahead symbol •Need](https://reader034.fdocuments.net/reader034/viewer/2022042810/5f9f24619b30dd3a34035b2c/html5/thumbnails/19.jpg)
Example First/FollowS → ABA → c | εB → cbB | ca
First(A) = {c, ε} Follow(A) = {c}Follow(A) ∩ First(c) = {c}
First(B) = {c}First(cbB) = First(ca) = {c} Follow(B) = {$}First(S) = {c} Follow(S) = {$}
Not an LL(1) grammar
![Page 20: CMPT 755 Compilers - cs.sfu.caanoop/courses/CMPT-755-Fall-2004/parsing… · Predictive Top-Down Parser •Knows which production to choose based on single lookahead symbol •Need](https://reader034.fdocuments.net/reader034/viewer/2022042810/5f9f24619b30dd3a34035b2c/html5/thumbnails/20.jpg)
Converting to LL(1)
S → ABA → c | εB → cbB | ca
S → cAaA → cB | BB → bcB | ε
c (c b c b … c b) c a (c b c b … c b) c a
Note that grammaris regular: c? (cb)* ca
same as:c c? (bc)* a
c c (b c b … c b c) ac (b c b … c b c) a
![Page 21: CMPT 755 Compilers - cs.sfu.caanoop/courses/CMPT-755-Fall-2004/parsing… · Predictive Top-Down Parser •Knows which production to choose based on single lookahead symbol •Need](https://reader034.fdocuments.net/reader034/viewer/2022042810/5f9f24619b30dd3a34035b2c/html5/thumbnails/21.jpg)
Verifying LL(1) using F/F sets
First(A) = {b, c, ε}First(B) = {b, ε}
Follow(A) = {a}Follow(B) = {a}
First(S) = {c} Follow(S) = {$}
S → cAaA → cB | BB → bcB | ε
![Page 22: CMPT 755 Compilers - cs.sfu.caanoop/courses/CMPT-755-Fall-2004/parsing… · Predictive Top-Down Parser •Knows which production to choose based on single lookahead symbol •Need](https://reader034.fdocuments.net/reader034/viewer/2022042810/5f9f24619b30dd3a34035b2c/html5/thumbnails/22.jpg)
Building the Parse Table
• Compute First and Follow sets• For each production A → α
– foreach a ∈ First(α) add A → α to M[A,a]– If ε ∈ First(α) add A → α to M[A,b] for each b
in Follow(A)– If ε ∈ First(α) add A → α to M[A,$] if $ ∈
Follow(α)– All undefined entries are errors
![Page 23: CMPT 755 Compilers - cs.sfu.caanoop/courses/CMPT-755-Fall-2004/parsing… · Predictive Top-Down Parser •Knows which production to choose based on single lookahead symbol •Need](https://reader034.fdocuments.net/reader034/viewer/2022042810/5f9f24619b30dd3a34035b2c/html5/thumbnails/23.jpg)
Revisit conditions for LL(1)
• A grammar G is LL(1) iff - whenever A → α | β
1. First(α) ∩ First(β) = ∅2. α ⇒* ε implies !(β ⇒* ε)3. α ⇒* ε implies First(β) ∩ Follow(A) = ∅
• No more than one entry per table field
![Page 24: CMPT 755 Compilers - cs.sfu.caanoop/courses/CMPT-755-Fall-2004/parsing… · Predictive Top-Down Parser •Knows which production to choose based on single lookahead symbol •Need](https://reader034.fdocuments.net/reader034/viewer/2022042810/5f9f24619b30dd3a34035b2c/html5/thumbnails/24.jpg)
Error Handling
• Reporting & Recovery– Report as soon as possible– Suitable error messages– Resume after error– Avoid cascading errors
• Phrase-level vs. Panic-mode recovery
![Page 25: CMPT 755 Compilers - cs.sfu.caanoop/courses/CMPT-755-Fall-2004/parsing… · Predictive Top-Down Parser •Knows which production to choose based on single lookahead symbol •Need](https://reader034.fdocuments.net/reader034/viewer/2022042810/5f9f24619b30dd3a34035b2c/html5/thumbnails/25.jpg)
Panic-Mode Recovery• Skip tokens until synchronizing set is seen
– Follow(A)• garbage or missing things after
– Higher-level start symbols– First(A)
• garbage before– Epsilon
• if nullable– Pop/Insert terminal
• “auto-insert”• Add “synch” actions to table
![Page 26: CMPT 755 Compilers - cs.sfu.caanoop/courses/CMPT-755-Fall-2004/parsing… · Predictive Top-Down Parser •Knows which production to choose based on single lookahead symbol •Need](https://reader034.fdocuments.net/reader034/viewer/2022042810/5f9f24619b30dd3a34035b2c/html5/thumbnails/26.jpg)
Summary so far
• LL(1) grammars– necessary conditions
• No left recursion• Left-factored
• Not all languages can be generated by LL(1)grammar
• LL(1) grammars can be parsed by simplepredictive recursive-descent parser– Alternative: table-driven top-down parser
![Page 27: CMPT 755 Compilers - cs.sfu.caanoop/courses/CMPT-755-Fall-2004/parsing… · Predictive Top-Down Parser •Knows which production to choose based on single lookahead symbol •Need](https://reader034.fdocuments.net/reader034/viewer/2022042810/5f9f24619b30dd3a34035b2c/html5/thumbnails/27.jpg)
Bottom-up parsing overview
• Start from terminal symbols, search for a path tothe start symbol
• Apply shift and reduce actions: postpone decisions• LR parsing:
– L: left to right parsing– R: rightmost derivation (in reverse or bottom-up)
• LR(0) → SLR(1) → LR(1) → LALR(1)– 0 or 1 or k lookahead symbols
![Page 28: CMPT 755 Compilers - cs.sfu.caanoop/courses/CMPT-755-Fall-2004/parsing… · Predictive Top-Down Parser •Knows which production to choose based on single lookahead symbol •Need](https://reader034.fdocuments.net/reader034/viewer/2022042810/5f9f24619b30dd3a34035b2c/html5/thumbnails/28.jpg)
Actions in Shift-Reduce Parsing
• Shift– add terminal to parse stack, advance input
• Reduce– If αw on stack, and A→ w, and there is a β ∈ T* such
that S ⇒*rm αAβ ⇒rm αwβ then we can prune thehandle w; we reduce αw to αA on the stack
– αw is a viable prefix• Error• Accept
![Page 29: CMPT 755 Compilers - cs.sfu.caanoop/courses/CMPT-755-Fall-2004/parsing… · Predictive Top-Down Parser •Knows which production to choose based on single lookahead symbol •Need](https://reader034.fdocuments.net/reader034/viewer/2022042810/5f9f24619b30dd3a34035b2c/html5/thumbnails/29.jpg)
Questions
• When to shift/reduce?– What are valid handles?– Ambiguity: Shift/reduce conflict
• If reducing, using which production?– Ambiguity: Reduce/reduce conflict
![Page 30: CMPT 755 Compilers - cs.sfu.caanoop/courses/CMPT-755-Fall-2004/parsing… · Predictive Top-Down Parser •Knows which production to choose based on single lookahead symbol •Need](https://reader034.fdocuments.net/reader034/viewer/2022042810/5f9f24619b30dd3a34035b2c/html5/thumbnails/30.jpg)
Rightmost derivation forid + id * idE ⇒ E * E⇒ E * id⇒ E + E * id⇒ E + id * id⇒ id + id * id shift
reduce with E → id
E → E + EE → E * EE → ( E )E → - EE → id
E ⇒*rm E + E \* id
![Page 31: CMPT 755 Compilers - cs.sfu.caanoop/courses/CMPT-755-Fall-2004/parsing… · Predictive Top-Down Parser •Knows which production to choose based on single lookahead symbol •Need](https://reader034.fdocuments.net/reader034/viewer/2022042810/5f9f24619b30dd3a34035b2c/html5/thumbnails/31.jpg)
LR Parsing
• Table-based parser– Creates rightmost derivation (in reverse)– For “less massaged” grammars than LL(1)
• Data structures:– Stack of states/symbols {s}– Action table: action[s, a]; a ∈ T– Goto table: goto[s, X]; X ∈ N
![Page 32: CMPT 755 Compilers - cs.sfu.caanoop/courses/CMPT-755-Fall-2004/parsing… · Predictive Top-Down Parser •Knows which production to choose based on single lookahead symbol •Need](https://reader034.fdocuments.net/reader034/viewer/2022042810/5f9f24619b30dd3a34035b2c/html5/thumbnails/32.jpg)
Action/Goto Table
R3R3R3R3R38R4R4R4R4R47
S7S3616S8S55
R2R2R2R2R244S8S53
Acc!S32R1R1R1R1R11
12S8S50FT$id)(*
F → (T)4F → id3T → T*F2T → F1
Productions
![Page 33: CMPT 755 Compilers - cs.sfu.caanoop/courses/CMPT-755-Fall-2004/parsing… · Predictive Top-Down Parser •Knows which production to choose based on single lookahead symbol •Need](https://reader034.fdocuments.net/reader034/viewer/2022042810/5f9f24619b30dd3a34035b2c/html5/thumbnails/33.jpg)
Trace “(id)*id”
Shift S5Shift S8Reduce 3 F→id,pop 8, goto [5,F]=1Reduce 1 T→ F,pop 1, goto [5,T]=6Shift S7Reduce 4 F→ (T),pop 7 6 5, goto [0,F]=1Reduce 1 T → Fpop 1, goto [0,T]=2
( id ) * id $id ) * id $
) * id $
) * id $
) * id $* id $
* id $
00 50 5 8
0 5 1
0 5 60 5 6 7
0 1
ActionInputStack
![Page 34: CMPT 755 Compilers - cs.sfu.caanoop/courses/CMPT-755-Fall-2004/parsing… · Predictive Top-Down Parser •Knows which production to choose based on single lookahead symbol •Need](https://reader034.fdocuments.net/reader034/viewer/2022042810/5f9f24619b30dd3a34035b2c/html5/thumbnails/34.jpg)
Trace “(id)*id”
Shift S5Shift S8Reduce 3 F→id,pop 8, goto [5,F]=1Reduce 1 T→ F,pop 1, goto [5,T]=6Shift S7Reduce 4 F→ (T),pop 7 6 5, goto [0,F]=1Reduce 1 T → Fpop 1, goto [0,T]=2
( id ) * id $id ) * id $
) * id $
) * id $
) * id $* id $
* id $
00 50 5 8
0 5 1
0 5 60 5 6 7
0 1
ActionInputStack
R3R3R3R3R38R4R4R4R4R47
S7S3616S8S55
R2R2R2R2R244S8S53
AS32R1R1R1R1R11
12S8S50FT$id)(*
F → (T)4F → id3T → T*F2T → F1
Productions
![Page 35: CMPT 755 Compilers - cs.sfu.caanoop/courses/CMPT-755-Fall-2004/parsing… · Predictive Top-Down Parser •Knows which production to choose based on single lookahead symbol •Need](https://reader034.fdocuments.net/reader034/viewer/2022042810/5f9f24619b30dd3a34035b2c/html5/thumbnails/35.jpg)
Trace “(id)*id”
Reduce 1 T→F,pop 1, goto [0,T]=2Shift S3Shift S8Reduce 3 F→id,pop 8, goto [3,F]=4Reduce 2 T→T * Fpop 4 3 2, goto [0,T]=2Accept
* id $
* id $id $
$
$
$
0 1
0 20 2 30 2 3 8
0 2 3 4
0 2
ActionInputStack
![Page 36: CMPT 755 Compilers - cs.sfu.caanoop/courses/CMPT-755-Fall-2004/parsing… · Predictive Top-Down Parser •Knows which production to choose based on single lookahead symbol •Need](https://reader034.fdocuments.net/reader034/viewer/2022042810/5f9f24619b30dd3a34035b2c/html5/thumbnails/36.jpg)
Trace “(id)*id”
Reduce 1 T→F,pop 1, goto [0,T]=2Shift S3Shift S8Reduce 3 F→id,pop 8, goto [3,F]=4Reduce 2 T→T * Fpop 4 3 2, goto [0,T]=2Accept
* id $
* id $id $
$
$
$
0 1
0 20 2 30 2 3 8
0 2 3 4
0 2
ActionInputStack
R3R3R3R3R38R4R4R4R4R47
S7S3616S8S55
R2R2R2R2R244S8S53
AS32R1R1R1R1R11
12S8S50FT$id)(*
F → (T)4F → id3T → T*F2T → F1
Productions
![Page 37: CMPT 755 Compilers - cs.sfu.caanoop/courses/CMPT-755-Fall-2004/parsing… · Predictive Top-Down Parser •Knows which production to choose based on single lookahead symbol •Need](https://reader034.fdocuments.net/reader034/viewer/2022042810/5f9f24619b30dd3a34035b2c/html5/thumbnails/37.jpg)
Tracing LR: action[s, a]
• case shift u:– push state u– read new a
• case reduce r:– lookup production r: X → Y1..Yk;– pop k states, find state u– push goto[u, X]
• case accept: done• no entry in action table: error
![Page 38: CMPT 755 Compilers - cs.sfu.caanoop/courses/CMPT-755-Fall-2004/parsing… · Predictive Top-Down Parser •Knows which production to choose based on single lookahead symbol •Need](https://reader034.fdocuments.net/reader034/viewer/2022042810/5f9f24619b30dd3a34035b2c/html5/thumbnails/38.jpg)
Configuration set
• Each set is a parser state• Consider
T → T * • FF → • ( T )F → • id
• Like NFA-to-DFA conversion
![Page 39: CMPT 755 Compilers - cs.sfu.caanoop/courses/CMPT-755-Fall-2004/parsing… · Predictive Top-Down Parser •Knows which production to choose based on single lookahead symbol •Need](https://reader034.fdocuments.net/reader034/viewer/2022042810/5f9f24619b30dd3a34035b2c/html5/thumbnails/39.jpg)
Closure
Closure property:• If T → X1 … Xi • Xi+1 … Xn is in set, and
Xi+1 is a nonterminal, thenXi+1 → • Y1 … Ym is in the set as well forall productions Xi+1 → Y1 … Ym
• Compute as fixed point
![Page 40: CMPT 755 Compilers - cs.sfu.caanoop/courses/CMPT-755-Fall-2004/parsing… · Predictive Top-Down Parser •Knows which production to choose based on single lookahead symbol •Need](https://reader034.fdocuments.net/reader034/viewer/2022042810/5f9f24619b30dd3a34035b2c/html5/thumbnails/40.jpg)
Starting Configuration
• Augment Grammar with S’• Add production S’ → S• Initial configuration set is
closure(S’ → • S)
![Page 41: CMPT 755 Compilers - cs.sfu.caanoop/courses/CMPT-755-Fall-2004/parsing… · Predictive Top-Down Parser •Knows which production to choose based on single lookahead symbol •Need](https://reader034.fdocuments.net/reader034/viewer/2022042810/5f9f24619b30dd3a34035b2c/html5/thumbnails/41.jpg)
S’ → TT → F | T * FF → id | ( T )
Example: I = closure(S’ → • T)
S’ → • TT → • T * FT → • FF → • idF → • ( T )
![Page 42: CMPT 755 Compilers - cs.sfu.caanoop/courses/CMPT-755-Fall-2004/parsing… · Predictive Top-Down Parser •Knows which production to choose based on single lookahead symbol •Need](https://reader034.fdocuments.net/reader034/viewer/2022042810/5f9f24619b30dd3a34035b2c/html5/thumbnails/42.jpg)
Successor(I, X)
Informally: “move by symbol X”1. move dot to the right in all items where
dot is before X2. remove all other items
(viable prefixes only!)3. compute closure
![Page 43: CMPT 755 Compilers - cs.sfu.caanoop/courses/CMPT-755-Fall-2004/parsing… · Predictive Top-Down Parser •Knows which production to choose based on single lookahead symbol •Need](https://reader034.fdocuments.net/reader034/viewer/2022042810/5f9f24619b30dd3a34035b2c/html5/thumbnails/43.jpg)
Successor ExampleS’ → TT → F | T * FF → id | ( T )
I = {S’ → • T, T → • F, T → • T * F, F → • id, F → • ( T ) }
{ F → ( • T ), T → • F, T → • T * F,F → • id, F → • ( T ) }
Compute Successor(I, “(“)
![Page 44: CMPT 755 Compilers - cs.sfu.caanoop/courses/CMPT-755-Fall-2004/parsing… · Predictive Top-Down Parser •Knows which production to choose based on single lookahead symbol •Need](https://reader034.fdocuments.net/reader034/viewer/2022042810/5f9f24619b30dd3a34035b2c/html5/thumbnails/44.jpg)
Sets-of-Items Construction
Family of configuration setsfunction items(G’)
C = { closure({S’ → • S}) }; do foreach I ∈ C do foreach X ∈ (N ∪ T) do
C = C ∪ { Successor(I, X) }; while C changes;
![Page 45: CMPT 755 Compilers - cs.sfu.caanoop/courses/CMPT-755-Fall-2004/parsing… · Predictive Top-Down Parser •Knows which production to choose based on single lookahead symbol •Need](https://reader034.fdocuments.net/reader034/viewer/2022042810/5f9f24619b30dd3a34035b2c/html5/thumbnails/45.jpg)
0: S’ → • TT → • FT → • T * FF → • idF → • ( T )
1: T → F •
F → (T)4F → id3T → T*F2T → F1
Productions
F
2: S’ → T •T → T • * F
T
3: T → T * • FF → • idF → • ( T )
*
4: T → T * F •
F
5: F → ( • T )T → • FT → • T * FF → • idF → • ( T )
(
6: F → ( T • )T → T • * F
T
7: F → ( T ) • )
8: F → id •
id
*
(
F
id
id
(
$ Accept
Reduce 1
Reduce 2
Reduce 3
Reduce 4
![Page 46: CMPT 755 Compilers - cs.sfu.caanoop/courses/CMPT-755-Fall-2004/parsing… · Predictive Top-Down Parser •Knows which production to choose based on single lookahead symbol •Need](https://reader034.fdocuments.net/reader034/viewer/2022042810/5f9f24619b30dd3a34035b2c/html5/thumbnails/46.jpg)
0: S’ → • TT → • FT → • T * FF → • idF → • ( T )
1: T → F •
F → (T)4F → id3T → T*F2T → F1
Productions
F
2: S’ → T •T → T • * F
T
3: T → T * • FF → • idF → • ( T )
*
4: T → T * F •
F
5: F → ( • T )T → • FT → • T * FF → • idF → • ( T )
(
6: F → ( T • )T → T • * F
T
7: F → ( T ) • )
8: F → id •
id
*
(
F
id
id
(
$ Accept
Reduce 1
Reduce 2
Reduce 3
Reduce 4
R3R3R3R3R38R4R4R4R4R47
S7S3616S8S55
R2R2R2R2R244S8S53
AS32R1R1R1R1R11
12S8S50FT$id)(*
![Page 47: CMPT 755 Compilers - cs.sfu.caanoop/courses/CMPT-755-Fall-2004/parsing… · Predictive Top-Down Parser •Knows which production to choose based on single lookahead symbol •Need](https://reader034.fdocuments.net/reader034/viewer/2022042810/5f9f24619b30dd3a34035b2c/html5/thumbnails/47.jpg)
LR(0) Construction
1. Construct F = {I0, I1, …In}2. a) if {A → α•} ∈ Ii and A != S
then action[i, _] := reduce A → α b) if {S’ → S•} ∈ Ii
then action[i,$] := accept c) if {A → α•aβ} ∈ Ii and Successor(Ii,a) = Ij
then action[i,a] := shift j3. if Successor(Ii,A) = Ij then goto[i,A] := j
![Page 48: CMPT 755 Compilers - cs.sfu.caanoop/courses/CMPT-755-Fall-2004/parsing… · Predictive Top-Down Parser •Knows which production to choose based on single lookahead symbol •Need](https://reader034.fdocuments.net/reader034/viewer/2022042810/5f9f24619b30dd3a34035b2c/html5/thumbnails/48.jpg)
LR(0) Construction (cont’d)
4. All entries not defined are errors5. Make sure I0 is the initial state
• Note: LR(0) always reduces if{A → α•} ∈ Ii, no lookahead
• Shift and reduce items can’t be in the sameconfiguration set
– Accepting state doesn’t count as reduce item• At most one reduce item per set
![Page 49: CMPT 755 Compilers - cs.sfu.caanoop/courses/CMPT-755-Fall-2004/parsing… · Predictive Top-Down Parser •Knows which production to choose based on single lookahead symbol •Need](https://reader034.fdocuments.net/reader034/viewer/2022042810/5f9f24619b30dd3a34035b2c/html5/thumbnails/49.jpg)
Set-of-items with Epsilon rulesS’ → •SS → •AaAbS → •BaBbA → ε•B → ε•
S → Aa•AbA → ε•
S → A•aAb
S → AaAbS → BaBbA → εB → ε S → B•aBb
S → Ba•BbB → ε•
S → AaA•bS → AaAb•
S → BaB•b
S → BaBb•
S’ → S•S
B
a
B
b
Aa
A
b
![Page 50: CMPT 755 Compilers - cs.sfu.caanoop/courses/CMPT-755-Fall-2004/parsing… · Predictive Top-Down Parser •Knows which production to choose based on single lookahead symbol •Need](https://reader034.fdocuments.net/reader034/viewer/2022042810/5f9f24619b30dd3a34035b2c/html5/thumbnails/50.jpg)
LR(0) conflicts:
Need more lookahead: SLR(1)
S’ → FF → id | ( T )F → id = T ;T → T * FT → id
5: F → id •F → id • = T
Shift/reduce conflict
2: F → id •T → id •
Reduce/Reduce conflict
![Page 51: CMPT 755 Compilers - cs.sfu.caanoop/courses/CMPT-755-Fall-2004/parsing… · Predictive Top-Down Parser •Knows which production to choose based on single lookahead symbol •Need](https://reader034.fdocuments.net/reader034/viewer/2022042810/5f9f24619b30dd3a34035b2c/html5/thumbnails/51.jpg)
SLR(1) : Simple LR(1) ParsingS’ → TT → F | T * F | C ( T )F → id | id ++ | ( T )C → id
0: S’ → • TT → • FT → • T * FT → • C (T)F → • idF → • id ++F → • ( T )C → • id
1: F → id • F → id • ++ C → id •
id
Follow(F) = ?{ *, ), $ }
Follow(C) = ?{ ( }
action[1,*]= action[1,)] = action[1,$] = Reduce F → idaction[1,(] = Reduce C → id
action[1,++] = Shift
![Page 52: CMPT 755 Compilers - cs.sfu.caanoop/courses/CMPT-755-Fall-2004/parsing… · Predictive Top-Down Parser •Knows which production to choose based on single lookahead symbol •Need](https://reader034.fdocuments.net/reader034/viewer/2022042810/5f9f24619b30dd3a34035b2c/html5/thumbnails/52.jpg)
SLR(1) Construction1. Construct F = {I0, I1, …In}2. a) if {A → α•} ∈ Ii and A != S’
then action[i, b] := reduce A → αfor all b ∈ Follow(A)
b) if {S’ → S•} ∈ Ii then action[i, $] := accept
c) if {A → α•aβ} ∈ Ii and Successor(Ii, a) = Ij then action[i, a] := shift j
3. if Successor(Ii, A) = Ij then goto[i, A] := j
![Page 53: CMPT 755 Compilers - cs.sfu.caanoop/courses/CMPT-755-Fall-2004/parsing… · Predictive Top-Down Parser •Knows which production to choose based on single lookahead symbol •Need](https://reader034.fdocuments.net/reader034/viewer/2022042810/5f9f24619b30dd3a34035b2c/html5/thumbnails/53.jpg)
SLR(1) Construction (cont’d)
4. All entries not defined are errors5. Make sure I0 is the initial state
• Note: SLR(1) only reduces{A → α•} if lookahead in Follow(A)
• Shift and reduce items or more than one reduceitem can be in the same configuration set aslong as lookaheads are disjoint
![Page 54: CMPT 755 Compilers - cs.sfu.caanoop/courses/CMPT-755-Fall-2004/parsing… · Predictive Top-Down Parser •Knows which production to choose based on single lookahead symbol •Need](https://reader034.fdocuments.net/reader034/viewer/2022042810/5f9f24619b30dd3a34035b2c/html5/thumbnails/54.jpg)
SLR(1) Conditions
• A grammar is SLR(1) if for eachconfiguration set:– For any item {A → α•xβ: x ∈ T} there is no {B → γ•: x ∈ Follow(B)}– For any two items {A → α•} and {B → β•}
Follow(A) ∩ Follow(B) = ∅
LR(0) Grammars ⊂ SLR(1) Grammars
![Page 55: CMPT 755 Compilers - cs.sfu.caanoop/courses/CMPT-755-Fall-2004/parsing… · Predictive Top-Down Parser •Knows which production to choose based on single lookahead symbol •Need](https://reader034.fdocuments.net/reader034/viewer/2022042810/5f9f24619b30dd3a34035b2c/html5/thumbnails/55.jpg)
Is this grammar SLR(1)?S’ → •SS → •AaAbS → •BaBbA → ε•B → ε•
S → Aa•AbA → ε•
S → A•aAb
S → AaAbS → BaBbA → εB → ε S → B•aBb
S → Ba•BbB → ε•
S → AaA•bS → AaAb•
S → BaB•b
S → BaBb•
S’ → S•S
B
a
B
b
Aa
A
b
![Page 56: CMPT 755 Compilers - cs.sfu.caanoop/courses/CMPT-755-Fall-2004/parsing… · Predictive Top-Down Parser •Knows which production to choose based on single lookahead symbol •Need](https://reader034.fdocuments.net/reader034/viewer/2022042810/5f9f24619b30dd3a34035b2c/html5/thumbnails/56.jpg)
SLR limitation: lack of context
0: S’ → • SS → • L = RS → • RL → • * RL → • idR → • L
S’ → SS → L = R | RL → *R | idR → L
1: L → id •
2: S → L • = RR → L •
id
L 3: S → L = • RR → • LL → • * RL → • id
=
Follow(R) = ?{ =, $ }
Input: id =id
![Page 57: CMPT 755 Compilers - cs.sfu.caanoop/courses/CMPT-755-Fall-2004/parsing… · Predictive Top-Down Parser •Knows which production to choose based on single lookahead symbol •Need](https://reader034.fdocuments.net/reader034/viewer/2022042810/5f9f24619b30dd3a34035b2c/html5/thumbnails/57.jpg)
Solution: Canonical LR(1)
• Extend definition of configuration– Remember lookahead
• New closure method• Extend definition of Successor
![Page 58: CMPT 755 Compilers - cs.sfu.caanoop/courses/CMPT-755-Fall-2004/parsing… · Predictive Top-Down Parser •Knows which production to choose based on single lookahead symbol •Need](https://reader034.fdocuments.net/reader034/viewer/2022042810/5f9f24619b30dd3a34035b2c/html5/thumbnails/58.jpg)
LR(1) Configurations
• [A → α•β, a] for a ∈ T is valid for a viableprefix δα if there is a rightmost derivation
S ⇒* δAη ⇒* δαβη and(η = aγ) or (η = ε and a = $)
• Notation: [A → α•β, a/b/c]– if [A → α•β, a], [A → α•β, b], [A → α•β, c]
are valid configurations
![Page 59: CMPT 755 Compilers - cs.sfu.caanoop/courses/CMPT-755-Fall-2004/parsing… · Predictive Top-Down Parser •Knows which production to choose based on single lookahead symbol •Need](https://reader034.fdocuments.net/reader034/viewer/2022042810/5f9f24619b30dd3a34035b2c/html5/thumbnails/59.jpg)
LR(1) Configurations
S → B BB → a B | b
• S ⇒*rm aaBab ⇒rm aaaBab• Item [B → a • B, a] is valid for viable
prefix aaa• S ⇒*rm BaB ⇒rm BaaB• Also, item [B → a • B, $] is valid for
viable prefix Baa
![Page 60: CMPT 755 Compilers - cs.sfu.caanoop/courses/CMPT-755-Fall-2004/parsing… · Predictive Top-Down Parser •Knows which production to choose based on single lookahead symbol •Need](https://reader034.fdocuments.net/reader034/viewer/2022042810/5f9f24619b30dd3a34035b2c/html5/thumbnails/60.jpg)
LR(1) Closure
Closure property:• If [A → α • Bβ, a] is in set, then
[B → • γ, b] is in set if b ∈ First(βa)• Compute as fixed point• Only include contextually valid lookaheads
to guide reducing to B
![Page 61: CMPT 755 Compilers - cs.sfu.caanoop/courses/CMPT-755-Fall-2004/parsing… · Predictive Top-Down Parser •Knows which production to choose based on single lookahead symbol •Need](https://reader034.fdocuments.net/reader034/viewer/2022042810/5f9f24619b30dd3a34035b2c/html5/thumbnails/61.jpg)
Starting Configuration
• Augment Grammar with S’ just like forLR(0), SLR(1)
• Initial configuration set isI = closure([S’ → • S, $])
![Page 62: CMPT 755 Compilers - cs.sfu.caanoop/courses/CMPT-755-Fall-2004/parsing… · Predictive Top-Down Parser •Knows which production to choose based on single lookahead symbol •Need](https://reader034.fdocuments.net/reader034/viewer/2022042810/5f9f24619b30dd3a34035b2c/html5/thumbnails/62.jpg)
Example: closure([S’ → • S, $])
[ S’ → • S, $][S → • L = R, $][S → • R, $][L → • * R, =][L → • id, =][R → • L, $][L → • *R, $][L → • id, $]
S’ → SS → L = R | RL → *R | idR → L
![Page 63: CMPT 755 Compilers - cs.sfu.caanoop/courses/CMPT-755-Fall-2004/parsing… · Predictive Top-Down Parser •Knows which production to choose based on single lookahead symbol •Need](https://reader034.fdocuments.net/reader034/viewer/2022042810/5f9f24619b30dd3a34035b2c/html5/thumbnails/63.jpg)
LR(1) Successor(C, X)
• Let I = [A → α • Bβ, a]• Successor(I, B) = closure([A → αB • β, a])
![Page 64: CMPT 755 Compilers - cs.sfu.caanoop/courses/CMPT-755-Fall-2004/parsing… · Predictive Top-Down Parser •Knows which production to choose based on single lookahead symbol •Need](https://reader034.fdocuments.net/reader034/viewer/2022042810/5f9f24619b30dd3a34035b2c/html5/thumbnails/64.jpg)
LR(1) Example: *id = id
0: S’ → • S, $S → • L = R, $S → • R, $L → • * R, =/$L → • id, =/$R → • L, $
1: L → id •, $/=
2: S → L • = R, $R → L •, $
3: S → L = • R, $R → • L, $L → • *R, $L → • id, $
4: L → id •, $
id
L
=
id
5: R → L •, $L
6: S → L = R•, $R7: S’ → S •, $
S
![Page 65: CMPT 755 Compilers - cs.sfu.caanoop/courses/CMPT-755-Fall-2004/parsing… · Predictive Top-Down Parser •Knows which production to choose based on single lookahead symbol •Need](https://reader034.fdocuments.net/reader034/viewer/2022042810/5f9f24619b30dd3a34035b2c/html5/thumbnails/65.jpg)
LR(1) Example: *id = id
0: S’ → • S, $S → • L = R, $S → • R, $L → • * R, =/$L → • id, =/$R → • L, $ 8: L → * • R, =/$
R → • L, =/$ L → • *R, =/$ L → • id, =/$
1: L → id •, =/$id
10: R → L •, =/$ L
11: L → *R •, =/$R
*
*
12: S → R•, $R
![Page 66: CMPT 755 Compilers - cs.sfu.caanoop/courses/CMPT-755-Fall-2004/parsing… · Predictive Top-Down Parser •Knows which production to choose based on single lookahead symbol •Need](https://reader034.fdocuments.net/reader034/viewer/2022042810/5f9f24619b30dd3a34035b2c/html5/thumbnails/66.jpg)
LR(1) Construction
1. Construct F = {I0, I1, …In}2. a) if [A → α•, a] ∈ Ii and A != S’
then action[i, a] := reduce A → α b) if [S’ → S•, $] ∈ Ii
then action[i, $] := accept c) if [A → α•aβ, b] ∈ Ii and Successor(Ii, a)=Ij
then action[i, a] := shift j3. if Successor(Ii, A) = Ij then goto[i, A] := j
![Page 67: CMPT 755 Compilers - cs.sfu.caanoop/courses/CMPT-755-Fall-2004/parsing… · Predictive Top-Down Parser •Knows which production to choose based on single lookahead symbol •Need](https://reader034.fdocuments.net/reader034/viewer/2022042810/5f9f24619b30dd3a34035b2c/html5/thumbnails/67.jpg)
LR(1) Construction (cont’d)
4. All entries not defined are errors5. Make sure I0 is the initial state
• Note: LR(1) only reduces using A → α for [A → α•, a] if a follows• LR(1) states remember context by virtue of
lookahead• Possibly many states!
– LALR(1) combines some states
![Page 68: CMPT 755 Compilers - cs.sfu.caanoop/courses/CMPT-755-Fall-2004/parsing… · Predictive Top-Down Parser •Knows which production to choose based on single lookahead symbol •Need](https://reader034.fdocuments.net/reader034/viewer/2022042810/5f9f24619b30dd3a34035b2c/html5/thumbnails/68.jpg)
LR(1) Conditions
• A grammar is LR(1) if for each configuration setholds:– For any item [A → α•xβ, a] with x ∈ T there is no
[B → γ•, x]– For any two complete items [A → γ•, a] and
[B → β•, b] it follows a and a != b.• Grammars:
– LR(0) ⊂ SLR(1) ⊂ LR(1) ⊂ LR(k)• Languages expressible by grammars:
– LR(0) ⊂ SLR(1) ⊂ LR(1) = LR(k)
![Page 69: CMPT 755 Compilers - cs.sfu.caanoop/courses/CMPT-755-Fall-2004/parsing… · Predictive Top-Down Parser •Knows which production to choose based on single lookahead symbol •Need](https://reader034.fdocuments.net/reader034/viewer/2022042810/5f9f24619b30dd3a34035b2c/html5/thumbnails/69.jpg)
Canonical LR(1) Recap
• LR(1) uses left context, current handle andlookahead to decide when to reduce or shift
• Most powerful parser so far• LALR(1) is practical simplification with
fewer states
![Page 70: CMPT 755 Compilers - cs.sfu.caanoop/courses/CMPT-755-Fall-2004/parsing… · Predictive Top-Down Parser •Knows which production to choose based on single lookahead symbol •Need](https://reader034.fdocuments.net/reader034/viewer/2022042810/5f9f24619b30dd3a34035b2c/html5/thumbnails/70.jpg)
Merging States in LALR(1)
• S’ → SS → XXX → aXX → b
• Same CoreSet
• Differentlookaheads
6: X → a • X, $X → • a X, $X → • b, $
3: X → a • X, a/bX → • a X, a/bX → • b, a/b
36: X → a • X, a/b/$ X → • a X, a/b/$ X → • b, a/b/$
![Page 71: CMPT 755 Compilers - cs.sfu.caanoop/courses/CMPT-755-Fall-2004/parsing… · Predictive Top-Down Parser •Knows which production to choose based on single lookahead symbol •Need](https://reader034.fdocuments.net/reader034/viewer/2022042810/5f9f24619b30dd3a34035b2c/html5/thumbnails/71.jpg)
R/R conflicts when merging
• B → dB → f X gX → …
• If R/R conflictsare introduced,grammar is notLALR(1)!
4: B → d •, gB → f X g •, c
2: B → d •, cB → f X g •, e
24: B → d •, c/g B → f X g •, c/e
![Page 72: CMPT 755 Compilers - cs.sfu.caanoop/courses/CMPT-755-Fall-2004/parsing… · Predictive Top-Down Parser •Knows which production to choose based on single lookahead symbol •Need](https://reader034.fdocuments.net/reader034/viewer/2022042810/5f9f24619b30dd3a34035b2c/html5/thumbnails/72.jpg)
LALR(1)
• LALR(1) Condition:– Merging in this way does not introduce reduce/reduce
conflicts– Shift/reduce can’t be introduced
• Merging brute force or step-by-step• More compact than canonical LR, like SLR(1)• More powerful than SLR(1)
– Not always merge to full Follow Set
![Page 73: CMPT 755 Compilers - cs.sfu.caanoop/courses/CMPT-755-Fall-2004/parsing… · Predictive Top-Down Parser •Knows which production to choose based on single lookahead symbol •Need](https://reader034.fdocuments.net/reader034/viewer/2022042810/5f9f24619b30dd3a34035b2c/html5/thumbnails/73.jpg)
S/R & ambiguous grammars
• Lx(k) Grammar vs. Language– Grammar is Lx(k) if it can be parsed by Lx(k) method
– according to criteria that is specific to the method.– A Lx(k) grammar may or may not exist for a language.
• Even if a given grammar is not LR(k),shift/reduce parser can sometimes handle them byaccounting for ambiguities– Example: ‘dangling’ else
• Preferring shift to reduce means matching inner ‘if’
![Page 74: CMPT 755 Compilers - cs.sfu.caanoop/courses/CMPT-755-Fall-2004/parsing… · Predictive Top-Down Parser •Knows which production to choose based on single lookahead symbol •Need](https://reader034.fdocuments.net/reader034/viewer/2022042810/5f9f24619b30dd3a34035b2c/html5/thumbnails/74.jpg)
Dangling ‘else’1. S → if E then S2. S → if E then S else S• Viable prefix “if E then if E then S”
– Then read else• Shift “else” (means go for 2)• Reduce (reduce using production #1)• NB: dangling else as written above is ambiguous
– NB: Ambiguity can be resolved, but there’s still noLR(k) grammar
![Page 75: CMPT 755 Compilers - cs.sfu.caanoop/courses/CMPT-755-Fall-2004/parsing… · Predictive Top-Down Parser •Knows which production to choose based on single lookahead symbol •Need](https://reader034.fdocuments.net/reader034/viewer/2022042810/5f9f24619b30dd3a34035b2c/html5/thumbnails/75.jpg)
Precedence & Associativity
• Consider E → E - E | E * E | id
id - id * id E - E• *
E
E
E
E
E
E
E
E
EReduce
id - id * id
E - E• *
Shift
id - id - id
E - E• -
Reduce
![Page 76: CMPT 755 Compilers - cs.sfu.caanoop/courses/CMPT-755-Fall-2004/parsing… · Predictive Top-Down Parser •Knows which production to choose based on single lookahead symbol •Need](https://reader034.fdocuments.net/reader034/viewer/2022042810/5f9f24619b30dd3a34035b2c/html5/thumbnails/76.jpg)
Precedence Relations
• Let A → w be a rule in the grammar• And b is a terminal• In some state q of the LR(1) parser there is
a shift-reduce conflict:– either reduce with A → w or shift on b
• Write down a rule, either:A → w, < b or A → w, > b
![Page 77: CMPT 755 Compilers - cs.sfu.caanoop/courses/CMPT-755-Fall-2004/parsing… · Predictive Top-Down Parser •Knows which production to choose based on single lookahead symbol •Need](https://reader034.fdocuments.net/reader034/viewer/2022042810/5f9f24619b30dd3a34035b2c/html5/thumbnails/77.jpg)
Precedence Relations
• A → w, < b means rule has less precedenceand so we shift if we see b in the lookahead
• A → w, > b means rule has higherprecedence and so we reduce if we see b inthe lookahead
• If there are multiple terminals with shift-reduce conflicts, then we list them all:A → w, > b, < c, > d
![Page 78: CMPT 755 Compilers - cs.sfu.caanoop/courses/CMPT-755-Fall-2004/parsing… · Predictive Top-Down Parser •Knows which production to choose based on single lookahead symbol •Need](https://reader034.fdocuments.net/reader034/viewer/2022042810/5f9f24619b30dd3a34035b2c/html5/thumbnails/78.jpg)
Precedence Relations
• Consider the grammarE → E + E | E * E | ( E ) | a
• Assume left-association so that E+E+E isinterpreted as (E+E)+E
• Assume multiplication has higherprecedence than addition
• Then we can write precedence rules/relns:E → E + E, > +, < *E → E * E, > +, > *
![Page 79: CMPT 755 Compilers - cs.sfu.caanoop/courses/CMPT-755-Fall-2004/parsing… · Predictive Top-Down Parser •Knows which production to choose based on single lookahead symbol •Need](https://reader034.fdocuments.net/reader034/viewer/2022042810/5f9f24619b30dd3a34035b2c/html5/thumbnails/79.jpg)
Precedence & Associativity
2:E → E * E • 1:E → E • + E 2:E → E • * E
E+
*
1:E → E + E • 1:E → E • + E 2:E → E • * E
E+
*
+
10:
7: 7
10
*
Shift
R2 R2
R1
E → E + E, > +, < *E → E * E, > +, > *
![Page 80: CMPT 755 Compilers - cs.sfu.caanoop/courses/CMPT-755-Fall-2004/parsing… · Predictive Top-Down Parser •Knows which production to choose based on single lookahead symbol •Need](https://reader034.fdocuments.net/reader034/viewer/2022042810/5f9f24619b30dd3a34035b2c/html5/thumbnails/80.jpg)
Handling S/R & R/R Conflicts
• Have a conflict?– No? – Done, grammar is compliant.
• Already using most powerful parseravailable?– No? – Upgrade and goto 1
• Can the grammar be rearranged so that theconflict disappears?– While preserving the language!
![Page 81: CMPT 755 Compilers - cs.sfu.caanoop/courses/CMPT-755-Fall-2004/parsing… · Predictive Top-Down Parser •Knows which production to choose based on single lookahead symbol •Need](https://reader034.fdocuments.net/reader034/viewer/2022042810/5f9f24619b30dd3a34035b2c/html5/thumbnails/81.jpg)
Conflicts revisited (cont’d)
• Can the grammar be rearranged so that theconflict disappears?– No?
• Is the conflict S/R and does shift-to-reduce preference yielddesired result?
– Yes: Done. (Example: dangling else)• Else: Bad luck
– Yes: Is it worth it?• Yes, resolve conflict.• No: live with default or specified conflict resolution
(precedence, associativity)
![Page 82: CMPT 755 Compilers - cs.sfu.caanoop/courses/CMPT-755-Fall-2004/parsing… · Predictive Top-Down Parser •Knows which production to choose based on single lookahead symbol •Need](https://reader034.fdocuments.net/reader034/viewer/2022042810/5f9f24619b30dd3a34035b2c/html5/thumbnails/82.jpg)
Compiler (parser) compilers
• Rather than build a parser for a particulargrammar (e.g. recursive descent), writedown a grammar as a text file
• Run through a compiler compiler whichproduces a parser for that grammar
• The parser is a program that can becompiled and accepts input strings andproduces user-defined output
![Page 83: CMPT 755 Compilers - cs.sfu.caanoop/courses/CMPT-755-Fall-2004/parsing… · Predictive Top-Down Parser •Knows which production to choose based on single lookahead symbol •Need](https://reader034.fdocuments.net/reader034/viewer/2022042810/5f9f24619b30dd3a34035b2c/html5/thumbnails/83.jpg)
Compiler (parser) compilers
• For LR parsing, all it needs to do is produceaction/goto table– Yacc (yet another compiler compiler) was distributed
with Unix, the most popular tool. Uses LALR(1).– Many variants of yacc exist for many languages
• As we will see later, translation of the parse treeinto machine code (or anything else) can also bewritten down with the grammar
• Handling errors and interaction with the lexicalanalyzer have to be precisely defined
![Page 84: CMPT 755 Compilers - cs.sfu.caanoop/courses/CMPT-755-Fall-2004/parsing… · Predictive Top-Down Parser •Knows which production to choose based on single lookahead symbol •Need](https://reader034.fdocuments.net/reader034/viewer/2022042810/5f9f24619b30dd3a34035b2c/html5/thumbnails/84.jpg)
Parsing CFGs
• Consider the problem of parsing witharbitrary CFGs
• For any input string, the parser has toproduce a parse tree
• The simpler problem: print yes if the inputstring is generated by the grammar, printno otherwise
• This problem is called recognition
![Page 85: CMPT 755 Compilers - cs.sfu.caanoop/courses/CMPT-755-Fall-2004/parsing… · Predictive Top-Down Parser •Knows which production to choose based on single lookahead symbol •Need](https://reader034.fdocuments.net/reader034/viewer/2022042810/5f9f24619b30dd3a34035b2c/html5/thumbnails/85.jpg)
CKY Recognition Algorithm
• The Cocke-Kasami-Younger algorithm• As we shall see it runs in time that is
polynomial in the size of the input• It takes space polynomial in the size of the
input• Remarkable fact: it can find all possible
parse trees (exponentially many) inpolynomial time
![Page 86: CMPT 755 Compilers - cs.sfu.caanoop/courses/CMPT-755-Fall-2004/parsing… · Predictive Top-Down Parser •Knows which production to choose based on single lookahead symbol •Need](https://reader034.fdocuments.net/reader034/viewer/2022042810/5f9f24619b30dd3a34035b2c/html5/thumbnails/86.jpg)
Chomsky Normal Form
• Before we can see how CKY works, weneed to convert the input CFG intoChomsky Normal Form
• CNF means that the input CFG G isconverted to a new CFG G’ in which allrules are of the form:A → B CA → a
![Page 87: CMPT 755 Compilers - cs.sfu.caanoop/courses/CMPT-755-Fall-2004/parsing… · Predictive Top-Down Parser •Knows which production to choose based on single lookahead symbol •Need](https://reader034.fdocuments.net/reader034/viewer/2022042810/5f9f24619b30dd3a34035b2c/html5/thumbnails/87.jpg)
Epsilon Removal
• First step, remove epsilon rulesA → B CC → ε | C D | aD → b B → b
• After ε-removal:A → B | B C D | B aC → D | C D D | a D | C D | aD → b B → b
![Page 88: CMPT 755 Compilers - cs.sfu.caanoop/courses/CMPT-755-Fall-2004/parsing… · Predictive Top-Down Parser •Knows which production to choose based on single lookahead symbol •Need](https://reader034.fdocuments.net/reader034/viewer/2022042810/5f9f24619b30dd3a34035b2c/html5/thumbnails/88.jpg)
Removal of Chain Rules
• Second step, remove chain rulesA → B C | C D CC → D | aD → d B → b
• After removal of chain rules:A → B a | B D | a D a | a D D | D D a | D D DD → d B → b
![Page 89: CMPT 755 Compilers - cs.sfu.caanoop/courses/CMPT-755-Fall-2004/parsing… · Predictive Top-Down Parser •Knows which production to choose based on single lookahead symbol •Need](https://reader034.fdocuments.net/reader034/viewer/2022042810/5f9f24619b30dd3a34035b2c/html5/thumbnails/89.jpg)
Eliminate terminals from RHS
• Third step, remove terminals from the rhsof rulesA → B a C d
• After removal of terminals from the rhs:A → B N1 C N2N1 → aN2 → d
![Page 90: CMPT 755 Compilers - cs.sfu.caanoop/courses/CMPT-755-Fall-2004/parsing… · Predictive Top-Down Parser •Knows which production to choose based on single lookahead symbol •Need](https://reader034.fdocuments.net/reader034/viewer/2022042810/5f9f24619b30dd3a34035b2c/html5/thumbnails/90.jpg)
Binarize RHS with Nonterminals
• Fourth step, convert the rhs of each rule to havetwo non-terminalsA → B N1 C N2N1 → aN2 → d
• After converting to binary form:A → B N3 N1 → aN3 → N1 N4 N2 → dN4 → C N2
![Page 91: CMPT 755 Compilers - cs.sfu.caanoop/courses/CMPT-755-Fall-2004/parsing… · Predictive Top-Down Parser •Knows which production to choose based on single lookahead symbol •Need](https://reader034.fdocuments.net/reader034/viewer/2022042810/5f9f24619b30dd3a34035b2c/html5/thumbnails/91.jpg)
CKY algorithm
• We will consider the working of thealgorithm on an example CFG and inputstring
• Example CFG:S → A X | Y BX → A B | B A Y → B AA → a B → a
• Example input string: aaa
![Page 92: CMPT 755 Compilers - cs.sfu.caanoop/courses/CMPT-755-Fall-2004/parsing… · Predictive Top-Down Parser •Knows which production to choose based on single lookahead symbol •Need](https://reader034.fdocuments.net/reader034/viewer/2022042810/5f9f24619b30dd3a34035b2c/html5/thumbnails/92.jpg)
CKY Algorithm
A, BA → aB → a
X, YX → A B | B AY → B A
A, BA → aB → a
SS → A(0,1) X(1,3)
S → Y(0,2) B(2,3)
X, YX → A B | B AY → B A
A, BA → aB → a
a a a
0
1
2
0 1 2 3
![Page 93: CMPT 755 Compilers - cs.sfu.caanoop/courses/CMPT-755-Fall-2004/parsing… · Predictive Top-Down Parser •Knows which production to choose based on single lookahead symbol •Need](https://reader034.fdocuments.net/reader034/viewer/2022042810/5f9f24619b30dd3a34035b2c/html5/thumbnails/93.jpg)
Parse trees
S
BY
B A
aa a
S
A X
A B
a a a
S
A X
B A
a a a
![Page 94: CMPT 755 Compilers - cs.sfu.caanoop/courses/CMPT-755-Fall-2004/parsing… · Predictive Top-Down Parser •Knows which production to choose based on single lookahead symbol •Need](https://reader034.fdocuments.net/reader034/viewer/2022042810/5f9f24619b30dd3a34035b2c/html5/thumbnails/94.jpg)
CKY AlgorithmInput string input of size nCreate a 2D table chart of size n2
for i=0 to n-1chart[i][i+1] = A if there is a rule A → a and input[i]=a
for j=2 to Nfor i=j-2 downto 0
for k=i+1 to j-1chart[i][j] = A if there is a rule A → B C and
chart[i][k] = B and chart[k][j] = Creturn yes if chart[0][n] has the start symbolelse return no
![Page 95: CMPT 755 Compilers - cs.sfu.caanoop/courses/CMPT-755-Fall-2004/parsing… · Predictive Top-Down Parser •Knows which production to choose based on single lookahead symbol •Need](https://reader034.fdocuments.net/reader034/viewer/2022042810/5f9f24619b30dd3a34035b2c/html5/thumbnails/95.jpg)
CKY algorithm summary
• Parsing arbitrary CFGs• For the CKY algorithm, the time complexity is
O(|G|2 n3)• The space requirement is O(n2)• The CKY algorithm handles arbitrary ambiguous
CFGs• All ambiguous choices are stored in the chart• For compilers we consider parsing algorithms for
CFGs that do not handle ambiguous grammars
![Page 96: CMPT 755 Compilers - cs.sfu.caanoop/courses/CMPT-755-Fall-2004/parsing… · Predictive Top-Down Parser •Knows which production to choose based on single lookahead symbol •Need](https://reader034.fdocuments.net/reader034/viewer/2022042810/5f9f24619b30dd3a34035b2c/html5/thumbnails/96.jpg)
GLR – Generalized LR Parsing
• Works for any CFG (just like CKY algorithm)– Masaru Tomita [1986]
• If you have shift/reduce conflict, just clone yourstack and shift in one clone, reduce in the otherclone– proceed in lockstep– parser that get into error states die– merge parsers that lead to identical reductions (graph
structured stack)
![Page 97: CMPT 755 Compilers - cs.sfu.caanoop/courses/CMPT-755-Fall-2004/parsing… · Predictive Top-Down Parser •Knows which production to choose based on single lookahead symbol •Need](https://reader034.fdocuments.net/reader034/viewer/2022042810/5f9f24619b30dd3a34035b2c/html5/thumbnails/97.jpg)
Parsing - Summary
• Parsing arbitrary CFGs: O(n3) time complexity• Top-down vs. bottom-up• Lookahead: FIRST and FOLLOW sets• LL(1) – Parsing: O(n) time complexity
– recursive-descent and table-driven predictive parsing• LR(k) – Parsing : O(n) time complexity
– LR(0), SLR(1), LR(1), LALR(1)• Resolving shift/reduce conflicts
– using precedence, associativity