Syntax Analysis - LR(0) Parsing 66.648 Compiler Design Lecture (02/04/98) Computer Science...

21
Syntax Analysis - Syntax Analysis - LR(0) Parsing LR(0) Parsing 66.648 Compiler Design Lecture 66.648 Compiler Design Lecture (02/04/98) (02/04/98) Computer Science Computer Science Rensselaer Polytechnic Rensselaer Polytechnic

Transcript of Syntax Analysis - LR(0) Parsing 66.648 Compiler Design Lecture (02/04/98) Computer Science...

Syntax Analysis - LR(0) Syntax Analysis - LR(0) ParsingParsing

66.648 Compiler Design Lecture (02/04/98)66.648 Compiler Design Lecture (02/04/98)

Computer ScienceComputer Science

Rensselaer PolytechnicRensselaer Polytechnic

Lecture OutlineLecture Outline

• LR(0) Parsing AlgorithmLR(0) Parsing Algorithm

• Parse TablesParse Tables

• ExamplesExamples

• AdministrationAdministration

LR(k) Parsing AlgorithmsLR(k) Parsing Algorithms

This is an efficient class of Bottom-up parsing This is an efficient class of Bottom-up parsing algorithms. The other bottom-up parsers algorithms. The other bottom-up parsers include operator precedence parsers.include operator precedence parsers.

The name LR(k) means:The name LR(k) means:

L - Left-to-right scanning of the inputL - Left-to-right scanning of the input

R - Constructing rightmost derivation in reverseR - Constructing rightmost derivation in reverse

k - number of input symbols to select a parser k - number of input symbols to select a parser action action

Yet Another Example Yet Another Example Consider a grammar to generate all palindromes.Consider a grammar to generate all palindromes.

1) S--> P1) S--> P

2) P --> a Pa2) P --> a Pa

3) P --> b P b3) P --> b P b

4) P --> c4) P --> c

LR parsers work with an augmented grammar in which LR parsers work with an augmented grammar in which the start symbol never appears in the right side of a the start symbol never appears in the right side of a production. In a given grammar, if the start symbol production. In a given grammar, if the start symbol appears in the RHS, we can add a production S’ --> S appears in the RHS, we can add a production S’ --> S (S’ is the new start symbol and S was the old start (S’ is the new start symbol and S was the old start symbol) symbol)

STACKSTACK INPUT BUFFERINPUT BUFFER ACTIONACTION

$$ abcba$abcba$ shiftshift

$a$a bcba$bcba$ shiftshift

$ab$ab cba$cba$ shiftshift

$abc$abc ba$ba$ reducereduce

$abP$abP ba$ba$ shiftshift

$abPb$abPb a$a$ reducereduce

$aP$aP a$a$ shiftshift

$aPa$aPa $$ reducereduce

$P$P $$ reducereduce

$S$S $$ acceptaccept

Example Cont...Example Cont...

Qn: How to select parser actions (namely shift, Qn: How to select parser actions (namely shift, reduce, accept and error)?reduce, accept and error)?

Ans:Ans:

1) By constructing a DFA that encodes all parser 1) By constructing a DFA that encodes all parser states, and transitions on terminals and states, and transitions on terminals and nonterminals. The transitions on terminals are nonterminals. The transitions on terminals are the parser actions( also called the action table) the parser actions( also called the action table) and transitions on nonterminals resulting in a and transitions on nonterminals resulting in a new state (also called the goto table).new state (also called the goto table).

2) Keeping a stack to simulate the PDA. This stack 2) Keeping a stack to simulate the PDA. This stack maintains the list of states.maintains the list of states.

LR(0) ParsersLR(0) Parsers

LR(0) Items and ClosureLR(0) Items and Closure

LR(0) parser state needs to capture how much of LR(0) parser state needs to capture how much of a given production we have scanned . LR(0) a given production we have scanned . LR(0) parser (like a FSA) needs to know how much the parser (like a FSA) needs to know how much the production (on the rhs) we have scanned so far.production (on the rhs) we have scanned so far.

For example: in the production:For example: in the production:

P --> a P --> a PP a a

An LR(0) item is a production with a mark/dot on An LR(0) item is a production with a mark/dot on the RHS. SO the items for this production will be the RHS. SO the items for this production will be P--> . a P a , P --> a . P a, P --> a P. a, P--> aPa. P--> . a P a , P --> a . P a, P --> a P. a, P--> aPa.

Items and Closure ContdItems and Closure Contd

Intuitively, there is a derivation (or we have seen Intuitively, there is a derivation (or we have seen the input symbols) to the left of dot.the input symbols) to the left of dot.

Two kinds of items, kernel items and nonkernel Two kinds of items, kernel items and nonkernel items - Kernel and nonkernel items.items - Kernel and nonkernel items.

Kernel Items - Includes initial item S’ --> .S and all Kernel Items - Includes initial item S’ --> .S and all items in which dot does not appear at the left items in which dot does not appear at the left most position.most position.

Nonkernel Items- All other items which have dots Nonkernel Items- All other items which have dots at the leftmost position.at the leftmost position.

Closure of ItemsClosure of Items

Let I be the set of items. Then Closure (I) consists Let I be the set of items. Then Closure (I) consists of the set of items that are constructed as of the set of items that are constructed as follows:follows:

1) Every item I is also in the Closure(I) - reflexive1) Every item I is also in the Closure(I) - reflexive

2 If A --. alpha . B beta is in Closure(I), and B--> 2 If A --. alpha . B beta is in Closure(I), and B--> gamma is production, then add the item B--gamma is production, then add the item B--> .gamma also in the Closure(I), if it is not > .gamma also in the Closure(I), if it is not already a member. Repeat this until no more already a member. Repeat this until no more items can be added.items can be added.

IntuitionIntuition

Closure represents an equivalent state - all the Closure represents an equivalent state - all the possible ways that you could have reached that possible ways that you could have reached that state. state.

Example: I = { S--> .P}Example: I = { S--> .P}

Closure (I) = { S-->.P,P-->.aPa,P-->.bPb,P-->.c}Closure (I) = { S-->.P,P-->.aPa,P-->.bPb,P-->.c}

In Arithmetic Expression: S’-->.EIn Arithmetic Expression: S’-->.E

closure(I)={ }closure(I)={ }

GOTO OperationGOTO Operation

Let I be the set of items and let X be a grammar Let I be the set of items and let X be a grammar symbol (nonterminal/terminal). Thensymbol (nonterminal/terminal). Then

GOTO(I,X) = Closure({A--> alpha X.beta| A--> alpha GOTO(I,X) = Closure({A--> alpha X.beta| A--> alpha . X beta is in I}). X beta is in I})

It is a new set of items moving a dot over X. It is a new set of items moving a dot over X. Intuitively, we have seen either an input symbol Intuitively, we have seen either an input symbol (terminal symbol) or seen a derivation starting (terminal symbol) or seen a derivation starting with that nonterminal.with that nonterminal.

Canonical set of Items (states)Canonical set of Items (states)

Enumerate possible states for an LR(0) parser. Enumerate possible states for an LR(0) parser. Each state is a canonical set of items.Each state is a canonical set of items.

Algorithm:Algorithm:

1) Start with a canonical set, Closure({S’-->.S})1) Start with a canonical set, Closure({S’-->.S})

2) If I is a canonical set and X is a grammar 2) If I is a canonical set and X is a grammar symbol such that I’=goto(I,X) is nonempty, then symbol such that I’=goto(I,X) is nonempty, then make I’ a new canonical set (if it is not already a make I’ a new canonical set (if it is not already a canonical set). Keep repeating this until no canonical set). Keep repeating this until no more canonical sets can be created.more canonical sets can be created.

The algorithm terminates!!.The algorithm terminates!!.

ExampleExample

S0: S--> .P , P --> .a P a, P--> .bP b, P-->.cS0: S--> .P , P --> .a P a, P--> .bP b, P-->.c

S1: S--> P.S1: S--> P.

S2: P --> a.Pa, P-->.aPa,P-->.bPb,P-->.cS2: P --> a.Pa, P-->.aPa,P-->.bPb,P-->.c

S3:P--> b.P b, P-->.aPa,P-->.bPb,P-->.cS3:P--> b.P b, P-->.aPa,P-->.bPb,P-->.c

S4: P--> c.S4: P--> c.

S5: P--> aP.aS5: P--> aP.a

S6:P--> bP.bS6:P--> bP.b

S7: P--> aPa. S7: P--> aPa.

S8: P--> bP b.S8: P--> bP b.

Finite State MachineFinite State Machine

Draw the FSA. The major difference is that Draw the FSA. The major difference is that transitions can be both terminal and transitions can be both terminal and nonterminal symbols.nonterminal symbols.

Key Idea in Canonical statesKey Idea in Canonical states

If a state contains an item of the form A--> beta ., If a state contains an item of the form A--> beta ., then state prompts a reduce action (provided then state prompts a reduce action (provided the correct symbols follow).the correct symbols follow).

If a state contains A--> alpha . delta, then the state If a state contains A--> alpha . delta, then the state prompts the parser to perform a shift action (of prompts the parser to perform a shift action (of course on the right symbols).course on the right symbols).

If a state contains S’--> S. and there are no more If a state contains S’--> S. and there are no more input symbols left, then the parser is prompted input symbols left, then the parser is prompted to accept.to accept.

Else an error message is prompted.Else an error message is prompted.

Prasing TablePrasing Tablestatestate Input symbol Input symbol gotogoto

a b c $a b c $ PP

00 s2 s3s2 s3 s4 s4 22

1. 1. acc acc

2. s2 s3 s4 52. s2 s3 s4 5

3.3. s2 s3 s4 6 s2 s3 s4 6

4. r3 r34. r3 r3

5. s7 5. s7

6. s86. s8

7. r1 r1 r1 r17. r1 r1 r1 r1

8. r2 r2 r2 r2 8. r2 r2 r2 r2

Parsing Table ContdParsing Table Contd

si means shift the input symbol and goto state I.si means shift the input symbol and goto state I.

rj means reduce by jth production. Note that we rj means reduce by jth production. Note that we are not storing all the items in the state in our are not storing all the items in the state in our table.table.

example: abcba$example: abcba$

if we go thru, parsing algorithm, we get if we go thru, parsing algorithm, we get

Example ContdExample ContdStateState inputinput actionaction

$S0$S0 abcba$abcba$ shiftshift

$S0aS2$S0aS2 bcba$bcba$ shiftshift

$S0aS2bS3$S0aS2bS3 cba$cba$ shistshist

$S0aS2bS3cS4$S0aS2bS3cS4 ba$ba$ reducereduce

Shift/Reduce ConflictsShift/Reduce Conflicts

An LR(0) state contains aconflict if its canonical An LR(0) state contains aconflict if its canonical set has two items that recommend conflicting set has two items that recommend conflicting actions.actions.

shift/reduce conflict - when one item prompts a shift/reduce conflict - when one item prompts a shift action, the other prompts a reduce action.shift action, the other prompts a reduce action.

reduce/reduce conflict - when two items prompt reduce/reduce conflict - when two items prompt for reduce actions by different production.for reduce actions by different production.

A grammar is said be to be LR(0) grammar, if the A grammar is said be to be LR(0) grammar, if the table does not have any conflicts. table does not have any conflicts.

LALR GrammarLALR Grammar

Programming languages cannot be generated byProgramming languages cannot be generated by

LR(0) grammar. We usually have a look ahead LR(0) grammar. We usually have a look ahead symbol, to deteremine what kind of action symbol, to deteremine what kind of action parser will be prompted for. parser will be prompted for.

These lookaheads refine states and actions.These lookaheads refine states and actions.

Comments and FeedbackComments and Feedback

Project 2 will be in the web by Friday (this). Project 2 will be in the web by Friday (this).

Please keep reading chapter 4 and understand the Please keep reading chapter 4 and understand the material. Work out as many exercises as you material. Work out as many exercises as you can.can.