Download - LR Parsing – The Items Lecture 10 Mon, Feb 14, 2005.

LR Parsing – The Items

Lecture 10

Mon, Feb 14, 2005

LR Parsers

A bottom-up parser follows a rightmost derivation from the bottom up.

Such parsers typically use the LR algorithm and are called LR parsers. L means process tokens from Left to right. R means follow a Rightmost derivation.

LR Parsers

Furthermore, in LR parsing, the production is applied only after the pattern has been matched.

In LL (predictive) parsing, the production was selected, and then the tokens were matched to it.

Rightmost Derivations

Let the grammar be

E E + T | T

T T * F | F

F (E) | id | num

Rightmost Derivations A rightmost derivation of (id + num)*id is

E T T*F T*id F*id (E)*id (E + T)*id (E + F)*id (E + num)*id (T + num)*id (F + num)*id (id + num)*id.

LR Parsers

An LR parser uses a parse table, an input buffer, and a stack of “states.”

It performs three operations. Shift a token from the input buffer to the stack. Reduce the content of the stack by applying a

production. Go to a new state.

LR(0) Items

To build an LR parse table, we must first find the LR(0) items.

An LR(0) item is a production with a special marker () marking a position within the string on the right side of the production.

LR(0) parsing is also called SLR parsing (“simple” LR).

Example: LR(0) Items

If the production is

E E + T,

then the possible LR(0) items are [E E + T] [E E + T] [E E + T] [E E + T ]

LR(0) Items

The interpretation of [A ] is

“We have processed and

we might process next.” Whether we do actually process will be

borne out by the subsequent tokens.

LR Parsing

We will build a PDA whose states are sets of LR(0) items.

First we augment the grammar with a new start symbol S'.

S' S. This guarantees that the start symbol will not

recurse.

States of the PDA

The initial state is called I0 (item 0). State I0 is the closure of the set

{[S' S]}. To form the closure of a set of items

For each item [A B] in the set and for each production B in the grammar, add the item [B ] to the set.

Let us call [B ] an initial B-item. Continue in this manner until there is no further

change.

Example: LR Parsing

Continuing with our standard example, the augmented grammar is

E' E

E E + T | T

T T * F | F

F (E) | id | num

Example: LR Parsing

The state I0 consists of the items in the closure of item [E' E].

[E' E]

[E E + T][E T][T T * F][T F][F (E)][F id][F num]

Transitions

There will be a transition from one state to another state for each grammar symbol in an item that immediately follows the marker in an item in that state.

If an item in the state is [A X], then The transition from that state occurs when the

symbol X is processed. The transition is to the state that is the closure of

the item [A X ].

Example: LR Parsing

Thus, from the state I0, there will be transitions for the symbols E, T, F, (, id, and num.

For example, on processing E, the items

[E' E] and [E E + T]

become

[E' E ] and [E E + T].

Example: LR Parsing

Let state I1 be the closure of these items.

I1: [E' E ]

[E E + T] Thus the PDA has the transition

I0 I1

E

Example: LR Parsing

Similarly we determine the other transitions from I0.

Process T:

I2: [E T ]

[T T * F] Process F:

I3: [T F ]

Example: LR Parsing

Process (:

I4: [F ( E)]

[E E + T]

[E T]

[T T + F]

[T F]

[F (E)]

[F id]

[F num]

Example: LR Parsing

Process id:

I5: [F id ] Process num:

I6: [F num ]

Example: LR Parsing

Now find the transitions from states I1 through I6 to other states, and so on, until no new states appear.

Example: LR Parsing

I7: [E E + T]

[T T * F]

[T F]

[F (E)]

[F id]

[F num]

Example: LR Parsing

I8: [T T * F]

[F (E)]

[F id]

[F num] I9: [F (E )]

[E E + T]

Example: LR Parsing

I10: [E E + T ]

[T T * F] I11: [T T * F ] I12: [F (E) ]