Chapter 3

27
Chapter 3 Chapter 3 Context-Free Context-Free Grammar and Grammar and Parsing Parsing

description

Chapter 3. Context-Free Grammar and Parsing. The Parsing Process. Parsing is the task of determining the syntax, or structure, of a program, so it is called syntax analysis. The syntax of a programming language is usually given by the grammar rules of a context free-grammar. - PowerPoint PPT Presentation

Transcript of Chapter 3

Chapter 3Chapter 3

Context-Free Context-Free Grammar and ParsingGrammar and Parsing

Dr.Manal Abdulaziz CS302 Ch3 2

The Parsing ProcessThe Parsing Process• Parsing is the task of determining the syntax, or

structure, of a program, so it is called syntax analysis.• The syntax of a programming language is usually given

by the grammar rules of a context free-grammar.• The rules of context free grammar is recursive. • The data structures used to represent the syntactic

structure of a language is called parse tree or syntax tree.

• Two general categories of parsing algorithms: top-down parsing and bottom-up parsing.

• The parsing process may be viewed as

Sequence of Tokens Syntax Tree Parser

Dr.Manal Abdulaziz CS302 Ch3 3

Context-Free Grammar Context-Free Grammar TerminologyTerminology

An alphabet or set of basic symbols (like regular expressions, only now the symbols are whole tokens, not chars), including . (Terminals)

A set of names for structures (like statement, expression, definition). (Non-terminals)

A set of grammar rules expressing the structure of each name. (Productions)

A start symbol (the name of the most general structure compilation unit in C).

Dr.Manal Abdulaziz CS302 Ch3 4

Context-Free GrammarsContext-Free Grammars• Example: exp exp op exp І ( exp ) І number op + І – І * • Names are written in italic.• Choice and concatenation similar as regular

expression.• Repetition represented by recursion.• Arrows replaces the equal sign.• Grammar rules in this form is called Backus-

Naur Form or BNF notation.

Dr.Manal Abdulaziz CS302 Ch3 5

ExampleExample

In what way does such a Context-Free Grammar differ from a regular expression?digit = 0|1|…|9number = digit digit*

Recursion!

exp exp op exp | ( exp ) | numberop + | - | *

2 non-terminals

6 terminals

6 productions (3 on each line)

Recursive rules “Base” rule

Dr.Manal Abdulaziz CS302 Ch3 6

DerivationsDerivations • A derivation is a sequence of replacements of structure

names by choices on the right-hand sides of grammar rules.

• The arithmetic expression (34 – 3)*42 corresponds to the legal string

(number – number)*number • (1) exp exp op exp [exp exp op exp] (2) exp op number [exp number] (3) exp * number [op *] (4) (exp) * number [exp (exp)] (5) (exp op exp)*number [exp exp op exp] (6) (exp op number)*number [exp number] (7) (exp – number)*number [op - ] (8) (number – number)*number [exp number]

Dr.Manal Abdulaziz CS302 Ch3 7

Abstract the Structure of Abstract the Structure of Derivation to a Parse TreeDerivation to a Parse Tree

exp

op

*

1

exp 4 3 exp

number

2

exp

exp op exp

number - number

5

8 7 6

( )

Dr.Manal Abdulaziz CS302 Ch3 8

DefinitionsDefinitions• Start symbol is the right-hand side of the

first grammar rule of the language, that initiate the other rules.

• Nonterminals is a structure names that must be replaced further on the derivation.

• Terminals is symbols in the alphabet that terminate the derivation.

• Left recursion A Aα І β• Right recursion A αA І β

Dr.Manal Abdulaziz CS302 Ch3 9

Repetition and RecursionRepetition and Recursion

• Left recursion: A A x | y– yxx:

A

A x

y

x A

• Right recursion: A x A | y– xxy:

A

A x

y

x A

Dr.Manal Abdulaziz CS302 Ch3 10

Parsing AlgorithmsParsing Algorithms

• Top down– Recursive descent (hand choice)– “Predictive” table-driven, “LL”

• Bottom up– “LR” and its cousin “LALR” (machine-

generated choice [Yacc / Bison])– Operator-precedence.

Dr.Manal Abdulaziz CS302 Ch3 11

Languages Generated by GrammarsLanguages Generated by Grammars

1- G : E (E) І aL(G) = { a, (a), ((a)), (((a))), …….}derivation for the input string ((a))

E (E) ((E)) ((a))2- G : E (E)

L(G) = { } the grammar yields no strings.3- G : E E + a І a

L(G) = { a, a +a, a + a + a, ……} derivation for the input string a + a +a

E E + a E + a + a a + a + a

Dr.Manal Abdulaziz CS302 Ch3 12

ExamplesExamples

Dr.Manal Abdulaziz CS302 Ch3 13

Parse TreeParse Tree• A parse tree corresponding to a derivation is a

labeled tree in which the interior nodes are labeled by, the leaf nodes are labeled by terminals, and the children of each internal node nonterminals represent the replacement of the associated nonterminal in one step of the derivation.

• exp exp op exp number op exp

number + exp number + number

exp

expopexp

number+number

Dr.Manal Abdulaziz CS302 Ch3 14

Rightmost and Leftmost DerivationRightmost and Leftmost Derivation

1 exp exp op exp2 number op exp3 number + exp4 number + number

exp

exp op exp

number+ number

1

2 3 4

1 exp exp op exp2 exp op number3 exp + number4 number + number

Leftmost or preorder

exp

exp op exp

number+ number

1

4 3 2

Rightmost orpostorder

Dr.Manal Abdulaziz CS302 Ch3 15

ExampleExample

A leftmost derivation (Slide 6 was a rightmost):

(1) exp exp op exp [exp exp op exp](2) (exp) op exp [exp ( exp )](3) (exp op exp) op exp [exp exp op exp](4) (number op exp) op exp [exp number](5) (number - exp) op exp [op -](6) (number - number) op exp [exp number](7) (number - number) * exp [op *](8) (number - number) * number [exp number]

Dr.Manal Abdulaziz CS302 Ch3 16

Abstract Syntax TreesAbstract Syntax Trees• An abstracted syntax tree, or syntax tree is a tree representation of

a shorthand notation for the structure of ordinary syntax.• Statement if-stmt І other if-stmt if (exp) statement І if (exp) statement else statement exp 0 І 1• Input : if (0) other else other

If-stmt

statement

statementelsestatement)exp(if

otherother0

Parse tree

If

0 other other

Syntax tree

Dr.Manal Abdulaziz CS302 Ch3 17

ExamplesExamples• G: stmt-sequence stmt ; stmt-sequence І stmt

stmt s

• Input string : s ; s ; s

Stmt-sequence

Stmt-sequence

Stmt-sequence

stmt

stmt

stmts

s

s ;

;

s

s

s ;

;Parse tree Syntax Tree

Dr.Manal Abdulaziz CS302 Ch3 18

Ambiguous GrammarsAmbiguous Grammars

• Parse tree s and syntax trees uniquely express the structure of syntax, as do leftmost and rightmost derivations, but not derivations in general.

• A grammar that generates a string with two distinct parse trees is called ambiguous grammar.

• Consider again the string number – number * number

exp

op

*

exp

number

exp

exp op exp

number - number

exp

op

*

exp

number

exp

exp op exp

number

-

number

Correct one

Dr.Manal Abdulaziz CS302 Ch3 19

AmbiguityAmbiguity• Sources of AmbiguitySources of Ambiguity

– Associativity and precedence of operators.– Extent of a substructure (dangling else).

• Dealing with ambiguity– Disambiguating rules: state a rule that

specifies in each ambiguous case which of the parse trees is the correct one.

– Change the grammar (but not the language): this implies changing the grammar into a form that forces the construction of the correct parse tree.

Dr.Manal Abdulaziz CS302 Ch3 20

Precedence and AssociativityPrecedence and Associativity

• Example:

integer arithmetic

exp exp addop term | term

addop + | -

term term mulop factor | factor

mulop *

factor ( exp ) | number

exp

exp

addop term

termterm - mulop factor

factorfactor * number

numbernumber

Dr.Manal Abdulaziz CS302 Ch3 21

Dangling else AmbiguityDangling else Ambiguity• Example:

statement if-stmt | other

if-stmt if ( exp ) statement

| if ( exp )statement else statement

exp 0 | 1

The following string has two parse trees:

if(0) if(1) other else other

Dr.Manal Abdulaziz CS302 Ch3 22

Parse Trees for Dangling elseParse Trees for Dangling else

statement

if-stmt

if ( ) else exp statement statement

0 other if-stmt

if ( ) exp statement

1 other

statement

if-stmt

if ( ) exp statement

0 if-stmt

if ( ) else exp statement statement

1 other other

Correct oneUsing the most closely nested disambiguity rule

Dr.Manal Abdulaziz CS302 Ch3 23

Changing the Grammar Rule for Changing the Grammar Rule for Dangling else ProblemDangling else Problem

The grammar becomes:

statement matched-stmt | unmatched-stmt

matched-stmt if ( exp ) matched-stmt else matched-stmt | other

unmatched-stmt if ( exp ) statement

| if ( exp ) matched-stmt else unmatched-stmt

exp 0 | 1

Dr.Manal Abdulaziz CS302 Ch3 24

Parse Tree for the SolutionParse Tree for the Solution

statement

Matched-stmt

Unmatched-stmt

statement

Matched-stmt

)exp

0

(if

Matched-stmt)

1

exp(if else

other other

Input string: if(0) if(1) other else other

Dr.Manal Abdulaziz CS302 Ch3 25

Extended BNF NotationExtended BNF Notation• Extended BNF (EBNF):

– New metasymbols […] and {…} largely eliminated by these.

• Repetition: A Aα І β (Left recursion) A αA І β (Right recursion) • This is equivalent to: A β α* A α* β• Using EBNF notation: A β {α} A {α} β

Dr.Manal Abdulaziz CS302 Ch3 26

Extended BNF NotationExtended BNF Notation

• Example:

stmt-sequence stmt ; stmt-sequence І stmt

• Using EBNF:

stmt-sequence { stmt ; } stmt (right recursion)

stmt-sequence stmt { ; stmt} (left recursion)• Optional: using previous example

stmt-sequence stmt [ ; stmt-sequence ]• Example: exp exp addop term І term

using EBNF: exp [exp addop ] term

Dr.Manal Abdulaziz CS302 Ch3 27

Syntax DiagramSyntax Diagram

• Example: factor ( exp ) І number

• Repetition: A {B }

• Optional: A [ B ]

number

( ) exp >

>

> >

> factor

B

A

BA