Chapter 3
-
Upload
jorden-burch -
Category
Documents
-
view
33 -
download
0
description
Transcript of Chapter 3
Dr.Manal Abdulaziz CS302 Ch3 2
The Parsing ProcessThe Parsing Process• Parsing is the task of determining the syntax, or
structure, of a program, so it is called syntax analysis.• The syntax of a programming language is usually given
by the grammar rules of a context free-grammar.• The rules of context free grammar is recursive. • The data structures used to represent the syntactic
structure of a language is called parse tree or syntax tree.
• Two general categories of parsing algorithms: top-down parsing and bottom-up parsing.
• The parsing process may be viewed as
Sequence of Tokens Syntax Tree Parser
Dr.Manal Abdulaziz CS302 Ch3 3
Context-Free Grammar Context-Free Grammar TerminologyTerminology
An alphabet or set of basic symbols (like regular expressions, only now the symbols are whole tokens, not chars), including . (Terminals)
A set of names for structures (like statement, expression, definition). (Non-terminals)
A set of grammar rules expressing the structure of each name. (Productions)
A start symbol (the name of the most general structure compilation unit in C).
Dr.Manal Abdulaziz CS302 Ch3 4
Context-Free GrammarsContext-Free Grammars• Example: exp exp op exp І ( exp ) І number op + І – І * • Names are written in italic.• Choice and concatenation similar as regular
expression.• Repetition represented by recursion.• Arrows replaces the equal sign.• Grammar rules in this form is called Backus-
Naur Form or BNF notation.
Dr.Manal Abdulaziz CS302 Ch3 5
ExampleExample
In what way does such a Context-Free Grammar differ from a regular expression?digit = 0|1|…|9number = digit digit*
Recursion!
exp exp op exp | ( exp ) | numberop + | - | *
2 non-terminals
6 terminals
6 productions (3 on each line)
Recursive rules “Base” rule
Dr.Manal Abdulaziz CS302 Ch3 6
DerivationsDerivations • A derivation is a sequence of replacements of structure
names by choices on the right-hand sides of grammar rules.
• The arithmetic expression (34 – 3)*42 corresponds to the legal string
(number – number)*number • (1) exp exp op exp [exp exp op exp] (2) exp op number [exp number] (3) exp * number [op *] (4) (exp) * number [exp (exp)] (5) (exp op exp)*number [exp exp op exp] (6) (exp op number)*number [exp number] (7) (exp – number)*number [op - ] (8) (number – number)*number [exp number]
Dr.Manal Abdulaziz CS302 Ch3 7
Abstract the Structure of Abstract the Structure of Derivation to a Parse TreeDerivation to a Parse Tree
exp
op
*
1
exp 4 3 exp
number
2
exp
exp op exp
number - number
5
8 7 6
( )
Dr.Manal Abdulaziz CS302 Ch3 8
DefinitionsDefinitions• Start symbol is the right-hand side of the
first grammar rule of the language, that initiate the other rules.
• Nonterminals is a structure names that must be replaced further on the derivation.
• Terminals is symbols in the alphabet that terminate the derivation.
• Left recursion A Aα І β• Right recursion A αA І β
Dr.Manal Abdulaziz CS302 Ch3 9
Repetition and RecursionRepetition and Recursion
• Left recursion: A A x | y– yxx:
A
A x
y
x A
• Right recursion: A x A | y– xxy:
A
A x
y
x A
Dr.Manal Abdulaziz CS302 Ch3 10
Parsing AlgorithmsParsing Algorithms
• Top down– Recursive descent (hand choice)– “Predictive” table-driven, “LL”
• Bottom up– “LR” and its cousin “LALR” (machine-
generated choice [Yacc / Bison])– Operator-precedence.
Dr.Manal Abdulaziz CS302 Ch3 11
Languages Generated by GrammarsLanguages Generated by Grammars
1- G : E (E) І aL(G) = { a, (a), ((a)), (((a))), …….}derivation for the input string ((a))
E (E) ((E)) ((a))2- G : E (E)
L(G) = { } the grammar yields no strings.3- G : E E + a І a
L(G) = { a, a +a, a + a + a, ……} derivation for the input string a + a +a
E E + a E + a + a a + a + a
Dr.Manal Abdulaziz CS302 Ch3 13
Parse TreeParse Tree• A parse tree corresponding to a derivation is a
labeled tree in which the interior nodes are labeled by, the leaf nodes are labeled by terminals, and the children of each internal node nonterminals represent the replacement of the associated nonterminal in one step of the derivation.
• exp exp op exp number op exp
number + exp number + number
exp
expopexp
number+number
Dr.Manal Abdulaziz CS302 Ch3 14
Rightmost and Leftmost DerivationRightmost and Leftmost Derivation
1 exp exp op exp2 number op exp3 number + exp4 number + number
exp
exp op exp
number+ number
1
2 3 4
1 exp exp op exp2 exp op number3 exp + number4 number + number
Leftmost or preorder
exp
exp op exp
number+ number
1
4 3 2
Rightmost orpostorder
Dr.Manal Abdulaziz CS302 Ch3 15
ExampleExample
A leftmost derivation (Slide 6 was a rightmost):
(1) exp exp op exp [exp exp op exp](2) (exp) op exp [exp ( exp )](3) (exp op exp) op exp [exp exp op exp](4) (number op exp) op exp [exp number](5) (number - exp) op exp [op -](6) (number - number) op exp [exp number](7) (number - number) * exp [op *](8) (number - number) * number [exp number]
Dr.Manal Abdulaziz CS302 Ch3 16
Abstract Syntax TreesAbstract Syntax Trees• An abstracted syntax tree, or syntax tree is a tree representation of
a shorthand notation for the structure of ordinary syntax.• Statement if-stmt І other if-stmt if (exp) statement І if (exp) statement else statement exp 0 І 1• Input : if (0) other else other
If-stmt
statement
statementelsestatement)exp(if
otherother0
Parse tree
If
0 other other
Syntax tree
Dr.Manal Abdulaziz CS302 Ch3 17
ExamplesExamples• G: stmt-sequence stmt ; stmt-sequence І stmt
stmt s
• Input string : s ; s ; s
Stmt-sequence
Stmt-sequence
Stmt-sequence
stmt
stmt
stmts
s
s ;
;
s
s
s ;
;Parse tree Syntax Tree
Dr.Manal Abdulaziz CS302 Ch3 18
Ambiguous GrammarsAmbiguous Grammars
• Parse tree s and syntax trees uniquely express the structure of syntax, as do leftmost and rightmost derivations, but not derivations in general.
• A grammar that generates a string with two distinct parse trees is called ambiguous grammar.
• Consider again the string number – number * number
exp
op
*
exp
number
exp
exp op exp
number - number
exp
op
*
exp
number
exp
exp op exp
number
-
number
Correct one
Dr.Manal Abdulaziz CS302 Ch3 19
AmbiguityAmbiguity• Sources of AmbiguitySources of Ambiguity
– Associativity and precedence of operators.– Extent of a substructure (dangling else).
• Dealing with ambiguity– Disambiguating rules: state a rule that
specifies in each ambiguous case which of the parse trees is the correct one.
– Change the grammar (but not the language): this implies changing the grammar into a form that forces the construction of the correct parse tree.
Dr.Manal Abdulaziz CS302 Ch3 20
Precedence and AssociativityPrecedence and Associativity
• Example:
integer arithmetic
exp exp addop term | term
addop + | -
term term mulop factor | factor
mulop *
factor ( exp ) | number
exp
exp
addop term
termterm - mulop factor
factorfactor * number
numbernumber
Dr.Manal Abdulaziz CS302 Ch3 21
Dangling else AmbiguityDangling else Ambiguity• Example:
statement if-stmt | other
if-stmt if ( exp ) statement
| if ( exp )statement else statement
exp 0 | 1
The following string has two parse trees:
if(0) if(1) other else other
Dr.Manal Abdulaziz CS302 Ch3 22
Parse Trees for Dangling elseParse Trees for Dangling else
statement
if-stmt
if ( ) else exp statement statement
0 other if-stmt
if ( ) exp statement
1 other
statement
if-stmt
if ( ) exp statement
0 if-stmt
if ( ) else exp statement statement
1 other other
Correct oneUsing the most closely nested disambiguity rule
Dr.Manal Abdulaziz CS302 Ch3 23
Changing the Grammar Rule for Changing the Grammar Rule for Dangling else ProblemDangling else Problem
The grammar becomes:
statement matched-stmt | unmatched-stmt
matched-stmt if ( exp ) matched-stmt else matched-stmt | other
unmatched-stmt if ( exp ) statement
| if ( exp ) matched-stmt else unmatched-stmt
exp 0 | 1
Dr.Manal Abdulaziz CS302 Ch3 24
Parse Tree for the SolutionParse Tree for the Solution
statement
Matched-stmt
Unmatched-stmt
statement
Matched-stmt
)exp
0
(if
Matched-stmt)
1
exp(if else
other other
Input string: if(0) if(1) other else other
Dr.Manal Abdulaziz CS302 Ch3 25
Extended BNF NotationExtended BNF Notation• Extended BNF (EBNF):
– New metasymbols […] and {…} largely eliminated by these.
• Repetition: A Aα І β (Left recursion) A αA І β (Right recursion) • This is equivalent to: A β α* A α* β• Using EBNF notation: A β {α} A {α} β
Dr.Manal Abdulaziz CS302 Ch3 26
Extended BNF NotationExtended BNF Notation
• Example:
stmt-sequence stmt ; stmt-sequence І stmt
• Using EBNF:
stmt-sequence { stmt ; } stmt (right recursion)
stmt-sequence stmt { ; stmt} (left recursion)• Optional: using previous example
stmt-sequence stmt [ ; stmt-sequence ]• Example: exp exp addop term І term
using EBNF: exp [exp addop ] term