Simple One-Pass Compiler
description
Transcript of Simple One-Pass Compiler
Simple One-Pass Compiler
Natawut Nupairoj, Ph.D.
Department of Computer EngineeringChulalongkorn University
Outline
Translation Scheme. Annotated Parse Tree. Parsing Fundamental. Top-Down Parsers. Abstract Stack Machine. Simple Code Generation.
Simple One-Pass Compiler
ScannerSource program(text stream)
Parser Object Code(text stream)
m a i n ( ) {
Sample Grammarexpr expr + termexpr expr - termexpr termterm 0 | 1 | 2 | ... | 9
Derivation String: 9 – 5 + 2
expr expr + term expr – term + term term – term + term 9 – term + term 9 – 5 + term 9 – 5 + 2
leftmost/rightmost derivation
Parse Treeexpr
expr term
expr term
term
9 - 5 + 2
Translation Scheme Context-free Grammar with Embedded Semantic Actions.
expr ::= expr + term
expr ::= expr – term
expr ::= term
term ::= 0
term ::= 1
...
term ::= 9
emitting (พ่�น) a translation
{ print(‘+’); }
{ print(‘-’); }
{ print(‘0’); }
{ print(‘1’); }
{ print(‘9’); }
Parse Tree with Semantic Actions
expr
+ { print(‘+’) }
expr term
- { print(‘-’) } 2 { print(‘2’) }
expr term
term 5 { print(‘5’) }
9 { print(‘9’) }
Depth-first traversal
Input: 9 – 5 + 2Output:
9 5 - 2 +
Location of Semantic Actions Semantic Actions can be placed anywhere on the RHS.
expr ::= {print(‘+’);} expr + term
expr ::= {print(‘-’);} expr – term
expr ::= term
term ::= 0 {print(‘0’);}
term ::= 1 {print(‘1’);}
...
term ::= 9 {print(‘9’);}
Parsing Approaches Top-down parsing
build parse tree from start symbolmatch result terminal string with input streamsimple but limit in power
Bottom-up parsingstart from input token streambuild parse tree from terminal until get start
symbolcomplex but powerful
Top Down vs. Bottom Up
start here
resultmatch
input token stream input token stream
start here
result
Top-down Parsing Bottom-up Parsing
Exampletype ::= simple
| ^id
| array [ simple ] of type
simple ::= integer
| char
| num dotdot num
Input Token String
array [ num dotdot num ] of integer
Top-Down Parsing with Left-to-right Scanning of Input Stream
type
array [ simple ] of type
Input array [ num dotdot num ] of integer
lookahead token
Backtracking(Recursive-Descent Parsing)
simple
integer char num
Input array [ num dotdot num ] of integer
lookahead token
Predictive Parsingtype ::= simple
| ^id | array [ simple ] of
typesimple ::= integer
| char | num dotdot num
type
array [ simple ] of type
Input array [ num dotdot num ] of integer
lookahead token
The Program for Predictive Parser
match(scanner)
Input(text stream)
PredictiveParser
Output
match(‘array’)
OKa r r a y [
The Program for Predictive Parsingprocedure match ( t : token ); procedure simple;
begin begin
if lookahead = t then if lookahead = integer then
lookahead := nexttoken match ( integer )
else error else if lookahead = char then
end; match ( char )
else if lookahead = num then begin
procedure type; match ( num )
match ( dotdot )
match ( num )
begin end
if lookahead is in { integer, char, num } then else error
simple end;
else if lookahead = ‘ ^ ‘ then begin
match ( ‘ ^ ’ ); match ( id )
end
else if lookahead = array then begin
match ( array ); match ( ‘ [ ‘ ); simple; match ( ‘ ] ‘ ); match ( of ); type
end
else error
end;
Mapping Between Production and Parser Codes
type -> arrary [ simple ] of type
match(array); match(‘[‘); simple; match(‘]’); match(of); type
parsing (recognition)of simple
scanner
parser
Lookahead SymbolsA ->
FIRST( ) = set of fist token in strings
generated from
FIRST(simple) = { integer, char, num }
FIRST( ^id ) = { ^ }
FIRST(array [ simple ] of type) = { array }
Rules for Predictive Parser If A -> and A -> then
FIRST() and FIRST() are disjoint
-production stmt -> begin opt_stmts end
opt_stmts -> stmt_list opt_stmts |
Left Recursion Left Recursion => Parser loops forever
A -> A | expr -> expr + term | term
Rewriting...A -> R
R -> R |
Exampleexpr expr + termexpr expr - termexpr term
term 0 | 1 | 2 | ... | 9
expr term restrest + term rest
| - term rest
| term 0 | 1 | 2 | ... | 9
Semantic Actions
expr term restrest + term {print(‘+’);} rest
| - term {print(‘-’);} rest
| term 0 {print(‘0’);}
| 1 {print(‘1’);}
...
expr term restrest + term {print(‘+’);} rest
| - term {print(‘-’);} rest
| term 0 {print(‘0’);}...
procedure rest;begin
if lookahead = ‘+’ then begin
match(‘+’);term();print(‘+’);rest();
else if lookahead = ‘-’ then begin
match(‘-’);term();print(‘-’);rest();
end;end;
procedure expr;begin
term();rest();
end;