Copyright © 2009 Elsevier Chapter 2 :: Programming Language Syntax Programming Language Pragmatics...
-
Upload
naomi-dorsey -
Category
Documents
-
view
273 -
download
6
Transcript of Copyright © 2009 Elsevier Chapter 2 :: Programming Language Syntax Programming Language Pragmatics...
![Page 1: Copyright © 2009 Elsevier Chapter 2 :: Programming Language Syntax Programming Language Pragmatics Michael L. Scott.](https://reader033.fdocuments.net/reader033/viewer/2022061607/56649f2a5503460f94c43610/html5/thumbnails/1.jpg)
Copyright © 2009 Elsevier
Chapter 2 :: Programming Language Syntax
Programming Language PragmaticsMichael L. Scott
![Page 2: Copyright © 2009 Elsevier Chapter 2 :: Programming Language Syntax Programming Language Pragmatics Michael L. Scott.](https://reader033.fdocuments.net/reader033/viewer/2022061607/56649f2a5503460f94c43610/html5/thumbnails/2.jpg)
Copyright © 2009 Elsevier
Parsing: recap
• There are large classes of grammars for which we can build parsers that run in linear time– The two most important classes are called LL and LR
• LL stands for 'Left-to-right, Leftmost derivation'.
• LR stands for 'Left-to-right, Rightmost derivation’
![Page 3: Copyright © 2009 Elsevier Chapter 2 :: Programming Language Syntax Programming Language Pragmatics Michael L. Scott.](https://reader033.fdocuments.net/reader033/viewer/2022061607/56649f2a5503460f94c43610/html5/thumbnails/3.jpg)
Copyright © 2009 Elsevier
Parsing
• LL parsers are also called 'top-down', or 'predictive' parsers & LR parsers are also called 'bottom-up', or 'shift-reduce' parsers
• There are several important sub-classes of LR parsers– SLR– LALR
• (We won't be going into detail on the differences between them.)
![Page 4: Copyright © 2009 Elsevier Chapter 2 :: Programming Language Syntax Programming Language Pragmatics Michael L. Scott.](https://reader033.fdocuments.net/reader033/viewer/2022061607/56649f2a5503460f94c43610/html5/thumbnails/4.jpg)
Copyright © 2009 Elsevier
Parsing
• You commonly see LL or LR (or whatever) written with a number in parentheses after it– This number indicates how many tokens of
look-ahead are required in order to parse
– Almost all real compilers use one token of look-ahead
• The expression grammar (with precedence and associativity) you saw before is LR(1), but not LL(1)
![Page 5: Copyright © 2009 Elsevier Chapter 2 :: Programming Language Syntax Programming Language Pragmatics Michael L. Scott.](https://reader033.fdocuments.net/reader033/viewer/2022061607/56649f2a5503460f94c43610/html5/thumbnails/5.jpg)
Copyright © 2009 Elsevier
Parsing
• Every LL(1) grammar is also LR(1), though right recursion in production tends to require very deep stacks and complicates semantic analysis
• Every CFL that can be parsed deterministically has an SLR(1) grammar (which is LR(1))
• Every deterministic CFL with the prefix property (no valid string is a prefix of another valid string) has an LR(0) grammar
![Page 6: Copyright © 2009 Elsevier Chapter 2 :: Programming Language Syntax Programming Language Pragmatics Michael L. Scott.](https://reader033.fdocuments.net/reader033/viewer/2022061607/56649f2a5503460f94c43610/html5/thumbnails/6.jpg)
Copyright © 2009 Elsevier
LL Parsing
• Here is an LL(1) grammar that we saw late time in class (based on Fig 2.15 in book):
1. program → stmt list $$$2. stmt_list → stmt stmt_list 3. | ε4. stmt → id := expr 5. | read id 6. | write expr7. expr → term term_tail8. term_tail → add op term term_tail 9. | ε
![Page 7: Copyright © 2009 Elsevier Chapter 2 :: Programming Language Syntax Programming Language Pragmatics Michael L. Scott.](https://reader033.fdocuments.net/reader033/viewer/2022061607/56649f2a5503460f94c43610/html5/thumbnails/7.jpg)
Copyright © 2009 Elsevier
LL Parsing
• LL(1) grammar (continued)10. term → factor fact_tailt11. fact_tail → mult_op fact fact_tail• | ε• factor → ( expr ) • | id • | number• add_op → + • | -• mult_op → * • | /
![Page 8: Copyright © 2009 Elsevier Chapter 2 :: Programming Language Syntax Programming Language Pragmatics Michael L. Scott.](https://reader033.fdocuments.net/reader033/viewer/2022061607/56649f2a5503460f94c43610/html5/thumbnails/8.jpg)
Copyright © 2009 Elsevier
LL Parsing
• Like the bottom-up grammar, this one captures associativity and precedence, but most people don't find it as pretty– for one thing, the operands of a given operator
aren't in a RHS together! – however, the simplicity of the parsing algorithm
makes up for this weakness
• How do we parse a string with this grammar? – by building the parse tree incrementally
![Page 9: Copyright © 2009 Elsevier Chapter 2 :: Programming Language Syntax Programming Language Pragmatics Michael L. Scott.](https://reader033.fdocuments.net/reader033/viewer/2022061607/56649f2a5503460f94c43610/html5/thumbnails/9.jpg)
Copyright © 2009 Elsevier
LL Parsing
• Example (average program)
read Aread Bsum := A + Bwrite sumwrite sum / 2
• We start at the top and predict needed productions on the basis of the current left-most non-terminal in the tree and the current input token
![Page 10: Copyright © 2009 Elsevier Chapter 2 :: Programming Language Syntax Programming Language Pragmatics Michael L. Scott.](https://reader033.fdocuments.net/reader033/viewer/2022061607/56649f2a5503460f94c43610/html5/thumbnails/10.jpg)
Copyright © 2009 Elsevier
LL Parsing
• Parse tree for the average program (Figure 2.17)
![Page 11: Copyright © 2009 Elsevier Chapter 2 :: Programming Language Syntax Programming Language Pragmatics Michael L. Scott.](https://reader033.fdocuments.net/reader033/viewer/2022061607/56649f2a5503460f94c43610/html5/thumbnails/11.jpg)
Copyright © 2009 Elsevier
LL Parsing: actual implementation
• Table-driven LL parsing: you have a big loop in which you repeatedly look up an action in a two-dimensional table based on current leftmost non-terminal and current input token. The actions are (1) match a terminal(2) predict a production(3) announce a syntax error
![Page 12: Copyright © 2009 Elsevier Chapter 2 :: Programming Language Syntax Programming Language Pragmatics Michael L. Scott.](https://reader033.fdocuments.net/reader033/viewer/2022061607/56649f2a5503460f94c43610/html5/thumbnails/12.jpg)
Copyright © 2009 Elsevier
LL Parsing
• LL(1) parse table for parsing for calculator language
![Page 13: Copyright © 2009 Elsevier Chapter 2 :: Programming Language Syntax Programming Language Pragmatics Michael L. Scott.](https://reader033.fdocuments.net/reader033/viewer/2022061607/56649f2a5503460f94c43610/html5/thumbnails/13.jpg)
Copyright © 2009 Elsevier
LL Parsing
• To keep track of the left-most non-terminal, you push the as-yet-unseen portions of productions onto a stack– for details see Figure 2.20
• The key thing to keep in mind is that the stack contains all the stuff you expect to see between now and the end of the program – what you predict you will see
![Page 14: Copyright © 2009 Elsevier Chapter 2 :: Programming Language Syntax Programming Language Pragmatics Michael L. Scott.](https://reader033.fdocuments.net/reader033/viewer/2022061607/56649f2a5503460f94c43610/html5/thumbnails/14.jpg)
Copyright © 2009 Elsevier
LL Parsing: when it isn’t LL
• Problems trying to make a grammar LL(1)– left recursion
• example:
id_list → id | id_list , idequivalently
id_list → id id_list_tailid_list_tail → , id id_list_tail
| epsilon• we can get rid of all left recursion mechanically in any
grammar
![Page 15: Copyright © 2009 Elsevier Chapter 2 :: Programming Language Syntax Programming Language Pragmatics Michael L. Scott.](https://reader033.fdocuments.net/reader033/viewer/2022061607/56649f2a5503460f94c43610/html5/thumbnails/15.jpg)
Copyright © 2009 Elsevier
LL Parsing
• Problems trying to make a grammar LL(1)– common prefixes: another thing that LL parsers
can't handle• solved by "left-factoring”• example:stmt → id := expr | id ( arg_list )
equivalentlystmt → id id_stmt_tailid_stmt_tail → := expr
| ( arg_list)• we can eliminate left-factor mechanically
![Page 16: Copyright © 2009 Elsevier Chapter 2 :: Programming Language Syntax Programming Language Pragmatics Michael L. Scott.](https://reader033.fdocuments.net/reader033/viewer/2022061607/56649f2a5503460f94c43610/html5/thumbnails/16.jpg)
Copyright © 2009 Elsevier
LL Parsing
• Note that eliminating left recursion and common prefixes does NOT make a grammar LL– there are infinitely many non-LL
LANGUAGES, and the mechanical transformations work on them just fine
– the few that arise in practice, however, can generally be handled with kludges
![Page 17: Copyright © 2009 Elsevier Chapter 2 :: Programming Language Syntax Programming Language Pragmatics Michael L. Scott.](https://reader033.fdocuments.net/reader033/viewer/2022061607/56649f2a5503460f94c43610/html5/thumbnails/17.jpg)
Copyright © 2009 Elsevier
LL Parsing
• Problems trying to make a grammar LL(1)– the"dangling else" problem prevents grammars
from being LL(1) (or in fact LL(k) for any k)– the following natural grammar fragment is
inherently ambiguous (from Pascal)stmt → if cond then_clause else_clause
| other_stuffthen_clause → then stmtelse_clause → else stmt
| epsilon
![Page 18: Copyright © 2009 Elsevier Chapter 2 :: Programming Language Syntax Programming Language Pragmatics Michael L. Scott.](https://reader033.fdocuments.net/reader033/viewer/2022061607/56649f2a5503460f94c43610/html5/thumbnails/18.jpg)
Copyright © 2009 Elsevier
LL Parsing
• The less natural grammar fragment can be parsed bottom-up (so LR) but not top-down (so not LL)stmt → balanced_stmt | unbalanced_stmtbalanced_stmt → if cond then balanced_stmt
else balanced_stmt | other_stuff
unbalanced_stmt → if cond then stmt | if cond then
balanced_stmt else
unbalanced_stmt
![Page 19: Copyright © 2009 Elsevier Chapter 2 :: Programming Language Syntax Programming Language Pragmatics Michael L. Scott.](https://reader033.fdocuments.net/reader033/viewer/2022061607/56649f2a5503460f94c43610/html5/thumbnails/19.jpg)
Copyright © 2009 Elsevier
LL Parsing
• The usual approach, whether top-down OR bottom-up, is to use the ambiguous grammar together with a disambiguating rule that says – else goes with the closest then or
– more generally, the first of two possible productions is the one to predict (or reduce)
![Page 20: Copyright © 2009 Elsevier Chapter 2 :: Programming Language Syntax Programming Language Pragmatics Michael L. Scott.](https://reader033.fdocuments.net/reader033/viewer/2022061607/56649f2a5503460f94c43610/html5/thumbnails/20.jpg)
Copyright © 2009 Elsevier
LL Parsing
• Better yet, languages (since Pascal) generally employ explicit end-markers, which eliminate this problem
• In Modula-2, for example, one says:if A = B then
if C = D then E := F endelse
G := Hend
• Ada says 'end if'; other languages say 'fi'
![Page 21: Copyright © 2009 Elsevier Chapter 2 :: Programming Language Syntax Programming Language Pragmatics Michael L. Scott.](https://reader033.fdocuments.net/reader033/viewer/2022061607/56649f2a5503460f94c43610/html5/thumbnails/21.jpg)
Copyright © 2009 Elsevier
LL Parsing
• One problem with end markers is that they tend to bunch up. In Pascal you say
if A = B then …else if A = C then …else if A = D then …else if A = E then …else ...;
• With end markers this becomesif A = B then …else if A = C then …else if A = D then …else if A = E then …else ...;end; end; end; end;
![Page 22: Copyright © 2009 Elsevier Chapter 2 :: Programming Language Syntax Programming Language Pragmatics Michael L. Scott.](https://reader033.fdocuments.net/reader033/viewer/2022061607/56649f2a5503460f94c43610/html5/thumbnails/22.jpg)
Copyright © 2009 Elsevier
LL Parsing
• The algorithm to build predict sets is tedious (for a "real" sized grammar), but relatively simple
• It consists of three stages:– (1) compute FIRST sets for symbols– (2) compute FOLLOW sets for non-terminals
(this requires computing FIRST sets for some strings)
– (3) compute predict sets or table for all productions
![Page 23: Copyright © 2009 Elsevier Chapter 2 :: Programming Language Syntax Programming Language Pragmatics Michael L. Scott.](https://reader033.fdocuments.net/reader033/viewer/2022061607/56649f2a5503460f94c43610/html5/thumbnails/23.jpg)
Copyright © 2009 Elsevier
LL Parsing
• It is conventional in general discussions of grammars to use – lower case letters near the beginning of the alphabet for
terminals– lower case letters near the end of the alphabet for strings
of terminals– upper case letters near the beginning of the alphabet for
non-terminals– upper case letters near the end of the alphabet for arbitrary
symbols– greek letters for arbitrary strings of symbols
![Page 24: Copyright © 2009 Elsevier Chapter 2 :: Programming Language Syntax Programming Language Pragmatics Michael L. Scott.](https://reader033.fdocuments.net/reader033/viewer/2022061607/56649f2a5503460f94c43610/html5/thumbnails/24.jpg)
Copyright © 2009 Elsevier
LL Parsing
• Algorithm First/Follow/Predict:– FIRST(α) == {a : α →* a β} ∪ (if α =>* ε THEN {ε} ELSE NULL)
– FOLLOW(A) == {a : S →+ α A a β} ∪ (if S →* α A THEN {ε} ELSE NULL)
– Predict (A → X1 ... Xm) == (FIRST (X1 ...
Xm) - {ε}) ∪ (if X1, ..., Xm →* ε then
FOLLOW (A) ELSE NULL)
• Details following…
![Page 25: Copyright © 2009 Elsevier Chapter 2 :: Programming Language Syntax Programming Language Pragmatics Michael L. Scott.](https://reader033.fdocuments.net/reader033/viewer/2022061607/56649f2a5503460f94c43610/html5/thumbnails/25.jpg)
Copyright © 2009 Elsevier
LL Parsing
![Page 26: Copyright © 2009 Elsevier Chapter 2 :: Programming Language Syntax Programming Language Pragmatics Michael L. Scott.](https://reader033.fdocuments.net/reader033/viewer/2022061607/56649f2a5503460f94c43610/html5/thumbnails/26.jpg)
Copyright © 2009 Elsevier
LL Parsing
![Page 27: Copyright © 2009 Elsevier Chapter 2 :: Programming Language Syntax Programming Language Pragmatics Michael L. Scott.](https://reader033.fdocuments.net/reader033/viewer/2022061607/56649f2a5503460f94c43610/html5/thumbnails/27.jpg)
Copyright © 2009 Elsevier
LL Parsing
• If any token belongs to the predict set of more than one production with the same LHS, then the grammar is not LL(1)
• A conflict can arise because – the same token can begin more than one RHS– it can begin one RHS and can also appear after
the LHS in some valid program, and one possible RHS is ε
![Page 28: Copyright © 2009 Elsevier Chapter 2 :: Programming Language Syntax Programming Language Pragmatics Michael L. Scott.](https://reader033.fdocuments.net/reader033/viewer/2022061607/56649f2a5503460f94c43610/html5/thumbnails/28.jpg)
Copyright © 2009 Elsevier
LR Parsing
• LR parsers are almost always table-driven:– like a table-driven LL parser, an LR parser uses a
big loop in which it repeatedly inspects a two-dimensional table to find out what action to take
– unlike the LL parser, however, the LR driver has non-trivial state (like a DFA), and the table is indexed by current input token and current state
– the stack contains a record of what has been seen SO FAR (NOT what is expected)
![Page 29: Copyright © 2009 Elsevier Chapter 2 :: Programming Language Syntax Programming Language Pragmatics Michael L. Scott.](https://reader033.fdocuments.net/reader033/viewer/2022061607/56649f2a5503460f94c43610/html5/thumbnails/29.jpg)
Copyright © 2009 Elsevier
LR Parsing
• A scanner is a DFA– it can be specified with a state diagram
• An LL or LR parser is a Push Down Automata, or PDA– a PDA can be specified with a state diagram and a
stack• the state diagram looks just like a DFA state diagram,
except the arcs are labeled with <input symbol, top-of-stack symbol> pairs, and in addition to moving to a new state the PDA has the option of pushing or popping a finite number of symbols onto/off the stack
![Page 30: Copyright © 2009 Elsevier Chapter 2 :: Programming Language Syntax Programming Language Pragmatics Michael L. Scott.](https://reader033.fdocuments.net/reader033/viewer/2022061607/56649f2a5503460f94c43610/html5/thumbnails/30.jpg)
Copyright © 2009 Elsevier
LR Parsing
• An LL(1) PDA has only one state! – well, actually two; it needs a second one to
accept with, but that's all – all the arcs are self loops; the only difference
between them is the choice of whether to push or pop
– the final state is reached by a transition that sees EOF on the input and the stack
![Page 31: Copyright © 2009 Elsevier Chapter 2 :: Programming Language Syntax Programming Language Pragmatics Michael L. Scott.](https://reader033.fdocuments.net/reader033/viewer/2022061607/56649f2a5503460f94c43610/html5/thumbnails/31.jpg)
Copyright © 2009 Elsevier
LR Parsing
• An LR (or SLR/LALR) PDA has multiple states– it is a "recognizer," not a "predictor"
– it builds a parse tree from the bottom up
– the states keep track of which productions we might be in the middle
• The parsing of the Characteristic Finite State Machine (CFSM) is based on– Shift
– Reduce
![Page 32: Copyright © 2009 Elsevier Chapter 2 :: Programming Language Syntax Programming Language Pragmatics Michael L. Scott.](https://reader033.fdocuments.net/reader033/viewer/2022061607/56649f2a5503460f94c43610/html5/thumbnails/32.jpg)
Copyright © 2009 Elsevier
LR Parsing
• To illustrate LR parsing, consider the grammar (from Figure 2.24):
1. program → stmt list $$$2. stmt_list → stmt_list stmt3. | stmt
4. stmt → id := expr5. | read id 6. | write expr
7. expr → term8. | expr add op term
![Page 33: Copyright © 2009 Elsevier Chapter 2 :: Programming Language Syntax Programming Language Pragmatics Michael L. Scott.](https://reader033.fdocuments.net/reader033/viewer/2022061607/56649f2a5503460f94c43610/html5/thumbnails/33.jpg)
Copyright © 2009 Elsevier
LR Parsing
• LR grammar (continued):9. term → factor • | term mult_op factor• factor →( expr ) • | id • | number• add op → + • | -• mult op → * • | /
![Page 34: Copyright © 2009 Elsevier Chapter 2 :: Programming Language Syntax Programming Language Pragmatics Michael L. Scott.](https://reader033.fdocuments.net/reader033/viewer/2022061607/56649f2a5503460f94c43610/html5/thumbnails/34.jpg)
Copyright © 2009 Elsevier
LR Parsing
• This grammar is SLR(1), a particularly nice class of bottom-up grammar– it isn't exactly what we saw originally
– we've eliminated the epsilon production to simplify the presentation
• When parsing, mark current position with a “.”, and can have a similar sort of table to mark what state to go to
![Page 35: Copyright © 2009 Elsevier Chapter 2 :: Programming Language Syntax Programming Language Pragmatics Michael L. Scott.](https://reader033.fdocuments.net/reader033/viewer/2022061607/56649f2a5503460f94c43610/html5/thumbnails/35.jpg)
Copyright © 2009 Elsevier
LR Parsing
![Page 36: Copyright © 2009 Elsevier Chapter 2 :: Programming Language Syntax Programming Language Pragmatics Michael L. Scott.](https://reader033.fdocuments.net/reader033/viewer/2022061607/56649f2a5503460f94c43610/html5/thumbnails/36.jpg)
Copyright © 2009 Elsevier
LR Parsing
![Page 37: Copyright © 2009 Elsevier Chapter 2 :: Programming Language Syntax Programming Language Pragmatics Michael L. Scott.](https://reader033.fdocuments.net/reader033/viewer/2022061607/56649f2a5503460f94c43610/html5/thumbnails/37.jpg)
Copyright © 2009 Elsevier
LR Parsing
![Page 38: Copyright © 2009 Elsevier Chapter 2 :: Programming Language Syntax Programming Language Pragmatics Michael L. Scott.](https://reader033.fdocuments.net/reader033/viewer/2022061607/56649f2a5503460f94c43610/html5/thumbnails/38.jpg)
Copyright © 2009 Elsevier
Syntax Errors
• When parsing a program, the parser will often detect a syntax error– Generally when the next token/input doesn’t form a valid
possible transition.
• What should we do?– Halt and find closest rule that does match.
– Recover and continue parsing if possible.
• Most compilers don’t just halt; this would mean ignoring all code past the error. – Instead, goal is to find and report as many errors as
possible.
![Page 39: Copyright © 2009 Elsevier Chapter 2 :: Programming Language Syntax Programming Language Pragmatics Michael L. Scott.](https://reader033.fdocuments.net/reader033/viewer/2022061607/56649f2a5503460f94c43610/html5/thumbnails/39.jpg)
Copyright © 2009 Elsevier
Syntax Errors: approaches
• Method 1: Panic mode:
• Define a small set of “safe symbols”.– In C++, start from just after next semicolon
– In Python, jump to next newline and continue
• When an error occurs, computer jumps back to last safe symbol, and tries to compile from the next safe symbol on.– (Ever notice that errors often point to the line before or
after the actual error?)
![Page 40: Copyright © 2009 Elsevier Chapter 2 :: Programming Language Syntax Programming Language Pragmatics Michael L. Scott.](https://reader033.fdocuments.net/reader033/viewer/2022061607/56649f2a5503460f94c43610/html5/thumbnails/40.jpg)
Copyright © 2009 Elsevier
Syntax Errors: approaches
• Method 2: Phase-level recovery– Refine panic mode with different safe symbols for
different states
– Ex: expression -> ), statement -> ;
• Method 3: Context specific look-ahead: – Improves on 2 by checking various contexts in which
the production might appear in a parse tree
– Improves error messages, but costs in terms of speed and complexity
![Page 41: Copyright © 2009 Elsevier Chapter 2 :: Programming Language Syntax Programming Language Pragmatics Michael L. Scott.](https://reader033.fdocuments.net/reader033/viewer/2022061607/56649f2a5503460f94c43610/html5/thumbnails/41.jpg)
Copyright © 2009 Elsevier
Beyond Parsing: Ch. 4
• We also need to define rules to connect the productions to actual operations concepts.
• Example grammar: E → E + TE → E – TE → TT → T * FT → T / FT → FF → - F
• Question: Is it LL or LR?
![Page 42: Copyright © 2009 Elsevier Chapter 2 :: Programming Language Syntax Programming Language Pragmatics Michael L. Scott.](https://reader033.fdocuments.net/reader033/viewer/2022061607/56649f2a5503460f94c43610/html5/thumbnails/42.jpg)
Copyright © 2009 Elsevier
Attribute Grammars
• We can turn this into an attribute grammar as follows (similar to Figure 4.1):E → E + T E1.val = E2.val + T.valE → E – T E1.val = E2.val - T.valE → T E.val = T.valT → T * F T1.val = T2.val * F.valT → T / F T1.val = T2.val / F.valT → F T.val = F.valF → - F F1.val = - F2.valF → (E) F.val = E.valF → const F.val = C.val
![Page 43: Copyright © 2009 Elsevier Chapter 2 :: Programming Language Syntax Programming Language Pragmatics Michael L. Scott.](https://reader033.fdocuments.net/reader033/viewer/2022061607/56649f2a5503460f94c43610/html5/thumbnails/43.jpg)
Copyright © 2009 Elsevier
Attribute Grammars
• The attribute grammar serves to define the semantics of the input program
• Attribute rules are best thought of as definitions, not assignments
• They are not necessarily meant to be evaluated at any particular time, or in any particular order, though they do define their left-hand side in terms of the right-hand side
![Page 44: Copyright © 2009 Elsevier Chapter 2 :: Programming Language Syntax Programming Language Pragmatics Michael L. Scott.](https://reader033.fdocuments.net/reader033/viewer/2022061607/56649f2a5503460f94c43610/html5/thumbnails/44.jpg)
Copyright © 2009 Elsevier
Evaluating Attributes
• The process of evaluating attributes is called annotation, or DECORATION, of the parse tree [see next slide]– When a parse tree under this grammar is fully
decorated, the value of the expression will be in the val attribute of the root
• The code fragments for the rules are called SEMANTIC FUNCTIONS– Strictly speaking, they should be cast as functions,
e.g., E1.val = sum (E2.val, T.val), cf., Figure 4.1
![Page 45: Copyright © 2009 Elsevier Chapter 2 :: Programming Language Syntax Programming Language Pragmatics Michael L. Scott.](https://reader033.fdocuments.net/reader033/viewer/2022061607/56649f2a5503460f94c43610/html5/thumbnails/45.jpg)
Copyright © 2009 Elsevier
Evaluating Attributes
![Page 46: Copyright © 2009 Elsevier Chapter 2 :: Programming Language Syntax Programming Language Pragmatics Michael L. Scott.](https://reader033.fdocuments.net/reader033/viewer/2022061607/56649f2a5503460f94c43610/html5/thumbnails/46.jpg)
Copyright © 2009 Elsevier
Evaluating Attributes
• This is a very simple attribute grammar:– Each symbol has at most one
attribute• the punctuation marks have no attributes
• These attributes are all so-called SYNTHESIZED attributes:– They are calculated only from the attributes of
things below them in the parse tree
![Page 47: Copyright © 2009 Elsevier Chapter 2 :: Programming Language Syntax Programming Language Pragmatics Michael L. Scott.](https://reader033.fdocuments.net/reader033/viewer/2022061607/56649f2a5503460f94c43610/html5/thumbnails/47.jpg)
Copyright © 2009 Elsevier
Evaluating Attributes
• In general, we are allowed both synthesized and INHERITED attributes:– Inherited attributes may depend on things above or
to the side of them in the parse tree – Tokens have only synthesized attributes, initialized
by the scanner (name of an identifier, value of a constant, etc.).
– Inherited attributes of the start symbol constitute run-time parameters of the compiler
![Page 48: Copyright © 2009 Elsevier Chapter 2 :: Programming Language Syntax Programming Language Pragmatics Michael L. Scott.](https://reader033.fdocuments.net/reader033/viewer/2022061607/56649f2a5503460f94c43610/html5/thumbnails/48.jpg)
Copyright © 2009 Elsevier
Evaluating Attributes
• The grammar above is called S-ATTRIBUTED because it uses onlysynthesized attributes
• Its ATTRIBUTE FLOW (attribute dependence graph) is purely bottom-up– It is SLR(1), but not LL(1)
• An equivalent LL(1) grammar requires inherited attributes:
![Page 49: Copyright © 2009 Elsevier Chapter 2 :: Programming Language Syntax Programming Language Pragmatics Michael L. Scott.](https://reader033.fdocuments.net/reader033/viewer/2022061607/56649f2a5503460f94c43610/html5/thumbnails/49.jpg)
Copyright © 2009 Elsevier
Evaluating Attributes – Example
• Attribute grammar in Figure 4.3:E → T TT E.v =TT.v
TT.st = T.v
TT1 → + T TT2 TT1.v = TT2.v
TT2.st = TT1.st + T.v
TT1 → - T TT1 TT1.v = TT2.v
TT2.st = TT1.st - T.v
TT → ε TT.v = TT.st
T → F FT T.v =FT.v
FT.st = F.v
![Page 50: Copyright © 2009 Elsevier Chapter 2 :: Programming Language Syntax Programming Language Pragmatics Michael L. Scott.](https://reader033.fdocuments.net/reader033/viewer/2022061607/56649f2a5503460f94c43610/html5/thumbnails/50.jpg)
Copyright © 2009 Elsevier
Evaluating Attributes– Example
• Attribute grammar in Figure 4.3 (continued):
FT1 → * F FT2 FT1.v = FT2.v
FT2.st = FT1.st * F.v
FT1 → / F FT2 FT1.v = FT2.v
FT2.st = FT1.st / F.v
FT → ε FT.v = FT.st
F1 → - F2 F1.v = - F2.v
F → ( E ) F.v = E.v
F → const F.v = C.v
• Figure 4.4 – parse tree for (1+3)*2
![Page 51: Copyright © 2009 Elsevier Chapter 2 :: Programming Language Syntax Programming Language Pragmatics Michael L. Scott.](https://reader033.fdocuments.net/reader033/viewer/2022061607/56649f2a5503460f94c43610/html5/thumbnails/51.jpg)
Copyright © 2009 Elsevier
Evaluating Attributes– Example
![Page 52: Copyright © 2009 Elsevier Chapter 2 :: Programming Language Syntax Programming Language Pragmatics Michael L. Scott.](https://reader033.fdocuments.net/reader033/viewer/2022061607/56649f2a5503460f94c43610/html5/thumbnails/52.jpg)
Copyright © 2009 Elsevier
Evaluating Attributes– Example
• Attribute grammar in Figure 4.3:– This attribute grammar is a good bit messier than
the first one, but it is still L-ATTRIBUTED, which means that the attributes can be evaluated in a single left-to-right pass over the input
– In fact, they can be evaluated during an LL parse
– Each synthetic attribute of a LHS symbol (by definition of synthetic) depends only on attributes of its RHS symbols
![Page 53: Copyright © 2009 Elsevier Chapter 2 :: Programming Language Syntax Programming Language Pragmatics Michael L. Scott.](https://reader033.fdocuments.net/reader033/viewer/2022061607/56649f2a5503460f94c43610/html5/thumbnails/53.jpg)
Copyright © 2009 Elsevier
Evaluating Attributes – Example
• Attribute grammar in Figure 4.3:– Each inherited attribute of a RHS symbol (by
definition of L-attributed) depends only on• inherited attributes of the LHS symbol, or
• synthetic or inherited attributes of symbols to its left in the RHS
– L-attributed grammars are the most general class of attribute grammars that can be evaluated during an LL parse
![Page 54: Copyright © 2009 Elsevier Chapter 2 :: Programming Language Syntax Programming Language Pragmatics Michael L. Scott.](https://reader033.fdocuments.net/reader033/viewer/2022061607/56649f2a5503460f94c43610/html5/thumbnails/54.jpg)
Copyright © 2009 Elsevier
Evaluating Attributes
• There are certain tasks, such as generation of code for short-circuit Boolean expression evaluation, that are easiest to express with non-L-attributed attribute grammars
• Because of the potential cost of complex traversal schemes, however, most real-world compilers insist that the grammar be L-attributed