Explorations in Lexical sample and All-words Lexical Substitution
Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design...
Transcript of Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design...
![Page 1: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/1.jpg)
1
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
Lexical Analysis
Regular Expressions & DFA
Copyright 2016, Pedro C. Diniz, all rights reserved.Students enrolled in the Compilers class at the University of Southern California have explicit permission to make copies of these materials for their personal use.
![Page 2: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/2.jpg)
2
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
2
Outline
• What is a Lexical Analyzer?
• Regular Expressions
• Matching regular expressions using Nondeterministic Finite Automata (NFA)
• Transforming an NFA to a DFA
![Page 3: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/3.jpg)
3
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
3
What is a Lexical Analyzer?
• Example of Tokens• Operators = + - > ( { := == <>• Keywords if while for int double• Numeric literals 43 5.65 -3.6e10 0x13F3A• Character literals ‘a’ ‘~’ ‘\’’• String literals “565” “Fall 10” “\”\” = empty”
• Example of non-tokens• White space space(‘ ‘) tab(‘\t’) end-of-line(‘\n’)• Comments /*this is not a token*/
Source Program Text Tokens
![Page 4: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/4.jpg)
4
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
4
f o r v a r 1 = 1 0 v a r 1 < =
Lexical Analyzer in Action
![Page 5: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/5.jpg)
5
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
5
for_key ID(“var1”) eq_op Num(10) ID(“var1”) leq_op
Lexical Analyzer in Action
f o r v a r 1 = 1 0 v a r 1 < =
![Page 6: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/6.jpg)
6
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
6
Lexical Analyzer Needs To...
• Partition Input Program Text into Subsequence of Characters Corresponding to Tokens
• Attach the Corresponding Attributes to the Tokens
• Eliminate White Space and Comments
![Page 7: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/7.jpg)
7
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
7
• Precisely identify the type of token that matches the input string
• 603 Num(603)• CSCI565 ID(“CSCI565”)
• Precisely describe different types of tokens• FORTRAN DO I=1,10 • C++ for(int i=1; i <= 10; i++)• C-shell foreach i (1 2 3 4 5 6 7 8 9 10)
• Use Regular Expressions to precisely describe what strings each type of token can recognize
Lexical Analyzer Needs To...
![Page 8: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/8.jpg)
8
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
8
Outline
• What is a Lexical Analyzer?
• Regular Expressions
• Matching Regular expressions using Nondeterministic Finite Automata (NFA)
• Transforming an NFA to a DFA
![Page 9: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/9.jpg)
9
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
9
Examples of Regular ExpressionsRegular Expression Strings Matched
a “a”a · b “ab”a | b “a” “b”ε “”
a* “” “a” “aa” “aaa”…(a | ε) · b “ab” “b”num = 0|1|2|3|4|5|6|7|8|9 “0”, “1”, …posint = num · num* “8” “6035” …int = (ε | -) · posint “-42” “1024” …real = int · (ε | (. · posint)) “-12.56” “12” “1.414”...
![Page 10: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/10.jpg)
10
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
10
Definition: Formal Languages
• Alphabet Σ = finite set of symbols– Σ = { 0, 1, 2, 3, 4, 5, 6, 7, 8, 9 }
• String s = finite sequence of symbols from alphabet– s = 6004
• Empty string ε = special string of length zero• Language L = set of strings over an alphabet
– L = { 6001, 6002, 6003, 6004, 6035 6891 … }
![Page 11: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/11.jpg)
11
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
11
• For a regular expression r, the language L(r) = { all the strings that match r }– L((a | ε) · b) = {“ab” “b”}
• Suppose r and s are Regular Expressions denoting languages L(r) and L(s)– L(r | s) = L(r) ∪ L(s)– L(r · s) = { xy | x ∈ L(r) and y ∈ L(s) }– L(r*) = { x1 x2 ... xk | xi ∈ L(r) and k >= 0 }– L(ε) = {}
Definition: Formal Languages
![Page 12: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/12.jpg)
12
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
12
More Regular Expressions• We know:
– L(r | s) is the union of L(r) and L(s)– L(r · s) is the concatenation of L(r) and L(s)– L(r*) is the Kleene closure of L(r)
• “zero or more occurrence of”
• Few additional ones– “one or more occurrence of”r+ = r · r*– “zero or one occurrence of” r? = r | ε
![Page 13: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/13.jpg)
13
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
13
Question
• What regular expression best identifies USC course numbers?
num = 0|1|2|3|4|5|6|7|8|9
1) class = num · num*
2) class = num · . · num*
3) class = num | . | num*
4) class = (num · . · num)*
![Page 14: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/14.jpg)
14
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
14
Outline• What is a Lexical Analyzer?
• Regular Expressions
• Matching regular expressions using Nondeterministic Finite Automata (NFA)
• Transforming an NFA to a DFA
![Page 15: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/15.jpg)
15
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
15
Reg. Expression to NFA Construction
a a
εε
rr
ss
If r and s are regular expressions with the NFA’s
![Page 16: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/16.jpg)
16
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
16
rr
ss
r · sr s
r | s r
s
ε
ε ε
ε
r*r
ε
ε
εε
Reg. Expression to NFA Construction
![Page 17: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/17.jpg)
17
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
17
r ?sε ε
ε
r+r
εε
Reg. Expression to NFA Construction
![Page 18: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/18.jpg)
18
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
18
Construction Example(-| ε)· (0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?
![Page 19: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/19.jpg)
19
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
19
ε
-
(-| ε)· (0|1|2|3|4|5|6|7|8|9)+· (. ·(0|1|2|3|4|5|6|7|8|9)*)?
Construction Example
![Page 20: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/20.jpg)
20
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
20
(-| ε)· (0|1|2|3|4|5|6|7|8|9)+· (. ·(0|1|2|3|4|5|6|7|8|9)*)?
ε
-ε
Construction Example
![Page 21: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/21.jpg)
21
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
21
(-| ε)· (0|1|2|3|4|5|6|7|8|9)+· (. ·(0|1|2|3|4|5|6|7|8|9)*)?
ε
ε
-ε
Construction Example
![Page 22: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/22.jpg)
22
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
22
(-| ε)· (0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?
ε
ε
0123456789
-ε
Construction Example
![Page 23: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/23.jpg)
23
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
23
(-| ε)· (0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?
ε
ε
0123456789
-ε ε
Construction Example
![Page 24: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/24.jpg)
24
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
24
(-| ε)· (0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?
ε
ε
0123456789
-ε ε
ε
ε
Construction Example
![Page 25: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/25.jpg)
25
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
25
(-| ε)· (0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?
ε
ε
0123456789
-ε . ε
ε
Construction Example
![Page 26: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/26.jpg)
26
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
26
(-| ε)· (0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?
ε
ε
0123456789
-ε . εε
εε
ε
Construction Example
![Page 27: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/27.jpg)
27
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
27
(-| ε)· (0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?
ε
ε
0123456789
ε
0123456789
-ε . εε
εε
Construction Example
![Page 28: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/28.jpg)
28
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
28
ε
ε
0123456789
ε
0123456789
-ε . εε
εε
- 1 2 . 8
String Matching
![Page 29: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/29.jpg)
29
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
29
ε
ε
0123456789
ε
0123456789
-ε . εε
εε
- 1 2 . 8
String Matching
![Page 30: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/30.jpg)
30
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
30
ε
ε
0123456789
ε
0123456789
-ε . εε
εε
- 1 2 . 8
String Matching
![Page 31: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/31.jpg)
31
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
31
ε
ε
0123456789
ε
0123456789
-ε . εε
εε
- 1 2 . 8
String Matching
![Page 32: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/32.jpg)
32
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
32
ε
ε
0123456789
ε
0123456789
-ε . εε
εε
- 1 2 . 8
String Matching
![Page 33: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/33.jpg)
33
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
33
ε
ε
0123456789
ε
0123456789
-ε . εε
εε
- 1 2 . 8
String Matching
![Page 34: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/34.jpg)
34
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
34
ε
ε
0123456789
ε
0123456789
-ε . εε
εε
- 1 2 . 8
String Matching
![Page 35: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/35.jpg)
35
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
35
ε
ε
0123456789
ε
0123456789
-ε . εε
εε
- 1 2 . 8
String Matching
![Page 36: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/36.jpg)
36
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
36
ε
ε
0123456789
ε
0123456789
-ε . εε
εε
- 1 2 . 8
String Matching
![Page 37: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/37.jpg)
37
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
37
ε
ε
0123456789
ε
0123456789
-ε . εε
εε
- 1 2 . 8
String Matching
![Page 38: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/38.jpg)
38
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
38
ε
ε
0123456789
ε
0123456789
-ε . εε
εε
- 1 2 . 8
String Matching
![Page 39: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/39.jpg)
39
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
39
ε
ε
0123456789
ε
0123456789
-ε . εε
εε
- 1 2 . 8
String Matching
![Page 40: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/40.jpg)
40
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
40
ε
ε
0123456789
ε
0123456789
-ε . εε
εε
- 1 2 . 8
String Matching
![Page 41: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/41.jpg)
41
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
41
ε
ε
0123456789
ε
0123456789
-ε . εε
εε
- 1 2 . 8
String Matching
![Page 42: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/42.jpg)
42
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
42
ε
ε
0123456789
ε
0123456789
-ε . εε
εε
- 1 2 . 8
String Matching
![Page 43: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/43.jpg)
43
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
43
ε
ε
0123456789
ε
0123456789
-ε . εε
εε
- 1 2 . 8
String Matching
![Page 44: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/44.jpg)
44
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
44
ε
ε
0123456789
ε
0123456789
-ε . εε
εε
- 1 2 . 8
String Matching
![Page 45: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/45.jpg)
45
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
45
ε
ε
0123456789
ε
0123456789
-ε . εε
εε
- 1 2 . 8
String Matching
![Page 46: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/46.jpg)
46
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
46
ε
ε
0123456789
ε
0123456789
-ε . εε
εε
- 1 2 . 8
String Matching
![Page 47: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/47.jpg)
47
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
47
Implementing a Lexical Analyzer
• Need to find which strings match a Regular Expression• Create a NFA for to match the Regular Expression• Unfortunately, NFA does not have a simple
implementation• Need to create a Deterministic Finite Automaton
(DFA) from a NFA
![Page 48: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/48.jpg)
48
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
48
Outline
• What is a Lexical Analyzer?
• Regular Expressions
• Matching regular expressions using Nondeterministic Finite Automata (NFA)
• Transforming an NFA to a DFA
![Page 49: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/49.jpg)
49
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
49
Constructing a DFA from a NFA
• Why do we need a DFA?– Easy to implement– Current state + input symbol uniquely identifies the next state
• How do you construct a DFA from a NFA?
![Page 50: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/50.jpg)
50
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
50
• Why do we need a DFA?– Easy to implement– Current state + input symbol uniquely identifies the next state
• How do you construct a DFA from a NFA?– DFA keeps track of which states the NFA would be in– Each state of the DFA is in fact a subset of the states of the NFA
NFA DFA
aA
B
A
B
Constructing a DFA from a NFA
aa
ε a1
2
3
4 5
aa
ε a1
2
3
4 5
![Page 51: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/51.jpg)
51
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
51
State ε-closure• The ε-closure of a state s is the set of states that can
be reached from that state without consuming any of the input– ε-Closure(S) is the smallest set T such that
• Algorithm (fixed-point)
€
T = S edge(s,ε)s∈T$
% & &
'
( ) )
€
T← Srepeat
T '←T
T←T ' edge(s,ε)s∈T '%
& ' '
(
) * *
until T =T '
![Page 52: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/52.jpg)
52
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
52
ε
-ε . εε
ε
1 4 5 8
ε0
123
4567
8
9
32
ε0
123
4567
8
9
76
ε
S = {1}T = {}T’ = {}
(-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?
ε-closure({1})
€
T← Srepeat
T '←T
T←T ' edge(s,ε)s∈T '%
& ' '
(
) * *
until T =T '
![Page 53: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/53.jpg)
53
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
53
ε
-ε . εε
ε
1 4 5 8
ε0
123
4567
8
9
32
ε0
123
4567
8
9
76
ε
S = {1}T = {1}T’ = {}
€
T← Srepeat
T '←T
T←T ' edge(s,ε)s∈T '%
& ' '
(
) * *
until T =T '
(-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?
ε-closure({1})
![Page 54: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/54.jpg)
54
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
54
ε
-ε . εε
ε
1 4 5 8
ε0
123
4567
8
9
32
ε0
123
4567
8
9
76
ε
S = {1}T = {1}T’ = {1}
(-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?
ε-closure({1})
€
T← Srepeat
T '←T
T←T ' edge(s,ε)s∈T '%
& ' '
(
) * *
until T =T '
![Page 55: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/55.jpg)
55
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
55
ε
-ε . εε
ε
1 4 5 8
ε0
123
4567
8
9
32
ε0
123
4567
8
9
76
ε
S = {1}T = {1, 2}T’ = {1}
(-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?
ε-closure({1})
€
T← Srepeat
T '←T
T←T ' edge(s,ε)s∈T '%
& ' '
(
) * *
until T =T '
![Page 56: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/56.jpg)
56
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
56
ε
-ε . εε
ε
1 4 5 8
ε0
123
4567
8
9
32
ε0
123
4567
8
9
76
ε
S = {1}T = {1, 2}T’ = {1}
(-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?
ε-closure({1})
€
T← Srepeat
T '←T
T←T ' edge(s,ε)s∈T '%
& ' '
(
) * *
until T =T '
![Page 57: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/57.jpg)
57
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
57
ε
-ε . εε
ε
1 4 5 8
ε0
123
4567
8
9
32
ε0
123
4567
8
9
76
ε
S = {1}T = {1, 2}T’ = {1, 2}
(-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?
ε-closure({1})
€
T← Srepeat
T '←T
T←T ' edge(s,ε)s∈T '%
& ' '
(
) * *
until T =T '
![Page 58: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/58.jpg)
58
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
58
ε
-ε . εε
ε
1 4 5 8
ε0
123
4567
8
9
32
ε0
123
4567
8
9
76
ε
S = {1}T = {1, 2}T’ = {1, 2}
(-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?
ε-closure({1})
€
T← Srepeat
T '←T
T←T ' edge(s,ε)s∈T '%
& ' '
(
) * *
until T =T '
![Page 59: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/59.jpg)
59
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
59
ε
-ε . εε
ε
1 4 5 8
ε0
123
4567
8
9
32
ε0
123
4567
8
9
76
ε
S = {1}T = {1, 2}T’ = {1, 2}
(-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?
ε-closure({1})
€
T← Srepeat
T '←T
T←T ' edge(s,ε)s∈T '%
& ' '
(
) * *
until T =T '
![Page 60: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/60.jpg)
60
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
60
ε
-ε . εε
ε
1 4 5 8
ε0
123
4567
8
9
32
ε0
123
4567
8
9
76
ε
S = {1}T = {1, 2}T’ = {1, 2}
(-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?
ε-closure({1})
€
T← Srepeat
T '←T
T←T ' edge(s,ε)s∈T '%
& ' '
(
) * *
until T =T '
![Page 61: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/61.jpg)
61
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
61
What is ε-closure({3})?
ε
-ε . εε
ε
1 4 8
ε0
123
4567
8
9
32
ε0
123
4567
8
9
76
ε
S = {3}T = ??
(-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?
5
![Page 62: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/62.jpg)
62
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
62
ε
-ε . εε
ε
1 4 5 8
ε0
123
4567
8
9
32
ε0
123
4567
8
9
76
ε
S = {3}T = {2, 3, 4, 8}
(-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?
What is ε-closure({3})?
![Page 63: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/63.jpg)
63
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
63
DFAedge• Given symbol c and a set of states S, what states can you reach?
• First find the states you can reach on the symbol c• Then, compute ε-closure to determine what other states are
reachable from each new state following ε-transitions.
€
DFAedge(S,c) = ε − closure edge(s,c)s∈S%
& ' '
(
) * *
![Page 64: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/64.jpg)
64
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
64
ε
-ε . εε
ε
1 4 5 8
ε0
123
4567
8
9
32
ε0
123
4567
8
9
76
ε
S = {1}
DFAedge({1}, 3) = ??
(-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?
What is DFAedge({1}, 3)?
![Page 65: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/65.jpg)
65
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
65
ε
-ε . εε
ε
1 4 8
ε0
123
4567
8
9
32
ε0
123
4567
8
9
76
ε
S = {1}
edge({1},3) = {}DFAedge({1},3) = {}
(-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?
What is DFAedge({1}, 3)?
5
![Page 66: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/66.jpg)
66
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
66
ε
-ε . εε
ε
1 4 8
ε0
123
4567
8
9
32
ε0
123
4567
8
9
76
ε
S = {2}
edge({2}, 3) = {3}DFAedge({2}, 3) = ε-closure({3})
(-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?
What is DFAedge({2}, 3)?
5
![Page 67: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/67.jpg)
67
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
67
ε
-ε . εε
ε
1 4 8
ε0
123
4567
8
9
32
ε0
123
4567
8
9
76
ε
S = {4}
DFAedge({4}, .) = ??
(-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?
What is DFAedge({4}, .)?
5
![Page 68: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/68.jpg)
68
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
68
ε
-ε . εε
ε
1 4 8
ε0
123
4567
8
9
32
ε0
123
4567
8
9
76
ε
S = {4}
DFAedge({4}, .) = {5,6,8}
(-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?
What is DFAedge({4}, .)?
5
![Page 69: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/69.jpg)
69
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
69
ε
-ε . εε
ε
1 4 5 8
ε0
123
4567
8
9
32
ε0
123
4567
8
9
76
ε
S = {2}
ε-closure({3}) ={2,3,4,8}DFAedge({2}, 3) = {2,3,4,8}
(-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?
What is DFAedge({2}, 3)?
![Page 70: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/70.jpg)
70
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
70
NFA to DFA: the Subset Construction
€
states[0] = s1states[1] = ε − closure({s1})p =1j = 0while ( j ≤ p) doforeach c∈ ∑ doe = DFAedge(states[ j],c)if (e = states[i] for some i ≤ p) thentrans[ j,c] = ielsep = p+1states[p] = etrans[ j,c] = pj = j +1
end ifend foreachend while
• Approach– Use Subset Construction– Mimics the Set of states the NFA
should be in if it were to operate non-deterministically
– Label the states as accepting if they have at least one of the accepting states of the NFA
![Page 71: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/71.jpg)
71
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
71
ε
-ε . εε
ε
1 4 5 8
ε0
123
4567
8
9
32
ε0
123
4567
8
9
76
ε
RE = (-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?
![Page 72: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/72.jpg)
72
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
72
ε
-ε . εε
ε
1 4 5 8
ε0
123
4567
8
9
32
ε0
123
4567
8
9
76
ε
RE = (-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?
Let's simplify the diagram first...
![Page 73: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/73.jpg)
73
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
73
ε
-ε . εε
ε
1 4 5 8
ε
0...9
32
ε
76
ε
RE = (-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?
0...9 just means any character in the range from 0 to 9
0...9
![Page 74: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/74.jpg)
74
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
74
ε
-ε . εε
ε
1 4 5 8
ε
0...9
32
ε
76
ε
RE = (-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?
0...9
ε-closure(1) = ?
![Page 75: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/75.jpg)
75
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
75
ε
-ε . εε
ε
1 4 5 8
ε
0...9
32
ε
76
ε
RE = (-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?
0...9
ε-closure(1) = {1, 2}
![Page 76: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/76.jpg)
76
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
76
ε
-ε . εε
ε
1 4 5 8
ε
0...9
32
ε
76
ε
RE = (-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?
0...9
ε-closure(1) = {1, 2}
This corresponds to the first state in the DFA...
![Page 77: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/77.jpg)
77
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
77
ε
-ε . εε
ε
1 4 5 8
ε
0...9
32
ε
76
ε
RE = (-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?
0...9
DFA State NFA Statesε-closure after transition on...
0...9 − .0 {1, 2}
Simplest approach for subset construction is to build a table
![Page 78: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/78.jpg)
78
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
78
ε
-ε . εε
ε
1 4 5 8
ε
0...9
32
ε
76
ε
RE = (-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?
0...9
DFA State NFA Statesε-closure after transition on...
0...9 − .0 {1, 2}
From NFA{1, 2}, if input is 0...9
![Page 79: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/79.jpg)
79
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
79
ε
-ε . εε
ε
1 4 5 8
ε
0...9
32
ε
76
ε
RE = (-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?
0...9
DFA State NFA Statesε-closure after transition on...
0...9 − .0 {1, 2}
From NFA{1, 2}, if input is 0...9
![Page 80: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/80.jpg)
80
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
80
ε
-ε . εε
ε
1 4 5 8
ε
0...9
32
ε
76
ε
RE = (-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?
0...9
DFA State NFA Statesε-closure after transition on...
0...9 − .0 {1, 2}
From NFA{1, 2}, if input is 0...9
![Page 81: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/81.jpg)
81
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
81
ε
-ε . εε
ε
1 4 5 8
ε
0...9
32
ε
76
ε
RE = (-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?
0...9
DFA State NFA Statesε-closure after transition on...
0...9 − .0 {1, 2}
From NFA{1, 2}, if input is 0...9
![Page 82: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/82.jpg)
82
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
82
ε
-ε . εε
ε
1 4 5
ε
0...9
32
ε
76
ε
RE = (-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?
0...9
DFA State NFA Statesε-closure after transition on...
0...9 − .0 {1, 2} {2, 3, 4, 8}
From NFA{1, 2}, if input is 0...9
8
![Page 83: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/83.jpg)
83
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
83
ε
-ε . εε
ε
1 4 5
ε
0...9
32
ε
76
ε
RE = (-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?
0...9
DFA State NFA Statesε-closure after transition on...
0...9 − .0 {1, 2} {2, 3, 4, 8}
1 {2, 3, 4, 8}
NFA{2, 3, 4, 8} is a new combination, so add a DFA state
8
![Page 84: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/84.jpg)
84
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
84
ε
-ε . εε
ε
1 4 5
ε
0...9
32
ε
76
ε
RE = (-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?
0...9
DFA State NFA Statesε-closure after transition on...
0...9 − .0 {1, 2} {2, 3, 4, 8}
1 {2, 3, 4, 8}
From NFA{1, 2}, if input is −
8
![Page 85: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/85.jpg)
85
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
85
ε
-ε . εε
ε
1 4 5
ε
0...9
32
ε
76
ε
RE = (-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?
0...9
DFA State NFA Statesε-closure after transition on...
0...9 − .0 {1, 2} {2, 3, 4, 8} {2}1 {2, 3, 4, 8}
ε-closure doesn't lead anywhere else
8
![Page 86: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/86.jpg)
86
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
86
ε
-ε . εε
ε
1 4 5
ε
0...9
32
ε
76
ε
RE = (-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?
0...9
DFA State NFA Statesε-closure after transition on...
0...9 − .0 {1, 2} {2, 3, 4, 8} {2}1 {2, 3, 4, 8}
2 {2}
NFA{2} is a new combination, so add a DFA state
8
![Page 87: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/87.jpg)
87
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
87
ε
-ε . εε
ε
1 4 5
ε
0...9
32
ε
76
ε
RE = (-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?
0...9
DFA State NFA Statesε-closure after transition on...
0...9 − .0 {1, 2} {2, 3, 4, 8} {2}1 {2, 3, 4, 8}
2 {2}
From NFA{1, 2}, if input is .
8
![Page 88: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/88.jpg)
88
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
88
ε
-ε . εε
ε
1 4 5
ε
0...9
32
ε
76
ε
RE = (-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?
0...9
DFA State NFA Statesε-closure after transition on...
0...9 − .0 {1, 2} {2, 3, 4, 8} {2} error1 {2, 3, 4, 8}
2 {2}
From NFA{1, 2}, if input is .
8
![Page 89: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/89.jpg)
89
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
89
ε
-ε . εε
ε
1 4 5
ε
0...9
32
ε
76
ε
RE = (-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?
0...9
DFA State NFA Statesε-closure after transition on...
0...9 − .0 {1, 2} {2, 3, 4, 8} {2} error
1 {2, 3, 4, 8}
2 {2}
From NFA{2, 3, 4, 8}, if input is 0...9
8
![Page 90: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/90.jpg)
90
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
90
ε
-ε . εε
ε
1 4 5
ε
0...9
32
ε
76
ε
RE = (-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?
0...9
DFA State NFA Statesε-closure after transition on...
0...9 − .0 {1, 2} {2, 3, 4, 8} {2} error
1 {2, 3, 4, 8}
2 {2}
From NFA{2, 3, 4, 8}, if input is 0...9
8
![Page 91: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/91.jpg)
91
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
91
ε
-ε . εε
ε
1 4 5
ε
0...9
32
ε
76
ε
RE = (-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?
0...9
DFA State NFA Statesε-closure after transition on...
0...9 − .0 {1, 2} {2, 3, 4, 8} {2} error
1 {2, 3, 4, 8}
2 {2}
From NFA{2, 3, 4, 8}, if input is 0...9
8
![Page 92: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/92.jpg)
92
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
92
ε
-ε . εε
ε
1 4 5
ε
0...9
32
ε
76
ε
RE = (-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?
0...9
DFA State NFA Statesε-closure after transition on...
0...9 − .0 {1, 2} {2, 3, 4, 8} {2} error
1 {2, 3, 4, 8} {2, 3, 4, 8}2 {2}
From NFA{2, 3, 4, 8}, if input is 0...9
8
![Page 93: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/93.jpg)
93
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
93
ε
-ε . εε
ε
1 4 5
ε
0...9
32
ε
76
ε
RE = (-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?
0...9
DFA State NFA Statesε-closure after transition on...
0...9 − .0 {1, 2} {2, 3, 4, 8} {2} error
1 {2, 3, 4, 8} {2, 3, 4, 8} error2 {2}
From NFA{2, 3, 4, 8}, if input is –
8
![Page 94: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/94.jpg)
94
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
94
ε
-ε . εε
ε
1 4 5
ε
0...9
32
ε
76
ε
RE = (-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?
0...9
DFA State NFA Statesε-closure after transition on...
0...9 − .0 {1, 2} {2, 3, 4, 8} {2} error
1 {2, 3, 4, 8} {2, 3, 4, 8} error
2 {2}
From NFA{2, 3, 4, 8}, if input is .
8
![Page 95: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/95.jpg)
95
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
95
ε
-ε . εε
ε
1 4 5
ε
0...9
32
ε
76
ε
RE = (-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?
0...9
DFA State NFA Statesε-closure after transition on...
0...9 − .0 {1, 2} {2, 3, 4, 8} {2} error
1 {2, 3, 4, 8} {2, 3, 4, 8} error
2 {2}
From NFA{2, 3, 4, 8}, if input is .
8
![Page 96: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/96.jpg)
96
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
96
ε
-ε . εε
ε
1 4 5
ε
0...9
32
ε
76
ε
RE = (-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?
0...9
DFA State NFA Statesε-closure after transition on...
0...9 − .0 {1, 2} {2, 3, 4, 8} {2} error
1 {2, 3, 4, 8} {2, 3, 4, 8} error {5, 6, 8}2 {2}
From NFA{2, 3, 4, 8}, if input is .
8
![Page 97: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/97.jpg)
97
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
97
ε
-ε . εε
ε
1 4 5
ε
0...9
32
ε
76
ε
RE = (-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?
0...9
DFA State NFA Statesε-closure after transition on...
0...9 − .0 {1, 2} {2, 3, 4, 8} {2} error
1 {2, 3, 4, 8} {2, 3, 4, 8} error {5, 6, 8}2 {2}
3 {5, 6, 8}
NFA{5, 6, 8} is a new combination, so add a DFA state
8
![Page 98: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/98.jpg)
98
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
98
ε
-ε . εε
ε
1 4 5
ε
0...9
32
ε
76
ε
RE = (-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?
0...9
DFA State NFA Statesε-closure after transition on...
0...9 − .0 {1, 2} {2, 3, 4, 8} {2} error
1 {2, 3, 4, 8} {2, 3, 4, 8} error {5, 6, 8}
2 {2}
3 {5, 6, 8}
From NFA{2}, if input is 0...9
8
![Page 99: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/99.jpg)
99
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
99
ε
-ε . εε
ε
1 4 5
ε
0...9
32
ε
76
ε
RE = (-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?
0...9
DFA State NFA Statesε-closure after transition on...
0...9 − .0 {1, 2} {2, 3, 4, 8} {2} error
1 {2, 3, 4, 8} {2, 3, 4, 8} error {5, 6, 8}
2 {2} {2, 3, 4, 8}3 {5, 6, 8}
From NFA{2}, if input is 0...9
8
![Page 100: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/100.jpg)
100
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
100
ε
-ε . εε
ε
1 4 5
ε
0...9
32
ε
76
ε
RE = (-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?
0...9
DFA State NFA Statesε-closure after transition on...
0...9 − .0 {1, 2} {2, 3, 4, 8} {2} error
1 {2, 3, 4, 8} {2, 3, 4, 8} error {5, 6, 8}
2 {2} {2, 3, 4, 8} error3 {5, 6, 8}
From NFA{2}, if input is –
8
![Page 101: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/101.jpg)
101
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
101
ε
-ε . εε
ε
1 4 5
ε
0...9
32
ε
76
ε
RE = (-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?
0...9
DFA State NFA Statesε-closure after transition on...
0...9 − .0 {1, 2} {2, 3, 4, 8} {2} error
1 {2, 3, 4, 8} {2, 3, 4, 8} error {5, 6, 8}
2 {2} {2, 3, 4, 8} error error3 {5, 6, 8}
From NFA{2}, if input is .
8
![Page 102: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/102.jpg)
102
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
102
ε
-ε . εε
ε
1 4 5
ε
0...9
32
ε
76
ε
RE = (-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?
0...9
DFA State NFA Statesε-closure after transition on...
0...9 − .0 {1, 2} {2, 3, 4, 8} {2} error
1 {2, 3, 4, 8} {2, 3, 4, 8} error {5, 6, 8}
2 {2} {2, 3, 4, 8} error error
3 {5, 6, 8}
From NFA{5, 6, 8}, if input is 0...9
8
![Page 103: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/103.jpg)
103
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
103
ε
-ε . εε
ε
1 4 5
ε
0...9
32
ε
76
ε
RE = (-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?
0...9
DFA State NFA Statesε-closure after transition on...
0...9 − .0 {1, 2} {2, 3, 4, 8} {2} error
1 {2, 3, 4, 8} {2, 3, 4, 8} error {5, 6, 8}
2 {2} {2, 3, 4, 8} error error
3 {5, 6, 8}
From NFA{5, 6, 8}, if input is 0...9
8
![Page 104: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/104.jpg)
104
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
104
ε
-ε . εε
ε
1 4 5
ε
0...9
32
ε
76
ε
RE = (-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?
0...9
DFA State NFA Statesε-closure after transition on...
0...9 − .0 {1, 2} {2, 3, 4, 8} {2} error
1 {2, 3, 4, 8} {2, 3, 4, 8} error {5, 6, 8}
2 {2} {2, 3, 4, 8} error error
3 {5, 6, 8} {6, 7, 8}
From NFA{5, 6, 8}, if input is 0...9
8
![Page 105: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/105.jpg)
105
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
105
ε
-ε . εε
ε
1 4 5
ε
0...9
32
ε
76
ε
RE = (-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?
0...9
DFA State NFA Statesε-closure after transition on...
0...9 − .0 {1, 2} {2, 3, 4, 8} {2} error
1 {2, 3, 4, 8} {2, 3, 4, 8} error {5, 6, 8}
2 {2} {2, 3, 4, 8} error error
3 {5, 6, 8} {6, 7, 8}4 {6, 7, 8}
NFA{6, 7, 8} is a new combination, so add a DFA state
8
![Page 106: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/106.jpg)
106
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
106
ε
-ε . εε
ε
1 4 5
ε
0...9
32
ε
76
ε
RE = (-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?
0...9
DFA State NFA Statesε-closure after transition on...
0...9 − .0 {1, 2} {2, 3, 4, 8} {2} error
1 {2, 3, 4, 8} {2, 3, 4, 8} error {5, 6, 8}
2 {2} {2, 3, 4, 8} error error
3 {5, 6, 8} {6, 7, 8} error4 {6, 7, 8}
From NFA{5, 6, 8}, if input is –
8
![Page 107: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/107.jpg)
107
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
107
ε
-ε . εε
ε
1 4 5
ε
0...9
32
ε
76
ε
RE = (-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?
0...9
DFA State NFA Statesε-closure after transition on...
0...9 − .0 {1, 2} {2, 3, 4, 8} {2} error
1 {2, 3, 4, 8} {2, 3, 4, 8} error {5, 6, 8}
2 {2} {2, 3, 4, 8} error error
3 {5, 6, 8} {6, 7, 8} error error4 {6, 7, 8}
From NFA{5, 6, 8}, if input is .
8
![Page 108: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/108.jpg)
108
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
108
ε
-ε . εε
ε
1 4 5
ε
0...9
32
ε
76
ε
RE = (-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?
0...9
DFA State NFA Statesε-closure after transition on...
0...9 − .0 {1, 2} {2, 3, 4, 8} {2} error
1 {2, 3, 4, 8} {2, 3, 4, 8} error {5, 6, 8}
2 {2} {2, 3, 4, 8} error error
3 {5, 6, 8} {6, 7, 8} error error
4 {6, 7, 8}
From NFA{6, 7, 8}, if input is 0...9
8
![Page 109: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/109.jpg)
109
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
109
ε
-ε . εε
ε
1 4 5
ε
0...9
32
ε
76
ε
RE = (-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?
0...9
DFA State NFA Statesε-closure after transition on...
0...9 − .0 {1, 2} {2, 3, 4, 8} {2} error
1 {2, 3, 4, 8} {2, 3, 4, 8} error {5, 6, 8}
2 {2} {2, 3, 4, 8} error error
3 {5, 6, 8} {6, 7, 8} error error
4 {6, 7, 8}
From NFA{6, 7, 8}, if input is 0...9
8
![Page 110: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/110.jpg)
110
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
110
ε
-ε . εε
ε
1 4 5
ε
0...9
32
ε
76
ε
RE = (-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?
0...9
DFA State NFA Statesε-closure after transition on...
0...9 − .0 {1, 2} {2, 3, 4, 8} {2} error
1 {2, 3, 4, 8} {2, 3, 4, 8} error {5, 6, 8}
2 {2} {2, 3, 4, 8} error error
3 {5, 6, 8} {6, 7, 8} error error
4 {6, 7, 8} {6, 7, 8}
From NFA{6, 7, 8}, if input is 0...9
8
![Page 111: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/111.jpg)
111
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
111
ε
-ε . εε
ε
1 4 5
ε
0...9
32
ε
76
ε
RE = (-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?
0...9
DFA State NFA Statesε-closure after transition on...
0...9 − .0 {1, 2} {2, 3, 4, 8} {2} error
1 {2, 3, 4, 8} {2, 3, 4, 8} error {5, 6, 8}
2 {2} {2, 3, 4, 8} error error
3 {5, 6, 8} {6, 7, 8} error error
4 {6, 7, 8} {6, 7, 8} error error
Last two are errors...
8
![Page 112: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/112.jpg)
112
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
112
ε
-ε . εε
ε
1 4 5
ε
0...9
32
ε
76
ε
RE = (-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?
0...9
DFA State NFA Statesε-closure after transition on...
0...9 − .0 {1, 2} {2, 3, 4, 8} {2} error
1 {2, 3, 4, 8} {2, 3, 4, 8} error {5, 6, 8}
2 {2} {2, 3, 4, 8} error error
3 {5, 6, 8} {6, 7, 8} error error
4 {6, 7, 8} {6, 7, 8} error error
No cells left to fill – DONE!
8
![Page 113: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/113.jpg)
113
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
113
ε
-ε . εε
ε
1 4 5
ε
0...9
32
ε
76
ε
RE = (-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?
0...9
DFA State NFA Statesε-closure after transition on...
0...9 − .0 {1, 2} {2, 3, 4, 8} {2} error
1 {2, 3, 4, 8} {2, 3, 4, 8} error {5, 6, 8}
2 {2} {2, 3, 4, 8} error error
3 {5, 6, 8} {6, 7, 8} error error
4 {6, 7, 8} {6, 7, 8} error error
In this case, NFA state 8 is an accepting state, so any DFA state whichcontains NFA state 8 should also be accepting.
8
![Page 114: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/114.jpg)
114
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
114
ε
-ε . εε
ε
1 4 5
ε
0...9
32
ε
76
ε
RE = (-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?
0...9
DFA State NFA Statesε-closure after transition on...
0...9 − .0 {1, 2} {2, 3, 4, 8} {2} error
1* {2, 3, 4, 8} {2, 3, 4, 8} error {5, 6, 8}
2 {2} {2, 3, 4, 8} error error
3* {5, 6, 8} {6, 7, 8} error error
4* {6, 7, 8} {6, 7, 8} error error
In this case, NFA state 8 is an accepting state, so any DFA state whichcontains NFA state 8 should also be accepting.
8
![Page 115: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/115.jpg)
115
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
115
DFA State NFA Statesε-closure after transition on...
0...9 − .0 {1, 2} {2, 3, 4, 8} {2} error
1* {2, 3, 4, 8} {2, 3, 4, 8} error {5, 6, 8}
2 {2} {2, 3, 4, 8} error error
3* {5, 6, 8} {6, 7, 8} error error
4* {6, 7, 8} {6, 7, 8} error error
Now use the table as a guide to construct the DFA diagram
![Page 116: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/116.jpg)
116
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
116
DFA State NFA Statesε-closure after transition on...
0...9 − .0 {1, 2} {2, 3, 4, 8} {2} error
1* {2, 3, 4, 8} {2, 3, 4, 8} error {5, 6, 8}
2 {2} {2, 3, 4, 8} error error
3* {5, 6, 8} {6, 7, 8} error error
4* {6, 7, 8} {6, 7, 8} error error
Now use the table as a guide to construct the DFA diagram
0 1 3 4
2
0...9
–
.
0...9
0...9
0...9 0...9
![Page 117: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/117.jpg)
117
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
117
0 1 3 4
2
0...9
–
.
0...9
0...9
0...9 0...9
ε
-ε . εε
ε
1 4 5
ε
0...9
32
ε
76
ε
RE = (-| ε) ·(0|1|2|3|4|5|6|7|8|9)+ · (. ·(0|1|2|3|4|5|6|7|8|9)*)?
0...9
8NFA
DFA
![Page 118: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/118.jpg)
118
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
118
NFA vs DFA: Complexity
• Matching time and space used depends on the length of the regular expression |r| and length of the input string |x|
• NFA matching time is O(|r|x|x|) and used space is O(r)
• DFA matching time is O(|x|) and used space is O(2|r|)– The number of states may grow exponential (cf. subset construction)– (a|b)*a (a|b) (a|b)… (a|b)
• Using lazy transition evaluation only states really used in practice are computed.– Optimization that overcomes or mitigates issues with space
![Page 119: Lexical Analysis - Information Sciences Institute · 2016-01-03 · 2 CSCI 565 - Compiler Design Spring 2016 Pedro Diniz pedro@isi.edu 2 Outline • What is a Lexical Analyzer? •](https://reader033.fdocuments.net/reader033/viewer/2022060307/5f09b68f7e708231d42828c5/html5/thumbnails/119.jpg)
119
Spring 2016CSCI 565 - Compiler Design
Pedro [email protected]
119
Summary
• Lexical Analyzer create tokens out of a text stream
• Tokens defined using Regular Expressions (REs)
• Regular Expressions can be mapped to Nondeterministic Finite Automata (NFA) – by simple Thompson’s construction
• NFA is transformed to a DFA – Transformation Algorithm: the Subset construction– Executing a DFA is straightforward