Computational morphology. Day 1. Theory of formal...
Transcript of Computational morphology. Day 1. Theory of formal...
![Page 1: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/1.jpg)
Computational morphology. Day 1. Theory of formal languages.
Computational morphology.
Day 1. Theory of formal languages.
Alexey Sorokin1,2
1Ìoscow State University, 2Moscow Institute of Physics and Technology
European Summer School
in Logic, Language and Information,
Toulouse, 24-28 July, 2017
![Page 2: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/2.jpg)
Computational morphology. Day 1. Theory of formal languages.
Outline of the course
Day 1: What is computational morphology? Theory of formal
languages: regular expressions and �nite automata.
Day 2: Finite transducers. Their application to natural languages.Day 3: Context-based morphology. Hidden Markov models.Day 4: Applying hidden Markov models to morphological analysis.Day 5: Other methods and models for morphological analysis.
![Page 3: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/3.jpg)
Computational morphology. Day 1. Theory of formal languages.
Outline of the course
Day 1: What is computational morphology? Theory of formal
languages: regular expressions and �nite automata.Day 2: Finite transducers. Their application to natural languages.
Day 3: Context-based morphology. Hidden Markov models.Day 4: Applying hidden Markov models to morphological analysis.Day 5: Other methods and models for morphological analysis.
![Page 4: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/4.jpg)
Computational morphology. Day 1. Theory of formal languages.
Outline of the course
Day 1: What is computational morphology? Theory of formal
languages: regular expressions and �nite automata.Day 2: Finite transducers. Their application to natural languages.Day 3: Context-based morphology. Hidden Markov models.
Day 4: Applying hidden Markov models to morphological analysis.Day 5: Other methods and models for morphological analysis.
![Page 5: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/5.jpg)
Computational morphology. Day 1. Theory of formal languages.
Outline of the course
Day 1: What is computational morphology? Theory of formal
languages: regular expressions and �nite automata.Day 2: Finite transducers. Their application to natural languages.Day 3: Context-based morphology. Hidden Markov models.Day 4: Applying hidden Markov models to morphological analysis.
Day 5: Other methods and models for morphological analysis.
![Page 6: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/6.jpg)
Computational morphology. Day 1. Theory of formal languages.
Outline of the course
Day 1: What is computational morphology? Theory of formal
languages: regular expressions and �nite automata.Day 2: Finite transducers. Their application to natural languages.Day 3: Context-based morphology. Hidden Markov models.Day 4: Applying hidden Markov models to morphological analysis.Day 5: Other methods and models for morphological analysis.
![Page 7: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/7.jpg)
Computational morphology. Day 1. Theory of formal languages.
Day 1 outline
What is computational morphology?
Regular expressions.Finite automata.Finite automata for linguistic phenomena.
![Page 8: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/8.jpg)
Computational morphology. Day 1. Theory of formal languages.
Day 1 outline
What is computational morphology?Regular expressions.
Finite automata.Finite automata for linguistic phenomena.
![Page 9: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/9.jpg)
Computational morphology. Day 1. Theory of formal languages.
Day 1 outline
What is computational morphology?Regular expressions.Finite automata.
Finite automata for linguistic phenomena.
![Page 10: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/10.jpg)
Computational morphology. Day 1. Theory of formal languages.
Day 1 outline
What is computational morphology?Regular expressions.Finite automata.Finite automata for linguistic phenomena.
![Page 11: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/11.jpg)
Computational morphology. Day 1. Theory of formal languages.
What is morphology?
�Morphology is the study of the forms of words, and the ways in
which words are related to other words of the same language.�
(R. Andersen).
�Morphology is the part of linguistics which studies the word in all
its relevant aspects.� (I. A. Melchuk).
Informally, morphology studies:
How the word changes in di�erent contexts (word in�ection).What factors determine these changes (morphological categories).What parts of the word re�ect these changes (morpheme analysis).
![Page 12: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/12.jpg)
Computational morphology. Day 1. Theory of formal languages.
What is morphology?
�Morphology is the study of the forms of words, and the ways in
which words are related to other words of the same language.�
(R. Andersen).
�Morphology is the part of linguistics which studies the word in all
its relevant aspects.� (I. A. Melchuk).
Informally, morphology studies:
How the word changes in di�erent contexts (word in�ection).What factors determine these changes (morphological categories).What parts of the word re�ect these changes (morpheme analysis).
![Page 13: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/13.jpg)
Computational morphology. Day 1. Theory of formal languages.
Tasks of computational morphology
Basic tasks of computational morphology:
Morphological analysis (tagging):
lirons (�(we will) read�) 7→ lire+Fut+Pl+1
Morphological synthesis:
lire+Fut+Pl+1 7→ lirons
Lemmatization:
parent 7→ parent �parent�, parer �(to) block�
Morpheme segmentation:
overcomed 7→ over + com(e) + ed
Paradigm detection:
parler 7→ parl-er, parl-e, parl-es, parl-e,
parl-ons, parl-ez, parl-ent
parler 7→ 1+er, 1+e, 1+es, 1+e, 1+ons, 1+ez, 1+ent
trouver 7→ 1+er, 1+e, 1+es, 1+e, 1+ons, 1+ez, 1+ent
![Page 14: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/14.jpg)
Computational morphology. Day 1. Theory of formal languages.
Tasks of computational morphology
Basic tasks of computational morphology:
Morphological analysis (tagging):
lirons (�(we will) read�) 7→ lire+Fut+Pl+1
Morphological synthesis:
lire+Fut+Pl+1 7→ lirons
Lemmatization:
parent 7→ parent �parent�, parer �(to) block�
Morpheme segmentation:
overcomed 7→ over + com(e) + ed
Paradigm detection:
parler 7→ parl-er, parl-e, parl-es, parl-e,
parl-ons, parl-ez, parl-ent
parler 7→ 1+er, 1+e, 1+es, 1+e, 1+ons, 1+ez, 1+ent
trouver 7→ 1+er, 1+e, 1+es, 1+e, 1+ons, 1+ez, 1+ent
![Page 15: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/15.jpg)
Computational morphology. Day 1. Theory of formal languages.
Tasks of computational morphology
Basic tasks of computational morphology:
Morphological analysis (tagging):
lirons (�(we will) read�) 7→ lire+Fut+Pl+1
Morphological synthesis:
lire+Fut+Pl+1 7→ lirons
Lemmatization:
parent 7→ parent �parent�, parer �(to) block�
Morpheme segmentation:
overcomed 7→ over + com(e) + ed
Paradigm detection:
parler 7→ parl-er, parl-e, parl-es, parl-e,
parl-ons, parl-ez, parl-ent
parler 7→ 1+er, 1+e, 1+es, 1+e, 1+ons, 1+ez, 1+ent
trouver 7→ 1+er, 1+e, 1+es, 1+e, 1+ons, 1+ez, 1+ent
![Page 16: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/16.jpg)
Computational morphology. Day 1. Theory of formal languages.
Tasks of computational morphology
Basic tasks of computational morphology:
Morphological analysis (tagging):
lirons (�(we will) read�) 7→ lire+Fut+Pl+1
Morphological synthesis:
lire+Fut+Pl+1 7→ lirons
Lemmatization:
parent 7→ parent �parent�, parer �(to) block�
Morpheme segmentation:
overcomed 7→ over + com(e) + ed
Paradigm detection:
parler 7→ parl-er, parl-e, parl-es, parl-e,
parl-ons, parl-ez, parl-ent
parler 7→ 1+er, 1+e, 1+es, 1+e, 1+ons, 1+ez, 1+ent
trouver 7→ 1+er, 1+e, 1+es, 1+e, 1+ons, 1+ez, 1+ent
![Page 17: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/17.jpg)
Computational morphology. Day 1. Theory of formal languages.
Tasks of computational morphology
Basic tasks of computational morphology:
Morphological analysis (tagging):
lirons (�(we will) read�) 7→ lire+Fut+Pl+1
Morphological synthesis:
lire+Fut+Pl+1 7→ lirons
Lemmatization:
parent 7→ parent �parent�, parer �(to) block�
Morpheme segmentation:
overcomed 7→ over + com(e) + ed
Paradigm detection:
parler 7→ parl-er, parl-e, parl-es, parl-e,
parl-ons, parl-ez, parl-ent
parler 7→ 1+er, 1+e, 1+es, 1+e, 1+ons, 1+ez, 1+ent
trouver 7→ 1+er, 1+e, 1+es, 1+e, 1+ons, 1+ez, 1+ent
![Page 18: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/18.jpg)
Computational morphology. Day 1. Theory of formal languages.
Tasks of computational morphology
Basic tasks of computational morphology:
Morphological analysis (tagging):
lirons (�(we will) read�) 7→ lire+Fut+Pl+1
Morphological synthesis:
lire+Fut+Pl+1 7→ lirons
Lemmatization:
parent 7→ parent �parent�, parer �(to) block�
Morpheme segmentation:
overcomed 7→ over + com(e) + ed
Paradigm detection:
parler 7→ parl-er, parl-e, parl-es, parl-e,
parl-ons, parl-ez, parl-ent
parler 7→ 1+er, 1+e, 1+es, 1+e, 1+ons, 1+ez, 1+ent
trouver 7→ 1+er, 1+e, 1+es, 1+e, 1+ons, 1+ez, 1+ent
![Page 19: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/19.jpg)
Computational morphology. Day 1. Theory of formal languages.
Tasks of computational morphology
Basic tasks of computational morphology:
Morphological analysis (tagging):
lirons (�(we will) read�) 7→ lire+Fut+Pl+1
Morphological synthesis:
lire+Fut+Pl+1 7→ lirons
Lemmatization:
parent 7→ parent �parent�, parer �(to) block�
Morpheme segmentation:
overcomed 7→ over + com(e) + ed
Paradigm detection:
parler 7→ parl-er, parl-e, parl-es, parl-e,
parl-ons, parl-ez, parl-ent
parler 7→ 1+er, 1+e, 1+es, 1+e, 1+ons, 1+ez, 1+ent
trouver 7→ 1+er, 1+e, 1+es, 1+e, 1+ons, 1+ez, 1+ent
![Page 20: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/20.jpg)
Computational morphology. Day 1. Theory of formal languages.
Context-dependent morphology
Morphological synthesis and paradigm detection do not depend
on context.But lemmatization and analysis DO!
parent 7→ parent+NOUN+Masc+Sg:
Mon parent es grand�My parent is tall�
parent 7→ parer+VERB+Pres+Pl+3:
Les d�efenseurs parent tous les tirs�The defenders block all the shots�
The e�ect of context is far more strong in highly in�ective
languages (Russian, Czech etc.).
![Page 21: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/21.jpg)
Computational morphology. Day 1. Theory of formal languages.
Context-dependent morphology
Morphological synthesis and paradigm detection do not depend
on context.But lemmatization and analysis DO!parent 7→ parent+NOUN+Masc+Sg:
Mon parent es grand�My parent is tall�
parent 7→ parer+VERB+Pres+Pl+3:
Les d�efenseurs parent tous les tirs�The defenders block all the shots�
The e�ect of context is far more strong in highly in�ective
languages (Russian, Czech etc.).
![Page 22: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/22.jpg)
Computational morphology. Day 1. Theory of formal languages.
Context-dependent morphology
Morphological synthesis and paradigm detection do not depend
on context.But lemmatization and analysis DO!parent 7→ parent+NOUN+Masc+Sg:
Mon parent es grand�My parent is tall�
parent 7→ parer+VERB+Pres+Pl+3:
Les d�efenseurs parent tous les tirs�The defenders block all the shots�
The e�ect of context is far more strong in highly in�ective
languages (Russian, Czech etc.).
![Page 23: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/23.jpg)
Computational morphology. Day 1. Theory of formal languages.
Context-dependent morphology
Morphological synthesis and paradigm detection do not depend
on context.But lemmatization and analysis DO!parent 7→ parent+NOUN+Masc+Sg:
Mon parent es grand�My parent is tall�
parent 7→ parer+VERB+Pres+Pl+3:
Les d�efenseurs parent tous les tirs�The defenders block all the shots�
The e�ect of context is far more strong in highly in�ective
languages (Russian, Czech etc.).
![Page 24: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/24.jpg)
Computational morphology. Day 1. Theory of formal languages.
Applications
Machine translation:
Pete bought a book 7→ Petya kupil knigu
boughtybuy+Past (with single masculine object)y
kupit'+Past+Sg+3+Mascykupil
Information retrieval.Language modelling: making a probability model more sparse.Actually, morphological tagging is a preprocessing step for almost
all NLP tasks.
![Page 25: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/25.jpg)
Computational morphology. Day 1. Theory of formal languages.
Applications
Machine translation:
Pete bought a book 7→ Petya kupil knigu
boughtybuy+Past (with single masculine object)y
kupit'+Past+Sg+3+Mascykupil
Information retrieval.
Language modelling: making a probability model more sparse.Actually, morphological tagging is a preprocessing step for almost
all NLP tasks.
![Page 26: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/26.jpg)
Computational morphology. Day 1. Theory of formal languages.
Applications
Machine translation:
Pete bought a book 7→ Petya kupil knigu
boughtybuy+Past (with single masculine object)y
kupit'+Past+Sg+3+Mascykupil
Information retrieval.Language modelling: making a probability model more sparse.
Actually, morphological tagging is a preprocessing step for almost
all NLP tasks.
![Page 27: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/27.jpg)
Computational morphology. Day 1. Theory of formal languages.
Applications
Machine translation:
Pete bought a book 7→ Petya kupil knigu
boughtybuy+Past (with single masculine object)y
kupit'+Past+Sg+3+Mascykupil
Information retrieval.Language modelling: making a probability model more sparse.Actually, morphological tagging is a preprocessing step for almost
all NLP tasks.
![Page 28: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/28.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Regular languages
Regular languages: �rst example
How to describe phonological conditions formally?A syllable is a sequence of letters containing one vowel (V ) and
arbitrary number of consonants (C ).
A syllable can be described as:
Arbitrary number of consonants (possibly zero).Followed by one vowel.Followed by arbitrary number of consonants (possibly zero).
Formally, a syllable is C ∗VC ∗ where ∗ stands for an arbitrary
number of symbols.Now let us describe a word...A word includes at least one vowel and arbitrary number of
consonants.Answer: (C |V )∗V (C |V )∗ where | stands for OR.
![Page 29: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/29.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Regular languages
Regular languages: �rst example
How to describe phonological conditions formally?A syllable is a sequence of letters containing one vowel (V ) and
arbitrary number of consonants (C ).A syllable can be described as:
Arbitrary number of consonants (possibly zero).
Followed by one vowel.Followed by arbitrary number of consonants (possibly zero).
Formally, a syllable is C ∗VC ∗ where ∗ stands for an arbitrary
number of symbols.Now let us describe a word...A word includes at least one vowel and arbitrary number of
consonants.Answer: (C |V )∗V (C |V )∗ where | stands for OR.
![Page 30: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/30.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Regular languages
Regular languages: �rst example
How to describe phonological conditions formally?A syllable is a sequence of letters containing one vowel (V ) and
arbitrary number of consonants (C ).A syllable can be described as:
Arbitrary number of consonants (possibly zero).Followed by one vowel.
Followed by arbitrary number of consonants (possibly zero).
Formally, a syllable is C ∗VC ∗ where ∗ stands for an arbitrary
number of symbols.Now let us describe a word...A word includes at least one vowel and arbitrary number of
consonants.Answer: (C |V )∗V (C |V )∗ where | stands for OR.
![Page 31: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/31.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Regular languages
Regular languages: �rst example
How to describe phonological conditions formally?A syllable is a sequence of letters containing one vowel (V ) and
arbitrary number of consonants (C ).A syllable can be described as:
Arbitrary number of consonants (possibly zero).Followed by one vowel.Followed by arbitrary number of consonants (possibly zero).
Formally, a syllable is C ∗VC ∗ where ∗ stands for an arbitrary
number of symbols.Now let us describe a word...A word includes at least one vowel and arbitrary number of
consonants.Answer: (C |V )∗V (C |V )∗ where | stands for OR.
![Page 32: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/32.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Regular languages
Regular languages: �rst example
How to describe phonological conditions formally?A syllable is a sequence of letters containing one vowel (V ) and
arbitrary number of consonants (C ).A syllable can be described as:
Arbitrary number of consonants (possibly zero).Followed by one vowel.Followed by arbitrary number of consonants (possibly zero).
Formally, a syllable is C ∗VC ∗ where ∗ stands for an arbitrary
number of symbols.
Now let us describe a word...A word includes at least one vowel and arbitrary number of
consonants.Answer: (C |V )∗V (C |V )∗ where | stands for OR.
![Page 33: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/33.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Regular languages
Regular languages: �rst example
How to describe phonological conditions formally?A syllable is a sequence of letters containing one vowel (V ) and
arbitrary number of consonants (C ).A syllable can be described as:
Arbitrary number of consonants (possibly zero).Followed by one vowel.Followed by arbitrary number of consonants (possibly zero).
Formally, a syllable is C ∗VC ∗ where ∗ stands for an arbitrary
number of symbols.Now let us describe a word...A word includes at least one vowel and arbitrary number of
consonants.
Answer: (C |V )∗V (C |V )∗ where | stands for OR.
![Page 34: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/34.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Regular languages
Regular languages: �rst example
How to describe phonological conditions formally?A syllable is a sequence of letters containing one vowel (V ) and
arbitrary number of consonants (C ).A syllable can be described as:
Arbitrary number of consonants (possibly zero).Followed by one vowel.Followed by arbitrary number of consonants (possibly zero).
Formally, a syllable is C ∗VC ∗ where ∗ stands for an arbitrary
number of symbols.Now let us describe a word...A word includes at least one vowel and arbitrary number of
consonants.Answer: (C |V )∗V (C |V )∗ where | stands for OR.
![Page 35: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/35.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Regular languages
More complex examples
We wish to describe the syllable structure of the word more
carefully.
We add the condition that exactly one syllable is stressed V0
and the syllables are separated by hyphens (−).Then a stressed syllable is C ∗V0C
∗.Let us separate two cases. First case: stressed syllable is the
last one.Second case: stressed syllable is not the last one.The answer is ((C ∗VC ∗−)∗C ∗V0C
∗)|((C ∗VC ∗−)∗C ∗V0C∗ −
(C ∗VC ∗−)∗C ∗VC ∗).Regrouping (? is �can be present or not�):
((C ∗VC ∗−)∗C ∗V0C∗)((−(C ∗VC ∗−)∗C ∗VC ∗)?).
Another variant:
(C ∗VC ∗−)∗C ∗V0C∗(−C ∗VC ∗)∗.
![Page 36: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/36.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Regular languages
More complex examples
We wish to describe the syllable structure of the word more
carefully.We add the condition that exactly one syllable is stressed V0
and the syllables are separated by hyphens (−).Then a stressed syllable is C ∗V0C
∗.
Let us separate two cases. First case: stressed syllable is the
last one.Second case: stressed syllable is not the last one.The answer is ((C ∗VC ∗−)∗C ∗V0C
∗)|((C ∗VC ∗−)∗C ∗V0C∗ −
(C ∗VC ∗−)∗C ∗VC ∗).Regrouping (? is �can be present or not�):
((C ∗VC ∗−)∗C ∗V0C∗)((−(C ∗VC ∗−)∗C ∗VC ∗)?).
Another variant:
(C ∗VC ∗−)∗C ∗V0C∗(−C ∗VC ∗)∗.
![Page 37: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/37.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Regular languages
More complex examplesWe wish to describe the syllable structure of the word more
carefully.We add the condition that exactly one syllable is stressed V0
and the syllables are separated by hyphens (−).Then a stressed syllable is C ∗V0C
∗.Let us separate two cases. First case: stressed syllable is the
last one.
All unstressed syllables are followed by a hyphen. That is C∗VC∗−(V stands for unstressed).We have an arbitrary number of such groups (C∗VC∗−)∗ followedby a stressed syllable C∗V0C
∗.Concatenating, we obtain (C∗VC∗−)∗C∗V0C
∗.
Second case: stressed syllable is not the last one.The answer is ((C ∗VC ∗−)∗C ∗V0C
∗)|((C ∗VC ∗−)∗C ∗V0C∗ −
(C ∗VC ∗−)∗C ∗VC ∗).Regrouping (? is �can be present or not�):
((C ∗VC ∗−)∗C ∗V0C∗)((−(C ∗VC ∗−)∗C ∗VC ∗)?).
Another variant:
(C ∗VC ∗−)∗C ∗V0C∗(−C ∗VC ∗)∗.
![Page 38: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/38.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Regular languages
More complex examplesWe wish to describe the syllable structure of the word more
carefully.We add the condition that exactly one syllable is stressed V0
and the syllables are separated by hyphens (−).Then a stressed syllable is C ∗V0C
∗.Let us separate two cases. First case: stressed syllable is the
last one.All unstressed syllables are followed by a hyphen. That is C∗VC∗−(V stands for unstressed).
We have an arbitrary number of such groups (C∗VC∗−)∗ followedby a stressed syllable C∗V0C
∗.Concatenating, we obtain (C∗VC∗−)∗C∗V0C
∗.
Second case: stressed syllable is not the last one.The answer is ((C ∗VC ∗−)∗C ∗V0C
∗)|((C ∗VC ∗−)∗C ∗V0C∗ −
(C ∗VC ∗−)∗C ∗VC ∗).Regrouping (? is �can be present or not�):
((C ∗VC ∗−)∗C ∗V0C∗)((−(C ∗VC ∗−)∗C ∗VC ∗)?).
Another variant:
(C ∗VC ∗−)∗C ∗V0C∗(−C ∗VC ∗)∗.
![Page 39: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/39.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Regular languages
More complex examplesWe wish to describe the syllable structure of the word more
carefully.We add the condition that exactly one syllable is stressed V0
and the syllables are separated by hyphens (−).Then a stressed syllable is C ∗V0C
∗.Let us separate two cases. First case: stressed syllable is the
last one.All unstressed syllables are followed by a hyphen. That is C∗VC∗−(V stands for unstressed).We have an arbitrary number of such groups (C∗VC∗−)∗ followedby a stressed syllable C∗V0C
∗.Concatenating, we obtain (C∗VC∗−)∗C∗V0C
∗.
Second case: stressed syllable is not the last one.The answer is ((C ∗VC ∗−)∗C ∗V0C
∗)|((C ∗VC ∗−)∗C ∗V0C∗ −
(C ∗VC ∗−)∗C ∗VC ∗).Regrouping (? is �can be present or not�):
((C ∗VC ∗−)∗C ∗V0C∗)((−(C ∗VC ∗−)∗C ∗VC ∗)?).
Another variant:
(C ∗VC ∗−)∗C ∗V0C∗(−C ∗VC ∗)∗.
![Page 40: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/40.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Regular languages
More complex examples
We wish to describe the syllable structure of the word more
carefully.We add the condition that exactly one syllable is stressed V0
and the syllables are separated by hyphens (−).Then a stressed syllable is C ∗V0C
∗.Let us separate two cases. First case: stressed syllable is the
last one. (C ∗V0C∗−)∗C ∗V0C
∗
Second case: stressed syllable is not the last one.
The answer is ((C ∗VC ∗−)∗C ∗V0C∗)|((C ∗VC ∗−)∗C ∗V0C
∗ −(C ∗VC ∗−)∗C ∗VC ∗).Regrouping (? is �can be present or not�):
((C ∗VC ∗−)∗C ∗V0C∗)((−(C ∗VC ∗−)∗C ∗VC ∗)?).
Another variant:
(C ∗VC ∗−)∗C ∗V0C∗(−C ∗VC ∗)∗.
![Page 41: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/41.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Regular languages
More complex examplesWe wish to describe the syllable structure of the word more
carefully.We add the condition that exactly one syllable is stressed V0
and the syllables are separated by hyphens (−).Then a stressed syllable is C ∗V0C
∗.Let us separate two cases. First case: stressed syllable is the
last one. (C ∗V0C∗−)∗C ∗V0C
∗
Second case: stressed syllable is not the last one.Arbitrary number of hyphenated unstressed syllables, followedby a hyphenated stressed syllable,followed by arbitrary number of hyphenated unstressed syllables,followed by an unstressed syllable.
Together, (C∗VC∗−)∗C∗V0C∗ − (C∗VC∗−)∗C∗VC∗.
The answer is ((C ∗VC ∗−)∗C ∗V0C∗)|((C ∗VC ∗−)∗C ∗V0C
∗ −(C ∗VC ∗−)∗C ∗VC ∗).Regrouping (? is �can be present or not�):
((C ∗VC ∗−)∗C ∗V0C∗)((−(C ∗VC ∗−)∗C ∗VC ∗)?).
Another variant:
(C ∗VC ∗−)∗C ∗V0C∗(−C ∗VC ∗)∗.
![Page 42: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/42.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Regular languages
More complex examplesWe wish to describe the syllable structure of the word more
carefully.We add the condition that exactly one syllable is stressed V0
and the syllables are separated by hyphens (−).Then a stressed syllable is C ∗V0C
∗.Let us separate two cases. First case: stressed syllable is the
last one. (C ∗V0C∗−)∗C ∗V0C
∗
Second case: stressed syllable is not the last one.Arbitrary number of hyphenated unstressed syllables, followedby a hyphenated stressed syllable,followed by arbitrary number of hyphenated unstressed syllables,followed by an unstressed syllable.Together, (C∗VC∗−)∗C∗V0C
∗ − (C∗VC∗−)∗C∗VC∗.
The answer is ((C ∗VC ∗−)∗C ∗V0C∗)|((C ∗VC ∗−)∗C ∗V0C
∗ −(C ∗VC ∗−)∗C ∗VC ∗).Regrouping (? is �can be present or not�):
((C ∗VC ∗−)∗C ∗V0C∗)((−(C ∗VC ∗−)∗C ∗VC ∗)?).
Another variant:
(C ∗VC ∗−)∗C ∗V0C∗(−C ∗VC ∗)∗.
![Page 43: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/43.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Regular languages
More complex examples
We wish to describe the syllable structure of the word more
carefully.We add the condition that exactly one syllable is stressed V0
and the syllables are separated by hyphens (−).Then a stressed syllable is C ∗V0C
∗.Let us separate two cases. First case: stressed syllable is the
last one. (C ∗V0C∗−)∗C ∗V0C
∗
Second case: stressed syllable is not the last one.
(C ∗VC ∗−)∗C ∗V0C∗ − (C ∗VC ∗−)∗C ∗VC ∗
The answer is ((C ∗VC ∗−)∗C ∗V0C∗)|((C ∗VC ∗−)∗C ∗V0C
∗ −(C ∗VC ∗−)∗C ∗VC ∗).
Regrouping (? is �can be present or not�):
((C ∗VC ∗−)∗C ∗V0C∗)((−(C ∗VC ∗−)∗C ∗VC ∗)?).
Another variant:
(C ∗VC ∗−)∗C ∗V0C∗(−C ∗VC ∗)∗.
![Page 44: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/44.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Regular languages
More complex examples
We wish to describe the syllable structure of the word more
carefully.We add the condition that exactly one syllable is stressed V0
and the syllables are separated by hyphens (−).Then a stressed syllable is C ∗V0C
∗.Let us separate two cases. First case: stressed syllable is the
last one. (C ∗V0C∗−)∗C ∗V0C
∗
Second case: stressed syllable is not the last one.
(C ∗VC ∗−)∗C ∗V0C∗ − (C ∗VC ∗−)∗C ∗VC ∗
The answer is ((C ∗VC ∗−)∗C ∗V0C∗)|((C ∗VC ∗−)∗C ∗V0C
∗ −(C ∗VC ∗−)∗C ∗VC ∗).Regrouping (? is �can be present or not�):
((C ∗VC ∗−)∗C ∗V0C∗)((−(C ∗VC ∗−)∗C ∗VC ∗)?).
Another variant:
(C ∗VC ∗−)∗C ∗V0C∗(−C ∗VC ∗)∗.
![Page 45: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/45.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Regular languages
More complex examples
We wish to describe the syllable structure of the word more
carefully.We add the condition that exactly one syllable is stressed V0
and the syllables are separated by hyphens (−).Then a stressed syllable is C ∗V0C
∗.Let us separate two cases. First case: stressed syllable is the
last one. (C ∗V0C∗−)∗C ∗V0C
∗
Second case: stressed syllable is not the last one.
(C ∗VC ∗−)∗C ∗V0C∗ − (C ∗VC ∗−)∗C ∗VC ∗
The answer is ((C ∗VC ∗−)∗C ∗V0C∗)|((C ∗VC ∗−)∗C ∗V0C
∗ −(C ∗VC ∗−)∗C ∗VC ∗).Regrouping (? is �can be present or not�):
((C ∗VC ∗−)∗C ∗V0C∗)((−(C ∗VC ∗−)∗C ∗VC ∗)?).
Another variant:
(C ∗VC ∗−)∗C ∗V0C∗(−C ∗VC ∗)∗.
![Page 46: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/46.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Regular languages
Examples for morphology
Spanish verb in�nitive ends with -ar,-ir,-er which is followed by
-se in case of re�exive verbs.
It is simple: (C |V )∗(a|i |e)r(se)?.C is an arbitrary consonant (just join all consonants with |) andV is a vowel.
![Page 47: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/47.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Regular languages
Examples for morphology
Spanish verb in�nitive ends with -ar,-ir,-er which is followed by
-se in case of re�exive verbs.It is simple: (C |V )∗(a|i |e)r(se)?.C is an arbitrary consonant (just join all consonants with |) andV is a vowel.
![Page 48: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/48.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Regular languages
Examples for morphology
More complex example: the plural form of English nouns:
-es follows a sibilant (s, x, z, ch, sh).-s cannot appear after e preceded by a consonant (sky 7→ skies).
For this task it is easier to parse witches as witche+s, not todeal with -es.But -s must be avoided after s, x, z, ch, sh, Cy,where C is arbitrary consonant.But regular expression cannot express negative patterns.Solution: list all that is allowed.
![Page 49: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/49.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Regular languages
Examples for morphology
More complex example: the plural form of English nouns:
-es follows a sibilant (s, x, z, ch, sh).-s cannot appear after e preceded by a consonant (sky 7→ skies).
For this task it is easier to parse witches as witche+s, not todeal with -es.But -s must be avoided after s, x, z, ch, sh, Cy,where C is arbitrary consonant.But regular expression cannot express negative patterns.Solution: list all that is allowed.
![Page 50: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/50.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Regular languages
Examples for morphology
More complex example: the plural form of English nouns:
-es follows a sibilant (s, x, z, ch, sh).-s cannot appear after e preceded by a consonant (sky 7→ skies).
For this task it is easier to parse witches as witche+s, not todeal with -es.
But -s must be avoided after s, x, z, ch, sh, Cy,where C is arbitrary consonant.But regular expression cannot express negative patterns.Solution: list all that is allowed.
![Page 51: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/51.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Regular languages
Examples for morphology
More complex example: the plural form of English nouns:
-es follows a sibilant (s, x, z, ch, sh).-s cannot appear after e preceded by a consonant (sky 7→ skies).
For this task it is easier to parse witches as witche+s, not todeal with -es.But -s must be avoided after s, x, z, ch, sh, Cy,where C is arbitrary consonant.
But regular expression cannot express negative patterns.Solution: list all that is allowed.
![Page 52: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/52.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Regular languages
Examples for morphology
More complex example: the plural form of English nouns:
-es follows a sibilant (s, x, z, ch, sh).-s cannot appear after e preceded by a consonant (sky 7→ skies).
For this task it is easier to parse witches as witche+s, not todeal with -es.But -s must be avoided after s, x, z, ch, sh, Cy,where C is arbitrary consonant.But regular expression cannot express negative patterns.
Solution: list all that is allowed.
![Page 53: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/53.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Regular languages
Examples for morphology
More complex example: the plural form of English nouns:
-es follows a sibilant (s, x, z, ch, sh).-s cannot appear after e preceded by a consonant (sky 7→ skies).
For this task it is easier to parse witches as witche+s, not todeal with -es.But -s must be avoided after s, x, z, ch, sh, Cy,where C is arbitrary consonant.But regular expression cannot express negative patterns.Solution: list all that is allowed.
![Page 54: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/54.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Regular languages
Examples for morphology
A plural form is a stem followed by -s, where a stem can beanything that:
Ends with vowel not equal to y : (C |V )∗(a|e|i |o|u).Ends with vowel+y: (C |V )∗Vy .Contains a vowel and ends with a consonant not equal to s, x , z , h(let C′ denote their complete list): (C |V )∗V (C |V )∗C ′
Contains a vowel and ends with h or C′′h, where C′′ stands forall consonants except s, c : (C |V )∗V (C |V )∗C ′′?h
Grouping all together: (C |V )∗((a|e|i |o|u|Vy)|V (C |V )∗(C ′|C ′′?h))s.
![Page 55: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/55.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Regular languages
Examples for morphology
A plural form is a stem followed by -s, where a stem can beanything that:
Ends with vowel not equal to y : (C |V )∗(a|e|i |o|u).
Ends with vowel+y: (C |V )∗Vy .Contains a vowel and ends with a consonant not equal to s, x , z , h(let C′ denote their complete list): (C |V )∗V (C |V )∗C ′
Contains a vowel and ends with h or C′′h, where C′′ stands forall consonants except s, c : (C |V )∗V (C |V )∗C ′′?h
Grouping all together: (C |V )∗((a|e|i |o|u|Vy)|V (C |V )∗(C ′|C ′′?h))s.
![Page 56: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/56.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Regular languages
Examples for morphology
A plural form is a stem followed by -s, where a stem can beanything that:
Ends with vowel not equal to y : (C |V )∗(a|e|i |o|u).Ends with vowel+y: (C |V )∗Vy .
Contains a vowel and ends with a consonant not equal to s, x , z , h(let C′ denote their complete list): (C |V )∗V (C |V )∗C ′
Contains a vowel and ends with h or C′′h, where C′′ stands forall consonants except s, c : (C |V )∗V (C |V )∗C ′′?h
Grouping all together: (C |V )∗((a|e|i |o|u|Vy)|V (C |V )∗(C ′|C ′′?h))s.
![Page 57: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/57.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Regular languages
Examples for morphology
A plural form is a stem followed by -s, where a stem can beanything that:
Ends with vowel not equal to y : (C |V )∗(a|e|i |o|u).Ends with vowel+y: (C |V )∗Vy .Contains a vowel and ends with a consonant not equal to s, x , z , h(let C′ denote their complete list): (C |V )∗V (C |V )∗C ′
Contains a vowel and ends with h or C′′h, where C′′ stands forall consonants except s, c : (C |V )∗V (C |V )∗C ′′?h
Grouping all together: (C |V )∗((a|e|i |o|u|Vy)|V (C |V )∗(C ′|C ′′?h))s.
![Page 58: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/58.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Regular languages
Examples for morphology
A plural form is a stem followed by -s, where a stem can beanything that:
Ends with vowel not equal to y : (C |V )∗(a|e|i |o|u).Ends with vowel+y: (C |V )∗Vy .Contains a vowel and ends with a consonant not equal to s, x , z , h(let C′ denote their complete list): (C |V )∗V (C |V )∗C ′
Contains a vowel and ends with h or C′′h, where C′′ stands forall consonants except s, c : (C |V )∗V (C |V )∗C ′′?h
Grouping all together: (C |V )∗((a|e|i |o|u|Vy)|V (C |V )∗(C ′|C ′′?h))s.
![Page 59: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/59.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Regular languages
Examples for morphology
A plural form is a stem followed by -s, where a stem can beanything that:
Ends with vowel not equal to y : (C |V )∗(a|e|i |o|u).Ends with vowel+y: (C |V )∗Vy .Contains a vowel and ends with a consonant not equal to s, x , z , h(let C′ denote their complete list): (C |V )∗V (C |V )∗C ′
Contains a vowel and ends with h or C′′h, where C′′ stands forall consonants except s, c : (C |V )∗V (C |V )∗C ′′?h
Grouping all together: (C |V )∗((a|e|i |o|u|Vy)|V (C |V )∗(C ′|C ′′?h))s.
![Page 60: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/60.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Regular languages
Formal de�nitions
Alphabet � arbitrary �nite set Σ, its elements � letters.
Words � �nite sequences of letters, the set of words � Σ∗.ε � empty word.· � concatenation of words, ad · bc = adbc .Languages � sets of words: L ⊆ Σ∗.Operations on languages:
Boolean operations: L1 ∪ L2, L1 ∩ L2, L1 − L2, L(complement).Concatenation: L1 · L2 = {w1 · w2|w1 ∈ L1,w2 ∈ L2}.Power Lk = L · . . . · L︸ ︷︷ ︸
k times
. L0 = {ε}, L1 = L.
Iteration (Kleene star): L∗ =∞⋃k=0
Lk .
{a, b}∗ = {a, b}0∪{a, b}1∪{a, b}2∪. . . = {ε, a, b, aa, ab, ba, bb, . . .}.
![Page 61: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/61.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Regular languages
Formal de�nitions
Alphabet � arbitrary �nite set Σ, its elements � letters.Words � �nite sequences of letters, the set of words � Σ∗.ε � empty word.
· � concatenation of words, ad · bc = adbc .Languages � sets of words: L ⊆ Σ∗.Operations on languages:
Boolean operations: L1 ∪ L2, L1 ∩ L2, L1 − L2, L(complement).Concatenation: L1 · L2 = {w1 · w2|w1 ∈ L1,w2 ∈ L2}.Power Lk = L · . . . · L︸ ︷︷ ︸
k times
. L0 = {ε}, L1 = L.
Iteration (Kleene star): L∗ =∞⋃k=0
Lk .
{a, b}∗ = {a, b}0∪{a, b}1∪{a, b}2∪. . . = {ε, a, b, aa, ab, ba, bb, . . .}.
![Page 62: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/62.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Regular languages
Formal de�nitions
Alphabet � arbitrary �nite set Σ, its elements � letters.Words � �nite sequences of letters, the set of words � Σ∗.ε � empty word.· � concatenation of words, ad · bc = adbc .
Languages � sets of words: L ⊆ Σ∗.Operations on languages:
Boolean operations: L1 ∪ L2, L1 ∩ L2, L1 − L2, L(complement).Concatenation: L1 · L2 = {w1 · w2|w1 ∈ L1,w2 ∈ L2}.Power Lk = L · . . . · L︸ ︷︷ ︸
k times
. L0 = {ε}, L1 = L.
Iteration (Kleene star): L∗ =∞⋃k=0
Lk .
{a, b}∗ = {a, b}0∪{a, b}1∪{a, b}2∪. . . = {ε, a, b, aa, ab, ba, bb, . . .}.
![Page 63: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/63.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Regular languages
Formal de�nitions
Alphabet � arbitrary �nite set Σ, its elements � letters.Words � �nite sequences of letters, the set of words � Σ∗.ε � empty word.· � concatenation of words, ad · bc = adbc .Languages � sets of words: L ⊆ Σ∗.
Operations on languages:
Boolean operations: L1 ∪ L2, L1 ∩ L2, L1 − L2, L(complement).Concatenation: L1 · L2 = {w1 · w2|w1 ∈ L1,w2 ∈ L2}.Power Lk = L · . . . · L︸ ︷︷ ︸
k times
. L0 = {ε}, L1 = L.
Iteration (Kleene star): L∗ =∞⋃k=0
Lk .
{a, b}∗ = {a, b}0∪{a, b}1∪{a, b}2∪. . . = {ε, a, b, aa, ab, ba, bb, . . .}.
![Page 64: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/64.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Regular languages
Formal de�nitions
Alphabet � arbitrary �nite set Σ, its elements � letters.Words � �nite sequences of letters, the set of words � Σ∗.ε � empty word.· � concatenation of words, ad · bc = adbc .Languages � sets of words: L ⊆ Σ∗.Operations on languages:
Boolean operations: L1 ∪ L2, L1 ∩ L2, L1 − L2, L(complement).
Concatenation: L1 · L2 = {w1 · w2|w1 ∈ L1,w2 ∈ L2}.Power Lk = L · . . . · L︸ ︷︷ ︸
k times
. L0 = {ε}, L1 = L.
Iteration (Kleene star): L∗ =∞⋃k=0
Lk .
{a, b}∗ = {a, b}0∪{a, b}1∪{a, b}2∪. . . = {ε, a, b, aa, ab, ba, bb, . . .}.
![Page 65: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/65.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Regular languages
Formal de�nitions
Alphabet � arbitrary �nite set Σ, its elements � letters.Words � �nite sequences of letters, the set of words � Σ∗.ε � empty word.· � concatenation of words, ad · bc = adbc .Languages � sets of words: L ⊆ Σ∗.Operations on languages:
Boolean operations: L1 ∪ L2, L1 ∩ L2, L1 − L2, L(complement).Concatenation: L1 · L2 = {w1 · w2|w1 ∈ L1,w2 ∈ L2}.
Power Lk = L · . . . · L︸ ︷︷ ︸k times
. L0 = {ε}, L1 = L.
Iteration (Kleene star): L∗ =∞⋃k=0
Lk .
{a, b}∗ = {a, b}0∪{a, b}1∪{a, b}2∪. . . = {ε, a, b, aa, ab, ba, bb, . . .}.
![Page 66: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/66.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Regular languages
Formal de�nitions
Alphabet � arbitrary �nite set Σ, its elements � letters.Words � �nite sequences of letters, the set of words � Σ∗.ε � empty word.· � concatenation of words, ad · bc = adbc .Languages � sets of words: L ⊆ Σ∗.Operations on languages:
Boolean operations: L1 ∪ L2, L1 ∩ L2, L1 − L2, L(complement).Concatenation: L1 · L2 = {w1 · w2|w1 ∈ L1,w2 ∈ L2}.Power Lk = L · . . . · L︸ ︷︷ ︸
k times
. L0 = {ε}, L1 = L.
Iteration (Kleene star): L∗ =∞⋃k=0
Lk .
{a, b}∗ = {a, b}0∪{a, b}1∪{a, b}2∪. . . = {ε, a, b, aa, ab, ba, bb, . . .}.
![Page 67: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/67.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Regular languages
Formal de�nitions
Alphabet � arbitrary �nite set Σ, its elements � letters.Words � �nite sequences of letters, the set of words � Σ∗.ε � empty word.· � concatenation of words, ad · bc = adbc .Languages � sets of words: L ⊆ Σ∗.Operations on languages:
Boolean operations: L1 ∪ L2, L1 ∩ L2, L1 − L2, L(complement).Concatenation: L1 · L2 = {w1 · w2|w1 ∈ L1,w2 ∈ L2}.Power Lk = L · . . . · L︸ ︷︷ ︸
k times
. L0 = {ε}, L1 = L.
Iteration (Kleene star): L∗ =∞⋃k=0
Lk .
{a, b}∗ = {a, b}0∪{a, b}1∪{a, b}2∪. . . = {ε, a, b, aa, ab, ba, bb, . . .}.
![Page 68: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/68.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Regular languages
Formal de�nitions
Alphabet � arbitrary �nite set Σ, its elements � letters.Words � �nite sequences of letters, the set of words � Σ∗.ε � empty word.· � concatenation of words, ad · bc = adbc .Languages � sets of words: L ⊆ Σ∗.Operations on languages:
Boolean operations: L1 ∪ L2, L1 ∩ L2, L1 − L2, L(complement).Concatenation: L1 · L2 = {w1 · w2|w1 ∈ L1,w2 ∈ L2}.Power Lk = L · . . . · L︸ ︷︷ ︸
k times
. L0 = {ε}, L1 = L.
Iteration (Kleene star): L∗ =∞⋃k=0
Lk .
{a, b}∗ = {a, b}0∪{a, b}1∪{a, b}2∪. . . = {ε, a, b, aa, ab, ba, bb, . . .}.
![Page 69: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/69.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Regular languages
Regular expressions: what is it formallyWe distinguish regular expression α and its language L(α).For example, if α = (a|b)(a|c), then L(α) = {aa, ac, ba, bc}.
Let some alphabet Σ be �xed.Regular expressions (Reg(Σ)):
Any a ∈ Σ is a regular expression, L(a) = {a}.0, 1 are regular expressions, L(0) = ∅, L(1) = {ε}.For all α, β ∈ Reg(Σ) also (α|β) ∈ Reg(Σ),L((α|β)) = L(α) ∪ L(β).For all α, β ∈ Reg(Σ) also (α · β) ∈ Reg(Σ),L((α · β)) = L(α) · L(β).If α ∈ Reg(Σ), then α∗ ∈ Reg(Σ), L(α∗) = L(α)∗.
Priority of operations: ∗, ·, |, so α∗β|γ = ((α∗) · β)|γ.Common conventions: α+ = αα∗ (positive iteration),
α? = (α|1) (optionality).Regular languages: languages that can be expressed by regular
expressions.
![Page 70: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/70.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Regular languages
Regular expressions: what is it formallyWe distinguish regular expression α and its language L(α).For example, if α = (a|b)(a|c), then L(α) = {aa, ac, ba, bc}.Let some alphabet Σ be �xed.Regular expressions (Reg(Σ)):
Any a ∈ Σ is a regular expression, L(a) = {a}.
0, 1 are regular expressions, L(0) = ∅, L(1) = {ε}.For all α, β ∈ Reg(Σ) also (α|β) ∈ Reg(Σ),L((α|β)) = L(α) ∪ L(β).For all α, β ∈ Reg(Σ) also (α · β) ∈ Reg(Σ),L((α · β)) = L(α) · L(β).If α ∈ Reg(Σ), then α∗ ∈ Reg(Σ), L(α∗) = L(α)∗.
Priority of operations: ∗, ·, |, so α∗β|γ = ((α∗) · β)|γ.Common conventions: α+ = αα∗ (positive iteration),
α? = (α|1) (optionality).Regular languages: languages that can be expressed by regular
expressions.
![Page 71: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/71.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Regular languages
Regular expressions: what is it formallyWe distinguish regular expression α and its language L(α).For example, if α = (a|b)(a|c), then L(α) = {aa, ac, ba, bc}.Let some alphabet Σ be �xed.Regular expressions (Reg(Σ)):
Any a ∈ Σ is a regular expression, L(a) = {a}.0, 1 are regular expressions, L(0) = ∅, L(1) = {ε}.
For all α, β ∈ Reg(Σ) also (α|β) ∈ Reg(Σ),L((α|β)) = L(α) ∪ L(β).For all α, β ∈ Reg(Σ) also (α · β) ∈ Reg(Σ),L((α · β)) = L(α) · L(β).If α ∈ Reg(Σ), then α∗ ∈ Reg(Σ), L(α∗) = L(α)∗.
Priority of operations: ∗, ·, |, so α∗β|γ = ((α∗) · β)|γ.Common conventions: α+ = αα∗ (positive iteration),
α? = (α|1) (optionality).Regular languages: languages that can be expressed by regular
expressions.
![Page 72: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/72.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Regular languages
Regular expressions: what is it formallyWe distinguish regular expression α and its language L(α).For example, if α = (a|b)(a|c), then L(α) = {aa, ac, ba, bc}.Let some alphabet Σ be �xed.Regular expressions (Reg(Σ)):
Any a ∈ Σ is a regular expression, L(a) = {a}.0, 1 are regular expressions, L(0) = ∅, L(1) = {ε}.For all α, β ∈ Reg(Σ) also (α|β) ∈ Reg(Σ),L((α|β)) = L(α) ∪ L(β).
For all α, β ∈ Reg(Σ) also (α · β) ∈ Reg(Σ),L((α · β)) = L(α) · L(β).If α ∈ Reg(Σ), then α∗ ∈ Reg(Σ), L(α∗) = L(α)∗.
Priority of operations: ∗, ·, |, so α∗β|γ = ((α∗) · β)|γ.Common conventions: α+ = αα∗ (positive iteration),
α? = (α|1) (optionality).Regular languages: languages that can be expressed by regular
expressions.
![Page 73: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/73.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Regular languages
Regular expressions: what is it formallyWe distinguish regular expression α and its language L(α).For example, if α = (a|b)(a|c), then L(α) = {aa, ac, ba, bc}.Let some alphabet Σ be �xed.Regular expressions (Reg(Σ)):
Any a ∈ Σ is a regular expression, L(a) = {a}.0, 1 are regular expressions, L(0) = ∅, L(1) = {ε}.For all α, β ∈ Reg(Σ) also (α|β) ∈ Reg(Σ),L((α|β)) = L(α) ∪ L(β).For all α, β ∈ Reg(Σ) also (α · β) ∈ Reg(Σ),L((α · β)) = L(α) · L(β).
If α ∈ Reg(Σ), then α∗ ∈ Reg(Σ), L(α∗) = L(α)∗.
Priority of operations: ∗, ·, |, so α∗β|γ = ((α∗) · β)|γ.Common conventions: α+ = αα∗ (positive iteration),
α? = (α|1) (optionality).Regular languages: languages that can be expressed by regular
expressions.
![Page 74: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/74.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Regular languages
Regular expressions: what is it formallyWe distinguish regular expression α and its language L(α).For example, if α = (a|b)(a|c), then L(α) = {aa, ac, ba, bc}.Let some alphabet Σ be �xed.Regular expressions (Reg(Σ)):
Any a ∈ Σ is a regular expression, L(a) = {a}.0, 1 are regular expressions, L(0) = ∅, L(1) = {ε}.For all α, β ∈ Reg(Σ) also (α|β) ∈ Reg(Σ),L((α|β)) = L(α) ∪ L(β).For all α, β ∈ Reg(Σ) also (α · β) ∈ Reg(Σ),L((α · β)) = L(α) · L(β).If α ∈ Reg(Σ), then α∗ ∈ Reg(Σ), L(α∗) = L(α)∗.
Priority of operations: ∗, ·, |, so α∗β|γ = ((α∗) · β)|γ.Common conventions: α+ = αα∗ (positive iteration),
α? = (α|1) (optionality).Regular languages: languages that can be expressed by regular
expressions.
![Page 75: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/75.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Regular languages
Regular expressions: what is it formallyWe distinguish regular expression α and its language L(α).For example, if α = (a|b)(a|c), then L(α) = {aa, ac, ba, bc}.Let some alphabet Σ be �xed.Regular expressions (Reg(Σ)):
Any a ∈ Σ is a regular expression, L(a) = {a}.0, 1 are regular expressions, L(0) = ∅, L(1) = {ε}.For all α, β ∈ Reg(Σ) also (α|β) ∈ Reg(Σ),L((α|β)) = L(α) ∪ L(β).For all α, β ∈ Reg(Σ) also (α · β) ∈ Reg(Σ),L((α · β)) = L(α) · L(β).If α ∈ Reg(Σ), then α∗ ∈ Reg(Σ), L(α∗) = L(α)∗.
Priority of operations: ∗, ·, |, so α∗β|γ = ((α∗) · β)|γ.
Common conventions: α+ = αα∗ (positive iteration),
α? = (α|1) (optionality).Regular languages: languages that can be expressed by regular
expressions.
![Page 76: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/76.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Regular languages
Regular expressions: what is it formallyWe distinguish regular expression α and its language L(α).For example, if α = (a|b)(a|c), then L(α) = {aa, ac, ba, bc}.Let some alphabet Σ be �xed.Regular expressions (Reg(Σ)):
Any a ∈ Σ is a regular expression, L(a) = {a}.0, 1 are regular expressions, L(0) = ∅, L(1) = {ε}.For all α, β ∈ Reg(Σ) also (α|β) ∈ Reg(Σ),L((α|β)) = L(α) ∪ L(β).For all α, β ∈ Reg(Σ) also (α · β) ∈ Reg(Σ),L((α · β)) = L(α) · L(β).If α ∈ Reg(Σ), then α∗ ∈ Reg(Σ), L(α∗) = L(α)∗.
Priority of operations: ∗, ·, |, so α∗β|γ = ((α∗) · β)|γ.Common conventions: α+ = αα∗ (positive iteration),
α? = (α|1) (optionality).
Regular languages: languages that can be expressed by regular
expressions.
![Page 77: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/77.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Regular languages
Regular expressions: what is it formallyWe distinguish regular expression α and its language L(α).For example, if α = (a|b)(a|c), then L(α) = {aa, ac, ba, bc}.Let some alphabet Σ be �xed.Regular expressions (Reg(Σ)):
Any a ∈ Σ is a regular expression, L(a) = {a}.0, 1 are regular expressions, L(0) = ∅, L(1) = {ε}.For all α, β ∈ Reg(Σ) also (α|β) ∈ Reg(Σ),L((α|β)) = L(α) ∪ L(β).For all α, β ∈ Reg(Σ) also (α · β) ∈ Reg(Σ),L((α · β)) = L(α) · L(β).If α ∈ Reg(Σ), then α∗ ∈ Reg(Σ), L(α∗) = L(α)∗.
Priority of operations: ∗, ·, |, so α∗β|γ = ((α∗) · β)|γ.Common conventions: α+ = αα∗ (positive iteration),
α? = (α|1) (optionality).Regular languages: languages that can be expressed by regular
expressions.
![Page 78: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/78.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Regular languages
Examples of regular languages
Words with exactly two a-s (alphabet a, b): (a|b)∗a(a|b)∗a(a|b)∗.
Words with even number of a-s (alphabet a, b):((a|b)∗a(a|b)∗a)∗(a|b)∗.Words with odd number of a-s (alphabet a, b): Exercise.a is immediately followed by b (alphabet a, b, c): (b|c |ab)∗.a is immediately preceded by b: Exercise.After every a b occurs earlier than c (alphabet a, b, c , d):(ad∗b|b|c |d)∗.Left to a b occurs closer than c : Exercise.No repeating letters (alphabet a, b):
b?(ab)∗a?
.Non-empty word with no repetitions:
H = a(ba)∗b?|b(ab)∗a?.No repeating letters (alphabet a, b, c):
H?(cH)∗c?
.
![Page 79: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/79.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Regular languages
Examples of regular languages
Words with exactly two a-s (alphabet a, b): (a|b)∗a(a|b)∗a(a|b)∗.Words with even number of a-s (alphabet a, b):((a|b)∗a(a|b)∗a)∗(a|b)∗.
Words with odd number of a-s (alphabet a, b): Exercise.a is immediately followed by b (alphabet a, b, c): (b|c |ab)∗.a is immediately preceded by b: Exercise.After every a b occurs earlier than c (alphabet a, b, c , d):(ad∗b|b|c |d)∗.Left to a b occurs closer than c : Exercise.No repeating letters (alphabet a, b):
b?(ab)∗a?
.Non-empty word with no repetitions:
H = a(ba)∗b?|b(ab)∗a?.No repeating letters (alphabet a, b, c):
H?(cH)∗c?
.
![Page 80: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/80.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Regular languages
Examples of regular languages
Words with exactly two a-s (alphabet a, b): (a|b)∗a(a|b)∗a(a|b)∗.Words with even number of a-s (alphabet a, b):((a|b)∗a(a|b)∗a)∗(a|b)∗.Words with odd number of a-s (alphabet a, b): Exercise.
a is immediately followed by b (alphabet a, b, c): (b|c |ab)∗.a is immediately preceded by b: Exercise.After every a b occurs earlier than c (alphabet a, b, c , d):(ad∗b|b|c |d)∗.Left to a b occurs closer than c : Exercise.No repeating letters (alphabet a, b):
b?(ab)∗a?
.Non-empty word with no repetitions:
H = a(ba)∗b?|b(ab)∗a?.No repeating letters (alphabet a, b, c):
H?(cH)∗c?
.
![Page 81: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/81.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Regular languages
Examples of regular languages
Words with exactly two a-s (alphabet a, b): (a|b)∗a(a|b)∗a(a|b)∗.Words with even number of a-s (alphabet a, b):((a|b)∗a(a|b)∗a)∗(a|b)∗.Words with odd number of a-s (alphabet a, b): Exercise.a is immediately followed by b (alphabet a, b, c): (b|c |ab)∗.
a is immediately preceded by b: Exercise.After every a b occurs earlier than c (alphabet a, b, c , d):(ad∗b|b|c |d)∗.Left to a b occurs closer than c : Exercise.No repeating letters (alphabet a, b):
b?(ab)∗a?
.Non-empty word with no repetitions:
H = a(ba)∗b?|b(ab)∗a?.No repeating letters (alphabet a, b, c):
H?(cH)∗c?
.
![Page 82: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/82.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Regular languages
Examples of regular languages
Words with exactly two a-s (alphabet a, b): (a|b)∗a(a|b)∗a(a|b)∗.Words with even number of a-s (alphabet a, b):((a|b)∗a(a|b)∗a)∗(a|b)∗.Words with odd number of a-s (alphabet a, b): Exercise.a is immediately followed by b (alphabet a, b, c): (b|c |ab)∗.a is immediately preceded by b: Exercise.
After every a b occurs earlier than c (alphabet a, b, c , d):(ad∗b|b|c |d)∗.Left to a b occurs closer than c : Exercise.No repeating letters (alphabet a, b):
b?(ab)∗a?
.Non-empty word with no repetitions:
H = a(ba)∗b?|b(ab)∗a?.No repeating letters (alphabet a, b, c):
H?(cH)∗c?
.
![Page 83: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/83.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Regular languages
Examples of regular languages
Words with exactly two a-s (alphabet a, b): (a|b)∗a(a|b)∗a(a|b)∗.Words with even number of a-s (alphabet a, b):((a|b)∗a(a|b)∗a)∗(a|b)∗.Words with odd number of a-s (alphabet a, b): Exercise.a is immediately followed by b (alphabet a, b, c): (b|c |ab)∗.a is immediately preceded by b: Exercise.After every a b occurs earlier than c (alphabet a, b, c , d):(ad∗b|b|c |d)∗.
Left to a b occurs closer than c : Exercise.No repeating letters (alphabet a, b):
b?(ab)∗a?
.Non-empty word with no repetitions:
H = a(ba)∗b?|b(ab)∗a?.No repeating letters (alphabet a, b, c):
H?(cH)∗c?
.
![Page 84: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/84.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Regular languages
Examples of regular languages
Words with exactly two a-s (alphabet a, b): (a|b)∗a(a|b)∗a(a|b)∗.Words with even number of a-s (alphabet a, b):((a|b)∗a(a|b)∗a)∗(a|b)∗.Words with odd number of a-s (alphabet a, b): Exercise.a is immediately followed by b (alphabet a, b, c): (b|c |ab)∗.a is immediately preceded by b: Exercise.After every a b occurs earlier than c (alphabet a, b, c , d):(ad∗b|b|c |d)∗.Left to a b occurs closer than c : Exercise.
No repeating letters (alphabet a, b):
b?(ab)∗a?
.Non-empty word with no repetitions:
H = a(ba)∗b?|b(ab)∗a?.No repeating letters (alphabet a, b, c):
H?(cH)∗c?
.
![Page 85: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/85.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Regular languages
Examples of regular languages
Words with exactly two a-s (alphabet a, b): (a|b)∗a(a|b)∗a(a|b)∗.Words with even number of a-s (alphabet a, b):((a|b)∗a(a|b)∗a)∗(a|b)∗.Words with odd number of a-s (alphabet a, b): Exercise.a is immediately followed by b (alphabet a, b, c): (b|c |ab)∗.a is immediately preceded by b: Exercise.After every a b occurs earlier than c (alphabet a, b, c , d):(ad∗b|b|c |d)∗.Left to a b occurs closer than c : Exercise.No repeating letters (alphabet a, b):
b?(ab)∗a?
.
Non-empty word with no repetitions:
H = a(ba)∗b?|b(ab)∗a?.No repeating letters (alphabet a, b, c):
H?(cH)∗c?
.
![Page 86: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/86.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Regular languages
Examples of regular languages
Words with exactly two a-s (alphabet a, b): (a|b)∗a(a|b)∗a(a|b)∗.Words with even number of a-s (alphabet a, b):((a|b)∗a(a|b)∗a)∗(a|b)∗.Words with odd number of a-s (alphabet a, b): Exercise.a is immediately followed by b (alphabet a, b, c): (b|c |ab)∗.a is immediately preceded by b: Exercise.After every a b occurs earlier than c (alphabet a, b, c , d):(ad∗b|b|c |d)∗.Left to a b occurs closer than c : Exercise.No repeating letters (alphabet a, b): b?(ab)∗a?.
Non-empty word with no repetitions:
H = a(ba)∗b?|b(ab)∗a?.No repeating letters (alphabet a, b, c):
H?(cH)∗c?
.
![Page 87: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/87.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Regular languages
Examples of regular languages
Words with exactly two a-s (alphabet a, b): (a|b)∗a(a|b)∗a(a|b)∗.Words with even number of a-s (alphabet a, b):((a|b)∗a(a|b)∗a)∗(a|b)∗.Words with odd number of a-s (alphabet a, b): Exercise.a is immediately followed by b (alphabet a, b, c): (b|c |ab)∗.a is immediately preceded by b: Exercise.After every a b occurs earlier than c (alphabet a, b, c , d):(ad∗b|b|c |d)∗.Left to a b occurs closer than c : Exercise.No repeating letters (alphabet a, b): b?(ab)∗a?.Non-empty word with no repetitions:
H = a(ba)∗b?|b(ab)∗a?.
No repeating letters (alphabet a, b, c):
H?(cH)∗c?
.
![Page 88: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/88.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Regular languages
Examples of regular languages
Words with exactly two a-s (alphabet a, b): (a|b)∗a(a|b)∗a(a|b)∗.Words with even number of a-s (alphabet a, b):((a|b)∗a(a|b)∗a)∗(a|b)∗.Words with odd number of a-s (alphabet a, b): Exercise.a is immediately followed by b (alphabet a, b, c): (b|c |ab)∗.a is immediately preceded by b: Exercise.After every a b occurs earlier than c (alphabet a, b, c , d):(ad∗b|b|c |d)∗.Left to a b occurs closer than c : Exercise.No repeating letters (alphabet a, b): b?(ab)∗a?.Non-empty word with no repetitions:
H = a(ba)∗b?|b(ab)∗a?.No repeating letters (alphabet a, b, c):
H?(cH)∗c?
.
![Page 89: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/89.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Regular languages
Examples of regular languages
Words with exactly two a-s (alphabet a, b): (a|b)∗a(a|b)∗a(a|b)∗.Words with even number of a-s (alphabet a, b):((a|b)∗a(a|b)∗a)∗(a|b)∗.Words with odd number of a-s (alphabet a, b): Exercise.a is immediately followed by b (alphabet a, b, c): (b|c |ab)∗.a is immediately preceded by b: Exercise.After every a b occurs earlier than c (alphabet a, b, c , d):(ad∗b|b|c |d)∗.Left to a b occurs closer than c : Exercise.No repeating letters (alphabet a, b): b?(ab)∗a?.Non-empty word with no repetitions:
H = a(ba)∗b?|b(ab)∗a?.No repeating letters (alphabet a, b, c): H?(cH)∗c?.
![Page 90: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/90.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Regular languages
Exercise: vowel harmony
Words that have at least one letter among V ,V1,V2, but not
V1 and V2 together.
Explanation: V1 and V2 are disharmonic types of vowels (say,
soft and round). V are neutral vowels, C are consonants.
C ∗(V |V1)(C |V |V1)∗|C ∗(V |V2)(C |V |V2)∗
.
Exercise: Turkish in�nitives.
In Turkish there are 8 vowels:
Front Back
Soft e i a �
Round �u �o u o
In�nitive is formed by su�x -mek/-mak attached to verb stem, where
e appears if the last vowel of stem is front and a � if it is back. Write
a regular expression for Turkish in�nitives.
![Page 91: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/91.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Regular languages
Exercise: vowel harmony
Words that have at least one letter among V ,V1,V2, but not
V1 and V2 together.Explanation: V1 and V2 are disharmonic types of vowels (say,
soft and round). V are neutral vowels, C are consonants.
C ∗(V |V1)(C |V |V1)∗|C ∗(V |V2)(C |V |V2)∗
.
Exercise: Turkish in�nitives.
In Turkish there are 8 vowels:
Front Back
Soft e i a �
Round �u �o u o
In�nitive is formed by su�x -mek/-mak attached to verb stem, where
e appears if the last vowel of stem is front and a � if it is back. Write
a regular expression for Turkish in�nitives.
![Page 92: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/92.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Regular languages
Exercise: vowel harmony
Words that have at least one letter among V ,V1,V2, but not
V1 and V2 together.Explanation: V1 and V2 are disharmonic types of vowels (say,
soft and round). V are neutral vowels, C are consonants.
C ∗(V |V1)(C |V |V1)∗|C ∗(V |V2)(C |V |V2)∗.
Exercise: Turkish in�nitives.
In Turkish there are 8 vowels:
Front Back
Soft e i a �
Round �u �o u o
In�nitive is formed by su�x -mek/-mak attached to verb stem, where
e appears if the last vowel of stem is front and a � if it is back. Write
a regular expression for Turkish in�nitives.
![Page 93: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/93.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Regular languages
Exercise: vowel harmony
Words that have at least one letter among V ,V1,V2, but not
V1 and V2 together.Explanation: V1 and V2 are disharmonic types of vowels (say,
soft and round). V are neutral vowels, C are consonants.
C ∗(V |V1)(C |V |V1)∗|C ∗(V |V2)(C |V |V2)∗.
Exercise: Turkish in�nitives.
In Turkish there are 8 vowels:
Front Back
Soft e i a �
Round �u �o u o
In�nitive is formed by su�x -mek/-mak attached to verb stem, where
e appears if the last vowel of stem is front and a � if it is back. Write
a regular expression for Turkish in�nitives.
![Page 94: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/94.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Finite automata
Finite automata
Regular expressions are convinient to describe patterns.
But there is no way to check that a word satis�es to an expression.Example: a(a|b|c)∗b(a|b|c).How we can process it:
Read the �rst letter, check that it is a, otherwise reject.Read the letters until the penultimate letter appears.Check that it is b.Check that exactly one letter remains.
Schematically:
q0 q1 q2 q3a
a, b, c
b a, b, c
That is �nite automaton.
![Page 95: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/95.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Finite automata
Finite automata
Regular expressions are convinient to describe patterns.But there is no way to check that a word satis�es to an expression.
Example: a(a|b|c)∗b(a|b|c).How we can process it:
Read the �rst letter, check that it is a, otherwise reject.Read the letters until the penultimate letter appears.Check that it is b.Check that exactly one letter remains.
Schematically:
q0 q1 q2 q3a
a, b, c
b a, b, c
That is �nite automaton.
![Page 96: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/96.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Finite automata
Finite automata
Regular expressions are convinient to describe patterns.But there is no way to check that a word satis�es to an expression.Example: a(a|b|c)∗b(a|b|c).
How we can process it:
Read the �rst letter, check that it is a, otherwise reject.Read the letters until the penultimate letter appears.Check that it is b.Check that exactly one letter remains.
Schematically:
q0 q1 q2 q3a
a, b, c
b a, b, c
That is �nite automaton.
![Page 97: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/97.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Finite automata
Finite automata
Regular expressions are convinient to describe patterns.But there is no way to check that a word satis�es to an expression.Example: a(a|b|c)∗b(a|b|c).How we can process it:
Read the �rst letter, check that it is a, otherwise reject.
Read the letters until the penultimate letter appears.Check that it is b.Check that exactly one letter remains.
Schematically:
q0 q1 q2 q3a
a, b, c
b a, b, c
That is �nite automaton.
![Page 98: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/98.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Finite automata
Finite automata
Regular expressions are convinient to describe patterns.But there is no way to check that a word satis�es to an expression.Example: a(a|b|c)∗b(a|b|c).How we can process it:
Read the �rst letter, check that it is a, otherwise reject.Read the letters until the penultimate letter appears.
Check that it is b.Check that exactly one letter remains.
Schematically:
q0 q1 q2 q3a
a, b, c
b a, b, c
That is �nite automaton.
![Page 99: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/99.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Finite automata
Finite automata
Regular expressions are convinient to describe patterns.But there is no way to check that a word satis�es to an expression.Example: a(a|b|c)∗b(a|b|c).How we can process it:
Read the �rst letter, check that it is a, otherwise reject.Read the letters until the penultimate letter appears.Check that it is b.Check that exactly one letter remains.
Schematically:
q0 q1 q2 q3a
a, b, c
b a, b, c
That is �nite automaton.
![Page 100: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/100.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Finite automata
Finite automata
Regular expressions are convinient to describe patterns.But there is no way to check that a word satis�es to an expression.Example: a(a|b|c)∗b(a|b|c).How we can process it:
Read the �rst letter, check that it is a, otherwise reject.Read the letters until the penultimate letter appears.Check that it is b.Check that exactly one letter remains.
Schematically:
q0 q1 q2 q3a
a, b, c
b a, b, c
That is �nite automaton.
![Page 101: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/101.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Finite automata
Finite automata
Finite automaton consists of:
Final set of states Q.Alphabet Σ.
Set of transitions (edges) ∆ ⊆ Q × Σ∗ × Q:
q1 q2w 〈q1,w〉 → q2
Initial state q0.Set of (possibly multiple) �nal states F ⊆ Q.
Every edge have its label. The label of a path is the concatenation
of its edges labels.Automaton A accepts language L(A) of all words that label
paths from initial state to some �nal.
![Page 102: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/102.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Finite automata
Finite automata
Finite automaton consists of:
Final set of states Q.Alphabet Σ.Set of transitions (edges) ∆ ⊆ Q × Σ∗ × Q:
q1 q2w 〈q1,w〉 → q2
Initial state q0.Set of (possibly multiple) �nal states F ⊆ Q.
Every edge have its label. The label of a path is the concatenation
of its edges labels.Automaton A accepts language L(A) of all words that label
paths from initial state to some �nal.
![Page 103: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/103.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Finite automata
Finite automata
Finite automaton consists of:
Final set of states Q.Alphabet Σ.Set of transitions (edges) ∆ ⊆ Q × Σ∗ × Q:
q1 q2w 〈q1,w〉 → q2
Initial state q0.
Set of (possibly multiple) �nal states F ⊆ Q.
Every edge have its label. The label of a path is the concatenation
of its edges labels.Automaton A accepts language L(A) of all words that label
paths from initial state to some �nal.
![Page 104: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/104.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Finite automata
Finite automata
Finite automaton consists of:
Final set of states Q.Alphabet Σ.Set of transitions (edges) ∆ ⊆ Q × Σ∗ × Q:
q1 q2w 〈q1,w〉 → q2
Initial state q0.Set of (possibly multiple) �nal states F ⊆ Q.
Every edge have its label. The label of a path is the concatenation
of its edges labels.Automaton A accepts language L(A) of all words that label
paths from initial state to some �nal.
![Page 105: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/105.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Finite automata
Finite automata
Finite automaton consists of:
Final set of states Q.Alphabet Σ.Set of transitions (edges) ∆ ⊆ Q × Σ∗ × Q:
q1 q2w 〈q1,w〉 → q2
Initial state q0.Set of (possibly multiple) �nal states F ⊆ Q.
Every edge have its label. The label of a path is the concatenation
of its edges labels.
Automaton A accepts language L(A) of all words that label
paths from initial state to some �nal.
![Page 106: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/106.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Finite automata
Finite automata
Finite automaton consists of:
Final set of states Q.Alphabet Σ.Set of transitions (edges) ∆ ⊆ Q × Σ∗ × Q:
q1 q2w 〈q1,w〉 → q2
Initial state q0.Set of (possibly multiple) �nal states F ⊆ Q.
Every edge have its label. The label of a path is the concatenation
of its edges labels.Automaton A accepts language L(A) of all words that label
paths from initial state to some �nal.
![Page 107: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/107.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Finite automata
Finite automata: examples
A syllable: states check vowel presence.
q0 q1V
C C ,V
Even number of a-s, alphabet a, b. States check parity of a-s.
ab b
a
Every a is immediately followed by b, alphabet a, b, c .
ab, c
b
![Page 108: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/108.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Finite automata
Finite automata: examples
A syllable: states check vowel presence.
q0 q1V
C C ,V
Even number of a-s, alphabet a, b. States check parity of a-s.
ab b
a
Every a is immediately followed by b, alphabet a, b, c .
ab, c
b
![Page 109: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/109.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Finite automata
Finite automata: examples
A syllable: states check vowel presence.
q0 q1V
C C ,V
Even number of a-s, alphabet a, b. States check parity of a-s.
ab b
a
Every a is immediately followed by b, alphabet a, b, c .
ab, c
b
![Page 110: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/110.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Finite automata
Finite automata: examples
Every a is immediately preceded by b, alphabet a, b, c .
bb, c
a
To the right of every a occurs b with no a, c between them,
alphabet a, b, c , d .
ab, c, d
b
d
![Page 111: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/111.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Finite automata
Finite automata: examples
Every a is immediately preceded by b, alphabet a, b, c .
bb, c
a
To the right of every a occurs b with no a, c between them,
alphabet a, b, c , d .
ab, c, d
b
d
![Page 112: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/112.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Finite automata
Finite automata: examples
No repeating letters, alphabet a, b, c . States correspond to letters:
0 B
A
C
a
b
c
b
c
a
cb
a
![Page 113: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/113.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Finite automata
Finite automata: examples
Word syllabi�cation: each syllable contains exactly one vowel
and exactly one vowel is stressed, syllables are separated by
hyphens.
States check two conditions:
There was a vowel in current syllable (the �rst coordinate).There was a stressed vowel (the second coordinate).
NN YN
YY NY
VC C
−
V
−
V
C C
![Page 114: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/114.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Finite automata
Finite automata: examples
Word syllabi�cation: each syllable contains exactly one vowel
and exactly one vowel is stressed, syllables are separated by
hyphens.States check two conditions:
There was a vowel in current syllable (the �rst coordinate).
There was a stressed vowel (the second coordinate).
NN YN
YY NY
VC C
−
V
−
V
C C
![Page 115: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/115.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Finite automata
Finite automata: examples
Word syllabi�cation: each syllable contains exactly one vowel
and exactly one vowel is stressed, syllables are separated by
hyphens.States check two conditions:
There was a vowel in current syllable (the �rst coordinate).There was a stressed vowel (the second coordinate).
NN YN
YY NY
VC C
−
V
−
V
C C
![Page 116: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/116.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Finite automata
Finite automata: examples
Word syllabi�cation: each syllable contains exactly one vowel
and exactly one vowel is stressed, syllables are separated by
hyphens.States check two conditions:
There was a vowel in current syllable (the �rst coordinate).There was a stressed vowel (the second coordinate).
NN YN
YY NY
VC C
−
V
−
V
C C
![Page 117: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/117.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Finite automata
Finite automata: English pluralAll plural forms can be decomposed as stem + s, where
A stem is anything with at least one vowel, but not ending with:
-s, -x, -z, -sh, -ch, -zh (sibilants).Cy.
Automaton for all possible stems(C0 = C − {s, x , z , c , h},C1 = C0 ∪ {s, x , z}):
V0
y
C
V0, y
y
C1
C1
s, c
s, c
V0
y
s,c
V0
y
C1
C1s,c
C1, c , h
C1 , c , h
C1, c , h
C1, c
s
s
![Page 118: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/118.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Finite automata
Finite automata: English pluralAll plural forms can be decomposed as stem + s, whereA stem is anything with at least one vowel, but not ending with:
-s, -x, -z, -sh, -ch, -zh (sibilants).Cy.
Automaton for all possible stems(C0 = C − {s, x , z , c , h},C1 = C0 ∪ {s, x , z}):
V0
y
C
V0, y
y
C1
C1
s, c
s, c
V0
y
s,c
V0
y
C1
C1s,c
C1, c , h
C1 , c , h
C1, c , h
C1, c
s
s
![Page 119: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/119.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Finite automata
Finite automata: English pluralAll plural forms can be decomposed as stem + s, whereA stem is anything with at least one vowel, but not ending with:
-s, -x, -z, -sh, -ch, -zh (sibilants).Cy.
Automaton for all possible stems(C0 = C − {s, x , z , c , h},C1 = C0 ∪ {s, x , z}):
V0
y
C
V0, y
y
C1
C1
s, c
s, c
V0
y
s,c
V0
y
C1
C1s,c
C1, c , h
C1 , c , h
C1, c , h
C1, c
s
s
![Page 120: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/120.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Finite automata
Properties of �nite automata
Theorem
Every automata language is recognized by an automaton with singleletter labels.
Sketch of the proof
Split all labels of length > 2 by inserting additional states.Now we have only letters and ε as labels.Add an edge 〈q1, a〉 → q2 if there exist states q3, q4 such that
(〈q3, a〉 → q4) ∈ ∆ and there are ε-paths from q1 to q3 and
from q4 to q2.Mark as terminal all states from which terminal states are ε-reachable.Now remove all ε-paths.
![Page 121: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/121.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Finite automata
Properties of �nite automata
Theorem
Every automata language is recognized by an automaton with singleletter labels.
Sketch of the proof
Split all labels of length > 2 by inserting additional states.Now we have only letters and ε as labels.
Add an edge 〈q1, a〉 → q2 if there exist states q3, q4 such that
(〈q3, a〉 → q4) ∈ ∆ and there are ε-paths from q1 to q3 and
from q4 to q2.Mark as terminal all states from which terminal states are ε-reachable.Now remove all ε-paths.
![Page 122: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/122.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Finite automata
Properties of �nite automata
Theorem
Every automata language is recognized by an automaton with singleletter labels.
Sketch of the proof
Split all labels of length > 2 by inserting additional states.Now we have only letters and ε as labels.Add an edge 〈q1, a〉 → q2 if there exist states q3, q4 such that
(〈q3, a〉 → q4) ∈ ∆ and there are ε-paths from q1 to q3 and
from q4 to q2.
Mark as terminal all states from which terminal states are ε-reachable.Now remove all ε-paths.
![Page 123: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/123.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Finite automata
Properties of �nite automata
Theorem
Every automata language is recognized by an automaton with singleletter labels.
Sketch of the proof
Split all labels of length > 2 by inserting additional states.Now we have only letters and ε as labels.Add an edge 〈q1, a〉 → q2 if there exist states q3, q4 such that
(〈q3, a〉 → q4) ∈ ∆ and there are ε-paths from q1 to q3 and
from q4 to q2.Mark as terminal all states from which terminal states are ε-reachable.Now remove all ε-paths.
![Page 124: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/124.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Finite automata
Properties of �nite automata
De�nition
An automaton with one-letter labels is deterministic if no state hastwo outcoming edges with the same label.
Theorem
Every automata language can be recognized by deterministicautomata.
Sketch of the proof
New automaton states are sets of old states.An edge labeled by a leads from set Q1 to Q2 if Q2 contains
exactly the states reachable from Q1 by a.Start state Q0 = {q0} (only old start state).Final states: subsets containing at least one old �nal state.
![Page 125: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/125.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Finite automata
Properties of �nite automata
De�nition
An automaton with one-letter labels is deterministic if no state hastwo outcoming edges with the same label.
Theorem
Every automata language can be recognized by deterministicautomata.
Sketch of the proof
New automaton states are sets of old states.An edge labeled by a leads from set Q1 to Q2 if Q2 contains
exactly the states reachable from Q1 by a.Start state Q0 = {q0} (only old start state).Final states: subsets containing at least one old �nal state.
![Page 126: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/126.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Finite automata
Properties of �nite automata
De�nition
An automaton with one-letter labels is deterministic if no state hastwo outcoming edges with the same label.
Theorem
Every automata language can be recognized by deterministicautomata.
Sketch of the proof
New automaton states are sets of old states.
An edge labeled by a leads from set Q1 to Q2 if Q2 contains
exactly the states reachable from Q1 by a.Start state Q0 = {q0} (only old start state).Final states: subsets containing at least one old �nal state.
![Page 127: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/127.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Finite automata
Properties of �nite automata
De�nition
An automaton with one-letter labels is deterministic if no state hastwo outcoming edges with the same label.
Theorem
Every automata language can be recognized by deterministicautomata.
Sketch of the proof
New automaton states are sets of old states.An edge labeled by a leads from set Q1 to Q2 if Q2 contains
exactly the states reachable from Q1 by a.
Start state Q0 = {q0} (only old start state).Final states: subsets containing at least one old �nal state.
![Page 128: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/128.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Finite automata
Properties of �nite automata
De�nition
An automaton with one-letter labels is deterministic if no state hastwo outcoming edges with the same label.
Theorem
Every automata language can be recognized by deterministicautomata.
Sketch of the proof
New automaton states are sets of old states.An edge labeled by a leads from set Q1 to Q2 if Q2 contains
exactly the states reachable from Q1 by a.Start state Q0 = {q0} (only old start state).
Final states: subsets containing at least one old �nal state.
![Page 129: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/129.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Finite automata
Properties of �nite automata
De�nition
An automaton with one-letter labels is deterministic if no state hastwo outcoming edges with the same label.
Theorem
Every automata language can be recognized by deterministicautomata.
Sketch of the proof
New automaton states are sets of old states.An edge labeled by a leads from set Q1 to Q2 if Q2 contains
exactly the states reachable from Q1 by a.Start state Q0 = {q0} (only old start state).Final states: subsets containing at least one old �nal state.
![Page 130: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/130.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Finite automata
Kleene theorem
Theorem
The classes of automata and regular languages are the same.
Sketch of the proof
We should transform every �nite automaton to regular
expression and every regular expression to �nite automaton.Automaton → expression: di�cult, we will not prove it.Expression → automaton: simple proof by induction:Regular languages are constructed from primitives by means of
concatenation, union and iteration.Primitive regular languages (singletons and empty language) are
certainly automata.We should prove that regular operations preserve automata
languages.
![Page 131: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/131.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Finite automata
Kleene theorem
Theorem
The classes of automata and regular languages are the same.
Sketch of the proof
We should transform every �nite automaton to regular
expression and every regular expression to �nite automaton.
Automaton → expression: di�cult, we will not prove it.Expression → automaton: simple proof by induction:Regular languages are constructed from primitives by means of
concatenation, union and iteration.Primitive regular languages (singletons and empty language) are
certainly automata.We should prove that regular operations preserve automata
languages.
![Page 132: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/132.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Finite automata
Kleene theorem
Theorem
The classes of automata and regular languages are the same.
Sketch of the proof
We should transform every �nite automaton to regular
expression and every regular expression to �nite automaton.Automaton → expression: di�cult, we will not prove it.Expression → automaton: simple proof by induction:
Regular languages are constructed from primitives by means of
concatenation, union and iteration.Primitive regular languages (singletons and empty language) are
certainly automata.We should prove that regular operations preserve automata
languages.
![Page 133: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/133.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Finite automata
Kleene theorem
Theorem
The classes of automata and regular languages are the same.
Sketch of the proof
We should transform every �nite automaton to regular
expression and every regular expression to �nite automaton.Automaton → expression: di�cult, we will not prove it.Expression → automaton: simple proof by induction:Regular languages are constructed from primitives by means of
concatenation, union and iteration.
Primitive regular languages (singletons and empty language) are
certainly automata.We should prove that regular operations preserve automata
languages.
![Page 134: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/134.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Finite automata
Kleene theorem
Theorem
The classes of automata and regular languages are the same.
Sketch of the proof
We should transform every �nite automaton to regular
expression and every regular expression to �nite automaton.Automaton → expression: di�cult, we will not prove it.Expression → automaton: simple proof by induction:Regular languages are constructed from primitives by means of
concatenation, union and iteration.Primitive regular languages (singletons and empty language) are
certainly automata.
We should prove that regular operations preserve automata
languages.
![Page 135: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/135.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Finite automata
Kleene theorem
Theorem
The classes of automata and regular languages are the same.
Sketch of the proof
We should transform every �nite automaton to regular
expression and every regular expression to �nite automaton.Automaton → expression: di�cult, we will not prove it.Expression → automaton: simple proof by induction:Regular languages are constructed from primitives by means of
concatenation, union and iteration.Primitive regular languages (singletons and empty language) are
certainly automata.We should prove that regular operations preserve automata
languages.
![Page 136: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/136.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Finite automata
Kleene theorem
Theorem
The classes of automata and regular languages are the same.
Sketch of the proof
Concatenation: L1 = L(M1), L2 = L(M2)→ L1 · L2 = L(M)
M1
M2
⇒ M1 M2ε
![Page 137: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/137.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Finite automata
Kleene theorem
Theorem
The classes of automata and regular languages are the same.
Sketch of the proof
Concatenation: L1 = L(M1), L2 = L(M2)→ L1 · L2 = L(M)
M1
M2
⇒ M1 M2ε
![Page 138: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/138.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Finite automata
Kleene theorem
Theorem
The classes of automata and regular languages are the same.
Sketch of the proof
Union: L1 = L(M1), L2 = L(M2)→ L1 ∪ L2 = L(M)
M1
M2
⇒M1
M2
ε
ε
![Page 139: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/139.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Finite automata
Kleene theorem
Theorem
The classes of automata and regular languages are the same.
Sketch of the proof
Iteration: L1 = L(M1), L∗1 = L(M)
M1 ⇒ M1
ε
ε
![Page 140: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/140.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Finite automata
Properties of automata languagesTheorem
The class of automata languages is closed under complement.
Sketch of the proof
Consider the deterministic automaton for language L.Complete it: add a new sink state q′.If a state q1 does not have outcoming edge labeled by letter a,add an edge 〈q1, a〉 → q′.Add edge 〈q′, a〉 → a for every letter a.Now for every q1 ∈ Q, a ∈ Σ there is an edge of the form
〈q1, a〉 → q2.Consequently, every word w leads from q0 to exactly one state:
terminal if w ∈ L and non-terminal if w ∈ L.Switching non-terminal and terminal states yields automaton for
the complement.
![Page 141: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/141.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Finite automata
Properties of automata languagesTheorem
The class of automata languages is closed under complement.
Sketch of the proof
Consider the deterministic automaton for language L.
Complete it: add a new sink state q′.If a state q1 does not have outcoming edge labeled by letter a,add an edge 〈q1, a〉 → q′.Add edge 〈q′, a〉 → a for every letter a.Now for every q1 ∈ Q, a ∈ Σ there is an edge of the form
〈q1, a〉 → q2.Consequently, every word w leads from q0 to exactly one state:
terminal if w ∈ L and non-terminal if w ∈ L.Switching non-terminal and terminal states yields automaton for
the complement.
![Page 142: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/142.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Finite automata
Properties of automata languagesTheorem
The class of automata languages is closed under complement.
Sketch of the proof
Consider the deterministic automaton for language L.Complete it: add a new sink state q′.If a state q1 does not have outcoming edge labeled by letter a,add an edge 〈q1, a〉 → q′.
Add edge 〈q′, a〉 → a for every letter a.Now for every q1 ∈ Q, a ∈ Σ there is an edge of the form
〈q1, a〉 → q2.Consequently, every word w leads from q0 to exactly one state:
terminal if w ∈ L and non-terminal if w ∈ L.Switching non-terminal and terminal states yields automaton for
the complement.
![Page 143: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/143.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Finite automata
Properties of automata languagesTheorem
The class of automata languages is closed under complement.
Sketch of the proof
Consider the deterministic automaton for language L.Complete it: add a new sink state q′.If a state q1 does not have outcoming edge labeled by letter a,add an edge 〈q1, a〉 → q′.Add edge 〈q′, a〉 → a for every letter a.
Now for every q1 ∈ Q, a ∈ Σ there is an edge of the form
〈q1, a〉 → q2.Consequently, every word w leads from q0 to exactly one state:
terminal if w ∈ L and non-terminal if w ∈ L.Switching non-terminal and terminal states yields automaton for
the complement.
![Page 144: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/144.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Finite automata
Properties of automata languagesTheorem
The class of automata languages is closed under complement.
Sketch of the proof
Consider the deterministic automaton for language L.Complete it: add a new sink state q′.If a state q1 does not have outcoming edge labeled by letter a,add an edge 〈q1, a〉 → q′.Add edge 〈q′, a〉 → a for every letter a.Now for every q1 ∈ Q, a ∈ Σ there is an edge of the form
〈q1, a〉 → q2.
Consequently, every word w leads from q0 to exactly one state:
terminal if w ∈ L and non-terminal if w ∈ L.Switching non-terminal and terminal states yields automaton for
the complement.
![Page 145: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/145.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Finite automata
Properties of automata languagesTheorem
The class of automata languages is closed under complement.
Sketch of the proof
Consider the deterministic automaton for language L.Complete it: add a new sink state q′.If a state q1 does not have outcoming edge labeled by letter a,add an edge 〈q1, a〉 → q′.Add edge 〈q′, a〉 → a for every letter a.Now for every q1 ∈ Q, a ∈ Σ there is an edge of the form
〈q1, a〉 → q2.Consequently, every word w leads from q0 to exactly one state:
terminal if w ∈ L and non-terminal if w ∈ L.
Switching non-terminal and terminal states yields automaton for
the complement.
![Page 146: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/146.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Finite automata
Properties of automata languagesTheorem
The class of automata languages is closed under complement.
Sketch of the proof
Consider the deterministic automaton for language L.Complete it: add a new sink state q′.If a state q1 does not have outcoming edge labeled by letter a,add an edge 〈q1, a〉 → q′.Add edge 〈q′, a〉 → a for every letter a.Now for every q1 ∈ Q, a ∈ Σ there is an edge of the form
〈q1, a〉 → q2.Consequently, every word w leads from q0 to exactly one state:
terminal if w ∈ L and non-terminal if w ∈ L.Switching non-terminal and terminal states yields automaton for
the complement.
![Page 147: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/147.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Finite automata
Properties of automata languagesTheorem
The class of automata languages is closed under intersection.
Sketch of the proof
Easy variant: L1 ∩ L2 = L1 ∪ L2.Complex (but e�ective) variant: consider complete deterministic
automata M1 for L1 and M2 for L2.Let Q1,Q2 be their sets of states, q01, q02 be initial states and
F1,F2 be sets of �nal states.Consider a new automaton whose states are pairs 〈q1, q2〉,q1 ∈ Q1, q2 ∈ Q2.Its start state is 〈q01, q02〉.On the �rst coordinate it operates like M1, on the second like
M2.Finite states are pairs of �nal states (the automaton accepts i�
it accepts for both coordinates).
![Page 148: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/148.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Finite automata
Properties of automata languagesTheorem
The class of automata languages is closed under intersection.
Sketch of the proof
Easy variant: L1 ∩ L2 = L1 ∪ L2.
Complex (but e�ective) variant: consider complete deterministic
automata M1 for L1 and M2 for L2.Let Q1,Q2 be their sets of states, q01, q02 be initial states and
F1,F2 be sets of �nal states.Consider a new automaton whose states are pairs 〈q1, q2〉,q1 ∈ Q1, q2 ∈ Q2.Its start state is 〈q01, q02〉.On the �rst coordinate it operates like M1, on the second like
M2.Finite states are pairs of �nal states (the automaton accepts i�
it accepts for both coordinates).
![Page 149: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/149.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Finite automata
Properties of automata languagesTheorem
The class of automata languages is closed under intersection.
Sketch of the proof
Easy variant: L1 ∩ L2 = L1 ∪ L2.Complex (but e�ective) variant: consider complete deterministic
automata M1 for L1 and M2 for L2.Let Q1,Q2 be their sets of states, q01, q02 be initial states and
F1,F2 be sets of �nal states.
Consider a new automaton whose states are pairs 〈q1, q2〉,q1 ∈ Q1, q2 ∈ Q2.Its start state is 〈q01, q02〉.On the �rst coordinate it operates like M1, on the second like
M2.Finite states are pairs of �nal states (the automaton accepts i�
it accepts for both coordinates).
![Page 150: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/150.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Finite automata
Properties of automata languagesTheorem
The class of automata languages is closed under intersection.
Sketch of the proof
Easy variant: L1 ∩ L2 = L1 ∪ L2.Complex (but e�ective) variant: consider complete deterministic
automata M1 for L1 and M2 for L2.Let Q1,Q2 be their sets of states, q01, q02 be initial states and
F1,F2 be sets of �nal states.Consider a new automaton whose states are pairs 〈q1, q2〉,q1 ∈ Q1, q2 ∈ Q2.
Its start state is 〈q01, q02〉.On the �rst coordinate it operates like M1, on the second like
M2.Finite states are pairs of �nal states (the automaton accepts i�
it accepts for both coordinates).
![Page 151: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/151.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Finite automata
Properties of automata languagesTheorem
The class of automata languages is closed under intersection.
Sketch of the proof
Easy variant: L1 ∩ L2 = L1 ∪ L2.Complex (but e�ective) variant: consider complete deterministic
automata M1 for L1 and M2 for L2.Let Q1,Q2 be their sets of states, q01, q02 be initial states and
F1,F2 be sets of �nal states.Consider a new automaton whose states are pairs 〈q1, q2〉,q1 ∈ Q1, q2 ∈ Q2.Its start state is 〈q01, q02〉.On the �rst coordinate it operates like M1, on the second like
M2.
Finite states are pairs of �nal states (the automaton accepts i�
it accepts for both coordinates).
![Page 152: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/152.jpg)
Computational morphology. Day 1. Theory of formal languages.
Theory of formal languages
Finite automata
Properties of automata languagesTheorem
The class of automata languages is closed under intersection.
Sketch of the proof
Easy variant: L1 ∩ L2 = L1 ∪ L2.Complex (but e�ective) variant: consider complete deterministic
automata M1 for L1 and M2 for L2.Let Q1,Q2 be their sets of states, q01, q02 be initial states and
F1,F2 be sets of �nal states.Consider a new automaton whose states are pairs 〈q1, q2〉,q1 ∈ Q1, q2 ∈ Q2.Its start state is 〈q01, q02〉.On the �rst coordinate it operates like M1, on the second like
M2.Finite states are pairs of �nal states (the automaton accepts i�
it accepts for both coordinates).
![Page 153: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/153.jpg)
Computational morphology. Day 1. Theory of formal languages.
Recursive construction of automata
Recursive construction of automata
Finite automata are closed under a couple of operations.
Moreover, this closure is e�ective: corresponding automata are
built algorithmically.Therefore we may combine automata just as regular expressions,
but with more operations.For example, the automata for English plural can be expressed
as:
(Lsib · es) ∪ (((Lsib ∩ LC ) ∪ LCy ∪ LV ) · s),
where
Lsib � words ending with sibilant.LC � words ending with consonant.LCy � words ending with consonant+y.LV � words ending with vowel (not y).
The basic languages are the automata ones; the automaton for
the whole expression could be constructed recursively.
![Page 154: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/154.jpg)
Computational morphology. Day 1. Theory of formal languages.
Recursive construction of automata
Recursive construction of automata
Finite automata are closed under a couple of operations.Moreover, this closure is e�ective: corresponding automata are
built algorithmically.
Therefore we may combine automata just as regular expressions,
but with more operations.For example, the automata for English plural can be expressed
as:
(Lsib · es) ∪ (((Lsib ∩ LC ) ∪ LCy ∪ LV ) · s),
where
Lsib � words ending with sibilant.LC � words ending with consonant.LCy � words ending with consonant+y.LV � words ending with vowel (not y).
The basic languages are the automata ones; the automaton for
the whole expression could be constructed recursively.
![Page 155: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/155.jpg)
Computational morphology. Day 1. Theory of formal languages.
Recursive construction of automata
Recursive construction of automata
Finite automata are closed under a couple of operations.Moreover, this closure is e�ective: corresponding automata are
built algorithmically.Therefore we may combine automata just as regular expressions,
but with more operations.
For example, the automata for English plural can be expressed
as:
(Lsib · es) ∪ (((Lsib ∩ LC ) ∪ LCy ∪ LV ) · s),
where
Lsib � words ending with sibilant.LC � words ending with consonant.LCy � words ending with consonant+y.LV � words ending with vowel (not y).
The basic languages are the automata ones; the automaton for
the whole expression could be constructed recursively.
![Page 156: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/156.jpg)
Computational morphology. Day 1. Theory of formal languages.
Recursive construction of automata
Recursive construction of automata
Finite automata are closed under a couple of operations.Moreover, this closure is e�ective: corresponding automata are
built algorithmically.Therefore we may combine automata just as regular expressions,
but with more operations.For example, the automata for English plural can be expressed
as:
(Lsib · es) ∪ (((Lsib ∩ LC ) ∪ LCy ∪ LV ) · s),
where
Lsib � words ending with sibilant.LC � words ending with consonant.LCy � words ending with consonant+y.LV � words ending with vowel (not y).
The basic languages are the automata ones; the automaton for
the whole expression could be constructed recursively.
![Page 157: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/157.jpg)
Computational morphology. Day 1. Theory of formal languages.
Recursive construction of automata
Recursive construction of automata
Finite automata are closed under a couple of operations.Moreover, this closure is e�ective: corresponding automata are
built algorithmically.Therefore we may combine automata just as regular expressions,
but with more operations.For example, the automata for English plural can be expressed
as:
(Lsib · es) ∪ (((Lsib ∩ LC ) ∪ LCy ∪ LV ) · s),
where
Lsib � words ending with sibilant.LC � words ending with consonant.LCy � words ending with consonant+y.LV � words ending with vowel (not y).
The basic languages are the automata ones; the automaton for
the whole expression could be constructed recursively.
![Page 158: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/158.jpg)
Computational morphology. Day 1. Theory of formal languages.
Recursive construction of automata
Recursive construction of automataTurkish in�nitive
Construct a �nite automaton for Turkish in�nitive
In�nitive has the form stem+mEk.Placeholder E is �lled by e if the stem ends with e, i, �o, �u and a if itends with a, �, o, u.
M1 is the automaton for expression C∗V (C |V )∗m(a|e)k (it is easyto construct it).M2 checks the condition for vowels:
C ,Ve, i,
�o, �u
a, �, o, u
C ,V
C ,V
e
a
M1 ∩M2 is the required automaton.
![Page 159: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/159.jpg)
Computational morphology. Day 1. Theory of formal languages.
Recursive construction of automata
Recursive construction of automataTurkish in�nitive
Construct a �nite automaton for Turkish in�nitive
In�nitive has the form stem+mEk.Placeholder E is �lled by e if the stem ends with e, i, �o, �u and a if itends with a, �, o, u.
M1 is the automaton for expression C∗V (C |V )∗m(a|e)k (it is easyto construct it).
M2 checks the condition for vowels:
C ,Ve, i,
�o, �u
a, �, o, u
C ,V
C ,V
e
a
M1 ∩M2 is the required automaton.
![Page 160: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/160.jpg)
Computational morphology. Day 1. Theory of formal languages.
Recursive construction of automata
Recursive construction of automataTurkish in�nitive
Construct a �nite automaton for Turkish in�nitive
In�nitive has the form stem+mEk.Placeholder E is �lled by e if the stem ends with e, i, �o, �u and a if itends with a, �, o, u.
M1 is the automaton for expression C∗V (C |V )∗m(a|e)k (it is easyto construct it).M2 checks the condition for vowels:
C ,Ve, i,
�o, �u
a, �, o, u
C ,V
C ,V
e
a
M1 ∩M2 is the required automaton.
![Page 161: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/161.jpg)
Computational morphology. Day 1. Theory of formal languages.
Recursive construction of automata
Recursive construction of automataTurkish in�nitive
Construct a �nite automaton for turkish passive in�nitive
In�nitive has the form stem+X+mEk.Placeholder E is �lled by e if the stem ends with e, i, �o, �u and a if itends with a, �, o, u.Su�x X is -n if the stem ends with vowel, -In if the stem ends withl and -Il otherwise.Placeholder I equals � after a, �; u after u, o; i after e, i ; �u after �u, �o.
![Page 162: Computational morphology. Day 1. Theory of formal languages.tipl.philol.msu.ru/application/files/4015/0084/... · Computational morphology. Dya 1. Theory of formal languages. Theory](https://reader030.fdocuments.net/reader030/viewer/2022041120/5f348db4d863c072683a3c7d/html5/thumbnails/162.jpg)
Computational morphology. Day 1. Theory of formal languages.
Recursive construction of automata
Where to get presentations
https://www.irit.fr/esslli2017/courses/33.http://tipl.philol.msu.ru/~otipl/index.php/department/
faculty/AAS/esslli
For the next day:
Install (simply download and unpack) �nite-state compiler FOMA
from https://code.google.com/archive/p/foma/.