Learning Restricted Restarting Automata

87
Learning Restricted Restarting Automata Presentation for the ABCD workshop in Prague, March 27 – 29, 2009 Peter Černo

description

Learning Restricted Restarting Automata. Presentation for the ABCD workshop in Prague, March 27 – 29, 2009 Peter Černo. About the presentation. - PowerPoint PPT Presentation

Transcript of Learning Restricted Restarting Automata

Page 1: Learning Restricted Restarting Automata

Learning Restricted Restarting Automata

Presentation for the ABCD workshop in Prague, March 27 – 29, 2009Peter Černo

Page 2: Learning Restricted Restarting Automata

About the presentation PART I: We present a specialized program

which allows an easy design and testing of restarting automata and provides specialized tools for learning finite automata and defining languages.

Page 3: Learning Restricted Restarting Automata

About the presentation PART I: We present a specialized program

which allows an easy design and testing of restarting automata and provides specialized tools for learning finite automata and defining languages.

PART II: We introduce two restricted models of restarting automata: RACL and RA∆CL together with their properties and limitations.

Page 4: Learning Restricted Restarting Automata

About the presentation PART I: We present a specialized program

which allows an easy design and testing of restarting automata and provides specialized tools for learning finite automata and defining languages.

PART II: We introduce two restricted models of restarting automata: RACL and RA∆CL together with their properties and limitations.

PART III: We demonstrate learning of these restricted models by another specialized program.

Page 5: Learning Restricted Restarting Automata

About the presentation PART I: We present a specialized program

which allows an easy design and testing of restarting automata and provides specialized tools for learning finite automata and defining languages.

PART II: We introduce two restricted models of restarting automata: RACL and RA∆CL together with their properties and limitations.

PART III: We demonstrate learning of these restricted models by another specialized program.

PART IV: We give a list of some open problems and topics for future investigations.

Page 6: Learning Restricted Restarting Automata

Restarting Automaton Is a system M = (Σ, Γ, I) where: Σ is an input alphabet, Γ is a working

alphabet I is a finite set of meta-instructions:

Rewriting meta-instruction (EL, x → y, ER), where x, y ∊ Γ* such that |x| > |y|, and EL, ER ⊆ Γ* are regular languages called left and right constraints.

Accepting meta-instruction (E, Accept), where E ⊆ Γ* is a regular language.

Page 7: Learning Restricted Restarting Automata

Language of Restarting Automaton Rewriting meta-instructions of M induce a

reducing relation ⊢M ⊆ Γ* x Γ* such that:for each u, v ∊ Γ*, u ⊢M v if and only if there exist an instruction i = (EL, x → y, ER) in I and words u1, u2 ∊ Γ* such that u = u1xu2, v = u1yu2, u1 ∊ EL and u2 ∊ ER.

Page 8: Learning Restricted Restarting Automata

Language of Restarting Automaton Rewriting meta-instructions of M induce a

reducing relation ⊢M ⊆ Γ* x Γ* such that:for each u, v ∊ Γ*, u ⊢M v if and only if there exist an instruction i = (EL, x → y, ER) in I and words u1, u2 ∊ Γ* such that u = u1xu2, v = u1yu2, u1 ∊ EL and u2 ∊ ER.

Accepting meta-instructions of M define simple sentential forms SM = set of words u ∊ Γ*, for which there exist an instruction i = (E, Accept) in I such that u ∊ E.

Page 9: Learning Restricted Restarting Automata

Language of Restarting Automaton Rewriting meta-instructions of M induce a

reducing relation ⊢M ⊆ Γ* x Γ* such that:for each u, v ∊ Γ*, u ⊢M v if and only if there exist an instruction i = (EL, x → y, ER) in I and words u1, u2 ∊ Γ* such that u = u1xu2, v = u1yu2, u1 ∊ EL and u2 ∊ ER.

Accepting meta-instructions of M define simple sentential forms SM = set of words u ∊ Γ*, for which there exist an instruction i = (E, Accept) in I such that u ∊ E.

The input language of M is defined as:L(M) = {u ∊ Σ*| ∃v ∊ SM: u ⊢M* v},where ⊢M* is a reflexive and transitive closure of ⊢M.

Page 10: Learning Restricted Restarting Automata

Example How to create a restarting automaton that

recognizes the language L = {aibicjdj | i, j > 0}. Accepting meta-instructions:

Reducing (or rewriting) meta-instructions:

Name Accepting LanguageA0 ^abcd$

Name LeftLanguage

FromWord

ToWord

RightLanguage

R0 ^a*$ ab λ ^b*c*d*$

R1 ^a*b*c*$

cd λ ^d*$

Page 11: Learning Restricted Restarting Automata

Example Suppose that we have a word: aaabbbccdd

Name Accepting Lang.A0 ^abcd$

Name LeftLang.

FromWord

ToWord

RightLang.

R0 ^a*$ ab λ ^b*c*d*$

R1 ^a*b*c*$

cd λ ^d*$

Page 12: Learning Restricted Restarting Automata

Example aaabbbccdd ‒R0→ aabbccdd

Name Accepting Lang.A0 ^abcd$

Name LeftLang.

FromWord

ToWord

RightLang.

R0 ^a*$ ab λ ^b*c*d*$

R1 ^a*b*c*$

cd λ ^d*$

Page 13: Learning Restricted Restarting Automata

Example aaabbbccdd ‒R0→ aabbccdd ‒R1→ aabbcd

Name Accepting Lang.A0 ^abcd$

Name LeftLang.

FromWord

ToWord

RightLang.

R0 ^a*$ ab λ ^b*c*d*$

R1 ^a*b*c*$

cd λ ^d*$

Page 14: Learning Restricted Restarting Automata

Example aaabbbccdd ‒R0→ aabbccdd ‒R1→ aabbcd ‒R0→ abcd abcd is accepted by A0, so the whole word aaabbbccdd is accepted. Note that abcd is a simple sentential form.

Name Accepting Lang.A0 ^abcd$

Name LeftLang.

FromWord

ToWord

RightLang.

R0 ^a*$ ab λ ^b*c*d*$

R1 ^a*b*c*$

cd λ ^d*$

Page 15: Learning Restricted Restarting Automata

PART I: RestartingAutomaton.exe

Page 16: Learning Restricted Restarting Automata

Capabilities and features Design a restarting automaton. The design of restarting

automaton consists of stepwise design of accepting and reducing meta-instructions. You can save (load) restarting automaton to (from) an XML file.

Test correctly defined restarting automaton: The system is able to give you a list of all words that can be

obtained by reductions from a given word w. The system is able to give you a list of all reduction paths from

one given word to another given word. Start a server mode, in which the client applications can

use services provided by the server application. You can use specialized tools to define formal languages. You can also save, load, copy, paste and view an XML

representation of the actual state of every tool.

Page 17: Learning Restricted Restarting Automata

Learning Languages There are several tools that are used to define languages:

DFA Modeler: allows you to enter a regular language by specifying its underlying deterministic finite automaton.

LStar Algorithm: encapsulates Dana Angluin’s L* algorithm that is a machine learning algorithm which learns deterministic finite automaton using membership and equivalence queries.

RPNI Algorithm: encapsulates a machine learning algorithm which learns deterministic finite automaton based on a given set of labeled examples.

Regular Expression: allows you to enter a regular language by specifying the regular expression.

SLT Language: allows you to design a regular language by specifying a positive integer k and positive examples using the algorithm for learning k-SLT languages.

Page 18: Learning Restricted Restarting Automata

Pros and Cons Pros:

The application is written in C# using .NET Framework 2.0.It works both on Win32 and UNIX platforms.

The application demonstrates that it is easy to design and work with restarting automata.

Any component of the application can be easily reused in another projects. It safes your work.

Cons: The application is a toy that allows you only to design

simple restarting automata recognizing only simple formal languages with small alphabets consisting of few letters. On large inputs the computation can take a long time and it can produce a huge output.

Page 19: Learning Restricted Restarting Automata

PART II: RACL k-local Restarting Automaton CLEARING (k-RACL) M = (Σ, I)

Σ is a finite nonempty alphabet, ¢, $ ∉ Σ I is a finite set of instructions (x, z, y), x ∊ LCk, y ∊ RCk, z ∊ Σ+

left context LCk = Σk ∪ ¢.Σ≤k-1 right context RCk = Σk ∪ Σ≤k-1.$

Page 20: Learning Restricted Restarting Automata

PART II: RACL k-local Restarting Automaton CLEARING (k-RACL) M = (Σ, I)

Σ is a finite nonempty alphabet, ¢, $ ∉ Σ I is a finite set of instructions (x, z, y), x ∊ LCk, y ∊ RCk, z ∊ Σ+

left context LCk = Σk ∪ ¢.Σ≤k-1 right context RCk = Σk ∪ Σ≤k-1.$

A word w = uzv can be rewritten to uv (uzv → uv) if and only if there exist an instruction i = (x, z, y) ∊ I such that: x ⊒ ¢.u (x is a suffix of ¢.u) y ⊑ v.$ (y is a prefix of v.$)

Page 21: Learning Restricted Restarting Automata

PART II: RACL k-local Restarting Automaton CLEARING (k-RACL) M = (Σ, I)

Σ is a finite nonempty alphabet, ¢, $ ∉ Σ I is a finite set of instructions (x, z, y), x ∊ LCk, y ∊ RCk, z ∊ Σ+

left context LCk = Σk ∪ ¢.Σ≤k-1 right context RCk = Σk ∪ Σ≤k-1.$

A word w = uzv can be rewritten to uv (uzv → uv) if and only if there exist an instruction i = (x, z, y) ∊ I such that: x ⊒ ¢.u (x is a suffix of ¢.u) y ⊑ v.$ (y is a prefix of v.$)

A word w is accepted if and only if w →* λ where →* is reflexive and transitive closure of →.

Page 22: Learning Restricted Restarting Automata

PART II: RACL k-local Restarting Automaton CLEARING (k-RACL) M = (Σ, I)

Σ is a finite nonempty alphabet, ¢, $ ∉ Σ I is a finite set of instructions (x, z, y), x ∊ LCk, y ∊ RCk, z ∊ Σ+

left context LCk = Σk ∪ ¢.Σ≤k-1 right context RCk = Σk ∪ Σ≤k-1.$

A word w = uzv can be rewritten to uv (uzv → uv) if and only if there exist an instruction i = (x, z, y) ∊ I such that: x ⊒ ¢.u (x is a suffix of ¢.u) y ⊑ v.$ (y is a prefix of v.$)

A word w is accepted if and only if w →* λ where →* is reflexive and transitive closure of →.

We define the class RACL as ⋃k≥1k-RACL.

Page 23: Learning Restricted Restarting Automata

Why RACL? This model was inspired by the Associative

Language Descriptions (ALD) model By Alessandra Cherubini, Stefano Crespi-Reghizzi,

Matteo Pradella, Pierluigi San Pietro See:

http://home.dei.polimi.it/sanpietr/ALD/ALD.html The more restricted model we have the easier

is the investigation of its properties. Moreover, the learning methods are much more simple and straightforward.

Page 24: Learning Restricted Restarting Automata

Example Language L = {anbn | n ≥ 0}.

Page 25: Learning Restricted Restarting Automata

Example Language L = {anbn | n ≥ 0}. 1-RACL M = ({a, b}, I) where the instructions I

are: R1 = (a, ab, b) R2 = (¢, ab, $)

Page 26: Learning Restricted Restarting Automata

Example Language L = {anbn | n ≥ 0}. 1-RACL M = ({a, b}, I) where the instructions I

are: R1 = (a, ab, b) R2 = (¢, ab, $)

For instance: aaaabbbb ‒R1→ aaabbb

Page 27: Learning Restricted Restarting Automata

Example Language L = {anbn | n ≥ 0}. 1-RACL M = ({a, b}, I) where the instructions I

are: R1 = (a, ab, b) R2 = (¢, ab, $)

For instance: aaaabbbb ‒R1→ aaabbb ‒R1→ aabb

Page 28: Learning Restricted Restarting Automata

Example Language L = {anbn | n ≥ 0}. 1-RACL M = ({a, b}, I) where the instructions I

are: R1 = (a, ab, b) R2 = (¢, ab, $)

For instance: aaaabbbb ‒R1→ aaabbb ‒R1→ aabb ‒R1→ ab

Page 29: Learning Restricted Restarting Automata

Example Language L = {anbn | n ≥ 0}. 1-RACL M = ({a, b}, I) where the instructions I

are: R1 = (a, ab, b) R2 = (¢, ab, $)

For instance: aaaabbbb ‒R1→ aaabbb ‒R1→ aabb ‒R1→ ab ‒R2→ λ

Now we see that the word aaaabbbb is accepted because aaaabbbb →* λ.

Page 30: Learning Restricted Restarting Automata

Example Language L = {anbn | n ≥ 0}. 1-RACL M = ({a, b}, I) where the instructions I

are: R1 = (a, ab, b) R2 = (¢, ab, $)

For instance: aaaabbbb ‒R1→ aaabbb ‒R1→ aabb ‒R1→ ab ‒R2→ λ

Now we see that the word aaaabbbb is accepted because aaaabbbb →* λ.

Note that λ is always accepted because λ →* λ.

Page 31: Learning Restricted Restarting Automata

Some Theorems Theorem: For every finite L ⊆ Σ* there exist 1-RACL M such that L(M) = L.

Proof. Suppose L = {w1, …, wn}.Consider I = {(¢, w1, $), …, (¢, wn, $)}. ∎

Page 32: Learning Restricted Restarting Automata

Some Theorems Theorem: For every finite L ⊆ Σ* there exist 1-RACL M such that L(M) = L.

Proof. Suppose L = {w1, …, wn}.Consider I = {(¢, w1, $), …, (¢, wn, $)}. ∎

Theorem: For all k≥1 ℒ(k-RACL) ⊆ ℒ((k+1)-RACL).

Page 33: Learning Restricted Restarting Automata

Some Theorems Theorem: For every finite L ⊆ Σ* there exist 1-RACL M such that L(M) = L.

Proof. Suppose L = {w1, …, wn}.Consider I = {(¢, w1, $), …, (¢, wn, $)}. ∎

Theorem: For all k≥1 ℒ(k-RACL) ⊆ ℒ((k+1)-RACL). Theorem: For each regular language L there

exist a k-RACL M such that L(M) = L∪{λ}.

Page 34: Learning Restricted Restarting Automata

Some Theorems Theorem: For every finite L ⊆ Σ* there exist 1-RACL M

such that L(M) = L. Proof. Suppose L = {w1, …, wn}.

Consider I = {(¢, w1, $), …, (¢, wn, $)}. ∎ Theorem: For all k≥1 ℒ(k-RACL) ⊆ ℒ((k+1)-RACL). Theorem: For each regular language L there exist a k-RACL M such that L(M) = L∪{λ}.

Proof. Based on pumping lemma for regular languages. For each z ∊ Σ*, |z|=n there exist u, v, w such that |v|≥1, δ(q0, uv) = δ(q0, u); the word v can be crossed out. We add corresponding instruction iz = (¢u, v, w). For each accepted z ∊ Σ<n we add instruction iz = (¢, z, $). ∎

Page 35: Learning Restricted Restarting Automata

Some Theorems Lemma: Let M be RACL, i = (x, z, y) its

instruction and w = uv such that x ⊒ ¢.u and y ⊑ v.$.Then uv ∊ L(M) ⇒ uzv ∊ L(M). Proof. uzv ―i→ uv →* λ. ∎

Page 36: Learning Restricted Restarting Automata

Some Theorems Lemma: Let M be RACL, i = (x, z, y) its

instruction and w = uv such that x ⊒ ¢.u and y ⊑ v.$.Then uv ∊ L(M) ⇒ uzv ∊ L(M). Proof. uzv ―i→ uv →* λ. ∎

Theorem: Languages L1 ∪ L2 and L1.L2 where L1 = {anbn | n≥0} and L2 = {anb2n | n≥0} are not accepted by any RACL. Proof by contradiction, based on the previous

lemma.

Page 37: Learning Restricted Restarting Automata

Some Theorems Lemma: Let M be RACL, i = (x, z, y) its

instruction and w = uv such that x ⊒ ¢.u and y ⊑ v.$.Then uv ∊ L(M) ⇒ uzv ∊ L(M). Proof. uzv ―i→ uv →* λ. ∎

Theorem: Languages L1 ∪ L2 and L1.L2 where L1 = {anbn | n≥0} and L2 = {anb2n | n≥0} are not accepted by any RACL. Proof by contradiction, based on the previous lemma.

Corollary: RACL is not closed under union and concatenation.

Page 38: Learning Restricted Restarting Automata

Some Theorems Lemma: Let M be RACL, i = (x, z, y) its instruction and w = uv such that x ⊒ ¢.u and y ⊑ v.$.

Then uv ∊ L(M) ⇒ uzv ∊ L(M). Proof. uzv ―i→ uv →* λ. ∎

Theorem: Languages L1 ∪ L2 and L1.L2 where L1 = {anbn | n≥0} and L2 = {anb2n | n≥0} are not accepted by any RACL. Proof by contradiction, based on the previous lemma.

Corollary: RACL is not closed under union and concatenation.

Corollary: RACL is not closed under homomorphism. Consider {anbn | n≥0} ∪ {cnd2n | n≥0} and homomorphism

defined as: a ↦ a, b ↦ b, c ↦ a, d ↦ b. ∎

Page 39: Learning Restricted Restarting Automata

Some Theorems Theorem: The language L1 = {ancbn | n ≥ 0} ∪ {λ} is not accepted by any RACL. Theorem: The languages:

L2 = {ancbn | n ≥ 0} ∪ {ambm | m ≥ 0} L3 = {ancbm | n, m ≥ 0} ∪ {λ} L4 = {ambm | m ≥ 0}are recognized by 1-RACL.

Page 40: Learning Restricted Restarting Automata

Some Theorems Theorem: The language L1 = {ancbn | n ≥ 0} ∪ {λ} is not accepted by any RACL. Theorem: The languages:

L2 = {ancbn | n ≥ 0} ∪ {ambm | m ≥ 0} L3 = {ancbm | n, m ≥ 0} ∪ {λ} L4 = {ambm | m ≥ 0}are recognized by 1-RACL.

Corollary: RACL is not closed under intersection. Proof. L1 = L2 ∩ L3. ∎

Page 41: Learning Restricted Restarting Automata

Some Theorems Theorem: The language L1 = {ancbn | n ≥ 0} ∪ {λ} is

not accepted by any RACL. Theorem: The languages:

L2 = {ancbn | n ≥ 0} ∪ {ambm | m ≥ 0} L3 = {ancbm | n, m ≥ 0} ∪ {λ} L4 = {ambm | m ≥ 0}are recognized by 1-RACL.

Corollary: RACL is not closed under intersection. Proof. L1 = L2 ∩ L3. ∎

Corollary: RACL is not closed under intersection with a regular language. Proof. L3 is a regular language. ∎

Page 42: Learning Restricted Restarting Automata

Some Theorems Theorem: The language L1 = {ancbn | n ≥ 0} ∪ {λ} is not

accepted by any RACL. Theorem: The languages:

L2 = {ancbn | n ≥ 0} ∪ {ambm | m ≥ 0} L3 = {ancbm | n, m ≥ 0} ∪ {λ} L4 = {ambm | m ≥ 0}are recognized by 1-RACL.

Corollary: RACL is not closed under intersection. Proof. L1 = L2 ∩ L3. ∎

Corollary: RACL is not closed under intersection with a regular language. Proof. L3 is a regular language. ∎

Corollary: RACL is not closed under difference. Proof. L1 = (L2 – L4) ∪ {λ}. ∎

Page 43: Learning Restricted Restarting Automata

Parentheses The following instruction of 1-RACL M is

enough for recognizing the language of correct parentheses: (λ, ( ), λ)

Page 44: Learning Restricted Restarting Automata

Parentheses The following instruction of 1-RACL M is

enough for recognizing the language of correct parentheses: (λ, ( ), λ)

Note: This instruction represents a set of instructions: ({¢}∪Σ, ( ), Σ∪{$}), where Σ = {(, )} and (A, w, B) = {(a, w, b) | a∊A, b∊B}.

Page 45: Learning Restricted Restarting Automata

Parentheses The following instruction of 1-RACL M is

enough for recognizing the language of correct parentheses: (λ, ( ), λ)

Note: This instruction represents a set of instructions: ({¢}∪Σ, ( ), Σ∪{$}), where Σ = {(, )} and (A, w, B) = {(a, w, b) | a∊A, b∊B}.

Note: We use the following notation for the (A, w, B):A w B

Page 46: Learning Restricted Restarting Automata

Arithmetic expressions Suppose that we want to check correctness of

arithmetic expressions over the alphabet Σ = {α, +, *, (, )}. For example α+(α*α+α) is correct, α*+α is not. The priority of the operations is considered.

Page 47: Learning Restricted Restarting Automata

Arithmetic expressions Suppose that we want to check correctness of

arithmetic expressions over the alphabet Σ = {α, +, *, (, )}. For example α+(α*α+α) is correct, α*+α is not. The priority of the operations is considered.

The following 1-RACL M is sufficient:¢+( α+()+ α(¢+*(

α*()* α(α) +α+() $+) α) *α*()

$+*)

¢ α() $

( α() )

Page 48: Learning Restricted Restarting Automata

Arithmetic expressions: ExampleExpression Instructionα*α + ((α + α) + (α + α*α))*α (¢, α*, α)α + ((α + α) + (α + α*α))*α (α, +α, ) )α + ((α) + (α + α*α))*α ( ), *α, $)α + ((α) + (α + α*α)) (+, α*, α)α + ((α) + (α + α)) ( (, α+, α)α + ((α) + (α)) ( (, α, ) )α + (( ) + (α)) ( (, ( )+, ( )α + ((α)) ( (, α, ) )α + (( )) ( (, ( ), ) )α + ( ) (¢, α+, ( )( ) (¢, ( ), $)λ accept

Page 49: Learning Restricted Restarting Automata

Nondeterminism Assume the following instructions:

R1 = (bb, a, bbbb) R2 = (bb, bb, $) R3 = (¢, cbb, $)and the word: cbbabbbb.

Page 50: Learning Restricted Restarting Automata

Nondeterminism Assume the following instructions:

R1 = (bb, a, bbbb) R2 = (bb, bb, $) R3 = (¢, cbb, $)and the word: cbbabbbb. Then: cbbabbbb ―R1→ cbbbbbb ―R2→ cbbbb ―R2→ cbb ―R3→ λ.

Page 51: Learning Restricted Restarting Automata

Nondeterminism Assume the following instructions:

R1 = (bb, a, bbbb) R2 = (bb, bb, $) R3 = (¢, cbb, $)and the word: cbbabbbb. Then: cbbabbbb ―R1→ cbbbbbb ―R2→ cbbbb ―R2→ cbb ―R3→ λ.

But if we have started with R2: cbbabbbb ―R2→ cbbabbthen it would not be possible to continue.

Page 52: Learning Restricted Restarting Automata

Nondeterminism Assume the following instructions:

R1 = (bb, a, bbbb) R2 = (bb, bb, $) R3 = (¢, cbb, $)and the word: cbbabbbb. Then: cbbabbbb ―R1→ cbbbbbb ―R2→ cbbbb ―R2→ cbb ―R3→ λ.

But if we have started with R2: cbbabbbb ―R2→ cbbabbthen it would not be possible to continue.

⇒ The order of used instructions is important!

Page 53: Learning Restricted Restarting Automata

Hardest CFL H By S. A. Greibach, definition from Section 10.5

of M. Harrison, Introduction to Formal Language Theory, Addison-Wesley, Reading, MA, 1978.

Page 54: Learning Restricted Restarting Automata

Hardest CFL H By S. A. Greibach, definition from Section 10.5

of M. Harrison, Introduction to Formal Language Theory, Addison-Wesley, Reading, MA, 1978.

Let D2 be Semi-Dyck set on {a1, a2, a1’, a2’} generated by the grammar: S → a1Sa1’S | a2Sa2’S | λ.

Page 55: Learning Restricted Restarting Automata

Hardest CFL H By S. A. Greibach, definition from Section 10.5

of M. Harrison, Introduction to Formal Language Theory, Addison-Wesley, Reading, MA, 1978.

Let D2 be Semi-Dyck set on {a1, a2, a1’, a2’} generated by the grammar: S → a1Sa1’S | a2Sa2’S | λ.Let Σ = {a1, a2, a1’, a2’, b, c}, d ∉ Σ.

Page 56: Learning Restricted Restarting Automata

Hardest CFL H By S. A. Greibach, definition from Section 10.5

of M. Harrison, Introduction to Formal Language Theory, Addison-Wesley, Reading, MA, 1978.

Let D2 be Semi-Dyck set on {a1, a2, a1’, a2’} generated by the grammar: S → a1Sa1’S | a2Sa2’S | λ.Let Σ = {a1, a2, a1’, a2’, b, c}, d ∉ Σ.Then H = {λ} ∪ {∏i=1..nxicyiczid | n ≥ 1, y1y2…yn ∊ bD2, xi, zi ∊ Σ*}.

Page 57: Learning Restricted Restarting Automata

Hardest CFL H By S. A. Greibach, definition from Section 10.5

of M. Harrison, Introduction to Formal Language Theory, Addison-Wesley, Reading, MA, 1978.

Let D2 be Semi-Dyck set on {a1, a2, a1’, a2’} generated by the grammar: S → a1Sa1’S | a2Sa2’S | λ.Let Σ = {a1, a2, a1’, a2’, b, c}, d ∉ Σ.Then H = {λ} ∪ {∏i=1..nxicyiczid | n ≥ 1, y1y2…yn ∊ bD2, xi, zi ∊ Σ*}.

Each CFL can be represented as an inverse homomorphism of H.

Page 58: Learning Restricted Restarting Automata

Hardest CFL H Theorem: H is not accepted by any RACL.

Proof by contradiction. But if we slightly extend the definition of RACL

then we will be able to recognize H.

Page 59: Learning Restricted Restarting Automata

RAΔCL k-local Restarting Automaton ΔCLEARINGk-RAΔCL M = (Σ, I)

Σ is a finite nonempty alphabet, ¢, $, Δ ∉ Σ, Γ = Σ ∪ {Δ} I is a finite set of instructions:

(1) (x, z → λ, y) (2) (x, z → Δ, y)

where x ∊ LCk, y ∊ RCk, z ∊ Γ+. left context LCk = Γk ∪ ¢. Γ≤k-1 right context RCk = Γk ∪ Γ≤k-1.$

Page 60: Learning Restricted Restarting Automata

RAΔCL A word w = uzv can be rewritten to uv (uzv → uv) if and only if there exist an instruction i = (x, z → λ, y) ∊ I

Page 61: Learning Restricted Restarting Automata

RAΔCL A word w = uzv can be rewritten to uv (uzv → uv) if and only if there exist an instruction i = (x, z → λ, y) ∊ I … or to uΔv (uzv → uΔv) if and only if there

exist an instruction i = (x, z → Δ, y) ∊ I

Page 62: Learning Restricted Restarting Automata

RAΔCL A word w = uzv can be rewritten to uv (uzv → uv) if and only if there exist an instruction i = (x, z → λ, y) ∊ I … or to uΔv (uzv → uΔv) if and only if there

exist an instruction i = (x, z → Δ, y) ∊ I such that x ⊒ ¢.u and y ⊑ v.$.

Page 63: Learning Restricted Restarting Automata

RAΔCL A word w = uzv can be rewritten to uv (uzv → uv) if and only if there exist an instruction i = (x, z → λ, y) ∊ I … or to uΔv (uzv → uΔv) if and only if there

exist an instruction i = (x, z → Δ, y) ∊ I such that x ⊒ ¢.u and y ⊑ v.$. A word w is accepted if and only if w →* λ

where →* is reflexive and transitive closure of →.

Page 64: Learning Restricted Restarting Automata

RAΔCL A word w = uzv can be rewritten to uv (uzv → uv) if and only if there exist an instruction i = (x, z → λ, y) ∊ I … or to uΔv (uzv → uΔv) if and only if there

exist an instruction i = (x, z → Δ, y) ∊ I such that x ⊒ ¢.u and y ⊑ v.$. A word w is accepted if and only if w →* λ

where →* is reflexive and transitive closure of →. We define the class RAΔCL as ⋃k≥1k-RAΔCL.

Page 65: Learning Restricted Restarting Automata

Hardest CFL H revival Theorem: H is recognized by 1-RAΔCL.

Idea. Suppose that we have w ∊ H:w = ¢ x1cy1cz1dx2cy2cz2d… xncyncznd $ In the first phase we start with deleting letters

(from the alphabet Σ = {a1, a2, a1’, a2’, b, c}) from the right side of ¢ and from the left and right sides of the letters d.

As soon as we think that we have the following word:¢ cy1cdcy2cd… cyncd $we introduce the Δ symbols: ¢ Δy1Δy2Δ… ΔynΔ $

In the second phase we check if y1y2…yn ∊ bD2.

Page 66: Learning Restricted Restarting Automata

Instructions of M recognizing CFL H Suppose Σ = {a1, a2, a1’, a2’, b, c}, d ∉ Σ, Γ = Σ ∪ {d, Δ}.

In fact, there is no such thing as a first phase or a second phase. We have only instructions.

Theorem: H ⊆ L(M). Theorem: H ⊇ L(M).

Idea. We describe all words that are generated by the instructions.

Instructions for the first phase:

Instructions for the second phase:(1) (¢, Σ → λ, Σ)(2) (Σ, Σ → λ, d)(3) (d, Σ → λ, Σ)(4) (¢, c → Δ, Σ ∪ {Δ})(5) (Σ ∪ {Δ}, cdc → Δ, Σ ∪ {Δ})(6) (Σ ∪ {Δ}, cd → Δ, $)(7) (Γ, a1a1’ → λ, Γ – {b})(8) (Γ, a2a2’ → λ, Γ – {b})(9) (Γ, a1Δa1’ → Δ, Γ – {b})(10) (Γ, a2Δa2’ → Δ, Γ – {b})(11) (Σ – {c}, Δ → λ, Δ)(12) (¢, ΔbΔ → λ, $)

Page 67: Learning Restricted Restarting Automata

The power of RACL Theorem: There exists a k-RACL M

recognizing a language that is not a CFL.

Page 68: Learning Restricted Restarting Automata

The power of RACL Theorem: There exists a k-RACL M

recognizing a language that is not a CFL. Idea. We try to create a k-RACL M such thatL(M) ∩ {(ab)n | n>0} = {(ab)2m | m≥0}.

Page 69: Learning Restricted Restarting Automata

The power of RACL Theorem: There exists a k-RACL M

recognizing a language that is not a CFL. Idea. We try to create a k-RACL M such thatL(M) ∩ {(ab)n | n>0} = {(ab)2m | m≥0}. If L(M) is a CFL then the intersection with a

regular language is also a CFL. In our case the intersection is not a CFL.

Page 70: Learning Restricted Restarting Automata

How does it work Example:¢ abababababababab $

Page 71: Learning Restricted Restarting Automata

How does it work Example:¢ abababababababab $ → ¢ abababababababb $ →¢ abababababbabb $ → ¢ abababbabbabb $ →¢ abbabbabbabb $

Page 72: Learning Restricted Restarting Automata

How does it work Example:¢ abababababababab $ → ¢ abababababababb $ →¢ abababababbabb $ → ¢ abababbabbabb $ →¢ abbabbabbabb $ → ¢ abbabbabbab $ →¢ abbabbabab $ → ¢ abbababab $ →¢ abababab $

Page 73: Learning Restricted Restarting Automata

How does it work Example:¢ abababababababab $ → ¢ abababababababb $ →¢ abababababbabb $ → ¢ abababbabbabb $ →¢ abbabbabbabb $ → ¢ abbabbabbab $ →¢ abbabbabab $ → ¢ abbababab $ →¢ abababab $ → ¢ abababb $ → ¢ abbabb $

Page 74: Learning Restricted Restarting Automata

How does it work Example:¢ abababababababab $ → ¢ abababababababb $ →¢ abababababbabb $ → ¢ abababbabbabb $ →¢ abbabbabbabb $ → ¢ abbabbabbab $ →¢ abbabbabab $ → ¢ abbababab $ →¢ abababab $ → ¢ abababb $ → ¢ abbabb $ → ¢ abbab $ → ¢ abab $

Page 75: Learning Restricted Restarting Automata

How does it work Example:¢ abababababababab $ → ¢ abababababababb $ →¢ abababababbabb $ → ¢ abababbabbabb $ →¢ abbabbabbabb $ → ¢ abbabbabbab $ →¢ abbabbabab $ → ¢ abbababab $ →¢ abababab $ → ¢ abababb $ → ¢ abbabb $ → ¢ abbab $ → ¢ abab $ → ¢ abb $ → ¢ ab $ → ¢ λ $ → accept.

Page 76: Learning Restricted Restarting Automata

The power of RACL: Instructions If we infer instructions from the previous

example, then for k=4 we get the following 4-RACL M:¢ababab a b$babb ¢aabba b b$bab$baba¢ ab $

Page 77: Learning Restricted Restarting Automata

The power of RACL: Instructions If we infer instructions from the previous

example, then for k=4 we get the following 4-RACL M:

Theorem: L(M) ∩ {(ab)n | n>0} = {(ab)2m | m≥0}. Idea. We describe the whole language L(M).

¢ababab a b$babb ¢aabba b b$bab$baba¢ ab $

Page 78: Learning Restricted Restarting Automata

The power of RACL: Instructions If we infer instructions from the previous example,

then for k=4 we get the following 4-RACL M:

Theorem: L(M) ∩ {(ab)n | n>0} = {(ab)2m | m≥0}. Idea. We describe the whole language L(M).

Note that this technique does not work with k<4.

¢ababab a b$babb ¢aabba b b$bab$baba¢ ab $

Page 79: Learning Restricted Restarting Automata

PART III: RACL.exe

Page 80: Learning Restricted Restarting Automata

RACL.exe: Reduce/Generate

Page 81: Learning Restricted Restarting Automata

PART IV: Open Problems We can restrict our model to a k-RASIMPLE

which is the same model as the k-RACL except that we do not use the symbols ¢ and $. I think that this model is useless because it is not able to recognize even finite languages.

Page 82: Learning Restricted Restarting Automata

PART IV: Open Problems We can restrict our model to a k-RASIMPLE

which is the same model as the k-RACL except that we do not use the symbols ¢ and $. I think that this model is useless because it is not able to recognize even finite languages.

We can generalize our model to a k-RAΔnCL which is the same model as the k-RAΔCL except that it uses n Δ symbols: Δ1, Δ2, …, Δn. This model is able to recognize each CFL.

Page 83: Learning Restricted Restarting Automata

PART IV: Open Problems We can restrict our model to a k-RASIMPLE

which is the same model as the k-RACL except that we do not use the symbols ¢ and $. I think that this model is useless because it is not able to recognize even finite languages.

We can generalize our model to a k-RAΔnCL which is the same model as the k-RAΔCL except that it uses n Δ symbols: Δ1, Δ2, …, Δn. This model is able to recognize each CFL.

We can study closure properties of these models.

Page 84: Learning Restricted Restarting Automata

PART IV: Open Problems We can restrict our model to a k-RASIMPLE which is

the same model as the k-RACL except that we do not use the symbols ¢ and $. I think that this model is useless because it is not able to recognize even finite languages.

We can generalize our model to a k-RAΔnCL which is the same model as the k-RAΔCL except that it uses n Δ symbols: Δ1, Δ2, …, Δn. This model is able to recognize each CFL.

We can study closure properties of these models. We can study decidability: L(M) = ∅, L(M) = Σ*

etc.

Page 85: Learning Restricted Restarting Automata

PART IV: Open Problems We can restrict our model to a k-RASIMPLE which is the

same model as the k-RACL except that we do not use the symbols ¢ and $. I think that this model is useless because it is not able to recognize even finite languages.

We can generalize our model to a k-RAΔnCL which is the same model as the k-RAΔCL except that it uses n Δ symbols: Δ1, Δ2, …, Δn. This model is able to recognize each CFL.

We can study closure properties of these models. We can study decidability: L(M) = ∅, L(M) = Σ* etc. We can study differences between language classes of RACL and RAΔCL (for different values of k) etc.

Page 86: Learning Restricted Restarting Automata

PART IV: Open Problems We can restrict our model to a k-RASIMPLE which is the same

model as the k-RACL except that we do not use the symbols ¢ and $. I think that this model is useless because it is not able to recognize even finite languages.

We can generalize our model to a k-RAΔnCL which is the same model as the k-RAΔCL except that it uses n Δ symbols: Δ1, Δ2, …, Δn. This model is able to recognize each CFL.

We can study closure properties of these models. We can study decidability: L(M) = ∅, L(M) = Σ* etc. We can study differences between language classes of RACL

and RAΔCL (for different values of k) etc. We can study if these models are applicable in real problems:

for example if we are able to recognize Pascal language etc.

Page 87: Learning Restricted Restarting Automata

References A. Cherubini, S. Crespi Reghizzi, and P.L. San

Pietro: Associative Language Descriptions, Theoretical Computer Science, 270, 2002, 463-491.

P. Jančar, F. Mráz, M. Plátek, J. Vogel: On Monotonic Automata with a Restart Operation. Journal of Automata, Languages and Combinatorics,1999, 4(4):287–311.

F. Mráz, F. Otto, M. Plátek: Learning Analysis by Reduction from Positive Data. In: Y. Sakakibara, S. Kobayashi, K. Sato, T. Nishino, E. Tomita (Eds.): Proceedings ICGI 2006, LNCS 4201, Springer, Berlin, 2006, 125–136.

http://www.petercerno.wz.cz/ra.html