Closure, Kleene’s Theorem, and Kleene’s...

Post on 16-Mar-2020

10 views 0 download

Transcript of Closure, Kleene’s Theorem, and Kleene’s...

DEGefWb5wiGH2XgYMEzUKajEmtCDUsRQ4d1A nice paper relevant to this course is titled “The Glory of the Past”2NICTA Researcher,

Adjunct at the Australian National University and Griffith University

ε-Closure, Kleene’s Theorem,and Kleene’s Algebra

Charles Gretton2

24 February 2014

NICTA Funding and Supporting Members and Partners

Kleene’s Theorem Copyright NICTA 2014 Charles Gretton2 1/30

First, a Quick Summary of ε-Closure

• I’ve previously waived my hands about ε-labelled arcs.

• Recall, ε is our notation for the empty string.

• How can we formally treat ε-labelled arcs in an NFA?

• Using a function called ECLOSE...

• A construction similar to that for NFA←→ DFA is used to proveε-NFA←→ DFA

Kleene’s Theorem Copyright NICTA 2014 Charles Gretton2 2/30

Quick Summary of ε-Closure

• ε-NFA is 〈Q,Σ, δ, q0,F 〉.

• ECLOSE : Q → 2Q – i.e. gives an element in the powerset of states.

• It is defined inductively, as follows.

• BASIS: q ∈ ECLOSE(q) – i.e. the argument is in the basis.

• INDUCTION: Suppose p ∈ ECLOSE(q) and r ∈ δ(p, ε), thenr ∈ ECLOSE(q).

Kleene’s Theorem Copyright NICTA 2014 Charles Gretton2 3/30

Quick Example of ε-Closure

• ECLOSE : Q → 2Q – i.e. gives an element in the powerset of states.

• It is defined inductively, as follows.

• BASIS: q ∈ ECLOSE(q) – i.e. the argument is in the basis.

• INDUCTION: Suppose p ∈ ECLOSE(q) and r ∈ δ(p, ε), thenr ∈ ECLOSE(q).

???

Kleene’s Theorem Copyright NICTA 2014 Charles Gretton2 4/30

Quick Example of ε-Closure

• ECLOSE : Q → 2Q – i.e. gives an element in the powerset of states.

• It is defined inductively, as follows.

• BASIS: q ∈ ECLOSE(q) – i.e. the argument is in the basis.

• INDUCTION: Suppose p ∈ ECLOSE(q) and r ∈ δ(p, ε), thenr ∈ ECLOSE(q).

ECLOSE(1) = {1, 2, 3, 4, 6}

Kleene’s Theorem Copyright NICTA 2014 Charles Gretton2 5/30

ε-Closure Subset Construction

• For DFA −→ ε-NFA, just add δ(q, ε) = ∅ transitions to all states –i.e. ε is not an alphabet symbol.

• ε-NFA −→ DFA

• Same as for NFA −→ DFA,

• Each DFA state corresponds to a unique set of ε-NFA states.

• Only take the ε-closure at ε-NFA transitions.

Kleene’s Theorem Copyright NICTA 2014 Charles Gretton2 6/30

Summarizing

• NFA←→ DFA

• The equivalent NFA can be exponentially smaller than thecorresponding DFA.

• ε-NFA←→ DFA

• From transitivity: ε-NFA←→ NFA

• All formalisms so far are expressively equivalent.

Kleene’s Theorem Copyright NICTA 2014 Charles Gretton2 7/30

Languages and Regular Expressions

• Regular expression implementations in many UNIX tools: grep,sed, awk, emacs and vi

• Following Kleene’s seminal work, we deal with regularity formally.

• We describe the language intended by a given regular expression.

• We consider the class of languages that can be expressed.

• We examine algebraic laws which inform re-writing rules forsimplifying expressions.

Kleene’s Theorem Copyright NICTA 2014 Charles Gretton2 8/30

Operations on Languages

• L1,L2 ⊆ Σ∗

• product of languages

L1.L2 = L1L2 = {WX |W ∈ L1,X ∈ L2}

• inductive i th product

L0 = {ε} ∀i > 0, Li = LLi−1

L∗ =⋃

i≥0 Li

• positive closure

L+ = LL∗

Kleene’s Theorem Copyright NICTA 2014 Charles Gretton2 9/30

Regular Expressions

• A regular expression is a well formed word over a given alphabet Σ

• We use the following symbols:

Σ ∪ {ε ∅ ( ) + . ∗}

• alphabet• empty string• empty set• open parenthesis• close parenthesis• “union” or “choice”• concatenation – X.Y often written as XY• Kleene closure – the “star” of this show

Kleene’s Theorem Copyright NICTA 2014 Charles Gretton2 10/30

Regular Expressions

• A regular expression is a well formed word over a given alphabet Σ

• We use the following symbols:

Σ ∪ {ε ∅ ( ) + . ∗}

• “symbol” :: alphabet• constant (or 0-ary operator) :: empty string• constant (or 0-ary operator) :: empty set• punctuation :: open parenthesis• punctuation :: close parenthesis• infix binary operator :: “union” or “choice”• infix binary operator :: concatenation – X.Y often written as XY• postfix unary operator :: Kleene closure – the “star” of this show

Kleene’s Theorem Copyright NICTA 2014 Charles Gretton2 11/30

Regular Expressions

• We use the following symbols:

Σ ∪ {ε ∅ ( ) + . ∗}

• Taking a ∈ Σ ∪ {ε, ∅}, regular expressions satisfy the followinggrammar:

regxp ::= a|(regxp)|regxp + regxp|regxp.regxp|regxp∗

• The Kleene closure operator, *, has the highest precedence.• Concatenation, . , is usually represented by juxtaposition – e.g.

concatenation a.b is written ab• Concatenation has the next highest precedence.• Union/choice has the least precedence.

Kleene’s Theorem Copyright NICTA 2014 Charles Gretton2 12/30

Regular Expression Examples

• Taking a ∈ Σ ∪ {ε, ∅}, regular expressions satisfy the followinggrammar:

regxp ::= a|(regxp)|regxp + regxp|regxp.regxp|regxp∗

• So, which of the following is correct?

ab∗ + a = (a(b∗ + a))ab∗ + a = (ab)∗ + aab∗ + a = (a(b∗)) + a

Kleene’s Theorem Copyright NICTA 2014 Charles Gretton2 13/30

Regular Expression Examples

• Taking a ∈ Σ ∪ {ε, ∅}, regular expressions satisfy the followinggrammar:

regxp ::= a|(regxp)|regxp + regxp|regxp.regxp|regxp∗

• So, which of the following is correct?

ab∗ + a = (a(b∗ + a))ab∗ + a = (ab)∗ + aab∗ + a = (a(b∗)) + a

Kleene’s Theorem Copyright NICTA 2014 Charles Gretton2 14/30

Semantics of Regular Expressions

• BASISLanguage of Expression SemanticsL(ε) = {ε}L(∅) = ∅L(a), a ∈ Σ = {a}

• INDUCTION

• Below, L1 and L2 are well formed regular expressions.

Language of Expression SemanticsL(L1 + L2) = L(L1) ∪ L(L2)L(L1L2) = L(L1)L(L2)L(L∗1) = L(L1)∗

L((L1)) = L(L1)

Kleene’s Theorem Copyright NICTA 2014 Charles Gretton2 15/30

Every Finite Language is Regular

• A regular language is a language that can be given by a regularexpression.

• A finite language is a finite set of strings:

{X1,X2, ..,Xn}

• The regular expression that gives that finite language is:

X1 + X2 + ..+ Xn

• QED

Kleene’s Theorem Copyright NICTA 2014 Charles Gretton2 16/30

Algebra and Regular Expressions

• Commutativity laws:

L1 + L2 = L2 + L1

Kleene’s Theorem Copyright NICTA 2014 Charles Gretton2 17/30

Algebra and Regular Expressions

• Cancellation laws:• Identities

L + ∅ = L∅+ L = LεL = LLε = L

• Annihilators

L∅ = ∅∅L = ∅

Kleene’s Theorem Copyright NICTA 2014 Charles Gretton2 18/30

Algebra and Regular Expressions

• Associative laws:

L1 + (L2 + L3) = (L1 + L2) + L3

L1(L2L3) = (L1L2)L3

• Power associativity

L1(L1L1) = (L1L1)L1

... ... ...(L1L1)(L1L1) = (L1(L1L1))L1

Kleene’s Theorem Copyright NICTA 2014 Charles Gretton2 19/30

Algebra and Regular Expressions

• Left and right distributive laws:

L1(L2 + L3) = L1L2 + L1L3

(L2 + L3)L1 = L2L1 + L3L1

• Idempotence:

L + L = L

• Closure laws – There are exactly two languages whose closures arefinite. What are they?

Kleene’s Theorem Copyright NICTA 2014 Charles Gretton2 20/30

Algebra and Regular Expressions

• Left and right distributive laws:

L1(L2 + L3) = L1L2 + L1L3

(L2 + L3)L1 = L2L1 + L3L1

• Idempotence:

L + L = L

• Closure laws

ε∗ = ε∅∗ = ε(L∗)∗ = L∗

Kleene’s Theorem Copyright NICTA 2014 Charles Gretton2 21/30

Application of Algebraic Laws

• a + ab∗

• precedence: (a + (a(b∗)))

• annihilators: (aε+ (a(b∗)))

• left distributive law: a(ε+ b∗)

• cancellation law: a(b∗)

• CONCLUSION: ab∗ = a + ab∗

Kleene’s Theorem Copyright NICTA 2014 Charles Gretton2 22/30

Relating Concrete and Lifted Expressions

• When I write (L∗)∗ = L∗, I intended that L could be any regularexpression.

• For example, the following concrete expressions are instances ofthis identity.

1 ((a)∗)∗ = a∗

2 ((ε(a + bab + ∅) + bab)∗)∗ = (ε(a + bab + ∅) + bab)∗

• The above are called concrete regular expressions, because theycontain no variables.

• Our textbook talks about a specific class of concrete expression(above, Type 1), where each variable symbol from the liftedexpression is replaced with an alphabet constant.

Kleene’s Theorem Copyright NICTA 2014 Charles Gretton2 23/30

Theorem Relating Concrete and LiftedExpressions

• Let E be a regular expression with variable symbols Li , i ∈ [1..m].

• For example taking m = 2, E could be (L1 + L2)∗, (L1L∗2)∗, etc

• Let E ′ be an expression where each variable Li is replaced by aconstant ai .

• Every X ∈ L(E) can be written as X1X2..Xk , where each substringXj is in one of the languages L(Li).

• Writing L(Xj) = i if Xj is from the i th language,

• aL(X1)aL(X2)..aL(Xk )∈ L(E ′)

Kleene’s Theorem Copyright NICTA 2014 Charles Gretton2 24/30

Theorem Relating Concrete and LiftedExpressions

• BASIS: E ∈ {ε, ∅, L1}.

• The cases E ∈ {ε, ∅} are trivial, because the concrete and liftedexpressions are identical.

• Taking E = L1, then L(E) = L(L1). Also E ′ = a1 andL(E ′) = L(a1) = {a1}.

• Above, the L1/a1 correspondence is clearly satisfied.

Kleene’s Theorem Copyright NICTA 2014 Charles Gretton2 25/30

Theorem Relating Concrete and LiftedExpressions

• INDUCTION: E ∈ {E1 + E2,E1E2,E∗1}.

• E ′1 and E ′2 are constructed so that if a variable appears in bothsubexpressions, then that is substituted with the same concretesymbol.

• For example, take E = L1L2 + L2L1.

• Then a valid concrete expression is: E = a1a2 + a2a1.

• An invalid concrete expression would be: E = a1a2 + a3a4.

Kleene’s Theorem Copyright NICTA 2014 Charles Gretton2 26/30

Theorem Relating Concrete and LiftedExpressions

• INDUCTION: E ∈ {E1 + E2,E1E2,E∗1}.

• E ′1 and E ′2 are constructed so that if a variable appears in bothsubexpressions, then that is substituted with the same concretesymbol.

• L(E ′) = L(E ′1 + E ′2) = L(E ′1) ∪ L(E ′2)

• L(E) = L(E1 + E2) = L(E1) ∪ L(E2)

• Our inductive hypothesis gives us the correspondence for strings inL(E1)/L(E ′1).

• Also for strings in L(E2)/L(E ′2).

• Following the definition of union, we have proved this step case.

Kleene’s Theorem Copyright NICTA 2014 Charles Gretton2 27/30

Theorem Relating Concrete and LiftedExpressions

• INDUCTION: E ∈ {E1 + E2,E1E2,E∗1}.• E ′1 and E ′2 are constructed so that if a variable appears in both

subexpressions, then that is substituted with the same concretesymbol.

• L(E ′) = L(E ′1.E′2) = L(E ′1).L(E ′2)

• L(E) = L(E1.E2) = L(E1).L(E2)

• Our inductive hypothesis gives us the correspondence for strings inL(E1)/L(E ′1).

• Also for strings in L(E2)/L(E ′2).

• Following the definition of concatenation, we have proved this stepcase.

Kleene’s Theorem Copyright NICTA 2014 Charles Gretton2 28/30

Theorem Relating Concrete and LiftedExpressions

• INDUCTION: E ∈ {E1 + E2,E1E2,E∗1}.

• L((E ′1)∗) = L(ε) ∪ L(E ′1)+

• L(E∗1 ) = L(ε) ∪ L(E1)+

• Our inductive hypothesis gives us the correspondence for strings inL(E1)/L(E ′1).

• Because closure yields zero or more occurrences of the languagestrings, we also have a correspondence between L(E1)∗/L(E ′1)∗.

• QED

Kleene’s Theorem Copyright NICTA 2014 Charles Gretton2 29/30

Theorem Relating Concrete and LiftedExpressions

• What if we added to the step cases:E ∈ {E1 ∩ E2,E1 + E2,E1E2,E∗1}?

• Let’s propose a law L1 ∩ L2 ∩ L3 = L1 ∩ L2

• The law holds for a concretization: L(a ∩ b ∩ c) = L(a ∩ b) = ∅

• But fails generally: L(a ∩ a ∩ ∅) 6= L(a ∩ a)

• So, the theorem we just proved is rather specific to regularexpressions.

Kleene’s Theorem Copyright NICTA 2014 Charles Gretton2 30/30