Recursive Definitions & Regular Expressions (RE).

53
Recursive Recursive Definitions Definitions & & Regular Expressions Regular Expressions (RE) (RE)

Transcript of Recursive Definitions & Regular Expressions (RE).

Page 1: Recursive Definitions & Regular Expressions (RE).

Recursive DefinitionsRecursive Definitions&&

Regular Expressions Regular Expressions (RE)(RE)

Page 2: Recursive Definitions & Regular Expressions (RE).

Recursive Language Recursive Language DefinitionDefinition

A recursive definition is A recursive definition is characteristically a three-step characteristically a three-step process:process:

1. First, we specify some basic objects in the set. 1. First, we specify some basic objects in the set. The number of basic objects specified must be The number of basic objects specified must be finite.finite.

2. Second, we give a finite number of rules for 2. Second, we give a finite number of rules for constructing more objects in the set from the constructing more objects in the set from the ones we already know.ones we already know.

3. Third, we declare that no objects except those 3. Third, we declare that no objects except those constructed in this way are allowed in the set.constructed in this way are allowed in the set.

Page 3: Recursive Definitions & Regular Expressions (RE).

Example:Example: Example: Consider the set P-EVEN, which is the set Example: Consider the set P-EVEN, which is the set

of positive even numbers.of positive even numbers. We can define the set P-EVEN in several different We can define the set P-EVEN in several different

ways:ways:

• • We can define P-EVEN to be the set of all positive We can define P-EVEN to be the set of all positive integers that are evenly divisible by 2.integers that are evenly divisible by 2.

• • P-EVEN is the set of all 2n, where n = 1, 2, . . ..P-EVEN is the set of all 2n, where n = 1, 2, . . .. P-EVEN is defined by these three rules:P-EVEN is defined by these three rules:

Rule 1 2 is in P-EVEN.Rule 1 2 is in P-EVEN. Rule 2 If x is in P-EVEN, then so is x + 2.Rule 2 If x is in P-EVEN, then so is x + 2. Rule 3 The only elements in the set P-EVEN are those that Rule 3 The only elements in the set P-EVEN are those that

can be produced from the two rules above.can be produced from the two rules above.

Page 4: Recursive Definitions & Regular Expressions (RE).

Example:Example:

Example: Let PALINDROME be the set Example: Let PALINDROME be the set of all strings over the alphabet = {a, of all strings over the alphabet = {a, b} that are the same spelled forward b} that are the same spelled forward as backwards; i.e., PALINDROME = as backwards; i.e., PALINDROME = {w : w = reverse(w)} = {, a, b, aa, bb, {w : w = reverse(w)} = {, a, b, aa, bb, aaa, aba, bab, bbb, aaaa, abba, . . .}.aaa, aba, bab, bbb, aaaa, abba, . . .}.

Page 5: Recursive Definitions & Regular Expressions (RE).

Recursive Definition of Recursive Definition of PALINDROMEPALINDROME

A recursive definition for PALINDROME A recursive definition for PALINDROME is as follows:is as follows:Rule 1 , a, and b are in PALINDROME.Rule 1 , a, and b are in PALINDROME.Rule 2 If w 2 PALINDROME, then so are Rule 2 If w 2 PALINDROME, then so are awa and bwb.awa and bwb.Rule 3 No other string is in PALINDROME Rule 3 No other string is in PALINDROME unless it can be produced by rules 1 and 2.unless it can be produced by rules 1 and 2.

Page 6: Recursive Definitions & Regular Expressions (RE).

Arithmetic Expressions(AE)Arithmetic Expressions(AE)

We recursively define AE using the We recursively define AE using the following rules:following rules:

What are the rules?What are the rules?

Page 7: Recursive Definitions & Regular Expressions (RE).

Theory Of AutomataTheory Of Automata77

Recursive Definition of AERecursive Definition of AE

Rule 1: Rule 1: Any number (positive, negative, or zero) is in Any number (positive, negative, or zero) is in AE.AE.

Rule 2: Rule 2: If x is in AE, then so areIf x is in AE, then so are(i) (i) (x)(x)(ii) -(ii) -x (provided that x does not already start with a minus x (provided that x does not already start with a minus

sign)sign)

Rule 3: Rule 3: If x and y are in AE, then so areIf x and y are in AE, then so are

(i) (i) x + y (if the first symbol in y is not + or -)x + y (if the first symbol in y is not + or -)(ii) (ii) x - y (if the first symbol in y is not + or -)x - y (if the first symbol in y is not + or -)(iii) (iii) x * yx * y(iv) (iv) x / yx / y(v) (v) x ** y (our notation for exponentiation)x ** y (our notation for exponentiation)

Page 8: Recursive Definitions & Regular Expressions (RE).

Theory Of AutomataTheory Of Automata88

The above definition is the most natural, because it is the The above definition is the most natural, because it is the method we use to recognize valid arithmetic expressions in method we use to recognize valid arithmetic expressions in real life.real life.

For instance, we wish to determine if the following For instance, we wish to determine if the following expression isexpression is

valid:valid:(2 + 4) * (7 * (9 - 3)/4)/4 * (2 + 8) - 1(2 + 4) * (7 * (9 - 3)/4)/4 * (2 + 8) - 1

We do not really scan over the string, looking for forbidden We do not really scan over the string, looking for forbidden substrings or count the parentheses.substrings or count the parentheses.

We actually imagine the expression in our mind broken We actually imagine the expression in our mind broken down intodown into

components:components:Is (2 + 4) OK? YesIs (2 + 4) OK? YesIs (9 - 3) OK? YesIs (9 - 3) OK? Yes

Page 9: Recursive Definitions & Regular Expressions (RE).

Theory Of AutomataTheory Of Automata99

Arithmetic Expression AEArithmetic Expression AE

Obviously, the following expressions are not valid:Obviously, the following expressions are not valid:

(3 + 5) + 6) 2(/8 + 9) (3 + (4-)8)(3 + 5) + 6) 2(/8 + 9) (3 + (4-)8) The first contains unbalanced parentheses; the The first contains unbalanced parentheses; the

second contains the forbidden substring /; the second contains the forbidden substring /; the third contains the forbidden substring -).third contains the forbidden substring -).

Are there more rules? The substrings // and */ are Are there more rules? The substrings // and */ are also forbidden.also forbidden.

Are there still more?Are there still more? The most natural way of defining a valid AE is by The most natural way of defining a valid AE is by

using a using a recursive definitionrecursive definition, rather than a long , rather than a long list of forbidden substrings.list of forbidden substrings.

Page 10: Recursive Definitions & Regular Expressions (RE).

Regular ExpressionsRegular Expressions

Page 11: Recursive Definitions & Regular Expressions (RE).

Defining Languages Using Defining Languages Using Regular ExpressionsRegular Expressions

Previously, we defined the languages:Previously, we defined the languages:

• • L1 = {L1 = {Xn for n = 1, 2, 3, . . .} for n = 1, 2, 3, . . .}

• • L2 = {x, xxx, xxxxx, . . .}L2 = {x, xxx, xxxxx, . . .}But these are not very precise ways of But these are not very precise ways of

defining languages.defining languages.So we now want to be very precise So we now want to be very precise

about how we define languages, and about how we define languages, and we will do this using regular we will do this using regular expressionsexpressions

Page 12: Recursive Definitions & Regular Expressions (RE).

Regular ExpressionsRegular Expressions

Regular expressions are written in bold face Regular expressions are written in bold face letters and are a way of specifying the language.letters and are a way of specifying the language.

Formal way to define the lexical specifications of a Formal way to define the lexical specifications of a languagelanguage

Remove ambiguity altogetherRemove ambiguity altogether Called expressions on account of similarity with Called expressions on account of similarity with

arithmetic expressionsarithmetic expressions Use *, + and ()Use *, + and ()

* shows repetition* shows repetition + presents choice or disjunction+ presents choice or disjunction () used for grouping() used for grouping

Page 13: Recursive Definitions & Regular Expressions (RE).

1313

Language-Defining SymbolsLanguage-Defining Symbols

We now introduce the use of the Kleene star, We now introduce the use of the Kleene star, applied not to a set, but directly to the letter x and applied not to a set, but directly to the letter x and written as a superscript: x*.written as a superscript: x*.

This simple expression indicates some sequence of This simple expression indicates some sequence of x’s (may be none at all):x’s (may be none at all):

x*x* = = Λ Λ or x or x or x or x22 or x or x33……= x= xnn for some n = 0, 1, 2, 3, … for some n = 0, 1, 2, 3, …

Letter Letter xx is intentionally written in boldface type to is intentionally written in boldface type to distinguish it from an alphabet character.distinguish it from an alphabet character.

We can think of the star as an unknown power. We can think of the star as an unknown power. That is, That is, x*x* stands for a string of x’s, but we do not stands for a string of x’s, but we do not specify how many, and it may be the null string .specify how many, and it may be the null string .

Page 14: Recursive Definitions & Regular Expressions (RE).

1414

The notation x* can be used to define The notation x* can be used to define languages by writing, say Llanguages by writing, say L44 = language (x*) = language (x*)

Since x* is any string of x’s, LSince x* is any string of x’s, L44 is then the is then the language of all possible strings of x’s of any language of all possible strings of x’s of any length (including length (including ΛΛ).).

We should not confuse x* (which is a We should not confuse x* (which is a language-defining symbollanguage-defining symbol) with L) with L44 (which (which is the is the name name we have given to a certain we have given to a certain language).language).

Page 15: Recursive Definitions & Regular Expressions (RE).

1515

Given the alphabet = {a, b}, suppose we wish to define the Given the alphabet = {a, b}, suppose we wish to define the language L that contains all words of the form one language L that contains all words of the form one aa followed by followed by some number of some number of bb’s (maybe no ’s (maybe no bb’s at all); that is’s at all); that is

L = {a, ab, abb, abbb, abbbb, …}L = {a, ab, abb, abbb, abbbb, …}

Using the language-defining symbol, we may writeUsing the language-defining symbol, we may write

L = language (ab*)L = language (ab*)

This equation obviously means that L is the language in which This equation obviously means that L is the language in which the words are the concatenation of an initial a with some or no the words are the concatenation of an initial a with some or no b’s.b’s.

From now on, for convenience, we will simply say From now on, for convenience, we will simply say some some bb’s ’s to to mean mean some or no some or no bb’s’s. When we want to mean . When we want to mean some positive some positive number of number of bb’s’s, we will explicitly say so., we will explicitly say so.

Page 16: Recursive Definitions & Regular Expressions (RE).

1616

We can apply the Kleene star to the We can apply the Kleene star to the whole string ab if we want:whole string ab if we want:

(ab)* =(ab)* = ΛΛ or ab or abab or ababab… or ab or abab or ababab… Observe thatObserve that

(ab)* (ab)* ≠≠ a*b* a*b*because the language defined by the because the language defined by the

expression on the left contains the word expression on the left contains the word abab, whereas the language defined by abab, whereas the language defined by the expression on the right does not.the expression on the right does not.

Page 17: Recursive Definitions & Regular Expressions (RE).

1717

If we want to define the language L1 = {x; xx; xxx; If we want to define the language L1 = {x; xx; xxx; …} using the language-defining symbol, we can …} using the language-defining symbol, we can writewrite

L1 = language(xx*)L1 = language(xx*) which means that each word of L1 must start with which means that each word of L1 must start with

an x followed by some (or no) x’s.an x followed by some (or no) x’s.

Note that we can also define L1 using the notation Note that we can also define L1 using the notation + (as an exponent) introduced in Chapter 2:+ (as an exponent) introduced in Chapter 2:

L1 = language(xL1 = language(x++))

which means that each word of L1 is a string of which means that each word of L1 is a string of some positive number of x’s.some positive number of x’s.

Page 18: Recursive Definitions & Regular Expressions (RE).

1818

Plus SignPlus Sign

Let us introduce another use of the plus Let us introduce another use of the plus sign. By the expressionsign. By the expression

x + yx + y

where x and y are strings of characters where x and y are strings of characters from an alphabet, we mean from an alphabet, we mean either either x x or or y.y.

Care should be taken so as not to confuse Care should be taken so as not to confuse this notation with the notation + (as an this notation with the notation + (as an exponent).exponent).

Page 19: Recursive Definitions & Regular Expressions (RE).

1919

ExampleExample

Consider the language T over the alphabet Consider the language T over the alphabet

ΣΣ = {a; b; c}: = {a; b; c}: T = {a; c; ab; cb; abb; cbb; abbb; cbbb; T = {a; c; ab; cb; abb; cbb; abbb; cbbb;

abbbb; cbbbb; …}abbbb; cbbbb; …} In other words, all the words in T begin In other words, all the words in T begin

with either an a or a c and then are with either an a or a c and then are followed by some number of b’s.followed by some number of b’s.

Using the above plus sign notation, we Using the above plus sign notation, we may write this asmay write this as

T = language((a+ c)b*)T = language((a+ c)b*)

Page 20: Recursive Definitions & Regular Expressions (RE).

2020

ExampleExample

Consider a finite language L that contains Consider a finite language L that contains all the strings of a’s and b’s of length all the strings of a’s and b’s of length three exactly:three exactly:

L = {aaa, aab, aba, abb, baa, bab, bba, L = {aaa, aab, aba, abb, baa, bab, bba, bbb}bbb}

Thus, we may writeThus, we may write

L = language((a+ b)(a + b)(a + b))L = language((a+ b)(a + b)(a + b))or for short,or for short,

L = language((a+ b)L = language((a+ b)33))

Page 21: Recursive Definitions & Regular Expressions (RE).

2121

ExampleExample

In general, if we want to refer to the set of all In general, if we want to refer to the set of all possible strings of a’s and b’s of any length possible strings of a’s and b’s of any length whatsoever, we could writewhatsoever, we could write

language((a+ b)*)language((a+ b)*)

This is the set of This is the set of all possible strings all possible strings of letters from of letters from the alphabet the alphabet ΣΣ = {a, b}, = {a, b}, including the null stringincluding the null string..

This is powerful notation. For instance, we can This is powerful notation. For instance, we can describe all the words that begin with first an a, describe all the words that begin with first an a, followed by anything (i.e., as many choices as we followed by anything (i.e., as many choices as we want of either a or b) aswant of either a or b) as

a(a + b)*a(a + b)*

Page 22: Recursive Definitions & Regular Expressions (RE).

Regular ExpressionsRegular Expressions

Given Given = {a,b} = {a,b}a* = {a* = {ΛΛ, a,aa,aaa,aaa,aaaa,aaaaa, , a,aa,aaa,aaa,aaaa,aaaaa,

…}…}ab* = {a, ab,abb,abbb,abbbb, …}ab* = {a, ab,abb,abbb,abbbb, …}a+b = {a,b}a+b = {a,b}(ab)* = {(ab)* = {ΛΛ, ab, abab, ababab, …}, ab, abab, ababab, …}(a+b)* = {(a+b)* = {ΛΛ, any string of as and bs}, any string of as and bs}

Page 23: Recursive Definitions & Regular Expressions (RE).

Regular ExpressionsRegular Expressions

The symbols that appear in the The symbols that appear in the regular expressions are the letters of regular expressions are the letters of the alphabet the alphabet , the symbol for , the symbol for ΛΛ, , parentheses, the star operator, and parentheses, the star operator, and the plus signthe plus sign

Page 24: Recursive Definitions & Regular Expressions (RE).

Formal Definition of Regular Formal Definition of Regular ExpressionsExpressions

The set of regular expression is The set of regular expression is defined by following rulesdefined by following rules1. Every letter of 1. Every letter of and and ΛΛ is a regular is a regular expression.expression.

2. If r1 and r2 are regular expressions, 2. If r1 and r2 are regular expressions, then so arethen so are

(r(r11))

rr11rr22

rr11+r+r22

rr11**

3.Nothing else is a regular expression3.Nothing else is a regular expression

Page 25: Recursive Definitions & Regular Expressions (RE).

Regular ExpressionsRegular Expressions

Whether following are RE if so what Whether following are RE if so what languages do they generatelanguages do they generatea (b + a)*a (b + a)*bb(a+b)bb(a+b)(a+b)(a+b)(a+b)(a+b)(a+b)(a+b)(a+b)*ba(a+b)*ba(a+b)*a(a+b)*(a+b)*a(a+b)*(a+b)*aa(a+b)*(a+b)*aa(a+b)*

Page 26: Recursive Definitions & Regular Expressions (RE).

Regular ExpressionsRegular ExpressionsWrite RE for the following languagesWrite RE for the following languages

All words ending with bAll words ending with bAll words that start with aAll words that start with aAll words that start with a double letterAll words that start with a double letterAll words that contain at least one double All words that contain at least one double

letterletterAll words that start and end with a double All words that start and end with a double

letterletterAll words of length >=3All words of length >=3All words that contain exactly one a or All words that contain exactly one a or

exactly one bexactly one bAll words that don’t end at baAll words that don’t end at ba

Page 27: Recursive Definitions & Regular Expressions (RE).

Language (Set) operationsLanguage (Set) operations

If L1 and L2 are two languages (set If L1 and L2 are two languages (set of words)of words)L1L2 is a product set that contain all L1L2 is a product set that contain all

combinations of a string from L1 combinations of a string from L1 concatenated with a string from L2concatenated with a string from L2

L1+L2 is the union set (equivalently L1 L1+L2 is the union set (equivalently L1 U L2) containing all words of L1 and L2U L2) containing all words of L1 and L2

ExamplesExamples

Page 28: Recursive Definitions & Regular Expressions (RE).

2828

Product SetProduct Set

If S and T are sets of strings of If S and T are sets of strings of letters, we define the letters, we define the product setproduct set of of strings of letters to be strings of letters to be

ST = {all combinations of a string from ST = {all combinations of a string from S concatenated with a string from T S concatenated with a string from T in that order} in that order}

Page 29: Recursive Definitions & Regular Expressions (RE).

2929

ExampleExample

If S = {a, aa, aaa} and T = {bb, bbb} thenIf S = {a, aa, aaa} and T = {bb, bbb} then

ST = {abb, abbb, aabb, aabbb, aaabb, aaabbb}ST = {abb, abbb, aabb, aabbb, aaabb, aaabbb}

Using regular expression, we can write this Using regular expression, we can write this example asexample as

(a + aa + aaa)(bb + bbb)(a + aa + aaa)(bb + bbb)

= abb + abbb + aabb + aabbb + aaabb + = abb + abbb + aabb + aabbb + aaabb + aaabbbaaabbb

Page 30: Recursive Definitions & Regular Expressions (RE).

3030

ExampleExample

If M = {If M = {ΛΛ,, x, xx} and N = { x, xx} and N = {ΛΛ,, y, yy, y, yy, yyy, yyyy, …} thenyyy, yyyy, …} then

MN ={MN ={ΛΛ,, y, yy, yyy, yyyy,…x, xy, y, yy, yyy, yyyy,…x, xy, xyy, xyyy, xyyyy, …xx, xxy, xxyy, xyy, xyyy, xyyyy, …xx, xxy, xxyy, xxyyy, xxyyyy, …}xxyyy, xxyyyy, …}

Using regular expressionUsing regular expression

((ΛΛ + x + xx)(y*) = y* + xy* + xxy* + x + xx)(y*) = y* + xy* + xxy*

Page 31: Recursive Definitions & Regular Expressions (RE).

Regular LanguagesRegular Languages

The languages defined by a regular The languages defined by a regular expression are called regular expression are called regular languageslanguages

Or alternativelyOr alternativelyAny language that can be Any language that can be

represented by a regular expression represented by a regular expression is a regular languageis a regular language

Page 32: Recursive Definitions & Regular Expressions (RE).

Languages Languages Associated with Associated with

Regular ExpressionsRegular Expressions

Page 33: Recursive Definitions & Regular Expressions (RE).

3333

DefinitionDefinition

The following rules define the The following rules define the language associated language associated with any regular expression:with any regular expression:

Rule 1: The language associated with the regular Rule 1: The language associated with the regular expression that is just a single letter is that one-letter expression that is just a single letter is that one-letter word alone, and the language associated with word alone, and the language associated with ΛΛ is is just {just {ΛΛ}, a one-word language.}, a one-word language.

Rule 2: If rRule 2: If r11 is a regular expression associated with the is a regular expression associated with the language Llanguage L11 and r and r22 is a regular expression associated is a regular expression associated with the language Lwith the language L22, then:, then:

(i) The regular expression (r(i) The regular expression (r11)(r)(r22) is associated with the ) is associated with the product Lproduct L11LL22, that is the language L, that is the language L11 times the language L times the language L22::

language(rlanguage(r11rr22) = L) = L11LL22

Page 34: Recursive Definitions & Regular Expressions (RE).

3434

Definition contd.Definition contd.Rule 2 (cont.):Rule 2 (cont.):

(ii) The regular expression r(ii) The regular expression r11 + r + r22 is is associated with the language formed by associated with the language formed by the union of Lthe union of L11 and L and L22::

language(rlanguage(r11 + r + r22) = L) = L11 + L + L22

(iii) The language associated with the (iii) The language associated with the regular expression (rregular expression (r11)* is L)* is L11*, the Kleene *, the Kleene closure of the set Lclosure of the set L11 as a set of words: as a set of words:

language(rlanguage(r11**) = L) = L11

**

Page 35: Recursive Definitions & Regular Expressions (RE).

Languages associated with Languages associated with REsREs

rr11 = a, r = a, r11 = b, r = b, r11 = = ΛΛ If L1 is associated with r1 and L2 is If L1 is associated with r1 and L2 is

associated r2associated r2Language(rLanguage(r11rr22) = L) = L11LL22

Language(rLanguage(r11+r+r22) = L) = L11+L+L22 = L = L11 U L U L22

Language(rLanguage(r11*) = L*) = L11* (Kleen’s Closure of * (Kleen’s Closure of LL11))

Page 36: Recursive Definitions & Regular Expressions (RE).

Regular LanguagesRegular Languages

How to tell whether a language is regularHow to tell whether a language is regularDefine an RE for it, if it is possible the Define an RE for it, if it is possible the

language is Regular other wise non-regularlanguage is Regular other wise non-regular DefinitionDefinition

The language generated by any regular The language generated by any regular expression is called a expression is called a regular language.regular language.

It is to be noted that if r1, r2 are regular It is to be noted that if r1, r2 are regular expressions, corresponding to the languages expressions, corresponding to the languages L1 and L2 then the languages generated by L1 and L2 then the languages generated by r1+ r2, r1r2( or r2r1) and r1*( or r2*) are r1+ r2, r1r2( or r2r1) and r1*( or r2*) are also regular languages.also regular languages.

Page 37: Recursive Definitions & Regular Expressions (RE).

Regular LanguagesRegular Languages

ExampleExampleConsider the language L, defined Consider the language L, defined

over Σ = {a,b}, of strings of over Σ = {a,b}, of strings of length 2, starting with a, thenlength 2, starting with a, then

L = {aa, ab}, may be expressed L = {aa, ab}, may be expressed by the regular expression aa+ab. by the regular expression aa+ab. Hence L, by definition, is a Hence L, by definition, is a regular language.regular language.

Page 38: Recursive Definitions & Regular Expressions (RE).

Regular LanguagesRegular Languages

All finite languages are All finite languages are regularregular

ExampleExampleConsider the language Consider the language LL, defined , defined

over over Σ = {a,b}Σ = {a,b}, of strings of length , of strings of length 22, starting with , starting with aa, then , then L = {aa, L = {aa, ab}ab}, may be expressed by the , may be expressed by the regular expressionregular expression aa+ab aa+ab. Hence. Hence L L, , by definition, is a by definition, is a regular regular languagelanguage..

Page 39: Recursive Definitions & Regular Expressions (RE).

3939

Theorem Theorem

If If L L is a finite language (a language with only finitely is a finite language (a language with only finitely many words), then many words), then L L can be defined by a regular can be defined by a regular expression. In other words, all finite languages are expression. In other words, all finite languages are regular.regular.

ProofProof

Let L be a finite language. To make one regular expression that Let L be a finite language. To make one regular expression that defines L, we turn all the words in L into boldface type and defines L, we turn all the words in L into boldface type and insert plus signs between them.insert plus signs between them.

For example, the regular expression that defines the languageFor example, the regular expression that defines the languageL = {baa, abbba, bababa} is baa + abbba + bababaL = {baa, abbba, bababa} is baa + abbba + bababa

This algorithm only works for finite languages because an This algorithm only works for finite languages because an infinite language would become a regular expression that is infinite language would become a regular expression that is infinitely long, which is forbidden.infinitely long, which is forbidden.

Page 40: Recursive Definitions & Regular Expressions (RE).

More REsMore REs

EVEN-EVEN (EVEN-EVEN ( = {a, b}) = {a, b})Language of all words having even Language of all words having even

number of as and even number of bsnumber of as and even number of bsPartitions/setsPartitions/sets

Even as even bs (valid)Even as even bs (valid)Even as odd bs (need to adjust bs)Even as odd bs (need to adjust bs)Odd as odd bs (need to adjust as and bs)Odd as odd bs (need to adjust as and bs)Odd as even bs (need to adjust as)Odd as even bs (need to adjust as)

Page 41: Recursive Definitions & Regular Expressions (RE).

Regular ExpressionsRegular ExpressionsEVEN-EVEN (EVEN-EVEN ( = {a, b}) = {a, b})

RE setsRE sets(aa+bb)*(aa+bb)*((ab+ba)(ab+ba))*((ab+ba)(ab+ba))*

(aa + bb + (ab + ba )(aa + bb)* (ab + (aa + bb + (ab + ba )(aa + bb)* (ab + ba))*ba))*

This expression represents all the words This expression represents all the words that are made up of :that are made up of :

typetype11 = aa = aa

typetype22 = bb = bb

typetype33 = (ab + ba)(aa + bb)*(ab + ba) = (ab + ba)(aa + bb)*(ab + ba)

Page 42: Recursive Definitions & Regular Expressions (RE).

Equivalent Regular Equivalent Regular ExpressionsExpressions

DefinitionDefinitionTwo regular expressions are said to be Two regular expressions are said to be

equivalent if they generate the same equivalent if they generate the same language.language.

ExampleExampleConsider the following regular Consider the following regular

expressionsexpressionsr1 = (a + b)* (aa + bb)r1 = (a + b)* (aa + bb)r2 = (a + b)*aa + ( a + b)*bb then r2 = (a + b)*aa + ( a + b)*bb then

both regular expressions define the both regular expressions define the language of strings ending in aa or bb language of strings ending in aa or bb

Page 43: Recursive Definitions & Regular Expressions (RE).

Equivalent Regular Equivalent Regular ExpressionsExpressions

NoteNote If r1 = (aa + bb) and r2 = ( a + b) thenIf r1 = (aa + bb) and r2 = ( a + b) then r1+r2 = (aa + bb) + (a + b)r1+r2 = (aa + bb) + (a + b) r1r2 = (aa + bb) (a + b)r1r2 = (aa + bb) (a + b)= (aaa + aab + bba + = (aaa + aab + bba +

bbb)bbb) (r1)* = (aa + bb)*(r1)* = (aa + bb)*

Page 44: Recursive Definitions & Regular Expressions (RE).

4444

ExampleExample

Consider the language defined by the expressionConsider the language defined by the expression

(a + b)*a(a + b)*(a + b)*a(a + b)*

At the beginning of any word in this language we have At the beginning of any word in this language we have

(a + b)*(a + b)*, which is any string of , which is any string of aa’s and ’s and bb’s, then comes ’s, then comes an an aa, then another any string., then another any string.

For example, the word abbaab can be considered to For example, the word abbaab can be considered to come from this expression by 3 different choices:come from this expression by 3 different choices:

((ΛΛ)a(bbaab) )a(bbaab) or (abb)a(ab) or (abb)a(ab) or (abba)a(b)or (abba)a(b)

Page 45: Recursive Definitions & Regular Expressions (RE).

4545

Example contd.Example contd.

This language is the set of all words over the This language is the set of all words over the alphabet alphabet ΣΣ = {a, b} that have at least one a. = {a, b} that have at least one a.

The only words left out are those that have The only words left out are those that have only b’s and the word only b’s and the word ΛΛ..

These left out words are exactly the These left out words are exactly the language defined by the expression b*.language defined by the expression b*.

If we combine this language, we should If we combine this language, we should provide a language of all strings over the provide a language of all strings over the alphabet alphabet ΣΣ = {a, b}. That is, = {a, b}. That is,

(a + b)* = (a + b)*a(a + b)* + b*(a + b)* = (a + b)*a(a + b)* + b*

Page 46: Recursive Definitions & Regular Expressions (RE).

4646

ExampleExample

The language of all words that have at least two a’s The language of all words that have at least two a’s can be defined by the expression:can be defined by the expression:

(a + b)*a(a + b)*a(a + b)*(a + b)*a(a + b)*a(a + b)*

Another expression that defines all the words with Another expression that defines all the words with at least two a’s isat least two a’s is

b*ab*a(a + b)*b*ab*a(a + b)*

Hence, we can writeHence, we can write(a + b)*a(a + b)*a(a + b)* = b*ab*a(a + b)*(a + b)*a(a + b)*a(a + b)* = b*ab*a(a + b)*

where by the equal sign we mean that these two where by the equal sign we mean that these two expressions are expressions are equivalent equivalent in the sense that they in the sense that they describe the same language.describe the same language.

Page 47: Recursive Definitions & Regular Expressions (RE).

4747

ExampleExample

The language of all words that have at least one a and at least one The language of all words that have at least one a and at least one b is somewhat trickier. If we writeb is somewhat trickier. If we write

(a + b)*a(a + b)*b(a + b)*(a + b)*a(a + b)*b(a + b)*then we are requiring that an a must precede a b in the word. Suchthen we are requiring that an a must precede a b in the word. Suchwords as ba and bbaaaa are not included in this language. words as ba and bbaaaa are not included in this language.

Since we know that either the a comes before the b or the b comes Since we know that either the a comes before the b or the b comes before the a, we can define the language by the expression before the a, we can define the language by the expression

(a + b)*a(a + b)*b(a + b)* + (a + b)*b(a + b)*a(a + b)*(a + b)*a(a + b)*b(a + b)* + (a + b)*b(a + b)*a(a + b)*

Note that the only words that are omitted by the first term Note that the only words that are omitted by the first term (a + b)*a(a + b)*b(a + b)* are the words of the form some b’s (a + b)*a(a + b)*b(a + b)* are the words of the form some b’s

followed by some a’s. They are defined by the expression bb*aa*followed by some a’s. They are defined by the expression bb*aa*

Page 48: Recursive Definitions & Regular Expressions (RE).

4848

ExampleExample

We can add these specific exceptions. We can add these specific exceptions. So, the language of all words over the So, the language of all words over the alphabet alphabet ΣΣ = {a, b} that contain at least = {a, b} that contain at least one a and at least one b is defined by one a and at least one b is defined by the expression:the expression:

(a + b)a(a + b)b(a + b) + bb*aa*(a + b)a(a + b)b(a + b) + bb*aa*Thus, we have proved thatThus, we have proved that

(a + b)*a(a + b)*b(a + b)* + (a + b)*b(a + b)*a(a + (a + b)*a(a + b)*b(a + b)* + (a + b)*b(a + b)*a(a + b)*b)*

= (a + b)*a(a + b)*b(a + b)* + bb*aa*= (a + b)*a(a + b)*b(a + b)* + bb*aa*

Page 49: Recursive Definitions & Regular Expressions (RE).

4949

ExampleExample

In the above example, the language of all words that In the above example, the language of all words that contain both an a and ab is defined by the contain both an a and ab is defined by the expressionexpression

(a + b)*a(a + b)*b(a + b)* + bb*aa*(a + b)*a(a + b)*b(a + b)* + bb*aa*

The only words that do not contain are the words of The only words that do not contain are the words of all a’s, all b’s, or all a’s, all b’s, or ΛΛ..

When these are included, we get everything. Hence, When these are included, we get everything. Hence, the expressionthe expression(a + b)*a(a + b)*b(a + b)* + bb*aa* + a* + b*(a + b)*a(a + b)*b(a + b)* + bb*aa* + a* + b*

defines all possible strings of a’s and b’s, including defines all possible strings of a’s and b’s, including (accounted for in both a and b).(accounted for in both a and b).

Page 50: Recursive Definitions & Regular Expressions (RE).

5050

ThusThus

(a + b)* = (a + b)*a(a + b)*b(a + b)* + bb*aa* (a + b)* = (a + b)*a(a + b)*b(a + b)* + bb*aa* + a* + b*+ a* + b*

Page 51: Recursive Definitions & Regular Expressions (RE).

5151

ExampleExample

The following equivalences show that we should not treat The following equivalences show that we should not treat expressions as algebraic polynomials:expressions as algebraic polynomials:

(a + b)* = (a + b)* + (a + b)*(a + b)* = (a + b)* + (a + b)*(a + b)* = (a + b)* + a*(a + b)* = (a + b)* + a*(a + b)* = (a + b)*(a + b)*(a + b)* = (a + b)*(a + b)*(a + b)* = a(a + b)* + b(a + b)* + (a + b)* = a(a + b)* + b(a + b)* + ΛΛ(a + b)* = (a + b)*ab(a + b)* + b*a*(a + b)* = (a + b)*ab(a + b)* + b*a*

The last equivalence may need some explanation:The last equivalence may need some explanation: The first term in the right hand side, (a + b)*ab(a + b)*, The first term in the right hand side, (a + b)*ab(a + b)*,

describes all the words that contain the substring ab.describes all the words that contain the substring ab.

The second term, b*a* describes all the words that do not The second term, b*a* describes all the words that do not contain the substring ab (i.e., all a’s, all b’s, contain the substring ab (i.e., all a’s, all b’s, ΛΛ, or some b’s , or some b’s followed by some a’s).followed by some a’s).

Page 52: Recursive Definitions & Regular Expressions (RE).

Home work Practice…Home work Practice…

Make Regular Expression that do not Make Regular Expression that do not end by double letterend by double letter

Make a regular expression that do not Make a regular expression that do not contains both substring bba and abb.contains both substring bba and abb.

Make a Regular Expression Where each Make a Regular Expression Where each word must contains odd number of a’s word must contains odd number of a’s and odd number of b’sand odd number of b’s

Make a regular expression where each Make a regular expression where each word contains 3,6,9,12,15,18.. No of a’s.word contains 3,6,9,12,15,18.. No of a’s.

Page 53: Recursive Definitions & Regular Expressions (RE).

Home work Practice…Home work Practice…

Language of all those words that Language of all those words that contains bbb.contains bbb.

Language of all those strings whose Language of all those strings whose length is multiple of 5.length is multiple of 5.

Language of all those strings which Language of all those strings which contains at least two b’scontains at least two b’s

Langauge of all those strings that Langauge of all those strings that contains a double letter and have contains a double letter and have even Length.even Length.