Minimization of automata - arXiv.org e-Print archive · Minimization of automata Jean Berstel1, Luc...

37
arXiv:1010.5318v3 [cs.FL] 30 Dec 2010 Minimization of automata Jean Berstel 1 , Luc Boasson 2 , Olivier Carton 2 , Isabelle Fagnot 1,* 1 Laboratoire d’Informatique Gaspard-Monge Universit´ e Paris-Est Marne-la-Vall´ ee 5, boulevard Descartes, Champs-sur-Marne, F-77454 Marne-la-Vall´ ee Cedex 2 2 LIAFA * Universit´ e Paris Diderot Case 7014, F-75205 Paris Cedex 13 email: {berstel,fagnot}@univ-mlv.fr,{boasson,carton}@liafa.jussieu.fr January 4, 2011 3 h 39 2010 Mathematics Subject Classification: 68Q45 Key words: Finite automata, minimization, Hopcroft’s algorithm. Contents 1 Introduction 2 2 Definitions and notation 3 3 Brzozowski’s algorithm 5 4 Moore’s algorithm 6 4.1 Description ................................. 6 4.2 Average complexity ............................. 8 5 Hopcroft’s algorithm 8 5.1 Outline .................................... 9 5.2 Behavior ................................... 10 5.3 Complexity ................................. 11 5.4 Miscellaneous remarks ........................... 12 5.5 Moore versus Hopcroft ........................... 13 6 Slow automata 14 6.1 Definition and equivalence ......................... 15 6.2 Examples .................................. 16 7 Minimization by fusion 18 7.1 Local automata ............................... 18 7.2 Bottom-up minimization .......................... 19

Transcript of Minimization of automata - arXiv.org e-Print archive · Minimization of automata Jean Berstel1, Luc...

arX

iv:1

010.

5318

v3 [

cs.F

L] 3

0 D

ec 2

010

Minimization of automata

Jean Berstel1, Luc Boasson2, Olivier Carton2, Isabelle Fagnot1,∗

1Laboratoire d’Informatique Gaspard-MongeUniversite Paris-Est Marne-la-Vallee

5, boulevard Descartes, Champs-sur-Marne, F-77454 Marne-la-Vallee Cedex 2

2LIAFA∗Universite Paris Diderot

Case 7014, F-75205 Paris Cedex 13

email:berstel,[email protected],boasson,[email protected]

January 4, 2011 3 h 39

2010 Mathematics Subject Classification: 68Q45

Key words: Finite automata, minimization, Hopcroft’s algorithm.

Contents

1 Introduction 22 Definitions and notation 33 Brzozowski’s algorithm 54 Moore’s algorithm 6

4.1 Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64.2 Average complexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8

5 Hopcroft’s algorithm 85.1 Outline . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95.2 Behavior . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105.3 Complexity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115.4 Miscellaneous remarks . . . . . . . . . . . . . . . . . . . . . . . . . . . 125.5 Moore versus Hopcroft . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

6 Slow automata 146.1 Definition and equivalence . . . . . . . . . . . . . . . . . . . . . . . . .156.2 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

7 Minimization by fusion 187.1 Local automata . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 187.2 Bottom-up minimization . . . . . . . . . . . . . . . . . . . . . . . . . . 19

2 J. Berstel, L. Boasson, O. Carton, I. Fagnot

7.3 Revuz’s algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 207.4 The algorithm of Almeida and Zeitoun . . . . . . . . . . . . . . . . .. . 217.5 Incremental minimization: the algorithm of Daciuket al. . . . . . . . . . 24

8 Dynamic minimization 269 Extensions and special cases 28

9.1 Special automata . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 299.2 Nondeterministic automata . . . . . . . . . . . . . . . . . . . . . . . .. 29

References 31

1 Introduction

This chapter is concerned with the design and analysis of algorithms for minimizing finiteautomata. Getting a minimal automaton is a fundamental issue in the use and implementa-tion of finite automata tools in frameworks like text processing, image analysis, linguisticcomputer science, and many other applications.

There are two main families of minimization algorithms. Thefirst by a sequence of re-finements of a partition of the set of states, the second by a sequence of fusions or mergesof states. Among the algorithms of the first family, we mention a simple algorithm de-scribed in the book [35]. It operates by a traversal of the product of the automaton withitself, and therefore is in time and space complexityO(n2). Other algorithms are Hop-croft’s and Moore’s algorithms, which will be considered indepth later. The linear-timeminimization of acyclic automata of Revuz belongs to the second family. Brzozowski’salgorithm stands quite isolated and fits in neither of these two classes.

The algorithm for the minimization of complete deterministic finite state automatagiven by Hopcroft [34] runs in worst-case timeO(n log n). It is, up to now, the mostefficient algorithm known in the general case. It has recently been extended to incompletedeterministic finite automata [50],[9].

Hopcroft’s algorithm is related to Moore’s partition refinement algorithm [44], al-though it is different. One of the purposes of this text is thecomparison of the nature ofMoore’s and Hopcroft’s algorithms. This gives some new insight into both algorithms.As we shall see, these algorithms are quite different both inbehavior and in complexity.In particular, we show that it is not possible to simulate thecomputations of one algorithmby the other.

Moore’s partition refinement algorithm is much simpler thanHopcroft’s algorithm. Ithas been shown [7] that, although its worst-case behavior isquadratic, its average runningtime isO(n logn). No evaluation of the average is known for Hopcroft’s algorithm.

The family of algorithms based on fusion of states is important in practice for theconstruction of minimal automata representing finite sets,such as dictionaries in naturallanguage processing. A linear time implementation of such an algorithm for cycle-freeautomata was given by Revuz [47]. This algorithm has been extended to a more generalclass of automata by Almeida and Zeitoun [3], namely to automata where all strongly con-nected components are simple cycles. It has been demonstrated in [8] that minimizationby state fusion, which is not always possible, works well forlocal automata.

Minimization of automata 3

There is another efficient incremental algorithm for finite sets, by Daciuket al. [27].The advantage of this algorithm is that it does not build the intermediate trie which israther space consuming.

We also consider updating a minimal automaton when a word is added or removedfrom the set it recognizes.

Finally, we discuss briefly the case of nondeterministic automata. It is well-known thatminimal nondeterministic automata are not unique. However, there are several subclasseswhere the minimal automaton is unique.

We do not consider here the problem of constructing a minimalautomaton startingfrom another description of the regular language, such as the synthesis of an automatonfrom a regular expression. We also do not consider devices that may be more spaceefficient, such as alternating automata or two-way automata. Other cases not consideredhere concern sets of infinite words and the minimization of their accepting devices.

The chapter is organized as follows. The first section just fixes notation, the nextdescribes briefly Brzozowski’s algorithm. In Section 4, we give basic facts on Moore’sminimization algorithm. Section 5 is a detailed description of Hopcroft’s algorithm, withthe proof of correctness and running time. It also contains the comparison of Moore’sand Hopcroft’s algorithms. The next section is devoted to so-called slow automata. Somematerial in these two sections is new.

Sections 7 and 8 are devoted to the family of algorithms working by fusion. Wedescribe in particular Revuz’s algorithm and its generalization by Almeida and Zeitoun,the incremental algorithm of Daciuket al., and dynamic minimization. The last sectioncontains miscellaneous results on special cases and a shortdiscussion of nondeterministicminimal automata.

2 Definitions and notation

It appears to be useful, for a synthetic presentation of the minimization algorithms ofMoore and of Hopcroft, to introduce some notation for partitions of the set of states. Thissection just fixes this notation.

Partitions and equivalence relations. A partition of a setE is a familyP of nonempty,pairwise disjoint subsets ofE such thatE =

⋃P∈P

P . The indexof the partition is the

number of its elements. A partition defines an equivalence relation≡P onE. Conversely,the set of all equivalence classes[x], for x ∈ E, of an equivalence relation onE definesa partition ofE. This is the reason why all terms defined for partitions have the samemeaning for equivalence relations and vice versa.

A subsetF of E is saturatedbyP if it is the union of classes ofP . LetQ be anotherpartition ofE. ThenQ is a refinementof P , or P is coarserthanQ, if each class ofQis contained in some class ofP . If this holds, we writeQ 6 P . The index ofQ is thenlarger than the index ofP .

Given two partitionsP andQ of a setE, we denote byU = P ∧ Q the coarsestpartition which refinesP andQ. The classes ofU are the nonempty setsP ∩ Q, for

4 J. Berstel, L. Boasson, O. Carton, I. Fagnot

P ∈ P andQ ∈ Q. The notation is extended to a set of partitions in the usual way: wewriteP = P1 ∧ · · · ∧ Pn for the common refinement ofP1, . . . ,Pn. If n = 0, thenP isthe universal partition ofE composed of the single classE. This partition is the neutralelement for the∧-operation.

LetF be a subset ofE. A partitionP of E induces a partitionP ′ of F by intersection:P ′ is composed of the nonempty setsP ∩ F , for P ∈ P . If P andQ are partitions ofEandQ 6 P , then the restrictionsP ′ andQ′ toF still satisfyQ′ 6 P ′.

If P andP ′ are partitions of disjoint setsE andE′, we denote byP ∨P ′ the partitionof E ∪E′ whose restriction toE andE′ areP andP ′ respectively. So, one may write

P =∨

P∈P

P .

Minimal automaton. We consider a deterministic automatonA = (Q, i, F ) over thealphabetA with set of statesQ, initial statei, and set of final statesF . To each stateqcorresponds a subautomaton ofA obtained whenq is chosen as the initial state. We callit the subautomaton rooted atq or simply the automaton atq. Usually, we consider onlythe trim part of this automaton. To each stateq corresponds a languageLq(A) which isthe set of words recognized by the subautomaton rooted atq, that is

Lq(A) = w ∈ A∗ | q · w ∈ F .

This language is called thefutureof the stateq, or also theright languageof this state.Similarly one defines thepastof q, also called theleft language, as the setw ∈ A∗ |i ·w = q. The automatonA is minimalif Lp(A) 6= Lq(A) for each pair of distinct statesp, q. The equivalence relation≡ defined by

p ≡ q if and only if Lp(A) = Lq(A)

is acongruence, that isp ≡ q impliesp · a ≡ q · a for all lettersa. It is called theNerodecongruence. Note that the Nerode congruence saturates the set of final states. Thus anautomaton is minimal if and only if its Nerode equivalence isthe identity.

Minimizing an automaton is the problem of computing the Nerode equivalence. In-deed, thequotientautomatonA/≡ obtained by taking for set of states the set of equiva-lence classes of the Nerode equivalence, for the initial state the class of the initial statei,for set of final states the set of equivalence classes of states inF and by defining the tran-sition function by[p] · a = [p · a] accepts the same language, and its Nerode equivalenceis the identity. The minimal automaton recognizing a given language is unique.

Partitions and automata. Again, we fix a deterministic automatonA = (Q, i, F ) overthe alphabetA. It is convenient to use the shorthandP c for Q \ P whenP is a subset ofthe setQ.

Given a setP ⊂ Q of states and a lettera, we denote bya−1P the set of statesq suchthatq · a ∈ P . Given setsP,R ⊂ Q anda ∈ A, we denote by

(P, a)|R

Minimization of automata 5

the partition ofR composed of the nonempty sets among the two sets

R ∩ a−1P = q ∈ R | q · a ∈ P and R \ a−1P = q ∈ R | q · a /∈ P .

Note thatR \ a−1P = R ∩ (a−1P )c = R ∩ a−1(P c) so the definition is symmetric inPandP c. In particular

(P, a)|R = (P c, a)|R . (2.1)

The pair(P, a) is called asplitter. Observe that(P, a)|R = R if either R · a ⊂ Por R · a ∩ P = ∅, and(P, a)|R is composed of two classes if bothR · a ∩ P 6= ∅ andR · a ∩ P c 6= ∅ or equivalently ifR · a 6⊂ P c andR · a 6⊂ P . If (P, a)|R contains twoclasses, then we say that(P, a) splitsR. Note that the pairS = (P, a) is called a splittereven if it does not split.

It is useful to extend the notation above to words. Given a wordw and setsP,R ⊂ Qof states, we denote byw−1P the set of states such thatq · w ∈ P , and by(P,w)|R thepartition ofR composed of the nonempty sets among

R ∩ w−1P = q ∈ R | q · w ∈ P and R \ w−1P = q ∈ R | q · w /∈ P .

As an example, the partition(F,w)|Q is the partition ofQ into the set of those states fromwhichw is accepted, and the other ones. A stateq in one of the sets and a stateq′ in theother are sometimes calledseparatedbyw.

The Nerode equivalence is the coarsest equivalence relation on the set of states that isa (right) congruence saturatingF . With the notation of splitters, this can be rephrased asfollows.

Proposition 2.1. The partition corresponding to the Nerode equivalence is the coarsestpartitionP such that no splitter(P, a), withP ∈ P anda ∈ A, splits a class inP , that issuch that(P, a)|R = R for all P,R ∈ P anda ∈ A.

We use later the following lemma which is already given in Hopcroft’s paper [34]. Itis the basic observation that ensures that Hopcroft’s algorithm works correctly.

Lemma 2.2. LetP be a set of states, and letP = P1, P2 be a partition ofP . For anylettera and for any set of statesR, one has

(P, a)|R ∧ (P1, a)|R = (P, a)|R ∧ (P2, a)|R = (P1, a)|R ∧ (P2, a)|R ,

and consequently

(P, a)|R > (P1, a)|R ∧ (P2, a)|R , (2.2)

(P1, a)|R > (P, a)|R ∧ (P2, a)|R . (2.3)

3 Brzozowski’s algorithm

The minimization algorithm given by Brzozowski [18] is quite different from the twofamilies of iterative algorithms (by refinement and by fusion) that we consider in thischapter. Although its worst-case behavior is exponential,it is conceptually simple, easy

6 J. Berstel, L. Boasson, O. Carton, I. Fagnot

0 1 2 · · · na a, b a, b a, b

a, b

0 1 2 · · · na a, b a, b a, b

a, b

Figure 1. The automaton on the left recognizing the languageA∗aAn. It hasn+ 1states and the minimal deterministic automaton for this language has2n states. Theautomaton on the right is its reversal. It is minimal and recognizesAnaA∗.

to implement, and it is quite efficient in many cases. Moreover, it does not require theautomaton to be deterministic, contrary to the algorithms described later.

Given an automatonA = (Q, I, F,E) over an alphabetA, its reversalis the automa-ton denotedAR obtained by exchanging the initial and the final states, and by invertingthe orientation of the edges. FormallyAR = (Q,F, I, ER), whereER = (p, a, q) |(q, a, p) ∈ E. The basis for Brzozowski’s algorithm is the following result.

Proposition 3.1. LetA be a finite deterministic automaton, and letA∼ be the determin-istic trim automaton obtained by determinizing and trimming the reversalAR. ThenA∼

is minimal.

For a proof of this proposition, see for instance Sakarovitch’s book [48]. The min-imization algorithm now is just a double application of the operation. Observe that theautomaton one starts with need not to be deterministic.

Corollary 3.2. LetA be a finite automaton. Then(A∼)∼ is the minimal automaton ofA.

Example 3.1. We consider the automata given in Figure 1 over the alphabetA = a, b.Each automaton is the reversal of the other. However, determinization of the automatonon the left requires exponential time and space.

4 Moore’s algorithm

The minimization algorithm given by Moore [44] computes theNerode equivalence by astepwise refinement of some initial equivalence. All automata are assumed to be deter-ministic.

4.1 Description

LetA = (Q, i, F ) be an automaton over an alphabetA. Define, forq ∈ Q andh > 0, theset

L(h)q (A) = w ∈ A∗| |w| 6 h, q · w ∈ F .

TheMoore equivalenceof orderh is the equivalence≡h onQ defined by

p ≡h q ⇐⇒ L(h)p (A) = L(h)

q (A) .

Minimization of automata 7

MOORE(A)⊲ The initial partitionP ← F, F c

repeat⊲ P ′ is the old partition,P is the new oneP ′ ← P

for all a ∈ A doPa ←

∧P∈P(P, a)|Q

P ← P ∧∧

a∈A Pa

until P = P ′

Figure 2. Moore’s minimization algorithm.

Using the notation of partitions introduced above, one can rephrase the definitions of theNerode equivalence and of the Moore equivalence of orderh. These are the equivalencesdefined by ∧

w∈A∗

(F,w)|Q , and∧

w∈A∗, |w|6h

(F,w)|Q .

Since the set of states is finite, there is a smallesth such that the Moore equivalence≡h

equals the Nerode equivalence≡. We call this integer thedepthof Moore’s algorithmon the finite automatonA, or the depth ofA for short. The depth depends in fact onlyon the language recognized by the automaton, and not on the particular automaton underconsideration. Indeed, each state of an automaton recognizing a languageL represents infact a left quotientu−1L for some wordu.

The depth is the smallesth such that≡h equals≡h+1. This leads to the refinementalgorithm that computes successively≡0, ≡1, . . . ,≡h, . . . , halting as soon as two con-secutive equivalences are equal. The next property gives a method to compute the Mooreequivalences efficiently.

Proposition 4.1. For two statesp, q ∈ Q, andh > 0, one has

p ≡h+1 q ⇐⇒ p ≡h q and p · a ≡h q · a for all a ∈ A . (4.1)

We use this proposition in a slightly different formulation. Denote byMh the partitioncorresponding to the Moore equivalence≡h. Then the following equations hold.

Proposition 4.2. For h > 0, one has

Mh+1 =Mh ∧∧

a∈A

P∈Mh

(P, a)|Q =∨

R∈Mh

(∧

a∈A

P∈Mh

(P, a)|R).

The computation is described in Figure 2. It is realized by a loop that refines the currentpartition. The computation of the refinement ofk partitions of a set swithn elements canbe done in timeO(kn2) by brute force. A radix sort improves the running time toO(kn).With k = Card(A), each tour in the loop is realized in timeO(kn), so the total timeis O(ℓkn), whereℓ is the number of refinement steps in the computation of the Nerodeequivalence≡, that is the depth of the automaton.

8 J. Berstel, L. Boasson, O. Carton, I. Fagnot

The worst case behavior is obtained forℓ = n − 2. We say that automata havingmaximal depth areslowand more precisely areslow for Mooreautomata. These automataare investigated later. We will show that they are equivalent to automata we callslow forHopcroft.

4.2 Average complexity

The average case behavior of Moore’s algorithm has recentlybeen studied in severalpapers. We report here some results given in [7, 28]. The authors make a detailed analysisof the distribution of the numberℓ of refinement steps in Moore’s algorithm, that is of thedepth of automata, and they prove that there are only a few automata for which this depthis larger thanlogn.

More precisely, fix some alphabet and we consider deterministic automata over thisalphabet.

A semi-automatonK is an automaton whose set of final states is not specified. Thus,an automaton is a pair(K, F ), whereK is a semi-automaton andF is the set of final states.The following theorem given an upper bound on the average complexity of Moore’s algo-rithm for all automata derived from a given semiautomaton.

Theorem 4.3(Bassino, David, Nicaud [7]).LetK be a semi-automaton withn states.The average complexity of Moore’s algorithm for the automata (K, F ), for the uniformprobability distribution over the setsF of final states, isO(n log n).

The result also holds for Bernoulli distributions for final states. The result remainsvalid for subfamilies of automata such as strongly connected automata or group automata.

When all semi-automata are considered to be equally like, then the following boundis valid.

Theorem 4.4(David [28]). The average complexity of Moore’s algorithm, for the uniformprobability over all complete automata withn states, isO(n log logn).

This result is remarkable in view of the lower bound which is given in the followingstatement [6].

Theorem 4.5. If the underlying alphabet has at least two letters, then Moore’s algorithm,applied on a minimal automaton withn states, requires at leastΩ(n log logn) operations.

5 Hopcroft’s algorithm

Hopcroft [34] has given an algorithm that computes the minimal automaton of a givendeterministic automaton. The running time of the algorithmis O(k n logn) wherek isthe cardinality of the alphabet andn is the number of states of the given automaton. Thealgorithm has been described and re-described several times [32, 1, 10, 16, 40].

Minimization of automata 9

HOPCROFT(A)1: ⊲ The initial partitionP ← F, F c2: ⊲ The waiting setW ← ∅3: for all a ∈ A do4: ⊲ Initialization of the waiting setADD((min(F, F c), a),W)5: whileW 6= ∅ do6: ⊲ Take and remove some splitter(W,a)← TAKESOME(W)7: for eachP ∈ P which is split by(W,a) do8: ⊲ Compute the splitP ′, P ′′ ← (W,a)|P9: ⊲ Refine the partitionREPLACEP byP ′ andP ′′ in P

10: for all ⊲ Update the waiting setb ∈ A do11: if (P, b) ∈ W then12: REPLACE (P, b) by (P ′, b) and(P ′′, b) inW13: else14: ADD((min(P ′, P ′′), b),W)

Figure 3. Hopcroft’s minimization algorithm.

5.1 Outline

The algorithm is outlined in the function HOPCROFTgiven in Figure 3. We denote bymin(P, P ′) the set of smaller size of the two setsP andP ′, and any one of them if theyhave the same size.

Given a deterministic automatonA, Hopcroft’s algorithm computes the coarsest con-gruence which saturates the setF of final states. It starts from the partitionF, F cwhichobviously saturatesF and refines it until it gets a congruence. These refinements ofthepartition are always obtained by splitting some class into two classes.

The algorithm proceeds as follows. It maintains a current partition P = P1, . . . , Pnand a current setW of splitters, that is of pairs(W,a) that remain to be processed, whereW is a class ofP anda is a letter. The setW is called thewaitingset. The algorithm stopswhen the waiting setW becomes empty. When it stops, the partitionP is the coarsestcongruence that saturatesF . The starting partition is the partitionF, F c and the startingsetW contains all pairs(min(F, F c), a) for a ∈ A.

The main loop of the algorithm removes one splitter(W,a) from the waiting setWand performs the following actions. Each classP of the current partition (including theclassW ) is checked as to whether it is split by the pair(W,a). If (W,a) does not splitP ,then nothing is done. On the other hand, if(W,a) splitsP into sayP ′ andP ′′, the classP is replaced in the partitionP by P ′ andP ′′. Next, for each letterb, if the pair(P, b)is inW , it is replaced inW by the two pairs(P ′, b) and(P ′′, b), otherwise only the pair(min(P ′, P ′′), b) is added toW .

It should be noted that the algorithm is not really deterministic because it has notbeen specified which pair(W,a) is taken fromW to be processed at each iteration ofthe main loop. This means that for a given automaton, there are many executions of thealgorithm. It turns out that all of them produce the right partition of the states. However,different executions may give rise to different sequences of splitting and also to different

10 J. Berstel, L. Boasson, O. Carton, I. Fagnot

running time. Hopcroft has proved that the running time of any execution is bounded byO(|A|n log n).

5.2 Behavior

The pair(P ,W) composed of the current partition and the current waiting set in someexecution of Hopcroft’s algorithm is called aconfiguration. The following propositiondescribes the evolution of the current partition in Hopcroft’s algorithm. Formula 5.1 isthe key inequality for the proofs of correctness and termination. We will use it in thespecial case where the setR is a class of the current partition.

Proposition 5.1. Let (P ,W) be a configuration in some execution of Hopcroft’s algo-rithm on an automatonA onA. For anyP ∈ P , any subsetR of a class ofP , anda ∈ A,one has

(P, a)|R >∧

(W,a)∈W

(W,a)|R , (5.1)

that is, the partition(P, a)|R is coarser than the partition∧

(W,a)∈W(W,a)|R.

Proof. The proof is by induction on the steps of an execution. The initial configuration(P ,W) is composed of the initial partition isP = F, F c and the initial waiting setWis eitherW = (F, a) | a ∈ A orW = (F c, a) | a ∈ A. Since the partitions(F, a)|Rand(F c, a)|R are equal for any setR the proposition is true.

Assume now that(P ,W) is not the initial configuration. Let(P , W) be a configu-ration that precedes immediately(P ,W). Thus(P ,W) is obtained from(P , W) in onestep of Hopcroft’s algorithm, by choosing one splitterS in W , and by performing therequired operations onP andW .

First we observe that, by the algorithm, and by Lemma 2.2, onehas for any set ofstatesR, ∧

(W ,a)∈W\S

(W , a)|R >∧

(W,a)∈W

(W,a)|R . (5.2)

Indeed, the setW contains allW \S with the exception of those pairs(P, a) for whichP is split into two parts, and in this case, the relation follows from (2.2).

Next, consider a subsetR of a set ofP . SinceR was not split byS, that is sinceS|R = R, we have

(W ,a)∈W

(W , a)|R =∧

(W ,a)∈W\S

(W , a)|R . (5.3)

Moreover, by the induction hypothesis, and sinceR is also a subset of a set ofP , wehave for anyP ∈ P ,

(P , a)|R >∧

(W ,a)∈W

(W , a)|R . (5.4)

Minimization of automata 11

Consequently, for any subsetR of a set ofP , anyP ∈ P, in view of 5.2, 5.3 and 5.4, wehave

(P , a)|R >∧

(W,a)∈W

(W,a)|R . (5.5)

Let nowR ∈ P , a ∈ A, and let againR be a subset of a set ofP . We consider thepartition(P, a)|R. We distinguish two cases.

Case 1. AssumeP ∈ P. The proof follows directly from (5.5).Case 2. AssumeP /∈ P. Then there existsP ∈ P such thatS|P = P, P ′.If (P , a) ∈ W \ S, then both(P, a) and(P ′, a) are inW . Consequently

(P, a)|R >∧

(W,a)∈W

(W,a)|R .

If, on the contrary,(P , a) /∈ W \ S, then by the algorithm, one of(P, a) or (P ′, a)is inW .

Case 2a. Assume that(P, a) ∈ W . Then obviously

(P, a)|R >∧

(W,a)∈W

(W,a)|R .

Case 2b. Assume(P ′, a) ∈ W . By Lemma 2.2, we have

(P, a)|R > (P ′, a)|R ∧ (P , a)|R

as obviously we have

(P ′, a)|R >∧

(W,a)∈W

(W,a)|R .

and by use of (5.5), we obtain

(P, a)|R >∧

(W,a)∈W

(W,a)|R .

This completes the proof.

Corollary 5.2. The current partition at the end of an execution of Hopcroft’s algorithmon an automatonA is the Nerode partition ofA.

Proof. LetP be the partition obtained at the end of an execution of Hopcroft’s algorithmon an automatonA. By Proposition 2.1, it suffices to check that no splitter splits a classof P . Since the waiting setW is empty, the right-hand side of (5.1) evaluates toR foreach triple(P, a,R). This means that(P, a) indeed does not splitR.

5.3 Complexity

Proposition 5.3. Hopcroft’s algorithm can be implemented to have worst-casetime com-plexityO(kn logn) for an automaton withn states over ak-letter alphabet.

12 J. Berstel, L. Boasson, O. Carton, I. Fagnot

To achieve the bound claimed, a partitionP of a setQ should be implemented in away to allow the following operations:

• accessing the class to which a state belongs in constant time;• enumeration of the elements of a class in time proportional to its size;• adding and removing of an element in a class in constant time.

The computation of all splittingsP ′, P ′′ of classesP by a given splitter(W,a) is donein timeO(Card(a−1W )) as follows.

(1) One enumerates the statesq in a−1W . For each stateq, the classP of q is markedas a candidate for splitting, the stateq is added to a list of states to be removed fromP , and a counter for the number of states in the list is incremented.

(2) Each class that is marked is a candidate for splitting. Itis split if the number ofstates to be removed differs from the size of the class. If this holds, the states in thelist of P are removed to build a new class. The other states remain inP .

The waiting setW is implemented such that membership can be tested in constanttime, and splitters can be added and removed in constant time. This allows the replace-ment of a splitter(P, b) by the two splitters(P ′, b) and(P ′′, b) in constant time, since infactP ′ is just the modified classP , and it suffices to add the splitter(P ′′, b).

Several implementations of partitions that satisfy the time requirements exist. Hop-croft [34] describes such a data structure, reported in [14]. Knuutila [40] gives a differentimplementation.

Proof of Proposition 5.3. For a given stateq, a splitter(W,a) such thatq ∈ W is calleda q-splitter.

Consider someq-splitter. When it is removed from the waiting setW , it may besmaller than when it was added, because it may have been splitduring its stay in thewaiting set. On the contrary, when aq-splitter is added to the waiting set, then its size isat most one half of the size it had when it was previously removed. Thus, for a fixed stateq, the number ofq-splitters(W,a) which are removed fromW is at mostk logn, since ateach removal, the number of states inW is at most one half of the previous addition.

The total number of elements of the setsa−1W , where(W,a) is inW , isO(kn log n).Indeed, for a fixed stateq ∈ Q, a statep such thatp · a = q is exactly in those setsa−1W ,where(W,a) is aq-splitter inW . There are at mostO(log n) of such sets for each lettera, so at most(k log n) sets for each fixedq. Since there aren states, the claim follows.This completes the proof since the running time is bounded bythe size of the setsa−1W ,where(W,a) is inW .

5.4 Miscellaneous remarks

There are some degrees of freedom in Hopcroft’s algorithm. In particular, the way thewaiting set is represented may influence the efficiency of thealgorithm. This issue hasbeen considered in [40]. In [5] some practical experiments are reported. In [45] it isshown that the worst-case reported in [13] in the case of de Bruijn words remains ofthis complexity when the waiting set is implemented as a queue (LIFO), whereas thiscomplexity is never reached with an implementation as a stack (FIFO). See this paper forother discussions, in particular in relation with cover automata.

Minimization of automata 13

0 1 2 3 4 5 6 7 8 9

a 2 1 3 3 5 7 8 1 7 6b 9 1 5 7 8 7 8 1 7 4

0

1

2

3

4

5

6

7

89

a

b

a, b

a

b

a

b

a

b

a, b

a, b

a, ba, b

a

b

Figure 4. Next-state function of the example automaton.

Hopcroft’s algorithm, as reported here, requires the automaton to be complete. Thismay be a serious drawback in the case where the automaton has only a few transitions. Forinstance, if a dictionary is represented by a trie, then the average number of edges per stateis about1.5 for the French dictionary (personal communication of Dominique Revuz),see Table 2 below. Recently, two generalizations of Hopcroft’s algorithm to incompleteautomata were presented, by [9] and [50], with running timeO(m log n), wheren is thenumber of states andm is the number of transitions. Sincem 6 kn wherek is the size ofthe alphabet, the algorithm achieves the same time complexity.

5.5 Moore versus Hopcroft

We present an example which illustrates the fact that Hopcroft’s algorithm is not just arefinement of Moore’s algorithm. This is proved by checking that one of the partitionscomputed by Moore’s algorithm in the example does not appearas a partition in any ofthe executions of Hopcroft’s algorithm on this automaton.

The automaton we consider is over the alphabetA = a, b. Its set of states isQ =0, 1, 2, . . . , 9, the set of final states isF = 7, 8, 9. The next-state function and thegraph are given in Figure 4.

The Moore partitions are easily computed. The partitionM1 of order1 is composedof the five classes:

0, 3, 4, 1, 2, 5, 6, 7, 9, 8 .

The Moore partition of order2 is the identity.The initial partition for Hopcroft’s algorithm isF, F c, whereF = 789 (we will rep-

resent a set of states by the sequence of its elements). The initial waiting set is composedof (789, a) and(789, b). There are two cases, according to the choice of the first splitter.

Case 1. The first splitter is(789, a). Sincea−1789 = 568, each of the classesF andF c is split. The new partition is01234|56|79|8. The new waiting set is

(79, b), (8, b), (8, a), (56, a), (56, b) .

14 J. Berstel, L. Boasson, O. Carton, I. Fagnot

P 56 8 79 12

a−1P 49 6 58 017

b−1P 2 46 0357 17

Table 1. The setsc−1P , with (P, c) in a waiting set.

The first three columns in Table 1 contain the setsc−1P , for (P, c) in this waiting set. Byinspection, one sees that each entry in these columns cuts off at least one singleton classwhich is not in the Moore equivalenceM1. This implies thatM1 cannot be obtained byHopcroft’s algorithm in this case.

Case 2. The first splitter is(789, b). Sinceb−1789 = 034568, the new partition is12|03456|79|8. The new waiting set is

(79, a), (8, a), (8, b), (12, a), (12, b) .

Again, each entry in the last three columns of Table 1 cuts offat least one singleton classwhich is not in the Moore equivalenceM1. This implies that, also in this case,M1

cannot be obtained by Hopcroft’s algorithm.

Despite the difference illustrated by this example, there are similarities between Moo-re’s and Hopcroft’s algorithms that have been exploited by Julien David in his thesis [29]to give an upper bound on the average running time of Hopcroft’s algorithm for a partic-ular strategy.

In this strategy, there are two waiting sets, the current setW and afuture waiting setF . Initially, F is empty. Hopcroft’s algorithm works as usual, except for line14: Here,the splitter(min(P ′, P ′′), b) is added toF and not toW . WhenW is empty, then thecontents ofF andW are swapped. The algorithm stops when both setsW andF areempty.

Proposition 5.4(David [29]). There is a strategy for Hopcroft’s algorithm such that itsaverage complexity, for the uniform probability over all complete automata withn states,isO(n log logn).

Julien David shows that at the end of each cycle, that is whenW becomes empty, thecurrent partitionP of the set of states is in fact a refinement of the corresponding levelin Moore’s algorithm. This shows that the number of cycles inHopcroft’s algorithm, forthis strategy, is bounded by the depth of the automaton. ThusTheorem 4.4 applies.

6 Slow automata

We are concerned in this section with automata that behave badly for Hopcroft’s andMoore’s minimization algorithms. In other terms, we look for automata for which Moo-re’s algorithm requires the maximal number of steps, and similarly for Hopcroft’s algo-rithm.

Minimization of automata 15

6.1 Definition and equivalence

Recall that an automaton withn states is calledslow for Mooreif the numberℓ of stepsin Moore’s algorithm isn − 2. A slow automaton is minimal. It is equivalent to say thateach Moore equivalence≡h has exactlyh + 2 equivalence classes forh 6 n − 2. Thisis due to the fact that, at each step, just one class of≡h is split, and that this class is splitinto exactly two classes of the equivalence≡h+1.

Proposition 4.2 takes the following special form for slow automata.

Proposition 6.1. LetA be an automaton withn states which is slow for Moore. For alln− 2 > h > 0, there is exactly one classR inMh which is split, and moreover, if(P, a)and(P ′, a′) split R, withP, P ′ ∈ Mh, then(P, a)|R = (P ′, a′)|R.

An automaton isslow for Hopcroftif, for all executionsof Hopcroft’s algorithm, thesplitters in the current waiting set either do not split or split in the same way: there is aunique class that is split into two classes, and always into the same two classes.

More formally, at each step(W ,P) of an execution, there is at most one classR inthe current partitionP that is split, and for all splitters(P, a) and(P ′, a′) inW that splitR, one has(P, a)|R = (P ′, a′)|R.

The definition is close to the statement in Proposition 6.1 above, and indeed, one hasthe following property.

Theorem 6.2. An automaton is slow for Moore if and only if it is slow for Hopcroft.

Proof. LetA be a finite automaton. We first suppose thatA is slow for Moore. We con-sider an execution of Hopcroft’s algorithm, and we prove that each step of the executionthat changes the partition produces a Moore partition.

This holds for the initial configuration(W ,P), sinceP =M0. Assume that one hasP = Mh for some configuration(W ,P) and someh > 0. Let R be the class ofMh

split by Moore’s algorithm.Let S ∈ W be the splitter chosen in Hopcroft’s algorithm. Then eitherS splits no

class, and the partition remains equal toMh or by Proposition 6.1 it splits the classR. Inthe second case, this class is split byS into two new classes, sayR′ andR′′. The partitionP ′ = P \ R ∪ R′, R′′ is equal toMh+1.

Conversely, suppose thatA is slow for Hopcroft. We show that it is also slow forMoore by showing that the partitionMh+1 has only one class more thanMh. For this,we use Proposition 4.2 which states that each classR inMh is refined inMh+1 into thepartitionR given by

R =∧

a∈A

P∈Mh

(P, a)|R .

We show by induction on the number of steps that, in any execution of Hopcroft’s algo-rithm,P =Mh for some configuration(P ,W). This holds for the initial configuration.Let (W,a) be some splitter inW . It follows from Proposition 5.1 that

R >∧

a∈A

(W,a)∈W

(W,a)|R .

16 J. Berstel, L. Boasson, O. Carton, I. Fagnot

0 1 2 · · · n

a

a a a a

Figure 5. An automaton over one letter recognizing the set of words of length atleastn.

∞ 0 1 2 · · · nb

b b b b

a a a a

Figure 6. An automaton recognizing the Dyck words of “height” at mostn.

Thus the partitionR is coarser than that the partition ofR obtained by Hopcroft’s algo-rithm. Since the automaton is slow for Hopcroft, the partition on the right-hand side has atmost two elements. More precisely, there is exactly one classR that is split by Hopcroft’salgorithm into two classes. Since Moore’s partitionMh+1 is coarser, it contains preciselythese classes. This proves the claim.

6.2 Examples

Example 6.1. The simplest example is perhaps the automaton given in Figure 5. For0 6 h 6 n− 1, the partitionMh is composed of the class0, . . . , n− h− 1, and of thesingleton classesn − h, n − h + 1,. . . , n. At each step, the last state is split offfrom the class0, . . . , n− h− 1.

Example 6.2. The automaton of Figure 6 recognizes the setD(n) of Dyck wordsw overa, b such that0 6 |u|a − |u|b 6 n for all prefixesu of w. The partitionMh, for0 6 h 6 n− 1, is composed of0, . . . , h, and∞, h+1, . . . , n. At h = n, the state∞ is separated from staten.

Example 6.3. Let w = b1 · · · bn be a word of lengthn over the binary alphabet0, 1.We define an automatonAw over the unary alphabeta as follows. The state set ofAw

is 1, . . . , n and the next state function is defined byi ·a = i+1 for i < n andn ·a = 1.Note that the underlying labeled graph ofAw is just a cycle of lengthn. The final statesreally depend onw. The set of final states ofAw is F = 1 6 i 6 n | bi = 1. We callsuch an automaton acyclic automaton.

For a binary wordu, we defineQu to be the set of states ofAw which are the startingpositions of circular occurrences ofu in w. If u is the empty word, thenQu is by con-vention the setQ of all states ofAw. By definition, the setF of final states ofAw is Q1

while its complementF c isQ0.Consider the automatonAw for w = 01001010 given in Fig. 7. The setsQ1, Q01 and

Q11 of states are respectively2, 5, 7, 1, 4, 6 and∅. If w is a Sturmian word, then theautomatonAw is slow.

Minimization of automata 17

1

2

3

4

5

6

7

8

a

a a

a

a

aa

a

Figure 7. Cyclic automatonAw for w = 01001010. Final states are circled.

Slow automata are closely related to binary Sturmian trees.Consider indeed a finiteautomaton, for instance over a binary alphabet. To this automaton corresponds an infinitebinary tree, composed of all paths in the automaton. The nodes of the tree are labeled withthe states encountered on the path. For each integerh > 0, the number of distinct subtreesof heighth is equal to the number of classes in the Moore partitionMh. It follows thatthe automaton is slow if and only if there areh+ 1 distinct subtrees of heighth for all h:this is precisely the definition ofSturmian trees, as given in [12].

We consider now the problem of showing that the running timeO(n log n) for Hop-croft’s algorithm onn-state automata is tight. The algorithm has a degree of freedombecause, in each step of its main loop, it allows one to choosethe splitter to be processed.Berstel and Carton [13] introduced a family of finite automata based on de Bruijn words.These are exactly the cyclic automataAw of Example 6.3 wherew is a binary de Bruijnword. They showed that there exist some “unlucky” sequence of choices that slows downthe computation to achieve the lower boundΩ(n logn).

In the papers [20] and [21], Castiglione, Restivo and Sciortino replace de Bruijn wordsby Fibonacci words. They observe that for these words, and more generally for all circu-lar standard Sturmian words, there is no more choice in Hopcroft’s algorithm. Indeed, thewaiting set always contains only one element. The uniqueness of the execution of Hop-croft’s algorithm implies by definition that the associatedcyclic automata for Sturmianwords are slow.

They show that, for Fibonacci words, the unique execution ofHopcroft’s algorithmruns in timeΩ(n logn), so that the worst-case behavior is achieved for the cyclic au-tomata of Fibonacci words. The computation is carried out explicitly, using connectionsbetween Fibonacci numbers and Lucas numbers. In [22], they give a detailed analysis ofthe reduction process that is the basis of their computation, and they show that this processis isomorphic, for all standard Sturmian words, to the refinement process in Hopcroft’salgorithm.

In [11], the analysis of the running time of Hopcroft’s algorithm is extended to cyclicautomata of standard Sturmian words. It is shown that the directive sequences for whichHopcroft’s algorithm has worst-case running time are thosesequences(d1, d2, d3, . . .)for which the sequence of geometric means((pn)

1/n)n>1, wherepn = d1d2 · · · dn, isbounded.

18 J. Berstel, L. Boasson, O. Carton, I. Fagnot

7 Minimization by fusion

In this section, we consider the minimization of automata byfusion of states. An impor-tant application of this method is the computation of the minimal automaton recognizinga given finite set of words. This is widely used in computational linguistics for the space-efficient representation of dictionaries.

Let A be a deterministic automaton over the alphabetA, with set of statesQ. Thesignatureof a statep is the set of pairs(a, q) ∈ A×Q such thatp · a = q, together with aBoolean value denoting whetherp is final or not. Two statesp andq are calledmergeableif and only if they have the same signature. Thefusionor mergeof two mergeable statesp andq consists in replacingp andq by a single state. The state obtained by fusion of twomergeable states has the same signature.

Minimization of an automaton by a sequence of fusion of states with the same signa-ture is not always possible. Consider the two-state automaton over the single lettera givenin Figure 8 which recognizesa∗. It is not minimal. The signature of state1 is +, (a, 2)and the signature of state2 is+, (a, 1) (here “+” denotes an accepting state), so the stateshave different signatures and are not mergeable.

1 2

a

a

Figure 8. An automaton recognizing the seta∗ which can not be minimized byfusion of its states.

7.1 Local automata

M.-P. Beal and M. Crochemore [8] designed an algorithm for minimizing a special classof deterministic automata by a sequence of state mergings. These automata are calledirreduciblelocalautomata. They occur quite naturally in symbolic dynamics.The runningtime of the algorithm isO(min(m(n−r+1),m logn)), wherem is the number of edges,n is the number of states, andr is the number of states of the minimized automaton. Inparticular, the algorithm is linear when the automaton is already minimal. Hopcroft’salgorithm has running timeO(kn log n), wherek is the size of the alphabet, and sincekn > m, it is worse than Beal and Crochemore’s algorithm. Moreover, their algorithmdoes not require the automaton to be complete.

The automata considered here have several particular features. First, all states areboth initial and final. Next, they areirreducible, that is, their underlying graph is stronglyconnected. Finally, the automata arelocal. By definition, this means that two distinctcycles carry different labels. This implies that the labelsof a cycle are primitive words,since otherwise there exist different traversals of the cycle which have the same label. In[8], the constructions and proofs are done for a more generalfamily of automata calledAFT (for automata ofalmost finite type). We sketch here the easier case of local automata.Since all states are final, two statesp andq of an automaton aremergeableif and onlyif, for all lettersa ∈ A, p · a is defined if and only ifq · a and, if this is the case, thenp · a = q · a.

Minimization of automata 19

1 2

a b

b

a

1 2

a a

b

b

Figure 9. The automaton on the left is local, the automaton on the rightis notbecause of the two loops labeleda, and because the labelbb of the cycle through1and2 is not a primitive word.

The basic proposition is the following. It shows that an irreducible local automatoncan be minimized by a sequence of fusion of states.

Proposition 7.1. If an irreducible local automaton is not minimal, then at least two of itsstates are mergeable.

The minimization algorithm assumes that the alphabet is totally ordered. It uses thenotion of partial signature. First, with each stateq is associated the signatureσ(q) =a1p1a2p2 · · · ampm, where(a1, p1), . . . , (am, pm) is the signature ofq, and the se-quence is ordered by increasing value of the letters. Since all states are final, the Booleanindicator reporting this is omitted. Apartial signatureis any prefixa1p1a2p2 · · · aipi ofa signature.

A first step consists in building asignature treewhich represents the sets of statessharing a common partial signature. The root of the tree represents the set of all states,associated to the empty signature. A node representing the sets of states with a partialsignaturea1p1a2p2 · · · aipi is the parent of the nodes representing the sets of states witha partial signaturea1p1a2p2 · · · aipiai+1pi+1. As a consequence, leaves represent fullsignatures. All states that correspond to a leaf are mergeable.

When mergeable states are detected in the signature tree, they can be merged. Thenthe signature tree has to be updated, and this is the difficultpart of the algorithm.

7.2 Bottom-up minimization

In this section, all automata are finite, acyclic, deterministic and trim. A statep is calledconfluentif there are at least two edges inA ending inp.

A trie is an automaton whose underlying graph is a tree. Thus an automaton is a trieif and only if it has no confluent state.

Bottom-up minimizationis the process of minimizing an acyclic automaton by a bot-tom-up traversal. In such a traversal, children of a node aretreated before the node itself.During the traversal, equivalent states are detected and merged. The basic property ofbottom-up minimization is that the check for (Nerode) equivalence reduces to equality ofsignatures. The critical point is to organize the states that are candidates in order to dothis check efficiently.

The bottom-up traversal itself may be organized in several ways, for instance as adepth-first search with the order of traversal of the children determined by the order onthe labels of the edges. Another traversal is by increasing height, as done in Revuz’salgorithm given next.

20 J. Berstel, L. Boasson, O. Carton, I. Fagnot

REVUZ(A)for h = 0 to HEIGHT(A) do

⊲ Compute states of heighthS ← GETSTATESFORHEIGHT(h)⊲ Compute and sort signaturesSORTSIGNATURES(S)

for ⊲ Merge mergeable statess ∈ S doif s ands.next have the same signaturethen

⊲ Mergeable states are consecutiveMERGE(s, s.next)

Figure 10. Revuz’s minimization algorithm.

One popular method for the construction of a minimal automaton for a given finiteset of words consists in first building a trie for this set and then minimizing it. Daciuketal. [27] propose an incremental version which avoids this intermediate construction.

Recall that thesignatureof a statep is the set of pairs(a, q) such thatp · a = qtogether with a Boolean value indicating whether the state is final or not. It is tacitlyunderstood that the alphabet of the automaton is ordered. The signature of a state isusually considered as the ordered sequence of pairs, where the order is determined bythe letters. It is important to observe that the signature ofa state evolves when states aremerged. As an example, the state6 of the automaton on the left of Figure 11 has signature+, (a, 3), (b, 10), and the same state has signature+, (a, 3), (b, 7) in the automaton on theright of the figure.

As already mentioned, if the minimization of the children oftwo statesp andq hasbeen done, thenp andq are (Nerode) equivalent if and only if they have the same sig-nature. So the problem to be considered is the bookkeeping ofsignatures, that is theproblem of detecting whether the signature of the currentlyconsidered state has alreadyoccured before. In practical implementations, this is doneby hash coding the signatures.This allows one to perform the test in constant average time.One remarkable exceptionis Revuz’s algorithm to be presented now, and its extension by Almeida and Zeitoun thatwe describe later.

7.3 Revuz’s algorithm

Revuz [47] was the first to give an explicit description of a linear time implementation ofthe bottom-up minimization algorithm. The principle of thealgorithm was also describedby [41].

Define theheightof a statep in an acyclic automaton to be the length of the longestpath starting atp. It is also the length of the longest word in the language of the subau-tomaton atp. Two equivalent states have the same height. Revuz’s algorithm operates byincreasing height. It is outlined in Figure 10. Heights may be computed in linear time bya bottom-up traversal. The lists of states of a given height are collected during this traver-sal. The signature of a state is easy to compute provided the edges starting in a state havebeen sorted (by a bucket sort for instance to remain within the linear time constraint).Sorting states by their signature again is done by a lexicographic sort. As for Moore’salgorithm, the last step can by done by a simple scan of the list of states since states withequal signature are consecutive in the sorted list.

Minimization of automata 21

1

2

3

4

5

6

7

8

9

1011

12ba

b

a

a

b

b

a

bb

b

a

a

b

a

b

1

2

3

4

5

6

7

8

9

b

a

b

a

a

bb

a

b

b

a

a

b

a

b

Figure 11. On the left a simple automaton: its nontrivial strongly connected com-ponents are the cycles2, 4 and3, 5, 9, 6. The minimization starts by merging11, 12and8, and also10 and7. This gives the automaton on the right.

The whole algorithm can be implemented to run in timeO(m) for an automaton withm edges.

Revuz’s algorithm relies on a particular bottom-up traversal of the trie. This traversalis defined by increasing height of states, and it makes the check for equivalent states easier.With another method for checking signatures in mind, like hash coding, the algorithm maybe organized in a different way. For instance, the traversalby heights can be replacedby a traversal by lexicographic order. The last item in the algorithm may be replaced byanother check. Whenever a state has been found which must be in the minimal automaton,its hash code isregistered. When the signature of a state is computed, one checks whetherits hash code is registered. If not, it is added to the register, otherwise it is replaced by thehash code.

Several implementations have been given in various packages. A comparison has beengiven in [25]. See also Table 2 below for numerical data.

7.4 The algorithm of Almeida and Zeitoun

Almeida and Zeitoun [3] consider an extension of the bottom-up minimization algorithmto automata which contain only simple cycles. They describea linear time algorithm forthese automata.

Let A be a finite trim automaton. We call itsimpleif every nontrivial strongly con-nected component is a simple cycle, that is if every vertex ofthe component has exactlyone successor vertex in this component. The automaton givenon the left of Figure 11is simple. Simple automata are interesting because they recognize exactly the boundedregular languages or, equivalently, the languages with polynomial growth. These are thesimplest infinite regular languages.

The starting point of the investigation of [3] is the observation that minimization canbe split into two parts: minimization of an acyclic automaton and minimization of the setof strongly connected components. There are three subproblems, namely (1) minimiza-tion of each strongly connected component, (2) identification and fusion of isomorphic

22 J. Berstel, L. Boasson, O. Carton, I. Fagnot

minimized strongly connected components, and (3) wrapping, which consists in mergingstates which are equivalent to a state in a strongly connected component, but which arenot in this component. The authors show that if these subproblems can be solved in lineartime, then, by a bottom-up algorithm which is a sophistication of Revuz’s algorithm, thewhole automaton can be minimized in linear time. Almeida andZeitoun show how thiscan be done for simple automata. The outline of the algorithmis given in Figure 12.

The algorithm works as Revuz’s algorithm as long as no nontrivial strongly connectedcomponents occur. In our example automaton, the states8, 11 and12 are merged, and thestates10 and7 also are merged. This gives the automaton on the right of Figure 11.

Then a cycle which has all its descendants minimized is checked for possible mini-mization. This is done as follows: theweak signatureof a statep of a cycle is the signatureobtained by replacing the name of its successor in the cycle by a dummy symbol, say2.In our example, the weak signatures of the states3, 5, 9, 6 are respectively:

−a8b2 , +a2b7 , −a8b2 , +a2b7 .

Here we write ‘+’ when the state is final, and ‘−’ otherwise.It is easily seen that the cycle is minimal if and only if the word composed of the

sequence of signatures is primitive. In our example, the word is not primitive since itis a square, and the cycle can be reduced by identifying states that are at correspondingpositions in the word, that is states5 and6 can be merged, and states3 and9. This givesthe automaton on the left of Figure 13.

Similarly, in order to check whether two (primitive) cyclescan be merged, one checkswhether the words of their weak signatures are conjugate. Inour example, the cycles2, 4and3, 5 have the signatures

+a2b7 ,−a8b2 and − a8b2 ,+a2b7 .

These words are conjugate and the corresponding states can be merged. This gives theautomaton on the right of Figure 13. This automaton is minimal.

A basic argument for preserving the linearity of the algorithm is the fact that the mini-mal conjugate of a word can be computed in linear time. This can be done for instance byBooth’s algorithm (see [24]). Thus, testing whether a cycleis minimized takes time pro-portional to its length, and for each minimized cycle, a canonical representative, namelythe unique minimal conjugate, which is a Lyndon word, can be computed in time pro-portional to its length. The equality of two cycles then reduces to the equality of the two

ALMEIDA ZEITOUN(A)⊲ States and cycles of height0S ← ZEROHEIGHT(A)

while S 6= ∅ do⊲ Minimize each cycleM INIMIZE CYCLES(S)⊲ Compute and sort signatures and mergeMERGEISOMORPHICCYCLES(S)⊲ Search states to wrapWRAP(S)⊲ Compute states for next heightS ← NEXTHEIGHT(A, S)

Figure 12. The algorithm of Almeida and Zeitoun.

Minimization of automata 23

1

2

3

4

5

7 8

a

b

a

b

b

a

a bb

a

b

1

2

3

7

8

a

b

bab

a

b

Figure 13. The minimization continues by merging the states5 and6, and the states3 and9. This gives the automaton on the left. The last step of minimization mergesthe states2 and5, and the states3 and4.

1 2 3

4 5 6 7 8

a

b a

c

a

c

a

c a b

a

b

1

2 3

4 7 8a

ba

c

a

c

a

c

a

b

Figure 14. The automaton on the left has one minimal cycle of height1. By wrap-ping, states6 and8, and states5 and7 are merged, respectively, giving the automatonon the right.

associated words. Finding isomorphic cycles is accomplished by a lexicographic orderingof the associated words, followed by a simple scan for equal words.

A few words on wrapping: it may happen that states of distinctheights in a simpleautomaton are equivalent. An example is given in Figure 14. Indeed, states6 and8 havethe same signature and therefore are mergeable but have height1 and0, respectively. Thissituation is typical: when statess andt are mergeable and have distinct heights, andtbelongs to a minimized component of current height, thens is a singleton component ona path to the cycle oft. Wrapping consists in detecting these states, and in “winding”them around the cycle. In our example, both6 and5 are wrapped in the component of7and8. In the algorithm given above, a wrapping step is performed at each iteration, afterthe minimization of the states and the cycles and before computing the states and cyclesof the next height. In our example, after the first iteration,states3 and4 are mergeable.A second wrapping step merges3 and4. These operations are reported in Figure 15. Acareful implementation can realize all these operations inglobal linear time.

24 J. Berstel, L. Boasson, O. Carton, I. Fagnot

1

2

4

7 8

a

b

a

c

a

c

a

b

1 4 7 8a, b

a

ca

b

Figure 15. The automaton on the left has one minimal cycle of height1. By wrap-ping, states6 and8, and states5 and7 are merged, respectively, giving the automatonon the right.

7.5 Incremental minimization: the algorithm of Daciuk et al.

The algorithm presented in [27] is an incremental algorithmfor the construction of aminimal automaton for a given set of words that is lexicographically sorted.

The algorithm is easy to implement and it is efficient: the construction of an automatonrecognizing a typical dictionary is done in a few seconds. Table 2 was kindly communi-cated by Sebastien Paumier. It contains the space saving and the computation time fordictionaries of various languages.

The algorithm described here is simple because the words aresorted. There existother incremental algorithms for the case of unsorted sets.One of them will be describedin the next section. Another algorithm, called semi-incremental because it requires a finalminimization step, is given in [51].

We start with some notation. LetA = (Q, i, F ) be a finite, acyclic, deterministic andtrim automaton. We say that a wordx is in the automatonA if i · x is defined. In otherwords,x is in A if x is a prefix of some word recognized byA. Let w be a word to beadded to the set recognized by an automatonA. The factorization

w = x · y ,

wherex is the longest prefix ofw which is inA, is called theprefix-suffix decompositionof w. The wordx (resp.y) is thecommon prefix(resp.corresponding suffix) of w.

One hasx = ε if eitherw = ε or i · a is undefined, wherea is the initial letter ofw.Similarly, y = ε if w itself is inA. If y 6= ε and starts with the letterb, theni · xb = ⊥.

Theinsertionof a wordy at statep is an operation that is performed providedy = ε orp ·b = ⊥, whereb is the initial letter ofy. If y = ε, the insertion simply consists in addingstatep to the setF of final states. Ify 6= ε, sety = b1b2 · · · bm. The insertion consistsin adding new statesp1, . . . , pm to Q, with the next state function defined byp · b1 = p1andpi−1 · bi = pi for i = 2, . . . ,m. Furthermore,pm is added to the setF of final states.

Assume that the language recognized byA is not empty, and that the wordw is lexi-cographically greater than all words inA. Thenw is not inA. So the common prefixxof w is strictly shorter thanw and the corresponding suffixy is nonempty.

The incremental algorithm works at follows. At each step, a new wordw that islexicographically greater than all previous ones is inserted in the current automatonA.First, the prefix-suffix decompositionw = xy of w, and the stateq = i · x are computed.Then the segment starting atq of the path carrying the suffixy′ of the previously inserted

Minimization of automata 25

File Lines Text file Automaton TimeStates Trans. Size Revuz Daciuk

delaf-de 189878 12.5Mb 57165 103362 1.45Mb 4.22s 4.44sdelaf-en 296637 13.2Mb 109965 224268 2.86Mb 5.94s 6.77sdelaf-es 638785 35.4Mb 56717 117417 1.82Mb 10.61s 11.28sdelaf-fi 256787 24.6Mb 124843 133288 4.14Mb 6.40s 7.02sdelaf-fr 687645 38.7Mb 109466 240409 3.32Mb 13.03s 14.14sdelaf-gr 1288218 83.8Mb 228405 442977 7.83Mb 28.33s 31.02sdelaf-it 611987 35.9Mb 64581 161718 1.95Mb 10.43s 11.46sdelaf-no 366367 23.3Mb 75104 166387 2.15Mb 6.86s 7.44sdelaf-pl 59468 3.8Mb 14128 20726 502Kb 1.19s 1.30sdelaf-pt 454241 24.8Mb 47440 115694 1.4Mb 7.87s 8.45sdelaf-ru 152565 10.8Mb 23867 35966 926Kb 2.95s 3.17sdelaf-th 33551 851Kb 36123 61357 925Kb 0.93s 1.14s

Table 2. Running time and space requirement for the computation of minimal au-tomata (communication of Sebastien Paumier).

DACIUK ETAL(A)for all ⊲ Words are given in lexicographic orderw do

⊲ x is the longest prefix ofw in A(x, y)← PREFSUFFDECOMP(w)⊲ q is the state reached by readingxq ← i · x⊲ Minimize the states on this pathM INIMIZE LASTPATH(q)⊲ Adds a path starting inq and carryingyADDPATH(q, y)

Figure 16. The incremental algorithm of Daciuket al.

wordw′ is minimized by merging states with the same signature. Finally, the suffixy isinserted at stateq. The algorithm is given in Figure 16.

The second step deserves a more detailed description. We observe first that the wordx of the prefix-suffix decompositionw = xy of w is in fact the greatest common prefixof w′ andw. Indeed, the wordx is a prefix of some word recognized byA (hereA is theautomaton before addingw), and sincew′ is the greatest word inA, the wordx is a prefixof w′. Thusx is a common prefix ofw′ andw. Next, if x′ is a common prefix ofw′ andw, thenx′ is inA because it is a prefix ofw′, and consequentlyx′ is a prefix ofx becausex is the longest prefix ofw in A. This shows the claim.

There are two cases for the merge. Ifw′ is a prefix ofw, thenw′ = x. In this case,there is no minimization to be performed.

If w′ is not a prefix ofw, then the paths forw′ and forw share a common initialsegment carrying the prefixx, from the initial state to stateq = i · x. The minimization

concerns the states on the pathqy′

−→ t′ carrying the suffixy′ of the factorizationw = xy′

of w′. Each of the states in this path, except the stateq, will never be visited again in anyinsertion that may follow, so they can be merged with previous states.

26 J. Berstel, L. Boasson, O. Carton, I. Fagnot

0 1 2

3 4

a a

b

a

(a)

0 1 2

3

a a

b a

5 6

b

a

(b)

Figure 17. (a). The automaton foraa, aba. (b). The automaton foraa, aba, ba. Herestate4 has been merged with state2.

0 1 2

3

a a

b a

5

b

a

7 8b a

(a)

0 1 2

3

a a

b a

5

b

ab

9c

(b)

Figure 18. (a). The automaton foraa, aba, ba, aba, bba. (b). The automaton foraa, aba, ba, aba, bba, bc. After insertingbc, states8 and2 are merged, and then states7 and2.

Example 7.1. We consider the sequence of words(aa, aba, ba, bba, bc). The first twowords give the automaton of Figure 17(a). Adding the wordba permits the merge ofstates2 and4. The resulting automaton is given in Figure 17(b). After insertingbba, thereis a merge of states6 and2, see Figure 18(a).

8 Dynamic minimization

Dynamic minimizationis the process of maintaining an automaton minimal when inser-tions or deletions are performed.

A solution for adding and for removing a word was proposed by Carrasco and Forcada[19]. It consists in an adaptation of the usual textbook constructions for intersection andcomplement to the special case where one of the languages is asingle word. It appearsthat the finiteness of the languageL plays no special role, so we assume here that it isregular, not necessarily finite. The construction for adding a word has also been proposedin [49], and in [27] for acyclic automata. An extension to general automata, and severalother issues, are discussed in [26].

We consider here, for lack of space, only deletion of a word from the set recognizedby an automaton, and minimization of the new automaton.

Let A = (Q, i, T ) be the minimal automaton recognizing a languageL over the al-phabetA, and letw be a word inL. Denote byAw the minimal automaton recognizing

Minimization of automata 27

ε a ab aba abab ⊥a b a b a, b

a, b

b

a

b

a

Figure 19. The minimal automaton recognizing the complement of the word abab.Only the stateabab is not final.

the complementA∗ \ w. The automaton hasn + 2 states, withn = |w|. Among them,there aren+ 1 states that are identified with the setP of prefixes ofw. The last state is asink state denoted⊥. An example is given in Figure 19.

The languageL \ w is equal toL ∩ (A∗ \ w), so it is recognized by the trimmed partB of the product automatonA × Aw. Its initial state is(i, ε), and its states are of threekinds.

• intact states: these are states of the form(q,⊥) with q ∈ Q. They are calledso because the language recognized at(q,⊥) in B is the same as the languagerecognized atq in A: LB(q,⊥) = LA(q).• cloned states: these are accessible states(q, x) with x ∈ P , sox is a prefix ofw.

Since we require these states to be accessible, one hasq = i · x in A, and there isone such state for each prefix. The next-state function on these states is defined by

(q, x) · a =

(q · a, xa) if xa ∈ P ,

(q · a,⊥) otherwise.

Observe that(i · w,⊥) is an intact state becausew is assumed to be recognizedbyA.• useless states: these are all states that are removed when trimming the product

automaton.

Trimming consists here in removing the state(i,⊥) if it is no longer accessible, and thestates reachable only from this state. For this, one followsthe path defined byw andstarting in(i,⊥) and removes the states until one reaches a confluent state (that has atleast two incoming edges). The automaton obtained is minimal.

The whole construction finally consists in keeping the initial automaton, by renaminga stateq as (q,⊥), adding a cloned path, and removing state(i,⊥) if it is no longeraccessible, and the states reachable only from this state.

Of course, one may also use the textbook construction directly, that is without takingadvantage of the existence of the automaton given at the beginning. For this, one starts atthe new initial state(i, ε) and one builds only the accessible part of the product automaton.The method has complexityO(n + |w|), wheren is the number of states of the initialautomaton, whereas the previous method has only complexityO(|w|).

Example 8.1. The automaton given in Figure 21 recognizes the languageL = (ab)+ ∪abc, acb. The direct product with the automaton of Figure 19 is shown in Figure 22.Observe that there are intact states that are not accessiblefrom the new initial state(0, ε).

28 J. Berstel, L. Boasson, O. Carton, I. Fagnot

REMOVEINCREMENTAL(w,A)⊲ Add a fresh path forw in AA′ ← ADDCLONEDPATH(w,A)⊲ Return trimmed automatonTRIM(A′)

ADDCLONEDPATH(a1 · · · an,A)⊲ Add a fresh initial stateq0p0 ← INITIAL (A); q0 ← CLONE(p0)

for i = 1 to n do⊲ qi inherits the transitions ofpipi ← pi−1 · ai; qi ← CLONE(pi)⊲ This edge is redirectedqi−1 · ai ← qi

SETFINAL (qn, false)

Figure 20. Removing the wordw from the language recognized byA.

0 1

2

3

4

5

6

ab

c

a

c

b

b

a

Figure 21. The minimal automaton recognizing the languageL = (ab)+ ∪abc, acb.

The minimal automaton is shown in Figure 23.

9 Extensions and special cases

In this section, we consider extensions of the minimizationproblem to other classes ofautomata. The most important problem is to find a minimal nondeterministic automaton

0,⊥ 1,⊥

2,⊥

3,⊥

4,⊥

5,⊥

6,⊥

ab

c

a

c

b

b

a

0, ε 1, a 2, ab 4, aba 6, ababa b

c

a

c

b

a

Figure 22. The automaton recognizing the languageL = (ab)+ ∪ abc, acb \abab. There are still unreachable states (shown in gray).

Minimization of automata 29

3 5 7 8

0 1 2 4 6

b

b

a

a b

c

a

c

b

a

Figure 23. The minimal automaton for the languageL = (ab)+ ∪ abc, acb \abab.

recognizing a given regular language. Other problems, not considered here, concern setsof infinite words and the minimization of their accepting devices, and the use of otherkinds of automata known to be equivalent with respect to their accepting capability, suchas two-way automata or alternating automata, see for instance [46].

9.1 Special automata

We briefly mention here special cases where minimization plays a role. It is well knownthat string matchingis closely related to the construction of particular automata. If wis a nonempty word over an alphabetA, then searching for all occurrences ofw as afactor in a textt is equivalent to computing all prefixes oft ending inw, and henceto determining all prefixes oft which are in the regular languageA∗w. The minimalautomaton recognizingA∗w hasn + 1 states, wheren = |w|, and can be constructedin linear time. The automaton has many interesting properties. For instance there are atmost2n edges, when one does not count edges ending in the initial state. This is due toImre Simon, see also [33]. For a general exposition, see e.g.[24]. Extension of stringmatching to a finite setX of patterns has been done by Aho and Corasick. The associatedautomaton is called thepattern matching machine; it can be computed in time linear inthe sum of the lengths of the words inX . See again [24]. However, this automaton isnot minimal in general. Indeed, the number of states is the number of distinct prefixesof words inX , and this may be greater than than the number of states in the minimalautomaton (consider for example the setX = ab, bb over the alphabetA = a, b).There are some investigations on the complexity of minimizing Aho–Corasick automata,see [2].

Another famous minimal automaton is thesuffix automaton. This is the minimal au-tomaton recognizing all suffixes of a given word. The number of states of the suffixautomaton of a word of lengthn is less than2n, and the number of its edges is less than3n. Algorithms for constructing suffix automata in linear timehave been given in [23]and [17], see again [24] for details.

9.2 Nondeterministic automata

A nondeterministic automaton is minimal if it has the minimal number of states among allautomata recognizing the same language. Nondeterministicautomata are not unique. InFigure 24, we give two non-isomorphic nondeterministic automata which are both smaller

30 J. Berstel, L. Boasson, O. Carton, I. Fagnot

0

1

2

3 4

b

ca

a

a

a

b

b0

1

2

3 4

b

ca

a

a

a

b

b

Figure 24. Two non-isomorphic non-deterministic automata recognizing the seta(b∗ ∪ c∗)ab+.

than the minimal deterministic automaton recognizing the same language. This languageis a(b∗ ∪ c∗)ab+. The example is derived from an automaton given in [4].

One might ask if there are simple conditions on the automata or on the language thatensure that the minimal nondeterministic automaton is unique. For instance, the automataof Figure 25 both recognize the same language, but the secondhas a particular propertythat we will describe now. The uniqueness of the minimal automaton in the deterministic

0 1

a, b

a0 1

a, b a

a

a, bFigure 25. Two nondeterministic automata recognizing the set of wordsending withthe lettera.

case is related to the fact that the futures of the states of such an automaton are pairwisedistinct, and that each future is some left quotient of the language: for each stateq, thelanguageLq(A) is equal to a sety−1L, for some wordy.

This characterization has been the starting point for investigating similar properties ofnondeterministic automata. Let us call a (nondeterministic) automaton aresidual automa-ton if the future of its states are left quotients of the language; it has been shown in [30]that, among all residual automata recognizing a given language, there is a unique residualautomaton having a minimal number of states; moreover, thisautomaton is characterizedby the fact that the set of its futures is the set of the prime left quotients of the language,a left quotient beingprime if it is not the union of other nonempty left quotients. Forinstance, the automaton on the right of Figure 25 has this property, sinceL0 = a, b∗aandL1 = a−1L0 = ε ∪ a, b∗a and there are no other nonempty left quotients. Theautomaton on the left of Figure 25 is not residual since the future of state1 is not a leftquotient.

The problem of converting a given nondeterministic automaton into a minimal non-deterministic automaton is NP-hard, even over a unary alphabet [36]. This applies alsoto unambiguous automata [43]. In [15], these results have been extended as follows. Theauthors define a classδNFA of automata that are unambiguous, have at most two compu-tations for each string, and have at most one state with two outgoing transitions carryingthe same letter. They show that minimization is NP-hard for all classes of finite automatathat includeδNFA, and they show that these hardness results can also be adapted to the

Minimization of automata 31

setting of unambiguous automata that can non-deterministically choose between two startstates, but are deterministic everywhere else.

Even approximating minimization of nondeterministic automata is intractable, see[31]. There is an algebraic framework that allows one to represent and to compute allautomata recognizing a given regular language. The state ofthis theory, that goes back toKameda and Weiner [39], has been described by Lombardy and Sakarovitch in a recentsurvey paper [42].

There is a well-known exponential blow-up from nondeterministic automata to deter-ministic ones. The usual textbook example, already given inthe first section (the automa-ton on the left in Figure 1) shows that this blow-up holds alsofor unambiguous automata,even if there is only one edge that causes the nondeterminism.

It has been shown that any value of blow-up can be obtained, inthe following sense[37]: for all integersn,N with n 6 N 6 2n, there exists a minimal nondeterministic au-tomaton withn states over a four-letter alphabet whose equivalent minimal deterministicautomaton has exactlyN states. This was improved to ternary alphabets [38].

Acknowledgements

We had several helpful discussions with Marie-Pierre Beal, Julien David, Sylvain Lom-bardy, Wim Martens, Cyril Nicaud, Sebastien Paumier, Jean-Eric Pin and Jacques Saka-rovitch. We thank Narad Rampersad for his careful reading ofthe text.

References

[1] A. Aho, J. E. Hopcroft, and J. D. Ullman.The design and analysis of computer algorithms.Addison-Wesley, 1974. 8

[2] O. AitMous, F. Bassino, and C. Nicaud. Building the minimal automaton ofA∗X in lineartime, whenX is of bounded cardinality. In A. Amir and L. Parida, editors,CombinatorialPattern Matching, 21st Annual Symposium, CPM 2010, volume 6129 ofLecture Notes inComput. Sci., pages 275–287. Springer-Verlag, 2010. 29

[3] J. Almeida and M. Zeitoun. Description and analysis of a bottom-up DFA minimizationalgorithm. Inform. Process. Lett., 107(2):52–59, 2008. 2, 21

[4] A. Arnold, A. Dicky, and M. Nivat. A note about minimal non-deterministic automata.Bull.European Assoc. Theor. Comput. Sci., 47:166–169, 1992. 30

[5] M. Baclet and C. Pagetti. Around Hopcroft’s algorithm. In 11th Conference on Implemen-tation Application Automata (CIAA), volume 4094 ofLecture Notes in Comput. Sci., pages114–125. Springer-Verlag, 2006. 12

[6] F. Bassino, J. David, and C. Nicaud. Average case analysis of Moore’s state minimizationalgorithm.submitted. 8

[7] F. Bassino, J. David, and C. Nicaud. On the average complexity of Moore’s state minimizationalgorithm. In S. Albers and J.-Y. Marion, editors,STACS 2009, Proc. 26th Symp. Theoreti-cal Aspects of Comp. Sci., volume 09001 ofDagstuhl Seminar Proceedings, pages 123–134.Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik, 2009.2, 8

32 J. Berstel, L. Boasson, O. Carton, I. Fagnot

[8] M.-P. Beal and M. Crochemore. Minimizing local automata. In G. Caire and M. Fossorier, edi-tors,IEEE International Symposium on Information Theory (ISIT’07), number 07CH37924C,pages 1376–1380. IEEE, 2007. 2, 18

[9] M.-P. Beal and M. Crochemore. Minimizing incomplete automata. InWorkshop on FiniteState Methods and Natural Language Processing 2008. Ispra, september 2008. 2, 13

[10] D. Beauquier, J. Berstel, and P. Chretienne.Elements d’algorithmique. Masson, 1992. 8

[11] J. Berstel, L. Boasson, and O. Carton. Continuant polynomials and worst-case behavior ofHopcroft’s minimization algorithm.Theoret. Comput. Sci., 410(30-32):2811–2822, 2009. 17

[12] J. Berstel, L. Boasson, O. Carton, and I. Fagnot. Sturmian trees.Theory Comput. Systems,46:443–478, 2010. 17

[13] J. Berstel and O. Carton. On the complexity of Hopcroft’s state minimization algorithm. In9thConference on Implementation Application Automata (CIAA), volume 3317 ofLecture Notesin Comput. Sci., pages 35–44. Springer-Verlag, 2004. 12, 17

[14] J. Berstel and D. Perrin. Algorithms on words. In M. Lothaire, editor,Algebraic combinatoricson words, volume 90 ofEncyclopedia of Mathematics and its Applications, chapter 1, pages1–100. Cambridge University Press, 2002. 12

[15] H. Bjorklund and W. Martens. The tractability frontier for NFA minimization. In L. Aceto,I. Damgard, L. A. Goldberg, M. M. Halldorsson, A. Ingolfsdottir, and I. Walukiewicz, edi-tors,Automata, Languages and Programming, 35th International Colloquium, ICALP 2008,volume 5126 ofLecture Notes in Comput. Sci., pages 27–38. Springer-Verlag, 2008. 30

[16] N. Blum. A O(n log n) implementation of the standard method for minimizingn-state finiteautomata.Inform. Process. Lett., 57:65–69, 1996. 8

[17] A. Blumer, J. A. Blumer, D. Haussler, A. Ehrenfeucht, M.T. Chen, and J. I. Seiferas. Thesmallest automaton recognizing the subwords of a text.Theoret. Comput. Sci., 40:31–55,1985. 29

[18] J. A. Brzozowski. Canonical regular expressions and minimal state graphs for definite events.In Proc. Sympos. Math. Theory of Automata (New York, 1962), pages 529–561. PolytechnicPress of Polytechnic Inst. of Brooklyn, Brooklyn, N.Y., 1963. 5

[19] R. C. Carrasco and M. L. Forcada. Incremental construction and maintenance of minimalfinite-state automata.Comput. Linguist., 28(2), 2002. 26

[20] G. Castiglione, A. Restivo, and M. Sciortino. Circularwords and automata minimization.In P. Arnoux, N. Bedaride, and J. Cassaigne, editors,Words 2007, pages 79–89. Institut deMathematiques de Luminy, 17–21 september 2007. 17

[21] G. Castiglione, A. Restivo, and M. Sciortino. Hopcroft’s algorithm and cyclic automata. InC. Martın-Vide, F. Otto, and H. Fernau, editors,Language and Automata Theory and Applica-tions, volume 5196 ofLecture Notes in Comput. Sci., pages 172–183. Springer-Verlag, 2008.Tarragona, Spain, March 13-19, 2008. Revised Papers. 17

[22] G. Castiglione, A. Restivo, and M. Sciortino. CircularSturmian words and Hopcroft’s algo-rithm. Theoret. Comput. Sci., 410:4372–4381, 2009. 17

[23] M. Crochemore. Transducers and repetitions.Theoret. Comput. Sci., 45:63–86, 1986. 29

[24] M. Crochemore, C. Hancart, and T. Lecroq.Algorithms on strings. Cambridge UniversityPress, 2007. 22, 29

[25] J. Daciuk. Comparison of construction algorithms for minimal, acyclic, deterministic finite-state automata from sets of strings. In J.-M. Champarnaud and D. Maurel, editors,7th Con-ference on Implementation Application Automata (CIAA), volume 2608 ofLecture Notes inComput. Sci., pages 255–261. Springer-Verlag, 2002. 21

Minimization of automata 33

[26] J. Daciuk. Comments on ”Incremental construction and maintenance of minimal finite-stateautomata” by Rafael C. Carrasco and Mikel L. Forcada.Comput. Linguist., 30(2):227–235,2004. 26

[27] J. Daciuk, S. Mihov, B. W. Watson, and R. E. Watson. Incremental construction of minimalacyclic finite-state automata.Comput. Linguist., 26(1):3–16, april 2000. 3, 20, 24, 26

[28] J. David. The average complexity of Moore’s state minimization algorithm iso(n log log n).In P. Hlineny and A. Kucera, editors,Mathematical Foundations of Computer Science 2010,35th International Symposium, MFCS 2010, volume 6281 ofLecture Notes in Comput. Sci.,pages 318–329. Springer-Verlag, 2010. 8

[29] J. David. Generation aleatoire d’automates et analyse d’algorithmes de minimisation. Doc-torat d’unversite, University Paris-Est, september 2010. 14

[30] F. Denis, A. Lemay, and A. Terlutte. Residual finite state automata. InSTACS 2001, Proc.18th Symp. Theoretical Aspects of Comp. Sci., volume 2010 ofLecture Notes in Comput. Sci.,pages 144–157. Springer-Verlag, 2001. 30

[31] G. Gramlich and G. Schnitger. Minimizing NFA’s and regular expressions. InSTACS 2005,Proc. 22th Symp. Theoretical Aspects of Comp. Sci., volume 3404 ofLecture Notes in Comput.Sci., pages 399–411. Springer-Verlag, 2005. 31

[32] D. Gries. Describing an algorithm by Hopcroft.Acta Informatica, 2:97–109, 1973. 8

[33] C. Hancart. On Simon’s string searching algorithm.Inform. Process. Lett., 47(2):95–99, 1993.29

[34] J. E. Hopcroft. Ann log n algorithm for minimizing states in a finite automaton. In Z. Kohaviand A. Paz, editors,Theory of Machines and Computations, pages 189–196. Academic Press,1971. 2, 5, 8, 12

[35] J. E. Hopcroft and J. D. Ullman.Introduction to automata theory, languages, and compu-tation. Addison-Wesley Publishing Co., Reading, Mass., 1979. Addison-Wesley Series inComputer Science. 2

[36] T. Jiang and B. Ravikumar. Minimal NFA problems are hard. SIAM J. Comput., 22(6):1117–1141, 1993. 30

[37] J. Jirasek, G. Jiraskova, and A. Szabari. Deterministic blow-up of minimal nondeterministicfinite automata over a fixed alphabet. In J. Karhumaki and A. Lepisto, editors,Developmentsin Language Theory, volume 4588 ofLecture Notes in Comput. Sci., pages 254–265. Springer-Verlag, 2007. 31

[38] G. Jiraskova. Magic numbers and ternary alphabet. InV. Diekert and D. Nowotka, editors,Developments in Language Theory, volume 5583 ofLecture Notes in Comput. Sci., pages300–311. Springer-Verlag, 2009. 31

[39] T. Kameda and P. Weiner. On the state minimization of nondeterministic finite automata.IEEE Trans. Comput., C-19(7):617–627, 1970. 31

[40] T. Knuutila. Re-describing an algorithm by Hopcroft.Theoret. Comput. Sci., 250:333–363,2001. 8, 12

[41] S. L. Krivol. Algorithms for minimization of finite acyclic automata and pattern matching interms. Cybernetics, 27:324– 331, 1991. translated from Kibernetika, No 3, May-June 1991,pp. 11–16. 20

[42] S. Lombardy and J. Sakarovitch. The universal automaton. In E. Gradel, J. Flum, and T. Wilke,editors,Logic and automata: History and perspectives, volume 2 ofTexts in Logic and Games,pages 467–514. Amsterdam University Press, 2007. 31

34 J. Berstel, L. Boasson, O. Carton, I. Fagnot

[43] A. Malcher. Minimizing finite automata is computationally hard. Theoret. Comput. Sci.,327(3):375–390, 2004. 30

[44] E. F. Moore. Gedanken experiments on sequential machines. In C. E. Shannon and J. Mc-Carthy, editors,Automata Studies, pages 129–153. Princeton Universty Press, 1956. 2, 6

[45] A. Paun, M. Paun, and A. Rodrıguez-Paton. On Hopcroft’s minimization technique for DFAand DFCA.Theoret. Comput. Sci., 410:2424–2430, 2009. 12

[46] D. Perrin and J.-E. Pin.Infinite words, automata, semigroups, logic and games. Elsevier,2004. 29

[47] D. Revuz. Minimisation of acyclic deterministic automata in linear time.Theoret. Comput.Sci., 92:181–189, 1992. 2, 20

[48] J. Sakarovitch.Elements of automata theory. Cambridge University Press, 2009. 6

[49] K. N. Sgarbas, N. D. Fakotakis, and G. K. Kokkinakis. Optimal insertion in deterministicDAWGs. Theoret. Comput. Sci., 301:103–117, 2003. 26

[50] A. Valmari and P. Lehtinen. Efficient minimization of DFAs with partial transition. In S. Al-bers and P. Weil, editors,STACS 2008, Proc. 25th Symp. Theoretical Aspects of Comp. Sci.,volume 08001 ofDagstuhl Seminar Proceedings, pages 645–656. Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik, 2008. 2, 13

[51] B. W. Watson. A new algorithm for the construction of minimal acyclic DFAs.Sci. Comput.Program., 48(2-3):81–97, 2003. 24

Minimization of automata 35

Abstract. This chapter is concerned with the design and analysis of algorithms for minimizingfinite automata. Getting a minimal automaton is a fundamental issue in the use and implementationof finite automata tools in frameworks like text processing,image analysis, linguistic computerscience, and many other applications.

Although minimization algorithms are rather old, there were recently some new developmentswhich are explained or sketched in this chapter.

There are two main families of minimization algorithms. Thefirst are by a sequence of refine-ments of a partition of the set of states, the second by a sequence of fusions or merges of states.Hopcroft’s and Moore’s algorithms belong to the first family. The linear-time minimization ofacyclic automata of Revuz belongs to the second family.

We describe, prove and analyze Moore’s and Hopcroft’s algorithms. One of our studies isthe comparison of these algorithms. It appears that they quite different both in behavior and incomplexity. In particular, we show that it is not possible tosimulate the computations of one ofthe algorithm by the other. We also state the results known about average complexity of bothalgorithms.

We describe the minimization algorithm by fusion for acyclic automata, and for a more gen-eral class of automata, namely those recognizing regular languages having polynomially boundednumber of words.

Finally, we consider briefly incremental algorithms for building minimal automata for finite setsof words. We consider the updating of a minimal automaton when a word is added or removed.

Index

automatonalmost finite type, 18cyclic, 16depth, 7local, 18minimal, 4minimal nondeterministic, 29quotient, 4reversal, 6semi, 8simple, 21slow, 8slow for Hopcroft, 15slow for Moore, 15splitter, 5trie, 19

coarser partition, 3common prefix, 24configuration in Hopcroft’s algorithm, 10confluent states, 19cyclic automaton, 16

decompositionprefix-suffix, 24

depth of Moore’s algorithm, 7

equivalenceMoore, 6Nerode, 4

fusion of states, 18future of a state, 4

height of a state, 20Hopcroft

automaton slow, 8Hopcroft’s algorithm

configuration, 10splitter, 9waiting set, 9

index of a partition, 3

languageleft, 4right, 4

left language, 4local automaton, 18Lyndon word, 22

merge of states, 18mergeable states, 18minimal

nondeterministic automaton, 29minimal automaton, 4Moore

automaton slow, 8Moore equivalence, 6Moore algorithm

depth, 7

Nerode congruence, 4nondeterministic automaton

minimal, 29

partial signature, 19partition, 3

coarser, 3refinement, 3

past of a state, 4prefix-suffix decomposition, 24

quotient automaton, 4

residual automaton, 30reversal

automaton, 6right language, 4

saturated subset, 3semi-automaton, 8separated states, 5signature

partial, 19tree, 19

36

Minimization of automata 37

signature of a state, 18, 20simple

automaton, 21slow automaton, 8slow automaton

for Moore, 15for Hopcroft, 15

slow for Hopcroft, 8slow for Moore, 8splitter, 5splitter in Hopcroft’s algorithm, 9state

confluent, 19fusion, 18future, 4height, 20merge, 18mergeable, 18partial signature, 19past, 4separated, 5signature, 18, 20

Sturmiantree, 17

subautomaton rooted at a state, 4

treesignature, 19Sturmian, 17

trie, 19

waiting set in Hopcroft’s algorithm, 9