Cerebellar Model Classifier for Data Mining With Linear Time Complexity

download Cerebellar Model Classifier for Data Mining With Linear Time Complexity

of 16

Transcript of Cerebellar Model Classifier for Data Mining With Linear Time Complexity

  • 7/28/2019 Cerebellar Model Classifier for Data Mining With Linear Time Complexity

    1/16

    International Journal of Computational Intelligence and ApplicationsVol. 6, No. 3 (2006) 299313c Imperial College Press

    A CEREBELLAR MODEL CLASSIFIER FOR DATA MINING

    WITH LINEAR TIME COMPLEXITY

    DAVID CORNFORTH

    School of Information Technology and Electrical Engineering

    University of New South WalesAustralian Defence Force Academy

    Northcott Drive, Canberra, ACT 2600, Australia

    Received 14 October 2003Revised 24 July 2006

    Accepted 7 August 2006

    Techniques for automated classification need to be efficient when applied to largedatasets. Machine learning techniques such as neural networks have been successfullyapplied to this class of problem, but training times can blow out as the size of thedatabase increases. Some of the desirable features of classification algorithms for largedatabases are linear time complexity, training with only a single pass of the data, andaccountability for class assignment decisions. A new training algorithm for classifiersbased on the Cerebellar Model Articulation Controller (CMAC) possesses these fea-tures. An empirical investigation of this algorithm has found it to be superior to thetraditional CMAC training algorithm, both in accuracy and time required to learn map-pings between input vectors and class labels.

    Keywords: Cerebellar model articulation controller; classification; training.

    1. Introduction

    A well-studied class of machine learning problems is that of categorization, or clas-

    sification. Here, the key is to determine some relationship between a set of input

    vectors that represent stimuli, and a corresponding set of values on a nominal scale

    that represent category or class. The relationship is obtained by applying an algo-

    rithm to training samples that are 2-tuples u, c consisting of an input vector u

    and a class label c. The learned relationship can then be applied to instances ofu

    not included in the training set, in order to discover the corresponding class label c.1

    A number of machine learning techniques including genetic algorithms,2 and neural

    networks3

    have been shown to be very effective in solving such problems.There are many large databases in existence that could yield valuable informa-

    ti if ffi i t d l bl th d f t t d l ifi ti ld b f d 4

  • 7/28/2019 Cerebellar Model Classifier for Data Mining With Linear Time Complexity

    2/16

    300 D. Cornforth

    Many algorithms for automated classification have an inherently non-linear rela-

    tionship between time taken by the algorithm to run and the number of train-

    ing examples. Analysis methods that work well for small data sets are completelyimpractical when applied to larger data sets. For example, training of a neural

    network using back-propagation is known to be NP-complete.5 Some studies sug-

    gest that evolutionary algorithms have polynomial time complexity.6 The work

    presented here investigates classification algorithms based on the Cerebellar Model

    Articulation Controller (CMAC),7 which have linear time complexity.

    Global error minimization techniques, such as back-propagation, require mul-

    tiple traversals of the data set during training. If the training set is very large,

    it cannot fit inside the memory of the machine. This will result in multiple disk

    read/write operations, which are relatively costly in time and can contribute greatly

    to data processing time. Current approaches include compression or summary of

    the data set before processing, and redesign of analysis tools so that analysis can be

    completed with only one pass of the data. This paper shows how the original CMAC

    training algorithm, which normally uses an iterative global error minimization tech-

    nique, may be adapted so that the training set only needs to be accessed once.

    The usefulness of a classification algorithm may be enhanced by providing an

    explanation for each class assignment decision. This could take the form of a set of

    rules that contribute to the assignment, or a probability for each class, given the

    input. Black box methods such as neural networks do not naturally lend themselves

    to this form of analysis. The new algorithm described here provides accountability

    for class assignment decisions in the form of class probabilities.

    In this paper, I propose the Kernel Addition Training Algorithm (KATA) as

    a more effective learning algorithm for the CMAC when used as a classifier. The

    proposed method requires only a single pass of the data and provides a probability

    model for class assignment decisions.

    The organization of the remainder of this paper is as follows. Section 2 briefly

    reviews the architecture of the CMAC, and introduces the proposed modifications.Section 3 provides an empirical investigation of the new fast learning algorithm and

    the traditional error minimization methods. Section 4 provides a discussion of the

    results and implications arising from them.

    2. Cerebellar Model Articulation Controller

    The CMAC, or Albus perceptron, is a sparse coarse-coded associative memory

    algorithm that mimics the functionality of the mammalian cerebellum.8 Originally,

    the CMAC was proposed as a function modeler for robotic controllers,7

    but hasbeen extensively used in reinforcement learning9,10 and also as a classifier.1114

    Th t i i th d d b Alb it ti l ith b d l b l

  • 7/28/2019 Cerebellar Model Classifier for Data Mining With Linear Time Complexity

    3/16

    A Cerebellar Model Classifier for Data Mining with Linear Time Complexity 301

    requires multiple passes of the training data. The method proposed in this paper

    requires only a single pass of the data. Furthermore, it provides a probability model

    for class assignment decisions.The CMAC is able to accept real valued inputs. An input vector u with d

    components may be visualized as a point in d-dimensional space. The input space

    is quantized using a set of q overlapping tiles as shown in Fig. 1(a), where q = 2.

    For input spaces of high dimensionality, the tiles form hyper-rectangular regions.

    (a)

  • 7/28/2019 Cerebellar Model Classifier for Data Mining With Linear Time Complexity

    4/16

    302 D. Cornforth

    A query is performed by first activating all the tiles that contain a query point.

    The activated tiles address memory cells, which contain stored values. These are

    the weights of the system, as shown in Fig. 1(b). The summing of these valuesproduces an overall output. The CMAC output is therefore stored in a distributed

    fashion, such that the output corresponding to any point in input space is derived

    from the value stored in a number of memory cells.

    A change of the input vector results in a change in the set of activated tiles, and

    therefore a change in the set of memory cells participating in the CMAC output.

    The memory size required by the CMAC depends of the number of tilings

    and the size of tiles. If the tiles are large, such that each tile covers a large proportion

    of the input space, a coarse division of input space is achieved, but local phenomena

    have a wide area of influence. If the tiles are small, a fine division of input space is

    achieved and local phenomena have a small area of influence. The number of tiles

    in the input space, and therefore the number of memory cells, is usually sufficiently

    large to become prohibitive due to memory constraints. Many of these tiles are

    never used due to the sparse coverage of the input space. One solution is to employ

    a consistent random hash function to collapse the large tiling space into a smaller

    memory cell space.16 This reduces the memory use, but still requires relatively

    large memory requirements for a classifier. An alternative and more comprehensive

    solution is the hierarchical CMAC.17 Here, several low-dimensional CMACs are

    connected to form a multi-layer tree structure. Training is accomplished by min-

    imizing the output error, and back-propagating errors to hidden layers. The tree

    structure can also be pruned to reduce redundant nodes.18 This method cannot be

    employed here because it is not compatible with the training rule presented.

    The CMAC learns a mapping from input space U Rd to output space Z R,

    where d is the number of dimensions, or the size of the input vector. Following

    existing convention, this can be broken into three mappings12:

    The input space to multi-layer tiling system mapping E : u x.

    The multi-layer tiling system to memory table mapping H : x y.

    The memory table to output mapping (weighted summation) W : y z.

    The mapping E can be implemented using simple integer division in each dimen-

    sion. The integer values for each dimension are combined to form one address for

    each tiling layer. Addresses for the other tiling layers are calculated in a similar

    way. The mapping H receives q addresses that must be mapped to memory cells.

    This mapping is usually implemented by a hashing function. The mapping W is a

    weighted summation of the contents of the memory cells. These values are set dur-

    ing training. An improvement over the Albus CMAC is the widely adopted practiceof embedding kernel functions into the quantizing regions.1921 This modifies the

  • 7/28/2019 Cerebellar Model Classifier for Data Mining With Linear Time Complexity

    5/16

    A Cerebellar Model Classifier for Data Mining with Linear Time Complexity 303

    (a) (b)

    Fig. 2. How kernel functions may be embedded into a 2-dimensional tiling grid. (a) Step kernelfunction. (b) Linear kernel function.

    Each weight y is indexed by address a, and the kernel function k is applied to

    some distance measure of the query point from the centre of the tile. The number

    of tiling layers is q. Some common kernel functions are illustrated in Fig. 2.

    2.1. Output mapping

    The CMAC may be used as a classifier by adopting a suitable mapping between the

    real valued output variable z and the nominal variable class label c. One possible

    mapping8 interprets positive values of z as one class, and negative values of z as

    another class. This is sufficient for two class problems, and is the most often cited in

    the literature.12,13,22,23 For problems with more than two classes, one could definethreshold values such as to divide the scalar range of z into the number of classes

    to be represented:

    c = v : tlowv < z < thighv (2)

    where threshold tlowv > thighv1 . Equation (2) represents a scalar mapping. Using this

    mapping, the CMAC can be used as a classifier if, during the training phase, weights

    are adjusted to make the output z approach a suitable target value. For example,

    the target for a given class could be a value equidistant from the thresholds corre-

    sponding to that class.

    2.2. Albus training algorithm

    The Albus CMAC is trained by evaluating the error as the difference between

    desired output zd and actual output z, and updating the active weights at each

    time step t:

    wi(t+1) = wit + (zdj zj)k(di)q

    k(d ). (3)

  • 7/28/2019 Cerebellar Model Classifier for Data Mining With Linear Time Complexity

    6/16

    304 D. Cornforth

    2.3. Kernel addition training algorithm

    The scalar mapping above is not ideal, as it represents a nominal variable using acontinuous scale, and there is no information about the degree of membership of a

    class. Consider an alternative output mapping, using a CMAC for each class:

    c = v : zv = max(z1, z2, . . . , zm) (4)

    where m is the number of classes. Equation (4) represents a vector mapping. This

    may be used to assess the decision of the classifier to assign any particular input to a

    class. For example, it is possible to discover if two classes have high activation, or if

    one class is the clear winner. A desirable property is that the output activations are

    proportional to the probability of the class, given the input, so that zv representsa relative probability of selecting class c. Then, it is possible to take account of

    a priori probability using Bayes Law:

    P(ci|x) =P(ci)P(x|ci)i P(ci)P(x|ci)

    (5)

    where P represents probability.3 The frequency of samples occurring in each class

    may be used to estimate P(ci). The goal of training then is to provide an output zithat can be used to estimate P(x|ci). There is no need to calculate the denominator,

    as assignment to the highest probability class requires only comparison.

    The new training algorithm, the KATA,25 uses a vector class mapping. As each

    training vector is presented, a kernel function value for each activated tile is added to

    the value of the corresponding memory cell. Assuming n training points distributed

    uniformly over a tile, the expected value of the corresponding cell after training will

    be n.ke, where ke is the expected value of the kernel function. If the kernel function

    is the step function, the value of each memory cell after training is a count of

    the number of times the corresponding tile was accessed during training. If the

    kernel function is not the step function, then training amounts to estimation of a

    histogram, using as a weight some function of the distance of the input from thecentre of the histogram bin. From the well-known properties of histograms, one

    concludes that:

    The value of any tile after training is proportional to the probability of inputs

    activating that tile.

    A histogram improves its estimate of the underlying distribution as the number

    of training samples increases, so the algorithm will converge.

    It is only necessary to present the training data once.

    There is no value in repeated presentation of the same training data.

    After training, the output z for each class will be proportional to the numerator

    f E (5) Thi b b id i l ifi ti bl h th

  • 7/28/2019 Cerebellar Model Classifier for Data Mining With Linear Time Complexity

    7/16

    A Cerebellar Model Classifier for Data Mining with Linear Time Complexity 305

    output for this class is also doubled. Assume that the number of samples in a class

    is a good estimator of P(ci). Then the CMAC output after training is proportional

    to P(ci).P(x|ci).After training, each CMAC forms a piecewise model of the probability density

    function for the corresponding class. There is no need to normalize the output as

    in Eq. (1), so the output is given by:

    z =

    qi=1

    k(disti) w[addri] (6)

    The KATA CMAC is trained using the value of the kernel:

    wi(t+1) = wit + k(disti) (7)

    In contrast to the Albus training algorithm, the KATA is not an iterative algo-

    rithm. The weights are updated during a single presentation of the training data

    at the inputs. From this, it follows that the KATA is not sensitive to the order in

    which input samples are presented. Also, the KATA is robust to outliers, as outliers

    occur with low frequency, and so will have minimal effect on the CMAC output.

    3. Experiments and Results

    Comparing Eq. (3) with Eq. (7), it can be seen that the KATA can be completed in

    less time than one iteration of the Albus algorithm. The speed advantage will not

    be as great as suggested simply by comparing these equations due to the different

    software overheads. However, one would expect that the KATA would be faster than

    the Albus training algorithm. This conclusion was tested using computer models

    of the two algorithms for comparison purposes. The experiments were designed

    to demonstrate the linear relationship between number of training samples and

    training time.

    3.1. Artificial test problem

    The two CMAC learning algorithms were tested using the parity problem. This

    problem was chosen because of its low spatial frequency, ensuring that there will be

    enough samples to discriminate classes in tests with a high number of dimensions

    or a high number of classes. In this problem, the input space is partitioned into m

    regions in each dimension, where m is the number of classes and d the number of

    dimensions. Given an input vector x = {x1 xd}, 0 < xi < r, then the class label

    is given by: d

    flmxi

    d (8)

  • 7/28/2019 Cerebellar Model Classifier for Data Mining With Linear Time Complexity

    8/16

    306 D. Cornforth

    Fig. 3. The parity problem for a three-dimensional input space. White represents class 0, blackrepresents class 1.

    Data sets were generated using randomly generated x values, and assigning a

    class label to each record according to Eq. (8). Seven data sets were generated,

    containing from 2 to 5 dimensions and from 2 to 5 classes. Each database consisted

    of 1 million samples, with input variables drawn from a uniform distribution. The

    classifiers were tested using different numbers of samples.

    3.2. Natural test problem

    The two CMAC algorithms were tested using a natural data set, derived from the

    1998 DARPA Intrusion Detection Evaluation Program.26 The dataset was originally

    collected to establish the efficacy of intrusion detection and includes a variety of

    simulated intrusions of a military computer network. A version of this was used for

    the KDDcup99 contest. There are 24 classes representing different types of attack

    and 41 measurements, or features used as inputs to the classifier. Some of these are

    discrete and some are numeric. The datasets used in these tests contained 494,020

    records.

    The dataset was adapted for testing the time complexity of the CMAC algo-rithms as follows. Features with a small number of integer values were removed,

    as the CMAC uses continuous inputs only. Also, features that are zero most of

    the time were removed. Thus, 12 features are left. The algorithms were tested on

    different numbers of records by extracting records at random, containing different

    numbers of samples.

    3.3. Test methodology

    Both versions of the CMAC used the same parameters. Input space was uniformlyquantized in all dimensions. Tile spacing was based on the work of Parks and

    Milit 27 A h hi f ti ith h i i d t hi lli i

  • 7/28/2019 Cerebellar Model Classifier for Data Mining With Linear Time Complexity

    9/16

    A Cerebellar Model Classifier for Data Mining with Linear Time Complexity 307

    thereby enable meaningful comparisons of running time. The Albus CMAC used a

    scalar output mapping, while the KATA CMAC used a vector output mapping.

    The gain term for the Albus training algorithm, , was set to 1.0 at the startof training, and reduced during training, as this guarantees quick convergence.28,29

    This was implemented by setting to the value of the normalized training error.

    The number of epochs used must be sufficient to allow convergence, but not too

    many so as to cause over fitting. After each epoch, the accuracy was compared to

    that from the previous epoch. If the accuracy had increased by less than 0.1%, then

    training would be terminated.

    The performance of the two algorithms was tested using three-fold cross valida-

    tion, so that accuracy was always tested only on unseen data. The data sets used

    were each divided into three parts at random. Training was performed using two

    parts of the data, and the trained model was tested on the remaining one part.

    This was done three times using a different part for testing. In this manner, the

    model was tested on all data, and reported a number of correctly classified samples,

    which was divided by the size of the data set to obtain percentage accuracy. The

    choice of the fraction one third is a compromise between using all data to train,

    which may result in over fitting the model, and using less data to be computation-

    ally efficient.30 For each test, the time taken to train and the resulting accuracy

    was measured.

    3.4. Performance comparison

    In order to put these results in context, some other classifier algorithms were com-

    pared to CMAC. For this purpose, the Weka toolbox was used.31 As this toolbox

    consists of programs written in Java, and using a common framework, it is possible

    to make direct comparisons between running time. For this purpose, the KATA

    CMAC algorithm was also coded in Java using the same libraries in order to pro-

    vide the most realistic comparison. Initially, 12 algorithms were considered from thewide range provided by Weka. Some of these were discarded during tests because

    their long running time did not provide a fair comparison with CMAC. Of the

    available algorithms, the three fastest were selected: functions.RBFNetwork (place-

    ment of Gaussian kernels using clustering), functions.SMO(the Sequential Minimal

    Optimization version of Support Vector Classifiers), and trees.J48 (an implemen-

    tation of the C4.5 decision tree algorithm). These three, as well as KATA CMAC,

    were tested on the same data sets from the Parity problem described earlier, using

    10-fold cross validation.

    3.5. Results

  • 7/28/2019 Cerebellar Model Classifier for Data Mining With Linear Time Complexity

    10/16

    308 D. Cornforth

    (a) (b)

    (c) (d)

    (e) (f)

    (g) (h)

  • 7/28/2019 Cerebellar Model Classifier for Data Mining With Linear Time Complexity

    11/16

    A Cerebellar Model Classifier for Data Mining with Linear Time Complexity 309

    (a) (b)

    (c) (d)

    (e) (f)

    (g) (h)

  • 7/28/2019 Cerebellar Model Classifier for Data Mining With Linear Time Complexity

    12/16

    310 D. Cornforth

    (a) (b)

    Fig. 6. Comparisons for the Parity problem using 2 dimensions and 5 classes. (a) showing trainingtime in seconds against number of samples (in thousands), (b) showing accuracy against numberof samples (in thousands).

    the CMAC algorithm. Figure 6 shows the results for the Intrusion Detection prob-

    lem. This supports the belief that CMAC classifiers have a linear time complexity

    for training. In all the tests, the KATA algorithm was about 2.5 to 3 times as fast

    as Albus. Note that in all these tests, the Albus training algorithm used a variable

    number of training epochs, which explains in part the occasional outliers. So the

    relative speed advantage of the KATA depends on the number of iterations of the

    Albus algorithm. The accuracy obtained by training with the KATA is consistently

    superior to that obtained using the Albus technique. There are two possible expla-

    nations for this. First, when the problem becomes more difficult, using more classes

    or dimensions, the performance of the classifier is bound to deteriorate, because the

    number of samples available for each homogenous block of the input space decreases.

    Since the Albus technique uses an error minimization, this is an inherently biased

    model, whereas the KATA uses an unbiased model of input space. Therefore, the

    accuracy of the classifier trained using the KATA degrades more slowly. Second,the Albus method suffers from the difficulty of correctly setting the parameter,

    which is not necessary for the KATA.

    Figure 7 shows results for the comparison between four classifier methods. Here,

    all the methods chosen show evidence of linear time complexity. The algorithm

    taking the longest time to build the model was J48, taking up to 10,000 seconds to

    train on datasets near one million samples. The next slowest was SMO, taking up to

    4000 seconds to train. CMAC was the algorithm with the fastest training time, and

    RBF was very close. The slowest classifier for testing was RBF, taking 70 seconds

    to classify. The next slowest was CMAC, taking up to 20 seconds to classify. Theother two methods, SMO and RBF were much quicker, classifying unknown cases

    i l th 10 d

  • 7/28/2019 Cerebellar Model Classifier for Data Mining With Linear Time Complexity

    13/16

    A Cerebellar Model Classifier for Data Mining with Linear Time Complexity 311

    (a) (b)

    (c) (d)

    Fig. 7. Comparisons for the Parity problem using 2 dimensions and 5 classes. (a) training time inseconds against number of samples (in thousands), (b) testing time in seconds against number ofsamples (in thousands), (c) accuracy in percent correct against number of samples (in thousands),(d) legend.

    SMO, and the close superposition of the boundaries in J48. The accuracy of the

    RBF classifier is similar to that of CMAC. This is expected, since the similarity ofCMAC to RBF is well known.

    It is clear that CMAC compares well with the other methods examined. It should

    be noted that only the fastest classifier methods were examined, so it was possible

    that CMAC would outperform other classifier methods on speed and accuracy.

    4. Conclusions

    There are three main results from this work. First, the training of CMAC-based

    classifiers has linear time complexity. This is a highly desirable property of machinelearning techniques, as it makes the processing of large databases more computa-

    ti ll f ibl S d t i i l ith ll th CMAC t b t i d

  • 7/28/2019 Cerebellar Model Classifier for Data Mining With Linear Time Complexity

    14/16

    312 D. Cornforth

    accountability for class assignment decisions and allows a priori probability to be

    accounted for. This has potential in applications that require estimation of the risk

    of incorrect classification.These three main results are supported by empirical evidence presented here.

    Other results may be inferred from the nature of the algorithm, namely, that the

    KATA is not sensitive to the order of training data, and is robust to outliers.

    Comparative trials suggest that different classifiers have advantages in different

    areas, but CMAC with KATA has the characteristic of fast training. This new

    training algorithm has great potential for application in data mining and automated

    knowledge discovery.

    Acknowledgments

    The author wishes to thank the New South Wales Centre for Parallel Computing

    (NSWCPC) for the use of their SGI Power Challenge machine upon which the

    calculations for this paper were performed. Part of this work was supported by a

    Faculty Seed Grant from Charles Sturt University, and part was supported by a

    Rectors Start-up Grant from the University of New South Wales.

    References

    1. T. G. Dietterich and G. Bakiri, Solving multiclass learning problems via error-correcting output codes, J. Artif. Intell. Res. 2 (1995) 263286.

    2. J. Holland, Adaptation in Natural and Artificial Systems: An Introductory Analysiswith Applications to Biology, Control, and Artificial Intelligence (MIT Press, 1992).

    3. R. O. Duda and P. E. Hart, Pattern Classification and Scene Analysis (John Wileyand Sons, New York, 1973).

    4. J. Han and M. Kamber, Data Mining Concepts and Techniques (Morgan Kaufman,2001).

    5. A. Roy, S. Govil and R. Miranda, A neural-network learning theory and a polynomialtime RBF algorithm, IEEE Trans. Neural Networks 8(6) (1997) 13011313.

    6. J. He and X. Yao, Drift analysis and average time complexity of evolutionary algo-rithms, Artif. Intell. 127(1) (2001) 5785.

    7. J. S. Albus, A new approach to manipulator control: The Cerebellar Model Articula-tion Controller (CMAC), J. Dynam. Syst. Measurement Contr. 97 (1975) 220233.

    8. J. S. Albus, Mechanisms of planning and problem solving in the brain, Math. Biosci.45 (1979) 247293.

    9. J. C. Santamaria, R. S. Sutton and A. Ram, Experiments with reinforcement learningin problems with continuous state and actions spaces, Technical Report UM-CS-1996-

    088, Department of Computer Science, University of Massachusetts, Amherst, MA(1996).10. M. Wiering, R. Salustowicz and J. Schmidhuber, Reinforcement learning soccer teams

  • 7/28/2019 Cerebellar Model Classifier for Data Mining With Linear Time Complexity

    15/16

    A Cerebellar Model Classifier for Data Mining with Linear Time Complexity 313

    12. Z. J. Geng and W. Shen, Fingerprint classification using fuzzy cerebellar model arith-metic computer neural networks, J. Electron. Imag. 6(3) (1997) 311318.

    13. H. Fashandi and M. Moin, Face detection using CMAC neural network, in Proc. 7thInt. Conf. Artif. Intell. Soft Comput. ICAISC, eds. L. Rutkowski, J. Siekmann, R.Tadeusiewicz and L. Zadeh, Lecture Notes in Computer Science, Vol. 3070 (Springer,2004) 724729.

    14. W. Xu, S. Xia and H. Xie, Application of CMAC-based networks on medical imageclassification, in Proc. Int. Symp. Neural Networks, eds. F. Yin, J. Wang and C. Guo,Lecture Notes in Computer Science, Vol. 3173 (Springer, 2004) 953958.

    15. D. Cornforth, Classifiers for machine intelligence, PhD thesis, Nottingham University,UK (1994).

    16. T. H. Corman, C. E. Leirson and R. L. Rivest, Introduction to Algorithms (McGraw-Hill, 1986).

    17. H. Lee, C. Chen and Y. Lu, A self-organizing HCMAC neural-network classifier, IEEETrans. Neural Networks 14(1) (2003) 1527.

    18. C. Chen, C. Hong and Y. Lu, A pruning structure of self-organising HCMAC neuralnetwork classifier, in Proc. 2004 IEEE Int. Joint Conf. Neural Networks 2 (2004)861866.

    19. P. C. E. An, W. T. Miller and P. C. Parks, Design improvements in associative memo-ries for cerebellar model articulation controllers, Proc. ICANN (1991), pp. 12071210.

    20. S. H. Lane, D. A. Handelman and J. J. Gelfand, Theory and development of higher-order CMAC neural networks, IEEE Cont. Syst. (1992) 2330.

    21. F. J. Gonzalez-Serrano, A. R. Figueiras-Vidal and A. Artes-Rodriguez, Generaliz-

    ing CMAC architecture and training, IEEE Trans. Neural Networks 9(6) (1998)15091514.

    22. H. Xu, C. Kwan, L. Haynes and J. Pryor, Real-time adaptive on-line traffic incidentdetection, Proc. IEEE Int. Symp. Intell. Contr. (1996), pp. 200205.

    23. J. Geng and T. Lee, Freeway traffic incident detection using fuzzy CMAC neuralnetworks, Proc. IEEE World Congress Comput. Intell. 2 (1998) 11641169.

    24. Y. Wong, CMAC learning is governed by a single parameter, in Proc. IEEE Int. Conf.Neural Networks, San Francisco (1993), pp. 14391443.

    25. D. Cornforth and D. Newth, The kernel addition training algorithm: Faster trainingfor CMAC based neural networks, in Proc. Conf. Artif. Neural Networks Expert Syst.

    (University of Otago, 2001), pp. 3439.26. S. Hettich and S. D. Bay, The UCI KDD Archive, University of California, Departmentof Information and Computer Science, Irvine, CA (1999), http://kdd.ics.uci.edu.

    27. P. C. Parks and J. Militzer, Improved allocation of weights for associative memorystorage in learning control systems, in Proc. IFAC Design Meth. Contr. Syst., Zurich,Switzerland (1991), pp. 507512.

    28. C. Lin and C. Chiang, Learning convergence of CMAC technique, IEEE Trans. NeuralNetworks8(6) (1997) 12821292.

    29. S. Yao and B. Zhang, The learning convergence of CMAC in cyclic learning, Proc.Int. Joint Conf. Neural Networks 3 (1993) 25832586.

    30. S. Weiss and C. A. Kulikowski (eds.), Computer Systems That Learn: Classification

    and Prediction Methods From Statistics, Neural Nets, Machine Learning, and ExpertSystems (Morgan Kaufman, San Mateo, CA, 1991).

    31 I Witt d E F k D t Mi i P ti l M hi L i T l d T h

  • 7/28/2019 Cerebellar Model Classifier for Data Mining With Linear Time Complexity

    16/16