Optimum design for artificial neural networks: an example ... · PDF fileOptimum design for...

Click here to load reader

  • date post

    11-Jul-2018
  • Category

    Documents

  • view

    216
  • download

    0

Embed Size (px)

Transcript of Optimum design for artificial neural networks: an example ... · PDF fileOptimum design for...

  • Optimum design for articial neural networks: an example in abicycle derailleur system

    T.Y. Lin, C.H. Tseng*

    Department of Mechanical Engineering, National Chiao Tung University, Hsinchu 30050, Taiwan

    Received 1 June 1998; accepted 1 September 1999

    Abstract

    The integration of neural networks and optimization provides a tool for designing network parameters and improvingnetwork performance. In this paper, the Taguchi method and the Design of Experiment (DOE) methodology are used tooptimize network parameters. The users have to recognize the application problems and choose a suitable Articial Neural

    Network model. Optimization problems can then be dened according to the model. The Taguchi method is rst applied to aproblem to nd out the more important factors, then the DOE methodology is used for further analysis and forecasting. ALearning Vector Quantization example is shown for an application to bicycle derailleur systems. # 2000 Elsevier Science Ltd.All rights reserved.

    Keywords: Neural networks; Optimization; Taguchi method; Design of experiments; Bicycle derailleur systems

    1. Introduction

    Articial Neural Networks (ANNs) are receivingmuch attention currently because of their wide applica-bility in research, medicine, business, and engineering.ANNs provide better and more reasonable solutionsfor many problems that either can or cannot be solvedby conventional technologies. Especially in engineeringapplications, ANNs oer improved performance inareas such as pattern recognition, signal processing,control, forecasting, etc.

    In the past few years, many ANN models withdierent strengths have been introduced for variousapplications. According to the dierent ANN modelsused, many training algorithms have been developedto improve the accuracy and convergence of themodels. Although a lot of research is being concen-

    trated in these two elds, there is still a conventionalproblem in ANN design. Users have to choose thearchitecture and determine many of the parameters ina selected network. For instance, in a ``MultilayerFeedforward (MLFF) Neural Network'', the architec-ture, such as the number of layers and the number ofunits in each layer, has to be determined. If a``Backpropagation with Momentum'' training algor-ithm is selected, many parameters, such as the learningrate, momentum term, weight initialization range, etc.,have to be selected. It is not easy for a user to choosea suitable network even if he is an experienceddesigner. The ``trial-and-error'' technique is the usualway to get a better combination of network architec-ture and parameters.

    Therefore, there must be an easier and more ecientway to overcome this disadvantage. Especially in en-gineering applications, an engineer, with or without anANN background, should not spend so much time inoptimizing the network. In recent years, the Taguchimethod (Taguchi, 1986; Peace, 1993) has become anew approach that can be used for solving the optimiz-

    Engineering Applications of Articial Intelligence 13 (2000) 314

    0952-1976/00/$ - see front matter # 2000 Elsevier Science Ltd. All rights reserved.PII: S0952-1976(99 )00045 -7

    www.elsevier.com/locate/engappai

    * Corresponding author. Tel.: +886-35-712-121; fax: +886-35-720-

    634.

    E-mail address: chtseng@cc.nctu.edu.tw (C.H. Tseng).

  • ation problems in this eld. The parameters and archi-tectures of an MLFF network were selected by usingthe Taguchi method in Khaw et al. (1995). This canimprove the original network design to obtain a betterperformance. The same technique has been used tooptimize Neocognitron Networks (Teo and Sim, 1995)and another MLFF network (Lin and Tseng, 1998).The Taguchi Method is a type of optimization tech-nique, which is very well suited to solving problemswith continuous, discrete and qualitative design vari-ables. Therefore, any ANN model can be optimized bythis method. Another method, the genetic algorithm,which requires a large computational cost, has beenapplied to populations of descriptions of networks inorder to learn the most appropriate architecture(Miller et al., 1989).

    In this study, a systematic process is introduced toobtain the optimum design of a neural network. TheTaguchi method and the Design of Experiments tech-nique (DOE) (Montgomery, 1991) are the main tech-niques used. Unlike previous studies, the Taguchimethod is used here to simplify the optimization pro-blems. Then, DOE is more easily performed. Becauseof the stronger statistical basis of DOE methodologies,many analyses can be executed. Finally, a LearningVector Quantization (LVQ) network is demonstratedas an example. The method proposed in this paper canalso be applied to any ANN model. The integration ofoptimization and ANNs in this paper was simulatedby a computer program which can be executed auto-matically and easily.

    2. Optimization process

    Optimization techniques are used to obtain animproved solution under given circumstances. In ANNdesign, it helpful to improve the original settings of anetwork in order to get a better performance. For theconvenience of further analysis, the parameters inANNs must be classied as follows.

    2.1. Design parameter classication

    ANNs are dened by a set of quantities, some ofwhich are viewed as variables during the design pro-cess. These parameters are classied into three partsaccording to the numerical quantities. For an n-vectorxx1, x2, . . ., xn, there are1. Continuous design parameters:

    xklRxkRxku k 1, 2, . . . , n 1where xk 2 Rn, xkl is the lower bound of xk, xku isthe upper bound of xk:

    2. Discrete design parameters: xk 2 xk1, xk2, . . ., xkmand m is the size of the discrete set.

    3. Qualitative design parameter: xk is a qualitative vari-able which cannot be described by a numerical ex-pression.

    For example, consider an MLFF neural networkwith a ``backpropagation with momentum'' trainingmethod. The continuous design parameters are thelearning rate, momentum term and weight initializa-tion range. The discrete design parameters are thenumber of hidden layers, the number of units ineach layer and the number of training data items.The qualitative design parameters are the activationfunction type, the network typologies and the nu-merical method, such as the gradient descent, conju-gate gradient and BFGS (Arora, 1989).

    Fig. 1. Optimization process.

    T.Y. Lin, C.H. Tseng / Engineering Applications of Articial Intelligence 13 (2000) 3144

  • 2.2. Optimization problem

    In order to obtain an optimum design for a neuralnetwork, an optimization process is proposed in Fig.1. First, choose a suitable ANN model for the appli-cation. The optimization problem can be formulatedas follows.

    Find an n-vector x x1, x2, . . ., xn of design vari-ables to minimize a vector objective function

    Fx f1x, f2x, . . . , fqx

    2subject to the constraints.

    hjx 0; j 1, 2, . . . , p

    gixR0; i 1, 2, . . . , m: 3The design variables x can be classied into threeparts: continuous, discrete and qualitative design vari-ables, as dened above. The objective functions rep-resent some criteria that are used to evaluate dierentdesigns. In ANN design, the objective function can bethe training error, the learning eciency, the groupingerror, etc. For engineering design problems, there aresome limitations, called constraints, and design vari-ables are not completely freely selected. Equality aswell as inequality constraints often exist in a problem.

    2.3. Traditional optimization method

    Numerical methods, such as Sequential LinearProgramming (SLP) and Sequential QuadraticProgramming (SQP) (Arora, 1989), which areemployed to solve optimization problems, are usuallyreferred to as ``traditional methods''. In ANN design,it is not appropriate to use these schemes to solve pro-blems. The reasons can be stated as follows.

    1. There exist qualitative design parameters and thesequalitative design parameters cannot be describedby a numerical expression. Therefore, they cannotbe solved using numerical methods.

    2. There exist non-pseudo-discrete design parameters.These discrete parameters, which occur when thesolution to a continuous problem is perfectly mean-ingful but cannot be accepted due to extraneousrestrictions, are termed as ``pseudo-discrete par-ameters'' which can be solved by traditionalmethods (Gill et al., 1981). For instance, the vari-able in a design problem could be the diameter of apipe. The diameter is a continuous variable, butonly specic values, such as 1 in, 1.5 in and 2 in,can be found in the market. This kind of variable iscalled a ``pseudo-discrete'' design parameter. Manynon-pseudo-discrete parameters that are intrinsically

    discrete, such as the number of units and layers,have to be determined in ANN design.

    3. The objective function is complicated. In applyingthe traditional methods, rst order or second orderdierentials of the objective function have to bechecked before using SLP or SQP. However, inANN design, it is dicult or impossible to write thenumerical expressions of the objective function. Forexample, the grouping error is treated as the objec-tive function, but the grouping error of every train-ing process may be calculated from software or auser subroutine, which is seen as a ``black box''.Therefore, only the implicit form of the objectivefunction can be obtained. There is no explicit formof the objective function for checking.

    For the above reasons, traditional optimizationmethods cannot be performed well in ANN design. Onthe other hand, there are no such limitations when

    Fig. 2. The Taguchi method and DOE methodology.

    T.Y. Lin, C.H. Tseng / Engineering Applications of Articial Intelligence 13 (2000) 314 5

  • using other kinds of optimization methods, such as theTaguchi method and DOE methodology.

    2.4. The Ta