Fundamentals of Artificial Neural Networks

7/30/2019 Fundamentals of Artificial Neural Networks

1/7

Fundamentals of Artificial Neural Networks

M ohamad H. Hassoun

A Bradford BookThe MIT PressCambridge, MassachusettsLondon, England


2/7

ContentsPreface xiiiAcknowledgments xixAbbreviations xxiSymbols xxiii

1 Threshold Gates 11.1 Threshold Gates 2

1.1.1 Linear Threshold Gates 21.1.2 Qu adratic Threshold Gates 71.1.3 Polynomial Threshold Gates 8

1.2 Computational Capabilities of Polynomial Threshold Gates 91.3 General Position and the Function Counting Theorem 151.3.1 Weierstrass's Approximation Theorem 15

1.3.2 Points in General Position 161.3.3 Function Counting Theorem 171.3.4 Separability in ^-Space 20

1.4 Minimal PT G Realization of Arbitrary Switching Functions 211.5 Ambiguity and Generalization 241.6 Extreme Points 271.7 Summary 29

Problems 302 Computational Capabilities of Artificial Neural Networks 352.1 Some Preliminary Results on Neural Network Mapping Capabilities 35

2.1.1 Network Realization of Boolean Functions 352.1.2 Bounds on the Number of Functions Realizable by a

Feedforward Network of LTG's 382.2 Necessary Lower Bounds on the Size of LTG Networks 41

2.2.1 Two Layer Feedforward Networks 412.2.2 Three Layer Feedforward Networks 442.2.3 Generally Interconnected Networks with No Feedback 45

2.3 Approximation Capabilities of Feedforward Neural Networks forContinuous Functions 462.3.1 Kolmogorov's Theorem 462.3.2 Single-Hidden-Layer Neural Networks Are

Universal Approximators 472.3.3 Single-Hidden-Layer Neural Networks Are Universal Classifiers 50


3/7

VU l Contents

2.4 Co mputational Effectiveness of Neural Networks 512.4.1 Algorithmic Complexity 512.4.2 Computational Energy 522.5 Summary 53Problems 54

3 Learning Rules 573.1 Supervised Learning in a Single-Unit Setting 57

3.1.1 Error Correction Rules 583.1.2 Other Gradient-Descent-Based Learning Rules 673.1.3 Extension of the jU-LMS Rule to Units with DifferentiableActivation Functions: Delta Rule 763.1.4 Adaptive Ho-Kashyap (AHK) Learning Rules 783.1.5 Other Criterion Functions 823.1.6 Extension of Gradient-Descent-Based Learning to Stochastic

Units 873.2 Reinforcement Learning 88

3.2.1 Associative Rew ard-Penalty Reinforcement Learning Rule 893.3 Unsupervised Learning 90

3.3.1 Hebbian Learning 903.3.2 Oja'sR ule 923.3.3 Yuille et al. Rule 923.3.4 Linsker's Rule 953.3.5 Hebbian Learning in a Network Setting: Principal-Componen tAnalysis (PCA) 973.3.6 Nonlinear PCA 101

3.4 Com petitive Learning 1033.4.1 Simple Com petitive Learning 1033.4.2 Vector Quan tization 109

3.5 Self-Organizing Feature Maps: Topology-Preserving CompetitiveLearning 1123.5.1 Kohonen's SOFM 1133.5.2 Examples of SO FM s 114

3.6 Summary 126Problems 134


4/7

Contents ix

4 Mathematical Theory of Neural Learning 1434.1 Learning as a Search/Approximation Mechanism 1434.2 Mathematical Theory of Learning in a Single-Unit Setting 1454.2.1 General Learning Equation 146

4.2.2 Analysis of the Learning Equation 1474.2.3 Analysis of Some Basic Learning Rules 148

4.3 Characterization of Additional Learning Rules 1524.3.1 Simple Hebbian Learning 1544.3.2 Improved Hebbian Learning 1554.3.3 Oja's Rule 1564.3.4 Yuille et al. Rule 1584.3.5 Hassoun's Rule 1614.4 Principal-Component Analysis (PCA) 163

4.5 Theory of Reinforcement Learning 1654.6 Theory of Simple Com petitive Learning 166

4.6.1 Deterministic Analysis 1674.6.2 Stochastic Analysis 168

4.7 Theory of Fea ture Mapping 1714.7.1 Characterization of Kohonen's Fea ture Map 1714.7.2 Self-Organizing Neura l Fields 173

4.8 Generalization 1804.8.1 Generalization Capabilities of Determ inistic Networks 1804.8.2 Generalization in Stochastic Networks 185

4.9 Complexity of Learning 1874.10 Summary 190

Problems 1905 Adaptive Multilayer Neural Networks I 1975.1 Learning Rule for Multilayer Feedforward Neural Networks 197

5.1.1 Error Backpropagation Learning Rule 1995.1.2 Global-Descent-Based Error Backpropagation 2065.2 Backprop Enhancem ents and Variations 210

5.2.1 Weights Initialization 2105.2.2 Learning Rate 2115.2.3 Momentum 213


5/7

X Contents

5.2.4 Activation Function 2195.2.5 Weight Decay, Weight Elimination, and Unit Elimination 2215.2.6 Cross-Validation 2265.2.7 Criterion Functions 230

5.3 Applications 2345.3.1 NE Ttalk 2345.3.2 Glove-Talk 2365.3.3 Handw ritten ZIP Code Recognition 2405.3.4 ALV INN: A Trainable Autonomous Land Vehicle 2445.3.5 Medical Diagnosis Expert Net 2465.3.6 Image Com pression and Dimensionality Reduction 247

5.4 Extensions of Backprop for Tem poral Learning 2535.4.1 Time-Delay Neural Networks 2545.4.2 Backpropagation Through Time 2595.4.3 Recurrent Backpropagation 2675.4.4 Time-Dependent Recurrent Backpropagation 2715.4.5 Real-Time Recurrent Learning 274

5.5 Summ ary 275Problems 276

6 Adaptive Multilayer Neural Networks II 2856.1 Radial Basis Function (RBF) Networks 285

6.1.1 RBF Networks versus Backprop Networks 2946.1.2 RBF Network Variations 296

6.2 Cerebeller Model Articulation Controller (CMAC) 3016.2.1 CMAC Relation to Rosenblatt's Perceptron and

Other Models 3046.3 Unit-Allocating Adaptive Networks 310

6.3.1 Hyperspherical Classifiers 3116.3.2 Cascade-Correlation Network 3186.4 Clustering Netw orks 3226.4.1 Adaptive Resonance Theory (ART) Networks 3236.4.2 Autoassociative Clustering Network 328

6.5 Summary 337Problems 339


6/7

Contents

7 Associative Neural Mem ories 3457.1 Basic Associative Ne ural M em ory M odels 345

7.1.1 Simple Associative M em ories and their Assoc iatedReco rding Recipes 3467.1.2 Dy nam ic Associative M emories (DAM s) 353

7.2 D A M Capa city and Retrieval Dyn am ics 3637.2.1 Co rrelation DA M s 3637.2.2 Projection DA M s 369

7.3 Cha racteristics of High-P erform ance D A M s 3747.4 Othe r DA M Models 375

7.4.1 Brain-State-in-a-Box (BSB) D A M 3757.4.2 No nm ono tonic Activations DA M 3817.4.3 Hysteretic Activations DA M 3867.4.4 Exp onen tial-Capa city DA M 3897.4.5 Sequ ence-G enerator D A M 3917.4.6 Heteroa ssociative D A M 392

7.5 The D A M as a Gr adie nt Ne t and Its Application toCom binatorial Optim ization 394

7.6 Sum ma ry 400Problems 401

8 Global Search Methods for Neural Netwo rks 41 78.1 Loc al versus Gl ob al Search 417

8.1.1 A Gr ad ien t Descent/Asce nt Search Strategy 4198.1.2 Stoc hastic G rad ien t Search : G lob al Search via Diffusion 421

8.2 Simulated An nea ling-B ased Glo bal Search 4248.3 Simu lated Annea ling for Stochastic Ne ural Ne two rks 428

8.3.1 Glo bal Conv ergence in a Stochastic Rec urrent Ne ura l Ne t:The Boltzmann M achine 429

8.3.2 Learn ing in Boltzm ann M achines 4318.4 M ean-F ield Annealing and Determ inistic Boltzm ann M achine s 4368.4.1 M ean -Fie ld Retrieva l 4378.4.2 M ean-F ield Learn ing 438

8.5 Gen etic Algorithm s in Ne ura l Ne twor k Op timiz ation 4398.5.1 Fu nda m enta ls of Gen etic Algorithms 4398.5.2 Application of Gen etic Algorithms to Ne ura l Ne two rks 452


7/7

xii Contents

8.6 Genetic Algorithm-A ssisted Supervised Learning 4548.6.1 Hybrid GA /Gradient-Descent Method for Feedforward

Multilayer Net Training 4558.6.2 Simulations 4588.7 Summary 462

Problems 463References 469Index 501

Fundamentals of Artificial Neural Networks

Documents

Transcript of Fundamentals of Artificial Neural Networks