NeuralNetworksandLearningMachines
For graduate-level neural network courses offered in the departments of Computer Engineering, Electrical Engineering, and Computer Science. Renowned for its thoroughness and readability, this well-organized and completely up-to-date text remains the most comprehensive treatment of neural networks frLibrary of Congress Cataloging-in-Publication DataHaykin, SimonNeural networks and learning machines/Simon Haykin. -3rd edp cmRev. ed of: Neural networks. 2nd ed. 1999Includes bibliographical references and indexISBN-13:978-0-13-1471399ISBN-10:0-13-147139-21. Neural networks( Computer science) 2. Adaptive filters. I. Haykin, SimonNeural networks. IL. TitleQA76.87H392008006.3′--dc222008034079Vice President and Editorial Director, ECS: Marcia J. HortonAssociate editor: Alice dworkinSupervisor/Editorial Assistant: Dolores marsEditorial Assistant: William OpaluchDirector of Team-Based Project Management: Vince OBrienSenior Managing Editor: Scott DisannoA/V Production Editor: Greg dullesArt Director: Jayne ConteCover Designer: Bruce Kenselaarr: alan fischeranufacturing Buyer: Lisa McDowellMarketing Manager: Tim GalliganCopyright@ 2009 by Pearson Education, Inc, Upper Saddle River, New Jersey 07458.Pearson Prentice Hall. All rights reserved. Printed in the United States of America. This publication isprotected by Copyright and permission should be obtained from the publisher prior to any prohibitedreproduction, storage in a retrieval system, or transmission in any form or by any means, electronic,mechanical, photocopying, recording, or likewise. For information regarding permission(s), write to: Rightsand Permissions DepartmentPearson is a registered trademark of Pearson plcPearson education ltdPearson Education Australia Pty. limitedPearson Education Singapore Pte LtdPearson Education north Asia ltdPearson Education Canada LtdPearson Educacion de mexico. S.A. de C.v.Pearson Education-JapanPearson Education Malaysia Pte LtdPrentice Hallis an imprint of10987654321PEARSONISBN-13:978-0-13-1471399ISBN-100-13-147139-2To my wife, Nancy, for her patience and tolerance,andto the countless researchers in neural networks for their original contributions,the many reviewers for their critical inputs, and many of my graduate students fortheir keen interestThis page intentionally left blankContentsIntroduction 1What is a Neural network 12.The human brain 6Models of a neuron 104Neural Networks Viewed As Directed Graphs 15Feedback 186.Network architecturKnowledge Representation8.Learning processes 349.Learning tasks 3810. Concluding remarks 45Notes and references 46Chapter 1 Rosenblatt's Perceptron 471.1 Introduction 471. 2. Perceptron 481.3. The Perceptron Convergence Theorem 501. 4. Relation Between the Perceptron and bayes classifier for a gaussian environment 551.5. Computer Experiment: Pattern Classification 601.6. The batch Perceptron algorithm 621.7. Summary and Discussion 65Notes and references 66Problems 66Chapter 2 Model Building through Regression 682.1 Introduction 682.2 Linear Regression Model: Preliminary Considerations 692.3 Maximum a Posteriori estimation of the parameter vector 712.4 Relationship Between Regularized Least-Squares Estimationand maP Estimation 762.5 Computer Experiment: Pattern Classification 772.6 The Minimum-Description-Length Principle 792.7 Finite Sample-Size Considerations 822. 8 The Instrumental-Variables method 862.9 Summary and discussion 88Notes and references 89Problems 89vi ContentsChapter 3 The Least-Mean-Square Algorithm 913.1 Introduction 913.2 Filtering Structure of the LMs Algorithm 923.3 Unconstrained Optimization: a Review 943.4 The Wiener filter 1003.5 The Least-Mean-Square algorithm 102Markov Model Portraying the Deviation of the LMs algorithmfrom the wiener filter 1043.7 The Langevin Equation: Characterization of Brownian Motion 1063.8 Kushner's Direct-Averaging Method 1073.9 Statistical LMS Learning Theory for Small Learning-Rate Parameter 1083.10 Computer Experiment 1: Linear Prediction 1103.11 Computer Experiment Il: Pattern Classification 1123.12 Virtues and limitations of the lms algorithm 1133.13 Learning-Rate Annealing Schedules 1153.14 Summary and Discussion 117Notes and references 118Problems 119Chapter 4 Multilayer Perceptrons 1224.1 Introduction 1234.2 Some Preliminaries 1244.3 Batch Learning and on- Line Learning 1264.4 The Back-Propagation Algorithm 1294.5 XOR Problem 1414.6 Heuristics for Making the Back-Propagation Algorithm Perform Better 1444.7 Computer Experiment: Pattern Classification 1504.8 Back Propagation and Differentiation 1534.9 The Hessian and its role in On-Line learning 1554.10 Optimal annealing and Adaptive control of the learning rate 1574.11 Generalization 1644.12 Approximations of Functions 1664.13 Cross-Validation 1714.14 Complexity Regularization and Network Pruning 1754.15 Virtues and Limitations of Back-Propagation Learning 1804.16 Supervised Learning viewed as an Optimization Problem 1864.17 Convolutional networks 2014.18 Nonlinear Filtering 2034.19 Small-Scale Versus Large-Scale Learning Problems 2094.20 Summary and discussion 217Notes and References 219Problems 221Chapter 5 Kernel Methods and Radial-Basis Function Networks 2305.1 Introduction 2305.2 Covers Theorem on the Separability of Patterns 231.3 The Interpolation Problem 2365.4 Radial-Basis-Function Networks 2395.5 K-Means Clustering 2425.6 Recursive Least-Squares Estimation of the weight vector 2455.7 Hybrid Learning procedure for rBF Networks 2495.8 Computer Experiment: Pattern Classification 2505.9 Interpretations of the gaussian Hidden Units 252ontents5.10 Kernel Regression and Its Relation to RBF Networks 2555.11 Summary and discussion 259Notes and references 261Problems 263Chapter 6 Support Vector Machines 2686.1 Introduction 2686.2 Optimal Hyperplane for Linearly Separable Patterns 2696.3 Optimal Hyperplane for Nonseparable Patterns 2766.4 The Support Vector Machine Viewed as a Kernel Machine 2816.5 Design of Support Vector Machines 2846.6 XOR Problem 2866.7 Computer Experiment: Pattern Classification 2896.8 Regression: Robustness Considerations 2896.9 Optimal solution of the linear regression Problem 2936.10 The Representer Theorem and related Issues 2966.11 Summary and Discussion 302Notes and references 304Problems 307Chapter 7 Regularization Theory 3137. 1 Introduction 3137.2 Hadamard's Conditions for Well-Posedness 3147.3 Tikhonov's Regularization Theory 3157.4 Regularization Networks 3267.5 Generalized Radial-Basis-Function Networks 3277.6 The Regularized Least-Squares Estimator: Revisited 3317.7 Additional notes of Interest on Regularization 3357.8 Estimation of the Regularization Parameter 3367.9 Semisupervised Learning 3427.10 Manifold Regularization: Preliminary Considerations 3437. 11 Differentiable manifolds 3457.12 Generalized Regularization Theory 3487.13 Spectral Graph Theory 3507.14 Generalized Representer Theorem 3527.15 Laplacian regularized least-Squares algorithm 3547.16 Experiments on Pattern Classification Using Semisupervised Learning 3567.17 Summary and Discussion 359Notes and references 361Problems 363Chapter 8 Principal-Components Analysis 3678.1 Introduction 3678.2 Principles of Self-Organization 3688.3 Self-Organized Feature analysis 3728.4 Principal-Components Analysis: Perturbation Theory 3738.5 Hebbian-Based Maximum Eigenfilter 3838.6 Hebbian-Based Principal-Components Analysis 3928.7 Case Study: Image Coding 3988.8 Kernel Principal-Components analysis 4018.9 Basic Issues Involved in the Coding of natural Images 4068.10 Kernel Hebbian Algorithm 4078.11 Summary and Discussion 412Notes and references 415Problems 418viii ContentsChapter 9 Self-Organizing maps 4259.1 Introduction 4259.2 Two Basic Feature-Mapping Models 4269.3 Self-Organizing Map 4289.4 Properties of the Feature Map 4379.5 Computer Experiments 1: Disentangling Lattice Dynamics Using SOM 4459.6 Contextual Maps 4479.7 Hierarchical Vector Quantization 4509.8 Kernel Self-Organizing Map 4549.9 Computer Experiment Il: Disentangling Lattice Dynamics UsingKernel Som 4629.10 Relationship Between Kernel SOM and Kullback-Leibler Divergence 4649.11 Summary and discussion 466Notes and references 468Problems 470Chapter 10 Information-Theoretic Learning Models 47510.1 Introduction 47610.2 Entropy 47710.3 Maximum-Entropy Principle 48110.4 Mutual Information 48410.5 Kullback-Leibler Divergence 48610.6 Copulas 48910.7 Mutual Information as an objective function to be optimized 49310.8 Maximum Mutual Information Principle 49410.9 Infomax and redundancy reduction 49910.10 Spatially Coherent Features 50110.11 Spatially Incoherent Features 50410.12 Independent-Components Analysis 50810.13 Sparse Coding of Natural Images and Comparison with ICA Coding 51410.14 Natural-Gradient Learning for Independent-Components Analysis 51610.15 Maximum-Likelihood Estimation for Independent- Components analysis 52610.16 Maximum-Entropy Learning for Blind Source Separation 52910.17 Maximization of Negentropy for Independent-Components Analysis 53410.18 Coherent Independent-Components analysis 54110.19 Rate Distortion Theory and Information Bottleneck 54910.20 Optimal manifold representation of data 55310.21 Computer Experiment: Pattern Classification 56010.22 Summary and Discussion 561Notes and references 564Problems 572Chapter 11 Stochastic Methods Rooted in Statistical Mechanics 57911.1 Introduction 58011. 2 Statistical mechanics 58011.3 Markov Chains 58211.4 Metropolis Algorithm 59111.5 Simulated Annealing 59411.6 Gibbs Sampling 59611.7 Boltzmann machine 59811.8 Logistic Belief Nets 60411.9 Deep Belief Nets 60611.10 Deterministic Annealing 610Content11.11 Analogy of Deterministic Annealing with Expectation-MaximizationAlgorithm 61611.12 Summary and Discussion 617Notes and References 619Problems 621Chapter 12 Dynamic Programming 62712. 1 Introduction 62712.2 Markov Decision Process 62912.3 Bellman's Optimality criterion 63112.4 Policy iteration 63512.5 Value iteration 63712.6 Approximate Dynamic Programming: Direct Methods 64212.7 Temporal-Difference Learning 64312.8 Q-Learning 64812.9 Approximate Dynamic Programming: Indirect Methods 65212.10 Least-Squares policy evaluation 65512.11 Approximate Policy Iteration 66012 12 Summary and Discussion 663Notes and references 665Problems 668Chapter 13 Neurodynamics 67213.1 Introduction 67213.2 Dynamic Systems 67413.3 Stability of Equilibrium States 67813.4 Attractors 68413.5 Neurodynamic Models 68613.6 Manipulation of Attractors as a RecurrentNetwork Paradigm 68913.7 Hopfield Model 69013. 8 The Cohen-Grossberg Theorem 70313.9 Brain-State-In-A-Box Model 70513.10 Strange Attractors and Chaos 71113.11 Dynamic Reconstruction of a Chaotic Process 71613. 12 Summary and Discussion 722Notes and References 724Problems 727Chapter 14 Bayseian Filtering for State Estimation of Dynamic Systems 73114.1 Introduction 73114.2 State-Space Models 73214.3 Kalman filters 73614.4 The Divergence-Phenomenon and square-Root Filtering 74414.5 The Extended Kalman filter 75014.6 The Bayesian Filter 75514.7 Cubature Kalman Filter: building on the Kalman Filter 75914. 8 Particle filters 76514.9 Computer Experiment: Comparative Evaluation of Extended Kalman and particleFil77514.10 Kalman Filtering in Modeling of Brain Functions 77714.11 Summary and Discussion 780Notes and references 782Problems 784
用户评论
非常不错! 非常好