fundamentals-of-neural-networks-laurene-fausett

Post on 21-Dec-2014

1.338 views 8 download

Tags:

description

fundamental of neural networks

Transcript of fundamentals-of-neural-networks-laurene-fausett

Contents

PREFACE

ACKNOWLEDGMENTS

CHAPTER 1 INTRODUCTION

1.1 Why Neural Networks, and Why Now? 1

1.2 What Is a Neural Net? 31.2.1 Artificial Neural Networks, 31.2.2 Biological Neural Networks, 5

1.3 Where Are Neural Nets Being Used? 71.3.1 Signal Processing, 71.3.2 Control, 81.3.3 Pattern Recognition, 81.3.4 Medicine, 91.3.5 Speech Production, 91.3.6 Speech Recognition, 101.3.7 Business, 11

1.4 How Are Neural Networks Used? 111.4.1 Typical Architectures, 121.4.2 Setting the Weights, 151.4.3 Common Activation Functions, 171.4.4 Summary of Notation, 20

xiii

xv,

1

vii

Preface

There has been a resurgence of interest in artificial neural networks over the lastfew years, as researchers from diverse backgrounds have produced a firm theo-retical foundation and demonstrated numerous applications of this rich field ofstudy. However, the interdisciplinary nature of neural networks complicates thedevelopment of a comprehensive, but introductory, treatise on the subject. Neuralnetworks are useful tools for solving many types of problems. These problemsmay be characterized as mapping (including pattern association and pattern clas-sification), clustering, and constrained optimization. There are several neural net-works available for each type of problem. In order to use these tools effectivelyit is important to understand the characteristics (strengths and limitations) of each.

This book presents a wide variety of standard neural networks, with dia-grams of the architecture, detailed statements of the training algorithm, and sev-eral examples of the application for each net. In keeping with our intent to showneural networks in a fair but objective light, typical results of simple experimentsare included (rather than the best possible). The emphasis is on computationalcharacteristics, rather than psychological interpretations. TO illustrate the simi-larities and differences among the neural networks discussed, similar examplesare used wherever it is appropriate.

Fundamentals of Neural Networks has been written for students and forresearchers in academia, industry, and govemment who are interested in usingneural networks. It has been developed both as a textbook for a one semester,or two quarter, Introduction to Neural Networks course at Florida Institute ofTechnology, and as a resource book for researchers. Our course has been de-veloped jointly by neural networks researchers from applied mathematics, com-

xiii

Xiv Preface

puter science, and computer and electrical engineering. Our students are seniors,or graduate students, in science and engineering; many work in local industry.

It is assumed that the reader is familiar with calculus and some vector-matrixnotation and operations. The mathematical treatment has been kept at a minimallevel, consistent with the primary aims of clarity and correctness. Derivations,theorems and proofs are included when they serve to illustrate the importantfeatures of a particular neural network. For example, the mathematical derivationof the backpropagation training algorithm makes clear the correct order of theoperations. The level of mathematical sophistication increases somewhat in thelater chapters, as is appropriate for the networks presented in chapters 5, 6, and7. However, derivations and proofs (when included) are presented at the end ofa section or chapter, SO that they can be skipped without loss of continuity.

The order of presentation of the topics was chosen to reflect increasingcomplexity of the networks. The material in each chapter is largely independent,SO that the chapters (after the first chapter) may be used in almost any orderdesired. The McCulloch-Pitts neuron discussed at the end of Chapter 1 providesa simple example of an early neural net. Single layer nets for pattern classificationand pattern association, covered in chapters 2 and 3, are two of the earliest ap-plications of neural networks with adaptive weights. More complex networks,discussed in later chapters, are also used for these types of problems, as well asfor more general mapping problems. Chapter 6, backpropagation, can logicallyfollow chapter 2, although the networks in chapters 3-5 are somewhat simpler instructure. Chapters 4 and 5 treat networks for clustering problems (and mappingnetworks that are based on these clustering networks). Chapter 7 presents a fewof the most widely used of the many other neural networks, including two forconstrained optimization problems.

Algorithms, rather than computer codes, are provided to encourage thereader to develop a thorough understanding of the mechanisms of training andapplying the neural network, rather than fostering the more superficial familiaritythat sometimes results from using completely developed software packages. Formany applications, the formulation of the problem for solution by a neural network(and choice of an appropriate network) requires the detailed understanding of thenetworks that cornes from performing both hand calculations and developing com-puter codes for extremely simple examples.

Acknowledgments

Many people have helped to make this book a reality. 1 can only mention a fewof them here.

1 have benefited either directly or indirectly from short courses on neuralnetworks taught by Harold Szu, Robert Hecht-Nielsen, Steven Rogers, BernardWidrow, and Tony Martinez.

My thanks go also to my colleagues for stimulating discussions and en-couragement, especially Harold K. Brown, Barry Grossman, Fred Ham, De-metrios Lainiotis, Moti Schneider, Nazif Tepedelenlioglu, and Mike Thursby.

My students have assisted in the development of this book in many ways;several of the examples are based on student work. Joe Vandeville, Alan Lindsay,and Francisco Gomez performed the computations for many of the examples inChapter 2. John Karp provided the results for Example 4.8. Judith Lipofsky didExamples 4.9 and 4.10. Fred Parker obtained the results shown in Examples 4.12and 4.13. Joseph Oslakovic performed the computations for several of the ex-amples in Chapter 5. Laurie Walker assisted in the development of the backpro-pagation program for several of the examples in Chapter 6; Ti-Cheng Shih did thecomputations for Example 6.5; Abdallah Said developed the logarithmic activationfunction used in Examples 6.7 and 6.8. Todd Kovach, Robin Schumann, andHong-wei Du assisted with the Boltzmann machine and Hopfield net examplesin Chapter 7; Ki-suck Yoo provided Example 7.8.

Several of the network architecture diagrams are adapted from the originalpublications as referenced in the text. The spanning tree test data (Figures 4.11,4.12, 5.11, and 5.12) are used with permission from Springer-Verlag. The illus-trations of modified Hebbian learning have been adapted from the original pub-

xv

xvi Acknowledgments

lications: Figure 7.10 has been adapted from Hertz, Krogh, Palmer, Introductionto the Theory of Neural Computation, @ 1991 by Addison-Wesley PublishingCompany, Inc. Figure 7.11 has been adapted and reprinted from Neural Networks,Vol. 5, Xu, Oja, and Suen, Modified Hebbian Leaming for Curve and SurfaceFitting, pp. 441-457, 1992 with permission from Pergamon Press Ltd, HeadingtonHi11 Hall, Oxford 0X3 OBW, UK. Several of the figures for the neocognitron areadapted from (Fukushima, et al., 1983); they are used with permission of IEEE.The diagrams of the ART2 architecture are used with permission of the OpticalSociety of America, and Carpenter and Grossberg. The diagrams of the simplerecurrent net for learning a context sensitive grammar (Servan-Schreiber, et al.,1989) are used with the permission of the authors.

The preparation of the manuscript and software for the examples has beengreatly facilitated by the use of a Macintosh IIci furnished by Apple Computersunder the AppleSeed project. 1 thank Maurice Kurtz for making it available tome.

1 appreciate the constructive and encouraging comments of the manuscriptreviewers: Stanley Ahalt, The Ohio State University; Peter Anderson, RochesterInstitute of Technology; and Nirmal Bose, Penn State University.

1 would like to thank the Prentice-Hall editorial staff, and especially RickDeLorenzo, for their diligent efforts to produce an accurate and attractive productwithin the inevitable time and budget constraints.

But first, last, and always, 1 would like to thank my husband and colleague,Don Fausett for introducing me to neural networks, and for his patience, en-couragement, and advice when asked, during the writing of this book (as well asother times).

FUNDAMENTALS, OF NEURAL NETWORKS

1.1 WHY NEURAL NETWORKS AND WHY NOW?

As modern computers become ever more powerful, scientists continue to be chal-lenged to use machines effectively for tasks that are relatively simple for humans.Based on examples, together with some feedback from a “teacher,” we learneasily to recognize the letter A or distinguish a cat from a bird. More experienceallows us to refine our responses and improve our performance. Although even-tually, we may be able to describe rules by which we can make such decisions,these do not necessarily reflect the actual process we use. Even without a teacher,we can group similar patterns together. Yet another common human activity istrying to achieve a goal that involves maximizing a resource (time with one’sfamily, for example) while satisfying certain constraints (such as the need to earna living). Each of these types of problems illustrates tasks for which computersolutions may be sought.

Traditional, sequential, logic-based digital computing excels in many areas,but has been less successful for other types of problems. The development ofartificial neural networks began approximately 50 years ago, motivated by a desireto try both to understand the brain and to emulate some of its strengths. Early

1

Raja
Highlight