Neural Networks: Introducton

CHAPTER 00

INTRODUCTION

CSC445: Neural Networks

Prof. Dr. Mostafa Gadal-Haqq M. Mostafa

Computer Science Department

Faculty of Computer & Information Sciences

AIN SHAMS UNIVERSITY

(All figures in this presentation are copyrighted to Pearson Education, Inc.)

ASU-CSC445: Neural Networks

Prof. Dr. Mostafa Gadal-Haqq 2

Course Objectives

Introduce the main concepts and techniques of

neural network systems.

Investigate the principal neural network

models and applications.



Textbooks

Recommended

S. Haykin. Neural Networks and Learning Machines, 3ed., Prentice Hall (Pearson), 2009.

Comprehensive and up-to-date

L. Fausett. Fundamental of Neural Networks: Architectures, Algorithms, and Applications. Prentice Hall, 1995.

Intermediate



Course Assessment

Assessment

Homework, Quizzes, Computer Assignments (10 points)

Midterm Exam (15 Points)

Project + Lab Exam (10 Points)

Final Exam (65 Points)

Programming and homework assignments

Late answers are NOT accepted!



Good Practice

How to pass this course?

Don’t accumulate …

Do it yourself …

Ask for help …



Resources

Webpages:

The Neural Networks FAQs

ftp://ftp.sas.com/pub/neural/FAQ.html

The Neural Networks Resources

http://www.neoxi.com/NNR/index.html


Prof. Dr. Mostafa Gadal-Haqq

What is a Neural Network?

Benefits of Neural Networks !

The Human Brain

Models of A Neuron

Neural Networks and Graphs

Feedback

Network Architectures

Knowledge Representation

Learning Processes

Learning Tasks

7

Introduction



What is a Neural Network?

A neural network is a massively parallel distributed

processor made up of simple processing units

(neurons) that has a natural tendency for storing

experiential knowledge and making it available for

use.

It resembles the brain in two respects:

Knowledge is acquired by the network from its environment

through the learning process.

Interneuron connection strengths, known as synaptic weights,

are used to store the acquired knowledge.



Benefits of Neural Networks

Nonlinearity (NN could be linear or nonlinear)

A highly important property, particularly if the underlying physical mechanism responsible for generation of the input signal (e.g. speech signal) is inherently nonlinear.

Input-Output Mapping

It finds the optimal mapping between an input signal and the output results through learning mechanism that adjust the weights that minimize the difference between the actual response and the desired one. (nonparametric statistical inference).

Adaptivity

A neural network could be designed to change its weights in real time, which enables the system to operate in a nonstationary environment.




Evidential Response

In the context of pattern classification, a network can be designed to provide information not only about which particular pattern to select, but also about the confidence in the decision made.

Contextual Information

Every neuron in the network is potentially affected by the global activity of all other neurons. Consequently, contextual information is dealt with naturally by a neural network.

Fault tolerance

A neural network is inherently fault tolerant, or capable of robust computation, in the sense that its performance degrades gracefully under adverse operating conditions.




VLSI Implementation

The massively parallel nature of a neural network, makes it potentially fast for a certain computation task, which also makes it well suited for implementation using VLSI technology.

Uniformity of Analysis and Design

Neural networks enjoy universality as information processors:

Neurons, in one form or another, represent an ingredient common to all neural networks, which makes it possible to share theories and learning algorithms in different applications.

Modular networks can be built through a seamless integration of modules

Neurobiology Analogy

The design of a neural network is motivated by analogy to the brain (the fastest and powerful fault tolerant parallel processor).



Application of Neural Networks

Neural networks are tractable and easy to

implement, Specially in hardware. This made it

attractive to be used in a wide range of applications:

Pattern Classifications

Medical Applications

Forecasting

Adaptive Filtering

Adaptive Control



The Human Nervous System

The human nervous system may be viewed as a three-stage system:

The brain, represented by the neural net, is central to the system. It continually receive information, perceives it, and makes appropriate decision.

The receptors convert stimuli from the human body or the external environment into electrical impulses that convey information to the brain.

The effectors convert electrical impulses generated by the brain into distinct responses as system output

Figure 1 Block diagram representation of nervous system.

13



the Human Brain

There are approximately 10 billion (1010) neurons in the human cortex, compared with 10 of thousands (104)of processors in the most powerful parallel computers.

Each biological neuron is connected to several thousands of other neurons.

The typical operating speeds of biological neurons is measured in milliseconds (10-3 s), while a silicon chip can operate in nanoseconds (10-9 s). (Lack of processing units in computers can be compensated by speed.)

The human brain is extremely energy efficient, using approximately 10-16 joules per operation per second, whereas the best computers today use around 10-6 joules per operation per second.



The Natural Neural Cell

The neuron’s cell body (soma) processes the incoming activations and converts them into output activations.

Dendrites are fibers which emanate from the cell body and provide the receptive zones that receive activation from other neurons.

Axons are fibers acting as transmission lines that send activation to other neurons.

The junctions that allow signal transmission between the axons and dendrites are called synapses. The process of transmission is by diffusion of chemicals called neurotransmitters across the synaptic cleft

15

Figure 2 The pyramidal cell.



Organization of the Human Brain

Neural Microcircuit

An assembly of synapses organized into patterns of connectivity to produce a functional operation of interest.

Local circuits

Group of neurons with similar or different properties that perform operations characteristic of a localized region in the brain.

Interregional circuits

Pathways, columns, and topographic maps, which involve multiple regions located in different parts of the brain.

Topographic maps are organized to respond to

incoming sensory information

16

Figure 3 Structural organization of levels in the brain.



Modularity of the Human Brain

17

Figure 4 Cytoarchitectural map of the cerebral cortex. The different areas are identified by

the thickness of their layers and types of cells within them. Some of the key sensory areas

are as follows: Motor cortex: motor strip, area 4; premotor area, area 6; frontal eye fields,

area 8. Somatosensory cortex: areas 3, 1, and 2. Visual cortex: areas 17, 18, and 19.

Auditory cortex: areas 41 and 42. (From A. Brodal, 1981; with permission of Oxford

University Press.)



Brief History of ANN

McCulloch and Pitts proposed the McCulloch-Pitts neuron model 1943

Hebb published his book The Organization of Behavior, in which the Hebbian learning rule was proposed.

1949

Rosenblatt introduced the simple single layer networks now called Perceptrons. 1958

Minsky and Papert’s book Perceptrons demonstrated the limitation of single layer perceptrons, and almost the whole field went into hibernation.

1969

Hopfield published a series of papers on Hopfield networks. 1982

Kohonen developed the Self-Organising Maps that now bear his name. 1982

The Back-Propagation learning algorithm for Multi-Layer Perceptrons was rediscovered and the whole field took off again.

1986

The sub-field of Radial Basis Function Networks was developed. 1990s

The power of Ensembles of Neural Networks and Support Vector Machines becomes apparent.

2000s



Models of a Neuron

The artificial neuron is made up of three basic elements:

A set of synapses, or connecting links, each of which is characterized by a weight or strength of its own

An adder for summing the input signals, weighted by the respective synaptic weight.

An activation function for limiting the amplitude of the output of a neuron. (Also called squashing function)

19

Figure 5 Nonlinear model of a

neuron, labeled k.



Models of a Neuron

In mathematical terms:

The output of the summing function is the linear combiner output:

and the final output signal of the neuron:

(.) is the activation function.

20

Figure 5 Nonlinear model of a

neuron, labeled k.

m

jjkjk xwu

1

)( kkk buy



Effect of a Bias

The use of a bias bk has the effect of applying affine transformation to the output uk.

That is, the bias value changes the relation between the induced local field, or the activation potential vk, and the linear combiner uk as shown in Fig. 6.

21

Figure 6 Affine transformation

produced by the presence of a

bias; note that vk = bk at uk = 0.

kkk buv



Models of a Neuron

Then we may write:

and

where

22

Figure 7 Another nonlinear model of a

neuron; wk0 accounts for the bias bk.

m

jjkjk xwv

0

)( kk vy

kk bwx 00 and ,1



Types of Activation Function

Threshold function also called Heaviside function.

This model is the McCulloch and Pitts neuron model.

23

Figure 8 (a) Threshold function.

0 if0

0 if1)(

v

vv

That is, the neuron will has output signal only if its activation potential is non-negative, a property known as all-or-none.




Sigmoid function

the most commonly used function. It is a strictly increasing function that exhibit a graceful balance between linear and nonlinear behavior.

a is the slope parameter.

This function is differentiable,

which is an important feature for the neural network theory.

24

Figure 8(b) Sigmoid function for

varying slope parameter a.

avev

1

1)(




Other activation functions

the signum function, which is an odd function of its activation potential v.

The hyperbolic tangent function , which allows for an odd sigmoid-type function.

25

)tanh()( vv

0 if

0 if

0 if

1

0

1

)(

v

v

v

v



Stochastic Model of a Neuron

The previous neuron models are deterministic, that

is, its output is precisely known for any input signal.

In some applications it is desirable to have a

stochastic neuron model.

That is, the neuron is permitted to reside in only one

of two states; +1 and -1, say. The decision for a

neuron to fire (i.e., switch its state from “off” to

“on”) is probabilistic.



Stochastic Model of a Neuron

Let x denote the state of the neuron, and P(v) denotes the probability of firing, where v is the activation potential, then we may write:

A standard choice of P(v)is the sigmoid function

Where T is a parameter used to control the noise level and therefore the uncertainty in firing. When T 0, the model reduces to the deterministic model.

TvevP

/1

1)(

)P(-1y probabilitwith

)P(y probabilitwith

1

1

v

vx



Neural Networks Viewed as Directed Graph

Figures 5 and 7 are functional description

of a neuron. We can use signal-flow graph

(a network of directed links that are

interconnected to points called nodes) to

describe the network using these rules:

A signal flow along a link only in the direction defined

by the arrow on the link

A node signal equal the algebraic sum of all signals

entering the node.

The signal at a node is transmitted to each outgoing

link originated from that node.

28

Figure 9 illustrating basic rules for the construction of signal-flow graphs.




A Signal-Flow graph of a neuron:

29

Figure 10 Signal-flow graph of a neuron.




An Architectural graph of a neuron:

30

Figure 11 Architectural graph of a neuron.




We have three graphical representations of a neural network.

Block diagram (Functional)

Complete directed graph (Signal-Flow)

Partially Complete directed graph (Architectural)

31

Architectural Signal-flow Functional



Feedback

Feedback is said to exist in dynamic systems whenever the output of a node influences in part the input of that particular node. (Very important in the study of recurrent networks.

and

32

Figure 12 Signal-flow graph of a single-

loop feedback system.

)]([)( nxny jk A

)]([)()( nynxnx kjj B

)]([1

)( nxny jkAB

A




Single-layer Feedforward Networks

Note that: we do not count the input layer of the source nodes because no computation is performed there.

33

Figure 15 Feedforward network

with a single layer of neurons.




Multilayer Feedforward Networks

Input layer (source nodes)

One or more hidden layers

One output layer

It is referred to as 10-4-2 network.

Could be fully or partially connected.

34

Figure 16 Fully connected feedforward

network with one hidden layer and one

output layer.

By adding one or more hidden layers, the network is enabled to extract higher order statistics from its input).




Recurrent Networks

A recurrent neural networks should has at least one feedback loop

Self-feedback refers to a situation where the output of a node is feedback into its own input.

35

Figure 17 Recurrent network with no self-

feedback loops and no hidden neurons.




Recurrent Networks

A recurrent neural networks with self feedback loop and with hidden neurons.

36

Figure 18 Recurrent network with hidden neurons.




Knowledge refers to

stored information or models used by a person or

machine to interpret, predict, and approximately respond

to the outside world.

Characteristics of knowledge representation:

What information is actually made explicit?

How the information is physically encoded for subsequent use?

37




The major task for a neural network is

to learn a model of the world (environment) in which it is

embedded, and to maintain the model sufficiently consistently

with the real world so as to achieve the specific goals of the

application of interest.

knowledge of real world consists of:

the known world state represented by facts (prior information).

Observations of the world obtained by sensors (measurements).

These observations are used to train the neural network.

38




(Commonsense) Roles of Knowledge Representation

39

Figure 19 Illustrating the

relationship between inner product

and Euclidean distance as

measures of similarity between

patterns.

Role 1: Similar inputs form similar

classes should usually produce

similar representation.

Role 2: Items to be categorized as

separate classes should be given

widely different representation.

Role 3: if a particular feature is

important, then there should be a

large number of neurons are

involved in the representation.

Role 4: Prior information and

invariances should be built into the

design of a NN when they are

available.




How to Build prior Info into Neural Network Design?

40

Figure 20: Illustrating the combined use of a receptive field and weight sharing. All four hidden

neurons share the same set of weights exactly for their six synaptic connections.

Currently ,there are no well-defined

rules. But rather some ad hoc

procedures:

Restricting the network

architecture, which is achieved

through the use of local

connections (receptive fields).

Constraining the choice of

synaptic weights, which is

implemented through the use of

weight-sharing.

NNs that make use of these two

techniques are called Convolutional

Networks.




How to Build Invariance into Neural Network Design?

Invariance by Structure

For example, enforcing in-plain rotation by making wij = wji

Invariance by Training

Including different examples of each class.

Invariant Feature Space

Transforming the data to invariant feature space (Fig. 21).

41

Figure 21 Block diagram of an invariant-feature-space type of system.



Neural Networks vs. Pattern Classifiers

Pattern Classifiers: We should have a prior knowledge about the data to be able to explicitly model it.

Neural Networks: The data set speaks about itself. The NN learns the implicit model in the dat.

Data (Explicit)

Model Building

Training

Data Training



Learning Processes

Supervised Learning: Learning with a Teacher

43

Figure 24 Block diagram of learning with a teacher; the part of the

figure printed in red constitutes a feedback loop.



Learning Processes

Unsupervised Learning: Learning without a Teacher

44

Figure 26 Block diagram of unsupervised learning.



Learning Processes

Reinforcement Learning: Learning without a Teacher

45

Figure 25 Block diagram of reinforcement learning; the learning

system and the environment are both inside the feedback loop.



Learning Tasks

Pattern Association

46

Figure 27 Input–output relation of pattern

associator.



Learning Tasks

Pattern Recognition

47

Figure 28 Illustration of the classical approach to pattern classification.



Learning Tasks

Function Approximation: System Identification

48

Figure 29 Block diagram of system identification:

The neural network, doing the identification, is

part of the feedback loop.



Learning Tasks

Function Approximation: Inverse Modeling

49

Figure 30 Block diagram of inverse system modeling. The neural network,

acting as the inverse model, is part of the feedback loop.



Learning Tasks

Control

50

Figure 31 Block diagram of feedback control system.

Rosenblatt’s Perceptron

Next Time

51

Neural Networks: Introducton

Education

Transcript of Neural Networks: Introducton