CS621: Artificial Intelligence

28
CS621: Artificial Intelligence Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 45– Backpropagation issues; applications 11 th Nov, 2010

description

CS621: Artificial Intelligence. Pushpak Bhattacharyya CSE Dept., IIT Bombay Lecture 45– Backpropagation issues; applications 11 th Nov, 2010. Backpropagation algorithm. …. Output layer (m o/p neurons). j. w ji. Fully connected feed forward network - PowerPoint PPT Presentation

Transcript of CS621: Artificial Intelligence

Page 1: CS621: Artificial Intelligence

CS621: Artificial IntelligencePushpak Bhattacharyya

CSE Dept., IIT Bombay

Lecture 45– Backpropagation issues; applications11th Nov, 2010

Page 2: CS621: Artificial Intelligence

Backpropagation algorithm

Fully connected feed forward network Pure FF network (no jumping of

connections over layers)

Hidden layers

Input layer (n i/p neurons)

Output layer (m o/p neurons)

j

iwji

….….….….

Page 3: CS621: Artificial Intelligence

Gradient Descent Equations

iji

jji

j

thj

ji

j

jji

jiji

jownet

jw

jnetE

netwnet

netE

wE

wEw

)layer j at theinput (

)10 rate, learning(

Page 4: CS621: Artificial Intelligence

Backpropagation – for outermost layer

ijjjjji

jjjj

m

ppp

thj

j

j

jj

ooootw

oootj

otE

netneto

oE

netEj

)1()(

))1()(( Hence,

)(21

)layer j at theinput (

1

2

Page 5: CS621: Artificial Intelligence

Backpropagation for hidden layers

Hidden layers

Input layer (n i/p neurons)

Output layer (m o/p neurons)j

i

….….….….

k

k is propagated backwards to find value of j

Page 6: CS621: Artificial Intelligence

Backpropagation – for hidden layers

ijjk

kkj

jjk

kjkj

jjk j

k

k

jjj

j

j

jj

iji

ooow

oow

ooonet

netE

oooE

neto

oE

netEj

jow

)1()(

)1()( Hence,

)1()(

)1(

layernext

layernext

layernext

Page 7: CS621: Artificial Intelligence

General Backpropagation Rule

ijjk

kkj ooow )1()(layernext

)1()( jjjjj ooot

iji jow • General weight updating rule:

• Where

for outermost layer

for hidden layers

Page 8: CS621: Artificial Intelligence

How does it work? Input propagation forward and

error propagation backward (e.g. XOR)

w2=1w1=1θ = 0.5

x1x2 x1x2

-1x1 x2

-11.51.5

1 1

Page 9: CS621: Artificial Intelligence

Local MinimaDue to the Greedy

nature of BP, it can get stuck in local minimum m and will never be able to reach the global minimum g as the error can only decrease by weight change.

Page 10: CS621: Artificial Intelligence

Momentum factor1. Introduce momentum factor.

Accelerates the movement out of the trough.

Dampens oscillation inside the trough. Choosing β : If β is large, we may jump

over the minimum.

iterationthnjiijiterationnthji wOw )1()()(

Page 11: CS621: Artificial Intelligence

Symmetry breaking If mapping demands different weights, but we

start with the same weights everywhere, then BP will never converge.

w2=1w1=1θ = 0.5

x1x2 x1x2

-1x1 x2

-11.51.5

1 1

XOR n/w: if we sstarted with identicalweight everywhere, BPwill not converge

Page 12: CS621: Artificial Intelligence

Backpropagation Applications

Page 13: CS621: Artificial Intelligence

Problem defined

Decided by trial error

Problem defined

O/P layer

Hidden layer

I/P layer

Feed Forward Network Architecture

Page 14: CS621: Artificial Intelligence

Digit Recognition Problem Digit recognition:

7 segment display Segment being on/off defines a digit

1

2

3

6

7

45

Page 15: CS621: Artificial Intelligence

9O 8O 7O . . . 2O 1O

Hidden layer

Full connection

Full connection

7O 6O 5O . . . 2O 1OSeg-7 Seg-6 Seg-5 Seg-2 Seg-1

Page 16: CS621: Artificial Intelligence

Example - Character Recognition Output layer – 26 neurons (all

capital) First output neuron has the

responsibility of detecting all forms of ‘A’

Centralized representation of outputs

In distributed representations, all output neurons participate in output

Page 17: CS621: Artificial Intelligence

An application in Medical Domain

Page 18: CS621: Artificial Intelligence

Expert System for Skin Diseases Diagnosis Bumpiness and scaliness of skin Mostly for symptom gathering and

for developing diagnosis skills Not replacing doctor’s diagnosis

Page 19: CS621: Artificial Intelligence

Architecture of the FF NN 96-20-10 96 input neurons, 20 hidden layer

neurons, 10 output neurons Inputs: skin disease symptoms and their

parameters Location, distribution, shape, arrangement,

pattern, number of lesions, presence of an active norder, amount of scale, elevation of papuls, color, altered pigmentation, itching, pustules, lymphadenopathy, palmer thickening, results of microscopic examination, presence of herald pathc, result of dermatology test called KOH

Page 20: CS621: Artificial Intelligence

Output 10 neurons indicative of the

diseases: psoriasis, pityriasis rubra pilaris,

lichen planus, pityriasis rosea, tinea versicolor, dermatophytosis, cutaneous T-cell lymphoma, secondery syphilis, chronic contact dermatitis, soberrheic dermatitis

Page 21: CS621: Artificial Intelligence

Training data Input specs of 10 model diseases

from 250 patients 0.5 is some specific symptom

value is not knoiwn Trained using standard error

backpropagation algorithm

Page 22: CS621: Artificial Intelligence

Testing Previously unused symptom and disease data

of 99 patients Result: Correct diagnosis achieved for 70% of

papulosquamous group skin diseases Success rate above 80% for the remaining

diseases except for psoriasis psoriasis diagnosed correctly only in 30% of

the cases Psoriasis resembles other diseases within the

papulosquamous group of diseases, and is somewhat difficult even for specialists to recognise.

Page 23: CS621: Artificial Intelligence

Explanation capability Rule based systems reveal the

explicit path of reasoning through the textual statements

Connectionist expert systems reach conclusions through complex, non linear and simultaneous interaction of many units

Analysing the effect of a single input or a single group of inputs would be difficult and would yield incor6rect results

Page 24: CS621: Artificial Intelligence

Explanation contd. The hidden layer re-represents the

data Outputs of hidden neurons are

neither symtoms nor decisions

Page 25: CS621: Artificial Intelligence

Figure : Explanation of dermatophytosis diagnosis using the DESKNET expert system.

5 (Dermatophytosis node)

0( Psoriasis node )

Disease diagnosis

-2.71

-2.48-3.46

-2.68

19

14

13

0

1.621.43

2.131.68

1.581.22

Symptoms & parametersDuration of lesions : weeks 0

1

6

10

36

171

95

96

Duration of lesions : weeks

Minimal itching

Positive KOH test

Lesions locatedon feet

Minimalincrease

in pigmentation

Positive test forpseudohyphae

And spores

Bias

Internalrepresentation

1.46

20 Bias

-2.86

-3.31

9 (Seborrheic dermatitis node)

Page 26: CS621: Artificial Intelligence

Discussion Symptoms and parameters

contributing to the diagnosis found from the n/w

Standard deviation, mean and other tests of significance used to arrive at the importance of contributing parameters

The n/w acts as apprentice to the expert

Page 27: CS621: Artificial Intelligence

Our work at IIT Bombay

Page 28: CS621: Artificial Intelligence

LanguageProcessing & Understanding

Information Extraction: Part of Speech tagging Named Entity Recognition Shallow Parsing Summarization

Machine Learning: Semantic Role labeling Sentiment Analysis Text Entailment (web 2.0 applications)Using graphical models, support vector machines, neural networks

IR: Cross Lingual Search Crawling Indexing Multilingual Relevance Feedback

Machine Translation: Statistical Interlingua Based EnglishIndian languages Indian languagesIndian languages Indowordnet

Resources: http://www.cfilt.iitb.ac.inPublications: http://www.cse.iitb.ac.in/~pb

Linguistics is the eye and computation thebody