Slide 1 EE3J2 Data Mining EE3J2 Data Mining Lecture 15: Introduction to Artificial Neural Networks...

EE3J2 Data MiningSlide 1

EE3J2 Data Mining

Lecture 15: Introduction to Artificial Neural Networks

Martin Russell

Objectives

Unsupervised and supervised learning Modelling and discrimination Introduction to Artificial Neural Networks (ANNs)

Unsupervised learning

So far we have looked at techniques which try to discover structure in ‘raw’ data – data with no information about classes– Gaussian Mixture Modelling

– Clustering

We treat the whole data set as a single entity, and try to discover underlying structure

The analysis is unsupervised, and automatic learning of the structure of the data is unsupervised learning

Supervised learning

In some cases additional information is available For example, for speech data we might know who

was speaking, or what he or she said This is information about the class of each piece of

data When the analysis is driven by class labels, it is

called supervised learning

Modelling and Discrimination

In supervised learning we can:– Analyse the data for each class separately– Try to discover how to distinguish between classes

Could apply GMM or clustering separately to model each class

Alternatively, we could try to find a method to discriminate between the classes

Modelling and DiscriminationClass models

Decision boundary

Discrimination

In the simplest cases we can discriminate between two classes using a class boundary

Allocation of a point to a class depends on which side of the boundary it lies

Linear decision boundary

Non-linear

decision boundary

Artificial Neural Networks

There are many approaches to discrimination A common class of approaches is based on the idea

of Artificial Neural Networks (ANNs) Inspiration for the basic elements of an ANN

(artificial neuron) comes from biology… …but the analogy really stops there ANNs are just a computational device for processing

patterns – not “artificial brains”

A model of a neuron

An Artificial Neuron

Simple artificial neuron Basic idea –

– if the input to unit u4 is big enough, then the neurone ‘fires’

– Otherwise nothing happens

How do we calculate the input to u4?

i1 i2 i3

w1,4 w2,4 w3,4

Artificial Neurone (2) Suppose that the inputs to units

1, 2 and 3 are i1, i2 and i3

Then the input to u4 is:

In general, for an artificial neuron with N input units the input to unit k is:

4,334,224,114 wiwiwii

i1 i2 i3

w1,4 w2,4 w3,4

nknnk wii

The ‘threshold’ activation function The activation function decides

whether the neuron should “fire” A suitable activation function is

the threshold function g:

The output of u4 is then:

0 if 0

0 if 1

i1 i2 i3

w1,4 w2,4 w3,4

44 igo

Other activation functions

Linear:

Sigmoid

Sigmoid activation function

The ‘bias’

As described, the neuron will ‘fire’ only if its input is greater than 0

We can change the value of the point of firing by introducing a bias

This is an additional input unit whose input is fixed at 1

i1 i2 i3

w1,4 w2,4 w3,4

How the bias works…

The artificial neuron ‘fires’ if input to u4 is greater than or equal to 0

I.E: But this happens only if

Or, equivalently,

04,4,334,224,114 bwwiwiwii

4,4,334,224,11 bwwiwiwi

Example (2D)

Suppose u has a threshold or sigmoid activation function

u will ‘fire’ if:

23 i.e.

Example (continued)

23 xyx y

u1 u2u3

Example (continued)

Assume – linear activation functions for units u1, u2 and u3

– Sigmoid activation function for u4

If input to u1 is 2 and input to u2 is 2, then:– Input to u4 is 2 × 3 + 2 ×1 + 1 × (-2) = 6– Hence output from u4 is g(6) = 0.998

If input to u1 is -2 and input to u2 is -2, then:– Input to u4 is -2 × 3 + -2 ×1 + 1 × (-2) = -10– Hence output from u4 is g(-10) = 4.54 × 10-5

Example 2

Combining 2 Artificial Neuronsx y

Combining neurons – artificial neural networks

1u1 u2

-12 1 -1

20 -20

Combining neurons

Input to u4 is 3 × x + 1 × y - 2

Input to u5 is 2 × x + (-1) × y – 1

When x = 3, y = 0– Input to u4 is 7, input to u5 is 5

– Output from u4 is 1, output from u5 is 0.99

– Input to u6 is 1 × 20 + 0.88 × (-20) - 2 = -1.88

– Output from u6 is 0.13

Outputs

i1 i2 o6

3 0 0.13

0.5 2 1.00

0.5 -2 0.00

-1 0 0.06

Combining neurones

‘firing region’

Single layer Multi-Layer Perceptron (MLP)

Input layer

Hidden layer

Output layer

Single Layer MLP Can characterize arbitrary convex regions Defines the region using linear decision boundaries

Two-layer MLP

Hidden layers

Two-Layer MLP

An MLP with two hidden layers can characterize arbitrary shapes

First hidden layer characterises convex regions Second hidden layer combines these convex regions There is no advantage in having more than two

hidden layers

MLP training

To define an MLP must decide:– Number of layers

– Number of input units

– Number of hidden units

– Number of output units

Once these are defined, properties of the MLP are completely defined by the values of the weights

How do we choose the weight values?

MLP training (continued)

MLP weights learnt automatically from training data We have already seen computational techniques for

estimating:– Parameters of GMMs

– Centroid positions in clustering

Similarly there is an iterative computational technique for estimating MLP weights – “Error-Back-Propagation”

Error-back propagation (EBP)

EBP is a ‘gradient descent’ method, like others we have seen

First stage is to choose initial values for the weights The EBP algorithm then changes the weights

incrementally to identify the class boundaries Only guaranteed to find a local optimum

Other types of ANN

Multi-Layer Perceptrons (MLP) are not the only types of ANNs

There are lots of others:– Radial Basis Function (RBF) networks

– Support Vector Machines (SVMs)

– …

There are also ANN interpretations of other methods

Summary

Discrimination versus Modelling Brief introduction to neural networks Definition of an ‘artificial neurone’ Activation functions – linear and sigmoid Linear boundary defined by a single neurone Convex region defined by a one-level MLP Two-level MLPs

Slide 1 EE3J2 Data Mining EE3J2 Data Mining Lecture 15: Introduction to Artificial Neural Networks...

Documents

Transcript of Slide 1 EE3J2 Data Mining EE3J2 Data Mining Lecture 15: Introduction to Artificial Neural Networks...

Web Mining – Data Mining im Internet Mining – Data Mining im Internet Vorlesung SS 2014 ... Web Mining is Data Mining for Data on the World-Wide Web Text Mining: Application of

Gufran Ahmad. Contents What is Data Mining? Data Mining / KDD process Different aspects of Data Mining Why Data Mining? Data Mining in Business Examples.

Slide 1 EE3J2 Data Mining Lecture 18 K-means and Agglomerative Algorithms Ali Al-Shahib.

Data Mining and Applications - antoniomucherino.it · Data Mining and Applications Data Mining Why Data Mining? Introduction to Data Mining Example III - text mining Let us suppose

Web Mining – Data Mining im Internet Mining – Data Mining im Internet Vorlesung SS 2010 ... Web Mining is Data Mining for Data on the World-Wide Web Text Mining: Application of

Lecture 2: Data Mining 1. Roadmap What is data mining? Data Mining Tasks – Classification/Decision Tree – Clustering – Association Mining Data Mining.

DATA MINING WITH - Lagout Mining/Data Mining with Decision... · Library of Congress Cataloging-in-Publication Data Rokach, Lior. Data mining with ... Data mining is the science,

Introduction to Introduction to Data Mining Data Mining

EE3J2: Data Mining Classification & Prediction Part II Dr Theodoros N Arvanitis Electronic, Electrical & Computer Engineering School of Engineering, The.

Secure Data Processing - uni-leipzig.de · PRIVACY-PRESERVING DATA MINING 12 Local Data Local Data Local Data Warehouse Data Mining Local Data Mining Local Data Mining Combiner Local

Data Mining Taylor Statistics 202: Data Mining

Visual Data Mining: An Overview What is Visual Data Mining? Survey of techniques Data Visualization Visualizing Data Mining Results Visual Data Mining.

Slide 1 EE3J2 Data Mining EE3J2 Data Mining Lecture 13: K-Means Clustering Martin Russell.

Slide 1 EE3J2 Data Mining EE3J2 Data Mining Lecture 14: Introduction to Hidden Markov Models Martin Russell.

October 18, 2015 Data Mining: Concepts and Techniques 1 DATA MINING Motivation: Why data mining? What is data mining? Data Mining: On what kind of data?

1 Data Mining Chapter 26. 2 Chapter 1. Introduction Motivation: Why data mining? What is data mining? Data Mining: On what kind of data? Data mining functionality.

What is Data Mining? Data Mining Motivation Data Mining Applications Applications of Data Mining in CRM Data Mining Taxonomy Data Mining Techniques.

Data mining week 1 - pengantar data mining

Data Mining: Introduction. Chapter 1. Introduction zMotivation: Why data mining? zWhat is data mining? zData Mining: On what kind of data? zData mining.

Course : Data mining - Lecture : Mining data streams