Hybrid Intelligence System for Data Imputation for Final Review

8/9/2019 Hybrid Intelligence System for Data Imputation for Final Review

1/77

Hybrid Intelligence Systems For Data

Imputation

Chandan Gautam

(12MCMB03)

Under the guidance of

Prof. V. Ravi

1


2/77

Outline

Outline

Problem Statement

Missing Data and their causes

Data Imputation

Literature Survey

Proposed Method

Results

Conclusions

References

2


3/77

Problem Statement

Problem Statement

Developing Hybrid Intelligence Systems for Data Imputation

Based on Statistical and Machine Learning Techniques.

3


4/77

What is missing data ?

What is missing data ?

4

In the real world scenario,

missing data is an inevitable

and common problem in

various disciplines.

It circumscribes the ability of

researchers to obtain any

conclusion, even if we will get

result by deleting missing data

then result may have biased and

inappropriate.

So, the missing values have to

be imputed.

Age Salary Incentive

25 4000 ??

?? 500 0

27 ?? 50

82 2000 150

42 6500 1000


5/77

Literature Survey

Literature Survey*

N. Ankaiah, V. Ravi, A novel soft computing hybrid for dataimputation, In Proceedings of the 7th International ConferenceOn Data Mining (DMIN), Las Vegas, USA, 2011.

Mistry, J., Nelwamondo, F., V., & Marwala, T. (2009). Data estimationusing principal component analysis and Auto associative neuralnetworks, Journal of Systemics, Cybernetics and Informatics, Volume 7,pp. 72-79 .

I. B. Aydilek, A. Arslan, A hybrid method for imputation of missingvalues using optimized fuzzy c-means with support vector regressionand a genetic algorithm, Information Sciences, vol. 233, pp. 25-35,

2013. Shichao Zhang, Nearest neighbor selection for iteratively kNN

imputation,The Journal of Systems and Software (2012), vol. 85(11),pp. 2541-2552.

5


6/77

Mean Imputation

6


7/77

Creating Missing Values and Mean Imputation


25 4000 200

34 500 0

27 1000 50

82 2000 150

42 6500 1000

7

44 3250 300


25 4000 ??

?? 500 0

27 ?? 50

82 2000 150

42 6500 1000

Mean Imputation :

Initially, No Missing Data

i


8/77

Compute the mean absolute percentage error (Flores,1986)

(MAPE) value:

Where,

n - Number of missing values in a given dataset.

- Predicted by the Mean Imputation for the missingvalue.

xi -Actual value.

n

ii

ii

x

xx

n

MAPE

1

100

Mean Imputation

8

MAPE

xi


9/77

Result of Mean Imputation

Average MAPE value over 10 fold Mean Imputation

9

Mean Imputation

Error value is too high for

most of the datasets.

So, we have need some

other methods.

Mean Imputation

Auto mpg 59.7

Body fat 11.61

Boston Housing 37.77

Forest fires 24.728

Iris 23.57

Prima Indian 24.022

Spanish 55.53

Spectf 14.85

Turkish 66.007

UK bankruptcy 37.07

UK Credit 28.43

Wine 29.99


10/77

Proposed Methods

Module I Module II Module III

PCA-AAELM Imputation

ECM-Imputation

ECM-AAELM Imputation

PSO-ECM- Imputation

PSO-ECM + ECM-AAELM

CPAANN Imputation

Gray+PCA-AAELM

Gray+CPAANN

10


11/77

Overview of ELM

Overview of Extreme Learning Machine (ELM)

11


12/77

Architecture of ELM

Architecture of ELM * Output of hidden nodes :g(a

i x + b

i)

ai : the weight vector of the connection

between the ithhidden node and the input

nodes.

bi: the threshold of the ithhidden node.

Output of SLFNs :

i: the weight vector of the connection

between the ith hidden node and the

output nodes.

)bxg(a)( ii1

m

i

imxf

Overview of ELM

12

Output Weight :

isMoore-Penrose inverse.

TH

H

x a

Training

H=g(a.x)

=?

H. =O

OH

Testing

H_T=g(y.a)Output=H_T .


13/77

Table of Activation Functions

Table of Activation Functions *

13

P d M h d


14/77

Architecture of AAELM

Auto encoders arefeed forward neural

networks trained to

recall the input

space.

Architecture of AAELM (Autoassociative ELM)

Proposed Method

14

Ensembled AAELM


15/77

Ensembling of AAELM

Ensembling of AAELM

Run AAELM 10 times independently on same dataset to

generate AAELF.

Use three different probability distribution functions(Uniform, Normal and Logistic distributions) to generate

weight and two different activation functions (Sigmoid and

Gaussian)at hidden layer.

AAELM ensemble for total six combinations of probability

distribution and activation functions.

15

Ensembled-AAELM

Ensembled AAELM


16/77

Result of Ensembled AAELM

Average MAPE value over 10 folds Ensembled AAELM *

16

Ensembled-AAELM


17/77

Problems and Solutions of Ensembled AAELM

Drawbacks of AAELM

Dependency of AAELM on randomness is very high and

significant because each run of ELM yields different results.

Result could be fluctuate wildly sometimes.

Remedy of Above Problem

We proposed two new hybrid methods to stabilize

randomness of AAELM :

PCA-AAELM

ECM-AAELM

17

PCA-AAELM


18/77

Proposed Method 1:

PCA-AAELM

PCA-AAELM

18

PCA-AAELM


19/77

Architecture of PCA-AAELM

PCA AAELM

19

Architecture of PCA-AAELM *

Traditional ELM

PCA-AAELM


20/77

Results

PCA AAELM

Average MAPE value over 10 folds - PCA-AAELM *

20

ECM-Imputation


21/77

Proposed Method 2:

Evolving Clustering method (ECM)

based Imputation

ECM Imputation

21

ECM-Imputation


22/77

Block Diagram of the Proposed Method

Dataset with

Missing

Values

Complete

Incomplete

ECM

Clustering

Obtained

Cluster Centers

Find Nearest

Cluster Center from

Incomplete Records

Impute Incomplete Features with

Corresponding Features of the Nearest

Cluster center

Dataset

without Missing

Values

ECM Imputation

ECM-Imputation


23/77

How to calculate missing values by the help of cluster centers ?

ECM Imputation

23

1 4 2

1 8 9

3 6 1

5 7 3

0 1 2

5)23()02( 22

2 ? 3

2)23()12( 22

9)33()52( 22 37)93()12( 22

5)13()32( 22

ECM-Imputation


24/77

24

Average MAPE value over 10 folds - ECM-Imputation *

ECM Imputation

Results

MeanK-Means+MLP

[Ankaiah & Ravi]ECM Imputation

Auto mpg 59.7 23.75 18.03

Body fat 11.61 7.83 6.31

Boston Housing 37.77 21.01 17.84

Forest fires 24.728 26.61 22.29

Iris 23.57 9.41 5.27

Prima Indian 24.022 29.7 27.16

Spanish 55.53 39.91 31.98

Spectf 14.85 12.14 10.21

Turkish 66.007 33.01 27.90

UK bankruptcy 37.07 30.96 46.14

UK Credit 28.43 32.17 27.40

Wine 29.99 21.58 15.61

ECM-AAELM


25/77

Proposed Method 3:ECM-AAELM

25

ECM-AAELM


26/77

Architecture of ECM-AAELM

Architecture of ECM-AAELM *

26

Traditional ELM

ECM-AAELM


27/77

Results

Average MAPE value over 10 folds - ECM-AAELM

27

PCA/ECM-AAELM


28/77

Behavior of PCA/ECM-AAELM on different activation functions

28

0

10

20

30

40

50

60

70

Auto mpg Body fat Boston

Housing

Forest fires Iris Prima

Indian

Spanish Spectf Turkish UK

bankruptcy

UK Credit Wine

ECM-AAELM Sigmoidsinh

Cloglogm

Bsigmoid

Sin

Hardlim

Tribas

Radbas

Softplus

Gaussian

Rectifier

0

10

20

30

40

50

60

70

80

Auto mpg Boby fat Boston

Housing

Forest fires Iris Prima

indian

Spanish Spectf Turkish UK

bankruptcy

UK Credit Wine

Sigmoid

Sinh

Cloglogm

Bsigmoid

Sine

Hardlim

Tribas

Radbas

Softplus

PCA-AAELM

ECM-AAELM


29/77

Influence of Dthr value on MAPE results : ECM-AAELM

29

0

50

100

150

200

250

r

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

.

Influence of Dthr value on MAPE results : ECM-AAELM

Auto_MPG

Body_Fat

Boston_housing

Forest_Fire

Iris

Prima_indian

Spanish

Turkish

Spectf

UK_CreditUK_Bankruptcy

Wine

ECM-AAELM


30/77

Module II:

Proposed Method 4:

PSO-ECM

30


31/77

31

1 3

Dataset contains

incomplete recordsComplete Records

Incomplete Records

Initialize PSO parameter

and Apply ECM with

initialized Dthr value

ECM imputation based on nearest

cluster center

Compute Covariance matrix for completerecords (Ccov) and total records (Tcov) after

imputation and Determinant of Ccov& Tcov

Compute MSE b/w Ccov& Tcov and absolute difference

b/w Det(Ccov) & Det(Tcov )

Is error

minimum ?Invoke PSO to select Dthr value Parameter Optimized

ECM imputation with optimized Dthr

valueApply ECM with Dthrvalue yielded by

PSO

Dataset does not contain

incomplete records

4

2

5

Block Diagram of the Proposed Method

ECM-Imputation


32/77

32

Average MAPE value over 10 fold PSO-ECM based Imputation *

Results

MeanK-Means+MLP

[Ankaiah & Ravi]ECM-Imputation PSO-ECM

Auto mpg 59.7 23.75 18.03 15.34844

Body fat 11.61 7.83 6.31 4.96008

Boston Housing 37.77 21.01 17.84 14.49978Forest fires 24.728 26.61 22.29 18.33909

Iris 23.57 9.41 5.27 4.82263

Prima Indian 24.022 29.7 27.16 24.57587

Spanish 55.53 39.91 31.98 20.73123

Spectf 14.85 12.14 10.21 9.85382Turkish 66.007 33.01 27.9 19.28137

UK bankruptcy 37.07 30.96 46.14 30.97627

UK Credit 28.43 32.17 27.4 24.61695

Wine 29.99 21.58 15.61 12.75819

Proposed Techniques


33/77

Proposed Method 5:

PSO-ECM + ECM-AAELM

33

PSO-ECM + ECM-AAELM


34/77

34

Proposed Model


35/77

PSO-ECM + ECM-AAELM


36/77

36

Compare the Results after and before selection of optimalDthr value

Comparison

MeanK-Means+MLP

[Ankaiah & Ravi]Before After

Auto mpg 59.7 23.75 17.38 14.69

Body fat 11.61 7.83 5.33 4.64Boston Housing 37.77 21.01 16.48 14.44

Forest fires 24.728 26.61 21.54 18.17

Iris 23.57 9.41 5.10 4.83

Prima Indian 24.022 29.7 23.95 23.96

Spanish 55.53 39.91 22.09 18.53Spectf 14.85 12.14 8.05 8.18

Turkish 66.007 33.01 21.49 18.97

UK bankruptcy 37.07 30.96 40.06 28.66

UK Credit 28.43 32.17 26.85 24.79

Wine 29.99 21.58 14.88 12.60

PSO-ECM +

ECM-AAELMECM-AAELM

CPAANN


37/77

Module III:

Proposed Method 6:

CPAANN

37

CPAANN


38/77

Introduction *

Semi-supervised Learning :

Added the concept of auto-associativity in CPNN and

created Counter Propagation Auto-associative Neural

Network (CPAANN)38

CP NNGrossberg

Outstar

Kohonen

SOM

competitive

learning

Unsupervised Supervised

Introduction of CPNN


39/77

PCA-AAELM


40/77

Proposed Method 7:

Gray + PCA-AAELM

40


41/77

Proposed Method*:

41

Stage I

Gray Distance

BasedNearest

Neighbor

Imputation

Stage II

PCA-AAELMBased

Imputation

Gray+PCA-AAELM

C i


42/77

42

Results of PCA-AAELM with Mean Imputation and Gray Distance based Imputation

Comparison

MeanK-Means+MLP

[Ankaiah & Ravi]Mean Imputation

Gray Distance

based Imputation

Auto mpg 59.7 23.75 28.63 16.92

Body fat 11.61 7.83 6.01 5.41


Iris 23.57 9.41 10.23 5.79

Prima Indian 24.022 29.7 22.06 22.03

Spanish 55.53 39.91 30.09 28.06

Spectf 14.85 12.14 9.11 8.38Turkish 66.007 33.01 30.18 27.38

UK bankruptcy 37.07 30.96 37.7 37.95

UK Credit 28.43 32.17 25.27 27.79

Wine 29.99 21.58 16.6 14.78

PCA-AAELM with

Gray+CPAANN


43/77

Proposed Method 8:

Gray + CPAANN

43

Gray+CPAANN


44/77

Proposed Method*:

44

Stage I

Gray Distance

BasedNearest

Neighbour

Imputation

Stage II

CPAANNBased

Imputation

Gray+CPAANN

C i


45/77

45

Results of CPAANN with Mean Imputation and Gray Distance based Imputation

Comparison

MeanK-Means+MLP

[Ankaiah & Ravi]Mean Imputation

Gray Distance

based Imputation

Auto mpg 59.7 23.75 18.32 15.31

Body fat 11.61 7.83 5.25 4.71


Iris 23.57 9.41 6.51 4.03

Prima Indian 24.022 29.7 18.21 19.34

Spanish 55.53 39.91 17.13 14.21

Spectf 14.85 12.14 8.61 8.53Turkish 66.007 33.01 16.07 17.37

UK bankruptcy 37.07 30.96 21.96 20.58

UK Credit 28.43 32.17 22.88 13.70

Wine 29.99 21.58 11.56 11.72

CPAANN with

C i

Comparison Between All Methods


46/77

46

PCA-

AAELM

ECM_Imp

utation

ECM-

AAELMPSO-ECM

PSO-

ECM+ECM-

AAELM

Gray+PCA

-AAELMCPAANN

Gray+CPA

ANN

Auto mpg 28.63 18.03 17.38 15.35 14.39 16.92 18.32 15.31

Body fat 6.01 6.31 5.33 4.96 4.61 5.41 5.25 4.71

Boston

Housing 20.90 17.84 16.4814.50

14.18 17.46 14.86 15.01

Forest fires 19.41 22.29 21.54 18.34 17.66 20.89 16.97 17.91

Iris 10.23 5.27 5.10 4.82 4.75 5.79 6.51 4.03

Prima Indian 22.06 27.16 23.95 24.58 23.38 22.03 18.21 19.34

Spanish 30.09 31.98 22.09 20.73 16.99 28.06 17.13 14.21

Spectf 9.11 10.21 8.05 9.85 8.18 8.38 8.61 8.53

Turkish 30.18 27.90 21.49 19.28 16.49 27.38 16.07 17.37

UK bankruptcy 37.70 46.14 40.0630.98

26.89 37.95 21.96 20.58

UK Credit 25.27 27.40 26.85 24.62 23.66 27.79 22.88 13.70

Wine 16.60 15.61 14.88 12.76 12.21 14.78 11.56 11.72

Comparison Between All Proposed Methods based on Average MAPE value over 10

folds

Comparison

Conclusions


47/77

Conclusions

47

Conclusion


48/77

Conclusion

Conclusions

The results indicated that all the proposed methods provided significantlyimproved results compare to K-Means +MLP.

ECM-Imputation alone outperformed K-Means +MLP. It showed powerful

local learning capability of ECM.

ECM-AAELM yields more accuracy than PCA-AAELM.

Output of ECM-AAELM primarily depends on threshold value of ECM, its

output does not fluctuate wildly according to activation functions.

Based on our experiment, it is proved that selectionof optimal Dthr value

always performed better imputation.

In case of PCA-AAELM, it is recommended to use Softplus activationfunction because it performed better than other activation functions.

Gray Distance based imputation performed better than Mean imputation as

preprocessing task for most of the dataset.

48

Papers


49/77

List of Published and Communicated Research Papers

C. Gautam, V. Ravi, Evolving Clustering Based Data Imputation,3rd

IEEE Conference, ICCPCT,Kanyakumari, Mar 21-22, 2014.

C. Gautam, V. Ravi, Data Imputation via Evolutionary Computation,

Clustering and a Neural Network, to be communicated in IEEE

Computational Intelligence Magazine (CIM).

A Hybrid Data Imputation method based on Gray System Theory and

Counterpropagation Auto-associative Neural Network, to be

communicated in Neurocomputing.

Imputation of Missing Data Using PCA, Extreme Learning Machine

and Gray System Theory, to be communicated in The 5th Joint

International Conference on Swarm, Evolutionary and Memetic

Computing (SEMCCO 2014).

49

Data Imputation

References


50/77

Data Imputation

References

Abdella, M., & Marwala, T. (2005). The use of genetic algorithms and neural

networks to approximate missing data in database, IEEE 3rd International

Conference on Computational Cybernetics, Mauritius, pp. 207-212.

Mistry, J., Nelwamondo, F., V., & Marwala, T. (2009). Data estimation using

principal component analysis and Auto associative neural networks, Journal ofSystemics, Cybernetics and Informatics, Volume 7, pp. 72-79 .

Ankaiah, N., & Ravi, V. (2011). A novel soft computing hybrid for data

imputation, International Conference on Data Mining, Las Vegas, USA.

Vriens, M., & Melton, E. (2002). Managing missing data. Marketing Research,Volume 14, Issue 3, pp.1217.

Naveen, N., Ravi, V., & Rao, C. R. (2010). Differential evolution trained radial

basis function network: application to bankruptcy prediction in banks, International

Journal of Bio-Inspired Computation (IJBIC), Volume 2, Issue 3, pp. 222-232.50

References

Data Imputation (Cont )


51/77

Nelwamondo, F., V., Golding, D., & Marwala, T. (2013). A dynamic programming

approach to missing data estimation using neural networks, Elsevier, Information

Sciences, Volume 237, pp. 4958.

Nishanth, K.J., Ankaiah, N., Ravi, V., & Bose, I. (2012). Soft computing based

imputation and hybrid data and text mining: The case of predicting the severity of

phishing alerts, Expert Systems with Applications, Volume 39, Issue 12, pp. 10583-

10589.

K. J. Nishanth, V. Ravi, A Computational Intelligence Based Online Data

Imputation Method: An Application For Banking, Journal of Information

Processing Systems, vol. 9 (4), pp. 633-650, 2013.

M. Krishna, V. Ravi, Particleswarm optimization and covariance matrix based data

imputation, IEEE International Conference on Computational Intelligence and

Computing Research (ICCIC), Enathi, 2013.

V. Ravi, M. Krishna, A new online data imputation method based on general

regression auto associative neural network, Neurocomputing, vol. 138, pp. 207-

212, 2014.

51

Data Imputation (Cont.)

Extreme Learning Machine (ELM)

References


52/77


Huang, G.B., Zhu, Q., & Siew, C. (2006). Extreme Learning Machine: Theory and

Applications, Neurocomputing, Elsevier, 7th Brazilian Symposium on Neural

Networks, Volume 70, pp. 489-501.

Rajesh, R., & Siva, J. (2011). Extreme Learning Machine A Review and State of

Art, International Journal Of Wisdom Based Computing, Volume 1, pp. 35-49.

Huang, G., Wang, D., & Lan, Y. (2011). Extreme Learning Machine: A Survey,

International Journal of Machine Learning and Cybernetics June 2011, Volume

2, Issue 2, pp 107-122.

Bartlett, P. (1997). For Valid Generalization, The Size of the Weights is more

important than the Size of the Network, Advances in Neural Information Processing

Systems, Volume 9, pp. 134-140.

Huang, G., Chen, L., & Siew, C. (2006). Universal Approximation Using

Incremental Constructive Feedforward Networks with Random Hidden Nodes,

IEEE Transactions on Neural Networks, Volume 17, Issue 4, pp. 879-892.

52

References

Extreme Learning Machine (ELM) (Cont )


53/77

Zhu, Q., Qin, A. K., Suganthan, P.N., & Huang, G. (2005). Evolutionary Extreme

Learning Machine, Pattern Recognition, Elsevier, Volume 38, Issue 10, pp. 1759

1763.

Castao, A., Fernndez-Navarro, F., & Hervs-Martnez, C. (2013). PCA-ELM -A

Robust and Pruned ELM Approach Based on PCA, Neural Processing Letter,

Springer, Volume 37, Issue 3, pp. 377-392.

Huang, G.B., Zhou, H., Ding, X., & Zhang, R. (2012). Extreme Learning Machine

for Regression and Multiclass Classification, IEEE Transaction on Systems, Man

And Cybernetics, Volume 42, Issue 2, pp. 513-529.

Extreme Learning Machine (ELM) (Cont.)

53


54/77

Activation FunctionReferences


55/77

Activation Function

Sibi, p., Jones, s., & Siddarth, p. (2013). Analysis of Different Activation Functions

Using Back Propagation Neural Networks, Journal of Theoretical and Applied

Information Technology, Volume 47, Issue 3, pp. 1264-1268.

Peng, J., Li, L., & Tang (2013). Combination of Activation Functions in Extreme

Learning Machines for Multivariate Calibration, Chemometrics and Intelligent

Laboratory Systems, Elsevier, Volume 120, pp. 53-58.

Gomes, G. S. S., Ludermir, T. B., & Lima, L. M. M. R. (2011). Comparison of new

activation functions in neural network for forecasting financial time series, Neural

Computing and Applications, Springer, Volume 20, Issue 3, pp. 417-439.

Asaduzzaman, Md., Shahjahan, M., & Murase, K. (2009). Faster Training Using

Fusion of Activation Functions for Feed Forward Neural Networks, International

Journal of Neural Systems , Volume 19, Issue 06, pp. 437-448 .

Karlik, B., & Olgac, A. V. (2010) Performance Analysis of Various Activation

Functions in Generalized MLP Architectures of Neural Networks, International Journal

of Artificial Intelligence and Expert Systems, Volume 1, Issue 4, pp. 111-122. 55

Activation Function(Cont.), ECM, Cross Validation & PCAReferences


56/77

Activation Function(Cont.), ECM, Cross Validation & PCA

Glorot, X., Bordes, A., & Bengio, Y. (2011). Deep Sparse Rectifier Neural

Networks, International Conference on Artificial Intelligence and Statistics, Fort

Lauderdale, USA, Volume 15, pp. 315-323.

Song, Q. & Kasasbov, N. (2001) ECM A Novel On-line, Evolving Clustering

Method and Its Applications, Proceedings of the Fifth Biannual Conference on

Artificial Neural Networks and Expert Systems, Berlin, pp. 87-92.

Refaeilzadeh, P., Tang, L., & Liu. H. (2009). "Cross Validation", in Encyclopedia

of Database Systems (EDBS), Springer, Volume 1, pp. 532-538.

Smith, L. (2002). A tutorial on Principal Components Analysis.

56


57/77

Thank You


58/77

Thank You

58

Activation Function


59/77

Activation Function *

Sibi, p., Jones, s., & Siddarth, p. (2013). Analysis of Different

Activation Functions Using Back Propagation Neural Networks, Journal

of Theoretical and Applied Information Technology, Volume 47, Issue

3, pp. 1264-1268.

Gomes, G. S. S., Ludermir, T. B., & Lima, L. M. M. R. (2011).

Comparison of new activation functions in neural network for

forecasting financial time series, Neural Computing and Applications,

Springer, Volume 20, Issue 3, pp. 417-439.

59

Activation Function


60/77

Activation Function (Cont.)

Karlik, B., & Olgac, A. V. (2010) Performance Analysis of Various

Activation Functions in Generalized MLP Architectures of Neural

Networks, International Journal of Artificial Intelligence and Expert

Systems, Volume 1, Issue 4, pp. 111-122.

Glorot, X., Bordes, A., & Bengio, Y. (2011). Deep Sparse Rectifier

Neural Networks, International Conference on Artificial Intelligence

and Statistics, Fort Lauderdale, USA, Volume 15, pp. 315-323.

60

Experimental Design


61/77

Experimental Design for PCA-AAELM and ECM-AAELM

10 fold cross validation has been used in our experiment.

Both PCA-AAELM and ECM-AAELM have one user

defined parameter, PCA has variance i.e. eigen values and

ECM has threshold value.

We fixed activation function and varied variance from 1 to 99

in PCA-AAELM and threshold from 0.001 to 0.999 in ECM-

AAELM for each activation function on whole dataset.

We used 11 activation functions and compare their

performances.

p g

61

Steps of PCA-AAELMPCA-AAELM


62/77

p

Following steps are required for training process :*

62

Selection of optimalnumber of hidden

nodes and value of

hidden node as input

weight

Perform the PCA

Perform the no-linear

transformation

Compute the

output weight by

performing Moore-Penrose generalized

inverse

Training Dataset

Neural NetworkModel

PC * Training Data

Evolving Clustering MethodECM-AAELM


63/77

63

C10 R10 =0

x1

C20 R20 =0

R11C11

x4

x3

x2x1

C12

C21

C30 R30 =0x7

x5

x6

x8C30

C21

x9 C13

Evolving Clustering Method *


64/77

Steps of ECM-AAELM (Cont.)ECM-AAELM


65/77

qyxq

iii

yx /2

1

Where q=number of features.

6) After this perform Moore-Penrose generalized inverse on

output of previous step and multiply by dataset to calculate

output weight.7) In last, multiply output weight to hidden node output to get

final output.

65

5) Normalized Euclidean distance formula is:

Why Moore-Penrose InverseELM


66/77

Why are we using Moore-Penrose Generalized Inverse *

Moore- Penrose provides solution of a linear system

Ax=y

in such a way thaterror = Ax-y and x

both will be minimized simultaneously and gives a unique

solution :x = y

Formula : = (HTH)-1HT

66

H

H

Flow of CPNN Algorithm


67/77

67

Initialize Network

Get Input

Find Winner

Update Winner &

neighbourhoods

Update nodes at Grossberg

Outstar

Repeat for

all inputs

N epochs

A hit t f F d l CPNN


68/77

Architecture of Forward only CPNN

x1

x2

xm

h1

h2

hn

y1

y2

yp

Input Hidden Output

Weights trained by

simple competitive

learning

Weights trained by

Outstar rule

68

How weights are updating ?


69/77

69

Fig. 9 red color

Hidden Nodes

and 10 blue color

input samples


70/77

How to Calculate Gray Distance*:

70

10

.,.......,3,2,1

.,......,3,2,1.,......,3,2,1

,||maxmax||

||maxmax||minmin

),(

oi

nkmp

xxxx

xxxx

xxGRC

ip

mis

kppiip

mis

kp

ip

mis

kppiip

mis

kppi

i

mis

kp

.,.....,2,1

.,......,2,1

),(1

),(1

nk

oi

xxGRCm

xxGRGm

p

i

mis

kpi

mis

k

Gray Relational Grade :

Gray Relational Coefficient :

Control the level of differences with respect to the relational coefficient.


71/77

71

attr1 attr2 attr3 attr4 attr5

R1 0.2 ? 0.9 0.6 0.5Abs.

Diff1

Abs.

Diff2

Abs.

Diff3

Abs.

Diff4Min Max

R2 0.1 0.3 0.9 0.4 0.6 0.1 0 0.2 0.1 0 0.2

R3 0.1 0.4 0.8 0.5 0.6 0.1 0.1 0.1 0.1 0.1 0.1

R4 0.8 0.2 0.5 0.3 0.2 0.6 0.4 0.3 0.3 0.3 0.6

R5 0.5 0.8 0.3 0.9 0.7 0.3 0.6 0.3 0.2 0.2 0.6

GRC1 GRC2 GRC3 GRC4 GRG

R2 0.75 1 0.6 0.75 0.775

R3 0.75 0.75 0.75 0.75 0.75

R4 0.333333 0.428571 0.5 0.5 0.440476

R5 0.5 0.333333 0.5 0.6 0.483333

Actual value = 0.3

Imputation by Gray

Distance = 0.3

Example *

Min= 0 Max=0.6

Gray Distance Based ImputationResults


72/77

72

Mean K-Means+MLP Gray Distance BasedImputation

Auto mpg 59.7 23.75 16.73

Body fat 11.61 7.83 7.65

Boston Housing 37.77 21.01 19.28

Forest fires 24.728 26.61 22.89

Iris 23.57 9.41 5.34

Prima Indian 24.022 29.7 28.06

Spanish 55.53 39.91 36.29

Spectf 14.85 12.14 11.60

Turkish 66.007 33.01 36.63

UK bankruptcy 37.07 30.96 39.75

UK Credit 28.43 32.17 28.90

Wine 29.99 21.58 17.58

Average MAPE value over 10 fold - Gray Distance Based Imputation *


73/77

Literature Survey

73


74/77



75/77


Huang, G.B., Zhu, Q., & Siew, C. (2006). Extreme Learning Machine:Theory and Applications, Neurocomputing, Elsevier, 7th Brazilian

Symposium on Neural Networks, Volume 70, pp. 489-501.

Huang, G., Wang, D., & Lan, Y. (2011). Extreme Learning Machine: A

Survey, International Journal of Machine Learning and Cybernetics June2011, Volume 2, Issue 2, pp 107-122.

Rajesh, R., & Siva, J. (2011). Extreme Learning Machine A Review and

State of Art, International Journal Of Wisdom Based Computing, Volume

1, pp. 35-49. Huang, G.B., Zhou, H., Ding, X., & Zhang, R. (2012). Extreme Learning

Machine for Regression and Multiclass Classification, IEEE Transaction

on Systems, Man And Cybernetics, Volume 42, Issue 2, pp. 513-529.

75

ECM & CPNN


76/77

Evolving Clustering Method*

Song, Q. & Kasasbov, N. (2001) ECM A Novel On-line, EvolvingClustering Method and Its Applications, Proceedings of the Fifth

Biannual Conference on Artificial Neural Networks and Expert Systems,

Berlin, pp. 87-92.

Counter Propagation Neural Network

Kuzmanovski, I., & Novi, M. (2008). Counter-propagation neuralnetworks in Matlab, Chemometrics and Intelligent Laboratory Systems,

pp. 84-91.

Ballabio, D., Consonni, V., & Todeschini, R. (2009). The Kohonen and CP-

ANN toolbox: A collection of MATLAB modules for Self Organizing Maps

and Counterpropagation Artificial Neural Networks, Chemometrics andIntelligent Laboratory Systems, pp. 115-122.

Sivanandam, S. N., & Deepa, S. N. Introduction to neural networks Using

MATLAB 6.0.

76


77/77

Hybrid Intelligence System for Data Imputation for Final Review

Documents

Transcript of Hybrid Intelligence System for Data Imputation for Final Review