Hybrid Intelligence System for Data Imputation for Final Review
-
Upload
chandan-gautam -
Category
Documents
-
view
218 -
download
0
Transcript of Hybrid Intelligence System for Data Imputation for Final Review
-
8/9/2019 Hybrid Intelligence System for Data Imputation for Final Review
1/77
Hybrid Intelligence Systems For Data
Imputation
Chandan Gautam
(12MCMB03)
Under the guidance of
Prof. V. Ravi
1
-
8/9/2019 Hybrid Intelligence System for Data Imputation for Final Review
2/77
Outline
Outline
Problem Statement
Missing Data and their causes
Data Imputation
Literature Survey
Proposed Method
Results
Conclusions
References
2
-
8/9/2019 Hybrid Intelligence System for Data Imputation for Final Review
3/77
Problem Statement
Problem Statement
Developing Hybrid Intelligence Systems for Data Imputation
Based on Statistical and Machine Learning Techniques.
3
-
8/9/2019 Hybrid Intelligence System for Data Imputation for Final Review
4/77
What is missing data ?
What is missing data ?
4
In the real world scenario,
missing data is an inevitable
and common problem in
various disciplines.
It circumscribes the ability of
researchers to obtain any
conclusion, even if we will get
result by deleting missing data
then result may have biased and
inappropriate.
So, the missing values have to
be imputed.
Age Salary Incentive
25 4000 ??
?? 500 0
27 ?? 50
82 2000 150
42 6500 1000
-
8/9/2019 Hybrid Intelligence System for Data Imputation for Final Review
5/77
Literature Survey
Literature Survey*
N. Ankaiah, V. Ravi, A novel soft computing hybrid for dataimputation, In Proceedings of the 7th International ConferenceOn Data Mining (DMIN), Las Vegas, USA, 2011.
Mistry, J., Nelwamondo, F., V., & Marwala, T. (2009). Data estimationusing principal component analysis and Auto associative neuralnetworks, Journal of Systemics, Cybernetics and Informatics, Volume 7,pp. 72-79 .
I. B. Aydilek, A. Arslan, A hybrid method for imputation of missingvalues using optimized fuzzy c-means with support vector regressionand a genetic algorithm, Information Sciences, vol. 233, pp. 25-35,
2013. Shichao Zhang, Nearest neighbor selection for iteratively kNN
imputation,The Journal of Systems and Software (2012), vol. 85(11),pp. 2541-2552.
5
-
8/9/2019 Hybrid Intelligence System for Data Imputation for Final Review
6/77
Mean Imputation
6
-
8/9/2019 Hybrid Intelligence System for Data Imputation for Final Review
7/77
Creating Missing Values and Mean Imputation
Age Salary Incentive
25 4000 200
34 500 0
27 1000 50
82 2000 150
42 6500 1000
7
44 3250 300
Age Salary Incentive
25 4000 ??
?? 500 0
27 ?? 50
82 2000 150
42 6500 1000
Mean Imputation :
Initially, No Missing Data
i
-
8/9/2019 Hybrid Intelligence System for Data Imputation for Final Review
8/77
Compute the mean absolute percentage error (Flores,1986)
(MAPE) value:
Where,
n - Number of missing values in a given dataset.
- Predicted by the Mean Imputation for the missingvalue.
xi -Actual value.
n
ii
ii
x
xx
n
MAPE
1
100
Mean Imputation
8
MAPE
xi
-
8/9/2019 Hybrid Intelligence System for Data Imputation for Final Review
9/77
Result of Mean Imputation
Average MAPE value over 10 fold Mean Imputation
9
Mean Imputation
Error value is too high for
most of the datasets.
So, we have need some
other methods.
Mean Imputation
Auto mpg 59.7
Body fat 11.61
Boston Housing 37.77
Forest fires 24.728
Iris 23.57
Prima Indian 24.022
Spanish 55.53
Spectf 14.85
Turkish 66.007
UK bankruptcy 37.07
UK Credit 28.43
Wine 29.99
-
8/9/2019 Hybrid Intelligence System for Data Imputation for Final Review
10/77
Proposed Methods
Module I Module II Module III
PCA-AAELM Imputation
ECM-Imputation
ECM-AAELM Imputation
PSO-ECM- Imputation
PSO-ECM + ECM-AAELM
CPAANN Imputation
Gray+PCA-AAELM
Gray+CPAANN
10
-
8/9/2019 Hybrid Intelligence System for Data Imputation for Final Review
11/77
Overview of ELM
Overview of Extreme Learning Machine (ELM)
11
-
8/9/2019 Hybrid Intelligence System for Data Imputation for Final Review
12/77
Architecture of ELM
Architecture of ELM * Output of hidden nodes :g(a
i x + b
i)
ai : the weight vector of the connection
between the ithhidden node and the input
nodes.
bi: the threshold of the ithhidden node.
Output of SLFNs :
i: the weight vector of the connection
between the ith hidden node and the
output nodes.
)bxg(a)( ii1
m
i
imxf
Overview of ELM
12
Output Weight :
isMoore-Penrose inverse.
TH
H
x a
Training
H=g(a.x)
=?
H. =O
OH
Testing
H_T=g(y.a)Output=H_T .
-
8/9/2019 Hybrid Intelligence System for Data Imputation for Final Review
13/77
Table of Activation Functions
Table of Activation Functions *
13
P d M h d
-
8/9/2019 Hybrid Intelligence System for Data Imputation for Final Review
14/77
Architecture of AAELM
Auto encoders arefeed forward neural
networks trained to
recall the input
space.
Architecture of AAELM (Autoassociative ELM)
Proposed Method
14
Ensembled AAELM
-
8/9/2019 Hybrid Intelligence System for Data Imputation for Final Review
15/77
Ensembling of AAELM
Ensembling of AAELM
Run AAELM 10 times independently on same dataset to
generate AAELF.
Use three different probability distribution functions(Uniform, Normal and Logistic distributions) to generate
weight and two different activation functions (Sigmoid and
Gaussian)at hidden layer.
AAELM ensemble for total six combinations of probability
distribution and activation functions.
15
Ensembled-AAELM
Ensembled AAELM
-
8/9/2019 Hybrid Intelligence System for Data Imputation for Final Review
16/77
Result of Ensembled AAELM
Average MAPE value over 10 folds Ensembled AAELM *
16
Ensembled-AAELM
-
8/9/2019 Hybrid Intelligence System for Data Imputation for Final Review
17/77
Problems and Solutions of Ensembled AAELM
Drawbacks of AAELM
Dependency of AAELM on randomness is very high and
significant because each run of ELM yields different results.
Result could be fluctuate wildly sometimes.
Remedy of Above Problem
We proposed two new hybrid methods to stabilize
randomness of AAELM :
PCA-AAELM
ECM-AAELM
17
PCA-AAELM
-
8/9/2019 Hybrid Intelligence System for Data Imputation for Final Review
18/77
Proposed Method 1:
PCA-AAELM
PCA-AAELM
18
PCA-AAELM
-
8/9/2019 Hybrid Intelligence System for Data Imputation for Final Review
19/77
Architecture of PCA-AAELM
PCA AAELM
19
Architecture of PCA-AAELM *
Traditional ELM
PCA-AAELM
-
8/9/2019 Hybrid Intelligence System for Data Imputation for Final Review
20/77
Results
PCA AAELM
Average MAPE value over 10 folds - PCA-AAELM *
20
ECM-Imputation
-
8/9/2019 Hybrid Intelligence System for Data Imputation for Final Review
21/77
Proposed Method 2:
Evolving Clustering method (ECM)
based Imputation
ECM Imputation
21
ECM-Imputation
-
8/9/2019 Hybrid Intelligence System for Data Imputation for Final Review
22/77
Block Diagram of the Proposed Method
Dataset with
Missing
Values
Complete
Incomplete
ECM
Clustering
Obtained
Cluster Centers
Find Nearest
Cluster Center from
Incomplete Records
Impute Incomplete Features with
Corresponding Features of the Nearest
Cluster center
Dataset
without Missing
Values
ECM Imputation
ECM-Imputation
-
8/9/2019 Hybrid Intelligence System for Data Imputation for Final Review
23/77
How to calculate missing values by the help of cluster centers ?
ECM Imputation
23
1 4 2
1 8 9
3 6 1
5 7 3
0 1 2
5)23()02( 22
2 ? 3
2)23()12( 22
9)33()52( 22 37)93()12( 22
5)13()32( 22
ECM-Imputation
-
8/9/2019 Hybrid Intelligence System for Data Imputation for Final Review
24/77
24
Average MAPE value over 10 folds - ECM-Imputation *
ECM Imputation
Results
MeanK-Means+MLP
[Ankaiah & Ravi]ECM Imputation
Auto mpg 59.7 23.75 18.03
Body fat 11.61 7.83 6.31
Boston Housing 37.77 21.01 17.84
Forest fires 24.728 26.61 22.29
Iris 23.57 9.41 5.27
Prima Indian 24.022 29.7 27.16
Spanish 55.53 39.91 31.98
Spectf 14.85 12.14 10.21
Turkish 66.007 33.01 27.90
UK bankruptcy 37.07 30.96 46.14
UK Credit 28.43 32.17 27.40
Wine 29.99 21.58 15.61
ECM-AAELM
-
8/9/2019 Hybrid Intelligence System for Data Imputation for Final Review
25/77
Proposed Method 3:ECM-AAELM
25
ECM-AAELM
-
8/9/2019 Hybrid Intelligence System for Data Imputation for Final Review
26/77
Architecture of ECM-AAELM
Architecture of ECM-AAELM *
26
Traditional ELM
ECM-AAELM
-
8/9/2019 Hybrid Intelligence System for Data Imputation for Final Review
27/77
Results
Average MAPE value over 10 folds - ECM-AAELM
27
PCA/ECM-AAELM
-
8/9/2019 Hybrid Intelligence System for Data Imputation for Final Review
28/77
Behavior of PCA/ECM-AAELM on different activation functions
28
0
10
20
30
40
50
60
70
Auto mpg Body fat Boston
Housing
Forest fires Iris Prima
Indian
Spanish Spectf Turkish UK
bankruptcy
UK Credit Wine
ECM-AAELM Sigmoidsinh
Cloglogm
Bsigmoid
Sin
Hardlim
Tribas
Radbas
Softplus
Gaussian
Rectifier
0
10
20
30
40
50
60
70
80
Auto mpg Boby fat Boston
Housing
Forest fires Iris Prima
indian
Spanish Spectf Turkish UK
bankruptcy
UK Credit Wine
Sigmoid
Sinh
Cloglogm
Bsigmoid
Sine
Hardlim
Tribas
Radbas
Softplus
PCA-AAELM
ECM-AAELM
-
8/9/2019 Hybrid Intelligence System for Data Imputation for Final Review
29/77
Influence of Dthr value on MAPE results : ECM-AAELM
29
0
50
100
150
200
250
r
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Influence of Dthr value on MAPE results : ECM-AAELM
Auto_MPG
Body_Fat
Boston_housing
Forest_Fire
Iris
Prima_indian
Spanish
Turkish
Spectf
UK_CreditUK_Bankruptcy
Wine
ECM-AAELM
-
8/9/2019 Hybrid Intelligence System for Data Imputation for Final Review
30/77
Module II:
Proposed Method 4:
PSO-ECM
30
-
8/9/2019 Hybrid Intelligence System for Data Imputation for Final Review
31/77
31
1 3
Dataset contains
incomplete recordsComplete Records
Incomplete Records
Initialize PSO parameter
and Apply ECM with
initialized Dthr value
ECM imputation based on nearest
cluster center
Compute Covariance matrix for completerecords (Ccov) and total records (Tcov) after
imputation and Determinant of Ccov& Tcov
Compute MSE b/w Ccov& Tcov and absolute difference
b/w Det(Ccov) & Det(Tcov )
Is error
minimum ?Invoke PSO to select Dthr value Parameter Optimized
ECM imputation with optimized Dthr
valueApply ECM with Dthrvalue yielded by
PSO
Dataset does not contain
incomplete records
4
2
5
Block Diagram of the Proposed Method
ECM-Imputation
-
8/9/2019 Hybrid Intelligence System for Data Imputation for Final Review
32/77
32
Average MAPE value over 10 fold PSO-ECM based Imputation *
Results
MeanK-Means+MLP
[Ankaiah & Ravi]ECM-Imputation PSO-ECM
Auto mpg 59.7 23.75 18.03 15.34844
Body fat 11.61 7.83 6.31 4.96008
Boston Housing 37.77 21.01 17.84 14.49978Forest fires 24.728 26.61 22.29 18.33909
Iris 23.57 9.41 5.27 4.82263
Prima Indian 24.022 29.7 27.16 24.57587
Spanish 55.53 39.91 31.98 20.73123
Spectf 14.85 12.14 10.21 9.85382Turkish 66.007 33.01 27.9 19.28137
UK bankruptcy 37.07 30.96 46.14 30.97627
UK Credit 28.43 32.17 27.4 24.61695
Wine 29.99 21.58 15.61 12.75819
Proposed Techniques
-
8/9/2019 Hybrid Intelligence System for Data Imputation for Final Review
33/77
Proposed Method 5:
PSO-ECM + ECM-AAELM
33
PSO-ECM + ECM-AAELM
-
8/9/2019 Hybrid Intelligence System for Data Imputation for Final Review
34/77
34
Proposed Model
-
8/9/2019 Hybrid Intelligence System for Data Imputation for Final Review
35/77
PSO-ECM + ECM-AAELM
-
8/9/2019 Hybrid Intelligence System for Data Imputation for Final Review
36/77
36
Compare the Results after and before selection of optimalDthr value
Comparison
MeanK-Means+MLP
[Ankaiah & Ravi]Before After
Auto mpg 59.7 23.75 17.38 14.69
Body fat 11.61 7.83 5.33 4.64Boston Housing 37.77 21.01 16.48 14.44
Forest fires 24.728 26.61 21.54 18.17
Iris 23.57 9.41 5.10 4.83
Prima Indian 24.022 29.7 23.95 23.96
Spanish 55.53 39.91 22.09 18.53Spectf 14.85 12.14 8.05 8.18
Turkish 66.007 33.01 21.49 18.97
UK bankruptcy 37.07 30.96 40.06 28.66
UK Credit 28.43 32.17 26.85 24.79
Wine 29.99 21.58 14.88 12.60
PSO-ECM +
ECM-AAELMECM-AAELM
CPAANN
-
8/9/2019 Hybrid Intelligence System for Data Imputation for Final Review
37/77
Module III:
Proposed Method 6:
CPAANN
37
CPAANN
-
8/9/2019 Hybrid Intelligence System for Data Imputation for Final Review
38/77
Introduction *
Semi-supervised Learning :
Added the concept of auto-associativity in CPNN and
created Counter Propagation Auto-associative Neural
Network (CPAANN)38
CP NNGrossberg
Outstar
Kohonen
SOM
competitive
learning
Unsupervised Supervised
Introduction of CPNN
-
8/9/2019 Hybrid Intelligence System for Data Imputation for Final Review
39/77
PCA-AAELM
-
8/9/2019 Hybrid Intelligence System for Data Imputation for Final Review
40/77
Proposed Method 7:
Gray + PCA-AAELM
40
-
8/9/2019 Hybrid Intelligence System for Data Imputation for Final Review
41/77
Proposed Method*:
41
Stage I
Gray Distance
BasedNearest
Neighbor
Imputation
Stage II
PCA-AAELMBased
Imputation
Gray+PCA-AAELM
C i
-
8/9/2019 Hybrid Intelligence System for Data Imputation for Final Review
42/77
42
Results of PCA-AAELM with Mean Imputation and Gray Distance based Imputation
Comparison
MeanK-Means+MLP
[Ankaiah & Ravi]Mean Imputation
Gray Distance
based Imputation
Auto mpg 59.7 23.75 28.63 16.92
Body fat 11.61 7.83 6.01 5.41
Boston Housing 37.77 21.01 20.9 17.46Forest fires 24.728 26.61 19.41 20.89
Iris 23.57 9.41 10.23 5.79
Prima Indian 24.022 29.7 22.06 22.03
Spanish 55.53 39.91 30.09 28.06
Spectf 14.85 12.14 9.11 8.38Turkish 66.007 33.01 30.18 27.38
UK bankruptcy 37.07 30.96 37.7 37.95
UK Credit 28.43 32.17 25.27 27.79
Wine 29.99 21.58 16.6 14.78
PCA-AAELM with
Gray+CPAANN
-
8/9/2019 Hybrid Intelligence System for Data Imputation for Final Review
43/77
Proposed Method 8:
Gray + CPAANN
43
Gray+CPAANN
-
8/9/2019 Hybrid Intelligence System for Data Imputation for Final Review
44/77
Proposed Method*:
44
Stage I
Gray Distance
BasedNearest
Neighbour
Imputation
Stage II
CPAANNBased
Imputation
Gray+CPAANN
C i
-
8/9/2019 Hybrid Intelligence System for Data Imputation for Final Review
45/77
45
Results of CPAANN with Mean Imputation and Gray Distance based Imputation
Comparison
MeanK-Means+MLP
[Ankaiah & Ravi]Mean Imputation
Gray Distance
based Imputation
Auto mpg 59.7 23.75 18.32 15.31
Body fat 11.61 7.83 5.25 4.71
Boston Housing 37.77 21.01 14.86 15.01Forest fires 24.728 26.61 16.97 17.91
Iris 23.57 9.41 6.51 4.03
Prima Indian 24.022 29.7 18.21 19.34
Spanish 55.53 39.91 17.13 14.21
Spectf 14.85 12.14 8.61 8.53Turkish 66.007 33.01 16.07 17.37
UK bankruptcy 37.07 30.96 21.96 20.58
UK Credit 28.43 32.17 22.88 13.70
Wine 29.99 21.58 11.56 11.72
CPAANN with
C i
Comparison Between All Methods
-
8/9/2019 Hybrid Intelligence System for Data Imputation for Final Review
46/77
46
PCA-
AAELM
ECM_Imp
utation
ECM-
AAELMPSO-ECM
PSO-
ECM+ECM-
AAELM
Gray+PCA
-AAELMCPAANN
Gray+CPA
ANN
Auto mpg 28.63 18.03 17.38 15.35 14.39 16.92 18.32 15.31
Body fat 6.01 6.31 5.33 4.96 4.61 5.41 5.25 4.71
Boston
Housing 20.90 17.84 16.4814.50
14.18 17.46 14.86 15.01
Forest fires 19.41 22.29 21.54 18.34 17.66 20.89 16.97 17.91
Iris 10.23 5.27 5.10 4.82 4.75 5.79 6.51 4.03
Prima Indian 22.06 27.16 23.95 24.58 23.38 22.03 18.21 19.34
Spanish 30.09 31.98 22.09 20.73 16.99 28.06 17.13 14.21
Spectf 9.11 10.21 8.05 9.85 8.18 8.38 8.61 8.53
Turkish 30.18 27.90 21.49 19.28 16.49 27.38 16.07 17.37
UK bankruptcy 37.70 46.14 40.0630.98
26.89 37.95 21.96 20.58
UK Credit 25.27 27.40 26.85 24.62 23.66 27.79 22.88 13.70
Wine 16.60 15.61 14.88 12.76 12.21 14.78 11.56 11.72
Comparison Between All Proposed Methods based on Average MAPE value over 10
folds
Comparison
Conclusions
-
8/9/2019 Hybrid Intelligence System for Data Imputation for Final Review
47/77
Conclusions
47
Conclusion
-
8/9/2019 Hybrid Intelligence System for Data Imputation for Final Review
48/77
Conclusion
Conclusions
The results indicated that all the proposed methods provided significantlyimproved results compare to K-Means +MLP.
ECM-Imputation alone outperformed K-Means +MLP. It showed powerful
local learning capability of ECM.
ECM-AAELM yields more accuracy than PCA-AAELM.
Output of ECM-AAELM primarily depends on threshold value of ECM, its
output does not fluctuate wildly according to activation functions.
Based on our experiment, it is proved that selectionof optimal Dthr value
always performed better imputation.
In case of PCA-AAELM, it is recommended to use Softplus activationfunction because it performed better than other activation functions.
Gray Distance based imputation performed better than Mean imputation as
preprocessing task for most of the dataset.
48
Papers
-
8/9/2019 Hybrid Intelligence System for Data Imputation for Final Review
49/77
List of Published and Communicated Research Papers
C. Gautam, V. Ravi, Evolving Clustering Based Data Imputation,3rd
IEEE Conference, ICCPCT,Kanyakumari, Mar 21-22, 2014.
C. Gautam, V. Ravi, Data Imputation via Evolutionary Computation,
Clustering and a Neural Network, to be communicated in IEEE
Computational Intelligence Magazine (CIM).
A Hybrid Data Imputation method based on Gray System Theory and
Counterpropagation Auto-associative Neural Network, to be
communicated in Neurocomputing.
Imputation of Missing Data Using PCA, Extreme Learning Machine
and Gray System Theory, to be communicated in The 5th Joint
International Conference on Swarm, Evolutionary and Memetic
Computing (SEMCCO 2014).
49
Data Imputation
References
-
8/9/2019 Hybrid Intelligence System for Data Imputation for Final Review
50/77
Data Imputation
References
Abdella, M., & Marwala, T. (2005). The use of genetic algorithms and neural
networks to approximate missing data in database, IEEE 3rd International
Conference on Computational Cybernetics, Mauritius, pp. 207-212.
Mistry, J., Nelwamondo, F., V., & Marwala, T. (2009). Data estimation using
principal component analysis and Auto associative neural networks, Journal ofSystemics, Cybernetics and Informatics, Volume 7, pp. 72-79 .
Ankaiah, N., & Ravi, V. (2011). A novel soft computing hybrid for data
imputation, International Conference on Data Mining, Las Vegas, USA.
Vriens, M., & Melton, E. (2002). Managing missing data. Marketing Research,Volume 14, Issue 3, pp.1217.
Naveen, N., Ravi, V., & Rao, C. R. (2010). Differential evolution trained radial
basis function network: application to bankruptcy prediction in banks, International
Journal of Bio-Inspired Computation (IJBIC), Volume 2, Issue 3, pp. 222-232.50
References
Data Imputation (Cont )
-
8/9/2019 Hybrid Intelligence System for Data Imputation for Final Review
51/77
Nelwamondo, F., V., Golding, D., & Marwala, T. (2013). A dynamic programming
approach to missing data estimation using neural networks, Elsevier, Information
Sciences, Volume 237, pp. 4958.
Nishanth, K.J., Ankaiah, N., Ravi, V., & Bose, I. (2012). Soft computing based
imputation and hybrid data and text mining: The case of predicting the severity of
phishing alerts, Expert Systems with Applications, Volume 39, Issue 12, pp. 10583-
10589.
K. J. Nishanth, V. Ravi, A Computational Intelligence Based Online Data
Imputation Method: An Application For Banking, Journal of Information
Processing Systems, vol. 9 (4), pp. 633-650, 2013.
M. Krishna, V. Ravi, Particleswarm optimization and covariance matrix based data
imputation, IEEE International Conference on Computational Intelligence and
Computing Research (ICCIC), Enathi, 2013.
V. Ravi, M. Krishna, A new online data imputation method based on general
regression auto associative neural network, Neurocomputing, vol. 138, pp. 207-
212, 2014.
51
Data Imputation (Cont.)
Extreme Learning Machine (ELM)
References
-
8/9/2019 Hybrid Intelligence System for Data Imputation for Final Review
52/77
Extreme Learning Machine (ELM)
Huang, G.B., Zhu, Q., & Siew, C. (2006). Extreme Learning Machine: Theory and
Applications, Neurocomputing, Elsevier, 7th Brazilian Symposium on Neural
Networks, Volume 70, pp. 489-501.
Rajesh, R., & Siva, J. (2011). Extreme Learning Machine A Review and State of
Art, International Journal Of Wisdom Based Computing, Volume 1, pp. 35-49.
Huang, G., Wang, D., & Lan, Y. (2011). Extreme Learning Machine: A Survey,
International Journal of Machine Learning and Cybernetics June 2011, Volume
2, Issue 2, pp 107-122.
Bartlett, P. (1997). For Valid Generalization, The Size of the Weights is more
important than the Size of the Network, Advances in Neural Information Processing
Systems, Volume 9, pp. 134-140.
Huang, G., Chen, L., & Siew, C. (2006). Universal Approximation Using
Incremental Constructive Feedforward Networks with Random Hidden Nodes,
IEEE Transactions on Neural Networks, Volume 17, Issue 4, pp. 879-892.
52
References
Extreme Learning Machine (ELM) (Cont )
-
8/9/2019 Hybrid Intelligence System for Data Imputation for Final Review
53/77
Zhu, Q., Qin, A. K., Suganthan, P.N., & Huang, G. (2005). Evolutionary Extreme
Learning Machine, Pattern Recognition, Elsevier, Volume 38, Issue 10, pp. 1759
1763.
Castao, A., Fernndez-Navarro, F., & Hervs-Martnez, C. (2013). PCA-ELM -A
Robust and Pruned ELM Approach Based on PCA, Neural Processing Letter,
Springer, Volume 37, Issue 3, pp. 377-392.
Huang, G.B., Zhou, H., Ding, X., & Zhang, R. (2012). Extreme Learning Machine
for Regression and Multiclass Classification, IEEE Transaction on Systems, Man
And Cybernetics, Volume 42, Issue 2, pp. 513-529.
Extreme Learning Machine (ELM) (Cont.)
53
-
8/9/2019 Hybrid Intelligence System for Data Imputation for Final Review
54/77
Activation FunctionReferences
-
8/9/2019 Hybrid Intelligence System for Data Imputation for Final Review
55/77
Activation Function
Sibi, p., Jones, s., & Siddarth, p. (2013). Analysis of Different Activation Functions
Using Back Propagation Neural Networks, Journal of Theoretical and Applied
Information Technology, Volume 47, Issue 3, pp. 1264-1268.
Peng, J., Li, L., & Tang (2013). Combination of Activation Functions in Extreme
Learning Machines for Multivariate Calibration, Chemometrics and Intelligent
Laboratory Systems, Elsevier, Volume 120, pp. 53-58.
Gomes, G. S. S., Ludermir, T. B., & Lima, L. M. M. R. (2011). Comparison of new
activation functions in neural network for forecasting financial time series, Neural
Computing and Applications, Springer, Volume 20, Issue 3, pp. 417-439.
Asaduzzaman, Md., Shahjahan, M., & Murase, K. (2009). Faster Training Using
Fusion of Activation Functions for Feed Forward Neural Networks, International
Journal of Neural Systems , Volume 19, Issue 06, pp. 437-448 .
Karlik, B., & Olgac, A. V. (2010) Performance Analysis of Various Activation
Functions in Generalized MLP Architectures of Neural Networks, International Journal
of Artificial Intelligence and Expert Systems, Volume 1, Issue 4, pp. 111-122. 55
Activation Function(Cont.), ECM, Cross Validation & PCAReferences
-
8/9/2019 Hybrid Intelligence System for Data Imputation for Final Review
56/77
Activation Function(Cont.), ECM, Cross Validation & PCA
Glorot, X., Bordes, A., & Bengio, Y. (2011). Deep Sparse Rectifier Neural
Networks, International Conference on Artificial Intelligence and Statistics, Fort
Lauderdale, USA, Volume 15, pp. 315-323.
Song, Q. & Kasasbov, N. (2001) ECM A Novel On-line, Evolving Clustering
Method and Its Applications, Proceedings of the Fifth Biannual Conference on
Artificial Neural Networks and Expert Systems, Berlin, pp. 87-92.
Refaeilzadeh, P., Tang, L., & Liu. H. (2009). "Cross Validation", in Encyclopedia
of Database Systems (EDBS), Springer, Volume 1, pp. 532-538.
Smith, L. (2002). A tutorial on Principal Components Analysis.
56
-
8/9/2019 Hybrid Intelligence System for Data Imputation for Final Review
57/77
Thank You
-
8/9/2019 Hybrid Intelligence System for Data Imputation for Final Review
58/77
Thank You
58
Activation Function
-
8/9/2019 Hybrid Intelligence System for Data Imputation for Final Review
59/77
Activation Function *
Sibi, p., Jones, s., & Siddarth, p. (2013). Analysis of Different
Activation Functions Using Back Propagation Neural Networks, Journal
of Theoretical and Applied Information Technology, Volume 47, Issue
3, pp. 1264-1268.
Gomes, G. S. S., Ludermir, T. B., & Lima, L. M. M. R. (2011).
Comparison of new activation functions in neural network for
forecasting financial time series, Neural Computing and Applications,
Springer, Volume 20, Issue 3, pp. 417-439.
59
Activation Function
-
8/9/2019 Hybrid Intelligence System for Data Imputation for Final Review
60/77
Activation Function (Cont.)
Karlik, B., & Olgac, A. V. (2010) Performance Analysis of Various
Activation Functions in Generalized MLP Architectures of Neural
Networks, International Journal of Artificial Intelligence and Expert
Systems, Volume 1, Issue 4, pp. 111-122.
Glorot, X., Bordes, A., & Bengio, Y. (2011). Deep Sparse Rectifier
Neural Networks, International Conference on Artificial Intelligence
and Statistics, Fort Lauderdale, USA, Volume 15, pp. 315-323.
60
Experimental Design
-
8/9/2019 Hybrid Intelligence System for Data Imputation for Final Review
61/77
Experimental Design for PCA-AAELM and ECM-AAELM
10 fold cross validation has been used in our experiment.
Both PCA-AAELM and ECM-AAELM have one user
defined parameter, PCA has variance i.e. eigen values and
ECM has threshold value.
We fixed activation function and varied variance from 1 to 99
in PCA-AAELM and threshold from 0.001 to 0.999 in ECM-
AAELM for each activation function on whole dataset.
We used 11 activation functions and compare their
performances.
p g
61
Steps of PCA-AAELMPCA-AAELM
-
8/9/2019 Hybrid Intelligence System for Data Imputation for Final Review
62/77
p
Following steps are required for training process :*
62
Selection of optimalnumber of hidden
nodes and value of
hidden node as input
weight
Perform the PCA
Perform the no-linear
transformation
Compute the
output weight by
performing Moore-Penrose generalized
inverse
Training Dataset
Neural NetworkModel
PC * Training Data
Evolving Clustering MethodECM-AAELM
-
8/9/2019 Hybrid Intelligence System for Data Imputation for Final Review
63/77
63
C10 R10 =0
x1
C20 R20 =0
R11C11
x4
x3
x2x1
C12
C21
C30 R30 =0x7
x5
x6
x8C30
C21
x9 C13
Evolving Clustering Method *
-
8/9/2019 Hybrid Intelligence System for Data Imputation for Final Review
64/77
Steps of ECM-AAELM (Cont.)ECM-AAELM
-
8/9/2019 Hybrid Intelligence System for Data Imputation for Final Review
65/77
qyxq
iii
yx /2
1
Where q=number of features.
6) After this perform Moore-Penrose generalized inverse on
output of previous step and multiply by dataset to calculate
output weight.7) In last, multiply output weight to hidden node output to get
final output.
65
5) Normalized Euclidean distance formula is:
Why Moore-Penrose InverseELM
-
8/9/2019 Hybrid Intelligence System for Data Imputation for Final Review
66/77
Why are we using Moore-Penrose Generalized Inverse *
Moore- Penrose provides solution of a linear system
Ax=y
in such a way thaterror = Ax-y and x
both will be minimized simultaneously and gives a unique
solution :x = y
Formula : = (HTH)-1HT
66
H
H
Flow of CPNN Algorithm
-
8/9/2019 Hybrid Intelligence System for Data Imputation for Final Review
67/77
67
Initialize Network
Get Input
Find Winner
Update Winner &
neighbourhoods
Update nodes at Grossberg
Outstar
Repeat for
all inputs
N epochs
A hit t f F d l CPNN
-
8/9/2019 Hybrid Intelligence System for Data Imputation for Final Review
68/77
Architecture of Forward only CPNN
x1
x2
xm
h1
h2
hn
y1
y2
yp
Input Hidden Output
Weights trained by
simple competitive
learning
Weights trained by
Outstar rule
68
How weights are updating ?
-
8/9/2019 Hybrid Intelligence System for Data Imputation for Final Review
69/77
69
Fig. 9 red color
Hidden Nodes
and 10 blue color
input samples
-
8/9/2019 Hybrid Intelligence System for Data Imputation for Final Review
70/77
How to Calculate Gray Distance*:
70
10
.,.......,3,2,1
.,......,3,2,1.,......,3,2,1
,||maxmax||
||maxmax||minmin
),(
oi
nkmp
xxxx
xxxx
xxGRC
ip
mis
kppiip
mis
kp
ip
mis
kppiip
mis
kppi
i
mis
kp
.,.....,2,1
.,......,2,1
),(1
),(1
nk
oi
xxGRCm
xxGRGm
p
i
mis
kpi
mis
k
Gray Relational Grade :
Gray Relational Coefficient :
Control the level of differences with respect to the relational coefficient.
-
8/9/2019 Hybrid Intelligence System for Data Imputation for Final Review
71/77
71
attr1 attr2 attr3 attr4 attr5
R1 0.2 ? 0.9 0.6 0.5Abs.
Diff1
Abs.
Diff2
Abs.
Diff3
Abs.
Diff4Min Max
R2 0.1 0.3 0.9 0.4 0.6 0.1 0 0.2 0.1 0 0.2
R3 0.1 0.4 0.8 0.5 0.6 0.1 0.1 0.1 0.1 0.1 0.1
R4 0.8 0.2 0.5 0.3 0.2 0.6 0.4 0.3 0.3 0.3 0.6
R5 0.5 0.8 0.3 0.9 0.7 0.3 0.6 0.3 0.2 0.2 0.6
GRC1 GRC2 GRC3 GRC4 GRG
R2 0.75 1 0.6 0.75 0.775
R3 0.75 0.75 0.75 0.75 0.75
R4 0.333333 0.428571 0.5 0.5 0.440476
R5 0.5 0.333333 0.5 0.6 0.483333
Actual value = 0.3
Imputation by Gray
Distance = 0.3
Example *
Min= 0 Max=0.6
Gray Distance Based ImputationResults
-
8/9/2019 Hybrid Intelligence System for Data Imputation for Final Review
72/77
72
Mean K-Means+MLP Gray Distance BasedImputation
Auto mpg 59.7 23.75 16.73
Body fat 11.61 7.83 7.65
Boston Housing 37.77 21.01 19.28
Forest fires 24.728 26.61 22.89
Iris 23.57 9.41 5.34
Prima Indian 24.022 29.7 28.06
Spanish 55.53 39.91 36.29
Spectf 14.85 12.14 11.60
Turkish 66.007 33.01 36.63
UK bankruptcy 37.07 30.96 39.75
UK Credit 28.43 32.17 28.90
Wine 29.99 21.58 17.58
Average MAPE value over 10 fold - Gray Distance Based Imputation *
-
8/9/2019 Hybrid Intelligence System for Data Imputation for Final Review
73/77
Literature Survey
73
-
8/9/2019 Hybrid Intelligence System for Data Imputation for Final Review
74/77
Extreme Learning Machine (ELM)
-
8/9/2019 Hybrid Intelligence System for Data Imputation for Final Review
75/77
Extreme Learning Machine (ELM)
Huang, G.B., Zhu, Q., & Siew, C. (2006). Extreme Learning Machine:Theory and Applications, Neurocomputing, Elsevier, 7th Brazilian
Symposium on Neural Networks, Volume 70, pp. 489-501.
Huang, G., Wang, D., & Lan, Y. (2011). Extreme Learning Machine: A
Survey, International Journal of Machine Learning and Cybernetics June2011, Volume 2, Issue 2, pp 107-122.
Rajesh, R., & Siva, J. (2011). Extreme Learning Machine A Review and
State of Art, International Journal Of Wisdom Based Computing, Volume
1, pp. 35-49. Huang, G.B., Zhou, H., Ding, X., & Zhang, R. (2012). Extreme Learning
Machine for Regression and Multiclass Classification, IEEE Transaction
on Systems, Man And Cybernetics, Volume 42, Issue 2, pp. 513-529.
75
ECM & CPNN
-
8/9/2019 Hybrid Intelligence System for Data Imputation for Final Review
76/77
Evolving Clustering Method*
Song, Q. & Kasasbov, N. (2001) ECM A Novel On-line, EvolvingClustering Method and Its Applications, Proceedings of the Fifth
Biannual Conference on Artificial Neural Networks and Expert Systems,
Berlin, pp. 87-92.
Counter Propagation Neural Network
Kuzmanovski, I., & Novi, M. (2008). Counter-propagation neuralnetworks in Matlab, Chemometrics and Intelligent Laboratory Systems,
pp. 84-91.
Ballabio, D., Consonni, V., & Todeschini, R. (2009). The Kohonen and CP-
ANN toolbox: A collection of MATLAB modules for Self Organizing Maps
and Counterpropagation Artificial Neural Networks, Chemometrics andIntelligent Laboratory Systems, pp. 115-122.
Sivanandam, S. N., & Deepa, S. N. Introduction to neural networks Using
MATLAB 6.0.
76
-
8/9/2019 Hybrid Intelligence System for Data Imputation for Final Review
77/77