EXAMINING OUTLIER DETECTION PERFORMANCE FOR PRINCIPAL COMPONENTS ANALYSIS METHOD AND ITS...
-
Upload
ijaet-journal -
Category
Documents
-
view
217 -
download
0
Transcript of EXAMINING OUTLIER DETECTION PERFORMANCE FOR PRINCIPAL COMPONENTS ANALYSIS METHOD AND ITS...
-
7/30/2019 EXAMINING OUTLIER DETECTION PERFORMANCE FOR PRINCIPAL COMPONENTS ANALYSIS METHOD AND ITS ROBUS
1/10
International Journal of Advances in Engineering & Technology, May 2013.
IJAET ISSN: 2231-1963
573 Vol. 6, Issue 2, pp. 573-582
EXAMINING OUTLIERDETECTION PERFORMANCE FOR
PRINCIPAL COMPONENTS ANALYSIS METHOD AND ITS
ROBUSTIFICATION METHODS
Nada Badr, Noureldien A. Noureldien
Department of Computer Science
University of Science and Technology, Omdurman, Sudan
ABSTRACT
Intrusion detection has gasped the attention of both commercial institutions and academic research area. In thispaper PCA (Principal Components Analysis) was utilized as unsupervised technique to detect multivariate
outliers on the dataset of an hour duration of time. PCA is sensitive to outliers since it depend on non-robustestimators. This lead us using MCD (Minimum Covariance Determinant) and PP (Projection Pursuit) as two
different robustification techniques for the PCA. The results obtained from experiments show that PCA
generates a high false alarms due to masking and swamping effects, while MCD and PP detection rate is much
accurate and both reveals the effects of masking and swamping undergo the PCA method.
KEYWORDS:Multivariate Techniques, Robust Estimators, Principal Components, Minimum Covariance
Determinant, Projection Pursuit.
I. INTRODUCTIONPrincipal Components Analysis (PCA) is a multivariate statistical method that concerned withanalyzing and understanding data in high dimensions, that is to say, PCA method analyzes data setsthat represent observations which are described by several dependent variables that are intercorrelated. PCA is one of the best known and most used multivariate exploratory analysis technique[5].Several robust competitors to classical PCA estimators have been proposed in the literature. A naturalway to robustify PCA is to use robust location and scatter estimators instead of the PCA's samplemean and sample covariance matrix when estimating the eigenvalues and eigenvectors of thepopulation covariance matrix. The minimum covariance determinant (MCD) method is a highlyrobust estimator of multivariate location and scatter. Its objective is to find h observations out ofnwhose covariance matrix has the lowest determinant. The MCD location estimate then is the mean ofthese h points, and the estimate of scatter is their covariance matrix. Another robust method for
principal component analysis uses the Projection-Pursuit (PP) principle. Here, one projects the data ona lower-dimensional space such that a robust measure of variance of the projected data will bemaximized.In this paper we investigate the effectiveness of the robust estimators provided by MCD and PP, byapplying PCA on Abilene dataset and compare its detection performance of dataset outliers to MCD
and PP.The rest of this paper is organized as follows. Section 2 is an overview to related work. Section 3 wasdedicated for classical PCA. PCA robustification methods, MCD and PP are discussed in section 4.In section 5 the experiment results are shown, conclusions and future work are drawn in section 6.
II. RELATED WORKA number of researches have utilized principal components analysis to reduce the dimensionality andto detect anomalous network traffic. The use of PCA to structure network traffic flows was introduced
-
7/30/2019 EXAMINING OUTLIER DETECTION PERFORMANCE FOR PRINCIPAL COMPONENTS ANALYSIS METHOD AND ITS ROBUS
2/10
-
7/30/2019 EXAMINING OUTLIER DETECTION PERFORMANCE FOR PRINCIPAL COMPONENTS ANALYSIS METHOD AND ITS ROBUS
3/10
International Journal of Advances in Engineering & Technology, May 2013.
IJAET ISSN: 2231-1963
575 Vol. 6, Issue 2, pp. 573-582
3.1 PCA Advantages
PCA common advantages are:
3.1.1 Exploratory Data Analysis
PCA is mostly used for making 2-dimensional plots of the data for visual examination and
interpretation. For this purpose, data is projected on factorial planes that are spanned by pairs ofPrincipal Components chosen among the first ones (that is, the most significant ones). From theseplots, one will try to extract information about the data structure, such as the detection of outliers
(observations that are very different from the bulk of the data).Due to most researches [8][11], PCA detect two types of outliers, type(1): the outlier that inflate
variance and this is detected by the major PCs and type (2): outlier that violate structure, which aredetected by minor PCs.
3.1.2 Data Reduction Technique
All multivariate techniques are prone to the bias variance tradeoff, which states that thenumber of variables entering a model should be severely restricted. Data is often describedby many more variables than necessary for building the best model. PCA is better than
other statistical reduction techniques in that, it select and feed the model with reducednumber of variables.
3.1.3 Low Computational Requirement
PCA needs low computational efforts since its algorithm constitutes simple calculations.
3.2 PCA Disadvantages
It may be noted that the PCA is based on the assumptions that, the dimensionality of data can beefficiently reduced by linear transformation and most information is contained in those directionswhere input data variance is maximum.As it is evident, these conditions are by no means always met. For example, if points of an input set
are positioned on the surface of a hyper sphere, no linear transformation can reduce dimension(nonlinear transformation, however, can easily cope with this task). From the above the followingdisadvantage of PCA are concluded.
3.2.1 Depending On Linear Algebra
It relies on simple linear algebra as its main mathematical engine, and is quite easy to interpret
geometrically. But this strength is also a weakness, for it might very well be that other syntheticvariables, more complex than just linear combinations of the original variables, would lead to a morecomplex data description.
3.2.2 Smallest Principal Components Have No Attention in Statistical Techniques
The lack of interest is due to the fact that, compared with the largest principal components that
contain most of the total variance in the data, the smallest principal components only contain thenoise of the data and, therefore, appear to contribute minimal information. However, because outliers
are a common source of noise, the smallest principal components should be useful for outlierdetection.
3.2.3 High False Alarms
Principal components are sensitive to outliers, since the principal components are determined bytheir directions and calculated from classical estimator such classical mean and classical covariance
or correlation matrices.
IV. PCAROBUSTIFICATIONIn real datasets, it often happens that some observation are different from the majority, suchobservation are called outliers, intrusion, discordant, etc. However classical PCA method can be
-
7/30/2019 EXAMINING OUTLIER DETECTION PERFORMANCE FOR PRINCIPAL COMPONENTS ANALYSIS METHOD AND ITS ROBUS
4/10
International Journal of Advances in Engineering & Technology, May 2013.
IJAET ISSN: 2231-1963
576 Vol. 6, Issue 2, pp. 573-582
affected by outliers so that PCA model cannot detect all the actual real deviating observation, this isknown as masking effect. In addition some good data points might even appear to be outliers whichare known as swamping effect .
Masking and swamping cause PCA to generate a high false alarm. To reduce this high false alarms
using robust estimators was proposed, since outlying points are less likely to enter into the
calculation of the robust estimators.
The well-known PCA Robustification methods are the minimum covariance determinant (MCD) andProjection-Pursuit (PP) principle. The objective of the raw MCD is to find h > n/2 observations outofn whosecovariance matrix has the smallest determinant. Its breakdown value is (bn= [n- h+1]/n),hence the number h determines the robustness of the estimator. InProjection-Pursuit principle [3],one projects the data on a lower-dimensional space such that a robust measure of variance of theprojected data will be maximized. PP is applied where the number of variables or dimensions is very
large, so PP has an advantage over MCD, since the MCD proposes the dimensions of the dataset notto exceed 50 dimensions.Principal Component Analysis (PCA) is an example of the PP approach, because they both search fordirections with maximal dispersion of the data projected on it, but PP instead of using variance as
measure of dispersion, they use robust scale estimator [4].
V. EXPERIMENTS AND RESULTSIn this section we show how we test PCA and its robustification methods MCD and PP on a dataset.The data that was used consist of OD (Origin-Destination) flows which, are collected and made
available by Zhang [1]. The dataset is an extraction of sixty minutes traffic flows from first week ofthe traffic matrix on 2004-03-01, which is the traffic matrix Yin Zhang was built from Abilenenetwork. Availability of the dataset is on offline mode, where it is extracted from offline trafficmatrix.
5.1 PCA on Dataset
At first, the dataset or the traffic matrix is arranged into the data matrix X, where rows representobservations and columns represent variables or dimensions.
X (14412) =[, , , ,],
The following steps are considered in apply PCA method on the dataset.
Centering the dataset to have zero mean, so the mean vector is calculated from the followingequation: = (1)
and subtracted off the mean for each dimension.The product of this step is another centered data matrix Y, which has the same size as original dataset
, , (2) Covariance matrix is calculated from the following equation: . (3) Finding eigenvectors and eigenvalues from the covariance matrix where eigenvalues are diagonal
elements of the matrix by using eigen-decomposition technique in equation (4). YE = (4)Where E is the eigenvectors, is the eigenvalues .
Ordering eigenvalues in decreasing order and sorting eigenvectors according to the orderedeigenvalues in loadings matrix. The Eigenvectors matrix is then sorted to be loading matrix.
Calculating scores matrix (dataset projected on principal components), which declares therelations between principal components and observations. The scores matrix is calculated from
the following equations:, , , (5)
-
7/30/2019 EXAMINING OUTLIER DETECTION PERFORMANCE FOR PRINCIPAL COMPONENTS ANALYSIS METHOD AND ITS ROBUS
5/10
International Journal of Advances in Engineering & Technology, May 2013.
IJAET ISSN: 2231-1963
577 Vol. 6, Issue 2, pp. 573-582
Applying the 97.5 tolerance ellipse of the bivariate dataset (data projected on first PCS, dataprojected on minor PCS) to reveal outliers automatically. The ellipse is defined by these datapoints whose distance is equal to the chisquare root of the 97.5 quantile with 2 degrees of
freedom. The form of the distance is ,.75 (6)The screeplot is used and studied and the first and the second principal components accounted for98% of total variance of the dataset, so retaining the first two principal components to represent thedataset as whole, figure (1) shows the screeplot, the plotting of the data projected onto the first twoprincipal components in order to reveal the outliers on the dataset visually is shown in figure (2).
Figure 1: PCA Screeplot Figure 2: PCA Visual outliers
Figure (3) shows tolerance ellipse on major PCS, and figures (4) and (5) shows the visual recording ofoutliers from scatter plots of data projected on robust minor principal components and the outliersdetected by robust minor principal components tuned by tolerance ellipse respectively.
Figure 3: PCA Tolerance Ellipse Figure 4: PCA type2 Outliers
.
0 2 4 6 8 10 120
10
20
30
40
50
60
70
80
90
100
principal components
totalvariance
variances
-2 -1 0 1 2 3 4 5 6 7
x 107
-1
-0.5
0
0.5
1
1.5
2x 10
7 data projected on major pcs
PC1
PC2
66
120
119
135
676871 7577788283
86
878889 90
9698
101103105
111112113115
126
127128
132
134136139141
125
129
130 131144
124
116
117 118
58606465 76798081 82
84
859192939495107108109110 114115121
137138140
142143
123456789101112131415161718192021222324252627282930313233343536373839404243444546474849505152535455565758596061626365106122123
133
6972102
73747370
104
-4 -2 0 2 4 6
x 107
-5
0
5
10
15
x 106
PC1
PC2
123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
616263646576 777879808182
84
859192939495106107108109110 111112113114115
121122123
124
125 130
137138
139140141
142143
66
6768697071727374
7583
86
87888990
969798
99100101
102103
104105
116
117 118
119
120
126
127128
129
131
132
133
134
135
136
144
Tolerance ellipse (97.5%)
-8 -6 -4 -2 0 2 4 6
x 105
-6
-4
-2
0
2
4
6
8x 10
5 data projected on minor pcs
last PC-1
lastPC
12345 678910
11
12131415
16
171819202122232425
26
2728293031 3233343536
37383940
41
4243444546474849505152535455
5657585960
6162636465
66
6768
70
71
72
73
74
75
7778 798081
8283
8485
8788
8990
91
92939495
99100
101
102103
104105
106
107108
109110
111
112113114115
116
117118
119
120
121
122123124
125
126
127128
129130
131
132133134135
136
137138
139140
141
142143
144
86
76
98
96
-
7/30/2019 EXAMINING OUTLIER DETECTION PERFORMANCE FOR PRINCIPAL COMPONENTS ANALYSIS METHOD AND ITS ROBUS
6/10
International Journal of Advances in Engineering & Technology, May 2013.
IJAET ISSN: 2231-1963
578 Vol. 6, Issue 2, pp. 573-582
Figure 5: Tuned Minor PCS
5.2 MCD on Dataset
Testing robust statistics MCD (Minimum Covariance Determinant) estimator yields robust locationmeasure Tmcdand robust dispersion mcd.The following steps are applied to test MCD on the dataset in order to reach the robust principalcomponents.
MCD measure is calculated from the formula:R=(xi-Tmcd(X))T.inv(mcd(X)).(xi-Tmcd(X) ) for i=1 to n (7)Tmcd or mcd =1.0e+006 *
From robust covariance matrix mcd calculating the followings:C(X)mcd or (x)mcd = 1.0e+012 *
* find robust eigenvalues as diagonal matrix as in equation (4) by replacing n with h* find robust eigenvectors as loading matrix as in equation (5).
Calculating robust scores matrix as in the following form, , , (8)The robust screeplot retaining the first two robust principal components which accounted above of98% of total variance is shown in figure (6). Figures (7) and (8) shows respectively the visual
recording of outliers from scatter plots of data projected on robust major principal components, andthe outliers detected by robust major principal components tuned by tolerance ellipse, and Figures (9)and (10) shows the visual recording of outliers from scatter plots of data projected on robust minorprincipal components and the outliers detected by robust minor principal components tuned bytolerance ellipse respectively.
Figure 6: MCD screeplot Figure 7: MCD Visual Outliers
-6 -4 -2 0 2 4
x 105
-4
-2
0
2
4
6
x 105
PC11
PC12
12345 678910
11
12131415
16
171819202122232425
26
2728293031 3233343536
37383940
41
4243444546474849505152535455
56
57585960
6162636465
76
7778 798081
82
84
85
91
92939495106
107108
109110
111
112113
114115
121
122123124
125
130
137138
139140
141
14214366
6768
6970
71
72
73
74
75
83
86
8788
8990
96
9798
99100
101
102103
104
105
116
117118
119
120
126
127128
129
131
132133134135
136
144
Tolerance ellipse (97.5%)
0 2 4 6 8 10 120
10
20
30
40
50
60
70
80
90
100robust mcd screeplot to retain robust PCS
principal components
totalvariance
-8 -7 -6 -5 -4 -3 -2 -1 0 1
x 107
-1
-0.5
0
0.5
1
1.5
2
2.5x 10
7
robustmcd PC1
robustmcdPC2
major pcs from robust estimator
135
119
120
66
116
118117
129129130 125
124
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656768
69 70 7172 7374 75 767778 798081
8283
84
8586
8788 8990 919293949596
979899100 101
102 103104 105 106107108109110111112113114115
121122123
127128
132136
137138
139140141
142143
134
133
131
104104
-
7/30/2019 EXAMINING OUTLIER DETECTION PERFORMANCE FOR PRINCIPAL COMPONENTS ANALYSIS METHOD AND ITS ROBUS
7/10
International Journal of Advances in Engineering & Technology, May 2013.
IJAET ISSN: 2231-1963
579 Vol. 6, Issue 2, pp. 573-582
Figure 8: MCD Tolerance Ellipse Figure 9: MCD type2 Outliers
Figure 10: MCD Tuned Minor PCs
5.3 Projection Pursuit on Dataset
Testing the projection pursuit method on the dataset is included in the following steps:
Center the data matrix X(n,p) , around L1-median to reach centralized data matrix Y(n,p) as :, , 1 9Where L1(X) is high robust estimator of multivariate data location with 50% resist of outliers [11].
Construct the directions pi as normalized rows of matrix , `this process include the following: , : ,1: 10 max() 11Where SVD stand for singular value decomposition.
12
Project all dataset on all possible directions. (13) Calculate robust scale estimator for all the projections and find the directions that maximize qn
estimator, m a x() 14qn is a scale estimator, essentially it is the first quartile of all pairwise distance between two datapoints [5]. The results of these steps yields the robust eigenvectors (PCs), and the squared ofvalue of the robust scale estimator is the eigenvalues.
project all data on the selected direction q to obtain robust principal components as in thefollowing: , (15)
Update data matrix by its orthogonal complement as in the followings: ( ). (16)
-6 -4 -2 0 2 4
x 107
-5
0
5
10
15
20
x 106
robustmcdPC1
robus
tmc
dPC2
12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364657981
84
859194106107108109110114121
122123
124
125
66
6768
69 707172 7374 75 767778808283
86
8788 8990 92939596
9798
99100 101102 103
104105 111112113115
116
117118
119
120
126
127128
129130
131
132133
134
135
136
137138
139140141
142143
144
Tolerance ellipse (97.5%)
-2.5 -2 -1.5 -1 -0.5 0 0.5 1 1.5
x 106
-3
-2.5
-2
-1.5
-1
-0.5
0
0.5
1
1.5
2x 10
6 data project on robustmcd minor PCS
robustmcd last-1 pc
robustmcd
lastpc
116
96131
717069
1019798
99100
66
120119
848576118
117
86
73
67
74141
91
81126
136144
102134
102104
136139
61248026
444444
1131128888
56
-2.5 -2 -1.5 -1 -0.5 0 0.5 1
x 106
-3
-2.5
-2
-1.5
-1
-0.5
0
0.5
1
1.5
x 106
robustmcd pclast-1
robustmcdpclast
12345678910111213141516171819202122232425
2627282930313233343536373839404142434445464748495051525354555657585960 61
62636465
79
81
848591
94106107108109110114
121122123124125
66
6768
697071
72
73
74
75
76
7778 808283
86
8788
8990929395
96
9798
99100
101102
103104105111
112113
115
116
117
118
119120
126
127128129130
131
132133
134
135136
137138
139140
141
142143144
Tolerance ellipse (97.5%)
-
7/30/2019 EXAMINING OUTLIER DETECTION PERFORMANCE FOR PRINCIPAL COMPONENTS ANALYSIS METHOD AND ITS ROBUS
8/10
International Journal of Advances in Engineering & Technology, May 2013.
IJAET ISSN: 2231-1963
580 Vol. 6, Issue 2, pp. 573-582
Project all data on the orthogonal complement, (17)The Plotting of the data projected on the first two robust principal components to detect outliers
visually, is shown in figure (11), and the tuning the first two robust principal components bytolerance ellipse is shown in figure (12). Figures (13) and (14) show respectively the plotting of
the data projected on minor robust principal components to detect outliers visually, and the tuningof the last robust principal components by tolerance ellipse.
Figure 11: PP Visual Outliers Figure 12: PP Tolerance Ellipse
Figure 13: MCD type2 Outliers Figure 14: MCD Tuned Minor PCs
5.4 Results
Table (1) summarizes the outliers detected by each method. The table shows that PCA suffers from
both masking and swamping. The MCD and PP methods results reveal the effects of masking and
swamping of the PCA method. The PP method results are similar to MCD with slight differencesince we use 12 dimensions on the dataset.
Table 1: Outliers Detection
PCA Outlier
detected by major
and Minor PCS
MCD Outliers
detected by major and
minor PCS
PP Outliers
detected by major
and minor PCS
False alarms effects
Masking Swamping
66 66 66 No No
99 99 99 No No
100 100 100 No No
116 116 116 No No
117 117 117 No No
118 118 118 No No
119 119 119 No No120 120 120 No No
-1 0 1 2 3 4 5 6 7 8
x 107
-4
-3.5
-3
-2.5
-2
-1.5
-1
-0.5
0
0.5
1x 10
7 data projected on robust major PCS by PP method
PProbust PC1
PP
robustPC2
66
6768
6970
71
727374
75
767778
798081
82838485
868788
8990
9192939495
96
979899100
101
102
103104
105
107111112113
114115
116
117
118
119
120
121 126
127128
129130
131132
133134
135
136
137138139
140141
142143
144
-4 -2 0 2 4 6
x 107
-4
-3
-2
-1
0
1
x 107
PProbust PC1
PProbustPC2
15 1373419363 7980
13487
14422622352023144947502948593033321817431092554422455274528110525360644106
9088
1425712264131265123465851268407
89
39
78
383731
77
1092
21541138
93949
958382
96132
143107
84
56108
128
11
73
13186
1401213616
127
61126
124
85
103
114139
72
81130
118
133
141
41115
102
75
129
125
117
91
71
74
112113136
105101
6768
111
104
76
135116
97981009970
69
66
119
120
Tolerance ellipse (97.5%)
-3 -2 -1 0 1 2 3 4
x 106
-2
-1.5
-1
-0.5
0
0.5
1
1.5x 10
6 data projected on robust minor PCS by PP
PProbust PC11
PP
robustPC12
99100
12345678910111213141516171819202122232425262728293031323334353637383940
41
42434445464748495051525354555657585960
6162636465
6768
70
7273
77787980
818283
8485
868788
91
92939495
102103
106107108109110
114121122123
127128131132
133
134137138139140
141
14214314412345678910
111213141516171819202122232425262728293031323334353637383940
41
42434445464748495051525354555657585960
6162636465
6768
71
7273
74
75
76
77787980
81828386
87888990
91
9293949596
101
102103
104105
106107108109110
111112113
114115
117118
121122123
124125126 127
129130
131132
133
134
136
137138139140
142143144
135
119
120
9797
116
-2 -1 0 1 2 3
x 106
-1.5
-1
-0.5
0
0.5
1
x 106
PProbust PC11
PProbu
stPC12
1513734193637980134871442262235202314494750294859303332181743109255442245527452811052536064410690 88142571226413126512346585126840789 39 78383731771092215411389394 995
838296
1321431078456
108
1281173131
86140121
3616 12761126124 85
103114 139
72
81
130 118
133
141
4111510275
129 125117
917174 112113 136105101
6768
111104
76
135
116
979810099 70
69 66
119
120
Tolerance ellipse (97.5%)
-
7/30/2019 EXAMINING OUTLIER DETECTION PERFORMANCE FOR PRINCIPAL COMPONENTS ANALYSIS METHOD AND ITS ROBUS
9/10
International Journal of Advances in Engineering & Technology, May 2013.
IJAET ISSN: 2231-1963
581 Vol. 6, Issue 2, pp. 573-582
129 129 129 No No
131 131 131 No No
135 135 135 No No
Normal Normal 69 Yes No
Normal Normal 70 Yes No
71 Normal normal No Yes
76 Normal normal No Yes81 Normal normal No Yes
101 Normal normal No Yes
104 Normal normal No Yes
111 Normal normal No Yes
144 Normal normal No Yes
Normal 84 normal Yes No
Normal 96 normal Yes No
Normal 97 97 Yes No
Normal 98 98 Yes No
VI. CONCLUSION AND FUTURE WORKThe study has examined the PCA and its robustification methods (MCD, PP) performance forintrusion detection by presenting the bi-plots and extracted outlying observation that are verydifferent from the bulk of data. The study showed that tuned results are identical to visualized one.The study returns the PCA false alarms shortness due to masking and swamping effect. The
comparison proved that PP results are similar to MCD with slight difference in outliers type 2 sinceare considered as source of noise. Our future work will go into applying the hybrid method
(ROBPCA), which takes PP as reduction technique and MCD as robust measure for furtherperformance, and applying dynamic robust PCA model with regards to online intrusion detection.
REFERENCES
[1]. Abilene TMs, collected by Zhang .www.cs.utexas.edu/yzhang/research, visited on 13/07/2012
[2]. Khalid Labib and V.Rao Vemuri. "An application of principal Components analysis to the detection
and visualization of computer network ". Annals of telecommunications, pages 218-234, 2005 .
[3]. C. Croux, A. Ruiz-Gazen, "A fast algorithm for robust principal components based on projection
pursuit", COMPSTAT: Proceedings in Computational Statistics, Physica-Verlag, Heidelberg,1996, 211
217.
[4]. Mei-ling Shyu, Schu-Ching Chen,Kanoksri Sarinnapakorn,and Li Wuchang. "Anovel anomaly detection
scheme based on principal components classifier". In proceedings of the IEEE foundations and New
directions of Data Mining workshop, in conjuction with third IEEE international conference on data mining
(ICOM03) .
[5]. J.Edward Jackson . "A user guide to principal components". Wiely interscience Ist edition 2003.
[6]. Anukool Lakhina,. Mark Crovella, and Christoph Diot. "Diagnosing network wide traffic anomalies".Proceedings of the 2004 conference on Applications, technologies, architectures, protocols for computercommunication. ACM 2004.
[7]. Yacine Bouzida, Frederic Cuppens, NoraCuppens-Boulahio, and Sylvain Gombaul. "Efficient Intrusion
Detection Using Principal Component Analysis ". La londe, France, June 2004.
[8]. R.Gnandesikan, "Methods for statistical data analysis of multivariate observations". Wiely-interscience
publication New York, 2nd edition 1997.
[9]. J.Terrel, K.Jeffay L.Zhang, H.Shen, Zhu, and A.Nobel, "Multivariate SVD analysis for a network
anomaly detection ". In proceedings of the ACM SIGOMM Conference 2005.
[10]. Challa S.Sastry, Sanjay Rawat, Aurn K.Pujari and V.P Gulati, "Netwok traffic analysis using singular
value decomposition and multiscale transforms ". information sciences : an international journal 2007.
http://www.cs.utexas.edu/yzhang/http://www.cs.utexas.edu/yzhang/http://www.cs.utexas.edu/yzhang/http://www.cs.utexas.edu/yzhang/ -
7/30/2019 EXAMINING OUTLIER DETECTION PERFORMANCE FOR PRINCIPAL COMPONENTS ANALYSIS METHOD AND ITS ROBUS
10/10
International Journal of Advances in Engineering & Technology, May 2013.
IJAET ISSN: 2231-1963
582 Vol. 6, Issue 2, pp. 573-582
[11]. I.T.Jollif, "Principal components analysis", springer series in statistics, Springer Network ,2nd edition
2007.
[12]. Wei Wong, Xiachong Guan, and Xiangliong Zhang, "Processing of massive audit data streams for realtime anomaly intrusion detection". Computer communications , Elsevier 2008.
[13]. A Lkhaina, K Papagiannak, M Crovella, C-Diot, E Kolaczy, and N. Taft, "Structural Analysis of
network traffic flows". In proceedings of SIGMETRICS, New York, NY, USA, 2004.
AUTHORS BIOGRAPHIES
Nada Badr earned her BSC in Mathematical and Computer Science at University of
Gezira, Sudan. She received the MSC in Computer Science at University of Science andTechnology. She is pursuing her PHD in Computer Science at University of Science and
Technology, Omdurman, Sudan. She currently serving lecturer at the University of
Science and Technology, Faculty of Computer Science and Information Technology.
Noureldien A. Noureldien is working as an associate professor in Computer Science,
department ofComputer Science and Information Technology, University of Science andTechnology, Omdurman, Sudan. He received his B.Sc. and M.Sc. from School of
Mathematical Sciences, University of Khartoum, and received his PhD in Computer
Science in 2001 from University of Science and Technology, Khartoum, Sudan. He has
many papers published in journals of repute. He currently working as the dean of the
Faculty of Computer Science and Information Technology at the University of Scienceand Technology, Omdurman, Sudan.