Research Article Optimized K-Means Algorithm2. K-Means Clustering K-means, an unsupervised learning...

15
Research Article Optimized K-Means Algorithm Samir Brahim Belhaouari, 1 Shahnawaz Ahmed, 1 and Samer Mansour 2 1 Department of Mathematics & Computer Science, College of Science and General Studies, Alfaisal University, P.O. Box 50927, Riyadh, Saudi Arabia 2 Department of Soο¬…ware Engineering, College of Engineering, Alfaisal University, P.O. Box 50927, Riyadh, Saudi Arabia Correspondence should be addressed to Shahnawaz Ahmed; [email protected] Received 18 May 2014; Revised 9 August 2014; Accepted 13 August 2014; Published 7 September 2014 Academic Editor: Gradimir MilovanoviΒ΄ c Copyright Β© 2014 Samir Brahim Belhaouari et al. is is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. e localization of the region of interest (ROI), which contains the face, is the first step in any automatic recognition system, which is a special case of the face detection. However, face localization from input image is a challenging task due to possible variations in location, scale, pose, occlusion, illumination, facial expressions, and clutter background. In this paper we introduce a new optimized k-means algorithm that finds the optimal centers for each cluster which corresponds to the global minimum of the k-means cluster. is method was tested to locate the faces in the input image based on image segmentation. It separates the input image into two classes: faces and nonfaces. To evaluate the proposed algorithm, MIT-CBCL, BioID, and Caltech datasets are used. e results show significant localization accuracy. 1. Introduction e face detection problem is to determine the face existence in the input image and then return its location if it exists. But in the face localization problem it is given that the input image contains a face and the goal is to determine the location of this face. e automatic recognition system, which is a special case of the face detection, locates, in its earliest phase, the region of interest (ROI) that contains the face. Face localization in an input image is a challenging task due to possible variations in location, scale, pose, occlusion, illumi- nation, facial expressions, and clutter background. Various methods were proposed to detect and/or localize faces in an input image; however, there are still needs to improve the performance of localization and detection methods. A survey on some face recognition and detection techniques can be found in [1]. A more recent survey, mainly on face detection, was written by Yang et al. [2]. One of the ROI detection trends is the idea of segmenting the image pixels into a number of groups or regions based on similarity properties. It gained more attention in the recognition technologies, which rely on grouping the features vector to distinguish between the image regions, and then concentrate on a particular region, which contains the face. One of the earliest surveys on image segmentation techniques was done by Fu and Mui [3]. ey classify the techniques into three classes: features shareholding (clustering), edge detection, and region extraction. A later survey by N. R. Pal and S. K. Pal [4] did only concentrate on fuzzy, nonfuzzy, and colour images techniques. Many efforts had been done in ROI detection, which may be divided into two types: region growing and region clustering. e difference between the two types is that the region clustering searches for the clusters without prior information while the region growing needs initial points called seeds to detect the clusters. e main problem in region growing approach is to find the suitable seeds since the clusters will grow from the neighbouring pixels of these seeds based on a specified deviation. For seeds selection, Wan and Higgins [5] defined a number of criteria to select insensitive initial points for the subclass of region growing. To reduce the region growing time, Chang and Li [6] proposed a fast region growing method by using parallel segmentation. As mentioned before, the region clustering approach searches for clusters without prior information. Pappas and Jayant [7] generalized the K-means algorithm to group the Hindawi Publishing Corporation Mathematical Problems in Engineering Volume 2014, Article ID 506480, 14 pages http://dx.doi.org/10.1155/2014/506480

Transcript of Research Article Optimized K-Means Algorithm2. K-Means Clustering K-means, an unsupervised learning...

  • Research ArticleOptimized K-Means Algorithm

    Samir Brahim Belhaouari,1 Shahnawaz Ahmed,1 and Samer Mansour2

    1 Department of Mathematics & Computer Science, College of Science and General Studies, Alfaisal University,P.O. Box 50927, Riyadh, Saudi Arabia

    2Department of Software Engineering, College of Engineering, Alfaisal University, P.O. Box 50927, Riyadh, Saudi Arabia

    Correspondence should be addressed to Shahnawaz Ahmed; [email protected]

    Received 18 May 2014; Revised 9 August 2014; Accepted 13 August 2014; Published 7 September 2014

    Academic Editor: Gradimir Milovanović

    Copyright Β© 2014 Samir Brahim Belhaouari et al. This is an open access article distributed under the Creative CommonsAttribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work isproperly cited.

    The localization of the region of interest (ROI), which contains the face, is the first step in any automatic recognition system, whichis a special case of the face detection. However, face localization from input image is a challenging task due to possible variations inlocation, scale, pose, occlusion, illumination, facial expressions, and clutter background. In this paper we introduce a new optimizedk-means algorithm that finds the optimal centers for each cluster which corresponds to the global minimum of the k-means cluster.This method was tested to locate the faces in the input image based on image segmentation. It separates the input image into twoclasses: faces and nonfaces. To evaluate the proposed algorithm,MIT-CBCL, BioID, and Caltech datasets are used.The results showsignificant localization accuracy.

    1. Introduction

    The face detection problem is to determine the face existencein the input image and then return its location if it exists.But in the face localization problem it is given that the inputimage contains a face and the goal is to determine the locationof this face. The automatic recognition system, which is aspecial case of the face detection, locates, in its earliest phase,the region of interest (ROI) that contains the face. Facelocalization in an input image is a challenging task due topossible variations in location, scale, pose, occlusion, illumi-nation, facial expressions, and clutter background. Variousmethods were proposed to detect and/or localize faces in aninput image; however, there are still needs to improve theperformance of localization and detectionmethods. A surveyon some face recognition and detection techniques can befound in [1]. A more recent survey, mainly on face detection,was written by Yang et al. [2].

    One of the ROI detection trends is the idea of segmentingthe image pixels into a number of groups or regions basedon similarity properties. It gained more attention in therecognition technologies, which rely on grouping the featuresvector to distinguish between the image regions, and then

    concentrate on a particular region, which contains the face.One of the earliest surveys on image segmentation techniqueswas done by Fu and Mui [3]. They classify the techniquesinto three classes: features shareholding (clustering), edgedetection, and region extraction. A later survey by N. R. Paland S. K. Pal [4] did only concentrate on fuzzy, nonfuzzy,and colour images techniques. Many efforts had been done inROI detection, which may be divided into two types: regiongrowing and region clustering. The difference between thetwo types is that the region clustering searches for the clusterswithout prior information while the region growing needsinitial points called seeds to detect the clusters. The mainproblem in region growing approach is to find the suitableseeds since the clusters will grow from the neighbouringpixels of these seeds based on a specified deviation. For seedsselection, Wan and Higgins [5] defined a number of criteriato select insensitive initial points for the subclass of regiongrowing. To reduce the region growing time, Chang and Li[6] proposed a fast region growing method by using parallelsegmentation.

    As mentioned before, the region clustering approachsearches for clusters without prior information. Pappas andJayant [7] generalized the K-means algorithm to group the

    Hindawi Publishing CorporationMathematical Problems in EngineeringVolume 2014, Article ID 506480, 14 pageshttp://dx.doi.org/10.1155/2014/506480

  • 2 Mathematical Problems in Engineering

    image pixels based on spatial constraints and their intensityvariations. These constraints include the following: first,the region intensity to be close to the data and, second,to have a spatial continuity. Their generalization algorithmis adaptive and includes the previous spatial constraints.Like the K-means clustering algorithm, their algorithm isiterative. These spatial constraints are included using Gibbsrandom field model and they concluded that each region ischaracterized by a slowly varying intensity function. Becauseof unimportant features, the algorithm is not that muchaccurate. Later, to improve theirmethod, they proposed usinga caricature of the original image to keep only the mostsignificant features and remove the unimportant features [8].Besides the problem of time costly segmentation, there arethree basic problems that occur during the clustering processwhich are center redundancy, dead centers, and trappedcenters at local minima. Moving K-mean (MKM) proposedin [9] can overcome the three basic problems by minimizingthe center redundancy and the dead center as well as reducingindirectly the effect of trapped centers at local minima. Inspite of that, MKM is sensitive to noise, centers are notlocated in the middle or centroid of a group of data, andsome members of the centers with the largest fitness areassigned as members of a center with the smallest fitness. Toreduce the effects of these problems. Isa et al. [10] proposedthree modified versions of the MKM clustering algorithmfor image segmentation: fuzzy moving k-means, adaptivemoving k-means, and adaptive fuzzy moving k-means. Afterthat Sulaiman and Isa [11] introduced a new segmentationmethod based on a new clustering algorithm which isAdaptive Fuzzy k-means Clustering Algorithm (AFKM). Onthe other hand, the Fuzzy C-Means (FCM) algorithm wasused in image segmentation but it still has some drawbacksthat are the lack of enough robustness to noise and outliers,the crucial parameter 𝛼 being selected generally based onexperience, and the time of segmenting of an image beingdependent on its size. To overcome the drawbacks of theFCM algorithm, Cai et al. [12] integrated both local spatialand gray information to propose fast and robust Fuzzy C-means algorithm called Fast Generalized Fuzzy C-MeansAlgorithm (FGFCM) for image segmentation. Moreover,many researchers have provided a definition for some datacharacteristics where it has a significant impact on the K-means clustering analysis including the scattering of the data,noise, high-dimensionality, the size of the data and outliers inthe data, datasets, types of attributes, and scales of attributes[13]. However, there is a need for more investigation tounderstand how data distributions affect K-means clusteringperformance. In [14] Xiong et al. provided a formal andorganized study of the effect of skewed data distributionson K-means clustering. Previous research found the impactof high-dimensionality on K-means clustering performance.The traditional Euclidean notion of proximity is not veryuseful on real-world high-dimensional datasets, such as doc-ument datasets and gene-expression datasets. To overcomethis challenge, researchers turn tomake use of dimensionalityreduction techniques, such as multidimensional scaling [15].Second, K-means has difficulty in detecting the β€œnatural”clusters with nonspherical shapes [13]. Modified K-means

    clustering algorithm is a direction to solve this problem.Guhaet al. [16] proposed the CURE method, which makes use ofmultiple representative points to obtain the β€œnatural” clustersshape information. The problem of outliers and noise in thedata can also reduce clustering algorithms performance [17],especially for prototype-based algorithms such as K-means.The direction to solve this kind of problem is by combiningsome outlier removal techniques before conducting K-meansclustering. For example, a simple method [9] of detectingoutliers is based on the distance measure. On the other hand,manymodifiedK-means clustering algorithms that workwellfor smallermedium-size datasets are unable to deal with largedatasets. Bradley et al. in [18] introduced a discussion ofscaling K-means clustering to large datasets.

    Arthur and Vassilvitskii implemented a preliminary ver-sion, k-means++, in C++ to evaluate the performance of k-means. K-means++ [19] uses a careful seeding method tooptimize the speed and the accuracy of k-means. Experi-ments were performed on four datasets and results showedthat that this algorithm is 𝑂(log π‘˜)-competitive with theoptimal clustering. Experiments also proved its better speed(70% faster) and accuracy (potential value obtained is betterby factors of 10 to 1000) than k-means. Kanungo et al.[20] also proposed an 𝑂(𝑛3πœ–βˆ’π‘‘) algorithm for the k-meansproblem. It is (9 + πœ–) competitive but quiet slow in practice.Xiong et al. investigates the measures that best reflects theperformance of K-means clustering [14]. An organized studywas performed to understand the impact of data distributionson the performance of K-means clustering. Research workalso has improved the traditional K-means clustering so thatit can handle datasets with large variation of cluster sizes.Thisformal study illustrates that the entropy sometimes providedmisleading information on the clustering performance. Basedon this, a coefficient of variation (CV) is proposed to validatethe clustering outcome. Experimental results proved that, fordatasets with great variation in β€œtrue” cluster sizes, K-meanslessens (less than 1.0) the variation in resultant cluster sizes.For datasets with small variation in β€œtrue” cluster sizes K-means increases (greater than 0.3) the variation in resultantcluster sizes.

    Scalability of the k-means for large datasets is one ofthe major limitations. Chiang et al. [21] proposed a simpleand easy to implement algorithm for reducing the compu-tation time of k-means. This pattern reduction algorithmcompresses and/or removes iteration patterns that are notlikely to change their membership.The pattern is compressedby selecting one of the patterns to be removed and setting itsvalue to the average of all patterns removed. If 𝐷[π‘Ÿπ‘–

    𝑙] is the

    pattern to be removed then we can write it as follows:

    𝐷[π‘Ÿπ‘–

    𝑙] =

    |𝑅𝑖

    𝑙|

    βˆ‘

    𝑗=1

    𝐷[π‘Ÿπ‘™

    𝑖𝑗] , where π‘Ÿπ‘–

    π‘™βˆˆ 𝑅𝑖

    𝑙. (1)

    This algorithm can be also applied to other clusteringalgorithms like population-based clustering algorithms. Thecomputation time of many clustering algorithms can bereduced using this technique, as it is especially effectivefor large and high-dimensional datasets. In [22] Bradley

  • Mathematical Problems in Engineering 3

    et al. proposed Scalable k-means that uses buffering anda two-stage compression scheme to compress or removepatterns in order to improve the performance of k-means.The algorithm is slightly faster than k-means but does notshow the same result always. The factors that degrade theperformance of scalable k-means are the compression pro-cesses and compression ratio. Farnstrom et al. also proposeda simple single pass k-means algorithm [23], which reducesthe computation time of k-means. Ordonez and Omiecinskiproposed relational k-means [24] that uses the block andincremental concept to improve the stability of scalable k-means.The computation time of k-means can also be reducedusing Parallel bisecting k-means [25] and triangle inequality[26] methods.

    Although K-means is the most popular and simple clus-tering algorithm, the major difficulty with this process is thatit cannot ensure the global optimumresults because the initialcluster centers are selected randomly. Reference [27] presentsa novel technique that selects the initial cluster centers byusing Voronoi diagram.This method automates the selectionof the initial cluster centers. To validate the performanceexperiments were performed on a range of artificial and realworld datasets. The quality of the algorithm is assessed usingthe following error rate (ER) equation:

    ER =Number ofmisclassified objects

    Total number of objectsΓ— 100%. (2)

    The lower the error rate the better the clustering. Theproposed algorithm is able to produce better clustering resultsthan the traditional K-means. Experimental results proved60% reduction in iterations for defining the centroids.

    All previous solutions and efforts to increase the perfor-mance of K-means algorithm still need more investigationsince they are all looking for a local minimum. Searching forthe global minimum will certainly improve the performanceof the algorithm.The rest of the paper is organized as follows.In Section 2, the proposed method is introduced that findsthe optimal centers for each cluster which corresponds tothe global minimum of the k-means cluster. In Section 3 theresults are discussed and analyzed using two sets from MIT,BioID, and Caltech datasets. Finally in Section 4, conclusionsare drawn.

    2. K-Means Clustering

    K-means, an unsupervised learning algorithm first proposedby MacQueen, 1967 [28], is a popular method of clusteranalysis. It is a process of organizing the specified objectsinto uniform classes called clusters based on similaritiesamong objects based on certain criteria. It solves the well-known clustering problem by considering certain attributesand performing an iterative alternating fitting process. Thisalgorithm partitions a dataset 𝑋 into π‘˜ disjoint clusters suchthat each observation belongs to the cluster with the nearestmean. Letπ‘₯

    𝑖be the centroid of cluster 𝑐

    𝑖and let𝑑(π‘₯

    𝑗, π‘₯𝑖)be the

    180

    160

    140

    120

    100

    80

    60

    40

    20

    0

    Clus

    terin

    g er

    ror

    Number of clusters k2 4 6 8 10 12 14 16

    Global k-means

    Fast global k-means

    k-means

    Min k-means

    Figure 1: Comparison of k-means.

    dissimilarity between π‘₯𝑖and object π‘₯

    π‘—βˆˆ 𝑐𝑖.Then the function

    minimized by the k-means is given by the following equation:

    minπ‘₯1 ,...,π‘₯π‘˜

    𝐸 =

    π‘˜

    βˆ‘

    𝑖=1

    βˆ‘

    π‘₯π‘—βˆˆπ‘π‘–

    𝑑 (π‘₯𝑗, π‘₯𝑖) . (3)

    K-means clustering is utilized in a vast number ofapplications including machine learning, fault detection,pattern recognition, image processing, statistics, and artificialintelligent [11, 29, 30]. k-means algorithm is considered asone of the fastest clustering algorithms with a number ofvariants that are sensitive to the selection of initial pointsand are intended for solving many issues of k-means likethe evaluation of the clusters number [31], the method ofinitialization of the clusters centroids [32], and the speed ofthe algorithm [33].

    2.1. Modifications in K-Means Clustering. Global k-meansalgorithm [34] is a proposal to improve global search prop-erties of k-means algorithm and its performance on verylarge datasets by computing clusters successively. The mainweakness of k-means clustering, that is, its sensitivity to initialpositions of the cluster centers, is overcome by global k-means clustering algorithm which consists of a deterministicglobal optimization technique. It does not select initial valuesrandomly; instead an incremental approach is applied tooptimally add one new cluster center at each stage. Globalk-means also proposes a method to reduce the computa-tional load while maintain the solution quality. Experimentswere performed to compare the performance of k-means,global k-means, min k-means, and fast global k-means asshown in Figure 1. Numerical results show that the globalk-means algorithm considerably outperforms other k-meansalgorithms.

  • 4 Mathematical Problems in Engineering

    Bagirov [35] proposed a new version of the global k-means algorithm for minimum sum-of-squares clusteringproblems. He also compared three different versions ofthe k-means algorithm to propose the modified versionof the global k-means algorithm. The proposed algorithmcomputes clusters incrementally and π‘˜ βˆ’ 1 cluster centersfrom the previous iteration are used to compute k-partitionof a dataset. The starting point of the π‘˜th cluster center iscomputed byminimizing an auxiliary cluster function. Givena finite set of points 𝐴 in the 𝑛-dimensional space the globalk-means compute the centroid𝑋1 of the set𝐴 as the followingequation:

    𝑋1=

    1

    π‘š

    π‘š

    βˆ‘

    𝑖=1

    π‘Žπ‘–, π‘Žπ‘–βˆˆ 𝐴, 𝑖 = 1, . . . , π‘š. (4)

    Numerical experiments performed on 14 datasets demon-strate the efficiency of themodified global k-means algorithmin comparison to the multistart k-means (MS k-means) andthe GKM algorithms when the number of clusters π‘˜ > 5.Modified global k-means however requires more computa-tional time than the global k-means algorithm. Xie and Jiang[36] also proposed a novel version of the global k-meansalgorithm. The method of creating the next cluster center inthe global k-means algorithm is modified and that showeda better execution time without affecting the performanceof the global k-means algorithm. Another extension of thestandard k-means clustering is the Global Kernel k-means[37] which optimizes the clustering error in feature spaceby employing kernel-means as a local search procedure. Akernel function is used in order to solve the M-clusteringproblem and near-optimal solutions are used to optimizethe clustering error in the feature space. This incrementalalgorithm has high ability to identify nonlinearly separableclusters in input space and no dependence on cluster initial-ization. It can handle weighted data points making it suitablefor graph partitioning applications. Twomajor modificationswere performed in order to reduce the computational costwith no major effect on the solution quality.

    Video imaging and Image segmentation are importantapplications for k-means clustering. Hung et al. [38]modifiedthe k-means algorithm for color image segmentation wherea weight selection procedure in the W-k-means algorithm isproposed. The evaluation function 𝐹(𝐼) is used for compari-son:

    𝐹 (𝐼) = βˆšπ‘…

    𝑅

    βˆ‘

    𝑖=1

    (𝑒𝑖/256)2

    βˆšπ΄π‘–

    , (5)

    where 𝐼 = segmented image, 𝑅 = the number of regions in thesegmented image, 𝐴

    𝑖= the area, or the number of pixels of

    the 𝑖th region, and 𝑒𝑖= the color error of region 𝑖.

    𝑒𝑖is the sum of the Euclidean distance of the color vectors

    between the original image and the segmented image. Smallervalues of 𝐹(𝐼) give better segmentation results. Resultsfrom color image segmentation illustrate that the proposedprocedure produces better segmentation than the randominitialization. Maliatski and Yadid-Pecht [39] also proposean adaptive k-means clustering, hardware-driven approach.

    Y

    X

    My

    1

    My

    My

    2

    Mx1 Mx Mx2

    C1

    C2

    Figure 2: The separation points in two dimensions.

    The algorithm uses both pixel intensity and pixel distance inthe clustering process. In [40] also a FPGA implementationof real-time k-means clustering is done for colored images.A filtering algorithm is used for hardware implementation.Suliman and Isa [11] presented a unique clustering algo-rithm called adaptive fuzzy-K-means clustering for imagesegmentation. It can be used for different types of imagesincluding medical images, microscopic images, and CCDcamera images. The concepts of fuzziness and belongingnessare applied to produce more adaptive clustering as comparedto other conventional clustering algorithms. The proposedadaptive fuzzy-K-means clustering algorithm provides bettersegmentation and visual quality for a range of images.

    2.2. The Proposed Methodology. In this method, the facelocation is determined automatically by using an optimizedK-means algorithm. First, the input image is reshaped intovectors of pixels values, followed by clustering the pixels,based on applying a certain threshold determined by thealgorithm, into two classes. Pixels with values below thethreshold are put in the nonface class.The rest of the pixels areput in the face class. However, some unwanted parts may getclustered to this face class.Therefore, the algorithm is appliedagain to the face class to obtain only the face part.

    2.2.1. Optimized K-Means Algorithm. Theproposed K-meansmodified algorithm is intended to solve the limitations of thestandard version by using differential equations to determinethe optimum separation point. This algorithm finds theoptimal centers for each cluster which corresponds to theglobal minimum of the k-means cluster. As an example, saywe would like to cluster an image into two subclasses: faceand nonface. So we look for a separator which will separatethe image into two different clusters in two dimensions.Thenthe means of the clusters can be found easily. Figure 2 showsthe separation points and themean of each class. To find these

  • Mathematical Problems in Engineering 5

    points we proposed the followingmodification in continuousand discrete cases.

    Let {π‘₯1, π‘₯2, π‘₯3, . . . , π‘₯

    𝑁} be a dataset and let π‘₯

    𝑖be an

    observation. Let𝑓(π‘₯) be the probabilitymass function (PMF)of π‘₯

    𝑓 (π‘₯) =

    ∞

    βˆ‘

    βˆ’βˆž

    1

    𝑁

    𝛿 (π‘₯ βˆ’ π‘₯𝑖) , π‘₯

    π‘–βˆˆ R, (6)

    where 𝛿(π‘₯) is the Dirac function.When the size of the data is very large, the question of the

    cost of the minimization ofβˆ‘π‘˜π‘–=1βˆ‘π‘₯π‘–βˆˆπ‘ π‘–

    |π‘₯π‘–βˆ’ πœ‡π‘–|2 arises, where

    π‘˜ is the number of clusters and πœ‡π‘–is the mean of cluster 𝑠

    𝑖.

    Let

    Var π‘˜ =π‘˜

    βˆ‘

    𝑖=1

    βˆ‘

    π‘₯π‘–βˆˆπ‘ π‘–

    π‘₯π‘–βˆ’ πœ‡π‘–

    2. (7)

    Then it is enough to minimize Var 1 without losing thegenerality since

    min𝑠𝑖

    𝑑

    βˆ‘

    𝑖=1

    βˆ‘

    π‘₯π‘—βˆˆπ‘ π‘–

    π‘₯π‘—βˆ’ πœ‡π‘–

    2

    β‰₯

    𝑑

    βˆ‘

    𝑙=1

    min𝑠𝑖

    (

    π‘˜

    βˆ‘

    𝑖=1

    βˆ‘

    π‘₯π‘—βˆˆπ‘ π‘–

    π‘₯𝑙

    π‘—βˆ’ πœ‡π‘™

    𝑖

    2

    ) , (8)

    where 𝑑 is the dimension of π‘₯𝑗= (π‘₯1

    𝑗, . . . , π‘₯

    𝑑

    𝑗) and πœ‡

    𝑗=

    (πœ‡1

    𝑗, . . . , πœ‡

    𝑑

    𝑗).

    2.2.2. Continuous Case. For the continuous case 𝑓(π‘₯) is likethe probability density function (PDF).Then for 2 classes, weneed to minimize the following equation:

    Var 2 = βˆ«π‘€

    βˆ’βˆž

    (π‘₯ βˆ’ πœ‡1)2𝑓 (π‘₯) 𝑑π‘₯

    + ∫

    ∞

    𝑀

    (π‘₯ βˆ’ πœ‡2)2𝑓 (π‘₯) 𝑑π‘₯.

    (9)

    If 𝑑 β‰₯ 2 we can use (8)

    minVar 2 β‰₯𝑑

    βˆ‘

    π‘˜=1

    min(βˆ«π‘€

    βˆ’βˆž

    (π‘₯π‘˜βˆ’ πœ‡π‘˜

    1)

    2

    𝑓 (π‘₯π‘˜) 𝑑π‘₯

    +∫

    𝑀

    βˆ’βˆž

    (π‘₯π‘˜βˆ’ πœ‡π‘˜

    2)

    2

    𝑓 (π‘₯π‘˜) 𝑑π‘₯) .

    (10)

    Let π‘₯ ∈ R be any random variable with probabilitydensity function 𝑓(π‘₯); we want to find πœ‡

    1and πœ‡

    2such that

    it minimizes (7) as follows:

    (πœ‡βˆ—

    1, πœ‡βˆ—

    2) = arg(πœ‡1 ,πœ‡2)

    [ min𝑀,πœ‡1,πœ‡2

    [∫

    𝑀

    βˆ’βˆž

    (π‘₯ βˆ’ πœ‡1)2𝑓 (π‘₯) 𝑑π‘₯

    +∫

    ∞

    𝑀

    (π‘₯ βˆ’ πœ‡2)2𝑓 (π‘₯) 𝑑π‘₯]] ,

    (11)

    min𝑀,πœ‡1,πœ‡2

    Var 2 = min𝑀

    [minπœ‡1

    ∫

    𝑀

    βˆ’βˆž

    (π‘₯ βˆ’ πœ‡1)2𝑓 (π‘₯) 𝑑π‘₯

    +minπœ‡2

    ∫

    ∞

    𝑀

    (π‘₯ βˆ’ πœ‡2)2𝑓 (π‘₯) 𝑑π‘₯] ,

    (12)

    min𝑀,πœ‡1,πœ‡2

    Var 2 = min𝑀

    [minπœ‡1

    ∫

    𝑀

    βˆ’βˆž

    (π‘₯ βˆ’ πœ‡1(𝑀))2𝑓 (π‘₯) 𝑑π‘₯

    +minπœ‡2

    ∫

    ∞

    𝑀

    (π‘₯ βˆ’ πœ‡2(𝑀))2𝑓 (π‘₯) 𝑑π‘₯] .

    (13)

    We know that

    minπ‘₯𝐸 [(𝑋 βˆ’ π‘₯)

    2] = 𝐸 [(𝑋 βˆ’ 𝐸 (π‘₯))

    2] = Var (π‘₯) . (14)

    So we can find πœ‡1and πœ‡2in terms of𝑀. Let us define Var

    1

    and Var2as follows:

    Var1= ∫

    𝑀

    βˆ’βˆž

    (π‘₯ βˆ’ πœ‡1)2𝑓 (π‘₯) 𝑑π‘₯,

    Var2= ∫

    ∞

    𝑀

    (π‘₯ βˆ’ πœ‡2)2𝑓 (π‘₯) 𝑑π‘₯.

    (15)

    After simplification we get

    Var1= ∫

    𝑀

    βˆ’βˆž

    π‘₯2𝑓 (π‘₯) 𝑑π‘₯

    + πœ‡2

    1∫

    𝑀

    βˆ’βˆž

    𝑓 (π‘₯) 𝑑π‘₯ βˆ’ 2πœ‡1∫

    𝑀

    βˆ’βˆž

    π‘₯𝑓 (π‘₯) 𝑑π‘₯,

    Var2= ∫

    ∞

    𝑀

    π‘₯2𝑓 (π‘₯) 𝑑π‘₯

    + πœ‡2

    2∫

    ∞

    𝑀

    𝑓 (π‘₯) 𝑑π‘₯ βˆ’ 2πœ‡2∫

    ∞

    𝑀

    π‘₯𝑓 (π‘₯) 𝑑π‘₯.

    (16)

    To find the minimum, Var1and Var

    2are both partially

    differentiated with respect to πœ‡1and πœ‡

    2, respectively. After

    simplification, we conclude that the minimization occurswhen

    πœ‡1=

    𝐸 (π‘‹βˆ’)

    𝑃 (π‘‹βˆ’)

    ,

    πœ‡2=

    𝐸 (𝑋+)

    𝑃 (𝑋+)

    ,

    (17)

    whereπ‘‹βˆ’= 𝑋 β‹… 1

    π‘₯

  • 6 Mathematical Problems in Engineering

    To find the minimum of Var 2, we need to find πœ•πœ‡/πœ•π‘€

    πœ‡1= (

    ∫

    𝑀

    βˆ’βˆžπ‘₯𝑓 (π‘₯) 𝑑π‘₯

    ∫

    𝑀

    βˆ’βˆžπ‘“ (π‘₯) 𝑑π‘₯

    )

    β‡’

    πœ•πœ‡1

    πœ•π‘€

    =

    𝑀𝑓 (𝑀)𝑃 (π‘‹βˆ’) βˆ’ 𝑓 (𝑀)𝐸 (𝑋

    βˆ’)

    𝑃2(π‘‹βˆ’)

    ,

    πœ‡2= (

    ∫

    ∞

    𝑀π‘₯𝑓 (π‘₯) 𝑑π‘₯

    ∫

    ∞

    𝑀𝑓 (π‘₯) 𝑑π‘₯

    )

    β‡’

    πœ•πœ‡2

    πœ•π‘€

    =

    𝑀𝑓 (𝑀)𝑃 (𝑋+) βˆ’ 𝑓 (𝑀)𝐸 (𝑋+

    )

    𝑃2(𝑋+)

    .

    (18)

    After simplification we get

    πœ•πœ‡1

    πœ•π‘€

    =

    𝑓 (𝑀)

    𝑃2(π‘‹βˆ’)

    [𝑀𝑃 (π‘‹βˆ’) βˆ’ 𝐸 (𝑋

    βˆ’)] ,

    πœ•πœ‡2

    πœ•π‘€

    =

    𝑓 (𝑀)

    𝑃2(𝑋+)

    [𝑀𝑃 (𝑋+) βˆ’ 𝐸 (𝑋

    +)] .

    (19)

    In order to find the minimum of Var 2, we will findπœ•Var1/πœ•π‘€ and πœ•π‘‰π‘Žπ‘Ÿ

    2/πœ•π‘€ separately and add themas follows:

    πœ•

    πœ•π‘€

    (Var 2) = πœ•πœ•π‘€

    (Var1) +

    πœ•

    πœ•π‘€

    (Var2)

    πœ•Var1

    πœ•π‘€

    =

    πœ•

    πœ•π‘€

    (∫

    𝑀

    βˆ’βˆž

    π‘₯2𝑓 (π‘₯) 𝑑π‘₯

    +πœ‡2

    1∫

    𝑀

    βˆ’βˆž

    𝑓 (π‘₯) 𝑑π‘₯ βˆ’ 2πœ‡1∫

    𝑀

    βˆ’βˆž

    π‘₯𝑓 (π‘₯) 𝑑π‘₯)

    = 𝑀2𝑓 (𝑀) + 2πœ‡

    1πœ‡

    1𝑃 (π‘‹βˆ’)

    + πœ‡2

    1𝑓 (𝑀) βˆ’ 2πœ‡

    1𝐸 (π‘‹βˆ’) βˆ’ 2πœ‡

    1𝑀𝑓(𝑀) .

    (20)

    After simplification we get

    πœ•Var1

    πœ•π‘€

    = 𝑓 (𝑀)[𝑀2+

    𝐸2(π‘‹βˆ’)

    𝑃(π‘‹βˆ’)2βˆ’ 2𝑀

    𝐸 (π‘‹βˆ’)

    𝑃 (π‘‹βˆ’)

    ] . (21)

    Similarly

    πœ•Var2

    πœ•π‘€

    = βˆ’π‘€2𝑓 (𝑀) + 2πœ‡

    2πœ‡

    2𝑃 (𝑋+)

    βˆ’ πœ‡2

    2𝑓 (𝑀) βˆ’ 2πœ‡

    2𝐸 (𝑋+) + 2πœ‡

    2𝑀𝑓(𝑀) .

    (22)

    After simplification we get

    πœ•Var2

    πœ•π‘€

    = 𝑓 (𝑀)[βˆ’π‘€2+

    𝐸2(𝑋+)

    𝑃(𝑋+)2βˆ’ 2𝑀

    𝐸 (𝑋+)

    𝑃 (𝑋+)

    ] . (23)

    Adding both these differentials and after simplifying, itfollows that

    πœ•

    πœ•π‘€

    (Var 2) = 𝑓 (𝑀)[(𝐸2(π‘‹βˆ’)

    𝑃(π‘‹βˆ’)2βˆ’

    𝐸2(𝑋+)

    𝑃(𝑋+)2)

    βˆ’2𝑀(

    𝐸 (π‘‹βˆ’)

    𝑃 (π‘‹βˆ’)

    βˆ’

    𝐸 (𝑋+)

    𝑃 (𝑋+)

    )] .

    (24)

    Equating this with 0 gives

    πœ•

    πœ•π‘€

    (Var 2) = 0,

    [(

    𝐸2(π‘‹βˆ’)

    𝑃(π‘‹βˆ’)2βˆ’

    𝐸2(𝑋+)

    𝑃(𝑋+)2) βˆ’ 2𝑀(

    𝐸 (π‘‹βˆ’)

    𝑃 (π‘‹βˆ’)

    βˆ’

    𝐸 (𝑋+)

    𝑃 (𝑋+)

    )] = 0,

    [

    𝐸 (π‘‹βˆ’)

    𝑃 (π‘‹βˆ’)

    βˆ’

    𝐸 (𝑋+)

    𝑃 (𝑋+)

    ] [

    𝐸 (π‘‹βˆ’)

    𝑃 (π‘‹βˆ’)

    +

    𝐸 (𝑋+)

    𝑃 (𝑋+)

    βˆ’ 2𝑀] = 0.

    (25)

    But [(𝐸(π‘‹βˆ’)/𝑃(𝑋

    βˆ’))βˆ’(𝐸(𝑋

    +)/𝑃(𝑋

    +))] ΜΈ= 0, as in that case

    πœ‡1= πœ‡2, which contradicts the initial assumption. Therefore

    [

    𝐸 (π‘‹βˆ’)

    𝑃 (π‘‹βˆ’)

    +

    𝐸 (𝑋+)

    𝑃 (𝑋+)

    βˆ’ 2𝑀] = 0

    β‡’ 𝑀 = [

    𝐸 (π‘‹βˆ’)

    𝑃 (π‘‹βˆ’)

    +

    𝐸 (𝑋+)

    𝑃 (𝑋+)

    ] β‹…

    1

    2

    ,

    𝑀 =

    πœ‡1+ πœ‡2

    2

    .

    (26)

    Therefore, we have to find all values of𝑀 that satisfy (26),which is easier than minimizing Var 2 directly.

    To find theminimum in terms of𝑀, it is enough to derive

    πœ•Var 2 (πœ‡1(𝑀) , πœ‡

    2(𝑀))

    πœ•π‘€

    = 𝑓 (𝑀) [

    𝐸2(π‘‹βˆ’)

    𝑃(𝑋 < 𝑀)2βˆ’

    𝐸2(𝑋+)

    𝑃(𝑋 β‰₯ 𝑀)2

    βˆ’2𝑀[

    𝐸 (π‘‹βˆ’)

    𝑃 (𝑋 ≀ 𝑀)

    βˆ’

    𝐸 (𝑋+)

    𝑃 (𝑋 > 𝑀)

    ]] .

    (27)

    Then we find

    𝑀 = [

    𝐸 (𝑋+)

    𝑃 (𝑋+)

    +

    𝐸 (π‘‹βˆ’)

    𝑃 (π‘‹βˆ’)

    ] β‹…

    1

    2

    . (28)

    2.2.3. Finding the Centers of Some Common ProbabilityDensity Functions

    (1) Uniform Distribution. The probability density function ofthe continuous uniform distribution is

    𝑓 (π‘₯) =

    {

    {

    {

    1

    𝑏 βˆ’ π‘Ž

    , π‘Ž < π‘₯ < 𝑏

    0 otherwise,(29)

  • Mathematical Problems in Engineering 7

    where π‘Ž and 𝑏 are the two parameters of the distribution.Theexpected values for𝑋 β‹… 1

    π‘₯β‰₯ 𝑀 and𝑋 β‹… 1

    π‘₯< 𝑀 are

    𝐸 (𝑋+) =

    1

    2

    (

    𝑏2βˆ’π‘€2

    𝑏 βˆ’ π‘Ž

    ) ,

    𝐸 (π‘‹βˆ’) =

    1

    2

    (

    𝑀2βˆ’ π‘Ž2

    𝑏 βˆ’ π‘Ž

    ) .

    (30)

    Putting all these in (28) and solving, we get

    𝑀 =

    1

    2

    (π‘Ž + 𝑏) ,

    πœ‡1(𝑀) =

    𝐸 (π‘‹βˆ’)

    𝑃 (π‘‹βˆ’)

    =

    1

    4

    (3π‘Ž + 𝑏) ,

    πœ‡2 (𝑀) =

    𝐸 (𝑋+)

    𝑃 (𝑋+)

    =

    1

    4

    (π‘Ž + 3𝑏) .

    (31)

    (2) Log-NormalDistribution.Theprobability density functionof a log-normal distribution is

    𝑓 (π‘₯; πœ‡, 𝜎) =

    1

    π‘₯𝜎√2πœ‹

    π‘’βˆ’(ln π‘₯βˆ’πœ‡)2/2𝜎2

    , π‘₯ > 0. (32)

    The probabilities in left and right of𝑀 are, respectively,

    𝑃 (𝑋+) =

    1

    2

    βˆ’

    1

    2

    erf (12

    √2 (lnπ‘€βˆ’ πœ‡)𝜎

    ) ,

    𝑃 (π‘‹βˆ’) =

    1

    2

    +

    1

    2

    erf (12

    √2 (lnπ‘€βˆ’ πœ‡)𝜎

    ) .

    (33)

    The expected values for𝑋 β‹… 1π‘₯β‰₯ 𝑀 and𝑋 β‹… 1

    π‘₯< 𝑀 are

    𝐸 (𝑋+)

    = [

    1

    2

    π‘’πœ‡+(1/2)𝜎

    2

    {1 + erf (12

    (πœ‡ + 𝜎2βˆ’ ln𝑀)√2𝜎

    )}] ,

    𝐸 (π‘‹βˆ’)

    = [

    1

    2

    π‘’πœ‡+(1/2)𝜎

    2

    {1 + erf (12

    (βˆ’πœ‡ βˆ’ 𝜎2+ ln𝑀)√2𝜎

    )}] .

    (34)

    Putting all these in (28) and assuming πœ‡ = 0 and 𝜎 = 1and solving, we get

    𝑀 = 4.244942583,

    πœ‡1(𝑀) =

    𝐸 (π‘‹βˆ’)

    𝑃 (π‘‹βˆ’)

    = 1.196827816,

    πœ‡2(𝑀) =

    𝐸 (𝑋+)

    𝑃 (𝑋+)

    = 7.293057348.

    (35)

    (3) Normal Distribution. The probability density function ofa normal distribution is

    𝑓 (π‘₯, πœ‡, 𝜎) =

    1

    𝜎√2πœ‹

    π‘’βˆ’(π‘₯βˆ’πœ‡)

    2/2𝜎2

    . (36)

    The probabilities in left and right of𝑀 are, respectively,

    𝑃 (𝑋+) =

    1

    2

    βˆ’

    1

    2

    erf (12

    √2 (𝑀 βˆ’ πœ‡)

    𝜎

    ) ,

    𝑃 (π‘‹βˆ’) =

    1

    2

    +

    1

    2

    erf (12

    √2 (𝑀 βˆ’ πœ‡)

    𝜎

    ) .

    (37)

    The expected values for𝑋 β‹… 1π‘₯β‰₯ 𝑀 and𝑋 β‹… 1

    π‘₯< 𝑀 are

    𝐸 (𝑋+) = βˆ’

    1

    2

    πœ‡{βˆ’1 + erf (12

    √2 (𝑀 βˆ’ πœ‡)

    𝜎

    )} ,

    𝐸 (π‘‹βˆ’) =

    1

    2

    πœ‡{1 + erf (12

    √2 (𝑀 βˆ’ πœ‡)

    𝜎

    )} .

    (38)

    Putting all these in (28) and solving we get 𝑀 = πœ‡.Assuming πœ‡ = 0 and 𝜎 = 1 and solving, we get

    𝑀 = 0,

    πœ‡1(𝑀) =

    𝐸 (π‘‹βˆ’)

    𝑃 (π‘‹βˆ’)

    = βˆ’βˆš2

    πœ‹

    ,

    πœ‡2(𝑀) =

    𝐸 (𝑋+)

    𝑃 (𝑋+)

    = √2

    πœ‹

    .

    (39)

    2.2.4. Discrete Case. Let𝑋 be a discrete random variable andassume we have a set of𝑁 observations from𝑋. Let 𝑝

    𝑗be the

    mass density function for an observation π‘₯𝑗.

    Let π‘₯π‘–βˆ’1

    < π‘€π‘–βˆ’1

    < π‘₯𝑖and π‘₯

    𝑖< 𝑀𝑖< π‘₯𝑖+1

    . We define πœ‡π‘₯≀π‘₯𝑖

    as themean for all π‘₯ ≀ π‘₯𝑖, and πœ‡

    π‘₯β‰₯π‘₯𝑖as themean for all π‘₯ β‰₯ π‘₯

    𝑖.

    Define Var 2(π‘€π‘–βˆ’1) as the variance of two centers forced by

    π‘€π‘–βˆ’1

    as a separator

    Var (π‘€π‘–βˆ’1) = βˆ‘

    π‘₯𝑗≀π‘₯π‘–βˆ’1

    (π‘₯π‘—βˆ’ πœ‡π‘₯𝑗≀π‘₯π‘–βˆ’1

    )

    2

    𝑝𝑗

    + βˆ‘

    π‘₯𝑗β‰₯π‘₯𝑖

    (π‘₯π‘—βˆ’ πœ‡π‘₯𝑗β‰₯π‘₯𝑖

    )

    2

    𝑝𝑗,

    (40)

    Var (𝑀𝑖) = βˆ‘

    π‘₯𝑗≀π‘₯𝑖

    (π‘₯π‘—βˆ’ πœ‡π‘₯𝑗≀π‘₯𝑖

    )

    2

    𝑝𝑗

    + βˆ‘

    π‘₯𝑗β‰₯π‘₯𝑖+1

    (π‘₯π‘—βˆ’ πœ‡π‘₯𝑗β‰₯π‘₯𝑖+1

    )

    2

    𝑝𝑗.

    (41)

  • 8 Mathematical Problems in Engineering

    The first part of Var(π‘€π‘–βˆ’1) simplifies as follows:

    βˆ‘

    π‘₯𝑗≀π‘₯π‘–βˆ’1

    (π‘₯π‘—βˆ’ πœ‡π‘₯𝑗≀π‘₯π‘–βˆ’1

    )

    2

    𝑝𝑗

    = βˆ‘

    π‘₯𝑗≀π‘₯π‘–βˆ’1

    π‘₯2

    π‘—π‘π‘—βˆ’ 2πœ‡π‘₯𝑗≀π‘₯π‘–βˆ’1

    Γ— βˆ‘

    π‘₯𝑗≀π‘₯π‘–βˆ’1

    π‘₯𝑗𝑝𝑗+ πœ‡2

    π‘₯𝑗≀π‘₯π‘–βˆ’1βˆ‘

    π‘₯𝑗≀π‘₯π‘–βˆ’1

    𝑝𝑗

    = 𝐸 (𝑋2β‹… 1π‘₯𝑗≀π‘₯π‘–βˆ’1

    ) βˆ’

    𝐸2(𝑋 β‹… 1

    π‘₯𝑗≀π‘₯π‘–βˆ’1)

    𝑃 (𝑋 ≀ π‘₯π‘–βˆ’1)

    .

    (42)

    Similarly,

    βˆ‘

    π‘₯𝑗β‰₯π‘₯𝑖

    (π‘₯π‘—βˆ’ πœ‡π‘₯𝑗β‰₯π‘₯𝑖

    )

    2

    𝑝𝑗= 𝐸 (𝑋

    2β‹… 1π‘₯𝑗β‰₯π‘₯𝑖

    ) βˆ’

    𝐸2(𝑋 β‹… 1

    π‘₯𝑗β‰₯π‘₯𝑖)

    𝑃 (𝑋 β‰₯ π‘₯𝑖)

    ,

    Var 2 (π‘€π‘–βˆ’1) = 𝐸 (𝑋

    2β‹… 1π‘₯𝑗≀π‘₯π‘–βˆ’1

    )

    βˆ’

    𝐸2(𝑋 β‹… 1

    π‘₯𝑗≀π‘₯π‘–βˆ’1)

    𝑃 (𝑋 ≀ π‘₯π‘–βˆ’1)

    + 𝐸 (𝑋2β‹… 1π‘₯𝑗β‰₯π‘₯𝑖

    )

    βˆ’

    𝐸2(𝑋 β‹… 1

    π‘₯𝑗β‰₯π‘₯𝑖)

    𝑃 (𝑋 β‰₯ π‘₯𝑖)

    ,

    Var 2 (𝑀𝑖) = 𝐸 (𝑋

    2β‹… 1π‘₯𝑗≀π‘₯𝑖

    )

    βˆ’

    𝐸2(𝑋 β‹… 1

    π‘₯𝑗≀π‘₯𝑖)

    𝑃 (𝑋 ≀ π‘₯𝑖)

    + 𝐸 (𝑋2β‹… 1π‘₯𝑗β‰₯π‘₯𝑖+1

    )

    βˆ’

    𝐸2(𝑋 β‹… 1

    π‘₯𝑗β‰₯π‘₯𝑖+1)

    𝑃 (𝑋 β‰₯ π‘₯𝑖+1)

    .

    (43)

    Define Ξ”Var 2 = Var 2(𝑀𝑖) βˆ’ Var 2(𝑀

    π‘–βˆ’1)

    Ξ”Var 2 = [

    [

    βˆ’

    𝐸2(𝑋 β‹… 1

    π‘₯𝑗≀π‘₯𝑖)

    𝑃 (𝑋 ≀ π‘₯𝑖)

    βˆ’

    𝐸2(𝑋 β‹… 1

    π‘₯𝑗β‰₯π‘₯𝑖+1)

    𝑃 (𝑋 β‰₯ π‘₯𝑖+1)

    ]

    ]

    βˆ’[

    [

    βˆ’

    𝐸2(𝑋 β‹… 1

    π‘₯𝑗≀π‘₯π‘–βˆ’1)

    𝑃 (𝑋 ≀ π‘₯π‘–βˆ’1)

    βˆ’

    𝐸2(𝑋 β‹… 1

    π‘₯𝑗β‰₯π‘₯𝑖)

    𝑃 (𝑋 β‰₯ π‘₯𝑖)

    ]

    ]

    .

    (44)

    Now, in order to simplify the calculation, we rearrangeΞ”Var 2 as follows:

    Ξ”Var 2 = [

    [

    𝐸2(𝑋 β‹… 1

    π‘₯𝑗≀π‘₯π‘–βˆ’1)

    𝑃 (𝑋 ≀ π‘₯π‘–βˆ’1)

    βˆ’

    𝐸2(𝑋 β‹… 1

    π‘₯𝑗≀π‘₯𝑖)

    𝑃 (𝑋 ≀ π‘₯𝑖)

    ]

    ]

    +[

    [

    𝐸2(𝑋 β‹… 1

    π‘₯𝑗β‰₯π‘₯𝑖)

    𝑃 (𝑋 β‰₯ π‘₯𝑖)

    βˆ’

    𝐸2(𝑋 β‹… 1

    π‘₯𝑗β‰₯π‘₯𝑖+1)

    𝑃 (𝑋 β‰₯ π‘₯𝑖+1)

    ]

    ]

    .

    (45)

    The first part is simplified as follows:

    𝐸2(𝑋 β‹… 1

    π‘₯𝑗≀π‘₯π‘–βˆ’1)

    𝑃 (𝑋 ≀ π‘₯π‘–βˆ’1)

    βˆ’

    𝐸2(𝑋 β‹… 1

    π‘₯𝑗≀π‘₯𝑖)

    𝑃 (𝑋 ≀ π‘₯𝑖)

    =

    𝐸2(𝑋 β‹… 1

    π‘₯𝑗≀π‘₯π‘–βˆ’1)

    𝑃 (𝑋 ≀ π‘₯π‘–βˆ’1)

    βˆ’

    [𝐸 (𝑋 β‹… 1π‘₯𝑗≀π‘₯π‘–βˆ’1

    ) + π‘₯𝑖𝑝𝑖]

    2

    𝑃 (𝑋 ≀ π‘₯π‘–βˆ’1) + 𝑝𝑖

    =

    1

    𝑃 (𝑋 ≀ π‘₯π‘–βˆ’1) [𝑃 (𝑋 ≀ π‘₯

    π‘–βˆ’1) + 𝑝𝑖]

    Γ— {𝐸2(𝑋 β‹… 1

    π‘₯𝑗≀π‘₯π‘–βˆ’1)𝑃 (𝑋 ≀ π‘₯

    𝑖)

    + 𝑝𝑖𝐸2(𝑋 β‹… 1

    π‘₯𝑗≀π‘₯π‘–βˆ’1)

    βˆ’ 𝐸2(𝑋 β‹… 1

    π‘₯𝑗≀π‘₯π‘–βˆ’1)𝑃 (𝑋 ≀ π‘₯

    π‘–βˆ’1)

    βˆ’ 2π‘₯𝑖𝑝𝑖𝐸(𝑋 β‹… 1

    π‘₯𝑗≀π‘₯π‘–βˆ’1)𝑃 (𝑋 ≀ π‘₯

    π‘–βˆ’1)

    βˆ’π‘₯2

    𝑖𝑝2

    𝑖𝑃 (𝑋 ≀ π‘₯

    π‘–βˆ’1)}

    =

    1

    𝑃 (𝑋 ≀ π‘₯π‘–βˆ’1) [𝑃 (𝑋 ≀ π‘₯

    π‘–βˆ’1) + 𝑝𝑖]

    Γ— {𝑝𝑖𝐸2(𝑋 β‹… 1

    π‘₯𝑗≀π‘₯π‘–βˆ’1)

    βˆ’ 2π‘₯𝑖𝑝𝑖𝐸(𝑋 β‹… 1

    π‘₯𝑗≀π‘₯π‘–βˆ’1)𝑃 (𝑋 ≀ π‘₯

    π‘–βˆ’1)

    βˆ’π‘₯2

    𝑖𝑝2

    𝑖𝑃 (𝑋 ≀ π‘₯

    π‘–βˆ’1)} .

    (46)

    Similarly, simplifying the second part yields

    𝐸2(𝑋 β‹… 1

    π‘₯𝑗β‰₯π‘₯𝑖)

    𝑃 (𝑋 β‰₯ π‘₯𝑖)

    βˆ’

    𝐸2(𝑋 β‹… 1

    π‘₯𝑗β‰₯π‘₯𝑖+1)

    𝑃 (𝑋 β‰₯ π‘₯𝑖+1)

    =

    [𝐸 (𝑋 β‹… 1π‘₯𝑗β‰₯π‘₯𝑖+1

    ) + π‘₯𝑖𝑝𝑖]

    2

    𝑃 (𝑋 β‰₯ π‘₯𝑖+1) + 𝑝𝑖

    βˆ’

    𝐸2(𝑋 β‹… 1

    π‘₯𝑗β‰₯π‘₯𝑖+1)

    𝑃 (𝑋 β‰₯ π‘₯𝑖+1)

    =

    1

    𝑃 (𝑋 β‰₯ π‘₯𝑖+1) [𝑃 (𝑋 β‰₯ π‘₯

    𝑖+1) + 𝑝𝑖]

    Γ— {𝐸2(𝑋 β‹… 1

    π‘₯𝑗β‰₯π‘₯𝑖+1)𝑃 (𝑋 β‰₯ π‘₯

    𝑖+1)

    βˆ’ 𝑝𝑖𝐸2(𝑋 β‹… 1

    π‘₯𝑗β‰₯π‘₯𝑖+1)

    βˆ’ 𝐸2(𝑋 β‹… 1

    π‘₯𝑗β‰₯π‘₯𝑖+1)𝑃 (𝑋 β‰₯ π‘₯

    𝑖+1)

    + 2π‘₯𝑖𝑝𝑖𝐸(𝑋 β‹… 1

    π‘₯𝑗β‰₯π‘₯𝑖+1)𝑃 (𝑋 β‰₯ π‘₯

    𝑖+1)

    +π‘₯2

    𝑖𝑝2

    𝑖𝑃 (𝑋 β‰₯ π‘₯

    𝑖+1) }

  • Mathematical Problems in Engineering 9

    =

    1

    𝑃 (𝑋 β‰₯ π‘₯𝑖+1) [𝑃 (𝑋 β‰₯ π‘₯

    𝑖+1) + 𝑝𝑖]

    Γ— {βˆ’π‘π‘–+1𝐸2(𝑋 β‹… 1

    π‘₯𝑗β‰₯π‘₯𝑖+1)

    + 2π‘₯𝑖𝑝𝑖𝐸(𝑋 β‹… 1

    π‘₯𝑗β‰₯π‘₯𝑖+1)𝑃 (𝑋 β‰₯ π‘₯

    𝑖+1)

    +π‘₯2

    𝑖𝑝2

    𝑖+1𝑃 (𝑋 β‰₯ π‘₯

    𝑖+1) } .

    (47)In general, these types of optimization arise when the

    dataset is very big.If we consider𝑁 to be very large then 𝑝

    π‘˜β‰ͺ max(βˆ‘

    𝑖β‰₯π‘˜+1

    𝑃𝑖; βˆ‘π‘–β‰€π‘˜βˆ’1

    𝑃𝑖), 𝑃(𝑋 ≀ π‘₯

    π‘–βˆ’1) + π‘π‘–β‰ˆ 𝑃(𝑋 ≀ π‘₯

    𝑖), and 𝑃(𝑋 β‰₯

    π‘₯𝑖+1) + π‘π‘–β‰ˆ 𝑃(𝑋 β‰₯ π‘₯

    𝑖+1):

    Ξ”Var 2 β‰ˆ [

    [

    𝐸2(𝑋 β‹… 1

    π‘₯𝑗≀π‘₯π‘–βˆ’1) 𝑝𝑖

    𝑃2(𝑋 ≀ π‘₯

    π‘–βˆ’1)

    βˆ’

    𝐸2(𝑋 β‹… 1

    π‘₯𝑗β‰₯π‘₯𝑖+1) 𝑝𝑖

    𝑃2(𝑋 β‰₯ π‘₯

    𝑖+1)

    ]

    ]

    βˆ’ 2𝑝𝑖π‘₯𝑖[

    [

    𝐸(𝑋 β‹… 1π‘₯𝑗≀π‘₯π‘–βˆ’1

    )

    𝑃 (𝑋 ≀ π‘₯π‘–βˆ’1)

    βˆ’

    𝐸 (𝑋 β‹… 1π‘₯𝑗β‰₯π‘₯𝑖+1

    )

    𝑃 (𝑋 β‰₯ π‘₯𝑖+1)

    ]

    ]

    βˆ’ 𝑝2

    𝑖π‘₯2

    𝑖[

    1

    𝑃 (𝑋 ≀ π‘₯π‘–βˆ’1)

    βˆ’

    1

    𝑃 (𝑋 β‰₯ π‘₯𝑖+1)

    ]

    (48)

    and, as limπ‘β†’βˆž

    {𝑝2

    𝑖π‘₯2

    𝑖[(1/𝑃(𝑋 ≀ π‘₯

    π‘–βˆ’1))βˆ’(1/𝑃(𝑋 β‰₯ π‘₯

    𝑖+1))]} =

    0, we can further simplify Ξ”Var 2 as follows:Ξ”Var 2

    β‰ˆ[

    [

    𝐸2(𝑋 β‹… 1

    π‘₯𝑗≀π‘₯π‘–βˆ’1) 𝑝𝑖

    𝑃2(𝑋 ≀ π‘₯

    π‘–βˆ’1)

    βˆ’

    𝐸2(𝑋 β‹… 1

    π‘₯𝑗β‰₯π‘₯𝑖+1) 𝑝𝑖

    𝑃2(𝑋 β‰₯ π‘₯

    𝑖+1)

    ]

    ]

    βˆ’ 2𝑝𝑖π‘₯𝑖[

    [

    𝐸(𝑋 β‹… 1π‘₯𝑗≀π‘₯π‘–βˆ’1

    )

    𝑃 (𝑋 ≀ π‘₯π‘–βˆ’1)

    βˆ’

    𝐸 (𝑋 β‹… 1π‘₯𝑗β‰₯π‘₯𝑖+1

    )

    𝑃 (𝑋 β‰₯ π‘₯𝑖+1)

    ]

    ]

    = 𝑝𝑖[

    [

    𝐸(𝑋 β‹… 1π‘₯𝑗≀π‘₯π‘–βˆ’1

    )

    𝑃 (𝑋 ≀ π‘₯π‘–βˆ’1)

    βˆ’

    𝐸 (𝑋 β‹… 1π‘₯𝑗β‰₯π‘₯𝑖+1

    )

    𝑃 (𝑋 β‰₯ π‘₯𝑖+1)

    ]

    ]

    Γ—[

    [

    (

    𝐸(𝑋 β‹… 1π‘₯𝑗≀π‘₯π‘–βˆ’1

    )

    𝑃 (𝑋 ≀ π‘₯π‘–βˆ’1)

    +

    𝐸 (𝑋 β‹… 1π‘₯𝑗β‰₯π‘₯𝑖+1

    )

    𝑃 (𝑋 β‰₯ π‘₯𝑖+1)

    ) βˆ’ 2π‘₯𝑖]

    ]

    .

    (49)To find the minimum of Ξ”Var 2, let us locate when

    Ξ”Var 2 = 0:

    β‡’ 𝑝𝑖[

    [

    𝐸(𝑋 β‹… 1π‘₯𝑗≀π‘₯π‘–βˆ’1

    )

    𝑃 (𝑋 ≀ π‘₯π‘–βˆ’1)

    βˆ’

    𝐸 (𝑋 β‹… 1π‘₯𝑗β‰₯π‘₯𝑖+1

    )

    𝑃 (𝑋 β‰₯ π‘₯𝑖+1)

    ]

    ]

    Γ—[

    [

    (

    𝐸(𝑋 β‹… 1π‘₯𝑗≀π‘₯π‘–βˆ’1

    )

    𝑃 (𝑋 ≀ π‘₯π‘–βˆ’1)

    +

    𝐸 (𝑋 β‹… 1π‘₯𝑗β‰₯π‘₯𝑖+1

    )

    𝑃 (𝑋 β‰₯ π‘₯𝑖+1)

    ) βˆ’ 2π‘₯𝑖]

    ]

    = 0.

    (50)

    VarD

    X1 X2 X3 X4 X5 X6 Xn

    M

    Figure 3: Discrete random variables.

    But 𝑝𝑖

    ΜΈ= 0, and so is [(𝐸(𝑋 β‹… 1π‘₯𝑗≀π‘₯π‘–βˆ’1

    )/𝑃(𝑋 ≀ π‘₯π‘–βˆ’1)) βˆ’

    (𝐸(𝑋 β‹… 1π‘₯𝑗β‰₯π‘₯𝑖+1

    )/𝑃(𝑋 β‰₯ π‘₯𝑖+1))] < 0, since π‘₯

    𝑖is an increasing

    sequence, which implies that πœ‡π‘–βˆ’1

    < πœ‡π‘–where

    πœ‡π‘–βˆ’1

    =

    𝐸 (𝑋 β‹… 1π‘₯𝑗≀π‘₯π‘–βˆ’1

    )

    𝑃 (𝑋 ≀ π‘₯π‘–βˆ’1)

    ,

    πœ‡π‘–=

    𝐸 (𝑋 β‹… 1π‘₯𝑗β‰₯π‘₯𝑖+1

    )

    𝑃 (𝑋 β‰₯ π‘₯𝑖+1)

    .

    (51)

    Therefore,

    [

    [

    𝐸(𝑋 β‹… 1π‘₯𝑗≀π‘₯π‘–βˆ’1

    )

    𝑃 (𝑋 ≀ π‘₯π‘–βˆ’1)

    +

    𝐸 (𝑋 β‹… 1π‘₯𝑗β‰₯π‘₯𝑖+1

    )

    𝑃 (𝑋 β‰₯ π‘₯𝑖+1)

    ]

    ]

    βˆ’ 2π‘₯𝑖= 0. (52)

    Let us define

    𝐹 (π‘₯𝑖) = 2π‘₯

    π‘–βˆ’[

    [

    𝐸(𝑋 β‹… 1π‘₯𝑗≀π‘₯π‘–βˆ’1

    )

    𝑃 (𝑋 ≀ π‘₯π‘–βˆ’1)

    +

    𝐸 (𝑋 β‹… 1π‘₯𝑗β‰₯π‘₯𝑖+1

    )

    𝑃 (𝑋 β‰₯ π‘₯𝑖+1)

    ]

    ]

    𝐹 (π‘₯𝑖) = 0

    β‡’ π‘₯𝑖=[

    [

    𝐸(𝑋 β‹… 1π‘₯𝑗≀π‘₯π‘–βˆ’1

    )

    𝑃 (𝑋 ≀ π‘₯π‘–βˆ’1)

    +

    𝐸 (𝑋 β‹… 1π‘₯𝑗β‰₯π‘₯𝑖+1

    )

    𝑃 (𝑋 β‰₯ π‘₯𝑖+1)

    ]

    ]

    β‹…

    1

    2

    β‡’ π‘₯𝑖=

    πœ‡π‘–βˆ’1+ πœ‡π‘–

    2

    ,

    (53)

    where πœ‡π‘–βˆ’1

    and πœ‡π‘–are the means of the first and second parts,

    respectively.To find the minimum for the total variation, it is enough

    to find the good separator𝑀 between π‘₯𝑖and π‘₯

    π‘–βˆ’1such that

    𝐹(π‘₯𝑖) β‹… 𝐹(π‘₯

    𝑖+1) < 0 and 𝐹(π‘₯

    𝑖) < 0 (see Figure 3).

    After finding few local minima, we can get the globalminimum by finding the smallest among them.

    3. Face Localization Using Proposed Method

    Now, it becomes clear the superiority of the K-means modi-fied algorithmover the standard algorithm in terms of finding

  • 10 Mathematical Problems in Engineering

    Figure 4: The original image.

    Figure 5: The face with shade only.

    a global minimum and we can explain our solution basedon the modified algorithm. Suppose we have an image withpixels values from 0 to 255. First, we reshaped the input imageinto a vector of pixels; Figure 4 shows an input image beforethe operation of the K-means modified algorithm.The imagecontains a face part, a normal background, and a backgroundwith illumination. Then we split the image pixels into twoclasses: one from 0 to π‘₯

    1and another from π‘₯

    1to 255 by

    applying a certain threshold determined by the algorithm.Let [0, . . . , π‘₯

    1] be the class 1-which represents the nonface

    part and let [π‘₯1, . . . , 255] be the class 2-which represents the

    face with the shade.All pixels with values less than the threshold belong to

    the first class and will be assigned to 0 values. This will keepthe pixels with values above the threshold and these pixelsbelong to both the face part and the illuminated part of thebackground, as shown in Figure 5.

    In order to remove the illuminated background part,another threshold will be applied to the pixels using thealgorithm meant to separate between the face part and theilluminated background part. New classes will be obtained,from π‘₯

    1to π‘₯2and from π‘₯

    2to 255.Then, the pixels with values

    less than the threshold will be assigned a value of 0 and thenonzero pixels left represent the face part. Let [π‘₯

    1, . . . , π‘₯

    2] be

    the class 1 which represents the illuminated background partand let [π‘₯

    2, . . . , 255] be the class 2 which represents the face

    part.The result of the removal of the illuminated back-ground

    part is shown in Figure 6.

    Figure 6: The face without illumination.

    Figure 7: The filtered image.

    One can see that some face pixels were assigned to 0 dueto the effects of the illumination. Therefore, there is a needto return to the original image in Figure 4 to fully extract theimage window corresponding to the face. A filtering processis then performed to reduce the noise and the result is shownin Figure 7.

    Figure 8 shows the final result which is a window contain-ing the face extracted from the original image.

    4. Results

    In this Section, the experimental results of the proposedalgorithm will be analyzed and discussed. First, it starts withpresenting a description of the datasets used in this work,followed by the results of the previous part of the work.Three popular databases were used to test the proposed facelocalizationmethod.TheMIT-CBCL database has image size115 Γ— 115, and 300 images were selected from this datasetbased on different conditions such as light variation, pose,and background variations. The second dataset is the BioIDdataset with image size 384 Γ— 286 and 300 images from thedatabase were selected based on different conditions suchas indoor/outdoor. The last dataset is the Caltech datasetwith image size 896 Γ— 592 and the 300 images from thedatabase were selected based on different conditions such asindoor/outdoor.

    4.1. MIT-CBCL Dataset. This dataset was established by theCenter for Biological and Computational Learning (CBCL)

  • Mathematical Problems in Engineering 11

    Figure 8: The corresponding window in the original image.

    Figure 9: Samples fromMIT-CBCL dataset.

    at Massachusetts Institute of Technology (MIT) [41]. It has10 persons with 200 images per person and an image size of115Γ—115 pixels.The images of each person are with differentlight conditions, clutter background, and scale and differentposes. Few examples of these images are shown in Figure 9.

    4.2. BioID Dataset. This dataset [42] consists of 1521 graylevel images with a resolution of 384 Γ— 286 pixels. Each oneshows the frontal view of a face of one out of 23 differenttest persons. The images were taken under different lightingconditions, with a complex background, and contain tiltedand rotated faces. This dataset is considered as the mostdifficult one for eye detection (see Figure 10).

    4.3. Caltech Dataset. This database [43] has been collectedby Markus Weber at California Institute of Technology. Thedatabase contains 450 face images of 27 subjects underdifferent lighting conditions, different face expressions, andcomplex backgrounds (see Figure 11).

    4.4. Proposed Method Result. The proposed method is eval-uated on these datasets where the result was 100% with a

    Figure 10: Sample of faces from BioID dataset for localization.

    Figure 11: Sample of faces from Caltech dataset for localization.

    Table 1: Face localization using K-mean modified algorithm onMIT-CBCL, BioID, and Caltech; the sets contain faces with differentposes and faces with clutter background as well as indoors/outdoors.

    Dataset Image Accuracy (%)MIT-Set 1 150 100MIT-Set 2 150 100BioID 300 100Caltech 300 100

    localization time of 4 s. Table 1 shows the proposed methodresults with the MIT-CBCL, Caltech, and BioID databases.In this test, we have focused on the use of images of facesfrom different angles from 30 degrees left to 30 degreesright and the variation in the background as well as thecases of indoor and outdoor images. Two sets of images areselected from theMIT-CBCL dataset and two other sets fromthe other databases to evaluate the proposed method. Thefirst set from MIT-CBCL contains the images with differentposes and the second set contains images with differentbackgrounds. Each set contains 150 images. The BioID andCaltech sets contain imageswith indoor and outdoor settings.The results showed the efficiency of the proposed K-meanmodified algorithm to locate the faces from the image withhigh background and poses changes. In addition, anotherparameter was considered in this test which is the localizationtime. From the results, the proposed method can locate theface in a very short time because it has less complexity due tothe use of differential equations.

  • 12 Mathematical Problems in Engineering

    Figure 12: Examples of face localization using k-mean modified algorithm on MIT-CBCL dataset.

    (a) (b) (c)

    Figure 13: Example of face localization using k-mean modified algorithm on BioID dataset: (a) the original image, (b) face position, and (c)filtered image.

    Figure 12 shows some examples of face localization onMIT-CBCL.

    In Figure 13(a), an original image from BioID dataset isshown.The localization position is shown in Figure 13(b). Butthere is need to do a filtering operation in order to remove theunwanted parts and this is shown in Figure 13(c) and then thearea of ROI will be determined.

    Figure 14 shows the face position on an image from theCaltech dataset.

    5. Conclusion

    In this paper, we focus on developing the RIO detectionby suggesting a new method for face localization. In thismethod of localization, the input image is reshaped intoa vector and then the optimized k-means algorithm isapplied twice to cluster the image pixels into classes. At the

    end, the block window corresponding to the class of pixelscontaining the face only is extracted from the input image.Three sets of images taken from MIT, BioID, and Caltechdatasets and differing in illumination, and backgrounds,as well as in-door/outdoor settings were used to evaluatethe proposed method. The results show that the proposedmethod achieved significant localization accuracy.Our futureresearch direction includes finding the optimal centers forhigher dimension data, whichwe believe is just generalizationof our approach to higher dimension (8) and to highersubclusters.

    Conflict of Interests

    The authors declare that there is no conflict of interestsregarding the publication of this paper.

  • Mathematical Problems in Engineering 13

    (a) (b) (c)

    Figure 14: Example of face localization using k-mean modified algorithm on Caltech dataset: (a) the original image, (b) face position, and(c) filtered image.

    Acknowledgments

    This study is a part of the Internal Grant Project no.419011811122. The authors gratefully acknowledge the con-tinued support from Alfaisal University and its Office ofResearch.

    References

    [1] R. Chellappa, C. L. Wilson, and S. Sirohey, β€œHuman andmachine recognition of faces: a survey,” Proceedings of the IEEE,vol. 83, no. 5, pp. 705–740, 1995.

    [2] M.-H. Yang, D. J. Kriegman, and N. Ahuja, β€œDetecting faces inimages: a survey,” IEEE Transactions on Pattern Analysis andMachine Intelligence, vol. 24, no. 1, pp. 34–58, 2002.

    [3] K. S. Fu and J. K. Mui, β€œA survey on image segmentation,”Pattern Recognition, vol. 13, no. 1, pp. 3–16, 1981.

    [4] N. R. Pal and S. K. Pal, β€œA review on image segmentationtechniques,” Pattern Recognition, vol. 26, no. 9, pp. 1277–1294,1993.

    [5] S.-Y.Wan andW. E.Higgins, β€œSymmetric region growing,” IEEETransactions on Image Processing, vol. 12, no. 9, pp. 1007–1015,2003.

    [6] Y.-L. Chang and X. Li, β€œFast image region growing,” Image andVision Computing, vol. 13, no. 7, pp. 559–571, 1995.

    [7] T. N. Pappas and N. S. Jayant, β€œAn adaptive clustering algorithmfor image segmentation,” in Proceedings of the InternationalConference on Acoustics, Speech, and Signal Processing (ICASSP’89), vol. 3, pp. 1667–1670, May 1989.

    [8] T. N. Pappas, β€œAn adaptive clustering algorithm for imagesegmentation,” IEEE Transactions on Signal Processing, vol. 40,no. 4, pp. 901–914, 1992.

    [9] M. Y. Mashor, β€œHybrid training algorithm for RBF network,”International Journal of the Computer, the Internet and Manage-ment, vol. 8, no. 2, pp. 50–65, 2000.

    [10] N. A. M. Isa, S. A. Salamah, and U. K. Ngah, β€œAdaptive fuzzymovingK-means clustering algorithm for image segmentation,”

    IEEE Transactions on Consumer Electronics, vol. 55, no. 4, pp.2145–2153, 2009.

    [11] S. N. Sulaiman and N. A.M. Isa, β€œAdaptive fuzzy-K-means clus-tering algorithm for image segmentation,” IEEE Transactions onConsumer Electronics, vol. 56, no. 4, pp. 2661–2668, 2010.

    [12] W. Cai, S. Chen, and D. Zhang, β€œFast and robust fuzzy c-meansclustering algorithms incorporating local information for imagesegmentation,” Pattern Recognition, vol. 40, no. 3, pp. 825–838,2007.

    [13] P. Tan, M. Steinbach, and V. Kumar, Introduction to DataMining, Pearson Addison-Wesley, 2006.

    [14] H. Xiong, J. Wu, and J. Chen, β€œK-means clustering versusvalidation measures: a data-distribution perspective,” IEEETransactions on Systems, Man, and Cybernetics B: Cybernetics,vol. 39, no. 2, pp. 318–331, 2009.

    [15] I. Borg and P. J. F. Groenen,ModernMultidimensional Scalingβ€”Theory and Applications, Springer Science + Business Media,2005.

    [16] S. Guha, R. Rastogi, and K. Shim, β€œCURE: an efficient clusteringalgorithm for large databases,” ACM SIGMOD Record, vol. 27,no. 2, pp. 73–84, 1998.

    [17] E.M. Knorr, R. T. Ng, and V. Tucakov, β€œDistance-based outliers:algorithms and applications,” VLDB Journal, vol. 8, no. 3-4, pp.237–253, 2000.

    [18] P. S. Bradley, U. Fayyad, and C. Reina, β€œScaling clusteringalgorithms to large databases,” in Proceedings of the 4th ACMSIGKDD International Conference on Knowledge Discovery andData Mining (KDD ’98), pp. 9–15, 1998.

    [19] D. Arthur and S. Vassilvitskii, β€œk-means++: the advantages ofcareful seeding,” in Proceedings of the 18th Annual ACM-SIAMSymposium on Discrete Algorithms, pp. 1027–1035, 2007.

    [20] T. Kanungo, D. M. Mount, N. S. Netanyahu, C. D. Piatko,R. Silverman, and A. Y. Wu, β€œA local search approximationalgorithm for k-means clustering,” Computational Geometry:Theory and Applications, vol. 28, no. 2-3, pp. 89–112, 2004.

  • 14 Mathematical Problems in Engineering

    [21] M. C. Chiang, C. W. Tsai, and C. S. Yang, β€œA time-efficient pat-tern reduction algorithm for k-means clustering,” InformationSciences, vol. 181, no. 4, pp. 716–731, 2011.

    [22] P. S. Bradley, U. M. Fayyad, and C. Reina, β€œScaling clusteringalgorithms to large databases,” in Proceedings of the ACMSIGKDD International Conference on Knowledge Discovery andData Mining (KDD ’98), pp. 9–15, 1998.

    [23] F. Farnstrom, J. Lewis, and C. Elkan, β€œScalability for clusteringalgorithms revisited,” SIGKDD Explorations, vol. 2, no. 1, pp. 51–57, 2000.

    [24] C. Ordonez and E. Omiecinski, β€œEfficient disk-based K-means clustering for relational databases,” IEEE Transactions onKnowledge and Data Engineering, vol. 16, no. 8, pp. 909–921,2004.

    [25] Y. Li and S. M. Chung, β€œParallel bisecting k-means withprediction clustering algorithm,” Journal of Supercomputing,vol. 39, no. 1, pp. 19–37, 2007.

    [26] C. Elkan, β€œUsing the triangle inequality to accelerate k-means,”in Proceedings of the 20th International Conference on MachineLearning, pp. 147–153, August 2003.

    [27] D. Reddy and P. K. Jana, β€œInitialization for K-means clusteringusing voronoi diagram,” Procedia Technology, vol. 4, pp. 395–400, 2012.

    [28] J. B. MacQueen, β€œSome methods for classification and analysisof multivariate observations,” in Proceedings of 5th BerkeleySymposium on Mathematical Statistics and Probability, pp. 281–297, University of California Press, 1967.

    [29] R. Ebrahimpour, R. Rasoolinezhad, Z. Hajiabolhasani, andM. Ebrahimi, β€œVanishing point detection in corridors: usingHough transform and K-means clustering,” IET ComputerVision, vol. 6, no. 1, pp. 40–51, 2012.

    [30] P. S. Bishnu and V. Bhattacherjee, β€œSoftware fault predictionusing quad tree-based K-means clustering algorithm,” IEEETransactions on Knowledge and Data Engineering, vol. 24, no.6, pp. 1146–1150, 2012.

    [31] T. Elomaa and H. Koivistoinen, β€œOn autonomous K-meansclustering,” in Proceedings of the International Symposium onMethodologies for Intelligent Systems, pp. 228–236, May 2005.

    [32] J. M. PenΜƒa, J. A. Lozano, and P. LarranΜƒaga, β€œAn empiricalcomparison of four initialization methods for the K-Meansalgorithm,” Pattern Recognition Letters, vol. 20, no. 10, pp. 1027–1040, 1999.

    [33] T. Kanungo, D. M. Mount, N. S. Netanyahu, C. D. Piatko,R. Silverman, and A. Y. Wu, β€œAn efficient k-means clusteringalgorithms: analysis and implementation,” IEEETransactions onPattern Analysis andMachine Intelligence, vol. 24, no. 7, pp. 881–892, 2002.

    [34] A. Likas, N. Vlassis, and J. J. Verbeek, β€œThe global k-meansclustering algorithm,” Pattern Recognition, vol. 36, no. 2, pp.451–461, 2003.

    [35] A. M. Bagirov, β€œModified global k-means algorithm for mini-mumsum-of-squares clustering problems,”PatternRecognition,vol. 41, no. 10, pp. 3192–3199, 2008.

    [36] J. Xie and S. Jiang, β€œA simple and fast algorithm for global K-means clustering,” in Proceedings of the 2nd InternationalWork-shop on Education Technology and Computer Science (ETCS '10),vol. 2, pp. 36–40, Wuhan, China, March 2010.

    [37] G. F. Tzortzis and A. C. Likas, β€œThe global kernel πœ…-meansalgorithm for clustering in feature space,” IEEE Transactions onNeural Networks, vol. 20, no. 7, pp. 1181–1194, 2009.

    [38] W.-L. Hung, Y.-C. Chang, and E. Stanley Lee, β€œWeight selectionin W-K-means algorithm with an application in color imagesegmentation,” Computers and Mathematics with Applications,vol. 62, no. 2, pp. 668–676, 2011.

    [39] B. Maliatski and O. Yadid-Pecht, β€œHardware-driven adaptive πœ…-means clustering for real-time video imaging,” IEEE Transac-tions on Circuits and Systems for Video Technology, vol. 15, no. 1,pp. 164–166, 2005.

    [40] T. Saegusa and T. Maruyama, β€œAn FPGA implementation ofreal-time K-means clustering for color images,” Journal of Real-Time Image Processing, vol. 2, no. 4, pp. 309–318, 2007.

    [41] β€œCBCL face recognition Database,” Vol 2010,http://cbcl.mit.edu/software-datasets/heisele/facerecognition-database.html.

    [42] BioID: BioID Support: Downloads: BioID Face Database, 2011.[43] http://www.vision.caltech.edu/html-files/archive.html.

  • Submit your manuscripts athttp://www.hindawi.com

    Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

    MathematicsJournal of

    Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

    Mathematical Problems in Engineering

    Hindawi Publishing Corporationhttp://www.hindawi.com

    Differential EquationsInternational Journal of

    Volume 2014

    Applied MathematicsJournal of

    Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

    Probability and StatisticsHindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

    Journal of

    Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

    Mathematical PhysicsAdvances in

    Complex AnalysisJournal of

    Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

    OptimizationJournal of

    Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

    CombinatoricsHindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

    International Journal of

    Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

    Operations ResearchAdvances in

    Journal of

    Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

    Function Spaces

    Abstract and Applied AnalysisHindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

    International Journal of Mathematics and Mathematical Sciences

    Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

    The Scientific World JournalHindawi Publishing Corporation http://www.hindawi.com Volume 2014

    Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

    Algebra

    Discrete Dynamics in Nature and Society

    Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

    Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

    Decision SciencesAdvances in

    Discrete MathematicsJournal of

    Hindawi Publishing Corporationhttp://www.hindawi.com

    Volume 2014 Hindawi Publishing Corporationhttp://www.hindawi.com Volume 2014

    Stochastic AnalysisInternational Journal of