Research Article Ant Colony Optimization with...

12
Hindawi Publishing Corporation Mathematical Problems in Engineering Volume 2013, Article ID 510167, 11 pages http://dx.doi.org/10.1155/2013/510167 Research Article Ant Colony Optimization with Three Stages for Independent Test Cost Attribute Reduction Zilong Xu, Hong Zhao, Fan Min, and William Zhu Lab of Granular Computing, Minnan Normal University, Zhangzhou 363000, China Correspondence should be addressed to Fan Min; [email protected] Received 3 January 2013; Revised 1 May 2013; Accepted 23 May 2013 Academic Editor: Lu Zhen Copyright © 2013 Zilong Xu et al. is is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Minimal test cost attribute reduction is an important problem in cost-sensitive learning. Recently, heuristic algorithms including the information gain-based algorithm and the genetic algorithm have been designed for this problem. However, in many cases these algorithms cannot find the optimal solution. In this paper, we develop an ant colony optimization algorithm to tackle this problem. e attribute set is represented as a graph with each vertex corresponding to an attribute and weight of each edge to pheromone. Our algorithm contains three stages, namely, the addition stage, the deletion stage, and the filtration stage. In the addition stage, each ant starts from the initial position and traverses edges probabilistically until the stopping criterion is satisfied. e pheromone of the traveled path is also updated in this process. In the deletion stage, each ant deletes redundant attributes. Two strategies, called the centralized deletion strategy and the distributed deletion strategy, are proposed. Finally, the ant with minimal test cost is selected to construct the reduct in the filtration stage. Experimental results on UCI datasets indicate that the algorithm is significantly better than the information gain-based one. It also outperforms the genetic algorithm on medium-sized dataset Mushroom. 1. Introduction Cost-sensitive attribute reduction has gained much research interest in rough sets. People have considered three types of costs, namely, test cost [1], misclassification cost [2], and delay cost [3, 4]. Test cost is the money and/or time that is required to obtain attribute values. In real applications, only part of tests is needed to maintain enough information for classification. We would like to choose an attribute reduct [5] that minimizes the total test cost. is issue is called the minimal test cost reduct (MTR) problem [1]. We explain the MTR problem through a simple example. ere is a medical decision system with five attributes, Mcv, Alkphos, Sgpt, Sgot, and Gammagt. e test costs of these five attributes are $55, $10, $15, $20, and $25, respectively. e decision system has three reducts, {Mcv, Alkphos}, {Alkphos, Sgpt, Gammagt}, and {Alkphos, Sgot, Gammagt}. Alkphos is the core attribute. e minimal reduct is {Mcv, Alkphos} including only two attributes. e minimal test cost reduct is {Alkphos, Sgpt, Gammagt} with only $50. e minimal test cost attribute reduction (MTR) is proposed for saving resources, and the problem is meaningful in applications. e MTR problem is more general than the minimal reduct problem which is NP-hard; hence the MTR problem is at least NP-hard. Consequently, heuristic algorithms are needed to deal with such problem. Heuristic algorithms including information gain-based -weighted reduction algorithm [1] and genetic algorithm [6] have been designed to deal with MTR problem. Unfortunately, they oſten do not find the optimal solution for medium-sized datasets. Although the competition approach [1] helps to improve the performance through constructing a population of reducts, the results are still unsatisfactory. ere is still room to obtain more sophisticated algorithms through other techniques. In this paper, we develop an algorithm based on ant colony optimization for the MTR problem. e attribute set is represented as a complete graph with each vertex corre- sponding to one attribute. en batches of ants are generated for attribute subset selection. Our algorithm contains three stages, namely, the addition stage, the deletion stage, and the filtration stage. First, in the addition stage, core vertexes are compressed into one vertex as the initial position of all ants. From the initial position, each ant selects next

Transcript of Research Article Ant Colony Optimization with...

Hindawi Publishing CorporationMathematical Problems in EngineeringVolume 2013 Article ID 510167 11 pageshttpdxdoiorg1011552013510167

Research ArticleAnt Colony Optimization with Three Stages for IndependentTest Cost Attribute Reduction

Zilong Xu Hong Zhao Fan Min and William Zhu

Lab of Granular Computing Minnan Normal University Zhangzhou 363000 China

Correspondence should be addressed to Fan Min minfanphd163com

Received 3 January 2013 Revised 1 May 2013 Accepted 23 May 2013

Academic Editor Lu Zhen

Copyright copy 2013 Zilong Xu et al This is an open access article distributed under the Creative Commons Attribution Licensewhich permits unrestricted use distribution and reproduction in any medium provided the original work is properly cited

Minimal test cost attribute reduction is an important problem in cost-sensitive learning Recently heuristic algorithms includingthe information gain-based algorithm and the genetic algorithm have been designed for this problemHowever inmany cases thesealgorithms cannot find the optimal solution In this paper we develop an ant colony optimization algorithm to tackle this problemThe attribute set is represented as a graph with each vertex corresponding to an attribute and weight of each edge to pheromoneOur algorithm contains three stages namely the addition stage the deletion stage and the filtration stage In the addition stageeach ant starts from the initial position and traverses edges probabilistically until the stopping criterion is satisfiedThe pheromoneof the traveled path is also updated in this process In the deletion stage each ant deletes redundant attributes Two strategies calledthe centralized deletion strategy and the distributed deletion strategy are proposed Finally the ant withminimal test cost is selectedto construct the reduct in the filtration stage Experimental results on UCI datasets indicate that the algorithm is significantly betterthan the information gain-based one It also outperforms the genetic algorithm on medium-sized dataset Mushroom

1 Introduction

Cost-sensitive attribute reduction has gained much researchinterest in rough sets People have considered three typesof costs namely test cost [1] misclassification cost [2] anddelay cost [3 4] Test cost is the money andor time that isrequired to obtain attribute values In real applications onlypart of tests is needed to maintain enough information forclassification We would like to choose an attribute reduct[5] that minimizes the total test cost This issue is called theminimal test cost reduct (MTR) problem [1]

We explain the MTR problem through a simple exampleThere is a medical decision system with five attributes McvAlkphos Sgpt Sgot and Gammagt The test costs of thesefive attributes are $55 $10 $15 $20 and $25 respectivelyThe decision system has three reducts McvAlkphosAlkphos SgptGammagt and Alkphos SgotGammagtAlkphos is the core attribute The minimal reduct isMcvAlkphos including only two attributes The minimaltest cost reduct is Alkphos SgptGammagt with only $50

The minimal test cost attribute reduction (MTR) isproposed for saving resources and the problem ismeaningful

in applications The MTR problem is more general thanthe minimal reduct problem which is NP-hard hence theMTR problem is at least NP-hard Consequently heuristicalgorithms are needed to deal with such problem Heuristicalgorithms including information gain-based 120582-weightedreduction algorithm [1] and genetic algorithm [6] have beendesigned to deal with MTR problem Unfortunately theyoften do not find the optimal solution for medium-sizeddatasets Although the competition approach [1] helps toimprove the performance through constructing a populationof reducts the results are still unsatisfactory There is stillroom to obtain more sophisticated algorithms through othertechniques

In this paper we develop an algorithm based on antcolony optimization for the MTR problem The attribute setis represented as a complete graph with each vertex corre-sponding to one attribute Then batches of ants are generatedfor attribute subset selection Our algorithm contains threestages namely the addition stage the deletion stage andthe filtration stage First in the addition stage core vertexesare compressed into one vertex as the initial position ofall ants From the initial position each ant selects next

2 Mathematical Problems in Engineering

vertex according to test costs of each adjacent vertex andpheromone of each adjacent edge If the attribute vertexesan ant has traveled satisfy the positive region conditionthe ant stops otherwise continues to add attributes Secondin the deletion stage redundant attributes are deleted fromthe obtained attribute subsets and the pheromone of eachedge is updated Two strategies including centralized deletionand distributed deletion are designed for this stage Thirdthe ant with the least test cost is selected and the attributesubset corresponding to its path is output as the result in thefiltration stage

To evaluate the performance of algorithms we adoptthree measures namely finding optimal factor (FOF) max-imal exceeding factor (MEF) and average exceeding factor(AEF) [1] Experimental results indicate that our algorithmoutperforms the information gain-based algorithm in mostdatasets with different test cost distributions It can obtainbetter results than the genetic algorithm except some smalldatasets One possible reason is that the ant colony opti-mization algorithm produces more diverse solutions than theexisting ones The distributed deletion strategy is superior tothe centralized one on medium-sized dataset Mushroom

The rest of the paper is organized as follows Section 2reviews the basic concepts in rough sets and decision systemSection 3 proposes the ant colony optimization to tackle theminimal test cost reduction problem In Section 4 we presentour experiment schemes and show the results We also give asimple analysis of our experimental results Finally Section 5presents the conclusion

2 Preliminaries

This section reviews basic knowledge test-cost-independentdecision systems relative reduct minimal test cost reductproblem genetic algorithm and ant colony optimization

21 Test-Cost-Independent Decision Systems Most super-vised learning approaches are based on decision systems Adecision system is often denoted by 119878 = (119880 119862119863 119881

119886| 119886 isin

119862cup119863 119868119886| 119886 isin 119862cup119863) where119880 is a finite set of objects called

the universe 119862 is the set of conditional attributes also calledthe set of tests 119863 is the set of decision attributes also calledthe decision 119881

119886is the set of values for each 119886 isin 119862 cup 119863 and

119868119886 119880 rarr 119881

119886is an information function for each 119886 isin 119862 cup 119863

We often denote 119881119886| 119886 isin 119862 cup 119863 and 119868

119886| 119886 isin 119862 cup 119863 by 119881

and 119868 respectively A decision system is often represented bya decision table as shown in Table 1

We consider the simplest case though most widely usedtype of cost-sensitive decision systems as follows

Definition 1 (see [7]) A test-cost-independent decision sys-tem (TCI-DS) 119878 is the 6-tuple

119878 = (119880 119862119863 119881 119868 119888) (1)

where 119880 119862 119863 119881 and 119868 have the same meanings as in adecision system and 119888 119862 rarr R+ cup 0 is the test costfunction Test costs are independent of one another that is119888(119861) = sum

119886isin119861119888(119886) for any 119861 sube 119862

Table 1 An exemplary decision table

Patient Headache Muscle Temperature Snivel Flu1199091

No Yes Normal No No1199092

Yes Yes High Yes Yes1199093

Yes Yes Very high No Yes1199094

No Yes Normal No No1199095

No No High No No1199096

No Yes Very high Yes Yes

Table 2 The test cost vector

119886 Headache Muscle pain Temperature Snivel119888(119886) 50 25 40 20

We usually use a vector 119888 = [119888(1198861) 119888(1198862) 119888(119886

|119862|)] to

represent the cost function An exemplary cost vector isshown in Table 2 In fact cost-sensitive decision systems aremore general than decision systems If all elements in 119888 are0 a TCI-DS coincides with a DS For simplicity free tests arenot considered in this work This consideration is reasonablesince we always need some cost to obtain data

22 The Relative Reduct The relative reduct is a crucial con-cept in rough sets and there are many different definitionssuch as positive region reducts [5] maximum distributionreducts [8] fuzzy reducts [9] and 120573-reduct [10] Thesedefinitions are equivalent if the decision table is consistentFurthermore there are some extended concepts such asdynamic reducts [11] parallel reducts [12 13] andM-relativereducts [14] In some extensions of rough sets that iscovering-based rough sets [15] there are other definitions ofa reduct (see eg [16ndash18])

Most existing reduct problems aim at finding theminimaldescription of the data And the objectives include findingattribute subsets with the minimal size [5 19] the minimalspace [20] or a covering with the minimal number of subsets[16] Since the test cost issue is the focus of this paper we areinterested in reducts with the minimal test cost This type ofreducts is defined as follows

Definition 2 (see [1]) Let 119878 be a TCI-DS and Red(119878) the setof all reducts of 119878 Any 119877 isin Red(119878) where 119888(119877) = min119888(1198771015840) |1198771015840

isin Red(119878) is called a minimal test cost reduct

The set of all minimal test cost reducts is denoted byMTR(119878) And the problem of constructing MTR(119878) is calledthe minimal test cost reduct (MTR) problem As indicated in[1] the time complexity of computing MTR(119878) is the same asRed(119878)

23 The Minimal Test Cost Reduct Problem Attribute reduc-tion is a key issue of the rough sets research The classical[5] covering-based [16 21] decision-theoretical [3] variable-precision [10] dominance-based [22] and neighborhood[23] rough sets models address the reduction problem fromdifferent perspectives A number of definitions of relative

Mathematical Problems in Engineering 3

reducts exist [5 19 23 24] for different rough sets modelsThis paper employs the definition based on the positiveregion

24 The Genetic Algorithm In the computer science field ofartificial intelligence a genetic algorithm (GA) is a searchheuristic that mimics the process of natural evolution [25]This heuristic (also sometimes called a metaheuristic) isroutinely used to generate useful solutions to optimizationand search problems Genetic algorithms belong to thelarger class of evolutionary algorithms (EA) which generatesolutions to optimization problems using techniques inspiredby natural evolution such as inheritancemutation selectionand crossover [25] In [26] the genetic algorithm is employedto evolve the cost-sensitive decision trees Recently thegenetic algorithm has been employed to tackle the minimaltest cost reduction problem [6] and attribute reduction withtest cost constraint [27]

25 The Ant Colony Optimization Swarm intelligence is arelatively new approach to problem solving that takes inspi-ration from the social behaviors of insects and other animals[28] The ant colony optimization (ACO) algorithm is aprobabilistic technique for solving computational problemswhich can be reduced to finding good paths through graphsACO algorithms are state-of-the-art for the sequential order-ing problem [29] the vehicle routing problem with timewindow constraints [30] the quadratic assignment problem[31] the arc-weighted l-cardinality tree problem [32] and theshortest common supersequence problem [33] In rough setsthe classical attribute reduction problem has been tackled bythe ACO [34 35]

26 Evaluation Measures For evaluating the experimentresults we adopt evaluation measures proposed in [1] Thenew algorithm is compared with the information gain-basedheuristic algorithm on four UCI datasets [36]

We need a measure to evaluate the quality of one partic-ular reduction Since an algorithm can run on many datasetsor one dataset with different test cost settings we adopt threemetrics from a statistical viewpointThey are finding optimalfactor (FOF) maximal exceeding factor (MEF) and averageexceeding factor (AEF) [1]

261 Finding Optimal Factor Let the number of experimentsbe 119870 and the number of successful searches of an optimalreduct 119896 The finding optimal factor is defined as

op =119896

119870 (2)

262 Exceeding Factor For a dataset with a particular testcost setting let 1198771015840 be an optimal reductThe exceeding factorof a reduct 119877 is

ef (119877) =119888lowast

(119877) minus 119888lowast

(1198771015840

)

119888lowast (1198771015840) (3)

The exceeding factor provides a quantitative metric to eval-uate the performance of a reduct It indicates the badnessof a reduct when it is not optimal Naturally if 119877 is anoptimal reduct the exceeding factor is 0 To demonstrate theperformance of an algorithm statistical metrics are neededLet the number of experiments be 119870 In the 119894th experiment(1 le 119894 le 119870) the reduct computed by the algorithm is denotedby 119877119894 The maximal exceeding factor is defined as

max1le119894le119870

ef (119877119894) (4)

This shows the worst case of the algorithm given somedataset Although it relates to the performance of one partic-ular reduct it should be viewed as a statistical rather than anindividual metric The average exceeding factor is defined as

119870

sum119894=1

ef (119877119894)

119870 (5)

Since it is averaged on119870 different test-cost-sensitive decisionsystems it shows the overall performance of the algorithmsolely from a statistical perspective

3 The Algorithm

In this section one hand we revise the classical ant colonyoptimization On the other hand we present our ant colonyoptimization with different techniques to tackle the minimaltest cost reduct problem

31 The Problem Representation and Algorithm FrameworkIn order to apply ant colony optimization to tackle theminimal test cost reduction problem we adopt the followingmodel

Graph The decision system is represented as a graph

Vertex An attribute is represented as a vertex Eachfeature vertex has information about test cost

EdgeThere is an edge between any two vertexes Eachedge has information on pheromone density

Adjacent Matrix The values of the matrix representpheromone density of each edge

Because our objective is attribute reduction not classi-fication which produces rule sets we do not adopt the treestructure such as the ant colony decision tree In this sectionwe employ the first simplified ant colony optimizationmdashASWe represent the general algorithm framework as follows

Stage 1 (addition stage) Compute the core of the dataset 119870batches of ants are created with each batch containing 119896 antstherefore giving a total of 119870 times 119896 ants Each ant takes the coreas the starting position From the initial positions each ant

4 Mathematical Problems in Engineering

traverses edges probabilistically until the stopping criterionis satisfied

Stage 2 (deletion stage) Delete redundant attributes andupdate the pheromone of the traveled path

Stage 3 (filtration stage)Gather the attribute subsets obtainedby ants and compute their test cost Choose the attributesubset with minimal test cost and output it as the result

Throughout the paper the stopping criterion is thepositive region condition The selecting probability dependson the test cost of each adjacent attribute vertex and thepheromone of each adjacent edgeTheprobabilistic transitionrule is

119901119896

119894119895=

[120591119895]120572

[120578119894119895]120573

sum119897isinN119896119894

[120591119897]120572

[120578119894119897]120573

119895 isin N119896

119894 (6)

where 119896 is the number of the ant 120591119895

= 1000119905119888(119886119895) as the

heuristic information 120578119894119895means the pheromone density of

the edge (119894 119895) 120572 is the exponent of 120591119895 120573 is the exponent of

120578119894119895 andN119896

119894is the set of all unvisited adjacent vertexes of the

attribute 119894The difference between the centralized deletion strategy

and the distributed strategy is the time to delete attributesThe deletion of redundant attributes follows the finish of thejourney of all ants in the case of the centralized deletionstrategy When using the distributed deletion strategy thealgorithm deletes redundant attributes after each ant travelsa path and then it adjusts the path In this situation thepheromone of edges adjoining to redundant attribute ver-texes will not be updated The adjusting method is illustratedin Figure 2 Monte Carlo methods are a broad class of com-putational algorithms that rely on repeated random samplingto obtain numerical results that is by running simulationsmany times over in order to calculate those same probabilitiesheuristically just like actually playing and recording yourresults in a real casino situation hence the name We employMonte Carlo method to simulate the process that an artificialant selects the next attribute vertex probabilistically

32 The ACO with Distributed Deletion We represent thesubstantial algorithm of the ant colony optimization withdistributed deletion strategy as Algorithm 1 The algorithmwith centralized deletion strategy is similar

We explain the algorithm in details as follows Lines 1through 8 correspond to Stage 1 in the algorithm frameworkIn lines 2 through 4 119870 batches of ants are created with eachbatch containing 119896 ants therefore giving a total of119870times119896 antsEach ant takes the core as indicated by line 5 Lines 9 through21 represent Stage 2 Select vertexes until all ants in the batchmeet the positive region condition Whenever an ant stopsit removes the redundant attributes and adjusts the traveledpath In line 18 the ants release pheromone to the adjustedpaths After all ants finish their journey select the best reductas lines 22 through 27 have shown Obviously these linesrelate to Stage 3 If we adopt the centralized deletion strategythe algorithm deletes redundant attributes after all ants finishtheir journey

33 A Running Example The TCI-DS is given by Tables 1and 2 We illustrate the process of an ant in Figure 1 whereattributes Headache Muscle Temperature and Snivelare represented by 0 1 2 and 3 respectively

After an ant travels a path the redundant attributes areremoved After deletion the algorithm reconstructs the pathThis adjusting process is represented in Figure 2

Each ant is placed on an attribute randomly For simplic-ity assume 119896 = 1 namely there is one artificial ant in onebatch Consider Core(119878) = 0 hence any ant initially takes novertex

Stage 1 (addition stage)

Stage 11 (artificial ants start) Assume ant[0] starts from theattribute[0]

Stage 12 (artificial ants select attributes)

Iteration 1 Ant[0] is at the attribute Headache Ant[0]has three attributes to select and they are Muscle painTemperature and Snivel

According to (6) the denominator of selecting probabilityis (100025)2 sdot 12 + (100040)

2

sdot 12

+ (100020)2

sdot 12

= 1600 +

625 + 2500 = 4725We compute the probability of three selections

11990101

= 16004725 = 0339

11990102

=625

4725= 0132

11990103

=2500

4725= 0529

(7)

Because 11990103

gt 11990101

gt 11990102 the ant will select this as next

feature with high probability That is to say the ant does notselect the attribute necessarily In this case we assume the antchoose the feature Snivel At the same time the ant will addthe selected attribute to the visited attribute set We find theselected attribute set the ant has visited does not satisfy thepositive region condition so it will continue to select nextattribute

Iteration 2 Ant[0] is at the attribute Snivel currentlyThe candidate features are Muscle pain and

Temperature According to the probability function thedenominator of selecting probability is (100025)

2

sdot 12

+

(100040)2

sdot 12

= 1600 + 625 = 2225Consider

11990131

=1600

2225= 0719

11990132

=625

2225= 0281

(8)

More over 11990131

gt 11990132 then the artificial ant[0] almost

selects the attribute Muscle pain

Mathematical Problems in Engineering 5

Input 119878 = (119880 119862119863 119881 119868 119888)

Output119872 a reduct of 119878Method acomtr(1)119872 = 0the best reduct obtained by ants(2) for (119894 = 0 119894 lt 119870 119894 ++) do(3) for (119895 = 0 119895 lt 119896 119895 + +) do(4) create ant[119894 times 119896 + 119895](5) 119861

119894times119896+119895= core(119878)takes the set of core attributes

(6) end forEach ant selects attributes

(7) for (each119898 isin [119894 times 119896 (119894 + 1) times 119896] where ant[119898] has not stopped) do(8) MonteCarlo(119905119888 119901ℎ)select the next vertex(9) if (119875119874119878

119861119898(119863) == 119875119874119878

119862(119863)) then

(10) ant[119898] stops Delete redundant vertexes(11) 119898119905119888 = 10000minimal test cost(12) for (119895 = |119861

119894| minus 1 119895 gt= |Core(119878)| 119895--) do

(13) if (119875119874119878(119861119894minus119886119895)

(119863) == 119875119874119878119861119894(119863)) then

(14) 119861119894= 119861119894minus 119886119895

(15) end if(16) end for(17) Adjust the traveled path(18) Update the pheromone of edges ant[119898] traveled(19) end if(20) end for(21) end for

Choose the minimal test cost reduction from the last 119899 ants(22) for (119894 = 119870 times 119896 minus 119899 119894 lt= 119870 times 119896 minus 1 119894 ++) do(23) if 119905119888(119861

119894) le 119898119905119888 then

(24) 119898119905119888 = 119905119888(119861119894)

(25) 119872 = 119861119894

(26) end if(27) end for(28) return119872

Algorithm 1 ACO with distributed deletion to MTR problem

Start

0 1

2 3

(a)

p01 = 0339

tc = 202 3

p02 = 0132

tc = 250 1

p03 = 0529

tc = 40

(b)

tc = 250 1

tc = 402 3

p32 = 0281

p31 = 0719

(c)

0 1

2 3

(d)

Figure 1 (a) Start from the attribute Headache (b) Select second attribute (c) Select third attribute (d) Stop

6 Mathematical Problems in Engineering

10

32

(a)

10

32

(b)

2 3

0

(c)

Figure 2 (a) The path which an ant traveled (b) Delete theredundant attribute (c) Reconstruct the path

Now the attribute set obtained by ant[0] has satisfiedthe positive region constraint The ant[0] stops working andupdates pheromone information

Stage 13 (pheromone updating) After an artificial ant stopsworking it will update the pheromone density of edgestraveled by itself In our algorithm when one path is crossedby an ant the pheromone diffusion will increase by one Ofcourse the rule of pheromone updating may be designed inother methods We adopt the way in this paper

In this example the sequence attribute selection of ant[0]is Headache Snivel and Muscle pain The pheromoneof edge (0 3) and edge (3 1) increases by one

Each artificial ant runs in above three stages

Stage 2 (deletion stage) After all ants have stopped workingthe algorithm will delete redundant attributes from theselected attribute sets of last few ants using positive regionIf an attribute does not contribute to the positive region weremove it

Stage 3 (filtration stage) After deletion we choose the reductobtained from last 119899 ants with minimal test cost as the result

In this example we adopt the centralized deletion strat-egy Figure 2 illustrates the distributed deletion strategy in aniteration In the figure the ant travels the paths 0 1 2 and 3Suppose the attribute 1 is a redundant attribute we remove itfrom the attribute subset and adjust the path The adjustingprocess is shown in Figure 2(c)

4 Experiments

In this section we try to answer the following questions byexperimentation

(1) How does the number of ants in the ant colonyinfluence the result

(2) How does the strategy of deleting redundantattributes influence the quality of the result

(3) Does our ant colony optimization algorithm outper-form the existing one

The UCI datasets we used are Zoo Voting Tic-tac-toeandMushroom Since these datasets have no test cost settingsfor statistical purposes we apply three common distributionsto generate random test cost The three distributions areuniform distribution normal distribution and Pareto distri-bution In this paper the test cost is a random integer ranging

from 1 to 100The exponent 120573 in (6) is set to be 2 since undermany circumstances this value is a good setup [28]

We do not design the parameter learning mechanismThe employed competition approach has the selection mech-anism of the parameter Different applications use the samerange and users do not specify the value of the parameterThestrategy is more straightforward than many other parametertuning strategies However we may design other strategiesto save time Finding optimal factor minimal exceedingfactor and average exceeding factor are employed to measurethe effectiveness of the algorithms We run each algorithmon datasets with 1000 times The results have statisticalcharacteristics

41 The Influence of Ant Counts on Experiment Results Inorder to find the relationship between the number of ants andthe quality of the result we conduct this experiment We runour algorithmwith 100 ants 150 ants 200 ants 400 ants Inthis section the number of experiments is set to be 100 Theexponent 120572 in (6) is set to be 2 Results are shown in Tables 3and 4

42 Comparison with Existing Heuristic Algorithms Accord-ing to the result of the above experiments we find 100 is theoptimal setting of the number of the ants We conduct anempirical study to examine the effect of our algorithm Toimprove the performance we use the competition approach[1] to enhance our algorithm The exponent 120572 in (6) is set asintegers ranging from 1 to 4 We illustrate the results amongthe information gain-based algorithm GA-1 GA-2 ACOwith centralized deletion and the ACO with distributed dele-tion in Table 5 For clarity the results on dataset Mushroomare shown in Figure 5

Our algorithm is tested on the UCI datasets 1000 timesrespectively The new algorithm with different techniques iscompared with the information gain-based algorithm [1] andthe genetic algorithm [6] Experimental results are listed inFigures 3 4 and 5 and Tables 3 4 and 5

43 Experimental Results Now we can answer the questionsproposed at the beginning of this section

(1) Experiment results indicate that the effect of thealgorithm becomes worse with the increment of theant counts We find that 100 is the optimal settingof the number of ants So we run our algorithmwith 100 ants to compare with the existing heuristicalgorithms

(2) From Figures 3 and 4 we infer that the distributeddeletion outperforms the centralized one on themedium-sized dataset Mushroom without the com-petition approach When we use the competitionapproach the centralized deletion strategy and thedistributed strategy produce the similar quality ofresults shown in Table 5 and Figure 5

Mathematical Problems in Engineering 7

Table 3 The finding optimal factor of ACO with different numbers of ants without competition method using centralized deletion

Dataset Distribution Number of ants100 150 200 250 300 350 400

Uniform 066 070 073 068 062 069 061Iris Normal 053 045 044 043 037 040 043

Pareto 095 093 094 096 087 093 094Uniform 075 067 070 068 070 068 063

Zoo Normal 066 048 038 029 030 033 034Pareto 099 100 096 094 097 095 084Uniform 090 094 084 080 086 081 083

Voting Normal 100 094 091 095 095 090 093Pareto 098 099 098 099 099 099 094Uniform 096 091 089 078 068 069 068

Tic-tac-toe Normal 100 096 096 096 089 086 083Pareto 100 100 100 099 099 099 100Uniform 075 072 069 064 059 060 049

Mushroom Normal 061 055 052 051 039 041 031Pareto 097 097 096 095 095 096 094

Table 4 The finding optimal factor of ACO with different numbers of ants without competition method using distributed deletion

Dataset Distribution Number of ants100 150 200 250 300 350 400

Uniform 072 070 068 067 072 063 072Iris Normal 042 040 037 040 048 041 042

Pareto 090 095 094 093 092 091 097Uniform 072 069 069 066 063 064 055

Zoo Normal 059 050 039 033 033 036 027Pareto 099 095 093 093 097 091 094Uniform 093 084 087 086 081 079 083

Voting Normal 100 100 097 098 096 096 097Pareto 100 098 099 097 097 097 098Uniform 090 078 074 077 071 074 066

Tic-tac-toe Normal 099 099 092 090 090 091 077Pareto 100 100 100 099 100 100 099Uniform 080 071 070 062 053 059 058

Mushroom Normal 062 054 053 034 033 027 029Pareto 099 099 096 095 095 094 094

Table 5 Results of 120582-information gain algorithm and the test cost based ant colony optimization with centralized deletion and distributeddeletion

Dataset Distribution Information gain GA-1 GA-2 ACO centralized ACO distributedFOF MEF AEF FOF MEF AEF FOF MEF AEF FOF MEF AEF FOF MEF AEF

Uniform 0915 0193 0004 0966 0118 0001 0947 0224 0002 0973 0177 0001 0975 0216 0001Zoo Normal 0833 0028 0001 0970 0109 0001 0924 0101 0002 0849 0040 0002 0844 0048 0002

Pareto 0999 0167 0000 0998 0167 0000 0993 0200 0001 1000 0000 0000 1000 0000 0000Uniform 0998 0053 0000 1000 0000 0000 0999 0009 0000 0997 0065 0000 1000 0000 0000

Voting Normal 1000 0000 0000 1000 0000 0000 1000 0000 0000 1000 0000 0000 1000 0000 0000Pareto 1000 0000 0000 1000 0000 0000 0995 0067 0000 1000 0000 0000 1000 0000 0000Uniform 0877 0055 0002 1000 0000 0000 1000 0000 0000 1000 0000 0000 1000 0000 0000

Tic-tac-toe Normal 0408 0018 0003 1000 0000 0000 1000 0000 0000 1000 0000 0000 1000 0000 0000Pareto 0986 0125 0002 1000 0000 0000 1000 0000 0000 1000 0000 0000 1000 0000 0000Uniform 0762 0523 0023 0448 1000 0067 0766 0543 0019 0948 0085 0002 0946 0144 0002

Mushroom Normal 0176 0306 0010 0321 0401 0063 0551 0340 0022 0958 0020 0000 0966 0020 0000Pareto 0733 0500 0064 0935 0400 0015 0949 0500 0012 1000 0000 0000 1000 0000 0000

8 Mathematical Problems in Engineering

1

09

08

07

06

05

04100 200 300 400

Number of ants

Find

ing

optim

al fa

ctor

(FO

F)

(a)

1

08

06

04

02100 200 300 400

Number of antsFi

ndin

g op

timal

fact

or (F

OF)

(b)

1

09

08100 200 300 400

Number of ants

IrisZooVoting

Tic-tac-toeMushroom

Find

ing

optim

al fa

ctor

(FO

F)

095

085

(c)

Figure 3 ACO with centralized deletion on different distributions (a) uniform (b) normal (c) pareto

(3) The ant colony optimization provides better perfor-mance than the information gain one and the geneticoneOur algorithmovercomes the shortcoming of thegenetic algorithmwhich performs badly on the largerdataset On medium-sized dataset Mushroom theACO outperforms the genetic one significantly Forexample when the test cost distribution is normal theFOF of the ant colony optimization using centralized

deletion and distributed deletion is 958 966respectively outperforming the information gain oneGA-1 and GA-2 with 176 321 and 521 respec-tively Figure 5 gives a more intuitive understandingof results on the dataset Mushroom One possiblereason is that the ant colony optimization algorithmproduces more diverse solutions than the existingone

Mathematical Problems in Engineering 9

1

09

08

07

06

05100 200 300 400

Number of ants

Find

ing

optim

al fa

ctor

(FO

F)

(a)

1

09

08

07

06

05

04

100 200 300 400Number of ants

Find

ing

optim

al fa

ctor

(FO

F)

03

02150 250 350

(b)

100 200 300 400Number of ants

IrisZooVoting

Tic-tac-toeMushroom

1

098

096

094

092

09

Find

ing

optim

al fa

ctor

(FO

F)

(c)

Figure 4 ACO with distributed deletion on different distributions (a) uniform (b) normal (c) Pareto

5 Conclusions

In this paper we have pointed out the shortcoming of theexisting heuristic algorithms including the information gain-based one and the genetic one We have designed a newalgorithm based on ant colony optimization to tackle theMTR problem In deletion stage our algorithm containscentralized deletion strategy and distributed deletion strat-egy We have tested the new one with three representativetest cost distributions on four UCI datasets Experimentalresults show that the new algorithm outperforms the existing

ones significantly especially 1lsquoon medium-sized dataset suchas Mushroom Moreover when distribution is normal thedistributed deletion strategy obtains better results than thecentralized one on Mushroom dataset

Acknowledgments

This work is in part supported by the National ScienceFoundation of China under Grant no 61170128 the NaturalScience Foundation of Fujian Province China under Grantno 2012J01294 State Key Laboratory of Management and

10 Mathematical Problems in Engineering

Uniform Normal Pareto0

02

04

06

08

1

Test cost distribution

Find

ing

optim

al fa

ctor

(FO

F)12

(a)

1

09

08

07

04

03

02

01

0Uniform Normal Pareto

Test cost distribution

Max

imal

exce

edin

g fa

ctor

05

06

(b)

01

009

008

007

006

005

004

003

002

001

0Uniform Normal Pareto

Test cost distribution

Information gainGA-1GA-2

ACO centralized deletionACO distributed deletion

Aver

age e

xcee

ding

fact

or

(c)

Figure 5 Result on Mushroom (a) finding optimal factor (b) maximal exceeding factor (c) average exceeding factor

Control for Complex Systems Open Project under Grantno 20110106 and Fujian Province Foundation of HigherEducation under Grant no JK2012028

References

[1] F Min H He Y Qian and W Zhu ldquoTest-cost-sensitiveattribute reductionrdquo Information Sciences vol 181 no 22 pp4928ndash4942 2011

[2] F Min and W Zhu ldquoMinimal cost attribute reduction throughbacktrackingrdquo in Proceedings of the International Conference onDatabase Theory and Application vol 258 of FGIT-DTABSBTpp 100ndash107 CCIS 2011

[3] Y Y Yao and Y Zhao ldquoAttribute reduction in decision-theoreticrough set modelsrdquo Information Sciences vol 178 no 17 pp3356ndash3373 2008

[4] X Jia W Liao Z Tang and L Shang ldquoMinimum cost attributereduction in decision-theoretic rough set modelsrdquo InformationSciences vol 219 pp 151ndash167 2013

Mathematical Problems in Engineering 11

[5] Z Pawlak ldquoRough setsrdquo International Journal of Computer andInformation Sciences vol 11 no 5 pp 341ndash356 1982

[6] G Pan F Min and W Zhu ldquoA genetic algorithm to theminimal test cost reduct problemrdquo in Proceedings of the IEEEInternational Conference on Granular Computing (GrC rsquo11) pp539ndash544 Kaohsiung Taiwan November 2011

[7] F Min and Q Liu ldquoA hierarchical model for test-cost-sensitivedecision systemsrdquo Information Sciences vol 179 no 14 pp2442ndash2452 2009

[8] W X Zhang J S Mi and W Z Wu ldquoKnowledge reductions ininconsistent information systemsrdquo Chinese Journal of Comput-ers vol 26 no 1 pp 12ndash18 2003

[9] Q Liu L Chen J Zhang and F Min ldquoKnowledge reduction ininconsistent decision tablesrdquo in Proceedings of Advanced DataMining and Applications vol 4093 of Lecture Notes in ComputerScience pp 626ndash635 2006

[10] W Ziarko ldquoVariable precision rough set modelrdquo Journal ofComputer and System Sciences vol 46 no 1 pp 39ndash59 1993

[11] J G Bazan and A Skowron ldquoDynamic reducts as a tool forextracting laws from decision tablesrdquo in Proceedings of the8th International Symposium on Methodologies for IntelligentSystems pp 346ndash355 1994

[12] D Deng ldquoParallel reduct and its propertiesrdquo in Proceedings ofthe International Conference on Granular Computing (GRC rsquo09)pp 121ndash125 Nanchang China August 2009

[13] D Deng J Wang and X Li ldquoParallel reducts in a seriesof decision subsystemsrdquo in Proceedings of the InternationalJoint Conference on Computational Sciences and Optimization(CSO rsquo09) pp 377ndash380 Hainan China April 2009

[14] F Min Z Bai M He and Q Liu ldquoThe reduct problemwith specified attributesrdquo in Proceedings of the Rough Setsand Soft Computing in Intelligent Agent and Web TechnologyInternational Workshop at WI-IAT pp 36ndash42 2005

[15] W Zhu and F Wang ldquoOn three types of covering rough setsrdquoIEEE Transactions on Knowledge and Data Engineering vol 19no 8 pp 1131ndash1144 2007

[16] W Zhu and F Y Wang ldquoReduction and axiomization ofcovering generalized rough setsrdquo Information Sciences vol 152pp 217ndash230 2003

[17] C C Aggarwal ldquoOn density based transforms for uncertaindata miningrdquo in Proceedings of the 23rd IEEE InternationalConference on Data Engineering pp 866ndash875 2007

[18] H Zhao F Min and W Zhu ldquoTest-cost-sensitive attributereduction of data with normal distribution measurementerrorsrdquoMathematical Problems in Engineering vol 2013 ArticleID 946070 12 pages 2013

[19] D Slęzak ldquoApproximate entropy reductsrdquo Fundamenta Infor-maticae vol 53 no 3-4 pp 365ndash390 2002

[20] F Min X Du H Qiu and Q Liu ldquoMinimal attribute spacebias for attribute reductionrdquo in Proceedings of the Rough Set andKnowledge Technology pp 379ndash386 2007

[21] W Zhu ldquoTopological approaches to covering rough setsrdquoInformation Sciences vol 177 no 6 pp 1499ndash1508 2007

[22] S Greco B Matarazzo R Slowinski and J StefanowskildquoVariable consistency model of dominance-based rough setsapproachrdquo in Proceedings of the Rough Sets and Current Trendsin Computing vol 2005 of Lecture Notes in Computer Sciencepp 170ndash181 2000

[23] Q Hu D Yu J Liu and CWu ldquoNeighborhood rough set basedheterogeneous feature subset selectionrdquo Information Sciencesvol 178 no 18 pp 3577ndash3594 2008

[24] YQian J LiangW Pedrycz andCDang ldquoPositive approxima-tion an accelerator for attribute reduction in rough set theoryrdquoArtificial Intelligence vol 174 no 9-10 pp 597ndash618 2010

[25] Wikipedia ldquoGenetic algorithmrdquo httpenwikipediaorgwikiGeneticalgorithm

[26] P D Turney ldquoCost-sensitive classification empirical evaluationof a hybrid genetic decision tree induction algorithmrdquo Journalof Artificial Intelligence Research vol 2 pp 369ndash409 1995

[27] J Liu F Min S Liao and W Zhu ldquoA genetic algorithm toattribute reduction with test cost constraintrdquo in Proceedings ofthe 6th IEEE International Conference on Computer Sciences andConvergence Information Technology (ICCIT rsquo11) pp 751ndash7542011

[28] M Dorigo and T Stutzle Eds Ant Colony Optimization MITPress Boston Mass USA 2004

[29] L Gambardella and M Dorigo ldquoHas-sop hybrid ant systemfor the sequential ordering problemrdquo Tech Rep (IDSIA-11-97)1997

[30] L M Gambardella E Taillard G Agazzi I D Corne MDorigo and F Glover ldquoMacs-vrptw a multiple ant colonysystem for vehicle routing problemswith timewindowsrdquo inNewIdeas in Optimization pp 63ndash76 McGraw-Hill New York NYUSA 1999

[31] V Maniezzo and A Colorni ldquoThe ant system applied to thequadratic assignment problemrdquo IEEE Transactions on Knowl-edge and Data Engineering vol 11 no 5 pp 769ndash778 1999

[32] C Blum M Blesa and U P de Catalunya Departament deLlenguatges i Sistemes InformaticsMetaheuristics for the edge-weighted k-cardinality tree problem 2003

[33] R Michel and M Middendorf ldquoAn aco algorithm for theshortest common supersequence problemrdquo in New Ideas inOptimization pp 51ndash62 McGraw-Hill London UK 1999

[34] Y ChenDMiao andRWang ldquoA rough set approach to featureselection based on ant colony optimizationrdquoPatternRecognitionLetters vol 31 no 3 pp 226ndash233 2010

[35] L Ke Z Feng and Z Ren ldquoAn efficient ant colony optimizationapproach to attribute reduction in rough set theoryrdquo PatternRecognition Letters vol 29 no 9 pp 1351ndash1357 2008

[36] C L Blake and C J Merz ldquoUCI repository of machine learningdatabasesrdquo 1998 httpwwwicsuciedusimmlearnmlrepositoryhtml

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

2 Mathematical Problems in Engineering

vertex according to test costs of each adjacent vertex andpheromone of each adjacent edge If the attribute vertexesan ant has traveled satisfy the positive region conditionthe ant stops otherwise continues to add attributes Secondin the deletion stage redundant attributes are deleted fromthe obtained attribute subsets and the pheromone of eachedge is updated Two strategies including centralized deletionand distributed deletion are designed for this stage Thirdthe ant with the least test cost is selected and the attributesubset corresponding to its path is output as the result in thefiltration stage

To evaluate the performance of algorithms we adoptthree measures namely finding optimal factor (FOF) max-imal exceeding factor (MEF) and average exceeding factor(AEF) [1] Experimental results indicate that our algorithmoutperforms the information gain-based algorithm in mostdatasets with different test cost distributions It can obtainbetter results than the genetic algorithm except some smalldatasets One possible reason is that the ant colony opti-mization algorithm produces more diverse solutions than theexisting ones The distributed deletion strategy is superior tothe centralized one on medium-sized dataset Mushroom

The rest of the paper is organized as follows Section 2reviews the basic concepts in rough sets and decision systemSection 3 proposes the ant colony optimization to tackle theminimal test cost reduction problem In Section 4 we presentour experiment schemes and show the results We also give asimple analysis of our experimental results Finally Section 5presents the conclusion

2 Preliminaries

This section reviews basic knowledge test-cost-independentdecision systems relative reduct minimal test cost reductproblem genetic algorithm and ant colony optimization

21 Test-Cost-Independent Decision Systems Most super-vised learning approaches are based on decision systems Adecision system is often denoted by 119878 = (119880 119862119863 119881

119886| 119886 isin

119862cup119863 119868119886| 119886 isin 119862cup119863) where119880 is a finite set of objects called

the universe 119862 is the set of conditional attributes also calledthe set of tests 119863 is the set of decision attributes also calledthe decision 119881

119886is the set of values for each 119886 isin 119862 cup 119863 and

119868119886 119880 rarr 119881

119886is an information function for each 119886 isin 119862 cup 119863

We often denote 119881119886| 119886 isin 119862 cup 119863 and 119868

119886| 119886 isin 119862 cup 119863 by 119881

and 119868 respectively A decision system is often represented bya decision table as shown in Table 1

We consider the simplest case though most widely usedtype of cost-sensitive decision systems as follows

Definition 1 (see [7]) A test-cost-independent decision sys-tem (TCI-DS) 119878 is the 6-tuple

119878 = (119880 119862119863 119881 119868 119888) (1)

where 119880 119862 119863 119881 and 119868 have the same meanings as in adecision system and 119888 119862 rarr R+ cup 0 is the test costfunction Test costs are independent of one another that is119888(119861) = sum

119886isin119861119888(119886) for any 119861 sube 119862

Table 1 An exemplary decision table

Patient Headache Muscle Temperature Snivel Flu1199091

No Yes Normal No No1199092

Yes Yes High Yes Yes1199093

Yes Yes Very high No Yes1199094

No Yes Normal No No1199095

No No High No No1199096

No Yes Very high Yes Yes

Table 2 The test cost vector

119886 Headache Muscle pain Temperature Snivel119888(119886) 50 25 40 20

We usually use a vector 119888 = [119888(1198861) 119888(1198862) 119888(119886

|119862|)] to

represent the cost function An exemplary cost vector isshown in Table 2 In fact cost-sensitive decision systems aremore general than decision systems If all elements in 119888 are0 a TCI-DS coincides with a DS For simplicity free tests arenot considered in this work This consideration is reasonablesince we always need some cost to obtain data

22 The Relative Reduct The relative reduct is a crucial con-cept in rough sets and there are many different definitionssuch as positive region reducts [5] maximum distributionreducts [8] fuzzy reducts [9] and 120573-reduct [10] Thesedefinitions are equivalent if the decision table is consistentFurthermore there are some extended concepts such asdynamic reducts [11] parallel reducts [12 13] andM-relativereducts [14] In some extensions of rough sets that iscovering-based rough sets [15] there are other definitions ofa reduct (see eg [16ndash18])

Most existing reduct problems aim at finding theminimaldescription of the data And the objectives include findingattribute subsets with the minimal size [5 19] the minimalspace [20] or a covering with the minimal number of subsets[16] Since the test cost issue is the focus of this paper we areinterested in reducts with the minimal test cost This type ofreducts is defined as follows

Definition 2 (see [1]) Let 119878 be a TCI-DS and Red(119878) the setof all reducts of 119878 Any 119877 isin Red(119878) where 119888(119877) = min119888(1198771015840) |1198771015840

isin Red(119878) is called a minimal test cost reduct

The set of all minimal test cost reducts is denoted byMTR(119878) And the problem of constructing MTR(119878) is calledthe minimal test cost reduct (MTR) problem As indicated in[1] the time complexity of computing MTR(119878) is the same asRed(119878)

23 The Minimal Test Cost Reduct Problem Attribute reduc-tion is a key issue of the rough sets research The classical[5] covering-based [16 21] decision-theoretical [3] variable-precision [10] dominance-based [22] and neighborhood[23] rough sets models address the reduction problem fromdifferent perspectives A number of definitions of relative

Mathematical Problems in Engineering 3

reducts exist [5 19 23 24] for different rough sets modelsThis paper employs the definition based on the positiveregion

24 The Genetic Algorithm In the computer science field ofartificial intelligence a genetic algorithm (GA) is a searchheuristic that mimics the process of natural evolution [25]This heuristic (also sometimes called a metaheuristic) isroutinely used to generate useful solutions to optimizationand search problems Genetic algorithms belong to thelarger class of evolutionary algorithms (EA) which generatesolutions to optimization problems using techniques inspiredby natural evolution such as inheritancemutation selectionand crossover [25] In [26] the genetic algorithm is employedto evolve the cost-sensitive decision trees Recently thegenetic algorithm has been employed to tackle the minimaltest cost reduction problem [6] and attribute reduction withtest cost constraint [27]

25 The Ant Colony Optimization Swarm intelligence is arelatively new approach to problem solving that takes inspi-ration from the social behaviors of insects and other animals[28] The ant colony optimization (ACO) algorithm is aprobabilistic technique for solving computational problemswhich can be reduced to finding good paths through graphsACO algorithms are state-of-the-art for the sequential order-ing problem [29] the vehicle routing problem with timewindow constraints [30] the quadratic assignment problem[31] the arc-weighted l-cardinality tree problem [32] and theshortest common supersequence problem [33] In rough setsthe classical attribute reduction problem has been tackled bythe ACO [34 35]

26 Evaluation Measures For evaluating the experimentresults we adopt evaluation measures proposed in [1] Thenew algorithm is compared with the information gain-basedheuristic algorithm on four UCI datasets [36]

We need a measure to evaluate the quality of one partic-ular reduction Since an algorithm can run on many datasetsor one dataset with different test cost settings we adopt threemetrics from a statistical viewpointThey are finding optimalfactor (FOF) maximal exceeding factor (MEF) and averageexceeding factor (AEF) [1]

261 Finding Optimal Factor Let the number of experimentsbe 119870 and the number of successful searches of an optimalreduct 119896 The finding optimal factor is defined as

op =119896

119870 (2)

262 Exceeding Factor For a dataset with a particular testcost setting let 1198771015840 be an optimal reductThe exceeding factorof a reduct 119877 is

ef (119877) =119888lowast

(119877) minus 119888lowast

(1198771015840

)

119888lowast (1198771015840) (3)

The exceeding factor provides a quantitative metric to eval-uate the performance of a reduct It indicates the badnessof a reduct when it is not optimal Naturally if 119877 is anoptimal reduct the exceeding factor is 0 To demonstrate theperformance of an algorithm statistical metrics are neededLet the number of experiments be 119870 In the 119894th experiment(1 le 119894 le 119870) the reduct computed by the algorithm is denotedby 119877119894 The maximal exceeding factor is defined as

max1le119894le119870

ef (119877119894) (4)

This shows the worst case of the algorithm given somedataset Although it relates to the performance of one partic-ular reduct it should be viewed as a statistical rather than anindividual metric The average exceeding factor is defined as

119870

sum119894=1

ef (119877119894)

119870 (5)

Since it is averaged on119870 different test-cost-sensitive decisionsystems it shows the overall performance of the algorithmsolely from a statistical perspective

3 The Algorithm

In this section one hand we revise the classical ant colonyoptimization On the other hand we present our ant colonyoptimization with different techniques to tackle the minimaltest cost reduct problem

31 The Problem Representation and Algorithm FrameworkIn order to apply ant colony optimization to tackle theminimal test cost reduction problem we adopt the followingmodel

Graph The decision system is represented as a graph

Vertex An attribute is represented as a vertex Eachfeature vertex has information about test cost

EdgeThere is an edge between any two vertexes Eachedge has information on pheromone density

Adjacent Matrix The values of the matrix representpheromone density of each edge

Because our objective is attribute reduction not classi-fication which produces rule sets we do not adopt the treestructure such as the ant colony decision tree In this sectionwe employ the first simplified ant colony optimizationmdashASWe represent the general algorithm framework as follows

Stage 1 (addition stage) Compute the core of the dataset 119870batches of ants are created with each batch containing 119896 antstherefore giving a total of 119870 times 119896 ants Each ant takes the coreas the starting position From the initial positions each ant

4 Mathematical Problems in Engineering

traverses edges probabilistically until the stopping criterionis satisfied

Stage 2 (deletion stage) Delete redundant attributes andupdate the pheromone of the traveled path

Stage 3 (filtration stage)Gather the attribute subsets obtainedby ants and compute their test cost Choose the attributesubset with minimal test cost and output it as the result

Throughout the paper the stopping criterion is thepositive region condition The selecting probability dependson the test cost of each adjacent attribute vertex and thepheromone of each adjacent edgeTheprobabilistic transitionrule is

119901119896

119894119895=

[120591119895]120572

[120578119894119895]120573

sum119897isinN119896119894

[120591119897]120572

[120578119894119897]120573

119895 isin N119896

119894 (6)

where 119896 is the number of the ant 120591119895

= 1000119905119888(119886119895) as the

heuristic information 120578119894119895means the pheromone density of

the edge (119894 119895) 120572 is the exponent of 120591119895 120573 is the exponent of

120578119894119895 andN119896

119894is the set of all unvisited adjacent vertexes of the

attribute 119894The difference between the centralized deletion strategy

and the distributed strategy is the time to delete attributesThe deletion of redundant attributes follows the finish of thejourney of all ants in the case of the centralized deletionstrategy When using the distributed deletion strategy thealgorithm deletes redundant attributes after each ant travelsa path and then it adjusts the path In this situation thepheromone of edges adjoining to redundant attribute ver-texes will not be updated The adjusting method is illustratedin Figure 2 Monte Carlo methods are a broad class of com-putational algorithms that rely on repeated random samplingto obtain numerical results that is by running simulationsmany times over in order to calculate those same probabilitiesheuristically just like actually playing and recording yourresults in a real casino situation hence the name We employMonte Carlo method to simulate the process that an artificialant selects the next attribute vertex probabilistically

32 The ACO with Distributed Deletion We represent thesubstantial algorithm of the ant colony optimization withdistributed deletion strategy as Algorithm 1 The algorithmwith centralized deletion strategy is similar

We explain the algorithm in details as follows Lines 1through 8 correspond to Stage 1 in the algorithm frameworkIn lines 2 through 4 119870 batches of ants are created with eachbatch containing 119896 ants therefore giving a total of119870times119896 antsEach ant takes the core as indicated by line 5 Lines 9 through21 represent Stage 2 Select vertexes until all ants in the batchmeet the positive region condition Whenever an ant stopsit removes the redundant attributes and adjusts the traveledpath In line 18 the ants release pheromone to the adjustedpaths After all ants finish their journey select the best reductas lines 22 through 27 have shown Obviously these linesrelate to Stage 3 If we adopt the centralized deletion strategythe algorithm deletes redundant attributes after all ants finishtheir journey

33 A Running Example The TCI-DS is given by Tables 1and 2 We illustrate the process of an ant in Figure 1 whereattributes Headache Muscle Temperature and Snivelare represented by 0 1 2 and 3 respectively

After an ant travels a path the redundant attributes areremoved After deletion the algorithm reconstructs the pathThis adjusting process is represented in Figure 2

Each ant is placed on an attribute randomly For simplic-ity assume 119896 = 1 namely there is one artificial ant in onebatch Consider Core(119878) = 0 hence any ant initially takes novertex

Stage 1 (addition stage)

Stage 11 (artificial ants start) Assume ant[0] starts from theattribute[0]

Stage 12 (artificial ants select attributes)

Iteration 1 Ant[0] is at the attribute Headache Ant[0]has three attributes to select and they are Muscle painTemperature and Snivel

According to (6) the denominator of selecting probabilityis (100025)2 sdot 12 + (100040)

2

sdot 12

+ (100020)2

sdot 12

= 1600 +

625 + 2500 = 4725We compute the probability of three selections

11990101

= 16004725 = 0339

11990102

=625

4725= 0132

11990103

=2500

4725= 0529

(7)

Because 11990103

gt 11990101

gt 11990102 the ant will select this as next

feature with high probability That is to say the ant does notselect the attribute necessarily In this case we assume the antchoose the feature Snivel At the same time the ant will addthe selected attribute to the visited attribute set We find theselected attribute set the ant has visited does not satisfy thepositive region condition so it will continue to select nextattribute

Iteration 2 Ant[0] is at the attribute Snivel currentlyThe candidate features are Muscle pain and

Temperature According to the probability function thedenominator of selecting probability is (100025)

2

sdot 12

+

(100040)2

sdot 12

= 1600 + 625 = 2225Consider

11990131

=1600

2225= 0719

11990132

=625

2225= 0281

(8)

More over 11990131

gt 11990132 then the artificial ant[0] almost

selects the attribute Muscle pain

Mathematical Problems in Engineering 5

Input 119878 = (119880 119862119863 119881 119868 119888)

Output119872 a reduct of 119878Method acomtr(1)119872 = 0the best reduct obtained by ants(2) for (119894 = 0 119894 lt 119870 119894 ++) do(3) for (119895 = 0 119895 lt 119896 119895 + +) do(4) create ant[119894 times 119896 + 119895](5) 119861

119894times119896+119895= core(119878)takes the set of core attributes

(6) end forEach ant selects attributes

(7) for (each119898 isin [119894 times 119896 (119894 + 1) times 119896] where ant[119898] has not stopped) do(8) MonteCarlo(119905119888 119901ℎ)select the next vertex(9) if (119875119874119878

119861119898(119863) == 119875119874119878

119862(119863)) then

(10) ant[119898] stops Delete redundant vertexes(11) 119898119905119888 = 10000minimal test cost(12) for (119895 = |119861

119894| minus 1 119895 gt= |Core(119878)| 119895--) do

(13) if (119875119874119878(119861119894minus119886119895)

(119863) == 119875119874119878119861119894(119863)) then

(14) 119861119894= 119861119894minus 119886119895

(15) end if(16) end for(17) Adjust the traveled path(18) Update the pheromone of edges ant[119898] traveled(19) end if(20) end for(21) end for

Choose the minimal test cost reduction from the last 119899 ants(22) for (119894 = 119870 times 119896 minus 119899 119894 lt= 119870 times 119896 minus 1 119894 ++) do(23) if 119905119888(119861

119894) le 119898119905119888 then

(24) 119898119905119888 = 119905119888(119861119894)

(25) 119872 = 119861119894

(26) end if(27) end for(28) return119872

Algorithm 1 ACO with distributed deletion to MTR problem

Start

0 1

2 3

(a)

p01 = 0339

tc = 202 3

p02 = 0132

tc = 250 1

p03 = 0529

tc = 40

(b)

tc = 250 1

tc = 402 3

p32 = 0281

p31 = 0719

(c)

0 1

2 3

(d)

Figure 1 (a) Start from the attribute Headache (b) Select second attribute (c) Select third attribute (d) Stop

6 Mathematical Problems in Engineering

10

32

(a)

10

32

(b)

2 3

0

(c)

Figure 2 (a) The path which an ant traveled (b) Delete theredundant attribute (c) Reconstruct the path

Now the attribute set obtained by ant[0] has satisfiedthe positive region constraint The ant[0] stops working andupdates pheromone information

Stage 13 (pheromone updating) After an artificial ant stopsworking it will update the pheromone density of edgestraveled by itself In our algorithm when one path is crossedby an ant the pheromone diffusion will increase by one Ofcourse the rule of pheromone updating may be designed inother methods We adopt the way in this paper

In this example the sequence attribute selection of ant[0]is Headache Snivel and Muscle pain The pheromoneof edge (0 3) and edge (3 1) increases by one

Each artificial ant runs in above three stages

Stage 2 (deletion stage) After all ants have stopped workingthe algorithm will delete redundant attributes from theselected attribute sets of last few ants using positive regionIf an attribute does not contribute to the positive region weremove it

Stage 3 (filtration stage) After deletion we choose the reductobtained from last 119899 ants with minimal test cost as the result

In this example we adopt the centralized deletion strat-egy Figure 2 illustrates the distributed deletion strategy in aniteration In the figure the ant travels the paths 0 1 2 and 3Suppose the attribute 1 is a redundant attribute we remove itfrom the attribute subset and adjust the path The adjustingprocess is shown in Figure 2(c)

4 Experiments

In this section we try to answer the following questions byexperimentation

(1) How does the number of ants in the ant colonyinfluence the result

(2) How does the strategy of deleting redundantattributes influence the quality of the result

(3) Does our ant colony optimization algorithm outper-form the existing one

The UCI datasets we used are Zoo Voting Tic-tac-toeandMushroom Since these datasets have no test cost settingsfor statistical purposes we apply three common distributionsto generate random test cost The three distributions areuniform distribution normal distribution and Pareto distri-bution In this paper the test cost is a random integer ranging

from 1 to 100The exponent 120573 in (6) is set to be 2 since undermany circumstances this value is a good setup [28]

We do not design the parameter learning mechanismThe employed competition approach has the selection mech-anism of the parameter Different applications use the samerange and users do not specify the value of the parameterThestrategy is more straightforward than many other parametertuning strategies However we may design other strategiesto save time Finding optimal factor minimal exceedingfactor and average exceeding factor are employed to measurethe effectiveness of the algorithms We run each algorithmon datasets with 1000 times The results have statisticalcharacteristics

41 The Influence of Ant Counts on Experiment Results Inorder to find the relationship between the number of ants andthe quality of the result we conduct this experiment We runour algorithmwith 100 ants 150 ants 200 ants 400 ants Inthis section the number of experiments is set to be 100 Theexponent 120572 in (6) is set to be 2 Results are shown in Tables 3and 4

42 Comparison with Existing Heuristic Algorithms Accord-ing to the result of the above experiments we find 100 is theoptimal setting of the number of the ants We conduct anempirical study to examine the effect of our algorithm Toimprove the performance we use the competition approach[1] to enhance our algorithm The exponent 120572 in (6) is set asintegers ranging from 1 to 4 We illustrate the results amongthe information gain-based algorithm GA-1 GA-2 ACOwith centralized deletion and the ACO with distributed dele-tion in Table 5 For clarity the results on dataset Mushroomare shown in Figure 5

Our algorithm is tested on the UCI datasets 1000 timesrespectively The new algorithm with different techniques iscompared with the information gain-based algorithm [1] andthe genetic algorithm [6] Experimental results are listed inFigures 3 4 and 5 and Tables 3 4 and 5

43 Experimental Results Now we can answer the questionsproposed at the beginning of this section

(1) Experiment results indicate that the effect of thealgorithm becomes worse with the increment of theant counts We find that 100 is the optimal settingof the number of ants So we run our algorithmwith 100 ants to compare with the existing heuristicalgorithms

(2) From Figures 3 and 4 we infer that the distributeddeletion outperforms the centralized one on themedium-sized dataset Mushroom without the com-petition approach When we use the competitionapproach the centralized deletion strategy and thedistributed strategy produce the similar quality ofresults shown in Table 5 and Figure 5

Mathematical Problems in Engineering 7

Table 3 The finding optimal factor of ACO with different numbers of ants without competition method using centralized deletion

Dataset Distribution Number of ants100 150 200 250 300 350 400

Uniform 066 070 073 068 062 069 061Iris Normal 053 045 044 043 037 040 043

Pareto 095 093 094 096 087 093 094Uniform 075 067 070 068 070 068 063

Zoo Normal 066 048 038 029 030 033 034Pareto 099 100 096 094 097 095 084Uniform 090 094 084 080 086 081 083

Voting Normal 100 094 091 095 095 090 093Pareto 098 099 098 099 099 099 094Uniform 096 091 089 078 068 069 068

Tic-tac-toe Normal 100 096 096 096 089 086 083Pareto 100 100 100 099 099 099 100Uniform 075 072 069 064 059 060 049

Mushroom Normal 061 055 052 051 039 041 031Pareto 097 097 096 095 095 096 094

Table 4 The finding optimal factor of ACO with different numbers of ants without competition method using distributed deletion

Dataset Distribution Number of ants100 150 200 250 300 350 400

Uniform 072 070 068 067 072 063 072Iris Normal 042 040 037 040 048 041 042

Pareto 090 095 094 093 092 091 097Uniform 072 069 069 066 063 064 055

Zoo Normal 059 050 039 033 033 036 027Pareto 099 095 093 093 097 091 094Uniform 093 084 087 086 081 079 083

Voting Normal 100 100 097 098 096 096 097Pareto 100 098 099 097 097 097 098Uniform 090 078 074 077 071 074 066

Tic-tac-toe Normal 099 099 092 090 090 091 077Pareto 100 100 100 099 100 100 099Uniform 080 071 070 062 053 059 058

Mushroom Normal 062 054 053 034 033 027 029Pareto 099 099 096 095 095 094 094

Table 5 Results of 120582-information gain algorithm and the test cost based ant colony optimization with centralized deletion and distributeddeletion

Dataset Distribution Information gain GA-1 GA-2 ACO centralized ACO distributedFOF MEF AEF FOF MEF AEF FOF MEF AEF FOF MEF AEF FOF MEF AEF

Uniform 0915 0193 0004 0966 0118 0001 0947 0224 0002 0973 0177 0001 0975 0216 0001Zoo Normal 0833 0028 0001 0970 0109 0001 0924 0101 0002 0849 0040 0002 0844 0048 0002

Pareto 0999 0167 0000 0998 0167 0000 0993 0200 0001 1000 0000 0000 1000 0000 0000Uniform 0998 0053 0000 1000 0000 0000 0999 0009 0000 0997 0065 0000 1000 0000 0000

Voting Normal 1000 0000 0000 1000 0000 0000 1000 0000 0000 1000 0000 0000 1000 0000 0000Pareto 1000 0000 0000 1000 0000 0000 0995 0067 0000 1000 0000 0000 1000 0000 0000Uniform 0877 0055 0002 1000 0000 0000 1000 0000 0000 1000 0000 0000 1000 0000 0000

Tic-tac-toe Normal 0408 0018 0003 1000 0000 0000 1000 0000 0000 1000 0000 0000 1000 0000 0000Pareto 0986 0125 0002 1000 0000 0000 1000 0000 0000 1000 0000 0000 1000 0000 0000Uniform 0762 0523 0023 0448 1000 0067 0766 0543 0019 0948 0085 0002 0946 0144 0002

Mushroom Normal 0176 0306 0010 0321 0401 0063 0551 0340 0022 0958 0020 0000 0966 0020 0000Pareto 0733 0500 0064 0935 0400 0015 0949 0500 0012 1000 0000 0000 1000 0000 0000

8 Mathematical Problems in Engineering

1

09

08

07

06

05

04100 200 300 400

Number of ants

Find

ing

optim

al fa

ctor

(FO

F)

(a)

1

08

06

04

02100 200 300 400

Number of antsFi

ndin

g op

timal

fact

or (F

OF)

(b)

1

09

08100 200 300 400

Number of ants

IrisZooVoting

Tic-tac-toeMushroom

Find

ing

optim

al fa

ctor

(FO

F)

095

085

(c)

Figure 3 ACO with centralized deletion on different distributions (a) uniform (b) normal (c) pareto

(3) The ant colony optimization provides better perfor-mance than the information gain one and the geneticoneOur algorithmovercomes the shortcoming of thegenetic algorithmwhich performs badly on the largerdataset On medium-sized dataset Mushroom theACO outperforms the genetic one significantly Forexample when the test cost distribution is normal theFOF of the ant colony optimization using centralized

deletion and distributed deletion is 958 966respectively outperforming the information gain oneGA-1 and GA-2 with 176 321 and 521 respec-tively Figure 5 gives a more intuitive understandingof results on the dataset Mushroom One possiblereason is that the ant colony optimization algorithmproduces more diverse solutions than the existingone

Mathematical Problems in Engineering 9

1

09

08

07

06

05100 200 300 400

Number of ants

Find

ing

optim

al fa

ctor

(FO

F)

(a)

1

09

08

07

06

05

04

100 200 300 400Number of ants

Find

ing

optim

al fa

ctor

(FO

F)

03

02150 250 350

(b)

100 200 300 400Number of ants

IrisZooVoting

Tic-tac-toeMushroom

1

098

096

094

092

09

Find

ing

optim

al fa

ctor

(FO

F)

(c)

Figure 4 ACO with distributed deletion on different distributions (a) uniform (b) normal (c) Pareto

5 Conclusions

In this paper we have pointed out the shortcoming of theexisting heuristic algorithms including the information gain-based one and the genetic one We have designed a newalgorithm based on ant colony optimization to tackle theMTR problem In deletion stage our algorithm containscentralized deletion strategy and distributed deletion strat-egy We have tested the new one with three representativetest cost distributions on four UCI datasets Experimentalresults show that the new algorithm outperforms the existing

ones significantly especially 1lsquoon medium-sized dataset suchas Mushroom Moreover when distribution is normal thedistributed deletion strategy obtains better results than thecentralized one on Mushroom dataset

Acknowledgments

This work is in part supported by the National ScienceFoundation of China under Grant no 61170128 the NaturalScience Foundation of Fujian Province China under Grantno 2012J01294 State Key Laboratory of Management and

10 Mathematical Problems in Engineering

Uniform Normal Pareto0

02

04

06

08

1

Test cost distribution

Find

ing

optim

al fa

ctor

(FO

F)12

(a)

1

09

08

07

04

03

02

01

0Uniform Normal Pareto

Test cost distribution

Max

imal

exce

edin

g fa

ctor

05

06

(b)

01

009

008

007

006

005

004

003

002

001

0Uniform Normal Pareto

Test cost distribution

Information gainGA-1GA-2

ACO centralized deletionACO distributed deletion

Aver

age e

xcee

ding

fact

or

(c)

Figure 5 Result on Mushroom (a) finding optimal factor (b) maximal exceeding factor (c) average exceeding factor

Control for Complex Systems Open Project under Grantno 20110106 and Fujian Province Foundation of HigherEducation under Grant no JK2012028

References

[1] F Min H He Y Qian and W Zhu ldquoTest-cost-sensitiveattribute reductionrdquo Information Sciences vol 181 no 22 pp4928ndash4942 2011

[2] F Min and W Zhu ldquoMinimal cost attribute reduction throughbacktrackingrdquo in Proceedings of the International Conference onDatabase Theory and Application vol 258 of FGIT-DTABSBTpp 100ndash107 CCIS 2011

[3] Y Y Yao and Y Zhao ldquoAttribute reduction in decision-theoreticrough set modelsrdquo Information Sciences vol 178 no 17 pp3356ndash3373 2008

[4] X Jia W Liao Z Tang and L Shang ldquoMinimum cost attributereduction in decision-theoretic rough set modelsrdquo InformationSciences vol 219 pp 151ndash167 2013

Mathematical Problems in Engineering 11

[5] Z Pawlak ldquoRough setsrdquo International Journal of Computer andInformation Sciences vol 11 no 5 pp 341ndash356 1982

[6] G Pan F Min and W Zhu ldquoA genetic algorithm to theminimal test cost reduct problemrdquo in Proceedings of the IEEEInternational Conference on Granular Computing (GrC rsquo11) pp539ndash544 Kaohsiung Taiwan November 2011

[7] F Min and Q Liu ldquoA hierarchical model for test-cost-sensitivedecision systemsrdquo Information Sciences vol 179 no 14 pp2442ndash2452 2009

[8] W X Zhang J S Mi and W Z Wu ldquoKnowledge reductions ininconsistent information systemsrdquo Chinese Journal of Comput-ers vol 26 no 1 pp 12ndash18 2003

[9] Q Liu L Chen J Zhang and F Min ldquoKnowledge reduction ininconsistent decision tablesrdquo in Proceedings of Advanced DataMining and Applications vol 4093 of Lecture Notes in ComputerScience pp 626ndash635 2006

[10] W Ziarko ldquoVariable precision rough set modelrdquo Journal ofComputer and System Sciences vol 46 no 1 pp 39ndash59 1993

[11] J G Bazan and A Skowron ldquoDynamic reducts as a tool forextracting laws from decision tablesrdquo in Proceedings of the8th International Symposium on Methodologies for IntelligentSystems pp 346ndash355 1994

[12] D Deng ldquoParallel reduct and its propertiesrdquo in Proceedings ofthe International Conference on Granular Computing (GRC rsquo09)pp 121ndash125 Nanchang China August 2009

[13] D Deng J Wang and X Li ldquoParallel reducts in a seriesof decision subsystemsrdquo in Proceedings of the InternationalJoint Conference on Computational Sciences and Optimization(CSO rsquo09) pp 377ndash380 Hainan China April 2009

[14] F Min Z Bai M He and Q Liu ldquoThe reduct problemwith specified attributesrdquo in Proceedings of the Rough Setsand Soft Computing in Intelligent Agent and Web TechnologyInternational Workshop at WI-IAT pp 36ndash42 2005

[15] W Zhu and F Wang ldquoOn three types of covering rough setsrdquoIEEE Transactions on Knowledge and Data Engineering vol 19no 8 pp 1131ndash1144 2007

[16] W Zhu and F Y Wang ldquoReduction and axiomization ofcovering generalized rough setsrdquo Information Sciences vol 152pp 217ndash230 2003

[17] C C Aggarwal ldquoOn density based transforms for uncertaindata miningrdquo in Proceedings of the 23rd IEEE InternationalConference on Data Engineering pp 866ndash875 2007

[18] H Zhao F Min and W Zhu ldquoTest-cost-sensitive attributereduction of data with normal distribution measurementerrorsrdquoMathematical Problems in Engineering vol 2013 ArticleID 946070 12 pages 2013

[19] D Slęzak ldquoApproximate entropy reductsrdquo Fundamenta Infor-maticae vol 53 no 3-4 pp 365ndash390 2002

[20] F Min X Du H Qiu and Q Liu ldquoMinimal attribute spacebias for attribute reductionrdquo in Proceedings of the Rough Set andKnowledge Technology pp 379ndash386 2007

[21] W Zhu ldquoTopological approaches to covering rough setsrdquoInformation Sciences vol 177 no 6 pp 1499ndash1508 2007

[22] S Greco B Matarazzo R Slowinski and J StefanowskildquoVariable consistency model of dominance-based rough setsapproachrdquo in Proceedings of the Rough Sets and Current Trendsin Computing vol 2005 of Lecture Notes in Computer Sciencepp 170ndash181 2000

[23] Q Hu D Yu J Liu and CWu ldquoNeighborhood rough set basedheterogeneous feature subset selectionrdquo Information Sciencesvol 178 no 18 pp 3577ndash3594 2008

[24] YQian J LiangW Pedrycz andCDang ldquoPositive approxima-tion an accelerator for attribute reduction in rough set theoryrdquoArtificial Intelligence vol 174 no 9-10 pp 597ndash618 2010

[25] Wikipedia ldquoGenetic algorithmrdquo httpenwikipediaorgwikiGeneticalgorithm

[26] P D Turney ldquoCost-sensitive classification empirical evaluationof a hybrid genetic decision tree induction algorithmrdquo Journalof Artificial Intelligence Research vol 2 pp 369ndash409 1995

[27] J Liu F Min S Liao and W Zhu ldquoA genetic algorithm toattribute reduction with test cost constraintrdquo in Proceedings ofthe 6th IEEE International Conference on Computer Sciences andConvergence Information Technology (ICCIT rsquo11) pp 751ndash7542011

[28] M Dorigo and T Stutzle Eds Ant Colony Optimization MITPress Boston Mass USA 2004

[29] L Gambardella and M Dorigo ldquoHas-sop hybrid ant systemfor the sequential ordering problemrdquo Tech Rep (IDSIA-11-97)1997

[30] L M Gambardella E Taillard G Agazzi I D Corne MDorigo and F Glover ldquoMacs-vrptw a multiple ant colonysystem for vehicle routing problemswith timewindowsrdquo inNewIdeas in Optimization pp 63ndash76 McGraw-Hill New York NYUSA 1999

[31] V Maniezzo and A Colorni ldquoThe ant system applied to thequadratic assignment problemrdquo IEEE Transactions on Knowl-edge and Data Engineering vol 11 no 5 pp 769ndash778 1999

[32] C Blum M Blesa and U P de Catalunya Departament deLlenguatges i Sistemes InformaticsMetaheuristics for the edge-weighted k-cardinality tree problem 2003

[33] R Michel and M Middendorf ldquoAn aco algorithm for theshortest common supersequence problemrdquo in New Ideas inOptimization pp 51ndash62 McGraw-Hill London UK 1999

[34] Y ChenDMiao andRWang ldquoA rough set approach to featureselection based on ant colony optimizationrdquoPatternRecognitionLetters vol 31 no 3 pp 226ndash233 2010

[35] L Ke Z Feng and Z Ren ldquoAn efficient ant colony optimizationapproach to attribute reduction in rough set theoryrdquo PatternRecognition Letters vol 29 no 9 pp 1351ndash1357 2008

[36] C L Blake and C J Merz ldquoUCI repository of machine learningdatabasesrdquo 1998 httpwwwicsuciedusimmlearnmlrepositoryhtml

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Mathematical Problems in Engineering 3

reducts exist [5 19 23 24] for different rough sets modelsThis paper employs the definition based on the positiveregion

24 The Genetic Algorithm In the computer science field ofartificial intelligence a genetic algorithm (GA) is a searchheuristic that mimics the process of natural evolution [25]This heuristic (also sometimes called a metaheuristic) isroutinely used to generate useful solutions to optimizationand search problems Genetic algorithms belong to thelarger class of evolutionary algorithms (EA) which generatesolutions to optimization problems using techniques inspiredby natural evolution such as inheritancemutation selectionand crossover [25] In [26] the genetic algorithm is employedto evolve the cost-sensitive decision trees Recently thegenetic algorithm has been employed to tackle the minimaltest cost reduction problem [6] and attribute reduction withtest cost constraint [27]

25 The Ant Colony Optimization Swarm intelligence is arelatively new approach to problem solving that takes inspi-ration from the social behaviors of insects and other animals[28] The ant colony optimization (ACO) algorithm is aprobabilistic technique for solving computational problemswhich can be reduced to finding good paths through graphsACO algorithms are state-of-the-art for the sequential order-ing problem [29] the vehicle routing problem with timewindow constraints [30] the quadratic assignment problem[31] the arc-weighted l-cardinality tree problem [32] and theshortest common supersequence problem [33] In rough setsthe classical attribute reduction problem has been tackled bythe ACO [34 35]

26 Evaluation Measures For evaluating the experimentresults we adopt evaluation measures proposed in [1] Thenew algorithm is compared with the information gain-basedheuristic algorithm on four UCI datasets [36]

We need a measure to evaluate the quality of one partic-ular reduction Since an algorithm can run on many datasetsor one dataset with different test cost settings we adopt threemetrics from a statistical viewpointThey are finding optimalfactor (FOF) maximal exceeding factor (MEF) and averageexceeding factor (AEF) [1]

261 Finding Optimal Factor Let the number of experimentsbe 119870 and the number of successful searches of an optimalreduct 119896 The finding optimal factor is defined as

op =119896

119870 (2)

262 Exceeding Factor For a dataset with a particular testcost setting let 1198771015840 be an optimal reductThe exceeding factorof a reduct 119877 is

ef (119877) =119888lowast

(119877) minus 119888lowast

(1198771015840

)

119888lowast (1198771015840) (3)

The exceeding factor provides a quantitative metric to eval-uate the performance of a reduct It indicates the badnessof a reduct when it is not optimal Naturally if 119877 is anoptimal reduct the exceeding factor is 0 To demonstrate theperformance of an algorithm statistical metrics are neededLet the number of experiments be 119870 In the 119894th experiment(1 le 119894 le 119870) the reduct computed by the algorithm is denotedby 119877119894 The maximal exceeding factor is defined as

max1le119894le119870

ef (119877119894) (4)

This shows the worst case of the algorithm given somedataset Although it relates to the performance of one partic-ular reduct it should be viewed as a statistical rather than anindividual metric The average exceeding factor is defined as

119870

sum119894=1

ef (119877119894)

119870 (5)

Since it is averaged on119870 different test-cost-sensitive decisionsystems it shows the overall performance of the algorithmsolely from a statistical perspective

3 The Algorithm

In this section one hand we revise the classical ant colonyoptimization On the other hand we present our ant colonyoptimization with different techniques to tackle the minimaltest cost reduct problem

31 The Problem Representation and Algorithm FrameworkIn order to apply ant colony optimization to tackle theminimal test cost reduction problem we adopt the followingmodel

Graph The decision system is represented as a graph

Vertex An attribute is represented as a vertex Eachfeature vertex has information about test cost

EdgeThere is an edge between any two vertexes Eachedge has information on pheromone density

Adjacent Matrix The values of the matrix representpheromone density of each edge

Because our objective is attribute reduction not classi-fication which produces rule sets we do not adopt the treestructure such as the ant colony decision tree In this sectionwe employ the first simplified ant colony optimizationmdashASWe represent the general algorithm framework as follows

Stage 1 (addition stage) Compute the core of the dataset 119870batches of ants are created with each batch containing 119896 antstherefore giving a total of 119870 times 119896 ants Each ant takes the coreas the starting position From the initial positions each ant

4 Mathematical Problems in Engineering

traverses edges probabilistically until the stopping criterionis satisfied

Stage 2 (deletion stage) Delete redundant attributes andupdate the pheromone of the traveled path

Stage 3 (filtration stage)Gather the attribute subsets obtainedby ants and compute their test cost Choose the attributesubset with minimal test cost and output it as the result

Throughout the paper the stopping criterion is thepositive region condition The selecting probability dependson the test cost of each adjacent attribute vertex and thepheromone of each adjacent edgeTheprobabilistic transitionrule is

119901119896

119894119895=

[120591119895]120572

[120578119894119895]120573

sum119897isinN119896119894

[120591119897]120572

[120578119894119897]120573

119895 isin N119896

119894 (6)

where 119896 is the number of the ant 120591119895

= 1000119905119888(119886119895) as the

heuristic information 120578119894119895means the pheromone density of

the edge (119894 119895) 120572 is the exponent of 120591119895 120573 is the exponent of

120578119894119895 andN119896

119894is the set of all unvisited adjacent vertexes of the

attribute 119894The difference between the centralized deletion strategy

and the distributed strategy is the time to delete attributesThe deletion of redundant attributes follows the finish of thejourney of all ants in the case of the centralized deletionstrategy When using the distributed deletion strategy thealgorithm deletes redundant attributes after each ant travelsa path and then it adjusts the path In this situation thepheromone of edges adjoining to redundant attribute ver-texes will not be updated The adjusting method is illustratedin Figure 2 Monte Carlo methods are a broad class of com-putational algorithms that rely on repeated random samplingto obtain numerical results that is by running simulationsmany times over in order to calculate those same probabilitiesheuristically just like actually playing and recording yourresults in a real casino situation hence the name We employMonte Carlo method to simulate the process that an artificialant selects the next attribute vertex probabilistically

32 The ACO with Distributed Deletion We represent thesubstantial algorithm of the ant colony optimization withdistributed deletion strategy as Algorithm 1 The algorithmwith centralized deletion strategy is similar

We explain the algorithm in details as follows Lines 1through 8 correspond to Stage 1 in the algorithm frameworkIn lines 2 through 4 119870 batches of ants are created with eachbatch containing 119896 ants therefore giving a total of119870times119896 antsEach ant takes the core as indicated by line 5 Lines 9 through21 represent Stage 2 Select vertexes until all ants in the batchmeet the positive region condition Whenever an ant stopsit removes the redundant attributes and adjusts the traveledpath In line 18 the ants release pheromone to the adjustedpaths After all ants finish their journey select the best reductas lines 22 through 27 have shown Obviously these linesrelate to Stage 3 If we adopt the centralized deletion strategythe algorithm deletes redundant attributes after all ants finishtheir journey

33 A Running Example The TCI-DS is given by Tables 1and 2 We illustrate the process of an ant in Figure 1 whereattributes Headache Muscle Temperature and Snivelare represented by 0 1 2 and 3 respectively

After an ant travels a path the redundant attributes areremoved After deletion the algorithm reconstructs the pathThis adjusting process is represented in Figure 2

Each ant is placed on an attribute randomly For simplic-ity assume 119896 = 1 namely there is one artificial ant in onebatch Consider Core(119878) = 0 hence any ant initially takes novertex

Stage 1 (addition stage)

Stage 11 (artificial ants start) Assume ant[0] starts from theattribute[0]

Stage 12 (artificial ants select attributes)

Iteration 1 Ant[0] is at the attribute Headache Ant[0]has three attributes to select and they are Muscle painTemperature and Snivel

According to (6) the denominator of selecting probabilityis (100025)2 sdot 12 + (100040)

2

sdot 12

+ (100020)2

sdot 12

= 1600 +

625 + 2500 = 4725We compute the probability of three selections

11990101

= 16004725 = 0339

11990102

=625

4725= 0132

11990103

=2500

4725= 0529

(7)

Because 11990103

gt 11990101

gt 11990102 the ant will select this as next

feature with high probability That is to say the ant does notselect the attribute necessarily In this case we assume the antchoose the feature Snivel At the same time the ant will addthe selected attribute to the visited attribute set We find theselected attribute set the ant has visited does not satisfy thepositive region condition so it will continue to select nextattribute

Iteration 2 Ant[0] is at the attribute Snivel currentlyThe candidate features are Muscle pain and

Temperature According to the probability function thedenominator of selecting probability is (100025)

2

sdot 12

+

(100040)2

sdot 12

= 1600 + 625 = 2225Consider

11990131

=1600

2225= 0719

11990132

=625

2225= 0281

(8)

More over 11990131

gt 11990132 then the artificial ant[0] almost

selects the attribute Muscle pain

Mathematical Problems in Engineering 5

Input 119878 = (119880 119862119863 119881 119868 119888)

Output119872 a reduct of 119878Method acomtr(1)119872 = 0the best reduct obtained by ants(2) for (119894 = 0 119894 lt 119870 119894 ++) do(3) for (119895 = 0 119895 lt 119896 119895 + +) do(4) create ant[119894 times 119896 + 119895](5) 119861

119894times119896+119895= core(119878)takes the set of core attributes

(6) end forEach ant selects attributes

(7) for (each119898 isin [119894 times 119896 (119894 + 1) times 119896] where ant[119898] has not stopped) do(8) MonteCarlo(119905119888 119901ℎ)select the next vertex(9) if (119875119874119878

119861119898(119863) == 119875119874119878

119862(119863)) then

(10) ant[119898] stops Delete redundant vertexes(11) 119898119905119888 = 10000minimal test cost(12) for (119895 = |119861

119894| minus 1 119895 gt= |Core(119878)| 119895--) do

(13) if (119875119874119878(119861119894minus119886119895)

(119863) == 119875119874119878119861119894(119863)) then

(14) 119861119894= 119861119894minus 119886119895

(15) end if(16) end for(17) Adjust the traveled path(18) Update the pheromone of edges ant[119898] traveled(19) end if(20) end for(21) end for

Choose the minimal test cost reduction from the last 119899 ants(22) for (119894 = 119870 times 119896 minus 119899 119894 lt= 119870 times 119896 minus 1 119894 ++) do(23) if 119905119888(119861

119894) le 119898119905119888 then

(24) 119898119905119888 = 119905119888(119861119894)

(25) 119872 = 119861119894

(26) end if(27) end for(28) return119872

Algorithm 1 ACO with distributed deletion to MTR problem

Start

0 1

2 3

(a)

p01 = 0339

tc = 202 3

p02 = 0132

tc = 250 1

p03 = 0529

tc = 40

(b)

tc = 250 1

tc = 402 3

p32 = 0281

p31 = 0719

(c)

0 1

2 3

(d)

Figure 1 (a) Start from the attribute Headache (b) Select second attribute (c) Select third attribute (d) Stop

6 Mathematical Problems in Engineering

10

32

(a)

10

32

(b)

2 3

0

(c)

Figure 2 (a) The path which an ant traveled (b) Delete theredundant attribute (c) Reconstruct the path

Now the attribute set obtained by ant[0] has satisfiedthe positive region constraint The ant[0] stops working andupdates pheromone information

Stage 13 (pheromone updating) After an artificial ant stopsworking it will update the pheromone density of edgestraveled by itself In our algorithm when one path is crossedby an ant the pheromone diffusion will increase by one Ofcourse the rule of pheromone updating may be designed inother methods We adopt the way in this paper

In this example the sequence attribute selection of ant[0]is Headache Snivel and Muscle pain The pheromoneof edge (0 3) and edge (3 1) increases by one

Each artificial ant runs in above three stages

Stage 2 (deletion stage) After all ants have stopped workingthe algorithm will delete redundant attributes from theselected attribute sets of last few ants using positive regionIf an attribute does not contribute to the positive region weremove it

Stage 3 (filtration stage) After deletion we choose the reductobtained from last 119899 ants with minimal test cost as the result

In this example we adopt the centralized deletion strat-egy Figure 2 illustrates the distributed deletion strategy in aniteration In the figure the ant travels the paths 0 1 2 and 3Suppose the attribute 1 is a redundant attribute we remove itfrom the attribute subset and adjust the path The adjustingprocess is shown in Figure 2(c)

4 Experiments

In this section we try to answer the following questions byexperimentation

(1) How does the number of ants in the ant colonyinfluence the result

(2) How does the strategy of deleting redundantattributes influence the quality of the result

(3) Does our ant colony optimization algorithm outper-form the existing one

The UCI datasets we used are Zoo Voting Tic-tac-toeandMushroom Since these datasets have no test cost settingsfor statistical purposes we apply three common distributionsto generate random test cost The three distributions areuniform distribution normal distribution and Pareto distri-bution In this paper the test cost is a random integer ranging

from 1 to 100The exponent 120573 in (6) is set to be 2 since undermany circumstances this value is a good setup [28]

We do not design the parameter learning mechanismThe employed competition approach has the selection mech-anism of the parameter Different applications use the samerange and users do not specify the value of the parameterThestrategy is more straightforward than many other parametertuning strategies However we may design other strategiesto save time Finding optimal factor minimal exceedingfactor and average exceeding factor are employed to measurethe effectiveness of the algorithms We run each algorithmon datasets with 1000 times The results have statisticalcharacteristics

41 The Influence of Ant Counts on Experiment Results Inorder to find the relationship between the number of ants andthe quality of the result we conduct this experiment We runour algorithmwith 100 ants 150 ants 200 ants 400 ants Inthis section the number of experiments is set to be 100 Theexponent 120572 in (6) is set to be 2 Results are shown in Tables 3and 4

42 Comparison with Existing Heuristic Algorithms Accord-ing to the result of the above experiments we find 100 is theoptimal setting of the number of the ants We conduct anempirical study to examine the effect of our algorithm Toimprove the performance we use the competition approach[1] to enhance our algorithm The exponent 120572 in (6) is set asintegers ranging from 1 to 4 We illustrate the results amongthe information gain-based algorithm GA-1 GA-2 ACOwith centralized deletion and the ACO with distributed dele-tion in Table 5 For clarity the results on dataset Mushroomare shown in Figure 5

Our algorithm is tested on the UCI datasets 1000 timesrespectively The new algorithm with different techniques iscompared with the information gain-based algorithm [1] andthe genetic algorithm [6] Experimental results are listed inFigures 3 4 and 5 and Tables 3 4 and 5

43 Experimental Results Now we can answer the questionsproposed at the beginning of this section

(1) Experiment results indicate that the effect of thealgorithm becomes worse with the increment of theant counts We find that 100 is the optimal settingof the number of ants So we run our algorithmwith 100 ants to compare with the existing heuristicalgorithms

(2) From Figures 3 and 4 we infer that the distributeddeletion outperforms the centralized one on themedium-sized dataset Mushroom without the com-petition approach When we use the competitionapproach the centralized deletion strategy and thedistributed strategy produce the similar quality ofresults shown in Table 5 and Figure 5

Mathematical Problems in Engineering 7

Table 3 The finding optimal factor of ACO with different numbers of ants without competition method using centralized deletion

Dataset Distribution Number of ants100 150 200 250 300 350 400

Uniform 066 070 073 068 062 069 061Iris Normal 053 045 044 043 037 040 043

Pareto 095 093 094 096 087 093 094Uniform 075 067 070 068 070 068 063

Zoo Normal 066 048 038 029 030 033 034Pareto 099 100 096 094 097 095 084Uniform 090 094 084 080 086 081 083

Voting Normal 100 094 091 095 095 090 093Pareto 098 099 098 099 099 099 094Uniform 096 091 089 078 068 069 068

Tic-tac-toe Normal 100 096 096 096 089 086 083Pareto 100 100 100 099 099 099 100Uniform 075 072 069 064 059 060 049

Mushroom Normal 061 055 052 051 039 041 031Pareto 097 097 096 095 095 096 094

Table 4 The finding optimal factor of ACO with different numbers of ants without competition method using distributed deletion

Dataset Distribution Number of ants100 150 200 250 300 350 400

Uniform 072 070 068 067 072 063 072Iris Normal 042 040 037 040 048 041 042

Pareto 090 095 094 093 092 091 097Uniform 072 069 069 066 063 064 055

Zoo Normal 059 050 039 033 033 036 027Pareto 099 095 093 093 097 091 094Uniform 093 084 087 086 081 079 083

Voting Normal 100 100 097 098 096 096 097Pareto 100 098 099 097 097 097 098Uniform 090 078 074 077 071 074 066

Tic-tac-toe Normal 099 099 092 090 090 091 077Pareto 100 100 100 099 100 100 099Uniform 080 071 070 062 053 059 058

Mushroom Normal 062 054 053 034 033 027 029Pareto 099 099 096 095 095 094 094

Table 5 Results of 120582-information gain algorithm and the test cost based ant colony optimization with centralized deletion and distributeddeletion

Dataset Distribution Information gain GA-1 GA-2 ACO centralized ACO distributedFOF MEF AEF FOF MEF AEF FOF MEF AEF FOF MEF AEF FOF MEF AEF

Uniform 0915 0193 0004 0966 0118 0001 0947 0224 0002 0973 0177 0001 0975 0216 0001Zoo Normal 0833 0028 0001 0970 0109 0001 0924 0101 0002 0849 0040 0002 0844 0048 0002

Pareto 0999 0167 0000 0998 0167 0000 0993 0200 0001 1000 0000 0000 1000 0000 0000Uniform 0998 0053 0000 1000 0000 0000 0999 0009 0000 0997 0065 0000 1000 0000 0000

Voting Normal 1000 0000 0000 1000 0000 0000 1000 0000 0000 1000 0000 0000 1000 0000 0000Pareto 1000 0000 0000 1000 0000 0000 0995 0067 0000 1000 0000 0000 1000 0000 0000Uniform 0877 0055 0002 1000 0000 0000 1000 0000 0000 1000 0000 0000 1000 0000 0000

Tic-tac-toe Normal 0408 0018 0003 1000 0000 0000 1000 0000 0000 1000 0000 0000 1000 0000 0000Pareto 0986 0125 0002 1000 0000 0000 1000 0000 0000 1000 0000 0000 1000 0000 0000Uniform 0762 0523 0023 0448 1000 0067 0766 0543 0019 0948 0085 0002 0946 0144 0002

Mushroom Normal 0176 0306 0010 0321 0401 0063 0551 0340 0022 0958 0020 0000 0966 0020 0000Pareto 0733 0500 0064 0935 0400 0015 0949 0500 0012 1000 0000 0000 1000 0000 0000

8 Mathematical Problems in Engineering

1

09

08

07

06

05

04100 200 300 400

Number of ants

Find

ing

optim

al fa

ctor

(FO

F)

(a)

1

08

06

04

02100 200 300 400

Number of antsFi

ndin

g op

timal

fact

or (F

OF)

(b)

1

09

08100 200 300 400

Number of ants

IrisZooVoting

Tic-tac-toeMushroom

Find

ing

optim

al fa

ctor

(FO

F)

095

085

(c)

Figure 3 ACO with centralized deletion on different distributions (a) uniform (b) normal (c) pareto

(3) The ant colony optimization provides better perfor-mance than the information gain one and the geneticoneOur algorithmovercomes the shortcoming of thegenetic algorithmwhich performs badly on the largerdataset On medium-sized dataset Mushroom theACO outperforms the genetic one significantly Forexample when the test cost distribution is normal theFOF of the ant colony optimization using centralized

deletion and distributed deletion is 958 966respectively outperforming the information gain oneGA-1 and GA-2 with 176 321 and 521 respec-tively Figure 5 gives a more intuitive understandingof results on the dataset Mushroom One possiblereason is that the ant colony optimization algorithmproduces more diverse solutions than the existingone

Mathematical Problems in Engineering 9

1

09

08

07

06

05100 200 300 400

Number of ants

Find

ing

optim

al fa

ctor

(FO

F)

(a)

1

09

08

07

06

05

04

100 200 300 400Number of ants

Find

ing

optim

al fa

ctor

(FO

F)

03

02150 250 350

(b)

100 200 300 400Number of ants

IrisZooVoting

Tic-tac-toeMushroom

1

098

096

094

092

09

Find

ing

optim

al fa

ctor

(FO

F)

(c)

Figure 4 ACO with distributed deletion on different distributions (a) uniform (b) normal (c) Pareto

5 Conclusions

In this paper we have pointed out the shortcoming of theexisting heuristic algorithms including the information gain-based one and the genetic one We have designed a newalgorithm based on ant colony optimization to tackle theMTR problem In deletion stage our algorithm containscentralized deletion strategy and distributed deletion strat-egy We have tested the new one with three representativetest cost distributions on four UCI datasets Experimentalresults show that the new algorithm outperforms the existing

ones significantly especially 1lsquoon medium-sized dataset suchas Mushroom Moreover when distribution is normal thedistributed deletion strategy obtains better results than thecentralized one on Mushroom dataset

Acknowledgments

This work is in part supported by the National ScienceFoundation of China under Grant no 61170128 the NaturalScience Foundation of Fujian Province China under Grantno 2012J01294 State Key Laboratory of Management and

10 Mathematical Problems in Engineering

Uniform Normal Pareto0

02

04

06

08

1

Test cost distribution

Find

ing

optim

al fa

ctor

(FO

F)12

(a)

1

09

08

07

04

03

02

01

0Uniform Normal Pareto

Test cost distribution

Max

imal

exce

edin

g fa

ctor

05

06

(b)

01

009

008

007

006

005

004

003

002

001

0Uniform Normal Pareto

Test cost distribution

Information gainGA-1GA-2

ACO centralized deletionACO distributed deletion

Aver

age e

xcee

ding

fact

or

(c)

Figure 5 Result on Mushroom (a) finding optimal factor (b) maximal exceeding factor (c) average exceeding factor

Control for Complex Systems Open Project under Grantno 20110106 and Fujian Province Foundation of HigherEducation under Grant no JK2012028

References

[1] F Min H He Y Qian and W Zhu ldquoTest-cost-sensitiveattribute reductionrdquo Information Sciences vol 181 no 22 pp4928ndash4942 2011

[2] F Min and W Zhu ldquoMinimal cost attribute reduction throughbacktrackingrdquo in Proceedings of the International Conference onDatabase Theory and Application vol 258 of FGIT-DTABSBTpp 100ndash107 CCIS 2011

[3] Y Y Yao and Y Zhao ldquoAttribute reduction in decision-theoreticrough set modelsrdquo Information Sciences vol 178 no 17 pp3356ndash3373 2008

[4] X Jia W Liao Z Tang and L Shang ldquoMinimum cost attributereduction in decision-theoretic rough set modelsrdquo InformationSciences vol 219 pp 151ndash167 2013

Mathematical Problems in Engineering 11

[5] Z Pawlak ldquoRough setsrdquo International Journal of Computer andInformation Sciences vol 11 no 5 pp 341ndash356 1982

[6] G Pan F Min and W Zhu ldquoA genetic algorithm to theminimal test cost reduct problemrdquo in Proceedings of the IEEEInternational Conference on Granular Computing (GrC rsquo11) pp539ndash544 Kaohsiung Taiwan November 2011

[7] F Min and Q Liu ldquoA hierarchical model for test-cost-sensitivedecision systemsrdquo Information Sciences vol 179 no 14 pp2442ndash2452 2009

[8] W X Zhang J S Mi and W Z Wu ldquoKnowledge reductions ininconsistent information systemsrdquo Chinese Journal of Comput-ers vol 26 no 1 pp 12ndash18 2003

[9] Q Liu L Chen J Zhang and F Min ldquoKnowledge reduction ininconsistent decision tablesrdquo in Proceedings of Advanced DataMining and Applications vol 4093 of Lecture Notes in ComputerScience pp 626ndash635 2006

[10] W Ziarko ldquoVariable precision rough set modelrdquo Journal ofComputer and System Sciences vol 46 no 1 pp 39ndash59 1993

[11] J G Bazan and A Skowron ldquoDynamic reducts as a tool forextracting laws from decision tablesrdquo in Proceedings of the8th International Symposium on Methodologies for IntelligentSystems pp 346ndash355 1994

[12] D Deng ldquoParallel reduct and its propertiesrdquo in Proceedings ofthe International Conference on Granular Computing (GRC rsquo09)pp 121ndash125 Nanchang China August 2009

[13] D Deng J Wang and X Li ldquoParallel reducts in a seriesof decision subsystemsrdquo in Proceedings of the InternationalJoint Conference on Computational Sciences and Optimization(CSO rsquo09) pp 377ndash380 Hainan China April 2009

[14] F Min Z Bai M He and Q Liu ldquoThe reduct problemwith specified attributesrdquo in Proceedings of the Rough Setsand Soft Computing in Intelligent Agent and Web TechnologyInternational Workshop at WI-IAT pp 36ndash42 2005

[15] W Zhu and F Wang ldquoOn three types of covering rough setsrdquoIEEE Transactions on Knowledge and Data Engineering vol 19no 8 pp 1131ndash1144 2007

[16] W Zhu and F Y Wang ldquoReduction and axiomization ofcovering generalized rough setsrdquo Information Sciences vol 152pp 217ndash230 2003

[17] C C Aggarwal ldquoOn density based transforms for uncertaindata miningrdquo in Proceedings of the 23rd IEEE InternationalConference on Data Engineering pp 866ndash875 2007

[18] H Zhao F Min and W Zhu ldquoTest-cost-sensitive attributereduction of data with normal distribution measurementerrorsrdquoMathematical Problems in Engineering vol 2013 ArticleID 946070 12 pages 2013

[19] D Slęzak ldquoApproximate entropy reductsrdquo Fundamenta Infor-maticae vol 53 no 3-4 pp 365ndash390 2002

[20] F Min X Du H Qiu and Q Liu ldquoMinimal attribute spacebias for attribute reductionrdquo in Proceedings of the Rough Set andKnowledge Technology pp 379ndash386 2007

[21] W Zhu ldquoTopological approaches to covering rough setsrdquoInformation Sciences vol 177 no 6 pp 1499ndash1508 2007

[22] S Greco B Matarazzo R Slowinski and J StefanowskildquoVariable consistency model of dominance-based rough setsapproachrdquo in Proceedings of the Rough Sets and Current Trendsin Computing vol 2005 of Lecture Notes in Computer Sciencepp 170ndash181 2000

[23] Q Hu D Yu J Liu and CWu ldquoNeighborhood rough set basedheterogeneous feature subset selectionrdquo Information Sciencesvol 178 no 18 pp 3577ndash3594 2008

[24] YQian J LiangW Pedrycz andCDang ldquoPositive approxima-tion an accelerator for attribute reduction in rough set theoryrdquoArtificial Intelligence vol 174 no 9-10 pp 597ndash618 2010

[25] Wikipedia ldquoGenetic algorithmrdquo httpenwikipediaorgwikiGeneticalgorithm

[26] P D Turney ldquoCost-sensitive classification empirical evaluationof a hybrid genetic decision tree induction algorithmrdquo Journalof Artificial Intelligence Research vol 2 pp 369ndash409 1995

[27] J Liu F Min S Liao and W Zhu ldquoA genetic algorithm toattribute reduction with test cost constraintrdquo in Proceedings ofthe 6th IEEE International Conference on Computer Sciences andConvergence Information Technology (ICCIT rsquo11) pp 751ndash7542011

[28] M Dorigo and T Stutzle Eds Ant Colony Optimization MITPress Boston Mass USA 2004

[29] L Gambardella and M Dorigo ldquoHas-sop hybrid ant systemfor the sequential ordering problemrdquo Tech Rep (IDSIA-11-97)1997

[30] L M Gambardella E Taillard G Agazzi I D Corne MDorigo and F Glover ldquoMacs-vrptw a multiple ant colonysystem for vehicle routing problemswith timewindowsrdquo inNewIdeas in Optimization pp 63ndash76 McGraw-Hill New York NYUSA 1999

[31] V Maniezzo and A Colorni ldquoThe ant system applied to thequadratic assignment problemrdquo IEEE Transactions on Knowl-edge and Data Engineering vol 11 no 5 pp 769ndash778 1999

[32] C Blum M Blesa and U P de Catalunya Departament deLlenguatges i Sistemes InformaticsMetaheuristics for the edge-weighted k-cardinality tree problem 2003

[33] R Michel and M Middendorf ldquoAn aco algorithm for theshortest common supersequence problemrdquo in New Ideas inOptimization pp 51ndash62 McGraw-Hill London UK 1999

[34] Y ChenDMiao andRWang ldquoA rough set approach to featureselection based on ant colony optimizationrdquoPatternRecognitionLetters vol 31 no 3 pp 226ndash233 2010

[35] L Ke Z Feng and Z Ren ldquoAn efficient ant colony optimizationapproach to attribute reduction in rough set theoryrdquo PatternRecognition Letters vol 29 no 9 pp 1351ndash1357 2008

[36] C L Blake and C J Merz ldquoUCI repository of machine learningdatabasesrdquo 1998 httpwwwicsuciedusimmlearnmlrepositoryhtml

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

4 Mathematical Problems in Engineering

traverses edges probabilistically until the stopping criterionis satisfied

Stage 2 (deletion stage) Delete redundant attributes andupdate the pheromone of the traveled path

Stage 3 (filtration stage)Gather the attribute subsets obtainedby ants and compute their test cost Choose the attributesubset with minimal test cost and output it as the result

Throughout the paper the stopping criterion is thepositive region condition The selecting probability dependson the test cost of each adjacent attribute vertex and thepheromone of each adjacent edgeTheprobabilistic transitionrule is

119901119896

119894119895=

[120591119895]120572

[120578119894119895]120573

sum119897isinN119896119894

[120591119897]120572

[120578119894119897]120573

119895 isin N119896

119894 (6)

where 119896 is the number of the ant 120591119895

= 1000119905119888(119886119895) as the

heuristic information 120578119894119895means the pheromone density of

the edge (119894 119895) 120572 is the exponent of 120591119895 120573 is the exponent of

120578119894119895 andN119896

119894is the set of all unvisited adjacent vertexes of the

attribute 119894The difference between the centralized deletion strategy

and the distributed strategy is the time to delete attributesThe deletion of redundant attributes follows the finish of thejourney of all ants in the case of the centralized deletionstrategy When using the distributed deletion strategy thealgorithm deletes redundant attributes after each ant travelsa path and then it adjusts the path In this situation thepheromone of edges adjoining to redundant attribute ver-texes will not be updated The adjusting method is illustratedin Figure 2 Monte Carlo methods are a broad class of com-putational algorithms that rely on repeated random samplingto obtain numerical results that is by running simulationsmany times over in order to calculate those same probabilitiesheuristically just like actually playing and recording yourresults in a real casino situation hence the name We employMonte Carlo method to simulate the process that an artificialant selects the next attribute vertex probabilistically

32 The ACO with Distributed Deletion We represent thesubstantial algorithm of the ant colony optimization withdistributed deletion strategy as Algorithm 1 The algorithmwith centralized deletion strategy is similar

We explain the algorithm in details as follows Lines 1through 8 correspond to Stage 1 in the algorithm frameworkIn lines 2 through 4 119870 batches of ants are created with eachbatch containing 119896 ants therefore giving a total of119870times119896 antsEach ant takes the core as indicated by line 5 Lines 9 through21 represent Stage 2 Select vertexes until all ants in the batchmeet the positive region condition Whenever an ant stopsit removes the redundant attributes and adjusts the traveledpath In line 18 the ants release pheromone to the adjustedpaths After all ants finish their journey select the best reductas lines 22 through 27 have shown Obviously these linesrelate to Stage 3 If we adopt the centralized deletion strategythe algorithm deletes redundant attributes after all ants finishtheir journey

33 A Running Example The TCI-DS is given by Tables 1and 2 We illustrate the process of an ant in Figure 1 whereattributes Headache Muscle Temperature and Snivelare represented by 0 1 2 and 3 respectively

After an ant travels a path the redundant attributes areremoved After deletion the algorithm reconstructs the pathThis adjusting process is represented in Figure 2

Each ant is placed on an attribute randomly For simplic-ity assume 119896 = 1 namely there is one artificial ant in onebatch Consider Core(119878) = 0 hence any ant initially takes novertex

Stage 1 (addition stage)

Stage 11 (artificial ants start) Assume ant[0] starts from theattribute[0]

Stage 12 (artificial ants select attributes)

Iteration 1 Ant[0] is at the attribute Headache Ant[0]has three attributes to select and they are Muscle painTemperature and Snivel

According to (6) the denominator of selecting probabilityis (100025)2 sdot 12 + (100040)

2

sdot 12

+ (100020)2

sdot 12

= 1600 +

625 + 2500 = 4725We compute the probability of three selections

11990101

= 16004725 = 0339

11990102

=625

4725= 0132

11990103

=2500

4725= 0529

(7)

Because 11990103

gt 11990101

gt 11990102 the ant will select this as next

feature with high probability That is to say the ant does notselect the attribute necessarily In this case we assume the antchoose the feature Snivel At the same time the ant will addthe selected attribute to the visited attribute set We find theselected attribute set the ant has visited does not satisfy thepositive region condition so it will continue to select nextattribute

Iteration 2 Ant[0] is at the attribute Snivel currentlyThe candidate features are Muscle pain and

Temperature According to the probability function thedenominator of selecting probability is (100025)

2

sdot 12

+

(100040)2

sdot 12

= 1600 + 625 = 2225Consider

11990131

=1600

2225= 0719

11990132

=625

2225= 0281

(8)

More over 11990131

gt 11990132 then the artificial ant[0] almost

selects the attribute Muscle pain

Mathematical Problems in Engineering 5

Input 119878 = (119880 119862119863 119881 119868 119888)

Output119872 a reduct of 119878Method acomtr(1)119872 = 0the best reduct obtained by ants(2) for (119894 = 0 119894 lt 119870 119894 ++) do(3) for (119895 = 0 119895 lt 119896 119895 + +) do(4) create ant[119894 times 119896 + 119895](5) 119861

119894times119896+119895= core(119878)takes the set of core attributes

(6) end forEach ant selects attributes

(7) for (each119898 isin [119894 times 119896 (119894 + 1) times 119896] where ant[119898] has not stopped) do(8) MonteCarlo(119905119888 119901ℎ)select the next vertex(9) if (119875119874119878

119861119898(119863) == 119875119874119878

119862(119863)) then

(10) ant[119898] stops Delete redundant vertexes(11) 119898119905119888 = 10000minimal test cost(12) for (119895 = |119861

119894| minus 1 119895 gt= |Core(119878)| 119895--) do

(13) if (119875119874119878(119861119894minus119886119895)

(119863) == 119875119874119878119861119894(119863)) then

(14) 119861119894= 119861119894minus 119886119895

(15) end if(16) end for(17) Adjust the traveled path(18) Update the pheromone of edges ant[119898] traveled(19) end if(20) end for(21) end for

Choose the minimal test cost reduction from the last 119899 ants(22) for (119894 = 119870 times 119896 minus 119899 119894 lt= 119870 times 119896 minus 1 119894 ++) do(23) if 119905119888(119861

119894) le 119898119905119888 then

(24) 119898119905119888 = 119905119888(119861119894)

(25) 119872 = 119861119894

(26) end if(27) end for(28) return119872

Algorithm 1 ACO with distributed deletion to MTR problem

Start

0 1

2 3

(a)

p01 = 0339

tc = 202 3

p02 = 0132

tc = 250 1

p03 = 0529

tc = 40

(b)

tc = 250 1

tc = 402 3

p32 = 0281

p31 = 0719

(c)

0 1

2 3

(d)

Figure 1 (a) Start from the attribute Headache (b) Select second attribute (c) Select third attribute (d) Stop

6 Mathematical Problems in Engineering

10

32

(a)

10

32

(b)

2 3

0

(c)

Figure 2 (a) The path which an ant traveled (b) Delete theredundant attribute (c) Reconstruct the path

Now the attribute set obtained by ant[0] has satisfiedthe positive region constraint The ant[0] stops working andupdates pheromone information

Stage 13 (pheromone updating) After an artificial ant stopsworking it will update the pheromone density of edgestraveled by itself In our algorithm when one path is crossedby an ant the pheromone diffusion will increase by one Ofcourse the rule of pheromone updating may be designed inother methods We adopt the way in this paper

In this example the sequence attribute selection of ant[0]is Headache Snivel and Muscle pain The pheromoneof edge (0 3) and edge (3 1) increases by one

Each artificial ant runs in above three stages

Stage 2 (deletion stage) After all ants have stopped workingthe algorithm will delete redundant attributes from theselected attribute sets of last few ants using positive regionIf an attribute does not contribute to the positive region weremove it

Stage 3 (filtration stage) After deletion we choose the reductobtained from last 119899 ants with minimal test cost as the result

In this example we adopt the centralized deletion strat-egy Figure 2 illustrates the distributed deletion strategy in aniteration In the figure the ant travels the paths 0 1 2 and 3Suppose the attribute 1 is a redundant attribute we remove itfrom the attribute subset and adjust the path The adjustingprocess is shown in Figure 2(c)

4 Experiments

In this section we try to answer the following questions byexperimentation

(1) How does the number of ants in the ant colonyinfluence the result

(2) How does the strategy of deleting redundantattributes influence the quality of the result

(3) Does our ant colony optimization algorithm outper-form the existing one

The UCI datasets we used are Zoo Voting Tic-tac-toeandMushroom Since these datasets have no test cost settingsfor statistical purposes we apply three common distributionsto generate random test cost The three distributions areuniform distribution normal distribution and Pareto distri-bution In this paper the test cost is a random integer ranging

from 1 to 100The exponent 120573 in (6) is set to be 2 since undermany circumstances this value is a good setup [28]

We do not design the parameter learning mechanismThe employed competition approach has the selection mech-anism of the parameter Different applications use the samerange and users do not specify the value of the parameterThestrategy is more straightforward than many other parametertuning strategies However we may design other strategiesto save time Finding optimal factor minimal exceedingfactor and average exceeding factor are employed to measurethe effectiveness of the algorithms We run each algorithmon datasets with 1000 times The results have statisticalcharacteristics

41 The Influence of Ant Counts on Experiment Results Inorder to find the relationship between the number of ants andthe quality of the result we conduct this experiment We runour algorithmwith 100 ants 150 ants 200 ants 400 ants Inthis section the number of experiments is set to be 100 Theexponent 120572 in (6) is set to be 2 Results are shown in Tables 3and 4

42 Comparison with Existing Heuristic Algorithms Accord-ing to the result of the above experiments we find 100 is theoptimal setting of the number of the ants We conduct anempirical study to examine the effect of our algorithm Toimprove the performance we use the competition approach[1] to enhance our algorithm The exponent 120572 in (6) is set asintegers ranging from 1 to 4 We illustrate the results amongthe information gain-based algorithm GA-1 GA-2 ACOwith centralized deletion and the ACO with distributed dele-tion in Table 5 For clarity the results on dataset Mushroomare shown in Figure 5

Our algorithm is tested on the UCI datasets 1000 timesrespectively The new algorithm with different techniques iscompared with the information gain-based algorithm [1] andthe genetic algorithm [6] Experimental results are listed inFigures 3 4 and 5 and Tables 3 4 and 5

43 Experimental Results Now we can answer the questionsproposed at the beginning of this section

(1) Experiment results indicate that the effect of thealgorithm becomes worse with the increment of theant counts We find that 100 is the optimal settingof the number of ants So we run our algorithmwith 100 ants to compare with the existing heuristicalgorithms

(2) From Figures 3 and 4 we infer that the distributeddeletion outperforms the centralized one on themedium-sized dataset Mushroom without the com-petition approach When we use the competitionapproach the centralized deletion strategy and thedistributed strategy produce the similar quality ofresults shown in Table 5 and Figure 5

Mathematical Problems in Engineering 7

Table 3 The finding optimal factor of ACO with different numbers of ants without competition method using centralized deletion

Dataset Distribution Number of ants100 150 200 250 300 350 400

Uniform 066 070 073 068 062 069 061Iris Normal 053 045 044 043 037 040 043

Pareto 095 093 094 096 087 093 094Uniform 075 067 070 068 070 068 063

Zoo Normal 066 048 038 029 030 033 034Pareto 099 100 096 094 097 095 084Uniform 090 094 084 080 086 081 083

Voting Normal 100 094 091 095 095 090 093Pareto 098 099 098 099 099 099 094Uniform 096 091 089 078 068 069 068

Tic-tac-toe Normal 100 096 096 096 089 086 083Pareto 100 100 100 099 099 099 100Uniform 075 072 069 064 059 060 049

Mushroom Normal 061 055 052 051 039 041 031Pareto 097 097 096 095 095 096 094

Table 4 The finding optimal factor of ACO with different numbers of ants without competition method using distributed deletion

Dataset Distribution Number of ants100 150 200 250 300 350 400

Uniform 072 070 068 067 072 063 072Iris Normal 042 040 037 040 048 041 042

Pareto 090 095 094 093 092 091 097Uniform 072 069 069 066 063 064 055

Zoo Normal 059 050 039 033 033 036 027Pareto 099 095 093 093 097 091 094Uniform 093 084 087 086 081 079 083

Voting Normal 100 100 097 098 096 096 097Pareto 100 098 099 097 097 097 098Uniform 090 078 074 077 071 074 066

Tic-tac-toe Normal 099 099 092 090 090 091 077Pareto 100 100 100 099 100 100 099Uniform 080 071 070 062 053 059 058

Mushroom Normal 062 054 053 034 033 027 029Pareto 099 099 096 095 095 094 094

Table 5 Results of 120582-information gain algorithm and the test cost based ant colony optimization with centralized deletion and distributeddeletion

Dataset Distribution Information gain GA-1 GA-2 ACO centralized ACO distributedFOF MEF AEF FOF MEF AEF FOF MEF AEF FOF MEF AEF FOF MEF AEF

Uniform 0915 0193 0004 0966 0118 0001 0947 0224 0002 0973 0177 0001 0975 0216 0001Zoo Normal 0833 0028 0001 0970 0109 0001 0924 0101 0002 0849 0040 0002 0844 0048 0002

Pareto 0999 0167 0000 0998 0167 0000 0993 0200 0001 1000 0000 0000 1000 0000 0000Uniform 0998 0053 0000 1000 0000 0000 0999 0009 0000 0997 0065 0000 1000 0000 0000

Voting Normal 1000 0000 0000 1000 0000 0000 1000 0000 0000 1000 0000 0000 1000 0000 0000Pareto 1000 0000 0000 1000 0000 0000 0995 0067 0000 1000 0000 0000 1000 0000 0000Uniform 0877 0055 0002 1000 0000 0000 1000 0000 0000 1000 0000 0000 1000 0000 0000

Tic-tac-toe Normal 0408 0018 0003 1000 0000 0000 1000 0000 0000 1000 0000 0000 1000 0000 0000Pareto 0986 0125 0002 1000 0000 0000 1000 0000 0000 1000 0000 0000 1000 0000 0000Uniform 0762 0523 0023 0448 1000 0067 0766 0543 0019 0948 0085 0002 0946 0144 0002

Mushroom Normal 0176 0306 0010 0321 0401 0063 0551 0340 0022 0958 0020 0000 0966 0020 0000Pareto 0733 0500 0064 0935 0400 0015 0949 0500 0012 1000 0000 0000 1000 0000 0000

8 Mathematical Problems in Engineering

1

09

08

07

06

05

04100 200 300 400

Number of ants

Find

ing

optim

al fa

ctor

(FO

F)

(a)

1

08

06

04

02100 200 300 400

Number of antsFi

ndin

g op

timal

fact

or (F

OF)

(b)

1

09

08100 200 300 400

Number of ants

IrisZooVoting

Tic-tac-toeMushroom

Find

ing

optim

al fa

ctor

(FO

F)

095

085

(c)

Figure 3 ACO with centralized deletion on different distributions (a) uniform (b) normal (c) pareto

(3) The ant colony optimization provides better perfor-mance than the information gain one and the geneticoneOur algorithmovercomes the shortcoming of thegenetic algorithmwhich performs badly on the largerdataset On medium-sized dataset Mushroom theACO outperforms the genetic one significantly Forexample when the test cost distribution is normal theFOF of the ant colony optimization using centralized

deletion and distributed deletion is 958 966respectively outperforming the information gain oneGA-1 and GA-2 with 176 321 and 521 respec-tively Figure 5 gives a more intuitive understandingof results on the dataset Mushroom One possiblereason is that the ant colony optimization algorithmproduces more diverse solutions than the existingone

Mathematical Problems in Engineering 9

1

09

08

07

06

05100 200 300 400

Number of ants

Find

ing

optim

al fa

ctor

(FO

F)

(a)

1

09

08

07

06

05

04

100 200 300 400Number of ants

Find

ing

optim

al fa

ctor

(FO

F)

03

02150 250 350

(b)

100 200 300 400Number of ants

IrisZooVoting

Tic-tac-toeMushroom

1

098

096

094

092

09

Find

ing

optim

al fa

ctor

(FO

F)

(c)

Figure 4 ACO with distributed deletion on different distributions (a) uniform (b) normal (c) Pareto

5 Conclusions

In this paper we have pointed out the shortcoming of theexisting heuristic algorithms including the information gain-based one and the genetic one We have designed a newalgorithm based on ant colony optimization to tackle theMTR problem In deletion stage our algorithm containscentralized deletion strategy and distributed deletion strat-egy We have tested the new one with three representativetest cost distributions on four UCI datasets Experimentalresults show that the new algorithm outperforms the existing

ones significantly especially 1lsquoon medium-sized dataset suchas Mushroom Moreover when distribution is normal thedistributed deletion strategy obtains better results than thecentralized one on Mushroom dataset

Acknowledgments

This work is in part supported by the National ScienceFoundation of China under Grant no 61170128 the NaturalScience Foundation of Fujian Province China under Grantno 2012J01294 State Key Laboratory of Management and

10 Mathematical Problems in Engineering

Uniform Normal Pareto0

02

04

06

08

1

Test cost distribution

Find

ing

optim

al fa

ctor

(FO

F)12

(a)

1

09

08

07

04

03

02

01

0Uniform Normal Pareto

Test cost distribution

Max

imal

exce

edin

g fa

ctor

05

06

(b)

01

009

008

007

006

005

004

003

002

001

0Uniform Normal Pareto

Test cost distribution

Information gainGA-1GA-2

ACO centralized deletionACO distributed deletion

Aver

age e

xcee

ding

fact

or

(c)

Figure 5 Result on Mushroom (a) finding optimal factor (b) maximal exceeding factor (c) average exceeding factor

Control for Complex Systems Open Project under Grantno 20110106 and Fujian Province Foundation of HigherEducation under Grant no JK2012028

References

[1] F Min H He Y Qian and W Zhu ldquoTest-cost-sensitiveattribute reductionrdquo Information Sciences vol 181 no 22 pp4928ndash4942 2011

[2] F Min and W Zhu ldquoMinimal cost attribute reduction throughbacktrackingrdquo in Proceedings of the International Conference onDatabase Theory and Application vol 258 of FGIT-DTABSBTpp 100ndash107 CCIS 2011

[3] Y Y Yao and Y Zhao ldquoAttribute reduction in decision-theoreticrough set modelsrdquo Information Sciences vol 178 no 17 pp3356ndash3373 2008

[4] X Jia W Liao Z Tang and L Shang ldquoMinimum cost attributereduction in decision-theoretic rough set modelsrdquo InformationSciences vol 219 pp 151ndash167 2013

Mathematical Problems in Engineering 11

[5] Z Pawlak ldquoRough setsrdquo International Journal of Computer andInformation Sciences vol 11 no 5 pp 341ndash356 1982

[6] G Pan F Min and W Zhu ldquoA genetic algorithm to theminimal test cost reduct problemrdquo in Proceedings of the IEEEInternational Conference on Granular Computing (GrC rsquo11) pp539ndash544 Kaohsiung Taiwan November 2011

[7] F Min and Q Liu ldquoA hierarchical model for test-cost-sensitivedecision systemsrdquo Information Sciences vol 179 no 14 pp2442ndash2452 2009

[8] W X Zhang J S Mi and W Z Wu ldquoKnowledge reductions ininconsistent information systemsrdquo Chinese Journal of Comput-ers vol 26 no 1 pp 12ndash18 2003

[9] Q Liu L Chen J Zhang and F Min ldquoKnowledge reduction ininconsistent decision tablesrdquo in Proceedings of Advanced DataMining and Applications vol 4093 of Lecture Notes in ComputerScience pp 626ndash635 2006

[10] W Ziarko ldquoVariable precision rough set modelrdquo Journal ofComputer and System Sciences vol 46 no 1 pp 39ndash59 1993

[11] J G Bazan and A Skowron ldquoDynamic reducts as a tool forextracting laws from decision tablesrdquo in Proceedings of the8th International Symposium on Methodologies for IntelligentSystems pp 346ndash355 1994

[12] D Deng ldquoParallel reduct and its propertiesrdquo in Proceedings ofthe International Conference on Granular Computing (GRC rsquo09)pp 121ndash125 Nanchang China August 2009

[13] D Deng J Wang and X Li ldquoParallel reducts in a seriesof decision subsystemsrdquo in Proceedings of the InternationalJoint Conference on Computational Sciences and Optimization(CSO rsquo09) pp 377ndash380 Hainan China April 2009

[14] F Min Z Bai M He and Q Liu ldquoThe reduct problemwith specified attributesrdquo in Proceedings of the Rough Setsand Soft Computing in Intelligent Agent and Web TechnologyInternational Workshop at WI-IAT pp 36ndash42 2005

[15] W Zhu and F Wang ldquoOn three types of covering rough setsrdquoIEEE Transactions on Knowledge and Data Engineering vol 19no 8 pp 1131ndash1144 2007

[16] W Zhu and F Y Wang ldquoReduction and axiomization ofcovering generalized rough setsrdquo Information Sciences vol 152pp 217ndash230 2003

[17] C C Aggarwal ldquoOn density based transforms for uncertaindata miningrdquo in Proceedings of the 23rd IEEE InternationalConference on Data Engineering pp 866ndash875 2007

[18] H Zhao F Min and W Zhu ldquoTest-cost-sensitive attributereduction of data with normal distribution measurementerrorsrdquoMathematical Problems in Engineering vol 2013 ArticleID 946070 12 pages 2013

[19] D Slęzak ldquoApproximate entropy reductsrdquo Fundamenta Infor-maticae vol 53 no 3-4 pp 365ndash390 2002

[20] F Min X Du H Qiu and Q Liu ldquoMinimal attribute spacebias for attribute reductionrdquo in Proceedings of the Rough Set andKnowledge Technology pp 379ndash386 2007

[21] W Zhu ldquoTopological approaches to covering rough setsrdquoInformation Sciences vol 177 no 6 pp 1499ndash1508 2007

[22] S Greco B Matarazzo R Slowinski and J StefanowskildquoVariable consistency model of dominance-based rough setsapproachrdquo in Proceedings of the Rough Sets and Current Trendsin Computing vol 2005 of Lecture Notes in Computer Sciencepp 170ndash181 2000

[23] Q Hu D Yu J Liu and CWu ldquoNeighborhood rough set basedheterogeneous feature subset selectionrdquo Information Sciencesvol 178 no 18 pp 3577ndash3594 2008

[24] YQian J LiangW Pedrycz andCDang ldquoPositive approxima-tion an accelerator for attribute reduction in rough set theoryrdquoArtificial Intelligence vol 174 no 9-10 pp 597ndash618 2010

[25] Wikipedia ldquoGenetic algorithmrdquo httpenwikipediaorgwikiGeneticalgorithm

[26] P D Turney ldquoCost-sensitive classification empirical evaluationof a hybrid genetic decision tree induction algorithmrdquo Journalof Artificial Intelligence Research vol 2 pp 369ndash409 1995

[27] J Liu F Min S Liao and W Zhu ldquoA genetic algorithm toattribute reduction with test cost constraintrdquo in Proceedings ofthe 6th IEEE International Conference on Computer Sciences andConvergence Information Technology (ICCIT rsquo11) pp 751ndash7542011

[28] M Dorigo and T Stutzle Eds Ant Colony Optimization MITPress Boston Mass USA 2004

[29] L Gambardella and M Dorigo ldquoHas-sop hybrid ant systemfor the sequential ordering problemrdquo Tech Rep (IDSIA-11-97)1997

[30] L M Gambardella E Taillard G Agazzi I D Corne MDorigo and F Glover ldquoMacs-vrptw a multiple ant colonysystem for vehicle routing problemswith timewindowsrdquo inNewIdeas in Optimization pp 63ndash76 McGraw-Hill New York NYUSA 1999

[31] V Maniezzo and A Colorni ldquoThe ant system applied to thequadratic assignment problemrdquo IEEE Transactions on Knowl-edge and Data Engineering vol 11 no 5 pp 769ndash778 1999

[32] C Blum M Blesa and U P de Catalunya Departament deLlenguatges i Sistemes InformaticsMetaheuristics for the edge-weighted k-cardinality tree problem 2003

[33] R Michel and M Middendorf ldquoAn aco algorithm for theshortest common supersequence problemrdquo in New Ideas inOptimization pp 51ndash62 McGraw-Hill London UK 1999

[34] Y ChenDMiao andRWang ldquoA rough set approach to featureselection based on ant colony optimizationrdquoPatternRecognitionLetters vol 31 no 3 pp 226ndash233 2010

[35] L Ke Z Feng and Z Ren ldquoAn efficient ant colony optimizationapproach to attribute reduction in rough set theoryrdquo PatternRecognition Letters vol 29 no 9 pp 1351ndash1357 2008

[36] C L Blake and C J Merz ldquoUCI repository of machine learningdatabasesrdquo 1998 httpwwwicsuciedusimmlearnmlrepositoryhtml

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Mathematical Problems in Engineering 5

Input 119878 = (119880 119862119863 119881 119868 119888)

Output119872 a reduct of 119878Method acomtr(1)119872 = 0the best reduct obtained by ants(2) for (119894 = 0 119894 lt 119870 119894 ++) do(3) for (119895 = 0 119895 lt 119896 119895 + +) do(4) create ant[119894 times 119896 + 119895](5) 119861

119894times119896+119895= core(119878)takes the set of core attributes

(6) end forEach ant selects attributes

(7) for (each119898 isin [119894 times 119896 (119894 + 1) times 119896] where ant[119898] has not stopped) do(8) MonteCarlo(119905119888 119901ℎ)select the next vertex(9) if (119875119874119878

119861119898(119863) == 119875119874119878

119862(119863)) then

(10) ant[119898] stops Delete redundant vertexes(11) 119898119905119888 = 10000minimal test cost(12) for (119895 = |119861

119894| minus 1 119895 gt= |Core(119878)| 119895--) do

(13) if (119875119874119878(119861119894minus119886119895)

(119863) == 119875119874119878119861119894(119863)) then

(14) 119861119894= 119861119894minus 119886119895

(15) end if(16) end for(17) Adjust the traveled path(18) Update the pheromone of edges ant[119898] traveled(19) end if(20) end for(21) end for

Choose the minimal test cost reduction from the last 119899 ants(22) for (119894 = 119870 times 119896 minus 119899 119894 lt= 119870 times 119896 minus 1 119894 ++) do(23) if 119905119888(119861

119894) le 119898119905119888 then

(24) 119898119905119888 = 119905119888(119861119894)

(25) 119872 = 119861119894

(26) end if(27) end for(28) return119872

Algorithm 1 ACO with distributed deletion to MTR problem

Start

0 1

2 3

(a)

p01 = 0339

tc = 202 3

p02 = 0132

tc = 250 1

p03 = 0529

tc = 40

(b)

tc = 250 1

tc = 402 3

p32 = 0281

p31 = 0719

(c)

0 1

2 3

(d)

Figure 1 (a) Start from the attribute Headache (b) Select second attribute (c) Select third attribute (d) Stop

6 Mathematical Problems in Engineering

10

32

(a)

10

32

(b)

2 3

0

(c)

Figure 2 (a) The path which an ant traveled (b) Delete theredundant attribute (c) Reconstruct the path

Now the attribute set obtained by ant[0] has satisfiedthe positive region constraint The ant[0] stops working andupdates pheromone information

Stage 13 (pheromone updating) After an artificial ant stopsworking it will update the pheromone density of edgestraveled by itself In our algorithm when one path is crossedby an ant the pheromone diffusion will increase by one Ofcourse the rule of pheromone updating may be designed inother methods We adopt the way in this paper

In this example the sequence attribute selection of ant[0]is Headache Snivel and Muscle pain The pheromoneof edge (0 3) and edge (3 1) increases by one

Each artificial ant runs in above three stages

Stage 2 (deletion stage) After all ants have stopped workingthe algorithm will delete redundant attributes from theselected attribute sets of last few ants using positive regionIf an attribute does not contribute to the positive region weremove it

Stage 3 (filtration stage) After deletion we choose the reductobtained from last 119899 ants with minimal test cost as the result

In this example we adopt the centralized deletion strat-egy Figure 2 illustrates the distributed deletion strategy in aniteration In the figure the ant travels the paths 0 1 2 and 3Suppose the attribute 1 is a redundant attribute we remove itfrom the attribute subset and adjust the path The adjustingprocess is shown in Figure 2(c)

4 Experiments

In this section we try to answer the following questions byexperimentation

(1) How does the number of ants in the ant colonyinfluence the result

(2) How does the strategy of deleting redundantattributes influence the quality of the result

(3) Does our ant colony optimization algorithm outper-form the existing one

The UCI datasets we used are Zoo Voting Tic-tac-toeandMushroom Since these datasets have no test cost settingsfor statistical purposes we apply three common distributionsto generate random test cost The three distributions areuniform distribution normal distribution and Pareto distri-bution In this paper the test cost is a random integer ranging

from 1 to 100The exponent 120573 in (6) is set to be 2 since undermany circumstances this value is a good setup [28]

We do not design the parameter learning mechanismThe employed competition approach has the selection mech-anism of the parameter Different applications use the samerange and users do not specify the value of the parameterThestrategy is more straightforward than many other parametertuning strategies However we may design other strategiesto save time Finding optimal factor minimal exceedingfactor and average exceeding factor are employed to measurethe effectiveness of the algorithms We run each algorithmon datasets with 1000 times The results have statisticalcharacteristics

41 The Influence of Ant Counts on Experiment Results Inorder to find the relationship between the number of ants andthe quality of the result we conduct this experiment We runour algorithmwith 100 ants 150 ants 200 ants 400 ants Inthis section the number of experiments is set to be 100 Theexponent 120572 in (6) is set to be 2 Results are shown in Tables 3and 4

42 Comparison with Existing Heuristic Algorithms Accord-ing to the result of the above experiments we find 100 is theoptimal setting of the number of the ants We conduct anempirical study to examine the effect of our algorithm Toimprove the performance we use the competition approach[1] to enhance our algorithm The exponent 120572 in (6) is set asintegers ranging from 1 to 4 We illustrate the results amongthe information gain-based algorithm GA-1 GA-2 ACOwith centralized deletion and the ACO with distributed dele-tion in Table 5 For clarity the results on dataset Mushroomare shown in Figure 5

Our algorithm is tested on the UCI datasets 1000 timesrespectively The new algorithm with different techniques iscompared with the information gain-based algorithm [1] andthe genetic algorithm [6] Experimental results are listed inFigures 3 4 and 5 and Tables 3 4 and 5

43 Experimental Results Now we can answer the questionsproposed at the beginning of this section

(1) Experiment results indicate that the effect of thealgorithm becomes worse with the increment of theant counts We find that 100 is the optimal settingof the number of ants So we run our algorithmwith 100 ants to compare with the existing heuristicalgorithms

(2) From Figures 3 and 4 we infer that the distributeddeletion outperforms the centralized one on themedium-sized dataset Mushroom without the com-petition approach When we use the competitionapproach the centralized deletion strategy and thedistributed strategy produce the similar quality ofresults shown in Table 5 and Figure 5

Mathematical Problems in Engineering 7

Table 3 The finding optimal factor of ACO with different numbers of ants without competition method using centralized deletion

Dataset Distribution Number of ants100 150 200 250 300 350 400

Uniform 066 070 073 068 062 069 061Iris Normal 053 045 044 043 037 040 043

Pareto 095 093 094 096 087 093 094Uniform 075 067 070 068 070 068 063

Zoo Normal 066 048 038 029 030 033 034Pareto 099 100 096 094 097 095 084Uniform 090 094 084 080 086 081 083

Voting Normal 100 094 091 095 095 090 093Pareto 098 099 098 099 099 099 094Uniform 096 091 089 078 068 069 068

Tic-tac-toe Normal 100 096 096 096 089 086 083Pareto 100 100 100 099 099 099 100Uniform 075 072 069 064 059 060 049

Mushroom Normal 061 055 052 051 039 041 031Pareto 097 097 096 095 095 096 094

Table 4 The finding optimal factor of ACO with different numbers of ants without competition method using distributed deletion

Dataset Distribution Number of ants100 150 200 250 300 350 400

Uniform 072 070 068 067 072 063 072Iris Normal 042 040 037 040 048 041 042

Pareto 090 095 094 093 092 091 097Uniform 072 069 069 066 063 064 055

Zoo Normal 059 050 039 033 033 036 027Pareto 099 095 093 093 097 091 094Uniform 093 084 087 086 081 079 083

Voting Normal 100 100 097 098 096 096 097Pareto 100 098 099 097 097 097 098Uniform 090 078 074 077 071 074 066

Tic-tac-toe Normal 099 099 092 090 090 091 077Pareto 100 100 100 099 100 100 099Uniform 080 071 070 062 053 059 058

Mushroom Normal 062 054 053 034 033 027 029Pareto 099 099 096 095 095 094 094

Table 5 Results of 120582-information gain algorithm and the test cost based ant colony optimization with centralized deletion and distributeddeletion

Dataset Distribution Information gain GA-1 GA-2 ACO centralized ACO distributedFOF MEF AEF FOF MEF AEF FOF MEF AEF FOF MEF AEF FOF MEF AEF

Uniform 0915 0193 0004 0966 0118 0001 0947 0224 0002 0973 0177 0001 0975 0216 0001Zoo Normal 0833 0028 0001 0970 0109 0001 0924 0101 0002 0849 0040 0002 0844 0048 0002

Pareto 0999 0167 0000 0998 0167 0000 0993 0200 0001 1000 0000 0000 1000 0000 0000Uniform 0998 0053 0000 1000 0000 0000 0999 0009 0000 0997 0065 0000 1000 0000 0000

Voting Normal 1000 0000 0000 1000 0000 0000 1000 0000 0000 1000 0000 0000 1000 0000 0000Pareto 1000 0000 0000 1000 0000 0000 0995 0067 0000 1000 0000 0000 1000 0000 0000Uniform 0877 0055 0002 1000 0000 0000 1000 0000 0000 1000 0000 0000 1000 0000 0000

Tic-tac-toe Normal 0408 0018 0003 1000 0000 0000 1000 0000 0000 1000 0000 0000 1000 0000 0000Pareto 0986 0125 0002 1000 0000 0000 1000 0000 0000 1000 0000 0000 1000 0000 0000Uniform 0762 0523 0023 0448 1000 0067 0766 0543 0019 0948 0085 0002 0946 0144 0002

Mushroom Normal 0176 0306 0010 0321 0401 0063 0551 0340 0022 0958 0020 0000 0966 0020 0000Pareto 0733 0500 0064 0935 0400 0015 0949 0500 0012 1000 0000 0000 1000 0000 0000

8 Mathematical Problems in Engineering

1

09

08

07

06

05

04100 200 300 400

Number of ants

Find

ing

optim

al fa

ctor

(FO

F)

(a)

1

08

06

04

02100 200 300 400

Number of antsFi

ndin

g op

timal

fact

or (F

OF)

(b)

1

09

08100 200 300 400

Number of ants

IrisZooVoting

Tic-tac-toeMushroom

Find

ing

optim

al fa

ctor

(FO

F)

095

085

(c)

Figure 3 ACO with centralized deletion on different distributions (a) uniform (b) normal (c) pareto

(3) The ant colony optimization provides better perfor-mance than the information gain one and the geneticoneOur algorithmovercomes the shortcoming of thegenetic algorithmwhich performs badly on the largerdataset On medium-sized dataset Mushroom theACO outperforms the genetic one significantly Forexample when the test cost distribution is normal theFOF of the ant colony optimization using centralized

deletion and distributed deletion is 958 966respectively outperforming the information gain oneGA-1 and GA-2 with 176 321 and 521 respec-tively Figure 5 gives a more intuitive understandingof results on the dataset Mushroom One possiblereason is that the ant colony optimization algorithmproduces more diverse solutions than the existingone

Mathematical Problems in Engineering 9

1

09

08

07

06

05100 200 300 400

Number of ants

Find

ing

optim

al fa

ctor

(FO

F)

(a)

1

09

08

07

06

05

04

100 200 300 400Number of ants

Find

ing

optim

al fa

ctor

(FO

F)

03

02150 250 350

(b)

100 200 300 400Number of ants

IrisZooVoting

Tic-tac-toeMushroom

1

098

096

094

092

09

Find

ing

optim

al fa

ctor

(FO

F)

(c)

Figure 4 ACO with distributed deletion on different distributions (a) uniform (b) normal (c) Pareto

5 Conclusions

In this paper we have pointed out the shortcoming of theexisting heuristic algorithms including the information gain-based one and the genetic one We have designed a newalgorithm based on ant colony optimization to tackle theMTR problem In deletion stage our algorithm containscentralized deletion strategy and distributed deletion strat-egy We have tested the new one with three representativetest cost distributions on four UCI datasets Experimentalresults show that the new algorithm outperforms the existing

ones significantly especially 1lsquoon medium-sized dataset suchas Mushroom Moreover when distribution is normal thedistributed deletion strategy obtains better results than thecentralized one on Mushroom dataset

Acknowledgments

This work is in part supported by the National ScienceFoundation of China under Grant no 61170128 the NaturalScience Foundation of Fujian Province China under Grantno 2012J01294 State Key Laboratory of Management and

10 Mathematical Problems in Engineering

Uniform Normal Pareto0

02

04

06

08

1

Test cost distribution

Find

ing

optim

al fa

ctor

(FO

F)12

(a)

1

09

08

07

04

03

02

01

0Uniform Normal Pareto

Test cost distribution

Max

imal

exce

edin

g fa

ctor

05

06

(b)

01

009

008

007

006

005

004

003

002

001

0Uniform Normal Pareto

Test cost distribution

Information gainGA-1GA-2

ACO centralized deletionACO distributed deletion

Aver

age e

xcee

ding

fact

or

(c)

Figure 5 Result on Mushroom (a) finding optimal factor (b) maximal exceeding factor (c) average exceeding factor

Control for Complex Systems Open Project under Grantno 20110106 and Fujian Province Foundation of HigherEducation under Grant no JK2012028

References

[1] F Min H He Y Qian and W Zhu ldquoTest-cost-sensitiveattribute reductionrdquo Information Sciences vol 181 no 22 pp4928ndash4942 2011

[2] F Min and W Zhu ldquoMinimal cost attribute reduction throughbacktrackingrdquo in Proceedings of the International Conference onDatabase Theory and Application vol 258 of FGIT-DTABSBTpp 100ndash107 CCIS 2011

[3] Y Y Yao and Y Zhao ldquoAttribute reduction in decision-theoreticrough set modelsrdquo Information Sciences vol 178 no 17 pp3356ndash3373 2008

[4] X Jia W Liao Z Tang and L Shang ldquoMinimum cost attributereduction in decision-theoretic rough set modelsrdquo InformationSciences vol 219 pp 151ndash167 2013

Mathematical Problems in Engineering 11

[5] Z Pawlak ldquoRough setsrdquo International Journal of Computer andInformation Sciences vol 11 no 5 pp 341ndash356 1982

[6] G Pan F Min and W Zhu ldquoA genetic algorithm to theminimal test cost reduct problemrdquo in Proceedings of the IEEEInternational Conference on Granular Computing (GrC rsquo11) pp539ndash544 Kaohsiung Taiwan November 2011

[7] F Min and Q Liu ldquoA hierarchical model for test-cost-sensitivedecision systemsrdquo Information Sciences vol 179 no 14 pp2442ndash2452 2009

[8] W X Zhang J S Mi and W Z Wu ldquoKnowledge reductions ininconsistent information systemsrdquo Chinese Journal of Comput-ers vol 26 no 1 pp 12ndash18 2003

[9] Q Liu L Chen J Zhang and F Min ldquoKnowledge reduction ininconsistent decision tablesrdquo in Proceedings of Advanced DataMining and Applications vol 4093 of Lecture Notes in ComputerScience pp 626ndash635 2006

[10] W Ziarko ldquoVariable precision rough set modelrdquo Journal ofComputer and System Sciences vol 46 no 1 pp 39ndash59 1993

[11] J G Bazan and A Skowron ldquoDynamic reducts as a tool forextracting laws from decision tablesrdquo in Proceedings of the8th International Symposium on Methodologies for IntelligentSystems pp 346ndash355 1994

[12] D Deng ldquoParallel reduct and its propertiesrdquo in Proceedings ofthe International Conference on Granular Computing (GRC rsquo09)pp 121ndash125 Nanchang China August 2009

[13] D Deng J Wang and X Li ldquoParallel reducts in a seriesof decision subsystemsrdquo in Proceedings of the InternationalJoint Conference on Computational Sciences and Optimization(CSO rsquo09) pp 377ndash380 Hainan China April 2009

[14] F Min Z Bai M He and Q Liu ldquoThe reduct problemwith specified attributesrdquo in Proceedings of the Rough Setsand Soft Computing in Intelligent Agent and Web TechnologyInternational Workshop at WI-IAT pp 36ndash42 2005

[15] W Zhu and F Wang ldquoOn three types of covering rough setsrdquoIEEE Transactions on Knowledge and Data Engineering vol 19no 8 pp 1131ndash1144 2007

[16] W Zhu and F Y Wang ldquoReduction and axiomization ofcovering generalized rough setsrdquo Information Sciences vol 152pp 217ndash230 2003

[17] C C Aggarwal ldquoOn density based transforms for uncertaindata miningrdquo in Proceedings of the 23rd IEEE InternationalConference on Data Engineering pp 866ndash875 2007

[18] H Zhao F Min and W Zhu ldquoTest-cost-sensitive attributereduction of data with normal distribution measurementerrorsrdquoMathematical Problems in Engineering vol 2013 ArticleID 946070 12 pages 2013

[19] D Slęzak ldquoApproximate entropy reductsrdquo Fundamenta Infor-maticae vol 53 no 3-4 pp 365ndash390 2002

[20] F Min X Du H Qiu and Q Liu ldquoMinimal attribute spacebias for attribute reductionrdquo in Proceedings of the Rough Set andKnowledge Technology pp 379ndash386 2007

[21] W Zhu ldquoTopological approaches to covering rough setsrdquoInformation Sciences vol 177 no 6 pp 1499ndash1508 2007

[22] S Greco B Matarazzo R Slowinski and J StefanowskildquoVariable consistency model of dominance-based rough setsapproachrdquo in Proceedings of the Rough Sets and Current Trendsin Computing vol 2005 of Lecture Notes in Computer Sciencepp 170ndash181 2000

[23] Q Hu D Yu J Liu and CWu ldquoNeighborhood rough set basedheterogeneous feature subset selectionrdquo Information Sciencesvol 178 no 18 pp 3577ndash3594 2008

[24] YQian J LiangW Pedrycz andCDang ldquoPositive approxima-tion an accelerator for attribute reduction in rough set theoryrdquoArtificial Intelligence vol 174 no 9-10 pp 597ndash618 2010

[25] Wikipedia ldquoGenetic algorithmrdquo httpenwikipediaorgwikiGeneticalgorithm

[26] P D Turney ldquoCost-sensitive classification empirical evaluationof a hybrid genetic decision tree induction algorithmrdquo Journalof Artificial Intelligence Research vol 2 pp 369ndash409 1995

[27] J Liu F Min S Liao and W Zhu ldquoA genetic algorithm toattribute reduction with test cost constraintrdquo in Proceedings ofthe 6th IEEE International Conference on Computer Sciences andConvergence Information Technology (ICCIT rsquo11) pp 751ndash7542011

[28] M Dorigo and T Stutzle Eds Ant Colony Optimization MITPress Boston Mass USA 2004

[29] L Gambardella and M Dorigo ldquoHas-sop hybrid ant systemfor the sequential ordering problemrdquo Tech Rep (IDSIA-11-97)1997

[30] L M Gambardella E Taillard G Agazzi I D Corne MDorigo and F Glover ldquoMacs-vrptw a multiple ant colonysystem for vehicle routing problemswith timewindowsrdquo inNewIdeas in Optimization pp 63ndash76 McGraw-Hill New York NYUSA 1999

[31] V Maniezzo and A Colorni ldquoThe ant system applied to thequadratic assignment problemrdquo IEEE Transactions on Knowl-edge and Data Engineering vol 11 no 5 pp 769ndash778 1999

[32] C Blum M Blesa and U P de Catalunya Departament deLlenguatges i Sistemes InformaticsMetaheuristics for the edge-weighted k-cardinality tree problem 2003

[33] R Michel and M Middendorf ldquoAn aco algorithm for theshortest common supersequence problemrdquo in New Ideas inOptimization pp 51ndash62 McGraw-Hill London UK 1999

[34] Y ChenDMiao andRWang ldquoA rough set approach to featureselection based on ant colony optimizationrdquoPatternRecognitionLetters vol 31 no 3 pp 226ndash233 2010

[35] L Ke Z Feng and Z Ren ldquoAn efficient ant colony optimizationapproach to attribute reduction in rough set theoryrdquo PatternRecognition Letters vol 29 no 9 pp 1351ndash1357 2008

[36] C L Blake and C J Merz ldquoUCI repository of machine learningdatabasesrdquo 1998 httpwwwicsuciedusimmlearnmlrepositoryhtml

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

6 Mathematical Problems in Engineering

10

32

(a)

10

32

(b)

2 3

0

(c)

Figure 2 (a) The path which an ant traveled (b) Delete theredundant attribute (c) Reconstruct the path

Now the attribute set obtained by ant[0] has satisfiedthe positive region constraint The ant[0] stops working andupdates pheromone information

Stage 13 (pheromone updating) After an artificial ant stopsworking it will update the pheromone density of edgestraveled by itself In our algorithm when one path is crossedby an ant the pheromone diffusion will increase by one Ofcourse the rule of pheromone updating may be designed inother methods We adopt the way in this paper

In this example the sequence attribute selection of ant[0]is Headache Snivel and Muscle pain The pheromoneof edge (0 3) and edge (3 1) increases by one

Each artificial ant runs in above three stages

Stage 2 (deletion stage) After all ants have stopped workingthe algorithm will delete redundant attributes from theselected attribute sets of last few ants using positive regionIf an attribute does not contribute to the positive region weremove it

Stage 3 (filtration stage) After deletion we choose the reductobtained from last 119899 ants with minimal test cost as the result

In this example we adopt the centralized deletion strat-egy Figure 2 illustrates the distributed deletion strategy in aniteration In the figure the ant travels the paths 0 1 2 and 3Suppose the attribute 1 is a redundant attribute we remove itfrom the attribute subset and adjust the path The adjustingprocess is shown in Figure 2(c)

4 Experiments

In this section we try to answer the following questions byexperimentation

(1) How does the number of ants in the ant colonyinfluence the result

(2) How does the strategy of deleting redundantattributes influence the quality of the result

(3) Does our ant colony optimization algorithm outper-form the existing one

The UCI datasets we used are Zoo Voting Tic-tac-toeandMushroom Since these datasets have no test cost settingsfor statistical purposes we apply three common distributionsto generate random test cost The three distributions areuniform distribution normal distribution and Pareto distri-bution In this paper the test cost is a random integer ranging

from 1 to 100The exponent 120573 in (6) is set to be 2 since undermany circumstances this value is a good setup [28]

We do not design the parameter learning mechanismThe employed competition approach has the selection mech-anism of the parameter Different applications use the samerange and users do not specify the value of the parameterThestrategy is more straightforward than many other parametertuning strategies However we may design other strategiesto save time Finding optimal factor minimal exceedingfactor and average exceeding factor are employed to measurethe effectiveness of the algorithms We run each algorithmon datasets with 1000 times The results have statisticalcharacteristics

41 The Influence of Ant Counts on Experiment Results Inorder to find the relationship between the number of ants andthe quality of the result we conduct this experiment We runour algorithmwith 100 ants 150 ants 200 ants 400 ants Inthis section the number of experiments is set to be 100 Theexponent 120572 in (6) is set to be 2 Results are shown in Tables 3and 4

42 Comparison with Existing Heuristic Algorithms Accord-ing to the result of the above experiments we find 100 is theoptimal setting of the number of the ants We conduct anempirical study to examine the effect of our algorithm Toimprove the performance we use the competition approach[1] to enhance our algorithm The exponent 120572 in (6) is set asintegers ranging from 1 to 4 We illustrate the results amongthe information gain-based algorithm GA-1 GA-2 ACOwith centralized deletion and the ACO with distributed dele-tion in Table 5 For clarity the results on dataset Mushroomare shown in Figure 5

Our algorithm is tested on the UCI datasets 1000 timesrespectively The new algorithm with different techniques iscompared with the information gain-based algorithm [1] andthe genetic algorithm [6] Experimental results are listed inFigures 3 4 and 5 and Tables 3 4 and 5

43 Experimental Results Now we can answer the questionsproposed at the beginning of this section

(1) Experiment results indicate that the effect of thealgorithm becomes worse with the increment of theant counts We find that 100 is the optimal settingof the number of ants So we run our algorithmwith 100 ants to compare with the existing heuristicalgorithms

(2) From Figures 3 and 4 we infer that the distributeddeletion outperforms the centralized one on themedium-sized dataset Mushroom without the com-petition approach When we use the competitionapproach the centralized deletion strategy and thedistributed strategy produce the similar quality ofresults shown in Table 5 and Figure 5

Mathematical Problems in Engineering 7

Table 3 The finding optimal factor of ACO with different numbers of ants without competition method using centralized deletion

Dataset Distribution Number of ants100 150 200 250 300 350 400

Uniform 066 070 073 068 062 069 061Iris Normal 053 045 044 043 037 040 043

Pareto 095 093 094 096 087 093 094Uniform 075 067 070 068 070 068 063

Zoo Normal 066 048 038 029 030 033 034Pareto 099 100 096 094 097 095 084Uniform 090 094 084 080 086 081 083

Voting Normal 100 094 091 095 095 090 093Pareto 098 099 098 099 099 099 094Uniform 096 091 089 078 068 069 068

Tic-tac-toe Normal 100 096 096 096 089 086 083Pareto 100 100 100 099 099 099 100Uniform 075 072 069 064 059 060 049

Mushroom Normal 061 055 052 051 039 041 031Pareto 097 097 096 095 095 096 094

Table 4 The finding optimal factor of ACO with different numbers of ants without competition method using distributed deletion

Dataset Distribution Number of ants100 150 200 250 300 350 400

Uniform 072 070 068 067 072 063 072Iris Normal 042 040 037 040 048 041 042

Pareto 090 095 094 093 092 091 097Uniform 072 069 069 066 063 064 055

Zoo Normal 059 050 039 033 033 036 027Pareto 099 095 093 093 097 091 094Uniform 093 084 087 086 081 079 083

Voting Normal 100 100 097 098 096 096 097Pareto 100 098 099 097 097 097 098Uniform 090 078 074 077 071 074 066

Tic-tac-toe Normal 099 099 092 090 090 091 077Pareto 100 100 100 099 100 100 099Uniform 080 071 070 062 053 059 058

Mushroom Normal 062 054 053 034 033 027 029Pareto 099 099 096 095 095 094 094

Table 5 Results of 120582-information gain algorithm and the test cost based ant colony optimization with centralized deletion and distributeddeletion

Dataset Distribution Information gain GA-1 GA-2 ACO centralized ACO distributedFOF MEF AEF FOF MEF AEF FOF MEF AEF FOF MEF AEF FOF MEF AEF

Uniform 0915 0193 0004 0966 0118 0001 0947 0224 0002 0973 0177 0001 0975 0216 0001Zoo Normal 0833 0028 0001 0970 0109 0001 0924 0101 0002 0849 0040 0002 0844 0048 0002

Pareto 0999 0167 0000 0998 0167 0000 0993 0200 0001 1000 0000 0000 1000 0000 0000Uniform 0998 0053 0000 1000 0000 0000 0999 0009 0000 0997 0065 0000 1000 0000 0000

Voting Normal 1000 0000 0000 1000 0000 0000 1000 0000 0000 1000 0000 0000 1000 0000 0000Pareto 1000 0000 0000 1000 0000 0000 0995 0067 0000 1000 0000 0000 1000 0000 0000Uniform 0877 0055 0002 1000 0000 0000 1000 0000 0000 1000 0000 0000 1000 0000 0000

Tic-tac-toe Normal 0408 0018 0003 1000 0000 0000 1000 0000 0000 1000 0000 0000 1000 0000 0000Pareto 0986 0125 0002 1000 0000 0000 1000 0000 0000 1000 0000 0000 1000 0000 0000Uniform 0762 0523 0023 0448 1000 0067 0766 0543 0019 0948 0085 0002 0946 0144 0002

Mushroom Normal 0176 0306 0010 0321 0401 0063 0551 0340 0022 0958 0020 0000 0966 0020 0000Pareto 0733 0500 0064 0935 0400 0015 0949 0500 0012 1000 0000 0000 1000 0000 0000

8 Mathematical Problems in Engineering

1

09

08

07

06

05

04100 200 300 400

Number of ants

Find

ing

optim

al fa

ctor

(FO

F)

(a)

1

08

06

04

02100 200 300 400

Number of antsFi

ndin

g op

timal

fact

or (F

OF)

(b)

1

09

08100 200 300 400

Number of ants

IrisZooVoting

Tic-tac-toeMushroom

Find

ing

optim

al fa

ctor

(FO

F)

095

085

(c)

Figure 3 ACO with centralized deletion on different distributions (a) uniform (b) normal (c) pareto

(3) The ant colony optimization provides better perfor-mance than the information gain one and the geneticoneOur algorithmovercomes the shortcoming of thegenetic algorithmwhich performs badly on the largerdataset On medium-sized dataset Mushroom theACO outperforms the genetic one significantly Forexample when the test cost distribution is normal theFOF of the ant colony optimization using centralized

deletion and distributed deletion is 958 966respectively outperforming the information gain oneGA-1 and GA-2 with 176 321 and 521 respec-tively Figure 5 gives a more intuitive understandingof results on the dataset Mushroom One possiblereason is that the ant colony optimization algorithmproduces more diverse solutions than the existingone

Mathematical Problems in Engineering 9

1

09

08

07

06

05100 200 300 400

Number of ants

Find

ing

optim

al fa

ctor

(FO

F)

(a)

1

09

08

07

06

05

04

100 200 300 400Number of ants

Find

ing

optim

al fa

ctor

(FO

F)

03

02150 250 350

(b)

100 200 300 400Number of ants

IrisZooVoting

Tic-tac-toeMushroom

1

098

096

094

092

09

Find

ing

optim

al fa

ctor

(FO

F)

(c)

Figure 4 ACO with distributed deletion on different distributions (a) uniform (b) normal (c) Pareto

5 Conclusions

In this paper we have pointed out the shortcoming of theexisting heuristic algorithms including the information gain-based one and the genetic one We have designed a newalgorithm based on ant colony optimization to tackle theMTR problem In deletion stage our algorithm containscentralized deletion strategy and distributed deletion strat-egy We have tested the new one with three representativetest cost distributions on four UCI datasets Experimentalresults show that the new algorithm outperforms the existing

ones significantly especially 1lsquoon medium-sized dataset suchas Mushroom Moreover when distribution is normal thedistributed deletion strategy obtains better results than thecentralized one on Mushroom dataset

Acknowledgments

This work is in part supported by the National ScienceFoundation of China under Grant no 61170128 the NaturalScience Foundation of Fujian Province China under Grantno 2012J01294 State Key Laboratory of Management and

10 Mathematical Problems in Engineering

Uniform Normal Pareto0

02

04

06

08

1

Test cost distribution

Find

ing

optim

al fa

ctor

(FO

F)12

(a)

1

09

08

07

04

03

02

01

0Uniform Normal Pareto

Test cost distribution

Max

imal

exce

edin

g fa

ctor

05

06

(b)

01

009

008

007

006

005

004

003

002

001

0Uniform Normal Pareto

Test cost distribution

Information gainGA-1GA-2

ACO centralized deletionACO distributed deletion

Aver

age e

xcee

ding

fact

or

(c)

Figure 5 Result on Mushroom (a) finding optimal factor (b) maximal exceeding factor (c) average exceeding factor

Control for Complex Systems Open Project under Grantno 20110106 and Fujian Province Foundation of HigherEducation under Grant no JK2012028

References

[1] F Min H He Y Qian and W Zhu ldquoTest-cost-sensitiveattribute reductionrdquo Information Sciences vol 181 no 22 pp4928ndash4942 2011

[2] F Min and W Zhu ldquoMinimal cost attribute reduction throughbacktrackingrdquo in Proceedings of the International Conference onDatabase Theory and Application vol 258 of FGIT-DTABSBTpp 100ndash107 CCIS 2011

[3] Y Y Yao and Y Zhao ldquoAttribute reduction in decision-theoreticrough set modelsrdquo Information Sciences vol 178 no 17 pp3356ndash3373 2008

[4] X Jia W Liao Z Tang and L Shang ldquoMinimum cost attributereduction in decision-theoretic rough set modelsrdquo InformationSciences vol 219 pp 151ndash167 2013

Mathematical Problems in Engineering 11

[5] Z Pawlak ldquoRough setsrdquo International Journal of Computer andInformation Sciences vol 11 no 5 pp 341ndash356 1982

[6] G Pan F Min and W Zhu ldquoA genetic algorithm to theminimal test cost reduct problemrdquo in Proceedings of the IEEEInternational Conference on Granular Computing (GrC rsquo11) pp539ndash544 Kaohsiung Taiwan November 2011

[7] F Min and Q Liu ldquoA hierarchical model for test-cost-sensitivedecision systemsrdquo Information Sciences vol 179 no 14 pp2442ndash2452 2009

[8] W X Zhang J S Mi and W Z Wu ldquoKnowledge reductions ininconsistent information systemsrdquo Chinese Journal of Comput-ers vol 26 no 1 pp 12ndash18 2003

[9] Q Liu L Chen J Zhang and F Min ldquoKnowledge reduction ininconsistent decision tablesrdquo in Proceedings of Advanced DataMining and Applications vol 4093 of Lecture Notes in ComputerScience pp 626ndash635 2006

[10] W Ziarko ldquoVariable precision rough set modelrdquo Journal ofComputer and System Sciences vol 46 no 1 pp 39ndash59 1993

[11] J G Bazan and A Skowron ldquoDynamic reducts as a tool forextracting laws from decision tablesrdquo in Proceedings of the8th International Symposium on Methodologies for IntelligentSystems pp 346ndash355 1994

[12] D Deng ldquoParallel reduct and its propertiesrdquo in Proceedings ofthe International Conference on Granular Computing (GRC rsquo09)pp 121ndash125 Nanchang China August 2009

[13] D Deng J Wang and X Li ldquoParallel reducts in a seriesof decision subsystemsrdquo in Proceedings of the InternationalJoint Conference on Computational Sciences and Optimization(CSO rsquo09) pp 377ndash380 Hainan China April 2009

[14] F Min Z Bai M He and Q Liu ldquoThe reduct problemwith specified attributesrdquo in Proceedings of the Rough Setsand Soft Computing in Intelligent Agent and Web TechnologyInternational Workshop at WI-IAT pp 36ndash42 2005

[15] W Zhu and F Wang ldquoOn three types of covering rough setsrdquoIEEE Transactions on Knowledge and Data Engineering vol 19no 8 pp 1131ndash1144 2007

[16] W Zhu and F Y Wang ldquoReduction and axiomization ofcovering generalized rough setsrdquo Information Sciences vol 152pp 217ndash230 2003

[17] C C Aggarwal ldquoOn density based transforms for uncertaindata miningrdquo in Proceedings of the 23rd IEEE InternationalConference on Data Engineering pp 866ndash875 2007

[18] H Zhao F Min and W Zhu ldquoTest-cost-sensitive attributereduction of data with normal distribution measurementerrorsrdquoMathematical Problems in Engineering vol 2013 ArticleID 946070 12 pages 2013

[19] D Slęzak ldquoApproximate entropy reductsrdquo Fundamenta Infor-maticae vol 53 no 3-4 pp 365ndash390 2002

[20] F Min X Du H Qiu and Q Liu ldquoMinimal attribute spacebias for attribute reductionrdquo in Proceedings of the Rough Set andKnowledge Technology pp 379ndash386 2007

[21] W Zhu ldquoTopological approaches to covering rough setsrdquoInformation Sciences vol 177 no 6 pp 1499ndash1508 2007

[22] S Greco B Matarazzo R Slowinski and J StefanowskildquoVariable consistency model of dominance-based rough setsapproachrdquo in Proceedings of the Rough Sets and Current Trendsin Computing vol 2005 of Lecture Notes in Computer Sciencepp 170ndash181 2000

[23] Q Hu D Yu J Liu and CWu ldquoNeighborhood rough set basedheterogeneous feature subset selectionrdquo Information Sciencesvol 178 no 18 pp 3577ndash3594 2008

[24] YQian J LiangW Pedrycz andCDang ldquoPositive approxima-tion an accelerator for attribute reduction in rough set theoryrdquoArtificial Intelligence vol 174 no 9-10 pp 597ndash618 2010

[25] Wikipedia ldquoGenetic algorithmrdquo httpenwikipediaorgwikiGeneticalgorithm

[26] P D Turney ldquoCost-sensitive classification empirical evaluationof a hybrid genetic decision tree induction algorithmrdquo Journalof Artificial Intelligence Research vol 2 pp 369ndash409 1995

[27] J Liu F Min S Liao and W Zhu ldquoA genetic algorithm toattribute reduction with test cost constraintrdquo in Proceedings ofthe 6th IEEE International Conference on Computer Sciences andConvergence Information Technology (ICCIT rsquo11) pp 751ndash7542011

[28] M Dorigo and T Stutzle Eds Ant Colony Optimization MITPress Boston Mass USA 2004

[29] L Gambardella and M Dorigo ldquoHas-sop hybrid ant systemfor the sequential ordering problemrdquo Tech Rep (IDSIA-11-97)1997

[30] L M Gambardella E Taillard G Agazzi I D Corne MDorigo and F Glover ldquoMacs-vrptw a multiple ant colonysystem for vehicle routing problemswith timewindowsrdquo inNewIdeas in Optimization pp 63ndash76 McGraw-Hill New York NYUSA 1999

[31] V Maniezzo and A Colorni ldquoThe ant system applied to thequadratic assignment problemrdquo IEEE Transactions on Knowl-edge and Data Engineering vol 11 no 5 pp 769ndash778 1999

[32] C Blum M Blesa and U P de Catalunya Departament deLlenguatges i Sistemes InformaticsMetaheuristics for the edge-weighted k-cardinality tree problem 2003

[33] R Michel and M Middendorf ldquoAn aco algorithm for theshortest common supersequence problemrdquo in New Ideas inOptimization pp 51ndash62 McGraw-Hill London UK 1999

[34] Y ChenDMiao andRWang ldquoA rough set approach to featureselection based on ant colony optimizationrdquoPatternRecognitionLetters vol 31 no 3 pp 226ndash233 2010

[35] L Ke Z Feng and Z Ren ldquoAn efficient ant colony optimizationapproach to attribute reduction in rough set theoryrdquo PatternRecognition Letters vol 29 no 9 pp 1351ndash1357 2008

[36] C L Blake and C J Merz ldquoUCI repository of machine learningdatabasesrdquo 1998 httpwwwicsuciedusimmlearnmlrepositoryhtml

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Mathematical Problems in Engineering 7

Table 3 The finding optimal factor of ACO with different numbers of ants without competition method using centralized deletion

Dataset Distribution Number of ants100 150 200 250 300 350 400

Uniform 066 070 073 068 062 069 061Iris Normal 053 045 044 043 037 040 043

Pareto 095 093 094 096 087 093 094Uniform 075 067 070 068 070 068 063

Zoo Normal 066 048 038 029 030 033 034Pareto 099 100 096 094 097 095 084Uniform 090 094 084 080 086 081 083

Voting Normal 100 094 091 095 095 090 093Pareto 098 099 098 099 099 099 094Uniform 096 091 089 078 068 069 068

Tic-tac-toe Normal 100 096 096 096 089 086 083Pareto 100 100 100 099 099 099 100Uniform 075 072 069 064 059 060 049

Mushroom Normal 061 055 052 051 039 041 031Pareto 097 097 096 095 095 096 094

Table 4 The finding optimal factor of ACO with different numbers of ants without competition method using distributed deletion

Dataset Distribution Number of ants100 150 200 250 300 350 400

Uniform 072 070 068 067 072 063 072Iris Normal 042 040 037 040 048 041 042

Pareto 090 095 094 093 092 091 097Uniform 072 069 069 066 063 064 055

Zoo Normal 059 050 039 033 033 036 027Pareto 099 095 093 093 097 091 094Uniform 093 084 087 086 081 079 083

Voting Normal 100 100 097 098 096 096 097Pareto 100 098 099 097 097 097 098Uniform 090 078 074 077 071 074 066

Tic-tac-toe Normal 099 099 092 090 090 091 077Pareto 100 100 100 099 100 100 099Uniform 080 071 070 062 053 059 058

Mushroom Normal 062 054 053 034 033 027 029Pareto 099 099 096 095 095 094 094

Table 5 Results of 120582-information gain algorithm and the test cost based ant colony optimization with centralized deletion and distributeddeletion

Dataset Distribution Information gain GA-1 GA-2 ACO centralized ACO distributedFOF MEF AEF FOF MEF AEF FOF MEF AEF FOF MEF AEF FOF MEF AEF

Uniform 0915 0193 0004 0966 0118 0001 0947 0224 0002 0973 0177 0001 0975 0216 0001Zoo Normal 0833 0028 0001 0970 0109 0001 0924 0101 0002 0849 0040 0002 0844 0048 0002

Pareto 0999 0167 0000 0998 0167 0000 0993 0200 0001 1000 0000 0000 1000 0000 0000Uniform 0998 0053 0000 1000 0000 0000 0999 0009 0000 0997 0065 0000 1000 0000 0000

Voting Normal 1000 0000 0000 1000 0000 0000 1000 0000 0000 1000 0000 0000 1000 0000 0000Pareto 1000 0000 0000 1000 0000 0000 0995 0067 0000 1000 0000 0000 1000 0000 0000Uniform 0877 0055 0002 1000 0000 0000 1000 0000 0000 1000 0000 0000 1000 0000 0000

Tic-tac-toe Normal 0408 0018 0003 1000 0000 0000 1000 0000 0000 1000 0000 0000 1000 0000 0000Pareto 0986 0125 0002 1000 0000 0000 1000 0000 0000 1000 0000 0000 1000 0000 0000Uniform 0762 0523 0023 0448 1000 0067 0766 0543 0019 0948 0085 0002 0946 0144 0002

Mushroom Normal 0176 0306 0010 0321 0401 0063 0551 0340 0022 0958 0020 0000 0966 0020 0000Pareto 0733 0500 0064 0935 0400 0015 0949 0500 0012 1000 0000 0000 1000 0000 0000

8 Mathematical Problems in Engineering

1

09

08

07

06

05

04100 200 300 400

Number of ants

Find

ing

optim

al fa

ctor

(FO

F)

(a)

1

08

06

04

02100 200 300 400

Number of antsFi

ndin

g op

timal

fact

or (F

OF)

(b)

1

09

08100 200 300 400

Number of ants

IrisZooVoting

Tic-tac-toeMushroom

Find

ing

optim

al fa

ctor

(FO

F)

095

085

(c)

Figure 3 ACO with centralized deletion on different distributions (a) uniform (b) normal (c) pareto

(3) The ant colony optimization provides better perfor-mance than the information gain one and the geneticoneOur algorithmovercomes the shortcoming of thegenetic algorithmwhich performs badly on the largerdataset On medium-sized dataset Mushroom theACO outperforms the genetic one significantly Forexample when the test cost distribution is normal theFOF of the ant colony optimization using centralized

deletion and distributed deletion is 958 966respectively outperforming the information gain oneGA-1 and GA-2 with 176 321 and 521 respec-tively Figure 5 gives a more intuitive understandingof results on the dataset Mushroom One possiblereason is that the ant colony optimization algorithmproduces more diverse solutions than the existingone

Mathematical Problems in Engineering 9

1

09

08

07

06

05100 200 300 400

Number of ants

Find

ing

optim

al fa

ctor

(FO

F)

(a)

1

09

08

07

06

05

04

100 200 300 400Number of ants

Find

ing

optim

al fa

ctor

(FO

F)

03

02150 250 350

(b)

100 200 300 400Number of ants

IrisZooVoting

Tic-tac-toeMushroom

1

098

096

094

092

09

Find

ing

optim

al fa

ctor

(FO

F)

(c)

Figure 4 ACO with distributed deletion on different distributions (a) uniform (b) normal (c) Pareto

5 Conclusions

In this paper we have pointed out the shortcoming of theexisting heuristic algorithms including the information gain-based one and the genetic one We have designed a newalgorithm based on ant colony optimization to tackle theMTR problem In deletion stage our algorithm containscentralized deletion strategy and distributed deletion strat-egy We have tested the new one with three representativetest cost distributions on four UCI datasets Experimentalresults show that the new algorithm outperforms the existing

ones significantly especially 1lsquoon medium-sized dataset suchas Mushroom Moreover when distribution is normal thedistributed deletion strategy obtains better results than thecentralized one on Mushroom dataset

Acknowledgments

This work is in part supported by the National ScienceFoundation of China under Grant no 61170128 the NaturalScience Foundation of Fujian Province China under Grantno 2012J01294 State Key Laboratory of Management and

10 Mathematical Problems in Engineering

Uniform Normal Pareto0

02

04

06

08

1

Test cost distribution

Find

ing

optim

al fa

ctor

(FO

F)12

(a)

1

09

08

07

04

03

02

01

0Uniform Normal Pareto

Test cost distribution

Max

imal

exce

edin

g fa

ctor

05

06

(b)

01

009

008

007

006

005

004

003

002

001

0Uniform Normal Pareto

Test cost distribution

Information gainGA-1GA-2

ACO centralized deletionACO distributed deletion

Aver

age e

xcee

ding

fact

or

(c)

Figure 5 Result on Mushroom (a) finding optimal factor (b) maximal exceeding factor (c) average exceeding factor

Control for Complex Systems Open Project under Grantno 20110106 and Fujian Province Foundation of HigherEducation under Grant no JK2012028

References

[1] F Min H He Y Qian and W Zhu ldquoTest-cost-sensitiveattribute reductionrdquo Information Sciences vol 181 no 22 pp4928ndash4942 2011

[2] F Min and W Zhu ldquoMinimal cost attribute reduction throughbacktrackingrdquo in Proceedings of the International Conference onDatabase Theory and Application vol 258 of FGIT-DTABSBTpp 100ndash107 CCIS 2011

[3] Y Y Yao and Y Zhao ldquoAttribute reduction in decision-theoreticrough set modelsrdquo Information Sciences vol 178 no 17 pp3356ndash3373 2008

[4] X Jia W Liao Z Tang and L Shang ldquoMinimum cost attributereduction in decision-theoretic rough set modelsrdquo InformationSciences vol 219 pp 151ndash167 2013

Mathematical Problems in Engineering 11

[5] Z Pawlak ldquoRough setsrdquo International Journal of Computer andInformation Sciences vol 11 no 5 pp 341ndash356 1982

[6] G Pan F Min and W Zhu ldquoA genetic algorithm to theminimal test cost reduct problemrdquo in Proceedings of the IEEEInternational Conference on Granular Computing (GrC rsquo11) pp539ndash544 Kaohsiung Taiwan November 2011

[7] F Min and Q Liu ldquoA hierarchical model for test-cost-sensitivedecision systemsrdquo Information Sciences vol 179 no 14 pp2442ndash2452 2009

[8] W X Zhang J S Mi and W Z Wu ldquoKnowledge reductions ininconsistent information systemsrdquo Chinese Journal of Comput-ers vol 26 no 1 pp 12ndash18 2003

[9] Q Liu L Chen J Zhang and F Min ldquoKnowledge reduction ininconsistent decision tablesrdquo in Proceedings of Advanced DataMining and Applications vol 4093 of Lecture Notes in ComputerScience pp 626ndash635 2006

[10] W Ziarko ldquoVariable precision rough set modelrdquo Journal ofComputer and System Sciences vol 46 no 1 pp 39ndash59 1993

[11] J G Bazan and A Skowron ldquoDynamic reducts as a tool forextracting laws from decision tablesrdquo in Proceedings of the8th International Symposium on Methodologies for IntelligentSystems pp 346ndash355 1994

[12] D Deng ldquoParallel reduct and its propertiesrdquo in Proceedings ofthe International Conference on Granular Computing (GRC rsquo09)pp 121ndash125 Nanchang China August 2009

[13] D Deng J Wang and X Li ldquoParallel reducts in a seriesof decision subsystemsrdquo in Proceedings of the InternationalJoint Conference on Computational Sciences and Optimization(CSO rsquo09) pp 377ndash380 Hainan China April 2009

[14] F Min Z Bai M He and Q Liu ldquoThe reduct problemwith specified attributesrdquo in Proceedings of the Rough Setsand Soft Computing in Intelligent Agent and Web TechnologyInternational Workshop at WI-IAT pp 36ndash42 2005

[15] W Zhu and F Wang ldquoOn three types of covering rough setsrdquoIEEE Transactions on Knowledge and Data Engineering vol 19no 8 pp 1131ndash1144 2007

[16] W Zhu and F Y Wang ldquoReduction and axiomization ofcovering generalized rough setsrdquo Information Sciences vol 152pp 217ndash230 2003

[17] C C Aggarwal ldquoOn density based transforms for uncertaindata miningrdquo in Proceedings of the 23rd IEEE InternationalConference on Data Engineering pp 866ndash875 2007

[18] H Zhao F Min and W Zhu ldquoTest-cost-sensitive attributereduction of data with normal distribution measurementerrorsrdquoMathematical Problems in Engineering vol 2013 ArticleID 946070 12 pages 2013

[19] D Slęzak ldquoApproximate entropy reductsrdquo Fundamenta Infor-maticae vol 53 no 3-4 pp 365ndash390 2002

[20] F Min X Du H Qiu and Q Liu ldquoMinimal attribute spacebias for attribute reductionrdquo in Proceedings of the Rough Set andKnowledge Technology pp 379ndash386 2007

[21] W Zhu ldquoTopological approaches to covering rough setsrdquoInformation Sciences vol 177 no 6 pp 1499ndash1508 2007

[22] S Greco B Matarazzo R Slowinski and J StefanowskildquoVariable consistency model of dominance-based rough setsapproachrdquo in Proceedings of the Rough Sets and Current Trendsin Computing vol 2005 of Lecture Notes in Computer Sciencepp 170ndash181 2000

[23] Q Hu D Yu J Liu and CWu ldquoNeighborhood rough set basedheterogeneous feature subset selectionrdquo Information Sciencesvol 178 no 18 pp 3577ndash3594 2008

[24] YQian J LiangW Pedrycz andCDang ldquoPositive approxima-tion an accelerator for attribute reduction in rough set theoryrdquoArtificial Intelligence vol 174 no 9-10 pp 597ndash618 2010

[25] Wikipedia ldquoGenetic algorithmrdquo httpenwikipediaorgwikiGeneticalgorithm

[26] P D Turney ldquoCost-sensitive classification empirical evaluationof a hybrid genetic decision tree induction algorithmrdquo Journalof Artificial Intelligence Research vol 2 pp 369ndash409 1995

[27] J Liu F Min S Liao and W Zhu ldquoA genetic algorithm toattribute reduction with test cost constraintrdquo in Proceedings ofthe 6th IEEE International Conference on Computer Sciences andConvergence Information Technology (ICCIT rsquo11) pp 751ndash7542011

[28] M Dorigo and T Stutzle Eds Ant Colony Optimization MITPress Boston Mass USA 2004

[29] L Gambardella and M Dorigo ldquoHas-sop hybrid ant systemfor the sequential ordering problemrdquo Tech Rep (IDSIA-11-97)1997

[30] L M Gambardella E Taillard G Agazzi I D Corne MDorigo and F Glover ldquoMacs-vrptw a multiple ant colonysystem for vehicle routing problemswith timewindowsrdquo inNewIdeas in Optimization pp 63ndash76 McGraw-Hill New York NYUSA 1999

[31] V Maniezzo and A Colorni ldquoThe ant system applied to thequadratic assignment problemrdquo IEEE Transactions on Knowl-edge and Data Engineering vol 11 no 5 pp 769ndash778 1999

[32] C Blum M Blesa and U P de Catalunya Departament deLlenguatges i Sistemes InformaticsMetaheuristics for the edge-weighted k-cardinality tree problem 2003

[33] R Michel and M Middendorf ldquoAn aco algorithm for theshortest common supersequence problemrdquo in New Ideas inOptimization pp 51ndash62 McGraw-Hill London UK 1999

[34] Y ChenDMiao andRWang ldquoA rough set approach to featureselection based on ant colony optimizationrdquoPatternRecognitionLetters vol 31 no 3 pp 226ndash233 2010

[35] L Ke Z Feng and Z Ren ldquoAn efficient ant colony optimizationapproach to attribute reduction in rough set theoryrdquo PatternRecognition Letters vol 29 no 9 pp 1351ndash1357 2008

[36] C L Blake and C J Merz ldquoUCI repository of machine learningdatabasesrdquo 1998 httpwwwicsuciedusimmlearnmlrepositoryhtml

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

8 Mathematical Problems in Engineering

1

09

08

07

06

05

04100 200 300 400

Number of ants

Find

ing

optim

al fa

ctor

(FO

F)

(a)

1

08

06

04

02100 200 300 400

Number of antsFi

ndin

g op

timal

fact

or (F

OF)

(b)

1

09

08100 200 300 400

Number of ants

IrisZooVoting

Tic-tac-toeMushroom

Find

ing

optim

al fa

ctor

(FO

F)

095

085

(c)

Figure 3 ACO with centralized deletion on different distributions (a) uniform (b) normal (c) pareto

(3) The ant colony optimization provides better perfor-mance than the information gain one and the geneticoneOur algorithmovercomes the shortcoming of thegenetic algorithmwhich performs badly on the largerdataset On medium-sized dataset Mushroom theACO outperforms the genetic one significantly Forexample when the test cost distribution is normal theFOF of the ant colony optimization using centralized

deletion and distributed deletion is 958 966respectively outperforming the information gain oneGA-1 and GA-2 with 176 321 and 521 respec-tively Figure 5 gives a more intuitive understandingof results on the dataset Mushroom One possiblereason is that the ant colony optimization algorithmproduces more diverse solutions than the existingone

Mathematical Problems in Engineering 9

1

09

08

07

06

05100 200 300 400

Number of ants

Find

ing

optim

al fa

ctor

(FO

F)

(a)

1

09

08

07

06

05

04

100 200 300 400Number of ants

Find

ing

optim

al fa

ctor

(FO

F)

03

02150 250 350

(b)

100 200 300 400Number of ants

IrisZooVoting

Tic-tac-toeMushroom

1

098

096

094

092

09

Find

ing

optim

al fa

ctor

(FO

F)

(c)

Figure 4 ACO with distributed deletion on different distributions (a) uniform (b) normal (c) Pareto

5 Conclusions

In this paper we have pointed out the shortcoming of theexisting heuristic algorithms including the information gain-based one and the genetic one We have designed a newalgorithm based on ant colony optimization to tackle theMTR problem In deletion stage our algorithm containscentralized deletion strategy and distributed deletion strat-egy We have tested the new one with three representativetest cost distributions on four UCI datasets Experimentalresults show that the new algorithm outperforms the existing

ones significantly especially 1lsquoon medium-sized dataset suchas Mushroom Moreover when distribution is normal thedistributed deletion strategy obtains better results than thecentralized one on Mushroom dataset

Acknowledgments

This work is in part supported by the National ScienceFoundation of China under Grant no 61170128 the NaturalScience Foundation of Fujian Province China under Grantno 2012J01294 State Key Laboratory of Management and

10 Mathematical Problems in Engineering

Uniform Normal Pareto0

02

04

06

08

1

Test cost distribution

Find

ing

optim

al fa

ctor

(FO

F)12

(a)

1

09

08

07

04

03

02

01

0Uniform Normal Pareto

Test cost distribution

Max

imal

exce

edin

g fa

ctor

05

06

(b)

01

009

008

007

006

005

004

003

002

001

0Uniform Normal Pareto

Test cost distribution

Information gainGA-1GA-2

ACO centralized deletionACO distributed deletion

Aver

age e

xcee

ding

fact

or

(c)

Figure 5 Result on Mushroom (a) finding optimal factor (b) maximal exceeding factor (c) average exceeding factor

Control for Complex Systems Open Project under Grantno 20110106 and Fujian Province Foundation of HigherEducation under Grant no JK2012028

References

[1] F Min H He Y Qian and W Zhu ldquoTest-cost-sensitiveattribute reductionrdquo Information Sciences vol 181 no 22 pp4928ndash4942 2011

[2] F Min and W Zhu ldquoMinimal cost attribute reduction throughbacktrackingrdquo in Proceedings of the International Conference onDatabase Theory and Application vol 258 of FGIT-DTABSBTpp 100ndash107 CCIS 2011

[3] Y Y Yao and Y Zhao ldquoAttribute reduction in decision-theoreticrough set modelsrdquo Information Sciences vol 178 no 17 pp3356ndash3373 2008

[4] X Jia W Liao Z Tang and L Shang ldquoMinimum cost attributereduction in decision-theoretic rough set modelsrdquo InformationSciences vol 219 pp 151ndash167 2013

Mathematical Problems in Engineering 11

[5] Z Pawlak ldquoRough setsrdquo International Journal of Computer andInformation Sciences vol 11 no 5 pp 341ndash356 1982

[6] G Pan F Min and W Zhu ldquoA genetic algorithm to theminimal test cost reduct problemrdquo in Proceedings of the IEEEInternational Conference on Granular Computing (GrC rsquo11) pp539ndash544 Kaohsiung Taiwan November 2011

[7] F Min and Q Liu ldquoA hierarchical model for test-cost-sensitivedecision systemsrdquo Information Sciences vol 179 no 14 pp2442ndash2452 2009

[8] W X Zhang J S Mi and W Z Wu ldquoKnowledge reductions ininconsistent information systemsrdquo Chinese Journal of Comput-ers vol 26 no 1 pp 12ndash18 2003

[9] Q Liu L Chen J Zhang and F Min ldquoKnowledge reduction ininconsistent decision tablesrdquo in Proceedings of Advanced DataMining and Applications vol 4093 of Lecture Notes in ComputerScience pp 626ndash635 2006

[10] W Ziarko ldquoVariable precision rough set modelrdquo Journal ofComputer and System Sciences vol 46 no 1 pp 39ndash59 1993

[11] J G Bazan and A Skowron ldquoDynamic reducts as a tool forextracting laws from decision tablesrdquo in Proceedings of the8th International Symposium on Methodologies for IntelligentSystems pp 346ndash355 1994

[12] D Deng ldquoParallel reduct and its propertiesrdquo in Proceedings ofthe International Conference on Granular Computing (GRC rsquo09)pp 121ndash125 Nanchang China August 2009

[13] D Deng J Wang and X Li ldquoParallel reducts in a seriesof decision subsystemsrdquo in Proceedings of the InternationalJoint Conference on Computational Sciences and Optimization(CSO rsquo09) pp 377ndash380 Hainan China April 2009

[14] F Min Z Bai M He and Q Liu ldquoThe reduct problemwith specified attributesrdquo in Proceedings of the Rough Setsand Soft Computing in Intelligent Agent and Web TechnologyInternational Workshop at WI-IAT pp 36ndash42 2005

[15] W Zhu and F Wang ldquoOn three types of covering rough setsrdquoIEEE Transactions on Knowledge and Data Engineering vol 19no 8 pp 1131ndash1144 2007

[16] W Zhu and F Y Wang ldquoReduction and axiomization ofcovering generalized rough setsrdquo Information Sciences vol 152pp 217ndash230 2003

[17] C C Aggarwal ldquoOn density based transforms for uncertaindata miningrdquo in Proceedings of the 23rd IEEE InternationalConference on Data Engineering pp 866ndash875 2007

[18] H Zhao F Min and W Zhu ldquoTest-cost-sensitive attributereduction of data with normal distribution measurementerrorsrdquoMathematical Problems in Engineering vol 2013 ArticleID 946070 12 pages 2013

[19] D Slęzak ldquoApproximate entropy reductsrdquo Fundamenta Infor-maticae vol 53 no 3-4 pp 365ndash390 2002

[20] F Min X Du H Qiu and Q Liu ldquoMinimal attribute spacebias for attribute reductionrdquo in Proceedings of the Rough Set andKnowledge Technology pp 379ndash386 2007

[21] W Zhu ldquoTopological approaches to covering rough setsrdquoInformation Sciences vol 177 no 6 pp 1499ndash1508 2007

[22] S Greco B Matarazzo R Slowinski and J StefanowskildquoVariable consistency model of dominance-based rough setsapproachrdquo in Proceedings of the Rough Sets and Current Trendsin Computing vol 2005 of Lecture Notes in Computer Sciencepp 170ndash181 2000

[23] Q Hu D Yu J Liu and CWu ldquoNeighborhood rough set basedheterogeneous feature subset selectionrdquo Information Sciencesvol 178 no 18 pp 3577ndash3594 2008

[24] YQian J LiangW Pedrycz andCDang ldquoPositive approxima-tion an accelerator for attribute reduction in rough set theoryrdquoArtificial Intelligence vol 174 no 9-10 pp 597ndash618 2010

[25] Wikipedia ldquoGenetic algorithmrdquo httpenwikipediaorgwikiGeneticalgorithm

[26] P D Turney ldquoCost-sensitive classification empirical evaluationof a hybrid genetic decision tree induction algorithmrdquo Journalof Artificial Intelligence Research vol 2 pp 369ndash409 1995

[27] J Liu F Min S Liao and W Zhu ldquoA genetic algorithm toattribute reduction with test cost constraintrdquo in Proceedings ofthe 6th IEEE International Conference on Computer Sciences andConvergence Information Technology (ICCIT rsquo11) pp 751ndash7542011

[28] M Dorigo and T Stutzle Eds Ant Colony Optimization MITPress Boston Mass USA 2004

[29] L Gambardella and M Dorigo ldquoHas-sop hybrid ant systemfor the sequential ordering problemrdquo Tech Rep (IDSIA-11-97)1997

[30] L M Gambardella E Taillard G Agazzi I D Corne MDorigo and F Glover ldquoMacs-vrptw a multiple ant colonysystem for vehicle routing problemswith timewindowsrdquo inNewIdeas in Optimization pp 63ndash76 McGraw-Hill New York NYUSA 1999

[31] V Maniezzo and A Colorni ldquoThe ant system applied to thequadratic assignment problemrdquo IEEE Transactions on Knowl-edge and Data Engineering vol 11 no 5 pp 769ndash778 1999

[32] C Blum M Blesa and U P de Catalunya Departament deLlenguatges i Sistemes InformaticsMetaheuristics for the edge-weighted k-cardinality tree problem 2003

[33] R Michel and M Middendorf ldquoAn aco algorithm for theshortest common supersequence problemrdquo in New Ideas inOptimization pp 51ndash62 McGraw-Hill London UK 1999

[34] Y ChenDMiao andRWang ldquoA rough set approach to featureselection based on ant colony optimizationrdquoPatternRecognitionLetters vol 31 no 3 pp 226ndash233 2010

[35] L Ke Z Feng and Z Ren ldquoAn efficient ant colony optimizationapproach to attribute reduction in rough set theoryrdquo PatternRecognition Letters vol 29 no 9 pp 1351ndash1357 2008

[36] C L Blake and C J Merz ldquoUCI repository of machine learningdatabasesrdquo 1998 httpwwwicsuciedusimmlearnmlrepositoryhtml

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Mathematical Problems in Engineering 9

1

09

08

07

06

05100 200 300 400

Number of ants

Find

ing

optim

al fa

ctor

(FO

F)

(a)

1

09

08

07

06

05

04

100 200 300 400Number of ants

Find

ing

optim

al fa

ctor

(FO

F)

03

02150 250 350

(b)

100 200 300 400Number of ants

IrisZooVoting

Tic-tac-toeMushroom

1

098

096

094

092

09

Find

ing

optim

al fa

ctor

(FO

F)

(c)

Figure 4 ACO with distributed deletion on different distributions (a) uniform (b) normal (c) Pareto

5 Conclusions

In this paper we have pointed out the shortcoming of theexisting heuristic algorithms including the information gain-based one and the genetic one We have designed a newalgorithm based on ant colony optimization to tackle theMTR problem In deletion stage our algorithm containscentralized deletion strategy and distributed deletion strat-egy We have tested the new one with three representativetest cost distributions on four UCI datasets Experimentalresults show that the new algorithm outperforms the existing

ones significantly especially 1lsquoon medium-sized dataset suchas Mushroom Moreover when distribution is normal thedistributed deletion strategy obtains better results than thecentralized one on Mushroom dataset

Acknowledgments

This work is in part supported by the National ScienceFoundation of China under Grant no 61170128 the NaturalScience Foundation of Fujian Province China under Grantno 2012J01294 State Key Laboratory of Management and

10 Mathematical Problems in Engineering

Uniform Normal Pareto0

02

04

06

08

1

Test cost distribution

Find

ing

optim

al fa

ctor

(FO

F)12

(a)

1

09

08

07

04

03

02

01

0Uniform Normal Pareto

Test cost distribution

Max

imal

exce

edin

g fa

ctor

05

06

(b)

01

009

008

007

006

005

004

003

002

001

0Uniform Normal Pareto

Test cost distribution

Information gainGA-1GA-2

ACO centralized deletionACO distributed deletion

Aver

age e

xcee

ding

fact

or

(c)

Figure 5 Result on Mushroom (a) finding optimal factor (b) maximal exceeding factor (c) average exceeding factor

Control for Complex Systems Open Project under Grantno 20110106 and Fujian Province Foundation of HigherEducation under Grant no JK2012028

References

[1] F Min H He Y Qian and W Zhu ldquoTest-cost-sensitiveattribute reductionrdquo Information Sciences vol 181 no 22 pp4928ndash4942 2011

[2] F Min and W Zhu ldquoMinimal cost attribute reduction throughbacktrackingrdquo in Proceedings of the International Conference onDatabase Theory and Application vol 258 of FGIT-DTABSBTpp 100ndash107 CCIS 2011

[3] Y Y Yao and Y Zhao ldquoAttribute reduction in decision-theoreticrough set modelsrdquo Information Sciences vol 178 no 17 pp3356ndash3373 2008

[4] X Jia W Liao Z Tang and L Shang ldquoMinimum cost attributereduction in decision-theoretic rough set modelsrdquo InformationSciences vol 219 pp 151ndash167 2013

Mathematical Problems in Engineering 11

[5] Z Pawlak ldquoRough setsrdquo International Journal of Computer andInformation Sciences vol 11 no 5 pp 341ndash356 1982

[6] G Pan F Min and W Zhu ldquoA genetic algorithm to theminimal test cost reduct problemrdquo in Proceedings of the IEEEInternational Conference on Granular Computing (GrC rsquo11) pp539ndash544 Kaohsiung Taiwan November 2011

[7] F Min and Q Liu ldquoA hierarchical model for test-cost-sensitivedecision systemsrdquo Information Sciences vol 179 no 14 pp2442ndash2452 2009

[8] W X Zhang J S Mi and W Z Wu ldquoKnowledge reductions ininconsistent information systemsrdquo Chinese Journal of Comput-ers vol 26 no 1 pp 12ndash18 2003

[9] Q Liu L Chen J Zhang and F Min ldquoKnowledge reduction ininconsistent decision tablesrdquo in Proceedings of Advanced DataMining and Applications vol 4093 of Lecture Notes in ComputerScience pp 626ndash635 2006

[10] W Ziarko ldquoVariable precision rough set modelrdquo Journal ofComputer and System Sciences vol 46 no 1 pp 39ndash59 1993

[11] J G Bazan and A Skowron ldquoDynamic reducts as a tool forextracting laws from decision tablesrdquo in Proceedings of the8th International Symposium on Methodologies for IntelligentSystems pp 346ndash355 1994

[12] D Deng ldquoParallel reduct and its propertiesrdquo in Proceedings ofthe International Conference on Granular Computing (GRC rsquo09)pp 121ndash125 Nanchang China August 2009

[13] D Deng J Wang and X Li ldquoParallel reducts in a seriesof decision subsystemsrdquo in Proceedings of the InternationalJoint Conference on Computational Sciences and Optimization(CSO rsquo09) pp 377ndash380 Hainan China April 2009

[14] F Min Z Bai M He and Q Liu ldquoThe reduct problemwith specified attributesrdquo in Proceedings of the Rough Setsand Soft Computing in Intelligent Agent and Web TechnologyInternational Workshop at WI-IAT pp 36ndash42 2005

[15] W Zhu and F Wang ldquoOn three types of covering rough setsrdquoIEEE Transactions on Knowledge and Data Engineering vol 19no 8 pp 1131ndash1144 2007

[16] W Zhu and F Y Wang ldquoReduction and axiomization ofcovering generalized rough setsrdquo Information Sciences vol 152pp 217ndash230 2003

[17] C C Aggarwal ldquoOn density based transforms for uncertaindata miningrdquo in Proceedings of the 23rd IEEE InternationalConference on Data Engineering pp 866ndash875 2007

[18] H Zhao F Min and W Zhu ldquoTest-cost-sensitive attributereduction of data with normal distribution measurementerrorsrdquoMathematical Problems in Engineering vol 2013 ArticleID 946070 12 pages 2013

[19] D Slęzak ldquoApproximate entropy reductsrdquo Fundamenta Infor-maticae vol 53 no 3-4 pp 365ndash390 2002

[20] F Min X Du H Qiu and Q Liu ldquoMinimal attribute spacebias for attribute reductionrdquo in Proceedings of the Rough Set andKnowledge Technology pp 379ndash386 2007

[21] W Zhu ldquoTopological approaches to covering rough setsrdquoInformation Sciences vol 177 no 6 pp 1499ndash1508 2007

[22] S Greco B Matarazzo R Slowinski and J StefanowskildquoVariable consistency model of dominance-based rough setsapproachrdquo in Proceedings of the Rough Sets and Current Trendsin Computing vol 2005 of Lecture Notes in Computer Sciencepp 170ndash181 2000

[23] Q Hu D Yu J Liu and CWu ldquoNeighborhood rough set basedheterogeneous feature subset selectionrdquo Information Sciencesvol 178 no 18 pp 3577ndash3594 2008

[24] YQian J LiangW Pedrycz andCDang ldquoPositive approxima-tion an accelerator for attribute reduction in rough set theoryrdquoArtificial Intelligence vol 174 no 9-10 pp 597ndash618 2010

[25] Wikipedia ldquoGenetic algorithmrdquo httpenwikipediaorgwikiGeneticalgorithm

[26] P D Turney ldquoCost-sensitive classification empirical evaluationof a hybrid genetic decision tree induction algorithmrdquo Journalof Artificial Intelligence Research vol 2 pp 369ndash409 1995

[27] J Liu F Min S Liao and W Zhu ldquoA genetic algorithm toattribute reduction with test cost constraintrdquo in Proceedings ofthe 6th IEEE International Conference on Computer Sciences andConvergence Information Technology (ICCIT rsquo11) pp 751ndash7542011

[28] M Dorigo and T Stutzle Eds Ant Colony Optimization MITPress Boston Mass USA 2004

[29] L Gambardella and M Dorigo ldquoHas-sop hybrid ant systemfor the sequential ordering problemrdquo Tech Rep (IDSIA-11-97)1997

[30] L M Gambardella E Taillard G Agazzi I D Corne MDorigo and F Glover ldquoMacs-vrptw a multiple ant colonysystem for vehicle routing problemswith timewindowsrdquo inNewIdeas in Optimization pp 63ndash76 McGraw-Hill New York NYUSA 1999

[31] V Maniezzo and A Colorni ldquoThe ant system applied to thequadratic assignment problemrdquo IEEE Transactions on Knowl-edge and Data Engineering vol 11 no 5 pp 769ndash778 1999

[32] C Blum M Blesa and U P de Catalunya Departament deLlenguatges i Sistemes InformaticsMetaheuristics for the edge-weighted k-cardinality tree problem 2003

[33] R Michel and M Middendorf ldquoAn aco algorithm for theshortest common supersequence problemrdquo in New Ideas inOptimization pp 51ndash62 McGraw-Hill London UK 1999

[34] Y ChenDMiao andRWang ldquoA rough set approach to featureselection based on ant colony optimizationrdquoPatternRecognitionLetters vol 31 no 3 pp 226ndash233 2010

[35] L Ke Z Feng and Z Ren ldquoAn efficient ant colony optimizationapproach to attribute reduction in rough set theoryrdquo PatternRecognition Letters vol 29 no 9 pp 1351ndash1357 2008

[36] C L Blake and C J Merz ldquoUCI repository of machine learningdatabasesrdquo 1998 httpwwwicsuciedusimmlearnmlrepositoryhtml

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

10 Mathematical Problems in Engineering

Uniform Normal Pareto0

02

04

06

08

1

Test cost distribution

Find

ing

optim

al fa

ctor

(FO

F)12

(a)

1

09

08

07

04

03

02

01

0Uniform Normal Pareto

Test cost distribution

Max

imal

exce

edin

g fa

ctor

05

06

(b)

01

009

008

007

006

005

004

003

002

001

0Uniform Normal Pareto

Test cost distribution

Information gainGA-1GA-2

ACO centralized deletionACO distributed deletion

Aver

age e

xcee

ding

fact

or

(c)

Figure 5 Result on Mushroom (a) finding optimal factor (b) maximal exceeding factor (c) average exceeding factor

Control for Complex Systems Open Project under Grantno 20110106 and Fujian Province Foundation of HigherEducation under Grant no JK2012028

References

[1] F Min H He Y Qian and W Zhu ldquoTest-cost-sensitiveattribute reductionrdquo Information Sciences vol 181 no 22 pp4928ndash4942 2011

[2] F Min and W Zhu ldquoMinimal cost attribute reduction throughbacktrackingrdquo in Proceedings of the International Conference onDatabase Theory and Application vol 258 of FGIT-DTABSBTpp 100ndash107 CCIS 2011

[3] Y Y Yao and Y Zhao ldquoAttribute reduction in decision-theoreticrough set modelsrdquo Information Sciences vol 178 no 17 pp3356ndash3373 2008

[4] X Jia W Liao Z Tang and L Shang ldquoMinimum cost attributereduction in decision-theoretic rough set modelsrdquo InformationSciences vol 219 pp 151ndash167 2013

Mathematical Problems in Engineering 11

[5] Z Pawlak ldquoRough setsrdquo International Journal of Computer andInformation Sciences vol 11 no 5 pp 341ndash356 1982

[6] G Pan F Min and W Zhu ldquoA genetic algorithm to theminimal test cost reduct problemrdquo in Proceedings of the IEEEInternational Conference on Granular Computing (GrC rsquo11) pp539ndash544 Kaohsiung Taiwan November 2011

[7] F Min and Q Liu ldquoA hierarchical model for test-cost-sensitivedecision systemsrdquo Information Sciences vol 179 no 14 pp2442ndash2452 2009

[8] W X Zhang J S Mi and W Z Wu ldquoKnowledge reductions ininconsistent information systemsrdquo Chinese Journal of Comput-ers vol 26 no 1 pp 12ndash18 2003

[9] Q Liu L Chen J Zhang and F Min ldquoKnowledge reduction ininconsistent decision tablesrdquo in Proceedings of Advanced DataMining and Applications vol 4093 of Lecture Notes in ComputerScience pp 626ndash635 2006

[10] W Ziarko ldquoVariable precision rough set modelrdquo Journal ofComputer and System Sciences vol 46 no 1 pp 39ndash59 1993

[11] J G Bazan and A Skowron ldquoDynamic reducts as a tool forextracting laws from decision tablesrdquo in Proceedings of the8th International Symposium on Methodologies for IntelligentSystems pp 346ndash355 1994

[12] D Deng ldquoParallel reduct and its propertiesrdquo in Proceedings ofthe International Conference on Granular Computing (GRC rsquo09)pp 121ndash125 Nanchang China August 2009

[13] D Deng J Wang and X Li ldquoParallel reducts in a seriesof decision subsystemsrdquo in Proceedings of the InternationalJoint Conference on Computational Sciences and Optimization(CSO rsquo09) pp 377ndash380 Hainan China April 2009

[14] F Min Z Bai M He and Q Liu ldquoThe reduct problemwith specified attributesrdquo in Proceedings of the Rough Setsand Soft Computing in Intelligent Agent and Web TechnologyInternational Workshop at WI-IAT pp 36ndash42 2005

[15] W Zhu and F Wang ldquoOn three types of covering rough setsrdquoIEEE Transactions on Knowledge and Data Engineering vol 19no 8 pp 1131ndash1144 2007

[16] W Zhu and F Y Wang ldquoReduction and axiomization ofcovering generalized rough setsrdquo Information Sciences vol 152pp 217ndash230 2003

[17] C C Aggarwal ldquoOn density based transforms for uncertaindata miningrdquo in Proceedings of the 23rd IEEE InternationalConference on Data Engineering pp 866ndash875 2007

[18] H Zhao F Min and W Zhu ldquoTest-cost-sensitive attributereduction of data with normal distribution measurementerrorsrdquoMathematical Problems in Engineering vol 2013 ArticleID 946070 12 pages 2013

[19] D Slęzak ldquoApproximate entropy reductsrdquo Fundamenta Infor-maticae vol 53 no 3-4 pp 365ndash390 2002

[20] F Min X Du H Qiu and Q Liu ldquoMinimal attribute spacebias for attribute reductionrdquo in Proceedings of the Rough Set andKnowledge Technology pp 379ndash386 2007

[21] W Zhu ldquoTopological approaches to covering rough setsrdquoInformation Sciences vol 177 no 6 pp 1499ndash1508 2007

[22] S Greco B Matarazzo R Slowinski and J StefanowskildquoVariable consistency model of dominance-based rough setsapproachrdquo in Proceedings of the Rough Sets and Current Trendsin Computing vol 2005 of Lecture Notes in Computer Sciencepp 170ndash181 2000

[23] Q Hu D Yu J Liu and CWu ldquoNeighborhood rough set basedheterogeneous feature subset selectionrdquo Information Sciencesvol 178 no 18 pp 3577ndash3594 2008

[24] YQian J LiangW Pedrycz andCDang ldquoPositive approxima-tion an accelerator for attribute reduction in rough set theoryrdquoArtificial Intelligence vol 174 no 9-10 pp 597ndash618 2010

[25] Wikipedia ldquoGenetic algorithmrdquo httpenwikipediaorgwikiGeneticalgorithm

[26] P D Turney ldquoCost-sensitive classification empirical evaluationof a hybrid genetic decision tree induction algorithmrdquo Journalof Artificial Intelligence Research vol 2 pp 369ndash409 1995

[27] J Liu F Min S Liao and W Zhu ldquoA genetic algorithm toattribute reduction with test cost constraintrdquo in Proceedings ofthe 6th IEEE International Conference on Computer Sciences andConvergence Information Technology (ICCIT rsquo11) pp 751ndash7542011

[28] M Dorigo and T Stutzle Eds Ant Colony Optimization MITPress Boston Mass USA 2004

[29] L Gambardella and M Dorigo ldquoHas-sop hybrid ant systemfor the sequential ordering problemrdquo Tech Rep (IDSIA-11-97)1997

[30] L M Gambardella E Taillard G Agazzi I D Corne MDorigo and F Glover ldquoMacs-vrptw a multiple ant colonysystem for vehicle routing problemswith timewindowsrdquo inNewIdeas in Optimization pp 63ndash76 McGraw-Hill New York NYUSA 1999

[31] V Maniezzo and A Colorni ldquoThe ant system applied to thequadratic assignment problemrdquo IEEE Transactions on Knowl-edge and Data Engineering vol 11 no 5 pp 769ndash778 1999

[32] C Blum M Blesa and U P de Catalunya Departament deLlenguatges i Sistemes InformaticsMetaheuristics for the edge-weighted k-cardinality tree problem 2003

[33] R Michel and M Middendorf ldquoAn aco algorithm for theshortest common supersequence problemrdquo in New Ideas inOptimization pp 51ndash62 McGraw-Hill London UK 1999

[34] Y ChenDMiao andRWang ldquoA rough set approach to featureselection based on ant colony optimizationrdquoPatternRecognitionLetters vol 31 no 3 pp 226ndash233 2010

[35] L Ke Z Feng and Z Ren ldquoAn efficient ant colony optimizationapproach to attribute reduction in rough set theoryrdquo PatternRecognition Letters vol 29 no 9 pp 1351ndash1357 2008

[36] C L Blake and C J Merz ldquoUCI repository of machine learningdatabasesrdquo 1998 httpwwwicsuciedusimmlearnmlrepositoryhtml

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Mathematical Problems in Engineering 11

[5] Z Pawlak ldquoRough setsrdquo International Journal of Computer andInformation Sciences vol 11 no 5 pp 341ndash356 1982

[6] G Pan F Min and W Zhu ldquoA genetic algorithm to theminimal test cost reduct problemrdquo in Proceedings of the IEEEInternational Conference on Granular Computing (GrC rsquo11) pp539ndash544 Kaohsiung Taiwan November 2011

[7] F Min and Q Liu ldquoA hierarchical model for test-cost-sensitivedecision systemsrdquo Information Sciences vol 179 no 14 pp2442ndash2452 2009

[8] W X Zhang J S Mi and W Z Wu ldquoKnowledge reductions ininconsistent information systemsrdquo Chinese Journal of Comput-ers vol 26 no 1 pp 12ndash18 2003

[9] Q Liu L Chen J Zhang and F Min ldquoKnowledge reduction ininconsistent decision tablesrdquo in Proceedings of Advanced DataMining and Applications vol 4093 of Lecture Notes in ComputerScience pp 626ndash635 2006

[10] W Ziarko ldquoVariable precision rough set modelrdquo Journal ofComputer and System Sciences vol 46 no 1 pp 39ndash59 1993

[11] J G Bazan and A Skowron ldquoDynamic reducts as a tool forextracting laws from decision tablesrdquo in Proceedings of the8th International Symposium on Methodologies for IntelligentSystems pp 346ndash355 1994

[12] D Deng ldquoParallel reduct and its propertiesrdquo in Proceedings ofthe International Conference on Granular Computing (GRC rsquo09)pp 121ndash125 Nanchang China August 2009

[13] D Deng J Wang and X Li ldquoParallel reducts in a seriesof decision subsystemsrdquo in Proceedings of the InternationalJoint Conference on Computational Sciences and Optimization(CSO rsquo09) pp 377ndash380 Hainan China April 2009

[14] F Min Z Bai M He and Q Liu ldquoThe reduct problemwith specified attributesrdquo in Proceedings of the Rough Setsand Soft Computing in Intelligent Agent and Web TechnologyInternational Workshop at WI-IAT pp 36ndash42 2005

[15] W Zhu and F Wang ldquoOn three types of covering rough setsrdquoIEEE Transactions on Knowledge and Data Engineering vol 19no 8 pp 1131ndash1144 2007

[16] W Zhu and F Y Wang ldquoReduction and axiomization ofcovering generalized rough setsrdquo Information Sciences vol 152pp 217ndash230 2003

[17] C C Aggarwal ldquoOn density based transforms for uncertaindata miningrdquo in Proceedings of the 23rd IEEE InternationalConference on Data Engineering pp 866ndash875 2007

[18] H Zhao F Min and W Zhu ldquoTest-cost-sensitive attributereduction of data with normal distribution measurementerrorsrdquoMathematical Problems in Engineering vol 2013 ArticleID 946070 12 pages 2013

[19] D Slęzak ldquoApproximate entropy reductsrdquo Fundamenta Infor-maticae vol 53 no 3-4 pp 365ndash390 2002

[20] F Min X Du H Qiu and Q Liu ldquoMinimal attribute spacebias for attribute reductionrdquo in Proceedings of the Rough Set andKnowledge Technology pp 379ndash386 2007

[21] W Zhu ldquoTopological approaches to covering rough setsrdquoInformation Sciences vol 177 no 6 pp 1499ndash1508 2007

[22] S Greco B Matarazzo R Slowinski and J StefanowskildquoVariable consistency model of dominance-based rough setsapproachrdquo in Proceedings of the Rough Sets and Current Trendsin Computing vol 2005 of Lecture Notes in Computer Sciencepp 170ndash181 2000

[23] Q Hu D Yu J Liu and CWu ldquoNeighborhood rough set basedheterogeneous feature subset selectionrdquo Information Sciencesvol 178 no 18 pp 3577ndash3594 2008

[24] YQian J LiangW Pedrycz andCDang ldquoPositive approxima-tion an accelerator for attribute reduction in rough set theoryrdquoArtificial Intelligence vol 174 no 9-10 pp 597ndash618 2010

[25] Wikipedia ldquoGenetic algorithmrdquo httpenwikipediaorgwikiGeneticalgorithm

[26] P D Turney ldquoCost-sensitive classification empirical evaluationof a hybrid genetic decision tree induction algorithmrdquo Journalof Artificial Intelligence Research vol 2 pp 369ndash409 1995

[27] J Liu F Min S Liao and W Zhu ldquoA genetic algorithm toattribute reduction with test cost constraintrdquo in Proceedings ofthe 6th IEEE International Conference on Computer Sciences andConvergence Information Technology (ICCIT rsquo11) pp 751ndash7542011

[28] M Dorigo and T Stutzle Eds Ant Colony Optimization MITPress Boston Mass USA 2004

[29] L Gambardella and M Dorigo ldquoHas-sop hybrid ant systemfor the sequential ordering problemrdquo Tech Rep (IDSIA-11-97)1997

[30] L M Gambardella E Taillard G Agazzi I D Corne MDorigo and F Glover ldquoMacs-vrptw a multiple ant colonysystem for vehicle routing problemswith timewindowsrdquo inNewIdeas in Optimization pp 63ndash76 McGraw-Hill New York NYUSA 1999

[31] V Maniezzo and A Colorni ldquoThe ant system applied to thequadratic assignment problemrdquo IEEE Transactions on Knowl-edge and Data Engineering vol 11 no 5 pp 769ndash778 1999

[32] C Blum M Blesa and U P de Catalunya Departament deLlenguatges i Sistemes InformaticsMetaheuristics for the edge-weighted k-cardinality tree problem 2003

[33] R Michel and M Middendorf ldquoAn aco algorithm for theshortest common supersequence problemrdquo in New Ideas inOptimization pp 51ndash62 McGraw-Hill London UK 1999

[34] Y ChenDMiao andRWang ldquoA rough set approach to featureselection based on ant colony optimizationrdquoPatternRecognitionLetters vol 31 no 3 pp 226ndash233 2010

[35] L Ke Z Feng and Z Ren ldquoAn efficient ant colony optimizationapproach to attribute reduction in rough set theoryrdquo PatternRecognition Letters vol 29 no 9 pp 1351ndash1357 2008

[36] C L Blake and C J Merz ldquoUCI repository of machine learningdatabasesrdquo 1998 httpwwwicsuciedusimmlearnmlrepositoryhtml

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of

Submit your manuscripts athttpwwwhindawicom

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical Problems in Engineering

Hindawi Publishing Corporationhttpwwwhindawicom

Differential EquationsInternational Journal of

Volume 2014

Applied MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Probability and StatisticsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Mathematical PhysicsAdvances in

Complex AnalysisJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

OptimizationJournal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

CombinatoricsHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Operations ResearchAdvances in

Journal of

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Function Spaces

Abstract and Applied AnalysisHindawi Publishing Corporationhttpwwwhindawicom Volume 2014

International Journal of Mathematics and Mathematical Sciences

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

The Scientific World JournalHindawi Publishing Corporation httpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Algebra

Discrete Dynamics in Nature and Society

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Decision SciencesAdvances in

Discrete MathematicsJournal of

Hindawi Publishing Corporationhttpwwwhindawicom

Volume 2014 Hindawi Publishing Corporationhttpwwwhindawicom Volume 2014

Stochastic AnalysisInternational Journal of