Using Clustering and DEA for evaluation and ranking university majors

7
Using Clustering and DEA for evaluation and ranking university majors Peyman Gholami, Azadeh Bazleh*, Mahdis Salehi  Abstract  — Although all university majors are prominent, and the necessity of their presence is of no question, they might not have the same priority basis considering different resources and strategies that could be spotted for a country. Their priorities likely change as the time goes by; that is, different majors are desirable at different time. If the government is informed of which majors could tackle today existing problems of world and its country, it surely more esteems those majors. This paper considers the problem of clustering and ranking university majors in Iran. To do so, a model is presented to clarify the procedure. Eight different criteria are determined, and 177 existing university majors are compared on these criteria. First, by k-means algorithm, university majors are clustered based on similarities and differences. Then, by DEA algorithm, we rank central point of each one clusters for evaluation and calculate the efficiency each clusters groups. Index Terms  — Clustering, Data mining, Data Envelopment An alyse (DEA), K-means algorithm, Multi-criteria decision making, University major ranking problem ——————————  —————————— 1 INTRODUCTION NIVERCITY major choice is an important decision to make for anybody seeking professional/higher education. It is a decision that will inuence the way people look at the world around themselves (Porter & Umbach, 2006). The future occupation of people is closely related to their education. Given this importance, it is always of interest to nd the guidance in collaboration with making aforementioned choices about which major to select. It is known that students should draw on avail- able resources to ultimately pick a path that is right for them (Boudarbat, 2008). Nowadays, due to the creation of numerous undergraduate majors, the need for having a more precise approach becomes each time more neces- sary. Besides individual reasons, governments could be another client of university major choice. They might look for a way to supply their professional labors as one of the most inuential factors in its national future. To manage this and to nd which majors are of more important in future, they require a systematic approach to have more deep view about majors. For example, they entail to know areas each major affects, how majors can affect, to what extent each major is inuential in a given area. Although all university majors are prominent, and the necessity of their presence is of no question, they might not have the same priority basis considering different strategies that could be spotted for a country. Their priorities likely change as the time goes by; that is, different majors are desirable at different time. If the government is informed of which majors could tackle today existing problems of world and its country, it surely more esteems those ma-  jors. By more investing on those majors or providing greater grants for those studying the majors, they intend to motivate more talented students to study these majors. Therefore, with reference to the given e xplanations, it is a handy contribution to construct a model for such a deci- sion-making process. To this end, we dene eight diffrent main specialization groups. We rst group university majors based on their similarities and differences which are obtained by their magnitude of inuence on MSGs. The values of different major group can then be calcu- lated and evaluated to provide useful decisional informa- tion for the government to rationally exploit resources. Among available grouping methods, data mining ap- proaches have been attracted more attention. Given dif- ferent data mining models, clustering is regarded as the art of systematically nding groups in a data set (Fayyad, Piatetsky-Shapiro, & Smyth, 1996). In this paper, to clus- ter the university majors, we utilize the k-means algo- rithm as the most widely used method that have shown many successes in different applications such as market segmentation, pattern recognition, information retrieval, and so forth (Cheung, 2003; Kuo, Ho, & Hu, 2002). Be- sides its high performance, it is a very popular approach for clustering because of its simplicity of implementation and fast execution. Ranking/ordering  university majors is a multi-criteria problem; that is, different criteria should be taken into account. For example, one major might be very important for industrial setting while another one is to improve so- cial culture. Armed with this, we apply the data enve- lopment analyse (DEA) as a simple multi-criteria decision making (or MCDM) method for dealing with unstruc- tured, multi-attribute problems. DEA is a non-parametric linear programming based technique for measuring the relative efficiency of a set of similar units, usually referred to as decision making ————————————————   Department of Industrial Engineering, Arak Branch, Islamic Azad Univer- sity, Arak, Iran  * Department of Indust rial Engineering, Arak Branch, Islamic Aza d Uni- versity, Arak, Iran  Department of Industrial Engineering, Arak Branch, Islamic Azad Univer- sity, Arak, Iran U JOURNAL OF COMPUTING, VOLUME 3, ISSUE 8, AUGUST 2011, ISSN 2151-9617 HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING/ WW.JOURNALOFCOMPUTING.ORG 18 © 2011 Journal of Computing Press, NY, USA, ISSN 2151-9617

Transcript of Using Clustering and DEA for evaluation and ranking university majors

Page 1: Using Clustering and DEA for evaluation and ranking university majors

8/4/2019 Using Clustering and DEA for evaluation and ranking university majors

http://slidepdf.com/reader/full/using-clustering-and-dea-for-evaluation-and-ranking-university-majors 1/7

Using Clustering and DEA for evaluation andranking university majors

Peyman Gholami, Azadeh Bazleh*, Mahdis Salehi 

Abstract — Although all university majors are prominent, and the necessity of their presence is of no question, they might not

have the same priority basis considering different resources and strategies that could be spotted for a country. Their priorities

likely change as the time goes by; that is, different majors are desirable at different time. If the government is informed of which

majors could tackle today existing problems of world and its country, it surely more esteems those majors. This paper considers

the problem of clustering and ranking university majors in Iran. To do so, a model is presented to clarify the procedure. Eight

different criteria are determined, and 177 existing university majors are compared on these criteria. First, by k-means algorithm,

university majors are clustered based on similarities and differences. Then, by DEA algorithm, we rank central point of each one

clusters for evaluation and calculate the efficiency each clusters groups.

Index Terms — Clustering, Data mining, Data Envelopment Analyse (DEA), K-means algorithm, Multi-criteria decision making,

University major ranking problem

——————————    ——————————

1 INTRODUCTION

NIVERCITY major choice is an important decisionto make for anybody seeking professional/highereducation. It is a decision that will influence the way

people look at the world around themselves (Porter &Umbach, 2006). The future occupation of people is closelyrelated to their education. Given this importance, it isalways of interest to find the guidance in collaborationwith making aforementioned choices about which major

to select. It is known that students should draw on avail-able resources to ultimately pick a path that is right forthem (Boudarbat, 2008). Nowadays, due to the creation ofnumerous undergraduate majors, the need for having amore precise approach becomes each time more neces-sary. Besides individual reasons, governments could beanother client of university major choice. They might lookfor a way to supply their professional labors as one of themost influential factors in its national future. To managethis and to find which majors are of more important infuture, they require a systematic approach to have moredeep view about majors. For example, they entail to knowareas each major affects, how majors can affect, to what

extent each major is influential in a given area. Althoughall university majors are prominent, and the necessity oftheir presence is of no question, they might not have thesame priority basis considering different strategies thatcould be spotted for a country. Their priorities likelychange as the time goes by; that is, different majors aredesirable at different time. If the government is informedof which majors could tackle today existing problems ofworld and its country, it surely more esteems those ma-

  jors. By more investing on those majors or providinggreater grants for those studying the majors, they intendto motivate more talented students to study these majors.

Therefore, with reference to the given explanations, it is

a handy contribution to construct a model for such a deci-

sion-making process. To this end, we define eight diffrent

main specialization groups. We first group university

majors based on their similarities and differences which

are obtained by their magnitude of influence on MSGs.The values of different major group can then be calcu-

lated and evaluated to provide useful decisional informa-

tion for the government to rationally exploit resources.

Among available grouping methods, data mining ap-

proaches have been attracted more attention. Given dif-

ferent data mining models, clustering is regarded as the

art of systematically finding groups in a data set (Fayyad,

Piatetsky-Shapiro, & Smyth, 1996). In this paper, to clus-

ter the university majors, we utilize the k-means algo-

rithm as the most widely used method that have shown

many successes in different applications such as market

segmentation, pattern recognition, information retrieval,

and so forth (Cheung, 2003; Kuo, Ho, & Hu, 2002). Be-

sides its high performance, it is a very popular approach

for clustering because of its simplicity of implementation

and fast execution.

Ranking/ordering  university majors is a multi-criteriaproblem; that is, different criteria should be taken intoaccount. For example, one major might be very importantfor industrial setting while another one is to improve so-cial culture. Armed with this, we apply the data enve-lopment analyse (DEA) as a simple multi-criteria decision

making (or MCDM) method for dealing with unstruc-tured, multi-attribute problems.DEA is a non-parametric linear programming based

technique for measuring the relative efficiency of a set ofsimilar units, usually referred to as decision making

———————————————— 

•  Department of Industrial Engineering, Arak Branch, Islamic Azad Univer-sity, Arak, Iran

•  * Department of Industrial Engineering, Arak Branch, Islamic Azad Uni-versity, Arak, Iran

•  Department of Industrial Engineering, Arak Branch, Islamic Azad Univer-sity, Arak, Iran 

U

JOURNAL OF COMPUTING, VOLUME 3, ISSUE 8, AUGUST 2011, ISSN 2151-9617

HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING/

WW.JOURNALOFCOMPUTING.ORG 18

© 2011 Journal of Computing Press, NY, USA, ISSN 2151-9617

Page 2: Using Clustering and DEA for evaluation and ranking university majors

8/4/2019 Using Clustering and DEA for evaluation and ranking university majors

http://slidepdf.com/reader/full/using-clustering-and-dea-for-evaluation-and-ranking-university-majors 2/7

units (DMUs). Because of its successful application andcase studies, DEA has gained too much attention andwidespread use by business and academy researchers.

Evaluation of data warehouse operations (Mannino,Hong, & Choi, 2008), selection of flexible manufacturingsystem (Liu, 2008), assessment of bank branch perfor-mance (Camanho & Dyson, 2005), examining bank effi-

ciency (Chen, Skully, & Brown, 2005), analyzing firm’sfinancial statements (Edirisinghe & Zhang, 2007), mea-suring the efficiency of higher education(by solving onlyone LP) institutions (Johnes, 2006), solving facilitylayout design (FLD) problem (Ertay, Ruan, & Tuzkaya,2006) and measuring the efficiency of organizational in-vestments in information technology (Shafer & Byrd, 2000) are examples of using DEA in various areas.

Looking into the literature, there is no paper pub-

lished dealing with the major choice as a nationwide

problem. They almost tackle the problem as an individual

assistance model. These papers usually propose regres-

sion models that guide a student to know which major isthe best choice regarding her/his personal conditions,

characteristics and interests (Porter & Umbach, 2006;

Boudarbat, 2008; Berger, 1988; Crampton, Walstrom, &

Schambach, 2006). As far as we reviewed, this paper is the

first work exploring this problem as a nationwide one,

and cluster university majors using a data mining method

called k-means. Moreover, university majors are ranked

by a MCDM method, called DEA algorithm. The rest of

the paper is organized as follows. Section 2 clusters the

university majors. Section 3 presents the conceptual mod-

el of university majors ranking. Section 4 applies the DEAalgorithm to order group university majors. Section 5

concludes the paper.

2 UNIVERCITY MAJOR RANKING MODEL 

This section presents a conceptual model to describe thedecision-making procedure of university major clusteringand ranking.In fact, we employ a flow chart (FC) model to show wholeprocedure. This diagram is to clarify each step of wholeprocedure regardless of their details. Fig. 1 presents theFC model. The procedure could be divided into threemain phases: Data gathering, Data preparation, and Deci-sion making. In the first phase, the list of existing univer-sity majors is solicited from Iranian Ministry of ScienceResearch and Technology. University majors in Iran arepresented in five main groups each of which covers aneducational background from high schools. These fivegroups are: (1) Fine art, (2) Mathematics and Physics, (3)Empirical Sciences, (4) Human Sciences, (5) Foreign Lan-guages. Finally, 177 university majors presented in Iranare identified. Then, MSGs are determined. Doing so, thispaper intends to consider eight highlighted main specia-lization groups with due considerations to Iran’s own

attributes and special areas are needed in order to ease

Fig. 1. The proposed Flow Chat (FC) model

JOURNAL OF COMPUTING, VOLUME 3, ISSUE 8, AUGUST 2011, ISSN 2151-9617

HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING/

WW.JOURNALOFCOMPUTING.ORG 19

© 2011 Journal of Computing Press, NY, USA, ISSN 2151-9617

Page 3: Using Clustering and DEA for evaluation and ranking university majors

8/4/2019 Using Clustering and DEA for evaluation and ranking university majors

http://slidepdf.com/reader/full/using-clustering-and-dea-for-evaluation-and-ranking-university-majors 3/7

TABLE 1THE RESULTS OF UNIVERSITY MAJOR RANKING 

JOURNAL OF COMPUTING, VOLUME 3, ISSUE 8, AUGUST 2011, ISSN 2151-9617

HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING/

WW.JOURNALOFCOMPUTING.ORG 20

© 2011 Journal of Computing Press, NY, USA, ISSN 2151-9617

Page 4: Using Clustering and DEA for evaluation and ranking university majors

8/4/2019 Using Clustering and DEA for evaluation and ranking university majors

http://slidepdf.com/reader/full/using-clustering-and-dea-for-evaluation-and-ranking-university-majors 4/7

the design process of sustainable development. Theseeight MSGs were extracted after a review of the literature of the problem and the reports published by the localgovernment for achieving sustainable development, andthe validity and reliability of these MSGs have been

verified and confirmed by a number of structured inter-views. At this stage, additional rules and constraints tak-en from Iran’s strategies and views should be consideredas well. Finally, the following eight MSGs are considered as decision criteria:

1. Financial/Economical group 2. Social/Religious group 3. Industrial group4. Political group5. Service group6. Agricultural group7. Therapy/Health group8. Environmental/Natural source group

In the second phase, regarding the data gathered inprevious phase, two suitable questionnaires are designed.The first one is to compare university majors on theirmagnitude of influence on above-mentioned MSGs. Thesecond one is to compare the importance/weight of eachMSG for today Iran. The questionnaires are sent to severalexperts (64 experts) whose definition is set in this researchas follows: an expert is a person who has at least a Masterof Science degree in one of the official university majorsalong with at least a three-year working experience inhis/her specialization. When data are collected, they are

prepared.If all the requirements are met, the third phase starts.

First, we employ one of the well-known data mining ap-proaches to cluster university majors based on their simi-larities and differences on the results. This step is ex-plained in more details in Section 3. Then, we rank un i-versity majors by means of a MCDM algorithm. There aretwo options to employ: (1) multi-objective decision maing(or MODM), or (2) multi-attribute decision-making (orMADM) approaches.

MODM models are those searching a conti-nuous/integer space to find optimal solutions. The most

commonly used type of these models is linear program-ming. Although the MODM model is not a good choicefor the problem studied here, we formulate the universitymajor ranking problem in the form of a linear program-ming model. The following notifications are defined. 

n the number of majors (n = 177) m the number of MSGs (m = 8)i index for majors; i = {1, 2, . . . , n}

 j  index for MSGs; j = {1, 2, . . . , m}b ij  the magnitude of influence of major i on MSG j 

The decision variables are as follows.

 X i continuous variable for the importance rate ofmajor i The proposed model is given below:

Maximize

Subject to:

Eq. (1) calculates the objective function which is thesummation of multiplying the importance rate given toeach major and its influence on each of all eight MSGs.Constraint set (2) limits the total importance of all themajors to be equal to 1. Constraint set (3) defines the deci-sion variables.

Since the problem of ranking university majors is not acontinuous problem, the MODM model is not the bestchoice. Our purpose to present the model is to mathemat-ically characterize the problem. A MADM model could bemore effective. Among the MADM approaches, DEA hasshown many successful applications in such ranking

problems. Therefore, we have been thinking of rankingthe university majors by DEA algorithm. The details andresults are presented in Section 4.

3 UNIVERSITY MAJOR CLUSTERING PROBLEM

(UMCP) IN IRAN 

3.1 THE BACKGROUND OF CLUSTERING AND K-MEANS

ALGORITHM 

In today world, data are considered as one of the mostvaluable assets. With the current dramatic increase in

magnitude of available data and also their low cost sto-rage, it became interesting to discover knowledge in thesedata. Therefore, the importance of how to effectivelyprocess and use data more and more soars. This calls for

TABLE 2THE CENTRIODS OF UNIVERSITY MAJOR CLUSTERS IN EACH MSG IN PERCENTAGE 

Cluster MSG

Financial / Economical

Social/ Religious Industry Politics Services Therapy / Health Agriculture Environment/ Natu-ral Resources

1 44 28 18 4 11 2 5 102 59 9 70 6 41 18 21 16

3 11 14 0 0 47 4 58 154 16 54 4 5 25 8 37 275 7 37 4 41 8 1 2 16 18 5 18 1 20 8 10 107 29 2 23 4 16 44 35 548 6 20 4 10 8 2 5 39 41 59 19 50 24 6 5 6

10 4 2 3 2 6 1 3 3

JOURNAL OF COMPUTING, VOLUME 3, ISSUE 8, AUGUST 2011, ISSN 2151-9617

HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING/

WW.JOURNALOFCOMPUTING.ORG 21

© 2011 Journal of Computing Press, NY, USA, ISSN 2151-9617

Page 5: Using Clustering and DEA for evaluation and ranking university majors

8/4/2019 Using Clustering and DEA for evaluation and ranking university majors

http://slidepdf.com/reader/full/using-clustering-and-dea-for-evaluation-and-ranking-university-majors 5/7

new techniques to help analyze, understand the hugeamounts of stored data (Liao & Chen, 2004). Among thenew techniques developed, data mining is the non trivialextraction of hidden and potentially useful informationfrom large sets of data. In other words, data mining is theprocess of discovering significant knowledge, such aspatterns, associations, changes, anomalies and significantstructures from large amounts of data stored in databases,data warehouses, or other information repositories (Liao,Chen, & Wu, 2008). In the literature, there are many datamining models such as classification, estimation, predic-tive modeling, clustering, affinity grouping or associationrules, description and visualization, as well as sequentialmodeling.

Clustering is a widely used technique, whose goal isto provide insight into the data by partitioning the data(objects) into disjoint and homogeneous groups (clusters)of objects, such that objects in a cluster are more similar toeach other than to objects in other clusters. According to

Boutsinas and Gnardellis (2002), clustering algorithmshave been frequently studied in various fields includingmachine learning, neural networks and statistics, amongothers (Fensel, 2001; Corcho, Lopez, & Perez, 2003; Davies& Fensel, 2003).

The k-means algorithm, first proposed by MacQueen(1967), is the most popular partition-clustering methodthat has attracted great interest in the literature. The goalof the k-means algorithm is to partition the objects into kclusters so that the within-group similarity is maximized.The procedure of k-means methods could be described asfollows.

1. Place k points into the space represented by the ob- jects that are being clustered. These points represent intialgroup centroids.

2. Assign each object to the group that has the closestcentroid.

3. When all objects have been assigned, recalculate thepositions of the k centroids.

4. Repeat Steps 2 and 3 until the centroids no longermove. This produces a separation of the objects intogroups from which the metric to be minimized can becalculated.

3.2 THE APPLICATIONS OF K-MEANS FOR UNIVERSITY

MAJOR CLUSTERING PROBLEM 

This study employs k-means in cluster analysis andpartitions 177 university majors into 10 clusters. Let X ={UM 1  ; UM 2  ; . . . ; UM n = 177} be the set of majors to beclustered where UM i indicates the university major i. Us-ing the vector-space model, each major is measured withrespect to a set of m = 8 initial attributes  A1 ; A2  ; . . . ; A8 

where  A j  indicates MSG  j. Each major is thereforedscribed by a 8-dimensional vector UMi = {UM i ,1  ; . . . ,UM i ,8 } where UM i,j is UM i’s magnitude of influence onMSG j; UM ij R; 1 ≤ i ≤ 177; 1 ≤ j ≤ 8. The initial 10 cen-

triods C l  , l = {1, 2,. . . , 10} are randomly selected. Let C lj indicates the value of dimension j of centriod l. Then, thedistance between each major and centeriod is calculatedusing Euclidean distance as the most commonly used

distance measure in k-means method (Huang, Chang, &Wu, 2009). 

That is, the distance between the major UM i  and thecenteriod C l is obtained by the following formula:

Again, each major is assigned to the nearest cluster,and the new centeriod dimension  j of cluster l is thearithmetic mean of UM ij for the majors belonging to thecluster l. This procedure iterates until no new cluster isobtained when majors are re-assigned. To run the proce-dure, the algorithm coded in MATLAB 7. The results ofUMCP and the final centriods of the 10 clusters are pre-sented in Tables 1 and 2, respectively. Centeriods of each-cluster are its average magnitude of influence on eachMSG. Cluster 1 includes the majors concerning more onfinancial and economical MSG, and slightly on social andreligious MSG. Cluster 2 consists of engineering majors;

therefore, they clearly focus on the industrial MSG. Ma-  jors in Cluster 3 are those relating to individual therapyand health, whereas we could see majors of Cluster 4 arethose focusing on public health. Cluster 5 covers majorstraining social related courses such as social, political,religious and military affairs. Cluster 6 involves majorsproviding services for civilians. Cluster 7 includes majorsrelating to Agriculture and Natural Resources MSG. Clus-ter 8 consists of majors that their aspects of social servicesare more influential than the other aspects. Cluster 9 in-volves majors that could almost affect all the eight crite-ria; although they are more important on economical,

social and religious, political, and service criteria. Cluster10 apparently covers majors that have been given lowervalues by the involved decision makers. Based on resultscollected, they might be comparatively less influential. 

4 UNIVERSITY MAJOR RANKING PROBLEM

(UMRP) IN IRAN 

4.1 THE BACKGROUND OF DATAENVELOPMENT ANALYSE

(DEA)

DEA is a technique for deciding the relative efficiency of a

Decision Making Unit (DMU) by comparing it with linearcombinations of other DMUs engaged in providing the same

outputs from the same inputs [23]. Charnes, Cooper, andRhodes developed data envelopment analysis (DEA) toevaluate the efficiency of decision making units (DMUs)through identifying the efficiency frontier and comparingeach DMU with the frontier [20]. Since DEA is able toestimate efficiency with minimal prior assumptions [28],it has a comparative advantage to approaches that requirea priori assumptions, such as standard forms of statisticalregression analysis [4]. During the past thirty years, vari-ous DEA extensions and models have been developed

and established themselves as powerful analytical tools[23]. The original DEA model presented by Charnes,Cooper, and Rhodes [20] is called ‘‘CCR ratio model”,which uses the ratio of outputs to inputs to measure the

JOURNAL OF COMPUTING, VOLUME 3, ISSUE 8, AUGUST 2011, ISSN 2151-9617

HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING/

WW.JOURNALOFCOMPUTING.ORG 22

© 2011 Journal of Computing Press, NY, USA, ISSN 2151-9617

Page 6: Using Clustering and DEA for evaluation and ranking university majors

8/4/2019 Using Clustering and DEA for evaluation and ranking university majors

http://slidepdf.com/reader/full/using-clustering-and-dea-for-evaluation-and-ranking-university-majors 6/7

efficiency of DMUs. Assume that there are n DMUs withm inputs to produce s outputs. X ij  and yrj  represent theamount of input i and output r for DMU  j (j = 1, 2, . . .,n), respectively. Then the ratio-form of DEA can berepresented as:

where the ur ’s and the v i’s are the variables and the y

ro’s and x io’s are the observed output and input values ofthe DMU  to be evaluated (i.e., DMU o), respectively [23].The equivalent linear programming problem using theCharnes–Cooper transformation is [23]:

c

 

Banker, Charnes, and Cooper introduced the BCC modelby adding a constraint to the CCR model [5].These models can be solved using the simplex method foreach DMU . DMU s with value of 1 are efficient and othersare inefficient. Nakhaeizadeh and Schnabl proposed to

use DEA approach in data mining algorithms selection[26]. They argued that in order to make an objective eval-uation of data mining algorithms that all the availablepositive and negative properties of algorithms are impor-tant and DEA models are able to take both aspects intoconsideration. Positive and negative properties of datamining algorithms can be considered as output and inputcomponents in DEA, respectively. For example, the over-all accuracy rate of a classification algorithm is an outputcomponent and the computation time of an algorithm isan input component. Using existing DEA models, it ispossible to give a comprehensive evaluation of data min-ing algorithms. In this study, input components are equal

. Output components include the scors that each centro-id gained from criteria (MSG). 

After solving LP, the clusters with hRr R = 1 (100%) are ef-ficient clusters. The other clusters do not belong to theefficiency frontier and remain outside of it. As alreadymentioned, the definition of efficiency is more generalthan intrestingness as suggested by Fayyad et al. (1996).For ranking the clusters, one can use the approach sug-gested by Andersen and Petersen (1993) (AP-model).They use a criterion that we call it the AP-value. In input-oriented models the AP-value measures how much anefficient cluster can radially enlarge its input-levels while

remining still efficienct (output-oriented is analogous).

For example, for an input-oriented method an AP-value equal to 1.5 means that the algorithm remains still

efficient when its input values are all enlarged by 50%. Ifthe cluster is inefficient then the AP-value is equal to theefficiency value. In this paper, the CCR and BCC modelsare utilized.

4.2  THE APPLICATION OF DEA FOR UMRP

In this section using DEA the central points of every clus-ter will be consider as DMUs and then with applying of 8criteria that already mentioned we are able to evaluatethe efficiency of each cluster that the results summarizedin table 3.

As you see in the table3 the efficiency of more than 1

DMUs are equal to 1. Therefore, using A-P model we

calculate the efficiency of them again and ultimately we

will able to final ranking of these clusters. The results

represented in table 4. As shown in table 4 the first rank

is cluster 2.

5 CONCLUSION 

This paper dealt with university majors ranking problem.

The UMRP is an important problem since university ma-

 jors might not have the same priority basis with due con-

siderations to different resources and strategies that a

country has, they are all eminent though. The UMRP is a

dynamic problem; therefore, a general model is needed to

TABLE 3THE EFFICIENCY OF UNIVERSITY MAJOR CLUSTERS 

TABLE4THE EFFICIENCY OF UNIVERSITY MAJOR CLUSTERS

OBTAINED FROM A-P MODEL 

JOURNAL OF COMPUTING, VOLUME 3, ISSUE 8, AUGUST 2011, ISSN 2151-9617

HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING/

WW.JOURNALOFCOMPUTING.ORG 23

© 2011 Journal of Computing Press, NY, USA, ISSN 2151-9617

Page 7: Using Clustering and DEA for evaluation and ranking university majors

8/4/2019 Using Clustering and DEA for evaluation and ranking university majors

http://slidepdf.com/reader/full/using-clustering-and-dea-for-evaluation-and-ranking-university-majors 7/7

clarify the whole procedure. In this case, we employed a

Flow Chart model which has three phases: Data gather-

ing, Data preparation, and Decision making. In the first

two phases, all the data needed for the third phase are

collected and tested for the necessary requirements. In the

third phase, the all the majors are clustered according to

their similarity and differences by k-means algorithm.Since UMRP is an MADM problem, we employ an appli-

cation of DEA algorithm to rank university majors.

As a direction for future research, one might work on

application of other multi objective decision making pro-

cedure.

REFERENCES 

[1]  Berger, M. C. (1988). Predicted future earnings and choice of college

major. Industrial and Labor Relations Review, 41, 418–429.

[2]  Boudarbat, B. (2008). Job-search strategies and the unemployment of

university graduates in Morocco. International Research Journal ofFinance and Economics, 14, 15–33.

[3]  Boutsinas, B., & Gnardellis, T. (2002). On distributing the clustering

process. Pattern Recognition Letters, 23, 999–1008.[4]  Camanho, A. S., & Dyson, R. G. (2005). Cost efficiency

measurement with price uncertainty: A DEA application to  bank branch assessments. European Journal of OperationalResearch, 161, 432–446.

[5]  Chen, X., Skully, M., & Brown, K. (2005). Banking efficiency inChina: Application of DEA to pre- and post-deregulation eras:1993–2000. China Economic Review, 16, 229–245.

[6]  Chen, Y. L., & Cheng, L. C. (2009). Mining maximum consensus se-

quences from group ranking data. European Journal of Operational Re-

search, 198, 241–251.\[7]  Cheung, Y. M. (2003). K-means: A new generalized K-means clustering

algorithm. Pattern Recognition Letters, 24, 2883–2893.

[8]  Corcho, O., Lopez, M. F., & Perez, A. G. (2003). Methodologies, tools

and languages for building ontologies. Where is their meeting point?

Data and Knowledge Engineering, 46, 41–64.

[9]  Crampton, W. J., Walstrom, K. A., & Schambach, T. P. (2006). Factors

influencing major selection by college of business students. Information

Systems, 7(1), 226–230.

[10]  Davies, J., & Fensel, D. (2003). Toward the Semantic Web: Ontology

driven knowledge management. John Wiley & Sons Ltd.[11]  Edirisinghe, N. C. P., & Zhang, X. (2007). Generalized DEA

model of fundamental analysis and its application to portfolio

optimization. Journal of Banking & Finance, 31, 3311–3335.[12]  Ertay, T., Ruan, D., & Tuzkaya, U. R. (2006). Integrating data

envelopment analysis and analytic hierarchy for the facilitylayout design in manufacturing systems. Information Sciences,176, 237–262.

[13]  Fayyad, U. M., Piatetsky-Shapiro, G., & Smyth, P. (1996). Advances in

knowledge discovery and data mining. Cambridge: AAAI Press/MIT

Press.

[14]  Fensel, D. (2001). Ontologies: A silver bullet for knowledge manage-

ment and electronic commerce. New York: Springer.

[15]  Huang, S. C., Chang, E. C., & Wu, H. H. (2009). A case study of applying

data mining techniques in an outfitter’s customer value analysis. Expert

Systems with Applications, 36, 5909–5915.

[16]   Johnes, J. (2006). Measuring teaching efficiency in highereducation: An applicationof data envelopment analysis toeconomics graduates from UK Universities. European Journalof Operational Research, 174, 443–456.

[17]  Kuo, R. J., Ho, L. M., & Hu, C. M. (2002). Integration of self-organizing

feature map and K-means algorithm for market segmentation. Com-

puters and Operations Research, 29(11), 1475–1493.

[18]  Liao, S. H., & Chen, Y. J. (2004). Mining customer knowledge for elec-

tronic catalo marketing. Expert Systems with Applications, 27, 521–532.

[19]  Liao, S. H., Chen, C. M., & Wu, C. H. (2008). Mining customer know-

ledge for product line and brand extension in retailing. Expert Systems

with Applications, 35(3), 1763–1776.

[20]  Lipovetsky, S., & Conklin, W. M. (2002). Decision aiding. Robust estima-

tion of priorities in the AHP. European Journal of Operational Research,

137, 110–122.[21]  Liu, S. T. (2008). A fuzzy DEA/AR approach to the selection of

flexible manufacturing systems. Computers & IndustrialEngineering, 54, 66–76.

[22]  Kablan, M. M. (2004). Decision support for energy conservation promo-

tion: Ananalytic hierarchy process approach. Energy Policy, 32, 1151–

1158.

[23]  Koksalan. M, C. Tuncer, A DEA-based approach to ranking multi-

criteria alternatives, International Journal of Information Technology

and Decision Making 8 (1) (2009) 29–54.[24]  MacQueen, J. B. (1967). Some methods for classification and analysis of

multivariate observations. Proceedings of 5-th Berkeley symposium on

mathematical statistics and probability (vol. 1, pp. 281–297). Berkeley:

University of California Press.[25]  Mannino, M., Hong, S. N., & Choi, I. J. (2008). Efficiency

evaluation of data warehouse operations. Decision SupportSystems, 44, 883–898.

[26]  Nakhaeizadeh G., A. Schnabl, Development of multi-criteriametrics for evaluation of data mining algorithms, in:Proceeding of the Third International Conference onKnowledge Discovery and Data Mining (KDD’97), NewportBeach, California, August 14–17, 1997, pp.37–42.

[27]  Porter, S. R., & Umbach, P. D. (2006). College major choice: An analysis

of person–environment fit. Research in Higher Education, 47(4), 429–449.

[28]  Shafer, S. M., & Byrd, T. A. (2000). A framework for measuringthe efficiency of organizational investments in informationtechnology using data envelopment analysis. Omega, 28, 125–141.

[29]  Steuer, R. E. (2003). Multiple criteria decision making combined with

finance: Acategorized bibliographic study. European Journal of Opera-

tional Research, 150, 496–515.

JOURNAL OF COMPUTING, VOLUME 3, ISSUE 8, AUGUST 2011, ISSN 2151-9617

HTTPS://SITES.GOOGLE.COM/SITE/JOURNALOFCOMPUTING/

WW.JOURNALOFCOMPUTING.ORG 24