Survey on data mining techniques in heart disease prediction
-
Upload
sivagowry-sabanathan -
Category
Education
-
view
1.166 -
download
15
Transcript of Survey on data mining techniques in heart disease prediction
Presented by
S. Sivagowry
Research scholar
Bharathidasan university,
Trichy
Under the Guidance of
Dr. M.Durai Raj,
Assistant Professor
School of Computer Science and Engineering,
Bharathidasan University,
Trichy
Data Mining• Exploration of large data sets to extract hidden and
previously unknown pattern, etc.,• Two tasks:
Predictive Tasks Descriptive Tasks
• Predictive tasks predict the value of specific attribute based on other attributeClassification, Regression and Deviation Deduction
Contd..• Descriptive Tasks
– Derive pattern that summarize the relationship between data– Clustering, Association rule Mining and Sequential Pattern
Discovery
• Steps in Data Mining
Data Cleaning, Data Integration, Data Selection, Data transformation, Data Mining, Pattern Evaluation and Knowledge Representation
Contd.. Medical Data mining
Involves lot of accuracy and uncertaintyQuality service at affordable cost is a major challengeData is massiveDecision based on doctor’s experience may fail in some
casesData Mining in health care – an intelligent diagnostic tool
Heart Disease
29.2% of death is due to Cardio Vascular DiseaseCVD – leading cause for death in developing countries.
Data sets
Contd… Collected from University of California, Irvine (UCI).Cleveland data set, Hungary data set, Switzerland data set,
Long beach and statlog data set76 attributes14 are used
Data Mining Techniques in Heart Disease Prediction
ClusteringClassificationRegressionAssociation Rule Mining
Data Mining and Association Rules Carlos Ordonez and et. Al.,[7] used a simple mapping
algorithm. Treats numerical or categorical attributes as uniform. Decision tree is incapable – it automatically split
numerical value. (Medical data are in numerical format ) Interpreting experimental result by D.T is difficult Clustering medical data deserves further research Justify the use of A.R in Medical data
Contd…Deepika [11] used Pruning Classification Association
Rule (PCAR).PCAR comes from Apriori algorithm.Deletes minimum frequency item with minimum
frequency item sets.Deletes infrequent item from item sets.Classifies item based on frequency of item sets and
discovers frequent item sets.
Data Mining and ClassificationUsha Rani[38], used ANN in heart disease using feed
forward and back propagation algorithm.Experiment by single and multi layered neural network
models.Parallelism is implemented to speed up learning process. Neural network provides satisfactory results
Contd….In [3], Classification is based on Supervised machine
learning Algorithm.Tanagara tool is used to classify dataEvaluation by using 10 fold cross validation.The performance is analysed based on accuracy and time
taken to build the model. Naïve bayes is the better algorithmThe table below shows the perfomance study of algorithm
Contd..Algorithm used
Accuracy Time taken
Naïve bayes 52.33% 609ms
Decision List 52% 719ms
KNN 45.67% 1000MS
Contd..In [24], novel neuro fuzzy techniques is used.Preprocess by using Genetic Algorithm(GA).A four layered fuzzy neural network is used.Radial Basic Function neural network is constructed with
5 input, training and normalization in hidden layer and output layer with 1 node.
In [25], Intelligent Heart Disease Prediction System (IHDPS )is proposed using Decision Tree, NB and Neural network.
NB is the most effective one.
Contd..In [1], GA is used to determine the number of attributes.NB, D.T., Classification by Clustering are compared. DT takes more time to build the model.NB performs consistently before and after reduction
of attributes.CVC is poor in performance
Contd.. In [30], k-means clustering algorithm is used.Maximal Frequent Item Set Algorithm (MAFIA) is used. Multilayer perception network and back propagation algorithm is used as
training algorithm. Pseudo code for MAFIA [29]:MAFIA(C, MFI, Boolean IsHUT) {
name HUT = C.head C.tail;
if HUT is in MFI
stop generation of children and return
Count all children, use PEP to trim the tail, and recorder by increasing support,
For each item i in C, trimmed_tail {
IsHUT = whether i is the first item in the tail
newNode = C I
MAFIA (newNode, MFI, IsHUT)}
if (IsHUT and all extensions are frequent)
Stop search and go back up subtree
If (C is a leaf and C.head is not in MFI)
Add C.head to MFI
}
Contd… In [35], Naïve Bayes is used for predicting Decision Support in heart
disease prediction System.NB is found to be best in heart disease prediction. It can be used as a tool for training nurses and medical students for
diagnosing. It provides new ways of understanding and exploring the data. In [6], NB Classification can be used as a best decision support system. In [10], hybridization is used to train the neural network using GA. Feed
forward and Back propagation is used as a learning algorithm.When two more attributes are added with existing attributes, Neural
Network shows better performance in both the cases.
Contd..RIPPER, SVM, Decision Tree and ANN are compared
based on Sensitivity, Specificity, Accuracy, Error Rate, TP AND FP Rate. [20]
SVM predicts with least error rate and higher accuracy.
DM with Fuzzy Logic reduces the number of attributes and number of tests for the patients.[21]
Data Mining and ClusteringK-means clustering algorithm is used for the prediction of the
heart disease[4].Euclidean distance formula is usedNB is slow and Neural network takes number of iterations.Performance of clustering and classification algorithm is
compared [28].NB predicts with highest accuracy than Clustering Algorithm.
CONCLUSIONClassification task plays a vital role when compared with
Clustering, Association Rule and Regression.In Classification, each techniques has its own merits and
demerits.Reduction of attributes is considered.Hybridization of Classification with Fuzzy Logic can predict
with highest accuracy.
REFERENCES1. Anbarasi.M, Anupriya and Iyengar “Enhanced Prediction of Heart Disease with Feature Subset Selection using
Genetic Algorithm”, International Journal of Engineering and Technology, Vol 2(10), 2010, pp 5370-5376.2. Annoj P.K.,” Clinical decision support system: Risk level prediction of heart disease using Data Mining
Algorithms”, Journal of King Saud University- Computer and Information Sciences, 2012,pp 27-40.3. Asha Rajkumar and Mrs. Sophia Reena, “ Diagnosis of Heart Disease using Data Mining Algorithms, Global
Journal of Computer Science and Technology, vol. 10(10), 2010, pp 38-43.4. Bala Sundar V, “Development of Data Clustering Algorithm for predicting Heart”, IJCA, Vol 48(7), June 2012,
pp 8-13.5. Bhagyashree Ambulkar and Vaishali Borkar “Data Mining in Cloud Computing”, MPGINMC, Recent Trends in
Computing, ISSN 0975-8887,2012, pp 23-26.6. Bhuvaneswari. R, “Naïve Bayesian Classification Approach in Health Care Application”, International Journal
of Computer Science and Telecommunication, vol 3(1), Jan 2012, pp 106-112.7. Carlos Ordonez, Edward Omincenski and Levien de Braal “Mining Constraint Association Rules to Predict Heart
Disease”, Proceeding of 2001, IEEE International Conference of Data Mining, IEEE Computer Society, ISBN-0-7695-1119-8, 2001, pp: 433-440.
8. Cengiz colak.M , Cemiz colak and Hasan Kocatruk “Predicting coronary artery disease using different artificial neural network models”, CAD and Artificial neural network, pp 249-254, 2008.
9. Chaltrali S. Dangare and Sulabha, “Improved Study of Heart Disease Prediction System using Data Mining Classification Techniques”, IJCA, Vol 47(10), pp 44-48, June 2012.
10. Chen A.H., “HDPS: Heart Disease Prediction System”, Computing in Cardiology, ISSN 0276-6574, pp 557-560, 2011.
11. Deepika. N, “Association Rule for Classification of Heart Attack patients”, IJAEST, Vol 11(2), pp 253-257, 2011.
12. Jabbar M.A., “Knowledge discovery from mining association rules for Heart disease Prediction”, JATIT, Vol 41(2), pp 166-174, 2012.
13. Jyothi Soni, Uzma ansari and Dipesh Ansari “Intelligent and Effective Heart Disease Prediction System using Weighted Associate Classifer”, IJCSE, Vol 3(6), pp 2385-2392, June 2011.
14. K.Rajeswari, “Prediction of Risk Score for Heart Disease in India using Machine Intelligence”,IPCSIT, Vol 4, 2011.
15. Kavitha K.S, “Modeling and designing of evolutionary neural network for heart disease prediction”, IJCSI, Vol 7(5), pp 272-283, September 2010.
16. Latha Parthiban and R.Subramanian, “Intelligent Heart Disease Prediction System using CANFIS and Genetic Algorithm”, International Journal of Biological and Life Sciences, Vol 3(3), pp157-160,2007.
17. Liangxiao. J, Harry.Z, Zhihua.C and Jiang.S “One Dependency Augmented Naïve Bayes”, ADMA, pp 186-194, 2005.
18. Mia Shouman, “Using data mining techniques in heart disease diagnosis and treatment”, 978-1-4673-0483-2, Japan-Egypt Conference on Electronics, Communications and Computers, pp 189-193, 2012.
19. Milan Kumari and Sunila Godara, “Review of Data Mining Classification Model in Cardio Vascular Disease diagnosis”, IJCA, 2011.
20. Milan Kumari and Sunila Godara, “Comparative Study of Data Mining Classification Methods in Cardio-Vascular Disease Prediction”, IJCST, Vol 2(2), June 2011.
21. Nidhi Bhatia and Kiran Jyothi, “A Novel Approach for heart disease diagnosis using Data Mining and Fuzzy logic”, IJCA, Vol 54(17), pp 16-21, September 2012.
22. Nithya N.S, Sarumathi. S and Dr. Duraisamy. K “ Assessment of the risk factors of Heart Attack using frequent feature Selection Method”, International Journal of Communications and Enggineering, Vol 1(1), ISSN 0988-0382, pp 127-133, March 2012.
23. Qeethara Kadhim Al. Shayea, “Artificial neural network in Medical Diagnosis”, IJCSI, Vol 3(2), March 2011.
24. R. Setthukkarase and Kannan “An Intelligent System for mining Temporal rules in Clinical database using Fuzzy neural network”,European Journal of Scientific Research, ISSN 1450-216, Vol 70(3), pp 386-395, 2012.
25. Rafiah Awang and Palaniappan. S “Intelligent Heart Disease Prediction System Using Data Mining techniques”, IJCSNS, Vol 8(8), pp 343-350, Aug 2008.
26. Rafiah Awang and Palaniappan. S “Web based Heart Disease Decision Support System using Data Mining Classification Modeling techniques” , Proceedings of iiWAS, pp 177-187, 2007.
27. Raghu. D.Dr, “Probability Based Heart Disease Prediction using Data Mining Techniques”, IJCST, Vol 2(4), pp 66-68, Dec 2011.
28. Santhi. P, “Improving the performance of Data Mining Algorithm in Health Care data”, IJCST, Vol 2(3), 2011.
29. Setiawan N.A, “ Rule Selection for Coronary Artery Disease Diagnosis Based on Rough Set” ,International Journal of Recent Trends in Engineering, Vol 2(5), pp 198-202, Dec 2009.
30. Shantakumar B.Patil, “Intelligent and Effective Heart Attack Prediction System using Data Mining and Artifical Neural Network”, European Journal of Scientific Research, Vol 31(4), pp 642-656, 2009.
31. Shanthakumar B. Patil, “Extraction of Significant patterns from Heart Disease Ware Houses for Heart Attack Prediction”, IJCSNS, Vol 9(2), pp 228-235, Feb 2009.
32. Shouman.M, Turner.T and Stocker.R, “Applying K-Nearest Neighbour in diagnosing Heart Disease Patients”, International Journal of Information and Education Technology, Vol 2(3), June 2012.
33. Siri Krishnan Wasan, Vasutha Bhatnagar and Harleen Kaur “The Impact of Data Mining techniques on medical diagnostics”, Data Science Journal, Vol 5(19), pp 119-126, October 2006.
34. Srinivas, Kavitha Rani and Dr. Govarthan, “Application of Data Mining Techniques in Health Care and Prediction of Heart Attack”, IJCSE, Vol 2(2), pp 250-255, 2010.
35. Subbulakshmi, Ramesh and Chinna Rao “Decision Support in Heart Disease Prediction System using Naïve Bayes”, IJCSE, ISSN 0976-5166, Vol 2(2), May 2011.
36. Sudha.A, Gayathri.p and Jaishankar. N “Utilization of Data Mining Approaches for prediction of life Threatening Disease Survivability”, IJAC (0975-8887), Vol 14(17), March 2012.
37. Jyothi. S, Ujma.A, Dipesh. S and Sunita. S “Predictive Data Mining for Medical Diagnosis : An Overview of Heart Disease Prediction”, IJCA, Vol 17(8), pp 43-48, March 2011.
38. Usha. K Dr, “Analysis of Heart Disease Dataset using Neural network approach”, IJDKP, Vol 1(5), Sep 2011.