heart disease diagnosis by artificial neural networks

download heart  disease  diagnosis  by  artificial  neural networks

of 5

Transcript of heart disease diagnosis by artificial neural networks

  • 8/17/2019 heart disease diagnosis by artificial neural networks

    1/5

    Journal of Cardiology (2012) 59, 190—194

     Available online at www.sciencedirect.com

     journal homepage: www.elsevier .com/ locate / j jcc

    Original article

    Coronary heart disease diagnosis by artificial neuralnetworks including genetic polymorphisms andclinical parameters

    Oleg Yu. Atkov (MD, PhD) a, Svetlana G. Gorokhova (MD, PhD)b,∗,Alexandr G. Sboev (PhD) c, Eduard V. Generozov (PhD)d,Elena V. Muraseyeva (MD, PhD)e, Svetlana Y. Moroshkinad,Nadezhda N. Cherniyc

    a N.I. Pirogov Russian State Medical University, Moscow, Russiab I.M. Sechenov First Moscow State Medical University, Moscow, Russiac MEPhI National Research Nuclear University, Moscow, Russiad Research Institute of Physical-Chemical Medicine, Moscow, Russiae Central Clinical Hospital No. 2 of Russian Railways JSC, Moscow, Russia

    Received 18 November 2011; accepted 21 November 2011

    Available online 2 January 2012

    KEYWORDSCoronary heartdisease;Risk factors;Artificial neuralnetworks

    Summary The aim of  this study was to develop an artificial neural networks-based (ANNs)

    diagnostic model for coronary heart disease (CHD) using a complex of  traditional and genetic

    factors of this disease. The original database for ANNs included clinical, laboratory, functional,

    coronary angiographic, and genetic [single nucleotide polymorphisms (SNPs)] characteristics of 

    487 patients (327 with CHD caused by coronary atherosclerosis, 160 without CHD). By changing

    the types of ANN and the number of input factors applied, we created models that demonstrated

    64—94% accuracy. The best accuracy was obtained with a neural networks topology of multilayer

    perceptron with two hidden layers for models included by both genetic and non-genetic CHD

    risk factors.

    © 2011 Japanese College of Cardiology. Published by Elsevier Ltd. All rights reserved.

    ∗ Corresponding author at: 43 ul. Losinoostrovskaya, Moscow,107150, Russia I.M. Sechenov First Moscow State Medical Univer-sity, Cardiology Branch of the Department of Family Medicine.Tel.: +7 903 597 91 95; fax: +7 499 160 12 05.

    E-mail address: [email protected] (S.G. Gorokhova).

    Introduction

    The mortality rate caused by coronary heart disease (CHD)has been changing since the 1990s and, in some industrial-ized countries, shows a decline. However, morbidity frommyocardial infarction and angina/CHD remains high in somesubgroups, the highest being male workers and the elderly[1—6]. It poses a serious problem, and the development of 

    0914-5087/$ — see front matter © 2011 Japanese College of Cardiology. Published by Elsevier Ltd. All rights reserved.doi:10.1016/j.jjcc.2011.11.005

    http://localhost/var/www/apps/conversion/tmp/scratch_3/dx.doi.org/10.1016/j.jjcc.2011.11.005http://localhost/var/www/apps/conversion/tmp/scratch_3/dx.doi.org/10.1016/j.jjcc.2011.11.005http://localhost/var/www/apps/conversion/tmp/scratch_3/dx.doi.org/10.1016/j.jjcc.2011.11.005http://localhost/var/www/apps/conversion/tmp/scratch_3/dx.doi.org/10.1016/j.jjcc.2011.11.005http://www.sciencedirect.com/science/journal/09145087http://www.elsevier.com/locate/jjccmailto:[email protected]://localhost/var/www/apps/conversion/tmp/scratch_3/dx.doi.org/10.1016/j.jjcc.2011.11.005http://localhost/var/www/apps/conversion/tmp/scratch_3/dx.doi.org/10.1016/j.jjcc.2011.11.005mailto:[email protected]://www.elsevier.com/locate/jjcchttp://www.sciencedirect.com/science/journal/09145087http://localhost/var/www/apps/conversion/tmp/scratch_3/dx.doi.org/10.1016/j.jjcc.2011.11.005

  • 8/17/2019 heart disease diagnosis by artificial neural networks

    2/5

    coronary heart disease diagnostics 191

    methods for CHD prediction is of  immediate scientific andpractical interest.

    Several algorithms of  risk stratification and diagnosticmodels for CHD have already been created. They are basedon different sets of risk factors, established in epidemiologicstudies and randomized controlled trials, such as arterialhypertension, hypercholesterolemia, diabetes mellitus, andsmoking [1,7—12]. Some authors suggest using a coronary

    calcium score [13,14] and retinal vascular signs [15] etc. asadditional factors. Recent developments focus on geneticmarkers for prediction of  CHD and recommend includingsingle nucleotide polymorphisms (SNPs) for risk assessment[16—19].

    It seems that objective difficulties in CHD detection arecaused by the multiplicity of  risk factors to be taken intoconsideration. This makes it necessary to survey the struc-ture of  the variable risk factors and to create an efficientclassification system. Therefore, the development of  algo-rithms for correct classification of  CHD risk factors is animportant problem. Nowadays, computer methods of  intel-ligent data processing are available and applied for thispurpose, and, on this basis, expert medical systems havebeen created.

    One of  these promising methods is artificial neural net-works (ANNs), a highly effective tool used in classificationtasks, as well as to solve many important problems, such assignal enhancement, identification, and prediction of signalsand factors. The important feature of  ANNs is their adap-tivity. This enables them to be applied in cases when it isimpossible to create a strict mathematical model but wherethere is a sufficiently representative set of  samples. Theother important characteristic of  neural networks is theircapacity to generalize input information and to give correctanswers for ‘‘unfamiliar’’ data, which makes them effec-tive in solving complicated classification problems [20,21].

    Today, ANNs are applied in clinical and genetic research.Attempts have been made to create diagnostic models forvarious diseases with the use of  ANNs of  different topolo-gies [22—25]. It is assumed that using complexes of signs of CHD will allow ANNs to not only diagnose, but also to predictclinically significant events, myocardial infarction being thefirst.

    The aim of this study was to develop an ANNs-based diag-nostic model for CHD using the complex of  traditional andgenetic factors of this disease.

    Materials and methods

    The study included 487 patients (males — 425, females —62, mean age: 51.25±9.74 years) hospitalized in CentralClinical Hospital No. 2 of  Russian Railways JSC for a coro-nary angiography to diagnose CHD. All patients underwentuniform standard clinical examinations (laboratory tests,electrocardiogram, Holter monitoring, stress tests, echocar-diography etc.), coronary angiography (Advantx, GeneralElectric, Waukesha, WI, USA), and genetic analysis. Thediagnosis of  CHD was made following both the clinical andcoronarography results. The information obtained from test-ing and genotyping allowed us to create a database of patients that was subsequently used to diagnose CHD usingANNs.

    Risk factors

    CHD risk factors included in the analysis were: age,gender, total cholesterol, high-density lipoprotein (HDL)cholesterol, low-density lipoprotein (LDL) cholesterol, very-low-density lipoprotein (VLDL) cholesterol, triglycerides,cholesterol ratio, fasting plasma glucose, arterial hyper-tension, diabetes mellitus, current tobacco smoking status,

    obesity (Quetelet index, body mass index BMI > 30), and afamily history of CHD. Additional factors taken into accountwere profession (locomotive driving), risk of  fatal cardio-vascular disease according to the European SCORE projectscale (SCORE index) [1], left ventricular ejection fraction,and coronary angiography data.

    Genotyping

    Genotyping was performed by an allele-specific primerextension of  multiplex amplified products and detectionwith a matrix-assisted laser desorption/ionization time-of-flight mass spectroscopy on an AutoPhlex II MALDI-TOF MS(Bruker Daltonics, Billerica, MA, USA). The analyzed panelincludes 14 SNPs localized in genes involved in CHD patho-genesis: lipoprotein lipase [LIPC 250G/A (rs2070895) andLIPC 514C/T (rs1800588)], nitric oxide synthase [NOS E298D(rs1799983)], methylenetetrahydrofolate reductase [MTHFRA223V (rs1801133)], angiotensin-converting enzyme [ ACE Alu Ins/Del I>D (rs4646994)], angiotensinogen [ AGT  M235T(rs699)], and  AGT  T174M (rs4762) variants, angiotensinII type 1 receptor [ AGTR A1166C (rs5186)], plasminogenactivator inhibitor-1 [PAI-1 5G/4G (rs1799889)], and C-reactive protein [CRP-1 (rs1800947)], CRP-2 (rs1417938),CRP-3 (rs1205), CRP-4 (rs3093068), and CRP-5 (rs1130864)].

    Artificial neural network

    The neural network model was created using a multilayerperceptron, the multi-level neural feedforward networktaught by the statistical backpropagation of  error. The setsof  variable parameters were selected in order to adjustANN models by pairwise correlation between the databaseparameters and CHD diagnosis.

    Accuracy of  models was improved by a genetic algo-rithm with different optimization parameters [26] includingnumber of  neurons in the hidden layer, number of  inputsto the neural network, and slope coefficient of  activationfunctions. NeuroSolutions 5.0 development environment

    (NeuroDimension Inc., Gainesville, FL, USA) was used tocheck the possibilities of optimization.

    The ANN was created by inputting the specified variablefactors. The task to solve was of  two-class classification:‘‘1’’ — CHD, ‘‘0’’ — healthy. A total of  287 examples wereused for teaching, 100 for cross-validation, and 100 for test-ing.

    Results

    Based on the results of  the clinical instrumental research,CHD caused by coronary atherosclerosis was diagnosed in

  • 8/17/2019 heart disease diagnosis by artificial neural networks

    3/5

    192 O.Yu. Atkov et al.

    Table 1 Accuracy of ANN models for CHD diagnosis.a

    Model Factors Accuracy (%)

    I Age, profession, diabetes, arterial hypertension, smoking, obesity, family anamnesis of 

    CHD, glucose, cholesterol

    64

    II Age, profession, diabetes, arterial hypertension, smoking, obesity, family anamnesis of 

    CHD, glucose, cholesterol

    77

    III Age, profession, diabetes, arterial hypertension, smoking, obesity, family anamnesis of CHD, glucose, total cholesterol, HDL, LDL, VLDL, triglycerides, cholesterol ratio 83

    IV Age, profession, diabetes, arterial hypertension, smoking, obesity, heredity, glucose,

    total cholesterol, HDL, LDL, VLDL, triglycerides, cholesterol ratio, coronary angiography

    data

    91

    V NOS,  ACE ,  AGT-235,  AGT-174,  AGTR, CRP-1, CRP-2, CRP-3 SNPs 90

    VI NOS,  ACE ,  AGT-235,  AGT-174,  AGTR, CRP-1, CRP-2, CRP-3 SNPs, SCORE index 83

    VII   NOS,  ACE ,  AGT-235,  AGT-174,  AGTR, CRP-1, CRP-2, CRP-3 SNPs, coronary angiography

    data

    89

    VIII NOS,  ACE ,  AGT-235,  AGT-174,  AGTR, CRP-1, CRP-2, CRP-3 SNPs, SCORE index, coronary

    angiography data

    93

    IX NOS,  ACE ,  AGT-235,  AGT-174,  AGTR, CRP-1, CRP-2, CRP-3 SNPs, HDL, LDL, glucose 90

    X NOS,  ACE ,  AGT-235,  AGT-174,  AGTR, CRP-1, CRP-2, CRP-3 SNPs, age, smoking, obesity,

    family anamnesis of CHD, HDL, LDL

    88

    CHD, coronary heart disease; HDL, high-density lipoprotein cholesterol; LDL, low-density lipoprotein cholesterol; VLDL, very-low-densitylipoprotein cholesterol; SNP, single nucleotide polymorphisms.

    a In all cases, the multilayer perceptron neuronal network topology with two buried 4-neuron layers was used.

    327 (67.2%) patients. A total of 160 (32.8%) patients had anintact coronary artery wall and no evidence of CHD.

    Correlation analysis between the specified variable fac-tors and CHD allowed us to choose 32 factors associated withthe disease. These factors were analyzed by ANNs.

    Approaches using different ANN types and a variablenumber of  input factors (from 5 to 10) led to the mod-els with 64—94% diagnostic accuracy. The best result (94%)was achieved in a multilayer perceptron (MLP) model withtwo buried layers and 10 factors (profession, LDL, HDL,triglycerides, cholesterol rate, SCORE index, left ventricularejection fraction, family CHD history, coronary arteriogra-phy data, PAI gene). On the other hand, the same ANNtype with 5 factors (coronary arteriography data, choles-terol rate, SCORE index, left ventricular ejection fraction,PAI gene) had a lower diagnostic accuracy (78%). How-ever, the same 10 factors analyzed by other ANN typesyielded a 79% result. This suggests that the diagnostic accu-racy depends on the ANN type and the number of  variablefactors.

    The next step included an analysis of  10 models of  CHDprediction formed by different combinations of examinationblocks (demographic characteristics, CHD history, laboratorytests, echocardiography and coronary angiography data,genes) by the MLP with two buried 4-neuron layers. The num-ber of variable input factors ranged from 8 (models I and V)to 15 (model IV). The results are shown in Table 1. The mini-mal accuracy of 64% was obtained in model I, which included8 non-genetic factors. Models IV, V, VIII, and IX, which hadsignificantly different sets of  variables, showed ≥90% diag-nostic accuracy. Model IV included only non-genetic factors;model V included only eight SNPs (NOS,  ACE ,  AGT-235,  AGT-174, AGTR, CRP-1, CRP-2 and CRP-3); model VIII included the

    same SNPs in combination with coronary angiography dataand SCORE index; model IX included the same SNPs and HDL,LDL, and glucose. Thereby the optimal accuracy was moreoften achieved when a diagnostic model included geneticfactors.

    Discussion

    One of  the modern approaches to the solution of  classi-fication problems is intelligent data processing based onresolving optimization tasks using ANNs. Our results suggestthat ANNs may also be used to create a highly accurate andeffective model for CHD prediction.

    In this research we have examined the influence of  dif-ferent parameters, teaching methods, and neural networktopologies on the accuracy of  CHD diagnosis. We revealedthat the optimal topology to resolve this classification taskis a MLP with two buried layers. The optimal input parame-ters are 8—10 of  the most significant factors: the use of  all

    factors seems to excessively complicate the model, while,at the same time, smaller numbers do not provide essentialinformation for resolving this problem. It is of  interest tonote that, in some models, the accuracy of  CHD diagnosiswas lower than 90% in spite of  including coronary arteriog-raphy data, which are considered the ‘‘gold standard’’ forCHD diagnosis in clinical practice.

    Publications on the possibilities of  cardiologic applica-tion of  artificial intelligence, ANN, first of  all, are usuallylimited to different combinations of  traditional laboratoryand instrumental methods [22—25]. Therefore, the noveltyof our study is the inclusion of genes as risk factors in order toestimate capabilities of CHD prediction. In this connection,

  • 8/17/2019 heart disease diagnosis by artificial neural networks

    4/5

    coronary heart disease diagnostics 193

    it is important to note that the accuracy of  the diagnosisbased on genetic factors only was found to be almost equalto that in the model combining 15 non-genetic factors (90%and 91%, respectively). The most informative (93%) was themodel that included 8 SNPs, SCORE index, and coronary arte-riography data. Taking into account the fact that geneticinformation is a permanent parameter, it can be assumedthat analysis of NOS,  ACE ,  AGT-3,  AGT-4, AGTR, CRP-1, CRP-

    2, and CRP-3 SNPs enables prediction of  CHD with a highaccuracy.

    The diagnostic accuracy can be improved not only byincreasing the number of genetic markers, but also by theirprecise selection. Good candidates for consideration maybe genes involved in vascular cell growth, apoptosis, andinflammation, or others associated with CHD, e.g.  ABCA1(ATP-binding cassette-transporter A1), CYP1A2 (cytochromeP450),  ADRB group (-adrenoreceptors), HSF1 (heat shockfactors) [27], CLOCK , and BMAL1 [28]. The last ones areof  interest because they are associated with the regulationof  circadian rhythms, including those in the cardiovascularsystem [29].

    The obtained experience in the development of  neuralnetwork models for CHD prediction constitutes a basis forthe design of applied software products for the diagnosis of cardiovascular diseases, including screening tests.

    Conclusion

    The purpose of  this study was to create CHD diagnos-tic models with appropriate analytical characteristics usingMLP ANNs. The best accuracy was obtained in models thatincluded both genetic and non-genetic factors associatedwith the disease. The models of >90% accuracy may serve asthe basis for the development of software tools for diagnosisand prediction of CHD.

    References

    [1] Graham I, Atar D, Borch-Johnsen K, Boysen G, Burell G,Cifkova R, Dallongeville J, De Backer G, Ebrahim S, GjelsvikB, Herrmann-Lingen C, Hoes A, Humphries S, Knapton M, PerkJ, et al. European guidelines on cardiovascular disease preven-tion in clinical practice: executive summary. Fourth Joint TaskForce of the European Society of Cardiology and other soci-eties on cardiovascular disease prevention in clinical practice(constituted by representatives of nine societies and by invitedexperts). Eur Heart J 2007;28:2375—414.

    [2] Rosamond W, Flegal K, Friday G, Furie K, Go A, Greenlund K,Haase N, Ho M, Howard V, Kissela B, Kittner S, Lloyd-JonesD, McDermott M, Meigs J, Moy C, et al. Heart disease andstroke statistics — 2007 update: a report from the AmericanHeart Association Statistics Committee and Stroke StatisticsSubcommittee. Circulation 2007;115:e69—171.

    [3] Li C, Balluz LS, Okoro CA, Strine TW, Lin JM, Town M, GarvinW, Murphy W, Bartoli W, Valluru B. Centers for Disease Controland Prevention (CDC). Division of Behavioral Surveillance, Pub-lic Health Surveillance Program Office, Office of Surveillance,Epidemiology, and Laboratory Services. Surveillance of certainhealth behaviors and conditions among States and selectedlocal areas — behavioral risk factor surveillance system, UnitedStates, 2009. MMWR Surveill Summ 2011;60:1—250.

    [4] Ford ES, Capewell S. Coronary heart disease mortality amongyoung adults in the U.S. from 1980 through 2002: concealedleveling of mortality rates. J Am Coll Cardiol 2007;50:2128—32.

    [5] Rumana N, Kita Y, Turin TC, Murakami Y, Sugihara H, MoritaY, Tomioka N, Okayama A, Nakamura Y, Abbott RD, UeshimaH. Trend of increase in the incidence of acute myocardialinfarction in a Japanese population: Takashima AMI Registry,1990—2001. Am J Epidemiol 2008;167:1358—64.

    [6] Pogosova GV, Oganov RG, Koltunov IE, Sokolova OIu, Pozdni-

    akov IuM, Vygodin VA, Sapunova ID, Ryzhikova IB, Karpova AV,Eliseeva NA. Monitoring of secondary prevention of ischemicheart disease in Russia and European countries: results of international multicenter study EUROASPIRE III. Kardiologiia2011;51:34—40.

    [7] Assmann G, Cullen P, Schulte H. Simple scoring scheme for cal-culating the risk of acute coronary events based on the 10-yearfollow-up of the prospective cardiovascular Munster (PROCAM)study. Circulation 2002;105:310—5.

    [8] Wilson PWF, Castelli WP, Kannel WB. Coronary risk predic-tion in adults (the Framingham Heart Study). Am J Cardiol1987;59:91—4.

    [9] Conroy RM, Pyörälä K, Fitzgerald AP, Sans S, Menotti A, DeBacker G, De Bacquer D, Ducimetière P, Jousilahti P, Keil U,Njølstad I, Oganov RG, Thomsen T, Tunstall-Pedoe H, Tverdal A,

    et al. Estimation of ten-year risk of fatal cardiovascular diseasein Europe: the SCORE project. Eur Heart J 2003;24:987—1003.

    [10] Hippisley-Cox J, Coupland C, Vinogradova Y, Robson J, May M,Brindle P. Derivation and validation of QRISK, a new cardiovas-cular disease risk score for the United Kingdom: prospectiveopen cohort study. BMJ 2007;335:136.

    [11] Ridker PM, Buring JE, Rifai N, Cook NR. Development and val-idation of improved algorithms for the assessment of globalcardiovascular risk in women: the Reynolds Risk Score. JAMA2007;297:611—9.

    [12] Brindle P, Beswick A, Fahey T, Ebrahim S. Accuracy and impactof risk assessment in the primary prevention of cardiovasculardisease: a systematic review. Heart 2006;92:1752—9.

    [13] Yamamoto H, Ohashi N, Ishibashi K, Utsunomiya H, Kunita E,Oka T, Horiguchi J, Kihara Y. Coronary calcium score as a predic-

    tor for coronary artery disease and cardiac events in Japanesehigh-risk patients. Circ J 2011;75:2424—31.

    [14] Williams M, Shaw LJ, Raggi P, Morris D, Vaccarino V, Liu ST,Weinstein SR, Mosler TP, Tseng PH, Flores FR, Nasir K, Budoff M. Prognostic value of number and site of calcified coronarylesions compared with the total score. JACC Cardiovasc Imag-ing 2008;1:61—9.

    [15] Tedeschi-Reiner E, Strozzi M, Skoric B, Reiner Z. Relation of atherosclerotic changes in retinal arteries to the extent of coronary artery disease. Am J Cardiol 2005;96:1107—9.

    [16] Casas J, Cooper J, Miller GJ, Hingorani AD, Humphries SE.Investigating the genetic determinants of cardiovascular dis-ease using candidate genes and meta-analysis of associationstudies. Ann Hum Genet 2006;70:145—69.

    [17] Brautbar A, Ballantyne CM, Lawson K, Nambi V, ChamblessL, Folsom AR, Willerson JT, Boerwinkle E. Impact of addinga single allele in the 9p21 locus to traditional risk factorson reclassification of coronary heart disease risk and impli-cations for lipid-modifying therapy in the Atherosclerosis Riskin Communities (ARIC) study. Circ Cardiovasc Genet 2009;2:279—85.

    [18] Kullo IJ, Cooper LT. Early identification of cardiovascu-lar risk using genomics and proteomics. Nat Rev Cardiol2010;7:309—17.

    [19] Ding K, Kullo IJ. Genome-wide association studies foratherosclerotic vascular disease and its risk factors. Circ Car-diovasc Genet 2009;2:63—72.

    [20] Haykin S. Neural networks: a comprehensive foundation. NewYork: Macmillan College Publishing Company; 1994.

  • 8/17/2019 heart disease diagnosis by artificial neural networks

    5/5

    194 O.Yu. Atkov et al.

    [21] Obenshain MK. Application of data mining techniques to health-care data. Infect Control Hosp Epidemiol 2004;25:690—5.

    [22] Allison JS, Heo J, Iskandrian AE. Artificial neural network mod-eling of stress single-photon emission computed tomographicimaging for detecting extensive coronary artery disease. Am JCardiol 2005;95:178—81.

    [23] Colak MC, Colak C, Kocatürk H, Sağiroğlu S, Barutçu I.Predicting coronary artery disease using different artifi-cial neural network models. Anadolu Kardiyol Derg 2008;8:

    249—54.[24] Papaloukas C, Fotiadis DI, Likas A, Michalis LK. An ischemia

    detection method based on artificial neural networks. Artif Intell Med 2002;24:167—78.

    [25] Scott JA, Aziz K, Yasuda T, Gewirtz H. Integration of clinicaland imaging data to predict the presence of coronary arterydisease with the use of neural networks. Coron Artery Dis2004;15:427—34.

    [26] Goldberg DE. Genetic algorithms in search, optimization andmachine learning. New York: Addison Wesley; 1989.

    [27] Young ME. Anticipating anticipation: pursuing identificationof cardiomyocyte circadian clock function. J Appl Physiol2009;107:1339—47.

    [28] Yu S, Li G. MicroRNA expression and function in cardiacischemic injury. J Cardiovasc Transl Res 2010;3:241—5.

    [29] Takeda N, Maemura K. Circadian clock and cardiovascular dis-ease. J Cardiol 2011;57:249—56.