heart disease diagnosis by artificial neural networks
-
Upload
jatin-gera -
Category
Documents
-
view
215 -
download
0
Transcript of heart disease diagnosis by artificial neural networks
-
8/17/2019 heart disease diagnosis by artificial neural networks
1/5
Journal of Cardiology (2012) 59, 190—194
Available online at www.sciencedirect.com
journal homepage: www.elsevier .com/ locate / j jcc
Original article
Coronary heart disease diagnosis by artificial neuralnetworks including genetic polymorphisms andclinical parameters
Oleg Yu. Atkov (MD, PhD) a, Svetlana G. Gorokhova (MD, PhD)b,∗,Alexandr G. Sboev (PhD) c, Eduard V. Generozov (PhD)d,Elena V. Muraseyeva (MD, PhD)e, Svetlana Y. Moroshkinad,Nadezhda N. Cherniyc
a N.I. Pirogov Russian State Medical University, Moscow, Russiab I.M. Sechenov First Moscow State Medical University, Moscow, Russiac MEPhI National Research Nuclear University, Moscow, Russiad Research Institute of Physical-Chemical Medicine, Moscow, Russiae Central Clinical Hospital No. 2 of Russian Railways JSC, Moscow, Russia
Received 18 November 2011; accepted 21 November 2011
Available online 2 January 2012
KEYWORDSCoronary heartdisease;Risk factors;Artificial neuralnetworks
Summary The aim of this study was to develop an artificial neural networks-based (ANNs)
diagnostic model for coronary heart disease (CHD) using a complex of traditional and genetic
factors of this disease. The original database for ANNs included clinical, laboratory, functional,
coronary angiographic, and genetic [single nucleotide polymorphisms (SNPs)] characteristics of
487 patients (327 with CHD caused by coronary atherosclerosis, 160 without CHD). By changing
the types of ANN and the number of input factors applied, we created models that demonstrated
64—94% accuracy. The best accuracy was obtained with a neural networks topology of multilayer
perceptron with two hidden layers for models included by both genetic and non-genetic CHD
risk factors.
© 2011 Japanese College of Cardiology. Published by Elsevier Ltd. All rights reserved.
∗ Corresponding author at: 43 ul. Losinoostrovskaya, Moscow,107150, Russia I.M. Sechenov First Moscow State Medical Univer-sity, Cardiology Branch of the Department of Family Medicine.Tel.: +7 903 597 91 95; fax: +7 499 160 12 05.
E-mail address: [email protected] (S.G. Gorokhova).
Introduction
The mortality rate caused by coronary heart disease (CHD)has been changing since the 1990s and, in some industrial-ized countries, shows a decline. However, morbidity frommyocardial infarction and angina/CHD remains high in somesubgroups, the highest being male workers and the elderly[1—6]. It poses a serious problem, and the development of
0914-5087/$ — see front matter © 2011 Japanese College of Cardiology. Published by Elsevier Ltd. All rights reserved.doi:10.1016/j.jjcc.2011.11.005
http://localhost/var/www/apps/conversion/tmp/scratch_3/dx.doi.org/10.1016/j.jjcc.2011.11.005http://localhost/var/www/apps/conversion/tmp/scratch_3/dx.doi.org/10.1016/j.jjcc.2011.11.005http://localhost/var/www/apps/conversion/tmp/scratch_3/dx.doi.org/10.1016/j.jjcc.2011.11.005http://localhost/var/www/apps/conversion/tmp/scratch_3/dx.doi.org/10.1016/j.jjcc.2011.11.005http://www.sciencedirect.com/science/journal/09145087http://www.elsevier.com/locate/jjccmailto:[email protected]://localhost/var/www/apps/conversion/tmp/scratch_3/dx.doi.org/10.1016/j.jjcc.2011.11.005http://localhost/var/www/apps/conversion/tmp/scratch_3/dx.doi.org/10.1016/j.jjcc.2011.11.005mailto:[email protected]://www.elsevier.com/locate/jjcchttp://www.sciencedirect.com/science/journal/09145087http://localhost/var/www/apps/conversion/tmp/scratch_3/dx.doi.org/10.1016/j.jjcc.2011.11.005
-
8/17/2019 heart disease diagnosis by artificial neural networks
2/5
coronary heart disease diagnostics 191
methods for CHD prediction is of immediate scientific andpractical interest.
Several algorithms of risk stratification and diagnosticmodels for CHD have already been created. They are basedon different sets of risk factors, established in epidemiologicstudies and randomized controlled trials, such as arterialhypertension, hypercholesterolemia, diabetes mellitus, andsmoking [1,7—12]. Some authors suggest using a coronary
calcium score [13,14] and retinal vascular signs [15] etc. asadditional factors. Recent developments focus on geneticmarkers for prediction of CHD and recommend includingsingle nucleotide polymorphisms (SNPs) for risk assessment[16—19].
It seems that objective difficulties in CHD detection arecaused by the multiplicity of risk factors to be taken intoconsideration. This makes it necessary to survey the struc-ture of the variable risk factors and to create an efficientclassification system. Therefore, the development of algo-rithms for correct classification of CHD risk factors is animportant problem. Nowadays, computer methods of intel-ligent data processing are available and applied for thispurpose, and, on this basis, expert medical systems havebeen created.
One of these promising methods is artificial neural net-works (ANNs), a highly effective tool used in classificationtasks, as well as to solve many important problems, such assignal enhancement, identification, and prediction of signalsand factors. The important feature of ANNs is their adap-tivity. This enables them to be applied in cases when it isimpossible to create a strict mathematical model but wherethere is a sufficiently representative set of samples. Theother important characteristic of neural networks is theircapacity to generalize input information and to give correctanswers for ‘‘unfamiliar’’ data, which makes them effec-tive in solving complicated classification problems [20,21].
Today, ANNs are applied in clinical and genetic research.Attempts have been made to create diagnostic models forvarious diseases with the use of ANNs of different topolo-gies [22—25]. It is assumed that using complexes of signs of CHD will allow ANNs to not only diagnose, but also to predictclinically significant events, myocardial infarction being thefirst.
The aim of this study was to develop an ANNs-based diag-nostic model for CHD using the complex of traditional andgenetic factors of this disease.
Materials and methods
The study included 487 patients (males — 425, females —62, mean age: 51.25±9.74 years) hospitalized in CentralClinical Hospital No. 2 of Russian Railways JSC for a coro-nary angiography to diagnose CHD. All patients underwentuniform standard clinical examinations (laboratory tests,electrocardiogram, Holter monitoring, stress tests, echocar-diography etc.), coronary angiography (Advantx, GeneralElectric, Waukesha, WI, USA), and genetic analysis. Thediagnosis of CHD was made following both the clinical andcoronarography results. The information obtained from test-ing and genotyping allowed us to create a database of patients that was subsequently used to diagnose CHD usingANNs.
Risk factors
CHD risk factors included in the analysis were: age,gender, total cholesterol, high-density lipoprotein (HDL)cholesterol, low-density lipoprotein (LDL) cholesterol, very-low-density lipoprotein (VLDL) cholesterol, triglycerides,cholesterol ratio, fasting plasma glucose, arterial hyper-tension, diabetes mellitus, current tobacco smoking status,
obesity (Quetelet index, body mass index BMI > 30), and afamily history of CHD. Additional factors taken into accountwere profession (locomotive driving), risk of fatal cardio-vascular disease according to the European SCORE projectscale (SCORE index) [1], left ventricular ejection fraction,and coronary angiography data.
Genotyping
Genotyping was performed by an allele-specific primerextension of multiplex amplified products and detectionwith a matrix-assisted laser desorption/ionization time-of-flight mass spectroscopy on an AutoPhlex II MALDI-TOF MS(Bruker Daltonics, Billerica, MA, USA). The analyzed panelincludes 14 SNPs localized in genes involved in CHD patho-genesis: lipoprotein lipase [LIPC 250G/A (rs2070895) andLIPC 514C/T (rs1800588)], nitric oxide synthase [NOS E298D(rs1799983)], methylenetetrahydrofolate reductase [MTHFRA223V (rs1801133)], angiotensin-converting enzyme [ ACE Alu Ins/Del I>D (rs4646994)], angiotensinogen [ AGT M235T(rs699)], and AGT T174M (rs4762) variants, angiotensinII type 1 receptor [ AGTR A1166C (rs5186)], plasminogenactivator inhibitor-1 [PAI-1 5G/4G (rs1799889)], and C-reactive protein [CRP-1 (rs1800947)], CRP-2 (rs1417938),CRP-3 (rs1205), CRP-4 (rs3093068), and CRP-5 (rs1130864)].
Artificial neural network
The neural network model was created using a multilayerperceptron, the multi-level neural feedforward networktaught by the statistical backpropagation of error. The setsof variable parameters were selected in order to adjustANN models by pairwise correlation between the databaseparameters and CHD diagnosis.
Accuracy of models was improved by a genetic algo-rithm with different optimization parameters [26] includingnumber of neurons in the hidden layer, number of inputsto the neural network, and slope coefficient of activationfunctions. NeuroSolutions 5.0 development environment
(NeuroDimension Inc., Gainesville, FL, USA) was used tocheck the possibilities of optimization.
The ANN was created by inputting the specified variablefactors. The task to solve was of two-class classification:‘‘1’’ — CHD, ‘‘0’’ — healthy. A total of 287 examples wereused for teaching, 100 for cross-validation, and 100 for test-ing.
Results
Based on the results of the clinical instrumental research,CHD caused by coronary atherosclerosis was diagnosed in
-
8/17/2019 heart disease diagnosis by artificial neural networks
3/5
192 O.Yu. Atkov et al.
Table 1 Accuracy of ANN models for CHD diagnosis.a
Model Factors Accuracy (%)
I Age, profession, diabetes, arterial hypertension, smoking, obesity, family anamnesis of
CHD, glucose, cholesterol
64
II Age, profession, diabetes, arterial hypertension, smoking, obesity, family anamnesis of
CHD, glucose, cholesterol
77
III Age, profession, diabetes, arterial hypertension, smoking, obesity, family anamnesis of CHD, glucose, total cholesterol, HDL, LDL, VLDL, triglycerides, cholesterol ratio 83
IV Age, profession, diabetes, arterial hypertension, smoking, obesity, heredity, glucose,
total cholesterol, HDL, LDL, VLDL, triglycerides, cholesterol ratio, coronary angiography
data
91
V NOS, ACE , AGT-235, AGT-174, AGTR, CRP-1, CRP-2, CRP-3 SNPs 90
VI NOS, ACE , AGT-235, AGT-174, AGTR, CRP-1, CRP-2, CRP-3 SNPs, SCORE index 83
VII NOS, ACE , AGT-235, AGT-174, AGTR, CRP-1, CRP-2, CRP-3 SNPs, coronary angiography
data
89
VIII NOS, ACE , AGT-235, AGT-174, AGTR, CRP-1, CRP-2, CRP-3 SNPs, SCORE index, coronary
angiography data
93
IX NOS, ACE , AGT-235, AGT-174, AGTR, CRP-1, CRP-2, CRP-3 SNPs, HDL, LDL, glucose 90
X NOS, ACE , AGT-235, AGT-174, AGTR, CRP-1, CRP-2, CRP-3 SNPs, age, smoking, obesity,
family anamnesis of CHD, HDL, LDL
88
CHD, coronary heart disease; HDL, high-density lipoprotein cholesterol; LDL, low-density lipoprotein cholesterol; VLDL, very-low-densitylipoprotein cholesterol; SNP, single nucleotide polymorphisms.
a In all cases, the multilayer perceptron neuronal network topology with two buried 4-neuron layers was used.
327 (67.2%) patients. A total of 160 (32.8%) patients had anintact coronary artery wall and no evidence of CHD.
Correlation analysis between the specified variable fac-tors and CHD allowed us to choose 32 factors associated withthe disease. These factors were analyzed by ANNs.
Approaches using different ANN types and a variablenumber of input factors (from 5 to 10) led to the mod-els with 64—94% diagnostic accuracy. The best result (94%)was achieved in a multilayer perceptron (MLP) model withtwo buried layers and 10 factors (profession, LDL, HDL,triglycerides, cholesterol rate, SCORE index, left ventricularejection fraction, family CHD history, coronary arteriogra-phy data, PAI gene). On the other hand, the same ANNtype with 5 factors (coronary arteriography data, choles-terol rate, SCORE index, left ventricular ejection fraction,PAI gene) had a lower diagnostic accuracy (78%). How-ever, the same 10 factors analyzed by other ANN typesyielded a 79% result. This suggests that the diagnostic accu-racy depends on the ANN type and the number of variablefactors.
The next step included an analysis of 10 models of CHDprediction formed by different combinations of examinationblocks (demographic characteristics, CHD history, laboratorytests, echocardiography and coronary angiography data,genes) by the MLP with two buried 4-neuron layers. The num-ber of variable input factors ranged from 8 (models I and V)to 15 (model IV). The results are shown in Table 1. The mini-mal accuracy of 64% was obtained in model I, which included8 non-genetic factors. Models IV, V, VIII, and IX, which hadsignificantly different sets of variables, showed ≥90% diag-nostic accuracy. Model IV included only non-genetic factors;model V included only eight SNPs (NOS, ACE , AGT-235, AGT-174, AGTR, CRP-1, CRP-2 and CRP-3); model VIII included the
same SNPs in combination with coronary angiography dataand SCORE index; model IX included the same SNPs and HDL,LDL, and glucose. Thereby the optimal accuracy was moreoften achieved when a diagnostic model included geneticfactors.
Discussion
One of the modern approaches to the solution of classi-fication problems is intelligent data processing based onresolving optimization tasks using ANNs. Our results suggestthat ANNs may also be used to create a highly accurate andeffective model for CHD prediction.
In this research we have examined the influence of dif-ferent parameters, teaching methods, and neural networktopologies on the accuracy of CHD diagnosis. We revealedthat the optimal topology to resolve this classification taskis a MLP with two buried layers. The optimal input parame-ters are 8—10 of the most significant factors: the use of all
factors seems to excessively complicate the model, while,at the same time, smaller numbers do not provide essentialinformation for resolving this problem. It is of interest tonote that, in some models, the accuracy of CHD diagnosiswas lower than 90% in spite of including coronary arteriog-raphy data, which are considered the ‘‘gold standard’’ forCHD diagnosis in clinical practice.
Publications on the possibilities of cardiologic applica-tion of artificial intelligence, ANN, first of all, are usuallylimited to different combinations of traditional laboratoryand instrumental methods [22—25]. Therefore, the noveltyof our study is the inclusion of genes as risk factors in order toestimate capabilities of CHD prediction. In this connection,
-
8/17/2019 heart disease diagnosis by artificial neural networks
4/5
coronary heart disease diagnostics 193
it is important to note that the accuracy of the diagnosisbased on genetic factors only was found to be almost equalto that in the model combining 15 non-genetic factors (90%and 91%, respectively). The most informative (93%) was themodel that included 8 SNPs, SCORE index, and coronary arte-riography data. Taking into account the fact that geneticinformation is a permanent parameter, it can be assumedthat analysis of NOS, ACE , AGT-3, AGT-4, AGTR, CRP-1, CRP-
2, and CRP-3 SNPs enables prediction of CHD with a highaccuracy.
The diagnostic accuracy can be improved not only byincreasing the number of genetic markers, but also by theirprecise selection. Good candidates for consideration maybe genes involved in vascular cell growth, apoptosis, andinflammation, or others associated with CHD, e.g. ABCA1(ATP-binding cassette-transporter A1), CYP1A2 (cytochromeP450), ADRB group (-adrenoreceptors), HSF1 (heat shockfactors) [27], CLOCK , and BMAL1 [28]. The last ones areof interest because they are associated with the regulationof circadian rhythms, including those in the cardiovascularsystem [29].
The obtained experience in the development of neuralnetwork models for CHD prediction constitutes a basis forthe design of applied software products for the diagnosis of cardiovascular diseases, including screening tests.
Conclusion
The purpose of this study was to create CHD diagnos-tic models with appropriate analytical characteristics usingMLP ANNs. The best accuracy was obtained in models thatincluded both genetic and non-genetic factors associatedwith the disease. The models of >90% accuracy may serve asthe basis for the development of software tools for diagnosisand prediction of CHD.
References
[1] Graham I, Atar D, Borch-Johnsen K, Boysen G, Burell G,Cifkova R, Dallongeville J, De Backer G, Ebrahim S, GjelsvikB, Herrmann-Lingen C, Hoes A, Humphries S, Knapton M, PerkJ, et al. European guidelines on cardiovascular disease preven-tion in clinical practice: executive summary. Fourth Joint TaskForce of the European Society of Cardiology and other soci-eties on cardiovascular disease prevention in clinical practice(constituted by representatives of nine societies and by invitedexperts). Eur Heart J 2007;28:2375—414.
[2] Rosamond W, Flegal K, Friday G, Furie K, Go A, Greenlund K,Haase N, Ho M, Howard V, Kissela B, Kittner S, Lloyd-JonesD, McDermott M, Meigs J, Moy C, et al. Heart disease andstroke statistics — 2007 update: a report from the AmericanHeart Association Statistics Committee and Stroke StatisticsSubcommittee. Circulation 2007;115:e69—171.
[3] Li C, Balluz LS, Okoro CA, Strine TW, Lin JM, Town M, GarvinW, Murphy W, Bartoli W, Valluru B. Centers for Disease Controland Prevention (CDC). Division of Behavioral Surveillance, Pub-lic Health Surveillance Program Office, Office of Surveillance,Epidemiology, and Laboratory Services. Surveillance of certainhealth behaviors and conditions among States and selectedlocal areas — behavioral risk factor surveillance system, UnitedStates, 2009. MMWR Surveill Summ 2011;60:1—250.
[4] Ford ES, Capewell S. Coronary heart disease mortality amongyoung adults in the U.S. from 1980 through 2002: concealedleveling of mortality rates. J Am Coll Cardiol 2007;50:2128—32.
[5] Rumana N, Kita Y, Turin TC, Murakami Y, Sugihara H, MoritaY, Tomioka N, Okayama A, Nakamura Y, Abbott RD, UeshimaH. Trend of increase in the incidence of acute myocardialinfarction in a Japanese population: Takashima AMI Registry,1990—2001. Am J Epidemiol 2008;167:1358—64.
[6] Pogosova GV, Oganov RG, Koltunov IE, Sokolova OIu, Pozdni-
akov IuM, Vygodin VA, Sapunova ID, Ryzhikova IB, Karpova AV,Eliseeva NA. Monitoring of secondary prevention of ischemicheart disease in Russia and European countries: results of international multicenter study EUROASPIRE III. Kardiologiia2011;51:34—40.
[7] Assmann G, Cullen P, Schulte H. Simple scoring scheme for cal-culating the risk of acute coronary events based on the 10-yearfollow-up of the prospective cardiovascular Munster (PROCAM)study. Circulation 2002;105:310—5.
[8] Wilson PWF, Castelli WP, Kannel WB. Coronary risk predic-tion in adults (the Framingham Heart Study). Am J Cardiol1987;59:91—4.
[9] Conroy RM, Pyörälä K, Fitzgerald AP, Sans S, Menotti A, DeBacker G, De Bacquer D, Ducimetière P, Jousilahti P, Keil U,Njølstad I, Oganov RG, Thomsen T, Tunstall-Pedoe H, Tverdal A,
et al. Estimation of ten-year risk of fatal cardiovascular diseasein Europe: the SCORE project. Eur Heart J 2003;24:987—1003.
[10] Hippisley-Cox J, Coupland C, Vinogradova Y, Robson J, May M,Brindle P. Derivation and validation of QRISK, a new cardiovas-cular disease risk score for the United Kingdom: prospectiveopen cohort study. BMJ 2007;335:136.
[11] Ridker PM, Buring JE, Rifai N, Cook NR. Development and val-idation of improved algorithms for the assessment of globalcardiovascular risk in women: the Reynolds Risk Score. JAMA2007;297:611—9.
[12] Brindle P, Beswick A, Fahey T, Ebrahim S. Accuracy and impactof risk assessment in the primary prevention of cardiovasculardisease: a systematic review. Heart 2006;92:1752—9.
[13] Yamamoto H, Ohashi N, Ishibashi K, Utsunomiya H, Kunita E,Oka T, Horiguchi J, Kihara Y. Coronary calcium score as a predic-
tor for coronary artery disease and cardiac events in Japanesehigh-risk patients. Circ J 2011;75:2424—31.
[14] Williams M, Shaw LJ, Raggi P, Morris D, Vaccarino V, Liu ST,Weinstein SR, Mosler TP, Tseng PH, Flores FR, Nasir K, Budoff M. Prognostic value of number and site of calcified coronarylesions compared with the total score. JACC Cardiovasc Imag-ing 2008;1:61—9.
[15] Tedeschi-Reiner E, Strozzi M, Skoric B, Reiner Z. Relation of atherosclerotic changes in retinal arteries to the extent of coronary artery disease. Am J Cardiol 2005;96:1107—9.
[16] Casas J, Cooper J, Miller GJ, Hingorani AD, Humphries SE.Investigating the genetic determinants of cardiovascular dis-ease using candidate genes and meta-analysis of associationstudies. Ann Hum Genet 2006;70:145—69.
[17] Brautbar A, Ballantyne CM, Lawson K, Nambi V, ChamblessL, Folsom AR, Willerson JT, Boerwinkle E. Impact of addinga single allele in the 9p21 locus to traditional risk factorson reclassification of coronary heart disease risk and impli-cations for lipid-modifying therapy in the Atherosclerosis Riskin Communities (ARIC) study. Circ Cardiovasc Genet 2009;2:279—85.
[18] Kullo IJ, Cooper LT. Early identification of cardiovascu-lar risk using genomics and proteomics. Nat Rev Cardiol2010;7:309—17.
[19] Ding K, Kullo IJ. Genome-wide association studies foratherosclerotic vascular disease and its risk factors. Circ Car-diovasc Genet 2009;2:63—72.
[20] Haykin S. Neural networks: a comprehensive foundation. NewYork: Macmillan College Publishing Company; 1994.
-
8/17/2019 heart disease diagnosis by artificial neural networks
5/5
194 O.Yu. Atkov et al.
[21] Obenshain MK. Application of data mining techniques to health-care data. Infect Control Hosp Epidemiol 2004;25:690—5.
[22] Allison JS, Heo J, Iskandrian AE. Artificial neural network mod-eling of stress single-photon emission computed tomographicimaging for detecting extensive coronary artery disease. Am JCardiol 2005;95:178—81.
[23] Colak MC, Colak C, Kocatürk H, Sağiroğlu S, Barutçu I.Predicting coronary artery disease using different artifi-cial neural network models. Anadolu Kardiyol Derg 2008;8:
249—54.[24] Papaloukas C, Fotiadis DI, Likas A, Michalis LK. An ischemia
detection method based on artificial neural networks. Artif Intell Med 2002;24:167—78.
[25] Scott JA, Aziz K, Yasuda T, Gewirtz H. Integration of clinicaland imaging data to predict the presence of coronary arterydisease with the use of neural networks. Coron Artery Dis2004;15:427—34.
[26] Goldberg DE. Genetic algorithms in search, optimization andmachine learning. New York: Addison Wesley; 1989.
[27] Young ME. Anticipating anticipation: pursuing identificationof cardiomyocyte circadian clock function. J Appl Physiol2009;107:1339—47.
[28] Yu S, Li G. MicroRNA expression and function in cardiacischemic injury. J Cardiovasc Transl Res 2010;3:241—5.
[29] Takeda N, Maemura K. Circadian clock and cardiovascular dis-ease. J Cardiol 2011;57:249—56.