An approach to predict the 13C NMR chemical shifts of acrylonitrile copolymers using artificial...

8
An approach to predict the 13 C NMR chemical shifts of acrylonitrile copolymers using artificial neural network Jaspreet Kaur, Ajaib S. Brar * Department of Chemistry, Indian Institute of Technology Delhi, New Delhi 110016, India Received 19 June 2006; received in revised form 17 July 2006; accepted 19 September 2006 Available online 3 November 2006 Abstract Artificial neural network has been utilized to simulate the 13 C{ 1 H} NMR chemical shifts for the hydrogen terminated fragments of acrylonitrile copolymers and comparison was done with carbon-13 chemical shift values predicted by partial least square regression analysis (PLSR). In this work, structural descriptors were linked to the chemical shift values apply- ing back-propagation learning algorithm as well as PLSR. The descriptors used offered a very useful formal tool for the proper and adequate description of environment of carbon atoms in the copolymers. It has been demonstrated that the performance of 13 C{ 1 H} NMR chemical shift prediction could be made easy using principal component analysis. 13 C{ 1 H} chemical shift values of methine and methylene carbon atoms of acrylonitrile/butyl methacrylate and acryloni- trile/ethyl acrylate copolymers were predicted with the average mean absolute error of various carbons varies between 0.4 and 1.4 ppm. The calculated chemical shift values have good correlation with the experimental values. The results were compared with partial least square regression method, which afforded the error between 2.0 and 5.5 ppm. Ó 2006 Elsevier Ltd. All rights reserved. Keywords: Neural networks; PLSR; NMR; 13 C{ 1 H} NMR chemical shift 1. Introduction 13 C{ 1 H} chemical shift information plays an important role in the structure elucidation of poly- mers [1]. 13 C{ 1 H} chemical shift data are especially sensitive to the compositional and configurational sequences of the copolymers. Unlike proton, carbon is involved in interatomic interactions only to a lim- ited amount, representing almost pure and noise free connectivity information. 13 C{ 1 H} NMR spectral simulation techniques can provide an assistance in the solution of complex structural elucidation problems. These are based on the existence of direct yet complex relationship between the observed chemical shifts of carbon atom and its environment. The basic approaches for the prediction of carbon-13 NMR chemical shifts are ab initio [2–4], semi-empirical [4] and empirical calculations [5]. In ab initio and semi-empirical approaches the necessity to predetermine both con- stitution and conformation restricts the applicability 0014-3057/$ - see front matter Ó 2006 Elsevier Ltd. All rights reserved. doi:10.1016/j.eurpolymj.2006.09.014 * Corresponding author. Tel.: +91 11 26591377/26596536; fax: +91 11 26581579. E-mail address: [email protected] (A.S. Brar). European Polymer Journal 43 (2007) 156–163 www.elsevier.com/locate/europolj EUROPEAN POLYMER JOURNAL

Transcript of An approach to predict the 13C NMR chemical shifts of acrylonitrile copolymers using artificial...

EUROPEAN

European Polymer Journal 43 (2007) 156–163

www.elsevier.com/locate/europolj

POLYMERJOURNAL

An approach to predict the 13C NMR chemical shiftsof acrylonitrile copolymers using artificial neural network

Jaspreet Kaur, Ajaib S. Brar *

Department of Chemistry, Indian Institute of Technology Delhi, New Delhi 110016, India

Received 19 June 2006; received in revised form 17 July 2006; accepted 19 September 2006Available online 3 November 2006

Abstract

Artificial neural network has been utilized to simulate the 13C{1H} NMR chemical shifts for the hydrogen terminatedfragments of acrylonitrile copolymers and comparison was done with carbon-13 chemical shift values predicted by partialleast square regression analysis (PLSR). In this work, structural descriptors were linked to the chemical shift values apply-ing back-propagation learning algorithm as well as PLSR. The descriptors used offered a very useful formal tool for theproper and adequate description of environment of carbon atoms in the copolymers. It has been demonstrated that theperformance of 13C{1H} NMR chemical shift prediction could be made easy using principal component analysis.13C{1H} chemical shift values of methine and methylene carbon atoms of acrylonitrile/butyl methacrylate and acryloni-trile/ethyl acrylate copolymers were predicted with the average mean absolute error of various carbons varies between0.4 and 1.4 ppm. The calculated chemical shift values have good correlation with the experimental values. The results werecompared with partial least square regression method, which afforded the error between 2.0 and 5.5 ppm.� 2006 Elsevier Ltd. All rights reserved.

Keywords: Neural networks; PLSR; NMR; 13C{1H} NMR chemical shift

1. Introduction

13C{1H} chemical shift information plays animportant role in the structure elucidation of poly-mers [1]. 13C{1H} chemical shift data are especiallysensitive to the compositional and configurationalsequences of the copolymers. Unlike proton, carbonis involved in interatomic interactions only to a lim-

0014-3057/$ - see front matter � 2006 Elsevier Ltd. All rights reserved

doi:10.1016/j.eurpolymj.2006.09.014

* Corresponding author. Tel.: +91 11 26591377/26596536; fax:+91 11 26581579.

E-mail address: [email protected] (A.S. Brar).

ited amount, representing almost pure and noisefree connectivity information.

13C{1H} NMR spectral simulation techniquescan provide an assistance in the solution of complexstructural elucidation problems. These are based onthe existence of direct yet complex relationshipbetween the observed chemical shifts of carbon atomand its environment. The basic approaches for theprediction of carbon-13 NMR chemical shifts areab initio [2–4], semi-empirical [4] and empiricalcalculations [5]. In ab initio and semi-empiricalapproaches the necessity to predetermine both con-stitution and conformation restricts the applicability

.

J. Kaur, A.S. Brar / European Polymer Journal 43 (2007) 156–163 157

of the method as the extensive optimization of multi-ple conformations have to be taken into accountwhich renders the method to be time consumingand expensive. The empirical approach known as‘‘increment method’’ for chemical shift predictionwas first incubated by Grant and Paul [6] and byLindman and Adams [7]. These approaches rely onthe knowledge of chemical shifts for the large setof known structures. The advantage of this kind ofapproaches is its simplicity that allows it to be appli-cable in nearly every class of molecules and thestraightforward calculation of the chemical shiftvalues.

Cheng and Bennett [8] proposed empirical addi-tive parameters for 13C{1H} chemical shift predic-tion of vinyl polymers. Later, Matlengiewicz et al.[9,10] and Cheng [11,12] applied the feature of addi-tivity in predicting 13C{1H} chemical shift values.

One active area of the research involves substitut-ing linear regression analysis with other statisticaland numerical techniques e.g. artificial neural net-work [13,14] and support vector machines [15].Recently, there has been a great deal of interest inneural network due to its simplicity and its perti-nence in many areas of science and technology likespeech recognition, autonomous vehicle navigationand handwritten digit recognition, etc. [16]. Appli-cations of neural network have also appearedrecently in the various areas of chemistry [17] whichinvolves studies of organic reactions [18], structure–activity relationships [19], NMR spectroscopy [20–22], prediction of the structure of the proteins [23]and nonlinear multivariate mapping of chemicaldata [24], etc. Jurs and Meiler group have appliedneural networks in the prediction of chemical shiftsin proteins [22], steroids [25], and trisaccharides [26].Svozil et al. [27] predicted carbon-13 NMR chemicalshifts in alkanes through carbon atoms from embed-ded frequencies using artificial neural networks.

In the present work, the 13C{1H} NMR chemicalshifts of the polymers were predicted by means ofartificial neural network and partial least squareregression methods. The outcome of these twomethods were compared in an attempt to improve13C{1H} NMR spectral prediction capabilities. Thismarks the first time that the neural network hasbeen applied to the copolymers to the best of ourknowledge. Here, we present a three-layered neuralnetwork to predict the chemical shift values of acry-lonitrile copolymers. In PLSR typical algorithm [28]was used. In both the methods copolymers were rep-resented by various hydrogen-terminated fragments

which in turn were portrayed through descriptors;subsequently these descriptors were indirectly linkedto chemical shift values. These models were vali-dated using n-fold cross-validation method formethine and methylene carbon atoms of variouscopolymers and after accomplishing reasonableresults the neural network was applied to testcopolymer systems (acrylonitrile/butyl methacrylateand acrylonitrile/ethyl acrylate copolymers).

2. Methodology

2.1. Approach to predict the 13C chemical shift values

The macroscopic properties like 13C{1H} chemi-cal shift values of a molecule are regulated by itsmolecular structure [4]. Here, to link molecular struc-ture with the chemical shift an indirect approachwas used. This approach consists of three maincomponents:

(a) Portraying each copolymer’s molecular struc-ture with various hydrogen terminated frag-ments.

(b) Representing each fragments with numericaldescriptors which describes the chemical envi-ronment.

(c) Choosing subsets of the descriptors and build-ing good models that can predict chemicalshift value.

These chosen descriptors were then used to pre-dict the chemical shift values using multiple layerpreceptors (artificial neural network) and partialleast square regression.

2.2. Various steps followed in the approach to predict

the 13C chemical shift values

2.2.1. First step: entry of molecular structures and

generation of 3D models

The molecular structures of copolymers repre-sented by hydrogen terminated fragments wereentered into computer by sketching and subse-quently these structures were energy minimized bymeans of AM1 semi-empirical method for solvingHamiltonian on MOPAC software [29].

2.2.2. Second step: generation of information rich

descriptors

The descriptors that encodes the chemical envi-ronment around the carbon center of the interest

Fig. 2. Schematic representation of M · N matrix reduced toM 0 · N by applying PCA.

158 J. Kaur, A.S. Brar / European Polymer Journal 43 (2007) 156–163

varies from simple to complex and are generallyclassified as topological, electronic and geometrical[30–32].

In the present work, following electronic descrip-tors were used for representing copolymer frag-ments that in turn portrays the molecularstructure of copolymer.

(1) Partial r charge on the carbon center.(2) Average r charge for the atom f (1–4) bond

from the carbon center divided by the averagebond length from the carbon center.

(3) Most positive charge.(4) Most negative charge.(5) HOMO energy.(6) Dipole moment.

One of the fragments of the acrylonitrile/butylmethacrylate copolymer is shown in Fig. 1 whichis terminated by hydrogen atoms elucidating the cal-culation of descriptors used. It is really necessary tohave set of descriptors which are information richsimultaneously this set should be small enough tosave precious computational time. To answer thisproblem, principal component analysis (PCA) [33]was introduced. This technique has three main util-ities: (a) it orthogonalizes the components of theinput vectors so that they are uncorrelated with eachother, (b) it orders the resulting orthogonal compo-nents (principal components) so that those with thelargest variation come first and (c) it eliminatesthose components that contribute least to the varia-tion in the data set. Principal components that con-tribute less than 10% to the total variation in the

Fig. 1. Hydrogen terminated fragment of acrylonitrile/butyl methacrywhose chemical shift is known/to predict (charge = �0.2501). D2: averadivided by average bond length from C

*, ðD2 ¼ rCa þ rCa0 þ rH�

rCb0 þ rCb00 þ rCb000 þ rHa þ rHa0=la2 þ lb

2 þ lc2 þ ld

2 þ le2 þ lf

2 and likewiserespectively were calculated. D6: net dipole moment = 7.57. D7: highestD9: HOMO energy level = �11.1311. Descriptors are purely numeric v

data set were eliminated (Fig. 2). The implementa-tion of PCA helped in attaining the reduction inthe number of descriptors, making learning easierand less time consuming.

2.2.3. Third step: generation of model using

descriptor setsNeural network: Set of known chemical shift val-

ues and related descriptors of various copolymerswere fed into a three-layer computational neuralnetwork shown in Fig. 3, for its training. In the neu-ral network each neuron in the hidden layer receivesthe signal from all the neurons in the layer above it(input layer). After performing its function theneuron passes its output to all the neurons in the

late copolymer D1: charge on C*

central methylene carbon atomge r charge for the atoms one bond away from the carbon centerþ rH�0=la

1 þ lb1 þ lc

1 þ ld4Þ, similarly we described the D3 ¼ rCbþ

D4 and D5 for the atoms, three and four bonds away from C*

positive charge = 0.3520. D8: highest negative charge = �0.3767.alues. D = Descriptor.

Fig. 3. Architecture of three-layer computational neural networkfor calculating the chemical shift. R represents

Pnl�1

j¼1 wli;jV

l�1j;p þ wl

i;0

and f(Æ) represents f ðnetli;pÞ [13] which is shown in the high-

lightened neuron and is the constituents of all neurons in thenetwork.

J. Kaur, A.S. Brar / European Polymer Journal 43 (2007) 156–163 159

layer below (output layer) it, providing a feed-for-ward path to the output. These lines of communica-tion from one neuron to another are importantaspects of neural network providing variable weightsto an input.

Before training (using back-propagation algo-rithm [13]) the inputs and targets were normalizedbetween �1 and +1 as the transfer function usedin the neural network allows only the valuesbetween �1 and +1 and noise was added to the tar-get values. The tan-sigmoid transfer function thatwas used to calculate the output according tof(net) = (2/1 + expt(�net)) � 1 (description givenin Fig. 3) for input and hidden layer neurons wasfound to give good results. Numbers of hidden lay-ers were optimized to obtain best outcome. Outputneuron uses a linear transfer function; it gives out-put values from �1 to +1.

The set of data, which enables the training, iscalled training set. The data sets were randomlyarranged before permitting into the neural network.To start this training process the initial weights werechosen randomly, subsequently the training orlearning begins. In the supervised learning boththe inputs and outputs are provided. The networkthen processes the inputs and compares its resultingoutputs against the desired output, here the experi-mental chemical shift values. If there is differencethen the weights are adjusted to reduce the differ-ence for each training pattern. This training proce-dure continues till the difference between thedesired output and actual output reaches a predeter-mined accepted level.

One of the problems that occur during neuralnetwork training is called overfitting. The error onthe training set is driven to a very small value, butwhen new data is presented to the network the erroris large. The network has memorized the trainingset, but it has not learned to generalize to new situ-ations. One method for improving network general-ization is to use a network that is just large enoughto provide an adequate fit. The larger a network youuse, the more complex the functions the networkcan create. If a small enough network is used, it willnot have enough power to overfit the data. Unfortu-nately, it is difficult to know before hand about thesize of the network. If the size of the training set islarge then the problem of overfitting rarely encoun-tered. In our case we had limited amount of data,so, the chances of overfitting were greater. Regular-ization and early stopping avoided the problem ofoverfitting. A common approach to the regulariza-tion process is the Bayesian framework. Here theweights and biases of the network are assumed tobe random variables with specific distributions.Bayesian regularization minimizes a linear combina-tion of squared errors and weights. It also modifiesthe linear combination so that at the end of trainingthe resulting network has good generalization qual-ities [34,35]. This Bayesian regularization takesplace within the Levenberg–Marquardt algorithm[36,37]. Back-propagation is used to calculate theJacobian jX of performance with respect to theweight and bias variables X. Each variable isadjusted according to Levenberg–Marquardt,

jj ¼ jX � jX

je ¼ jX � E

dX ¼ �ðjjþ I � muÞ n je

where E is all errors and I is the identity matrix.One feature of the algorithm is that it provides

a measure of how many network parameters areeffective in the network. The iterative training wasstopped when the sum square error and sumsquare weights were relatively constant over severaliterations.

Partial least square regression: Partial leastsquare regression generalizes and combines featuresfrom principal component analysis and multipleregression. Three latent variables were used in par-tial least square regression, for the same input andoutput values as were used for neural network.The detail description of algorithm as well as math-ematics is given by Geladi and Kowalski [28].

Table 1Copolymers used to train and test the neural network

Acrylonitrile/hexyl methacrylate Glycidyl methacrylate/vinyl acetateVinylidene chloride/methyl methacrylate Vinylidene chloride/methyl acrylateMethyl acrylate/N-vinyl-2-pyrrolidine Acrylonitrile/methacrylic acidAcrylonitrile/heptyl methacrylate Acrylonitrile/pentyl methacrylateAcrylonitrile/methyl methacrylate Glycidyl methacrylate/styreneVinylidene chloride/acrylonitrile Acrylonitrile/butyl methacrylateVinylidene chloride/methacrylonitrile Acrylonitrile/pentyl acrylateAcrylonitrile/hexyl acrylate Acrylonitrile/heptyl acrylateAcrylonitrile/glycidyl methacrylate Acrylonitrile/acrylic acidMethyl methacrylate/styrene Acrylonitrile/N-vinyl-2-pyrrolidineMethacrylonitrile/N-vinyl-2-pyrrolidine Acrylonitrile/butyl acrylateAcrylonitrile/ethyl acrylate Acrylonitrile/styreneAcrylonitrile/methyl acrylateAcrylonitrile/ethyl methacrylateButyl acrylate/N-vinyl-2-pyrrolidineMethyl methacrylate/N-vinyl-2-pyrrolidineGlycidyl methacrylate/N-vinyl-2-pyrrolidineMethacrylonitrile/trans-4-acryloyloxyazobenezeneAcrylonitrile/trans-4-acryloyloxyazobenezene

160 J. Kaur, A.S. Brar / European Polymer Journal 43 (2007) 156–163

2.2.4. Fourth step: validation of model

It involved demonstration of ability of neuralnetworks and partial least square regression forthe prediction of 13C{1H} chemical shift for copoly-mer using n-fold cross-validation method [38]. Thistenders more accurate assessment of actual error.The chemical shift values for the unknown copoly-mer (test data) were calculated after validating themodel.

MATLABTM was used to construct, train andanalyze the neural network and PLSR (for step 3and step 4).

2.3. Copolymer data set

Copolymers used in the data set were compiledfrom published data [39–67] and are listed inTable 1.

3. Results and discussion

Aim of the present work is to analyze the capabil-ity of multiple layer neural networks to predictthe 13C{1H} NMR chemical shifts of copolymersrepresented by hydrogen terminated fragments ofacrylonitrile copolymers. To approach this problem,validation of the neural network was realized bysubdividing 127 methine and methylene carbonatoms from all acrylonitrile copolymers into twosubsets. The first subset comprising of the 10 ran-domly selected carbon atoms with unknown chemi-cal shifts was used as validation data set. Remaining

117 carbon atoms with unambiguously assignedchemical shift values were used to train the neuralnetwork. The trained network used approximately24 parameters (weights and biases) in 107 iterations.Network gave good correlation between the targetand output values. Fig. 4 evinces correlation coeffi-cient (R) = 0.99 and the mean absolute error (mae) =1.26 ppm for 10 randomly selected carbon atoms.

n-Fold cross-validation is a much better option,although it is not uncommon for studies based ona single validation set to make their way intorespectable journals even though it is well knownthat underlying predictors may perform well on par-ticular test set and abysmally on another. But n-foldcross-validation can too fall prey to random numbergenerator. A much better approach would be to runa sufficiently large number of n-fold cross-validationruns, this type of practice can provide a better esti-mate of true generalization error of the model.Keeping this in mind, the neural network and PLSRwas validated using n-fold cross-validation (where,n = 1) method for the methine and methylenecarbon atoms from all acrylonitrile copolymers(Table 1).

The descriptors were fed into 3–9–1 (input–hid-den–output neurons layer) architecture of the neuralnetwork. The training data was randomly arrangedand each data was systematically removed from thetraining set, building a model from the remainingdata and predicting the chemical shift value of theremoved data set using optimized weights. Thiswas done for each data in the original training set.

Fig. 4. (a) The training of network with each epoch, whichinitially decreases and finally becomes constant. (b) Sum squareweights (ssw) are relatively constant over several epochs which isone of the indication that algorithm has truly converged. (c) Thecorrelation of the target and the output values of methine andmethylene carbon atoms in AEAEA, AGAGA, HAH, AAS,GAAG, AAAE, HHA, AGGA, AGGG, MMMMAA unit,where A = acrylonitrile, M = methyl methacrylate, G = glycidylmethacrylate, H = hexyl acrylate, E = ethyl methacrylate andS = styrene.

Table 2Experimental and calculated chemical shift values in ppm of themethylene carbon atoms for hydrogen terminated fragments ofacrylonitrile–butyl methacrylate (A–B) copolymer

Fragment Experimentalchemical shift(ppm)

Calculated chemical shift (ppm)using

Neuralnetwork

Partial least squareregression

BBBB 54.6 54.0 43.6BBBA 51.6 53.9 43.2ABBA 46.8 44.1 43.8BBAB 43.6 43.1 43.8BBAA 42.1 41.5 44.0ABAA 40.6 40.5 44.5BAAB 37.5 36.9 43.9AAAB 35.4 33.2 44.3AAAA 33.5 32.0 35.3

J. Kaur, A.S. Brar / European Polymer Journal 43 (2007) 156–163 161

Each model was validated 50 times in order toobtain reliable statistics and establish the true gener-alization capabilities of resulting model. Neural net-work gave mean absolute error (mae) = 1.4 ppm(partial least square regression gave mae = 5.0 ppm)indicating that neural network can be used to pre-dict 13C{1H} NMR chemical shift information andhence microstructure of the copolymers. It gave agood correlation between the structure of thecopolymer that was represented by the electronicdescriptors and the chemical shift values of thecopolymer, signifying that the chemical shift valuescan be appropriately anticipated by representingcopolymer fragment through descriptors whichencodes electronic environment of the atoms.

After validating the model curiosity aroused totest the network for predicting chemical shift valuesof different copolymers. The neural network wasapplied to predict the chemical shift values of theacrylonitrile–butyl methacrylate and acrylonitrile–ethyl acrylate copolymers (test data). Number ofneurons in the input and output layer was fixed

while the numbers of neurons in the hidden layerwere varied. Three neurons in the input layer andone neuron in the output layer signify the numberof descriptors and the chemical shift value, respec-tively. Best results were achieved with 4–18 neuronsin the hidden layer. The correlation coefficient forall the test data set of the copolymers achievedwas always more than 0.9 and mean absolute errorvaried from 0.4 to 1.3 ppm. The results obtained byneural network were compared with partial leastsquare regression [28]. In case of partial least squareregression, the mean absolute error varied from2.3 ppm to 5.5 ppm. In case of acrylonitrile/butylmethacrylate (Table 2) for methylene carbon, theneural network gave the mean absolute error of1.23 ppm and for the same carbon atoms partialleast square regression gave the error of 5.05 ppmwhich is very large indicating that neural networkpredicts values that are more closer to experimental13C chemical shift values than what the partial leastsquare regression does. Comparatively large error inacrylonitrile/butyl methacrylate might be due to thefact that it is conformational sensitive.

The 13C chemical shift values were conformation-ally averaged. It is well known that differentarrangements of the functional groups flanking themain chain can influence the chemical shift values.Here, the structure of the copolymers was onlydelineated via electronic descriptors; dependencyof conformation was not considered which can beachieved by topological descriptors like dihedralangle etc. Acrylonitrile/butyl methacrylate appearsto be more sensitive toward conformations in com-parison to acrylonitrile/ethyl acrylate due to the

Table 3Experimental vs calculated chemical shift values in ppm of themethine carbons for hydrogen terminated fragments of acrylo-nitrile–ethylacrylate (A–E) copolymer

Fragment Experimentalchemical shift(ppm)

Calculated chemical shift (ppm)using

Neuralnetwork

Partial leastsquare regression

MMM 41.4 41.7 38.1AMM 41.3 41.3 38.2AMA 41.2 40.6 38.0AAA 28.1 28.8 27.1AAM 28.0 27.2 29.8MAM 27.9 28.4 29.1

Table 4Experimental vs calculated chemical shift values in ppm of themethine carbons for hydrogen terminated fragments of acrylo-nitrile–butyl methacrylate (A–B) copolymer

Fragment Experimentalchemical shift(ppm)

Calculated chemical shift (ppm)using

Neuralnetwork

Partial leastsquare regression

AAABA 25.8 26.9 29.4AAABB 25.0 26.7 32.7ABABA 23.0 24.9 29.7ABABB 22.6 22.4 28.5BBABB 21.3 20.8 18.0

162 J. Kaur, A.S. Brar / European Polymer Journal 43 (2007) 156–163

presence of methyl group. This might clarify whythe error for acrylonitrile/ethyl acrylate is relativelyless. The methine carbons of acrylonitrile–ethylacrylate (Table 3) were generally less sensitivetowards conformation compared to acrylonitrile–butyl methacrylate (Table 4) thus, giving meanabsolute error of 0.48 ppm and correlation coeffi-cient of 0.99 as compared to mean absolute errorof 1.08 ppm and R = 0.95 for later. In case of par-tial least square regression, the mean absolute errorfor prediction of 13C-NMR chemical shift values ofmethine carbons of acrylonitrile/ethyl acrylatecopolymer and acrylonitrile–butyl methacrylatecopolymer were computed to be 2.3 ppm and was5.4 ppm, respectively. Therefore, it can be inferredfrom these mean absolute errors values that PLSRmethod is not suitable enough to offer microstruc-ture information of copolymer with tolerableprecision.

The model developed by artificial neural net-works lays out good correlation between the struc-ture and the chemical shift values. It is fast, simple

and can serve as an additional restraint in structureelucidation process.

4. Conclusions

In the present work, indirect correlation betweenthe structure of the copolymer and the chemicalshift was employed. The structure of the copolymerwas represented by the descriptors to predict the13C{1H}NMR chemical shifts of carbon atoms ofvarious copolymers using artificial neural networks.Principal component analysis technique has beenapplied to scale down the number of descriptorswhich made learning easy and less time consuming.The neural network was first validated using n-foldcross-validation method and subsequently tested foracrylonitrile/ethyl acrylate and acrylonitrile/butylmethacrylate copolymers. The models generatedwithout the use of geometrical descriptors, were freefrom conformational difficulties. Results of the neu-ral network predictions of 13C{1H} NMR chemicalshifts values were reasonably close to the experi-mental values and were significantly superior to par-tial least square regression outcome. Hence, theapproach is suitable for the prediction of 13C{1H}NMR chemical shifts of the copolymers. Theapproach is simple, easy to use and should find gen-eral applicability in the study of structure–propertyrelationship analysis in polymers. Increase in num-ber of data set for training and exploitation of topo-logical descriptors can further reduce the error.

Acknowledgement

One of the authors (Jaspreet Kaur) thanks theCouncil of Scientific and Industrial Research(CSIR), India for the financial support.

References

[1] Ibbett RN. NMR spectroscopy of polymers. Glas-gow: Blackie Academic and Professional; 1993.

[2] Asirvatham PS, Subramanian V, Balakrishnan R, Rama-sami T. Macromolecules 2003;36:921.

[3] Born R, Spiess HW. NMR—Basic Principles Prog 1997;35.[4] Ando I, Kuroki S, Kurosu H, Yamanobe T. Prog NMR

Spectrosc 2001;39:79.[5] Furst A, Pretsch E. Anal Chim Acta 1990;229:17.[6] Grant DM, Paul G. J Am Chem Soc 1964;86:2984.[7] Lindman LP, Adams JQ. Anal Chem 1971;43:1245.[8] Cheng HN, Bennett MA. Anal Chem 1984;56:2320.[9] Matlengiewicz M, Nguyen G, Nicole D, Itenzeu N. J Polym

Sci: Part A: Polym Chem 2000;38:2147.[10] Nguyen G, Nicole D, Swistek M, Matlengiewicz M, Wiegert

B. Polymer 1997;38:3455.

J. Kaur, A.S. Brar / European Polymer Journal 43 (2007) 156–163 163

[11] Cheng HN. J Chem Inf Comput Sci 1987;27:8.[12] Cheng HN. Polym News 2000;25:114.[13] Zurada JM. Introduction to artificial neural systems. St.

Paul, MN: West Publishing Company; 2003.[14] Bose NK, Liang P. Neural network fundamentals

with graphs, algorithms and applications. New York: TataMcGraw Hill; 1998.

[15] Scholkopf B, Burges JC, Smola AJ. Advances in kernelmethods – support vector learning. Cambridge, Massachu-setts, London, England: MT Press; 1998.

[16] Bose NK, Liang P. Neural network fundamentals withgraphs, algorithms and applications. New York: TataMcGraw Hill; 1998 [chapter 10]. p. 407.

[17] Zupan J, Gasteiger J. Anal Chim Acta 1991;248:1.[18] Elrod DW, Magglora GM, Trenary RG. J Chem Inf

Comput Sci 1990;30:477.[19] Aoyama T, Suzuki Y, Ichikawa H. J Med Chem

1990;33:2583.[20] Meiler J, Meusinger R, Will M. J Chem Inf Comput Sci

2000;40:1169.[21] Meiler J, Maier W, Will M, Meusinger R. J Magn Reson

2002;158:242.[22] Meiler J. J Biomol NMR 2003;26:25.[23] Kneller DG, Cohen FE, Langridge R. J Mol Biol

1990;214:171.[24] Blank TB, Brown SD. Anal Chem 1993;65:3081.[25] Anker LS, Jurs PC. Anal Chem 1992;64:1157.[26] Clouser DL, Jurs PC. Carbohydr Res 1995;271:65.[27] Svozil D, Pospıchal J, Kvasnicka V. J Chem Inf Comput Sci

1995;35:924.[28] Geladi P, Kowalski BR. Anal Chim Acta 1986;185:1.[29] Leach AR. Molecular modelling—principles and applica-

tions. England: Addision Wesley Logman Ltd.; 1997 [chap-ter 2]. p. 102.

[30] Small GW, Jurs PC. Anal Chem 1983;55:1121.[31] Small GW, Jurs PC. Anal Chem 1984;56:1314.[32] Jurs PC, Sutton GP, Ranc ML. Anal Chem 1989;61:1115A.[33] Oja E. Int J Neural Syst 1989;1:61.[34] MacKay DJC. Neural Comput 1992;4:415.[35] Foresee FD, Hagan MT. In: Proceedings of the 1997

international joint conference on neural networks, 1997. p.1930–5.

[36] Levenberg K. Quart Appl Math 1944;2:164.[37] Marquardt D. SIAM J Appl Math 1963;11:431.[38] Agrafiotis DK, Cedeno W, Lobanov VS. J Chem Inf

Comput Sci 2002;42:903.[39] Brar AS, Kaur S. J Polym Sci: Part A: Polym Chem

2005;43:1100.[40] Brar AS, Yadav A, Kaur M. Polym Prepr (Am Chem Soc

Div, Polym Chem) 2003;44:387.[41] Brar AS, Hooda S, Kumar R. J Polym Sci: Part A: Polym

Chem 2003;41:313.[42] Brar AS, Kaur M. Eur Polym J 2003;39:705.[43] Brar AS, Kaur M. J Appl Polym Sci 2003;88:3005.[44] Brar AS, Yadav A. Polymer J 2003;35:37.[45] Brar AS, Yadav A. Indian J Chem 2002;41A:2008.[46] Hooda S, Brar AS. J Appl Polym Sci 2003;88:3232.[47] Brar AS, Kumar R. J Mol Struct 2002;616:37.[48] Brar AS, Yadav A. Eur Polym J 2003;39:15.[49] Brar AS, Kumar R. J Appl Polym Sci 2002;84:50.[50] Brar AS, Pradhan DR. Indian J Chem 2002;41A:950.[51] Brar AS, Yadav A. J Mol Struct 2002;602:29.[52] Brar AS, Yadav A, Hooda S. Eur Polym J 2002;38:1683.[53] Brar AS, Kumar R. Eur Polym J 2001;37:1827.[54] Brar AS, Pradhan DR. Polym J 2001;33:602.[55] Brar AS, Yadav A. J Polym Sci: Part A: Polym Chem

2001;39:4051.[56] Brar AS, Dutta K, Pandey D. Polym J 1999;31:396.[57] Mukherjee M, Chaterjee SK, Brar AS. J Appl Polym Sci

1999;73:55.[58] Brar AS, Hekmatyar SK. J Polym Sci: Part A: Polym Chem

1999;37:721.[59] Brar AS, Dutta K. J Polym Sci: Part A: Polym Chem

1999;37:533.[60] Brar AS, Dutta K. Macromolecules 1998;31:4695.[61] Brar AS, Dutta K. J Appl Polym Sci 1998;69:2507.[62] Brar AS, Dutta K. Macromol Chem Phys 1998;199:2005.[63] Brar AS, Dutta K, Hekmatyar SK. J Polym Sci: Part A:

Polym Chem 1998;36:1081.[64] Brar AS, Dutta K. Eur Polym J 1998;31:1585.[65] Brar AS, Dutta K, Kapur GS. Macromolecules 1995;

28:8735.[66] Brar AS, Jayaram B, Dutta K. J Polym Mater 1994;11:171.[67] Brar AS, Jayaram B, Dutta K. J Polym Mater 1993;10:269.