Non ionic organic pesticide environmental behaviour: ranking and classification F. Consolaro and P....

1
Non ionic organic pesticide environmental behaviour: ranking and Non ionic organic pesticide environmental behaviour: ranking and classification classification F. Consolaro and P. Gramatica QSAR Research Unit, Dept. of Structural and Functional Biology, University of Insubria, Varese, Italy. E-mail: [email protected] Web: http://fisio.varbio2.unimi.it/dbsf/home.html INTRODUCTION Pesticides play an important role in maintaining agricultural productivity, but they may also be the cause of air, water, soil and food contamination with possible adverse effects on human and animal health. Knowledge of pesticide environmental behaviour, i.e. their absorption in the soil, possibility of leaching in groundwater, volatility in the atmosphere, is of primary concern. Data for all the most relevant physico- chemical properties, such as soil adsorption coefficient (K oc ) 1 , n-octanol/water partition coefficient (K ow ), water solubility, vapour pressure and Henry’s law constant, are modelled and predicted by the QSPR (Quantitative Structure-Property Relationships) approach. [1] P. Gramatica, M. Corradi, V. Consonni Modelling and prediction of soil sorption coefficients of non-ionic organic pesticides by different sets of molecular descriptors. Chemosphere, 41/5, (2000), 763-777 STRUCTURAL DESCRIPTION OF COMPOUNDS Molecular descriptors used are: molecular weight, count descriptors (1D-descriptors), obtained from a simple knowledge of the molecular formula, and graph invariant descriptors (2D-descriptors), both topological and information indexes, obtained from the knowledge of the molecular topology and WHIM descriptors 2 , that contain information about the whole 3D-molecular structure in terms of size, symmetry and atom distribution. These indices are calculated from (x,y,z)-coordinates of three-dimensional molecular structures, usually from a minimum energy conformation (obtained by the molecular mechanics method of Allinger, MM+], by using the software WHIM-3D/QSAR 3 . [2] R.Todeschini and P.Gramatica, 3D-modelling and prediction by WHIM descriptors. Part 5. Theory development and chemical meaning of the WHIM descriptors, Quant.Struct.-Act.Relat., 16 (1997) 113-119. [3] web download: http://www.disat.unimib.it/chm STUDIED PESTICIDES QSPR studies were performed on a data set composed by 185 pesticides of the following chemical classes: 6 acetanilides, 40 carbammates, 12 dinitroanilines, 9 organochlorides, 47 organophosphates, 45 phenylureas, 17 triazines and 9 di- and triazoles. Only for 54 of these chemicals, experimental data of all 5 physico-chemical properties before mentioned are available. CONCLUSIONS CONCLUSIONS RANKING and CLASSIFICATION The experimental data of the studied properties for 54 pesticides of different chemical classes are combined in Principal Component Analysis and Hierarchical Cluster Analysis to rank the pesticides in 4 a priori classes according to their environmental behaviour (sorbed, soluble, volatile and a mixed class). PCA ON CHEMICAL-PHYSICAL PROPERTIES OF 54 PESTICIDES Cum. E.V. = 94.6% (PC1 = 70.1%) PC 1 PC 2 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 -3 -2 -1 0 1 2 3 4 -5 -3 -1 1 3 5 1 = Soluble comp. 2 = Medium comp. 3 = Volatile comp. 4 = Sorbed comp. CLA 1 CLA 3 CLA 2 CLA 4 1: Soluble comp. 2: Medium comp. 3: Volatile comp. 4: Sorbed comp. CLUSTER ANALISIS OF PHYSICO-CHEMICAL PROPERTIES CLASSIFICATION TREE ASSIGNED CLASSES MW BAL nHD 317.69 2.53 0.50 2 3 1 4 Classification by different classification methods (CART, K-NN, LDA and RDA) using only theoretical molecular structure descriptors allows a fast screening of both existing and new pesticides with regard to their behaviour in the environment, based simply on the knowledge of molecular structure. METHOD N .O B J. NO M O D EL E.R .% E.R .% cv E.R .% MOLEC.DESCRIPTORS CART 54 62.96 11.11 22.22 M W – B A L -nH D KNN (K=1) 54 62.96 / 16.67 M W – B A L -nH D LDA 54 62.96 16.67 24.07 M W – B A L -nH D RDA ( =0.25 =0) 54 62.96 14.81 16.67 M W – B A L -nH D CLASSIFICATION MODELS RESULTS ONLY 54 EXPERIMENTAL DATA QSPR PREDICTION OF MISSING DATA and COMPARTMENT-ASSIGNMENT OF 173 PESTICIDES Due to the lack of a lot of experimental data for the most relevant physico-chemical properties, the above-mentioned molecular descriptors have been used to obtain OLS regression models with satisfactory predictive power (Q 2 LOO = 68-83%); the reliability of predicted data has been always checked by the leverage approach. The experimental and predicted data of these properties for a total of 173 pesticides of different chemical classes are then combined in Principal Component Analysis, as shown in the graph below: PCA ON EXP. AND PRED. DATA OF PHYSICO-CHEMICAL PROPERTIES OF 173 PESTICIDES Cum. E.V. = 88.2% (PC1 = 63.1%) PC 1 PC 2 1 2 3 4 5 6 7 9 10 11 12 13 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 42 43 44 45 46 47 49 50 51 52 53 56 57 58 59 60 61 62 63 64 65 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 152 153 154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 176 177 179 180 181 182 183 184 -5 -4 -3 -2 -1 0 1 2 3 4 -6 -4 -2 0 2 4 6 LogKow LogKoc LogSol Logvp LogH SOLUBILITY SORPTION VOLAIILITY PRO PER TIES O B J.N . VAR.N. R2 Q 2(LO O ) Q 2(LM O ) SDEP SDEC SELECTED VARIABLES Log Koc 141 6 84.3 82.4 82.1 0.36 0.34 M W -nNO -nHA -CIC M AXDP -Ts Log Kow 158 6 81.2 79.5 79.1 0.65 0.62 nN -nO -nNO -IDDM M AXDP -Am Log W ater solubility 142 6 76.7 74.4 74.1 0.92 0.87 M W -nN -nO -nNO Bic -UI Log Vapour pressure 119 6 72.4 68.5 67.8 0.93 0.87 nHD -nHA -CHI1 -Se DELS - 2p Log H enry's law constant 63 6 79.9 74.5 74 0.91 0.81 nP -nR 06 -ZM 2 -M AXDP HY -E2s REGRESSION MODELS RESULTS The distribution of the 173 chemicals in the PC1-PC2 space shows that the PC1 scores separate sorbed from volatile pesticides, while the PC2 scores separate volatile and not- volatile pesticides. Thus, regression models by the OLS method are then performed on PC1 and PC2 scores with the aim of predicting these characteristics (volatility, sorption and solubility) for each pesticide starting only from the knowledge of a few theoretical descriptors of molecular structure (count descriptors: nC, nS, nX, nCO, nCIC; unsaturation index UI: hydrophobicity index HYF and WHIM descriptors of global dimension Tm). PC1 MODEL: Q 2 LOO : 84.9% R 2 : 85.8% Q 2 LMO : 84.7% SDEP: 0.69 SDEC: 0.67 Selected descriptors: nC - nS - nX - nCO PC2 MODEL: Q 2 LOO : 78.3% R 2 : 80.3% Q 2 LMO : 77.7% SDEP: 0.52 SDEC: 0.47 Selected descriptors: nCIC - UI - HYF - Tm The classification and regression models presented allow an easy a priori” prediction of the environmental compartment in which existing and new pesticides can be found, starting only from the knowledge of few molecular structure descriptors. MW : molecular weight BAL : Balaban distance connectivity index nHD : number of donor atoms for H-bonds EXPERIMENTAL and PREDICTED DATA (173 compounds)

Transcript of Non ionic organic pesticide environmental behaviour: ranking and classification F. Consolaro and P....

Page 1: Non ionic organic pesticide environmental behaviour: ranking and classification F. Consolaro and P. Gramatica QSAR Research Unit, Dept. of Structural and.

Non ionic organic pesticide environmental behaviour: ranking and Non ionic organic pesticide environmental behaviour: ranking and classificationclassification

Non ionic organic pesticide environmental behaviour: ranking and Non ionic organic pesticide environmental behaviour: ranking and classificationclassificationF. Consolaro and P. Gramatica

QSAR Research Unit, Dept. of Structural and Functional Biology, University of Insubria, Varese, Italy.E-mail: [email protected]

Web: http://fisio.varbio2.unimi.it/dbsf/home.html

INTRODUCTIONPesticides play an important role in maintaining agricultural productivity, but they may also be the cause of air, water, soil and food contamination with possible adverse effects on human and animal health. Knowledge of pesticide environmental behaviour, i.e. their absorption in the soil, possibility of leaching in groundwater, volatility in the atmosphere, is of primary concern. Data for all the most relevant physico-chemical properties, such as soil adsorption coefficient (Koc)1, n-octanol/water partition coefficient (Kow), water solubility, vapour pressure and Henry’s law constant, are modelled and predicted by the QSPR (Quantitative Structure-Property Relationships) approach.[1] P. Gramatica, M. Corradi, V. Consonni Modelling and prediction of soil sorption coefficients of non-ionic organic pesticides by different sets of molecular descriptors. Chemosphere, 41/5, (2000), 763-777

STRUCTURAL DESCRIPTION OF COMPOUNDS

Molecular descriptors used are: molecular weight, count descriptors (1D-descriptors), obtained from a simple knowledge of the molecular formula, and graph invariant descriptors (2D-descriptors), both topological and information indexes, obtained from the knowledge of the molecular topology and WHIM descriptors2, that contain information about the whole 3D-molecular structure in terms of size, symmetry and atom distribution. These indices are calculated from (x,y,z)-coordinates of three-dimensional molecular structures, usually from a minimum energy conformation (obtained by the molecular mechanics method of Allinger, MM+], by using the software WHIM-3D/QSAR3.[2] R.Todeschini and P.Gramatica, 3D-modelling and prediction by WHIM descriptors. Part 5. Theory development and chemical meaning of the WHIM descriptors, Quant.Struct.-Act.Relat., 16 (1997) 113-119.

[3] web download: http://www.disat.unimib.it/chm

STUDIED PESTICIDES

QSPR studies were performed on a data set composed by 185 pesticides of the following chemical classes:

6 acetanilides, 40 carbammates, 12 dinitroanilines, 9 organochlorides, 47 organophosphates, 45 phenylureas, 17 triazines and 9 di- and triazoles.

Only for 54 of these chemicals, experimental data of all 5 physico-chemical properties before mentioned are available.

CONCLUSIONSCONCLUSIONS

RANKING and CLASSIFICATIONThe experimental data of the studied properties for 54 pesticides of different chemical classes are combined in Principal Component Analysis and Hierarchical Cluster Analysis to rank the pesticides in 4 a priori classes according to their environmental behaviour (sorbed, soluble, volatile and a mixed class).

PCA ON CHEMICAL-PHYSICAL PROPERTIES OF 54 PESTICIDES

Cum. E.V. = 94.6% (PC1 = 70.1%)

PC 1

PC

2

1

2

3

4

56

7

89

10

11

12

13

14

15

16

17

18

1920

2122

23

24

25

26

27

28

29

3031

32

33

34

3536

37 38

39

40

41

42

43

44

45

46

47

48

49

50

51

52

53

54

-3

-2

-1

0

1

2

3

4

-5 -3 -1 1 3 5

1 = Soluble comp.2 = Medium comp.3 = Volatile comp.4 = Sorbed comp.

CLA 1

CLA 3

CLA 2CLA 4

1: Soluble comp. 2: Medium comp. 3: Volatile comp. 4: Sorbed comp.

CLUSTER ANALISIS OF PHYSICO-CHEMICAL PROPERTIES

CLASSIFICATION TREE

ASSIGNED CLASSES

MW

BAL

nHD

317.69

2.53

0.50

2 3 1 4

Classification by different classification methods (CART, K-NN, LDA and RDA) using

only theoretical molecular structure descriptors allows a fast screening of both

existing and new pesticides with regard to their behaviour in the environment, based

simply on the knowledge of molecular structure.

METHOD N. OBJ. NOMODEL E.R.% E.R.% cv E.R.% MOLEC. DESCRIPTORSCART 54 62.96 11.11 22.22 MW – BAL - nHDKNN (K=1) 54 62.96 / 16.67 MW – BAL - nHDLDA 54 62.96 16.67 24.07 MW – BAL - nHDRDA (=0.25 =0) 54 62.96 14.81 16.67 MW – BAL - nHD

CLASSIFICATION MODELS RESULTS

ONLY 54 EXPERIMENTAL DATA

QSPR PREDICTION OF MISSING DATA and COMPARTMENT-ASSIGNMENT

OF 173 PESTICIDES

Due to the lack of a lot of experimental data for the most relevant physico-chemical properties, the above-mentioned molecular descriptors have been used to obtain OLS regression models with satisfactory predictive power (Q2

LOO = 68-83%); the reliability of predicted data has been always checked by the leverage approach. The experimental and predicted data of these properties for a total of 173 pesticides of different chemical classes are then combined in Principal Component Analysis, as shown in the graph below:

PCA ON EXP. AND PRED. DATA OF PHYSICO-CHEMICAL PROPERTIES OF 173 PESTICIDES

Cum. E.V. = 88.2% (PC1 = 63.1%)

PC 1

PC

2

1

23

45

6

7910

11

12

1315

16

17

18

1920

21

2223

24

25

26

2728293031

32

33

34

35

36

37

38

39

40

42

4344

45

4647

49

5051

5253

56

57 58

59 60

61

6263

64

65

67

68

69

70 7172

73

74

75

76

77

78

79

80

81

8283

84

85

86 87

88

89

919293 94

95

96

97

98 99

100101

102

103

104

105

106

107

108

110

111112

113

114 115116117

118119

120

121

122 123

124

125

126127

128

129 130

131132133

134

135136

137138139

140

141 142143

144

145

146

147148149

150

151

152

153

154

155

156

157

158

159 160

161

162

163

164

165

166

167168

169

170

171172 173

174

176

177

179

180

181182

183

184

-5

-4

-3

-2

-1

0

1

2

3

4

-6 -4 -2 0 2 4 6

LogKowLogKoc

LogSol

Logvp

LogH

SOLUBILITYSORPTION

VO

LA

IILIT

Y

PROPERTIES OBJ. N. VAR. N. R2 Q2(LOO) Q2(LMO) SDEP SDEC SELECTED VARIABLES

Log Koc 141 6 84.3 82.4 82.1 0.36 0.34 MW - nNO - nHA - CIC MAXDP - Ts

Log Kow 158 6 81.2 79.5 79.1 0.65 0.62 nN - nO - nNO - IDDM MAXDP - Am

Log Water solubility 142 6 76.7 74.4 74.1 0.92 0.87 MW - nN - nO - nNOBic - UI

Log Vapour pressure 119 6 72.4 68.5 67.8 0.93 0.87 nHD - nHA - CHI1 - SeDELS - 2p

Log Henry's law constant 63 6 79.9 74.5 74 0.91 0.81 nP - nR06 - ZM2 - MAXDPHY - E2s

REGRESSION MODELS RESULTS

The distribution of the 173 chemicals in the PC1-PC2 space shows that the PC1 scores separate sorbed from volatile pesticides, while the PC2 scores separate volatile and not-volatile pesticides.

Thus, regression models by the OLS method are then performed on PC1 and PC2 scores with the aim of predicting these characteristics (volatility, sorption and solubility) for each pesticide starting only from the knowledge of a few theoretical descriptors of molecular structure (count descriptors: nC, nS, nX, nCO, nCIC; unsaturation index UI: hydrophobicity index HYF and WHIM descriptors of global dimension Tm).

PC1 MODEL: Q2LOO: 84.9% R2: 85.8% Q2

LMO: 84.7% SDEP: 0.69 SDEC: 0.67 Selected descriptors: nC - nS - nX - nCO

PC2 MODEL: Q2LOO: 78.3% R2: 80.3% Q2

LMO: 77.7% SDEP: 0.52 SDEC: 0.47 Selected descriptors: nCIC - UI - HYF - Tm

The classification and regression models presented allow an easy “a priori” prediction of the environmental compartment in which existing and new pesticides can be found, starting only from the knowledge of few molecular structure descriptors.

MW : molecular weight

BAL : Balaban distance connectivity index

nHD : number of donor atoms for H-bonds

EXPERIMENTAL and PREDICTED DATA (173 compounds)