ars.els-cdn.com  · Web viewThe Quantitative Structure – Insecticidal Activity Relationships...

35
The Quantitative Structure – Insecticidal Activity Relationships from plant derived compounds against chikungunya and zika Aedes aegypti (Diptera:Culicidae) vector Laura M. Saavedra a,* , Gustavo P. Romanelli b, c , Ciro E. Rozo d , Pablo R. Duchowicz a,* a Instituto de Investigaciones Fisicoquímicas Teóricas y Aplicadas (INIFTA). CONICET. UNLP. Diag. 113 y 64. C.C. 16. Sucursal 4. 1900 La Plata. Argentina. b Centro de Investigación y Desarrollo en Ciencias Aplicadas “Dr. J.J. Ronco” (CINDECA). Departamento de Química. Facultad de Ciencias Exactas. CONICET. UNLP. Calle 47 No. 257. B1900AJK La Plata. Argentina c Cátedra de Química Orgánica. Centro de Investigación en Sanidad Vegetal (CISaV). Facultad de Ciencias Agrarias y Forestales. Universidad Nacional de La Plata. Calles 60 y 119 s/n. B1904AAN La Plata. Argentina d Grupo de Investigaciones Ambientales para el Desarrollo Sostenible (GIADS). Universidad Santo Tomas, seccional Bucaramanga. Carrera 18 No. 9-27. 680011 Bucaramanga. Colombia.

Transcript of ars.els-cdn.com  · Web viewThe Quantitative Structure – Insecticidal Activity Relationships...

The Quantitative Structure – Insecticidal Activity Relationships

from plant derived compounds against chikungunya and zika

Aedes aegypti (Diptera:Culicidae) vector

Laura M. Saavedra a,*, Gustavo P. Romanelli b, c, Ciro E. Rozo d, Pablo R. Duchowicz a,*

a Instituto de Investigaciones Fisicoquímicas Teóricas y Aplicadas (INIFTA). CONICET.

UNLP. Diag. 113 y 64. C.C. 16. Sucursal 4. 1900 La Plata. Argentina.b Centro de Investigación y Desarrollo en Ciencias Aplicadas “Dr. J.J. Ronco” (CINDECA).

Departamento de Química. Facultad de Ciencias Exactas. CONICET. UNLP. Calle 47 No.

257. B1900AJK La Plata. Argentinac Cátedra de Química Orgánica. Centro de Investigación en Sanidad Vegetal (CISaV). Facultad

de Ciencias Agrarias y Forestales. Universidad Nacional de La Plata. Calles 60 y 119 s/n.

B1904AAN La Plata. Argentinad Grupo de Investigaciones Ambientales para el Desarrollo Sostenible (GIADS). Universidad

Santo Tomas, seccional Bucaramanga. Carrera 18 No. 9-27. 680011 Bucaramanga. Colombia.

*Corresponding authors: Tel.: +54 221 425 7430. +54 221 425 7291; fax: +54 221 425 4642.

E-mail addresses: laurasaa0913@gmail .com (L.M.S.); [email protected] (P.R.D.)

(Total 26 pages including this cover page)

The Quantitative Structure – Insecticidal Activity Relationships from plant derived compounds against chikungunya and zika Aedes aegypti (Diptera:Culicidae) vector Supplementary material

Supplementary Material

Table 1S. The experimental insecticidal activity of plant-derived compounds used in

this QSAR study…….…………………………………………………….………………….2

Table 2S. List of mathematical equations used in the present study……..………………….5

Table 3S. A brief description for different molecular descriptors.....................……………..7

Table 4S. Correlation matrix for the best QSAR model established………………………...9

Table 5S. The values of the descriptors included in the best QSAR model (Eq. 1)..............10

Table 6S. Experimental, predicted (Eq. 1) and residual values for the insecticidal

activity………………………………………………………………………………………12

Table 7S. The best QSAR models found considering all descriptors blocks for the ‘set A’

and ‘set B’…...…………………………….…………………………………………...……14

Figure 1S. A. Predicted and experimental values for the model with and the

set A. B. Dispersion plot of residuals……………………………………………………….15

Figure 2S. A. Predicted and experimental values for the model with and the

set B. B. Dispersion plot of residuals……………………………………………………….16

Table 8S. The values of the descriptors included in the best QSAR model of the Set A…..17

Table 9S. The values of the descriptors included in the best QSAR model of the Set B.….18

Table 10S. The best QSAR models found considering 233 simpler Dragon descriptors….19

Table 11S. Correlation matrix of the simpler descriptors of Eq. 2………...........................20

Table 12S. The values of the simpler descriptors included in the QSAR model (Eq. 2)…...21

The Replacement Method (RM) procedure...……………………………………………….23

The Balanced Subsets Method (BSM) procedure…………………………………………..25

1

The Quantitative Structure – Insecticidal Activity Relationships from plant derived compounds against chikungunya and zika Aedes aegypti (Diptera:Culicidae) vector Supplementary material

Table 1S. The experimental insecticidal activity of plant derived compounds used in

this QSAR study.

ID Smiles Code (µg/mL) exp.

Reference

1 C1[C@@H]([C@@H](C[C@@H](C1)C)O)C(C)C 404 Santos et al. (2011)

2 O=C1C[C@@H](CC=C1C)C(=C)C 152 Santos et al. (2011)

3 O=C1C[C@H](CC=C1C)C(=C)C 124 Santos et al. (2011)

4 [C@@H]1(CC(=O)[C@@H](CC1)C)C(=C)C 118 Santos et al. (2011)

5 O1[C@@]2([C@H]1C[C@H](CC2=O)C(=C)C)C 219 Santos et al. (2011)

6 O1[C@]2([C@H]1C[C@@H](CC2)C(=C)C)C 517 Santos et al. (2011)

7 [C@@H]1(CCC(=CC1)C)C(=C)C 27 Santos et al. (2011)

8 [C@@H]1(CCC(=CC1)C)C(=C)C 30 Santos et al. (2011)

9 C(C1=CCC(=CC1)C)(C)C 56 Santos et al. (2011)

10 C1(=CC[C@H]2C(C)(C)[C@@H]2C1)C 150 Santos et al. (2011)

11 O[C@H]1[C@@H](CC[C@H](C1)C)C(=C)C 297 Santos et al. (2011)

12 [C@@H]1([C@H](C[C@@H](CC1)C)O)C(=C)C 554 Santos et al. (2011)

13 [C@@H]12C(C(=C)[C@@H](C1)CC2)(C)C 220.7 Scotti et al. (2014)

14 CC(c1ccc(c(c1)O)C)C 50.8 Scotti et al. (2014)

15 CC(c1ccc(c(c1)OC(=O)C)C)C 92 Scotti et al. (2014)

16 CC(c1ccc(c(c1)OC(=O)C(Cl)Cl)C)C 51.9 Scotti et al. (2014)

17 CC(c1ccc(c(c1)OC(=O)C(Cl)(Cl)Cl)C)C 66.5 Scotti et al. (2014)

18 CC(c1ccc(c(c1)OC(=O)CC)C)C 66.6 Scotti et al. (2014)

19 c1cc(cc(c1C(C)C)O)C 33.8 Scotti et al. (2014)

20 c1cc(cc(c1C(C)C)OC(=O)C)C 91.9 Scotti et al.

2

The Quantitative Structure – Insecticidal Activity Relationships from plant derived compounds against chikungunya and zika Aedes aegypti (Diptera:Culicidae) vector Supplementary material

(2014)

21 CC(c1ccc(cc1OC(=O)C(Cl)(Cl)Cl)C)C 41.7 Scotti et al. (2014)

22 CC(c1ccc(cc1OC(=O)CC)C)C 66.6 Scotti et al. (2014)

23 O(c1cc(CC=C)ccc1OC(=O)C)C 108.1 Scotti et al. (2014)

24 O(c1cc(CC=C)ccc1O)C 73.2 Scotti et al. (2014)

25 O(c1cc(CC=C)ccc1OC)C 102.5 Scotti et al. (2014)

26 O(c1c(OC)cc(CC=C)cc1)C(=O)CC 61.9 Scotti et al. (2014)

27 O(c1cc(CC=C)ccc1OC(=O)c1ccccc1)C 706.8 Barbosa et al. (2012)

28 O(c1cc(CC=C)ccc1OCC)C 67.2 Barbosa et al. (2012)

29 Oc1ccccc1 194 Santos et al. (2010)

30 O(c1cc(CC=C)ccc1O[Si](C)(C)C(C)(C)C)C 279 Barbosa et al. (2012)

31 [C@]12(C([C@@H](C[C@H]1O)CC2)(C)C)C 598 Scotti et al. (2014)

32 Oc1c(O)cccc1 240 Scotti et al. (2014)

33 O(c1c(O)cccc1)C 380.7 Scotti et al. (2014)

34 O=C(CCCCCCC)C 200.5 Scotti et al. (2014)

35 O=C(CCCCCCCCC)C 52.6 Scotti et al. (2014)

36 C(c1ccc(cc1)C)(C)C 51 Scotti et al. (2014)

37 Oc1cc(O)ccc1 577 Scotti et al. (2014)

38 O(c1cc(ccc1O)C=O)C 514.2 Scotti et al. (2014)

39 O[C@@H]1[C@]2(C([C@@H](C1)CC2)(C)C)C 605.9 Scotti et al. (2014)

40 [C@H]12C(C(=C)[C@H](C1)CC2)(C)C 406 Santos et al. (2010)

41 O=C1[C@@H](CC[C@H](C1)C)C(C)C 508 Santos et al. (2011)

42 C1(=O)C[C@@H](C(=C)C)C[C@@H](C#N)[C@@H]1C 412 Santos et al.

(2011)43 O=C1[C@]2(C([C@H](CC2)C1)(C)C)C 657 Santos et al.

3

The Quantitative Structure – Insecticidal Activity Relationships from plant derived compounds against chikungunya and zika Aedes aegypti (Diptera:Culicidae) vector Supplementary material

(2010)

44 O1[C@@]2(CC[C@]1(CC2)C)C(C)C 751 Santos et al. (2010)

45 O1C([C@@H]2CC[C@]1(C=C2)C)(C)C 1419 Santos et al. (2010)

46 CC(c1ccc(c(c1)OC(=O)c1ccccc1)C)C 55.4 Scotti et al. (2014)

47 CC(c1ccc(c(c1)OCC(=O)O)C)C 169.1 Scotti et al. (2014)

48 CC(c1ccc(c(c1C=O)O)C)C 66.1 Scotti et al. (2014)

49 CC(c1ccc(c(c1O)C=O)C)C 33.9 Scotti et al. (2014)

50 c1cc(cc(c1C(C)C)OC(=O)CCl)C 45.1 Scotti et al. (2014)

51 c1cc(cc(c1C(C)C)OC(=O)c1ccccc1)C 88 Scotti et al. (2014)

52 c1cc(cc(c1C(C)C)OCC)C 123.2 Scotti et al. (2014)

53 c1cc(cc(c1C(C)C)OCC(=O)O)C 464.4 Scotti et al. (2014)

54 O(c1cc(CC=C)ccc1OCC(=O)O)C 202.7 Scotti et al. (2014)

55 O(c1cc(CCCO)ccc1OC(=O)c1ccccc1)C 166 Barbosa et al. (2012)

56 O(c1cc(CCCO)ccc1O)C 1415.1 Barbosa et al. (2012)

57 C1=C[C@@H]2C[C@H]1C(CCO)(C2)CCO 758 Santos et al (2010)

58 Oc1c(cccc1)C=O 136 Scotti et al. (2014)

59 C[CH2]=[CH2]CCCC 40 Scotti et al. (2014)

60 C1=C[C@@H]2C[C@H]1[C@@H](C2)O 759 Santos et al. (2010)

61 C1=C[C@@H]2C[C@H]1[C@@H](CCO)[C@H]2CCO 1407 Santos et al.

(2010)

62 C1=C[C@@H]2C[C@H]1[C@H](CCO)[C@@H]2CCO 717 Santos et al.

(2010)

Table 2S. List of mathematical equations used in the present study.

4

The Quantitative Structure – Insecticidal Activity Relationships from plant derived compounds against chikungunya and zika Aedes aegypti (Diptera:Culicidae) vector Supplementary material

S: standard deviation; N: number of molecules; : experimental activity for compound i;

: predicted activity for compound i; d: number of descriptors; RMSD=root mean

5

The Quantitative Structure – Insecticidal Activity Relationships from plant derived compounds against chikungunya and zika Aedes aegypti (Diptera:Culicidae) vector Supplementary material

square deviation; : number of molecules in the training; =number of molecules in

the test set; and : calculated activities for i in test set in regressions through the

origin of against and against . respectively; and : squared

correlation coefficients for regressions through the origin in the test set of against

and against . respectively; : average value for in the test set; :

average value for in the test set; : modified squared correlation coefficient; :

squared correlation coefficient between observed and predicted values for the test set.

6

The Quantitative Structure – Insecticidal Activity Relationships from plant derived compounds against chikungunya and zika Aedes aegypti (Diptera:Culicidae) vector Supplementary material

Table 3S. A brief description of different molecular descriptors calculated by means Dragon

6 commercial program.

Class Symbol Description

CATS 2D

CATS2D_03_DA CATS2D Donor-Acceptor at lag 03CATS2D_05_LL CATS2D Lipophilic-Lipophilic at lag 05CATS2D_06_LL CATS2D Lipophilic-Lipophilic at lag 06CATS2D_08_DD CATS2D Donor-Donor at lag 08CATS2D_08_DA CATS2D Donor-Acceptor at lag 08CATS2D_09_AL CATS2D Acceptor-Lipophilic at lag 09

Connectivity indices

X4 connectivity index of order 4X5v valence connectivity index of order 5X4v valence connectivity index of order 4X4sol solvation connectivity index of order 4

Molecular properties

BLTF96 Verhaar Fish base-line toxicity from MLOGP (mmol/l)

BLTD48Verhaar Daphnia base-line toxicity from MLOGP (mmol/l)

MLOGP Moriguchi octanol-water partition coeff. (logP)Uc unsaturation countUi unsaturation index

Constitutional indices Se

sum of atomic Sanderson electronegativities (scaled on Carbon atom)

2D autocorrelations

MATS5eMoran autocorrelation of lag 5 weighted by Sanderson electronegativity

MATS5iMoran autocorrelation of lag 5 weighted by ionization potential

MATS7iMoran autocorrelation of lag 7 weighted by ionization potential

MATS1vMoran autocorrelation of lag 1 weighted by van der Waals volume

GATS5iGeary autocorrelation of lag 5 weighted by ionization potential

GATS7p Geary autocorrelation of lag 7 weighted by polarizability

GATS7iGeary autocorrelation of lag 7 weighted by ionization potential

ATSC5sCentred Broto-Moreau autocorrelation of lag 5 weighted by I-state

JGI7 mean topological charge index of order 7JGI3 mean topological charge index of order 3

ATS2mBroto-Moreau autocorrelation of lag 2 (log function) weighted by mass

2D matrix-based descriptors SpMAD_B(m)

spectral mean absolute deviation from Burden matrix weighted by mass

J.Dz(i) Balaban-like index from Barysz matrix weighted by

7

The Quantitative Structure – Insecticidal Activity Relationships from plant derived compounds against chikungunya and zika Aedes aegypti (Diptera:Culicidae) vector Supplementary material

ionization potential

SpMax2.Bh(m)largest eigenvalue n. 2 of Burden matrix weighted by mass and finally one Atom-centred fragments

Atom-centred fragments

H.052 H attached to C0(sp3) with 1X attached to next CC-026 R--CX--R

Edge adjacency indices

Chi1_EA(bo)connectivity-like index of order 1 from edge adjacency mat. weighted by bond order

Eig13_AEA(dm)eigenvalue n. 13 from augmented edge adjacency mat. weighted by dipole moment

Eig08_EA(dm)eigenvalue n. 8 from edge adjacency mat. weighted by dipole moment

Eig10_EA(dm)eigenvalue n. 10 from edge adjacency mat. weighted by dipole moment

Eig10_EA(bo)eigenvalue n. 10 from edge adjacency mat. weighted by bond order

2D Atom Pairs

F10[C-C] Frequency of C - C at topological distance 10F02[C-O] Frequency of C - O at topological distance 2F05[C-O] Frequency of C - O at topological distance 5F06[C-O] Frequency of C - O at topological distance 6B01[C-O] Presence/absence of C - O at topological distance 1B06[C-C] Presence/absence of C - C at topological distance 6B10[C-O] Presence/absence of C - O at topological distance 10

Functional group counts

nCt number of total tertiary C(sp3)nRCOOR number of esters (aliphatic)nHAcc number of acceptor atoms for H-bonds (N,O,F)

Ring descriptorsRbrid ring bridge countnR06 number of 6-membered ringsTRS total ring size

Topological indices

ICR radial centric information indexTIE E-state topological parameter

Drug-like indices LLS_02 modified lead-like score from Monge et al. (8 rules)

Table 4S. Correlation matrix for the best QSAR model established.

8

The Quantitative Structure – Insecticidal Activity Relationships from plant derived compounds against chikungunya and zika Aedes aegypti (Diptera:Culicidae) vector Supplementary material

Eq. 1

1 0.154 0.243 0.034 0.413 1 0.101 0.389 0.184

1 0.514 0.1221 0.027

1

Table 5S. The values of the descriptors included in the best QSAR model (Eq. 1).

9

The Quantitative Structure – Insecticidal Activity Relationships from plant derived compounds against chikungunya and zika Aedes aegypti (Diptera:Culicidae) vector Supplementary material

ID

1 2.46 33.29 0 3.27 02 2.71 30.88 0 3.42 03 2.71 30.88 0 3.42 04 2.54 30.88 0 3.37 05 2.15 35.53 0 3.49 56 2.00 13.13 0 3.49 77 2.50 8.67 0 3.40 08 2.50 8.67 0 3.40 09 2.54 6.89 0 3.38 010 2.13 5.62 0 3.38 011 2.53 31.14 0 3.40 312 2.53 31.14 0 3.40 313 2.34 6.09 0 3.20 014 3.09 19.04 0 3.37 015 3.18 22.94 0 3.45 016 3.15 28.64 0.02 3.88 017 3.20 33.31 0.03 3.89 018 3.12 22.71 0.02 3.51 019 3.13 33.29 0 3.36 020 3.26 28.86 0 3.41 021 3.28 39.67 0.01 3.90 022 3.20 27.92 0.01 3.49 023 3.18 35.40 0.01 3.46 024 3.07 24.26 0 3.38 025 3.13 22.11 0.01 3.39 026 3.13 34.88 0.01 3.51 027 2.36 47.78 0.01 3.81 028 3.09 21.49 0.01 3.41 329 3.10 6.27 0 2.83 030 2.95 34.80 0.02 3.85 931 2.42 25.91 0 3.09 232 3.27 15.96 0 2.87 033 3.21 20.52 0 3.06 034 2.85 11.08 0.02 3.46 035 2.87 11.80 0.01 3.52 036 2.89 7.41 0 3.36 037 3.21 20.91 0 2.90 038 3.28 41.21 0 3.23 039 2.42 25.91 0 3.09 240 2.34 6.09 0 3.20 041 2.52 33.43 0 3.38 042 2.78 75.99 0 3.45 043 2.49 34.17 0 3.12 0

10

The Quantitative Structure – Insecticidal Activity Relationships from plant derived compounds against chikungunya and zika Aedes aegypti (Diptera:Culicidae) vector Supplementary material

44 2.30 7.66 0 3.34 1245 2.45 9.69 0 3.26 1246 2.37 34.63 0.01 3.82 047 3.03 22.95 0.03 3.48 048 3.42 40.40 0 3.37 049 3.40 67.87 0 3.40 050 3.21 31.26 0.01 3.77 051 2.42 40.74 0.01 3.81 052 3.16 18.12 0 3.36 353 3.10 29.23 0.01 3.46 054 3.04 29.85 0.02 3.48 055 2.28 49.97 0.01 3.81 056 2.84 28.95 0.02 3.26 057 2.13 42.00 0 3.39 458 3.23 30.54 0 3.13 059 2.69 2.48 0 3.35 060 2.27 4.25 0 2.96 361 2.05 26.55 0 3.33 462 2.05 26.55 0 3.33 4

11

The Quantitative Structure – Insecticidal Activity Relationships from plant derived compounds against chikungunya and zika Aedes aegypti (Diptera:Culicidae) vector Supplementary material

Table 6S. Experimental, predicted (Eq. 1) and residual values for the insecticidal

activity.

ID (µg/mL) exp. (µg/mL) pred. Residual1 2.61 2.57 0.042 2.18 2.16 0.023 2.09 2.16 -0.074 2.07 2.33 -0.255 2.34 2.75 -0.416 2.71 2.55 0.177 1.43 1.88 -0.448 1.48 1.88 -0.409 1.75 1.87 -0.1210 2.18 2.03 0.1511 2.47 2.51 -0.0412 2.74 2.51 0.2313 2.34 2.23 0.1214 1.71 1.86 -0.1515 1.96 1.76 0.2016 1.72 1.62 0.0917 1.82 1.75 0.0818 1.82 1.99 -0.1619 1.53 2.13 -0.6020 1.96 1.90 0.0721 1.62 1.43 0.1922 1.82 1.91 -0.0923 2.03 2.22 -0.1924 1.87 1.95 -0.0925 2.01 2.07 -0.0626 1.79 2.17 -0.3727 2.85 2.20 0.6528 1.83 2.22 -0.4029 2.29 2.50 -0.2230 2.45 2.49 -0.0431 2.78 2.89 -0.1132 2.38 2.55 -0.1733 2.58 2.34 0.2434 2.30 2.08 0.2235 1.72 1.81 -0.0936 1.71 1.75 -0.0437 2.76 2.62 0.1438 2.71 2.43 0.29

12

The Quantitative Structure – Insecticidal Activity Relationships from plant derived compounds against chikungunya and zika Aedes aegypti (Diptera:Culicidae) vector Supplementary material

39 2.78 2.89 -0.1140 2.61 2.23 0.3841 2.71 2.36 0.3542 2.65 2.93 -0.2843 2.82 2.83 -0.0144 2.88 2.93 -0.0545 3.15 3.03 0.1346 1.74 1.98 -0.2347 2.23 2.38 -0.1548 1.82 2.11 -0.2949 2.97 2.59 0.3850 1.65 1.51 0.1451 1.94 2.07 -0.1352 2.09 2.05 0.0453 2.67 2.04 0.6354 2.31 2.20 0.1155 1.60 2.32 -0.7156 3.15 2.79 0.3657 2.88 2.97 -0.0958 2.13 2.41 -0.2759 1.60 1.76 -0.1660 2.88 2.85 0.0361 3.15 2.83 0.3262 2.86 2.83 0.03

13

The Quantitative Structure – Insecticidal Activity Relationships from plant derived compounds against chikungunya and zika Aedes aegypti (Diptera:Culicidae) vector Supplementary material

Table 7S. The best QSAR models found considering all descriptors blocks for the ‘set A’

and ‘set B’.

Set A with ( and )

molecular descriptors

1 0.27 0.40 0.71 0.20 0

2 0.44 0.36 0.71 0.24 0.01 ,

3 0.56 0.33 0.53 0.32 0.16 , ,

4 0.66 0.29 0.60 0.38 0.35, , ,

5 0.77 0.25 0.71 0.49 0.94, , ,

,

Set B with ( and )

1 0.62 0.32 0.75 0.27 0

2 0.83 0.22 0.80 0.33 0.25 ,

3 0.93 0.14 0.70 0.50 0.19 , ,

4 0.96 0.11 0.89 0.59 0.18 , , ,

5

0.97 0.10 0.90 N/A* 0.46

, , ,

,

* Non – Available.

14

The Quantitative Structure – Insecticidal Activity Relationships from plant derived compounds against chikungunya and zika Aedes aegypti (Diptera:Culicidae) vector Supplementary material

Figure 1S. A. Predicted and experimental values for the model with of the

set A. B. Dispersion plot of residuals.

(A)

(B)

15

The Quantitative Structure – Insecticidal Activity Relationships from plant derived compounds against chikungunya and zika Aedes aegypti (Diptera:Culicidae) vector Supplementary material

Figure 2S. A. Predicted and experimental values for the model with and the

set B. B. Dispersion plot of residuals.

(A)

(B)

16

The Quantitative Structure – Insecticidal Activity Relationships from plant derived compounds against chikungunya and zika Aedes aegypti (Diptera:Culicidae) vector Supplementary material

Table 8S. The values of the descriptors included in the best QSAR model of the Set A.

ID

14 0.86 0.00 1 015 1.05 -1.55 1 016 1.04 -1.00 1 017 1.03 -0.92 1 018 1.05 -1.32 1 019 0.94 0.00 1 020 0.99 -1.53 1 021 0.97 -0.95 1 022 0.99 -1.38 1 023 1.12 -1.29 2 024 1.01 0.00 2 125 1.01 -2.00 2 026 1.10 -1.14 2 027 0.89 -0.57 2 028 1.12 -1.54 2 029 0.31 0.00 1 030 0.71 -1.00 2 032 0.38 0.00 2 033 0.88 0.00 2 136 0.88 0.00 0 037 0.42 0.00 2 038 0.83 0.00 2 146 0.85 -0.92 1 047 1.08 -1.31 1 148 1.23 -2.00 1 049 0.86 -2.00 1 050 0.98 -1.37 1 051 0.80 -0.95 1 052 1.03 -2.00 1 053 1.02 -1.37 1 154 1.11 -1.12 2 155 0.88 -0.40 2 056 1.00 -2.00 2 158 0.54 0.00 1 0

17

The Quantitative Structure – Insecticidal Activity Relationships from plant derived compounds against chikungunya and zika Aedes aegypti (Diptera:Culicidae) vector Supplementary material

Table 9S. The values of the descriptors included in the best QSAR model of the Set B.

ID

1 2.91 -1.772 2.92 -2.003 2.92 -2.004 2.92 -2.005 3.12 -1.566 3.01 -1.727 2.78 -2.438 2.78 -2.439 2.78 -2.6210 2.88 -1.8611 2.96 -1.9412 2.96 -1.9413 3.06 -2.0031 3.22 -1.7434 2.59 0.0035 2.77 -1.7939 3.22 -1.7440 3.06 -2.0041 2.96 -1.9242 3.09 -1.7543 3.21 -1.7444 3.24 -1.7245 3.22 -1.6057 3.21 -1.4559 2.11 0.0060 2.81 0.0061 3.18 -1.4662 3.18 -1.46

18

The Quantitative Structure – Insecticidal Activity Relationships from plant derived compounds against chikungunya and zika Aedes aegypti (Diptera:Culicidae) vector Supplementary material

Table 10S. The best QSAR models found considering 233 simpler Dragon descriptors. The

selected model appears in bold.

molecular descriptors

1 0.26 0.42 0.79 0.41 0.00

2 0.49 0.35 0.79 0.33 0.03

3 0.54 0.34 0.85 0.32 0.42

4 0.60 0.32 0.87 0.33 0.21

5 0.62 0.31 0.87 0.35 0.21

6 0.69 0.29 0.82 0.40 0.56

19

The Quantitative Structure – Insecticidal Activity Relationships from plant derived compounds against chikungunya and zika Aedes aegypti (Diptera:Culicidae) vector Supplementary material

Table 11S. Correlation matrix of the simpler descriptors of Eq. 2.

Eq. 2

1 0.013 0.026 0.031 0.452 0.46

1 0.131

20

The Quantitative Structure – Insecticidal Activity Relationships from plant derived compounds against chikungunya and zika Aedes aegypti (Diptera:Culicidae) vector Supplementary material

Table 12S. The values of the simpler descriptor included in the QSAR model (Eq. 2).

ID TRS TIE Ui BLTF961 6 59.99 1.43 1.732 6 59.99 1.428 1.653 6 59.99 1.428 1.654 6 59.99 1.428 1.655 9 92.19 1.461 1.376 9 80.43 1.426 1.337 6 50.00 1.389 1.608 6 50.00 1.389 1.609 6 50.00 1.389 1.6010 9 72.00 1.395 1.6311 6 59.99 1.43 1.7312 6 59.99 1.43 1.7313 10 72.00 1.404 2.1314 6 59.99 1.428 1.6515 6 77.98 1.513 1.4216 6 113.42 1.555 1.3117 6 140.00 1.573 1.3118 6 83.65 1.534 1.3419 6 59.99 1.43 1.7320 6 77.98 1.516 1.5321 6 140.00 1.576 1.4022 6 83.65 1.538 1.4423 6 83.08 1.533 1.3024 6 65.09 1.458 1.4525 6 71.09 1.487 1.4126 6 88.75 1.553 1.2427 12 118.41 1.618 0.8128 6 75.42 1.511 1.3329 6 32.66 1.241 2.0230 6 146.76 1.607 1.2831 10 87.33 1.443 2.2932 6 43.10 1.305 2.0933 6 49.09 1.354 1.8934 0 35.33 1.373 1.4935 0 43.33 1.439 1.3136 6 50.00 1.389 1.6037 6 41.32 1.303 1.9538 6 62.42 1.43 1.6339 10 87.33 1.443 2.2940 10 72.00 1.404 2.13

21

The Quantitative Structure – Insecticidal Activity Relationships from plant derived compounds against chikungunya and zika Aedes aegypti (Diptera:Culicidae) vector Supplementary material

41 6 59.99 1.43 1.7342 6 73.66 1.492 1.6543 10 87.33 1.443 2.2944 10 83.32 1.436 1.8345 12 80.65 1.438 1.8446 12 113.31 1.604 0.8647 6 83.87 1.532 1.2748 6 75.65 1.494 1.7849 6 75.65 1.493 1.7450 6 92.75 1.538 1.4451 12 113.31 1.607 0.9152 6 70.32 1.491 1.5853 6 83.87 1.536 1.3654 6 88.97 1.552 1.1955 12 123.41 1.632 0.7756 6 70.09 1.483 1.2957 10 83.99 1.492 1.4058 6 47.32 1.354 1.8959 0 20.00 1.222 1.7360 10 53.99 1.309 2.2261 10 80.99 1.49 1.2762 10 80.99 1.49 1.27

The Replacement Method (RM) procedure

22

The Quantitative Structure – Insecticidal Activity Relationships from plant derived compounds against chikungunya and zika Aedes aegypti (Diptera:Culicidae) vector Supplementary material

The procedure of the RM technique is as follows: choose descriptors at

random and do a linear regression. Choose one of the descriptors of this set called Xi and

replaced it by each of the descriptors of the set (except itself), keeping the best resulting

set.21 Since one can start replacing any of the d descriptors in the initial model then a

regression equation with variables has possible paths to achieve the final result; one

example of the choice above will develop into path . Next choose the variable with the

greatest relative error in its coefficient (except the one replaced in the previous step) and

replace it with all the descriptors (except itself) keeping again the best set. Replace the

entire remaining variable having the greatest relative error in the coefficient and repeat the

whole process. The process will be repeated as many times as needed until the set of

descriptors remains unchanged. At the end, we have the best model for the path i. Proceed in

exactly the same way for all possible paths compare the resulting models and

keep the best one. Our numerical experiments show that in this way one obtains a model

almost as good as the best one with fewer linear regressions when this

combinatorial number is large.29

Dataset Partitioning

Several standard techniques for designing a rational partition of a data set are currently

available in the literature, namely the traditional procedures such as principal compounds

23

The Quantitative Structure – Insecticidal Activity Relationships from plant derived compounds against chikungunya and zika Aedes aegypti (Diptera:Culicidae) vector Supplementary material

analysis (PCA), discriminant analysis (DA), cluster analysis (CA), Kennard-Stone method and

D-optimal design, which offer different possibilities and objectives. The CA is proposed for

many different application fields and is divided into two methods: hierarchical and partition;

the latter consists of relocation cases by moving them for one cluster to another, starting from

an initial cluster, and requires that the cluster numbers shall be defined by the user. For

obtaining the global optimality of partition-base clustering many enumeration processes of all

possible partitions are needed; the most common algorithm uses a squared error criterion is

called k-means algorithm; this method for partitioning the data into k-clusters, where each

cluster is determined by its centroid or centre point (the centroid of each cluster is calculated as

the mean of all the instances belonging to that cluster). The initialization of the algorithm is

based on an initial set of cluster centroids chosen by a heuristic process, where each iteration

for each instance is assigned to its nearest cluster centroid depending on the Euclidean distance

between two of them, and then the cluster centroids are recalculated.31, 32

24

The Quantitative Structure – Insecticidal Activity Relationships from plant derived compounds against chikungunya and zika Aedes aegypti (Diptera:Culicidae) vector Supplementary material

The Balanced Subsets Method (BSM) procedure

The procedure applied to the median lethal concentration ( ) data set includes the

following steps:

A. Prepare a matrix (C) that includes the experimental property and the 4,885 molecular

descriptors selected. Now the size of C is .

B. Remove the linearly dependent variables from the previous matrix and now the new size

is .

C. Standardize C for centering and scaling its matrix elements. This is done for discerning

better the matrix elements.

D. Created clusters with the 62 compounds through the k-MCA method, for that the C

matrix is used together with the Euclidean metrics and 90 runs for the numerical

optimization algorithm of k-MCA in order to achieve the best solution. This computes

clusters centroid location, each centroid of size;

where is the number of compounds in the training set and is the number of

compounds that have minimum or maximum values of the experimental property .

25

The Quantitative Structure – Insecticidal Activity Relationships from plant derived compounds against chikungunya and zika Aedes aegypti (Diptera:Culicidae) vector Supplementary material

E. The training set is designed by including one compound per cluster, which is the

compound that is close to the centroid in each cluster. It also includes the

compounds.

F. The validation set is created by including one compound per cluster, which is the

compound that is nearer to the centroid in each cluster.

G. For finally, the test set includes the remaining the compounds.

26