TOXICITY MODELLING OF “EEC PRIORITY LIST 1” COMPOUNDS TOXICITY MODELLING OF “EEC PRIORITY LIST...
-
Upload
randolph-hicks -
Category
Documents
-
view
223 -
download
7
Transcript of TOXICITY MODELLING OF “EEC PRIORITY LIST 1” COMPOUNDS TOXICITY MODELLING OF “EEC PRIORITY LIST...
TOXICITY MODELLING OF “EEC PRIORITY LIST 1” TOXICITY MODELLING OF “EEC PRIORITY LIST 1” COMPOUNDSCOMPOUNDS
TOXICITY MODELLING OF “EEC PRIORITY LIST 1” TOXICITY MODELLING OF “EEC PRIORITY LIST 1” COMPOUNDSCOMPOUNDS
Council Directive 76/464/EEC of the European Communities (EEC 1976a) includes the so-called “List 1 compounds” that are dangerous compounds for aquatic environments, selected mainly on the basis of their toxicity, persistence and bioaccumulation.Thus it is very important to obtain all the information and data relevant to the particular substances in living aquatic organisms. If no data are available to make an appropriate judgement for a specific substance, the substance is considered a candidate for List 1 until such data become available. For many chemicals there is little reliable information detailing their relative toxicity, so the application of molecular descriptors and chemometric methods in Quantitative Structure-Activity Relationships (QSAR)studies is used to predict toxicological data for different aquatic organisms.
INTRODUCTIONINTRODUCTION
All the toxicity data are expressed in mmol/l and in logarithmic scale as log (1/response). The values used for calculations were selected by prof. Marco Vighi (Dept. of Environmental Sciences, Milano) from among the more reliable data of all the sets available. Selected data were produced with comparable, officially accepted testing methods (e.g. standard OECD or EEC Guidelines).
30 min EC50 of the light emitted by a photoluminescent bacterium (Photobacterium phosphoreum) obtained by a standard automated method (Microtox).Available for 33 homogeneous molecules.
96 h EC50 of unicellular chlorophiceans (Selenastrum, Chlorella or comparable species) obtained with standard methods were used.Available for 45 molecules.
48 h EC50 obtained with standard methods were selected.
Available for 94 molecules.
96 h LC50 obtained with standard methods and produced with Onchorinchus mikiss, Poecilia reticulata or Pimephales promelas were selected.
Available for 88 molecules.
BacteriaBacteria
AlgaeAlgae
DaphniaDaphnia
PC 1
PC
2
3
56
7
13
17 19
20
2325
28
2930 32d
3840
46
47c
98
49
52
52b
53
5559
62
6363c
63e64
64b
64c64d
64e64f
6565b
65c
67tr67cs
70
73
76
7879
80
81
84 85
85b
86
88
89
94
96
97
99
100
102
105
106
111
112
115
118
118b121
122
122b122c
122f
124
129129b129c 130s
134s
135s
-2.5
-2.0
-1.5
-1.0
-0.5
0.0
0.5
1.0
1.5
2.0
-3 -2 -1 0 1 2 3 4
Low toxicity High toxicity
T. Daphnia
T. Fish
PC 1
PC
2
7
17
20
25
28
29
3032d
46
5253
55
59
63
63c
64
64b
64d
64e64f
67tr67cs
73
7980
85
85b96
99
102118118b
122b
122c129129b
130s
-3
-2
-1
0
1
2
3
-4 -3 -2 -1 0 1 2 3 4 5
Low toxicity High toxicity
T. Daphnia
T. Fish
T. Algae
PC 1
PC
2
7
17
20
28
29
30 32d
52
53
55
63
63c
64
102
118
-1.4
-0.8
-0.2
0.4
1.0
1.6
2.2
-5 -3 -1 1 3 5Low toxicity High toxicity
T. Algae
T. Daphnia
T. Fish
T. Bacteria
FishFish
The minimum energy conformations of all the compounds were obtained by the molecular mechanics method of Allinger (MM+), using the package HyperChem. All descriptors were calculated from the obtained coordinates using the package WHIM-3D/QSAR.
Principal Component Analysis (PCA) was performed by STATISTICA.
The Selection of the best Subset Variables (VSS method) for modelling the toxicity was done by taking a Genetic Algorithm (GA-VSS) approach, where the response is obtained by Ordinary Least Squares regression (OLS), using the package Moby Digs for variable selections.
All the calculations were performed using the leave-one-out procedure of cross-validation, maximising the cross-validated R squared (Q2), (Quick rule). To avoid an overestimation of the predictive capability of the models, the leave-more-out procedure (with N cross-validation groups, I.e. a 30% of objects left out at each step) was also performed (Q2
LMO). Standard Deviation Error in Prediction (SDEP) and Standard Deviation Error in
Calculation (SDEC) are also reported, together with the multiple correlation coefficient (R2). For the obtained models, the leverages approach was performed, with the aim of estimating the reliability of the predicted data and allowing only reliable predicted data to be considered.
MOLECULAR DESCRIPTORSMOLECULAR DESCRIPTORS
The molecule structure has been represented by different set of descriptors: mono-dimensional (count), two-dimensional (graph-invariants) and three-dimensional (3D-WHIM, 3D-Weighted Holistic Invariant Molecular) by the software produced by the Milano Chemometric Research Group of prof. Roberto Todeschini(1).Count descriptors (38) directly encode particular features of molecular structure and are simply obtained from the chemical structural formula of molecules, counting defined elements such as atoms (nAT), bonds (nBT), rings (nCIC), H-bond acceptors (nHA) and H-bond donors (nHD); atom type counts are obtained such as number of hydrogens, carbons, halogens (nH, nC, nX respectively). The second set is constituted by the more frequently used 34 graph-invariants descriptors (topological and information indices). The molecular weight (MW) is always used.For the 3D representation of the molecules, the WHIM descriptors, recently proposed and widely applied by Todeschini and Gramatica(2), have been used: a set constituted by the 33 non-directional WHIM and the 66 directional WHIM. WHIM descriptors are molecular indices that represent different sources of chemical information about the whole 3D-molecular structure in terms of size, shape, symmetry and atom distribution. These indices are calculated from (x,y,z)-coordinates of a 3D-structure of the molecule, usually from a spatial conformation of minimum energy, within different weighting schemes in a straightforward manner and represent a very general approach to describe molecules in a unitary conceptual framework.
(1) R. Todeschini, WHIM-3D/QSAR- Software for the calculation of the WHIM descriptors, rel. 4.1 for Windows, Talete srl, Milano (Italy) 1996. Download: http://www.disat.unimi.it/chm.(2) R. Todeschini and P. Gramatica, 3D-modelling and prediction by WHIM descriptors. Part 5. Theory development and chemical meaning of the WHIM descriptors, Quant. Struct.-Act. Relat., 16 (1997) 113-119; Part 6. Applications in QSAR Studies, same, 120-125.
Exp. toxicity
Ca
lc. t
ox
icit
y
3
56
7
1317 19
20
23
25
2829 30
32d3840
46
4747c 98
49
52 52b
5354 55
5958
62
6363c63e
6464b 64c64d 64e64f
65 65b65c 67tr67cs
70
73
76
78
79
80
81
83
84
8585b
86
88
89
94tr94cs
9697
99
100
102105
106
107
111
112
115
118118b118c
121
122122b122c122e 122f
124
127
129129b129c
130s
133s
134s
135s
-1
0
1
2
3
4
5
6
7
-1 0 1 2 3 4 5 6 7
Exp. toxicity
Cal
c. t
oxic
ity
3
56
7
11
13
1718 1920
23
25
2829
303132b
32d32h 383940
46
47c98
4952
52b 5355
59
62
6363b63c 63e
63f
6464b64c
64d 64e64f
6565b
65c67tr67cs
70
73
76
78
79
80
81
84
8585b
8688
89 94tr94cs
96
97
99
99b
99e99f
100
102
105
106
108
111112
115
118
118b
121
122
122b
122c
122d 122f 124
125
126
129129b129c130s
134s
135s
-1
0
1
2
3
4
5
6
7
-1 0 1 2 3 4 5 6 7
Exp. toxicity
Cal
c. t
oxic
ity
7
1720
25
28
29
30
31
32b32d
32h
32i
46
525355
59
6363b
63c
63f
64
64b
64d64e
64f
67tr
67cs
73
79
80
8585b
96
99
99e
102118
118b
122b122c122d
129129b
130s
-1
0
1
2
3
4
5
-1 0 1 2 3 4 5Exp. toxicity
Cal
c. t
oxic
ity
7
1718
19
20
28
29
30
32d
3840
5252b
52c52d 52e
52f53
54
55
6363b63c63e
63f 64
86
89
102
106
107
118
122
-0.5
0.0
0.5
1.0
1.5
2.0
2.5
3.0
3.5
-0.5 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5
F. Consolaro and P. Gramatica QSAR Research Unit, Dept. of Structural and Functional Biology, University of Insubria, Varese, Italy.
E-mail: [email protected] Web-site: http://andromeda.varbio.unimi.it/~QSAR/
EXPERIMENTAL DATAEXPERIMENTAL DATA
Uniform dimension Principal Component Analyses were performed on all the experimental toxicity data with the aim of highlighting the distribution of the studied compounds. It can be noted that along the first component the compounds are well separated by the global toxicity, while along the second principal component they are separated by their specific toxicity:
Toxicity in BacteriaToxicity in Bacteria (33 objects)
Tox = 6.83 + 1.21 nBO - 0.32 nO - 4.84 WIA - 2.90 P2s - 7.70 Ke
R2 = 89.8% Q2LOO = 86.1% Q2
LMO = 82.0%
SDEP = 0.26 SDEC = 0.22 F5,27 = 47.65 S = 0.23
nBO: n. of skeleton bonds nO: n. of oxigen atoms WIA: average Wiener index
P2s: shape dir-WHIM descriptor Ke: shape glob-WHIM descriptor
Toxicity in AlgaeToxicity in Algae (45 objects)
Tox = - 0.21 - 0.66 nS - 0.63 nOH - 0.53 nNH2 - 4.12 P1s + 1.02 Tm - 0.23 As
R2 = 70.6% Q2LOO = 61.5% Q2
LMO = 58.1%
SDEP = 0.56 SDEC = 0.49 F6,38 = 15.20 S = 0.52
nS: n. of sulphur atoms nOH: n. of OH groups nNH2: n. of NH2 groups
P1s: shape dir-WHIM descriptors Tm and As: dimensional glob.-WHIM descriptors
Toxicity in Toxicity in DaphniaDaphnia (94 objects)
Tox = - 3.57 + 4.05 nP - 0.39 nHA + 1.02 IDM + 0.67 E1m
R2 = 84.2% Q2LOO = 82.1% Q2
LMO = 81.7%
SDEP = 0.68 SDEC = 0.64 F4,89 = 118.66 S = 0.65
nP: n. of phosphorous atoms nHA: n. of h bonds acceptors
IDM: mean inf. cont. on the dist. magn. E1m: atom distribution dir-WHIM descriptor
Toxicity in FishToxicity in Fish (88 objects)
Tox = - 2.29 - 0.66 nNO - 0.91 nHD + 0.94 IDM - 10.39 Du + 7.39 De + 2.01 Ds
R2 = 81.5% Q2LOO = 78.1% Q2
LMO = 77.8%
SDEP = 0.58 SDEC = 0.53 F6,81 = 59.55 S = 0.55
nNO: n. of NO groups nHD: n. of H bonds donors IDM: mean inf. cont. on the dist. magn.
Du, De and Ds: atom distribution glob.-WHIM descriptors
Bacteria
Daphnia
Algae
Fish
PCA on toxicity of bacteria, algae, Daphnia and fish
Training set: 15 mol. Cum. E.V. = 86.6% (PC1 = 72.8%)
PCA on toxicity of algae, Daphnia and fish
Training set: 37 mol. Cum. E.V. = 90.0% (PC1 = 69.0%)
PCA on toxicity of Daphnia and fish
Training set: 79 mol. Cum. E.V. = 100% (PC1 = 86.5%)
CONCLUSIONSCONCLUSIONSThe used procedures have confirmed the quite satisfactory predictive capability of the obtained models. The role of the descriptors in predicting the toxic effects can be explained, though there are a few uncertainties. Count descriptors play an important role in all models because of their capability in explaining particular features of some groups of chemicals; also the shape (P, k) and the density factors (E, D) are determinant in predicting the toxicity of the studied compounds.
Using the reliable predicted data it was possible to add many toxicological data to the available experimental values. The graphics below and the annexed table report all the available experimental data and in addition the values predicted by our models (pink data).
PC 1
PC
2
3
56
711
13
1718 19
2023 25
28
2930
31
3232b32c
32d32e
32f 32g
32h
32i
38
3940
46
4747b47c
98
49
50
52
52b
52c52d
52e52f
5354
5559
5862
6363b63c 63d63e
63f
64
64b
64c64d 64e64f
6565b
65c
65d 67tr67cs
68
68c68d68e
70
73
76
7879
80
81
83
84 85
85b
85c86
88
89
94tr94cs
96
97
99
99b
99c99d
99e
99f99g
100
102
105
106107
108
111
112
115
118
118b
118c
121
122
122b122c
122d122e
122f
124
125
126
127
129129b129c 130s
133s
134s
135s
-2.5
-2.0
-1.5
-1.0
-0.5
0.0
0.5
1.0
1.5
2.0
-3 -2 -1 0 1 2 3 4Low toxicity High toxicity
T. Daphnia
T. Fish
ExperimentalPredicted
PC1
PC
2
3
7
11
13
17
2023
25
28
29
30
3132
32b
32c
32d32e32f32g
32h32i
3839
40
46
52
53
54 55
59
58
62
63
63b
63c
63d
63e
63f
64
64b
64c
64d
64e64f
6565b
65c
65d
67tr67cs
68
68c68d
68e
70
73
78
79 80
838485
85b
85c
86
94tr
94cs
96
97
99
99b
99c
99d
99e
99f
99g
100
102
106
107
111
112
118118b
118c
121
122122b
122c122d
122e122f
129129b
129c
130s
134s
-2.5
-1.5
-0.5
0.5
1.5
2.5
-5 -3 -1 1 3 5Low toxicity High toxicity
T. Daphnia
T. Fish
T. Algae
ExperimentalPredicted
PC1
PC
2
7
17
20
28
29 30
31
32
32b
32c
32d32e
32f32g
32h32i
38
39
40
52
5354
55
63
63b
63c
63d
63e
63f
64
64b
64c64d
64e
64f
79
86
102
106
107
112
118
118b
118c
122
122b
122c122d
122e
122f129129b129c
134suppl.
-2.5
-1.5
-0.5
0.5
1.5
2.5
3.5
-4 -3 -2 -1 0 1 2 3 4 5Low toxicity High toxicity
T. Algae
T. Daphnia
T. Fish
T. Bacteria
ExperimentalPredicted
PCA on toxicity of bacteria, algae, Daphnia and fish
n. tot. mol.: 54 Cum. E.V. = 81.5% (PC1 = 58.2%)
PCA on toxicity of algae, Daphnia and fish
n. tot. mol.: 97 Cum. E.V. = 93.7% (PC1 = 77.0%)
PCA on toxicity of Daphnia and fish
n. tot. mol.: 125 Cum. E.V. = 100% (PC1 = 88.3%)
METHODSMETHODS
2s/P003