carcinogens - caesar

appliancepartΤεχνίτη Νοημοσύνη και Ρομποτική

19 Οκτ 2013 (πριν από 3 χρόνια και 8 μήνες)

76 εμφανίσεις



Carcinogenicity prediction

for Regulatory Use

Natalja

Fjodorova

Marjana Novič
,

Marjan Vračk
o,

Marjan Tušar



National institute of Chemistry,
Ljubljana, Slovenia





Kemijske Dnevi 25
-
27 September 2008



UNIVERZA MARIBOR

Overview


1.
EU project CAESAR aimed for
development of QSAR models for
prediction of toxicological properties of
substances, used for regulatory purposes.


2. The principles of validations of QSARs
which will be used for chemical regulation.


3. Carcinogenicity models using Counter
Propagation Artificial Network





It is estimated that o
ver
30000
industrial chemicals

used in Europe
require additional safety testing to
meet requirements of new chemical
regulation

REACH.


If conducted on animals this testing
would require the use of an extra
10
-
20 million animal experiments
.


Quantitative Structure Activity
Relationships (
QSAR
) is one major
prospect

between
alternative testing
methods to be used in a regulatory
context.






aimed to
develop (Q)SARs
as non
-
animal alternative
tools for the assessment of
chemical toxicity under the
REACH.




FR6
-

CAESAR

European Project

C
omputer
A
ssisted

E
valuation of Industrial

chemical

S
ubstances

A
ccording

to

R
egulations


Coordinator
-

Emilio Benfenati
-


Istituto di Ricerche Farmacologiche
“Mario Negri”


The general aim of CAESAR is



1.
T
o produce QSAR models for
toxicity prediction of chemical

substances
, to be used for
regulatory purposes under REACH
in a transparent manner by
applying new and unique
modelling and validation
methods
.

2.
Reduce animal testing and its associated
costs, in

accordance with

Council Directive

86/609/EEC

and
Cosmetics Directive
(Council Directive 2003/15/EC)


CAESAR is solving several problems:


Ethical
-

save animal
lifes;


Economical
-

cost
reduction on testing;


Political
-

REACH
implementation
-

new
chemical legislation



CAESAR aimed to develop

n
ew
(Q)SAR models for
5
end
-
points
:


Bioaccumulation
(
BCF
),

Skin sensitisation


Mutagenicity

Carcinogenicity

T
eratogenicity




The characterization of the
QSAR

models follows the general
scheme of
5
OEC
D
principles
:


1.
A defined endpoin
t

2.
An unambiguous algorithm

3.
A defined domain of applicability

4.
Appropriate measures of goodness
-
of
-
fit,
robustness and predictivity

5.
A mechanistic interpretation, if
possible
.





Principle1
-

A defined endpoin
t


Endpoint

is the property or biological
activity determined in experimental
protocol, (OECDTest Guideline).


Carcinogenicity

is a
defined endpoint


addressed by an

officially recognized


test

method

(
Method B.32


Carcinogenicity test


Annex V

to


Directive

67/548/EEC
).



Principle2
-

An unambiguous
algorithm


Algorithm is the form of relationship
between chemical structure and property
or biological activity being modelled.


Examples:

1. Statistically (regression) based QSARs

2. Neural network model, which includes
both learning process and prediction
process.






Transparency in the (Q)SAR
algorithm can be provided by means
of the following

information:

a)
D
efinition of the
mathematical form

of a QSAR model, or of the decision

rule (e.g. in the case of a SAR)

b) Definitions of all
descriptors

in the
algorithm, and a description of their
derivation

c) Details of the
training set

used to
develop the algorithm.

Principle3
-

A Defined Domain
of Applicability


The definition of the
A
pplicability
D
omain (
AD
)

is

based on the

assumption that a

model is
capable of making reliable predictions only
within the structural,

physicochemical

and
response space that is

known from its training
set.



List of basic structures (for example,
aniline, fluorene..)


The range of chemical descriptors values
.




The assessment of model performance is
sometimes

called statistical validation.

Principle4
-

Appropriate measures




goodness
-
of
-
fit,


robustness
(internal performance)

and



predictivity
(external performance)

Principle5
-

A mechanistic
interpretation, if possible


Mechanistic interpretation of (Q)SAR provides a
ground for interaction and dialogue between
model developer, and toxicologists and
regulators, and permits the integration of the
(Q)SAR results into wider regulatory framework,
where different types of evidence and data
concur or compliment each other as a basis for
making decisions and taking actions.

Example
: enhancing/inhibition the metabolic
activation of substances may be discussed.


National Institute of Chemistry
in Ljubljana (
NIC
-
LJU
)



is responsible for development
of models for predicton of
carcinogenicity



DATA ON CARCINOGENICITY


1
.S
tudies of carcinogenicity in humans

2
.
Carcinogenicity studies in animals

3
.
Other relevant data


additional evidence related to

the possible carcinogenicity



Genetic Toxicology


Structure
-
Activity Comparisons


Pharmacokinetics and Metabolism


Pathology


Cancer Risk Assessment


IARC International Agency for Research of Cancer






IARC



For animals


Group


Classification


Explanation


Classification

Group A

Human Carcinogen

sufficient human evidence for
causal association between
exposure and cancer



Group B1

Probable Human

limited evidence in human



Group B2

Probable Human

inadequate evidence in humans
and
sufficient evidence in animals

clear evidence


Group C

Possible Human
Carcinogen

limited evidence in animals

some evidence

Group D

Not Classifiable as
Human
Carcinogenicity

inadequate evidence in animals

equiv
o
cal


Group E

No Evidence of
Carcinogenicity in
Human

at least two adequate animal tests
or both
negative

epidemiology and
animal studies

no evidence

Predictive Toxicology Approaches


1. Quantitative models (QSARs)
Continuous data prediction on the basis of
experimental evidence of rodent
carcinogenic potential
(
TD50 tumorgenic
dose)



2. C
ategorical

models based on YES/NO
data
. (
P
-
positive; NP
-
not positive
)

Dataset:


805 chemicals

were filtered
from

1
481compounds

taken from

Distributed Structure
-
Searchable Toxicity
(DSSTox) Public Database Network
http://www.epa.gov/ncct/dsstox/sdf_cpdb
as.html


which was derived from the Lois Gold
Carcinogenic Database
(CPDBAS)


The chemicals involved in the study belong
to different chemical classes,
(
noncongeneric substances
)

Descriptors:

1.
252 MDL descriptors

were calculated in
program
MDL QSAR
.


2. Descriptors dataset was reduced to


27 MDL descriptors
, using Kohonen map
and Principle Component Analisis.


Counter Propagation Artificial Neural Network



Step1
: mapping of molecule Xs

(vector representing structure)

into the Kohonen layer


Step2
: correction of weights in
both, the Kohonen and the
Output layer

Step3
: prediction of the four
-
dementional target (toxicity) Ts

Investigation of quantitative models

shows us low results

RESPONCE
-

TD50mmol

1.



Correlation coefficient in the external validation is lower then 0.5



Continuouse data models

(Quantitative models)


Models

Reduction of
descriptors
method,
model


TRAINING

TEST

R_train

RMSE

R_test

RMSE

CP ANN_model

250MDLdescriptors

0.74

1.51

0.47

1.78

CP ANN_model

86MDLdescriptors

Kohonen map



0.72

1.54

0.42

1.90

CP ANN_model

27MDLdescriptors


PCA

0.74

1.52

0.45

1.80

SVM_model

(Thomas Ferrary)

86MDLdescriptors

0.82


1.23


0.47

1.81


Investigation of categorical models

shows us satisfactory results


YES/NO

principe


RESPONCE:

P
-
positive
-
active

NP
-
not positive
-
inactive

Characteristics used for
validation of categorical model


true positive(
TP
),


true negative (
TN
)


Accuracy(
AC
),
AC=(TN+TP)/(TN+TP+FN+FP)


TPrate=
Sensitivity(SE)
=TP/(TP+FN)


TNrate=
Specificity(SP)
=TN/(TN+FP)

Categorical model for dataset

805 chemicals

(
Training=644

and
Test=161
),
using 27 MDL descriptors



Training

Test



ACC
,
%

SE
,
%

SP
,
%

ACC
,
%

SE
,
%

SP
,
%

Model
_1

88

90

86

68

69

67

Model
_2

92

99

85

68

73

63




Confusion matrix
TR(644)
/TE(161)

classes (Positive
-

Negative)





Class

Positive

(predict.)

Negative

(predict.)

Number

TR
(TE)

644
(161)


Positive

(experim.)




329
(65)


3(24)

332
(89)

Negative

(experim.)



47
(27)

265
(45)

312
(72)

FP

FN

TP

TN

How we find optimal model, using
threshold

Threshold=0.45

Accuracy=0.68

SE=0.73

SP=0.63

Changing of
threshold allows us
to get models with
different statistical
performances.

Tr

SE

SP

ACC

0.05

0.91

0.15

0.57

0.1

0.83

0.36

0.62

0.15

0.8

0.47

0.65

0.2

0.79

0.47

0.65

0.25

0.79

0.47

0.65

0.3

0.79

0.53

0.67

0.35

0.78

0.57

0.68

0.4

0.73

0.6

0.67

0.45

0.73

0.63

0.68

0.5

0.65

0.63

0.64

0.55

0.62

0.72

0.66

0.6

0.62

0.74

0.67

0.65

0.6

0.76

0.67

0.7

0.58

0.76

0.66

0.75

0.54

0.78

0.65

0.8

0.52

0.79

0.64

0.85

0.45

0.83

0.62

0.9

0.31

0.89

0.57

0.95

0.24

0.93

0.55

1

0

1

0.45

ROC(Receiver operating
characteristic) curve

Training set

Test set


The area under the curve is 0.988 and 0.699 in the training and test sets, respectively.

How requrements of REACH reflect
development of models


To focus model to high
sensitivity

in
prediction of carcinogenicity


From regulatory perspective, the higher
sensitivity in predicting carcinogens is
more desirable than high specificity


Sensitivity
-

percentage of correct predictions of
carcinogens


Specificity
-

percentage of correct predictions of
non
-
carcinogens

Conclusion


1.We have bult the carcinogenicity models in
accordance with
5
OEC
D
principles

principle of
validation


2. We have got satisfactory results for
categorical models with accuracy 68% which is
good for carcinogenicity as it meet the level of
uncertanty of test data.


3. The goal of our future investigation will be
dedicated to research of relationship between
results of carcinogenicity tests and presence of
Genotoxic, non Genotoxic alerts using TOX
TREE program.

Acknowledgements


The financial support of the
European Union through
CAESAR project (SSPI
-
022674) as well as of the
Slovenian Ministry of Higher
Education, Science and
Technology (grant P1
-
017)
is gratefully acknowledged.





THANK YOU