Discriminating between Drugs and Nondrugs by Prediction of Activity Spectra for Substances (PASS)
description
Transcript of Discriminating between Drugs and Nondrugs by Prediction of Activity Spectra for Substances (PASS)
Discriminating between Drugs and Nondrugs by Prediction of Activity Spectra for Substances (PASS)
Soheila Anzali, Gerhard Barnickel, Bertram Cezanne, Michael Krug, Dmitrii Filiminov, and Vladimir Poroiko (collaboration between Merck
and an academic institution)
Max Shneider – Case study
Overview Goal – better drug/nondrug classification at
beginning of drug discovery process (ADMET) Method - Used PASS, a computer system that
predicts more than 500 biological activities using regression Has a mean prediction accuracy of about 86%
2D compound representation – includes information on each atom and its neighbors
Training set – 5,000 drugs from WDI database and 5,000 nondrugs from ACD database
Test set filtering – removed items that were already in training set, had errors in structural formulas, etc.
Results Leave-one out (LOO) cross-validation
Mean prediction accuracy of 79.9% PASS vs Drugs
864 launched and registered compounds from Cipsline database Predicted 78.5% drugs, 21.5% nondrugs
PASS vs Nondrugs 9,484 compounds with reactive groups, low molecular weight, etc. Predicted 83.8% nondrugs, 16.2% drugs
PASS vs TOP-100 Drugs 88 compounds from top-100 prescription pharmaceuticals list Predicted 87.5% drugs, 12.5% nondrugs
Evaluating PASS with Cleaned Training Set Used filtered “Drugs” and “Nondrugs” test sets from above as training
sets instead of WDI and ACD LOO cross-validation – mean prediction accuracy of 89.9% vs TOP-100 Drugs - Predicted 94.5% drugs, 4.5% nondrugs
Discussion Chemical descriptors and algorithms in PASS provide
highly robust structure-activity relationships and reliable predictions
PASS is in good accordance with other approaches (Sadowsky and Kubinyi, Ajay)
PASS is relatively successful on new compounds that have nontraditional structures and/or belong to new chemical classes
Computation is fast – one compound can be predicted in 4 ms on a 300 MHz computer
Using PASS out of the box gives good results, but better discrimination might be possible with more specific drug information