Convolutional Neural Networks for Protein-Ligand...

Methods Results

ConclusionWe have shown that a convolutional neural network with a well-optimized model andappropriate training dataset has great potential in aiding with drug discovery. Creatingvariety in the poses through rotation, translation and shuffling during training areimportant in training the model. Other parameters such as network depth and widthcan also reduce overfitting. Visualizations highlight the pose sensitivity of the CNNmodel and can emphasize regions of interest in the protein-ligand complex.

Our best model performs better than Autodock Vina at pose selection when evaluatedfor pose predication performance (CSAR) and virtual screening performance (DUD-E),although the nature of the training data greatly influences the result.

ConvolutionalNeuralNetworksforProtein-LigandScoringMattRagoza1,2, ElisaIdrobo3,6,JoshuaHochuli2,4,JocelynSunseri5 andDavidKoes5

1DepartmentofNeuroscience,2DepartmentofComputerScience,3TECBioREU@Pitt,4DepartmentofBiologicalSciences,5Dept.ofComputationalandSystemsBiology,UniversityofPittsburgh,Pittsburgh,PA15260,6TheCollegeofNewJersey,Ewing,NJ08618

AbstractComputational approaches to drug discovery reduce the time and cost associated withexperimental assays and enable the screening of novel chemotypes. Structure-baseddrug design methods rely on scoring functions to rank and predict binding affinities andposes. The ever expanding amount of protein-ligand binding and structural dataenables deep machine learning techniques for protein-ligand scoring.

We describe a convolutional neural network (CNN) scoring function that takes as inputa comprehensive 3D representation of a protein-ligand interaction. A CNN scoringfunction automatically learns the key features of protein-ligand interactions thatdetermine binding. We train and optimize our CNN scoring functions to discriminatebetween correct and incorrect binding poses and known binders and nonbinders. Wefind that our CNN scoring function outperforms the AutoDock Vina scoring functionwhen ranking poses both for pose prediction and virtual screening.

BackgroundProtein-ligand scoring provides a metric of binding strengthbetween small molecules and target proteins. This has a widearray of uses such as virtual screening, which filters largedatabases of candidate molecules for potential hits, anddocking, which predicts the binding pose of a ligand.

Machine learning strategies have been used to treat scoring as a classification problembetween “good” and “bad” binding states, though this often requires manuallyselecting properties that the model uses for discrimination, for example pairwiseinteractions and global counts of typical chemical interactions. However, other machinelearning models can learn the most important features directly from data.

Input data are fed forward through the network, and a prediction is output by the lastlayer. A neural network is trained by iteratively updating its weights by minimization ofan objective function, for example, the mean squared deviation between predictionsand their ground truth labels.

Within the last decade convolutional neural networkshave become the state-of-the-art in imageclassification. Convolutional layers only have connectionweights to small spatial subsets of the previous layer,and apply these weight kernels across the entire inputto produce feature maps.

The fact that convolutional layers learn local featuresand apply them across the entire input space leads tofaster training and improved accuracy on data with aspatial structure.

The rise of GPU computing in combination with otheradvances has made training networks with many morelayers feasible, leading to the surge of research in deeplearning. Each successive layer in a deep neural networklearns features at a higher level of abstraction.

Protein-ligand scoring is a natural generalization of image recognition where the full 3D“images” of protein-ligand complexes are used for training. Convolutional neural netstrained on protein-ligand interactions have the potential to provide substantially moreaccurate scoring functions for improved docking and virtual screening.

DatasetsAllposesgeneratedwithsmina/AutoDock VinaCommunityStructure-ActivityResource(CSAR)§ CSARversion2010with2011update§ 337targets,eliminateweakbinders§ Crystalstructuresà Referenceposes§ Poses<2ÅRMSDà actives§ Poses>4ÅRMSDà decoysDatabaseofUsefulDecoys-Enhanced(DUD-E)§ 101targets§ Activeanddecoyligands§ Unknowncorrectposes

InputFormatVoxelgridcenteredatactivesitecalculatedfrommoleculardata34atomtypechannels§ 16receptoratomtypes§ 18ligandatomtypes23.5Å3 Gaussianatomgrid§ 0.5Åresolution§ 483 points

TrainingCaffe DeepLearningFrameworkTrainto10,000iterationsLayer-wisemodeldefinition§ N-dimensionalinputlayer§ Convolutionallayers§ Non-linearlayers(rectifiedlinearunits)§ Fully-connectedlayers§ Softmax(converttoprobabilities)§ Multinomiallogisticloss(2-class)Performance§ Mini-batchparallelism(batchsize=10)§ Multi-GPUsupport

ModelEvaluationReceiveroperatorcharacteristic(ROC)§ Falsepositivevs.truepositiverate§ AreaunderROCcurve(AUC)Clustered3-foldcross-validation§ Splittargetsinto3balancedfolds§ DUD-Etargetswith80%similarityweregroupedintosamefold

§ Train3models,leavingonefoldout§ Combineperformanceontestsets§ Avoidstestingontargets/ligandssimilar totrainingset

BootstrappedAUC

OptimizationCSARsetChangesingleparameterrelativetoareferencemodel§ Widthofnetwork§ Depthofnetwork§ Resolutionofgrid§ Rotating,translating,balancingandshufflingduringtraining

§ Typesofpoolinglayers§ Numericalvsbinaryoccupancies§ Valuesofradiusmultiplier§ FullyconnectedlayerattheendEvaluatebyaccuracyandtrainingtimeCombinebestchangesintonewreferencemodelandrepeat

DUD-EEvaluationCross-validationwasdonewiththebestmodelDifferentratiosofDUD-EtoCSARdatawereusedtotrainMultipleposesofligandsvssingleposesweretested§ Themaximumscoreofaligand’sposeswastakenastheligand’sscore

§ Eachroundofoptimizationincreasedaccuracyanddecreasedtrainingtime

§ Rotations,smalltranslations,balancingbetweenactivesanddecoysandshufflingorderduringtrainingreduceoverfittingtodata

§ Reducingdimensionslowerstrainingtime

§ Higherresolutionincreasesaccuracybutalsosignificantlyincreasestrainingtime

§ OptimizationincreasedtheAUCofthebestmodelfrom0.78to0.82

§ BestmodelhasabetterAUCthanVina’s scoringfunction

Optimization

LigandbindingpredictionwithDUD-ENeural networks are a supervised machine learningalgorithm inspired by the nervous system. A basicnetwork consists of an input layer, one or more hiddenlayers, and an output layer of interconnected nodes. Eachhidden node computes a feature that is a function of theweighted input it receives from the nodes of the previouslayer.

ThisworkissupportedbyR01GM108340fromtheNationalInstituteofGeneralMedicalSciences.TECBio REU@PittissupportedbytheNationalScienceFoundationunderGrantDBI-1263020andisco-fundedbytheDepartmentofDefenseinpartnershipwiththeNSFREUprogram.

FutureWork• Explore alternative network topologies, such as residual neural networks• Evaluate the use of noise models when training• Investigate the use of CNN scoring for affinity prediction• More informative visualizations from backpropagated gradients• Extract positional gradients from neural network to support energy minimization• Use CNN energy minimization to implement CNN-based pose generation• Use reinforcement learning to iteratively refine CNN models for pose generation• Deploy an open-source comprehensive CNN-based molecular docking and energy

minimization software package (http://github.com/gnina)

GPUAcceleratedMolecularGridding§ Parallelizeoveratoms toobtainamaskofatomsthatoverlapeachgridregion

§ Useexclusivescantoobtainalistofatomindicesfromthemask

§ Parallelizeovergridpoints,usingreducedatomlisttoavoidO(Natoms)check

VisualizationMethod

CNNscoringtrainedwithDUD-EhasbetterperformancethanVina for90%oftargets

TrainingwithaDUD-E/CSAR2:1mixisbetterfor81%oftargets

• TrainingwithCSARandtestingonDUD-Eorvice-versaisnotveryeffective• Scoringmultipleposes,insteadoftheposetop-rankedbyVina,andtakingthemaximumscore

increasesaccuracyforvirtualscreening,suggestingtheCNNmodelcanselectbetterposes• UsingbothCSARandDUD-Etotrainincreasesthisgain

Ligand§ Singleatomdecomposition• Removeatomsonebyoneandscore• Computescoredifference• Setatomwithscoredifference

§ Fragmentdecomposition• FragmentligandwithRDKit• Foreachfragment

• Scoreligandwithoutfragment• Accumulatescoredifferenceonfragmentatoms

§ Averagesingleatomandfragmentscores

Protein§ Removewholeresiduesatatime§ Computescoredifference§ SetallresidueatomstoscoredifferenceStoredifferenceinb-factorfieldof.pdb fileVisualizewithPyMOL

Spatiallyaccurate:Whencomparedwithcrystalposes,portionsofthetestposesthatareinthesamelocationscorewell.Atomsfarremovedfromtheircrystallocationoftenscorepoorly.

Ligandandreceptoragreement:Effectsareoftenshownonbothmoleculesatinteractinglocations.

Ligandmoresignificantthanreceptor:Ligandsoftenshowmoresignificantnegativeandpositivevaluesthanreceptors,exceptinthepresenceofmetalions,whichoftendominatethescore.

Littleconsensusoncarbonatoms:Modelsfromdifferenttrainingsetsdisagreeonwhichcarbonssignificantlycontributetooverallscore.

2byi

3ejt

2fxu

2idz

4ubp

2vw5

Insights

+

=

single atom

fragment

Convolutional Neural Networks for Protein-Ligand...

Documents

Transcript of Convolutional Neural Networks for Protein-Ligand...