Paweł Aleksander Siedlecki Affinity prediction of low molecular … · 2019. 9. 10. · Maciej...
Transcript of Paweł Aleksander Siedlecki Affinity prediction of low molecular … · 2019. 9. 10. · Maciej...
Paweł Aleksander Siedlecki
Affinity prediction of low molecular weight compounds to
protein receptors. Application to high throughput
screening.
1
1. Name and surname
Paweł Aleksander Siedlecki
2. Education and obtained scientific titles, degrees:
Doctor of Philosophy degree in biological sciences in biology, Institute of
Biochemistry and Biophysics (IBB), Polish Academy of Sciences (27.06.2006). Title
of PhD thesis: „New inhibitors of human DNMT1 methyltransferase - computer
design and evaluation ” Supervisor: prof. dr hab. Piotr Zielenkiewicz - Department of
Bioinformatics, Institute of Biochemistry and Biophysics, Polish Academy of
Sciences, Warsaw, Poland.
Reviewers:
- prof. dr hab. Andrzej Jerzmanowski, Faculty of Biology, University of Warsaw
- prof. dr hab. Grzegorz Grynkiewicz, Pharmaceutical Research Institute, Warsaw
- prof. dr Sandor Suhai, Deutsches Krebsforschungszentrum (DKFZ), Heidelberg,
Master of Science in biology, specialization microbiology, Faculty of Biology,
University of Warsaw, Poland (2.11.2000). The title of M.Sc. thesis: “Evolution of
archaeal TBP proteins; modeling structural features responsible for thermostability”
supervisor prof. Piotr Zielenkiewicz.
3. Academic appointments:
● 2008 – until now: adiunkt at Department of Systems Biology, Institute of
Experimental Plant Biology, Faculty of Biology, University of Warsaw,
Poland
● 2006 – until now: adiunkt at Department of Bioinformatics, Institute of
Biochemistry and Biophysics, Polish Academy of Sciences, Warsaw, Poland.
● 2005-2006: Biologist, employed at the Institute of Biochemistry and
Biophysics,Polish Academy of Sciences, Warsaw, Poland
● 2002-2005 – Internship at Deutsches Krebsforschungszentrum (DKFZ),
Heidelberg, Niemcy (2 years total).
2
● 2000-2004: - PhD student (scholarship) at the School of Molecular Biology,
Institute of Biochemistry and Biophysics, Polish Academy of Sciences,
Warsaw, Poland
4. Scientific achievement according to the current regulations (article 16,
paragraph 2 of the bill enacted on March 14, 2003, about scientific degrees and a
scientific title as well as degrees and a title in arts (Dz. U. 2016 r. poz. 882 ze zm.
w Dz. U. z 2016 r. poz. 1311.):
a. The title of the scientific achievement:
“Affinity prediction of low molecular weight compounds to protein receptors.
Application to high throughput screening.”
b. Publications included into the scientific achievement
● The scientific achievement consists of 7 publications published in journals listed by
the Journal Citation Report (JCR).
● Total IF (impact factor) of journals in which publications included in the scientific
achievement appeared, according to the year of publication and to Web of Science
(WoS) - 32
● The number of citations of publications included in the scientific achievement until
the date of submitting the application, according to WoS - 38
● Total MSHE points (according to the list of the Ministry of Science and Higher
Education, MSHE), all from A category - 280
1. Maciej Wójcikowski, Michał Kukiełka, Marta Stepniewska-Dziubinska oraz Paweł Siedlecki, 2018, “Development of a Protein-Ligand Extended Connectivity (PLEC) fingerprint and its application for binding affinity predictions”, Bioinformatics. 2018 Sep 8 IF: 5,481, MNiSW: 45 2. Marta Stepniewska-Dziubinska, Piotr Zielenkiewicz oraz Paweł Siedlecki, 2018, “Development and evaluation of a deep learning model for protein-ligand binding affinity prediction”, Bioinformatics. 2018 Nov 1;34(21):3666-3674
3
IF: 5,481, MNiSW: 45 3. Maciej Wójcikowski, Pedro J. Ballester oraz Paweł Siedlecki, 2017, “Performance of machine-learning scoring functions in structure-based virtual screening”, Sci Rep. 2017 Apr 25;7:46710. IF: 4,259, MNiSW: 40 4. Marta Stepniewska-Dziubinska, Piotr Zielenkiewicz oraz Paweł Siedlecki, 2017, “DeCAF-Discrimination, Comparison, Alignment Tool for 2D PHarmacophores.”, Molecules. 2017 Jul 6;22(7). IF: 2,861, MNiSW: 30 5. Maciej Wójcikowski, Piotr Zielenkiewicz oraz Paweł Siedlecki, 2015, “Open Drug Discovery Toolkit (ODDT): a new open-source player in the drug discovery field.”, J Cheminform. 2015 Jun 22;7:26. IF: 4,547, MNiSW: 45 6. Maciej Wójcikowski, Piotr Zielenkiewicz oraz Paweł Siedlecki, 2014, “DiSCuS: an open platform for (not only) virtual screening results management.”, J Chem Inf Model. 2014 Jan 27;54(1):347-54. IF: 4,068, MNiSW: 40 7. Szymon Kaczanowski*, Paweł Siedlecki* oraz Piotr Zielenkiewicz, 2009, “The High Throughput Sequence Annotation Service (HT-SAS) - the shortcut from sequence to true Medline words.”, BMC Bioinformatics. 2009 May 16;10:148 IF: 3,781, MNiSW: 35
Corresponding author
* Joint first author
The above scientific achievement has been documented in the form of a cycle of
thematic publications. It consists seven scientific articles. In each of them a significant
part of the work was accomplished in cooperation with the PhD students I was
didactic and scientific supervisor. I am the main author or author of correspondence
for all these publications, but the case of publication no. 7 I am the first co-author.
4
c. Presentation of the scientific objectives of the publications listed above,
the obtained results, and possible applications
Introduction
The search for low molecular weight compounds capable of modulating selected cellular
functions, influence the activity of proteins and/or their interaction with each other is an
important element directing researchers towards a given class of chemical compounds.
Hillisch et al. in a paper from 2015 [1] , state that over half of the currently tested in the first
phase of clinical trials new compounds were developed with the aid of in silico methods.
These can be divided basically into two branches; 1) based on the characteristics of known
ligands (so called ligand-based) and 2) based on the structural features of receptors (called
receptor-based). In my work, I tried to develop both types of methodologies, applying them
in practice in my research projects. In my opinion, particularly interesting results are obtained
when predictions are based on the structure of protein targets; [2,3].
The key elements of ligand-receptor affinity prediction are generation of spatial conformation
of the complex and the way the complex will be evaluated. For both of the above elements,
there are a number of methods, approximations and various limitations related to the
properties of the complexes themselves and also computational constraints [4]. Current
methodologies focus on evaluating complexes obtained through experimental and/or in silico
methods, including comparative modeling [5] or de-novo [6,7]. When using the receptor
structure, it may be problematic to obtain the correct "native" ligand conformation associated
with the receptor. This in turn may lead to incorrect assessment of its potential activity [8,9].
Unfortunately, this problem results from the properties of the biological targets (receptors)
themselves, whose conformation can change during binding. To some extent, this problem is
solved by molecular dynamics [10], ensemble docking [11] or fully flexible docking [12],
but these methods are sensitive to correct parameterization of systems and still
computationally costly, which greatly limits their use in screening.
On this canvas molecular docking can have a range of applications. In short, the method
allows to develop a model of favorable interactions between the ligand and receptor, starting
from uncomplexed entities. It consists of defining the space on the macromolecule (receptor),
e.g. an active enzyme center. This space is then searched with various algorithms (genetic
5
[13], anchor-and-grow [14], and other [11]) in order to find a ligand conformation which fits
stericaly and electrostatically to the defined constraints. Docking can be applied to different
types of ligands; a small organic compound, a peptide, a nucleic acid fragment. The term
“fitting” refers to generating several or more diverse conformations of a given compound, in
which they interact favorably with the receptor. In comparative studies in which native
complexes of a small molecule compound were reconstituted, molecular docking achieves
efficacy of 70-80% [15].
A more difficult and challenging task of the methodology is the assessment of the complex
[4], i.e. the evaluation of the strength of the ligand's interaction with the receptor. The process
of evaluating generated conformations is currently the most critical element of in silico
screening; it directly affects its effectiveness and determines the level of success (success
rate). The generated conformations of a single ligand must be evaluated so that the most
probable one can be selected (choosing best complex out of many generated), but also allow
to compare conformations of different ligands, i.e. produce a ranking list of different
ligand-receptor complexes. In in-silico screening (high throughput virtual screening (HTVS))
this is the job of the scoring functions, which are responsible for indicating compounds that
may be active and which are worth testing experimentally.
In HTVS campaigns, a library of many hundreds of thousands or millions of chemical
compounds is searched, most often to pick only a small percentage, or permille of molecules
that potentially bind to the receptor. Unfortunately, both quick and accurate estimation of
binding energy is not possible [16]. The tradeoff for speed is accuracy [2,17]. Thus,
simplifications and approximations are used to speed up scoring calculations. Scoring
functions are developed based on complexes solved by experimental methods, where the
ligand's “fit” to the receptor is very high. In the case of docking, however, one often obtains
many sub-optimal conformations (not fully aligned to the receptor structure), which becomes
challenging for such functions [4,18]. Another methodological disadvantage is the use of a
limited number of complexes to create the assessment function [15]. This results in not
representing all components of ligand-receptor interactions frequent enough in the training
set [19]. Scoring functions can be built in several different ways; e.g. using force fields,
statistical potentials or all kinds of hybrids of the mentioned categories [20]. Regardless of
the type, they are characterized by a well-defined linear equations whose elements (types of
interactions and their weights) are constant [2]. Such functions, in addition to the undoubted
6
advantages such as speed and ease of interpretability, have basic disadvantages in the form of
low accuracy and sensitivity [17].
The aim of my research after obtaining the PhD degree was to increase the sensitivity and
specificity of in silico high throughput screening campaigns by developing new descriptions
of the ligand-receptor complex. These new methods of coding the ligand-receptor complex
would allow usage of much larger datasets of structures, forging the way to include specific
ligand-receptor interactions that occur rarely, or even are not currently considered as
influencing binding affinity. I researched the possibility of using structural data available in
public databases to answer two main questions: 1) which small molecule compounds would
be active for a given receptor structure, and 2) which molecular target(s) a new low molecular
weight compound could bind to. Answering these questions is highly complex but at the
same time extremely important from the scientific and application point of view. One can
approach them in many ways depending on the type of information used, e.g. only the
structure of the ligand itself [21], the full 3D coordinates of ligand-receptor complexes [3,22],
or various combinations of the above [2,23]. During my work I developed bio- and
cheminformatics methods capable of determining how one can use elements of structural
information to predict affinity of a given ligand to a receptor. I was particularly interested for
my research to be applicable to screening, where the speed of the process is important, but
even more the accuracy at the top of the compound ranking list. I describe below some of my
findings published so far, and comment on future perspectives of my ongoing research.
DiSCuS
My research began with classical molecular docking experiments, dealing with a more
practical aspect of screening. As part of the PBS grant "New drugs for targeted multiple
myeloma therapy", in which I directed the screening task, I was looking for new
low-molecular compounds that could specifically and selectively bind to the PIN domain of
the human DIS protein. I searched for two types of compounds; capable of chelating metal
ions, and competitor compounds that prevent the metal ion from binding, therefore disturbing
the chemical reaction. It was necessary to generate a number of DIS3 structures and to
conduct a separate screening for each of them. Dealing with the number of screening tasks
my goal was to create a system that would be able to integrate in silico and experimental data,
correcting my predictions based on ongoing biochemical affinity assays. Such system would
7
be in a way “tought” which combinations of scoring functions and to what extent would yield
results that are closest to the experimental data. The DiSCuS system [22] was created for this
purpose.
From the scientific perspective, the most important element of DiSCuS is the RankScore
module, used to find the optimal model of consensus scoring functions. It does so by
adjusting the individual components (scoring functions) to experimental evidence in a
normalized assessment. When experimental activities are available, DiSCuS calculates the
AUC values for the ROC curves [24] and uses them to measure the performance of each
function. The system can then semi-automatically adjust the evaluation procedure by
applying different weights and/or completely disabling selected scoring functions. The ROC
curve is a graphical representation of the efficiency of the predictive method; it allows to
evaluate the correctness of the model (a classifier) by describing its sensitivity and
specificity.
Each point of such a curve is an confusion matrix for a given cut-off level (threshold) at
which one measures the efficiency of the method. For example, if we assume a sensitivity of
0.8 (the method correctly predicted 80% of active compounds), the ROC curve will allow to
determine how many inactive compounds were incorrectly considered active by the
predictive model. By calculating the area under the ROC curve (ROC AUC) we obtain a
single value in the interval [0,1], allowing comparison of prediction models with each other
[25]. Interpretation of AUC ROC is the probability that a predictive model will correctly
distinguish a random element of the positive class from a random element of the negative
class. It is worth noting that there is no single value from which the model can be considered
"good"; it depends on the type of data or the specificity of the problem. However, when
comparing different predictive models for the same screening data, the ROC AUC is a very
useful tool.
DiSCuS can be used to analyze simple docking experiments with a single target, although
many of its advantages can be seen during the analysis of big data, i.e. large screening
campaigns against many targets. Within the framework of the said grant, approximately 1.9
million small molecule compounds from various databases were docked in the DiSCuS, into
five receptor models using 3 different docking programs. Each compound had on average 5
different conformations for a single receptor. Ultimately, approximately 140 million
ligand-receptor complexes were obtained and analyzed in the DiSCuS system.
8
In addition to the analysis of the screening experiments, a new way of describing the
interactions of the ligand-receptor complex, called "Binding Profile" was developed. It allows
to find a wide range of physical interactions present in the complex and save them as
one-dimensional strings (1D). Such profiles can be used for filtering or for enriching ligand
libraries. Importantly, they can be compared between ligands or single ligand conformations.
This concept, developed early in DiSCuS inspired me to experiment further with novel
methods of describing ligand-receptor complexes, one of them being PLEC [23], which will
be described later. Currently, several ways to create binding profiles have been described in
the literature [26–28], and the interaction profile itself has become an important
cheminformatics tool.
Ultimately, DiSCuS is built as a modular system, with the possibility to integrate various
external tools in mind. It is important to think about it not as a replacement for known tools,
but rather as an information hub that allows to select the relevant features from different
programs and integrate them into a unified decision platform. More information about the
interface, installation, user documentation and sources can be found on the DiSCuS website:
http://discus.ibb.waw.pl.
9
Figure 1. The two main functionalities of the DiSCuS system highlighted in the text. On the left, the "Binding Profile" module, on the right docking results assessment with RankScore module.
ODDT
Development of the DiSCuS system and its use in scientific projects (grants NCBiR: PBS and
Leader) and commercial projects (startups: Metheor Corp. and NooTech Ltd.) made me
realize that to start using more advanced techniques of ligand-receptor interaction analysis
and in an efficient way test hypotheses, preparation of a cheminformatic toolchain will be
required. This was the motivation to develop Open Drug Discovery Toolkit (ODDT) [29]; a
set of tools and algorithms adapted to work with structural data of ligand-receptor complexes.
ODDT integrates two most comprehensive tool sets; OpenBabel, designed for work with
biomolecules (receptors) and RDKit with many functions directed towards small-molecule
chemical compounds (ligands). Among the many implemented in ODDT methods, both self
designed and developed by other researchers, the most important in the perspective of time
have been three modules: analysis of protein-ligand interaction, module for docking and
scoring, and a library that allows to design novel high throughput protocols (HTS). All three
will be discussed below.
The interaction module is a set of tools that allows to analyze receptor-ligand interactions.
The full list of interactions consists of hydrogen bonds, salt bridges, hydrophobic contacts,
halogen bonds, pi stacking (face-to-face and edge-to-face), pi-kation, pi-metal and ion
coordination. In addition, directional interactions, such as hydrogen bonds and salt bridges,
have two modes of operation: the "strict" mode, which indicates whether the angular and
distance parameters are within the limit values, and the "crude" mode when only specific
distance criteria are taken into account. This functionality is particularly useful when working
with comparative models when the receptor structure may not be accurate or with docking
results where ligand does not fit perfectly to the rigid structure of the receptor. Interactions
are detected using functions developed in-house and can be analyzed for characteristic
binding pattern or as descriptors for a prediction function.
The docking module and scoring module provides a uniform tool for the preparation of input
data (e.g. ligand datasets) independent of the requirements of a specific docking software as
well as performs the docking procedure with selected docking algorithms. It also provides an
in-house built implementation of two important models (scoring functions) based on machine
10
learning: NNscore v2 [30] and three versions of RFscore [31]. ODDT uses the sklearn
package [32] as the main mechanism underlying machine learning and scoring evaluation,
with ffnet [33] for the construction of neural networks. The module also supports
multithreading, even if the docking program itself does not have such functionality, which
significantly improves the use of all available computing resources.
For my research interests the most important feature was the ability to quickly prototype new
ways of assessing the ligand-receptor conformations, with new descriptors and machine
learning capabilities. Two main types of machine learning models are the regressors, for
continuous data such as IC50, EC50, Ki/Kd; and classifiers, used for categorical data, e.g.
ligands marked as active or inactive. ODDT allows handling both types of data by providing
a set of predictive models such as: random forests, support vector machines (SVMs) and
artificial neural networks (single and multilayered). These models have been shown to be
useful in the evaluation of protein-ligand complexes [30,31,34] and in SAR and QSAR
methodologies [35,36]. What is more ODDT provides a built-in mechanism of assessing the
predictive power of generated models. In a single step one can calculate many metrics
including ROC AUC and the enrichment factor EF (Enrichment Factor) in a given percentage
of the ranking list.
The enrichment factor EF [37] is a particularly useful measure in screening. It informs how
many more active compounds are present in the selected upper percentage of the ranking list,
in relation to a random distribution for a set of given size. In other words, how much better is
the predictive model than a random model. In screening EF suggests what percentage of the
list of compounds should be subjected to experimental tests to find active compounds. For
example, EF0.1% = 10 means that among best-rated 0.1% of all compounds analyzed, there
are 10 times more active compounds than would result from random distribution. This may
mean that the method which obtained such a result is definitely a better alternative to "blind"
experimental testing of compounds [38]. It should be noted however that in practice there is
no fully random library of compounds in which all possible systems of features are present in
a uniform distribution. The enrichment factor of a given predictive model for two different
datasets of compounds may thus be different. However, if the results achieved by a predictive
model (e.g. a scoring function) differ significantly between datasets, even worse give
significantly poorer predictions for new data, attention should be paid to the problem of
model overfitting [39].
11
Overfitting is a situation in which the model does not reproduce the trends present in the data
but reproduces the data itself. For example, if the model contains too many parameters in
relation to the data size on which it is being trained, minimizing errors will generate a
formula describing every element of input data. This results in a nearly perfect fit of the
model to the training data, but a poor generalization of the model (its ability to describe new,
unknown data [40]). To control and avoid such a situation, a number of validation methods
can be used; in ODDT different variations of cross validation were implemented: k-fold cross
validation and LOO / LPO (Leave-One-Out and Leave P Out). Cross-check, or
cross-validation, is a method in which input data are divided into subsets; some of them are
used to train the model while the remaining part is used to test its performance.
In summary, ODDT covers all elements related to the construction of new predictive models
based both on classical scoring functions and/or machine learning; from input operations
(preparation of biomolecule structures available from PDBbind [15], DUD-E [41] and CASF
[15]), training, testing and validating of the model, up to performance assessment of the
predictive efficiency. One can think about ODDT as a workshop or laboratory, where a set of
tools and methods allow to design experiments and analyze results.
More information on ODDT can be found at https://github.com/oddt/oddt.
Figure 2. Overview of the most important features of the Open Drug Discovery Toolkit (ODDT). On the left, a graphical representation of data analysis possibilities for CK2 kinase active and inactive ligands. On the right, true ODDT code, allowing to dock (using the Autodock Vina program) a set of active ligands with given physicochemical parameters and evaluate them using the RF-score v1 function.
12
RF-Score-VS
As mentioned earlier, one of the basic problems in evaluating in silico screening experiments
is the use of sub-optimal conformations from docking to predict interaction. The three most
important elements introducing noise to the dataset are 1) inaccurate ligand conformation
with respect to the receptor, 2) rigid receptor structure, and 3) biophysical effects such as
desolvation or entropy effects not taken into consideration. Direct simulation of these
elements, e.g. complex flexibility with molecular dynamics leads to a significant increase in
the cost of affinity calculations making it impossible to use in screening.
In my research, I assumed that the first two problems, conjoined with each other, may be
solved to some extent by using a complex representation less restrictive than classical
Cartesian coordinates. The biophysical effects however may be taken into account only
indirectly, using a larger and more diverse number of structural data than was previously
done.
Looking for my own solution for data representation, I found the works of Dr. Pedro
Ballester, in particular [31], who proposed a description of the ligand-receptor complex based
on the number of atoms forming a surrounding of a given ligand. This was a very interesting
solution from my point of view, primarily because the description of the complex to a much
lesser extent relied on the perfect matching of molecules, allowing a more favorable
description of the sub-optimal conformations occurring in molecular docking. In this method,
a sphere with a given radius is created for a ligand atom, encapsulating the atoms of the
receptor. Then, all the types of receptor atoms in such spheres are summed up and stored in
the form of a one-dimensional sum sequence (a feature vector). Passing successively through
the ligand atoms, strings of local environments are constructed for the whole small molecule
compound, ultimately creating a new complex representation.
The procedure described above can be modified, e.g. by dividing the sphere into smaller
sub-spheres and assign different weights depending on the distance from the center or add
additional information as scores obtained by a complex from external functions. By utilising
various ways of describing the complexes, we created our own Random Forrests [42,43]
predictive models, which are capable of predicting affinity values based on a conformation
obtained from molecular docking. What distinguishes our solution and what makes it unique
13
is the use of negative data in the process of learning the model. Our models have been trained
for 102 diverse protein targets, including GPCR receptors, chemokines, kinases or viral
proteases, to which about 20,000 active and about 800,000 inactive compounds from the
DUD-E base were docked [41]. Therefore negative data, i.e. protein-inactive ligand
complexes, account for about 97.5% of our entire dataset. Such data are not normally used,
and are even avoided when training predictive models; it is assumed that they introduce noise
into the training set [44]. However, in the case of screening results, it is this type of
proportions that will be analyzed by a predictive model. It is the ability to discriminate
between active and inactive compounds that a scoring function should possess.
Using this line of thinking, we have built a prediction model called RF-Score-VS [2], which
main application is the evaluation of ligand-receptor complexes in terms of their potential
affinity. One of our main results is the striking improvement in distinguishing between active
and inactive compounds in the top ranges of the ranking list. The enrichment factor EF1%
calculated as the average over all 102 protein targets was 39 for the general model, and 43.43
for models built for each target separately. The best performing classical scoring function
(Dock 3.6) obtained 16.86, which compared to our method gives about 2.2 times less active
compounds in the upper 1% of the list. This shows an outstanding improvement in the
screening process. RF-Score-VS compared to the widely popular Autodock Vina scoring
function provides much improved activity correlations (Pearson's correlation Rp = 0.56 vs Rp
= -0.18 respectively). Both these results became the basis for writing a very well-received
publication, cited and reused in a short time by many researchers [2].
The proposed combination of a less restrictive description of the ligand-receptor complex
combined with a much larger, diversified set of targets and enriched with “negative”
complexes turned out to be a very interesting solution. It is worth noting that the most
numerous class in our data are inactive ligands in complex with receptors (the negative data),
while the efficiency of our method is calculated as the ability to find active ligands.
In short, our idea of data preparation and negative data augmentation, combined with random
forests, made it possible to create a much improved model, tailored specifically for assessing
the results of in silico screening, with high sensitivity and specificity, several times better
than the solutions used so far. Our work was appreciated and found it way to the list of 100
most-read articles published in 2018 in Scientific Reports
(https://www.nature.com/collections/zzcpmcdkqp/content/76-100)
14
.
Figure 3. Results obtained with RF-Score-VS. Top panel; comparison of the scatter and correlation between real affinity values of active compounds and the predictions made by widely used scoring function Vina (left) and RF-Score-VS (right). Bottom left panel shows enrichment factor for various popular scoring functions and three versions of RF-Score-VS. Bottom right panel; a schematic representation of a ligand-receptor complex from PDBL:2p33 is shown; for the fluorine atom in the ligand, a sphere of 12Å is created, then all types of receptor atoms in the sphere are counted and stored in a one-dimensional vector. A detailed description of the methods and results can be found in [2].
Pafnucy
The success of RF-Score-VS confirmed that by using a less restrictive representation of
ligand-receptor complexes a more efficient predictor can be build. However, this has been
confirmed for a limited number of receptors, i.e. 102 structures. Currently in public databases
15
there are over 12,000 experimentally solved ligand-receptor complexes [45,46]. How would
the use of a much larger set of complexes influence estimation of affinity performance; does
the representation of complexes used in RF-Score-VS still limit the performance? Trying to
answer these questions I wanted to build a model in which the model itself would choose
elements that are important for the prediction of interactions. In other words, to limit
engineering of features that are used to train the model as much as possible. The solution was
to create a neural network that could serve both as a feature selector and a scoring function.
Neural networks have already been shown to be able to classify ligands as active or inactive
[47,48]. We set ourselves the goal to create a network capable of returning the affinity value
for the ligand-receptor complex; this way it could be fully used in screening.
To increase the number of structures, we used ligand-receptor complexes available in the
PDBbind database [15]. The database has been divided into 3 sets of data - training, testing
and a validation set used to control the learning process. The training set included 11,906
complexes. The two test sets contained 195 complexes from PDBBind subset "core set 2013"
and 290 complexes from the "core set 2016" collection. We used these test sets to quickly
compare our method to the established scoring functions developed so far. The validation set
was 1000 randomly selected complexes from the PDBBind database. Of course, none of the
complexes were present in both training and test collection, so to avoid any data leakage
problems.
In our approach, a complex is described as a cube with 20Å sides, built around the geometric
center of the ligand. Next, the atoms inside the cube were brought into a three-dimensional
grid with a resolution of 1Å, allowing the input to represented in the form of a fixed size
matrix. In our approach the input data (3D complex) is a four-dimensional tensors, where
three of its dimensions are Cartesian coordinates, while the fourth is a vector describing the
"features" of the atom. We used 19 features to describe the atom:
● 9 bits (1 if present) corresponding to the types of atoms: B, C, N, O, P, S, Se, halogen
and metal.
● 1 integer corresponding to hybridization.
● 1 integer corresponding to the sum of bonds with heavy atoms.
● 1 integer corresponding to the sum of the heteroatom bonds.
● 5 bits (1 if present) corresponding to one of the five features defined by the SMARTS
pattern: hydrophobic, aromatic, acceptor, donor and ring.
16
● 1 number corresponding to partial charge.
● 1 integer to distinguish the ligand (1) from the receptor (-1)
The above representation is a very neutral approach to the description of the complex in
which the receptor and ligand share the same atom types (differing only in one bit). This
approach also acts as regularization [49] because it forces the network to detect the
interaction between the receptor atoms and the ligand.
The Tensorflow library was used for building the model [50]. After the input layer, there are
3 convolutional layers and 3 dense layers. The output layer consisted of one linear neuron
returning the affinity value. To improve learning, we used two types of regularization. The
first was "dropout" at level 0.5 for dense layers, which means that 50% of the neurons were
masked and did not participate in the prediction. The second method was a penalty for
increasing the L2 type weights.
The constructed model has been trained using the training set ligand-receptor complexes
(described above). After evaluation, it achieved much better accuracy (measured as
correlation between the experimental and predicted affinity values) from all 20 commonly
used scoring functions. In this evaluation the best function achieved Pearson correlation
coefficient at 0.6, while our neural network obtained R = 0.7 for 2013 core set and R = 0.78
2016 core set [3]. Our research thus confirmed the hypothesis that the use of a larger number
of structured data is possible and increases the efficiency of the predictive model. In addition,
it seems that the most important features necessary to predict affinity can be found in the
structural data. In other words, the structure of the ligand-receptor complex, assuming its
correct conformation, carries enough information that the affinity prediction task can be
solved with a sufficiently good approximation.
An important goal of our research was also understanding how the model selects the features
that are uses to predict affinity; how it distinguishes signal from noise and how stable are the
results obtained? In the case of neural networks, this is not an easy task. Here we examined
the penalties for increasing weights for individual atomic features which the network
analyzed. Their range indirectly shows the impact a given feature had on the model; if the
penalties differ significantly from the initial "0", this feature must carry information relevant
to the model and the prediction being returned. The atomic feature with the widest range was
the one that distinguishes the receptor from the ligand. This result indicates that the binding
affinity depends on the relationship between the two molecules and that their recognition by
17
the network is crucial. In addition, the weights for selenium and boron atom types (Se and B,
respectively) have changed only slightly and are close to zero. This result can be interpreted
in two ways: either the network found other features of ligand-receptor complexes, more
important for binding affinity, or because of the rare occurrence of these types of atoms in
ligands, the network was unable to find general patterns for their effect on binding affinity.
To inspect closer how the network uses input data, we analyzed the impact of missing data on
prediction accuracy. For this purpose, we chose the PDE10A complexes with a
benzimidazole inhibitor (PDB ID: 3WS8 complex, PDB ID lig .: X4C). Then we generated
343 "damaged" complexes with missing data. The missing data was produced by removing a
5Å cube from the original data, systematically moving it with a 3Å step in all directions.
Next, we rotated the complex 180° around the X axis and performed the same procedure,
obtaining another 343 damaged input data. For each of the two orientations, we analyzed 15
damaged inputs that had the largest negative impact on predicted affinity, to determine which
missing atoms of the complex caused the largest decreases in predictions. For both
orientations, the same region containing the ligand and its closest neighborhood has been
identified. It contains the amino acids involved in interactions with the ligand, i.e. Gln726
and Tyr693 forming a hydrogen bond with the ligand, Phe729, which forms a π-π interaction
and Met713, which forms hydrophobic contacts. The methodology presented above can be
applied to other complexes to explain specific ligand-receptor interactions with the strongest
prediction effect.
Overall, our model is able not only to distinguish active compounds from inactive, but what
is important, it provides affinity values. It can therefore be useful in many applications,
including virtual screening. One of our reviewers even stated that "I would like to praise the
authors for the great work they should be proud of, which will have a significant benefit for
the wider community and perhaps will initiate a new revolution in scoring functions".
The source code and software is available as a git repository at:
http://gitlab.com/cheminfIBB/pafnucy.
18
Figure 4. The use of a deep convolutional neural network [3] to predict the affinity of ligand-receptor complexes. Top panel, the results of Pearson's correlation (Rp) for two sets of data (core set 2013 and core set 2016). Bottom left panel; a graphical representation of the weight penalties for atoms, indicating which features were important for the model. Bottom right panel; example of the prediction for PDE10 protein complex and benzimidazole inhibitor (PDB ID: 3WS8, PDB ID lig.: X4C). By analyzing which deleted data fragments were responsible for reduced prediction efficiency, we have recreate the pattern of binding.
19
DeCAF
Looking for new solutions related to prediction of ligand-receptor affinity, I also explored
methods in which the structure of small molecule (ligand) would be the sole provider of
information to be analyzed while the receptor structure would not be considered. Such a
solution has a fundamental advantage; it is not necessary to generate a conformation of the
ligand-receptor complex [51]. In such methods however, the problem lies in accounting for
ligands’ different possible spatial conformations[52]. These conformations can significantly
affect ligand properties, important for potential receptor binding. Often the small difference
between conformations of the same compound lead to very different comparison results [53].
Nevertheless, I postulated that inclusion of spatial features in ligand representation should be
beneficial for increasing predictive power of the newly developed ligand-based methodology.
Generating a large number of ligand conformations and comparing them results in a
significant increase in computation time. To solve this problem, we have developed our own
extended representation of the molecule, which is less complex than the full 3D model. The
proposed solution takes into account the spatial distribution of features and is based on the
use of relative distances between individual ligand atoms. This way the ligand can be
described as a graph in which the edge lengths between nodes are equal to the number of
bonds dividing the corresponding atoms. The atoms themselves are replaced by
pharmacophore points. This allowed the introduction of node "properties" (e.g. hydrogen
donor / acceptor). The use of the graph allowed to bypass the generation of conformations
and enabled quick and efficient comparison of compounds. An additional element enriching
the representation was the introduction of weights for pharmacophore features. These weights
correspond to the frequency of a given element in the compared molecules from which the
pharmacophore was created; they can also be manually modified, thus introducing additional
information to the model [21].
Our newly developed method of representation allows to compare ligands by aligning them
efficiently and finding common substructures. Therefore it offers a measure of molecules’
similarity based on their physico-chemical and spatial characteristics, It allows to search for
molecules similar to a given ligand or to generate a more complex model describing a whole
group of active molecules. It is thanks to these properties that our method called DeCAF
(Discrimination, Comparison, Alignment Tool for 2D PHarmacophores) can be used to
predict the activity of new small molecule compounds in screening.
20
We have tested methodology in several different ways on two sets of data: 1) a set developed
by [54] consisting of 88 protein targets (receptors) for comparison with currently used 2D
methods (so-called fingerprints), and 2) a set of 73 receptors recreated from [55] This set
was used for comparison with a much more sophisticated method; SEA - Similarity
Ensemble Approach [55], which reduces the number of false positive results. Finally we
compared our solution with 3) USRCAT one of the leading fully 3D methods using shape
recognition.
Our experiments clearly showed that overall our method is not significantly better or worse
than both the 14 tested types of fingerprints and the SEA methodology. However, its
advantage is revealed in the early enrichment (EF). In the upper range of the ranking list, our
solution provides a higher number of truly positive results with high confidence scores, while
returning smaller number of false positive predictions with a high scores. Such combination
is not available for any of the fingerprints tested [21].
Comparisons with USRCAT also gave interesting results. We chose USRCAT because it is
considered an accurate and effective algorithm for molecule comparison. The only
time-consuming stage is the process of generating conformers. Our results showed that the
effectiveness of DeCAF was comparable or better than USRCAT. However, the lack of
conformer generation stage fin our method allowed DeCAF to be applied to much larger
datasets.
In conclusion, DeCAF has shown it is possible to create a fast and effective tool for assessing
the activity of chemical molecules utilising ligands as sole source of information. By
including spatial information in ligand representation we were able to create a method which
shows advantages especially in screening campaigns (EF). The developed method has many
more potential applications related to computer drug design. The software can be downloaded
from the repository: https://bitbucket.org/marta-sd/decaf/
21
Figura 5. Construction of the DeCAF model and obtained comparison results. Top panel; a schematic representation of the method with pharmacophore features and atomic distances. Bottom left panel; comparison of DeCAF model vs SEA method on the same set of 35 receptors. Bottom right panel; a detailed comparison of DeCAF model and various 2D methods. A more detailed description of the methodology and results can be found in[21]
PLEC FP
Continuing my search for novel ways to represent the protein-ligand complex, with the aim
of limiting the strict nature of Cartesian coordinates used currently, I explored the field of
interaction fingerprints (IFP). Fingerprints (FPs) are one of the key concepts in
cheminformatics, allowing for effective representation of molecules with fixed length vectors
of booleans or integers. FPs can be used to represent intramolecular interactions as well.
Some notable examples include SiFTs (Structural Interaction Fingerprints - [56]), PyPLIFs
(Protein–Ligand Interaction Fingerprints - [27]) or more advanced SPLIF (Structural
Protein–Ligand Interaction Fingerprint [26]), which all use explicitly defined well-known
interaction types such as hydrogen bonds, halogen bonds or π stacking. There are also
variants of IFPs that group interactions by residue type, e.g. SILIRID - Simple
Ligand–Receptor Interaction Descriptor [28].
From my previous work, especially with RF-Score-VS and deep learning experiments , I
concluded it is not necessary to explicitly define interactions and apply them to
22
ligand-receptor complex representations. This statement is true especially in case of noisy
data, in my case high throughput screening results [2] or when using large sets of structural
data obtained by experiments with varying accuracy, physico-chemical conditions and
methodology [3]. For affinity predictions it was enough to provide a simple representation of
3D information where the interactions are not defined explicitly but rather implicitly, and
occur as a result of statistical learning. Here, I tried to merge this idea with the IFP concept to
provide a simple, unified way to describe the protein-ligand complex, rich enough to
implicitly encode ligand-receptor interactions.
Our approach (called PLEC FP - Protein Ligand Extended Connectivity Fingerprint [23])
builds upon the ECFP fingerprint presented by [57] and using atom surroundings (i.e.
environments) rather than defined functional groups or substructure patterns. In contrast to
ECFP however, only atoms in contact with another molecule are used in our approach. PLEC
FP stores environments of atoms from both interacting entities, which are encoded into a
fingerprint representation that is highly efficient to process and compare.
To assess its strengths and weaknesses, we tested its performance on binding affinity
predictions. We used PDBBind general set for training and core set v.2013 and v.2016 for
testing [46]. Additionally CASF-2013 benchmark [15] was used to compare our results to 20
currently used scoring functions. Three types of machine learning models were trained using
PLEC FP to predict ligand-receptor affinities on these datasets: 1) linear regression, 2)
random forest, and 3) a dense, three hidden layer neural network.
An important conclusion from obtained results is that the performance of the three different
models trained on the PLEC FP is quite similar. This consistent predictive power for all the
models (i.e. results stability) is most probably due to important global features being encoded
in the PLEC representation. Although a slight performance gain is possible by switching from
a linear to a more complex model such as random forest or neural network, the linear
regression is preferred due to its simplicity. Also the coefficients can be directly interpreted
highlighting the impact of a given feature on the ligand affinity prediction. Importantly, each
bit in the FP can be traced to the parent substructure, which expands the many possible
applications of the PLEC FP.
Results obtained by comparing the predictive power of models trained on PLEC FP
representation were highly promising. The PLEC linear model tested on core set v.2016
achieved Rp = 0.817. The PLEC neural network SF did equally well, with Rp = 0.817.
23
With the v.2013 core set setup the PLEC linear model and the neural network scored
Rp = 0.771 and Rp = 0.764 respectively. The linear model also outperformed slightly the
latest, best ML scoring function RF-Score v3 (Rp = 0.803), while providing a much simpler
and easier to interpret result. Additionally, results obtained from the CASF-2013 benchmark
showed that the PLEC linear model has outperformed all 20 different scoring functions, with
a significant improvement compared to the best X-Score (Rp = 0.614 vs Rp = 0.757 for
PLEC FP). To the best of our knowledge, the linear model trained on PLEC FP
representation is the best model tested on those datasets, published to date, in addition to
being the least complex one.
When compared with other methods used to represent receptor-ligand complexes our solution
also performed consistently and substantially well. Again this consistency is most likely a
consequence of the feature power; even the simplest linear model built on PLEC outperforms
the best performing ML models trained with other interaction fingerprints. On the v.2016
core set SILIRID-based linear model scored Rp = 0.36, while the neural network achieved
Rp = 0.52. SPLIF performed much better, yielding Rp = 0.78 for both the linear and the
neural network models.
In conclusion, we showed that PLEC FP is accurate and does exceptionally well even with a
simple linear regression model. The linear equation coefficients can show the impact of a
given contact on the predicted ligand affinity. Although a number of attempts have been
made to develop a versatile interaction fingerprints, we still lack a general, descriptive and
easily interpretable solution. I believe our results allow us to present PLEC FP as a candidate
for this task. The PLEC fingerprint is implemented in ODDT, the Open Drug Discovery
Toolkit which is free to use and available at GitHub (https://github.com/oddt/oddt).
Additionally PLEC FP and other functionalities implemented in ODDT can be easily tested
via a web browser using MyBinder, see https://github.com/oddt/notebooks.
24
Figure 6. Protein Ligand Extended Connectivity Fingerprint (PLEC FP). Top left panel; construction of the PLEC fingerprint. Depicted is the schematic 2D representation of a 3D complex. Atoms in close contact (green) are identified, followed by the subsequent generation and hashing of corresponding layers on the ligand and the protein side. Top right panel; Detailed view of the prediction accuracy for the two testing sets (v2013 and v 2016 core sets from PDBDind). Each dot represents a prediction for a single ligand–receptor complex. Bottom panel; A single Pearson correlation coefficient value (Rp) for each model built on PDBbind v2016 core set using the SILIRID, SPLIF and PLEC fingerprints. SILIRID is an explicit interaction fingerprint and has a fixed size. HTSAS
In addition to predicting affinity of small-molecule compounds to protein targets, I also
looked for methods that allowed finding new, biologically interesting molecular targets
(receptors). I have focused my interest on literature mining applied to scientific literature, in
particular directed towards the automatic functional annotation of proteins. The result of
25
these interests was published in two papers in BMC Bioinformatics [58] (I am one of the
first two authors) and Bioinformatics [59] (second co-author). These works allowed me to
develop my statistical and programming skills, primarily dealing with signals extraction in
noisy datasets. Thanks to obtained results I became interested in many molecular targets. I
established cooperation with a number of laboratories, which resulted in the work which I
discuss in the chapter "other scientific achievements".
Conclusions
The goal of predicting affinity of low molecular weight compounds to protein targets
(receptors) is a complex and multifaceted problem that many researchers work to accomplish.
There is a belief that the structural data does not provide enough information to effectively
solve this problem. To some extent this is true; it is clear that data obtained by X-ray
crystallography, NMR, CryoEM or in silico modeling do not describe many important
features, e.g the properties of ADME (absorption, distribution, metabolism, excretion).
However, my experiments and published results indicate that structural data contains much
more information than is normally analyzed. The solutions, methodologies and experiments
proposed by me and my co-workers show new ways in which ligand-receptor complexes can
be used.
Abstract
In silico techniques have become one of the fastest growing areas of the broadly understood
drug design process [60–67]. In my research, I tried to develop my own methods and
solutions related to the prediction of low molecular weight compound activity that can be
used in the in silico screening process. I created solutions related to the management and
analysis of chemical information on ligand-receptor complexes [22] and a set of statistical
tools and methods allowing, among others, development of advanced predictive models [29].
Based on these results, I developed a new way of using structural data from screening
experiments, thanks to which I significantly improved the efficiency of affinity predictions
[2]. I also developed a new method of describing the complex, which, combined with a deep,
convolutional neural network, allowed very accurate prediction of the activity of small
molecule compounds, returning its value - which is the first published example of such a
26
function, efficient optimization, and multithreading. J Comput Chem. 2010;31: 455–461.
14. Brozell SR, Mukherjee S, Balius TE, Roe DR, Case DA, Rizzo RC. Evaluation of DOCK 6 as a pose generation and database enrichment tool. J Comput Aided Mol Des. 2012;26: 749–773.
15. Li Y, Han L, Liu Z, Wang R. Comparative assessment of scoring functions on an updated benchmark: 2. Evaluation methods and general results. J Chem Inf Model. 2014;54: 1717–1736.
16. Gilson MK, Given JA, Bush BL, McCammon JA. The statistical-thermodynamic basis for computation of binding affinities: a critical review. Biophys J. 1997;72: 1047–1069.
17. Huang S-Y, Grinter SZ, Zou X. Scoring functions and their evaluation methods for protein-ligand docking: recent advances and future directions. Phys Chem Chem Phys. 2010;12: 12899–12908.
18. Waszkowycz B, Clark DE, Gancia E. Outstanding challenges in protein-ligand docking and structure-based virtual screening: Outstanding challenges in protein-ligand docking and structure-based virtual screening. WIREs Comput Mol Sci. 2011;1: 229–259.
19. Voth AR, Khuu P, Oishi K, Ho PS. Halogen bonds as orthogonal molecular interactions to hydrogen bonds. Nat Chem. 2009;1: 74–79.
20. Xu W, Lucke AJ, Fairlie DP. Comparing sixteen scoring functions for predicting biological activities of ligands for protein targets. J Mol Graph Model. 2015;57: 76–88.
21. Stepniewska-Dziubinska MM, Zielenkiewicz P, Siedlecki P. DeCAF-Discrimination, Comparison, Alignment Tool for 2D PHarmacophores. Molecules. 2017;22. doi:10.3390/molecules22071128
22. Wójcikowski M, Zielenkiewicz P, Siedlecki P. DiSCuS: an open platform for (not only) virtual screening results management. J Chem Inf Model. 2014;54: 347–354.
23. Wójcikowski M, Kukiełka M, Stepniewska-Dziubinska M, Siedlecki P. Development of a Protein-Ligand Extended Connectivity (PLEC) Fingerprint and Its Application for Binding Affinity Predictions. 2018; doi:10.26434/chemrxiv.5928406.v1
24. Metz CE. Basic principles of ROC analysis. Semin Nucl Med. 1978;8: 283–298.
25. DeLong ER, DeLong DM, Clarke-Pearson DL. Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach. Biometrics. 1988;44: 837–845.
26. Da C, Kireev D. Structural protein-ligand interaction fingerprints (SPLIF) for structure-based virtual screening: method and benchmark study. J Chem Inf Model. 2014;54: 2555–2561.
27. Radifar M, Yuniarti N, Istyastono EP. PyPLIF: Python-based Protein-Ligand Interaction Fingerprinting. Bioinformation. 2013;9: 325–328.
28. Chupakhin V, Marcou G, Gaspar H, Varnek A. Simple Ligand-Receptor Interaction Descriptor (SILIRID) for alignment-free binding site comparison. Comput Struct Biotechnol J. 2014;10: 33–37.
29. Wójcikowski M, Zielenkiewicz P, Siedlecki P. Open Drug Discovery Toolkit (ODDT): a new open-source player in the drug discovery field. J Cheminform. 2015;7: 26.
30. Durrant JD, McCammon JA. NNScore 2.0: a neural-network receptor-ligand scoring function. J Chem Inf Model. 2011;51: 2897–2903.
31. Ballester PJ, Mitchell JBO. A machine learning approach to predicting protein-ligand binding affinity with applications to molecular docking. Bioinformatics. 2010;26: 1169–1175.
32. Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, et al. Scikit-learn: Machine
28
Learning in Python. J Mach Learn Res. 2011;12: 2825–2830.
33. Wojciechowski M. Feed-forward neural network for python. Technical University of Lodz (Poland), Department of Civil Engineering, Architecture and Environmental Engineering, http://ffnet sourceforge net. 2011;
34. Zilian D, Sotriffer CA. SFCscore(RF): a random forest-based scoring function for improved affinity prediction of protein-ligand complexes. J Chem Inf Model. 2013;53: 1923–1933.
35. Varnek A, Baskin I. Machine learning methods for property prediction in chemoinformatics: Quo Vadis? J Chem Inf Model. 2012;52: 1413–1437.
36. Cruz-Monteagudo M, Medina-Franco JL, Perera-Sardiña Y, Borges F, Tejera E, Paz-Y-Miño C, et al. Probing the Hypothesis of SAR Continuity Restoration by the Removal of Activity Cliffs Generators in QSAR. Curr Pharm Des. 2016;22: 5043–5056.
37. Bender A, Glen RC. A Discussion of Measures of Enrichment in Virtual Screening: Comparing the Information Content of Descriptors with Increasing Levels of Sophistication. J Chem Inf Model. American Chemical Society; 2005;45: 1369–1375.
38. Truchon J-F, Bayly CI. Evaluating virtual screening methods: good and bad metrics for the “early recognition” problem. J Chem Inf Model. 2007;47: 488–508.
39. Empereur-Mot C, Guillemain H, Latouche A, Zagury J-F, Viallon V, Montes M. Predictiveness curves in virtual screening. J Cheminform. 2015;7: 52.
40. Tetko IV, Livingstone DJ, Luik AI. Neural network studies. 1. Comparison of overfitting and overtraining. J Chem Inf Comput Sci. American Chemical Society; 1995;35: 826–833.
41. Mysinger MM, Carchia M, Irwin JJ, Shoichet BK. Directory of useful decoys, enhanced (DUD-E): better ligands and decoys for better benchmarking. J Med Chem. 2012;55: 6582–6594.
42. Ho TK. The random subspace method for constructing decision forests. IEEE Trans Pattern Anal Mach Intell. 1998;20: 832–844.
43. Breiman L. Random Forests. Mach Learn. 2001;45: 5–32.
44. Chawla NV. Data Mining for Imbalanced Datasets: An Overview. In: Maimon O, Rokach L, editors. Data Mining and Knowledge Discovery Handbook. Boston, MA: Springer US; 2005. pp. 853–867.
45. Wang R, Fang X, Lu Y, Yang C-Y, Wang S. The PDBbind database: methodologies and updates. J Med Chem. 2005;48: 4111–4119.
46. Liu Z, Su M, Han L, Liu J, Yang Q, Li Y, et al. Forging the Basis for Developing Protein-Ligand Interaction Scoring Functions. Acc Chem Res. 2017;50: 302–309.
47. Wallach I, Dzamba M, Heifets A. AtomNet: A Deep Convolutional Neural Network for Bioactivity Prediction in Structure-based Drug Discovery [Internet]. arXiv [cs.LG]. 2015. Available: http://arxiv.org/abs/1510.02855
48. Ragoza M, Hochuli J, Idrobo E, Sunseri J, Koes DR. Protein-Ligand Scoring with Convolutional Neural Networks. J Chem Inf Model. 2017;57: 942–957.
49. Hinton GE, Srivastava N, Krizhevsky A, Sutskever I, Salakhutdinov RR. Improving neural networks by preventing co-adaptation of feature detectors [Internet]. arXiv [cs.NE]. 2012. Available: http://arxiv.org/abs/1207.0580
50. Abadi M, Agarwal A, Barham P, Brevdo E, Chen Z, Citro C, et al. TensorFlow: Large-Scale Machine
29