A deep learning algorithm to translate and classify ... · 9/28/2020 · We showed that a deep...
Transcript of A deep learning algorithm to translate and classify ... · 9/28/2020 · We showed that a deep...
![Page 1: A deep learning algorithm to translate and classify ... · 9/28/2020 · We showed that a deep learning network can be applied to classify cells into the drugged and drug free categories](https://reader034.fdocuments.in/reader034/viewer/2022052104/603ef87405c24705dc7026e8/html5/thumbnails/1.jpg)
A deep learning algorithm to translate and classify cardiac
electrophysiology: From iPSC-CMs to adult cardiac cells
Parya Aghasafari PhD1, Pei-Chi Yang PhD1, Divya C. Kernik PhD2, Kauho Sakamoto PhD4, Yasunari
Kanda PhD5, Junko Kurokawa PhD4, Igor Vorobyov PhD1,6 and Colleen E. Clancy PhD1,3
1 Department of Physiology and Membrane Biology
University of California Davis, Davis, CA
2 Washington University in St. Louis
4Department of Bio-Informational Pharmacology
School of Pharmaceutical Sciences
University of Shizuoka
Shizuoka, Japan
5Division of Pharmacology
National Institute of Health Sciences
Kanagawa, Japan
6Department of Pharmacology
University of California, Davis, CA
3 Correspondence: Colleen E. Clancy, Ph.D.
University of California, Davis
Tupper Hall, RM 4303
Davis, CA 95616-8636
Email: [email protected]
Phone: 530-754-0254
Word count:
Subject code: Artificial Intelligence, Machine Learning, Deep Learning, Computational Biology,
Arrhythmias, Pharmacology.
Short title: Translation of cardiac action potential from immature to mature
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted September 29, 2020. ; https://doi.org/10.1101/2020.09.28.317461doi: bioRxiv preprint
![Page 2: A deep learning algorithm to translate and classify ... · 9/28/2020 · We showed that a deep learning network can be applied to classify cells into the drugged and drug free categories](https://reader034.fdocuments.in/reader034/viewer/2022052104/603ef87405c24705dc7026e8/html5/thumbnails/2.jpg)
Abstract
Exciting developments in both in vitro and in silico technologies have led to new ways to identify
patient specific cardiac mechanisms. The development of induced pluripotent stem cell-derived
cardiomyocytes (iPSC-CMs) has been a critical in vitro advance in the study of patient-specific
physiology, pathophysiology and response to drugs. However, the iPSC-CM methodology is limited
by the low throughput and high variability of resulting electrophysiological measurements.
Moreover, the iPSC-CMs generate immature action potentials, and it is not clear if observations in
the iPSC-CM model system can be confidently interpreted to reflect impact in human adults.
There has been no demonstrated method to allow reliable translation of results from the iPSC-CM
to a mature adult cardiac response. Here, we demonstrate a new computational approach
intended to address the current shortcomings of the iPSC-CM platform by developing and
deploying a multitask network that was trained and tested using simulated data and then applied
to experimental data. We showed that a deep learning network can be applied to classify cells
into the drugged and drug free categories and can be used to predict the impact of
electrophysiological perturbation across the continuum of aging from the immature iPSC-CM
action potential to the adult ventricular myocyte action potential. We validated the output of the
model with experimental data. The method can be applied broadly across a spectrum of aging,
but also to translate data between species.
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted September 29, 2020. ; https://doi.org/10.1101/2020.09.28.317461doi: bioRxiv preprint
![Page 3: A deep learning algorithm to translate and classify ... · 9/28/2020 · We showed that a deep learning network can be applied to classify cells into the drugged and drug free categories](https://reader034.fdocuments.in/reader034/viewer/2022052104/603ef87405c24705dc7026e8/html5/thumbnails/3.jpg)
Introduction
The development of novel technologies has resulted in new ways to study cardiac function and
rhythm disorders [1]. One such technology is the induced pluripotent stem cell-derived
cardiomyocyte (iPSC-CMs) in vitro model system [2]. The iPSC-CM system constitutes a powerful
in vitro tool for preclinical assessment of cardiac electrophysiological impact and drug safety
liabilities in a human physiological context [3-8]. Moreover, because iPSC-CMs can be cultured
from patient specific-cells, it is one of the new model systems for patient-based medicine.
While in vitro iPSC-CM utilization allows for observation of a variety of responses to drugs and
other perturbations [9-12], there is still a major inherent limitation: The complex differentiation
process to create iPSC-CMs results in a model of cardiac electrical behavior, which is relatively
immature, resembling fetal cardiomyocytes. Hallmarks of the immature phenotype include
spontaneous beating, immature calcium handling, presence of developmental currents, and
significant differences in the relative contributions of repolarizing potassium currents compared
to adult ventricular myocyte [13-15]. The profound differences between the immature iPSC-CM
and the adult cardiac myocyte have led to persistent questions about the utility of the iPSC-CM
action potential to predict relevant impact on adult human electrophysiology [16, 17]. In this
study, we set out to take a first step to bridge the gap.
Here we describe a new way to connect data from patient derived iPSC-CMs to feed the
development of computational models and to fuel the application of machine learning techniques
to allow a way to predict, classify and translate changes observed in cardiac activity from the in
vitro iPSC-CMs to predict their effects on adult cardiomyocytes (adult-CMs). The iPSC-CM and
adult-CM action potential (AP) populations were paced with physiological noise with 0-50% IKr
block to generate a robust simulated data set to train and test the deep learning algorithm. We
then showed how it can be applied even to scarce experimental data, which was also used to
validate the model.
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted September 29, 2020. ; https://doi.org/10.1101/2020.09.28.317461doi: bioRxiv preprint
![Page 4: A deep learning algorithm to translate and classify ... · 9/28/2020 · We showed that a deep learning network can be applied to classify cells into the drugged and drug free categories](https://reader034.fdocuments.in/reader034/viewer/2022052104/603ef87405c24705dc7026e8/html5/thumbnails/4.jpg)
Deep learning techniques are increasingly used to advance personalized medicine [18]. Long-
short-term-memory (LSTM) based networks are capable of learning order dependence in
sequence prediction problems [19] and have been widely used for cardiac monitoring purposes.
They have been used to extract important biomarkers from raw ECG signals [20-22] and help
clinicians to accurately detect common heart failure features in ECG screenings [20, 23-27]. LSTM
classifiers have also been employed to automatically classify arrhythmias using ECG features [28-
32]. In addition, modeling and simulation, and multi-task networks have been widely implemented
to identify hERG blockers from sets of small molecules in drug discovery and screening [33-38].
Here, we developed and applied a multitask network leveraging LSTM architecture to classify and
translate observations from cardiac activity of in vitro iPSC-CMs to predict corresponding adult
ventricular myocyte cardiac activity. Here, we show that developments in iPSC-CM experimental
technology and developments in cardiac electrophysiological modeling and simulation of iPSC-
CMs can be leveraged in a new machine learning model to interrogate disease and drug response
in cardiac myocytes from immaturity to maturation.
Results
In this study, we set out to develop and apply a deep learning multitask network that would
perform two distinct tasks: A) The first task is cellular level action potential classification to
distinguish between drug free action potential waveforms and action potential waveforms
following application of IKr block (drug-induced hERG channel block). B) The second goal is cellular
level action potential translation to translate immature action potential waveforms to predict
adult waveforms. We first simulated a population of 542 in silico iPSC-CMs and O’Hara-Rudy in
silico human adult ventricular action potentials (adult-CMs). An average cell AP from the
population is shown in Figure 1A for iPSC-CMs and Figure 1B for adult-CMs. The ionic currents
underlying the in silico iPSC-CM APs and adult-CMs APs are markedly different as shown in panels
C and D, respectively. In Figure 1E, the population of iPSC-CM APs is shown and the adult-CM AP
population is in Figure 1F, generated by applying physiological noise as we have described
previously [39-41]. In Figure 1 panels G and H, the impact of perturbation of the in silico iPSC-CM
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted September 29, 2020. ; https://doi.org/10.1101/2020.09.28.317461doi: bioRxiv preprint
![Page 5: A deep learning algorithm to translate and classify ... · 9/28/2020 · We showed that a deep learning network can be applied to classify cells into the drugged and drug free categories](https://reader034.fdocuments.in/reader034/viewer/2022052104/603ef87405c24705dc7026e8/html5/thumbnails/5.jpg)
APs and adult-CMs APs by 1-50% IKr block is shown. Panel I shows the parameter comparison for
iPSC-CMs and adult-CMs APs without and with perturbation by 1-50% IKr block.
Figure 1. Cellular action potential (AP) and ionic currents for induced pluripotent stem cell-derived cardiomyocytes (iPSC-CMs) and adult-CMs (O’Hara-Rudy human ventricular action
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted September 29, 2020. ; https://doi.org/10.1101/2020.09.28.317461doi: bioRxiv preprint
![Page 6: A deep learning algorithm to translate and classify ... · 9/28/2020 · We showed that a deep learning network can be applied to classify cells into the drugged and drug free categories](https://reader034.fdocuments.in/reader034/viewer/2022052104/603ef87405c24705dc7026e8/html5/thumbnails/6.jpg)
potentials). A - B Comparison of Cellular APs in the baseline model of iPSC-CMs and control case of adult-CMs at cycle length of 982 ms. C. Simulated ionic currents (ICaL, IKr, IKs, Ito, IK1) during
baseline iPSC-CMs AP compared to the simulated currents profiles during control case of adult-CMs AP in D. E. iPSC-CMs APs of spontaneously beating cells (n = 304) were simulated after incorporating physiological noise. F. Adult-CMs APs were simulated with physiological noise
currents at matching cycle lengths of iPSC-CMs in panel E. G. iPSC-CMs APs were simulated with perturbation by 1-50% IKr block after incorporating physiological noise. H. Adult-CMs APs were
simulated with perturbation by 1-50% IKr block and physiological noise currents at matching cycle lengths of iPSC-CMs in panel G. I. Comparison between iPSC-CMs and adult-CMs for upstroke
velocity, maximum diastolic potential (MDP) and action potential durations (APD).
We next developed a multitask deep learning network (Figure 2A), for the independent translation
and classification tasks to be performed. The goal of the translation task (Figure 2B) is to use an
immature cardiac action potential waveform dataset and convert (translate) these data to predict
the resulting effect on mature cardiac action potential waveforms. Figure 2C shows the
classification task that is intended to be used to classify action potential waveforms into drug free
(iPSC-CMs APs with less than a determined threshold of (threshold%) IKr block) and drugged (iPSC-
CMs APs with greater than or equal to a determined threshold of (threshold%) IKr block) categories.
We combined these two tasks to demonstrate a positive relationship between planned tasks and
empower the multitask network to predict mature cardiac action potential waveforms with better
accuracy simultaneously. Both tasks are performed by a network comprising two long-short term
memory (LSTM) layers (Figure 2D) followed by independent fully connected layers for each task
(Figure 2E). Notably, the LSTM layers are shared for both tasks (please see methods for details on
the LSTM layers). The features extracted by the LSTM layers are then used as input for the
independent fully connected neural network layers with two hidden layers. The outputs from the
network are both the mature cardiac action potential waveform, and the category of action
potential waveform (drug free or drugged categories).
We used normalized and labeled in silico iPSC-CM APs and adult-CMs APs without (0% IKr block)
(Figure 1 E-F) and with perturbation by 1-50% IKr block (Figure1 G-H) as input and output for
training the multitask network (Figure 2A). First, we started training the network considering drug-
free (0% block) and drugged (1-50% IKr) cases for classification task. Then, we tested if training the
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted September 29, 2020. ; https://doi.org/10.1101/2020.09.28.317461doi: bioRxiv preprint
![Page 7: A deep learning algorithm to translate and classify ... · 9/28/2020 · We showed that a deep learning network can be applied to classify cells into the drugged and drug free categories](https://reader034.fdocuments.in/reader034/viewer/2022052104/603ef87405c24705dc7026e8/html5/thumbnails/7.jpg)
multitask network with cellular action potential waveforms that had been subject to various IKr
blocking conditions (range 1-50% block) could improve the accuracy of the network for
reconstructing adult-CM APs from iPSC-CM APs . The long short-term memory (LSTM) layer
representation is shown as a schematic in Figure 2D. The computation module and gating
mechanism for each training iteration is also depicted in Figure 2D.
Figure 2. The multitask network architecture. A. The general overview of the multitask network presented in this study. B. The translation task to reconstruct adult-CMs APs from corresponding
iPSC-CMs APs. C. The classification task is shown with IKr block threshold. D. The repeating module in the implemented LSTM layers in the presented multitask network. E. The architecture
of the implemented fully connected layers in the presented multitask network.
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted September 29, 2020. ; https://doi.org/10.1101/2020.09.28.317461doi: bioRxiv preprint
![Page 8: A deep learning algorithm to translate and classify ... · 9/28/2020 · We showed that a deep learning network can be applied to classify cells into the drugged and drug free categories](https://reader034.fdocuments.in/reader034/viewer/2022052104/603ef87405c24705dc7026e8/html5/thumbnails/8.jpg)
We next explored whether the translation and classification tasks in this study are compatible with
each other, which would allow them to be combined into a single multitask network. The benefits
of combining the tasks are to save computation time and also potentially to improve each task
performance. Therefore, we applied the multitask network to different scenarios in this study as
shown in the workflow schematic in Figure 3. The data generated through the various scenarios
are described in the subsequent figures in the study. We first trained the single task networks
independently and compared the performance for the individual translation (purple, left) and
classification (pink, second from left) tasks to the combined multitask network (yellow in middle).
We applied statistical measures for the translation and classification tasks to evaluate task
alignments. Next, we applied a forward and backward digital filter technique (light blue, second
from right) to iPSC-CMs and adult-CMs APs traces and retrained the multitask network to explore
the effect of noise filtering on the multitask network performance. In the last scenario, we
investigated the importance of carrying out time dependency analysis by removing LSTM layers
for both tasks (gray, right) and retrained the network with various threshold for the IKr block
classification task (red block) to determine the optimal threshold for the best performance of the
multitask network.
Figure 3. Work flow schematic describing different scenarios in this study: Exploring the task alignment by comparing the performance for the individual translation (purple, left) and
classification (pink, second from left) tasks to the combined multitask network (yellow in middle);
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted September 29, 2020. ; https://doi.org/10.1101/2020.09.28.317461doi: bioRxiv preprint
![Page 9: A deep learning algorithm to translate and classify ... · 9/28/2020 · We showed that a deep learning network can be applied to classify cells into the drugged and drug free categories](https://reader034.fdocuments.in/reader034/viewer/2022052104/603ef87405c24705dc7026e8/html5/thumbnails/9.jpg)
Inspecting the influence of applying noise-filtering techniques (light blue, second from right) to the iPSC-CMs and adult-CMs APs on the multitask network performance for the translation and classification tasks; Investigating the importance of considering the existing time dependency within simulated iPSC-CMs and adult-CMs APs for training the multitask network (gray, right); Tracking the performance of the multitask network for the translation and classification tasks
with various threshold for the IKr block (red block) to identify the optimal threshold for classification task.
From the population of 542 computer model generated iPSC-CMs and adult-CM action potentials;
which contained both drug free and drugged cases, we randomly selected 80% of the virtual cell
waveforms containing both drug free and drugged cases to train the multitask network. The
procedure was as follows: The multitask network utilized the iPSC-CM AP training data with
physiological noise as an input to the network. The network was then optimized to best produce
a translation of these data to match model generated adult-CM action potentials that were
produced under the same pacing conditions and degree of IKr block (i.e. the immature and adult
AP traces were generated through model simulations under the same pacing frequency conditions
and extent of IKr reduction for both). These are shown as the “train” data set in Figure 4 A (iPSC-
CMs) and B (adult-CMs training data in red). The optimized network was able to accurately
translate the training data by generating a similar matched set of adult-CM action potentials as
shown in blue in B (adult-CMs translated from iPSC-CMs by the network). The histogram in Figure
4C shows the good agreement between the model generated adult action potentials and iPSC-CM
APs translated to adult-CM APs in terms of the frequency of virtual cells with similar APD.
The remaining 20% of virtual cell waveforms for both immature and mature APs were designated
as the test set (data “unseen” to the network during training) and used to evaluate the
performance of the multitask network as shown in Figure 4D (iPSC-CMs) and 4E for adult-CMs in
red with superimposed translated data in blue. As indicated by the near superimposition of the
histogram distribution in Figure 4 panel F, the translation from iPSC-CMs to adult-CMs AP across
the range of pacing frequencies was successful. Since drug free and drugged data sets are mixed
in the training and test set, we also plotted the simulated iPSC-CMs AP (Black) and adult-CMs AP
(red) for drug free (Figure 4 G-H) and drugged AP waveforms (Figure 4J-K) separately to clarify the
performance of the multitask network for translating iPSC-CMs to adult-CMs AP for categories
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted September 29, 2020. ; https://doi.org/10.1101/2020.09.28.317461doi: bioRxiv preprint
![Page 10: A deep learning algorithm to translate and classify ... · 9/28/2020 · We showed that a deep learning network can be applied to classify cells into the drugged and drug free categories](https://reader034.fdocuments.in/reader034/viewer/2022052104/603ef87405c24705dc7026e8/html5/thumbnails/10.jpg)
considered in the classification task. Superimposed translated adult-CMs AP waveforms are also
illustrated in panel H and K in blue for drug free and drugged categories, respectively. Panel I and
L illustrates the superimposition of the histogram distribution of APD90 values for drug free and
drugged categories.
To evaluate the network performance for classification task we used area under receiver operating
characteristic plot [42], recall, precision and F1_score [43] as statistical measures (Table 1). We
compared the values for discussed measures for both training and test set and observed that the
network can categorize APs into drug-free and drugged waveforms with approximately 90%
accuracy (Table 1).
Figure 4. The performance of multitask network for translating iPSC-CMs APs into adult-CMs APs. A. The iPSC-CMs APs used for training the multitask network contained a variety of action potential
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted September 29, 2020. ; https://doi.org/10.1101/2020.09.28.317461doi: bioRxiv preprint
![Page 11: A deep learning algorithm to translate and classify ... · 9/28/2020 · We showed that a deep learning network can be applied to classify cells into the drugged and drug free categories](https://reader034.fdocuments.in/reader034/viewer/2022052104/603ef87405c24705dc7026e8/html5/thumbnails/11.jpg)
morphologies without and with IKr block (Train set). B. Comparison between simulated (red) and translated adult-CM APs (blue) in the training set. C. Comparison between the histogram distribution of APD90 values for simulated and translated adult-CM APs in the training set. D. Dedicated iPSC-CM APs for testing the performance of multitask network for the unseen AP data to the network (Test set) E. Comparison between simulated (red) and translated adult-CM APs (blue) in the test set. F. Comparison between histogram distribution of APD90 values for simulated and translated adult-CM APs in the test set. G. Population of iPSC-CM APs without IKr block. H. Comparison between simulated (red) and translated adult-CM APs (blue) without IKr block. I. Comparison between histogram distribution of APD90 values for simulated and translated adult-CM APs without IKr block. J. Population of iPSC-CMs APs with IKr block. K. Comparison between simulated (red) and translated adult-CM APs (blue) with IKr block. L. Comparison between histogram distribution of APD90 values for simulated and translated adult-CM APs with IKr block.
Table 1. Statistical measures for evaluating the performance of the multitask network for classifying AP traces into ones with and without 1-50% IKr block for both training and test sets.
Measures AUROC Recall Precision F1-score
Training set performance 0.85 0.75 0.96 0.84
Test set performance 0.90 0.85 0.95 0.9
To evaluate the task alignment, we trained two single task networks for classification and
translation tasks separately and compared the single task network performance with multitask
network performance (Table 2). We compared the statistical measures for both translation and
classification tasks. Interestingly, the multitask network actually performed better in drugged
cases (Figure 4L) compared to drug-free cases (Figure 4I). As it can be seen in the presented
measures, the multitask network has less mean-squared-error (MSE) and APD90 calculation error
for translating the traces with 1-50% IKr block (Table 2). The implication is that addition of the
classification task will improve the reliability of the network to translate iPSC-CMs APs into adult-
CMs AP.
In contrast to the benefit of the multitask network to improving translation accuracy, no significant
improvement was observed in statistical measures for classification task (Table 2). Notably, the
statistical measures are close in value for the single task and multitask network which indicates
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted September 29, 2020. ; https://doi.org/10.1101/2020.09.28.317461doi: bioRxiv preprint
![Page 12: A deep learning algorithm to translate and classify ... · 9/28/2020 · We showed that a deep learning network can be applied to classify cells into the drugged and drug free categories](https://reader034.fdocuments.in/reader034/viewer/2022052104/603ef87405c24705dc7026e8/html5/thumbnails/12.jpg)
good task alignment in the multitask network. Therefore, combining the two tasks may indeed
save computation time and also improve the translation task performance (Table2).
Table 2. Statistical measures for evaluating the performance of the multitask network for both translation and classification tasks to explore task alignment.
Translation
Networks MSE R2_score APD90 error for block
Single task (Translation) 0.0014 0.995 3.40%
Single task (Classification) - - -
Multitask 0.0013 0.995 3.25%
Classification
Networks AUROC Recall Precision F1-score
Single task (Translation) - - - -
Single task (Classification) 0.90 0.85 0.95 0.9
Multitask 0.90 0.85 0.95 0.9
Next, in order to demonstrate how the multitask network can be extended to apply even to noisy
experimental data, we applied a digital filter forward-backward technique [44] to iPSC-CM and
adult-CM AP traces without and with perturbation by 1-50% IKr block (Figure 5). Panel A shows
simulated drug-free iPSC-CM with physiological noise in green and after applying a noise filtering
technique in purple. To assess the phase distortion resulting from noise filtering for APD90 values,
we then plotted a histogram distribution of APD90 values in Figure 5B, which shows near
superimposition of the histogram distribution. This indicates that noise filtering does not change
AP waveforms shape and only removes existing vertical noise. Panels C and D show the same
process as described in panels A and B, but for the adult-CM APs (again, with physiological noise
in green and after applying noise filtering techniques in purple in panel C, and histogram
distributions of APD90 in panel D) .
In Figure 5E and F, we show the drugged AP waveforms with physiological noise in green, noise-
filtered in purple and without physiological noise in pink for iPSC-CMs and corresponding
histograms indicating the frequency of APD90 values, respectively. The adult-CM (Figure 5G) is
shown for drugged AP waveforms with physiological noise in green, noise-filtered in purple and
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted September 29, 2020. ; https://doi.org/10.1101/2020.09.28.317461doi: bioRxiv preprint
![Page 13: A deep learning algorithm to translate and classify ... · 9/28/2020 · We showed that a deep learning network can be applied to classify cells into the drugged and drug free categories](https://reader034.fdocuments.in/reader034/viewer/2022052104/603ef87405c24705dc7026e8/html5/thumbnails/13.jpg)
without physiological noise in pink along with the overlaid APD90 histograms in Figure 5H. We also
undertook retraining of the multitask network with noise-filtered traces to explore the effect of
noise filtering on the network performance. We compared the statistical measures for both
translation and classification tasks for the trained network based on AP waveforms with
physiological noise and noise-filtered traces (Table 3) and observed that applying noise filtering
techniques could slightly improve multitask network performance for both tasks.
Table 3. Statistical measures for evaluating the influence of applying noise-filtering techniques and considering the existing time dependency within simulated iPSC-CMs and adult-CMs APs on the multitask network performance.
Translation
Networks MSE R2_score APD90 error with Ikr block
Multitask without noise-filtering 0.0013 0.995 3.25%
Multitask with noise-filtering 0.0013 0.995 3.17%
Multitask with noise-filtering and removing LSTM layers
0.0016 0.980 3.6%
Classification
Networks AUROC Recall Precision F1-score
Multitask without noise-filtering 0.90 0.85 0.95 0.90
Multitask with noise-filtering 0.91 0.87 0.95 0.91
Multitask with noise-filtering and removing LSTM layers
0.88 0.85 0.87 0.88
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted September 29, 2020. ; https://doi.org/10.1101/2020.09.28.317461doi: bioRxiv preprint
![Page 14: A deep learning algorithm to translate and classify ... · 9/28/2020 · We showed that a deep learning network can be applied to classify cells into the drugged and drug free categories](https://reader034.fdocuments.in/reader034/viewer/2022052104/603ef87405c24705dc7026e8/html5/thumbnails/14.jpg)
Figure 5. Applying a digital filter forward and backward technique to iPSC-CM and adult-CM APs without and with IKr block indicates zero phase distortion for APD90 values. A. Simulated drug-
free iPSC-CMs APs with physiological noise in green and after applying the noise filtering technique in purple. B. Comparison between histogram distributions of APD90 values for noisy
and noise-filtered iPSC-CMs APs. C. Simulated drug-free adult-CMs APs with physiological noise in green and after applying noise filtering technique in purple. D. Comparison between
histogram distribution of APD90 values for noisy and noise-filtered adult-CMs APs. E. Simulated iPSC-CM APs with IKr block with physiological noise in green, after applying noise filtering
technique on simulated traces in purple, and without physiological noise in pink. F. Comparison
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted September 29, 2020. ; https://doi.org/10.1101/2020.09.28.317461doi: bioRxiv preprint
![Page 15: A deep learning algorithm to translate and classify ... · 9/28/2020 · We showed that a deep learning network can be applied to classify cells into the drugged and drug free categories](https://reader034.fdocuments.in/reader034/viewer/2022052104/603ef87405c24705dc7026e8/html5/thumbnails/15.jpg)
between histogram distribution of APD90 values for noisy and noise-filtered versus iPSC-CM APs without physiological noise. G. Simulated adult-CM APs with IKr block with physiological noise in
green, after applying noise filtering technique on simulated traces in purple, and without physiological noise in pink. H. Comparison between histogram distribution of APD90 values for
noisy and noise-filtered versus simulated adult-CMs APs without physiological noise.
We next explored the importance of the existing time dependency within the simulated iPSC-CMs
and adult-CMs APs on the multitask network performance. We did this by extracting the LSTM
layers and compared the network performance with and without the LSTM layers for both
translation and classification tasks (Table 3). As can be seen, removing LSTM layers worsened the
network performance for both tasks. In summary, we determined that training the multitask
network both with noise filtered traces and taking existing time dependence into consideration as
the best approach to design and perform our desired translation and classification tasks.
To test whether training the multitask network with cellular action potential waveforms that had
been subject to a variety of IKr blocking conditions (range 1-50% block) could improve the
performance of the network, we retrained the multitask network with various IKr block
percentages for the classification task and compared the performance of the multitask network
to identify the optimal threshold for the classification task (Figure 6). For our test set, we recorded
the MSE (Figure 6A) and the error in calculation of APD90 (Figure 6B) to investigate its influence on
the translation task and monitored values of AUROC (Figure 6C) and precision (Figure 6D) for the
classification task. As can be seen in Figure 6A-D, perturbation by 25% IKr block resulted in the
best performance for both the translation and classification tasks when the selected threshold
defined two classes: class 1 (iPSC-CMs APs with less than 25% IKr block) and class 2 (iPSC-CMs APs
with greater than or equal to 25% IKr block). In summary, the multitask network with noised-
filtered iPSC-CMs APs and adult-CMs APs with LSTM layers that defined a 25% IKr threshold for the
classification task emerged as the best proposed network.
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted September 29, 2020. ; https://doi.org/10.1101/2020.09.28.317461doi: bioRxiv preprint
![Page 16: A deep learning algorithm to translate and classify ... · 9/28/2020 · We showed that a deep learning network can be applied to classify cells into the drugged and drug free categories](https://reader034.fdocuments.in/reader034/viewer/2022052104/603ef87405c24705dc7026e8/html5/thumbnails/16.jpg)
Figure 6. Tracking the performance of the multitask network for translation and classification tasks with different IKr block percentage thresholds to identify an optimal one. A. Calculated the
mean-squared-error (MSE) between simulated and translated adult-CM AP waveforms by changing IKr block threshold for the classification task for test set. B. Calculated error between
APD90 values for simulated and translated adult-CM APs for test set. C. Multitask network AUCROC measure for classifying iPSC-CM APs with IKr block into class 1 (iPSC-CM APs with less
than threshold % IKr block) and class 2 (iPSC-CM APs with greater than or equal to threshold % IKr block). D. Multitask network precision measure for classifying iPSC-CM APs with IKr block into
class 1 and 2.
We next set out to demonstrate the real-world utility of the multitask classification and translation
network by applying the network to experimental data. We utilized experimental iPSC-CM APs
from the Kurokawa lab (Figure 7A) and applied the translation task resulting in the predicted adult-
CM APs as shown in Figure 7B. The translation notably resulted in a reduction in variability in APD
in the adult translated cells, consistent with our simulated results and with previous experimental
observations [16, 45]. In an additional validation of the deep learning network, we first simulated
iPSC-CM APs with 50% block of IKr. We then used these simulation APs as an input for the multitask
network and utilized the output from the translation task as a prediction of the effect of 50% block
on adult-CMs. In Figure 7C, the translated drugged APD90 values are shown as turquoise asterisks
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted September 29, 2020. ; https://doi.org/10.1101/2020.09.28.317461doi: bioRxiv preprint
![Page 17: A deep learning algorithm to translate and classify ... · 9/28/2020 · We showed that a deep learning network can be applied to classify cells into the drugged and drug free categories](https://reader034.fdocuments.in/reader034/viewer/2022052104/603ef87405c24705dc7026e8/html5/thumbnails/17.jpg)
plotted with directly simulated O’Hara-Rudy adult-CMs APs with 50% IKr block (red curve) and
additional validation by experimental 50% block of IKr by 1M E-4031 (blue squares) [40]. These
data strongly suggest that the effects of drug block in iPSC-CMs can successfully be translated to
predict its effect on adult human APs.
Figure 7. Translation of experimentally recorded iPSC-CMs APs into adult-CMs APs to validate the multitask network performance. A. Experimentally recorded iPSC-CMs APs from the Kurokawa lab. B. Translated adult-CMs APs from experimentally recorded iPSC-CMs APs using presented
multitask network. C. Comparing translated adult-CMs AP APD90 values with 50% IKr block (turquoise asterisks) with previously published simulated (red curve for drugged and black for drug-free control) and experimental (blue squares) values in O’Hara-Rudy study [40] indicates
successful model validation.
Methods
Generation of the in silico data for training and testing the deep learning model:
iPSC-CM and adult-CM baseline Action potentials with or without IKr block
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted September 29, 2020. ; https://doi.org/10.1101/2020.09.28.317461doi: bioRxiv preprint
![Page 18: A deep learning algorithm to translate and classify ... · 9/28/2020 · We showed that a deep learning network can be applied to classify cells into the drugged and drug free categories](https://reader034.fdocuments.in/reader034/viewer/2022052104/603ef87405c24705dc7026e8/html5/thumbnails/18.jpg)
iPSC-CMs baseline cells were paced from steady-state. The model formulation used for the
baseline adult-CMs was the O’Hara-Rudy endocardial model cell [40]. The control adult-CMs was
paced at cycle length of 982 ms to match the cycle length of the last beat of iPSC-CMs AP. The
block case of adult-CMs (50% inhibition) was paced at cycle length of 1047 ms to match the cycle
length of the last beat of iPSC-CMs AP with 50% IKr block.
iPSC-CMs and adult-CMs baseline simulations with physiological noise currents
Data set 1: Without IKr block: iPSC-CMs AP populations (n = 304) were generated after
incorporating physiological noise. The adult-CMs were paced with noise for 100 beats after
reaching steady state at matching cycle length of the last beat of iPSC-CMs AP populations.
Data set 2: With IKr block: The adult-CMs with noise were paced with 1-50% IKr block. The model
was simulated at five varying beating rates for each percentage of block that match to the last
beat of iPSC-CMs with 1-50% IKr block (n = 250). The numerical method used for updating the
voltage was Forward Euler method [46].
Simulated physiological noise currents:
Simulated noise current was added to the last 100 paced beats in the simulated AP models, and
simulated action potentials (APs) were recorded at the 2000th paced beat in single cells. This
noise current was modeled using the equation from [39],
𝑉𝑡+∆𝑡 = 𝑉𝑡 −𝐼(𝑉𝑡)∆𝑡
𝐶𝑚+ 𝜉𝑛√∆𝑡 (1)
Where n is N(0,1) is a random number from a Gaussian distribution, and ∆t is the time step. =
0.3 is the diffusion coefficient, which is the amplitude of noise. The noise current was generated
and applied to membrane potential Vt throughout the last 100 beats of simulated time course.
Experimental iPSC-CMs:
Human iPSC-CMs (201B7, RIKEN BRC, Tsukuba, Japan) were cultured and subcultured on SNL76/7
feeder cells as described in detail previously [47]. Cardiomyocyte differentiation was performed
either by the EB method with slight modifications [47]. Commercially available iCell-
cardiomyocytes (FUJIFILM Cellular Dynamics, Inc., Tokyo, Japan) were cultured according to the
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted September 29, 2020. ; https://doi.org/10.1101/2020.09.28.317461doi: bioRxiv preprint
![Page 19: A deep learning algorithm to translate and classify ... · 9/28/2020 · We showed that a deep learning network can be applied to classify cells into the drugged and drug free categories](https://reader034.fdocuments.in/reader034/viewer/2022052104/603ef87405c24705dc7026e8/html5/thumbnails/19.jpg)
manual provided from the companies. Action potentials were recorded with the perforated
configuration of the patch-clamp technique as described in detail previously [47]. Measurements
were performed at 36 ± 1 °C with the external solution composed of (in mM): NaCl (135), NaH2PO4
(0.33), KCl (5.4), CaCl2 (1.8), MgCl2 (0.53), glucose (5.5), HEPES, pH 7.4. To achieve patch
perforation (10-20 MΩ; series resistances), amphotericin B (0.3-0.6 µg/mL) was added to the
internal solution composed of (in mM): aspartic acid (110), KCl (30), CaCl2 (1), adenosine-5’-
triphosphate magnesium salt (5), creatine phosphate disodium salt (5), HEPES (5), EGTA (11), pH
7.25. In quiescent cardiomyocytes, action potentials were elicited by passing depolarizing current
pulses (2 ms in duration) of suprathreshold intensity (120 % of the minimum input to elicit action
potentials) with a frequency at 1 Hz unless noted otherwise.
The multitask network architecture:
The proposed multitask network comprises two LSTM layers followed by independent fully
connected layers (Figure 2A) for the classification and translation tasks. The LSTM layers
memorizes the important time dependent information and then transfers the extracted
information (features) into the subsequent fully connected layers to translate immature cardiac
action potential waveforms into mature cardiac action potential waveforms (Figure 2B) and
classify iPSC-CM APs into class 1 (iPSC-CM APs with less than the threshold % IKr block) and class 2
(iPSC-CM APs with greater than or equal to the threshold % IKr block) (Figure 2C).
Long-short term memory (LSTM) layers (Figure 2D):
We used LSTMs as the first two layers of the multi-task network to promote network learning for
which data in a sequence is important to keep or to throw away. At each time step, the LSTM cell
takes in three different pieces of information, the current input data (𝐴𝑃𝑖𝑃𝑆𝐶𝑡), the short-term
memory (hidden state) from the previous cell (ℎ𝑡−1) and the long-term memory (cell state)
(𝐶𝑡−1). The LSTM cells contain internal mechanisms called gates. The gates are neural networks
(with weights (w) and bias terms (b)) that regulate the flow of information at each time step before
passing on the long-term and short-term information to the next cell [48]. These gates are called
the input gate, the forget gate, and the output gate (Figure 2D).
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted September 29, 2020. ; https://doi.org/10.1101/2020.09.28.317461doi: bioRxiv preprint
![Page 20: A deep learning algorithm to translate and classify ... · 9/28/2020 · We showed that a deep learning network can be applied to classify cells into the drugged and drug free categories](https://reader034.fdocuments.in/reader034/viewer/2022052104/603ef87405c24705dc7026e8/html5/thumbnails/20.jpg)
The forget gate, as the name implies, determines which information from the long-term memory
should be kept or discarded. This is done by multiplying the incoming long-term memory by a
forget vector generated by the current input and incoming short-term memory. To obtain the
forget vector, the short-term memory and current input are passed through a sigmoid function
(σ) [49]. The output vector of sigmoid function is binary comprising 0s and 1s and is then multiplied
by the long-term memory to choose which parts of the long-term memory to retain (Eq. 2).
𝐹𝑡 = 𝜎𝑓(𝑤𝑓𝐴𝑃𝑖𝑃𝑆𝐶𝑡+ 𝑤𝑓ℎ𝑡−1 + 𝑏𝑓) (2)
The input gate decides what new information will be stored in the long-term memory. It considers
the current input and the short-term memory from the previous time step and transforms the
values to be between 0 (unimportant) and 1 (important) using a sigmoid activation function (Eq.
3). The second layer in input gate takes the short-term memory and current input and passes it
through a hyperbolic tangent (tanh) activation function to regulate the network (Eq. 4).
𝐼𝑡 = 𝜎𝑖(𝑊𝑖𝐴𝑃𝑖𝑃𝑆𝐶𝑡+ 𝑊𝑖ℎ𝑡−1 + 𝑏𝑖) (3)
𝑆𝑡 = 𝑡𝑎𝑛ℎ𝑖(𝑤𝑠𝐴𝑃𝑖𝑃𝑆𝐶𝑡+ 𝑤𝑠ℎ𝑡−1 + 𝑏𝑠) (4)
The outputs from the forget and input gates then undergo a pointwise addition to give a new
version of the long-term memory (Eq . 5), which is then passed on to the next cell.
𝐶𝑡 = 𝐹𝑡 ∗ 𝐶𝑡−1 + 𝐼𝑡 ∗ 𝑆𝑡 (5)
Finally, the output gate utilizes current input and previous short-term memory and passes them
into a sigmoid function (Eq. 6). Then the new computed long-term memory passes through a tanh
activation function and the outputs from these two processes are multiplied to produce the new
short-term memory (Eq. 7).
𝑂𝑡 = 𝜎𝑜(𝑤𝑜𝑥𝑡 + 𝑤𝑜ℎ𝑡−1 + 𝑏𝑜) (6)
ℎ𝑡 = 𝑂𝑡 ∗ 𝑡𝑎𝑛ℎ𝑜(𝐶𝑡) (7)
The short-term and long-term memory produced by these gates is carried over to the next cell for
the process to be repeated. The output of each time step is obtained from the short-term memory,
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted September 29, 2020. ; https://doi.org/10.1101/2020.09.28.317461doi: bioRxiv preprint
![Page 21: A deep learning algorithm to translate and classify ... · 9/28/2020 · We showed that a deep learning network can be applied to classify cells into the drugged and drug free categories](https://reader034.fdocuments.in/reader034/viewer/2022052104/603ef87405c24705dc7026e8/html5/thumbnails/21.jpg)
also known as the hidden state, and is subsequently passed into fully connected layers to perform
the translation and classification tasks.
Fully connected layers (Figure 2E):
The fully connected neural network layers contains input, hidden and output layers (Figure 2E)
which may have various numbers of neurons. Every neuron in a layer is connected to neurons in
another layer [50]. Fully connected layers receive the output of LSTM layers as input. The fully
connected layers calculate a weighted sum of LSTM outputs and add a bias term to the outputs.
These data are then passed to an activation function (f) to define the output for each node (Eq.
8,9) [51].
𝑎𝑗𝑛 = 𝑓(𝑍𝑗
𝑛) (8)
𝑍𝑗𝑛 = 𝑊𝑖,𝑗
𝑛 ∗ 𝑎𝑗𝑛−1 + 𝑏𝑛 (9)
where 𝑎0 is the input data and 𝑎𝑛+1 is output data 𝑦 ̂𝜖 {𝑦𝑡𝑖
, 𝑦𝑐𝑖
} where 𝑦𝑡𝑖 and 𝑦𝑐𝑖
are the outputs for
translation and classification tasks, respectively. We first assign random values to all network
parameters (𝜃𝑡; nodes weight (𝑊𝑖,𝑗), bias term (𝑏) and network hyperparameters and select the
best network infrastructure; number of hidden layers, number of neurons and activation functions
for each hidden layer. Next, we estimate the network errors using mean-squared-error (Eq. 10)
and cross-entropy loss functions (Eq. 11) to map the translation and classification tasks [52, 53],
respectively.
𝑀𝑆𝐸 = 1
𝑛∑ ‖𝑦𝑡𝑖
− �̂�𝑡𝑖‖
2𝑛
𝑖=1 (10)
𝐶𝑟𝑜𝑠𝑠𝐸𝑛𝑡𝑟𝑜𝑝𝑦 = − (𝑦𝑐𝑖log(�̂�𝑐𝑖
) + (1 − 𝑦𝑐𝑖)log (1 − �̂�𝑐𝑖
)) (11)
where n is the total number of input samples and 𝑦𝑡𝑖 and �̂�𝑡𝑖
are the simulated and translated
adult-CM APs (the network output for translation task). The 𝑦𝑐𝑖 is binary indicator of class labels
for iPSC-CM APs (0 for iPSC-CM APs with less than the threshold % IKr block or 1 for iPSC-CM APs
with greater than or equal to the threshold % IKr block) and �̂�𝑐𝑖 is predicted probability of APs being
classified into the discussed classes. We used sum of both loss functions (Eq. 12) to calculate the
overall network error (J) for both translation and classification tasks during the network training
process. We updated network parameters (𝜃𝑡+1) using adaptive momentum estimation (ADAM)
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted September 29, 2020. ; https://doi.org/10.1101/2020.09.28.317461doi: bioRxiv preprint
![Page 22: A deep learning algorithm to translate and classify ... · 9/28/2020 · We showed that a deep learning network can be applied to classify cells into the drugged and drug free categories](https://reader034.fdocuments.in/reader034/viewer/2022052104/603ef87405c24705dc7026e8/html5/thumbnails/22.jpg)
optimization algorithm [54] based on the average gradient of overall loss function with respect to
the network parameters for 128 randomly selected simulated AP traces (mini-batch=128) at each
training iteration (Eqs. 13-15).
𝐽(𝜃𝑡,𝑖) = 𝐶𝑟𝑜𝑠𝑠𝐸𝑛𝑡𝑟𝑜𝑝𝑦𝐶𝑙𝑎𝑠𝑠𝑖𝑓𝑖𝑐𝑎𝑡𝑖𝑜𝑛(𝜃𝑡,𝑖) + MSE𝑇𝑟𝑎𝑛𝑠𝑙𝑎𝑡𝑖𝑜𝑛(𝜃𝑡,𝑖) (12)
𝜃𝑡+1,𝑖 = 𝜃𝑡,𝑖 −𝛼 .�̂�𝑡
√�̂�𝑡+𝜖 , 𝜃𝑡,𝑖 𝜖 {𝑊𝑖,𝑗
𝑛 , 𝑏𝑗𝑛} (13)
�̂�𝑡 =𝑚𝑡
1−𝛽1𝑡 , where 𝑚𝑡 = (1 − 𝛽1)𝛻𝐽(𝜃𝑡,𝑖) + 𝛽1𝑚𝑡−1 (14)
�̂�𝑡 =𝜈𝑡
1−𝛽2𝑡 , where 𝜈𝑡 = (1 − 𝛽2) (𝛻𝐽(𝜃𝑡,𝑖))
2+ 𝛽2𝑣𝑡−1 (15)
We used rectified linear unit (ReLu) activation function for hidden layers and dropout
regularization [55] with probability of eliminating any hidden units in fully connected layers equal
to 0.25 for updating model parameters to find global minimum of loss function based on a
predefined learning rate (𝛼) , first and second momentum terms (𝛽1, 𝛽
2 ), and a small term
preventing division by zero (𝜖).
We first started training the network considering drug-free (0% block) and drugged (1-50% IKr)
cases for classification task. Then, we tested a hypothesis whether training the multitask network
with classifying cellular action potential waveforms that had been subject to various IKr blocking
conditions (range 1-50% block) could improve the performance of the network. We used MSE (Eq.
10), R2_score (Eqs. 16-17 below) and the histogram distribution of APD90 as statistical measures to
evaluate the performance of network for translation task and area under receiver operating
characteristic curve (AUROC), recall, precision and F1-score to measure capability of network for
classification task as described below.
We applied a forward and backward digital filter technique[44] into normalized and labeled
simulated iPSC-CM APs and adult-CM APs without and with perturbation by 1-50% IKr block and
utilized them as inputs and outputs in the network. The network architecture is implemented using
Pytorch platform[56]. 80% of the simulated APs were considered for training the network and the
remaining were used to test the performance of the network for an unseen dataset during
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted September 29, 2020. ; https://doi.org/10.1101/2020.09.28.317461doi: bioRxiv preprint
![Page 23: A deep learning algorithm to translate and classify ... · 9/28/2020 · We showed that a deep learning network can be applied to classify cells into the drugged and drug free categories](https://reader034.fdocuments.in/reader034/viewer/2022052104/603ef87405c24705dc7026e8/html5/thumbnails/23.jpg)
training. The network codes have been made publicly available at Github.
(https://github.com/ClancyLabUCD/Multitask_network)
Statistical measures
As we discussed, we used MSE and cross-entropy as criteria for performance evaluation of
translation and classification tasks. In addition to MSE, we computed R2_score [57] (Eqs. 16,17) to
measure how close the translated adult-CM AP (�̂�𝑡𝑖) are to the simulated adult-CM AP (𝑦𝑡𝑖
). We
compared the histogram distribution of simulated and translated adult-CM APD90 values to
investigate the ability of network to predict them accurately.
�̅�𝑡 = 1
𝑛∑ 𝑦𝑡𝑖
𝑛
𝑖=1 (16)
𝑅2 =∑ (�̂�𝑡𝑖
− �̅�𝑡)𝑖
∑ (𝑦𝑡𝑖 −�̅�𝑡 )𝑖
(17)
We used area under the receiver operating curve (AUROC) to measure the capability of model in
distinguishing between classes [42]. Receiver operating curve (ROC) is a plot of the false positive
rate (FPR, the probability that the network classifies iPSC-CM AP with less than 25% IKr block into
greater than or equal to 25% IKr block) (Eq. 18) versus the true positive rate (TPR) or recall (Eq. 19).
AUROC close to 1 represents an excellent model, which has good measure of separability, while a
poor model has AUROC near 0, which means that it has poor separability.
prediction [58]. In addition, the confusion matrix elements were used to measure statistical
metrics (recall, precision and F1-score) to describe the performance of a classification model [13],
where recall (Eq. 19) is the proportion of actual positives that are correctly identified as such.
Precision measures the proportion of correct positive identifications (Eq. 20) and F1-score is the
harmonic mean of precision and recall (Eq. 21).
FPR =𝐹𝑃
𝐹𝑃+𝑇𝑁 (18)
Recall =𝑇𝑃
𝑇𝑃+𝐹𝑁 (19)
𝑃𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛 =𝑇𝑃
𝑇𝑃+𝐹𝑃 (20)
F1 = 2 ×Precision × Recall
Precision+ Recall (21)
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted September 29, 2020. ; https://doi.org/10.1101/2020.09.28.317461doi: bioRxiv preprint
![Page 24: A deep learning algorithm to translate and classify ... · 9/28/2020 · We showed that a deep learning network can be applied to classify cells into the drugged and drug free categories](https://reader034.fdocuments.in/reader034/viewer/2022052104/603ef87405c24705dc7026e8/html5/thumbnails/24.jpg)
Discussion
In this study, we have developed a data-driven algorithm intended to address well known
shortcomings in the induced pluripotent stem cell-derived cardiomyocyte (iPSC-CMs) platform. A
known concern with iPSC-CMs is that the data collection results in measurements from immature
action potentials, and it is unclear if these data reliably indicate impact in the adult cardiac
environment. Here, we set out to demonstrate a new deep learning algorithm to allow reliable
translation of results from the iPSC-CM to a mature adult cardiac response. The translation task
yielded superb results with 0.0012 MSE, 0.99 R2_score and less than 3% calculation error in
predicting adult APD90 values from iPSC-CMs inputs.
The multi-task network we present here notably conferred additional benefit over considering the
translation and classification tasks separately. For example, we noted that adding the classification
task to distinguish data from action potentials with and without the application of an IKr blocking
drug could improve the performance of the network. Adding the IKr blocking threshold to the
translation task resulted in up to 7% decrease in MSE value. To pick the best threshold, we tested
a range of thresholds for classification and retrained the network with each of them. The threshold
defined as 25% IKr block led to the highest accuracy for both the translation and classification tasks.
The high values for the classification task statistical measures including AUROC and precision equal
to 0.993 and 0.98, respectively, also indicate the credibility of the multitask network to sort iPSC-
CMs APs into groups with greater than or equal to 25% IKr block from iPSC-CMs APs with less than
25% IKr block.
Importantly, the multitask network presented here performed well even in the setting of the noted
variability in measurements from iPSC-CMs. We utilized a modeling and simulation approach from
our recent study [41] to generate a population of IPSC-CM action potentials that incorporate
variability comparable to that in experimental measurements. Utilizing simulated data presented
a unique opportunity: We were able to generate large amounts of data that were used both to
train and optimize the network and then to test the network with specifically designated distinct
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted September 29, 2020. ; https://doi.org/10.1101/2020.09.28.317461doi: bioRxiv preprint
![Page 25: A deep learning algorithm to translate and classify ... · 9/28/2020 · We showed that a deep learning network can be applied to classify cells into the drugged and drug free categories](https://reader034.fdocuments.in/reader034/viewer/2022052104/603ef87405c24705dc7026e8/html5/thumbnails/25.jpg)
simulated data sets. Utilizing simulated data to train a deep learning network may constitute a
more widely applicable approach that could be used to train variety of networks to perform
multiple functions where access to comparable experimental data is not feasible.
Following the optimization and demonstration of the network as an accurate tool for both
translating and classifying data, we then used the same network to translate experimentally
obtained data. We showed that the proposed network can effectively take experimental data as
an input from immature iPSC-CMs and translate those data to produce adult action potential
waveforms. It is notable that the variation observed in the adult-CM AP duration is smaller
compared to iPSC-CM AP (Figure 7A-B). This has been observed both experimentally [16, 45] and
in our simulated cell environment [41, 59] . Although the simulated iPSC-CM has a large initial
calcium current (Figure 1B) compared to the simulated adult-CM (Figure 1C), the amplitude of
currents flowing through adult-CM action potential plateau is notably larger. The immature iPSC-
CM cells have low conductance during the AP plateau rendering it comparably higher resistance.
For this reason, small perturbations to the iPSC-CMs have a larger impact on the resulting AP
duration than observed in adult cells [60]. We also used simulated iPSC-CMs subject to 50% block
of IKr. We translated those data to adult-CM APs and then compared the previously reported
impact on adult-CM APs to 50% IKr block from experiments and noted excellent agreement thereby
providing validation using experimental data from adult human cells.
In this study, we show that a deep learning network can be applied to classify cells into the drugged
and drug free categories and can be used to predict the impact of electrophysiological
perturbation across the continuum of aging from the immature iPSC-CM action potential to the
adult ventricular myocyte action potential. We translated experimental immature APs into mature
APs using the proposed network and validated the output of some key model simulations with
experimental data. The multitask network in this study was used for translation of iPSC-CMs to
adult APs, but could be readily extended and applied to translate data across species and classify
data from a variety of systems. Also, another extension of the technology presented here is to
predict the impact of naturally occurring mutations and other genetic anomalies [61].
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted September 29, 2020. ; https://doi.org/10.1101/2020.09.28.317461doi: bioRxiv preprint
![Page 26: A deep learning algorithm to translate and classify ... · 9/28/2020 · We showed that a deep learning network can be applied to classify cells into the drugged and drug free categories](https://reader034.fdocuments.in/reader034/viewer/2022052104/603ef87405c24705dc7026e8/html5/thumbnails/26.jpg)
References
1. Shaheen, N., et al., Human induced pluripotent stem cell-derived cardiac cell sheets
expressing genetically encoded voltage indicator for pharmacological and arrhythmia
studies. Stem cell reports, 2018. 10(6): p. 1879-1894.
2. Leyton-Mange, J.S., et al., Rapid cellular phenotyping of human pluripotent stem cell-
derived cardiomyocytes using a genetically encoded fluorescent voltage sensor. Stem cell
reports, 2014. 2(2): p. 163-170.
3. Sun, N., et al., Patient-specific induced pluripotent stem cells as a model for familial
dilated cardiomyopathy. Sci Transl Med, 2012. 4(130): p. 130ra47.
4. Lan, F., et al., Abnormal calcium handling properties underlie familial hypertrophic
cardiomyopathy pathology in patient-specific induced pluripotent stem cells. Cell Stem
Cell, 2013. 12(1): p. 101-13.
5. Burridge, P.W., et al., Human induced pluripotent stem cell-derived cardiomyocytes
recapitulate the predilection of breast cancer patients to doxorubicin-induced
cardiotoxicity. Nat Med, 2016. 22(5): p. 547-56.
6. Doss, M.X. and A. Sachinidis, Current challenges of iPSC-based disease modeling and
therapeutic implications. Cells, 2019. 8(5): p. 403.
7. Collins, T.A., M.G. Rolf, and A. Pointon, Current and future approaches to nonclinical
cardiovascular safety assessment. Drug Discovery Today, 2020.
8. Wu, J.C., et al., Towards precision medicine with human iPSCs for cardiac
channelopathies. Circulation research, 2019. 125(6): p. 653-658.
9. Tveito, A., et al., Inversion and computational maturation of drug response using human
stem cell derived cardiomyocytes in microphysiological systems. Scientific reports, 2018.
8(1): p. 1-14.
10. Tveito, A., et al., Computational translation of drug effects from animal experiments to
human ventricular myocytes. Scientific Reports, 2020. 10(1): p. 1-11.
11. Sube, R. and E.A. Ertel, Cardiomyocytes Derived from Human Induced Pluripotent Stem
Cells: An In-Vitro Model to Predict Cardiac Effects of Drugs. Journal of Biomedical
Science and Engineering, 2017. 10(11): p. 527.
12. Navarrete, E.G., et al., Screening drug-induced arrhythmia using human induced
pluripotent stem cell–derived cardiomyocytes and low-impedance microelectrode arrays.
Circulation, 2013. 128(11_suppl_1): p. S3-S13.
13. Lieu, D.K., et al., Mechanism-based facilitated maturation of human pluripotent stem
cell-derived cardiomyocytes. Circ Arrhythm Electrophysiol, 2013. 6(1): p. 191-201.
14. Veerman, C.C., et al., Immaturity of human stem-cell-derived cardiomyocytes in culture:
fatal flaw or soluble problem? Stem Cells Dev, 2015. 24(9): p. 1035-52.
15. Tu, C., B.S. Chao, and J.C. Wu, Strategies for Improving the Maturity of Human Induced
Pluripotent Stem Cell-Derived Cardiomyocytes. Circ Res, 2018. 123(5): p. 512-514.
16. Blinova, K., et al., International multisite study of human-induced pluripotent stem cell-
derived cardiomyocytes for drug proarrhythmic potential assessment. Cell reports, 2018.
24(13): p. 3582-3592.
17. Sala, L., M. Bellin, and C.L. Mummery, Integrating cardiomyocytes from human
pluripotent stem cells in safety pharmacology: has the time come? British journal of
pharmacology, 2017. 174(21): p. 3749-3765.
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted September 29, 2020. ; https://doi.org/10.1101/2020.09.28.317461doi: bioRxiv preprint
![Page 27: A deep learning algorithm to translate and classify ... · 9/28/2020 · We showed that a deep learning network can be applied to classify cells into the drugged and drug free categories](https://reader034.fdocuments.in/reader034/viewer/2022052104/603ef87405c24705dc7026e8/html5/thumbnails/27.jpg)
18. Alhusseini, M.I., et al., Machine Learning to Classify Intracardiac Electrical Patterns
During Atrial Fibrillation: Machine Learning of Atrial Fibrillation. Circulation:
Arrhythmia and Electrophysiology, 2020. 13(8): p. e008160.
19. Hochreiter, S. and J. Schmidhuber, Long short-term memory. Neural computation, 1997.
9(8): p. 1735-1780.
20. Ballinger, B., et al. DeepHeart: semi-supervised sequence learning for cardiovascular
risk prediction. in Thirty-Second AAAI Conference on Artificial Intelligence. 2018.
21. He, R., et al., Automatic cardiac arrhythmia classification using combination of deep
residual network and bidirectional LSTM. IEEE Access, 2019. 7: p. 102119-102135.
22. Hou, B., et al., LSTM Based Auto-Encoder Model for ECG Arrhythmias Classification.
IEEE Transactions on Instrumentation and Measurement, 2019.
23. Warrick, P. and M.N. Homsi. Cardiac arrhythmia detection from ECG combining
convolutional and long short-term memory networks. in 2017 Computing in Cardiology
(CinC). 2017. IEEE.
24. Oh, S.L., et al., Automated diagnosis of arrhythmia using combination of CNN and LSTM
techniques with variable length heart beats. Computers in biology and medicine, 2018.
102: p. 278-287.
25. Chen, C., et al., Automated arrhythmia classification based on a combination network of
CNN and LSTM. Biomedical Signal Processing and Control, 2020. 57: p. 101819.
26. Wang, L. and X. Zhou, Detection of congestive heart failure based on LSTM-based deep
network via short-term RR intervals. Sensors, 2019. 19(7): p. 1502.
27. Bian, M., et al. An accurate lstm based video heart rate estimation method. in Chinese
Conference on Pattern Recognition and Computer Vision (PRCV). 2019. Springer.
28. Yildirim, O., et al., A new approach for arrhythmia classification using deep coded
features and LSTM networks. Computer methods and programs in biomedicine, 2019.
176: p. 121-133.
29. Wang, E.K., X. Zhang, and L. Pan, Automatic classification of CAD ECG signals with
SDAE and bidirectional long short-term network. IEEE Access, 2019. 7: p. 182873-
182880.
30. Martis, R.J., et al., Application of higher order cumulant features for cardiac health
diagnosis using ECG signals. International journal of neural systems, 2013. 23(04): p.
1350014.
31. Liu, F., et al. A LSTM and CNN based assemble neural network framework for
arrhythmias classification. in ICASSP 2019-2019 IEEE International Conference on
Acoustics, Speech and Signal Processing (ICASSP). 2019. IEEE.
32. Yildirim, Ö., A novel wavelet sequence based on deep bidirectional LSTM network model
for ECG signal classification. Computers in biology and medicine, 2018. 96: p. 189-202.
33. Cai, C., et al., Deep learning-based prediction of drug-induced cardiotoxicity. Journal of
chemical information and modeling, 2019. 59(3): p. 1073-1084.
34. Zhang, Y., et al., Prediction of hERG K+ channel blockage using deep neural networks.
Chemical Biology & Drug Design, 2019. 94(5): p. 1973-1985.
35. Dickson, C.J., C. Velez-Vega, and J.S. Duca, Revealing molecular determinants of hERG
blocker and activator binding. Journal of chemical information and modeling, 2019.
60(1): p. 192-203.
36. Ryu, J.Y., et al., DeepHIT: a deep learning framework for prediction of hERG-induced
cardiotoxicity. Bioinformatics, 2020. 36(10): p. 3049-3055.
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted September 29, 2020. ; https://doi.org/10.1101/2020.09.28.317461doi: bioRxiv preprint
![Page 28: A deep learning algorithm to translate and classify ... · 9/28/2020 · We showed that a deep learning network can be applied to classify cells into the drugged and drug free categories](https://reader034.fdocuments.in/reader034/viewer/2022052104/603ef87405c24705dc7026e8/html5/thumbnails/28.jpg)
37. Yang, P.-C., et al., A computational pipeline to predict cardiotoxicity: From the atom to
the rhythm. Circulation research, 2020. 126(8): p. 947-964.
38. Li, Z., et al., General principles for the validation of proarrhythmia risk prediction
models: an extension of the CiPA in silico strategy. Clinical Pharmacology &
Therapeutics, 2020. 107(1): p. 102-111.
39. Tanskanen, A.J. and L.H. Alvarez, Voltage noise influences action potential duration in
cardiac myocytes. Mathematical biosciences, 2007. 208(1): p. 125-146.
40. O'Hara, T., et al., Simulation of the undiseased human cardiac ventricular action
potential: model formulation and experimental validation. PLoS computational biology,
2011. 7(5): p. e1002061.
41. Kernik, D.C., et al., A computational model of induced pluripotent stem‐cell derived
cardiomyocytes incorporating experimental variability from multiple data sources. The
Journal of physiology, 2019.
42. Fawcett, T., An introduction to ROC analysis. Pattern recognition letters, 2006. 27(8): p.
861-874.
43. Powers, D.M., Evaluation: from precision, recall and F-measure to ROC, informedness,
markedness and correlation. 2011.
44. Gustafsson, F., Determining the initial states in forward-backward filtering. IEEE
Transactions on signal processing, 1996. 44(4): p. 988-992.
45. Fabbri, A., et al., Required GK1 to suppress automaticity of iPSC-CMs depends strongly
on IK1 model structure. Biophysical Journal, 2019. 117(12): p. 2303-2315.
46. Atkinson, K.E., An introduction to numerical analysis. 2008: John wiley & sons.
47. Li, M., et al., Overexpression of KCNJ2 in induced pluripotent stem cell-derived
cardiomyocytes for the assessment of QT-prolonging drugs. Journal of Pharmacological
Sciences, 2017. 134(2): p. 75-85.
48. Cheng, J., L. Dong, and M. Lapata, Long short-term memory-networks for machine
reading. arXiv preprint arXiv:1601.06733, 2016.
49. Olah, C., Understanding LSTM Networks. Aug. 2015. URL https://colah. github.
io/posts/2015-08-Understanding-LSTMs, 2017.
50. Krogh, A., What are artificial neural networks? Nature biotechnology, 2008. 26(2): p.
195-197.
51. Carugo, O., F. Eisenhaber, and Carugo, Data mining techniques for the life sciences. Vol.
609. 2010: Springer.
52. Goodfellow, I., Y. Bengio, and A. Courville, Deep learning. 2016: MIT press.
53. Murphy, K.P., Machine learning: a probabilistic perspective. 2012: MIT press.
54. Kingma, D.P. and J. Ba, Adam: A method for stochastic optimization. arXiv preprint
arXiv:1412.6980, 2014.
55. Zaremba, W., I. Sutskever, and O. Vinyals, Recurrent neural network regularization.
arXiv preprint arXiv:1409.2329, 2014.
56. Ketkar, N., Introduction to pytorch, in Deep learning with python. 2017, Springer. p.
195-208.
57. Devore, J.L., Probability and Statistics for Engineering and the Sciences. 2011: Cengage
learning.
58. Mason, S.J. and N.E. Graham, Areas beneath the relative operating characteristics
(ROC) and relative operating levels (ROL) curves: Statistical significance and
interpretation. Quarterly Journal of the Royal Meteorological Society: A journal of the
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted September 29, 2020. ; https://doi.org/10.1101/2020.09.28.317461doi: bioRxiv preprint
![Page 29: A deep learning algorithm to translate and classify ... · 9/28/2020 · We showed that a deep learning network can be applied to classify cells into the drugged and drug free categories](https://reader034.fdocuments.in/reader034/viewer/2022052104/603ef87405c24705dc7026e8/html5/thumbnails/29.jpg)
atmospheric sciences, applied meteorology and physical oceanography, 2002. 128(584):
p. 2145-2166.
59. Kernik, D.C., et al., A computational model of induced pluripotent stem-cell derived
cardiomyocytes for high throughput risk stratification of KCNQ1 genetic variants. PLOS
Computational Biology, 2020. 16(8): p. e1008109.
60. Yang, P.C., et al., A computational modelling approach combined with cellular
electrophysiology data provides insights into the therapeutic benefit of targeting the late
Na+ current. The Journal of physiology, 2015. 593(6): p. 1429-1442.
61. Yoshinaga, D., et al., Phenotype-based high-throughput classification of long QT
syndrome subtypes using human induced pluripotent stem cells. Stem cell reports, 2019.
13(2): p. 394-404.
(which was not certified by peer review) is the author/funder. All rights reserved. No reuse allowed without permission. The copyright holder for this preprintthis version posted September 29, 2020. ; https://doi.org/10.1101/2020.09.28.317461doi: bioRxiv preprint