Machine Learning for Personalized Medicine
Jean-Philippe Vert
MascotNum 2014, ETH Zurich, April 24, 2014
Complexity of life
1 body = 1014 cells1 cell = 6× 109 ACGT coding for 20,000 genes
Sequencing revolution
A flood of omics data
Genome
Interactome
Mutations Structural variations
Smoothing parameter chosen to maximize agreement with breakpoint annotations, to learn breakpoints on other chromosomes
position/1e+07
LogR
atio
−2.0
−1.5
−1.0
−0.5
0.0
0.5
1.0
−2.0
−1.5
−1.0
−0.5
0.0
0.5
1.0
−2.0
−1.5
−1.0
−0.5
0.0
0.5
1.0
1
0 5 10 15 20
2
0 5 10 15 20
3
0 5 10 15
4
0 5 10 15
5
5 10 15
6
5 10 15
7
5 10 15
8
2468101214
9
2468101214
10
24681012
11
24681012
12
24681012
13
246810
14
246810
15
246810
16
2468
17
2468
18
1234567
19
12345
20
123456
21
1.52.02.53.03.54.04.5
22
1.52.02.53.03.54.04.55.0
X
2468101214
Y
0.51.01.52.02.5
0.01100
1e+08
labelbreakpointnormal
Status Lots Trisomy Normal Monosomy Loss
Outlier inlier outlier
Epigenome
Phenome
Transcriptome
Publications
Cancer
A cancer cell
A cancer cell
A cancer cell
Opportunities
What is your risk of developing a cancer? (prevention)After diagnosis and treatment, what is the risk of relapse?(prognosis)What specific treatment will cure your cancer? (personalizedmedicine)
Outline
1 Learning molecular classifiers with network information
2 Kernel bilinear regression for toxicogenomics
Outline
1 Learning molecular classifiers with network information
2 Kernel bilinear regression for toxicogenomics
Breast cancer prognosis
Learning with regularization
Given a training set (xi , yi)i=1,...,n where xi ∈ Rp (typically,n = 200,p = 20,000), we estimate a linear predictor
fβ(x) = β>x
by solvingminβ∈Rp
R(β) + λΩ(β)
where:R(β) is a convex empirical risk, typically
R(β) =1n
n∑
i=1
`(β>xi , yi)
for some loss function ` (squared error, logistic loss, hinge loss...)Ω(β) is a regularization term, typically ‖β ‖2 (ridge regression,SVM...) or ‖β ‖1 (lasso...)
Gene selection, molecular signature
The ideaWe look for a limited set of genes that are sufficient for prediction.Selected genes should inform us about the underlying biology
Lack of stability of signatures
0.56 0.58 0.6 0.62 0.64 0.660
0.05
0.1
0.15
0.2
AUC
Sta
bility
RandomT testEntropyBhatt.WilcoxonRFEGFSLassoE−Net
Single−runEnsemble−meanEnsemble−expEnsemble−ss
Haury et al. (2011)
Gene networks
N
-
Glycan biosynthesis
Protein kinases
DNA and RNA polymerase subunits
Glycolysis / Gluconeogenesis
Sulfurmetabolism
Porphyrinand chlorophyll metabolism
Riboflavin metabolism
Folatebiosynthesis
Biosynthesis of steroids, ergosterol metabolism
Lysinebiosynthesis
Phenylalanine, tyrosine andtryptophan biosynthesis Purine
metabolism
Oxidative phosphorylation, TCA cycle
Nitrogen,asparaginemetabolism
Gene networks and expression data
MotivationBasic biological functions usually involve the coordinated action ofseveral proteins:
Formation of protein complexesActivation of metabolic, signalling or regulatory pathways
Many pathways and protein-protein interactions are already knownHypothesis: the weights of the classifier should be “coherent” withrespect to this prior knowledge
Graph based penalty
fβ(x) = β>x minβ
R(fβ) + λΩ(β)
Prior hypothesisGenes near each other on the graph should have similar weigths.
An idea (Rapaport et al., 2007)
Ω(β) =∑
i∼j
(βi − βj)2 ,
minβ∈Rp
R(fβ) + λ∑
i∼j
(βi − βj)2 .
Graph based penalty
fβ(x) = β>x minβ
R(fβ) + λΩ(β)
Prior hypothesisGenes near each other on the graph should have similar weigths.
An idea (Rapaport et al., 2007)
Ω(β) =∑
i∼j
(βi − βj)2 ,
minβ∈Rp
R(fβ) + λ∑
i∼j
(βi − βj)2 .
Graph Laplacian
DefinitionThe Laplacian of the graph is the matrix L = D − A.
1
2
3
4
5
L = D − A =
1 0 −1 0 00 1 −1 0 0−1 −1 3 −1 00 0 −1 2 −10 0 0 1 1
Spectral penalty as a kernel
TheoremThe function f (x) = β>x where β is solution of
minβ∈Rp
1n
n∑
i=1
`(β>xi , yi
)+ λ
∑
i∼j
(βi − βj
)2
is equal to g(x) = γ>Φ(x) where γ is solution of
minγ∈Rp
1n
n∑
i=1
`(γ>Φ(xi), yi
)+ λγ>γ ,
and whereΦ(x)>Φ(x ′) = x>KGx ′
for KG = L∗, the pseudo-inverse of the graph Laplacian.
Example
1
2
3
4
5
L∗ =
0.88 −0.12 0.08 −0.32 −0.52−0.12 0.88 0.08 −0.32 −0.52
0.08 0.08 0.28 −0.12 −0.32−0.32 −0.32 −0.12 0.48 0.28−0.52 −0.52 −0.32 0.28 1.08
ClassifiersRapaport et al
N
-
Glycan biosynthesis
Protein kinases
DNA and RNA polymerase subunits
Glycolysis / Gluconeogenesis
Sulfurmetabolism
Porphyrinand chlorophyll metabolism
Riboflavin metabolism
Folatebiosynthesis
Biosynthesis of steroids, ergosterol metabolism
Lysinebiosynthesis
Phenylalanine, tyrosine andtryptophan biosynthesis Purine
metabolism
Oxidative phosphorylation, TCA cycle
Nitrogen,asparaginemetabolism
Fig. 4. Global connection map of KEGG with mapped coefficients of the decision function obtained by applying a customary linear SVM
(left) and using high-frequency eigenvalue attenuation (80% of high-frequency eigenvalues have been removed) (right). Spectral filtering
divided the whole network into modules having coordinated responses, with the activation of low-frequency eigen modes being determined by
microarray data. Positive coefficients are marked in red, negative coefficients are in green, and the intensity of the colour reflects the absolute
values of the coefficients. Rhombuses highlight proteins participating in the Glycolysis/Gluconeogenesis KEGG pathway. Some other parts of
the network are annotated including big highly connected clusters corresponding to protein kinases and DNA and RNA polymerase sub-units.
5 DISCUSSION
Our algorithm groups predictor variables according to highly
connected "modules" of the global gene network. We assume
that the genes within a tightly connected network module
are likely to contribute similarly to the prediction function
because of the interactions between the genes. This motivates
the filtering of gene expression profile to remove the noisy
high-frequencymodes of the network.
Such grouping of variables is a very useful feature of the
resulting classification function because the function beco-
mes meaningful for interpreting and suggesting biological
factors that cause the class separation. This allows classifi-
cations based on functions, pathways and network modules
rather than on individual genes. This can lead to a more robust
behaviour of the classifier in independent tests and to equal if
not better classification results. Our results on the dataset we
analysed shows only a slight improvement, although this may
be due to its limited size. Thereforewe are currently extending
our work to larger data sets.
An important remark to bear in mind when analyzing pictu-
res such as fig.4 and 5 is that the colors represent the weights
of the classifier, and not gene expression levels. There is
of course a relationship between the classifier weights and
the typical expression levels of genes in irradiated and non-
irradiated samples: irradiated samples tend to have expression
profiles positively correlated with the classifier, while non-
irradiated samples tend to be negatively correlated. Roughly
speaking, the classifier tries to find a smooth function that
has this property. If more samples were available, better
non-smooth classifier might be learned by the algorithm, but
constraining the smoothness of the classifier is away to reduce
the complexity of the learning problem when a limited num-
ber of samples are available. This means in particular that the
pictures provide virtually no information regarding the over-
8
ClassifierSpectral analysis of gene expression profiles using gene networks
a) b)Fig. 5. Theglycolysis/gluconeogenesis pathways ofKEGGwithmapped coefficients of the decision function obtained by applying a customary
linear SVM (a) and using high-frequency eigenvalue attenuation (b). The pathways are mutually exclusive in a cell, as clearly highlighted by
our algorithm.
or under-expression of individual genes, which is the cost to
pay to obtain instead an interpretation in terms of more glo-
bal pathways. Constraining the classifier to rely on just a few
genes would have a similar effect of reducing the complexity
of the problem,butwould lead to amoredifficult interpretation
in terms of pathways.
An advantage of our approach over other pathway-based
clustering methods is that we consider the network modules
that naturally appear from spectral analysis rather than a histo-
rically defined separation of the network into pathways. Thus,
pathways cross-talking is taken into account, which is diffi-
cult to do using other approaches. It can however be noticed
that the implicit decomposition into pathways that we obtain
is biased by the very incomplete knowledge of the network
and that certain regions of the network are better understood,
leading to a higher connection concentration.
Like most approaches aiming at comparing expression data
with gene networks such as KEGG, the scope of this work
is limited by two important constraints. First the gene net-
work we use is only a convenient but rough approximation to
describe complex biochemical processes; second, the trans-
criptional analysis of a sample can not give any information
regarding post-transcriptional regulation and modifications.
Nevertheless, we believe that our basic assumptions remain
valid, in that we assume that the expression of the genes
belonging to the same metabolic pathways module are coor-
dinately regulated. Our interpretation of the results supports
this assumption.
Another important caveat is that we simplify the network
description as an undirected graph of interactions. Although
this would seem to be relevant for simplifying the descrip-
tion of metabolic networks, real gene regulation networks are
influenced by the direction, sign and importance of the interac-
tion. Although the incorporationof weights into the Laplacian
(equation 1) is straightforward and allows the extension of the
approach to weighted undirected graphs, the incorporation
of directions and signs to represent signalling or regulatory
pathways requires more work but could lead to important
advances for the interpretation of microarray data in cancer
studies, for example.
ACKNOWLEDGMENTS
This work was supported by the grant ACI-IMPBIO-2004-47
of the French Ministry for Research and New Technologies.
9
Other penalties with kernels
Φ(x)>Φ(x ′) = x>KGx ′
with:KG = (c + L)−1 leads to
Ω(β) = cp∑
i=1
β2i +
∑
i∼j
(βi − βj
)2.
The diffusion kernel:
KG = expM(−2tL) .
penalizes high frequencies of β in the Fourier domain.
Other penalties without kernels
Gene selection + Piecewise constant on the graph
Ω(β) =∑
i∼j
∣∣βi − βj∣∣+
p∑
i=1
|βi |
Gene selection + smooth on the graph
Ω(β) =∑
i∼j
(βi − βj
)2+
p∑
i=1
|βi |
Graph lasso
Two solutions
Ωintersection(β) =∑
i∼j
√β2
i + β2j ,
Ωunion(β) = supα∈Rp:∀i∼j,‖α2
i +α2j ‖≤1
α>β .
Generalization: Group lasso with overlapping groups
ΩGlatent (w)∆=
minv
∑
g∈G‖vg‖2
w =∑
g∈G vg
supp(vg)⊆ g.
PropertiesResulting support is a union of groups in G.Possible to select one variable without selecting all the groupscontaining it.Equivalent to group lasso when there is no overlap
Theoretical results
Consistency in group support (Jacob et al., 2009)Let w be the true parameter vector.Assume that there exists a unique decomposition vg such thatw =
∑g vg and ΩGlatent (w) =
∑ ‖vg‖2.Consider the regularized empirical risk minimization problemL(w) + λΩGlatent (w).
Thenunder appropriate mutual incoherence conditions on X ,as n→∞,with very high probability,
the optimal solution w admits a unique decomposition (vg)g∈G suchthat
g ∈ G|vg 6= 0
=
g ∈ G|vg 6= 0.
Theoretical results
Consistency in group support (Jacob et al., 2009)Let w be the true parameter vector.Assume that there exists a unique decomposition vg such thatw =
∑g vg and ΩGlatent (w) =
∑ ‖vg‖2.Consider the regularized empirical risk minimization problemL(w) + λΩGlatent (w).
Thenunder appropriate mutual incoherence conditions on X ,as n→∞,with very high probability,
the optimal solution w admits a unique decomposition (vg)g∈G suchthat
g ∈ G|vg 6= 0
=
g ∈ G|vg 6= 0.
Experiments
Synthetic data: overlapping groups10 groups of 10 variables with 2 variables of overlap between twosuccessive groups :1, . . . ,10, 9, . . . ,18, . . . , 73, . . . ,82.Support: union of 4th and 5th groups.Learn from 100 training points.
Frequency of selection of each variable with the lasso (left) and ΩGlatent (.)
(middle), comparison of the RMSE of both methods (right).
Lasso signature (accuracy 0.61)
Graph Lasso signature (accuracy 0.64)
Outline
1 Learning molecular classifiers with network information
2 Kernel bilinear regression for toxicogenomics
Pharmacogenomics / Toxicogenomics
DREAM8 Toxicogenetics challenge
Genotypes from the 1000 genome projectRNASeq from the Geuvadis project
Bilinear regression
Cell line X , chemical Y , toxicity Z .Bilinear regression model:
Z = f (X ,Y ) + b(Y ) + ε ,
Estimation by kernel ridge regression:
minf∈H,b∈Rp
n∑
i=1
p∑
j=1
(f (xi , yj) + bj − zij
)2+ λ‖f‖2 ,
Solving in O(max(n,p)3)
2 The kernel bilinear regression model
Let X and Y denote abstract vector space to represent, respectively, cell lines and chemicals. Forexample, if each cell line is characterized by a measure of d genetic markers, then we may takeX = Rd to represent each cell line as a vector of markers, but to keep generality we will simplyassume that X and Y are endowed with positive definite kernels, respectively KX and KY . Givena set of n cell lines x1, . . . , xn 2 X and p chemicals y1, . . . , yp 2 Y, we assume that a quantitativemeasure of toxicity response zi,j 2 R has been measured when cell line xi is exposed to chemical yj ,for i = 1, . . . , n and j = 1, . . . , p. Our goal is to estimate, from this data, a function h : X Y ! Rto predict the response h(x, y) if a cell line x is exposed to a chemical y.
We propose to model the response with a simple bilinear regression model of the form:
Z = f(X, Y ) + b(Y ) + , (1)
where f is a bilinear function, b is a chemical-specific bias term and is some Gaussian noise. Weadd the chemical-specific bias term to adjust for the large di↵erences in absolute toxicity responsevalues between chemicals, while the bilinear term f(X, Y ) can capture some patterns of variationsbetween cell lines shared by di↵erent chemicals. We will only focus on the problem of predictingthe action of known and tested chemicals on new cell lines, meaning that we will not try to estimateb(Y ) on new cell lines.
If x and y are finite dimensional vectors, then the bilinear term f(x, y) has the simple formx>My for some matrix M , with Frobenius norm kMk2 = Tr(M>M). The natural generalizationof this bilinear model to possibly infinite-dimensional spaces X and Y is to consider a function fin the product reproducing kernel Hilbert space H associated to the product kernel KXKY , withHilbert-Schmitt norm kfk2. To estimate model (1), we solve a standard ridge regression problem:
minf2H,b2Rp
nX
i=1
pX
j=1
(f(xi, yj) + bj zij)2 + kfk2 , (2)
where is a regularization parameter to be optimized. As shown in the next theorem, (2) has ananalytical solution. Note that 1n refers to the n-dimensional vector of ones, Diag(u) for a vectoru 2 Rn refers to the n n diagonal matrix whose diagonal is u, and A B for two matrices of thesame size refers to their Hadamard (or entrywise) product.
Theorem 1. Let Z 2 Rnp be the response matrix, and KX 2 Rnn and KY 2 Rpp be the kernelGram matrices of the n cell lines and p chemicals, with respective eigenvalue decompositions KX =
UXDXU>X and KY = UY DY U>
Y . Let = U>X1n and S 2 Rnp be defined by Sij = 1/
+ Di
XDjY
,
where DiX (resp. Di
Y ) denotes the i-th diagonal term of DX (resp. DY ). Then the solution (f, b)of (2) is given by
b = UY DiagS>2
1 S>
U>
Y Z>UX
(3)
and
8(x, y) 2 X Y , f(x, y) =
nX
i=1
pX
j=1
↵i,jKX(xi, x)KY (yi, y) , (4)
where↵ = UX
S
U>
X
Z 1nb>
UY
U>
Y . (5)
2
Kernel Trick
cell line descriptors
drug descriptors
mercredi 6 novembre 13
Kernel Trick
cell line descriptors
drug descriptors
Kcell
kernelizedKdrug
jeudi 7 novembre 13
Kernel Trick
cell line descriptors
drug descriptors
Kcell
kernelizedKdrug
kernelbilinear
regressionf
jeudi 7 novembre 13
Kernel Trick
cell line descriptors
drug descriptors
Kcell
kernelizedKdrug
kernelbilinear
regression fKernel choice?. descriptors. data integration. missing data
jeudi 7 novembre 13
Kernel choice
1 Kcell :=⇒ 29 cell line kernels tested=⇒ 1 kernel that integrate all information=⇒ deal with missing data
2 Kdrug :=⇒ 48 drug kernels tested=⇒ multi-task kernels
Kernel choice
1 Kcell :=⇒ 29 cell line kernels tested=⇒ 1 kernel that integrate all information=⇒ deal with missing data
2 Kdrug :=⇒ 48 drug kernels tested=⇒ multi-task kernels
Cell line data integration
Covariates. linear kernel
SNPs. 10 gaussian
kernels
RNA-seq. 10 gaussian
kernels
mercredi 6 novembre 13
Cell line data integration
Covariates. linear kernel
SNPs. 10 gaussian
kernels
RNA-seq. 10 gaussian
kernels
Integrated kernel
mercredi 6 novembre 13
Multi-task drug kernels
1 Dirac2 Multi-Task3 Feature-based4 Empirical5 Integrated
independent regression for each drug
Multi-task drug kernels
1 Dirac2 Multi-Task3 Feature-based4 Empirical5 Integrated
sharing information across drugs
Multi-task drug kernels
1 Dirac2 Multi-Task3 Feature-based4 Empirical5 Integrated
Linear kernel and 10 gaussian kernelsbased on features:
CDK (160 descriptors) and SIRMS(9272 descriptors)Graph kernel for molecules (2D walkkernel)Fingerprint of 2D substructures (881descriptors)Ability to bind human proteins (1554descriptors)
Multi-task drug kernels
1 Dirac2 Multi-Task3 Feature-based4 Empirical5 Integrated
Empirical correlation
0 0.4 0.8Value
010
0
Color Keyand Histogram
Cou
nt
Multi-task drug kernels
1 Dirac2 Multi-Task3 Feature-based4 Empirical5 Integrated
Integrated kernel:Combine all information on drugs
29x48 kernel combinations: CV results
Km
ultit
ask1
1K
subs
truc
ture
.txt
Kch
emcp
p.tx
tK
mul
titas
k1K
sirm
sRbf
1.tx
tK
cdkR
bf1.
txt
Kpr
edta
rget
Rbf
1.tx
tK
sirm
sRbf
2.tx
tK
mul
titas
k3K
mul
titas
k2K
pred
targ
etR
bf2.
txt
Kcd
kRbf
2.tx
tK
pred
targ
etR
bf3.
txt
Kcd
kRbf
3.tx
tK
mul
titas
k4K
sirm
sRbf
3.tx
tK
mul
titas
k5K
cdkR
bf8.
txt
Ksi
rmsR
bf8.
txt
Kpr
edta
rget
Rbf
8.tx
tK
pred
targ
etR
bf6.
txt
Kpr
edta
rget
Rbf
7.tx
tK
mul
titas
k7K
mul
titas
k10
Kpr
edta
rget
Rbf
4.tx
tK
pred
targ
etR
bf5.
txt
Kcd
kRbf
4.tx
tK
cdkR
bf6.
txt
Kcd
kRbf
5.tx
tK
cdkR
bf7.
txt
Ksi
rmsR
bf6.
txt
Ksi
rmsR
bf7.
txt
Kem
piric
alK
int
Km
ultit
ask6
Km
ultit
ask8
Km
ultit
ask9
Ksi
rmsR
bf4.
txt
Ksi
rmsR
bf5.
txt
Kpr
edta
rget
Mea
nK
cdkM
ean
Ksi
rmsM
ean
Kcovariates.txtKintKcovariatesPopulation.txtKcovariatesBatch.txtKrnaseqMeanKrnaseqRbf4.txtKrnaseqRbf9.txtKrnaseqRbf10.txtKrnaseqRbf8.txtKrnaseqRbf5.txtKrnaseqRbf7.txtKrnaseqRbf6.txtdiracuniformKsnpMeanKrnaseqRbf2.txtKrnaseqRbf3.txtKsnpRbf1.txtKsnpRbf3.txtKsnpRbf2.txtKcovariatesSex.txtKsnpRbf4.txtKsnpRbf5.txtKrnaseqRbf1.txtKsnpRbf8.txtKsnpRbf7.txtKsnpRbf6.txt
CI
0.5 0.51 0.52 0.53Value
015
0Color Key
and HistogramC
ount
29x48 kernel combinations: CV results
Kmultitask11
Ksubstructure.txt
Kchemcpp.txt
Kmultitask1
Ksirm
sRbf1.txt
KcdkRbf1.txt
KpredtargetRbf1.txt
Ksirm
sRbf2.txt
Kmultitask3
Kmultitask2
KpredtargetRbf2.txt
KcdkRbf2.txt
KpredtargetRbf3.txt
KcdkRbf3.txt
Kmultitask4
Ksirm
sRbf3.txt
Kmultitask5
KcdkRbf8.txt
Ksirm
sRbf8.txt
KpredtargetRbf8.txt
KpredtargetRbf6.txt
KpredtargetRbf7.txt
Kmultitask7
Kmultitask10
KpredtargetRbf4.txt
KpredtargetRbf5.txt
KcdkRbf4.txt
KcdkRbf6.txt
KcdkRbf5.txt
KcdkRbf7.txt
Ksirm
sRbf6.txt
Ksirm
sRbf7.txt
Kempirical
Kint
Kmultitask6
Kmultitask8
Kmultitask9
Ksirm
sRbf4.txt
Ksirm
sRbf5.txt
KpredtargetMean
KcdkMean
Ksirm
sMean
Kcovariates.txtKintKcovariatesPopulation.txtKcovariatesBatch.txtKrnaseqMeanKrnaseqRbf4.txtKrnaseqRbf9.txtKrnaseqRbf10.txtKrnaseqRbf8.txtKrnaseqRbf5.txtKrnaseqRbf7.txtKrnaseqRbf6.txtdiracuniformKsnpMeanKrnaseqRbf2.txtKrnaseqRbf3.txtKsnpRbf1.txtKsnpRbf3.txtKsnpRbf2.txtKcovariatesSex.txtKsnpRbf4.txtKsnpRbf5.txtKrnaseqRbf1.txtKsnpRbf8.txtKsnpRbf7.txtKsnpRbf6.txt
CI
0.5 0.51 0.52 0.53Value
0150
Color Keyand Histogram
Count
integrated and
covariateskernels
jeudi 7 novembre 13
29x48 kernel combinations: CV results
Kmultitask11
Ksubstructure.txt
Kchemcpp.txt
Kmultitask1
Ksirm
sRbf1.txt
KcdkRbf1.txt
KpredtargetRbf1.txt
Ksirm
sRbf2.txt
Kmultitask3
Kmultitask2
KpredtargetRbf2.txt
KcdkRbf2.txt
KpredtargetRbf3.txt
KcdkRbf3.txt
Kmultitask4
Ksirm
sRbf3.txt
Kmultitask5
KcdkRbf8.txt
Ksirm
sRbf8.txt
KpredtargetRbf8.txt
KpredtargetRbf6.txt
KpredtargetRbf7.txt
Kmultitask7
Kmultitask10
KpredtargetRbf4.txt
KpredtargetRbf5.txt
KcdkRbf4.txt
KcdkRbf6.txt
KcdkRbf5.txt
KcdkRbf7.txt
Ksirm
sRbf6.txt
Ksirm
sRbf7.txt
Kempirical
Kint
Kmultitask6
Kmultitask8
Kmultitask9
Ksirm
sRbf4.txt
Ksirm
sRbf5.txt
KpredtargetMean
KcdkMean
Ksirm
sMean
Kcovariates.txtKintKcovariatesPopulation.txtKcovariatesBatch.txtKrnaseqMeanKrnaseqRbf4.txtKrnaseqRbf9.txtKrnaseqRbf10.txtKrnaseqRbf8.txtKrnaseqRbf5.txtKrnaseqRbf7.txtKrnaseqRbf6.txtdiracuniformKsnpMeanKrnaseqRbf2.txtKrnaseqRbf3.txtKsnpRbf1.txtKsnpRbf3.txtKsnpRbf2.txtKcovariatesSex.txtKsnpRbf4.txtKsnpRbf5.txtKrnaseqRbf1.txtKsnpRbf8.txtKsnpRbf7.txtKsnpRbf6.txt
CI
0.5 0.51 0.52 0.53Value
0150
Color Keyand Histogram
Count
covariates kernelon cell lines
sightly multi-taskon drugs
jeudi 7 novembre 13
Kernel on cell lines: CV results
diracuniform
KsnpRbf7.txtKsnpRbf6.txtKsnpRbf8.txtKsnpRbf5.txt
KrnaseqRbf1.txtKcovariatesSex.txt
KsnpRbf4.txtKsnpMean
KrnaseqRbf2.txtKsnpRbf1.txtKsnpRbf3.txtKsnpRbf2.txt
KrnaseqRbf3.txtKrnaseqMean
KrnaseqRbf4.txtKrnaseqRbf5.txtKrnaseqRbf7.txtKrnaseqRbf6.txtKrnaseqRbf8.txtKrnaseqRbf9.txtKrnaseqRbf10.txt
KcovariatesPopulation.txtKcovariatesBatch.txt
Kcovariates.txtKint
Mean CI for cell line kernels
0.50 0.51 0.52 0.53 0.54
integrated kernel
batch effect
jeudi 7 novembre 13
Kernel on drugs: CV results
Kchemcpp.txtKmultitask11
Ksubstructure.txtKpredtargetRbf1.txt
KcdkRbf1.txtKsirmsRbf2.txt
Kmultitask1KpredtargetRbf2.txt
KsirmsRbf1.txtKcdkRbf3.txt
KsirmsRbf3.txtKcdkRbf2.txtKmultitask3Kmultitask5
KsirmsRbf4.txtKint
Kmultitask4Kmultitask2KcdkMean
Kmultitask6KsirmsMean
KsirmsRbf5.txtKmultitask8Kmultitask9
KcdkRbf4.txtKmultitask7
KcdkRbf6.txtKcdkRbf5.txt
KpredtargetMeanKpredtargetRbf3.txt
KcdkRbf7.txtKempirical
KpredtargetRbf4.txtKsirmsRbf6.txtKsirmsRbf7.txt
KpredtargetRbf5.txtKpredtargetRbf6.txt
KcdkRbf8.txtKsirmsRbf8.txt
KpredtargetRbf7.txtKmultitask10
KpredtargetRbf8.txt
Mean CI for chemicals kernels
0.50 0.51 0.52 0.53 0.54
Final Submission (ranked 2nd)
Kchemcpp.txtKmultitask11
Ksubstructure.txtKpredtargetRbf1.txt
KcdkRbf1.txtKsirmsRbf2.txt
Kmultitask1KpredtargetRbf2.txt
KsirmsRbf1.txtKcdkRbf3.txt
KsirmsRbf3.txtKcdkRbf2.txtKmultitask3Kmultitask5
KsirmsRbf4.txtKint
Kmultitask4Kmultitask2KcdkMeanKmultitask6KsirmsMean
KsirmsRbf5.txtKmultitask8Kmultitask9KcdkRbf4.txtKmultitask7KcdkRbf6.txtKcdkRbf5.txt
KpredtargetMeanKpredtargetRbf3.txt
KcdkRbf7.txtKempirical
KpredtargetRbf4.txtKsirmsRbf6.txtKsirmsRbf7.txt
KpredtargetRbf5.txtKpredtargetRbf6.txt
KcdkRbf8.txtKsirmsRbf8.txt
KpredtargetRbf7.txtKmultitask10
KpredtargetRbf8.txt
Mean CI for chemicals kernels
0.50 0.51 0.52 0.53 0.54
Empirical kernel on drugs
Empirical correlation
0 0.4 0.8Value
0100
Color Keyand Histogram
Count
Integrated kernel on cell lines
vendredi 8 novembre 13
Thanks
jeudi 7 novembre 13
Top Related