Doubly Decomposing Nonparametric Tensor Regression
-
Upload
masaaki-imaizumi -
Category
Science
-
view
74 -
download
1
Transcript of Doubly Decomposing Nonparametric Tensor Regression
Tensor regression problemOur Approach
Convergence AnalysisExperiments
Summary
Doubly DecomposingNonparametric Tensor Regression
M.Imaizumi (Univ. of Tokyo)K.Hayashi (AIST / JST ERATO)
2016/06/21 (ICML2016)
M.Imaizumi (Univ. of Tokyo) K.Hayashi (AIST / JST ERATO) Doubly Decomposing Nonparametric Tensor Regression
Tensor regression problemOur Approach
Convergence AnalysisExperiments
Summary
Outline
TopicRegression with Tensor inputNonparametric approach
MethodPropose nonparametric regression model with Bayes estimatorIt avoids “The Curse of Dimensionality” of tensor regression
M.Imaizumi (Univ. of Tokyo) K.Hayashi (AIST / JST ERATO) Doubly Decomposing Nonparametric Tensor Regression
Tensor regression problemOur Approach
Convergence AnalysisExperiments
Summary
1 Tensor regression problem
2 Our Approach
3 Convergence Analysis
4 Experiments
5 Summary
M.Imaizumi (Univ. of Tokyo) K.Hayashi (AIST / JST ERATO) Doubly Decomposing Nonparametric Tensor Regression
Tensor regression problemOur Approach
Convergence AnalysisExperiments
Summary
Tensor Regression Problem
Tensor dataX ∈ RI1×...×IK
K : mode of tensor XIk : dim of k-th mode
Tensor Regressionn observations Dn = {(Xi, Yi)}ni=1
Input (tensor) : Xi ∈ RI1×...×IK Output (real) : Yi ∈ RDn is generated with a function f : RI1×...×IK → R as
Yi = f(Xi) + εi
for i = 1, . . . , n.εi is a Gaussian noise
M.Imaizumi (Univ. of Tokyo) K.Hayashi (AIST / JST ERATO) Doubly Decomposing Nonparametric Tensor Regression
Tensor regression problemOur Approach
Convergence AnalysisExperiments
Summary
Application of Tensor Regression Problem
Predict health conditions from medical 3D imagesXi : medical 3D image of patient i, Yi : health condition of i
Predict spread of epidemics on networksXi : adjacency matrix of network i, Yi : # of infected nodes
M.Imaizumi (Univ. of Tokyo) K.Hayashi (AIST / JST ERATO) Doubly Decomposing Nonparametric Tensor Regression
Tensor regression problemOur Approach
Convergence AnalysisExperiments
Summary
The curse of dimensionality
Performance of the estimator of f gets worse with tensor input
Naive estimator f̃n
‖f̃n − f‖2 = O(n−2β/(2β+
∏k Ik)
),
where β is smoothness of f .∏k Ik = # of elements in X
We propose a method to obtain better convergence rate
M.Imaizumi (Univ. of Tokyo) K.Hayashi (AIST / JST ERATO) Doubly Decomposing Nonparametric Tensor Regression
Tensor regression problemOur Approach
Convergence AnalysisExperiments
Summary
1 Tensor regression problem
2 Our Approach
3 Convergence Analysis
4 Experiments
5 Summary
M.Imaizumi (Univ. of Tokyo) K.Hayashi (AIST / JST ERATO) Doubly Decomposing Nonparametric Tensor Regression
Tensor regression problemOur Approach
Convergence AnalysisExperiments
Summary
Double Decomposition
IdeaReduce the complexity of f by decomposing f intononparametric local functions
Control bias variance trade-off with the estimation
We decompose both the function f and the input tensor X
M.Imaizumi (Univ. of Tokyo) K.Hayashi (AIST / JST ERATO) Doubly Decomposing Nonparametric Tensor Regression
Tensor regression problemOur Approach
Convergence AnalysisExperiments
Summary
Double Decomposition 1
1.Tensor (CP) DecompositionConsider X ∈ RI1×...×IK
There exists a set of normalized vectors {x(k)r ∈ RIk}R∗,K
r,k=1,1
and scale term λr for all r = 1, . . . , R∗, then
X =
R∗∑r=1
λrx(1)r ⊗ x(2)r ⊗ . . .⊗ x(K)
r .
R∗ is a tensor rank
M.Imaizumi (Univ. of Tokyo) K.Hayashi (AIST / JST ERATO) Doubly Decomposing Nonparametric Tensor Regression
Tensor regression problemOur Approach
Convergence AnalysisExperiments
Summary
Double Decomposition 2
2.Functional Decomposition
For each r, consider a function f(x(1)r , x
(2)r , . . . , x
(K)r )
x(k)r are Ik-dimensional vectors
There exist M∗ ∈ Z+ ∪ {∞} and a set of local functions
{f (k)m }K,M∗
k,m=1 satisfying
f(x(1)r , x(2)r , . . . , x(K)r ) =
M∗∑m=1
K∏k=1
f (k)m (x(k)r ).
M∗ is a model complexity
M.Imaizumi (Univ. of Tokyo) K.Hayashi (AIST / JST ERATO) Doubly Decomposing Nonparametric Tensor Regression
Tensor regression problemOur Approach
Convergence AnalysisExperiments
Summary
Proposed Framework
Assumption
f is additive separable with respect to r = 1, . . . , R∗
Consider doubly decomposed form of f
f(X) =
M∗∑m=1
R∗∑r=1
λr
K∏k=1
f (k)m (x(k)r )
Additive-Multiplicative Nonparametric Regression(AMNR)
Composed by f(k)m for all m = 1, . . . ,M∗ and k = 1, . . . ,K
f(k)m takes a Ik-dimensional vector as inputM∗ (model complexity) and R∗ (tensor rank) are tuningparameters
M.Imaizumi (Univ. of Tokyo) K.Hayashi (AIST / JST ERATO) Doubly Decomposing Nonparametric Tensor Regression
Tensor regression problemOur Approach
Convergence AnalysisExperiments
Summary
Estimation Method
The Bayes method with the Gaussian process prior.
Prior
π(f) =∏m
∏k
GP(k)(f (k)m ),
Posterior
π(f |Dn) =exp(−
∑ni=1(Yi −G[f ](Xi))
2)∫exp(−
∑ni=1(Yi −G[f ′](Xi))2)π(df ′)
π(f),
where G[f ](Xi) :=M∑m=1
R∑r=1
K∏k=1
f (k)m
(x(k)r,i
).
ImplementationThe estimation bases on Gibbs sampling
M.Imaizumi (Univ. of Tokyo) K.Hayashi (AIST / JST ERATO) Doubly Decomposing Nonparametric Tensor Regression
Tensor regression problemOur Approach
Convergence AnalysisExperiments
Summary
1 Tensor regression problem
2 Our Approach
3 Convergence Analysis
4 Experiments
5 Summary
M.Imaizumi (Univ. of Tokyo) K.Hayashi (AIST / JST ERATO) Doubly Decomposing Nonparametric Tensor Regression
Tensor regression problemOur Approach
Convergence AnalysisExperiments
Summary
Analyze Convergence Theoretically
In the form of AMNR, M∗ controls the size of biasAs M∗ increases, the bias decreases
Focus on the distance between f̂n and f∗ with given M∗
We start with a case then finite M∗ is sufficient to represent fThen, we consider M∗ is larger (infinite)
M.Imaizumi (Univ. of Tokyo) K.Hayashi (AIST / JST ERATO) Doubly Decomposing Nonparametric Tensor Regression
Tensor regression problemOur Approach
Convergence AnalysisExperiments
Summary
Finite M ∗ Case
In the following, we assume that
i f belongs to Sobolev space with order βii Parameters of the prior estimation is appropriately selected
Theorem 1
Let M∗ <∞. Then, with some finite constant C > 0,
E‖f̂n − f∗‖2n ≤ Cn−2β/(2β+maxk Ik).
Remind that the naive estimator f̃n has a convergence raten−2β/(2β+
∏k Ik)
M.Imaizumi (Univ. of Tokyo) K.Hayashi (AIST / JST ERATO) Doubly Decomposing Nonparametric Tensor Regression
Tensor regression problemOur Approach
Convergence AnalysisExperiments
Summary
Infinite M ∗ Case
With infinite M∗, we estimate first M components with someassumption.
Theorem 2
Assume that with some constant γ ≥ 1,∥∥∥∑r λr∏k f
(k)m
∥∥∥2= o
(m−γ−1
), as m→∞. Suppose we
construct the estimator with a proximal complexity M such that
M � (n−2β/(2β+maxk Ik))1/(1+γ).
Then, with some finite constant C > 0,
E‖f̂n − f∗‖2n ≤ C(n−2β/(2β+maxk Ik))γ/(1+γ).
M.Imaizumi (Univ. of Tokyo) K.Hayashi (AIST / JST ERATO) Doubly Decomposing Nonparametric Tensor Regression
Tensor regression problemOur Approach
Convergence AnalysisExperiments
Summary
Convergence Rate
Compare Nomparametric method for Tensor Regression
For example case, we set K = 3, Ik = 100, β = γ = 2
Method Conv. Rate Example
Naive n−2β/(2β+∏
k Ik) n−1/2501
AMNR (Finite M∗) n−2β/(2β+maxk Ik) n−1/26
AMNR (Infinite M∗) (n−2β/(2β+maxk Ik))γ/(1+γ) n−1/39
AMNR achieves better convergence rate, by reducing the sizeof the model space by the double decomposition
M.Imaizumi (Univ. of Tokyo) K.Hayashi (AIST / JST ERATO) Doubly Decomposing Nonparametric Tensor Regression
Tensor regression problemOur Approach
Convergence AnalysisExperiments
Summary
1 Tensor regression problem
2 Our Approach
3 Convergence Analysis
4 Experiments
5 Summary
M.Imaizumi (Univ. of Tokyo) K.Hayashi (AIST / JST ERATO) Doubly Decomposing Nonparametric Tensor Regression
Tensor regression problemOur Approach
Convergence AnalysisExperiments
Summary
Experiment Outline
We introduce 3 experiments
1 Prediction performance2 Convergence analysis3 Real data analysis
Methods
AMNR (our method)TGP (Tensor Gaussian Process ; Zhou et al. (2014))
Close to the naive estimator
TLR (Tensor Linear Regression ; Zhao et al. (2013))
Not nonparametric method
M.Imaizumi (Univ. of Tokyo) K.Hayashi (AIST / JST ERATO) Doubly Decomposing Nonparametric Tensor Regression
Tensor regression problemOur Approach
Convergence AnalysisExperiments
Summary
Prediction Performance
Generate synthetic data with low rank tensor as
f(X) =∑2
r=1 λr∏Kk=1(1 + exp(–γTx
(k)r ))−1
Full view
100 200 300 400 500
n
−0.5
0.0
0.5
1.0
1.5
2.0
2.5
MSE
Enlarged view
100 200 300 400 500
n
0.04
0.06
0.08
0.10
0.12
0.14
0.16
0.18
0.20
MSE
M.Imaizumi (Univ. of Tokyo) K.Hayashi (AIST / JST ERATO) Doubly Decomposing Nonparametric Tensor Regression
Tensor regression problemOur Approach
Convergence AnalysisExperiments
Summary
Convergence analysis
Generate synthetic data with smoothness-controlled processf(X) =
∑Rr=1
∏Kk=1
∑l µlφl(γ
Tx)
Symmetric tensor
4.0 4.5 5.0 5.5 6.0
log n
log M
SE
TGP(10*10*10)
AMNR(10*10*10)
Asymmetric tensor
4.0 4.5 5.0 5.5 6.0
log n
log M
SE
TGP(10*3*3)
AMNR(10*3*3)
M.Imaizumi (Univ. of Tokyo) K.Hayashi (AIST / JST ERATO) Doubly Decomposing Nonparametric Tensor Regression
Tensor regression problemOur Approach
Convergence AnalysisExperiments
Summary
Real Data Analysis
Epidemic Spreading DataXi : Adjacency matrix of network iYi : the number of total infected nodes of network i
0 50 100 150 200
n
0
10
20
30
40
50
60
70
MSE
Testing MSE
M.Imaizumi (Univ. of Tokyo) K.Hayashi (AIST / JST ERATO) Doubly Decomposing Nonparametric Tensor Regression
Tensor regression problemOur Approach
Convergence AnalysisExperiments
Summary
1 Tensor regression problem
2 Our Approach
3 Convergence Analysis
4 Experiments
5 Summary
M.Imaizumi (Univ. of Tokyo) K.Hayashi (AIST / JST ERATO) Doubly Decomposing Nonparametric Tensor Regression
Tensor regression problemOur Approach
Convergence AnalysisExperiments
Summary
Conclusion
Proposed nonparametric regression model with tensor input
AMNR controls the complexity of the regression function, bydecomposing both the function and the input tensor
It controls the bias variance trade-off and avoids the curse ofdimensionality of tensor input
Its limitation is a computational complexity
M.Imaizumi (Univ. of Tokyo) K.Hayashi (AIST / JST ERATO) Doubly Decomposing Nonparametric Tensor Regression