June3,2015 · PDF file June3,2015 Off the grid tensor completion for seismic data...
Transcript of June3,2015 · PDF file June3,2015 Off the grid tensor completion for seismic data...
University of British ColumbiaSLIM
Curt Da Silva and Felix J. HerrmannJune 3, 2015
Off the grid tensor completion for seismic data interpolation
Motivation
3D seismic experiments -‐ 5D data • expensive to acquire, store• sample at sub-‐Nyquist rates
Data exhibits low-‐rank structure• exploit structure for interpolation
Fully sampled data • simultaneous sources in wave-‐equation based inversion• mitigating multiples
Motivation
Standard matrix, tensor completion• defined on a regularly spaced grid• sampling removes points from this regular grid
Degraded results in practice• very difficult to sample regularly in practice• irregular “off-‐the-‐grid” sampling destroys low-‐rank behaviour
7.34 Hz - 90% missing receivers - irregular samplingCommon source gather
Subsampled dataTrue data
receiver x20 40 60 80 100 120 140 160 180 200
rece
iver
y
20
40
60
80
100
120
140
160
180
200
receiver x20 40 60 80 100 120 140 160 180 200
rece
iver
y
20
40
60
80
100
120
140
160
180
200
7.34 Hz - 90% missing receivers - irregular samplingCommon source gather
Regularized tensor completionSNR 11 dB
True data
receiver x20 40 60 80 100 120 140 160 180 200
rece
iver
y
20
40
60
80
100
120
140
160
180
200
receiver x20 40 60 80 100 120 140 160 180 200
rece
ive
r y
20
40
60
80
100
120
140
160
180
200
Context
Low-‐rank matrix/tensor completion via nuclear norm projection [1]• Require SVDs on huge data matrices• Not scalable to large problem sizes
Data completion via Toeplitz embedding [2]• Problem size -‐ (# data points)2
• Ad-‐hoc windowing -‐ can degrade quality, as demonstrated in [3]
[1] Kreimer and Sacchi, "A tensor higher-‐order singular value decomposi^on for prestack seismic data noise reduc^on and interpola^on." (2012)[2] Gao, Vicente, and Sacchi. "Evalua^on of a fast algorithm for the eigen-‐decomposi^on of large block Toeplitz matrices with applica^on to 5D seismic data interpola^on." (2011)[3] Da Silva, Kumar, et al, “Efficient matrix comple^on for seismic data reconstruc^on.” (To appear)
Goals
Review Hierarchical Tucker tensor format, principles of low-‐rank tensor recovery
Compensate for lack of low-‐rank behaviour in irregular sampling domain• Introduce a new domain in which the data is low rank
Compressive sensingwith sparsity promotion
Successful reconstruction scheme
Signal structure• sparsity
Sampling • subsampling decreases sparsity
Optimization• look for sparsest solution
Multidimensional interpolationwith Hierarchical Tucker
Successful reconstruction scheme
Signal structure• Hierarchical Tucker
Sampling • subsampling, noise increases hierarchical rank
Optimization• fit data in the Hierarchical Tucker format
Hierarchical Tucker format
“SVD”-like decomposition
X � n1 ⇥ n2 ⇥ n3 ⇥ n4 tensor
X(1,2)
n1n2
n3n4
=
n1n2
U12
k12
B1234 UT34
n3n4
k34
U12
n1n2
k12
! U12n1
n2k12
U12
n1n2
k12
! U12n1
n2k12
! U1
UT2
n1
k1 n2
k2
B12
1
Hierarchical Tucker formatX � n1 ⇥ n2 ⇥ n3 ⇥ n4 tensor
U12
n1n2
k12
! U12n1
n2k12
Hierarchical Tucker formatX � n1 ⇥ n2 ⇥ n3 ⇥ n4 tensor
U12
n1n2
k12
! U12n1
n2k12
! U1
UT2
n1
k1 n2
k2
B12
Hierarchical Tucker format
Intermediate matrices don’t need to be stored
-‐ small parameter matrices• specify the tensor completely
Separating groups of dimensions from each other• dimension tree
Ut, Bt
Example
A. Uschmajew, B. Vandereycken. The geometry of algorithms using hierarchical tensors. Linear algebra and its applications, 2013.
The geometry of hierarchical tensors 7
{1, 2, 3, 4, 5}
{1, 2, 3} {4, 5}
{4} {5}
{2} {3}
{1} {2, 3}
= tr
= t
= t2= t1
Fig. 3.1 A dimension tree of {1, 2, 3, 4, 5}.
3 The hierarchical Tucker decomposition
In this section, we define the set of HT tensors of fixed rank using the HT de-composition and the hierarchical rank. Much of this and related concepts werealready introduced in [24,22,23] but we establish in addition some new relationson the parametrization of this set. In order to better facilitate the derivation of thesmooth manifold structure in the next section, we have adopted a slightly di↵erentpresentation compared to [24,22].
3.1 The hierarchical Tucker format
Definition 3.1 Given the order d, a dimension tree T is a non-trivial, rootedbinary tree whose nodes t can be labeled (and hence identified) by elements of thepower set P({1, 2, . . . , d}) such that
(i) the root has the label tr = {1, 2, . . . , d}; and,(ii) every node t 2 T , which is not a leaf, has two sons t1 and t2 that form an
ordered partition of t, that is,
t1 [ t2 = t and µ < ⌫ for all µ 2 t1, ⌫ 2 t2. (3.1)
The set of leafs is denoted by L. An example of a dimension tree for tr ={1, 2, 3, 4, 5} is depicted in Figure 3.1.
The idea of the HT format is to recursively factorize subspaces of Rn1⇥n2⇥···⇥nd
into tensor products of lower-dimensional spaces according to the index splittingsin the tree T . If X is contained in such subspaces that allow for preferably low-dimensional factorizations, then X can be e�ciently stored based on the nextdefinition. Given dimensions n1, n2, . . . , nd, called spatial dimensions, and a nodet ✓ {1, 2, . . . , d}, we define the dimension of t as nt =
Qµ2t nµ.
Definition 3.2 Let T be a dimension tree and k = (kt)t2T a set of positiveintegers with kt
r
= 1. The hierarchical Tucker (HT) format for tensors X 2Rn1,n2,...,nd is defined as follows.
(i) To each node t 2 T , we associate a matrix Ut 2 Rnt
⇥kt .
(ii) For the root tr, we define Utr
= vec(X).
U123 B123 U45 B45
B12345
U1 U23 B23 U4 U5
U2 U3
Hierarchical Tucker format
Storage
Compare to storage for the full tensor
Effectively breaking the curse of dimensionality when
Low frequency data compresses in HT
dNK + (d� 2)K3 +K2
Nd
K ⌧ N d � 4
Seismic Hierarchical Tucker
We consider a 3D seismic survey with coordinates (src x, src y, rec x, rec y, time)
We take a Fourier transform in time and restrict ourselves to a single frequency slice
Seismic Hierarchical Tucker
For a frequency slice with coordinates (src x, src y, rec x, rec y),
there are essentially two choices of dimension splitting (by reciprocity)
First explored in [1] -‐ low rank solution operators for wave equations
{src x, rec x, src y, rec y}
{src x, rec x} {src y, rec y}
{src x} {rec x} {src y} {rec y}
{src x, srcy, rec x, rec y}
{src x, src y} {rec x, rec y}
{src x} {src y} {rec x} {rec y}
Canonical Decomposition Non-‐canonical Decomposition
[1] L. Demanet. Curvelets, Wave Atoms, and Wave Equations. PhD Thesis, MIT
(Rec x, Rec y) matricization -‐ Canonical ordering
Matricizations
10 20 30 40 50 60 70 80 90 100
10
20
30
40
50
60
70
80
90
100
0 100 200 300 400 500 600 70010−7
10−6
10−5
10−4
10−3
10−2
10−1
100
Nor
mal
ized
sin
gula
r val
ue
Full data
xsrc, ysrc
x rec, y
rec
100 200 300 400 500 600
100
200
300
400
500
600
(Src x, Rec x) matricization -‐ Noncanonical ordering
Matricizations
10 20 30 40 50 60 70 80 90 100
10
20
30
40
50
60
70
80
90
100
ysrc,yrec
x src,xrec
100 200 300 400 500 600
100
200
300
400
500
600
0 100 200 300 400 500 600 70010−7
10−6
10−5
10−4
10−3
10−2
10−1
100
Nor
mal
ized
sin
gula
r val
ue
(Rx Ry) matricization(Sx Rx) matricization
Multidimensional interpolationwith Hierarchical Tucker
Successful reconstruction scheme
Signal structure• Hierarchical Tucker
Sampling • subsampling, noise increases hierarchical rank
Optimization• fit data in the Hierarchical Tucker format
Matrix Completion
2
66664
⇤ ⇤ ⇤ ⇤ ⇤⇤ ⇤ ⇤ ⇤ ⇤⇤ ⇤ ⇤ ⇤ ⇤⇤ ⇤ ⇤ ⇤ ⇤⇤ ⇤ ⇤ ⇤ ⇤
3
77775
0 10 20 30 40 50 60 70 80 90 10010
⌧4
10⌧3
10⌧2
10⌧1
100
Nor
mal
ize
d si
ng
ular
val
ue
X
Matrix CompletionA(X)2
66664
⇤ ⇤ ⇤ 0 ⇤⇤ 0 0 ⇤ 0⇤ ⇤ ⇤ ⇤ ⇤⇤ ⇤ 0 ⇤ ⇤0 ⇤ ⇤ ⇤ 0
3
77775
0 10 20 30 40 50 60 70 80 90 10010
⌧4
10⌧3
10⌧2
10⌧1
100
Nor
mal
ize
d si
ng
ular
val
ue
Sampling
Sampling points from the data• idealized recovery• impossible to physically implement
Sampling points from the data• less idealized• possible to acquire data -‐ e.g. ocean bottom nodes
(xsrc, ysrc, xrec, yrec)
(xrec, yrec)
Realistic recovery50% random receivers removed
xsrc, ysrc
x rec, y
rec
100 200 300 400 500 600
100
200
300
400
500
600
(Rec x, Rec y) matricization -‐ Canonical ordering
10 20 30 40 50 60 70 80 90 100
10
20
30
40
50
60
70
80
90
100
0 100 200 300 400 500 600 70010−7
10−6
10−5
10−4
10−3
10−2
10−1
100
Nor
mal
ized
sin
gula
r val
ue
No subsampling50% missing sources
Realistic recovery50% random receivers removed
(Src x, Rec x) matricization -‐ Noncanonical ordering
ysrc,yrec
x src,xrec
100 200 300 400 500 600
100
200
300
400
500
600
10 20 30 40 50 60 70 80 90 100
10
20
30
40
50
60
70
80
90
100
0 100 200 300 400 500 600 70010−8
10−7
10−6
10−5
10−4
10−3
10−2
10−1
100
Nor
mal
ized
sin
gula
r val
ue
No subsampling50% missing sources
Data organization
(rec x, rec y) organization• High rank• Missing receivers operator -‐ removes rows• Poor recovery scenario
(src x, rec x) organization• Low rank• Missing receivers operator -‐ removes blocks• Closer to ideal recovery scenario
Nonuniform sampling
BG data set• 68 x 68 sources, 150m spacing• 401 x 401 receivers, 25m spacing
Consider two sampling situations• regular 201 x 201 grid, 50m spacing
• irregular 201 x 201 grid• random 25m perturbations from the regular 201 x 201 grid • load true data from the irregular grid, avoid generating inverse crime data
Regular grid
28
Unstructured grid
29
Regular vs irregular grid - singular values
0 50 100 150 200 25010−3
10−2
10−1
100
norm
aliz
ed s
ingu
lar v
alue
s
singular value index0 10 20 30 40 50 60 70
100
norm
aliz
ed s
ingu
lar v
alue
s
singular value index0 2000 4000 6000 8000 10000 12000 14000
10−10
10−9
10−8
10−7
10−6
10−5
10−4
10−3
10−2
10−1
100
norm
aliz
ed s
ingu
lar v
alue
s
singular value index
source x, receiver x source x receiver x
Black -‐ regular gridBlue -‐ irregular grid
30
Off the grid tensor interpolation
The data volume is no longer low rank when irregularly sampled• standard tensor completion framework won’t work well
Solution• construct a domain where the data is low rank• choose an appropriate transform : low rank domain -‐> sampling domain• incorporate transform in to the optimization problem
31
Multidimensional interpolationwith Hierarchical Tucker
Successful reconstruction scheme
Signal structure• Hierarchical Tucker
Sampling • subsampling, noise increases hierarchical rank
Optimization• fit data in the Hierarchical Tucker format
Optimization program�(x)
Parameter space
Full-‐tensor space
Optimization program
Optimization problem
The standard problem we solve is
Our sampling operator is typically
where
minx
kA�(x)� bk22
A = RP
R : regular full grid ! subsampled grid
P : (src x, rec x, src y, rec y) ! (src x, src y, rec x, rec y)
35
Optimization problem
In the irregular grid case, the subsampling operator is in fact
In order to take this discrepancy in to account, we introduce an operator
This is an extension of [1] to the tensor case
R : irregular full grid ! subsampled grid
F : regular full grid ! irregular full grid
36
[1] R. Kumar, O. Lopez, E. Esser, and F. J. Herrmann. Matrix completion on unstructured grids : 2-D seismic data regularization and interpolation. EAGE 2015.
Optimization problem
The sequence of operators is then
We set and use the same optimization code as previously in [1]
In our examples, we use the non-‐uniform Fourier transform [2]
R : irregular full grid ! subsampled grid
F : regular full grid ! irregular full gridP : (src x, rec x, src y, rec y) ! (src x, src y, rec x, rec y)
A = RFP
37
[1]! C. Da Silva and F. J. Herrmann. Optimization on the hierarchical tucker manifold – applications to tensor completion. Linear Algebra and its Applications
[2] L Greengard and J.Y. Lee. Accelerating the nonuniform fast fourier transform. SIAM Review.
Optimization problem
Main computational costs• Gauss-‐Newton method -‐> convergence in ~15 iterations
• ~ 4 objective evaluations per iteration
• ~60 applications of the interpolation operator• main source of computational costs
38
Results
Synthetic BG Group data
Unknown model • 68 x 68 sources with 401 x 401 receivers, data at 7.34 Hz, 12.3 Hz
Receivers sampled on a randomly perturbed 201 x 201 grid
We compare to ‘vanilla’ tensor completion, where we just bin the data to a regular grid
7.34 Hz - 75% missing receiversCommon source gather
Subsampled dataTrue data
receiver x20 40 60 80 100 120 140 160 180 200
rece
iver
y
20
40
60
80
100
120
140
160
180
200
receiver x20 40 60 80 100 120 140 160 180 200
rece
iver
y
20
40
60
80
100
120
140
160
180
200
7.34 Hz - 75% missing receiversCommon source gather
Vanilla tensor completion SNR 9.46 dB
True data
receiver x20 40 60 80 100 120 140 160 180 200
rece
iver
y
20
40
60
80
100
120
140
160
180
200
receiver x20 40 60 80 100 120 140 160 180 200
rece
iver
y
20
40
60
80
100
120
140
160
180
200
7.34 Hz - 75% missing receiversCommon source gather
Regularized tensor completionSNR 15.7 dB
True data
receiver x20 40 60 80 100 120 140 160 180 200
rece
iver
y
20
40
60
80
100
120
140
160
180
200
receiver x20 40 60 80 100 120 140 160 180 200
rece
ive
r y
20
40
60
80
100
120
140
160
180
200
7.34 Hz - 75% missing receiversCommon source gather
Regularized tensor completion difference
Vanilla tensor completiondifference
receiver x20 40 60 80 100 120 140 160 180 200
rece
ive
r y
20
40
60
80
100
120
140
160
180
200
receiver x20 40 60 80 100 120 140 160 180 200
rece
ive
r y
20
40
60
80
100
120
140
160
180
200
12.3 Hz - 75% missing receiversCommon source gather
True data
receiver x20 40 60 80 100 120 140 160 180 200
rece
ive
r y
20
40
60
80
100
120
140
160
180
200
receiver x20 40 60 80 100 120 140 160 180 200
rece
ive
r y
20
40
60
80
100
120
140
160
180
200
Subsampled data
12.3 Hz - 75% missing receiversCommon source gather
True data
receiver x20 40 60 80 100 120 140 160 180 200
rece
ive
r y
20
40
60
80
100
120
140
160
180
200
Vanilla tensor completion SNR 4.45 dB
receiver x20 40 60 80 100 120 140 160 180 200
rece
ive
r y
20
40
60
80
100
120
140
160
180
200
12.3 Hz - 75% missing receiversCommon source gather
True data
receiver x20 40 60 80 100 120 140 160 180 200
rece
ive
r y
20
40
60
80
100
120
140
160
180
200
receiver x20 40 60 80 100 120 140 160 180 200
rece
ive
r y
20
40
60
80
100
120
140
160
180
200
Regularized tensor completionSNR 11.4 dB
12.3 Hz - 75% missing receiversCommon source gather
Vanilla tensor completiondifference
receiver x20 40 60 80 100 120 140 160 180 200
rece
ive
r y
20
40
60
80
100
120
140
160
180
200
Regularized tensor completion difference
receiver x20 40 60 80 100 120 140 160 180 200
rece
ive
r y
20
40
60
80
100
120
140
160
180
200
7.34 Hz - 90% missing receiversCommon source gather
Subsampled dataTrue data
receiver x20 40 60 80 100 120 140 160 180 200
rece
iver
y
20
40
60
80
100
120
140
160
180
200
receiver x20 40 60 80 100 120 140 160 180 200
rece
iver
y
20
40
60
80
100
120
140
160
180
200
7.34 Hz - 90% missing receiversCommon source gather
Vanilla tensor completionSNR 6.95 dB
True data
receiver x20 40 60 80 100 120 140 160 180 200
rece
iver
y
20
40
60
80
100
120
140
160
180
200
receiver x20 40 60 80 100 120 140 160 180 200
rece
ive
r y
20
40
60
80
100
120
140
160
180
200
7.34 Hz - 90% missing receiversCommon source gather
Regularized tensor completionSNR 11 dB
True data
receiver x20 40 60 80 100 120 140 160 180 200
rece
iver
y
20
40
60
80
100
120
140
160
180
200
receiver x20 40 60 80 100 120 140 160 180 200
rece
ive
r y
20
40
60
80
100
120
140
160
180
200
Conclusion
3D seismic data has an underlying structure that we can exploit for interpolation (Hierarchical Tucker format)
Different schemes for organizing data -‐ important for recovery
Conclusion
We can interpolate HT tensors with missing entries using the Riemannian manifold structure of the HT format
It is important and worthwhile to account for the off-‐grid nature of sampling when interpolating data• binning just doesn’t cut it
Acknowledgements
This work was in part financially supported by the Natural Sciences and Engineering Research Council of Canada Discovery Grant (22R81254) and the Collaborative Research and Development Grant DNOISE II (375142-‐08). This research was carried out as part of the SINBAD II project with support from the following organizations: BG Group, BGP, BP, CGG, Chevron, ConocoPhillips, ION, Petrobras, PGS, Total SA, WesternGeco, Statoil, and Woodside.
Thank you for your attention