Discussion led by Chunping Wang ECE, Duke University July 10, 2009
-
Upload
shaeleigh-aguirre -
Category
Documents
-
view
15 -
download
0
description
Transcript of Discussion led by Chunping Wang ECE, Duke University July 10, 2009
![Page 1: Discussion led by Chunping Wang ECE, Duke University July 10, 2009](https://reader036.fdocuments.in/reader036/viewer/2022062422/56812e19550346895d9381e7/html5/thumbnails/1.jpg)
Simulation of the matrix Bingham-von Mises-
Fisher distribution, with applications to
multivariate and relational data
Discussion led by Chunping Wang
ECE, Duke University
July 10, 2009
Peter D. Hoffto appear in Journal of Computational and Graphical Statistics
![Page 2: Discussion led by Chunping Wang ECE, Duke University July 10, 2009](https://reader036.fdocuments.in/reader036/viewer/2022062422/56812e19550346895d9381e7/html5/thumbnails/2.jpg)
Outline
• Introduction and Motivations
• Sampling from the Vector Von Mises-Fisher (vMF) Distribution (existing method)
• Sampling from the Matrix Von Mises-Fisher (mMF) Distribution
• Sampling from the Bingham-Von Mises-Fisher (BMF) Distribution
• One Example
• Conclusions
1/21
![Page 3: Discussion led by Chunping Wang ECE, Duke University July 10, 2009](https://reader036.fdocuments.in/reader036/viewer/2022062422/56812e19550346895d9381e7/html5/thumbnails/3.jpg)
Introduction
The matrix Bingham distribution – quadratic term
The matrix von Mises-Fisher distribution – linear term
}{etr)|( XCCX TMFp
}{etr),|( AXBXBAX TBp
The matrix Bingham-von Mises-Fisher distribution
},{etr),,|( AXBXXCCBAX TTBMFp
0B0A ,
0C
Stiefel manifold: set of rank- orthonormal matrices, denotedRmR
X
2/21
![Page 4: Discussion led by Chunping Wang ECE, Duke University July 10, 2009](https://reader036.fdocuments.in/reader036/viewer/2022062422/56812e19550346895d9381e7/html5/thumbnails/4.jpg)
MotivationsSampling orthonormal matrices from distributions is useful for many applications.
Examples:
• Factor analysis
),0(~ 2Niid
latent
latent
Given uniform priors over Stiefel manifold,
observed matrixpnR Y
UV
}{etr)|( XCCX Tp
3/21
![Page 5: Discussion led by Chunping Wang ECE, Duke University July 10, 2009](https://reader036.fdocuments.in/reader036/viewer/2022062422/56812e19550346895d9381e7/html5/thumbnails/5.jpg)
Motivations
• Principal components
observed matrix, with each row
}2/)({etr),|( 1 TTp YUUΛYΛUY
),(~ Σ0y p
iid
i NTUUΛΣ Eigen-value decomposition
Likelihood
}2/)({etr),|( 1 UYYUΛΛYU TTp
Posterior with respect to uniform prior
pnR Y
with U
}{etr),|( AXBXBAX Tp
4/21
![Page 6: Discussion led by Chunping Wang ECE, Duke University July 10, 2009](https://reader036.fdocuments.in/reader036/viewer/2022062422/56812e19550346895d9381e7/html5/thumbnails/6.jpg)
Motivations
• Network data, symmetric binary observed matrix, with the 0-1 mm:Y ijy
indicator of a link between nodes i and j.
U
}2/{etr),|( ZUUΛΛZU Tp
Posterior with respect to uniform prior
E: symmetric matrix of independent standard normal noise
}{etr),|( AXBXBAX Tp
5/21
![Page 7: Discussion led by Chunping Wang ECE, Duke University July 10, 2009](https://reader036.fdocuments.in/reader036/viewer/2022062422/56812e19550346895d9381e7/html5/thumbnails/7.jpg)
Sampling from the vMF Distribution (wood, 1994)
},{exp),|( xξξx TMFp ,mSx
,mSξ the modal vector;
)cos()cos(|||||||| xξxξT
constant distribution for any given angle
, concentration parameter
A distribution on the -sphere in )1( m mR
ξx
defines the modal direction. ξ
6/21
![Page 8: Discussion led by Chunping Wang ECE, Duke University July 10, 2009](https://reader036.fdocuments.in/reader036/viewer/2022062422/56812e19550346895d9381e7/html5/thumbnails/8.jpg)
Sampling from the vMF Distribution (wood, 1994)
),(fromsampleabetoprovedis];1[ 2 ξvx MFww
).exp()1()(fromsampleaiswhen 2/)3(2 wwwfw m
( Proposal envelope ))1(2/)3(2 })1()1{()1()( mm wbbwwg
mTT x xξξ ],1,0,,0[ (1) A simple direction
,ddistributeuniformlyFor 1 mSv
For a fixed orthogonal matrix ,
P ).(~ PξPx MF
7/21
mm
(2) An arbitrary direction
![Page 9: Discussion led by Chunping Wang ECE, Duke University July 10, 2009](https://reader036.fdocuments.in/reader036/viewer/2022062422/56812e19550346895d9381e7/html5/thumbnails/9.jpg)
Sampling from the mMF Distribution
Rejection sampling scheme 1: uniform envelope
XXCCX },{etr)|( TMFp
Mg
pMF )(
)|(
X
CX
X
)(XMg
)|( CXMFp
Acceptance region
rejection region
,)(
)|(
X
CX
Mg
pu MF accept
Sample )1,0(~),(~ Uug XX
when X
a bound
Extremely inefficient
u
0
1
8/21
![Page 10: Discussion led by Chunping Wang ECE, Duke University July 10, 2009](https://reader036.fdocuments.in/reader036/viewer/2022062422/56812e19550346895d9381e7/html5/thumbnails/10.jpg)
Sampling from the mMF Distribution
Rejection sampling scheme 2: based on sampling from vMF
Y
Y
Y
9/21
Proposal samples are drawn from vMF density functions with parameter , constrained to be orthogonal to other columns of .][,rH
][,rYY
![Page 11: Discussion led by Chunping Wang ECE, Duke University July 10, 2009](https://reader036.fdocuments.in/reader036/viewer/2022062422/56812e19550346895d9381e7/html5/thumbnails/11.jpg)
Sampling from the mMF Distribution
Rejection sampling scheme 2: based on sampling from vMF
Y
Y
Y
0)]1,,1([, rr YN
9/21
Proposal samples are drawn from vMF density functions with parameter , constrained to be orthogonal to other columns of .][,rH
][,rYY
![Page 12: Discussion led by Chunping Wang ECE, Duke University July 10, 2009](https://reader036.fdocuments.in/reader036/viewer/2022062422/56812e19550346895d9381e7/html5/thumbnails/12.jpg)
Sampling from the mMF Distribution
Rejection sampling scheme 2: based on sampling from vMF
Y
Y
Y
Rotate the modal direction
9/21
Proposal samples are drawn from vMF density functions with parameter , constrained to be orthogonal to other columns of .][,rH
][,rYY
![Page 13: Discussion led by Chunping Wang ECE, Duke University July 10, 2009](https://reader036.fdocuments.in/reader036/viewer/2022062422/56812e19550346895d9381e7/html5/thumbnails/13.jpg)
Sampling from the mMF Distribution
Rejection sampling scheme 2: based on sampling from vMF
Y
Y
Y Rotate the sample to be orthogonal to the previous columns
9/21
Proposal samples are drawn from vMF density functions with parameter , constrained to be orthogonal to other columns of .][,rH
][,rYY
![Page 14: Discussion led by Chunping Wang ECE, Duke University July 10, 2009](https://reader036.fdocuments.in/reader036/viewer/2022062422/56812e19550346895d9381e7/html5/thumbnails/14.jpg)
Sampling from the mMF Distribution
Rejection sampling scheme 2: based on sampling from vMF
Proposal samples are drawn from vMF density functions with parameter , constrained to be orthogonal to other columns of .][,rH
][,rYY
Y
Y
Y
}{etr)()|()|()(11
][,][,1
)]1,,1([,][, YHNHNYNYYY TR
rr
R
rr
Trr
TrMF
R
rrr Fppg
Proposal distribution
9/21
![Page 15: Discussion led by Chunping Wang ECE, Duke University July 10, 2009](https://reader036.fdocuments.in/reader036/viewer/2022062422/56812e19550346895d9381e7/html5/thumbnails/15.jpg)
Sampling from the mMF Distribution
Rejection sampling scheme 2: based on sampling from vMF
Sample scheme:
10/21
![Page 16: Discussion led by Chunping Wang ECE, Duke University July 10, 2009](https://reader036.fdocuments.in/reader036/viewer/2022062422/56812e19550346895d9381e7/html5/thumbnails/16.jpg)
Sampling from the mMF Distribution
A Gibbs sampling scheme
Sample iteratively )|(~ ][,][,][, rrr p XXX
)1( RmSz• Note that . When . remedy: sampling two columns at a time• Non-orthogonality among the columns of add to the autocorrelation in the Gibbs sampler.
remedy: performing the Gibbs sampler on
}1,1{, zRm
C
TMF YVXUDY with),(~11/21
![Page 17: Discussion led by Chunping Wang ECE, Duke University July 10, 2009](https://reader036.fdocuments.in/reader036/viewer/2022062422/56812e19550346895d9381e7/html5/thumbnails/17.jpg)
Sampling from the BMF Distribution
The vector Bingham distribution
symmetric, A
;2iy
12/21
![Page 18: Discussion led by Chunping Wang ECE, Duke University July 10, 2009](https://reader036.fdocuments.in/reader036/viewer/2022062422/56812e19550346895d9381e7/html5/thumbnails/18.jpg)
Sampling from the BMF Distribution
The vector Bingham distribution
symmetric, A
;2iy
)exp(),|(1
2
m
iii yp ΛEy
12/21
![Page 19: Discussion led by Chunping Wang ECE, Duke University July 10, 2009](https://reader036.fdocuments.in/reader036/viewer/2022062422/56812e19550346895d9381e7/html5/thumbnails/19.jpg)
Sampling from the BMF Distribution
The vector Bingham distribution
symmetric, A
;2iy Better mixing
12/21
![Page 20: Discussion led by Chunping Wang ECE, Duke University July 10, 2009](https://reader036.fdocuments.in/reader036/viewer/2022062422/56812e19550346895d9381e7/html5/thumbnails/20.jpg)
Sampling from the BMF Distribution
The vector Bingham distribution
symmetric, A
;2iy
From variable substitution, rejection sampling or grid sampling
12/21
![Page 21: Discussion led by Chunping Wang ECE, Duke University July 10, 2009](https://reader036.fdocuments.in/reader036/viewer/2022062422/56812e19550346895d9381e7/html5/thumbnails/21.jpg)
Sampling from the BMF Distribution
The vector Bingham distribution
symmetric, A
;2iy
The density is symmetric about zero
12/21
![Page 22: Discussion led by Chunping Wang ECE, Duke University July 10, 2009](https://reader036.fdocuments.in/reader036/viewer/2022062422/56812e19550346895d9381e7/html5/thumbnails/22.jpg)
Sampling from the BMF Distribution
The vector Bingham-von Mises-Fisher distribution
The density is not symmetric about zero any more, is no longer uniformly distributed on . The update of and should be done jointly. The modified step 2(b) and 2(c) are:
is}1,1{
13/21
q, is
)exp(),,|(1
2
m
iii
T yp yddΛEy
),|( iip sq
![Page 23: Discussion led by Chunping Wang ECE, Duke University July 10, 2009](https://reader036.fdocuments.in/reader036/viewer/2022062422/56812e19550346895d9381e7/html5/thumbnails/23.jpg)
Sampling from the BMF Distribution
The matrix Bingham-von Mises-Fisher distribution )( Rm
14/21
},{etr),,|( AXBXXCCBAX TTBMFp
},{ ]1[, XNzXRewrite
)exp()|( 1,1]1[,]1[, ANzNzNzCXz TTT bp
![Page 24: Discussion led by Chunping Wang ECE, Duke University July 10, 2009](https://reader036.fdocuments.in/reader036/viewer/2022062422/56812e19550346895d9381e7/html5/thumbnails/24.jpg)
Sampling from the BMF Distribution
The matrix Bingham-von Mises-Fisher distribution )( Rm
15/21
},{ )]2,1([, XNzXSample two columns at a time
Parameterize 2-dimensional orthonormal matrices as
]2[,)(Z
]1[,)(Z
1s 1s
]1[,)(Z
]2[,)(Z
Uniform pairs on the circle
Uniform )2,0(
)),((),( spsp Z
![Page 25: Discussion led by Chunping Wang ECE, Duke University July 10, 2009](https://reader036.fdocuments.in/reader036/viewer/2022062422/56812e19550346895d9381e7/html5/thumbnails/25.jpg)
Sampling from the BMF Distribution
The matrix Bingham-von Mises-Fisher distribution )( Rm
16/21
![Page 26: Discussion led by Chunping Wang ECE, Duke University July 10, 2009](https://reader036.fdocuments.in/reader036/viewer/2022062422/56812e19550346895d9381e7/html5/thumbnails/26.jpg)
Example: Eigenmodel estimation for network data
17/21
![Page 27: Discussion led by Chunping Wang ECE, Duke University July 10, 2009](https://reader036.fdocuments.in/reader036/viewer/2022062422/56812e19550346895d9381e7/html5/thumbnails/27.jpg)
Example: Eigenmodel estimation for network data
indicator of a link between nodes i and j.
}2/{etr),|( ZUUΛΛZU Tp
Posterior with respect to uniform prior
, symmetric binary observed matrix, with the 0-1 mm:Y ijy
18/21
UE: symmetric matrix of independent standard normal noise
BMF distribution with 0CΛBZA ,,2/
270,3 mR
![Page 28: Discussion led by Chunping Wang ECE, Duke University July 10, 2009](https://reader036.fdocuments.in/reader036/viewer/2022062422/56812e19550346895d9381e7/html5/thumbnails/28.jpg)
Samples from two independent Markov chains with different starting values
Example: Eigenmodel estimation for network data
19/21
![Page 29: Discussion led by Chunping Wang ECE, Duke University July 10, 2009](https://reader036.fdocuments.in/reader036/viewer/2022062422/56812e19550346895d9381e7/html5/thumbnails/29.jpg)
Example: Eigenmodel estimation for network data
20/21
![Page 30: Discussion led by Chunping Wang ECE, Duke University July 10, 2009](https://reader036.fdocuments.in/reader036/viewer/2022062422/56812e19550346895d9381e7/html5/thumbnails/30.jpg)
Conclusions
• The sampling scheme of a family of exponential distributions over the Stiefel manifold was developed;
• This enables us to make Bayesian inference for those orthonormal matrices and incorporate prior information during the inference;
• The author mentioned several application and implemented the sampling scheme on a network data set.
21/21
![Page 31: Discussion led by Chunping Wang ECE, Duke University July 10, 2009](https://reader036.fdocuments.in/reader036/viewer/2022062422/56812e19550346895d9381e7/html5/thumbnails/31.jpg)
References
• Andrew T. A. Wood. Simulation of the von Mises Fisher distribution. Comm. Statist. Simulation Comput., 23:157-164, 1994
• G. Ulrich. Computer generation of distributions on the m-sphere. Appl. Statist., 33, 158-163, 1984
• J. G. Saw. A family of distributions on the m-sphere and some hypothesis tests. Biometrika, 65, 69-74, 1978