Proteins, Particles, and Pseudo-Max- Marginals: A ...pachecoj.com/pubs/icml15.pdf · x 5 x 6 x 7 x...
Transcript of Proteins, Particles, and Pseudo-Max- Marginals: A ...pachecoj.com/pubs/icml15.pdf · x 5 x 6 x 7 x...
![Page 1: Proteins, Particles, and Pseudo-Max- Marginals: A ...pachecoj.com/pubs/icml15.pdf · x 5 x 6 x 7 x 8 x 4 x 3 x 1 x 2 Protein Side Chain Prediction [ Image: Harder et al., BMC Informatics](https://reader033.fdocuments.in/reader033/viewer/2022051913/600400930bd31f01f0073360/html5/thumbnails/1.jpg)
Proteins, Particles, and Pseudo-Max-
Marginals: A Submodular Approach
Jason Pacheco
Erik Sudderth
Department of Computer Science
Brown University, Providence RI
![Page 2: Proteins, Particles, and Pseudo-Max- Marginals: A ...pachecoj.com/pubs/icml15.pdf · x 5 x 6 x 7 x 8 x 4 x 3 x 1 x 2 Protein Side Chain Prediction [ Image: Harder et al., BMC Informatics](https://reader033.fdocuments.in/reader033/viewer/2022051913/600400930bd31f01f0073360/html5/thumbnails/2.jpg)
x5
x6
x7
x8
x4
x3
x1
x2
Protein Side Chain Prediction
[ Image: Harder et al., BMC Informatics 2010 ]
1D – 4D Continuous state.
Estimate side chains from backbone.
![Page 3: Proteins, Particles, and Pseudo-Max- Marginals: A ...pachecoj.com/pubs/icml15.pdf · x 5 x 6 x 7 x 8 x 4 x 3 x 1 x 2 Protein Side Chain Prediction [ Image: Harder et al., BMC Informatics](https://reader033.fdocuments.in/reader033/viewer/2022051913/600400930bd31f01f0073360/html5/thumbnails/3.jpg)
Reweighted Max-Product (RMP)
Message passing on discrete side chains
Pseudo-max-marginal:
Max-marginal:
Edge Appearance Probability
![Page 4: Proteins, Particles, and Pseudo-Max- Marginals: A ...pachecoj.com/pubs/icml15.pdf · x 5 x 6 x 7 x 8 x 4 x 3 x 1 x 2 Protein Side Chain Prediction [ Image: Harder et al., BMC Informatics](https://reader033.fdocuments.in/reader033/viewer/2022051913/600400930bd31f01f0073360/html5/thumbnails/4.jpg)
Rotamer discretization
Penicillin Acylase Complex, Trp154 [Shapovalov & Dunbrack 2007]
Truth Rotamers
Fit to side chain marginal statistics
Fails to capture side chain placement…
300o
180o
60o
Rotamers
![Page 5: Proteins, Particles, and Pseudo-Max- Marginals: A ...pachecoj.com/pubs/icml15.pdf · x 5 x 6 x 7 x 8 x 4 x 3 x 1 x 2 Protein Side Chain Prediction [ Image: Harder et al., BMC Informatics](https://reader033.fdocuments.in/reader033/viewer/2022051913/600400930bd31f01f0073360/html5/thumbnails/5.jpg)
Particle Max-Product (PMP)
Latent space is continuous…
…particle approximation of continuous
RMP messages.
![Page 6: Proteins, Particles, and Pseudo-Max- Marginals: A ...pachecoj.com/pubs/icml15.pdf · x 5 x 6 x 7 x 8 x 4 x 3 x 1 x 2 Protein Side Chain Prediction [ Image: Harder et al., BMC Informatics](https://reader033.fdocuments.in/reader033/viewer/2022051913/600400930bd31f01f0073360/html5/thumbnails/6.jpg)
Particle Max-Product (PMP)
Augment
Particles
1
Sample new particles from proposals:
( Random Walk, Likelihood, Neighbor, … )
![Page 7: Proteins, Particles, and Pseudo-Max- Marginals: A ...pachecoj.com/pubs/icml15.pdf · x 5 x 6 x 7 x 8 x 4 x 3 x 1 x 2 Protein Side Chain Prediction [ Image: Harder et al., BMC Informatics](https://reader033.fdocuments.in/reader033/viewer/2022051913/600400930bd31f01f0073360/html5/thumbnails/7.jpg)
Particle Max-Product (PMP)
Update RMP messages
on augmented particles.
Augment
Particles
1RMP
Update
2
Edge Appearance Probability
![Page 8: Proteins, Particles, and Pseudo-Max- Marginals: A ...pachecoj.com/pubs/icml15.pdf · x 5 x 6 x 7 x 8 x 4 x 3 x 1 x 2 Protein Side Chain Prediction [ Image: Harder et al., BMC Informatics](https://reader033.fdocuments.in/reader033/viewer/2022051913/600400930bd31f01f0073360/html5/thumbnails/8.jpg)
Particle Max-Product (PMP)
Select subset of good particles…
Augment
Particles
1RMP
Update
2Select
Particles
3
…Need particle
selection method.
![Page 9: Proteins, Particles, and Pseudo-Max- Marginals: A ...pachecoj.com/pubs/icml15.pdf · x 5 x 6 x 7 x 8 x 4 x 3 x 1 x 2 Protein Side Chain Prediction [ Image: Harder et al., BMC Informatics](https://reader033.fdocuments.in/reader033/viewer/2022051913/600400930bd31f01f0073360/html5/thumbnails/9.jpg)
Greedy PMP (G-PMP)
Select best particle, sample from random
walk Gaussian. [ Trinh ‘09, Peng ‘11 ]
Truth
Particles
Wrong
Naïve proposals do not exploit model.
![Page 10: Proteins, Particles, and Pseudo-Max- Marginals: A ...pachecoj.com/pubs/icml15.pdf · x 5 x 6 x 7 x 8 x 4 x 3 x 1 x 2 Protein Side Chain Prediction [ Image: Harder et al., BMC Informatics](https://reader033.fdocuments.in/reader033/viewer/2022051913/600400930bd31f01f0073360/html5/thumbnails/10.jpg)
Top-N PMP (T-PMP)
Truth
Particles
Wrong
Select N-best particles ranked by pseudo-
max-marginal values. [ Besse ’12, Pacheco ‘14 ]
Particles collapse to single solution.
![Page 11: Proteins, Particles, and Pseudo-Max- Marginals: A ...pachecoj.com/pubs/icml15.pdf · x 5 x 6 x 7 x 8 x 4 x 3 x 1 x 2 Protein Side Chain Prediction [ Image: Harder et al., BMC Informatics](https://reader033.fdocuments.in/reader033/viewer/2022051913/600400930bd31f01f0073360/html5/thumbnails/11.jpg)
Diverse PMP (D-PMP)
Diverse
States
Truth
Particles
Select particles to preserve messages.
Encourages particle diversity
Robust to initialization
Diverse
States
![Page 12: Proteins, Particles, and Pseudo-Max- Marginals: A ...pachecoj.com/pubs/icml15.pdf · x 5 x 6 x 7 x 8 x 4 x 3 x 1 x 2 Protein Side Chain Prediction [ Image: Harder et al., BMC Informatics](https://reader033.fdocuments.in/reader033/viewer/2022051913/600400930bd31f01f0073360/html5/thumbnails/12.jpg)
Diverse Particle SelectionAt node select particles to minimize
maximum outgoing message error:
RMP message over subset:
Binary Selection Vector
Approximate IP with greedy algorithm.
![Page 13: Proteins, Particles, and Pseudo-Max- Marginals: A ...pachecoj.com/pubs/icml15.pdf · x 5 x 6 x 7 x 8 x 4 x 3 x 1 x 2 Protein Side Chain Prediction [ Image: Harder et al., BMC Informatics](https://reader033.fdocuments.in/reader033/viewer/2022051913/600400930bd31f01f0073360/html5/thumbnails/13.jpg)
Diverse Particle Selection
Pacheco et al. ICML 2014Pose Estimation
Good empirical results
Difficult to analyze
Limited to tree-structured MRFs
![Page 14: Proteins, Particles, and Pseudo-Max- Marginals: A ...pachecoj.com/pubs/icml15.pdf · x 5 x 6 x 7 x 8 x 4 x 3 x 1 x 2 Protein Side Chain Prediction [ Image: Harder et al., BMC Informatics](https://reader033.fdocuments.in/reader033/viewer/2022051913/600400930bd31f01f0073360/html5/thumbnails/14.jpg)
Diverse Particle Selection
Equivalent to minimizing norm.
Consider other norms, e.g. :
Property 1: Message error upper bounds
pseudo-max-marginal error:
Easier to analyze…
![Page 15: Proteins, Particles, and Pseudo-Max- Marginals: A ...pachecoj.com/pubs/icml15.pdf · x 5 x 6 x 7 x 8 x 4 x 3 x 1 x 2 Protein Side Chain Prediction [ Image: Harder et al., BMC Informatics](https://reader033.fdocuments.in/reader033/viewer/2022051913/600400930bd31f01f0073360/html5/thumbnails/15.jpg)
Submodular Particle Selection
Set function is submodular
iff diminishing marginal gains.
Property 2: Selection IP equivalent to
submodular maximization.
P. 3: Efficient LAZYGREEDY selection within
factor of optimal value.
Margin
![Page 16: Proteins, Particles, and Pseudo-Max- Marginals: A ...pachecoj.com/pubs/icml15.pdf · x 5 x 6 x 7 x 8 x 4 x 3 x 1 x 2 Protein Side Chain Prediction [ Image: Harder et al., BMC Informatics](https://reader033.fdocuments.in/reader033/viewer/2022051913/600400930bd31f01f0073360/html5/thumbnails/16.jpg)
LAZYGREEDY Selection
MessageJoint Probability Margin
Selection Objective:
![Page 17: Proteins, Particles, and Pseudo-Max- Marginals: A ...pachecoj.com/pubs/icml15.pdf · x 5 x 6 x 7 x 8 x 4 x 3 x 1 x 2 Protein Side Chain Prediction [ Image: Harder et al., BMC Informatics](https://reader033.fdocuments.in/reader033/viewer/2022051913/600400930bd31f01f0073360/html5/thumbnails/17.jpg)
LAZYGREEDY Selection
Joint Probability Message Margin
Selection Objective:
![Page 18: Proteins, Particles, and Pseudo-Max- Marginals: A ...pachecoj.com/pubs/icml15.pdf · x 5 x 6 x 7 x 8 x 4 x 3 x 1 x 2 Protein Side Chain Prediction [ Image: Harder et al., BMC Informatics](https://reader033.fdocuments.in/reader033/viewer/2022051913/600400930bd31f01f0073360/html5/thumbnails/18.jpg)
LAZYGREEDY Selection
Joint Probability Message Margin
Selection Objective:
![Page 19: Proteins, Particles, and Pseudo-Max- Marginals: A ...pachecoj.com/pubs/icml15.pdf · x 5 x 6 x 7 x 8 x 4 x 3 x 1 x 2 Protein Side Chain Prediction [ Image: Harder et al., BMC Informatics](https://reader033.fdocuments.in/reader033/viewer/2022051913/600400930bd31f01f0073360/html5/thumbnails/19.jpg)
LAZYGREEDY Selection
Joint Probability Message Margin
Selection Objective:
![Page 20: Proteins, Particles, and Pseudo-Max- Marginals: A ...pachecoj.com/pubs/icml15.pdf · x 5 x 6 x 7 x 8 x 4 x 3 x 1 x 2 Protein Side Chain Prediction [ Image: Harder et al., BMC Informatics](https://reader033.fdocuments.in/reader033/viewer/2022051913/600400930bd31f01f0073360/html5/thumbnails/20.jpg)
LAZYGREEDY Selection
Joint Probability Message Margin
Selection Objective:
![Page 21: Proteins, Particles, and Pseudo-Max- Marginals: A ...pachecoj.com/pubs/icml15.pdf · x 5 x 6 x 7 x 8 x 4 x 3 x 1 x 2 Protein Side Chain Prediction [ Image: Harder et al., BMC Informatics](https://reader033.fdocuments.in/reader033/viewer/2022051913/600400930bd31f01f0073360/html5/thumbnails/21.jpg)
LAZYGREEDY Selection
Joint Probability Message Margin
Selection Objective:
![Page 22: Proteins, Particles, and Pseudo-Max- Marginals: A ...pachecoj.com/pubs/icml15.pdf · x 5 x 6 x 7 x 8 x 4 x 3 x 1 x 2 Protein Side Chain Prediction [ Image: Harder et al., BMC Informatics](https://reader033.fdocuments.in/reader033/viewer/2022051913/600400930bd31f01f0073360/html5/thumbnails/22.jpg)
Protein Side Chain Prediction
Pairwise Markov random field (MRF):
[ Image: Harder et al., BMC Informatics 2010 ]
Estimate side chain for fixed backbone
Gaussian Mixture Lennard-Jones
1D to 4D continuous states.
![Page 23: Proteins, Particles, and Pseudo-Max- Marginals: A ...pachecoj.com/pubs/icml15.pdf · x 5 x 6 x 7 x 8 x 4 x 3 x 1 x 2 Protein Side Chain Prediction [ Image: Harder et al., BMC Informatics](https://reader033.fdocuments.in/reader033/viewer/2022051913/600400930bd31f01f0073360/html5/thumbnails/23.jpg)
Protein Side Chain Prediction
![Page 24: Proteins, Particles, and Pseudo-Max- Marginals: A ...pachecoj.com/pubs/icml15.pdf · x 5 x 6 x 7 x 8 x 4 x 3 x 1 x 2 Protein Side Chain Prediction [ Image: Harder et al., BMC Informatics](https://reader033.fdocuments.in/reader033/viewer/2022051913/600400930bd31f01f0073360/html5/thumbnails/24.jpg)
Protein Side Chain Prediction
![Page 25: Proteins, Particles, and Pseudo-Max- Marginals: A ...pachecoj.com/pubs/icml15.pdf · x 5 x 6 x 7 x 8 x 4 x 3 x 1 x 2 Protein Side Chain Prediction [ Image: Harder et al., BMC Informatics](https://reader033.fdocuments.in/reader033/viewer/2022051913/600400930bd31f01f0073360/html5/thumbnails/25.jpg)
Protein Side Chain Prediction
Rosetta simulated annealing [Rohl et al., 2004]
20 Proteins
(11 Runs) 370 Proteins
G-PMP, T-PMP, D-PMP , D-PMP
Log-probability of MAP estimate for…
![Page 26: Proteins, Particles, and Pseudo-Max- Marginals: A ...pachecoj.com/pubs/icml15.pdf · x 5 x 6 x 7 x 8 x 4 x 3 x 1 x 2 Protein Side Chain Prediction [ Image: Harder et al., BMC Informatics](https://reader033.fdocuments.in/reader033/viewer/2022051913/600400930bd31f01f0073360/html5/thumbnails/26.jpg)
Rosetta
G-PMPT-PMP
D-PMPD-PMP
Protein Side Chain Prediction
Root mean square deviation (RMSD)
from x-ray structure.
Oracle selects best configuration in
current particle set.
![Page 27: Proteins, Particles, and Pseudo-Max- Marginals: A ...pachecoj.com/pubs/icml15.pdf · x 5 x 6 x 7 x 8 x 4 x 3 x 1 x 2 Protein Side Chain Prediction [ Image: Harder et al., BMC Informatics](https://reader033.fdocuments.in/reader033/viewer/2022051913/600400930bd31f01f0073360/html5/thumbnails/27.jpg)
Optical Flow
Estimate 2D motion for every superpixel.
Middlebury optical flow benchmark[Baker et al. 2011]
![Page 28: Proteins, Particles, and Pseudo-Max- Marginals: A ...pachecoj.com/pubs/icml15.pdf · x 5 x 6 x 7 x 8 x 4 x 3 x 1 x 2 Protein Side Chain Prediction [ Image: Harder et al., BMC Informatics](https://reader033.fdocuments.in/reader033/viewer/2022051913/600400930bd31f01f0073360/html5/thumbnails/28.jpg)
Optical Flow
Estimate 2D motion for every superpixel.
Middlebury optical flow benchmark[Baker et al. 2011]
![Page 29: Proteins, Particles, and Pseudo-Max- Marginals: A ...pachecoj.com/pubs/icml15.pdf · x 5 x 6 x 7 x 8 x 4 x 3 x 1 x 2 Protein Side Chain Prediction [ Image: Harder et al., BMC Informatics](https://reader033.fdocuments.in/reader033/viewer/2022051913/600400930bd31f01f0073360/html5/thumbnails/29.jpg)
Optical Flow
Estimate 2D motion for every superpixel.
Middlebury optical flow benchmark[Baker et al. 2011]
![Page 30: Proteins, Particles, and Pseudo-Max- Marginals: A ...pachecoj.com/pubs/icml15.pdf · x 5 x 6 x 7 x 8 x 4 x 3 x 1 x 2 Protein Side Chain Prediction [ Image: Harder et al., BMC Informatics](https://reader033.fdocuments.in/reader033/viewer/2022051913/600400930bd31f01f0073360/html5/thumbnails/30.jpg)
Optical Flow
Flow ambiguity near object boundaries…
D-PMP particles reflect this.
D-PMP Particles D-PMP Estimate
![Page 31: Proteins, Particles, and Pseudo-Max- Marginals: A ...pachecoj.com/pubs/icml15.pdf · x 5 x 6 x 7 x 8 x 4 x 3 x 1 x 2 Protein Side Chain Prediction [ Image: Harder et al., BMC Informatics](https://reader033.fdocuments.in/reader033/viewer/2022051913/600400930bd31f01f0073360/html5/thumbnails/31.jpg)
Optical Flow
D-PMP accuracy equivalent to Classic-C [Sun et al. 2014]
Training Test
![Page 32: Proteins, Particles, and Pseudo-Max- Marginals: A ...pachecoj.com/pubs/icml15.pdf · x 5 x 6 x 7 x 8 x 4 x 3 x 1 x 2 Protein Side Chain Prediction [ Image: Harder et al., BMC Informatics](https://reader033.fdocuments.in/reader033/viewer/2022051913/600400930bd31f01f0073360/html5/thumbnails/32.jpg)
Summary
General purpose particle-based max-
product for continuous graphical
models with cycles.
Code Available: cs.brown.edu/~pachecoj
![Page 33: Proteins, Particles, and Pseudo-Max- Marginals: A ...pachecoj.com/pubs/icml15.pdf · x 5 x 6 x 7 x 8 x 4 x 3 x 1 x 2 Protein Side Chain Prediction [ Image: Harder et al., BMC Informatics](https://reader033.fdocuments.in/reader033/viewer/2022051913/600400930bd31f01f0073360/html5/thumbnails/33.jpg)
![Page 34: Proteins, Particles, and Pseudo-Max- Marginals: A ...pachecoj.com/pubs/icml15.pdf · x 5 x 6 x 7 x 8 x 4 x 3 x 1 x 2 Protein Side Chain Prediction [ Image: Harder et al., BMC Informatics](https://reader033.fdocuments.in/reader033/viewer/2022051913/600400930bd31f01f0073360/html5/thumbnails/34.jpg)
Diverse Particle Selection
Minimize sum of errors ( ):
Augmented
Messages
Subset
Messages
Selection
Vector
Easier to analyze than selection…
P1: Message error upper bounds pseudo-
max-marginal error:
![Page 35: Proteins, Particles, and Pseudo-Max- Marginals: A ...pachecoj.com/pubs/icml15.pdf · x 5 x 6 x 7 x 8 x 4 x 3 x 1 x 2 Protein Side Chain Prediction [ Image: Harder et al., BMC Informatics](https://reader033.fdocuments.in/reader033/viewer/2022051913/600400930bd31f01f0073360/html5/thumbnails/35.jpg)
A function is submodular iff diminishingmarginal gains:
Diverse particle selection is submodular maximization with cardinality constraint
Efficient greedy approximation algorithm
Submodularity
![Page 36: Proteins, Particles, and Pseudo-Max- Marginals: A ...pachecoj.com/pubs/icml15.pdf · x 5 x 6 x 7 x 8 x 4 x 3 x 1 x 2 Protein Side Chain Prediction [ Image: Harder et al., BMC Informatics](https://reader033.fdocuments.in/reader033/viewer/2022051913/600400930bd31f01f0073360/html5/thumbnails/36.jpg)
Resolving Ties
Particle diversity leads to more conflicts:Side Chain Particles
T-PMP D-PMP
![Page 37: Proteins, Particles, and Pseudo-Max- Marginals: A ...pachecoj.com/pubs/icml15.pdf · x 5 x 6 x 7 x 8 x 4 x 3 x 1 x 2 Protein Side Chain Prediction [ Image: Harder et al., BMC Informatics](https://reader033.fdocuments.in/reader033/viewer/2022051913/600400930bd31f01f0073360/html5/thumbnails/37.jpg)
Submodular Particle SelectionAugmented
Messages
Property 2: IP is equivalent to submodular
maximization subject to cardinality constraints
Property 1: Message reconstruction error
bounds pseudo-max-marginal error:
Subset
Messages
Selection
Vector
![Page 38: Proteins, Particles, and Pseudo-Max- Marginals: A ...pachecoj.com/pubs/icml15.pdf · x 5 x 6 x 7 x 8 x 4 x 3 x 1 x 2 Protein Side Chain Prediction [ Image: Harder et al., BMC Informatics](https://reader033.fdocuments.in/reader033/viewer/2022051913/600400930bd31f01f0073360/html5/thumbnails/38.jpg)
Submodular Particle Selection
Select particles to minimize sum of errors:
Augmented
Messages
Subset
Messages
Selection
Vector
Property 1: Message error
bounds pseudo-max-marginal:
Property 2: Equivalent to
submodular maximization
subject to cardinality constraints
Good empirical results and we can analyze!
![Page 39: Proteins, Particles, and Pseudo-Max- Marginals: A ...pachecoj.com/pubs/icml15.pdf · x 5 x 6 x 7 x 8 x 4 x 3 x 1 x 2 Protein Side Chain Prediction [ Image: Harder et al., BMC Informatics](https://reader033.fdocuments.in/reader033/viewer/2022051913/600400930bd31f01f0073360/html5/thumbnails/39.jpg)
Reweighted Max-Product (RMP)
Message passing on discrete
side chain states.
But latent space
is continuous…
Edge Appearance Probability
RMP Messages:
Pseudo-max-marginal:
![Page 40: Proteins, Particles, and Pseudo-Max- Marginals: A ...pachecoj.com/pubs/icml15.pdf · x 5 x 6 x 7 x 8 x 4 x 3 x 1 x 2 Protein Side Chain Prediction [ Image: Harder et al., BMC Informatics](https://reader033.fdocuments.in/reader033/viewer/2022051913/600400930bd31f01f0073360/html5/thumbnails/40.jpg)
Optical Flow
Estimate motion vector for every pixel.
Select Top-N Particles (T-PMP)Diverse Particle Selection (D-PMP)
![Page 41: Proteins, Particles, and Pseudo-Max- Marginals: A ...pachecoj.com/pubs/icml15.pdf · x 5 x 6 x 7 x 8 x 4 x 3 x 1 x 2 Protein Side Chain Prediction [ Image: Harder et al., BMC Informatics](https://reader033.fdocuments.in/reader033/viewer/2022051913/600400930bd31f01f0073360/html5/thumbnails/41.jpg)
Diverse Particle Selection
Minimize maximum
message error ( ):Pose Estimation
Augmented
Messages
Subset
Messages
Selection Vector
Good empirical results
No analysis/guarantees
Limited to tree MRFs
[ Pacheco et al., ICML 2014 ]