Clustering the Temporal Sequences of 3D Protein Structure Mayumi Kamada +*, Sachi Kimura, Mikito...

25
Clustering the Temporal Sequences of 3D Protein Structure Mayumi Kamada +* , Sachi Kimura, Mikito Toda , Masami Takata + , Kazuki Joe + + Graduate School of Humanities and Science, Information and Computer Sciences, Nara Women’s University Departments of physics, Nara Women’s University

Transcript of Clustering the Temporal Sequences of 3D Protein Structure Mayumi Kamada +*, Sachi Kimura, Mikito...

  • Slide 1
  • Clustering the Temporal Sequences of 3D Protein Structure Mayumi Kamada +*, Sachi Kimura, Mikito Toda , Masami Takata +, Kazuki Joe + + Graduate School of Humanities and Science, Information and Computer Sciences, Nara Womens University Departments of physics, Nara Womens University
  • Slide 2
  • Outline Motivation Flexibility Docking Feature Extraction using Motion Analysis Conclusions and Future Work
  • Slide 3
  • Motivation Protein in biological molecules Docking Transform oneself and Combine with other materials Prediction of Docking Prediction of resultant functions
  • Slide 4
  • Existing Docking Simulation Predicted structures from docking structure A structure B Docking simulation PDB * Rigid structures * Protein Data Bank Fluctuating in living cells Low prediction accuracy Docking simulation Considering fluctuations
  • Slide 5
  • Flexibility Docking Predicted structures from docking structure A structure B Docking simulation PDB Flexibility handling Considering fluctuation of proteins in living cells Extraction of fluctuated structures Consideration of structural fluctuation of proteins
  • Slide 6
  • Flexibility Handling Flexibility handling MD Filter output file Representative structure Filtering Selection of representative structures from similar structures Molecular dynamic simulation(MD) Simulation of motion of molecules in a polyatomic system output file output file output file output file Representative structure Create filters by using RMSD
  • Slide 7
  • Filters using RMSD RMSD(Root Mean Square Deviation) Comparison of the similarity of two structures Propose two filtering algorithms Maximum RMSD selection filter Below RMSD 1 deletion filter Result Useful for the heat fluctuation condition RMSD Unification of topology information Lapse of information Feature extraction focusing on Protein Motion not Structure
  • Slide 8
  • Capture Protein Motion MD Wavelet transform Clustering Continuous wavelet transform: Morlet wavelet Clustering algorithm: Affinity Propagation Selection of representative motions Feature extraction The frequency may change momentarily!
  • Slide 9
  • Target Protein 1TIB Residue length: 269 MD simulation Software: AMBER Simulation run time: 2 nsec Result data files: 200 Space coordinates of C atoms
  • Slide 10
  • Singular Value Decomposition SVD(Singular value decomposition) Definition: Unitary matrix U: Left-singular vectors Spatial motion Unitary matrix V: Right-singular vectors Frequency fluctuation Matrix A: At time step i (t i ) Components column C row Frequency matrix-size of A: 807199
  • Slide 11
  • Singular Value Decomposition SVD(Singular value decomposition) Definition: Unitary matrix U: Left-singular vectors Spatial motion Unitary matrix V: Right-singular vectors Frequency fluctuation Matrix A: At time step i (t i ) Components column C row Frequency matrix-size of A: 807199
  • Slide 12
  • Verification of Reproducibility Singular values and principal components N=1 N=4 N=6 N=8 M=1 M=4 M=6 M=8 Left Singular Vectors (Spatial motion) Right Singular Vectors (Frequency fluctuation)
  • Slide 13
  • Reproducibility Using the eight principal components, the motion expressed by 199 components can be reproduced ! Almost adjusted !
  • Slide 14
  • Examination (1) Each of singular values (2)The first singular value Accounted for about 30% over Expression of the original motion Possible by the six singular values The first singular value is useful
  • Slide 15
  • Clustering Analysis Focus on the first principal component Definition Similarities and Preference Clustering by using the above values
  • Slide 16
  • Similarities (1) For left singular vectors Difference of spatial directs Inner products Similarity : Same directionDifferential direction K ij :Value 10 C
  • Slide 17
  • Similarities (2) For right singular vectors Difference between distributions of spectrum Hellinger Distance Similarity:
  • Slide 18
  • Clustering Method Affinity propagation(AP) Brendan J. Frey and Delbert Dueck Clustering by Passing Messages Between Data Points . Science 315, 972 976. 2007 Obtain Exemplars: cluster centers Preference Left singular vectors Average of similarities Right singular vectors minimum of similarities maximum of similarities minimum
  • Slide 19
  • Similarities between Left Singular Vectors
  • Slide 20
  • Clustering of Left Singular Vectors
  • Slide 21
  • Similarities between Right Singular Vectors
  • Slide 22
  • Clustering of Right Singular Vectors
  • Slide 23
  • Discussions Each of motions Spatial motion Repetition of several similar spatial motions in time variation Frequency fluctuation Repetition of similar frequency patterns in time variation Relationship Characteristic Frequency fluctuation Group transition on spatial motion
  • Slide 24
  • Conclusions and Future Work Flexibility docking Flexibility handling: MD and Filter Feature extraction based motion Wavelet analysis Analysis of motions Clustering Future work Collective motion Relationship Perform the docking simulation
  • Slide 25
  • Conclusions and Future Work Flexibility docking Flexibility handling: MD and Filter Feature extraction based motion Wavelet analysis Analysis of motions Clustering Future work Collective motion Relationship Perform the docking simulation