Subtrees Comparison of Phylogenetic Trees with Applications to Two Component Systems Sequence...
-
Upload
randolf-fleming -
Category
Documents
-
view
215 -
download
0
Transcript of Subtrees Comparison of Phylogenetic Trees with Applications to Two Component Systems Sequence...
Subtrees Comparison of Phylogenetic Trees with Applications to Two Component Systems Sequence Classifications in Bacteri
al Genome
Yaw-Ling Lin 1 Ming-Tat Ko 2
1 Dept Computer Sci. & Info. Management,Providence University, Taichung, Taiwan.
2 Institute of Information ScienceAcademia Sinica, Taipei, Taiwan
Yaw-Ling Lin, Providence, Taiwan 2
Motivation – Where the problems
come from?
Yaw-Ling Lin, Providence, Taiwan 3
Two-Component System
• Two-component systems (2CS):– Sensor histidine kinase– response regulator
• The major controlling machinery in order for bacteria to encounter a diverse and often hostile environment
Yaw-Ling Lin, Providence, Taiwan 4
2CS in Pseudomonas aeruginosa PAO1
http://www.pseudomonas.com/
“Complete genome sequence of Pseudomonas aeruginosa PAO1, an opportunistic pathogen.” Nature. 2000 Aug 31;406(6799):947-8. by Stover CK, Pham XQ, Erwin AL, et al.
• Genome: 6.3M bp• predicted genes: 5570• 123 genes were classif
ied as 2CSs.
Yaw-Ling Lin, Providence, Taiwan 5
2CS in PAO1
Yaw-Ling Lin, Providence, Taiwan 6
2CS in PAO1
Yaw-Ling Lin, Providence, Taiwan 7
2CS in PAO1
Yaw-Ling Lin, Providence, Taiwan 8
2CS in PAO1
• There are 123 annotated 2CS genes in PAO1.• Use systemic analysis of the evolutionary relations
hips between the sensor kinase and response regulator of a 2CS.
• Construct phylogenic trees using Clustal-W for 54 sensor kinases and 59 response regulators.
Yaw-Ling Lin, Providence, Taiwan 9
2CS in PAO1 -- Sensor Tree
Yaw-Ling Lin, Providence, Taiwan 10
2CS: Regulator Tree
Yaw-Ling Lin, Providence, Taiwan 11
Subtrees Analysis of 2CS
Yaw-Ling Lin, Providence, Taiwan 12
Co-evolution subtree Analysis
Sensor Tree Regulator Tree
versus
Yaw-Ling Lin, Providence, Taiwan 13
Problem Definition
• A phylogenetic tree with n leaves is a (rooted binary) tree such that all the leaf nodes are uniquely labelled from 1 to n.
• Given two n-leaf phylogenetic trees, we wish to explore the subtrees relationships between subtrees of the two trees.
Yaw-Ling Lin, Providence, Taiwan 14
Normalized cluster distance between two sets
• Symmetric set difference:
• Normalized cluster distance:
Yaw-Ling Lin, Providence, Taiwan 15
All Pairs Subtrees Comparison – A naïve O(n3) algorithm
Yaw-Ling Lin, Providence, Taiwan 16
All Pairs Subtrees Comparison – Property
Yaw-Ling Lin, Providence, Taiwan 17
All Pairs Subtrees Comparison – an O(n2) algorithm
Yaw-Ling Lin, Providence, Taiwan 18
Lowest Common Ancestor
Yaw-Ling Lin, Providence, Taiwan 19
Confluent subtree
Yaw-Ling Lin, Providence, Taiwan 20
Confluent subtree – Illustration
Yaw-Ling Lin, Providence, Taiwan 21
Consructing confluent subtree
Yaw-Ling Lin, Providence, Taiwan 22
Nearest subtree
Yaw-Ling Lin, Providence, Taiwan 23
Nearest subtree: reasoning
Yaw-Ling Lin, Providence, Taiwan 24
Nearest subtree: Algorithm
Yaw-Ling Lin, Providence, Taiwan 25
k-agreement Problem
Yaw-Ling Lin, Providence, Taiwan 26
Correlation analysis• Does gene duplication tend to occur within a
relative short distance on a bacterial genome? • Idea: a dot-matrix plot will be created, with the X-
axis being the physical distance, and Y-axis being the evolutionary distance, between two comparing 2CS.
• Some subset of 2CS, presumably functionally related, could possess the correlation between their physical and evolutionary distances.
Yaw-Ling Lin, Providence, Taiwan 27
k-correlation Problem
Yaw-Ling Lin, Providence, Taiwan 28
k-correlation is NP-complete
• Let M1 be an adjacent matrix of a graph G, and M2
be an zero matrix.• If we can solve the k-correlation problem in
polynomial time, then the maximum independent set problem will be polynomial solvable.
Yaw-Ling Lin, Providence, Taiwan 29
Conclusion
• Identifying novel 2CS in other bacteria genomes as well as in eucaryotic genomes
• Clustering analysis of 2CS for functional prediction of uncharacterized genes
• Co-evolutionary analysis of 2CS
Yaw-Ling Lin, Providence, Taiwan 30
Future Research
• Identifying novel 2CS in other bacteria genomes as well as in eucaryotic genomes
• Clustering analysis of 2CS for functional prediction of uncharacterized genes
• Co-evolutionary analysis of 2CS