Download - Protein Functional Site Prediction The identification of protein regions responsible for stability and function is an especially important post-genomic.

Transcript
Page 1: Protein Functional Site Prediction The identification of protein regions responsible for stability and function is an especially important post-genomic.

Protein Functional Site Prediction

• The identification of protein regions responsible for stability and function is an especially important post-genomic problem

• With the explosion of genomic data from recent sequencing efforts, protein functional site prediction from only sequence is an increasingly important bioinformatic endeavor.

Page 2: Protein Functional Site Prediction The identification of protein regions responsible for stability and function is an especially important post-genomic.

What is a “Functional Site”?

• Defining what constitutes a “functional site” is not trivial

• Residues that include and cluster around known functionality are clear candidates for functional sites

• We define a functional site as catalytic residues, binding sites, and regions that clustering around them.

Page 3: Protein Functional Site Prediction The identification of protein regions responsible for stability and function is an especially important post-genomic.

Protein

Page 4: Protein Functional Site Prediction The identification of protein regions responsible for stability and function is an especially important post-genomic.

Protein + Ligand

Page 5: Protein Functional Site Prediction The identification of protein regions responsible for stability and function is an especially important post-genomic.

Functional Sites (FS)

Page 6: Protein Functional Site Prediction The identification of protein regions responsible for stability and function is an especially important post-genomic.

Regions that Cluster Around FS

Page 7: Protein Functional Site Prediction The identification of protein regions responsible for stability and function is an especially important post-genomic.

Phylogenetic motifs

• PMs are short sequence fragments that conserve the overall familial phylogeny

• Are they functional?

• How do we detect them?

Page 8: Protein Functional Site Prediction The identification of protein regions responsible for stability and function is an especially important post-genomic.

Phylogenetic motifs

• PMs are short sequence fragments that conserve the overall familial phylogeny

• Are they functional?• How do we detect them? • First we design a simple heuristic to find

them• Then we see if the detected sites are

functional

Page 9: Protein Functional Site Prediction The identification of protein regions responsible for stability and function is an especially important post-genomic.

Phylogenetic Motif Identification

• Compare all windowed trees with whole tree and keep track of the partition metric scores

• Normalize all partition metric scores by calculating z-scores

• Call these normalized scores Phylogenetic Similarity Z-scores (PSZ)

• Set a PSZ threshold for identifying windows that represent phylogenetic motifs

Page 10: Protein Functional Site Prediction The identification of protein regions responsible for stability and function is an especially important post-genomic.

Set PSZ Threshold

Page 11: Protein Functional Site Prediction The identification of protein regions responsible for stability and function is an especially important post-genomic.

Regions of PMs

Page 12: Protein Functional Site Prediction The identification of protein regions responsible for stability and function is an especially important post-genomic.

TIM

Phylogenetic Similarity False Positive Expectation

Page 13: Protein Functional Site Prediction The identification of protein regions responsible for stability and function is an especially important post-genomic.

TIM

Phylogenetic Similarity False Positive Expectation

Page 14: Protein Functional Site Prediction The identification of protein regions responsible for stability and function is an especially important post-genomic.

Cytochrome P450

Phylogenetic Similarity False Positive Expectation

Page 15: Protein Functional Site Prediction The identification of protein regions responsible for stability and function is an especially important post-genomic.

Cytochrome P450

Phylogenetic Similarity False Positive Expectation

Page 16: Protein Functional Site Prediction The identification of protein regions responsible for stability and function is an especially important post-genomic.

Enolase

Phylogenetic Similarity False Positive Expectation

Page 17: Protein Functional Site Prediction The identification of protein regions responsible for stability and function is an especially important post-genomic.

Glycerol Kinase

Phylogenetic Similarity False Positive Expectation

Page 18: Protein Functional Site Prediction The identification of protein regions responsible for stability and function is an especially important post-genomic.

Glycerol Kinase

Phylogenetic Similarity False Positive Expectation

Page 19: Protein Functional Site Prediction The identification of protein regions responsible for stability and function is an especially important post-genomic.

Myoglobin

Phylogenetic Similarity False Positive Expectation

Page 20: Protein Functional Site Prediction The identification of protein regions responsible for stability and function is an especially important post-genomic.

Myoglobin

Phylogenetic Similarity False Positive Expectation