Protein Functional Site Prediction The identification of protein regions responsible for stability...

20
Protein Functional Site Prediction The identification of protein regions responsible for stability and function is an especially important post-genomic problem With the explosion of genomic data from recent sequencing efforts, protein functional site prediction from only sequence is an increasingly important bioinformatic endeavor.
  • date post

    19-Dec-2015
  • Category

    Documents

  • view

    216
  • download

    0

Transcript of Protein Functional Site Prediction The identification of protein regions responsible for stability...

Page 1: Protein Functional Site Prediction The identification of protein regions responsible for stability and function is an especially important post-genomic.

Protein Functional Site Prediction

• The identification of protein regions responsible for stability and function is an especially important post-genomic problem

• With the explosion of genomic data from recent sequencing efforts, protein functional site prediction from only sequence is an increasingly important bioinformatic endeavor.

Page 2: Protein Functional Site Prediction The identification of protein regions responsible for stability and function is an especially important post-genomic.

What is a “Functional Site”?

• Defining what constitutes a “functional site” is not trivial

• Residues that include and cluster around known functionality are clear candidates for functional sites

• We define a functional site as catalytic residues, binding sites, and regions that clustering around them.

Page 3: Protein Functional Site Prediction The identification of protein regions responsible for stability and function is an especially important post-genomic.

Protein

Page 4: Protein Functional Site Prediction The identification of protein regions responsible for stability and function is an especially important post-genomic.

Protein + Ligand

Page 5: Protein Functional Site Prediction The identification of protein regions responsible for stability and function is an especially important post-genomic.

Functional Sites (FS)

Page 6: Protein Functional Site Prediction The identification of protein regions responsible for stability and function is an especially important post-genomic.

Regions that Cluster Around FS

Page 7: Protein Functional Site Prediction The identification of protein regions responsible for stability and function is an especially important post-genomic.

Phylogenetic motifs

• PMs are short sequence fragments that conserve the overall familial phylogeny

• Are they functional?

• How do we detect them?

Page 8: Protein Functional Site Prediction The identification of protein regions responsible for stability and function is an especially important post-genomic.

Phylogenetic motifs

• PMs are short sequence fragments that conserve the overall familial phylogeny

• Are they functional?• How do we detect them? • First we design a simple heuristic to find

them• Then we see if the detected sites are

functional

Page 9: Protein Functional Site Prediction The identification of protein regions responsible for stability and function is an especially important post-genomic.

Phylogenetic Motif Identification

• Compare all windowed trees with whole tree and keep track of the partition metric scores

• Normalize all partition metric scores by calculating z-scores

• Call these normalized scores Phylogenetic Similarity Z-scores (PSZ)

• Set a PSZ threshold for identifying windows that represent phylogenetic motifs

Page 10: Protein Functional Site Prediction The identification of protein regions responsible for stability and function is an especially important post-genomic.

Set PSZ Threshold

Page 11: Protein Functional Site Prediction The identification of protein regions responsible for stability and function is an especially important post-genomic.

Regions of PMs

Page 12: Protein Functional Site Prediction The identification of protein regions responsible for stability and function is an especially important post-genomic.

TIM

Phylogenetic Similarity False Positive Expectation

Page 13: Protein Functional Site Prediction The identification of protein regions responsible for stability and function is an especially important post-genomic.

TIM

Phylogenetic Similarity False Positive Expectation

Page 14: Protein Functional Site Prediction The identification of protein regions responsible for stability and function is an especially important post-genomic.

Cytochrome P450

Phylogenetic Similarity False Positive Expectation

Page 15: Protein Functional Site Prediction The identification of protein regions responsible for stability and function is an especially important post-genomic.

Cytochrome P450

Phylogenetic Similarity False Positive Expectation

Page 16: Protein Functional Site Prediction The identification of protein regions responsible for stability and function is an especially important post-genomic.

Enolase

Phylogenetic Similarity False Positive Expectation

Page 17: Protein Functional Site Prediction The identification of protein regions responsible for stability and function is an especially important post-genomic.

Glycerol Kinase

Phylogenetic Similarity False Positive Expectation

Page 18: Protein Functional Site Prediction The identification of protein regions responsible for stability and function is an especially important post-genomic.

Glycerol Kinase

Phylogenetic Similarity False Positive Expectation

Page 19: Protein Functional Site Prediction The identification of protein regions responsible for stability and function is an especially important post-genomic.

Myoglobin

Phylogenetic Similarity False Positive Expectation

Page 20: Protein Functional Site Prediction The identification of protein regions responsible for stability and function is an especially important post-genomic.

Myoglobin

Phylogenetic Similarity False Positive Expectation