Predicting Protein Function Annotation using Protein-Protein Interaction Networks

34
Predicting Protein Function Annotation using Protein-Protein Interaction Networks By Tamar Eldad Advisor: Dr. Yanay Ofran 89-385 Computational Biology - Projects Workshop Bar-Ilan University, the Mina and Everard Goodman Faculty of Life Sciences 1

description

Predicting Protein Function Annotation using Protein-Protein Interaction Networks. By Tamar Eldad Advisor: Dr. Yanay Ofran 89-385 Computational Biology - Projects Workshop Bar-Ilan University , the Mina and Everard Goodman Faculty of Life Sciences. Protein Function Prediction. - PowerPoint PPT Presentation

Transcript of Predicting Protein Function Annotation using Protein-Protein Interaction Networks

Page 1: Predicting Protein Function Annotation using Protein-Protein Interaction Networks

Predicting Protein Function Annotation using Protein-

Protein Interaction Networks

By Tamar Eldad

Advisor: Dr. Yanay Ofran

89-385 Computational Biology - Projects Workshop

Bar-Ilan University, the Mina and Everard Goodman Faculty of Life Sciences1

Page 2: Predicting Protein Function Annotation using Protein-Protein Interaction Networks

Exponential increase in the number of proteins being identified by sequence genomics projects

Impossible to perform functional assay for every uncharacterized gene

Turn to sophisticated computational methods for assistance in annotating the huge volume of sequence and structure data being produced

homology-based annotation transfer sequence patterns structure similarity structure patterns genomic context microarray data

Protein Function Prediction

2

Page 3: Predicting Protein Function Annotation using Protein-Protein Interaction Networks

Biological function has more than one aspect

Sub-cellular to whole-organism context

Physiological aspect

Phenotype

What is Function?

The need of a well-defined vocabulary

3

Page 4: Predicting Protein Function Annotation using Protein-Protein Interaction Networks

Protein Sequence:

Protein Structure:

4

Page 5: Predicting Protein Function Annotation using Protein-Protein Interaction Networks

The Gene Ontology project is a major bioinformatics initiative with the aim of standardizing the representation of gene and gene product attributes across species and databases.

The project provides a controlled vocabulary of terms for describing gene product characteristics and gene product annotation data.

The Gene Ontology

6

Page 6: Predicting Protein Function Annotation using Protein-Protein Interaction Networks

The Gene Ontology

Cellular component Molecular function Biological process

DAG (1….N parent nodes) General Specific Term is assigned to Gene Product

7

Page 7: Predicting Protein Function Annotation using Protein-Protein Interaction Networks

The Gene Ontology

8

Page 8: Predicting Protein Function Annotation using Protein-Protein Interaction Networks

Classical Biology – collect a set of features for each protein Systems Biology – study protein function in the context of a network

A New Approach

Assemblies represent more than the sum of their parts

9

Page 9: Predicting Protein Function Annotation using Protein-Protein Interaction Networks

Protein Interactions Data on thousands of interactions in humans and most model

species have become available

mass spectrometry

genome-wide chromatin immunoprecipitation

yeast two-hybrid assays

combinatorial reverse genetic screens

rapid literature mining techniques

10

Page 10: Predicting Protein Function Annotation using Protein-Protein Interaction Networks

PPI Networks

Data are represented as networks, with nodes representing proteins and edges representing the detected PPIs.

11

Page 11: Predicting Protein Function Annotation using Protein-Protein Interaction Networks

Alignment – aligning sequence-matching proteins between species and checking if they also share network alignment can teach us about conserved pathways between species

Integration - data from different types of networks (i.e. protein, genetic, and transcriptional interaction networks) are integrated in order to get a better picture of the whole biological system

Querying - find sub-networks similar to functional units (by comparing interactions and the proteins themselves) - likely to be functioning units too

Existing Methods

12

Page 12: Predicting Protein Function Annotation using Protein-Protein Interaction Networks

conserved network motifs between two species convey evidence for function similarity of the individual proteins that make up these motifs

New Method

HUMANYEAST

2e-10

8e-13

1e-09

5e-15

13

Page 13: Predicting Protein Function Annotation using Protein-Protein Interaction Networks

What do we need?

1. list of proteins in human cell

2. list of proteins in yeast cell

3. interactions in each cell

4. sequence similarity grades

5. known GO annotations

6. function distance calculation

New Method

14

Page 14: Predicting Protein Function Annotation using Protein-Protein Interaction Networks

Protein Lists - UniProt DB

15

Page 15: Predicting Protein Function Annotation using Protein-Protein Interaction Networks

Interaction Databases

HPRD - The Human Protein Reference Database.

Dip - Database of Interacting Proteins.

Mips -Munich information center of proteins sequences

IntAct – interaction molecular database.

Reliable interaction performs one of these conditions:1. was at least observed in 2 different experiments.

OR2. was reported in 3 different articles.

16

Page 16: Predicting Protein Function Annotation using Protein-Protein Interaction Networks

Sequence Similarity Grades

BLAST - bl2seq

1 2 3 4

1 - 0.008 3e-18 X

2 10 - 0.02 3.6

HUMAN

YE

AS

T

17

Page 17: Predicting Protein Function Annotation using Protein-Protein Interaction Networks

GO annotations –UniProt DB

18

Page 18: Predicting Protein Function Annotation using Protein-Protein Interaction Networks

Evidence Codes

19

Page 19: Predicting Protein Function Annotation using Protein-Protein Interaction Networks

Function Distance Calculation

20

Page 20: Predicting Protein Function Annotation using Protein-Protein Interaction Networks

1. Prepare similarity matrix for cutoff e-value

2. Find all components of size N – 1 (DFS search)

3. Compare sub-graphs found using similarity matrix

4. Add N-th non-similar component to each pair of matching graphs

5. Get GO function annotation of N-th components

6. Calculate average distance of N-th component’s function

Implementation

21

Page 21: Predicting Protein Function Annotation using Protein-Protein Interaction Networks

1. Compare to random-pair annotation

No-sequence similarity

2. Compare to sequence-similar annotation

BLAST

Only proteins under cut-off value

Human genes only

Quality Assurance

22

Page 22: Predicting Protein Function Annotation using Protein-Protein Interaction Networks

Detailed Results

graph1new comp go func graph2 new comp go func term type dist

Eval average

,4814,4256,591,1584, Q12495 GO:0005515 ,4253,1335,2447,2353, Q9UHD2 GO:0005515 MolecularFunction 4 0.079

,4814,4256,591,1584, Q12495 GO:0030528 ,4253,1335,2447,2353, Q9UHD2 GO:0030528 MolecularFunction 3 0.079

,4814,4256,591,1584, Q12495 GO:0006334 ,4253,1335,2447,2353, Q9UHD2 GO:0006334 BiologicalProcess 0 0.079

,4814,4256,591,1584, Q12495 GO:0005515 ,4253,1335,2447,2353, O15111 GO:0005515 MolecularFunction 1 0.079

,4814,4256,591,1584, Q12495 GO:0005515 ,4253,1335,2447,2353, O15111 GO:0005515 MolecularFunction 12 0.079

,4819,2,236,234, P16649 GO:0016584 ,4354,2303,2890,3693, P55060 GO:0016584 BiologicalProcess 1 0.062

,4819,2,236,234, P16649 GO:0016565 ,4354,2303,2890,3693, Q96KB5 GO:0016565 MolecularFunction 1 0.062

,4819,2,236,234, P16649 GO:0016584 ,4354,2303,2890,3693, Q15699 GO:0016584 BiologicalProcess 8 0.062

,4819,2,236,234, P16649 GO:0016584 ,4354,2303,2890,3693, Q15699 GO:0016584 BiologicalProcess 5 0.062

,4867,2966,168,1224, P13393 GO:0000120 ,4387,1383,1452,2289, P63279 GO:0000120 CellularComponent 4 0.041

,4867,2966,168,1224, P13393 GO:0000120 ,4387,1383,1452,2289, P63279 GO:0000120 CellularComponent 3 0.041

,4867,2966,168,1224, P13393 GO:0000126 ,4387,1383,1452,2289, P63279 GO:0000126 CellularComponent 7 0.041

23

Page 23: Predicting Protein Function Annotation using Protein-Protein Interaction Networks

Results

E-value 5e-05

24

Page 24: Predicting Protein Function Annotation using Protein-Protein Interaction Networks

• Change graph size

• Lower e-value

• Start with larger amount of connected components

• Use only graphs with higher connectivity

• Non-similar proteins can be any protein in the graph

• Different network topology

• Limit number of paired proteins

Play with Parameters

25

Page 25: Predicting Protein Function Annotation using Protein-Protein Interaction Networks

Results

26

Page 26: Predicting Protein Function Annotation using Protein-Protein Interaction Networks

Conclusions

Most results are random

Significant improvement only for Biological Process prediction

Still far behind Homology Based Transfer

27

Page 27: Predicting Protein Function Annotation using Protein-Protein Interaction Networks

Summary

Functional annotation is one of the greatest challenges in the post-genomic era

PPI data for functional annotation as a new approach for promoting this field

Method tried out is unsuccessful

Other Ideas: Find a more specific search pattern Start from best results – what specializes them?

28

Page 28: Predicting Protein Function Annotation using Protein-Protein Interaction Networks

References

Friedberg,I. (2006) Automated function prediction: the genomic challenge. Brief. Bioinform. Accepted for publication

Sharan R, Ulitsky I, Shamir R: Network-based prediction of protein function. Mol Syst Biol 2007, 3:88.

Sharan R, Ideker T: Modeling cellular machinery through biological network comparison. Nature Biotechnology 24, 4: 427 - 433.

http://www.geneontology.org/ http://www.chem.qmul.ac.uk/iubmb/enzyme/

29

Page 29: Predicting Protein Function Annotation using Protein-Protein Interaction Networks

Thanks

Advisor – Dr. Yanay Ofran Guys at the lab – Rotem, Vered, Sivan Roi Adadi & Omer Erel

30

Page 30: Predicting Protein Function Annotation using Protein-Protein Interaction Networks

Alignment

Page 31: Predicting Protein Function Annotation using Protein-Protein Interaction Networks

Querying

Page 32: Predicting Protein Function Annotation using Protein-Protein Interaction Networks

Integration

Page 33: Predicting Protein Function Annotation using Protein-Protein Interaction Networks

1 2 3 4

1 - 0.008 3e-18 X

2 10 - 0.02 3.6

E-value = 0.0005

TRUE

FALSE FALSE

FALSE

FALSE

TRUE

HUMAN

YE

AS

TSimilarity Matrix

Page 34: Predicting Protein Function Annotation using Protein-Protein Interaction Networks

Neighboring matrix

1 2 3 4

1 - TRUE FALSE TRUE

2 TRUE - FALSE FALSE

HUMAN CELL INTERACTIONS