BCB 570 Spring 20081 Protein-Protein Interaction Networks & methods Julie Dickerson Electrical and...

39
BCB 570 Spring 2008 1 Protein-Protein Interaction Networks & methods Julie Dickerson Electrical and Computer Engineering

Transcript of BCB 570 Spring 20081 Protein-Protein Interaction Networks & methods Julie Dickerson Electrical and...

Page 1: BCB 570 Spring 20081 Protein-Protein Interaction Networks & methods Julie Dickerson Electrical and Computer Engineering.

BCB 570 Spring 2008 1

Protein-Protein Interaction Networks & methods

Julie Dickerson

Electrical and Computer Engineering

Page 2: BCB 570 Spring 20081 Protein-Protein Interaction Networks & methods Julie Dickerson Electrical and Computer Engineering.

BCB 570 Spring 2008 2

Outline

Data for Protein-protein interaction networks

Brief review of network concepts for network analysis

Effect of different data sets Biological network comparison

Page 3: BCB 570 Spring 20081 Protein-Protein Interaction Networks & methods Julie Dickerson Electrical and Computer Engineering.

BCB 570 Spring 2008 3

Two hybrid system

P protein of interest, referred to as "bait," is bound to a DNA Binding Domain (DBD).

A separate protein, called the "prey," is bound to an open reading frame.

If these two proteins (the bait and prey) interact, a reporter gene is transcribed.

In general, used for initial identification of interacting proteins, not for detailed characterization of the interaction

Image from http://www.biochem.arizona.edu/classes/bioc568/two-hybrid_system.htm.

Page 4: BCB 570 Spring 20081 Protein-Protein Interaction Networks & methods Julie Dickerson Electrical and Computer Engineering.

BCB 570 Spring 2008 4

Domain Belief Assumptions : A domain is a discrete

functional and structural unit, such that it folds as a unit and carries out a particular function.

Proteins consist of a number of these domains, laid out in a linear array along the polypeptide chain.

The properties of a domain are basically the same when this unit is put into a different context (such as in a hybrid protein, for instance in the two-hybrid system).

Limitations: Not all proteins have a domain

structure. In many proteins, domains exist

but they include portions of the polypeptide from different parts of the chain; for example, a domain might be composed of residues 1-100 and 250-350.

Properties of a domain may change when it is taken out of the context of the intact protein. E.g., some proteins contain "autoinhibitory" regions.

Page 5: BCB 570 Spring 20081 Protein-Protein Interaction Networks & methods Julie Dickerson Electrical and Computer Engineering.

BCB 570 Spring 2008 5

Co-Immunoprecipitation (co-IP) to find out what is binding

the protein itself is used as an affinity reagent to isolate its binding partners

Compared with two-hybrid and chip-based approaches, this strategy has the advantages that the fully processed and modified protein serves as bait

Page 6: BCB 570 Spring 20081 Protein-Protein Interaction Networks & methods Julie Dickerson Electrical and Computer Engineering.

BCB 570 Spring 2008 6

Proteome Mass Spectrometry

Page 7: BCB 570 Spring 20081 Protein-Protein Interaction Networks & methods Julie Dickerson Electrical and Computer Engineering.

BCB 570 Spring 2008 7

Problems

Noisy data Many weak associations Self-activators contaminants

Molecules are highly connected

Page 8: BCB 570 Spring 20081 Protein-Protein Interaction Networks & methods Julie Dickerson Electrical and Computer Engineering.

BCB 570 Spring 2008 8

Approach

Get more evidence Physical interactions Synthetic lethality Co-citation Co-expression Literature

Page 9: BCB 570 Spring 20081 Protein-Protein Interaction Networks & methods Julie Dickerson Electrical and Computer Engineering.

BCB 570 Spring 2008 9

MIPS Database GDA1p

Page 10: BCB 570 Spring 20081 Protein-Protein Interaction Networks & methods Julie Dickerson Electrical and Computer Engineering.

BCB 570 Spring 2008 10

PIR Database

Page 11: BCB 570 Spring 20081 Protein-Protein Interaction Networks & methods Julie Dickerson Electrical and Computer Engineering.

BCB 570 Spring 2008 11

DIP

GDA1p

YEL017W

YBR161W

YJL152W

ALD5pSsp120p

HPA2p

Page 12: BCB 570 Spring 20081 Protein-Protein Interaction Networks & methods Julie Dickerson Electrical and Computer Engineering.

BCB 570 Spring 2008 12

Page 13: BCB 570 Spring 20081 Protein-Protein Interaction Networks & methods Julie Dickerson Electrical and Computer Engineering.

BCB 570 Spring 2008 13

Biogrid.org

Page 14: BCB 570 Spring 20081 Protein-Protein Interaction Networks & methods Julie Dickerson Electrical and Computer Engineering.

BCB 570 Spring 2008 14

Analyzing P-P interaction networks

Create networks Find structure in networks, search for

modules or motifs Analyze results using known databases,

functional enrichment, expression data, organelle information,etc

Page 15: BCB 570 Spring 20081 Protein-Protein Interaction Networks & methods Julie Dickerson Electrical and Computer Engineering.

BCB 570 Spring 2008 15

Science. 2003 Dec 5;302(5651):1727-36. Epub 2003 Nov 6.                A protein interaction map of Drosophila melanogaster. By Giot, et al.

Page 16: BCB 570 Spring 20081 Protein-Protein Interaction Networks & methods Julie Dickerson Electrical and Computer Engineering.

BCB 570 Spring 2008 16

Page 17: BCB 570 Spring 20081 Protein-Protein Interaction Networks & methods Julie Dickerson Electrical and Computer Engineering.

BCB 570 Spring 2008 17

Page 18: BCB 570 Spring 20081 Protein-Protein Interaction Networks & methods Julie Dickerson Electrical and Computer Engineering.

BCB 570 Spring 2008 18

Page 19: BCB 570 Spring 20081 Protein-Protein Interaction Networks & methods Julie Dickerson Electrical and Computer Engineering.

BCB 570 Spring 2008 19Copyright restrictions may apply.

Jonsson, P. F. et al. Bioinformatics 2006 22:2291-2297; doi:10.1093/bioinformatics/btl390

A description of the protein communities identified by k-clique cluster analysis (k = 6)

Page 20: BCB 570 Spring 20081 Protein-Protein Interaction Networks & methods Julie Dickerson Electrical and Computer Engineering.

BCB 570 Spring 2008 20

Find structure Use cliques or highly connected regions in a

network Clique Percolation Method (CPM, see Derényi

et al., 2005) to locate the k-clique percolation clusters of the network

MCL-Markov Cluster Algorithm based on simulation of (stochastic) flow in graphs Enright A.J., Van Dongen S., Ouzounis C.A. An efficient

algorithm for large-scale detection of protein families. Nucleic Acids Research 30(7):1575-1584 (2002).

Animation

Page 21: BCB 570 Spring 20081 Protein-Protein Interaction Networks & methods Julie Dickerson Electrical and Computer Engineering.

BCB 570 Spring 2008 21

Method: MCL Cluster Definition: Natural clusters in a graph are characterised by

the presence of many edges between the members of that cluster, and one expects that the number of ‘higher-length’ (longer) paths between two arbitrary nodes in the cluster is high. Random walks on the graph rarely go from one natural cluster to another.

The MCL algorithm finds cluster structure in graphs by deterministically computes (the probabilities of) random walks through the similarity graph, and uses two operators transforming one set of probabilities into another. It uses the language of stochastic matrices (also called Markov matrices) to capture the mathematical concept of random walks on a graph. Expansion coincides with taking the power of a stochastic matrix using

the normal matrix product finding probabilities of random walks between nodes

Inflation corresponds with taking the Hadamard power of a matrix:

1

kr r

r pq iqpqi

M M M

Page 22: BCB 570 Spring 20081 Protein-Protein Interaction Networks & methods Julie Dickerson Electrical and Computer Engineering.

BCB 570 Spring 2008 22

Example

Page 23: BCB 570 Spring 20081 Protein-Protein Interaction Networks & methods Julie Dickerson Electrical and Computer Engineering.

BCB 570 Spring 2008 23

Page 24: BCB 570 Spring 20081 Protein-Protein Interaction Networks & methods Julie Dickerson Electrical and Computer Engineering.

BCB 570 Spring 2008 24

Adding in Transcriptional Interactions

ChIP-chip with whole genome microarrays determines the range of in vivo DNA binding sites for any given protein

Map protein complexes (interacting proteins and their

Map co-regulated complexes within and across species.

Page 25: BCB 570 Spring 20081 Protein-Protein Interaction Networks & methods Julie Dickerson Electrical and Computer Engineering.

BCB 570 Spring 2008 25

Page 26: BCB 570 Spring 20081 Protein-Protein Interaction Networks & methods Julie Dickerson Electrical and Computer Engineering.

BCB 570 Spring 2008 26

Approach Cross Species

Nature Biotechnology 24, 427 - 433 (2006) Modeling cellular machinery through

biological network comparisonRoded Sharan& Trey Ideker

Page 27: BCB 570 Spring 20081 Protein-Protein Interaction Networks & methods Julie Dickerson Electrical and Computer Engineering.

BCB 570 Spring 2008 27

Network Alignment

Why is this hard?

Page 28: BCB 570 Spring 20081 Protein-Protein Interaction Networks & methods Julie Dickerson Electrical and Computer Engineering.

BCB 570 Spring 2008 28

Page 29: BCB 570 Spring 20081 Protein-Protein Interaction Networks & methods Julie Dickerson Electrical and Computer Engineering.

BCB 570 Spring 2008 29

PATHBlast

Identifies pairs of interaction paths, drawn from the networks of different species or from different processes within a species,

Proteins at equivalent path positions must share strong sequence homology.

Score is a sum of alignments plus the probability of the interaction ideally compared to the null set.

Page 30: BCB 570 Spring 20081 Protein-Protein Interaction Networks & methods Julie Dickerson Electrical and Computer Engineering.

BCB 570 Spring 2008 30

Algorithms for Network Alignment

Scoring: measure similarity of each subnetwork to a predefined structure of interest and the level of conservation of the subnetwork across networks being compared.

Search procedures: find conserved subnetworks of interest.

Page 31: BCB 570 Spring 20081 Protein-Protein Interaction Networks & methods Julie Dickerson Electrical and Computer Engineering.

BCB 570 Spring 2008 31

Page 32: BCB 570 Spring 20081 Protein-Protein Interaction Networks & methods Julie Dickerson Electrical and Computer Engineering.

BCB 570 Spring 2008 32

Edit-Distance Methods Evolution-based

Define M to be set of matches determine by orthology relationships between pairs of proteins

N: set of mismatched interactions, sets of proteins where one pair interacts

D: union of sets of duplicated protein pairs within each network

a M a N a D

S m a n a d a

Page 33: BCB 570 Spring 20081 Protein-Protein Interaction Networks & methods Julie Dickerson Electrical and Computer Engineering.

BCB 570 Spring 2008 33

Fit to a desired structure

Maximum likelihood Compute a log-likelihood ratio that

measures fit to an ideal structure vs. chance that the subnetwork is observed at random (null hypothesis).

Ratios summed over aligned subnetworks to give overall score.

Page 34: BCB 570 Spring 20081 Protein-Protein Interaction Networks & methods Julie Dickerson Electrical and Computer Engineering.

BCB 570 Spring 2008 34

Model of Protein Complex Each protein interacts with high prob ,

independently of other protein pairs. Null: every two proteins interact with a probability

that depends on their node degree, p(u,v) Likelihood that a set of proteins, C, with

interactions E(C) forms a complex is:

, ,

1log log

, 1 ,u v E C u v E C

L Cp u v p u v

Page 35: BCB 570 Spring 20081 Protein-Protein Interaction Networks & methods Julie Dickerson Electrical and Computer Engineering.

BCB 570 Spring 2008 35

Page 36: BCB 570 Spring 20081 Protein-Protein Interaction Networks & methods Julie Dickerson Electrical and Computer Engineering.

BCB 570 Spring 2008 36

Network Queries

Page 37: BCB 570 Spring 20081 Protein-Protein Interaction Networks & methods Julie Dickerson Electrical and Computer Engineering.

BCB 570 Spring 2008 37

Searching

Greedy seach: promising seed network, refines using local search using an editing approach (adding/deleting a protein)

Works well for defined graph structures such as paths or trees

Page 38: BCB 570 Spring 20081 Protein-Protein Interaction Networks & methods Julie Dickerson Electrical and Computer Engineering.

BCB 570 Spring 2008 38

Network Evolution

Page 39: BCB 570 Spring 20081 Protein-Protein Interaction Networks & methods Julie Dickerson Electrical and Computer Engineering.

BCB 570 Spring 2008 39