Networks and Interactions Boo Virk [email protected] v1.0.

34
Networks and Interactions Boo Virk [email protected] v1.0

Transcript of Networks and Interactions Boo Virk [email protected] v1.0.

Networks and Interactions

Boo [email protected]

v1.0

I have my gene set, what next?

Interaction Network:

Nodes = genes

Edges = strength of evidence for co-functionality between the connected genes

Why make a network?• Networks:– insight into new interactions– generate new hypothesis to test– knowledge obtained from the published literature

• What types of network are there?• How to construct a network?– How to link my genes?

Gene Co-expression• Gene expression data• Two genes are linked if their expression levels were

similar across conditions in a gene expression study

Physical Interaction

• Protein-protein interaction (PPI) data • Two gene products are linked if they were found to

interact in a protein-protein interaction study

Genetic Interaction• Genetic interaction data• Two genes are linked if the effects of perturbing one

gene were modified by perturbations to a second gene

Same Pathway

• Pathway data• Two gene products are linked if they participate in the

same reaction within a pathway• Unlike physical networks – here don’t have to directly

interact, just be part of the same pathway

Other Types of Network• Shared protein domain

– Two gene products are linked if they have the same protein domain

• Co-localisation– Two genes are linked if they are both expressed in the same

tissue, or if their gene products are both identified in the same cellular location

• Predicted– Predicted functional relationships between genes, often protein

interactions

• Other– phenotype correlations from Ensembl– disease information from OMIM– chemical genomics data.

Multiple Networks

Type of network depends on the question you are asking

Constructing Networks

• Many tools are available for reconstructing networks• In this course:

• Generally networks should be constructed from 50-100 genes, if you want to be able to easily interpret and analyse

• Free to use web resourcehttp://string-db.org/

• Direct and indirect associations from four sources:– Co-expression– genomic context, – high throughput experiments– published/previous knowledge

• Over 2000 organisms (some more studied than others though)

Under Options:• Change the background• Hide unconnected nodes• Remove selected nodes

Changing parameters in STRING

Clustering

Can cluster in STRING by:• Kmeans• MCL

Customisable cluster level

IMPORTANT NOTE:We will soon improve our clustering tools adding other algorithms suitable to cluster protein-protein interactions.By now, we suggest you to use the MCL algorithm (see review paper below).Please, note that if your network is small (i.e. it contains few nodes) the MCL algorithm will produce only a single large cluster. In that case, you could simply enlarge the network (e.g. clicking the '+' button or modifying the String parameters at the bottom of the page) and relaunch the clustering.

EnrichmentCan test for enrichment in STRING:• GO Biological Process• GO Molecular Function• GO Cellular Components

• KEGG Pathway

• Domains• Protein-protein Interaction

Saving networks in STRING

Save low resolution and high resolution images of your network (PNG, SVG)

Can also save .txt files containing the interactions of your network

Limitations of STRING

• Reproducibility of networks– STRING assigns some random variables when

constructing networks, therefore each time you construct your network it may be slightly different

• Free to use web resource

• Association data include:– Protein interactions– Genetic interactions– Pathways– Co-expression– Co-localization– Protein domain similarity

• 9 Organisms– human, mouse, rat, worm, fly, zebrafish, E. coli, arabidopsis, baker’s yeast

http://www.genemania.org/

Limitations of GeneMANIA

• Can’t remove nodes– Can move nodes around, but not remove nodes to

perturb the network

• No automatic layout ‘relaxation’

• Only 9 organisms covered

• No clustering available

Cytoscape is an open source software platform• visualizing molecular interaction networks and biological

pathways• integrating networks with annotations, gene expression

profiles and other state data• Can import, edit and analyse networks• Compatible with GeneMANIA, STRING and many others

http://www.cytoscape.org/

Cytoscape Apps

• Cytoscape has a plethora of apps to help with network analysis

• Some of the commonly used apps include:

iRegulon• Identifies transcription factors (green) and target interactions

of TF’s (purple) within networks

BiNGO• Graphical Gene Ontology analysis• The node size is proportional to the number of proteins

represented by functional category• Color denotes the p-value for each enriched GO term

ClueGo• ClueGO integrates GO terms as well as pathways• Creates a functionally organized GO/pathway term network

Why make a network?• Networks:– give insight into new interactions– generate new hypothesis to test– knowledge obtained from the published literature

Why make a network?• Networks:

– give insight into new interactions– generate new hypothesis to test– knowledge obtained from the published literature

• Limitations of Networks:– Based on published literature, if working on something very

novel information may be limited– When number of genes gets too large, difficult to interpret– Be careful with choosing types of network – “everything is

linked to everything else”