Networks and Interactions Boo Virk [email protected] v1.0.
Transcript of Networks and Interactions Boo Virk [email protected] v1.0.
I have my gene set, what next?
Interaction Network:
Nodes = genes
Edges = strength of evidence for co-functionality between the connected genes
Why make a network?• Networks:– insight into new interactions– generate new hypothesis to test– knowledge obtained from the published literature
• What types of network are there?• How to construct a network?– How to link my genes?
Gene Co-expression• Gene expression data• Two genes are linked if their expression levels were
similar across conditions in a gene expression study
Physical Interaction
• Protein-protein interaction (PPI) data • Two gene products are linked if they were found to
interact in a protein-protein interaction study
Genetic Interaction• Genetic interaction data• Two genes are linked if the effects of perturbing one
gene were modified by perturbations to a second gene
Same Pathway
• Pathway data• Two gene products are linked if they participate in the
same reaction within a pathway• Unlike physical networks – here don’t have to directly
interact, just be part of the same pathway
Other Types of Network• Shared protein domain
– Two gene products are linked if they have the same protein domain
• Co-localisation– Two genes are linked if they are both expressed in the same
tissue, or if their gene products are both identified in the same cellular location
• Predicted– Predicted functional relationships between genes, often protein
interactions
• Other– phenotype correlations from Ensembl– disease information from OMIM– chemical genomics data.
Constructing Networks
• Many tools are available for reconstructing networks• In this course:
• Generally networks should be constructed from 50-100 genes, if you want to be able to easily interpret and analyse
• Free to use web resourcehttp://string-db.org/
• Direct and indirect associations from four sources:– Co-expression– genomic context, – high throughput experiments– published/previous knowledge
• Over 2000 organisms (some more studied than others though)
Clustering
Can cluster in STRING by:• Kmeans• MCL
Customisable cluster level
IMPORTANT NOTE:We will soon improve our clustering tools adding other algorithms suitable to cluster protein-protein interactions.By now, we suggest you to use the MCL algorithm (see review paper below).Please, note that if your network is small (i.e. it contains few nodes) the MCL algorithm will produce only a single large cluster. In that case, you could simply enlarge the network (e.g. clicking the '+' button or modifying the String parameters at the bottom of the page) and relaunch the clustering.
EnrichmentCan test for enrichment in STRING:• GO Biological Process• GO Molecular Function• GO Cellular Components
• KEGG Pathway
• Domains• Protein-protein Interaction
Saving networks in STRING
Save low resolution and high resolution images of your network (PNG, SVG)
Can also save .txt files containing the interactions of your network
Limitations of STRING
• Reproducibility of networks– STRING assigns some random variables when
constructing networks, therefore each time you construct your network it may be slightly different
• Free to use web resource
• Association data include:– Protein interactions– Genetic interactions– Pathways– Co-expression– Co-localization– Protein domain similarity
• 9 Organisms– human, mouse, rat, worm, fly, zebrafish, E. coli, arabidopsis, baker’s yeast
http://www.genemania.org/
Limitations of GeneMANIA
• Can’t remove nodes– Can move nodes around, but not remove nodes to
perturb the network
• No automatic layout ‘relaxation’
• Only 9 organisms covered
• No clustering available
Cytoscape is an open source software platform• visualizing molecular interaction networks and biological
pathways• integrating networks with annotations, gene expression
profiles and other state data• Can import, edit and analyse networks• Compatible with GeneMANIA, STRING and many others
http://www.cytoscape.org/
Cytoscape Apps
• Cytoscape has a plethora of apps to help with network analysis
• Some of the commonly used apps include:
iRegulon• Identifies transcription factors (green) and target interactions
of TF’s (purple) within networks
BiNGO• Graphical Gene Ontology analysis• The node size is proportional to the number of proteins
represented by functional category• Color denotes the p-value for each enriched GO term
ClueGo• ClueGO integrates GO terms as well as pathways• Creates a functionally organized GO/pathway term network
Why make a network?• Networks:– give insight into new interactions– generate new hypothesis to test– knowledge obtained from the published literature
Why make a network?• Networks:
– give insight into new interactions– generate new hypothesis to test– knowledge obtained from the published literature
• Limitations of Networks:– Based on published literature, if working on something very
novel information may be limited– When number of genes gets too large, difficult to interpret– Be careful with choosing types of network – “everything is
linked to everything else”