VL Network Analysis (19401701) SS2016 Week...
Transcript of VL Network Analysis (19401701) SS2016 Week...
![Page 1: VL Network Analysis (19401701) SS2016 Week 5medicalbioinformatics.de/...Analysis/...ss16_week5.pdf · Tim Conrad, VL Network Analysis, SS16 1 Based on slides by J Ruan (U Texas) and](https://reader033.fdocuments.in/reader033/viewer/2022052022/60376d27a85f713c263241ff/html5/thumbnails/1.jpg)
Tim Conrad, VL Network Analysis, SS16 1
Based on slides by J Ruan (U Texas)and U von Luxburg (U Tübingen)
VL Network Analysis (19401701)
SS2016Week 5
Tim Conrad AG Medical Bioinformatics Institut für Mathematik & Informatik, Freie Universität Berlin
![Page 2: VL Network Analysis (19401701) SS2016 Week 5medicalbioinformatics.de/...Analysis/...ss16_week5.pdf · Tim Conrad, VL Network Analysis, SS16 1 Based on slides by J Ruan (U Texas) and](https://reader033.fdocuments.in/reader033/viewer/2022052022/60376d27a85f713c263241ff/html5/thumbnails/2.jpg)
Tim Conrad, VL Network Analysis, SS16 2
Community structure
![Page 3: VL Network Analysis (19401701) SS2016 Week 5medicalbioinformatics.de/...Analysis/...ss16_week5.pdf · Tim Conrad, VL Network Analysis, SS16 1 Based on slides by J Ruan (U Texas) and](https://reader033.fdocuments.in/reader033/viewer/2022052022/60376d27a85f713c263241ff/html5/thumbnails/3.jpg)
Tim Conrad, VL Network Analysis, SS16 3
Source: Newman and M. Girvan, Finding and evaluating community structure in networks, Physical Review E 69, 026113 (2004).
![Page 4: VL Network Analysis (19401701) SS2016 Week 5medicalbioinformatics.de/...Analysis/...ss16_week5.pdf · Tim Conrad, VL Network Analysis, SS16 1 Based on slides by J Ruan (U Texas) and](https://reader033.fdocuments.in/reader033/viewer/2022052022/60376d27a85f713c263241ff/html5/thumbnails/4.jpg)
Tim Conrad, VL Network Analysis, SS16 4
Consider edges that fall within a community or between a community and the rest of the networkDefine modularity Q:
A: Adjacency matrixL : Total number of links ki : degree of i-th nodeci : label of module to which i-th node belongsD: indicator function – 1 if both nodes are in same cluster
probability of an edge betweentwo vertices is proportional to their degrees
Modularity function (Q)
![Page 5: VL Network Analysis (19401701) SS2016 Week 5medicalbioinformatics.de/...Analysis/...ss16_week5.pdf · Tim Conrad, VL Network Analysis, SS16 1 Based on slides by J Ruan (U Texas) and](https://reader033.fdocuments.in/reader033/viewer/2022052022/60376d27a85f713c263241ff/html5/thumbnails/5.jpg)
Tim Conrad, VL Network Analysis, SS16 5
HQcut
• Ruan & Zhang, Physical Review E 2008
• Apply Qcut to get communities with largest Q
• Recursively search for sub-communities within each community
• When to stop?– Q value of sub-network is small, or– Q is not statistically significant
• Estimated by Monte-Carlo method
![Page 6: VL Network Analysis (19401701) SS2016 Week 5medicalbioinformatics.de/...Analysis/...ss16_week5.pdf · Tim Conrad, VL Network Analysis, SS16 1 Based on slides by J Ruan (U Texas) and](https://reader033.fdocuments.in/reader033/viewer/2022052022/60376d27a85f713c263241ff/html5/thumbnails/6.jpg)
Tim Conrad, VL Network Analysis, SS16 6
Applications to a PPI network
• Protein-protein interaction (PPI) network– Vertices: proteins– Edges: interactions detected by experiments
• Motivation:– Community = protein complex?
• Protein complex– Group of proteins associated via interactions– Elementary functional unit in the cell– Prediction from PPI network is important
![Page 7: VL Network Analysis (19401701) SS2016 Week 5medicalbioinformatics.de/...Analysis/...ss16_week5.pdf · Tim Conrad, VL Network Analysis, SS16 1 Based on slides by J Ruan (U Texas) and](https://reader033.fdocuments.in/reader033/viewer/2022052022/60376d27a85f713c263241ff/html5/thumbnails/7.jpg)
Tim Conrad, VL Network Analysis, SS16 7
Experiments• Data set
– A yeast protein-protein interaction network• Krogan et.al., Nature. 2006
– 2708 proteins, 7123 interactions
• Algorithms:– Qcut, HQcut, Newman
• Evaluation– ~300 Known protein complexes in MIPS– How well does a community match to a known protein
complex?
![Page 8: VL Network Analysis (19401701) SS2016 Week 5medicalbioinformatics.de/...Analysis/...ss16_week5.pdf · Tim Conrad, VL Network Analysis, SS16 1 Based on slides by J Ruan (U Texas) and](https://reader033.fdocuments.in/reader033/viewer/2022052022/60376d27a85f713c263241ff/html5/thumbnails/8.jpg)
Tim Conrad, VL Network Analysis, SS16 8
Results
Newman Qcut HQcut
# of communities 56 93 316
Max community size 312 264 60
# of matched communities 53 52 216
Communities with matching score = 1 5 (9%) 7 (13%) 43 (20%)
Average matching score 0.56 0.55 0.70
# of novel predictions 3 41 100
![Page 9: VL Network Analysis (19401701) SS2016 Week 5medicalbioinformatics.de/...Analysis/...ss16_week5.pdf · Tim Conrad, VL Network Analysis, SS16 1 Based on slides by J Ruan (U Texas) and](https://reader033.fdocuments.in/reader033/viewer/2022052022/60376d27a85f713c263241ff/html5/thumbnails/9.jpg)
Tim Conrad, VL Network Analysis, SS16 9
Communities found by HQcutSmall ribosomal subunit (90%)
RNA poly II mediator (83%)
Proteasome core (90%)
Exosome (94%)
gamma-tubulin (77%)
respiratory chain complex IV (82%)
![Page 10: VL Network Analysis (19401701) SS2016 Week 5medicalbioinformatics.de/...Analysis/...ss16_week5.pdf · Tim Conrad, VL Network Analysis, SS16 1 Based on slides by J Ruan (U Texas) and](https://reader033.fdocuments.in/reader033/viewer/2022052022/60376d27a85f713c263241ff/html5/thumbnails/10.jpg)
Tim Conrad, VL Network Analysis, SS16 10
Lecture outline
• Gene expression analysis• Converting data to networks• Applications of network clustering methods
![Page 11: VL Network Analysis (19401701) SS2016 Week 5medicalbioinformatics.de/...Analysis/...ss16_week5.pdf · Tim Conrad, VL Network Analysis, SS16 1 Based on slides by J Ruan (U Texas) and](https://reader033.fdocuments.in/reader033/viewer/2022052022/60376d27a85f713c263241ff/html5/thumbnails/11.jpg)
Tim Conrad, VL Network Analysis, SS16 11
Gene Expression Analysis
![Page 12: VL Network Analysis (19401701) SS2016 Week 5medicalbioinformatics.de/...Analysis/...ss16_week5.pdf · Tim Conrad, VL Network Analysis, SS16 1 Based on slides by J Ruan (U Texas) and](https://reader033.fdocuments.in/reader033/viewer/2022052022/60376d27a85f713c263241ff/html5/thumbnails/12.jpg)
Tim Conrad, VL Network Analysis, SS16
The early steps of a microarray study
• Scientific Question (biological)• Study design (biological/statistical)• Conducting Experiment (biological)• Preprocessing/
Normalising Data (statistical)
• Finding differentially expressed genes (statistical)
1st Classical statistics T-tests, ANOVA Since 1950s
2ndHigh-dimensional feature selection;Machine learning
SAM, Limma; SVM, Neural networks
Since 1990s
3rd Group-based enrichment analysis
GSEA, GSA, Globaltest Since 2003
4th Pathway Analysis SPIA, TopoGSA Since 2007
![Page 13: VL Network Analysis (19401701) SS2016 Week 5medicalbioinformatics.de/...Analysis/...ss16_week5.pdf · Tim Conrad, VL Network Analysis, SS16 1 Based on slides by J Ruan (U Texas) and](https://reader033.fdocuments.in/reader033/viewer/2022052022/60376d27a85f713c263241ff/html5/thumbnails/13.jpg)
Tim Conrad, VL Network Analysis, SS16
Specific Filtering
• t-statistic (one-way ANOVA F-statistic if > 2 samples)–problem is that there often isn’t enough data to estimate variances
•Fold change: simplest method; ratio of expression levels(but as microarray data is typically log transformed,
calculated as difference of means)
![Page 14: VL Network Analysis (19401701) SS2016 Week 5medicalbioinformatics.de/...Analysis/...ss16_week5.pdf · Tim Conrad, VL Network Analysis, SS16 1 Based on slides by J Ruan (U Texas) and](https://reader033.fdocuments.in/reader033/viewer/2022052022/60376d27a85f713c263241ff/html5/thumbnails/14.jpg)
Tim Conrad, VL Network Analysis, SS16
A data example
• Lee et al (2005) compared adipose tissue (abdominal subcutaenous adipocytes) between obese and lean Pima Indians
• Samples were hybridised on HGu95e-Affymetrix arrays (12639 genes/probe sets)
• Available as GDS1498 on the GEO database
![Page 15: VL Network Analysis (19401701) SS2016 Week 5medicalbioinformatics.de/...Analysis/...ss16_week5.pdf · Tim Conrad, VL Network Analysis, SS16 1 Based on slides by J Ruan (U Texas) and](https://reader033.fdocuments.in/reader033/viewer/2022052022/60376d27a85f713c263241ff/html5/thumbnails/15.jpg)
Tim Conrad, VL Network Analysis, SS16
![Page 16: VL Network Analysis (19401701) SS2016 Week 5medicalbioinformatics.de/...Analysis/...ss16_week5.pdf · Tim Conrad, VL Network Analysis, SS16 1 Based on slides by J Ruan (U Texas) and](https://reader033.fdocuments.in/reader033/viewer/2022052022/60376d27a85f713c263241ff/html5/thumbnails/16.jpg)
Tim Conrad, VL Network Analysis, SS16
The “Result”Probe Set ID log.ratio pvalue adj.p73554_at 1.4971 0.0000 0.000491279_at 0.8667 0.0000 0.001774099_at 1.0787 0.0000 0.010483118_at -1.2142 0.0000 0.013981647_at 1.0362 0.0000 0.013984412_at 1.3124 0.0000 0.022290585_at 1.9859 0.0000 0.025884618_at -1.6713 0.0000 0.025891790_at 1.7293 0.0000 0.035080755_at 1.5238 0.0000 0.035185539_at 0.9303 0.0000 0.035190749_at 1.7093 0.0000 0.035174038_at -1.6451 0.0000 0.035179299_at 1.7156 0.0000 0.035172962_at 2.1059 0.0000 0.035188719_at -3.1829 0.0000 0.035172943_at -2.0520 0.0000 0.035191797_at 1.4676 0.0000 0.035178356_at 2.1140 0.0001 0.035990268_at 1.6552 0.0001 0.0421
What happened to the biology???
![Page 17: VL Network Analysis (19401701) SS2016 Week 5medicalbioinformatics.de/...Analysis/...ss16_week5.pdf · Tim Conrad, VL Network Analysis, SS16 1 Based on slides by J Ruan (U Texas) and](https://reader033.fdocuments.in/reader033/viewer/2022052022/60376d27a85f713c263241ff/html5/thumbnails/17.jpg)
Tim Conrad, VL Network Analysis, SS16
Naive functional analyses
• Manually annotate list of differentially expressed (DE) genes • Extremely time-consuming, not systematic, user-dependent• Group together genes with similar function• Conclude functional categories with most DE genes
important in disease/condition under study• BUT may not be the right conclusion
![Page 18: VL Network Analysis (19401701) SS2016 Week 5medicalbioinformatics.de/...Analysis/...ss16_week5.pdf · Tim Conrad, VL Network Analysis, SS16 1 Based on slides by J Ruan (U Texas) and](https://reader033.fdocuments.in/reader033/viewer/2022052022/60376d27a85f713c263241ff/html5/thumbnails/18.jpg)
Tim Conrad, VL Network Analysis, SS16
![Page 19: VL Network Analysis (19401701) SS2016 Week 5medicalbioinformatics.de/...Analysis/...ss16_week5.pdf · Tim Conrad, VL Network Analysis, SS16 1 Based on slides by J Ruan (U Texas) and](https://reader033.fdocuments.in/reader033/viewer/2022052022/60376d27a85f713c263241ff/html5/thumbnails/19.jpg)
Tim Conrad, VL Network Analysis, SS16
The Gene Ontology Consortium
![Page 20: VL Network Analysis (19401701) SS2016 Week 5medicalbioinformatics.de/...Analysis/...ss16_week5.pdf · Tim Conrad, VL Network Analysis, SS16 1 Based on slides by J Ruan (U Texas) and](https://reader033.fdocuments.in/reader033/viewer/2022052022/60376d27a85f713c263241ff/html5/thumbnails/20.jpg)
Tim Conrad, VL Network Analysis, SS16
GO Consortium
• Developed three structured and controlled vocabularies (ontologies)
• Describe gene products in terms of their
• associated biological processes, • cellular components and • molecular functions
in a species-independent manner
• Has become a major resource for microarray data interpretation
![Page 21: VL Network Analysis (19401701) SS2016 Week 5medicalbioinformatics.de/...Analysis/...ss16_week5.pdf · Tim Conrad, VL Network Analysis, SS16 1 Based on slides by J Ruan (U Texas) and](https://reader033.fdocuments.in/reader033/viewer/2022052022/60376d27a85f713c263241ff/html5/thumbnails/21.jpg)
Tim Conrad, VL Network Analysis, SS16
The Gene Ontology
• Molecular Function: basic activity or task• e.g. catalytic activity, calcium ion binding
• Biological Process: broad objective or goal• e.g. signal transduction, immune response
• Cellular Component: location or complex• e.g. nucleus, mitochondrion
![Page 22: VL Network Analysis (19401701) SS2016 Week 5medicalbioinformatics.de/...Analysis/...ss16_week5.pdf · Tim Conrad, VL Network Analysis, SS16 1 Based on slides by J Ruan (U Texas) and](https://reader033.fdocuments.in/reader033/viewer/2022052022/60376d27a85f713c263241ff/html5/thumbnails/22.jpg)
Tim Conrad, VL Network Analysis, SS16
Slightly more informative resultsProbe Set ID Gene SymboGene Title go biological process termgo molecular function term log.ratio pvalue adj.p73554_at CCDC80 coiled-coil domain contain --- --- 1.4971 0.0000 0.000491279_at C1QTNF5 /// C1q and tumor necrosis fa visual perception /// embry --- 0.8667 0.0000 0.001774099_at --- --- --- --- 1.0787 0.0000 0.010483118_at RNF125 ring finger protein 125 immune response /// mod protein binding /// zinc ion -1.2142 0.0000 0.013981647_at --- --- --- --- 1.0362 0.0000 0.013984412_at SYNPO2 synaptopodin 2 --- actin binding /// protein bin 1.3124 0.0000 0.022290585_at C15orf59 chromosome 15 open rea --- --- 1.9859 0.0000 0.025884618_at C12orf39 chromosome 12 open rea --- --- -1.6713 0.0000 0.025891790_at MYEOV myeloma overexpressed --- --- 1.7293 0.0000 0.035080755_at MYOF myoferlin muscle contraction /// bloo protein binding 1.5238 0.0000 0.035185539_at PLEKHH1 pleckstrin homology doma --- binding 0.9303 0.0000 0.035190749_at SERPINB9 serpin peptidase inhibitor, anti-apoptosis /// signal traendopeptidase inhibitor ac 1.7093 0.0000 0.035174038_at --- --- --- --- -1.6451 0.0000 0.035179299_at --- --- --- --- 1.7156 0.0000 0.035172962_at BCAT1 branched chain aminotran G1/S transition of mitotic c catalytic activity /// branch 2.1059 0.0000 0.035188719_at C12orf39 chromosome 12 open rea --- --- -3.1829 0.0000 0.035172943_at --- --- --- --- -2.0520 0.0000 0.035191797_at LRRC16A leucine rich repeat contain --- --- 1.4676 0.0000 0.035178356_at TRDN triadin muscle contraction receptor binding 2.1140 0.0001 0.035990268_at C5orf23 chromosome 5 open read --- --- 1.6552 0.0001 0.0421
• If we are lucky, some of the top genes mean something to us
• But what if they don’t?
• And what are the results for other genes with similar biological functions?
![Page 23: VL Network Analysis (19401701) SS2016 Week 5medicalbioinformatics.de/...Analysis/...ss16_week5.pdf · Tim Conrad, VL Network Analysis, SS16 1 Based on slides by J Ruan (U Texas) and](https://reader033.fdocuments.in/reader033/viewer/2022052022/60376d27a85f713c263241ff/html5/thumbnails/23.jpg)
Tim Conrad, VL Network Analysis, SS16
Major bioinformatic developments
• Requires annotating entire set of genes
• The Gene Ontology Consortium (www.geneontology.org)
• Automated, statistical approaches for annotating gene lists and performing functional profiling
![Page 24: VL Network Analysis (19401701) SS2016 Week 5medicalbioinformatics.de/...Analysis/...ss16_week5.pdf · Tim Conrad, VL Network Analysis, SS16 1 Based on slides by J Ruan (U Texas) and](https://reader033.fdocuments.in/reader033/viewer/2022052022/60376d27a85f713c263241ff/html5/thumbnails/24.jpg)
Tim Conrad, VL Network Analysis, SS16
Functional profiling tools
Identify GO categories with significantly more DE genes than expected by chance (i.e. over-represented among DE genes relative to
representation on array as a whole)
Correct for testing multiple GO categories
Hypergeometric Distribution or Fisher’s Exact Test
![Page 25: VL Network Analysis (19401701) SS2016 Week 5medicalbioinformatics.de/...Analysis/...ss16_week5.pdf · Tim Conrad, VL Network Analysis, SS16 1 Based on slides by J Ruan (U Texas) and](https://reader033.fdocuments.in/reader033/viewer/2022052022/60376d27a85f713c263241ff/html5/thumbnails/25.jpg)
Tim Conrad, VL Network Analysis, SS16
Biological Interpretation
• Interpretation still requires substantial work• search literature and public databases • likely functional consequences of the changes• are the genes identified as significant within each GO category up-
or down-regulated?• genes within a category can have opposite effects e.g. apoptosis
would include genes that induce or repress apoptosis
![Page 26: VL Network Analysis (19401701) SS2016 Week 5medicalbioinformatics.de/...Analysis/...ss16_week5.pdf · Tim Conrad, VL Network Analysis, SS16 1 Based on slides by J Ruan (U Texas) and](https://reader033.fdocuments.in/reader033/viewer/2022052022/60376d27a85f713c263241ff/html5/thumbnails/26.jpg)
Tim Conrad, VL Network Analysis, SS16
More than GO…
![Page 27: VL Network Analysis (19401701) SS2016 Week 5medicalbioinformatics.de/...Analysis/...ss16_week5.pdf · Tim Conrad, VL Network Analysis, SS16 1 Based on slides by J Ruan (U Texas) and](https://reader033.fdocuments.in/reader033/viewer/2022052022/60376d27a85f713c263241ff/html5/thumbnails/27.jpg)
Tim Conrad, VL Network Analysis, SS16
• Methods of how to incorporate biological knowledge into microarray analysis
• The type of knowledge we deal with is rather simple: we know groups/sets of genes that for example:
• have a similar function (e.g. GO)• belong to the same pathway• are located on the same chromosome, etc…
• We will assume these groupings to be given• i.e we will not discuss methods how to detect pathways,
networks, gene clusters
![Page 28: VL Network Analysis (19401701) SS2016 Week 5medicalbioinformatics.de/...Analysis/...ss16_week5.pdf · Tim Conrad, VL Network Analysis, SS16 1 Based on slides by J Ruan (U Texas) and](https://reader033.fdocuments.in/reader033/viewer/2022052022/60376d27a85f713c263241ff/html5/thumbnails/28.jpg)
Tim Conrad, VL Network Analysis, SS16
What is a pathway?
• No clear definition• Wikipedia: “In biochemistry, metabolic pathways are series of chemical
reactions occurring within a cell. In each pathway, a principal chemical is modified by chemical reactions.”
• These pathways describe enzymes and metabolites
• But often the word “pathway” is also used to describe gene regulatory networks or protein interaction networks
• In all cases a pathway describes a biological function very specifically
![Page 29: VL Network Analysis (19401701) SS2016 Week 5medicalbioinformatics.de/...Analysis/...ss16_week5.pdf · Tim Conrad, VL Network Analysis, SS16 1 Based on slides by J Ruan (U Texas) and](https://reader033.fdocuments.in/reader033/viewer/2022052022/60376d27a85f713c263241ff/html5/thumbnails/29.jpg)
Tim Conrad, VL Network Analysis, SS16
What is a Gene Set?
• Just what it says: a set of genes!• All genes involved in a pathway are an example of a Gene Set• All genes corresponding to a Gene Ontology term are a Gene Set• All genes mentioned in a paper of Smith et al might form a Gene Set
• A Gene Set is a much more general and less specific concept than a pathway
• Still: we will sometimes use two words interchangeably, as the analysis methods are mainly the same
![Page 30: VL Network Analysis (19401701) SS2016 Week 5medicalbioinformatics.de/...Analysis/...ss16_week5.pdf · Tim Conrad, VL Network Analysis, SS16 1 Based on slides by J Ruan (U Texas) and](https://reader033.fdocuments.in/reader033/viewer/2022052022/60376d27a85f713c263241ff/html5/thumbnails/30.jpg)
Tim Conrad, VL Network Analysis, SS16
What is Gene Set/Pathway analysis?
• The aim is to give one number (score, p-value) to a Gene Set/Pathway
• Are many genes in the pathway differentially expressed (up-regulated/down-regulated)
• Can we give a number (p-value) to the probability of observing these changes just by chance?
![Page 31: VL Network Analysis (19401701) SS2016 Week 5medicalbioinformatics.de/...Analysis/...ss16_week5.pdf · Tim Conrad, VL Network Analysis, SS16 1 Based on slides by J Ruan (U Texas) and](https://reader033.fdocuments.in/reader033/viewer/2022052022/60376d27a85f713c263241ff/html5/thumbnails/31.jpg)
Tim Conrad, VL Network Analysis, SS16
Classes of Gene Set Analysis
Khatri et al. PLOS Comp Bio. 8:1 2012
DAVID
GSEA
Reactome FI networkPARADIGM
![Page 32: VL Network Analysis (19401701) SS2016 Week 5medicalbioinformatics.de/...Analysis/...ss16_week5.pdf · Tim Conrad, VL Network Analysis, SS16 1 Based on slides by J Ruan (U Texas) and](https://reader033.fdocuments.in/reader033/viewer/2022052022/60376d27a85f713c263241ff/html5/thumbnails/32.jpg)
Tim Conrad, VL Network Analysis, SS16
Limitations of Gene Set Enrichment Analysis
• Many possible gene sets – diseases, molecular function, biological process, cellular compartment, pathways...
• Gene sets are heavily overlapping; need to sort through lists of enriched gene sets!
• “Bags of genes” obscure regulatory relationships among them.
![Page 33: VL Network Analysis (19401701) SS2016 Week 5medicalbioinformatics.de/...Analysis/...ss16_week5.pdf · Tim Conrad, VL Network Analysis, SS16 1 Based on slides by J Ruan (U Texas) and](https://reader033.fdocuments.in/reader033/viewer/2022052022/60376d27a85f713c263241ff/html5/thumbnails/33.jpg)
Tim Conrad, VL Network Analysis, SS16
Pathway Analysis
![Page 34: VL Network Analysis (19401701) SS2016 Week 5medicalbioinformatics.de/...Analysis/...ss16_week5.pdf · Tim Conrad, VL Network Analysis, SS16 1 Based on slides by J Ruan (U Texas) and](https://reader033.fdocuments.in/reader033/viewer/2022052022/60376d27a85f713c263241ff/html5/thumbnails/34.jpg)
Tim Conrad, VL Network Analysis, SS16
Pathway Databases
• Advantages:– Usually curated.– Biochemical view of biological processes.– Cause and effect captured.– Human-interpretable visualizations.
• Disadvantages:– Sparse coverage of genome.– Different databases disagree on boundaries of
pathways.
![Page 35: VL Network Analysis (19401701) SS2016 Week 5medicalbioinformatics.de/...Analysis/...ss16_week5.pdf · Tim Conrad, VL Network Analysis, SS16 1 Based on slides by J Ruan (U Texas) and](https://reader033.fdocuments.in/reader033/viewer/2022052022/60376d27a85f713c263241ff/html5/thumbnails/35.jpg)
Tim Conrad, VL Network Analysis, SS16
KEGG
![Page 36: VL Network Analysis (19401701) SS2016 Week 5medicalbioinformatics.de/...Analysis/...ss16_week5.pdf · Tim Conrad, VL Network Analysis, SS16 1 Based on slides by J Ruan (U Texas) and](https://reader033.fdocuments.in/reader033/viewer/2022052022/60376d27a85f713c263241ff/html5/thumbnails/36.jpg)
Tim Conrad, VL Network Analysis, SS16
Reactome
![Page 37: VL Network Analysis (19401701) SS2016 Week 5medicalbioinformatics.de/...Analysis/...ss16_week5.pdf · Tim Conrad, VL Network Analysis, SS16 1 Based on slides by J Ruan (U Texas) and](https://reader033.fdocuments.in/reader033/viewer/2022052022/60376d27a85f713c263241ff/html5/thumbnails/37.jpg)
Tim Conrad, VL Network Analysis, SS16
Reactome
• Hand-curated pathways in human.• Rigorous curation standards – every reaction
traceable to primary literature.• Automatically-projected pathways to non-human
species.• 22 species; 1112 human pathways; 5078 proteins.• Features:
– Google-map style reaction diagrams with overlays; – Find pathways containing your gene list; – Calculate gene overrepresentation in pathways;– Find corresponding pathways in other species.
• Open access.
![Page 38: VL Network Analysis (19401701) SS2016 Week 5medicalbioinformatics.de/...Analysis/...ss16_week5.pdf · Tim Conrad, VL Network Analysis, SS16 1 Based on slides by J Ruan (U Texas) and](https://reader033.fdocuments.in/reader033/viewer/2022052022/60376d27a85f713c263241ff/html5/thumbnails/38.jpg)
Tim Conrad, VL Network Analysis, SS16
Pathway Commons
![Page 39: VL Network Analysis (19401701) SS2016 Week 5medicalbioinformatics.de/...Analysis/...ss16_week5.pdf · Tim Conrad, VL Network Analysis, SS16 1 Based on slides by J Ruan (U Texas) and](https://reader033.fdocuments.in/reader033/viewer/2022052022/60376d27a85f713c263241ff/html5/thumbnails/39.jpg)
Tim Conrad, VL Network Analysis, SS16
Pathway Colorization
• Main feature offered by all pathway databases.• Upload a gene list• Database calculates an enrichment score on each
pathway and displays ranked list.• Browse into pathways of interest; download
colorized pictures.
![Page 40: VL Network Analysis (19401701) SS2016 Week 5medicalbioinformatics.de/...Analysis/...ss16_week5.pdf · Tim Conrad, VL Network Analysis, SS16 1 Based on slides by J Ruan (U Texas) and](https://reader033.fdocuments.in/reader033/viewer/2022052022/60376d27a85f713c263241ff/html5/thumbnails/40.jpg)
Tim Conrad, VL Network Analysis, SS16
Example from Reactome
![Page 41: VL Network Analysis (19401701) SS2016 Week 5medicalbioinformatics.de/...Analysis/...ss16_week5.pdf · Tim Conrad, VL Network Analysis, SS16 1 Based on slides by J Ruan (U Texas) and](https://reader033.fdocuments.in/reader033/viewer/2022052022/60376d27a85f713c263241ff/html5/thumbnails/41.jpg)
Tim Conrad, VL Network Analysis, SS16
Example from Reactome
![Page 42: VL Network Analysis (19401701) SS2016 Week 5medicalbioinformatics.de/...Analysis/...ss16_week5.pdf · Tim Conrad, VL Network Analysis, SS16 1 Based on slides by J Ruan (U Texas) and](https://reader033.fdocuments.in/reader033/viewer/2022052022/60376d27a85f713c263241ff/html5/thumbnails/42.jpg)
Tim Conrad, VL Network Analysis, SS16
![Page 43: VL Network Analysis (19401701) SS2016 Week 5medicalbioinformatics.de/...Analysis/...ss16_week5.pdf · Tim Conrad, VL Network Analysis, SS16 1 Based on slides by J Ruan (U Texas) and](https://reader033.fdocuments.in/reader033/viewer/2022052022/60376d27a85f713c263241ff/html5/thumbnails/43.jpg)
Tim Conrad, VL Network Analysis, SS16
Curated Human Data – Version 35.5078 proteins 4166 reactions3870 complexes 1112 pathways Only ~25% of genome!
Goal: add a “corona” of uncurated interaction data around scaffold of curated pathway data.
Example: Reactome FI Network
![Page 44: VL Network Analysis (19401701) SS2016 Week 5medicalbioinformatics.de/...Analysis/...ss16_week5.pdf · Tim Conrad, VL Network Analysis, SS16 1 Based on slides by J Ruan (U Texas) and](https://reader033.fdocuments.in/reader033/viewer/2022052022/60376d27a85f713c263241ff/html5/thumbnails/44.jpg)
Tim Conrad, VL Network Analysis, SS16
![Page 45: VL Network Analysis (19401701) SS2016 Week 5medicalbioinformatics.de/...Analysis/...ss16_week5.pdf · Tim Conrad, VL Network Analysis, SS16 1 Based on slides by J Ruan (U Texas) and](https://reader033.fdocuments.in/reader033/viewer/2022052022/60376d27a85f713c263241ff/html5/thumbnails/45.jpg)
Tim Conrad, VL Network Analysis, SS16
More than pathways
![Page 46: VL Network Analysis (19401701) SS2016 Week 5medicalbioinformatics.de/...Analysis/...ss16_week5.pdf · Tim Conrad, VL Network Analysis, SS16 1 Based on slides by J Ruan (U Texas) and](https://reader033.fdocuments.in/reader033/viewer/2022052022/60376d27a85f713c263241ff/html5/thumbnails/46.jpg)
Tim Conrad, VL Network Analysis, SS16
Networks
• Pathways capture only the “well understood” portion of biology.• Networks cover less well understood relationships:
– Genetic interactions– Physical interaction– Coexpression– GO term sharing– Adjacency in pathways
![Page 47: VL Network Analysis (19401701) SS2016 Week 5medicalbioinformatics.de/...Analysis/...ss16_week5.pdf · Tim Conrad, VL Network Analysis, SS16 1 Based on slides by J Ruan (U Texas) and](https://reader033.fdocuments.in/reader033/viewer/2022052022/60376d27a85f713c263241ff/html5/thumbnails/47.jpg)
Tim Conrad, VL Network Analysis, SS16 47
Gene Expression Networks
![Page 48: VL Network Analysis (19401701) SS2016 Week 5medicalbioinformatics.de/...Analysis/...ss16_week5.pdf · Tim Conrad, VL Network Analysis, SS16 1 Based on slides by J Ruan (U Texas) and](https://reader033.fdocuments.in/reader033/viewer/2022052022/60376d27a85f713c263241ff/html5/thumbnails/48.jpg)
Tim Conrad, VL Network Analysis, SS16 48
Microarray data
• Data organized into a matrix– Rows are genes– Columns are samples representing different
time points, conditions, tissues, etc.• Analysis techniques
– Differential expression analysis– Classification and clustering– Regulatory network construction– Enrichment analysis
• Characteristics of microarray data– High dimensionality and noise– Underlying topology unknown, often
irregular shape
Sample
Gen
e
Red: high activityGreen: low activity
![Page 49: VL Network Analysis (19401701) SS2016 Week 5medicalbioinformatics.de/...Analysis/...ss16_week5.pdf · Tim Conrad, VL Network Analysis, SS16 1 Based on slides by J Ruan (U Texas) and](https://reader033.fdocuments.in/reader033/viewer/2022052022/60376d27a85f713c263241ff/html5/thumbnails/49.jpg)
Tim Conrad, VL Network Analysis, SS16 49
Microarray data clustering
• Many clustering algorithms available– K-means– Hierarchical– Self organizing maps– Parameter hard to tune– Does not consider network topology
Sample
Gen
e • Common functions?• Common regulation?• Predict functions for
unknown genes?
Analyze genes in each cluster
Red: high activityGreen: low activity
![Page 50: VL Network Analysis (19401701) SS2016 Week 5medicalbioinformatics.de/...Analysis/...ss16_week5.pdf · Tim Conrad, VL Network Analysis, SS16 1 Based on slides by J Ruan (U Texas) and](https://reader033.fdocuments.in/reader033/viewer/2022052022/60376d27a85f713c263241ff/html5/thumbnails/50.jpg)
Tim Conrad, VL Network Analysis, SS16 50
From Data to Neworks
![Page 51: VL Network Analysis (19401701) SS2016 Week 5medicalbioinformatics.de/...Analysis/...ss16_week5.pdf · Tim Conrad, VL Network Analysis, SS16 1 Based on slides by J Ruan (U Texas) and](https://reader033.fdocuments.in/reader033/viewer/2022052022/60376d27a85f713c263241ff/html5/thumbnails/51.jpg)
Tim Conrad, VL Network Analysis, SS16 51
Network-based data analysis
![Page 52: VL Network Analysis (19401701) SS2016 Week 5medicalbioinformatics.de/...Analysis/...ss16_week5.pdf · Tim Conrad, VL Network Analysis, SS16 1 Based on slides by J Ruan (U Texas) and](https://reader033.fdocuments.in/reader033/viewer/2022052022/60376d27a85f713c263241ff/html5/thumbnails/52.jpg)
Tim Conrad, VL Network Analysis, SS16 52
Network-based data analysis
![Page 53: VL Network Analysis (19401701) SS2016 Week 5medicalbioinformatics.de/...Analysis/...ss16_week5.pdf · Tim Conrad, VL Network Analysis, SS16 1 Based on slides by J Ruan (U Texas) and](https://reader033.fdocuments.in/reader033/viewer/2022052022/60376d27a85f713c263241ff/html5/thumbnails/53.jpg)
Tim Conrad, VL Network Analysis, SS16 53
Network-based data analysis
![Page 54: VL Network Analysis (19401701) SS2016 Week 5medicalbioinformatics.de/...Analysis/...ss16_week5.pdf · Tim Conrad, VL Network Analysis, SS16 1 Based on slides by J Ruan (U Texas) and](https://reader033.fdocuments.in/reader033/viewer/2022052022/60376d27a85f713c263241ff/html5/thumbnails/54.jpg)
Tim Conrad, VL Network Analysis, SS16 54
Network-based data analysis
![Page 55: VL Network Analysis (19401701) SS2016 Week 5medicalbioinformatics.de/...Analysis/...ss16_week5.pdf · Tim Conrad, VL Network Analysis, SS16 1 Based on slides by J Ruan (U Texas) and](https://reader033.fdocuments.in/reader033/viewer/2022052022/60376d27a85f713c263241ff/html5/thumbnails/55.jpg)
Tim Conrad, VL Network Analysis, SS16 55
Network-based data analysis
![Page 56: VL Network Analysis (19401701) SS2016 Week 5medicalbioinformatics.de/...Analysis/...ss16_week5.pdf · Tim Conrad, VL Network Analysis, SS16 1 Based on slides by J Ruan (U Texas) and](https://reader033.fdocuments.in/reader033/viewer/2022052022/60376d27a85f713c263241ff/html5/thumbnails/56.jpg)
Tim Conrad, VL Network Analysis, SS16 56
Distances & Similarity
![Page 57: VL Network Analysis (19401701) SS2016 Week 5medicalbioinformatics.de/...Analysis/...ss16_week5.pdf · Tim Conrad, VL Network Analysis, SS16 1 Based on slides by J Ruan (U Texas) and](https://reader033.fdocuments.in/reader033/viewer/2022052022/60376d27a85f713c263241ff/html5/thumbnails/57.jpg)
Tim Conrad, VL Network Analysis, SS16 57
Directed k-nearest neighbor graph
![Page 58: VL Network Analysis (19401701) SS2016 Week 5medicalbioinformatics.de/...Analysis/...ss16_week5.pdf · Tim Conrad, VL Network Analysis, SS16 1 Based on slides by J Ruan (U Texas) and](https://reader033.fdocuments.in/reader033/viewer/2022052022/60376d27a85f713c263241ff/html5/thumbnails/58.jpg)
Tim Conrad, VL Network Analysis, SS16 58
Undirected k-nearest neighbor graph
![Page 59: VL Network Analysis (19401701) SS2016 Week 5medicalbioinformatics.de/...Analysis/...ss16_week5.pdf · Tim Conrad, VL Network Analysis, SS16 1 Based on slides by J Ruan (U Texas) and](https://reader033.fdocuments.in/reader033/viewer/2022052022/60376d27a85f713c263241ff/html5/thumbnails/59.jpg)
Tim Conrad, VL Network Analysis, SS16 59
Undirected k-nearest neighbor graph
![Page 60: VL Network Analysis (19401701) SS2016 Week 5medicalbioinformatics.de/...Analysis/...ss16_week5.pdf · Tim Conrad, VL Network Analysis, SS16 1 Based on slides by J Ruan (U Texas) and](https://reader033.fdocuments.in/reader033/viewer/2022052022/60376d27a85f713c263241ff/html5/thumbnails/60.jpg)
Tim Conrad, VL Network Analysis, SS16 60
epsilon-neighborhood Graph