4 IPM-NUS workshopbs.ipm.ac.ir/workshop/IPM-NUS2015/khosravi.pdf · From gene expression profile to...
Transcript of 4 IPM-NUS workshopbs.ipm.ac.ir/workshop/IPM-NUS2015/khosravi.pdf · From gene expression profile to...
![Page 1: 4 IPM-NUS workshopbs.ipm.ac.ir/workshop/IPM-NUS2015/khosravi.pdf · From gene expression profile to network analysis 4th IPM-NUS workshop Pegah Khosravi 03/10/2015](https://reader034.fdocuments.in/reader034/viewer/2022051604/6004d158a44fe32329287ac2/html5/thumbnails/1.jpg)
From gene expression profile to network analysis
4th IPM-NUSworkshop
Pegah Khosravi
03/10/2015
![Page 2: 4 IPM-NUS workshopbs.ipm.ac.ir/workshop/IPM-NUS2015/khosravi.pdf · From gene expression profile to network analysis 4th IPM-NUS workshop Pegah Khosravi 03/10/2015](https://reader034.fdocuments.in/reader034/viewer/2022051604/6004d158a44fe32329287ac2/html5/thumbnails/2.jpg)
From gene expression profile to network analysis
We put math and biology into a blender and drink the resulting smoothie
2
![Page 3: 4 IPM-NUS workshopbs.ipm.ac.ir/workshop/IPM-NUS2015/khosravi.pdf · From gene expression profile to network analysis 4th IPM-NUS workshop Pegah Khosravi 03/10/2015](https://reader034.fdocuments.in/reader034/viewer/2022051604/6004d158a44fe32329287ac2/html5/thumbnails/3.jpg)
From gene expression profile to network analysis
Outlines
• Introduction to gene expression and network analysis
• Network-based approach reveals Y chromosome influences prostate cancer susceptibility
• Computational hands
3
![Page 4: 4 IPM-NUS workshopbs.ipm.ac.ir/workshop/IPM-NUS2015/khosravi.pdf · From gene expression profile to network analysis 4th IPM-NUS workshop Pegah Khosravi 03/10/2015](https://reader034.fdocuments.in/reader034/viewer/2022051604/6004d158a44fe32329287ac2/html5/thumbnails/4.jpg)
From gene expression profile to network analysis
Expression profile
• gene expression profiling is the measurementof the expression of thousands of genes atonce, to create a global picture of cellularfunction.
• quantitative PCR• Next-generation sequencing (NGS)• Microarray: microarrays are far more
common, accounting for 65858 PubMedarticles by March, 2015
4
![Page 5: 4 IPM-NUS workshopbs.ipm.ac.ir/workshop/IPM-NUS2015/khosravi.pdf · From gene expression profile to network analysis 4th IPM-NUS workshop Pegah Khosravi 03/10/2015](https://reader034.fdocuments.in/reader034/viewer/2022051604/6004d158a44fe32329287ac2/html5/thumbnails/5.jpg)
From gene expression profile to network analysis
Microarray
• The core principle behind microarrays is hybridization between two DNA strands, the property of complementary nucleic acid sequences to specifically pair with each other by forming hydrogen bonds between complementary nucleotide base pairs.
• Single-channel• Affymetrix, Illumina
• Two-channel• Eppendorf, TeleChem
5
![Page 6: 4 IPM-NUS workshopbs.ipm.ac.ir/workshop/IPM-NUS2015/khosravi.pdf · From gene expression profile to network analysis 4th IPM-NUS workshop Pegah Khosravi 03/10/2015](https://reader034.fdocuments.in/reader034/viewer/2022051604/6004d158a44fe32329287ac2/html5/thumbnails/6.jpg)
From gene expression profile to network analysis
Affymetrix versus Illumina
• Affymetrix
• 25mer
• Probe synthesized on chips
• Multiple probes/probeset
• May have multiple probes/transcript
• .dat, .cel, .cdf, .chp file types
• Normalization methods such as quantile
• Txt output can be used for downstream data analysis
• Annotations can be updated
• Illumina
• Longer oligo
• Bead technology
• Single probe
• May have multiple probes/transcript
• Image file processed by Bead Studio
• Several normalization methods
• Txt output can be used for downstream data analysis
• Annotations can be updated
6
![Page 7: 4 IPM-NUS workshopbs.ipm.ac.ir/workshop/IPM-NUS2015/khosravi.pdf · From gene expression profile to network analysis 4th IPM-NUS workshop Pegah Khosravi 03/10/2015](https://reader034.fdocuments.in/reader034/viewer/2022051604/6004d158a44fe32329287ac2/html5/thumbnails/7.jpg)
From gene expression profile to network analysis
From experiment to data
Prediction:
Gene ValueD26528_at 193D26561_cds1_at -70D26561_cds2_at 144D26561_cds3_at 33D26579_at 318D26598_at 1764D26599_at 1537D26600_at 1204D28114_at 707
Class Sno D26528 D63874 D63880 …ALL 2 193 4157 556ALL 3 129 11557 476ALL 4 44 12125 498ALL 5 218 8484 1211AML 51 109 3537 131AML 52 106 4578 94AML 53 211 2431 209…
Data Miningand analysis
Newsample
Microarray chips Images scanned by laser
Datasets
7
![Page 8: 4 IPM-NUS workshopbs.ipm.ac.ir/workshop/IPM-NUS2015/khosravi.pdf · From gene expression profile to network analysis 4th IPM-NUS workshop Pegah Khosravi 03/10/2015](https://reader034.fdocuments.in/reader034/viewer/2022051604/6004d158a44fe32329287ac2/html5/thumbnails/8.jpg)
From gene expression profile to network analysis
GEO-Affymetrix data
Probe set Id
Signal value
Total probesets
Raw files
8
![Page 9: 4 IPM-NUS workshopbs.ipm.ac.ir/workshop/IPM-NUS2015/khosravi.pdf · From gene expression profile to network analysis 4th IPM-NUS workshop Pegah Khosravi 03/10/2015](https://reader034.fdocuments.in/reader034/viewer/2022051604/6004d158a44fe32329287ac2/html5/thumbnails/9.jpg)
From gene expression profile to network analysis
Gene expression data
Data Data (log scale)
Always log your data
9
![Page 10: 4 IPM-NUS workshopbs.ipm.ac.ir/workshop/IPM-NUS2015/khosravi.pdf · From gene expression profile to network analysis 4th IPM-NUS workshop Pegah Khosravi 03/10/2015](https://reader034.fdocuments.in/reader034/viewer/2022051604/6004d158a44fe32329287ac2/html5/thumbnails/10.jpg)
From gene expression profile to network analysis
Quantilenorm
Normalize your data to avoid systematic (non-biological) effects.
Quantile Normalization: is a technique for making two distributions identical in statistical properties.
Is that already a result? No! It’s just data, not knowledge.
We need to use this data to answer a scientific question.
• NormData = quantilenorm(data)
• boxplot(data)
10
![Page 11: 4 IPM-NUS workshopbs.ipm.ac.ir/workshop/IPM-NUS2015/khosravi.pdf · From gene expression profile to network analysis 4th IPM-NUS workshop Pegah Khosravi 03/10/2015](https://reader034.fdocuments.in/reader034/viewer/2022051604/6004d158a44fe32329287ac2/html5/thumbnails/11.jpg)
From gene expression profile to network analysis
Fold change and Hierarchical clustering
1 10 100 1000 10000
72 (control)0.01
0.1
1
10
100
1000
10000
72
(raw)
72
(raw)
72 (control)
1 10 100 1000 10000
72 (control)0.01
0.1
1
10
100
1000
10000
72
(raw)
72
(raw)
72 (control)
• pvalues = mattest(dependentData, independentData);
• mavolcanoplot(dependentData, independentData, pvalues,'Labels', probesetIDs)
11
• A heat map is a graphical representation of data where the individual values contained in a matrix are represented as colors.
• prostate = clustergram(data(1:40,:),'Standardize','Row')
![Page 12: 4 IPM-NUS workshopbs.ipm.ac.ir/workshop/IPM-NUS2015/khosravi.pdf · From gene expression profile to network analysis 4th IPM-NUS workshop Pegah Khosravi 03/10/2015](https://reader034.fdocuments.in/reader034/viewer/2022051604/6004d158a44fe32329287ac2/html5/thumbnails/12.jpg)
From gene expression profile to network analysis
Microarray Applications and limitations
• Biological discovery
• new and better molecular diagnostics
• new molecular targets for therapy
• finding and refining biological pathways
• Mutation and polymorphism detection
• Limitations
• Chip to chip variation
• What fold change has biological relevance?
• Expensive!! Not every lab can afford experiment repeat.
• The real limitation is Bioinformatics
12
![Page 13: 4 IPM-NUS workshopbs.ipm.ac.ir/workshop/IPM-NUS2015/khosravi.pdf · From gene expression profile to network analysis 4th IPM-NUS workshop Pegah Khosravi 03/10/2015](https://reader034.fdocuments.in/reader034/viewer/2022051604/6004d158a44fe32329287ac2/html5/thumbnails/13.jpg)
From gene expression profile to network analysis
Why do we need network analysis?
• Data normalization
• Selecting Significantly changes genes (criteria FC ≥ 2, P-value ≤ 0.05)
• 1157 probes ID for 978 unique genes (FC ≥ 2 at least in one stage)
• Identify Up and Down regulated genes for each stage• Adjacent: 65 genes up-regulated, 13 genes down-regulated
• Tumor: 178 genes up-regulated, 137 genes down-regulated
• Metastatic: 418 genes up-regulated, 634 genes down-regulated
13
![Page 14: 4 IPM-NUS workshopbs.ipm.ac.ir/workshop/IPM-NUS2015/khosravi.pdf · From gene expression profile to network analysis 4th IPM-NUS workshop Pegah Khosravi 03/10/2015](https://reader034.fdocuments.in/reader034/viewer/2022051604/6004d158a44fe32329287ac2/html5/thumbnails/14.jpg)
From gene expression profile to network analysis
Biological networks
• Biological networks
• PPI, GRN, Metabolic Networks, etc….
• Essential nodes
• Hubs
• Bottlenecks
• Driver genes
• Association
• Correlation coefficient
• MI
• MIC
14
![Page 15: 4 IPM-NUS workshopbs.ipm.ac.ir/workshop/IPM-NUS2015/khosravi.pdf · From gene expression profile to network analysis 4th IPM-NUS workshop Pegah Khosravi 03/10/2015](https://reader034.fdocuments.in/reader034/viewer/2022051604/6004d158a44fe32329287ac2/html5/thumbnails/15.jpg)
From gene expression profile to network analysis
DREAM competition
15
http://dreamchallenges.org/
![Page 16: 4 IPM-NUS workshopbs.ipm.ac.ir/workshop/IPM-NUS2015/khosravi.pdf · From gene expression profile to network analysis 4th IPM-NUS workshop Pegah Khosravi 03/10/2015](https://reader034.fdocuments.in/reader034/viewer/2022051604/6004d158a44fe32329287ac2/html5/thumbnails/16.jpg)
From gene expression profile to network analysis
MI-based algorithms
• MI =∑P(x,y)log (P(x,y)/P(x)P(y))
• RELNET algorithm (RELevance NETworks)
• ARACNe (Algorithm for Reverse engineering of Accurate Cellular Networks)
• CLR (Context Likelihood Relatedness)
16
![Page 17: 4 IPM-NUS workshopbs.ipm.ac.ir/workshop/IPM-NUS2015/khosravi.pdf · From gene expression profile to network analysis 4th IPM-NUS workshop Pegah Khosravi 03/10/2015](https://reader034.fdocuments.in/reader034/viewer/2022051604/6004d158a44fe32329287ac2/html5/thumbnails/17.jpg)
From gene expression profile to network analysis
ROC and PR
• Recall that PR_AUC is based on precision and recall (= TPR = sensitivity):
• Precision = TP / (TP + FP)• Recall = Sensitivity = TPR = TP / (TP + FN)
• And recall that ROC_AUC is based on TPR (= recall = sensitivity) and FPR:
• TPR = TP / (TP + FN)• FPR = FP / (FP + TN)
• Receiver Operator Characteristic (ROC) curves are commonly used to present results for binary decision problems in machine learning. However, when dealing with highly skewed datasets, Precision-Recall (PR) curves give a more informative picture of an algorithm’s performance.
17
![Page 18: 4 IPM-NUS workshopbs.ipm.ac.ir/workshop/IPM-NUS2015/khosravi.pdf · From gene expression profile to network analysis 4th IPM-NUS workshop Pegah Khosravi 03/10/2015](https://reader034.fdocuments.in/reader034/viewer/2022051604/6004d158a44fe32329287ac2/html5/thumbnails/18.jpg)
From gene expression profile to network analysis
Network reconstruction
18
![Page 19: 4 IPM-NUS workshopbs.ipm.ac.ir/workshop/IPM-NUS2015/khosravi.pdf · From gene expression profile to network analysis 4th IPM-NUS workshop Pegah Khosravi 03/10/2015](https://reader034.fdocuments.in/reader034/viewer/2022051604/6004d158a44fe32329287ac2/html5/thumbnails/19.jpg)
From gene expression profile to network analysis
Other databases
19
![Page 20: 4 IPM-NUS workshopbs.ipm.ac.ir/workshop/IPM-NUS2015/khosravi.pdf · From gene expression profile to network analysis 4th IPM-NUS workshop Pegah Khosravi 03/10/2015](https://reader034.fdocuments.in/reader034/viewer/2022051604/6004d158a44fe32329287ac2/html5/thumbnails/20.jpg)
From gene expression profile to network analysis
Prostate metastasis network
STAT1
AR
HLF
TCF21
ISL1
GATA3
KLF6
SMAD3
NHLH2
EGR3
FOS
NKX2-2
FOXF1
ATF6
PBX1
HOXC6
FOXA1
ELK4
VDR
Up-regulated
Not-changed
Down-regulated
20
![Page 21: 4 IPM-NUS workshopbs.ipm.ac.ir/workshop/IPM-NUS2015/khosravi.pdf · From gene expression profile to network analysis 4th IPM-NUS workshop Pegah Khosravi 03/10/2015](https://reader034.fdocuments.in/reader034/viewer/2022051604/6004d158a44fe32329287ac2/html5/thumbnails/21.jpg)
From gene expression profile to network analysis
Cytoscape
• Cytoscape supports many use cases in molecular and systems biology, genomics, and proteomics:• Load molecular and genetic interaction data sets in many standards formats• Project and integrate global datasets and functional annotations• Establish powerful visual mappings across these data• Perform advanced analysis and modeling using Cytoscape Apps• Visualize and analyze human-curated pathway datasets such as WikiPathways, Reactome, and KEGG
21
![Page 22: 4 IPM-NUS workshopbs.ipm.ac.ir/workshop/IPM-NUS2015/khosravi.pdf · From gene expression profile to network analysis 4th IPM-NUS workshop Pegah Khosravi 03/10/2015](https://reader034.fdocuments.in/reader034/viewer/2022051604/6004d158a44fe32329287ac2/html5/thumbnails/22.jpg)
From gene expression profile to network analysis
Programming
22
![Page 23: 4 IPM-NUS workshopbs.ipm.ac.ir/workshop/IPM-NUS2015/khosravi.pdf · From gene expression profile to network analysis 4th IPM-NUS workshop Pegah Khosravi 03/10/2015](https://reader034.fdocuments.in/reader034/viewer/2022051604/6004d158a44fe32329287ac2/html5/thumbnails/23.jpg)
From gene expression profile to network analysis
Google is my best friend
23
![Page 24: 4 IPM-NUS workshopbs.ipm.ac.ir/workshop/IPM-NUS2015/khosravi.pdf · From gene expression profile to network analysis 4th IPM-NUS workshop Pegah Khosravi 03/10/2015](https://reader034.fdocuments.in/reader034/viewer/2022051604/6004d158a44fe32329287ac2/html5/thumbnails/24.jpg)
From gene expression profile to network analysis
Network-based approach
24
![Page 25: 4 IPM-NUS workshopbs.ipm.ac.ir/workshop/IPM-NUS2015/khosravi.pdf · From gene expression profile to network analysis 4th IPM-NUS workshop Pegah Khosravi 03/10/2015](https://reader034.fdocuments.in/reader034/viewer/2022051604/6004d158a44fe32329287ac2/html5/thumbnails/25.jpg)
From gene expression profile to network analysis
Y-Chromosome
• 60 genes (loci) on the human Y chromosome• All genes that interact with Y-chromosome genes from iHOP database (471 genes)• GDS2545 from GEO (310 genes with FC ≥ 1.5 and p-value ≤ 0.05)• Grouped Normal and Adjacent samples as normal prostate tissue and Tumor and Metastasis samples as cancerous prostate
tissue• The resulting normal and cancerous networks contain 1973 and 1831 interactions• 80 genes with BN and Hub high score• Extracted a sub-network from normal and cancer co-expression networks using the Y-chromosome gene list• We detected 22 genes related to the Y chromosome in our sub-network
25
![Page 26: 4 IPM-NUS workshopbs.ipm.ac.ir/workshop/IPM-NUS2015/khosravi.pdf · From gene expression profile to network analysis 4th IPM-NUS workshop Pegah Khosravi 03/10/2015](https://reader034.fdocuments.in/reader034/viewer/2022051604/6004d158a44fe32329287ac2/html5/thumbnails/26.jpg)
From gene expression profile to network analysis
Expression alteration
• The role of some of genes such as PRKY, PCDH11Y, PRY2, USP9Y, EIF1AY, NLGN4Y, ZFY, DDX3Y, BPY2, SRY, UTY, KDM5D, and TMSB4Y are most well-known in prostate cancer
• AMELY, DAZ4, RBMY1J, RBMY1E, VCY1B, RPS4Y1, CDY1B, XKRY2 and CYORF15B may have unknown roles in prostate cancer.
• CYORF15B, RPS4Y1, PRY2, RBMY1E, and DAZ4 are up-regulated in the cancerous stage
• KDM5D, USP9Y, RBMY1J, and DDX3Y have been down-regulated during cancer
26
![Page 27: 4 IPM-NUS workshopbs.ipm.ac.ir/workshop/IPM-NUS2015/khosravi.pdf · From gene expression profile to network analysis 4th IPM-NUS workshop Pegah Khosravi 03/10/2015](https://reader034.fdocuments.in/reader034/viewer/2022051604/6004d158a44fe32329287ac2/html5/thumbnails/27.jpg)
From gene expression profile to network analysis
New modulation score
27
![Page 28: 4 IPM-NUS workshopbs.ipm.ac.ir/workshop/IPM-NUS2015/khosravi.pdf · From gene expression profile to network analysis 4th IPM-NUS workshop Pegah Khosravi 03/10/2015](https://reader034.fdocuments.in/reader034/viewer/2022051604/6004d158a44fe32329287ac2/html5/thumbnails/28.jpg)
From gene expression profile to network analysis
Pathways and GOs
• We identified 18 distinct BPs and pathways with constituent genes having significant co-expression with each other, either in normal or cancerous states
• Y-chromosome genes such as PRKY, RPS4Y1 and USP9Y involve in protein phosphorylation, cellular protein metabolic process, and transforming growth factor beta receptor signaling pathway, respectively.
28
![Page 29: 4 IPM-NUS workshopbs.ipm.ac.ir/workshop/IPM-NUS2015/khosravi.pdf · From gene expression profile to network analysis 4th IPM-NUS workshop Pegah Khosravi 03/10/2015](https://reader034.fdocuments.in/reader034/viewer/2022051604/6004d158a44fe32329287ac2/html5/thumbnails/29.jpg)
From gene expression profile to network analysis
Network rewiring
• The positive regulation of phosphatidylinositol 3-kinase cascade is activated only in cancerous stage
• Genes collaborating in the TNF signaling pathway are intra-connected in the normal stage.
• This novel network-based analysis suggests that significant biases exist among the two stage-specific co-expression networks when their constituent genes are classified by Gene Ontology (GO) terms or KEGG pathways.
29
![Page 30: 4 IPM-NUS workshopbs.ipm.ac.ir/workshop/IPM-NUS2015/khosravi.pdf · From gene expression profile to network analysis 4th IPM-NUS workshop Pegah Khosravi 03/10/2015](https://reader034.fdocuments.in/reader034/viewer/2022051604/6004d158a44fe32329287ac2/html5/thumbnails/30.jpg)
From gene expression profile to network analysis
Computational hands
• Go to the http://cytoscape.org/ and download the last version that is 2.8.3• Import network• Analysis them via different apps such as cytoHubba, MCODE and BiNGO
30
![Page 31: 4 IPM-NUS workshopbs.ipm.ac.ir/workshop/IPM-NUS2015/khosravi.pdf · From gene expression profile to network analysis 4th IPM-NUS workshop Pegah Khosravi 03/10/2015](https://reader034.fdocuments.in/reader034/viewer/2022051604/6004d158a44fe32329287ac2/html5/thumbnails/31.jpg)
From gene expression profile to network analysis
What we have covered today
• Analysis of gene expression
• Reconstruction of Gene Networks
• Recommending candidate genes, process andpathways for future research in the field ofcancer studies.
• Computational hands on Cytoscape
31
![Page 32: 4 IPM-NUS workshopbs.ipm.ac.ir/workshop/IPM-NUS2015/khosravi.pdf · From gene expression profile to network analysis 4th IPM-NUS workshop Pegah Khosravi 03/10/2015](https://reader034.fdocuments.in/reader034/viewer/2022051604/6004d158a44fe32329287ac2/html5/thumbnails/32.jpg)
Thanks