Signed weighted gene co- expression network …...Signed weighted gene co-expression network...
Transcript of Signed weighted gene co- expression network …...Signed weighted gene co-expression network...
![Page 1: Signed weighted gene co- expression network …...Signed weighted gene co-expression network analysis of transcriptional regulation in murine embryonic stem cells Steve Horvath University](https://reader033.fdocuments.in/reader033/viewer/2022042301/5eccce2e7a093840f93bafc5/html5/thumbnails/1.jpg)
Signed weighted gene co-expression network analysis of transcriptional regulation in murine embryonic stem
cells
Steve HorvathUniversity of California, Los Angeles
Acknowledgement:Dissertation work of Mike J Mason
Guoping Fan, Kathrin Plath, Qing Zhou
ES cell culture
Self-
renewing
Ecto-
derm
Meso-
derm
Endoderm
![Page 2: Signed weighted gene co- expression network …...Signed weighted gene co-expression network analysis of transcriptional regulation in murine embryonic stem cells Steve Horvath University](https://reader033.fdocuments.in/reader033/viewer/2022042301/5eccce2e7a093840f93bafc5/html5/thumbnails/2.jpg)
Contents
• Weighted Gene Co-Expression
Network Analysis
• Application to stem cell data
![Page 3: Signed weighted gene co- expression network …...Signed weighted gene co-expression network analysis of transcriptional regulation in murine embryonic stem cells Steve Horvath University](https://reader033.fdocuments.in/reader033/viewer/2022042301/5eccce2e7a093840f93bafc5/html5/thumbnails/3.jpg)
How to construct
a weighted gene co-expression network?Bin Zhang and Steve Horvath (2005) "A General Framework for Weighted Gene Co-Expression Network Analysis", Statistical Applications in Genetics and Molecular Biology: Vol. 4: No. 1
![Page 4: Signed weighted gene co- expression network …...Signed weighted gene co-expression network analysis of transcriptional regulation in murine embryonic stem cells Steve Horvath University](https://reader033.fdocuments.in/reader033/viewer/2022042301/5eccce2e7a093840f93bafc5/html5/thumbnails/4.jpg)
Undirected Network=Adjacency Matrix
• A network can be represented by an adjacency matrix, A=[aij], that encodes whether/how a pair of nodes is connected. – A is a symmetric matrix with entries in [0,1]
– For unweighted network, entries are 1 or 0 depending on whether or not 2 nodes are adjacent (connected)
– For weighted networks, the adjacency matrix reports the connection strength between gene pairs
![Page 5: Signed weighted gene co- expression network …...Signed weighted gene co-expression network analysis of transcriptional regulation in murine embryonic stem cells Steve Horvath University](https://reader033.fdocuments.in/reader033/viewer/2022042301/5eccce2e7a093840f93bafc5/html5/thumbnails/5.jpg)
Steps for constructing aco-expression networkA) Gene expression data B) Measure concordance of gene
expression with a Pearson correlation
C) The Pearson correlation matrix is either dichotomized to arrive at an unweighted adjacency matrix �unweighted network
Or transformed continuously with the power adjacency function �weighted network
![Page 6: Signed weighted gene co- expression network …...Signed weighted gene co-expression network analysis of transcriptional regulation in murine embryonic stem cells Steve Horvath University](https://reader033.fdocuments.in/reader033/viewer/2022042301/5eccce2e7a093840f93bafc5/html5/thumbnails/6.jpg)
Power adjacency function for constructing unsigned and signed weighted gene co-expr.
networks
Unsigned network, absolute value
| ( , ) |
Signed network preserves sign info
| 0.5 0.5 ( , ) |
ij i j
ij i j
a cor x x
a cor x x
β
β
=
= + ×
Default values: beta=6 for unsigned and beta=12 for signed networks.Alternatively, use the “scale free topology criterion” described in Zhang and Horvath 2005.
![Page 7: Signed weighted gene co- expression network …...Signed weighted gene co-expression network analysis of transcriptional regulation in murine embryonic stem cells Steve Horvath University](https://reader033.fdocuments.in/reader033/viewer/2022042301/5eccce2e7a093840f93bafc5/html5/thumbnails/7.jpg)
Comparing adjacency functions for transforming the correlation into a measure
of connection strength
Unsigned Network Signed Network
![Page 8: Signed weighted gene co- expression network …...Signed weighted gene co-expression network analysis of transcriptional regulation in murine embryonic stem cells Steve Horvath University](https://reader033.fdocuments.in/reader033/viewer/2022042301/5eccce2e7a093840f93bafc5/html5/thumbnails/8.jpg)
Why soft thresholding as opposed
to hard thresholding?
1. Preserves the continuous information of the co-expression information
2. Results tend to be more robust with regard to different threshold choices
But hard thresholding has its own advantages: In particular, graph theoretic algorithms from the computer
science community can be applied to the resulting networks
![Page 9: Signed weighted gene co- expression network …...Signed weighted gene co-expression network analysis of transcriptional regulation in murine embryonic stem cells Steve Horvath University](https://reader033.fdocuments.in/reader033/viewer/2022042301/5eccce2e7a093840f93bafc5/html5/thumbnails/9.jpg)
Question: Are signed correlation networks
superior to unsigned networks?
Answer: Overall, recent applications have convinced
me that signed networks are preferable.
• For example, signed networks were critical in a
recent stem cell application
• Michael J Mason, Kathrin Plath, Qing Zhou, SH (2009)
Signed Gene Co-expression Networks for Analyzing Transcriptional Regulation in Murine Embryonic Stem
Cells. BMC Genomics 2009, 10:327
![Page 10: Signed weighted gene co- expression network …...Signed weighted gene co-expression network analysis of transcriptional regulation in murine embryonic stem cells Steve Horvath University](https://reader033.fdocuments.in/reader033/viewer/2022042301/5eccce2e7a093840f93bafc5/html5/thumbnails/10.jpg)
Re-analysis of published
microarray data sets
• Ivanova N, Dobrin R, Lu R, Kotenko L, Levorse J, DeCoste C, Schafer X, Lun Y, Lemischka I: Discecting
self-renewal in stem cells with RNA interference.Nature
2006, 442:533-538
• Zhou Q, Chipperfield H, Melton DA, Wong WH: A gene
regulatrory network in mouse embryonic stem cells. Proc Natl Acad Sci 2007, 104(42):16438-16443.
![Page 11: Signed weighted gene co- expression network …...Signed weighted gene co-expression network analysis of transcriptional regulation in murine embryonic stem cells Steve Horvath University](https://reader033.fdocuments.in/reader033/viewer/2022042301/5eccce2e7a093840f93bafc5/html5/thumbnails/11.jpg)
ES Cell Datasets Used• Ivanova et al.: RNA
knockdown of 8 TFs
thought to play a role in
pluripotency
• Zhou et al.: ES cell
samples and
differentiated cell
samples sorted into
Oct4 positive and
negative groups
ES / Oct4+ Oct4- ES / Oct4+ Oct4-
![Page 12: Signed weighted gene co- expression network …...Signed weighted gene co-expression network analysis of transcriptional regulation in murine embryonic stem cells Steve Horvath University](https://reader033.fdocuments.in/reader033/viewer/2022042301/5eccce2e7a093840f93bafc5/html5/thumbnails/12.jpg)
How to detect network modules?
![Page 13: Signed weighted gene co- expression network …...Signed weighted gene co-expression network analysis of transcriptional regulation in murine embryonic stem cells Steve Horvath University](https://reader033.fdocuments.in/reader033/viewer/2022042301/5eccce2e7a093840f93bafc5/html5/thumbnails/13.jpg)
As default, we define modules as branches of a cluster tree
• We use average linkage hierarchical clustering which inputs a measure of interconnectedness
– often the topological overlap measure
• Once a dendrogram is obtained from a hierarchical clustering method, we define modules as branches using a branch cutting method
– dynamicTreeCut R package (Peter Langfelder et al 2007)
![Page 14: Signed weighted gene co- expression network …...Signed weighted gene co-expression network analysis of transcriptional regulation in murine embryonic stem cells Steve Horvath University](https://reader033.fdocuments.in/reader033/viewer/2022042301/5eccce2e7a093840f93bafc5/html5/thumbnails/14.jpg)
How to cut branches off a tree?
Bioinformatics 2008 24(5):719-720
Module=branch of a cluster tree
Module genes are assigned the same color
![Page 15: Signed weighted gene co- expression network …...Signed weighted gene co-expression network analysis of transcriptional regulation in murine embryonic stem cells Steve Horvath University](https://reader033.fdocuments.in/reader033/viewer/2022042301/5eccce2e7a093840f93bafc5/html5/thumbnails/15.jpg)
Signed WGCNA finds a pluripotency related module, which cannot be found in an unsigned
network analysis
Pluripotencymodule
![Page 16: Signed weighted gene co- expression network …...Signed weighted gene co-expression network analysis of transcriptional regulation in murine embryonic stem cells Steve Horvath University](https://reader033.fdocuments.in/reader033/viewer/2022042301/5eccce2e7a093840f93bafc5/html5/thumbnails/16.jpg)
Question: How does one summarize
the expression profiles in a module?
Math answer: module eigengene
= first principal component
Network answer: the most highly
connected intramodular hub gene
Both turn out to be equivalent
![Page 17: Signed weighted gene co- expression network …...Signed weighted gene co-expression network analysis of transcriptional regulation in murine embryonic stem cells Steve Horvath University](https://reader033.fdocuments.in/reader033/viewer/2022042301/5eccce2e7a093840f93bafc5/html5/thumbnails/17.jpg)
b r o w n
1234567891 01 11 21 31 41 51 61 71 81 92 02 12 22 32 42 52 62 72 82 93 03 13 23 33 43 53 63 73 83 94 04 14 24 34 44 54 64 74 84 95 05 15 25 35 45 55 65 75 85 96 06 16 26 36 46 56 66 76 86 97 07 17 27 37 47 57 67 77 87 98 08 18 28 38 48 58 68 78 88 99 09 19 29 39 49 59 69 79 89 91 0 01 0 11 0 21 0 31 0 41 0 51 0 61 0 71 0 81 0 91 1 01 1 11 1 21 1 31 1 41 1 51 1 61 1 71 1 81 1 91 2 01 2 11 2 21 2 31 2 41 2 51 2 61 2 71 2 81 2 91 3 01 3 11 3 21 3 31 3 41 3 51 3 61 3 71 3 81 3 91 4 01 4 11 4 21 4 31 4 41 4 51 4 61 4 71 4 81 4 91 5 01 5 11 5 21 5 31 5 41 5 51 5 61 5 71 5 81 5 91 6 01 6 11 6 21 6 31 6 41 6 51 6 61 6 71 6 81 6 91 7 01 7 11 7 21 7 31 7 41 7 51 7 61 7 71 7 81 7 91 8 01 8 11 8 21 8 31 8 41 8 5
b r o w n
-0.1
0.0
0.1
0.2
0.3
0.4
Module Eigengene= measure of over-expression=average redness
Rows,=genes, Columns=microarray
The brown module eigengenes across samples
![Page 18: Signed weighted gene co- expression network …...Signed weighted gene co-expression network analysis of transcriptional regulation in murine embryonic stem cells Steve Horvath University](https://reader033.fdocuments.in/reader033/viewer/2022042301/5eccce2e7a093840f93bafc5/html5/thumbnails/18.jpg)
Eigengene-based connectivity, also known as kME
or module membership measure
, ( ) ( , )ME i i
k ModuleMembership i cor x ME= =
kME(i) is simply the correlation between the i-th gene expression profile and the module eigengene.
Very useful measure for annotating genes with regard to modules.
Module eigengene turns out to be the most highly
connected gene
![Page 19: Signed weighted gene co- expression network …...Signed weighted gene co-expression network analysis of transcriptional regulation in murine embryonic stem cells Steve Horvath University](https://reader033.fdocuments.in/reader033/viewer/2022042301/5eccce2e7a093840f93bafc5/html5/thumbnails/19.jpg)
What is weighted gene co-
expression network analysis?
![Page 20: Signed weighted gene co- expression network …...Signed weighted gene co-expression network analysis of transcriptional regulation in murine embryonic stem cells Steve Horvath University](https://reader033.fdocuments.in/reader033/viewer/2022042301/5eccce2e7a093840f93bafc5/html5/thumbnails/20.jpg)
Construct a networkRationale: make use of interaction patterns between genes
Identify modulesRationale: module (pathway) based analysis
Relate modules to external informationArray Information: RNAi knock-outGene Information: gene ontology, DNA binding data, epigenetic
Rationale: find biologically interesting modules
Find the key drivers in interesting modulesTools: intramodular connectivity kMERationale: experimental validation, novel genes
Study Module Preservation across different data Rationale: • Same data: to check robustness of module definition•Example Ivanova versus Zhou data
![Page 21: Signed weighted gene co- expression network …...Signed weighted gene co-expression network analysis of transcriptional regulation in murine embryonic stem cells Steve Horvath University](https://reader033.fdocuments.in/reader033/viewer/2022042301/5eccce2e7a093840f93bafc5/html5/thumbnails/21.jpg)
What is different from other analyses?• Emphasis on modules (pathways) instead of
individual genes– Greatly alleviates the problem of multiple comparisons
• Less than 20 comparisons versus 20000 comparisons
• Use of intramodular connectivity kME to find key drivers– Quantifies module membership (centrality)
– If the module is preserved, intramodular hub genes are preserved as well
• Module definition is based on gene expression data only– No prior pathway information is used for module definition
– Two module (eigengenes) can be highly correlated
– Typically defined by cutting branches of a cluster tree
• Emphasis on a unified approach for relating variables– Default: power of a correlation
• Technical Details: soft thresholding with the power adjacency function, topological overlap matrix to measure interconnectedness
![Page 22: Signed weighted gene co- expression network …...Signed weighted gene co-expression network analysis of transcriptional regulation in murine embryonic stem cells Steve Horvath University](https://reader033.fdocuments.in/reader033/viewer/2022042301/5eccce2e7a093840f93bafc5/html5/thumbnails/22.jpg)
How to relate modules to external
data?
![Page 23: Signed weighted gene co- expression network …...Signed weighted gene co-expression network analysis of transcriptional regulation in murine embryonic stem cells Steve Horvath University](https://reader033.fdocuments.in/reader033/viewer/2022042301/5eccce2e7a093840f93bafc5/html5/thumbnails/23.jpg)
Oct4 RNAi knock out status gives rise to a gene significance measure
Possible definitions• We defined a measure of gene significance (GS) as
the t-statistic from the paired Student's t-test of expression in control RNAi samples and ES cell samples with RNAi knock down of Oct4 (paired by day of treatment)
• GS could also be a fold change• GS(i)=|T-test(i)| of differential expression• GS(i)=-log(p-value)
![Page 24: Signed weighted gene co- expression network …...Signed weighted gene co-expression network analysis of transcriptional regulation in murine embryonic stem cells Steve Horvath University](https://reader033.fdocuments.in/reader033/viewer/2022042301/5eccce2e7a093840f93bafc5/html5/thumbnails/24.jpg)
A gene significance naturally gives
rise to a module significance measure
• Define module significance as mean gene significance
• Often highly related to the correlation between module eigengene and trait
![Page 25: Signed weighted gene co- expression network …...Signed weighted gene co-expression network analysis of transcriptional regulation in murine embryonic stem cells Steve Horvath University](https://reader033.fdocuments.in/reader033/viewer/2022042301/5eccce2e7a093840f93bafc5/html5/thumbnails/25.jpg)
The Black Module Contains
Genes Involved in Pluripotency
• The genes of this
module are significantly
more likely to be bound
by key regulators of
pluripotency and self-
renewal
![Page 26: Signed weighted gene co- expression network …...Signed weighted gene co-expression network analysis of transcriptional regulation in murine embryonic stem cells Steve Horvath University](https://reader033.fdocuments.in/reader033/viewer/2022042301/5eccce2e7a093840f93bafc5/html5/thumbnails/26.jpg)
The blue module contains
transcription factors involved in
differentiation
![Page 27: Signed weighted gene co- expression network …...Signed weighted gene co-expression network analysis of transcriptional regulation in murine embryonic stem cells Steve Horvath University](https://reader033.fdocuments.in/reader033/viewer/2022042301/5eccce2e7a093840f93bafc5/html5/thumbnails/27.jpg)
Module Membership and Binding Information in the Signed Ivanova et al (2006) Network. This file contains module membership, kME, and binding data from Loh et al (2006), Boyer et al (2007), and Chen et al (2008) for
each gene on the microarray.
![Page 28: Signed weighted gene co- expression network …...Signed weighted gene co-expression network analysis of transcriptional regulation in murine embryonic stem cells Steve Horvath University](https://reader033.fdocuments.in/reader033/viewer/2022042301/5eccce2e7a093840f93bafc5/html5/thumbnails/28.jpg)
Signed WGCNA finds Novel Pathways
Involved in Pluripotency in Zhou
dataset
• Nup133 is ranked 29th by connectivity and 777th by fold
change
![Page 29: Signed weighted gene co- expression network …...Signed weighted gene co-expression network analysis of transcriptional regulation in murine embryonic stem cells Steve Horvath University](https://reader033.fdocuments.in/reader033/viewer/2022042301/5eccce2e7a093840f93bafc5/html5/thumbnails/29.jpg)
Epigenetic Regulation and Module
Membership• Recent studies suggest that chromatin structure and
epigenetic modifications, like histone modification and DNA methylation, play a role in controlling gene expression during ES cell self-renewal and differentiation.– For example, gene repression by the PcG protein complex via
histone H3 lysine 27 trimethylation (H3K27me3) is required for ES cell self-renewal and pluripotency.
• To understand how epigenetic variables contribute to the regulation of ES cells we studied the relationship of the pluripotency and differentiation modules with ES cell H3K4 and H3K27 trimethylation, DNA methylation, and CpG promoter content from previously published data sets.
• Data from Guenther et al. Cell 2007
![Page 30: Signed weighted gene co- expression network …...Signed weighted gene co-expression network analysis of transcriptional regulation in murine embryonic stem cells Steve Horvath University](https://reader033.fdocuments.in/reader033/viewer/2022042301/5eccce2e7a093840f93bafc5/html5/thumbnails/30.jpg)
• Relating Module Membership to Epigenetic Regulation.
• The y-axis reports the proportion of top 1000 genes that are known to belong to the group of genes defined on the x-axis.
• Histone H3K4me3 trimethylation status is abbreviated K4, H3K27me3 trimethylation status is abbreviated by K27.
• Note that genes with promoter CpG methylation are significantly (p = 2.0 × 10-14) under-enriched with respect to the top 1000 black module genes.
![Page 31: Signed weighted gene co- expression network …...Signed weighted gene co-expression network analysis of transcriptional regulation in murine embryonic stem cells Steve Horvath University](https://reader033.fdocuments.in/reader033/viewer/2022042301/5eccce2e7a093840f93bafc5/html5/thumbnails/31.jpg)
Analysis of variance
module membership (kME) versus epigenetic variables
2.2E-020.0017.1E-010.000CpG Methylated
1.9E-010.0008.7E-020.000PcG Bound
6.0E-100.0054.9E-040.002CPG class (HCP,
ICP, LCP)
7.5E-030.0018.0E-080.003Oct4 Complex
2.6E-040.002< 2.2E-160.015cMyc Complex
< 2.2E-
160.034< 2.2E-160.067
Histone
Trimethylation
(K4, K27,K4&K27)
p-value
Prop.
Of
Total
Var
p-value
Prop. Of
Total
Var
Source
kMEblue, Total
Prop Var
Explained =
4.2%
kMEblack,
Total Prop Var Explained =
8.3%
Source of Variation in kME
![Page 32: Signed weighted gene co- expression network …...Signed weighted gene co-expression network analysis of transcriptional regulation in murine embryonic stem cells Steve Horvath University](https://reader033.fdocuments.in/reader033/viewer/2022042301/5eccce2e7a093840f93bafc5/html5/thumbnails/32.jpg)
Comparison of gene screening based on kME
versus screening based on differential expression
• Venn diagrams show the amount of gene overlap between the top 1000 black (pluripotency) module genes and the top 1000 genes most significantly down-regulated upon Oct4 RNAi (left)
• gene overlap between the top 1000 blue (differentiation) module genes and the 1000 genes most significantly up-regulated with Oct4 RNAi (right).
• Ivanova et al data set.
Green=
Black module genes
Grey: Standard differential expression analysis
Green: genes with highest module membership kME
![Page 33: Signed weighted gene co- expression network …...Signed weighted gene co-expression network analysis of transcriptional regulation in murine embryonic stem cells Steve Horvath University](https://reader033.fdocuments.in/reader033/viewer/2022042301/5eccce2e7a093840f93bafc5/html5/thumbnails/33.jpg)
Module genes (green) have more significant enrichment than those
found by a standard differential expression analysis
![Page 34: Signed weighted gene co- expression network …...Signed weighted gene co-expression network analysis of transcriptional regulation in murine embryonic stem cells Steve Horvath University](https://reader033.fdocuments.in/reader033/viewer/2022042301/5eccce2e7a093840f93bafc5/html5/thumbnails/34.jpg)
Conclusion
• Signed WGCNA – has more consistent gene rankings between data sets,
– is better able to identify functionally enriched groups of genes
• Focus on module eigengenes circumvents the multiple testing problems that plague standard gene-based expression analysis.
• kME =module membership is very useful
– kME based gene screening identifies several novel stem cell related genes that would not have been found using a standard differential expression analysis
– kME is valuable for annotating genes with regard to module membership and for identifying genes related to pluripotency and differentiation
– Can be used as input of analysis of variance to dissect which factors contribute to module membership
![Page 35: Signed weighted gene co- expression network …...Signed weighted gene co-expression network analysis of transcriptional regulation in murine embryonic stem cells Steve Horvath University](https://reader033.fdocuments.in/reader033/viewer/2022042301/5eccce2e7a093840f93bafc5/html5/thumbnails/35.jpg)
Software and Data Availability
• R software tutorials etc can be found online
• Google search
– weighted co-expression network
– “WGCNA”
– “co-expression network”
• http://www.genetics.ucla.edu/labs/horvath/CoexpressionNetwork
![Page 36: Signed weighted gene co- expression network …...Signed weighted gene co-expression network analysis of transcriptional regulation in murine embryonic stem cells Steve Horvath University](https://reader033.fdocuments.in/reader033/viewer/2022042301/5eccce2e7a093840f93bafc5/html5/thumbnails/36.jpg)
Acknowledgement
• Dissertation work of Mike J Mason
• Collaborators:
Guoping Fan, Kathrin Plath, Qing Zhou
• WGCNA R package: Peter Langfelder