Gene regulatory network

30
Gene regulatory network Jin Chen CSE891-001 2012 Fall 1

description

Gene regulatory network. Jin Chen CSE891- 001 2012 Fall. Outline. Transcriptional regulation Co-expression & co-regulation Bio-techniques ChIP - seq Bacterial one-hybrid system Computational models for gene regulation network construction Binding + expressions with TF knock out - PowerPoint PPT Presentation

Transcript of Gene regulatory network

Page 1: Gene regulatory network

1

Gene regulatory network

Jin ChenCSE891-001

2012 Fall

Page 2: Gene regulatory network

2

Outline

• Transcriptional regulation• Co-expression & co-regulation• Bio-techniques– ChIP-seq – Bacterial one-hybrid system

• Computational models for gene regulation network construction– Binding + expressions with TF knock out– Binding + time serial gene expressions

Page 3: Gene regulatory network

3

David J.C. MacKay, Information Theory, Inference & Learning Algorithms, 2003

Page 4: Gene regulatory network

4

Transcriptional regulation• Regulation of transcription controls when transcription occurs

and how much RNA is created

• Transcription factors are often needed to be bound to a regulatory binding site to switch a gene on (activator) or to shut off a gene (repressor)

• Generally, as the organism grows more sophisticated, their cellular protein regulation becomes more complicated

Page 5: Gene regulatory network

5

Transcription control

• Transcription control is directed primarily by two elements– Transcription factors (TF)– DNA sequences that facilitate the binding of these

TFs (cis-regulatory elements)

Page 6: Gene regulatory network

6

Transcription control• First, there needs to be an initiating signal

• This signal gives rise to the activation of a TF, and recruits other members of the "transcription machine." TFs generally simultaneously bind DNA. TFs and their cofactors, can be regulated through reversible structural alterations

• Transcription is initiated at the promoter site, as an increase in the amount of an active TF binds a target DNA sequence. Other proteins, known as "scaffolding proteins" bind other cofactors and hold them in place

• Frequently, extracellular signals induce the expression of immediate early genes. These are in and of themselves TFs or components thereof, and can further influence gene expression

Page 7: Gene regulatory network

7

Gene regulation network

• Node– TF or target gene

• Edge– Regulation relation– Directed– Activation “--->”– Inhibition “---|”

Mata et al. Genome Biol. 2007;8(10):R217

Yeast

Page 8: Gene regulatory network

8

Co-expression & co-regulation

• Genes belonging to the same cluster are often called co-expressed

• Genes with similar expression patterns might share transcription factors and functional regulatory binding sites

Page 9: Gene regulatory network

9

Topic 1. GRN Reconstruction

Algorithms for GRN reconstruction based on gene expression & motif data (2002-2008)

Algorithms focusing on integrating binding data with existing data (2007-now)

Time2002 20082007 2010

2 input dataTime serial microarray TF binding motifs

3 modelsTime shift matchingMutual informationGranger causality & DBN

2 limitationsHigh false positive rateSmall scale (# genes~100)

3 input dataTime serial microarray TF binding motifsBinding data

2 modelsTime shift matching and bindingProtein expression approximation

from binding data2 limitations

Lack of binding data at systems levelCombinatorial TF studies

Background

Page 10: Gene regulatory network

10

ChIP-seq• ChIP-seq is the sequencing of the genomic DNA fragments that

co-precipitate with a DNA-binding protein that is under study

• DNA-binding proteins most frequently investigated in this way are transcription factors, etc

• ChIP-seq can identify all DNA segments in the genome physically associated with a specific DNA-binding protein

• It does not rely on prior knowledge of precise DNA binding sites

Liu et al, BMC Biology 2010, 8:56

Page 11: Gene regulatory network

11Liu et al. BMC Biology 2010 8:56

Flow scheme of the central steps in the ChIP-seq procedure

Page 12: Gene regulatory network

12

A ChIP DNA sample from a stem cell population and the corresponding input DNA sample were both processed without amplification

http://www.helicosbio.com/Applications/ChIPSeq/tabid/69/Default.aspx

Example

Algorithmic analysis for mapping and peak-calling

Page 13: Gene regulatory network

14

Binding network

• First action of a TF is to find and to bind DNA segments and ChIP-seq allows the binding sites of TFs to be identified across entire genomes

• Protein-DNA binding network– Direct downstream targets of any transcription factor can

be determined– DNA sequence motif that is recognized by the binding

protein can be computed

Page 14: Gene regulatory network

15

ExampleChIP-seq profiling of 13 TFs in embryonic stem (ES) cell development revealed the organization of regulatory elements. This provided insights in the integration of TF-mediated signaling pathways in ES cell differentiation

Chen et al Cell 2008 , 133:1106-1117

Page 15: Gene regulatory network

16

What ChIP-seq cannot do• Many observed binding events are neutral and do not

regulate transcription• Regulatory binding events often occur at enhancers that are

not proximal to the target gene that they control

• The task of identifying transcriptional targets requires the integration of ChIP-seq with evidence from expression data to help associate binding events with target gene regulation

Honkela et al. PNAS 2010 vol. 107 no. 17 pp 7793–7798

Page 16: Gene regulatory network

17

Bacterial one-hybrid system

wikipedia

Page 17: Gene regulatory network

18

Gene expression data

• TF knock-out gene expression• TF over-expression gene expression

• Differential expression of genes between wild type and mutant/over-expression is indicative of a potential regulatory interaction, e.g. Yeast GRN

Reimand et al, Nucleic Acids Research, 2010, Vol. 38, No. 14 pp 4768–4777

Page 18: Gene regulatory network

19

Comprehensive analysis of TF knockout expression data in Yeast

• 269 TF knockout microarrays, covering almost all yeast TFs

• DNA–protein interactions derived from ChIP-chip experiments

• Predicted TF binding sites with position weight matrices

Hu et al. Nat. Genet., 39, 683–687, 2007; Harbison et al. Nature, 431, 99–104. 2004

Page 19: Gene regulatory network

20

Comprehensive analysis of TF knockout expression data in Yeast

• Checked the expression levels of the TFs– Intuitively one expects the TF under consideration to have

lower expression in the mutant strain compared with the wild type strain

– confirms this for 155 TFs– 78 TFs display a negative fold change at statistically non-

significant levels– 36 TFs are lethal

Reimand et al, Nucleic Acids Research, 2010, Vol. 38, No. 14 pp 4768–4777

Page 20: Gene regulatory network

21Reimand et al, Nucleic Acids Research, 2010, Vol. 38, No. 14 pp 4768–4777

Page 21: Gene regulatory network

22

Comprehensive analysis of TF knockout expression data in Yeast

• Examine functional annotations of differentially expressed genes– As most TFs are considered to regulate distinct cellular

processes, their target genes should be associated with a coherent set of molecular and biological functions

– Used g:Profiler to identify GO, KEGG and Reactome pathway annotations

Reimand et al, Nucleic Acids Research, 2010, Vol. 38, No. 14 pp 4768–4777

Page 22: Gene regulatory network

23Reimand et al, Nucleic Acids Research, 2010, Vol. 38, No. 14 pp 4768–4777

Page 23: Gene regulatory network

24

Comprehensive analysis of TF knockout expression data in Yeast

• Overlap between TF-binding and TF knockout data – Collect binding sites for 142 TFs, comprising 5,188 ChIP-

chip interactions and 17,091 motif predictions– Calculate the intersection between the list of differentially

expressed genes from the TF knockout and targets identified by ChIP-chip or binding-site predictions

– 2,230 regulation relations

Reimand et al, Nucleic Acids Research, 2010, Vol. 38, No. 14 pp 4768–4777

Page 24: Gene regulatory network

25

Comprehensive analysis of TF knockout expression data in Yeast

• Include protein-protein interaction information as an additional perspective to the assessment of GRN– TFs that function together may show significant overlap in

their target genes• Out of 115 pairs of physically interacting TFs in the dataset, 92

display such an overlap– TFs tend to regulate genes that interact with each other

• Out of 110,487 differentially expressed genes, there are 3,846 pair-wise interactions between co-regulated genes, covering 2,262 genes in total

• Most TFs target at least one pair of interacting genes

Reimand et al, Nucleic Acids Research, 2010, Vol. 38, No. 14 pp 4768–4777

Page 25: Gene regulatory network

26

Temporal gene expression data• A problem with the above approach is that the creation of

mutant strains is challenging or impossible for many TFs of interest

• Even when available, mutants may provide very limited information because of redundancy or due to the confounding of signal from indirect regulatory feedback

• Temporal dynamics: use time serial wild-type gene expression. e.g. Drosophila GRN

Honkela et al. PNAS 2010 vol. 107 no. 17 pp 7793–7798

Page 26: Gene regulatory network

27

Other models for gene regulation network construction

• Expression based study– Dynamic Bayesian Network– Granger causality

• TF binding motif based study– Weader– AlignAce

Page 27: Gene regulatory network

28

Gene regulation network analysis• Transcriptional regulation is mediated by the combinatorial interplay

between cis-regulatory DNA elements and trans-acting transcription factors, and is perhaps the most important mechanism for controlling gene expression

• A transcriptional regulatory network that integrates such information can lead to a systems-level understanding of regulatory mechanisms

Kim et al. Wiley Interdisciplinary Reviews: Systems Biology and Medicine. Vol. 3, Iss 1, pp 21–35 2011

Page 28: Gene regulatory network

29

Discovery of motifs and regulatory modules

• Binding motif prediction only based on the knowledge afforded by PWMs often suffers from high false-positive rates

• To reduce the false-positive rate, a number of biological insights have been used– evolutionary pressure placed on these important cis-regulatory

elements– co-expression with genes that have well-documented functions and

expression patterns– clustering of cis-regulatory features into cis-regulatory modules

Page 29: Gene regulatory network

30

Discovery of motifs and regulatory modules

• Collect the binding sequences for known TFs and to identify potential binding sites in unannotated genome sequence

• position frequency matrix

• cluster of individual TF binding sites

• regulatory relations among 4 genes

Modeling of individual transcription factor binding sites and cis-regulatory modules

Kim et al. Wiley Interdisciplinary Reviews: Systems Biology and Medicine. Vol. 3, Iss 1, pp 21–35 2011

Page 30: Gene regulatory network

31

Co-regulation

• A density-based subspace clustering algorithms for coherent clustering of gene expression data

• The model allows– Expression profiles of genes in a cluster to follow any

shifting-and-scaling patterns in subspace– Expression value changes across any two conditions of the

cluster to be significant• Experimental results show that the algorithm is able

to detect a significant amount of high biological significant clusters missed by previous models