Albin Sandelin (University of Copenhagen) Jasmina Ponjavic (Oxford University) Boris Lenhard (Bergen...

34
Albin Sandelin (University of Copenhag en) Jasmina Ponjavic (Oxford University) Boris Lenhard (Bergen University) David Hume (Queensland University) Piero Carninci (RIKEN Wako) Martin Frith (RIKEN Yokohama) Hideya Kawaji (NTT Software) Yoshihide Hayashizaki (RIKEN Yokohama) >100 Japanese technicians New paradigms and resources for promoter studies

Transcript of Albin Sandelin (University of Copenhagen) Jasmina Ponjavic (Oxford University) Boris Lenhard (Bergen...

Page 1: Albin Sandelin (University of Copenhagen) Jasmina Ponjavic (Oxford University) Boris Lenhard (Bergen University) David Hume (Queensland University) Piero.

Albin Sandelin (University of Copenhagen)

Jasmina Ponjavic (Oxford University) Boris Lenhard (Bergen University)

David Hume (Queensland University) Piero Carninci (RIKEN Wako)

Martin Frith (RIKEN Yokohama)Hideya Kawaji (NTT Software)

Yoshihide Hayashizaki (RIKEN Yokohama)…

>100 Japanese technicians

New paradigms and resources for promoter

studies

Page 2: Albin Sandelin (University of Copenhagen) Jasmina Ponjavic (Oxford University) Boris Lenhard (Bergen University) David Hume (Queensland University) Piero.

Aims• Introduction of Cap Analysis of

Gene Expression (CAGE) data and resources

• Insights on core promoter structure and transcriptional landscapes using CAGE

• (The JASPAR database)

Main references:Carninci et al Nat Genet. 2006 Jun;38(6):626-35 Carninci et al Science 2005 Sep 2;309(5740):1559-63 Katayama et al Science 2005 Sep 2;309(5740):1564-6. Frith et al Genome Res 2006 Jun;16(6):713-22. Ponjavic et al Genome Biol 2006 Aug 17;7(8):R78

Page 3: Albin Sandelin (University of Copenhagen) Jasmina Ponjavic (Oxford University) Boris Lenhard (Bergen University) David Hume (Queensland University) Piero.

CAGE tags are the 20 first nucleotides of a full-length cDNAfrom a non-normalized cDNA library

–Shiraki et al, PNAS 100:15776-81 (2003)

Sequencing and mapping to the genome

What is CAGE?

Page 4: Albin Sandelin (University of Copenhagen) Jasmina Ponjavic (Oxford University) Boris Lenhard (Bergen University) David Hume (Queensland University) Piero.

Advantages

• Large-scale sequencing with no cDNA normalization: – enables localization AND quantification of transcripts/promoters– Enables promoter localization with unprecedented sampling dept

h (sequence >1 million transcripts in one experiment…)

• Base-pair resolution, with strand information– Quite impressive validation rates even for single tags (86% true

positives by RACE)

• Unbiased in terms of location: genome-wide• Different RNA populations can be sequenced and compa

red

Page 5: Albin Sandelin (University of Copenhagen) Jasmina Ponjavic (Oxford University) Boris Lenhard (Bergen University) David Hume (Queensland University) Piero.

image

Page 6: Albin Sandelin (University of Copenhagen) Jasmina Ponjavic (Oxford University) Boris Lenhard (Bergen University) David Hume (Queensland University) Piero.

Initial analysesSets:

7 million tags (mouse), 145 libraries5 million tags (human), 40 libraries

Page 7: Albin Sandelin (University of Copenhagen) Jasmina Ponjavic (Oxford University) Boris Lenhard (Bergen University) David Hume (Queensland University) Piero.

CAGE resources

• Genomic element viewer ( very similar to UCSC browser)– CAGE tags and cDNA landscapes

Page 8: Albin Sandelin (University of Copenhagen) Jasmina Ponjavic (Oxford University) Boris Lenhard (Bergen University) David Hume (Queensland University) Piero.
Page 9: Albin Sandelin (University of Copenhagen) Jasmina Ponjavic (Oxford University) Boris Lenhard (Bergen University) David Hume (Queensland University) Piero.

CAGE resources

• Basic CAGE viewer– Comprehensive browser of CAGE tags and

CAGE tag clusters, and library information

Page 10: Albin Sandelin (University of Copenhagen) Jasmina Ponjavic (Oxford University) Boris Lenhard (Bergen University) David Hume (Queensland University) Piero.
Page 11: Albin Sandelin (University of Copenhagen) Jasmina Ponjavic (Oxford University) Boris Lenhard (Bergen University) David Hume (Queensland University) Piero.

CAGE resources

• CAGE analysis viewer– Browse tissue specificity in core promoters

Page 12: Albin Sandelin (University of Copenhagen) Jasmina Ponjavic (Oxford University) Boris Lenhard (Bergen University) David Hume (Queensland University) Piero.
Page 13: Albin Sandelin (University of Copenhagen) Jasmina Ponjavic (Oxford University) Boris Lenhard (Bergen University) David Hume (Queensland University) Piero.

Biological insights from CAGE data analysis

Page 14: Albin Sandelin (University of Copenhagen) Jasmina Ponjavic (Oxford University) Boris Lenhard (Bergen University) David Hume (Queensland University) Piero.

…if this is true, we would expect all CAGE tags in known promoters to

cluster like this

+1-30 +30

% of tags within a cluster(minimum 100 tags)

Page 15: Albin Sandelin (University of Copenhagen) Jasmina Ponjavic (Oxford University) Boris Lenhard (Bergen University) David Hume (Queensland University) Piero.

Mouse

Human

mRNA

mRNA

Page 16: Albin Sandelin (University of Copenhagen) Jasmina Ponjavic (Oxford University) Boris Lenhard (Bergen University) David Hume (Queensland University) Piero.
Page 17: Albin Sandelin (University of Copenhagen) Jasmina Ponjavic (Oxford University) Boris Lenhard (Bergen University) David Hume (Queensland University) Piero.

Mouse

Human

Page 18: Albin Sandelin (University of Copenhagen) Jasmina Ponjavic (Oxford University) Boris Lenhard (Bergen University) David Hume (Queensland University) Piero.
Page 19: Albin Sandelin (University of Copenhagen) Jasmina Ponjavic (Oxford University) Boris Lenhard (Bergen University) David Hume (Queensland University) Piero.
Page 20: Albin Sandelin (University of Copenhagen) Jasmina Ponjavic (Oxford University) Boris Lenhard (Bergen University) David Hume (Queensland University) Piero.

Evolutionary advantages of having broad promoters?

Page 21: Albin Sandelin (University of Copenhagen) Jasmina Ponjavic (Oxford University) Boris Lenhard (Bergen University) David Hume (Queensland University) Piero.

Take-home message I

• At least two major types of TSS selection exists– This is correlated to both sequence content

and tissue specificity– The majority of promoters are NOT the text-

book type

Page 22: Albin Sandelin (University of Copenhagen) Jasmina Ponjavic (Oxford University) Boris Lenhard (Bergen University) David Hume (Queensland University) Piero.

What about the genome landscape?

• Many more core promoters than previously seen (factor 5-10) – this is despite that many tissues are not sampled

• What are they up to?

Page 23: Albin Sandelin (University of Copenhagen) Jasmina Ponjavic (Oxford University) Boris Lenhard (Bergen University) David Hume (Queensland University) Piero.

58% of genes have more than one promoter, many which are tissue-specific

UDP-glucuronyl transferase gene: >= 7 promoters

Take-home message: Do not talk about tissue-specific genes!

Page 24: Albin Sandelin (University of Copenhagen) Jasmina Ponjavic (Oxford University) Boris Lenhard (Bergen University) David Hume (Queensland University) Piero.

Promoters within 3’ UTRs

The largest number of CAGE tags map to 5’ ends of genes

However, there are many clear cases of significant start sites in 3’ UTRs!

Page 25: Albin Sandelin (University of Copenhagen) Jasmina Ponjavic (Oxford University) Boris Lenhard (Bergen University) David Hume (Queensland University) Piero.
Page 26: Albin Sandelin (University of Copenhagen) Jasmina Ponjavic (Oxford University) Boris Lenhard (Bergen University) David Hume (Queensland University) Piero.

Complex loci

Page 27: Albin Sandelin (University of Copenhagen) Jasmina Ponjavic (Oxford University) Boris Lenhard (Bergen University) David Hume (Queensland University) Piero.

Takehome message II

• 1 gene – many promoters (what is a gene, anyway?)

• Many uncharacterized promoters await deeper study

• Many promoters and transcripts are at unexpected locations

• The genome has become a messy place to work in – transcripts everywhere

Page 28: Albin Sandelin (University of Copenhagen) Jasmina Ponjavic (Oxford University) Boris Lenhard (Bergen University) David Hume (Queensland University) Piero.

Brief examples of more detailed analyses using the

same dataset:

• Evolutionary turnover of TSS– Frith et al 2006, Genome Res

• Dissection of TATA-containing core promoters– Ponjavic et al 2006, Genome Biol

• (There are some 10 more)

Page 29: Albin Sandelin (University of Copenhagen) Jasmina Ponjavic (Oxford University) Boris Lenhard (Bergen University) David Hume (Queensland University) Piero.

TSS turnover (Frith et al) No turnover

Page 30: Albin Sandelin (University of Copenhagen) Jasmina Ponjavic (Oxford University) Boris Lenhard (Bergen University) David Hume (Queensland University) Piero.

TSS turnover (Frith et al) Total turnover

Page 31: Albin Sandelin (University of Copenhagen) Jasmina Ponjavic (Oxford University) Boris Lenhard (Bergen University) David Hume (Queensland University) Piero.

TSS turnover (Frith et al) Partial turnover

Page 32: Albin Sandelin (University of Copenhagen) Jasmina Ponjavic (Oxford University) Boris Lenhard (Bergen University) David Hume (Queensland University) Piero.

TSS turnover does exist

• …although this is not the default situation (We find about 1000 cases)

• When TSS turnover does occur, “phylogenetic footprinting” type TFBS search is problematic

• Can all functional elements that are active on genome level undergo turnover?

Page 33: Albin Sandelin (University of Copenhagen) Jasmina Ponjavic (Oxford University) Boris Lenhard (Bergen University) David Hume (Queensland University) Piero.
Page 34: Albin Sandelin (University of Copenhagen) Jasmina Ponjavic (Oxford University) Boris Lenhard (Bergen University) David Hume (Queensland University) Piero.

http://jaspar.genereg.net.