Recent discoveries in molecular biology Instructor: Henry Levin [email protected] Website ...

77
Recent discoveries in molecular biology Instructor: Henry Levin [email protected] Website https://science.nichd.nih.gov/c onfluence/display/biochem539/ Home

Transcript of Recent discoveries in molecular biology Instructor: Henry Levin [email protected] Website ...

Recent discoveries in molecular biology

Instructor: Henry Levin [email protected]

Websitehttps://science.nichd.nih.gov/confluence/d

isplay/biochem539/Home

Exam

Class Schedule

Exam

Career path

The function and impact of transposable elements

A transposable element is:

possesses an intrinsic ability to mobilize.

any genomic sequence that

TE

Target

What are some ways transposable elements can impact the host?

Transposable elements can disrupt coding sequences

Huang, 2012, Annual review of genetics.

Transposable elements can reduce protein activity or mRNA splicing

Gret1

Vvmby1A

Integration of Gret1 reduces expression of anthocyanin producing genes in grapes

Transposon insertion can change patterns of tissue specific expression

B-Bolivia causes expression of the b1 transcription factor to change from plant to seed.

HLH transcription factor

Transposon insertion can change patterns of tissue specific expression

Insertion of Rider causes cold dependent expression of Ruby in fruit of blood oranges.

Ruby (Myb transcription factor)

Transposons can alter gene function in three key ways

1. Disrupt coding sequence.

2. Change transcription level or mRNA stability.

1. Change tissue specificity of gene expression.

DNA “cut and paste” transposons rely on a single enzyme

Autonomous examplesTn7 in E. coliP in drosophilaAc in maizeSleeping beauty in salmon

Non-autonomous examplesDs in maize

Long terminal repeat retrotransposons have an RNA intermediate

Autonomous examplesTy1 in S. cerevisiaeGypsy in drosophilaIAP in mouse

Non-autonomous examplesETn in mousePbTRIM in Harvester Ant

Non-Long terminal repeat retrotransposons have an RNA intermediate and rely on target primed reverse transcription

Autonomous examplesL1 in humanI factor in drosophilaR1 and R2 in insects

Non-autonomous examplesAlu in humanSVA in human

Review of the three types of transposons

Type mechanism

DNA transposons Cut and paste

LTR retrotransposons Reverse transcription followed by integration

Non-LTR retrotransposon

Target primed reverse transcription

The evolution of transposons

RT

RT

RT

RT

RT

IN

IN

IN

IN

DNA transposons

Retro-transposons

Transposable elements have been extremely successful throughout evolution

45%

80%

30%

Huang et al, 2012Ann. Rev. Genet.

50%

0.3%12%

10%

A burst in LTR-retrotransposons

Maize diverged from sorghum about 16 million years ago. During this time the maize genome increased by three fold. This is entirely due to increased numbers of retrotransposons.

How are new transposons discovered?

1. Analyze spontaneous changes in phenotypes for their genetic origin. This is how transposons were discovered by Barbara McClintock.

2. Compare multiple genomes for polymorphic insertions or deletions.

3. Analyze genome sequences for repeats.

4. Search genomes for sequences similar to known transposons

Before the genome of S. pombe was sequenced we used DNA blots to look for repeat elements

Tf1 probe Tf2 probe

Tf1 has sequence similar to other transposon proteins

Gag PR RT IN

LTR LTR

Kick out plasmidWith 5-FOA

Tf1

URA3

Tf1 plasmid

Induce expression of Tf1-neo Select for integration with G418

WT IN fsPR fs

chromosome

Tf1

URA3

Tf1 plasmid

neo

Tf1 neo

neo

How would you test Tf1 for transposition activity?

Tf1 had no homology to any tRNA.

What entity could prime Tf1 reverse transcription?

LTR retrotransposons and retroviruses use a tRNA to prime reverse transcription

...... .

The priming of Tf1 reverse transcription

5’3’

WT IN-

7th base5' end

7th basePBS

7th basedoublemutant

5th base5' end

5th basePBS

5th basedoublemutant

....

Assays with cloned copies of transposable elements

•Can measure transposition activity.

•Can reveal mechanistic details.

•Can generate strains with new copies of the element.

How would you identify active L1 elements given that:

•There are 880,000 copies of L1 in the human genome.

•Approximately 99.7% are 5’ truncated.

•Of the few full-length L1s, most had obvious mutations.

L1 insertions were found to cause several cases of diseases

Method for studying L1 transposition in human cells

The human genome sequence yielded only 90 L1s with intact ORFs, most of the activity comes from 6 L1s

L1 activity in human cells

•There are about 880,000 copies of L1 in human cells.

•99.7% have deletions in their 5’ sequence.

•Over 80% of the L1 activity is derived from 6 copies of L1.

Why have transposons been so successful throughout evolution?

•Transposons are nothing more than molecular parasites that can efficiently propagate without destroying the host.

•Barbara McClintock thought transposons are maintained because they are beneficial and can reorganize the host genome to increase survival.

Vs.

How would you determine whether transposons are parasites that don’t benefit the cell or whether they

provide important contributions to the cell?

•Look for examples where transposons clearly provide an essential function.

Retrotransposons HeT-A and TART of Drosophila play an essential role in protecting the ends of chromosomes

HeT-A and TART substitute for telomerase by inserting into chromosome ends

But, this is the only known case of a transposon with an essential function.

Evidence that transposons are parasites includes the defense mechanisms that cells use to degrade transposon mRNA

Another reason transposons are thought to be parasites is because most insertions disrupt genes or are neutral.

•In humans, there are at least 65 documented cases of diseases resulting from de novo TE insertions.

•In a limited study of 42 de novo insertions of L1 in human cells 43% occurred in transposon repeats and 50% occurred in introns. The impact of these on expression was not determined.

•More can be learned about the impact of integration from the study of model organisms.

Target site preferences of the LTR-retrotransposons in S. cerevisiae an S. pombe

Element Target Impact of insertions Mechanism

Ty1 200 to 600 nt 5’ of Pol III genes

Neutral unknown

Ty3 2 nt 5’ of Pol III genes

Neutral Tethering: Integrase binds to Pol III transcription factor Brf1.

Ty5 Silent chromatin Neutral Tethering: Integrase binds heterochromatin factor Sir4.

Tf1 Promoters of Pol II transcribed genes

Not enough sites determined

Unknown

Deep sequencing of DNA now determines sequence of 250 million DNAs per lane

Kick out plasmidWith 5-FOA

Tf1

URA3

Tf1 plasmid

Induce expression of Tf1-neo Select for integration with G418

WT IN fsPR fs

chromosome

Tf1

URA3

Tf1 plasmid

neo

Tf1 neo

neo

Large numbers of insertions were generated

pombe genome

Cut with Mse I

Add adapters

5’

3’

NH2

NH2

Send sample to 454 Life Sciences with $$$Get 500,000 high quality sequence reads

Mse I Mse I Mse IMse IMse I

5’

Tf1

Tf1

Tf1 insertionpombe genome

Spe I Mse I Mse I Mse I

PCR amplify and gel purify

5’

High throughput sequencing of insertion sites

454 sequencing of four independent experiments revealed 73,125 insertions

Guo and Levin, 2010, Genome Research

21,848 inserts clustered in promoter sequences upstream of ORFs

3.5% ORF

Nucleotide

Num

ber

of in

sert

ions

Deep sequencing of integration sites:

•Positioned 78,000 independent insertions.

•The insertions sites matched the positions from the target plasmid assays.

•A reproducible measure of the integration in each of the 5,000 promoters of S. pombe.

•Tf1 integration is high in approximately 1,000 promoters.

•Stress response genes are favored targets.

•These data suggested Tf1 integration could have a significant impact on the fitness of cells subjected to stress.

Guo and Levin, 2010, Genome Research

Tf1 insertions increased expression of adjacent genes

Gang et al, 2013, NAR.

Tf1 increased expression of adjacent genes by providing enhancer activity

The expression of Tf1 itself is induced by stress

The impact of Tf1 integration

1. Tf1 integrates into the promoters of stress response genes.

2. Tf1 does not reduce the expression of adjacent genes.

3. Tf1 can increase the expression of adjacent genes.

4. Tf1 transcription is increased by heat shock and oxidative stress.

Break

The impact of Tf1 integration

1. Tf1 integrates into the promoters of stress response genes.

2. Tf1 does not reduce the expression of adjacent genes.

3. Tf1 can increase the expression of adjacent genes.

4. Tf1 transcription is increased by heat shock and oxidative stress.

How can we test in a systematic way whether Tf1 integration benefits cells exposed to stress?

Analyze cultures of cells with insertions to determine whether cells with specific insertions grow faster than others.

Cobalt restricts the growth of S. pombe cells and has an impact on the viabilityCobalt restricts the growth of S. pombe cells and has an impact on the viability

Cycles of 1-day growth

Cycles of 1-day growth

Cum

ulati

ve n

umbe

r of c

ell g

ener

ation

sVi

abili

ty (r

elati

ve to

0m

M C

obal

t)

Cobalt:

transition metal

cobalt ions induce DNA damage, interfere with DNA repair, DNA-protein crosslinking and sister chromatid exchange

0mM Cobalt0.2mM Cobalt1.2mM Cobalt

0mM Cobalt0.2mM Cobalt1.2mM Cobalt

library of 1x108 preexisting integration events: each cell has an integration of Tf1

Deep Sequencing of Tf1 integration profile

[Cobalt] : 0 mM 0.2 mM 1.2 mM

Determination of relative predominance of the integration sites

Experimental designExperimental design

T=0 generations Extract genomic DNA

competitive growth (32˚C) for 80 generations

17 intergenic regions are reproducibly enriched after culture in 0.2mM cobalt17 intergenic regions are reproducibly enriched after culture in 0.2mM cobalt

sat1 ssn65.2 kb

nth1 SPAC30D11.06c0.7 kb

mip1 sec210.4 kb

SPCC364.01 wtf91.2 kb

sap1 SPCC1672.03c2.8 kb

ubp22 SPCC188.09c0.5kb

sec8 SPCC970.080.9 kb

SPAC27E2.14 cdc12.4 kb

atp15 SPBC31F10.161.4 kb

SPCC162.01c adh12.7 kb

SPCC285.03 SPCC285.041.2 kb

rpl3601 mob21.2 kb

vps902 SPBC29A10.121.2 kb

rpl302 SPAPB8E5.07c0.8 kb

nat10 SPAC20G8.10c0.3 kb

mok13 zrt13.2 kb

SPCC18B5.02c wee13.4 kb

Tf1 integration in > 99% of intergenic regions appears to have no impact on the fitness of the cell.

Cultures grown for 80 generations in 0.2mM cobalt were reproducibly enriched with cells containing integration in 17 specific intergenic sequences.

The reproducible enrichment of specific intergenic sequences indicates that Tf1 integration does provide a competitive advantage to specific cells.

Intergenic sequences enriched are adjacent to genes that include a zinc transporter.

Results from monitoring populations of cells with Tf1 insertion grown in cobaltResults from monitoring populations of cells with Tf1 insertion grown in cobalt

Are transposons friends of foes?

•Transposon insertions disrupt gene function and cause diseases.

•There are few cases where transposons provide an ongoing essential function.

•Host cells evolved potent mechanisms to inhibit transposons.

•Vs.

•Transposons are induced by conditions of stress.

•Transposons can increase gene expression.

•Tf1 integration generates cells that have improved fitness in cobalt.

Applications of transposon technology

• All of the known DNA transposons of vertebrates have accumulated mutations and are no longer active.

Tools for gene therapy: The hunt for DNA transposons active in human cells

Molecular Reconstruction of a Salmonid Tc1-like Transposase Gene

Sequence alignment of 12 partial salmonid-type TcE sequences found in 8 fish species allowed Ivics and colleagues in 1997 to derive a consensus sequence.

The sleeping beauty clone #10 is an activity transposase in HeLa cells

Two plasmid system

Sleeping beauty:A tool for the discovery of novel cancer genes

• The most common animal model used to study cancer is the mouse.

• Conventional methods for the identification of cancer genes relied on transgenic/knockout alleles or retrovirus insertion. Both of these approaches have serious limitations.

• Sleeping beauty (SB) was the first cut and paste transposon to function in mice.

• The labs of Nancy Jenkins and David Largaespada have made modifications of SB that allowed it to be more mutagenic in order to identify cancer genes (Dupuy et al., 2005, Nature vol 436:221). Both tumor suppressors and oncogenes were identified.

The modified SB carried a promoter and splice donor so that it can activate genes

Promoter

Splice donor

SB transposon

The SB also carried splice acceptors and a polyA signal so that it can disrupt genes

regardless of SB orientation

Splice acceptors polyA sequences

SB transposon

polyA sequence

To activate the transposon the donor mouse was mated with RosaSB

120

The numbers of progeny carrying T2/Onc2 and RosaSB were significantly lower than expected based on mendelian inheritance.

The 24 double transgenic mice were tumor-prone

Pathology

• Tumor cells were frequently found in all tissues of the animal.

• Some mice developed two or three different cancer types.

• Haematopoietic tumors predominated, possibly reflecting the large pool of haematopoietic stem cells in mice.

Analysis of SB integration sites in tumor DNA

• Using ligation mediated PCR they cloned 781 insertions from 16 tumors.

• Multiple genes were mutated by SB integration in two or more tumors.

• seven of these genes were known cancer genes.

Sleeping beauty summary

• SB is a non-viral insertional mutagen that efficiently induces tumors in wild-type mice.

• With tissue specific expression of SB, genes associated with colon and liver cancer have now been identified.

• This SB system may someday allow mutagenesis of the mouse germline at frequencies that could be used for forward genetics.

Integration profiles are capable of testing gene function.

ORF ORF ORF ORF

1. Generate high density of transposon insertions.

ORF ORF ORF ORF

2. Grow cells under condition of interest and sequence millions of insertions.

3. Genes required to survive the condition of interest will lack insertions because those cells will not multiply in the growth condition.

essential essential

Hermes transposition in pombe

Transposase DNA donor

genome

kanMX6Transposase + kan donor

kan donor

Transposase

G418 resistance

33% of the insertions occurred in ORFs

bp

inte

grat

ion

even

ts

33%in ORFs

Results of Illumina sequence reactions

• High quality raw sequences: 45,918,527 • Unique matches in genome: 27,502,999• Independent integrations: 360,513• Integration density: 1 insertion/29 nt

Cdc25 is an essential cyclin phophatase and SPAC24H6.04 is a nonessential hexokinase

1 kb

cdc25 SPAC24H6.04

Cdc15 is an essential gene for cytokinesis and SPAC20G8.04c is a nonessential gene for electron transfer

1 kb

The essential genes had significantly lower densities of integration than the nonessential genes

10% of the consortium deletion strains still retain ORFs said to be deleted

Integration profiling identified genes required for cell division

• Integration profiling allowed genome-wide identification of genes required for cell division.

• Intermediate levels of integration identified nonessential genes that, nevertheless contribute to growth.

• Hermes integration data revealed that approximately 10% of the consortium deletion strains still retain ORFs said to be deleted.

• Integration profiling can identify many types of gene functions as well as interactions with specific mutant alleles or pharmaceutical agents.

Section on Eukaryotic Transposable Elements

Lab membersYoung-Eun Leem Target recognition of ade6Anasuya Majumdar Target recognition of fbp1Atreyi Chatterjee Serial Number systemYabin Guo Genome-wide insertionsCaroline Esnault Selection under stressSudhir Rae Host factors for Tf1Parmit Singh HIV-1 integrationAdam Evertts Hermes transpositionJung Min Park Hermes Integration profilingElizabeth Humes Hermes Integration profilingStevephen Hung Bioinformatic analysis

CollaboratorsNancy Craig Hermes system

Sunil Gangadharan Hermes in vitro