Flanking Sequence Tags identified by 3’-RACE · 2011-09-28 · Flanking Sequence Tags identified...
Transcript of Flanking Sequence Tags identified by 3’-RACE · 2011-09-28 · Flanking Sequence Tags identified...
Flanking Sequence Tags identified by 3’-RACE
Laurence Meslet-Cladière and Olivier Vallon
UMR 7141 CNRS/UMPC,
Institut de Biologie Physico-Chimique
13 rue Pierre et Marie Curie, 75005 Paris, FranceTel : +33 1 5841 5058 Fax : +33 1 5841 5022 E-mail : ovallon @ibpc.fr
Summary
• Chlamydomonas reinhardtii is a well-established model organisms to study photosynthesis, chloroplast biology etc...... and to develop biodiesel technologies
• Random insertional mutagenesis is a powerful way to study gene function, incl. at the genome scale. But once a library has been established, it is not trivial to identify the locus where insertion occurred. PCR-based methods to identify Flanking Sequence Tags (TAIL-, RESDA-, GenomeWalker-, SiteFinder-PCR etc) are sometimes difficult in the 65% GC DNA of Chlamydomonas. We have sought to develop a method for FST determination that does not require the annealing of degenerate oligos.
• Using a novel Spectinomycin resistance marker (recoded version of the bacterial AadA gene), we show that a high number of transformants can be obtained with a truncated version of the cassette, lacking a 3’-UTR. Antibiotics resistance depends upon the expression of a chimeric RNA, whose 3’-UTR is provided by the flanking Chlamydomonas DNA.
• We have used classic 3’-RACE to amplify the chimeric AadA transcript from randomly-selected transformants. Out of ... strains analyzed, we were able to extract FST from ... strains, and ... of them could be uniquely mapped to the genome.
• We propose this methodology as an alternative to DNA-based PCR methods for the large scale sequencing of insertional mutant libraries.
BstAPIBsgI
MunI
NdeIBbvCI
BmtINheISmaIXmaIAscI
SphI
MauBIAatIISfiI
MunIFspAIFspI
PpuMI
PfoI
AatIIBamHI
BclIXcmI
Acc65IKpnI
FspI
PsiI
XmnI
ScaI
FspI
NmeAIII
PciI
SapI
AleISacII
NotI
HSP70A-Pro
RBCS2-Pro5' UTR
CrAadA
RBCS2-3'polyAf1 (+) ori
bla
pUC ori
pALM32
pALM32 : AR_CrAadA_3’-RBCS2
A
BCD
A, B, C, D: primers for PCR
0
100
200
300
400
500
600
700
800
900
vecteur-
AR-
AadA-3'
AR-
AadA-3'
R-AadA-
3'
AadA-3' AR-
AadA
AadA
co
lon
ies
/ t
ran
sfo
Spec 75 µg/ml
Spec 100 µg/ml
Spec 200 µg/ml
Transformation by cut plasmid
The CrAadA marker can transform even when deprived of promotor or 3’ UTR
Removal of the 3’-UTR only marginally reduces transformation rate
Transformation by PCR products
blu
nt
5'
uncut
blu
nt
5'
uncut
blu
nt
5'
uncut
0
200
400
600
800
1000
1200
1400
1600
1800
2000
Tra
nsfo
rmation e
ffic
iency (norm
.)75 µg/ml
200 µg/ml
AR_CrAadA AR_CrAadA_3 ’UTR AR_CrAadA_3 ’UTR_T7pro
Generating FSTs by 3’-RACE
Marker DNA FST
AAAAAAmRNA 5’ 3 ’
TTTT primerA_B
Reverse transcription
1rst strand cDNA 3’ 5 ’TTTT TTTT primerA_B
1rst amplification
TTTT TTTT primerA_B
GSP1 B
SEQUENCING
2nd, nested, amplification
TTTT TTTT primerA
GSP2 A
3’-RACE method: Scotto-Lavino et al, Nat Protoc. 2006;1:2742-5
C1 C2 #2.2 #2.4 #2.6 #3.1 #3.3 #4.1 #4.2 #4.3 #14.2 #14.3 C1 C2 #2.2 #2.4 #2.6
CrAadA control: endogenous PETC
10 randomly picked strains, reverse-transcribed with primer QT
PC
R 1
PC
R2
*
* **
*
*
*
**
*
**
* ***
*
****
* : bands that gave a FST
#2.1 #2.3 #2.5 #3.2 #3.4 #3.5 #3.6 #4.4 #4.5 #4.6 #14.1 #14.4 C1 C2 C3 C1 C2 C3
Sequencing of whole PCR reaction
PCR2 PCR 1
>#4.4_CrAadA_F4 -- unclipped
GGGRTKAGCAGTATCTAGACGTCRACCCACTCTAKAGGATCCCCGCTCCGTGTWAATGGAGGCGTACGTASACACTGGGGGAGACTTTCCGTCACGGCCCCRWCCCARCGTC
GCTGGCAACGTCCACAGMTGTGCRCACCACGCGGCGCTGCTCACTCGCTGCCRACACGACSGCTCCCCGGCCCTGCCGCGGMCMTGCAGGTGRTCAAGGTGTTTGTAAGCKT
ATACAGTGACRACTACGGCAAGCGAGTGGCCATGGAGAACCTGCAGCGCCTGGAGCCSTRAGTGTCCRCRGGSRCCGGGGGGCATSKGACGAKGCATCKKGGCGGGGATGGA
AARGYSRGGGATGRCATCCSGGWGCRGGAGGGGTCSAGGAKTGAGGTGSGGCTGCGGGCCCACTTGRWGGMTAGTCTGTGSCMGCMGMTTGRCGTTTTCAGGGCCGCCACGG
CGCGTGTGACGGTCGASGACAGCGGACTCTCGCCACATCACACCGCRATCTGCTGCAGCTCACATGTAACCGTACCATACACRAAAAAAAAAAAAAAAAAARRWKKKKSKCY
KYYYCACAYAAAACGARKACTTTGAGGACCCARACGAGSGGCCGCAGGAGTACCCCAACCCCTTTGGCGACCTGTTMAWMRAMRACMRCGAKTACCGGRC
ARTGGCGGKCAAKCGCGTGSAGGAGCGGASGCGTAGCCAGGGCCKCCCGCASCCRCAAGGYCGCGGGCAKGGGCKGCRTGTGYAAKGGCAGCAGCGGGAGGMKGCGGCGGCT
GAGACTGGTGATGASTAGGTATAATGTCTGTTTGTCAGTGTATACTAACGAGCACGTGCGGGYRCGTGCAGGAACRGTGGATSGKCYGCWGYATGCAGGTTCATTGATAKCG
CAGTGCSACSGGAGCACGGRGCCTCMGGCACGCAAGAGCTACTKGWCCTACTGCTAGAGTMCGTCCTMCGGYGCTAGTCAGATCSCRCCTGGGAATCATCTTCTYGCCTKGT
GCGATGGWACGRGGTAAGGGGCAAGGATGCYGWATYCTGGMATCRYCTCSMCCGGTGCATCTTCYCCCCATTWACGTARCAGTTKCAYKSKTG
Au9.Cre02.g145000 Ribosome-binding factor A
NB: several sites of polyadenylation ; an intron is spliced out in the newly generated 3’UTR
24 « difficult » strains, reverse-transcribed with another primer, Qs
>QS
CGAGATCTACACTCTTTCCCTACACTAGACGACGCTCTTCCGATCTTTTTTTTTTTTTTTTTTT
>QU Tm=63.2°C
CGAGATCTACACTCTTTCCCTACACT
>QD Tm=64.4°C
CACTAGACGACGCTCTTCCGATCT
PCR1#2.6 2.11 3.4 3.7 3.8 3.12 3.11 4.3 4.5 4.6 4.9 4.12
11.3 11.5 11.6 11.8 11.9 11.10 11.11 11.12 14.2 14.3 14.4 con
PCR2#2.6 2.11 3.4 3.7 3.8 3.12 3.11 4.3 4.5 4.6 4.9 4.12
11.3 11.5 11.6 11.8 11.9 11.10 11.11 11.12 14.2 14.3 14.4 con
Summary of FSTsstrain
cassette
end
end
modified longest FST on genome gene annotation
position in
gene
cassette
/gene
polyadenylation
signal
#2.1 uncut -9 chromosome_16:304054-303727 Cre16.g649800PPR3;
Pentatrichopeptide
intergenic
(donwstream) + not reached
#2.2 uncut 0chromosome_1:8429714-8429732, spliced to
following exonsCre01.g060850
PSBS3, Chloroplast
Photosystem II- 3rd exon + not reached
#2.3 uncut 0 chromosome_12:589974-589850 Cre12.g488050FFT5, Fructan
fructosyltransferase intron 2 - not reached
#2.4 uncut 0 chromosome_14:3326333-3326254 Cre14.g630200 no predicted function3' UTR +
TGTAA (-17), that of
host gene
#2.5 uncut 0 chromosome_12:3269654-3269776 Cre12.g512250HRDC domain,
POLYMYOSITIS/SCLE 4th exon + TGTTC ? (-37)
#2.6 uncut 0 many locationsTEs: Toc1 and
DNA-2-7_CRnot reached
#2.7 uncut ?chromosome_10:1630531-1630475 spliced to
1630157-1630059Cre10.g429850 no predicted function
7th exon + not recognizeable
#2.8 uncut ? scaffold_27:75383-75179 Cre27.g774700SGNH hydrolase
(GDSL hydrolase); 5'UTR + not recognizeable
#2.9 uncut -3chromosome_9:2341565-2341083
NB: same as #4.3 !Cre09.g400950
Pfam:07690 Major
Facilitator Superfamily
3'UTR
(endogeneous) + TGTAA (-20)
#2.11 uncut 0 chromomsome_17:1621018-1620917 Cre17.g708000 PAS domain proteinvery end of the
3'-UTR + TGTGA (-33)
#3.1 BsiWI +4chromosome_3:1474847-1474820... and
chromosome_3:1475170-1475092…Cre03.g159300
SMALL MEMBRANE
PROTEIN-RELATEDnot reached
#3.2 BsiWI +3 chromosome_12:9166015-9166885 Cre12.g560350CNK2; NimA-related
protein kinase 2 intron 1 + TGTAA (-19)
#3.3 BsiWI +4 chromosome_7:1015855-1015936 Cre07.g319550conserved hypothetical
protein with FIST_C intron 1 +
TGTCC (-6) or
TGTCC(-11)/ TGGAA
#3.5 BsiWI -5a sequence similar to chromosome_7:2564723-
2565098 and other locations
exact location
unknown, similar to
unknown function (a
sequence repeated >12 ? ? TGTAA (-18)
#3.6 BsiWI +1 chromosome_3:2440121-2440156 Cre03.g166950PGM5;
phosphoglycerate intron 6 + TGTAA (-7)
#3.7 BsiWI ? chromosome_14:3746617-3744745 Cre14.g632700 protein kinaseintron 19 on
Au10.2 model + not reached
#3.11 BsiWI +4chromosome_17:2083314-2083570 with
junction, but also chromosome_17:2083567-Cre17.g712100 MDAR1
intron 7 + TGTGA (-4)
#4.1 (SnaBI) 0 chromosome_5:377208-377154 and -377020 Cre05.g231500 Zn-finger protein intron 6 + not recognizeable
#4.2 (SnaBI) 0not in v4; in finishing draft:
8022_2:24991-25170? ? ? ? Not reached
#4.3 SnaBI 0Chromosome_9:2341490-2341445
NB: same as #2.9 !Cre09.g400950
membrane protein of
Major facilitator 3' UTR + not reached
#4.4 (SnaBI) 0 chromosome_2:9598215-9597761 Cre02.g115000Ribosome-binding
factor Aintron 4 +
TGTAA (-18) / not
reached
#4.8 (SnaBI) chromosome_7:698506-698252 Intergenic TGGTAC ? (-40)
#4.10 (SnaBI) ? chromosome_7:4591244-4590811 Cre07.g346000 unknown function end of 3'UTR + not recognizeable
#11.1 AatII ? chromosome_16:2391524-2391584 Cre16.g666300 protein kinaseupstream,
integernic+ TGTAA (-19)
#11.4 AatII 2 chromosome_4:704462-705281 Cre04.g215800 no annotation last exon + not reached
#14.1 uncut ? chromosome 14:2909515-2909394 Cre14.g627600 Dynein heavy chainintron 6 and
exon 8+ not reached
#14.2 uncut 0
ambiguous: chromosome_12:4869726-
4870332 or 4925725-4926331 or 4944276-
4944882 (all with 1 intron)
Cre12.g526150 or
Cre12.g526450 or
Cre12.g526650
Protein kinases intron 3 + TGTGA (-4)
Total tested: 35 Failures: 8 Mappable FSTs: 27 Genome verified (thus far): 6