BioSci 203 Blumberg lecture 2 page 1 © copyright Bruce Blumberg 2001-2005. All rights reserved Bio...

27
BioSci 203 Blumberg lecture 2 page 1 © copyright Bruce Blumberg 2001- 2005. All rights reserved Bio Sci 203 Lecture 2 – genomic and cDNA libraries Bruce Blumberg ([email protected]) office – 2113E McGaugh Hall 824-8573 lab x46873, x43116 office hours MWF 11-12 or by appointment http://blumberg-serv.bio.uci.edu/bio203-2004/index.htm http://blumberg.bio.uci.edu/bio203-w2004/index.htm Link is also on main class web site Today genomic libraries factors affecting sequence clonability in E. coli cDNA library theory and construction
  • date post

    21-Dec-2015
  • Category

    Documents

  • view

    217
  • download

    3

Transcript of BioSci 203 Blumberg lecture 2 page 1 © copyright Bruce Blumberg 2001-2005. All rights reserved Bio...

Page 1: BioSci 203 Blumberg lecture 2 page 1 © copyright Bruce Blumberg 2001-2005. All rights reserved Bio Sci 203 Lecture 2 – genomic and cDNA libraries Bruce.

BioSci 203 Blumberg lecture 2 page 1 ©copyright Bruce Blumberg 2001-2005. All rights reserved

Bio Sci 203 Lecture 2 – genomic and cDNA libraries

• Bruce Blumberg ([email protected])– office – 2113E McGaugh Hall– 824-8573– lab x46873, x43116– office hours MWF 11-12 or by appointment

• http://blumberg-serv.bio.uci.edu/bio203-2004/index.htm• http://blumberg.bio.uci.edu/bio203-w2004/index.htm

• Link is also on main class web site

• Today– genomic libraries– factors affecting sequence clonability in E. coli– cDNA library theory and construction

Page 2: BioSci 203 Blumberg lecture 2 page 1 © copyright Bruce Blumberg 2001-2005. All rights reserved Bio Sci 203 Lecture 2 – genomic and cDNA libraries Bruce.

BioSci 203 Blumberg lecture 2 page 2 ©copyright Bruce Blumberg 2001-2005. All rights reserved

Genomic libraries• What do we commonly use genomic libraries for?

– Genome sequencing– gene cloning prior to targeted disruption or promoter analysis– positional cloning

• genetic mapping– Radiation hybrid, STS (sequence tagged sites)

• chromosome walking• gene identification from large insert clones• disease locus isolation and characterization

• Considerations before making a genomic library– what will you use it for, i.e., what size inserts are required?

• Walking to a clone• isolation of genes for knockouts

– Are high quality validated libraries available?• Caveat emptor

– Drosophila ~50% of clones are not traceable to original plates

– Research Genetics Xenopus tropicalis BAC library is really Xenopus laevis

• apply stringent standards, your time is valuable

Page 3: BioSci 203 Blumberg lecture 2 page 1 © copyright Bruce Blumberg 2001-2005. All rights reserved Bio Sci 203 Lecture 2 – genomic and cDNA libraries Bruce.

BioSci 203 Blumberg lecture 2 page 3 ©copyright Bruce Blumberg 2001-2005. All rights reserved

Genomic libraries (contd.)

• Considerations before making a genomic library (contd)– availability of equipment?

• PFGE• laboratory automation• if not available locally, it may be better to use a commercial

library when available

• Goals for a genomic library– Faithful representation of genome

• clonability and stability of fragments essential• >5 fold coverage is desirable (i.e., base library should have a

complexity of five times the estimated genome size to have a 95% probability of identifying a clone.

– easy to screen• plaques much easier to deal with colonies UNLESS you are

dealing with libraries spotted in high density on filter supports– easy to produce quantities of DNA for further analysis

Page 4: BioSci 203 Blumberg lecture 2 page 1 © copyright Bruce Blumberg 2001-2005. All rights reserved Bio Sci 203 Lecture 2 – genomic and cDNA libraries Bruce.

BioSci 203 Blumberg lecture 2 page 4 ©copyright Bruce Blumberg 2001-2005. All rights reserved

Construction of a genomic library

• Prepare HMW DNA– bacteriophage λ, cosmids or fosmids

• partial digest with frequent (4) cutter followed by sucrose gradient fractionation or gel electrophoresis

– Sau3A (^GATC) most frequently used, compatible with BamHI (G^GATCC)

• why can’t we use rare cutters?• Ligate to phage or cosmid arms then package in vitro

– Stratagene >>> better than competition– Vectors that accept larger inserts

• prepare DNA by enzyme digestion in agarose blocks– why?

• Partial digest with frequent cutter• Separate size range of interest by PFGE (pulsed field gel

electrophoresis)• ligate to vector and transform by electroporation

• What is the potential flaw for all these methods?

Page 5: BioSci 203 Blumberg lecture 2 page 1 © copyright Bruce Blumberg 2001-2005. All rights reserved Bio Sci 203 Lecture 2 – genomic and cDNA libraries Bruce.

BioSci 203 Blumberg lecture 2 page 5 ©copyright Bruce Blumberg 2001-2005. All rights reserved

Construction of a genomic library (contd)

• What is the potential flaw for all these methods?– Unequal representation of restriction sites, even 4 cutters in

genome– large regions may exist devoid of any restriction sites

• tend not to be in genes• Solution?

– Shear DNA or cut with several 4 cutters, then methylate and attach linkers for cloning

– benefits• should get accurate representation of genome• can select restriction sites for particular vector (i.e., not

limited to BamHI)– pitfalls

• quality of methylases• more steps• potential for artefactual ligation of fragments

– molar excess of linkers

Page 6: BioSci 203 Blumberg lecture 2 page 1 © copyright Bruce Blumberg 2001-2005. All rights reserved Bio Sci 203 Lecture 2 – genomic and cDNA libraries Bruce.

BioSci 203 Blumberg lecture 2 page 6 ©copyright Bruce Blumberg 2001-2005. All rights reserved

YACs, BACs and PACs

• Three complementary approaches, each with its own strengths and weaknesses

• YACs - Yeast artificial chromosomes– requires two vector arms, one

with an ARS one with a centromere

• both fragments have selective markers

– trp and ura are commonly used

• background reduction is by dephosphorylation

• ligation is transformed into spheroplasts

• colonies picked into microtiter dishes containing media with cryoprotectant

Page 7: BioSci 203 Blumberg lecture 2 page 1 © copyright Bruce Blumberg 2001-2005. All rights reserved Bio Sci 203 Lecture 2 – genomic and cDNA libraries Bruce.

BioSci 203 Blumberg lecture 2 page 7 ©copyright Bruce Blumberg 2001-2005. All rights reserved

YAC cloning

• YAC cloning (contd) – advantages

• can propagate extremely large fragments• may propagate sequences unclonable in E. coli

– disadvantages• tedious to purify away from yeast chromosomes by PFGE• grow slowly• insert instability• generally difficult to handle

Page 8: BioSci 203 Blumberg lecture 2 page 1 © copyright Bruce Blumberg 2001-2005. All rights reserved Bio Sci 203 Lecture 2 – genomic and cDNA libraries Bruce.

BioSci 203 Blumberg lecture 2 page 8 ©copyright Bruce Blumberg 2001-2005. All rights reserved

BAC cloning• Based on the E. coli F’ plasmid

– partial digests are cloned into dephosphorylated vector– ligation is transformed into E. coli by electroporation– advantages

• large plasmids - handle with usual methods• Stable - stringently controlled at 1 copy/cell• Vectors are small ~7 kb

– – good for shotgun cloning strategies– disadvantages

• low yield• no selection against

nonrecombinant clones (blue/white only)

• apparent size limitation

Page 9: BioSci 203 Blumberg lecture 2 page 1 © copyright Bruce Blumberg 2001-2005. All rights reserved Bio Sci 203 Lecture 2 – genomic and cDNA libraries Bruce.

BioSci 203 Blumberg lecture 2 page 9 ©copyright Bruce Blumberg 2001-2005. All rights reserved

P1 cloning

• P1cloning systems– derived from bacteriophage P1

• one of the primary tools of E. coli geneticists for many years– like cosmids, infect cells with packaged DNA then recover as a

plasmid.– useful, but size limited

to 95 kb by “headfull”packaging mechanism similar to bacteriophage λ

Page 10: BioSci 203 Blumberg lecture 2 page 1 © copyright Bruce Blumberg 2001-2005. All rights reserved Bio Sci 203 Lecture 2 – genomic and cDNA libraries Bruce.

BioSci 203 Blumberg lecture 2 page 10 ©copyright Bruce Blumberg 2001-2005. All rights reserved

PAC cloning

• PAC - P1 artificial chromosome– combines best features of P1 and BAC cloning– size selected partial digests

are ligated to dephosphorylated vector and electrotransformed into E. coli.

• Stored as colonies in microtiter plates

– Selection against non-recombinants via SacBII selection (nonrecombinant cells convert sucrose into a toxic product)

– inducible P1 lytic replicon allows amplification of plasmid copy number

Page 11: BioSci 203 Blumberg lecture 2 page 1 © copyright Bruce Blumberg 2001-2005. All rights reserved Bio Sci 203 Lecture 2 – genomic and cDNA libraries Bruce.

BioSci 203 Blumberg lecture 2 page 11 ©copyright Bruce Blumberg 2001-2005. All rights reserved

PAC cloning (contd)

• PAC– advantages

• all the advantages of BACS– stability– replication as plasmids– stringent copy control

• selection against nonrecombinant clones• inducible P1 lytic replicon

– addition of IPTG causes loss of copy control and larger yields

– disadvantages • effective size limitation (~300 kb)• Vector is large – lots of vector fragments from shotgun

cloning PACs

Page 12: BioSci 203 Blumberg lecture 2 page 1 © copyright Bruce Blumberg 2001-2005. All rights reserved Bio Sci 203 Lecture 2 – genomic and cDNA libraries Bruce.

BioSci 203 Blumberg lecture 2 page 12 ©copyright Bruce Blumberg 2001-2005. All rights reserved

Comparison of cloning systems

YAC BAC PAC

Host cells S. cerevisiae AB1380, J57D

E. coli DH10B E. coli DH10B

Transformation method

Spheroplast transformation

Electroporation Electroporation

DNA topology of recombinants Linear Circular supercoiled Circular supercoiled

Maximum insert size >>1 Mb ~300 kb ~300 kb

Selection for recombinants

Ade2 supF red-white color selection

Lacz blue-white SacIIb selective growth

Selection for vector Dropout medium (lacking trp and ura)

Chloramphenicol Kanamycin

Enzyme for partial digests EcoRI HindIII MboI or Sau3AI

Stability Variable but can be very unstable

Very stable Very stable

Degree of chimerism

Varies but can be >50%

Very low Very low

Degree of co-cloning Occasional Undetectable Undetectable

Purification of intact inserts Difficult Easy Easy

Direct sequencing of insert Difficult Relatively easy Relatively easy

Clone mating Yes No No’

Page 13: BioSci 203 Blumberg lecture 2 page 1 © copyright Bruce Blumberg 2001-2005. All rights reserved Bio Sci 203 Lecture 2 – genomic and cDNA libraries Bruce.

BioSci 203 Blumberg lecture 2 page 13 ©copyright Bruce Blumberg 2001-2005. All rights reserved

Which type of library to make

• Do I need to make a new library at all?– Is the library I need available? http://bacpac.chori.org/home.htm

• PAC libraries are suitable for most purposes • BAC libraries are most widely available • If your organism only has YAC libraries available you may

wish to make PAC or BACs• Much easier to buy pools or gridded libraries for screening

– doesn’t always work– What is the intended use?

• Will this library be used many times?– e.g. for isolation of clones for knockouts– if so, it pays to do it right

– who should make the library?• Going rate for custom PAC or BAC library is 50K. Most labs

do not have these resources• if care is taken, construction is not so difficult

Page 14: BioSci 203 Blumberg lecture 2 page 1 © copyright Bruce Blumberg 2001-2005. All rights reserved Bio Sci 203 Lecture 2 – genomic and cDNA libraries Bruce.

BioSci 203 Blumberg lecture 2 page 14 ©copyright Bruce Blumberg 2001-2005. All rights reserved

Screening of genomic libraries

• What types of probes are suitable for screening genomic libraries?

Page 15: BioSci 203 Blumberg lecture 2 page 1 © copyright Bruce Blumberg 2001-2005. All rights reserved Bio Sci 203 Lecture 2 – genomic and cDNA libraries Bruce.

BioSci 203 Blumberg lecture 2 page 15 ©copyright Bruce Blumberg 2001-2005. All rights reserved

Screening of genomic libraries

• What types of probes are suitable for screening genomic libraries?– suitable

• cDNAs (or mRNAs)• genomic fragments• longer oligonucleotides (> 30 mers)

– Not suitable• antibodies (no protein expression)• degenerate (mixed) oligonucleotides (genome complexity)• DNA binding proteins (genome complexity)

Page 16: BioSci 203 Blumberg lecture 2 page 1 © copyright Bruce Blumberg 2001-2005. All rights reserved Bio Sci 203 Lecture 2 – genomic and cDNA libraries Bruce.

BioSci 203 Blumberg lecture 2 page 16 ©copyright Bruce Blumberg 2001-2005. All rights reserved

Sequence stability in E. coli

• What are the sorts of factors that might modulate whether a sequence can be stably propagated in E. coli?

– 1

– 2

– 3

toxicity

restriction

recombination

Page 17: BioSci 203 Blumberg lecture 2 page 1 © copyright Bruce Blumberg 2001-2005. All rights reserved Bio Sci 203 Lecture 2 – genomic and cDNA libraries Bruce.

BioSci 203 Blumberg lecture 2 page 17 ©copyright Bruce Blumberg 2001-2005. All rights reserved

Sequence stability in E. coli

• toxicity– sequence may lead to the production of a toxic product or toxic

levels of an otherwise innocuous product– more problematic with cDNA than genomic clones

• restriction - Raleigh 1987 Meth. Enzymol. 152, 130-141– virtually all microorganisms have systems to destroy non-

endogenous DNA host range restriction• four classes of restriction endonucleases

– very important for cloning purposes are recently discovered systems that degrade DNA containing 5-methyl cytosine or 6-methyl adenine.

– If you are cloning genomic DNA, or hemimethylated cDNA these are very important!

• virtually all eukaryotic DNA contains 5-methyl cytosine and/or 6-methyl adenine

– mcrA,B,C - methylcytosine– mrr - methyl adenine

Page 18: BioSci 203 Blumberg lecture 2 page 1 © copyright Bruce Blumberg 2001-2005. All rights reserved Bio Sci 203 Lecture 2 – genomic and cDNA libraries Bruce.

BioSci 203 Blumberg lecture 2 page 18 ©copyright Bruce Blumberg 2001-2005. All rights reserved

Sequence stability in E. coli (contd)• Restriction (contd)

– foreign DNA escapes restriction 1/105 for EcoK and EcoB, 1/10 for mcrA.

– one needs to be conscious of the mcr and mrr restriction status of strains and packaging extracts to be used.

• Recombination - Wyman and Wertman (1987) Meth Enzymol 152, 173-180– genomic DNA contains lots of repeated sequences

• direct repeats• inverted repeats• interspersed repeats (e.g. Alu)

– repeated sequences unstable in recombination proficient E. coli if cloned in:

• lambda• plasmid• cosmid

– seems not to apply to single copy vectors such as BAC and PAC • What does this imply?

– ~30% of the human genome is unstable in plasmid or phage clones• phages with such sequences either don’t grow at all or get shorter

with time

Page 19: BioSci 203 Blumberg lecture 2 page 1 © copyright Bruce Blumberg 2001-2005. All rights reserved Bio Sci 203 Lecture 2 – genomic and cDNA libraries Bruce.

BioSci 203 Blumberg lecture 2 page 19 ©copyright Bruce Blumberg 2001-2005. All rights reserved

Sequence stability in E. coli (contd)

• Recombination (contd)– E. coli has a variety of recombination pathways. These are the

major players in causing sequence underrepresentation• recA required for all pathways• recBCD - major recombination pathway• sbcB,C - suppressor of B,C• minor pathways

– recE– recF– recJ

• rule of thumb - the more recombination pathways mutated, the sicker the cells and the slower they grow

– major players for inverted repeats are recBCD and sbc– recA is most important for stabilizing direct repeats and

preventing plasmid concatamerization

Page 20: BioSci 203 Blumberg lecture 2 page 1 © copyright Bruce Blumberg 2001-2005. All rights reserved Bio Sci 203 Lecture 2 – genomic and cDNA libraries Bruce.

BioSci 203 Blumberg lecture 2 page 20 ©copyright Bruce Blumberg 2001-2005. All rights reserved

Sequence stability in E. coli (contd)

• Plating a genomic library– whenever possible, select a cell type that is recA, recD, sbcB and

deficient in all restriction systems.• Conveniently, EcoK, mcrB,C and mrr are all linked and often

deleted together in strains• can get more than 100 fold difference in numbers of phage

between wild type and recombination deficient– recD is preferred over recB,C because recD promotes rolling

circle replication in lambda which improves yields

Page 21: BioSci 203 Blumberg lecture 2 page 1 © copyright Bruce Blumberg 2001-2005. All rights reserved Bio Sci 203 Lecture 2 – genomic and cDNA libraries Bruce.

BioSci 203 Blumberg lecture 2 page 21 ©copyright Bruce Blumberg 2001-2005. All rights reserved

What do I need to know about E. coli genetics?

• You look in a supplier’s catalog and see lots of E. coli with different genotypes of the following general form:

– F’{lacIq Tn10 (TetR)} mcrA, Δ(mrr-hsdRMS-mcrBC), Φ80lacZΔM15, ΔlacX74, deoR, recA1, araD139, Δ(ara-leu)7697, galU, galK, rpsL(StrR), endA1, nupG

• Does this make any difference for your experiments?– Or should you simply follow the supplier’s instructions?– Or just use whatever people in the next lab are using without

thinking about it?

Page 22: BioSci 203 Blumberg lecture 2 page 1 © copyright Bruce Blumberg 2001-2005. All rights reserved Bio Sci 203 Lecture 2 – genomic and cDNA libraries Bruce.

BioSci 203 Blumberg lecture 2 page 22 ©copyright Bruce Blumberg 2001-2005. All rights reserved

What do I need to know about E. coli genetics?

• F’{lacIq Tn10 (TetR)} mcrA, Δ(mrr-hsdRMS-mcrBC), Φ80lacZΔM15, ΔlacX74, deoR, recA1, araD139, Δ(ara-leu)7697, galU, galK, rpsL(StrR), endA1, nupG

• restriction systems– mcrA - cuts Cm5CGG– mcrB,C - complex cuts at Gm5C– mrr - restricts 6-methyl adenine containing DNA– Why are these important?– hsdRMS - EcoK restriction system

• R cuts 5'-AAC(N)6 GTGC-3’

• M/S methylates A residues in this sequence

• for stability of long repeated sequences– recA1 - deficient in general recombination– recD - deficiency in Exonuclease V– sbcB,C - Exonuclease I– deoR - allows uptake of large DNA

Page 23: BioSci 203 Blumberg lecture 2 page 1 © copyright Bruce Blumberg 2001-2005. All rights reserved Bio Sci 203 Lecture 2 – genomic and cDNA libraries Bruce.

BioSci 203 Blumberg lecture 2 page 23 ©copyright Bruce Blumberg 2001-2005. All rights reserved

What do I need to know about E. coli genetics? (contd)

• for lac color selection– lacZ ΔM15 either on F’ or on Φ80 prophage– lacIq - constitutive expression of lac repressor. Prevents leaky

expression of promoters containing lac operator

• for high quality DNA preps– recA1 - deficient in general recombination– endA1 - deficient in endonuclease I

• if you buy ESTs from Research Genetics (InVitrogen) or OpenBiosystems– tonA - resistant to bacteriophage T1

• for recombinant protein expression– lon - protease deficiency– OmpT - protease found in periplasmic space– most important protease inhibitor for E. coli protein preps is

pepstatin A

Page 24: BioSci 203 Blumberg lecture 2 page 1 © copyright Bruce Blumberg 2001-2005. All rights reserved Bio Sci 203 Lecture 2 – genomic and cDNA libraries Bruce.

BioSci 203 Blumberg lecture 2 page 24 ©copyright Bruce Blumberg 2001-2005. All rights reserved

What do I need to know about E. coli genetics? (contd)

• suppressors– supE - inserts glutamine at UAG (amber) codons– supF - inserts tyrosine at UAG (amber) codons

• many older phages have S100am which can only be suppressed by supF

– λZAP, λgt11, λZipLOX,

Page 25: BioSci 203 Blumberg lecture 2 page 1 © copyright Bruce Blumberg 2001-2005. All rights reserved Bio Sci 203 Lecture 2 – genomic and cDNA libraries Bruce.

BioSci 203 Blumberg lecture 2 page 25 ©copyright Bruce Blumberg 2001-2005. All rights reserved

Construction of cDNA libraries

• What is a cDNA library?

• What are they good for?

– Collection of DNA copies representing the expressed mRNA population of a cell, tissue, organ or embryo

– Identifying and isolating expressed mRNAs– functional identification of gene products– cataloging expression patterns for a particular tissue

• EST sequencing and microarray analysis

Page 26: BioSci 203 Blumberg lecture 2 page 1 © copyright Bruce Blumberg 2001-2005. All rights reserved Bio Sci 203 Lecture 2 – genomic and cDNA libraries Bruce.

BioSci 203 Blumberg lecture 2 page 26 ©copyright Bruce Blumberg 2001-2005. All rights reserved

Determinants of library quality

• What constitutes a full-length cDNA?– Strictly it is an exact copy of the mRNA– full-length protein coding sequence considered acceptable for most

purposes• mRNA

– full-length, capped mRNAs are critical to making full-length libraries– cytoplasmic mRNAs are best – WHY?

• 1st strand synthesis– complete first strand needs to be synthesized– issues about enzymes

• 2nd strand synthesis– thought to be less important than 1st strand (probably not)

• choice of vector– plasmids are best for EST sequencing– phages are best for manual screening

• how will library quality be evaluated– test with 2, 4, 6, 8 kb probes to ensure that these are well

represented

Page 27: BioSci 203 Blumberg lecture 2 page 1 © copyright Bruce Blumberg 2001-2005. All rights reserved Bio Sci 203 Lecture 2 – genomic and cDNA libraries Bruce.

BioSci 203 Blumberg lecture 2 page 27 ©copyright Bruce Blumberg 2001-2005. All rights reserved

cDNA synthesis

• Scheme– mRNA is isolated from source of interest

– 1-2 ug is denatured and annealed to primer containing d(T)n

– reverse transcriptase copies mRNA into cDNA– DNA polymerase I and Rnase H convert remaining mRNA into

DNA– cDNA is rendered blunt ended– linkers or adapters are added for cloning– cDNA is ligated into a suitable vector– vector is introduced into bacteria

• Caveats– there is lots of bad information out there

• much is derived from vendors who want to increase sales of their enzymes or kits

– all manufacturers do not make equal quality enzymes– most kits are optimized for speed at the expense of quality– small points can make a big difference in the final outcome