Chemical Synthesis, Amplification, and Sequencing of DNA (Part II)

52
Chemical Synthesis, Amplification, and Sequencing of DNA (Part II)

Transcript of Chemical Synthesis, Amplification, and Sequencing of DNA (Part II)

Page 1: Chemical Synthesis, Amplification, and Sequencing of DNA (Part II)

Chemical Synthesis, Amplification, and Sequencing

of DNA (Part II)

Page 2: Chemical Synthesis, Amplification, and Sequencing of DNA (Part II)

The function of a gene can often be deduced from its nucleotide sequence.

A presumptive amino acid sequence, determined from the nucleotide sequence, can be compared with protein from known genes. A significant similarity indicates a protein with an equivalent function.

DNA binding sites, receptor recognition sites, and transmembrane domains can be ascertained.

The non-coding regions may provide information about the regulation of a gene.

The sequence information is essential for molecular cloning studies and characterizing gene activity.

DNA Sequencing

Page 3: Chemical Synthesis, Amplification, and Sequencing of DNA (Part II)

DNA sequencing techniques uses of modified nucleosides (ddNTPs) with flourescent “tags” ddNTPs lack a hydroxyl group at the 3’ position, however,

so no new nucleotide can be added

Dideoxynucleotide Procedure

Page 4: Chemical Synthesis, Amplification, and Sequencing of DNA (Part II)
Page 5: Chemical Synthesis, Amplification, and Sequencing of DNA (Part II)

Normal DNA synthesisAn incoming dNTP base

pairs with the complementary nucleotide of the template strand.

The internucleotide linkage occurs between the 3’ hydroxyl group of the last nucleotide of the growing strand and the α-phosphate group of the incoming nucleotide.

Page 6: Chemical Synthesis, Amplification, and Sequencing of DNA (Part II)

Blocked DNA synthesisChain growth is stopped by

the addition of a dideoxynucleotide to the end of the growing strand.

The internucleotide linkage between the last nucleotide, which is ddNTP, and the next incoming nucleotide cannot be formed because there is no 3’ OH group on the dideoxynucleotide sugar.

Page 7: Chemical Synthesis, Amplification, and Sequencing of DNA (Part II)

Primer extension

Page 8: Chemical Synthesis, Amplification, and Sequencing of DNA (Part II)
Page 9: Chemical Synthesis, Amplification, and Sequencing of DNA (Part II)

Simulated autoradiograph

Each lane of the gel was loaded with the contents of one of the four reaction tubes.

By convention, the bands of the autograph are read from the bottom to the top.

Page 10: Chemical Synthesis, Amplification, and Sequencing of DNA (Part II)

Single-stranded DNA is mixed with DNA polymerase, short primer strands, the four normal dNTPs, and small amounts of fluorescently “tagged” ddNTPs

When a ddNTP is encountered, growth stops

Result is a solution containing different lengths of polynucleotides ending with tagged ddNTPs at end

DNA Sequencing

Page 11: Chemical Synthesis, Amplification, and Sequencing of DNA (Part II)
Page 12: Chemical Synthesis, Amplification, and Sequencing of DNA (Part II)

Electrophoresis separates strands by length Color of fluorescent

tag indicates type of ddNTP at end of the strand

By checking end color of successive strand lengths, the sequence is revealed

DNA Sequencing

Page 13: Chemical Synthesis, Amplification, and Sequencing of DNA (Part II)
Page 14: Chemical Synthesis, Amplification, and Sequencing of DNA (Part II)

Electrophoresis Gel Images from an Automated DNA Sequencer

Page 15: Chemical Synthesis, Amplification, and Sequencing of DNA (Part II)

The emission data are recorded and stored in computer and converted to a nucleotide sequence information.

Page 16: Chemical Synthesis, Amplification, and Sequencing of DNA (Part II)

To produce large amounts of dideoxynucleotide-terminated fragments for small amounts of template DNA, PCR-based cycle sequencing is commonly used.

The setup and components for this method are the same except that a thermostable DNA polymerase is required.

Since there is only single primer in each reaction, the amplification of the fragment is linear.

The high temperature both prevent secondary structure, which block elongation, and mismatching.

The cycle sequencing resolves between 600 and 800 nucleotides at a time.

DNA Sequencing

Page 17: Chemical Synthesis, Amplification, and Sequencing of DNA (Part II)

Cycle sequencing

Page 18: Chemical Synthesis, Amplification, and Sequencing of DNA (Part II)

Primer walking

Page 19: Chemical Synthesis, Amplification, and Sequencing of DNA (Part II)

Primer walkingBoth strands of the DNA must be sequenced.False priming could give erroneous and ambiguous

results. The primers are generally at least 24 nucleotides long.High stringent annealing conditions do not permit

spurious binding of the primer to similar but not identical sequences.

Primer has been used to sequence pieces of DNA that have been cloned into bacteriophage λ or cosmid vector (~20 and ~45 kb, respectively.)

Page 20: Chemical Synthesis, Amplification, and Sequencing of DNA (Part II)

PyrosequencingThe first of the second-generation sequencing

technology.The basis of the technique is the detection of

pyrophosphate that is released during DNA synthesis. The α-phosphate of each incoming complementary

dNTP is joined to the 3’ OH group of the last nucleotide of the growing strand.

The β- and γ-phosphates are cleaved as a unit that is called pyrophosphate.

Pyrophosphate is formed only when the complementary nucleotide is incorporated the end of growing strand.

Page 21: Chemical Synthesis, Amplification, and Sequencing of DNA (Part II)

Pyrosequencing

Page 22: Chemical Synthesis, Amplification, and Sequencing of DNA (Part II)

Pyrosequencing

Page 23: Chemical Synthesis, Amplification, and Sequencing of DNA (Part II)

Pyrosequencing

Page 24: Chemical Synthesis, Amplification, and Sequencing of DNA (Part II)

Reversible chain terminatorsFor pyrosequencing, DNA is sequenced by synthesis. Each of the 4 nucleotides must be added to the

reaction sequentially in separate cycles.This process would be faster if all the nucleotides were

added together for each cycle. It is necessary to ensure that the growing DNA strands

are extended by only a single nucleotide during each cycle.

The incorporated nucleotides are recognized individually.

These can be met with reversible chain terminators and four-color fluorescence.

Page 25: Chemical Synthesis, Amplification, and Sequencing of DNA (Part II)

Reversible chain terminatorsThe 3’ carbon of the deoxyribose sugar is capped with

a chemical group that blocks subsequent addition of nucleotides.

A different fluorophore is attached to each nucleotide at positions that do not interfere with either base pairing or phosphodiester bond formation.

The 3’ blocking group and the fluorescent dye are quickly removed after incorporation. The emissions are recorded.

The decapping step must restore a hydroxyl group at the 3’ position.

Page 26: Chemical Synthesis, Amplification, and Sequencing of DNA (Part II)

Reversible chain terminators

Page 27: Chemical Synthesis, Amplification, and Sequencing of DNA (Part II)

Sequencing by ligationSet of nanomers sequences that one position is fixed

and the other eight sites are filled by any of the four nucleotides are added in the reaction.

First cycle, the anchor primer anneals to the adaptor sequence at the 3’ end of the template sequence. Nanomers with A,T,C, and G in the first query position are added. The complementary nanomer that hybridized to the template will be ligated to the primer.

The fluorescent signal is record and the ligated primer-nanomer strand is released by melting.

The second cycle is repeated using another pool of nanomers with fixed nucleotides in query position 2 to identify the nucleotide in the second position.

Page 28: Chemical Synthesis, Amplification, and Sequencing of DNA (Part II)

Sequencing by ligation

Page 29: Chemical Synthesis, Amplification, and Sequencing of DNA (Part II)

There are two categories of DNA sequencing projects: de novo genome sequencing and resequencing.

Sequencing entire genomes that have not been done is de novo genome sequencing.

Resequencing entails comparing a newly determined sequence with a known reference sequence.

Applications include the identification of pathogenic strains, drug discovery, tests for disease-related mutations, forensic annalyses, and development of biological products.

Large-scale DNA Sequencing

Page 30: Chemical Synthesis, Amplification, and Sequencing of DNA (Part II)

Determining the long contiguous genome is more complex than sequencing a single piece of DNA.

First, the DNA fragments needed for Sanger sequencing have to be limited to a few hundred nucleotides in length.

The uniform concentration of each of the terminated fragments is required for reading the maximum number of base pairs.

However, it is difficult since different concentrations of ddNTPs are needed for chains of different length.

Genomic DNA Sequencing

Page 31: Chemical Synthesis, Amplification, and Sequencing of DNA (Part II)

Second, the polyacrylamide gel electrophoresis used in sequencing cannot discriminate between fragments longer than 800 bases .

The number of bases that can be determined accurately in a single lane of a gel is about 750 bases.

Determining the sequence of any substantial segment of DNA involves generating many short sequencing reads from overlapping sections of DNA.

The process is called “assembling”

Genomic DNA Sequencing

Page 32: Chemical Synthesis, Amplification, and Sequencing of DNA (Part II)
Page 33: Chemical Synthesis, Amplification, and Sequencing of DNA (Part II)

Whole genome sequencing is still difficult.There were questions about how to collect, catalog,

and assemble very large numbers of sequencing reads, especially with the repetitive sequences through out the genome.

The computer programs for sequence assemble were not capable of handling the extremely large number of sequencing reads.

Early 1990, the “map-based ” strategy was employed to sequence S. cerevisiae and c. elegans.

Genomic DNA Sequencing

Page 34: Chemical Synthesis, Amplification, and Sequencing of DNA (Part II)
Page 35: Chemical Synthesis, Amplification, and Sequencing of DNA (Part II)

Building maps is both time consuming and expensive, so “whole-genome shotgun” was considered.

It would be possible to sequence a genome by cloning it into many thousands of small plasmids, sequencing these at random, and assembling the reads without knowing the locations of the clones in the genome.

This method was used to sequence the genome of H. influenza.

Genomic DNA Sequencing

Page 36: Chemical Synthesis, Amplification, and Sequencing of DNA (Part II)

The DNA fragments were prepared by breaking genomic DNA mechanically into suitable size, cloned into vectors to make subclone libraries.

It is important to coverage the whole genome, that is the number of independent subclones that will be needed to ensure having a complete sequence.

To ensure that most of a genomic is represented in the sequence data, typically a level of 6X – 10X coverage is needed.

Shotgun cloning strategy

Page 37: Chemical Synthesis, Amplification, and Sequencing of DNA (Part II)

Shotgun cloning strategyGenomic DNA is isolated and randomly fragmented

by sonication, nebulization, or hydrodynamic shearing.

The frayed DNA fragments are repaired and phosphorelated.

Fragments are separated into small, medium, and large fractions and cloned into plasmids and fosmids vector.

After transformation of E. coli with the library, colonies with cloned DNA are picked and grown.

Vectors DNA is purify from each library and sequenced.

Page 38: Chemical Synthesis, Amplification, and Sequencing of DNA (Part II)

Shotgun cloning strategy

Repair of the ends of frayed DNA and phosphorelationT4 DNA polymerase

or 3’-5’ exonuclease create blunt end

T4 polynucleotide kinase phosphorelate the 5’ ends of the blunt end fragments

Page 39: Chemical Synthesis, Amplification, and Sequencing of DNA (Part II)

Shotgun cloning strategyGenomic DNASize fractionateEnd repair and phosphorelateLigate, clone, and transform E. coliExtract DNA and sequenceAssemble contigsScaffolding and gap closureFinished sequence

Page 40: Chemical Synthesis, Amplification, and Sequencing of DNA (Part II)

Assembling the genome from a large number of sequences reads requires and extremely large number of pairwise comparisons to identify which sequencing reads overlap with which.

However, sequence reads are occasionally connected incorrectly.

Paired-end sequencing can minimize this type of error.

Paired-end Sequencing

Page 41: Chemical Synthesis, Amplification, and Sequencing of DNA (Part II)

The subclone plasmid library must carry inserts that are approximately the same length, 3 kb.

Both ends of the cloned insertion of each plasmid are sequenced in the shotgun-sequencing stage.

When these reads are assembled, the computer check to see if any pairs of reads from the same plasmid appear in the assembly at places further or closer than about 3000 bp

If so, this is an indication of an assembly error.Paired-end sequencing can helps sequence across

repetitive regions.

Paired-end Sequencing

Page 42: Chemical Synthesis, Amplification, and Sequencing of DNA (Part II)
Page 43: Chemical Synthesis, Amplification, and Sequencing of DNA (Part II)
Page 44: Chemical Synthesis, Amplification, and Sequencing of DNA (Part II)
Page 45: Chemical Synthesis, Amplification, and Sequencing of DNA (Part II)

Several types of problems with assembled sequences are identified.

- Low quality sequence- Uncertain orientation- Sequence gaps- Clone gap

Sequencing Longer Sequence

Page 46: Chemical Synthesis, Amplification, and Sequencing of DNA (Part II)

Shotgun cloning strategy is very time-consuming and expensive, even though many of these steps have been automated.

Cyclic array sequencing has been developed.In comparison to the 8 months required to sequence a

human genome, cyclic array sequencing can provide the sequence of human genome in 2 months.

The strategy: prepare libraries of DNA fragments for sequencing, immobilizing the sequencing templates in a dense array on a surface, and use a sequence-by-synthesis approach.

Cyclic array Sequencing

Page 47: Chemical Synthesis, Amplification, and Sequencing of DNA (Part II)

Features of adaptors A and B that are used for template preparation, PCR amplification, and sequencing using a cyclic array-sequencing strategy with the 454 sequencing platform .

Page 48: Chemical Synthesis, Amplification, and Sequencing of DNA (Part II)

The adaptor A-genomic-adaptor B strands without biotin tag are released by melting, concentrated and retained for sequencing.

Page 49: Chemical Synthesis, Amplification, and Sequencing of DNA (Part II)

DNA capture bead. Oligomers that are complementary to the PCR amplification sequence of adaptor B are attached at their 5’ ends to a bead. Each DNA capture bead hybridizes with only one adaptor A-genomic DNA-adaptor B strand.

Page 50: Chemical Synthesis, Amplification, and Sequencing of DNA (Part II)

The beads and PCR reagents, including PCR primers that anneal to sequences that are part of adaptors A and B, are stirred vigorously with oil to create a water-in-oil emulsion “microreactor”.

During PCR cycle, strands with the same sequences as the isolated A-DNA-B molecules are synthesized.

Following the PCR, the emulsion is broken, the beads are collected, and all the free DNA molecules are washed away.

Pyrosequencing is used to determine the nucleotide sequence. The flow signals from each well is captured and stored in computer.

Emulsion PCR

Page 51: Chemical Synthesis, Amplification, and Sequencing of DNA (Part II)
Page 52: Chemical Synthesis, Amplification, and Sequencing of DNA (Part II)

454 sequencing