1 Ultra-High Throughput DNA Sequencing on the 454/Roche GS-FLX Methods, Automation, Applications...

34
1 Ultra-High Throughput DNA Sequencing on the 454/Roche GS-FLX Methods, Automation, Applications Graham Wiley Roe Lab

Transcript of 1 Ultra-High Throughput DNA Sequencing on the 454/Roche GS-FLX Methods, Automation, Applications...

1

Ultra-High Throughput DNA Sequencing on the 454/Roche

GS-FLX

Methods, Automation, Applications

Graham Wiley

Roe Lab

2

A Brief History of Automated DNA Sequencing Instruments

0.04

0.54

1.04

1.54

2.04

2.54

3.04

3.54

4.04

4.54

1994 1996 1998 2000 2002 2004 2006

Millions

Date of Introduction

# Bases/RunABI 3730

ABI 370/377

ABI 3700

454/Roche GS-FLX100,000,000

2007

64,000,000454-GS20

3

454 GSFLX Sequencer

• Pico-scale sequencing reactions

• 2 Core Techniques:– Emulsion PCR– Pyrosequencing

4

Emulsion PCR• Micro-reactors

– Water-in-oil emulsion generates millions of micelles.

– Each micelle contains all reagents/templates for a PCR reaction.

– ~10 Million individual PCR reactions in a single tube.

5

Emulsion PCR

6

Load Beads into 454 Plate

Centrifugation

Load Enzyme Beads

44 μm

Load beads into PicoTiterPlate

7

Pyrosequencing

DNABead

A A T C G G C A T G C T A A A A G T C A

Annealed Primer

T

PPi

ATP

Light + oxy luciferin

Sulfurylase

Luciferase

APS

luciferin

PolymerasedTTP(1)

(2)

(3)

(4)(5)

•Polymerase adds

nucleotide (dNTP)

•Pyrophosphate

is released (PPi)

•Sulfurylase creates ATP

from PPi and APS

• Luciferase hydrolyses ATP

to oxidize luciferin and

produce light

Enzyme Bead

CCD camera detects bursts of light

8

Pyrosequencing Output

9

Base Calling via Flowgram

TTCTGCGAA

10

Types of Libraries

• 454/Roche– Shotgun

• Random 250+bp reads

– Paired-End• 25-50bp ends of a circularized DNA molecule

– Amplicon• PCR product for SNP discovery

• Roe Lab– Paired-End/Shotgun

• Best of both worlds

11

Nebulization

454 Shotgun Library Preparation Protocol Overview

DNA End Repair 3’

5’3’

5’

Adaptor Ligation (A&B) 3’5’

3’ 5’BA

DNA End Repair3’

5’ 3’

5’BA

Library Quantification on Caliper

12

Shear to 2-4 Kbp fragments on the Hydroshear

454 Paired End/Shotgun DNA Preparation Protocol Overview

Quantitate on Caliper AMS-90

Ligate to Circularized the DNA

Shear to ~500 bp fragments in the Nebulizer

DNA End Repair & Linker Ligation

Cleave the Terminal Linkers with EcoR1

13

Amplification (emPCR)

Pyrosequencing on 454/Roche GS-FLX

Quantitate on Caliper AMS-90

DNA End Repair, Adaptor Ligation, Adapter End Repair

454 Paired End/Shotgun DNA Preparation Protocol Overview (cont)

14

454 Paired-End/Shotgun Assembly Process

• Separate based on inclusion or exclusion of middle linker– Those sequences containing a middle linker are

further separated based on the length of the read to either end of the linker sequence

– ~3-5% of the total reads contain the middle linker sequence

• Assembly of the reads by Newbler• Convert paired ends for Exgap ordering and

orienting– *.454f and *.454r

15

Automation of the Shotgun Library Preparation Steps

• Why automate?– Time– Reproducibility

• What are the obstacles?– Reaction Cleanup

• Qiagen Minelute centrifuge columns are difficult to automate, so replace those steps with

• Agencourt SPRI magnetic beads and add a magnetic station to the Zymark SciClone bed

– Enzyme Stability and Storage• Build an enzyme cooling station on the Zymark

SciClone bed

16

SPRI Bead Technology

• Solid Phase Reversible Immobilization• Carboxyl coated magnetic particles

suspended in a solution of 10% PEG and 1.25M NaCl

• Reversibly binds DNA – Hawkins, et al. (1994) DNA purification and isolation using a solid-

phase. Nucleic Acids Research, 22(21):4543-4544

http://www.agencourt.com/products/spri_reagents/ampure/

17

DNA Purification through the Qiagen Minelute Columns vs Agencourt SPRI Magnetic Beads

Qiagen Minelute centrifuge column Agencourt SPRI magnetic beads

Both procedures give an almost similar yield but the yield is slightly better with the SPRI beads and the automation of the SPRI bead prep is somewhat easier to achieve

18

96 well Magnetic Plate for Purification of the SPRI Beads

19

Enzyme Chilling Station

20

Zymark SciClone Deck Arrangement

Shaker

EtOH

Enzyme Mixes

Shaker

Shaker

Magnet

SPRI Beads

Sample

Buffers

Waste

21

Adding SPRI Beads on the SciClone

QuickTime™ and aH.264 decompressor

are needed to see this picture.

22

Magnetically Separating SPRI Beads on the SciClone

QuickTime™ and aH.264 decompressor

are needed to see this picture.

23

Washing SPRI Beads on the SciClone

QuickTime™ and aH.264 decompressor

are needed to see this picture.

24

Applications

• Whole Genome Sequencing

• Sample Pools– BACs– Viruses

• EST Libraries

• Bacterial Communities

25

Plant Viruses of the Tallgrass Prairie

• Single or double stranded RNA

• Typically <10,000bp, ~12,000bp max.

• 4-12 encoded genes

• Inherent instability of RNA leads to large amount of mutations, hence, large species variation

26

cDNA pooling strategy

• Tags on PCR primers allow for deconvolution of viral sequences post sequencing

• cDNA samples are pooled in sets of 24-96 at the Noble Foundation and sent to OU for sequencing

27

Strategy for preparing cDNA ready for 454 sequencing from dsRNA

5’ 3’

3’ 5’

Anneal with Random Hexamer Primers followed by Reverse Transcriptase PCR Reaction

5’ 3’

5’3’NNNNNN

CCTTCGGATCCTCC

RNAse Treatment to Remove any Excess Random Hexamer Primers followed by a Taq Polymerase PCR with one of the 20 Tagged Primers

3’ 5’

5’

GGAAGCCTAGGAGG

5’

5’

CCTCCTAGGCTTCCGAGA

+5’

3’ 5’CCTCCTAGGCTTCCNNNNNN

CCTCCTAGGCTTCC

NNNNNN

NNNNNNCCTTCGGATCCTCC5’ 3’

+

Additional Rounds of RT PCR with Random Hexamer Primers

NNNNNN

CCTTCGGATCCTCC

CCTCCTAGGCTTCC

NNNNNN

CCTCCTAGGCTTCCNNNNNN

NNNNNNCCTTCGGATCCTCC5’ 3’

AGAGCCTTCGGATCCTCC

GGAAGCCTAGGAGG

+ 5’ 3’

3’ 5’

AGAGCCTTCGGATCCTCC

CCTCCTAGGCTTCCGAGA

Amplified Product Ready for Ligating 454 A and B Primers

A B

28

Uniquely Tagged cDNA Sample from the TGP on the 454

454 tag (TCAG)

TGP Unique tag (GACA)

TGP common primer

(CCTTCGGATCCTCC)

RT-PCR Sequence

29

Putative New Allexivirus

• BlastX shows a large number of contigs have homology to viruses of the genus Allexiviridae

• Contig sequence lengths cover ~66% of a typical Allexivirus genome of ~8.5KB

• 5 of the 6 genes encoded by Allexiviridae species are represented in the sequenced contigs

Contig Contig Length E-Value Reference Coordinates Reference Sequence [Species]

05TGP120_65 638 6.0E-16 1-108 TGB 2 [Lily virus X]05TGP120_69 826 1.0E-31 8-188 coat protein [Garlic virus B]05TGP120_74 512 1.0E-10 27-150 40kDa protein [Garlic virus D]05TGP120_83 190 3.0E-10 66-126 helicase [Shallot virus X] 05TGP120_13 224 1.0E-27 125-197 replicase [Garlic virus A]05TGP120_105 463 2.0E-52 246-381 replicase [Garlic virus A]05TGP120_81 576 5.0E-51 768-887 replicase [Garlic virus A]05TGP120_76 468 1.0E-61 854-1006 replicase [Garlic virus X]05TGP120_75 661 6.0E-90 1102-1319 replicase [Garlic virus A]05TGP120_95 1103 8.0E-96 1375-1561 replicase [Garlic virus A]

Replicase Helicase

MembraneProtein Hypothetical

ProteinCoatProtein

Nucleic AcidBinding Protein

~8.5KB

05TGP00120

13105

8176

7595 83

65 74 69

(+)ssRNA

30

Current BAC Pooling Strategy

• 10x10 Grid of 100 BAC clones• 1-fold coverage of each pool of 10

150Kb BACs is 1.5 Mb• 1 quarter 454/Roche GS-FLX

picotiter plate give ~13Mb or 10-fold cov.

• 5 full picotiter plate runs are required for 20-fold coverage of each individual BAC at the horizontal/vertical intersect.

• $12k/run = ~$600/BAC• Additional ABI 3730 runs are needed

for each pool to aid in deconvolution at ~ $1000 for each of the 20 pools and an additional ~$800/BAC or $1400 total cost per BAC

Pool B

Pool A

X

31

Future Tagged BAC Pooling Strategy

• 24 uniquely tagged individual shotgun libraries would be pooled and sequenced on one full 454/Roche GS-FLX picotiter plate

• 24 150 Kb BACs would require 3.6 Mb for 1 x sequence coverage

• With >75 Mb of DNA sequence obtained per full plate, >20x coverage is obtained for each of the 24 pooled BACs

• 96 BACs would therefore require 4 full plate runs on the 454/Roche GS-FLX

• At $12k/run = ~$500 per BAC for >20-fold shotgun coverage and no ABI 3730 runs are needed to deconvolute the individual BACs as each BAC is individually tagged

32

Conclusions • It is possible to incorporate both shotgun and

paired end reads in the same library• Qiagen Minelute centrifuge columns may be

replaced by Agencourt SPRI beads after enzymatic steps in 454 library preparation.

• The replacement of centrifuge columns with magnetic beads as well as the manufacture of an enzyme chilling station allows for the automation of the library making process

• Through the use of tagged RT-PCR samples it is possible to sequence putatively novel plant viruses

33

Acknowledgments• Dr. Roe• Loaders & Data Analyzers: Simone Macmil,

Doug White, Steve Kenton• Makers & Breakers: Chunmei Qu, Ping

Wang, Yanbo Xing, Baifeng Qin, Keqin Wang• All other members of the Roe lab• Collaborators

– OSU: Ulrich Melcher, Vijay Muthamukar– Noble Foundation: Marilyn Roossinck, Guoan

Shen, Byoung Min, Rick Nelson, Tracy Feldman

34