NGS Introduction and Technology Overview (UEB-UAT Bioinformatics Course - Session 2.1.1 - VHIR,...

25
Vall d’Hebron Institut de Recerca (VHIR) Rosa Prieto Head of the High Tech Unit [email protected] 15/05/2014 Institut d’Investigació Sanitària acreditat per l’Instituto de Salud Carlos III (ISCIII) NEXT GENERATION SEQUENCING TECHNOLOGIES AND APPLICATIONS CURS OF BIOINFORMATICS FOR BIOMEDICAL RESEARCH

description

Course: Bioinformatics for Biomedical Research (2014). Session: 2.1.1- Next Generation Sequencing. Technologies and Applications. Part I: NGS Introduction and Technology Overview. Statistics and Bioinformatisc Unit (UEB) & High Technology Unit (UAT) from Vall d'Hebron Research Institute (www.vhir.org), Barcelona.

Transcript of NGS Introduction and Technology Overview (UEB-UAT Bioinformatics Course - Session 2.1.1 - VHIR,...

Page 1: NGS Introduction and Technology Overview (UEB-UAT Bioinformatics Course - Session 2.1.1 - VHIR, Barcelona)

1

Vall d’Hebron Institut de Recerca (VHIR)

Rosa PrietoHead of the High Tech Unit

[email protected]

15/05/2014

Institut d’Investigació Sanitària acreditat per l’Instituto de Salud Carlos III (ISCIII)

NEXT GENERATION SEQUENCING TECHNOLOGIES AND APPLICATIONS

CURS OF BIOINFORMATICS FOR BIOMEDICAL RESEARCH

Page 2: NGS Introduction and Technology Overview (UEB-UAT Bioinformatics Course - Session 2.1.1 - VHIR, Barcelona)

2

INTRODUCTION TO NGS1

2

3

4

Index

NGS TECHNOLOGY OVERVIEW

NGS APPLICATIONS OVERVIEW

CURS OF BIOINFORMATICS FOR BIOMEDICAL RESEARCH

WHAT IS NEXT IN SEQUENCING TECHNOLOGIES?

Page 3: NGS Introduction and Technology Overview (UEB-UAT Bioinformatics Course - Session 2.1.1 - VHIR, Barcelona)

5

Introduction

Personalized medicine era

Biomarker identification:•Diagnostic

•Susceptibility/risk (prevention)•Prognostic (indolent vs. aggressive)

•Predictive (response)

-The right therapeutic strategy for the right person at the right time-Predisposition to disease

-Early and targeted prevention

Page 4: NGS Introduction and Technology Overview (UEB-UAT Bioinformatics Course - Session 2.1.1 - VHIR, Barcelona)

7

Introduction: “omics”

“Omics”

Omics aims at the collective characterization and quantification of pools of biological molecules that translate into the structure, function, and dynamics of an organism or organisms (Wikipedia).

http://www.genomicglossaries.com/content/omes.asp

Genomics

High-throughput technologies

EpigenomicsMetagenomics

Transcriptomics Proteomics MetabolomicsLipidomics

Page 5: NGS Introduction and Technology Overview (UEB-UAT Bioinformatics Course - Session 2.1.1 - VHIR, Barcelona)

8

Next generation sequencingThe future is here, now?

Everything can be sequenced…

Page 6: NGS Introduction and Technology Overview (UEB-UAT Bioinformatics Course - Session 2.1.1 - VHIR, Barcelona)

9

Introduction to NGS technologies

1st generation 2nd generation 3rd generation

http://www.ipc.nxgenomics.org/newsletter/no11.htm

3.234,83 Mb (haploid)$ 2,7 billion

Automatic sequencer ABI1987

(GS20)

Page 7: NGS Introduction and Technology Overview (UEB-UAT Bioinformatics Course - Session 2.1.1 - VHIR, Barcelona)

Sequencing technology milestones

First generation sequencing Second generation sequencing

Page 8: NGS Introduction and Technology Overview (UEB-UAT Bioinformatics Course - Session 2.1.1 - VHIR, Barcelona)

NGS increases capacity and reduces costs

Moore’s Law: the number of transistors in an integrated circuit duplicates in 2-years time (1965).

Source - NHGRI : http://www.genome.gov/sequencingcosts/

Date Cost per Mb Cost per Genome % cost vs. sep01

Sep-01 $5.292,39 $95.263.072 100%

Sep-02 $3.413,80 $61.448.422 64,5039%

Oct-03 $2.230,98 $40.157.554 42,1544%

Oct-04 $1.028,85 $18.519.312 19,4402%

Oct-05 $766,73 $13.801.124 14,4874%

Oct-06 $581,92 $10.474.556 10,9954%

Oct-07 $397,09 $7.147.571 7,5030%

Oct-08 $3,81 $342.502 0,3595%

Oct-09 $0,78 $70.333 0,0738%

Oct-10 $0,32 $29.092 0,0305%

Oct-11 $0,086 $7.743 0,0081%

Oct-12 $0,074 $6.618 0,0069%

Oct-13 $0,057 $5.096 0,0053%

Jan-14 $0,045 $4.008 0,0042%

Page 9: NGS Introduction and Technology Overview (UEB-UAT Bioinformatics Course - Session 2.1.1 - VHIR, Barcelona)

1. Fragmentación de DNA 1. Fragmentación de DNA

2.Clonaje en Vectores; Transformación Bacterias; crecimiento y aislamiento

vector DNA

2. Ligación de adaptadores in vitro y Amplificación clonal

3. Ciclo Secuenciación

CTATGCTCG

Secuencia:Primer:

PolimerasadNTPs

ddNTPs marcados

Electroforesis(1 Secuencia/Capilar)

3. Secuenciación masiva en paralelo

4. Procesamiento imagen y análisis de datos

4. Procesamiento imagen

1. Fragmentación de DNA

2. y 3. Ligación de adaptadores in vitroy Secuenciación masiva

SIN Amplificación

Sanger 2ªNGS 3ªNGS

Sanger sequencing vs. NGS (2nd and 3rd generation)

4. Procesamiento imagen y análisis de datos

Page 10: NGS Introduction and Technology Overview (UEB-UAT Bioinformatics Course - Session 2.1.1 - VHIR, Barcelona)

Comparison of different NGS platforms

-Similarities (and differences vs. Sanger):

•library preparation:starting material: short fragments of nucleic acidsadapter ligationmultiplexing (MID tags)

•clonal amplification (not for 3rd generation sequencing)•massive parallel sequencing •the use of physical location to identify unique reads is a critical concept for all next generation sequencing systems. The density of the reads and the ability to record them without interfering noise is vital to the throughput of a given instrument.•signal needs to be processed and post-treated to get the individual sequences•complex data analysis due to the big amount of data

-Differences:

•Clonal amplification method/sequencing technology/signal detection•Throughput•Read-length•Run time•Cost per base

Page 11: NGS Introduction and Technology Overview (UEB-UAT Bioinformatics Course - Session 2.1.1 - VHIR, Barcelona)

16

Illumina

Life Technologies

ROCHE

SOLID5500xl

GS Junior 454GS FLX+ 454

HiSeq 2500 MiSeqNextSeq500

Benchtop Instruments

2ns generation NGS platforms

IonPGMIonProton

HiSeq X-Ten (exp.2014)

Page 12: NGS Introduction and Technology Overview (UEB-UAT Bioinformatics Course - Session 2.1.1 - VHIR, Barcelona)

17

DNA fragmentation and in vitro adaptor ligationDifferent kinds of libraries (amplicons, shot-gun,cDNA….)

emulsion PCR bridge PCR

454 sequencing Illumina technologyIon Proton/PGM

Pyrosequencing Semiconductor sequencing 4-colour fluorescent nucleotides

1

2

3

11

22

33

Library preparation

Clonal amplification

Cyclic array sequencing

NGS general workflow

Page 13: NGS Introduction and Technology Overview (UEB-UAT Bioinformatics Course - Session 2.1.1 - VHIR, Barcelona)

18

-1 starting effective fragment per microreactor- ~106 microreactors per ml- All processed in parallel (Clonal amplification)

High-speed shaker

Clonal amplification by emPCR (454, Ion)

emPCR based systems (Roche, SoLID, Ion)

Page 14: NGS Introduction and Technology Overview (UEB-UAT Bioinformatics Course - Session 2.1.1 - VHIR, Barcelona)

19

Clonal amplification by emPCR (454, Ion)

Clonal amplification??

No empty beads

No beads containing more than one amplified fragment

1) Bead vs. starting DNA quantity titration

2) Optimal enrichment:

Melt

dsDNAUnión de Primer marcado

con Biotina a bolas de captura con ssDNA

Adición de bolas magnéticas con estreptavidina

Melt

5-20% OK

Page 15: NGS Introduction and Technology Overview (UEB-UAT Bioinformatics Course - Session 2.1.1 - VHIR, Barcelona)

20Generación de clusters: PCR “en puente” 100-200 millones de clusters

HiSeq2500: 2 “flow-cells”, 8 carriles por celda

Unión de cadenas sencillas a los adaptadores

Eliminación de las cadenas reversas

Bloqueo y adición primer secuenciación

Clusters clonales de cadena doble

Bridge amplification (Illumina)

Page 16: NGS Introduction and Technology Overview (UEB-UAT Bioinformatics Course - Session 2.1.1 - VHIR, Barcelona)

21

Metal coated PTP reduces crosstalk29 μm well diameter (20/bead)

3,400,000 wells per PTP

GS FLX 454 sequencing

Page 17: NGS Introduction and Technology Overview (UEB-UAT Bioinformatics Course - Session 2.1.1 - VHIR, Barcelona)

22

Pyrosequencing (sequencing by synthesis)

CCD Camera

“flowgram” (signal intensity is proportional to the number of nucleotides incorporated in the

sequence)

- throughput limited by the nº of wells in the PTP- errors in homopolymers :S (454)- long sequences (up to 1000bp) are achieved- low throughput, very expensive reagents

GS FLX 454 sequencing

Page 18: NGS Introduction and Technology Overview (UEB-UAT Bioinformatics Course - Session 2.1.1 - VHIR, Barcelona)

23

Illumina sequencing

- Limited by the fragment length than can “bridge”- Labelled nucleotides are not incorporated as efficiently as native ones

- Short sequences-Strand-specific errors, substitutions towards the end of the read, base substitution errors (sistematic error GGT >GGG)-High throughput, expensive machines, cost per Mb OK

Liberación secuencial de 4 nucleótidos fluorescentes

Incorporación

Captación de imagen

Eliminación terminador 3’

Reversible dye terminator nucleotides (sequencing by synthesis)

Page 19: NGS Introduction and Technology Overview (UEB-UAT Bioinformatics Course - Session 2.1.1 - VHIR, Barcelona)

24

Fragmentación & secuencias adaptadoras

1. Liberación secuencial de nucleótidos no modificados2. La incorporación de un nucleótido por la polimerasa libera un H+3. Detección directa y simultánea de un cambio de pH en todos los

pocillos.

ION TORRENT (Life Techn.)

Amplificación clonal (emPCR sobre beads)Deposición de las beads+DNA en los pocillos del chip

Ion Torrent sequencing

•pHmeter, no optical system: rapid output improvement based on chips•Fast runs (native nucleotides)•Inexpensible machine and reagents•Fails in homopolymers detection

Page 20: NGS Introduction and Technology Overview (UEB-UAT Bioinformatics Course - Session 2.1.1 - VHIR, Barcelona)

25

NGS data analysis

454 sequencing

Pyrosequencing

Page 21: NGS Introduction and Technology Overview (UEB-UAT Bioinformatics Course - Session 2.1.1 - VHIR, Barcelona)

26

PLATFORM ROCHE GS FLX+ 454 ILLUMINA HISEQ 2500 ION PROTON

Library preparationemPCR Bridge amplification emPCR

Sequencing chemistry Pyrosequencing Reversible dye terminators pH change

Read length Up to 1000bp From 2x125 bp to 2x300 bp Up to 200 bp

Run time 22 hrs 7 hrs-6 days From 2 to 4 hrs

Throughput/run Up to 700 Mb 500-1000Gb (1Tb) 10Gb (PI), 100Gb (PII)

Equipment Cost 500.000 $ 750.000 $ 250.000 $

Reagents Cost/run 8.000 $ 5.500 $ 1.000 $

GOOD! Longest read length High throughput/low cost per base/ease of use Quick, easy to use and cheap

BAD!

High error rate in homopolymers (>6); very expensive; low throughput; not automatized at all

Short sequencesStrand-specific errors, substitutions towards the end of the read, base substitution errors (sistematic error GGT >GGG)

Errors in homopolymersHigher bias than Illumina

NGS platforms comparison

Page 22: NGS Introduction and Technology Overview (UEB-UAT Bioinformatics Course - Session 2.1.1 - VHIR, Barcelona)

27

NGS High-Throughput Platforms comparison

Two modes: Rapid Run and High OutputSingle/Dual Flow Cells PE 2 x 125 pb120 Gb in 27 hours (Rapid)1 Tb in 6 days (High) 20 exomes in a day1 human genome in a day30 RNAseq samples in 5 hours

Human exome, 30x, aprox. 800-1000 €Human RNAseq (30Mreads, 100bp PE, strandspecific): aprox. 800-1000 €Human whole genome 30x: 4000 €

HiSeq Xten(10 HiSeqX)

Only High Output modeSingle/Dual Flow CellsPE 2 x 150 pb600 Gb in a day (dual flow cell)1.8 Tb in 3 days (4x faster than HiSeq2500)HiSeq XTen: 10.000 genomes at 30x per year

Ion Proton

Source: Nextgenseek.com & Allseq.com.Todos estos costes son orientativos a mayo de 2.014 y de ninguna manera vinculantes para la UAT

Ion PI chip: Up to 20 Gb output (specific. 10 Gb)Read length:Up to 200 bpRun time: 2-4 hrs1 human exome (aprox. 1000 €)

Ion PII chip:Up to 100 Gb output (expected 2014), now reduced to 20-30 Gb at launchRun time: 2-4 hrsRead length: 100 pbHuman Whole Genome (10x, ?)

Ion PIII chip (???): 200 Gb output per run

Page 23: NGS Introduction and Technology Overview (UEB-UAT Bioinformatics Course - Session 2.1.1 - VHIR, Barcelona)

28

NGS Platforms specifications and applications

Ion PGM/Ion ProtonIllumina

Page 24: NGS Introduction and Technology Overview (UEB-UAT Bioinformatics Course - Session 2.1.1 - VHIR, Barcelona)

29

Roche 454

NGS Platforms specifications and applications

PacBio RSII (3rd generation)

Page 25: NGS Introduction and Technology Overview (UEB-UAT Bioinformatics Course - Session 2.1.1 - VHIR, Barcelona)

31

NGS advantages and limitations

Journal of Investigative Dermatology (2013) 133