Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overview Webinar Series Part...

46
Sample to Insight Next Generation Sequencing: An Introduction to Applications and Technologies Quan Peng, Senior Scientist, Genomics Assay Development, Research Foundation Eric Lader, Senior Director, Research and Development 1 Part 1 NGS Intro

Transcript of Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overview Webinar Series Part...

Page 1: Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overview Webinar Series Part 1

Sample to Insight

Next Generation Sequencing: An Introduction to Applications and Technologies

Quan Peng, Senior Scientist, Genomics Assay Development, Research FoundationEric Lader, Senior Director, Research and Development

1

Part 1NGS Intro

Page 2: Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overview Webinar Series Part 1

Sample to Insight

NGS Part 1: Introduction 2

Legal disclaimer

• QIAGEN products shown here are intended for molecular biology

applications. These products are not intended for the diagnosis,

prevention or treatment of a disease.

• For up-to-date licensing information and product-specific

disclaimers, see the respective QIAGEN kit handbook or user

manual. QIAGEN kit handbooks and user manuals are available

at www.QIAGEN.com or can be requested from QIAGEN

Technical Services or your local distributor.

Page 3: Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overview Webinar Series Part 1

Sample to Insight

Agenda

Next Generation Sequencing• Background• Technologies• Applications• Workflow

Targeted Enrichment• Methodologies• Data analysis

NGS Part 1: Introduction 3

1

2

Page 4: Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overview Webinar Series Part 1

Sample to Insight

Agenda

Next Generation Sequencing• Background• Technologies• Applications• Workflow

Targeted Enrichment• Methodologies• Data analysis

NGS Part 1: Introduction 4

1

2

Page 5: Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overview Webinar Series Part 1

Sample to Insight

DNA Sequencing – Machine Output

10E+2

10E+4

10E+6

10E+8

Out

put (

Kb)

Adapted from ER Mardis. Nature 470, 198-203 (2011)

NGS Part 1: Introduction 5

Page 6: Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overview Webinar Series Part 1

Sample to Insight

Rapid decrease in cost

NGS Part 1: Introduction 6

Page 7: Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overview Webinar Series Part 1

Sample to Insight

Almost the $1000 genome….

NGS Part 1: Introduction 7

Page 8: Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overview Webinar Series Part 1

Sample to Insight

What is Next-Generation Sequencing?

DNA is fragmented

Cloned to a plasmid vector

Or single PCR fragment

Cyclic sequencing reaction

Separation by electrophoresis

Readout with fluorescent tags

Automated Sanger Sequencingone residue at a time

NGS Part 1: Introduction 8

Page 9: Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overview Webinar Series Part 1

Sample to Insight

What is Next-Generation Sequencing?

DNA is fragmented

Cloned to a plasmid vector

Or single PCR fragment

Cyclic sequencing reaction

Separation by electrophoresis

Readout with fluorescent tags

Automated Sanger Sequencingone residue at a time

NGS: Massive Parallel Sequencing

. DNA is fragmented

Adaptors ligated to fragments(Library construction)

Clonal amplification of fragments on a solid surface(Bridge PCR or Emulsion PCR)

. Direct step-by-step detection of each nucleotide base incorporated during the sequencing reaction

NGS Part 1: Introduction 9

Page 10: Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overview Webinar Series Part 1

Sample to Insight

Bridge PCR

• DNA fragments are flanked with adaptors (Library)

• A solid surface is coated with primers complementary to the two adaptor sequences

• PCR Amplification, with one end of each ‘bridge’ tethered to the surface

• Clusters of DNA molecules are generated on the chip. Each cluster is originated from a single DNA fragment, and is thus a clonal population.

• Used by Illumina

NGS Part 1: Introduction 10

Page 11: Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overview Webinar Series Part 1

Sample to Insight

Illumina HiSeq/MiSeq

• Run time 1- 10 days

• Produces 2 - 600 Gb of sequence

• Read length 2X100 bp – 2X250bp (paired-end)

• Cost: $0.05 - $0.4/Mb

NGS Part 1: Introduction 11

Page 12: Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overview Webinar Series Part 1

Sample to Insight

Illumina Single-End vs. Paired-End

• Single-end reading (SE): o Sequencer reads a fragment from only one primer binding site

• Paired-end reading (PE): o Sequencer reads both ends of the same fragment

o More sequencing information, reads can be more accurately placed (“mapped”)o May not be required for all experiments, more expensive and time-consumingo Required for high-order multiplexing of samples (indexes on both sides)

Single-end reading

2nd strandsynthesis

Pair-end reading

NGS Part 1: Introduction 12

Page 13: Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overview Webinar Series Part 1

Sample to Insight

Emulsion PCR: Ion, Solid, 454

• Fragments with adaptors (the library) are PCR amplified within a water drop in oil

• One PCR primer is attached to the surface of a bead

• DNA molecules are synthesized on the beads in the water droplet. Each bead bears clonal DNA originated from a single DNA fragment

• Beads (with attached DNA) are then deposit into the wells of sequencing chips, one well one bead

• Used by Roche 454, IonTorrent and SOLiDNGS Part 1: Introduction 13

Page 14: Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overview Webinar Series Part 1

Sample to Insight

Ion Torrent PGM/ Proton

• Run time 3 hrs – no termination and deprotection steps

• Read length 100‐300 bp; homopolymer can be an issue

• Throughput determined by chip size (pH meter array): 10Mb – 5 Gb

• Cost: $1 - $20/Mb

NGS Part 1: Introduction 14

Page 15: Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overview Webinar Series Part 1

Sample to Insight

Multiplexing samples for sequencing: Sample indexing

• Multiple samples with different indices can be combined and sequenced together

• Depending on the application, may not need to generate many reads per sample (e.g. SNP)

• Save money on sequencing costs (more samples per run), optimize use of read budget

NGS Part 1: Introduction 15

Page 16: Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overview Webinar Series Part 1

Sample to Insight

NGS applications

Next Generation Sequencing

Genomics Transcript-omics

Epi-genomics

Meta-genomics

NGS Part 1: Introduction 16

Page 17: Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overview Webinar Series Part 1

Sample to Insight

Next Generation Sequencing

Genomics Transcript-omics

Epi-genomics

Meta-genomics

DNA-Seq

Mutation, SNVs, Indels,CNVs,

Translocation

NGS applications

NGS Part 1: Introduction 17

Page 18: Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overview Webinar Series Part 1

Sample to Insight

Next Generation Sequencing

Genomics Transcript-omics

Epi-genomics

Mutation, SNVs, Indels,CNVs,

Translocation

Expression level,Novel transcripts, Fusion transcripts,

Splice variants

Meta-genomics

DNA-Seq RNA-Seq

NGS applications

NGS Part 1: Introduction 18

Page 19: Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overview Webinar Series Part 1

Sample to Insight

Next Generation Sequencing

Genomics Transcript-omics

Epi-genomics

Mutation, SNVs, Indels,CNVs,

Translocation

Expression level,Novel transcripts, Fusion transcript,

Splice variants

Global mapping of DNA-protein

interactions, DNA methylation,

histone modification

Meta-genomics

DNA-Seq RNA-Seq ChIP-Seq, Methyl-Seq

NGS applications

NGS Part 1: Introduction 19

Page 20: Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overview Webinar Series Part 1

Sample to Insight

Next Generation Sequencing

Genomics Transcript-omics

Epi-genomics

Mutation, SNVs, Indels,CNVs,

Translocation

Expression level,Novel transcripts, Fusion transcript,

Splice variants

Global mapping of DNA-protein

interactions, DNA methylation,

histone modification

Meta-genomics

Microbial genomeSequence,

Microbial ID,Microbiome Sequencing

DNA-Seq RNA-Seq ChIP-Seq, Methyl-Seq

Microbial-Seq

NGS applications

NGS Part 1: Introduction 20

Page 21: Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overview Webinar Series Part 1

Sample to Insight

Considerations of NGS experiments: Read Budget

Read Budget is platform specific and determines multiplexing capacity

Read budget requirements differ for different applications

Genome

Methylome

Whole transcriptome

Exome

Targeted gene expression

Fusion transcripts

Somatic mutation detection

Structural variants

Germline SNP

Rea

ds n

eede

dM

ultiplexing possibility

NGS Part 1: Introduction 21

Page 22: Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overview Webinar Series Part 1

Sample to Insight

NGS workflow

Sample preparation•Isolate samples (DNA/RNA)•Qualify and quantify samples•Several hours to days

Library construction•Prepare NGS library•Qualify and quantify library•~4-8 hours

Sequencing •Perform sequencing run•8 hours to several days

Data analysis •Primary, secondary data analysis•Several hours to days to…

NGS Part 1: Introduction 22

Page 23: Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overview Webinar Series Part 1

Sample to Insight

QIAGEN solution for NGS workflow

Sample preparation

Library construction

Sequencing

Data analysis

• GeneRead® DNAseq NGS Panel System V2• MagAttract® HMW DNA Kit• REPLI-g® Single Cell Kit

• GeneRead™ rRNA Depletion Kit• GeneRead™ DNA QuantiMIZE

• GeneRead™ DNA Library Prep• GeneRead Targeted Gene Panels

• GeneRead™ Size Selection Kit• GeneRead™ Library Quant Kits

Result validation

• GeneRead DNAseq data analysis• Ingenuity Variant Analysis

• RT2 Profiler PCR Arrays• Somatic Mutation PCR Arrays• PyroMark Pyrosequencing

• CNA/CNV PCR Arrays• EpiTect ChIP PCR Arrays

NGS Part 1: Introduction 23

Page 24: Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overview Webinar Series Part 1

Sample to Insight

Agenda

Next Generation Sequencing• Background• Technologies• Applications• Workflow

Targeted Enrichment• Methodology• Data analysis• ???

NGS Part 1: Introduction 24

1

2

Page 25: Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overview Webinar Series Part 1

Sample to Insight

The problem with patient samples

• Low purity o Cancerous cells may be a minor fraction of total sample

• Heterogeneityo Multiple sub-clones of cancer may be present in one tumor sample

• Deep sequencing o 1000X coverage is required to get >90% sensitivity to detect ~5% mutation frequency

Whole genome / exome sequencing is expensive and may not yield sufficient coverage

NGS Part 1: Introduction 25

Page 26: Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overview Webinar Series Part 1

Sample to Insight

What is targeted sequencing?• Sequencing a region or subset of the genome or transcriptome

Why targeted sequencing?• Not all regions of the genome or transcripts are of interest or relevant to a specific study

o Exome Sequencing: sequencing most of the coding regions of the genome (exome). The protein-coding region constitutes less than 2% of the entire genome

o Focused panel/hot spot sequencing: focused on the genes or regions of interest e.g. Clinical relevance – tumor supressor genes, inherited mutations

What are the advantages of targeted panel sequencing?• More coverage per sample, more sensitive mutation detection

o 1 gene copy ~ 3 pg, 3000 copies in 10 ngo Heterogeneous sample 1% tumor cell = 30 copies in 10ngo Every base not covered equally in typical NGS experimento Typically read 1000 reads per locus for somatic mutationo More samples per run, lower cost per sample

Targeted sequencing

NGS Part 1: Introduction 26

Page 27: Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overview Webinar Series Part 1

Sample to Insight

Target enrichment: Methodology

Hybridization capture

• High input requirement (1 ug)

• Long processing time (2-3 days)

• Heterogeneous, lower specificity

• Cover large regions (exome)

NGS Part 1: Introduction 27

Data analysisSequencing

Hybridization capture

(24-72 hrs)

Library construction

Sample preparation

(DNA isolation)

Page 28: Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overview Webinar Series Part 1

Sample to Insight

Target enrichment: Methodology

Multiplex PCR

• Small DNA input (< 100 ng)

• No processing prior to enrichment

• Short library prep time (<8 hrs)

• Relatively small target region(KB - MB region)

NGS Part 1: Introduction 28

Data analysisSequencing

Hybridization capture

(24-72 hrs)

Library construction

Sample preparation

(DNA isolation)

Page 29: Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overview Webinar Series Part 1

Sample to Insight

QIAGEN GeneRead DNAseq Panel System V2

FOCUS ON YOUR RELEVANT GENES • Focused:

o Biologically relevant content selection enables deep sequencing on relevant genes and identification of rare mutations

• Flexible: o Mix and match any gene of interest

oro Fully customizable panels available

• NGS platform agnostic: o Functionally validated for Ion Torrent,

MiSeq/HiSeq

• Integrated controls:o Enabling quality control of prepared

library before sequencing

• Free, complete and easy of use data analysis tool

NGS Part 1: Introduction 29

Page 30: Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overview Webinar Series Part 1

Sample to Insight

GeneRead DNAseq V2 panel specifications

Application Panel name #genes

Target region(bases)

Coverage(%)

Specificity(%)

Uniformity(%)

Solid tumorsTumor Actionable Mutations 8 7,104 100.0 98.2 91Clinically Relevant Tumor 24 39,603 98.1 95.3 90

Hematologic malignancies Myeloid Neoplasms 50 236,319 98.1 97.4 94

Disease-specific

Breast Cancer 44 268,621 98.2 96.8 91Colorectal Cancer 38 182,851 98.7 98.3 95

Liver Cancer 33 191,170 99.0 96.4 96Lung Cancer 45 332,999 97.5 98.1 90

Ovarian Cancer 32 189,058 98.9 96.6 96Prostate Cancer 32 167,195 98.4 97.3 94Gastric Cancer 29 222,333 98.1 98.5 93

Cardiomyopathy 58 249,727 96.3 96.7 87

ComprehensiveCarrier Testing 157 664,735 97.5 97.9 91

Cancer Predisposition 143 620,318 98.3 96.8 93Comprehensive Cancer 160 744,835 98.0 97.7 92

Panel optimization results in outstanding experimental performance metrics

NGS Part 1: Introduction 30

Page 31: Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overview Webinar Series Part 1

Sample to Insight

GeneRead BRCA1 and BRCA2 custom panel

Design and specifications

Overlapping amplicon design

100% coverage

Experimentally-verified 100% coverage

Regions targeted Coding regions + 20 bp intron-exon junctions

# of bases targeted 21,472

Average amplicon length 150 bp

Total input DNA 40 ng

Number of amplicons 250

Specificity 99%

Uniformity (0.2x mean) 97%

% of bases callable 100%

NGS Part 1: Introduction 31

Page 32: Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overview Webinar Series Part 1

Sample to Insight

GeneRead DNAseq Custom Panel Builder

Build customized NGS panels in three simple steps

List targets

Customize

View & Order

NGS Part 1: Introduction 32

Page 33: Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overview Webinar Series Part 1

Sample to Insight

GeneRead DNAseq Custom Panel Builder

Quick start guide

Enter targets

Choose specs

View design & Order

Gene namesAmplicon size: 150/225Flanking bases: 10 (default)

Chr Start Stop

Enter name

NGS Part 1: Introduction 33

Page 34: Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overview Webinar Series Part 1

Sample to Insight

NGS data analysis

• Base callingo From raw data to DNA sequences: generate sequencing reads

• Mapping reads to a referenceo Align the reads to reference sequence, e.g. GRCh37 o Similar to a BLAST search: compares millions of reads against a reference

database

• Variants identificationo Identify the differences between sample DNA and reference DNA

• Variant prioritization/filtering/validation/interpretation

• Downstream validationo Typically PCR based, e.g. Somatic Mutation PCR analysis, methylation analysis,

allele specific amplification.

NGS Part 1: Introduction 34

Page 35: Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overview Webinar Series Part 1

Sample to Insight

NGS data analysis: Alignment

GRCh37 reference genome

Sequencing reads

A

CC

NGS Part 1: Introduction 35

Page 36: Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overview Webinar Series Part 1

Sample to Insight

NGS data analysis: Sequencing depth

• Coverage depth (or depth of coverage): how many times each base has been sequenced Unlike Sanger sequencing, in which each sample is sequenced 1-3 times to be confident of its nucleotide identity, NGS generally needs to cover each position many times to make a confident base call, due to relatively high error rate (0.1 - 1% vs 0.001 – 0.01%)

• Increasing coverage depth is also helpful to identify low frequent mutation in heterogenous samples such as identifying a somatic mutation in a heterogeneous cancer sample

GRCh37 reference genome

NGS reads

coverage depth = 4 coverage depth = 2coverage depth = 3NGS Part 1: Introduction 36

Page 37: Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overview Webinar Series Part 1

Sample to Insight

NGS data analysis: Specificity

• Specificity: the percentage of sequences that map to the intended targets region of interest

number of on-target reads / total number of reads

ROI 1 ROI 2GRCh37 reference

NGS reads

NGS Part 1: Introduction 37

Page 38: Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overview Webinar Series Part 1

Sample to Insight

NGS data analysis: Specificity

38

• Specificity: the percentage of sequences that map to the intended targets region of interest

number of on-target reads / total number of reads

ROI 1 ROI 2

Off-target reads

On-target reads On-target reads

GRCh37 reference

NGS reads

Page 39: Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overview Webinar Series Part 1

Sample to Insight

NGS data analysis: Uniformity

• Coverage uniformity: measure the evenness of the coverage depth across target regiono Calculate coverage depth of each positiono Calculate the median coverage depth o Set the lower boundary of the coverage depth relative to median depth

o (eg. 0.2 X median coverage depth in PCR panels)o Calculate the percentage of the target region covered to the depth of or deeper

than the lower boundary GRCh37 reference

NGS reads

NGS Part 1: Introduction 39

Page 40: Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overview Webinar Series Part 1

Sample to Insight

NGS data analysis: Uniformity

• Coverage uniformity: measure the evenness of the coverage depth across target regiono Calculate coverage depth of each positiono Calculate the median coverage depth o Set the lower boundary of the coverage depth relative to median depth

– (eg. 0.2 X median coverage depth in PCR panels)o Calculate the percentage of the target region covered to the depth of or deeper

than the lower boundary

coverage depth = 10 coverage depth = 2coverage depth = 3

GRCh37 reference

NGS reads

NGS Part 1: Introduction 40

Page 41: Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overview Webinar Series Part 1

Sample to Insight

GeneRead data analysis solution

• Data analysis for the non-bioinformaticians among us

Input

• FASTQ file• Panel ID• Job type

o Singleo Matched

Tumor/Normal• Analysis mode

o Somatico Germline

Output

• Bam (sequence) and VCF (variant) fileso Sequencing metricso Variants detected, confidenceo Copy number alterations

• QIAGEN Advanced Bioinformatics (webinar IV)

Comprehensive and easy-to-use data analysis

NGS Part 1: Introduction 41

Page 42: Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overview Webinar Series Part 1

Sample to Insight

NGS data analysis metrics

Run Summary• Specificity• Coverage• Uniformity• Numbers of SNPs and indels

Summary By Gene• Specificity• Coverage• Uniformity• # of SNPs and indels

NGS Part 1: Introduction 42

Page 43: Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overview Webinar Series Part 1

Sample to Insight

Features of Variant Report (QIAGEN)

SNP detection Indel detection

snpeff.sourceforge.net/

NGS Part 1: Introduction 43

Page 44: Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overview Webinar Series Part 1

Sample to Insight

Confirm variants of your NGS runs

Greek Symbol = Gene# = Mutation

• For Normalization• Assays detect non-

variable region (not ARMS-based design)

qBiomarker Somatic Mutation ARMS-PCR Assays and Arrays

NGS Part 1: Introduction 44

Page 45: Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overview Webinar Series Part 1

Sample to Insight

GeneRead DNA NGS solutions from Sample to Insight

NGS Part 1: Introduction 45

Page 46: Next-Generation Sequencing an Intro to Tech and Applications: NGS Tech Overview Webinar Series Part 1

Sample to Insight

Thank you for attending today’s webinar!

Contact QIAGENCall: 1-800-426-8157

Email: [email protected]

[email protected]

Questions?

Thank you for attending

NGS Part 1: Introduction 46