NGS 101 in Pathology: Technology and Terminology€¦ · NGS Technology Simplified Presented By:...

45
NGS 101 for Pathology: NGS Technology Simplified Presented By: Brigette Kipphut Field Application Scientist Agilent Genomics PR7000-2015

Transcript of NGS 101 in Pathology: Technology and Terminology€¦ · NGS Technology Simplified Presented By:...

Page 1: NGS 101 in Pathology: Technology and Terminology€¦ · NGS Technology Simplified Presented By: Brigette Kipphut Field Application Scientist. Agilent Genomics. PR7000-2015 • NGS

NGS 101 for Pathology: NGS Technology Simplified

Presented By:Brigette Kipphut

Field Application Scientist

Agilent Genomics

PR7000-2015

Page 2: NGS 101 in Pathology: Technology and Terminology€¦ · NGS Technology Simplified Presented By: Brigette Kipphut Field Application Scientist. Agilent Genomics. PR7000-2015 • NGS

• NGS can benefit turnaround times and

augment the current testing algorithms

Emerging Role of Next Generation Sequencing in Pathology

For Research Use Only. Not for use in diagnostic procedures.

PR7000-1944

PR7000-2015

Page 3: NGS 101 in Pathology: Technology and Terminology€¦ · NGS Technology Simplified Presented By: Brigette Kipphut Field Application Scientist. Agilent Genomics. PR7000-2015 • NGS

• NGS can benefit turnaround times and

augment the current testing algorithms

• NGS information can give insights into

clonal fractions and therapy selection

Emerging Role of Next Generation Sequencing in Pathology

For Research Use Only. Not for use in diagnostic procedures.

PR7000-1944

PR7000-2015

Page 4: NGS 101 in Pathology: Technology and Terminology€¦ · NGS Technology Simplified Presented By: Brigette Kipphut Field Application Scientist. Agilent Genomics. PR7000-2015 • NGS

• NGS can benefit turnaround times and

augment the current testing algorithms

• NGS information can give insights into

clonal fractions and therapy selection

• NGS technologies are now being seen

as options for companion diagnostics

Emerging Role of Next Generation Sequencing in Pathology

For Research Use Only. Not for use in diagnostic procedures.

PR7000-1944

PR7000-2015

Page 5: NGS 101 in Pathology: Technology and Terminology€¦ · NGS Technology Simplified Presented By: Brigette Kipphut Field Application Scientist. Agilent Genomics. PR7000-2015 • NGS

Topics for Today’s Presentation

The NGS Library Prep Workflow 2

1

3 Whole Genome vs Targeted NGS

4

What is Next-Gen Sequencing?

NGS Analysis

5 Reviewing NGS Terminology

For Research Use Only. Not for use in diagnostic procedures.PR7000-2015

Page 6: NGS 101 in Pathology: Technology and Terminology€¦ · NGS Technology Simplified Presented By: Brigette Kipphut Field Application Scientist. Agilent Genomics. PR7000-2015 • NGS

Topics for Today’s Presentation

The NGS Library Prep Workflow 2

1

3 Whole Genome vs Targeted NGS

4

What is Next-Gen Sequencing?

NGS Analysis

5 Reviewing NGS Terminology

For Research Use Only. Not for use in diagnostic procedures.PR7000-2015

Page 7: NGS 101 in Pathology: Technology and Terminology€¦ · NGS Technology Simplified Presented By: Brigette Kipphut Field Application Scientist. Agilent Genomics. PR7000-2015 • NGS

The Cornerstone Driving Next-Gen Sequencing Technology

For Research Use Only. Not for use in diagnostic procedures.

Research: 10 years

Cost: ~ $3 Billion

Completing The Human Genome…

…PricelessPR7000-2015

Page 8: NGS 101 in Pathology: Technology and Terminology€¦ · NGS Technology Simplified Presented By: Brigette Kipphut Field Application Scientist. Agilent Genomics. PR7000-2015 • NGS

What is Next-Gen Sequencing? A Brief History

For Research Use Only. Not for use in diagnostic procedures.

Massively Parallel Sequencing:- “Next-Generation Sequencing” (NGS)

• Does not use Sanger method• Different Platforms = Different Chemistries• Very High throughput instruments

- >100 gigabases of DNA sequence/day

Desktop Sized Sequencing Instruments & Beyond!:smaller but still as precise!!!

- “Next-Next Generation Sequencing”• Scaled down• Medium throughput• Individual Labs vs Core Facilities• Single molecule sequencers

Some food for thought:- What will sequencing be like 5, 10, 15 years from

now?

PR7000-2015

Page 9: NGS 101 in Pathology: Technology and Terminology€¦ · NGS Technology Simplified Presented By: Brigette Kipphut Field Application Scientist. Agilent Genomics. PR7000-2015 • NGS

Next-Gen Sequencing Cost & Technology Timeline…

2003 2005 2006 2007 2008 2010 2011 2012 2013 2015

~$3B $1.5M

year

cost per genome

Sequencing technology

454(Roche)

$100M

Genome Analyzer(Solexa/Illumina)

SOLiD(Applied Bio)

$40K

HiSeq

SOLiD 5500

$4K $2K

Ion Proton

$10K $5K

Ion Torrent

MiSeq

PacBio

2017 2018

$1K <$1K?

MinION

HiSeqX10NextSeq

IonS5

For Research Use Only. Not for use in diagnostic procedures.

Completion of Human

Genome ProjectLaunch of 1000

Genomes Project

Whole Genomes

Sequenced in a day!

Single Cell Sequencing

2014

NovaSeq

GridION X5

10X Chromium

10X Single Cell Controller

PR7000-2015

Page 10: NGS 101 in Pathology: Technology and Terminology€¦ · NGS Technology Simplified Presented By: Brigette Kipphut Field Application Scientist. Agilent Genomics. PR7000-2015 • NGS

Next-Gen Sequencing Platforms: System Overview

Next-generation platforms have some common elements and workflow

For Research Use Only. Not for use in diagnostic procedures.

IlluminaHiSeq/MiSeq

Ion TorrentIon Proton(Life Tech)

Sequencing Library

Genome sample

Array-based “Bridge-PCR”

(Clonal)

On-bead emulsion

PCR(Clonal)

Target Preparation Target Amplification

Random clusters on glass surface (flow cells)

Beads in defined microwells

(chip)

Sequencing Format

A,G,C,T cycle controlled fluidics Sequential nucleotide extension 2 Channel or 4 Channel imaging Performed on Flow Cells

A,G,C,T, cycle controlled fluidics Sequential nucleotide extension Detects H+ Ions Performed on a Semiconductor Chip

Sequencing Chemistry & Imaging

PR7000-2015

Page 11: NGS 101 in Pathology: Technology and Terminology€¦ · NGS Technology Simplified Presented By: Brigette Kipphut Field Application Scientist. Agilent Genomics. PR7000-2015 • NGS

“Next-Next” Gen Sequencing Platforms: System Overview

For Research Use Only. Not for use in diagnostic procedures.

PacBio

Nanopore

SequencingLibrary

Genome sample

Target Preparation

None

None

Target Amplification

Protein nanopores on electrically resistant polymer membrane

Nanophotonic confinementstructures on a surface

Sequencing Format

A,G,C,T cycle controlled fluidics Sequential nucleotide extension

- Single DNA polymerase and single stranded DNA molecule

- “Real Time PCR” 4-color fluorescent imaging (movie)

Nucleotides move through the nanopore and are “read” by changes in electric voltage associated with A, C, T, or G

Sequencing Chemistry & Imaging

PR7000-2015

Page 12: NGS 101 in Pathology: Technology and Terminology€¦ · NGS Technology Simplified Presented By: Brigette Kipphut Field Application Scientist. Agilent Genomics. PR7000-2015 • NGS

What can you do using NGS Technology:Applications for Basic and Clinical Research

Whole Genome

Exome & Targeted DNA-seq

TranscriptomeTargetedRNA-seq

miRNA-seq

Whole genome & Targeted Bi-Sulfite

Seq

• Exome: Most popular targeted sequencing method • Provides more reads in regions of interest• More efficient than WGS• Medium to very high coverage

• Sequence cDNA instead of DNA• Monitor gene expression and identify fusion and/or splicing events.• Targeted RNA-seq can add sensitivity to identify rare/low expressing transcripts

• Sequence DNA & identify methylated cytosines.• Can complement both DNA and RNA-seq

• WGS drives initial discovery• Low throughput• Medium to low coverage

For Research Use Only. Not for use in diagnostic procedures.

Large amplifications

Large deletions

Point mutations (SNP)

Insertions/Deletions

Rearrangements

Copy number (CNV)

Fusions/splice variants

Gene expression data

Methylation status

3D Genomic Interaction

Types of Variants Detectable using NGS

PR7000-2015

Page 13: NGS 101 in Pathology: Technology and Terminology€¦ · NGS Technology Simplified Presented By: Brigette Kipphut Field Application Scientist. Agilent Genomics. PR7000-2015 • NGS

Topics for Today’s Presentation

The NGS Library Prep Workflow 2

1

3 Whole Genome vs Targeted NGS

4

What is Next-Gen Sequencing?

NGS Analysis

5 Reviewing NGS Terminology

For Research Use Only. Not for use in diagnostic procedures.PR7000-2015

Page 14: NGS 101 in Pathology: Technology and Terminology€¦ · NGS Technology Simplified Presented By: Brigette Kipphut Field Application Scientist. Agilent Genomics. PR7000-2015 • NGS

Learning the NGS Workflow:Generating a Sequencing Library

Library - A collection of DNA or cDNA fragments prepared for sequencing by performing a series of enzymatic steps. These steps are commonly referred to as the Library Prep.

Isolate gDNA

Shearing/Fragmentation1. Sonication (Covaris)2. Restriction Enzymes3. Transposases4. Chemical/Heat (RNA-seq)

End Repair• Generate blunt

end fragments

A-Tailing• Add an “A” base to

3’ end of each strandA

A

Adapter Ligation• Adapters are short DNA oligos

that contain the primer sites used by the sequencer to generate the sequencing read

• Adapters can also contain short sequences called sample indexes or barcodes

• Incorporating barcodes allows different samples to be combined in the same sequencing run (multiplexing)

A

ATT

T

Y-ShapedSequencing Adapters

PCR• Using PCR primers

complementary to the adapters, DNA fragments with properly ligated adapters are selected for and amplified

For Research Use Only. Not for use in diagnostic procedures.PR7000-2015

Page 15: NGS 101 in Pathology: Technology and Terminology€¦ · NGS Technology Simplified Presented By: Brigette Kipphut Field Application Scientist. Agilent Genomics. PR7000-2015 • NGS

Library complexity• A sequencing library comprised of many unique molecules (e.g.

fragments with a unique start and stop point) is considered to have a high degree of library complexity.

• Each step of the workflow can contribute to the level of complexity in the final sequencing library. adapter ligation and PCR amplification steps often have the greatest impact

• Greater library complexity typically provides greater confidence that a varient detected in the sample is real. Exception are libraries that have molecular barcodes

High complexity Low complexity

For Research Use Only. Not for use in diagnostic procedures.PR7000-2015

Page 16: NGS 101 in Pathology: Technology and Terminology€¦ · NGS Technology Simplified Presented By: Brigette Kipphut Field Application Scientist. Agilent Genomics. PR7000-2015 • NGS

Molecular Barcoding: Monitoring with more precision

For Research Use Only. Not for use in diagnostic procedures.

What is molecular barcoding or unique molecular indexing?

• Unique DNA sequence tagged to DNA fragment

• Range between 3-5 (dual) and 10-15 (single) bases

• One or two MBCs per template DNA molecule

• Added before any PCR is performed

• Sequences originating from the same template DNA fragment can be pooled together to create a consensus read.

• Reduces false-positives from PCR & instrument induced errors

• Increases fidelity and sensitivity of variant calling, allowing low allele frequency variant detection (< 1%).

• Current limit of detection without a molecular barcode is ~3%

PR7000-2015

Page 17: NGS 101 in Pathology: Technology and Terminology€¦ · NGS Technology Simplified Presented By: Brigette Kipphut Field Application Scientist. Agilent Genomics. PR7000-2015 • NGS

Molecular Barcoding: How does it work?

1. Align reads to reference genome

2. Identify reads with same start and stop point (PCR duplicate reads)

3. Group those duplicate reads by their molecular barcode sequence (barcode family)

4. Consolidate read information to create a consensus read (condense PCR duplicates)

5. Filter out errors (false positives) from true variants

T

T

T

A

A

C

G

Group reads with the same molecular

barcode

Barcode family

TConsensus read

True variant

Random error

For Research Use Only. Not for use in diagnostic procedures.PR7000-2015

Page 18: NGS 101 in Pathology: Technology and Terminology€¦ · NGS Technology Simplified Presented By: Brigette Kipphut Field Application Scientist. Agilent Genomics. PR7000-2015 • NGS

Six Things to Consider Regarding NGS Library Prep Workflow

1. What kind of sample am I using and how much do I have?- High quality gDNA from cells or fresh/frozen tissue?- Degraded gDNA from Formalin Fixed Paraffin Embedded Blocks (FFPE)?- Do you have micrograms, nanograms, picograms?

For Research Use Only. Not for use in diagnostic procedures.PR7000-2015

Page 19: NGS 101 in Pathology: Technology and Terminology€¦ · NGS Technology Simplified Presented By: Brigette Kipphut Field Application Scientist. Agilent Genomics. PR7000-2015 • NGS

Six Things to Consider Regarding NGS Library Prep Workflow

1. What kind of sample am I using and how much do I have?- High quality gDNA from cells or fresh/frozen tissue?- Degraded gDNA from Formalin Fixed Paraffin Embedded Blocks (FFPE)?- Do you have micrograms, nanograms, picograms?

2. What do I want to learn from the samples I prepare?- Identify single nucleotide polymorphisms/variants (SNPs/SNVs)- Insertions and/or deletions (InDels)- More complex rearrangements: Translocations, Inversions, Copy #

Variations (CNVs)

For Research Use Only. Not for use in diagnostic procedures.PR7000-2015

Page 20: NGS 101 in Pathology: Technology and Terminology€¦ · NGS Technology Simplified Presented By: Brigette Kipphut Field Application Scientist. Agilent Genomics. PR7000-2015 • NGS

Six Things to Consider Regarding NGS Library Prep Workflow

1. What kind of sample am I using and how much do I have?- High quality gDNA from cells or fresh/frozen tissue?- Degraded gDNA from Formalin Fixed Paraffin Embedded Blocks (FFPE)?- Do you have micrograms, nanograms, picograms?

2. What do I want to learn from the samples I prepare?- Identify single nucleotide polymorphisms/variants (SNPs/SNVs)- Insertions and/or deletions (InDels)- More complex rearrangements: Translocations, Inversions, Copy # Variations

(CNVs)

3. One-size doesn’t fit all- Library preps & Target Enrichment technologies come in several flavors with

their own sets of advantages & disadvantages- Know what to look for during the QC steps of a given library prep workflow- Poor quality and very low input starting materials may require special handling- More input may be required, Whole Genome Amplification- Results from high quality gDNA ≠ Results from FFPE gDNA

For Research Use Only. Not for use in diagnostic procedures.PR7000-2015

Page 21: NGS 101 in Pathology: Technology and Terminology€¦ · NGS Technology Simplified Presented By: Brigette Kipphut Field Application Scientist. Agilent Genomics. PR7000-2015 • NGS

4. Higher yields aren’t necessarily better- Especially when performing non-Amplicon based library preps…- The less PCR, the better the results (higher Library Complexity)- Over-amplification can lead to poor performance

Six Things to consider beforehandReviewing the NGS Library Prep Workflow

For Research Use Only. Not for use in diagnostic procedures.PR7000-0197

Page 22: NGS 101 in Pathology: Technology and Terminology€¦ · NGS Technology Simplified Presented By: Brigette Kipphut Field Application Scientist. Agilent Genomics. PR7000-2015 • NGS

4. Higher yields aren’t necessarily better- Especially when performing non-Amplicon based library preps…- The less PCR, the better the results (higher Library Complexity)- Over-amplification can lead to poor performance

5. Set your expectations accordingly- Poor quality and very low input starting materials may require special

handling- More input required, Whole Genome Amplification- Results from high quality gDNA ≠ Results from FFPE gDNA

Six Things to consider beforehandReviewing the NGS Library Prep Workflow

For Research Use Only. Not for use in diagnostic procedures.PR7000-0197

Page 23: NGS 101 in Pathology: Technology and Terminology€¦ · NGS Technology Simplified Presented By: Brigette Kipphut Field Application Scientist. Agilent Genomics. PR7000-2015 • NGS

4. Higher yields aren’t necessarily better- Especially when performing non-Amplicon based library preps…- The less PCR, the better the results (higher Library Complexity)- Over-amplification can lead to poor performance

5. Set your expectations accordingly- Poor quality and very low input starting materials may require special

handling- More input required, Whole Genome Amplification- Results from high quality gDNA ≠ Results from FFPE gDNA

6. Don’t be afraid to ask for help!- While sequencing costs have come down, it’s still not cheap!- Reach out to your sequencing cores, other labs, or vendors for guidance

Six Things to consider beforehandReviewing the NGS Library Prep Workflow

For Research Use Only. Not for use in diagnostic procedures.PR7000-0197

Page 24: NGS 101 in Pathology: Technology and Terminology€¦ · NGS Technology Simplified Presented By: Brigette Kipphut Field Application Scientist. Agilent Genomics. PR7000-2015 • NGS

Topics for Today’s Presentation

The NGS Library Prep Workflow 2

1

3 Whole Genome vs Targeted NGS

4

What is Next-Gen Sequencing?

NGS Analysis

5 Reviewing NGS Terminology

For Research Use Only. Not for use in diagnostic procedures.PR7000-2015

Page 25: NGS 101 in Pathology: Technology and Terminology€¦ · NGS Technology Simplified Presented By: Brigette Kipphut Field Application Scientist. Agilent Genomics. PR7000-2015 • NGS

Overview of the NGS WorkflowCan be a 2 step or 3 step process…

Library Prep

Sequencing

Library Prep

Target Enrichment(subset of the initial library)

Sequencing

Whole GenomeWhole Transcriptome

All ExonsSmaller # of Genes/Regions

(i.e. Sequencing Panels)

For Research Use Only. Not for use in diagnostic procedures.PR7000-2015

Page 26: NGS 101 in Pathology: Technology and Terminology€¦ · NGS Technology Simplified Presented By: Brigette Kipphut Field Application Scientist. Agilent Genomics. PR7000-2015 • NGS

So you’ve made a library….now what?

Sequence It! Perform Target Enrichment

For Research Use Only. Not for use in diagnostic procedures.PR7000-2015

Page 27: NGS 101 in Pathology: Technology and Terminology€¦ · NGS Technology Simplified Presented By: Brigette Kipphut Field Application Scientist. Agilent Genomics. PR7000-2015 • NGS

Target Enrichment:It’s just like fishing… more like precision fishing

Why perform target enrichment?

1. Sequence only your desired regions of interest (Exons, gene panels, intergenic regions, etc.)!

2. Sequence more samples per lane/run (i.e. Multiplex)

3. Save time and money

4. Faster time to results = Smaller datasets

5. Identify variants in samples with increased reliability and accuracy: More Reads in regions of interest = Higher Depth of Coverage

For Research Use Only. Not for use in diagnostic procedures.PR7000-2015

Page 28: NGS 101 in Pathology: Technology and Terminology€¦ · NGS Technology Simplified Presented By: Brigette Kipphut Field Application Scientist. Agilent Genomics. PR7000-2015 • NGS

General Methods of Target Enrichment: What is the basic concept?1. Pull out the genes/regions of interest that you care about sequencing

A. Capture the regions using biotinylated baits: In-solution hybrid capture

B. Use primers to selectively amplify the genes/regions you want to sequence: Amplicon sequencing

2. Regions that are captured/amplified from initial library (i.e. pre-capture library) undergo additional amplification and processing creating a post-capture library

3. Off to sequencing!

For Research Use Only. Not for use in diagnostic procedures.PR7000-2015

Page 29: NGS 101 in Pathology: Technology and Terminology€¦ · NGS Technology Simplified Presented By: Brigette Kipphut Field Application Scientist. Agilent Genomics. PR7000-2015 • NGS

Agilent NGS Solutions to Meet Your Needs & Specimen Type

MASTR

HaloPlexHS

SureSelect

Key Applications Advantages• Inherited cancer targeting• Somatic cancer targeting• Inherited disease targeting

• Easy to use• Robust catalog designs• Fastest workflow

• MRD (hematological samples)• Inherited cancer• Somatic cancer• Inherited disease

• Molecular barcodes for highestsensitivity for low allelic frequency detection

• Fully customizable

• Size 0.1 - 35 Kb

• Size 6 Kb - 2.5 Mb

• Complex cancer profiling • Exome sequencing• Database building

• Profiling complex events (CNV, TRX, SNVs)

• Market leader for large panels• Fully customizable • Size 100 Kb and up

For Research Use Only. Not for use in diagnostic procedures.

PR7000-1944

PR7000-2015

Page 30: NGS 101 in Pathology: Technology and Terminology€¦ · NGS Technology Simplified Presented By: Brigette Kipphut Field Application Scientist. Agilent Genomics. PR7000-2015 • NGS

Topics for Today’s Presentation

The NGS Library Prep Workflow 2

1

3 Whole Genome vs Targeted NGS

4

What is Next-Gen Sequencing?

NGS Analysis

5 Reviewing NGS Terminology

For Research Use Only. Not for use in diagnostic procedures.PR7000-2015

Page 31: NGS 101 in Pathology: Technology and Terminology€¦ · NGS Technology Simplified Presented By: Brigette Kipphut Field Application Scientist. Agilent Genomics. PR7000-2015 • NGS

NGS Analysis: What happens after sequencing?

For Research Use Only. Not for use in diagnostic procedures.

Sequencer

Control SoftwareDe-multiplex

FASTQs“Raw” Data Files

Primary

FASTQs(Reads + Quality)

Trimming & Tools

BAM/SAM FilesReads aligned to genome

Secondary

BAM/SAM Files

GeneSpring SureCall,Partek, Open source tools

Mutations/Variants Identified (VCF File)

Tertiary

PR7000-2015

Page 32: NGS 101 in Pathology: Technology and Terminology€¦ · NGS Technology Simplified Presented By: Brigette Kipphut Field Application Scientist. Agilent Genomics. PR7000-2015 • NGS

Primary Analysis Overview• Sole responsibility of the sequencing platform vendor

• Converts physical signals to base calls as well as quality score

• Quality score: measure of confidence in the base call

FASTQ file

For Research Use Only. Not for use in diagnostic procedures.

SequencerData File (Reads + Quality)

Phred quality scores are logarithmically linked to error probabilities

Phred Quality Score Probability of Incorrect Base Call Base Call Accuracy

10 1 in 10 90%

20 1 in 100 99%

30 1 in 1000 99.9%

40 1 in 10000 99.99%

50 1 in 100000 99.999%

PR7000-2015

Page 33: NGS 101 in Pathology: Technology and Terminology€¦ · NGS Technology Simplified Presented By: Brigette Kipphut Field Application Scientist. Agilent Genomics. PR7000-2015 • NGS

Secondary AnalysisAlignment

• Takes the sequences and quality scores from primary analysis and matches the reference sequence to return location of read(s)

• Most resource-intensive step of NGS analysis—requiring RAM, CPU, and disk

• De facto standard output file format now SAM/BAM (BAM is binary)

• Freeware or commercial products

SAM/BAM file FASTQ file

For Research Use Only. Not for use in diagnostic procedures.

Data File(Reads + Quality)

Reads aligned to genome

PR7000-2015

Page 34: NGS 101 in Pathology: Technology and Terminology€¦ · NGS Technology Simplified Presented By: Brigette Kipphut Field Application Scientist. Agilent Genomics. PR7000-2015 • NGS

Tertiary Analysis

• Tertiary analysis starts with aligned data (SAM or BAM format)

• Analysis diverges depending on NGS data analysis type: ChIP-Seq, Methyl-Seq, whole Genome sequencing, amplicon sequencing, RNA-Seq, small RNA-Seq, etc.

• Freeware and commercial software

SAM/BAM file

SNPs, indelsStructural rearrangementsGene Expression profilesNovel Genes/exons/transcriptsTranscription Factor binding sitesMethylated regions/basesEtc.

For Research Use Only. Not for use in diagnostic procedures.

Reads aligned to genome

PR7000-2015

Page 35: NGS 101 in Pathology: Technology and Terminology€¦ · NGS Technology Simplified Presented By: Brigette Kipphut Field Application Scientist. Agilent Genomics. PR7000-2015 • NGS

Variant Call File (one output file format)• Tab-delimited format with each data line including the following fields:

Chromosome, position, variant id, reference/alternative alleles, quality,information (read depth), event, sample Id (optional), format (optional)

• May be shown as a Vcard (electronic business card) file in Windows OS

For Research Use Only. Not for use in diagnostic procedures.PR7000-2015

Page 36: NGS 101 in Pathology: Technology and Terminology€¦ · NGS Technology Simplified Presented By: Brigette Kipphut Field Application Scientist. Agilent Genomics. PR7000-2015 • NGS

Copy Number Variation (CNV)

• CNV is a type of aberration found throughout the human genome• Duplication or Deletion

• CNVs can range in size (up to a full chromosome) and many are linked to diseases• Down syndrome (Duplication of chromosome 21)

• With advances in NGS analysis algorithms it is now possible to make CNV calls more accurately from sequencing data

• Historically/currently done by karyotyping, FISH, CGH arrays

For Research Use Only. Not for use in diagnostic procedures.PR7000-2015

Page 37: NGS 101 in Pathology: Technology and Terminology€¦ · NGS Technology Simplified Presented By: Brigette Kipphut Field Application Scientist. Agilent Genomics. PR7000-2015 • NGS

Topics for Today’s Presentation

The NGS Library Prep Workflow 2

1

3 Whole Genome vs Targeted NGS

4

What is Next-Gen Sequencing?

NGS Analysis

5 Reviewing NGS Terminology

For Research Use Only. Not for use in diagnostic procedures.PR7000-2015

Page 38: NGS 101 in Pathology: Technology and Terminology€¦ · NGS Technology Simplified Presented By: Brigette Kipphut Field Application Scientist. Agilent Genomics. PR7000-2015 • NGS

Understanding Reads…Types of Reads, Lengths of Reads, Depth of Reads• Single-End Reads: Provide sequence from one end of a DNA insert

• Paired-End Reads: Provide sequence from both ends of a DNA insert.- Provides improved alignment of sequencing data- Better detection of chromosomal rearrangements: insertions/deletions/translocations and fusions.

• Mate Pair Reads: Similar to paired-end but both reads come from a single strand of the DNA insert and the distance between the reads is often much greater.

Figure Adapted from: GeneSpring NGS User Manual

For Research Use Only. Not for use in diagnostic procedures.

DNA Insert of Unknown Sequence

Figure Adapted from: tucf-genomics.tufts.edu

Flow CellBinding Region

Read Primer 1 Index Primer Flow CellBinding Region

Read Primer 2

PR7000-2015

Page 39: NGS 101 in Pathology: Technology and Terminology€¦ · NGS Technology Simplified Presented By: Brigette Kipphut Field Application Scientist. Agilent Genomics. PR7000-2015 • NGS

Understanding Reads…Types of Reads, Lengths of Reads, Depth of Reads

• Short reads – Illumina• <100bp (1 x 36bp, 1 x 50bp, 2 x 25bp, 2 x 75bp, 2 x 50bp)

• Medium reads – Illumina, Ion Torrent/Proton• >100bp but <1,000bp (ex. 2 x 75bp, 1 x 200bp, 2 x 100bp, 2 x 125bp,

2 x 150bp, 1 x 300bp, 2 x 250bp, 2x 300bp)

• Long Reads – Pacific Biosciences (PacBio). Oxford Nanopore• >1,000bp (ex. 1 x 1,000bp, Avg >10,000bp (Range 1-60,000bp))

Read lengths vary across sequencing platforms:

Fragment: GCCATATACGCTCATGGGTTAAAGCTACAGGATACAGAT

Read: GCCATATACGC

For Research Use Only. Not for use in diagnostic procedures.PR7000-2015

Page 40: NGS 101 in Pathology: Technology and Terminology€¦ · NGS Technology Simplified Presented By: Brigette Kipphut Field Application Scientist. Agilent Genomics. PR7000-2015 • NGS

Understanding Reads…Types of Reads, Lengths of Reads, Depth of Reads

ReferenceGenome

AlignedSequencing

Reads

7x Depth 10x Depth 12x Depth

Figure Adapted from Ambry Genetics

For Research Use Only. Not for use in diagnostic procedures.PR7000-2015

Page 41: NGS 101 in Pathology: Technology and Terminology€¦ · NGS Technology Simplified Presented By: Brigette Kipphut Field Application Scientist. Agilent Genomics. PR7000-2015 • NGS

A Beginner’s Next-Gen Glossary:Walk the walk, talk the talk

1. Library Preparation (Library Prep) – The method(s) used to prepare DNA/RNA for next-generation sequencing.

2. Sequencing Library (Library) – A collection of DNA or cDNA fragments of a given size range with adapters ligated to each end that can be run through a sequencer. Libraries can be DNA or cDNA (cDNA libraries prepared when performing RNA-seq).

• Pre-Capture Library – Sequencing library that is created before that library undergoes some form of target-enrichment. • Post-Capture Library – Sequencing library after it has completed some form of target-enrichment.

3. Library Complexity – The number of unique DNA fragments contained in a sequencing library.

4. Target Enrichment (Capture) – Methods to allow one to isolate and/or increase the frequency of specific genes or other regions of interest from a DNA or cDNA library prior to being sequenced. The regions of interest are retained for sequencing and the remaining material is washed away.

5. Baits – Common name given to the oligonucelotide sequences (i.e. probes) that are responsible for identifying and binding to a given region of interest for performing target-enrichment.

6. In-Solution Capture – A method of performing target enrichment that requires samples to be hybridized to baits to select and enrich the sample for the desired regions of interest.

7. Amplicon Sequencing – A method of performing target enrichment that utilizes one or more pairs of PCR primers to increase the number of copies of the genes or other regions of interest that will ultimately be sequenced.

8. Gene Panels – Name frequently given to the selected regions of interest (this can genes or intergenic regions) that will be captured using some form of target-enrichment technology.

For Research Use Only. Not for use in diagnostic procedures.PR7000-2015

Page 42: NGS 101 in Pathology: Technology and Terminology€¦ · NGS Technology Simplified Presented By: Brigette Kipphut Field Application Scientist. Agilent Genomics. PR7000-2015 • NGS

A Beginner’s Next-Gen Glossary:Walk the walk, talk the talk

9. Adapters – Oligonucleotides of a known sequence that are ligated to each end of a DNA/cDNA fragment (i.e. insert). They provide the primer sites used for sequencing the insert.

10. Sample Index/Barcode - Short sequences typically 6-8 nucleotides that serve as a way to identify/label individual samples when they are sequenced together in a single sequencing lane/chip. Sample Indexes/Barcodes are typically located within the sequencing adapters.

11. Multiplexing – Mixing two or more different samples together so they can be sequenced in a single sequencing lane or chip. Samples that are to be combined, need to be barcoded/indexed prior to being mixed together.

12. Molecular Barcode – Short sequences typically 10-15 nucleotides that serve as a way to identify/group PCR products amplified from the same original fragment/template. These sequences are typically incorporated into the sequencing adapters and are added onto each unique fragment in the library before any PCR is performed. They help to reduce error rates in sequencing data allowing for more precise calling of low-frequency variants in a sample.

13. Read – Base pair information of a given length from a DNA or cDNA fragment contained in a sequencing library. Different sequencing platforms are capable of generating different read lengths.

14. Single End Read – The sequence of the DNA is obtained from the 5’ end of only one strand of the insert. These reads are typically expressed as 1x “y”, where “y” is the length of the read in base pairs (ex. 1x50bp, 1x75bp).

15. Paired End Read – The sequence of the DNA is obtained from the 5’ ends of both strand of the insert. These reads are typically expressed as 2x “y”, where “y” is the length of the read in base pairs (ex. 2x100bp, 2x150bp).

16. Mate Pair Read – The sequence of the DNA is obtained similar to paired-end reads, however the size of the DNA insert is often much greater in size (2-10kb in length) and the paired reads originate from a single strand of the DNA insert.

For Research Use Only. Not for use in diagnostic procedures.PR7000-2015

Page 43: NGS 101 in Pathology: Technology and Terminology€¦ · NGS Technology Simplified Presented By: Brigette Kipphut Field Application Scientist. Agilent Genomics. PR7000-2015 • NGS

17. Depth of Coverage – The number of reads that spans a given DNA sequence of interest. This is commonly expressed in terms of “Yx” where “Y” is the number of reads and “x” is the unit reflecting the depth of coverage metric (5x, 10x, 20x, 100x).

18. Sequencing Depth – The amount of sequencing a given sample requires to achieve a certain depth of coverage. This is frequently expressed as the number of reads a sample requires (ex. 40 million reads, 80 million reads) or the number of bases of sequencing a sample requires (ex. 4 gigabases, 100 megabases).

19. Electropherogram – A graphical representation of the size and quantity of a DNA or RNA sample run through a BioAnalyzer, TapeStation or other instrument used for performing quality control.

20. FFPE DNA/RNA – Formalin Fixed Paraffin Embedded DNA or RNA. When attempting to prepare sequencing libraries from these sample types, modifications are often required to standard library preparation protocols to accommodate the level of DNA/RNA degradation commonly found from samples stored using this technique.

21. Call - Referring to the identification of a given aberration detected in the sequenced sample when compared to the reference/normal genome.

22. SNP/SNV – Referring to a Single Nucleotide Polymorphism or Single Nucleotide Variant detected in a sample.

23. CNVs – Referring to Copy Number Variation that is detected in sample.

24. InDels – One or more Insertion or Deletion event that is detected in a sample.

A Beginner’s Next-Gen Glossary:Walk the walk, talk the talk

For Research Use Only. Not for use in diagnostic procedures.PR7000-2015

Page 44: NGS 101 in Pathology: Technology and Terminology€¦ · NGS Technology Simplified Presented By: Brigette Kipphut Field Application Scientist. Agilent Genomics. PR7000-2015 • NGS

Topics Covered in Today’s Presentation

The NGS Library Prep Workflow 2

1

3 Whole Genome vs Targeted NGS

4

What is Next-Gen Sequencing?

NGS Analysis

5 Reviewing NGS Terminology

For Research Use Only. Not for use in diagnostic procedures.PR7000-2015

Page 45: NGS 101 in Pathology: Technology and Terminology€¦ · NGS Technology Simplified Presented By: Brigette Kipphut Field Application Scientist. Agilent Genomics. PR7000-2015 • NGS

Questions?

Contact Us

[email protected]

www.agilent.com/genomics

For Research Use Only. Not for use in diagnostic procedures.PR7000-2015