NGS: The Disruption of Genomics and Medicine
Elaine R. Mardis, Ph.D.Co-director, The Genome InstituteRobert E. and Louise F. Dunn Distinguished Professor of Medicine
9th Annual Texas Conference on Health Disparities
DNA Sequencing: Basic Principles
• DNA sequencing has a brief history (<40 yrs)
• The scientific discipline known as “Genomics”has DNA sequencing as its core technology
• What used to take years now takes weeks (or less)
• Computational methods and computing technology underlie nearly every aspect of modern genomic approaches
• Genomics is most impactful if practiced in a teamscience environment
DNA Sequencing c. 1985: radiolabeling
Klenow DNA polymerase
32P-dATP + dC,dG,dTTP
6 M13 templates
Universal primer
One hand-poured urea/polyacrylamide gel
Power supply
Fan
X-ray filmYield: 6 x 500 bp/ ??
DNA Sequencing c. 1987: fluorescent labeling
6 M13 templates
USB T7 Polymerase kit
Custom mixes (no 32P)
Hand-poured urea/PA gel
Yield: 6 x 500 bp/24 hr$100K/instrument
1990-1998: C. elegans Genome Sequence
• Most sequence data for the worm were generated after 1994
• Development of methods, technology, algorithms and infrastructure
• First complete animal genome sequenced (100 Mb)
ABI 3730xl Sequencer
• Each 3730xl sequencer cost $350K
• One 3730xl processed 20x96 well plates/24hr day and provided ~650bp/sample (1.25Mb/day)
• By 2005, we had 135 of these sequencers and produced ~10M sequencing reactions/month (~650 Mb)
The Trajectory of Throughput: 10 years
E.R. Mardis, Nature (2011) E.R. Mardis, Ann. Rev. Analyt. Chem. (2013)
Capillary technology
Applied Biosystems 3730xl (2004)
$15,000,000~5 years, International project
Next-gen technology
Illumina HiSeq 2500 (2014)
$7,000~28 hours, one instrument
Whole human genome sequencing: 10 years
What has the 1000 Genomes Project produced?
• Essentially all SNPs (MAF >1%) in multiple population groups from worldwide collection of bloods• Many variants within 0.2-1% MAF
• Highly complete catalogue of CNVs• Information required for imputation of lower
frequency alleles into existing GWAS samples• A set of validated computational methods for
use of next generation sequencing in disease samples
• Important reference genomes for clinical sequencing
Next-generation DNA sequencing instruments
• All NGS platforms require a library obtained either by amplification or ligation with custom linkers (adapters)
• Each library fragment is amplified on a solid surface (either bead or flat Si-derived surface) with covalently attached adapters that hybridize the library adapters
• Direct step-by-step detection of the nucleotide base incorporated by each amplified library fragment set
• Hundreds of thousands to hundreds of millions of reactions detected per instrument run = “massively parallel sequencing”
• A “digital” read type that enables direct quantitative comparisons
• Shorter read lengths than capillary sequencers
Library Construction for NGS• Shear high molecular weight DNA with sonication
• Enzymatic treatments to blunt ends
• Ligate synthetic DNA adapters
• Produce size fractions
• Quantitate
• Reduced input amount and time to library vs. conventional sequencing methods
• Read lengths much shorter than conventional sequencers
Hybrid Capture
• Hybrid capture - fragments from a whole genome library are selected by combining with probes that correspond to most (not all) human exons or gene targets.
• The probe DNAs are biotinylated, making selection from solution with streptavidin magnetic beads an effective means of purification.
• An “exome” by definition, is the exons of all genes annotated in the reference genome.
• Custom capture reagents can be synthesized to target specific loci that may be of clinical interest.
Multiplex PCR Amplification of Targets
1. Design amplification primer pairs for exons of genes of interest; tile primers to overlap fragments in larger exons
2. Group primer pairs according to G+C content, Tm and reaction condition specifics
3. Amplify genomic DNA to generate multiple products from each primer set; pool products from each set
4. Create library by ligation or tail platform adaptors on the primer ends
5. Sequence
Illumina: Massively Parallel Sequencing by Synthesis
Excitation
EmissionIncorporate
DetectDe-block
Cleave fluor
Platforms: Illumina
• High accuracy, range of capacity and throughput• Longer read lengths on some platforms (MiSeq)• Improved kits, improved software pipeline and capabilities, cloud compute
Platforms: Ion Torrent
PGMProton
• Two human exomes (Proton 1 chip) or one genome (@20X-Proton 2 chip) per run• Ion One Touch or Ion Chef preparatory modules• 2-4 hour/run• ~200 bp average read length• Proton 1 produces 60-80 Mreads >50 bp
• Three sequencing chips available:
• 314 = up to 100 Mb• 316 = up to 1 Gb• 318 = up to 2 Gb
• 2-7 hour/run• up to 400 bp read length• 400kreads up to 5 Mreads
• Low substitution error rate, in/dels problematic, no paired end reads• Inexpensive and fast turn-around for data production• Improved computational workflows for analysis
Cancer is a Disease of the Genome
In the early 1970’s, Janet Rowley’s microscopy studies of leukemia cell chromosomes suggested that specific alterations led to cancer, laying the foundation for cancer genomics.
Cancer genomics
K DFG R Y
Tyrosine kinase
745 Y869
K DFG Y Y Y YTM
718 964
EGF ligand binding autophos
GXGXXG
835
R
776
H
858 947
MLREA
~80% of Iressa responders have EGFR mutations
W. Pao et al., PNAS 2004
Cancer Genomics
R.K.Wilson 2011
AML1: First tumor:normal comparison by NGS
Ley et al., Nature 2008
• Caucasian female, mid-50s at diagnosis
• De novo M1 AML• Family history of
AML and lymphoma• 100% blasts in
initial BM sample• Relapsed and died
at 23 months• Informed consent
for whole genome sequencing
• Solexa sequencer, 32 bp unpaired reads
Funding: A. Siteman
TGI: Cancer Cases by WGS (March 2014)
0
25
50
75
100
125
150
WGS: 2,700 from ~1,100 cancer patients
Exomes: 6,200 from ~3,000 cancer patients
WU-SJ PCGP: 750 patients
WUMS Genomics Tumor Board
• The Genomics Tumor Board (GTB) serves as a vehicle for education, decision-making, and patient monitoring• Physicians work with junior faculty to develop and present case reports of
each patient’s clinical history• Oversight board of GTB reviews cases and determines 1-2 per month that
are most likely to benefit from genomic diagnosis (difficult diagnoses, late stage metastatic patients)
• Appropriate tumor and normal samples are obtained and studied by genome, exome and transcriptome sequencing and analysis
• Results of genomic diagnosis are communicated to the physician lead, then to GTB participants
• Physician lead presents their decision to treat, outcomes if available, difficulties encountered
• Costs for the GTB work will be absorbed by the Division of Oncology and The Genome Institute
Integrated Discovery: DNA and RNA data
Whole genome sequencing(tumor + normal)
Exome sequencing (tumor + normal)
Whole transcriptome sequencing(tumor)
Linking Somatic Variants to Therapies
Obi Griffith, Ph.D. and Malachi Griffith, Ph.D.2013 Wired UK Genius List
Somatic/Germline Cancer Events (DNA+RNA)
Drug Gene Interaction database
(>50 database sources)
Filtered (activating/drivers)
Candidate genes/pathways
Clinically actionable events
(aka “The Report”)
Functional annotation
DrugBank
TTD
clinicaltrials.gov
PharmGKB
TEND
Literature
dGene
TALC
Clinical prioritization and reporting
Malachi & Obi Griffith
Single Nucleotide Variants
Insertion/deletions
Structural Variants
Copy Number Variations
SV-predicted gene fusions
Differentially Expressed Genes
Differentially Expressed Isoforms
Therapeutic Interpretation of Variants
MyCancerGenome
Expressed variants
**
**
• Highly curated database of mutations having a demonstrated association with cancer
• General information about each somatic variant• Chromosomal Location• Strand• Gene• Protein impact of variant (annotation)• PubMed ID evidence cited, linked
• Easy to access from the web and programmatically through an API
DoCM: A Database of Canonical Cancer Mutations
Marston E, et al. Blood 2009 Jan 1;113(1):117-26.
FLT3 Over-expression in ALL1
• FLT3 was within the top 1% of all expressed genes.
• Absent a normal comparator, the literature report from Marston identified FLT3 over-expression in pre-B-ALL
• Based on wt FLT3 over-expression by the tumor cells, we predicted the cancer would be sensitive to the FLT3 inhibitor Sunitinib (Sutent) [DrugBank].
Patient biopsied metastatic
melanoma lesions
Tumor and germline DNA sequenced,
somatic mutations identified; RNA capture verifies
expressed mutations and expression level;
netMHC algorithm identifies
immunoepitopes
Apheresis samples from patient used to
verify the algorithmically-
identified immunoepitopes that elicit T cell memory
Sequencing to identify tumor-specific immunoepitopes:
Mardis, Schreiber et al., Nature 2012
Genome-driven cancer immunotherapy
Dendritic Cell Vaccine Platform
SNV specific T-cell immunityEx-vivo monitoring to evaluate generation of cytotoxic effectors
DC generationEx-vivo differentiation of monocytes into DC
DC maturationEx-vivo instruction to generate appropriate cytotoxic effectors
Infusion of DC vaccine
DC targeting Loading with selected SNV-derived peptides
A dendritic cell-based approach is currently being tested in an FDA approved protocol for metastatic melanoma patients:
• Patient 1 has received all three doses of vaccine, and is being monitored• Patient 2 has received three doses of vaccine, this patient has measurable disease and will be monitored for
progression, stability or regression• Patient 3 has measurable disease, has completed her vaccine infusions early March• Patients 4 and 5 have genomic analysis completed, in vitro assays completed, GMP peptides underway
Name [email protected]
Acknowledgements
WUSM/Siteman Cancer CenterTimothy J. Ley, M.D.Lukas Wartman, M.D.Peter Westervelt, M.D.John DiPersio, M.D.Gerry Linette, M.D.Beatriz Carreno, M.D., Ph.D.
Thanks also to:Aaron QuinlanGabor MarthMichael Zody
Our patients and their families
The Genome InstituteMalachi Griffith, Ph.D.Obi Griffith, Ph.D.Ben AinscoughZach SkidmoreAvinash RamuAllison RegierLee TraniNick SpiesVincent Magrini, Ph.D.Sean McGrathRyan DemeterJasreet Hundal, M.S.Jason WalkerDavid Larson, Ph.D.Lucinda FultonRobert FultonBrad Ozenberger
Richard K. Wilson, Ph.D.
Top Related