Aug2015 steve lincoln analytical validation
-
Upload
genomeinabottle -
Category
Health & Medicine
-
view
359 -
download
0
Transcript of Aug2015 steve lincoln analytical validation
© 2015 Invitae Corporation. All Rights Reserved. | CONFIDENTIAL | 1 © 2015 Invitae Corporation. All Rights Reserved. | CONFIDENTIAL | 1
Analytic Validation and Performance Monitoring of Clinical NGS Assays
Germline
© 2015 Invitae Corporation. All Rights Reserved. | CONFIDENTIAL | 2 © 2015 Invitae Corporation. All Rights Reserved. | CONFIDENTIAL | 2
A systematic comparison of traditional and multi-gene panel testing for hereditary breast and ovarian cancer in more than 1000 patients
Stephen E. Lincoln1, Yuya Kobayashi1, Michael J. Anderson1, Shan Yang1,Andrea J. Desmond2, Meredith A. Mills3, Geoffrey B. Nilsen1, Kevin B. Jacobs1, Federico A. Monzon1, Allison W. Kurian3, James M. Ford3, Leif W. Ellisen2,4
1. Invitae, San Francisco, CA2. Massachusetts General Hospital Cancer Center, Boston, MA3. Stanford University School of Medicine, Stanford, CA4. Harvard Medical School, Boston, MA
Lincoln et al., J Mol Diag 2015
© 2015 Invitae Corporation. All Rights Reserved. | CONFIDENTIAL | 3 © 2015 Invitae Corporation. All Rights Reserved. | CONFIDENTIAL | 3
Companion Clinical Actionability Research Study
Desmond et al., JAMA Oncol. 2015Swisher, JAMA Oncol. 2015
© 2015 Invitae Corporation. All Rights Reserved. | CONFIDENTIAL | 4 © 2015 Invitae Corporation. All Rights Reserved. | CONFIDENTIAL | 4
NA12878 and 6 other Well-Characterized Genomes (WCGs) were used in a 1105 sample study to evaluate a 29-gene hereditary cancer panel test
The 7 WCGs contributed 310 of 750 comparable variants to both the sensitivity and specificity analyses
But… the 77% coverage of GIAB data was a substantial limitation– No exonic variants in 5 of 29 panel genes in any of 7 samples
• Only 1 coding variant each in 2 other genes• Reason: (a) missing 23% of GIAB and (b) population genetics
– Almost all GIAB variants are SNVs• Only 6 of 310 were very small deletions (max 4bp)• 0 insertions, 0 other variant types• No GIAB CNV data yet (but we’d expect 0 CNVs in these 29 genes)
– The 77% is biased to the “easy” subset of the genome
WCGs Contribution to JMD Study
Lincoln et al., J Mol Diag 2015; Lincoln GIAB Spring 2015
© 2015 Invitae Corporation. All Rights Reserved. | CONFIDENTIAL | 5 © 2015 Invitae Corporation. All Rights Reserved. | CONFIDENTIAL | 5
A Significant Fraction of Pathogenic Variants in Clinical Cases are Technically Challenging
Pathogenic and likely pathogenic variants (n=260) among the clinical cases (n=1062) by variant type:
Lincoln et al., J Mol Diag 2015
Small Indel
i.e. CNVs
© 2015 Invitae Corporation. All Rights Reserved. | CONFIDENTIAL | 6 © 2015 Invitae Corporation. All Rights Reserved. | CONFIDENTIAL | 6
BRCA2 c.9203del126
Split-read signal at 3’
end of deletion
Split-read signal at 5’
end of deletion
Exon target
© 2015 Invitae Corporation. All Rights Reserved. | CONFIDENTIAL | 7 © 2015 Invitae Corporation. All Rights Reserved. | CONFIDENTIAL | 7
BRCA2 c.156_insAlu
Split-read signal of
Alu sequence
© 2015 Invitae Corporation. All Rights Reserved. | CONFIDENTIAL | 8 © 2015 Invitae Corporation. All Rights Reserved. | CONFIDENTIAL | 8
Get IGV
MSH2 c.943+3T>C
Homopolymer-A
Alignment and Biochemical
Artifacts
CDKN2A c.9_32dup24
Insertion of repeat in correctly mapped NGS reads
Split-read signal
Repeat Copy 1 Repeat Copy 2
Split-read signal
Translation5’ Met
© 2015 Invitae Corporation. All Rights Reserved. | CONFIDENTIAL | 10 © 2015 Invitae Corporation. All Rights Reserved. | CONFIDENTIAL | 10
Idealized Workflow
David Litwack, FDA, Feb 2015
© 2015 Invitae Corporation. All Rights Reserved. | CONFIDENTIAL | 11 © 2015 Invitae Corporation. All Rights Reserved. | CONFIDENTIAL | 11
Idealized Workflow
DNA Prep Targeting & Library Sequencing Bioinformatics Interpretation
and Reporting
GIAB Sample(s)
FASTQ VCFLibraryDNASpecimen Report
GIAB DataComparison
© 2015 Invitae Corporation. All Rights Reserved. | CONFIDENTIAL | 12 © 2015 Invitae Corporation. All Rights Reserved. | CONFIDENTIAL | 12
Idealized Workflow
Benefits:1. Easy (in principle, software tools still need
improvement)2. GIAB gives much broader coverage
compared to traditional reference samples3. Virtually unlimited sample supply
DNA Prep Targeting & Library Sequencing Bioinformatics Interpretation
and Reporting
GIAB Sample(s)
FASTQ VCFLibraryDNASpecimen Report
GIAB DataComparison
© 2015 Invitae Corporation. All Rights Reserved. | CONFIDENTIAL | 13 © 2015 Invitae Corporation. All Rights Reserved. | CONFIDENTIAL | 13
Idealized Workflow
DNA Prep Targeting & Library Sequencing Bioinformatics Interpretation
and Reporting
GIAB Sample(s)
FASTQ VCFLibraryDNASpecimen Report
GIAB DataComparison
Challenges:1. GIAB data are NOT representative of clinical
practice2. The VCF is NOT the report, and there are
substantial differences even in genotypes. 3. Does not evaluate specimen collection, storage,
transfer, or DNA prep By far our
greatest source of problems
In fact, pre-prepared gDNA is not in spec for
our Dx test
© 2015 Invitae Corporation. All Rights Reserved. | CONFIDENTIAL | 14 © 2015 Invitae Corporation. All Rights Reserved. | CONFIDENTIAL | 14
Idealized Workflow
DNA Prep Targeting & Library Sequencing Bioinformatics Interpretation
and Reporting
FASTQ VCFLibraryDNASpecimen Report
A complex, multi-step process which can and does change analytic results
© 2015 Invitae Corporation. All Rights Reserved. | CONFIDENTIAL | 15 © 2015 Invitae Corporation. All Rights Reserved. | CONFIDENTIAL | 15
Actual Interpretation/Reporting Workflow: VCF Genotypes Are NOT the Reported Analytical Data
More QC and Filtering
Convert into Transcript Variants
Lab Director Review of Raw Data
Lab Director Interpretation
Orthogonal Confirmation
Re-”spelling”QC
Confirmation Failures Removed
Common Polymorphisms and Wild-type Calls Removed
BenignsRemoved
Known Artifacts Removed
Many steps after VCF and before Dx report can and do add, remove, edit, or change genotypes before reporting
This happens in transcript variants (in HGVS) which is not convertible back to VCF
Benign variants with no clinical significance do NOT get the same quality control as reported variants
Many of these steps involve human medical experts, not algorithms
VCF Report
Gap FillingVariants may
be Added
Not a 1-1 process
© 2015 Invitae Corporation. All Rights Reserved. | CONFIDENTIAL | 16 © 2015 Invitae Corporation. All Rights Reserved. | CONFIDENTIAL | 16
Challenges in NGS Validation with GIAB
1. 77% completeness of GIAB vs. hg19/build37 is a very substantial limitation– The other 23% includes all or part of many commonly
tested genes– The 77% is based toward easy stuff– Reminder: this 23/77 issue is different than the “dark
matter” issue (regions not in the reference genome)
2. The very limited number of more challenging variants in coding regions of commonly tested genes is a even more substantial limitation– Indels, complex sequence changes, CNVs
© 2015 Invitae Corporation. All Rights Reserved. | CONFIDENTIAL | 17 © 2015 Invitae Corporation. All Rights Reserved. | CONFIDENTIAL | 17
Other Challenges in NGS Validation with GIAB
3. The actual references used in (most) reporting are transcripts, not the genome sequence– They can be different in significant ways– We use our own curated set of transcript sequences
and alignments (Hart et al., Bioinformatics 2014)
4. Sample type (pre-prepared gDNA) is not in spec for our validated assay
© 2015 Invitae Corporation. All Rights Reserved. | CONFIDENTIAL | 18 © 2015 Invitae Corporation. All Rights Reserved. | CONFIDENTIAL | 18
Philosophical Digression
What is the purpose of analytic validation of a NGS germline assay?
1. Essentially impossible to capture real world variability and challenges
2. In practice, online QC/QA plays a much more important role than validation