Using Public Access Clinical Databases to Interpret NGS Variants
-
Upload
golden-helix-inc -
Category
Science
-
view
262 -
download
6
description
Transcript of Using Public Access Clinical Databases to Interpret NGS Variants
Using Public Access Clinical Databases to
Interpret NGS Variants
February 19, 2014
Gabe RudyVP Product Development
Golden Helix
Use the Questions pane in your GoToWebinar window
Questions during the presentation
My Background
Golden Helix- Founded in 1998- Genetic association software- Analytic services- Hundreds of users worldwide- Over 800 customer citations in scientific
journals
Products I Build with My Team- SNP & Variation Suite (SVS)
- SNP, CNV, NGS tertiary analysis- Import and deal with all flavors of upstream data
- GenomeBrowse - Visualization of everything with genomic coordinates. All
standardized file formats.
- RNA-Seq Pipeline- Expression profiling bioinformatics
Agenda
Getting High Quality Variant Calls
Data Sharing and the Maturing of Public Resources
2
3
4
Clinical Grade Candidate Variant Identification
How I Met My Exomes1
NGS Clinical Utopia: Are We There Yet?5
Exome Sequencing in Consumer Genomics
Exomes done as part of Pilot Program
80x coverage
Raw data with no interpretation
ErinJIA
Gabe(me)
Ethan
Research or clinical grade?
Total Reads 140M
Unique Align 87%
Mean Target 105x
% Target at 2x 97%
% Target at 10x 94%
% Target at 20x 89%
% Target at 30x 83%
Agenda
Getting High Quality Variant Calls
Data Sharing and the Maturing of Public Resources
2
3
4
Clinical Grade Candidate Variant Identification
How I Met My Exomes1
NGS Clinical Utopia: Are We There Yet?5
Alignment and Variant Calling Broken Down
2012 2 VCFs from 23andMe- BWA 0.6.1- GATK (early & late 2012)
2013 Real Time Genomics- v3.1.2 2013-05-02- Called on Trio
2014 Rerun- BWA 0.7.6 (2014-01-31)- FreeBayes
2014 - BWA/FreeBayes
PSPH mis-alignment
Splice Mutation
GRCh38 – Here Now, but still Waiting
A better human reference- Revised Cambridge Reference
Sequence (rCRS) MT- Has centromere models- ~2000 incorrect alleles fixed- ~100 assembly gaps updated
No Gene Annotations- RefSeqGene - Feb 2014- Ensembl Q4 2014
No Variant Annotations- Re-align 1000 Genomes and
NHLBI 6500?- dbSNP?
GRCh37 GRCh38
Ts/Tv 2.06558 2.10171
GRCh37 GRCh38270000
280000
290000
300000
310000
320000
330000
340000
snps
snps
mnps
mnps
indels
indels
complex
complex
My Exome
331,824
319,442
Blog Post
Agenda
Getting High Quality Variant Calls
Data Sharing and the Maturing of Public Resources
2
3
4
Clinical Grade Candidate Variant Identification
How I Met My Exomes1
NGS Clinical Utopia: Are We There Yet?5
Baylor Workflow - Clinical Exomes Paper
Disease gene related
Medically actionable deleterious variants
Deleterious variants in ACMG gene list
Deleterious variants
VUS in dominant gene or homozygous in recessive
gene
Deleterious variant in gene with no known disease
Data Sources to Replicate Workflow
1000 Genomes (Phase 1)
“ESP” (NHLBI 6500 Exomes v2)
HGMD (Public vs Professional)
Variant’s Protein Coding Effect
RNA Splicing Effect
Genes Lists:- Single-Gene Disorder (OMIM with Inheritance)- Medically Actionable (114 genes NHLBI study)- Dominant Inheritance (MedGen)- ACMG Carrier Panel (ACMG Incidental
Findings guidelines)
My Exome Analyzed
Start: 235,689847
234,842
224,9149,928
9,069
807
859
40
242 13
59 565
0
624
624
255
20
20
20
0
0
598
644
Pathogenic by RSID match
Agenda
Getting High Quality Variant Calls
Data Sharing and the Maturing of Public Resources
2
3
4
Clinical Grade Candidate Variant Identification
How I Met My Exomes1
NGS Clinical Utopia: Are We There Yet?5
Applications of NGS Data in the Clinic
Carrier screening – prenatal and standard
Lifetime risk prediction
Genetic disorder diagnostics
Oncology care
PGx – dosage and care
ClinVar
Submitters:- OMIM: Johns Hopkins- Samuels- Lab for Molecular Medicine- Invitae- Emory Genetics Lab
Star rating system- 0-4 stars – level of review
ClinVar is designed to provide a freely accessible, public archive of reports of the relationships among human variations and phenotypes, with supporting evidence.
HGMD
Data mines academic papers for reported functional variants
Also takes submissions, corrections reviewed by team
First available in 1996- Originally 10k variants- 105k in Public (2014)- 148k in “Pro” (2014)
Example: CFTR
Different Variant Sources- CFTR2 (John Hopkins)- UMD-CFTR- ACMG
ClinVar- 1632 Variants- 442 Marked Pathogenic
ClinVitae- 446 Variants- 325 Marked Pathogenic
Caution Needed – Delta F508 Alignment
CFTR delta F508
BRCA: The back door to Myriad’s database
1995 – Patent issued to Myriad Genetics
June 2013 – Patents invalidated by ruling
Lab setting up Dx has a lot of catch up
“Free the Data” and other ways in which Mryiad’s data is in ClinVar, etc.
Sharing Clinical Reports Project
ClinVitae: ClinVar and Friends by Invitae
Sources:- ClinVar (62,913)- Emory (13,365)- ARUP (2,850)- Carver Mut (199)- K Cunningham (581)
79,907 V, 9,189 G- 32,523 Pathogenic- 38,796 Likely Pathogenic
Provided in HGVS- 59,878 after mapping to genomic space
BRCA: In my wife
Agenda
Getting High Quality Variant Calls
Data Sharing and the Maturing of Public Resources
2
3
4
Clinical Grade Candidate Variant Identification
How I Met My Exomes1
NGS Clinical Utopia: Are We There Yet?5
Training
Most variants are rare or novel- Training to interpret these is
extensive
MD/Pathology background is insufficient
Need a PhD in molecular genetics
There’s only 500 board certified Clinical Molecular Geneticists since started
Let’s share in the learning process
Baylor Exome Sign-Out
Thank you
Heidi Rehm – Chief Laboratory Director at Laboratory for Molecular Medicine, PCPGM
Joel Parker – Cancer Genetics, UNC Chapel Hill
Gerry Higgins – VP, Pharmacogenomic Science, Assure Rx Health
Frank Schacherer – Chief Technical Officer, BIOBASE
Reece Hart – Computational Biologist, Invitae
Greta Linse Peterson – Director of Product Management and Quality, Golden Helix
Use the Questions pane in your GoToWebinar window
Questions?
[cut slides after this]
Phenotypeing and Matchmaking Portals
PhenoDB
PhenomeCentral.org
Orphanet – Resources on over 6000 rare diseases and orphan drugs.
European centric:- GEN2PHEN (G2P)
Updated VCF and report at end of October
GATK is a Research Tool. Clinics Beware.
Found 8,0
00 “p
hanto
m” v
aria
nts
Rare Disease Resources
Rare defined as affecting fewer than 200k people.- Most affect fewer than 6000- 25M Americans have a rare disease
NIH Genetic and Rare Diseases Information Center (GARD)
ClinicalTrials.gov
Orphanet – Resources on over 6000 rare diseases and orphan drugs.
Cancer Resources
Behind germline because:- Sharing cancer data is more wholesale.
You don’t just post a variant + a phenotype, you have to have whole variant sets
- Cohorts are not covering enough ethnic groups. African americans under-represented
- Not a lot of incentive for large cancer centers to share their internal databases
What do we do with the data?- 70% of tumors can find driver genes.
But not many have actionable drugs.- Need much more evidence based trials
to find more examples like BRAF V600E
Pic of BRAF V600E and drug