Using Public Access Clinical Databases to Interpret NGS Variants

39
Using Public Access Clinical Databases to Interpret NGS Variants February 19, 2014 Gabe Rudy VP Product Development Golden Helix

description

In this webcast on February 19th, Gabe Rudy, Vice President of Product Development, will showcase publicly available databases and resources available for interpreting rare and novel mutations in the context of his own personal exome obtained through a limited 23andMe pilot in 2012. The last couple years have seen many changes in well-established resources such as OMIM and dbSNP, while motivating new efforts such as ClinVar and PhenoDB to bring NGS interpretation to clinical grade through a global data sharing effort. In this webcast, Gabe will cover: The changing landscape of public annotations: Then, Now, and Soon. Will the new human reference (GRCh38) released in December be a game changer? Specific examples of improvements in annotation and algorithms that result in more accurate analysis of his own exome. The utility and progress of NGS to different clinical applications in terms of public resources: carrier screening, hereditary cancer risk, pharmacogenomics, oncology care, and genetic disorder diagnosis. Sharing of new clinical data: How both variation and phenotype level data is currently being shared and what will be the way forward to match rare and undiagnosed cases at a global scale.

Transcript of Using Public Access Clinical Databases to Interpret NGS Variants

Page 1: Using Public Access Clinical Databases to Interpret NGS Variants

 Using Public Access Clinical Databases to

Interpret NGS Variants

February 19, 2014

Gabe RudyVP Product Development

Golden Helix

Page 2: Using Public Access Clinical Databases to Interpret NGS Variants

Use the Questions pane in your GoToWebinar window

Questions during the presentation

Page 3: Using Public Access Clinical Databases to Interpret NGS Variants

My Background

Golden Helix- Founded in 1998- Genetic association software- Analytic services- Hundreds of users worldwide- Over 800 customer citations in scientific

journals

Products I Build with My Team- SNP & Variation Suite (SVS)

- SNP, CNV, NGS tertiary analysis- Import and deal with all flavors of upstream data

- GenomeBrowse - Visualization of everything with genomic coordinates. All

standardized file formats.

- RNA-Seq Pipeline- Expression profiling bioinformatics

Page 4: Using Public Access Clinical Databases to Interpret NGS Variants

Agenda

Getting High Quality Variant Calls

Data Sharing and the Maturing of Public Resources

2

3

4

Clinical Grade Candidate Variant Identification

How I Met My Exomes1

NGS Clinical Utopia: Are We There Yet?5

Page 5: Using Public Access Clinical Databases to Interpret NGS Variants

Exome Sequencing in Consumer Genomics

Exomes done as part of Pilot Program

80x coverage

Raw data with no interpretation

ErinJIA

Gabe(me)

Ethan

Page 6: Using Public Access Clinical Databases to Interpret NGS Variants

Research or clinical grade?

Total Reads 140M

Unique Align 87%

Mean Target 105x

% Target at 2x 97%

% Target at 10x 94%

% Target at 20x 89%

% Target at 30x 83%

Page 7: Using Public Access Clinical Databases to Interpret NGS Variants

Agenda

Getting High Quality Variant Calls

Data Sharing and the Maturing of Public Resources

2

3

4

Clinical Grade Candidate Variant Identification

How I Met My Exomes1

NGS Clinical Utopia: Are We There Yet?5

Page 8: Using Public Access Clinical Databases to Interpret NGS Variants

Alignment and Variant Calling Broken Down

2012 2 VCFs from 23andMe- BWA 0.6.1- GATK (early & late 2012)

2013 Real Time Genomics- v3.1.2 2013-05-02- Called on Trio

2014 Rerun- BWA 0.7.6 (2014-01-31)- FreeBayes

2014 - BWA/FreeBayes

Page 9: Using Public Access Clinical Databases to Interpret NGS Variants

PSPH mis-alignment

Page 10: Using Public Access Clinical Databases to Interpret NGS Variants

Splice Mutation

Page 11: Using Public Access Clinical Databases to Interpret NGS Variants
Page 12: Using Public Access Clinical Databases to Interpret NGS Variants
Page 13: Using Public Access Clinical Databases to Interpret NGS Variants
Page 14: Using Public Access Clinical Databases to Interpret NGS Variants

GRCh38 – Here Now, but still Waiting

A better human reference- Revised Cambridge Reference

Sequence (rCRS) MT- Has centromere models- ~2000 incorrect alleles fixed- ~100 assembly gaps updated

No Gene Annotations- RefSeqGene - Feb 2014- Ensembl Q4 2014

No Variant Annotations- Re-align 1000 Genomes and

NHLBI 6500?- dbSNP?

GRCh37 GRCh38

Ts/Tv 2.06558 2.10171

GRCh37 GRCh38270000

280000

290000

300000

310000

320000

330000

340000

snps

snps

mnps

mnps

indels

indels

complex

complex

My Exome

331,824

319,442

Page 15: Using Public Access Clinical Databases to Interpret NGS Variants

Blog Post

Page 16: Using Public Access Clinical Databases to Interpret NGS Variants

Agenda

Getting High Quality Variant Calls

Data Sharing and the Maturing of Public Resources

2

3

4

Clinical Grade Candidate Variant Identification

How I Met My Exomes1

NGS Clinical Utopia: Are We There Yet?5

Page 17: Using Public Access Clinical Databases to Interpret NGS Variants

Baylor Workflow - Clinical Exomes Paper

Disease gene related

Medically actionable deleterious variants

Deleterious variants in ACMG gene list

Deleterious variants

VUS in dominant gene or homozygous in recessive

gene

Deleterious variant in gene with no known disease

Page 18: Using Public Access Clinical Databases to Interpret NGS Variants

Data Sources to Replicate Workflow

1000 Genomes (Phase 1)

“ESP” (NHLBI 6500 Exomes v2)

HGMD (Public vs Professional)

Variant’s Protein Coding Effect

RNA Splicing Effect

Genes Lists:- Single-Gene Disorder (OMIM with Inheritance)- Medically Actionable (114 genes NHLBI study)- Dominant Inheritance (MedGen)- ACMG Carrier Panel (ACMG Incidental

Findings guidelines)

Page 19: Using Public Access Clinical Databases to Interpret NGS Variants

My Exome Analyzed

Start: 235,689847

234,842

224,9149,928

9,069

807

859

40

242 13

59 565

0

624

624

255

20

20

20

0

0

598

644

Page 20: Using Public Access Clinical Databases to Interpret NGS Variants

Pathogenic by RSID match

Page 21: Using Public Access Clinical Databases to Interpret NGS Variants

Agenda

Getting High Quality Variant Calls

Data Sharing and the Maturing of Public Resources

2

3

4

Clinical Grade Candidate Variant Identification

How I Met My Exomes1

NGS Clinical Utopia: Are We There Yet?5

Page 22: Using Public Access Clinical Databases to Interpret NGS Variants

Applications of NGS Data in the Clinic

Carrier screening – prenatal and standard

Lifetime risk prediction

Genetic disorder diagnostics

Oncology care

PGx – dosage and care

Page 23: Using Public Access Clinical Databases to Interpret NGS Variants

ClinVar

Submitters:- OMIM: Johns Hopkins- Samuels- Lab for Molecular Medicine- Invitae- Emory Genetics Lab

Star rating system- 0-4 stars – level of review

ClinVar is designed to provide a freely accessible, public archive of reports of the relationships among human variations and phenotypes, with supporting evidence.

Page 24: Using Public Access Clinical Databases to Interpret NGS Variants

HGMD

Data mines academic papers for reported functional variants

Also takes submissions, corrections reviewed by team

First available in 1996- Originally 10k variants- 105k in Public (2014)- 148k in “Pro” (2014)

Page 25: Using Public Access Clinical Databases to Interpret NGS Variants

Example: CFTR

Different Variant Sources- CFTR2 (John Hopkins)- UMD-CFTR- ACMG

ClinVar- 1632 Variants- 442 Marked Pathogenic

ClinVitae- 446 Variants- 325 Marked Pathogenic

Caution Needed – Delta F508 Alignment

Page 26: Using Public Access Clinical Databases to Interpret NGS Variants
Page 27: Using Public Access Clinical Databases to Interpret NGS Variants

CFTR delta F508

Page 28: Using Public Access Clinical Databases to Interpret NGS Variants

BRCA: The back door to Myriad’s database

1995 – Patent issued to Myriad Genetics

June 2013 – Patents invalidated by ruling

Lab setting up Dx has a lot of catch up

“Free the Data” and other ways in which Mryiad’s data is in ClinVar, etc.

Sharing Clinical Reports Project

Page 29: Using Public Access Clinical Databases to Interpret NGS Variants

ClinVitae: ClinVar and Friends by Invitae

Sources:- ClinVar (62,913)- Emory (13,365)- ARUP (2,850)- Carver Mut (199)- K Cunningham (581)

79,907 V, 9,189 G- 32,523 Pathogenic- 38,796 Likely Pathogenic

Provided in HGVS- 59,878 after mapping to genomic space

Page 30: Using Public Access Clinical Databases to Interpret NGS Variants

BRCA: In my wife

Page 31: Using Public Access Clinical Databases to Interpret NGS Variants

Agenda

Getting High Quality Variant Calls

Data Sharing and the Maturing of Public Resources

2

3

4

Clinical Grade Candidate Variant Identification

How I Met My Exomes1

NGS Clinical Utopia: Are We There Yet?5

Page 32: Using Public Access Clinical Databases to Interpret NGS Variants

Training

Most variants are rare or novel- Training to interpret these is

extensive

MD/Pathology background is insufficient

Need a PhD in molecular genetics

There’s only 500 board certified Clinical Molecular Geneticists since started

Let’s share in the learning process

Baylor Exome Sign-Out

Page 33: Using Public Access Clinical Databases to Interpret NGS Variants

Thank you

Heidi Rehm – Chief Laboratory Director at Laboratory for Molecular Medicine, PCPGM

Joel Parker – Cancer Genetics, UNC Chapel Hill

Gerry Higgins – VP, Pharmacogenomic Science, Assure Rx Health

Frank Schacherer – Chief Technical Officer, BIOBASE

Reece Hart – Computational Biologist, Invitae

Greta Linse Peterson – Director of Product Management and Quality, Golden Helix

Page 34: Using Public Access Clinical Databases to Interpret NGS Variants

Use the Questions pane in your GoToWebinar window

Questions?

Page 35: Using Public Access Clinical Databases to Interpret NGS Variants

[cut slides after this]

Page 36: Using Public Access Clinical Databases to Interpret NGS Variants

Phenotypeing and Matchmaking Portals

PhenoDB

PhenomeCentral.org

Orphanet – Resources on over 6000 rare diseases and orphan drugs.

European centric:- GEN2PHEN (G2P)

Page 37: Using Public Access Clinical Databases to Interpret NGS Variants

Updated VCF and report at end of October

GATK is a Research Tool. Clinics Beware.

Found 8,0

00 “p

hanto

m” v

aria

nts

Page 38: Using Public Access Clinical Databases to Interpret NGS Variants

Rare Disease Resources

Rare defined as affecting fewer than 200k people.- Most affect fewer than 6000- 25M Americans have a rare disease

NIH Genetic and Rare Diseases Information Center (GARD)

ClinicalTrials.gov

Orphanet – Resources on over 6000 rare diseases and orphan drugs.

Page 39: Using Public Access Clinical Databases to Interpret NGS Variants

Cancer Resources

Behind germline because:- Sharing cancer data is more wholesale.

You don’t just post a variant + a phenotype, you have to have whole variant sets

- Cohorts are not covering enough ethnic groups. African americans under-represented

- Not a lot of incentive for large cancer centers to share their internal databases

What do we do with the data?- 70% of tumors can find driver genes.

But not many have actionable drugs.- Need much more evidence based trials

to find more examples like BRAF V600E

Pic of BRAF V600E and drug