Aug2015 analysis team 12 bina

Post on 17-Jan-2017

314 views 0 download

Transcript of Aug2015 analysis team 12 bina

Bina Technologies, now part of Roche Sequencing

Updates on Trio Analysis onHigh-Confidence SVsBina Technologies, Roche Sequencing

Marghoob Mohiyuddin, Jian Li, Hugo Lam

For Research Use Only. Not for use in diagnostic procedures.

Background• Recently published MetaSV tool does integrative SV-calling

• http://bioinform.github.io/metasv• Original work integrates across BreakDancer, BreakSeq2,

CNVnator and Pindel• Brad Chapman recently added support for CNVkit, LUMPY, Manta

• GiaB recently released high-confidence SVs for NA12878 for validating SV methods

• We performed validation of GiaB SV goldset by using trio analysis with MetaSV and published variants as a quality check

• More confirmation that GiaB calls are of high-quality• But potentially missing a significant number of true positives

For Research Use Only. Not for use in diagnostic procedures.

An Ensemble Approach

For Research Use Only. Not for use in diagnostic procedures.

MetaSV Workflow

• Ensemble SV calling• Merge SVs from multiple methods

and tools• SVs detected by multiple methods

are high-confidence• Enhanced insertion detection

• Existing tools weak in detecting insertions

• Use a combination of soft-clip analysis and assembly

• Assembly and alignment to refine breakpoints

For Research Use Only. Not for use in diagnostic procedures.

Trio Analysis Methodology

• Process the parents and child data using MetaSV (or any tool of choice) to identify SVs

• Identify high-confidence SVs for the child by checking against their presence in the parents’ SV set

• Check for Mendelian errors• Published variants also used for validation

For Research Use Only. Not for use in diagnostic procedures.

Total Validated

Total not validated

Additionally Validated

0(0%)

2,348 (100%)

0 (0%)

2,302 (98.0%)

46 (2.0%)

2,302(98.0%)

2,306 (98.2%)

42 (1.8%)

4(0.2%)

2,342 (99.7%)

6 (0.3%)

36(1.5%)

GiaB Deletion Validation

GiaB HC

GiaB HC Validated by Parents (MetaSV ALL)

GiaB HC Validated by Child (MetaSV PASS)

GiaB HC Validated by Child (curated)

For Research Use Only. Not for use in diagnostic procedures.

Comparison with Complete Genomics Deletion SVs for AJ Trio

Sample High-confidence MetaSV

High-confidence CG

Common (% CG) Common with all-MetaSV (% CG)

HG002 2809 1864 1426 (76.5%) 1607 (86.2%)

HG003 2891 1802 1448 (80.4%) 1611 (89.4%)

HG004 2959 1847 1398 (75.7%) 1554 (84.1%)

• Only SVs >= 100bp considered• Good fraction of CG calls also made by MetaSV

• Affirmation for both since different sequencing technologies• All-MetaSV provides even higher validation

For Research Use Only. Not for use in diagnostic procedures.

Analysis of HG002 deletions using Mendelian consistency

Calls in child (HG002) Pass/All in parent Consider genotypes Validated calls

2809 All No 2756 (98.1%)

2809 All Yes 2723 (96.9%)

2809 Pass No 2625 (93.4%)

2809 Pass Yes 2571 (91.5%)

• Analysis performed on ~50x data• Child SVs are highly Mendelian consistent affirming high-confidence (barring systematic errors)• Mendelian consistency analysis on CG results without genotypes gave 86.3% validation rate for

HG002

For Research Use Only. Not for use in diagnostic procedures.

More work needs to be done

• Improving MetaSV performance with more recent callers (working with Brad Chapman)

• Work in progress on improving inversion sensitivity for MetaSV• Assessment of insertion SVs• Integrating with other integrative SV-callers (Parliament, svclassify)

• Meta-MetaSV? • In general, more assessment needs to be done for constructing a

variant set which is not only precise but comprehensive

For Research Use Only. Not for use in diagnostic procedures.

Acknowledgement

• Genome in a Bottle Consortium• Hemang Parikh• Justin Zook• Brad Chapman• Rebecca Truty• The SV Team

• Bina Technologies• Jian Li• Hugo Lam• The Science Team

Bina Technologies, now part of Roche Sequencing

Backup slides

For Research Use Only. Not for use in diagnostic procedures.

MetaSV Workflow

• Ensemble SV calling• Merge SVs from multiple methods

and tools• SVs detected by multiple methods

are high-confidence• Enhanced insertion detection

• Existing tools weak in detecting insertions

• Use a combination of soft-clip analysis and assembly

• Assembly and alignment to refine breakpoints

For Research Use Only. Not for use in diagnostic procedures.

MetaSV Accuracy

• VarSim simulation of 50x Illumina 2x100bp reads for NA12878

• Reciprocal overlap of 90% and wiggle of 100bp to access both breakpoint precision and accuracy

• Performance varies for tools/methods across sizes

• MetaSV has best stable accuracy across all SV sizes

• Achieved 90.2% sensitivity against Complete Genomics high-confidence SVs for NA12878

For Research Use Only. Not for use in diagnostic procedures.

Trio analysis on AJ family using Mendelian consistency

● MetaSV High Quality ○ Mendelian Inheritance Consistency without Genotypes○ PASS in Child and PASS in Parents○ Considering no call as reference call

MetaSV Child Dels2,809

Parent Dels MetaSV high-quality

2,625

MetaSV Private184 (93.4%)