Aug2015 analysis team 12 bina

16
Bina Technologies, now part of Roche Sequencing

Transcript of Aug2015 analysis team 12 bina

Page 1: Aug2015 analysis team 12 bina

Bina Technologies, now part of Roche Sequencing

Page 2: Aug2015 analysis team 12 bina

Updates on Trio Analysis onHigh-Confidence SVsBina Technologies, Roche Sequencing

Marghoob Mohiyuddin, Jian Li, Hugo Lam

Page 3: Aug2015 analysis team 12 bina

For Research Use Only. Not for use in diagnostic procedures.

Background• Recently published MetaSV tool does integrative SV-calling

• http://bioinform.github.io/metasv• Original work integrates across BreakDancer, BreakSeq2,

CNVnator and Pindel• Brad Chapman recently added support for CNVkit, LUMPY, Manta

• GiaB recently released high-confidence SVs for NA12878 for validating SV methods

• We performed validation of GiaB SV goldset by using trio analysis with MetaSV and published variants as a quality check

• More confirmation that GiaB calls are of high-quality• But potentially missing a significant number of true positives

Page 4: Aug2015 analysis team 12 bina

For Research Use Only. Not for use in diagnostic procedures.

An Ensemble Approach

Page 5: Aug2015 analysis team 12 bina

For Research Use Only. Not for use in diagnostic procedures.

MetaSV Workflow

• Ensemble SV calling• Merge SVs from multiple methods

and tools• SVs detected by multiple methods

are high-confidence• Enhanced insertion detection

• Existing tools weak in detecting insertions

• Use a combination of soft-clip analysis and assembly

• Assembly and alignment to refine breakpoints

Page 6: Aug2015 analysis team 12 bina

For Research Use Only. Not for use in diagnostic procedures.

Trio Analysis Methodology

• Process the parents and child data using MetaSV (or any tool of choice) to identify SVs

• Identify high-confidence SVs for the child by checking against their presence in the parents’ SV set

• Check for Mendelian errors• Published variants also used for validation

Page 7: Aug2015 analysis team 12 bina

For Research Use Only. Not for use in diagnostic procedures.

Total Validated

Total not validated

Additionally Validated

0(0%)

2,348 (100%)

0 (0%)

2,302 (98.0%)

46 (2.0%)

2,302(98.0%)

2,306 (98.2%)

42 (1.8%)

4(0.2%)

2,342 (99.7%)

6 (0.3%)

36(1.5%)

GiaB Deletion Validation

GiaB HC

GiaB HC Validated by Parents (MetaSV ALL)

GiaB HC Validated by Child (MetaSV PASS)

GiaB HC Validated by Child (curated)

Page 8: Aug2015 analysis team 12 bina

For Research Use Only. Not for use in diagnostic procedures.

Comparison with Complete Genomics Deletion SVs for AJ Trio

Sample High-confidence MetaSV

High-confidence CG

Common (% CG) Common with all-MetaSV (% CG)

HG002 2809 1864 1426 (76.5%) 1607 (86.2%)

HG003 2891 1802 1448 (80.4%) 1611 (89.4%)

HG004 2959 1847 1398 (75.7%) 1554 (84.1%)

• Only SVs >= 100bp considered• Good fraction of CG calls also made by MetaSV

• Affirmation for both since different sequencing technologies• All-MetaSV provides even higher validation

Page 9: Aug2015 analysis team 12 bina

For Research Use Only. Not for use in diagnostic procedures.

Analysis of HG002 deletions using Mendelian consistency

Calls in child (HG002) Pass/All in parent Consider genotypes Validated calls

2809 All No 2756 (98.1%)

2809 All Yes 2723 (96.9%)

2809 Pass No 2625 (93.4%)

2809 Pass Yes 2571 (91.5%)

• Analysis performed on ~50x data• Child SVs are highly Mendelian consistent affirming high-confidence (barring systematic errors)• Mendelian consistency analysis on CG results without genotypes gave 86.3% validation rate for

HG002

Page 10: Aug2015 analysis team 12 bina

For Research Use Only. Not for use in diagnostic procedures.

More work needs to be done

• Improving MetaSV performance with more recent callers (working with Brad Chapman)

• Work in progress on improving inversion sensitivity for MetaSV• Assessment of insertion SVs• Integrating with other integrative SV-callers (Parliament, svclassify)

• Meta-MetaSV? • In general, more assessment needs to be done for constructing a

variant set which is not only precise but comprehensive

Page 11: Aug2015 analysis team 12 bina

For Research Use Only. Not for use in diagnostic procedures.

Acknowledgement

• Genome in a Bottle Consortium• Hemang Parikh• Justin Zook• Brad Chapman• Rebecca Truty• The SV Team

• Bina Technologies• Jian Li• Hugo Lam• The Science Team

Page 12: Aug2015 analysis team 12 bina

Bina Technologies, now part of Roche Sequencing

Page 13: Aug2015 analysis team 12 bina

Backup slides

Page 14: Aug2015 analysis team 12 bina

For Research Use Only. Not for use in diagnostic procedures.

MetaSV Workflow

• Ensemble SV calling• Merge SVs from multiple methods

and tools• SVs detected by multiple methods

are high-confidence• Enhanced insertion detection

• Existing tools weak in detecting insertions

• Use a combination of soft-clip analysis and assembly

• Assembly and alignment to refine breakpoints

Page 15: Aug2015 analysis team 12 bina

For Research Use Only. Not for use in diagnostic procedures.

MetaSV Accuracy

• VarSim simulation of 50x Illumina 2x100bp reads for NA12878

• Reciprocal overlap of 90% and wiggle of 100bp to access both breakpoint precision and accuracy

• Performance varies for tools/methods across sizes

• MetaSV has best stable accuracy across all SV sizes

• Achieved 90.2% sensitivity against Complete Genomics high-confidence SVs for NA12878

Page 16: Aug2015 analysis team 12 bina

For Research Use Only. Not for use in diagnostic procedures.

Trio analysis on AJ family using Mendelian consistency

● MetaSV High Quality ○ Mendelian Inheritance Consistency without Genotypes○ PASS in Child and PASS in Parents○ Considering no call as reference call

MetaSV Child Dels2,809

Parent Dels MetaSV high-quality

2,625

MetaSV Private184 (93.4%)