Post on 17-Jan-2017
Phasing NA12878 by segregation in children
● Joint calling of 17 member CEPH pedigree.● Benefits:
○ High Mendelian consistency across all members.○ (Near) full phasing of NA12878 (and NA12877) according
to segregation in the 11 children.● Latest run incorporates 300x Illumina reads for NA12878
RM8398 sample (other members ~30x).
● Calls that segregate well are more likely to be correct.● Could look at phasing inconsistent calls in more detail.
○ Structural variants○ Somatic variants
Unifying call sets
Different callers, different representations.Different samples, different representations.
Given some number of call sets, represent the calls in as consistent manner as possible.
● Incrementally accumulate alleles from call sets.● Recode call sets using accumulated alleles.● Harmonization rather than Canonicalization (chosen
representation comes from within rather than externally specified).
Example from v3.3 AJ trio
3 non-Mendelian calls become consistent on recoding.12 original alleles recoded into 6 alleles.
Original child mother father1:73974514 GAACCC G . 0|1 .1:73974515 A T 0/1 . .1:73974516 ACCC A 0/1 . .1:73974520 TC T . 0|1 .1:73974521 CATA C 0/1 . .1:73974524 A C . 0|1 .
Recoded1:73974515 A T 0/1 0/1 .1:73974516 ACCC A 0/1 0/1 .1:73974521 CATA C 0/1 0/1 .
Notes and Limitations
● Recoding loses existing annotations. Could recover in simple cases, but not clear what to do when calls are moved, split, or combined as a result of the recoding.
● If a new call set needs to be added, can incrementally accumulate new sample, but existing ones will need to be recoded.
● Final result is dependent on the order in which call sets are accumulated.
● Minimizes number of alleles (can in rare cases introduce Mendelian violations).
Phase Transfer
Another mode of operation for vcfeval. The phasing in one call set can be lifted over to another call set without losing annotations or changing the representation of calls.
v3.3 HG002/NA243859.7%
RTG AJ trio 300x88.1%
phase-transferred90.2%
chr20 NA12878 GATK0%
RTG CEPH SP 37.7.099.9%
89.0%
Illumina PG 8.0.199.9%
phase-transferred90.8%
Phase Transfer
During normal operation vcfeval ignores phasing information and tries each allele on each haplotype.
During phase transfer vcfeval will obey the phasing of one (or both) of the samples. Effectively restricts the matches that can be made. Ideally want at least one sample to be fully phased.
A special output mode is used to report the phasing found during the matching. Apart from the phasing, the calls are not changed and all the original annotations are retained.