Considerations for Analyzing Targeted NGS Data BRCA

28
Considerations for Analyzing Targeted NGS Data BRCA Tim Hague,CTO

description

Considerations for Analyzing Targeted NGS Data BRCA. Tim Hague , CTO. Introduction. BRCA 1 and 2 are best known as 'cancer susceptibility' genes Actually the proteins repair damage in DNA Large number of known deleterious mutations Disproportionate number of indels. History. - PowerPoint PPT Presentation

Transcript of Considerations for Analyzing Targeted NGS Data BRCA

Page 1: Considerations for Analyzing Targeted NGS Data BRCA

Considerations for Analyzing Targeted NGS Data

BRCA

Tim Hague,CTO

Page 2: Considerations for Analyzing Targeted NGS Data BRCA

Introduction BRCA 1 and 2 are best known as 'cancer

susceptibility' genes Actually the proteins repair damage in

DNA Large number of known deleterious

mutations Disproportionate number of indels

Page 3: Considerations for Analyzing Targeted NGS Data BRCA

History Mary-Claire King discovered BRCA1 and

BRCA2, published the function Myriad Genetics won the patent

Page 4: Considerations for Analyzing Targeted NGS Data BRCA

Distribution of known BRCA1 deletions >3 bp

Indel size (nt)

Page 5: Considerations for Analyzing Targeted NGS Data BRCA

Dominuque StoppaLyonnet at Curie Institute

„Large scale deletions could account for as many as one-third of all BRCA1 mutations in some populations”

Page 6: Considerations for Analyzing Targeted NGS Data BRCA

BRCA are tumor suppressor genes. 82% lifetime chance of developing breast/ovarian cancer.

Science 2004, 306:2187-2191

>1,500 deleterious BRCA mutations

17 kbp coding region with mutation rate of 1/2000

NGS-based BRCA screening Leeds UK, Newgene UK, Ghent Belgium

DIY genetic test published by Salzberg

Page 7: Considerations for Analyzing Targeted NGS Data BRCA

82% chance of cancer

>90% chance of being false positive/ negative

Page 8: Considerations for Analyzing Targeted NGS Data BRCA
Page 9: Considerations for Analyzing Targeted NGS Data BRCA

What kind of NGS data? False negatives must be avoided Precision of both sequencing data and the

data analysis is key

Looking for indels – indel detection abilities are a key criterion

Repeats are also an issue in BRCA region

Page 10: Considerations for Analyzing Targeted NGS Data BRCA

BRCA Repeats

Page 11: Considerations for Analyzing Targeted NGS Data BRCA

Homopolymer Errors

Homopolymer errors look like small indels and can cause noise

Problem for: Roche 454

Ion Torrent

Page 12: Considerations for Analyzing Targeted NGS Data BRCA

Long Reads

Read length is a limiting factor for insertion detection.

When searching for indels, long reads can help. Long reads can also help with repeats.

Roche 454 have the longest reads.

Page 13: Considerations for Analyzing Targeted NGS Data BRCA

Real examples with Roche 454 data

Page 14: Considerations for Analyzing Targeted NGS Data BRCA

Real examples with Roche 454 data

Page 15: Considerations for Analyzing Targeted NGS Data BRCA

Paired Reads Paired reads can also help to increase

effective 'read length'

Illumina MiSeq now has 2x250bp protocol

Page 16: Considerations for Analyzing Targeted NGS Data BRCA
Page 17: Considerations for Analyzing Targeted NGS Data BRCA

Compare 9 open source and commercial NGS analysis softwares

In silico test with mutated reference BRCA gene

2211 known BRCA variants1341 SNOs, 320 insertions and 551

deletions

Full GATK pipeline used for variant call, including quality recalibration and indel realignment

Page 18: Considerations for Analyzing Targeted NGS Data BRCA
Page 19: Considerations for Analyzing Targeted NGS Data BRCA
Page 20: Considerations for Analyzing Targeted NGS Data BRCA
Page 21: Considerations for Analyzing Targeted NGS Data BRCA

BWA

Overall Sensitivity:

99.2% Paired End

94.5% Single End

SNPs found:

99.5% PE

99.5% SE

Deletions found:

98.5% PE

85.5% SE

Insertions found:

99.4% PE

89.4% SE

Page 22: Considerations for Analyzing Targeted NGS Data BRCA

BWA

False Negatives :

17 Paired End

121 Single End

False Positives:

23 PE

168 SE

The longest (60bp+) deletions were not found, either with PE or SE data

Page 23: Considerations for Analyzing Targeted NGS Data BRCA

Indel sizes - BWA Single End

Page 24: Considerations for Analyzing Targeted NGS Data BRCA

Indel sizes - BWA Paired End

Page 25: Considerations for Analyzing Targeted NGS Data BRCA

Other Tools Most other alignment tools showed a

similar trend – much better results overall with Paired data

Only two of the tools tested found the longest deletions, even with Paired data

Page 26: Considerations for Analyzing Targeted NGS Data BRCA

Paired Reads - Conclusions Much better for reliable variant detection than equivalent

length single reads Provided much better coverage in the BRCA region

(spanning small repeats)

If available, paired reads should be preferred

Page 27: Considerations for Analyzing Targeted NGS Data BRCA

Indel Detection - Conclusions Not all tools are good at finding indels. Burrows Wheeler based aligners can't find indels beyond

a few base pairs in single reads, but can make better use of paired data – if indel realignment is also used.

They still can't detect the longest indels (there is just a gap in coverage).

If indel detection is required, an indel sensitive tool should be used

Page 28: Considerations for Analyzing Targeted NGS Data BRCA

Overall - Conclusions None of the alignment tools found all the variants It will almost certainly require the same data to be

analyzed with more than one tool, to get sufficiently accurate results