The use of BEDTools to analyze CNV regions

31
The use of BEDTools to analyze CNV regions Leandro Lima CAG analytical meeting - Dec 10, 2014 Center for Applied Genomics The Children’s Hospital of Philadelphia

Transcript of The use of BEDTools to analyze CNV regions

The use of BEDTools to analyze CNV regions

Leandro LimaCAG analytical meeting - Dec 10, 2014

Center for Applied Genomics

The Children’s Hospital of Philadelphia

Motivation

• In genetics, many analyses are a subtype of set theory or intervals arithmetic

• Examples:

1. Check coverage in a genome / exome

2. Find de novo CNVs

3. Find novel CNVs in a database

The BEDTools suite(some examples)

• intersect

The BEDTools suite(some examples)

• cluster

The BEDTools suite(some examples)

• merge

The BEDTools suite(some examples)

• genomecov

Evolution of DGV(Database of Genomic Variants)

• DGV is a curated catalogue of human genomic structural variation

• The content of the database is only representing structural variation identified in healthy control samples

• The Database of Genomic Variants provides a useful catalog of control data for studies aiming to correlate genomic variation with phenotypic data

Increase in Variation Data

Source: http://dgv.tcag.ca/dgv/app/statistics

Example 1 – Evolution of DGVper year, by chromosome

• First, we have to select regions of the references with year of publication less or equal to a specific year

• Then, use bedtools genomecov to get the percentage of each chromosomes covered by the variant regions

Increase in Variation Data(by chromosome)

Increase in Variation Data(by chromosome)

Increase in Variation Data(by chromosome)

Increase in Variation Data(by chromosome)

Increase in Variation Data(by chromosome)

Increase in Variation Data(by chromosome)

Increase in Variation Data(by chromosome)

Increase in Variation Data(by chromosome)

Increase in Variation Data(by chromosome)

Increase in Variation Data(by chromosome)

Increase in Variation Data(by chromosome)

Increase in Variation Data(by chromosome)

Example 2: find de novo CNVs

• Step 1: merge parents CNVs

• Step 2: get regions that do not overlap with CNVs from child

Example 2: find de novo CNVs

• Merge parents CNVs

Example 2: find de novo CNVs

• intersect -v Child CNVsParents CNVs

inherited

de novo

Example 2: find de novo CNVs

• cluster (to find de novo CNVs that happen in more than one family)

Example 3: find novel CNVs

• intersect -v De novo CNVsDGV

previouslyreported

novel (never reported)

Questions?