The use of BEDTools to analyze CNV regions
-
Upload
leandro-lima -
Category
Data & Analytics
-
view
389 -
download
0
Transcript of The use of BEDTools to analyze CNV regions
The use of BEDTools to analyze CNV regions
Leandro LimaCAG analytical meeting - Dec 10, 2014
Center for Applied Genomics
The Children’s Hospital of Philadelphia
Motivation
• In genetics, many analyses are a subtype of set theory or intervals arithmetic
• Examples:
1. Check coverage in a genome / exome
2. Find de novo CNVs
3. Find novel CNVs in a database
Evolution of DGV(Database of Genomic Variants)
• DGV is a curated catalogue of human genomic structural variation
• The content of the database is only representing structural variation identified in healthy control samples
• The Database of Genomic Variants provides a useful catalog of control data for studies aiming to correlate genomic variation with phenotypic data
Increase in Variation Data
Source: http://dgv.tcag.ca/dgv/app/statistics
Example 1 – Evolution of DGVper year, by chromosome
• First, we have to select regions of the references with year of publication less or equal to a specific year
• Then, use bedtools genomecov to get the percentage of each chromosomes covered by the variant regions
Example 2: find de novo CNVs
• Step 1: merge parents CNVs
• Step 2: get regions that do not overlap with CNVs from child