Assignment 9 - Washington University...
Transcript of Assignment 9 - Washington University...
![Page 1: Assignment 9 - Washington University Geneticsgenetics.wustl.edu/bio5488/files/2017/03/Assignment-9.pdf · Records in a VCF file 5 SNV INS DEL DEL Profiling counts Class of genome](https://reader034.fdocuments.in/reader034/viewer/2022042417/5f33313912babd37020d6eeb/html5/thumbnails/1.jpg)
Assignment 9Modified from Mayank’s notes from 2016
![Page 2: Assignment 9 - Washington University Geneticsgenetics.wustl.edu/bio5488/files/2017/03/Assignment-9.pdf · Records in a VCF file 5 SNV INS DEL DEL Profiling counts Class of genome](https://reader034.fdocuments.in/reader034/viewer/2022042417/5f33313912babd37020d6eeb/html5/thumbnails/2.jpg)
Extremely (re)productive F1s!
2
![Page 3: Assignment 9 - Washington University Geneticsgenetics.wustl.edu/bio5488/files/2017/03/Assignment-9.pdf · Records in a VCF file 5 SNV INS DEL DEL Profiling counts Class of genome](https://reader034.fdocuments.in/reader034/viewer/2022042417/5f33313912babd37020d6eeb/html5/thumbnails/3.jpg)
Anatomy of a VCF file
3http://gatkforums.broadinstitute.org/gatk/discussion/1268/what-is-a-vcf-and-how-should-i-interpret-it
![Page 4: Assignment 9 - Washington University Geneticsgenetics.wustl.edu/bio5488/files/2017/03/Assignment-9.pdf · Records in a VCF file 5 SNV INS DEL DEL Profiling counts Class of genome](https://reader034.fdocuments.in/reader034/viewer/2022042417/5f33313912babd37020d6eeb/html5/thumbnails/4.jpg)
Header of a VCF file
4http://gatkforums.broadinstitute.org/gatk/discussion/1268/what-is-a-vcf-and-how-should-i-interpret-it
![Page 5: Assignment 9 - Washington University Geneticsgenetics.wustl.edu/bio5488/files/2017/03/Assignment-9.pdf · Records in a VCF file 5 SNV INS DEL DEL Profiling counts Class of genome](https://reader034.fdocuments.in/reader034/viewer/2022042417/5f33313912babd37020d6eeb/html5/thumbnails/5.jpg)
Records in a VCF file
5
SNV
INS
DEL
DEL
Profiling countsClass of genome variation
count
SNVs …….indels ……DEL ….DUP …INV ..MEIs …BNDs ….Total GV
![Page 6: Assignment 9 - Washington University Geneticsgenetics.wustl.edu/bio5488/files/2017/03/Assignment-9.pdf · Records in a VCF file 5 SNV INS DEL DEL Profiling counts Class of genome](https://reader034.fdocuments.in/reader034/viewer/2022042417/5f33313912babd37020d6eeb/html5/thumbnails/6.jpg)
Plot the size distributions
6
indels MEIsDEL
![Page 7: Assignment 9 - Washington University Geneticsgenetics.wustl.edu/bio5488/files/2017/03/Assignment-9.pdf · Records in a VCF file 5 SNV INS DEL DEL Profiling counts Class of genome](https://reader034.fdocuments.in/reader034/viewer/2022042417/5f33313912babd37020d6eeb/html5/thumbnails/7.jpg)
Zygosity explained
7
Reference Genome
One pair of homologous chromosomes
ha hb hb
–Homozygous reference–
–Homozygous alternate–
——–Heterozygous——–
0/0
1/1
0/1
./. ————Missing————–GT field
![Page 8: Assignment 9 - Washington University Geneticsgenetics.wustl.edu/bio5488/files/2017/03/Assignment-9.pdf · Records in a VCF file 5 SNV INS DEL DEL Profiling counts Class of genome](https://reader034.fdocuments.in/reader034/viewer/2022042417/5f33313912babd37020d6eeb/html5/thumbnails/8.jpg)
Trio analysis to look for violations of Mendel’s Law of Segregation
8
![Page 9: Assignment 9 - Washington University Geneticsgenetics.wustl.edu/bio5488/files/2017/03/Assignment-9.pdf · Records in a VCF file 5 SNV INS DEL DEL Profiling counts Class of genome](https://reader034.fdocuments.in/reader034/viewer/2022042417/5f33313912babd37020d6eeb/html5/thumbnails/9.jpg)
Trio analysis to look for violations of Mendel’s Law of Segregation
9http://commons.wikimedia.org/wiki/File:Autorecessive.svg
![Page 10: Assignment 9 - Washington University Geneticsgenetics.wustl.edu/bio5488/files/2017/03/Assignment-9.pdf · Records in a VCF file 5 SNV INS DEL DEL Profiling counts Class of genome](https://reader034.fdocuments.in/reader034/viewer/2022042417/5f33313912babd37020d6eeb/html5/thumbnails/10.jpg)
GQ—Genotype Quality
10http://riddhitubes.com/images/quality-stamp.png
The following formula relates a given GQ value X to the probability that the genotype call is INCORRECT:
X = -10*log10(Probability(genotype call is incorrect)), or Probability(genotype call is incorrect) = 10-X/10
For instance, a GQ value of 20 means that you are 99% sure your genotype call is correct, or there is a 1% chance your genotype call is incorrect.
![Page 11: Assignment 9 - Washington University Geneticsgenetics.wustl.edu/bio5488/files/2017/03/Assignment-9.pdf · Records in a VCF file 5 SNV INS DEL DEL Profiling counts Class of genome](https://reader034.fdocuments.in/reader034/viewer/2022042417/5f33313912babd37020d6eeb/html5/thumbnails/11.jpg)
Assignment 9 requirements
11
• Input files located in /home/assignments/assignment9/
• Important: DO NOT copy the input data files to /work/, reference the full path, e.g. python3 count_gv.py /home/assignments/assignment9/sv.reclassed.filtered.vcf
• Your submission folder should contain: • A completed README.txt • Commented scripts:
• count_gv.py• quantify_genotype.py• violate_MS.py
• Figures appropriately scaled with labelled axes and informative titles: • histogram_indels.png• histogram_deletions.png• histogram_meis.png
• Due Wednesday (29th March ‘17) at 10:00 AM