Eukaryotic Genomes The Organization and Control of Eukaryotic Genomes.
My CoGe Comparing our genomes. Background and Introduction Decreases in sequencing costs, coupled...
-
Upload
octavia-golden -
Category
Documents
-
view
212 -
download
0
Transcript of My CoGe Comparing our genomes. Background and Introduction Decreases in sequencing costs, coupled...
![Page 1: My CoGe Comparing our genomes. Background and Introduction Decreases in sequencing costs, coupled with increases in speed have paved the way for “Personal.](https://reader030.fdocuments.in/reader030/viewer/2022032722/56649f3e5503460f94c5efec/html5/thumbnails/1.jpg)
myCoGe
Comparing our genomes
![Page 2: My CoGe Comparing our genomes. Background and Introduction Decreases in sequencing costs, coupled with increases in speed have paved the way for “Personal.](https://reader030.fdocuments.in/reader030/viewer/2022032722/56649f3e5503460f94c5efec/html5/thumbnails/2.jpg)
Background and Introduction
Decreases in sequencing costs, coupled with increases in speed have paved the way for “Personal Genomics”
Companies now providing sequencing include: 23andMe ($99) AncestryDNA ($99) CompleteGenomics
($5000) Counsyl ($1000) Ubiome ($89-$400) Genelex …and more!
![Page 3: My CoGe Comparing our genomes. Background and Introduction Decreases in sequencing costs, coupled with increases in speed have paved the way for “Personal.](https://reader030.fdocuments.in/reader030/viewer/2022032722/56649f3e5503460f94c5efec/html5/thumbnails/3.jpg)
Huge set of data provides lots of promise for researchers. 600k of 23andMe’s 800k
customers have consented to using data for research.
Multiple sources now provide means for individuals to share their genetics and health histories with researchers. i.e. Personal Genome
Project, OpenHuman
Unfortunately, data from different sources cannot be directly compared.
Background and Introduction
![Page 4: My CoGe Comparing our genomes. Background and Introduction Decreases in sequencing costs, coupled with increases in speed have paved the way for “Personal.](https://reader030.fdocuments.in/reader030/viewer/2022032722/56649f3e5503460f94c5efec/html5/thumbnails/4.jpg)
Goal of myCoGe Data Integration Pipeline
Provide a mechanism for automated retrieval of publically available genomic experiment datasets for import into CoGe.
Provide the necessary tools for converting raw experiment files to formats accepted by CoGe.
Provide tools for converting experiments to utilize the same reference genome.
What is myCoGe?
Ultimate Goal of myCoGe
Provide a powerful framework of tools and datasets to allow for analyses into how variation affects function in human genomes.
Provide a useful toolbox for individuals to investigate their own, personal genetic data.
![Page 5: My CoGe Comparing our genomes. Background and Introduction Decreases in sequencing costs, coupled with increases in speed have paved the way for “Personal.](https://reader030.fdocuments.in/reader030/viewer/2022032722/56649f3e5503460f94c5efec/html5/thumbnails/5.jpg)
myCoGe Data Integration Conceptual Pipeline
ReviewDownload Convert LoadIdentify
![Page 6: My CoGe Comparing our genomes. Background and Introduction Decreases in sequencing costs, coupled with increases in speed have paved the way for “Personal.](https://reader030.fdocuments.in/reader030/viewer/2022032722/56649f3e5503460f94c5efec/html5/thumbnails/6.jpg)
Operational File Structure
![Page 7: My CoGe Comparing our genomes. Background and Introduction Decreases in sequencing costs, coupled with increases in speed have paved the way for “Personal.](https://reader030.fdocuments.in/reader030/viewer/2022032722/56649f3e5503460f94c5efec/html5/thumbnails/7.jpg)
myCoGe Data Integration Full Pipeline
![Page 8: My CoGe Comparing our genomes. Background and Introduction Decreases in sequencing costs, coupled with increases in speed have paved the way for “Personal.](https://reader030.fdocuments.in/reader030/viewer/2022032722/56649f3e5503460f94c5efec/html5/thumbnails/8.jpg)
Fun Facts
Lines of Code
Slowest Process: Loading 20gig reference SNP file - ~4min
Convert 900,000 SNPs from reference file: 5-30seconds
Speed Benchmarks
Initiate : 123 lines.
myCoGe: 692 lines.
Finalize: 11 lines.
Execute_myCoGe: 3 lines.
SNPScraper: 59 lines.
Total: 888
![Page 9: My CoGe Comparing our genomes. Background and Introduction Decreases in sequencing costs, coupled with increases in speed have paved the way for “Personal.](https://reader030.fdocuments.in/reader030/viewer/2022032722/56649f3e5503460f94c5efec/html5/thumbnails/9.jpg)
Initial Execution Complete pipeline was executed
Friday, May 1st.
Initial query of PGP obtained 579 potential experiments and associated metadata.
Complications PGP servers slow, largely unresponsive
Through weekend, just under 100 experiments were able to be downloaded.
Of this, 79 yielded good results.
CoGe API Load Experiment not functional Code for loading is complete, but
CoGe returns authentication error.
Reference genome chromosome names are NCBI IDs instead of numbers.
![Page 10: My CoGe Comparing our genomes. Background and Introduction Decreases in sequencing costs, coupled with increases in speed have paved the way for “Personal.](https://reader030.fdocuments.in/reader030/viewer/2022032722/56649f3e5503460f94c5efec/html5/thumbnails/10.jpg)
Future Directions
myCoGe Data-Integration Pipeline Functional CoGe API loading. Increased stability in face of poor connections. Expanded file types. Expanded experiment sources. Automated execution.
myCoGe Web-based personal data integration Integrated comparison tools
Gene model annotations Functional and expression experiments Full-genome sequencing