H Mishima - Biogem, Ruby UCSC API, and BioRuby
-
Upload
jan-aerts -
Category
Technology
-
view
1.360 -
download
1
description
Transcript of H Mishima - Biogem, Ruby UCSC API, and BioRuby
![Page 1: H Mishima - Biogem, Ruby UCSC API, and BioRuby](https://reader035.fdocuments.in/reader035/viewer/2022062313/55799139d8b42ae72b8b4bba/html5/thumbnails/1.jpg)
Biogem,
Ruby UCSC API,
and BioRubyHiroyuki Mishima (Nagasaki University),Raoul J.P. Bonnal, Naohisa Goto,Francesco Strozzi, Toshiaki Katayama,Pjotr Prins
![Page 2: H Mishima - Biogem, Ruby UCSC API, and BioRuby](https://reader035.fdocuments.in/reader035/viewer/2022062313/55799139d8b42ae72b8b4bba/html5/thumbnails/2.jpg)
BioRuby•a bioinformatics library for the Ruby language
•>11 years - project since Nov. 21, 2000
![Page 3: H Mishima - Biogem, Ruby UCSC API, and BioRuby](https://reader035.fdocuments.in/reader035/viewer/2022062313/55799139d8b42ae72b8b4bba/html5/thumbnails/3.jpg)
BioRuby
is an open-source project
BUT, I HAVE A QUESTION...
![Page 4: H Mishima - Biogem, Ruby UCSC API, and BioRuby](https://reader035.fdocuments.in/reader035/viewer/2022062313/55799139d8b42ae72b8b4bba/html5/thumbnails/4.jpg)
Are open source projects truly open?
![Page 5: H Mishima - Biogem, Ruby UCSC API, and BioRuby](https://reader035.fdocuments.in/reader035/viewer/2022062313/55799139d8b42ae72b8b4bba/html5/thumbnails/5.jpg)
Aspects of the word ‘OPEN’
•OPEN for redistribution
•OPEN for source code access
•OPEN for contribution
![Page 6: H Mishima - Biogem, Ruby UCSC API, and BioRuby](https://reader035.fdocuments.in/reader035/viewer/2022062313/55799139d8b42ae72b8b4bba/html5/thumbnails/6.jpg)
CENTRALIZED APPROACH• Pros
–QC for stability and consistency–easy to apply coding standard–enables extensive tests and documentation
• Cons–heavy burden on release managers– longer process, sparser release– lack of cutting-edge features
![Page 7: H Mishima - Biogem, Ruby UCSC API, and BioRuby](https://reader035.fdocuments.in/reader035/viewer/2022062313/55799139d8b42ae72b8b4bba/html5/thumbnails/7.jpg)
Two ways to participate in BioRuby development
1. Be a committer1. be a trusted contributor in the community2. get an open-bio.org account3. be a CSV/SVN committer
2. Send patches to (busy) core-members1. wait for patch evaluation2. wait for next release of BioRuby
![Page 8: H Mishima - Biogem, Ruby UCSC API, and BioRuby](https://reader035.fdocuments.in/reader035/viewer/2022062313/55799139d8b42ae72b8b4bba/html5/thumbnails/8.jpg)
Two ways to participate in BioRuby development
1. Be a committer1. be a trusted contributor in the community2. get an open-bio.org account3. be a CSV/SVN committer
2. Send patches to (busy) core-members1. wait for patch evaluation2. wait for next release of BioRuby
BARRIERSTO ENTRY
![Page 9: H Mishima - Biogem, Ruby UCSC API, and BioRuby](https://reader035.fdocuments.in/reader035/viewer/2022062313/55799139d8b42ae72b8b4bba/html5/thumbnails/9.jpg)
Lower the barrier to entry!
![Page 10: H Mishima - Biogem, Ruby UCSC API, and BioRuby](https://reader035.fdocuments.in/reader035/viewer/2022062313/55799139d8b42ae72b8b4bba/html5/thumbnails/10.jpg)
Actions of BioRuby •more OPEN for source code access
•more OPEN for contribution
![Page 11: H Mishima - Biogem, Ruby UCSC API, and BioRuby](https://reader035.fdocuments.in/reader035/viewer/2022062313/55799139d8b42ae72b8b4bba/html5/thumbnails/11.jpg)
Social Coding Using GitHub
In 2010, the BioRuby project source repository moved to GitHub
ACTION 1
![Page 12: H Mishima - Biogem, Ruby UCSC API, and BioRuby](https://reader035.fdocuments.in/reader035/viewer/2022062313/55799139d8b42ae72b8b4bba/html5/thumbnails/12.jpg)
• Users can fork the code freely.• Users still have to wait for
acceptance of pull-requests to get their code incorporated into the official repository.
![Page 13: H Mishima - Biogem, Ruby UCSC API, and BioRuby](https://reader035.fdocuments.in/reader035/viewer/2022062313/55799139d8b42ae72b8b4bba/html5/thumbnails/13.jpg)
ACTION 2
Plug-in system - BioGem
![Page 14: H Mishima - Biogem, Ruby UCSC API, and BioRuby](https://reader035.fdocuments.in/reader035/viewer/2022062313/55799139d8b42ae72b8b4bba/html5/thumbnails/14.jpg)
DECENTRALIZED APPROACH• Enables expanding BioRuby without
tweaking its stable core• plug-ins are maintained by their authors• encourage ‘best practice’ using a tool
(biogem command)– Standard directory structure– version control using Git– Using the RubyGems packaging system– testing and documentation
![Page 15: H Mishima - Biogem, Ruby UCSC API, and BioRuby](https://reader035.fdocuments.in/reader035/viewer/2022062313/55799139d8b42ae72b8b4bba/html5/thumbnails/15.jpg)
The Biogems workflow
![Page 16: H Mishima - Biogem, Ruby UCSC API, and BioRuby](https://reader035.fdocuments.in/reader035/viewer/2022062313/55799139d8b42ae72b8b4bba/html5/thumbnails/16.jpg)
Biogems.infoBiogems.info – a portal site for Biogem users
rank in total downloads (rank up&down)citation, current version,day of final release, links to source code,status of Travis continuous integration
highly motivating (me)
![Page 17: H Mishima - Biogem, Ruby UCSC API, and BioRuby](https://reader035.fdocuments.in/reader035/viewer/2022062313/55799139d8b42ae72b8b4bba/html5/thumbnails/17.jpg)
Database /web-service APIbio ucsc apiintermineeutilssequenceservergorubybio ensembl
Wrapperbio samtoolsbio loggerbio bwabio signalpbio sgebio exportpredbio tabix
Applicationscaffoldergenfragbio isoelectric pointbio phytabio tm hmmdna sequence alignerbio gagbio kmer counter
File Parserbio gff3bio assemblybio blastxmlparserbio fasterbio alignmentbio nexmlbio kb illuminabio octopusbio affybio dbsnobio rdfbio hmmer modelbio hmmer3 reportbio pileup iteratorbio phyloxml
Visualizationbio graphics
Frameworkbio ngs
Toolboxbio genomic intervalbio bigbiobio hellobio plasmoapbio cnls screenscraperbio data bio aliphatic indexbio hydropathybio gngm
Biogem Examplebio hello
Biogem Collectionbio core
more than 60 Biogems...
![Page 18: H Mishima - Biogem, Ruby UCSC API, and BioRuby](https://reader035.fdocuments.in/reader035/viewer/2022062313/55799139d8b42ae72b8b4bba/html5/thumbnails/18.jpg)
Database /web-service APIbio ucsc apiintermineeutilssequenceservergorubybio ensembl
Wrapperbio samtoolsbio loggerbio bwabio signalpbio sgebio exportpredbio tabix
Applicationscaffoldergenfragbio isoelectric pointbio phytabio tm hmmdna sequence alignerbio gagbio kmer counter
File Parserbio gff3bio assemblybio blastxmlparserbio fasterbio alignmentbio nexmlbio kb illuminabio octopusbio affybio dbsnpbio rdfbio hmmer modelbio hmmer3 reportbio pileup iteratorbio phyloxml
Visualizationbio graphics
Frameworkbio ngs
Toolboxbio genomic intervalbio bigbiobio hellobio plasmoapbio cnls screenscraperbio data bio aliphatic indexbio hydropathybio gngm
Biogem Examplebio hello
Biogem Collectionbio core
Database Access-relatedNext Generation Sequencing-related
![Page 19: H Mishima - Biogem, Ruby UCSC API, and BioRuby](https://reader035.fdocuments.in/reader035/viewer/2022062313/55799139d8b42ae72b8b4bba/html5/thumbnails/19.jpg)
Hiro Mishima• NOT a core
developer of BioRuby
• not a computer scientist but a dentist
• semi-dry biologist• human geneticist
BioGem is lowering barriers to entry
![Page 20: H Mishima - Biogem, Ruby UCSC API, and BioRuby](https://reader035.fdocuments.in/reader035/viewer/2022062313/55799139d8b42ae72b8b4bba/html5/thumbnails/20.jpg)
Ruby UCSC API
![Page 21: H Mishima - Biogem, Ruby UCSC API, and BioRuby](https://reader035.fdocuments.in/reader035/viewer/2022062313/55799139d8b42ae72b8b4bba/html5/thumbnails/21.jpg)
>40,000tables!
![Page 22: H Mishima - Biogem, Ruby UCSC API, and BioRuby](https://reader035.fdocuments.in/reader035/viewer/2022062313/55799139d8b42ae72b8b4bba/html5/thumbnails/22.jpg)
$ gem install bio-ucsc-api
How to get started
EASY!
22
![Page 23: H Mishima - Biogem, Ruby UCSC API, and BioRuby](https://reader035.fdocuments.in/reader035/viewer/2022062313/55799139d8b42ae72b8b4bba/html5/thumbnails/23.jpg)
require 'bio-ucsc‘Bio::Ucsc::Hg19.connectresult = Bio::Ucsc::Hg19::Snp131. find_by_name("rs56289060")puts result.chrom # => "chr1"
23
A query written in fluent interface.
![Page 24: H Mishima - Biogem, Ruby UCSC API, and BioRuby](https://reader035.fdocuments.in/reader035/viewer/2022062313/55799139d8b42ae72b8b4bba/html5/thumbnails/24.jpg)
region = "chr17:7,579,614-7,579,700"condition = Bio::Ucsc::Hg19::Snp131. with_interval(region).select(:name)puts condition.to_sql
24
SQL made easy
SELECT name FROM `snp131`WHERE (chrom = 'chr17' AND bin in (642,80,9,1,0) AND ( (chromStart BETWEEN 7579613 AND 7579700) OR (chromEnd BETWEEN 7579613 AND 7579700) OR (chromStart <= 7579613 AND chromEND >= 7579700) ));
![Page 25: H Mishima - Biogem, Ruby UCSC API, and BioRuby](https://reader035.fdocuments.in/reader035/viewer/2022062313/55799139d8b42ae72b8b4bba/html5/thumbnails/25.jpg)
Details of Ruby UCSC API:
Please find poster
presentations BOSC2012 #15ISMB2012 #I06
![Page 26: H Mishima - Biogem, Ruby UCSC API, and BioRuby](https://reader035.fdocuments.in/reader035/viewer/2022062313/55799139d8b42ae72b8b4bba/html5/thumbnails/26.jpg)
FUTURE DIRECTION of BioGem• Still QC by peer-review is important.
–ensures stability and quality of codes and documents
–educates plug-in authors• R/Bioconductor has excellent peer-
review system–good coding style and well-formatted
document–requires huge human resources and
efforts
![Page 27: H Mishima - Biogem, Ruby UCSC API, and BioRuby](https://reader035.fdocuments.in/reader035/viewer/2022062313/55799139d8b42ae72b8b4bba/html5/thumbnails/27.jpg)
• recommended collections•Bio-Core (Raoul J.P. Bonnal)
• loose/casual peer-review• need to draw up guidelines for
designing “good” biogems
Solutions would be…
![Page 28: H Mishima - Biogem, Ruby UCSC API, and BioRuby](https://reader035.fdocuments.in/reader035/viewer/2022062313/55799139d8b42ae72b8b4bba/html5/thumbnails/28.jpg)
Common challenge among Bio* projects:
Balance between lowering barrier to entry and keeping higher quality
![Page 29: H Mishima - Biogem, Ruby UCSC API, and BioRuby](https://reader035.fdocuments.in/reader035/viewer/2022062313/55799139d8b42ae72b8b4bba/html5/thumbnails/29.jpg)
ACKNOWLEDGMENTS• All BioRuby contributors• Ruby UCSC API
– Jan Aerts• The BioRuby Panel
– Raoul Bonnal– Naohisa Goto– Francesco Strozzi– Toshiaki Katayama– Pjotr Prins
• Dept. of Human Genetics, Nagasaki Univ.– Koh-ichiro Yoshiura
• Google Summer of Code students• O|B|F – Open Bioinformatics Foundation
![Page 30: H Mishima - Biogem, Ruby UCSC API, and BioRuby](https://reader035.fdocuments.in/reader035/viewer/2022062313/55799139d8b42ae72b8b4bba/html5/thumbnails/30.jpg)
QUESTION?
or mishima_eng