Post on 02-Jul-2015
description
What can Bioinformaticians learn from YouTube?
Data
New project. New schema.
EMBL: 20 different data formats
“A biologist would rather share their toothbrush than their (gene) names”
Mike Ashburner
YouTube loves to share
100 million views per day
114 apps
<rdf:RDF xmlns="http://www.affymetrix.com/community/publications/affymetrix/tmsplice#"
<Gene rdf:about="#1110002A21Rik"> <chr>chr1</chr> <hasVariant rdf:parseType="Resource"> <representedBy rdf:resource="#gi13385627"/> </hasVariant> <hasVariant rdf:parseType="Resource"> <representedBy rdf:resource="#gi18043402"/> </hasVariant> <strand>+</strand> </Gene>
RDF, OWL, SPARQL, GRDDL, WTF?
The semantic web, not The Semantic Web
Lower case ‘s’, lower case ‘w’
<tr><th class="two-column">Gene</th><td class="two-column"><table width="100%" cellpadding="4"><tr><td><strong><a href="http://www.gene.ucl.ac.uk/cgi-bin/nomenclature/get_data.pl?hgnc_id=1101">BRCA2</a></div></strong> (HGNC Symbol)</td><td><span class="small"> To view all Ensembl genes linked to the name <a href="/Homo_sapiens/featureview?type=Gene;id=BRCA2">click here</a>.</span></td></tr></table><p>This gene is a member of the Human CCDS set: <a href="http://www.ncbi.nlm.nih.gov/CCDS/CcdsBrowse.cgi?REQUEST=CCDS&DATA=CCDS9344">CCDS9344</a> </p></td></tr>
<tr class="hgene"><th class="two-column">Gene</th><td class="two-column"><table width="100%" cellpadding="4"><tr><td><strong><a href="http://www.gene.ucl.ac.uk/cgi-bin/nomenclature/get_data.pl?hgnc_id=1101" rel="hgnc_name">BRCA2</a></div></strong> (HGNC Symbol)</td><td><span class="small"> To view all Ensembl genes linked to the name <a href="/Homo_sapiens/featureview?type=Gene;id=BRCA2" rel="gene_list" >click here</a>.</span></td></tr></table><p>This gene is a member of the Human CCDS set: <a href="http://www.ncbi.nlm.nih.gov/CCDS/CcdsBrowse.cgi?REQUEST=CCDS&DATA=CCDS9344" rel="ccds">CCDS9344</a> </p></td></tr>
Can our web site be our API?
http://www.ensembl.org/Homo_sapiens/geneview?gene=ENSG00000139618
http://www.ensembl.org/Homo_sapiens/geneview?gene=ENSG00000139618
my $gene_adaptor = $registry->get_adaptor( 'Human', 'Core', 'Gene' );
my $gene = fetch_by_gene_stable_id( 'ENSG00000139618' );
More data on our sites than through the API
(we’re not the only ones)
RSS
iCal
XML
RESTful service
Representational state
transfer
psd-production/projects
psd-production/projects
GET
RETRIEVE
<projects>
<project> <id type="integer">8</id> <created-at type="datetime">2007-10-22T09:43:30+01:00</created-at> <family-id type="integer">3</family-id> <name>Test BAC</name> <updated-at type="datetime">2007-10-22T09:43:30+01:00</updated-at> <user-id type="integer">1</user-id> <workspace-id type="integer"/> </project>
</projects>
psd-production/projects
POST
CREATE
http://psd-production/projects/67
http://psd-production/projects/67
POST
UPDATE
http://psd-production/projects/67
DELETE
DESTROY
http://psd-production/projects/67
No installationNo setup
No fancy protocols
All you need is curl
Perl API
Ruby API
Tools
ToolsWorkflows
It’s all about the workflow
Trace archive vs SSAHA
Workflows are memes
Users add value
YouTube knows memes
Not invented here!
Reproducibility
Go with the flow
Quickly define workflows
Quickly reuse services
Data
Explore
Service Service
Data
Reuse workflows
YouTube for workflows+
Yahoo! Pipes for biological data
=Never having to write another BLAST parser
Design
Stop hacking
Program to interfaces
“The interface is a contract between data provider and
data consumer” Lincoln Stein
Design for reuse
Code for maintenance
Foster “accidental development”
114 YouTube apps
However...
Designing for reuse is hard
With great power comesgreat responsibility
With great power comesgreat responsibility
Available
Accessible
Reliable
Discoverable:where is your web site?
Design is for humans
YouTube is ‘only’ an online video site
A good UI outweighs smart features
“Monolithic solutions always fail”
Graham Cameron
Loose coupling rules
Don’t reinvent Eclipse
Thank you
GREENISGOOD.CO.UK