Database name / Namespace
Main resource URL
Contact information
Date resources established
License
Taxonomic coverage (NCBI Taxid)
User support options
1. DATABASE IDENTIFICATION
2. DATABASE SCOPE
There would be a form where database providers would enter the scope of their database (all datatypes)
Those types of could come from some ontology (see EDAM, next slide) –maintained by the EBI (Jon Ison)
2. DATABASE SCOPE(here selection is from EDAM ontology) : http://sourceforge.net/projects/edamontology/http://www.ebi.ac.uk/ontology-lookup/browse.do?ontName=EDAM
x
SPECS AND FORMATS
• From MIBBI/ BioSharing• Each datatype would have options for MI* standards and dataformats
• Date the data was first integrated
• Curation Policy - Check all that apply
[ ] Data integrated without curation from users
[ ] Data obtained from an external source
[ ] Data manually curated at database
• Standards: MIs, Data formats, Terminologies- For PPI data: MIMIx compilant?
• Data formats : MIMIx compilant?
• Data accessibility/output options
• Data release frequency
• Versioning policy/ access to historical files (only if data is not from secondary parties? )
• Documentation available
• Data submission policy (only if data is not from secondary parties)
• Tools available (querying, analysis, etc)
EXAMPLE 1: protein-protein interaction data
• Date the data was first integrated
• Curation Policy - Check all that apply
[ ] Data integrated without curation from users
[ ] Data obtained from an external source
[ ] Data manually curated at database
• Standards: MIs, Data formats, TerminologiesMIGS
• Data formats : GFF3, FASTA, etc
• Data accessibility/output options GFF3, FASTA, etc
• Data release frequency (only if data is not from secondary parties? )
• Versioning policy/ access to historical files
• Documentation available
• Data submission policy (only if data is not from secondary parties)
• Tools available (BLAST, etc)
EXAMPLE 2: genome sequence data
POTENTIAL USERS
• Database providers : to be sure we know about all available resources
• Users: so they can search for specific resources and quickly assess the quality
• Publishers: It would be easier for data transfer (see Adriaan’s message on the
Google group)
• Funders : Having standardized descriptions of databases, and also who accepts
which data should help funding agencies enforce good data sharing policies.