Download - Bio db core-mockup-v1

Database name / Namespace

Main resource URL

Contact information

Date resources established

License

Taxonomic coverage (NCBI Taxid)

User support options

1. DATABASE IDENTIFICATION

2. DATABASE SCOPE

There would be a form where database providers would enter the scope of their database (all datatypes)

Those types of could come from some ontology (see EDAM, next slide) –maintained by the EBI (Jon Ison)

2. DATABASE SCOPE(here selection is from EDAM ontology) : http://sourceforge.net/projects/edamontology/http://www.ebi.ac.uk/ontology-lookup/browse.do?ontName=EDAM

x

http://sourceforge.net/projects/edamontology/

http://www.ebi.ac.uk/ontology-lookup/browse.do?ontName=EDAM



SPECS AND FORMATS

• From MIBBI/ BioSharing• Each datatype would have options for MI* standards and dataformats

• Date the data was first integrated

• Curation Policy - Check all that apply

[ ] Data integrated without curation from users

[ ] Data obtained from an external source

[ ] Data manually curated at database

• Standards: MIs, Data formats, Terminologies- For PPI data: MIMIx compilant?

• Data formats : MIMIx compilant?

• Data accessibility/output options

• Data release frequency

• Versioning policy/ access to historical files (only if data is not from secondary parties? )

• Documentation available

• Data submission policy (only if data is not from secondary parties)

• Tools available (querying, analysis, etc)

EXAMPLE 1: protein-protein interaction data

• Date the data was first integrated

• Curation Policy - Check all that apply

[ ] Data integrated without curation from users

[ ] Data obtained from an external source

[ ] Data manually curated at database

• Standards: MIs, Data formats, TerminologiesMIGS

• Data formats : GFF3, FASTA, etc

• Data accessibility/output options GFF3, FASTA, etc

• Data release frequency (only if data is not from secondary parties? )

• Versioning policy/ access to historical files

• Documentation available

• Data submission policy (only if data is not from secondary parties)

• Tools available (BLAST, etc)

EXAMPLE 2: genome sequence data

POTENTIAL USERS

• Database providers : to be sure we know about all available resources

• Users: so they can search for specific resources and quickly assess the quality

• Publishers: It would be easier for data transfer (see Adriaan’s message on the

Google group)

• Funders : Having standardized descriptions of databases, and also who accepts

which data should help funding agencies enforce good data sharing policies.