CASIMIR Networking MeetingHeathrow, July 2007
CASIMIR WP4Data Representation
John Hancock
Duncan Davidson
CASIMIR Networking MeetingHeathrow, July 2007
Objectives
• Assessment of technical aspects of database interoperability as
a barrier to scientific and financial sustainability
• Assessment of the variability of practice in the semantics of
biological data representation, e.g. genotype, gene expression
• Assessment of emerging standards and current practice for data
representation, annotation and ontologies
CASIMIR Networking MeetingHeathrow, July 2007
• 4.1 - D9 - Classified list of data representations in European mouse-centric and related databases• 4.4 - Network meeting 1 - June-Sep 07 - Bring together bioinformatics reps from (EU-funded) mouse projects to discuss data representation• 4.4 - Joint work package meeting to discuss results (4-5 Oct 07)• 4.5 - Sep - Dec 07 - Report of network meeting• 4.6 - Present conclusions at meetings
CASIMIR Networking MeetingHeathrow, July 2007
Discussion Points
• What do we understand by “data representation” - is it just CVs/Ontologies?– Interaction with other work packages
• What kinds of data?• What ontologies? How many on the PRIME
list do you use? Do you use others? Do you use OBO ontologies by default?
• What processes are they involved in elsewhere to discuss/unify data representation?
CASIMIR Networking MeetingHeathrow, July 2007
Future: Cross-Species Interactions
• Mouse-Human must be a priority because of the disease angle
• Mouse-Rat - already quite well integrated (?To what extent?) because of MGI-RGD-OBO interactions
• Other important models– Chick (ChickEST (UK), ChickVD (CN), Ensembl, others?)– Xenopus– Zebrafish– Drosophila– C. elegans– Yeast, E.coli
• In longer term get together with community reps to discuss similarities & differences
CASIMIR Networking MeetingHeathrow, July 2007
Extant Resources
• PRIME Expert Group Report and Outcomes
• Euromouse
• Interphenome discussion group & pilots
• EUMORPHIA/EUMODIC bioinformaticians
CASIMIR Networking MeetingHeathrow, July 2007
PRIME Expert Group
• Draft lists of:– Databases– Ontologies
CASIMIR Networking MeetingHeathrow, July 2007
Interphenome
• Phenotype data:
– Common data description
– Common protocol description
– Standard for data exchange
CASIMIR Networking MeetingHeathrow, July 2007
Interphenome - Current Status
• Ontologies– Investigate cross-mapping of current approaches and
eventual possible convergence (?)
• Protocols– Work on developing a format that can accommodate all
information needed for a protocol– Encode this as an XML schema– PPML?
• Data Exchange– Work on an XML schema that will allow structured exchange
of phenotype data and metadata - started work on this in EUMODIC
Publication in Mammalian Genome 18, 157-163 (March 2007):
“Integration of Mouse Phenome Data Resources”By The Mouse Phenotype Database Integration Consortium
CASIMIR Networking MeetingHeathrow, July 2007
WP4 - 1st Actions
Update the PRIME list of European mouse projects
Also identify “mouse-related” projectsIdentify contacts
• To hold a meaningful dialogue, get as many as possible to a networking meeting
CASIMIR Networking MeetingHeathrow, July 2007
Ontologies - So Far
• We have a little list
• Test how many of these are actually in use - Questionnaire
• Check how up to date it is, and track developments (e.g. Relationships Ontology, potential Synapse Ontology)
CASIMIR Networking MeetingHeathrow, July 2007
The CASIMIR Questionnaire
• http://www.casimir.org.uk/questionnaire.php• 1a. Are you using a relational database, object
database or flat files?• 1b. If relational, what is your chosen RDBMS
(Relational Database Management System)?• 2a. Is your database providing external links to other
on-line resources; possibly via URL/HTTP (if yes please name them)?
• 2b. Supported/Installed Web Services (if yes please name them)? Do you plan to install or develop web services in the near future?
CASIMIR Networking MeetingHeathrow, July 2007
The CASIMIR Questionnaire
• 3a. Please list the sorts of data entities you store (e.g. protein sequence data, mouse strain information etc...)
• 4a. Can you provide a brief explanatory description/schema of your data/data structure?
• 4b. Are you willing to provide a entity relationship diagram and would you be willing to provide it under an open source license?
CASIMIR Networking MeetingHeathrow, July 2007
The CASIMIR Questionnaire
• 5a.Are you currently using or do you intend to use any ontologies or controlled vocabularies to describe your data?
• 5b. Do you plan to expand your use of ontologies in future?
• 5c. Do you use OBO ontologies?• 5d. Do you perceive the need for additional
ontologies to serve your domain of knowledge?
CASIMIR Networking MeetingHeathrow, July 2007
The CASIMIR Questionnaire
• 6. Do you make use of Minimum Information standards (such as MIAME for microarray experiments) to describe any data? If so, which ones? If you do not make use of these standards, are you likely to do so in future?
CASIMIR Networking MeetingHeathrow, July 2007
Minimum Standards
• MIAME - Brazma et al (2001) Nat. Genet. 29, 365-71
CASIMIR Networking MeetingHeathrow, July 2007
The CASIMIR Questionnaire
• 7. What do you perceive as the main limiting factor in data representation/interoperability etc. in European bioinformatics databases?
• 8. Do you have any comments/thoughts on standards for data representation that need to be developed or that you might like discussed in CASIMIR?
CASIMIR Networking MeetingHeathrow, July 2007
The CASIMIR Questionnaire
Please fill it in as soon as humanly possible!
We will be chasing around database coordinators over the next few months to
make sure we have as much information as possible
CASIMIR Networking MeetingHeathrow, July 2007
Agenda for Today
• Reports from some databases:– MUGEN - Christina Chandras– EMMA - Glenn Proctor– EUMODIC - Niels Adams– EUCLIS - Eduardo Mendoza
• Discussion, e.g.– Comments on the questionnaire/CASIMIR’s aims– How to get widest possible participation– What do people see as the main obstacles to the
aim of integrating all this data?
CASIMIR Networking MeetingHeathrow, July 2007
Mouse to Human
DISEASE
Phenotypic AttributesPhenotypic Attributes Phenotypic Attributes Phenotypic Attributes
Phenotypic AttributesPhenotypic Attributes
Phenotypic Attributes
Human
Mouse PHENOTYPING
Phenotypic Measures Phenotypic MeasuresPhenotypic Measures
Top Related