Developing an application ontology for biomedical resource annotation and retrieval: challenges and...

20
Developing an application ontology for biomedical resource annotation and retrieval: challenges and lessons learned C. Torniai, M. Brush, N. Vasilevsky, E. Segerdell, M. Wilson, T. Johnson, K. Corday, C. Shaffer and M. Haendel

Transcript of Developing an application ontology for biomedical resource annotation and retrieval: challenges and...

Page 1: Developing an application ontology for biomedical resource annotation and retrieval: challenges and lessons learned C. Torniai, M. Brush, N. Vasilevsky,

Developing an application ontology for biomedical resource annotation

and retrieval:challenges and lessons learned

C. Torniai, M. Brush, N. Vasilevsky, E. Segerdell, M. Wilson, T. Johnson, K. Corday, C. Shaffer and M. Haendel

Page 2: Developing an application ontology for biomedical resource annotation and retrieval: challenges and lessons learned C. Torniai, M. Brush, N. Vasilevsky,

eagle-i project Aims Ontology role

eagle-i ontology Requirements Implementation

Implementation choices Challenges

Outline

c o n s o r t i u m

Page 3: Developing an application ontology for biomedical resource annotation and retrieval: challenges and lessons learned C. Torniai, M. Brush, N. Vasilevsky,

eagle-i• NIH funded pilot project working to make

scientific resources more visible via a federated network of nine institutional repositories

Index invisible resources• reagents, protocols, techniques,

instruments, expertise, organisms, software, training, human studies, biological specimens, etc.

Ontology-driven approach to research resource annotation and discovery

Facilitate development of shared semantic entities that can be referenced in publications, databases, experiments, etc.

c o n s o r t i u m

Page 4: Developing an application ontology for biomedical resource annotation and retrieval: challenges and lessons learned C. Torniai, M. Brush, N. Vasilevsky,

c o n s o r t i u m

1) Represent collected resource information

2) Use the set of ontologies to control the data collection and search applications user-interface (UI) and logic

3) Build a set of ontologies that are reusable and interoperable with other ontologies and existing efforts for representing biomedical entities

Ontology development drivers

Page 5: Developing an application ontology for biomedical resource annotation and retrieval: challenges and lessons learned C. Torniai, M. Brush, N. Vasilevsky,

c o n s o r t i u m

Ontology role in eagle-i architecture

eagle-i ontologies

Federated Network

Repositories (RDF)

NIF, PubMed Entrez Gene

Search Application

Data Collection Application

Resource informationcollection

Page 6: Developing an application ontology for biomedical resource annotation and retrieval: challenges and lessons learned C. Torniai, M. Brush, N. Vasilevsky,

Ontology/Method Scope/Purpose

Basic Formal Ontology (BFO) Upper ontology

Information Artifact Ontology (IAO) Ontology metadata

Relation Ontology (RO) Common properties

Minimum Information to Represent External Ontology

Terms (MIREOT)

Reuse classes and properties from external ontologies

Implementation

c o n s o r t i u m

Page 7: Developing an application ontology for biomedical resource annotation and retrieval: challenges and lessons learned C. Torniai, M. Brush, N. Vasilevsky,

Ontology layersGoal: to decouple research resources representation from information used for application appearance and behavior

• Application specific module– Classes, annotation properties

and individuals required to drive the UIs

• eagle-i core ontology– Classes and properties used to

represent information about biomedical research resources

• MIREOT files– Externally sourced classes and

properties

Page 8: Developing an application ontology for biomedical resource annotation and retrieval: challenges and lessons learned C. Torniai, M. Brush, N. Vasilevsky,

c o n s o r t i u m

eagle-i core and MIREOTed sources

eagle-i core ontology: 1283 classes, 56 object properties, and 61 data properties.

External Ontologies Purpose/subsets Classes

Ontology of Biomedical

Investigations (OBI)research material entities, processes, devices, roles 509

NCBI Taxonomy Organisms taxa 192

VIVO ontology people, organization, publications 20

Ontology of Clinical Research (OCRe)

human study designs and facets 19

Biomdedical Resource Ontology (BRO) instruments 13

Page 9: Developing an application ontology for biomedical resource annotation and retrieval: challenges and lessons learned C. Torniai, M. Brush, N. Vasilevsky,

Application-specific moduleContains properties and classes required to drive the UIs of the data collection and search applications

– UI Annotation Definition file– Definition of UI annotation

properties and sets of values for these properties

– UI Annotations file – Holds annotations made on

eagle-I core and MIREOTed classes and properties

Page 10: Developing an application ontology for biomedical resource annotation and retrieval: challenges and lessons learned C. Torniai, M. Brush, N. Vasilevsky,

c o n s o r t i u m

Examples of annotation values and use

Label Description Example

Primary Resource TypeDenotes classes for which instances are

collected ‘instrument’,

‘biospecimen’, ‘protocol’

Data Model Exclude

Denotes classes or properties that are not included in the model

used for the data tool or the search tool UIs

BFO classes such ‘continuant’ or

‘occurrent’ or RO relations such ‘precedes’

Embedded Class

Denotes a class for which instances can only

be created in the context of an

embedding class

‘antibody immunogen’ created within

‘antibody’, ‘construct insert’ created within

‘plasmid’

Page 11: Developing an application ontology for biomedical resource annotation and retrieval: challenges and lessons learned C. Torniai, M. Brush, N. Vasilevsky,

Additional application-specific properties

PropertyLabel

Description Example Property Type

eagle-i domain constraint

Used to specify the domain of an imported property.

Each annotation will contain the URI of one

class

Value set to “OBI_0000245”

(‘organization’) for RO property

‘location_of’’

Data Property

eagle-i range constraint

Used to specify the range of an imported property.

Each annotation will contain the URI of one

class

Value set to “ERO_0000004”

(‘instrument’) for RO property

‘located_in’

Data Property

eagle-i preferred label

Defines the value of preferred label to display in the data collection tool

and search UIs

Capitalized ‘Organization’ for

OBI_0000245 (‘organization’)

Annotation Property

Page 12: Developing an application ontology for biomedical resource annotation and retrieval: challenges and lessons learned C. Torniai, M. Brush, N. Vasilevsky,

Classes annotated with ‘primary resource type’

Construct insert is an example of a resource annotated as an ‘embedded class’,

‘eagle-i preferred definition’ is used for tooltips

‘eagle-i preferred label’ is used for the display nameProperty annotated as ‘’primary property’

Technique is annotated as ‘referenced taxonomy’

Data Collection Application

Page 13: Developing an application ontology for biomedical resource annotation and retrieval: challenges and lessons learned C. Torniai, M. Brush, N. Vasilevsky,

c o n s o r t i u m

Reuse of existent ontologies Ontology Layers

Application-specific module Community coordination and alignment Best practices and tools

Challenges and benefits

Page 14: Developing an application ontology for biomedical resource annotation and retrieval: challenges and lessons learned C. Torniai, M. Brush, N. Vasilevsky,

c o n s o r t i u m

BFO and the relation ontology (RO) OBO Foundry orthogonality principle

Advantages – Integration with other ontologies– Ease the design process– Data integration and publication (Linked Open Data)

Challenges– Need to exclude some classes (continuant, occurrent) from UI

visualization after the inferred module has computed– Domain and Range in RO not specified or not specific enough for an

application– Not all relevant ontologies are built using BFO and RO

Reuse of existent ontologies

Page 15: Developing an application ontology for biomedical resource annotation and retrieval: challenges and lessons learned C. Torniai, M. Brush, N. Vasilevsky,

c o n s o r t i u m

Advantages– Effective means to drive an application UI while

maintaining interoperability with external ontologies and data sources

– Facilitate parallel concurrent development Challenges– Keeping the annotations current with the core module– Risk of excessive proliferation of annotation

properties as quick way to simplify application development complexity

Ontology layers

Page 16: Developing an application ontology for biomedical resource annotation and retrieval: challenges and lessons learned C. Torniai, M. Brush, N. Vasilevsky,

c o n s o r t i u m

Requirements for bridging the gap between an application and domain-specific ontologies – Application-specific labels and definitions– Exclusion of sets of classes and properties from

the model used by the application– Restriction of domain and range for some

imported properties – Definition of display order of object and data

properties at class level

Application-specific module

Page 17: Developing an application ontology for biomedical resource annotation and retrieval: challenges and lessons learned C. Torniai, M. Brush, N. Vasilevsky,

c o n s o r t i u m

Commitment to collaboration with similar efforts aimed at resource modeling – Aligned high level models with NIF, RDS, VIVO– Service, instrument (device) implemented in OBI and reused by NIF

and eagle-i– Coordinated representation of reagents, biospecimens, and genotype

information (in progress)

Challenges– Process is time consuming and it requires extra implementation efforts

• Implement and import back from reference ontologies

– Application ontologies have peculiar requirements • Example: Service hierarchy in eagle-i based on type of process rather than

input and output of the process (OBI)

Community coordination

Page 18: Developing an application ontology for biomedical resource annotation and retrieval: challenges and lessons learned C. Torniai, M. Brush, N. Vasilevsky,

c o n s o r t i u m

Reusing/referencing existent ontologies– Ontofox, OWL module extractor, NCBO extractor service

Have tools integrated in ontology editors (Protégé)– Effective methods for managing and syncing MIREOTed

terms

Have several “community views” or ‘slims’ that could be directly imported with different level of complexity

Best practices and tools

Page 19: Developing an application ontology for biomedical resource annotation and retrieval: challenges and lessons learned C. Torniai, M. Brush, N. Vasilevsky,

c o n s o r t i u m

Developing an ontology-driven application has been an important benchmark for usage of biomedical ontologies

We have designed a layered set of ontologies, consisting of a broadly applicable core ontology and application-specific module– Requirements and principles to inform a general design

pattern

Future steps Refining, documenting and sharing requirements and lessons

learned Engage in efforts addressing the issues we have experienced

Conclusion

Page 20: Developing an application ontology for biomedical resource annotation and retrieval: challenges and lessons learned C. Torniai, M. Brush, N. Vasilevsky,

c o n s o r t i u m

Thank you

eagle-i core module: http://code.google.com/p/eagle-i/eagle-i search: http://eagle-i.net

Carlo [email protected]

Acknowledgments: Ted Bashor, Rob Frost, Larry Stone and Daniela Bourges

Project funded through NIH/NCRR ARRA award #U24RR029825