Complex Data Modeling for Simpler Data Access

21
Complex Data Modeling for Simpler Data Access TDWG 2014, Jönköping, Sweden Ramona Walls Robert Guralnick

Transcript of Complex Data Modeling for Simpler Data Access

Page 1: Complex Data Modeling for Simpler Data Access

Complex Data Modeling for Simpler Data Access

TDWG 2014, Jönköping, Sweden

Ramona Walls

Robert Guralnick

Page 2: Complex Data Modeling for Simpler Data Access

A Canonical Example of “Opportunistic Collecting” typical in biocollections

Page 3: Complex Data Modeling for Simpler Data Access

plot

sub-plot

transect

(within plot)

individual

(within plot)individual

(within sub-plot)

Page 4: Complex Data Modeling for Simpler Data Access
Page 5: Complex Data Modeling for Simpler Data Access

transect

depth

* *

*

*

*

*sample collection point

water sample at

depth X

aliquot

*

metagenome

Page 6: Complex Data Modeling for Simpler Data Access

DwC

Bag ofterms

Page 7: Complex Data Modeling for Simpler Data Access

http://vegbank.org/vegbank/general/faq.html#datamodel

Page 8: Complex Data Modeling for Simpler Data Access

http://vegbank.org/vegbank/general/faq.html#datamodel

?

Page 9: Complex Data Modeling for Simpler Data Access

Madin et al. 2007 Ecol. Informatics doi: 10.1016/j.ecoinf.2007.05.004

OBO-E:

O&M:

Page 10: Complex Data Modeling for Simpler Data Access

Most biology requires work at the intersection of disciplines

MUSEUM COLLECTIONS

ECOLOGYGENOMICS

Page 11: Complex Data Modeling for Simpler Data Access

Material entities, information entities, and processes in the Basic Formal Ontology

Page 12: Complex Data Modeling for Simpler Data Access

observations versus specimens

Page 13: Complex Data Modeling for Simpler Data Access

Specimen data from a Darwin Core Archive: VertNet

Page 14: Complex Data Modeling for Simpler Data Access

specimencollection

process

sampling process

material sampling process

sampling process logical definition:assay and (achieves_planned_objective some ‘biological feature identification objective’)

has_specified_input some‘sampling feature’has_specified_output some‘sample data item’

specimen collection process logical definition:'planned process' and(achieves_planned_objectivesome 'specimen collection objective')

has_specified_input some‘material entity’has_specified_output some‘specimen’

material sampling process logical definition:'planned process' and(achieves_planned_objectivesome ’material sampling objective')

has_specified_input some‘material sampling feature’has_specified_output some‘material sample’

Page 15: Complex Data Modeling for Simpler Data Access

ROB

BCO Taxonomic Inventory Process Class and Sub-classes of different kinds of processes

Page 16: Complex Data Modeling for Simpler Data Access
Page 17: Complex Data Modeling for Simpler Data Access
Page 18: Complex Data Modeling for Simpler Data Access
Page 19: Complex Data Modeling for Simpler Data Access
Page 20: Complex Data Modeling for Simpler Data Access

Conclusions

• BCO splits the middle ground between the high level OBO-E world view and the flat way of representing a process that has a single output to allow us to represent all kinds of different content.

• BCO can serve as a sandbox to test out new models and terms for describing sampling processes and data, to inform standards like DwC.

Page 21: Complex Data Modeling for Simpler Data Access

Acknowledgments

• Dozens of participants at BCO workshops and hackathons over the past two years

• NSF-EAGER: An Interoperable Information Infrastructure for Biodiversity Research (I3BR)

• NSF: Research Coordination Network for GSC (RCN4GSC)

• VertNet and University of Kansas Biodiversity Institute