Reproducibility of model-based results: standards, infrastructure, and recognition.
Transcript of Reproducibility of model-based results: standards, infrastructure, and recognition.
![Page 1: Reproducibility of model-based results: standards, infrastructure, and recognition.](https://reader031.fdocuments.in/reader031/viewer/2022030304/5876e9d11a28ab046d8b6d99/html5/thumbnails/1.jpg)
http://sems.uni-rostock.de
Dagmar WaltemathSeptember 2015, Rostock-Warnemünde | dcite
Reproducibility of model-based results:Standards, infrastructure and recognition
![Page 2: Reproducibility of model-based results: standards, infrastructure, and recognition.](https://reader031.fdocuments.in/reader031/viewer/2022030304/5876e9d11a28ab046d8b6d99/html5/thumbnails/2.jpg)
http://sems.uni-rostock.de
What is a model?
Fig.: Modeling Cellular Reprogramming Using Network-based Models. Courtesy Antonio del Sol Mesa, LCSB Luxembourg
Fig.: Modeling the cell cycle using ODE systems. Goldbeter (1991), http://www.ncbi.nlm.nih.gov/pubmed/1833774
Fig.: Modeling large-scale networks. Lee et al (2013), http://www.nature.com/articles/srep02197.
2In systems biology, a computational model represents biological facts in
the computer. Often, the representation is simulated to help understand
the system's dynamic behavior.
![Page 3: Reproducibility of model-based results: standards, infrastructure, and recognition.](https://reader031.fdocuments.in/reader031/viewer/2022030304/5876e9d11a28ab046d8b6d99/html5/thumbnails/3.jpg)
http://sems.uni-rostock.de
Re[usea|produci]bility challenge
3
Slide courtesy Mike Hucka @ 2012 Computational Cell Biology Summer School
![Page 4: Reproducibility of model-based results: standards, infrastructure, and recognition.](https://reader031.fdocuments.in/reader031/viewer/2022030304/5876e9d11a28ab046d8b6d99/html5/thumbnails/4.jpg)
http://sems.uni-rostock.de
Re[usea|produci]bility challenge
4
Slide courtesy Mike Hucka @ 2012 Computational Cell Biology Summer School
“With greater interaction between tools, anda common format for publications and databases, userswould be better able to spend more time on actual research
rather than on struggling with data format issues.”
![Page 5: Reproducibility of model-based results: standards, infrastructure, and recognition.](https://reader031.fdocuments.in/reader031/viewer/2022030304/5876e9d11a28ab046d8b6d99/html5/thumbnails/5.jpg)
http://sems.uni-rostock.de
Re[usea|produci]bility challenge (2003)
5
Slide courtesy Mike Hucka @ 2012 Computational Cell Biology Summer School
“With greater interaction between tools, anda common format for publications and databases, userswould be better able to spend more time on actual research
rather than on struggling with data format issues.” (SBML L1)
![Page 6: Reproducibility of model-based results: standards, infrastructure, and recognition.](https://reader031.fdocuments.in/reader031/viewer/2022030304/5876e9d11a28ab046d8b6d99/html5/thumbnails/6.jpg)
http://sems.uni-rostock.de
→ Standardised model representation
6
Ron Henkel et al. Database 2015;2015:bau130
![Page 7: Reproducibility of model-based results: standards, infrastructure, and recognition.](https://reader031.fdocuments.in/reader031/viewer/2022030304/5876e9d11a28ab046d8b6d99/html5/thumbnails/7.jpg)
http://sems.uni-rostock.de
Re[usea|produci]bility challenge (2010)
7
Fig.: Nature Blogs: Of Schemes and Dreams (2014)
Nine Worrying Stats on the Effect of Poor Scientific Data Management
Vijayalakshmi Chelliah et al. Nucl. Acids Res. 2015;43:D542-D548
Finding relevant models.
![Page 8: Reproducibility of model-based results: standards, infrastructure, and recognition.](https://reader031.fdocuments.in/reader031/viewer/2022030304/5876e9d11a28ab046d8b6d99/html5/thumbnails/8.jpg)
http://sems.uni-rostock.de
→ Strategies for model similarity, ranking, clustering, filtering
Fig.: Henkel et al 2010 http://www.biomedcentral.com/1471-2105/11/423/
Fig.: Schulz et al 2011 DOI: 10.1038/msb.2011.41
x x x x x x x x x x x x x x x x x x x x x x x x x x x x x x
CellCycle Models
x x x x x x
x x x x x
x x
x
x x x x
x x x
x
x x x x
x x x
x
x
x x x x
x
x x x x
x x x
x x
x x
x x x
x x
x x x x
x x x
x
x x x x x
x x x x x x
x x x x x x
x x x x x x x
x x x x x
x x x x x
x x x x
x x x x x x x
x x x x x x
x x x x x x x x x
x x x x x x
x x
Fig.:Alm et al (2014) doi:10.1186/s13326-015-0014-4
![Page 9: Reproducibility of model-based results: standards, infrastructure, and recognition.](https://reader031.fdocuments.in/reader031/viewer/2022030304/5876e9d11a28ab046d8b6d99/html5/thumbnails/9.jpg)
http://sems.uni-rostock.de
Re[usea|produci]bility challenge (2012)
Reproducing published models.
![Page 10: Reproducibility of model-based results: standards, infrastructure, and recognition.](https://reader031.fdocuments.in/reader031/viewer/2022030304/5876e9d11a28ab046d8b6d99/html5/thumbnails/10.jpg)
http://sems.uni-rostock.de
→ Standardised simulation descriptions
Fig.:Waltemath et al (2012) doi:10.1186/1752-0509-5-198
![Page 11: Reproducibility of model-based results: standards, infrastructure, and recognition.](https://reader031.fdocuments.in/reader031/viewer/2022030304/5876e9d11a28ab046d8b6d99/html5/thumbnails/11.jpg)
http://sems.uni-rostock.de
Re[usea|produci]bility challenge (2014)
Model-related data in the systems biology workflow
Linking the relevant files.
![Page 12: Reproducibility of model-based results: standards, infrastructure, and recognition.](https://reader031.fdocuments.in/reader031/viewer/2022030304/5876e9d11a28ab046d8b6d99/html5/thumbnails/12.jpg)
http://sems.uni-rostock.de
→ Retrieval and archiving of simulation studies and asssociated files
Model-related data in the systems biology workflow
Linking model-related data
Give me all the files I need to run this simulation study.
Which are the most frequently used GO annotations in my model set?
Which models contain reactions with 'ATP' as reactant and 'ADP' as product?
Find good candidates for features describing my set of
models.
![Page 13: Reproducibility of model-based results: standards, infrastructure, and recognition.](https://reader031.fdocuments.in/reader031/viewer/2022030304/5876e9d11a28ab046d8b6d99/html5/thumbnails/13.jpg)
http://sems.uni-rostock.de
State of affairs in 2015
● Standards:
– support for all steps of the modeling cycle
– support of various modeling techniques
– Still: some modeling concept not yet covered (→ Report of whole Cell modeling workshop, Waltemath et al 2015 (under review))
● Infrastructures:
– Software tools export/import standards
– Open model repositories and management systems
– Education
● Recognition
![Page 14: Reproducibility of model-based results: standards, infrastructure, and recognition.](https://reader031.fdocuments.in/reader031/viewer/2022030304/5876e9d11a28ab046d8b6d99/html5/thumbnails/14.jpg)
http://sems.uni-rostock.de
COMBINE Standards
● COmputational Modeling in BIology Network
● Goals:– Avoid overlap of standardisation efforts– Coordinate standard developments– Coordinate meetings – Coordinate development of procedures & tools– common infrastructure for specification development, semantic
annotation, and dissemination
● All specifications now citable and accessible in one place: Schreiber et al. (2015) http://journal.imbio.de/articles/pdf/jib-258.pdf
![Page 15: Reproducibility of model-based results: standards, infrastructure, and recognition.](https://reader031.fdocuments.in/reader031/viewer/2022030304/5876e9d11a28ab046d8b6d99/html5/thumbnails/15.jpg)
http://sems.uni-rostock.de
COMBINE Standards
Fig. : COMBINE standards today. Slide courtesy M. Hucka. http://www.slideshare.net/thehuck/a-summary-of-various-combine-standardization-activities
![Page 16: Reproducibility of model-based results: standards, infrastructure, and recognition.](https://reader031.fdocuments.in/reader031/viewer/2022030304/5876e9d11a28ab046d8b6d99/html5/thumbnails/16.jpg)
http://sems.uni-rostock.de
COMBINE Standards
● Data formats– Community-developed representation formats for models and
related data– Format: XML, OWL, RDF/XML
● Minimum Information/Reporting guidelines:– Minimum amount of data and information required reproduce
and interpret an experiment– Format: human-readable specification documents
● Basis for the specification of data models and metadata● Bio-ontologies
![Page 17: Reproducibility of model-based results: standards, infrastructure, and recognition.](https://reader031.fdocuments.in/reader031/viewer/2022030304/5876e9d11a28ab046d8b6d99/html5/thumbnails/17.jpg)
http://sems.uni-rostock.de
SBML
Fig.: SBML Level 3 Packages. Slide courtesy M. Hucka (ICSB 2014).
![Page 18: Reproducibility of model-based results: standards, infrastructure, and recognition.](https://reader031.fdocuments.in/reader031/viewer/2022030304/5876e9d11a28ab046d8b6d99/html5/thumbnails/18.jpg)
http://sems.uni-rostock.de
SBML
Fig.: SBML Level 3 Packages. Slide courtesy M. Hucka (ICSB 2014).
Lucky modelers: You should not need to worry about the details of these (XML) formats, the tools should handle import and export! (Tool developers should though.)
![Page 19: Reproducibility of model-based results: standards, infrastructure, and recognition.](https://reader031.fdocuments.in/reader031/viewer/2022030304/5876e9d11a28ab046d8b6d99/html5/thumbnails/19.jpg)
http://sems.uni-rostock.de
Minimum Information Guidelines
● Reporting guidelines and checklists● Narrative description of the information necessary to
reproduce a model-based result● MIRIAM: Minimum Information about the Annotation of a
Model● MIASE: Minimum Information about a Simulation Experiment● MIAPE,MIAME… for experimental setups
![Page 20: Reproducibility of model-based results: standards, infrastructure, and recognition.](https://reader031.fdocuments.in/reader031/viewer/2022030304/5876e9d11a28ab046d8b6d99/html5/thumbnails/20.jpg)
http://sems.uni-rostock.de
MIRIAM – information to provide about a model
● Models must– be encoded in a public machine readable format– be clearly linked to a single publication– reflect the structure of the biological processes described in the
reference paper (list of reactions, …)– be instantiable in a simulation (possess initial conditions, …)– be able to reproduce the results given in the reference paper– contain creator’s contact details– unambiguously identify each model constituent through annotation
![Page 21: Reproducibility of model-based results: standards, infrastructure, and recognition.](https://reader031.fdocuments.in/reader031/viewer/2022030304/5876e9d11a28ab046d8b6d99/html5/thumbnails/21.jpg)
http://sems.uni-rostock.de
MIRIAM – information to provide about a model
● Models must– be encoded in a public machine readable format– be clearly linked to a single publication– reflect the structure of the biological processes described in the
reference paper (list of reactions, …)– be instantiable in a simulation (possess initial conditions, …)– be able to reproduce the results given in the reference paper– contain creator’s contact details– unambiguously identify each model constituent through annotation
You should worry about the details of the guidelines, as they help you to check whether you provide all necessary information.
![Page 22: Reproducibility of model-based results: standards, infrastructure, and recognition.](https://reader031.fdocuments.in/reader031/viewer/2022030304/5876e9d11a28ab046d8b6d99/html5/thumbnails/22.jpg)
http://sems.uni-rostock.de
Bio-ontologies for model annotation
● Major ontologies
● Linking framework: RDF/XML
● Annotation scheme: used to semantically enrich model files with detailed descriptions of the underlying biological entities, mathematical concepts or algorithms used during analysis
● De facto standard: SBML annotation scheme
![Page 23: Reproducibility of model-based results: standards, infrastructure, and recognition.](https://reader031.fdocuments.in/reader031/viewer/2022030304/5876e9d11a28ab046d8b6d99/html5/thumbnails/23.jpg)
http://sems.uni-rostock.de
Bio-ontologies for model annotation
enzyme
enzyme
product
substrate
enzymatic rate law
catalytic rate constant
urn:miriam:SBO:0000011urn:miriam:SBO:0000014
urn:miriam:SBO:0000014
urn:miriam:SBO:0000025
urn:miriam:SBO:0000015
![Page 24: Reproducibility of model-based results: standards, infrastructure, and recognition.](https://reader031.fdocuments.in/reader031/viewer/2022030304/5876e9d11a28ab046d8b6d99/html5/thumbnails/24.jpg)
http://sems.uni-rostock.de
Bio-ontologies for model annotation
Tyrosine
Phenylalanine-4-hydroxylase
Phenylalanine-4-hydroxylase
Tetrahydrobiopterin
urn:miriam:uniprot:P00439
urn:miriam:uniprot:Q03393
urn:miriam:uniprot:P07101
urn:miriam:uniprot:P00439
![Page 25: Reproducibility of model-based results: standards, infrastructure, and recognition.](https://reader031.fdocuments.in/reader031/viewer/2022030304/5876e9d11a28ab046d8b6d99/html5/thumbnails/25.jpg)
http://sems.uni-rostock.de
Levels of standardisation
Fig.: COMBINE standards that are relevant to this workshop; adapted from (Chelliah et al., 2009, DILS)
![Page 26: Reproducibility of model-based results: standards, infrastructure, and recognition.](https://reader031.fdocuments.in/reader031/viewer/2022030304/5876e9d11a28ab046d8b6d99/html5/thumbnails/26.jpg)
http://sems.uni-rostock.de
State of affairs in 2015
● Standards:
– support for all steps of the modeling cycle
– support of various modeling technique
– Still: some modeling concept not yet covered (→ Report of whole Cell modeling workshop, Waltemath et al 2015 (under review))
● Infrastructures:
– Software tools export/import standards
– Open model repositories and management systems
– Education
● Recognition
![Page 27: Reproducibility of model-based results: standards, infrastructure, and recognition.](https://reader031.fdocuments.in/reader031/viewer/2022030304/5876e9d11a28ab046d8b6d99/html5/thumbnails/27.jpg)
http://sems.uni-rostock.de
Software tool support
● Standard converters (SBML ↔ SBGN; SBML ↔ CellML...)● Standard support in software● Interoperability tools
– Cytoscape for network analysis and visualization (SBML, SBGN, BioPax)
– The Virtual Cell for modeling (SBML, BioPAx)– VANTED for network analysis, visualization and manipulation
(SBML, SBGN)Check COMBINE Website
for details
![Page 28: Reproducibility of model-based results: standards, infrastructure, and recognition.](https://reader031.fdocuments.in/reader031/viewer/2022030304/5876e9d11a28ab046d8b6d99/html5/thumbnails/28.jpg)
http://sems.uni-rostock.de
Software tool support in SBML
Fig.: Software supporting SBML. Slide courtesy M. Hucka (ICSB 2014).
Also check the SBML Software Matrix
![Page 29: Reproducibility of model-based results: standards, infrastructure, and recognition.](https://reader031.fdocuments.in/reader031/viewer/2022030304/5876e9d11a28ab046d8b6d99/html5/thumbnails/29.jpg)
http://sems.uni-rostock.de
Open model repositories
● Structured, type-specific archives● Offer download of curated, annotated, published models
and associated files (visual representations, simulation descriptions, publication…)
CCDB
![Page 30: Reproducibility of model-based results: standards, infrastructure, and recognition.](https://reader031.fdocuments.in/reader031/viewer/2022030304/5876e9d11a28ab046d8b6d99/html5/thumbnails/30.jpg)
http://sems.uni-rostock.de
Model management systems
Fig.: The SEEK. Wolstencroft et al (2015). doi:10.1186/s12918-015-0174-y
Model management tasks:● Storage & Integration of data● Search & Retrieval ● Version Control● Provenance
![Page 31: Reproducibility of model-based results: standards, infrastructure, and recognition.](https://reader031.fdocuments.in/reader031/viewer/2022030304/5876e9d11a28ab046d8b6d99/html5/thumbnails/31.jpg)
http://sems.uni-rostock.de
Getting involved
● COMBINE user meeting→ next: COMBINE 2015, OCT 11-16, Salt Lake City
● COMBINE developers meeting → next: HARMONY 2016, June 7-11, Auckland
● FAIR-DOM activities: webinars, blogs, foundries● COMBINE activities: workshops, presentations, tutorials● Help through specification documents, show cases, mailing
lists, ...
http://co.mbine.org/ http://fair-dom.org/
![Page 32: Reproducibility of model-based results: standards, infrastructure, and recognition.](https://reader031.fdocuments.in/reader031/viewer/2022030304/5876e9d11a28ab046d8b6d99/html5/thumbnails/32.jpg)
http://sems.uni-rostock.de
State of affairs in 2015
● Standards:
– support for all steps of the modeling cycle
– support of various modeling technique
– Still: some modeling concept not yet covered (→ Report of whole Cell modeling workshop, Waltemath et al 2015 (under review))
● Infrastructures:
– Open model repositories
– Software tools export/import standards
– Model management systems
– Education
● Recognition
![Page 33: Reproducibility of model-based results: standards, infrastructure, and recognition.](https://reader031.fdocuments.in/reader031/viewer/2022030304/5876e9d11a28ab046d8b6d99/html5/thumbnails/33.jpg)
http://sems.uni-rostock.de
Recognition
33
1) Higher visibility of research
2) Long-term availability
3) Link to other resources
4) Quality-checks
Fig.: Piwowar and Vision (2013) Data reuse and the open data citation advantage. PeerJ
![Page 34: Reproducibility of model-based results: standards, infrastructure, and recognition.](https://reader031.fdocuments.in/reader031/viewer/2022030304/5876e9d11a28ab046d8b6d99/html5/thumbnails/34.jpg)
http://sems.uni-rostock.de
Model curation and publication in BioModels Database
Fig.: Li et al (2010)
![Page 35: Reproducibility of model-based results: standards, infrastructure, and recognition.](https://reader031.fdocuments.in/reader031/viewer/2022030304/5876e9d11a28ab046d8b6d99/html5/thumbnails/35.jpg)
http://sems.uni-rostock.de
Functional curation of models through virtual experiments
Fig.: Functional curation of models in the Web Lab. Cooper et al (2015) https://peerj.com/preprints/1338/ ; Cooper et al (2014) doi:10.1016/j.pbiomolbio.2014.10.001
Try out theCardiac physiology
Web Lab
![Page 36: Reproducibility of model-based results: standards, infrastructure, and recognition.](https://reader031.fdocuments.in/reader031/viewer/2022030304/5876e9d11a28ab046d8b6d99/html5/thumbnails/36.jpg)
http://sems.uni-rostock.de
Enabling model version control
Fig.: courtesy Martin Scharm, BudHat
![Page 37: Reproducibility of model-based results: standards, infrastructure, and recognition.](https://reader031.fdocuments.in/reader031/viewer/2022030304/5876e9d11a28ab046d8b6d99/html5/thumbnails/37.jpg)
http://sems.uni-rostock.de
Enabling on-the-fly reproduction of the model-based results
Fig.: Software supporting SBML and SED-ML.Waltemath et al (2011). doi:10.1186/1752-0509-5-198
![Page 38: Reproducibility of model-based results: standards, infrastructure, and recognition.](https://reader031.fdocuments.in/reader031/viewer/2022030304/5876e9d11a28ab046d8b6d99/html5/thumbnails/38.jpg)
http://sems.uni-rostock.de
So far for the theory… and in practice?
● Check for existing standards and specifications thereof: http://co.mbine.org
● Get involved in standard development → through the relevant mailing lists
● Problems with getting your model into the right format?
– Is it a problem with finding the approriate format or tool? → Ask on the relevant mailing list... people are friendly and happy to help.
– Is it a tool problem? → Complain with tool developers... who will hopefully change it.
– Is is a problem with the lack of a standards? → Feed back into the standard's community… people are friendly and happy to improve the standard.
● Follow best practices when aiming at publishing a result.
![Page 39: Reproducibility of model-based results: standards, infrastructure, and recognition.](https://reader031.fdocuments.in/reader031/viewer/2022030304/5876e9d11a28ab046d8b6d99/html5/thumbnails/39.jpg)
http://sems.uni-rostock.de
Best practices for publishing reproducible modeling results
1) Encode the model in a standard format, e.g. SBML.
2) Annotate the SBML model, following MIRIAM.
3) Publish the simulation experiment descriptions in standard format, e.g. SED-ML. If unsure what to include, consult the MIASE guidelines.
4) Try to reproduce the results *yourself*.
5) Ask a colleague to reproduce the results.
6) If successful: Archive all steps that led to your results.
7) Disseminate model code and simulation description through an open repository. Adapted from: Waltemath et al (2013), doi:10.1007/978-94-007-6803-1_10
![Page 40: Reproducibility of model-based results: standards, infrastructure, and recognition.](https://reader031.fdocuments.in/reader031/viewer/2022030304/5876e9d11a28ab046d8b6d99/html5/thumbnails/40.jpg)