eScience-School-Oct2012-Campinas-Brazil
-
Upload
susanna-assunta-sansone -
Category
Documents
-
view
106 -
download
0
description
Transcript of eScience-School-Oct2012-Campinas-Brazil
![Page 1: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/1.jpg)
The buzz around reproducible bioscience data:
the policies, the communities and the standards
Susanna-Assunta Sansone, PhD Principal Investigator and Team Leader,
University of Oxford e-Research Centre, Oxford, UK
SPSAS e-SciBioEnergy Sao Paolo School of Advanced Science on e-Science for Bioenergy Research, 22-26 Oct, 2012, Campinas, Brazil
Slides at: http://www.slideshare.net/SusannaSansone
![Page 2: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/2.jpg)
Lab scientist!
Data scientist!
Consultant!Team Leader!
![Page 3: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/3.jpg)
Oxford e-Research Centre
![Page 4: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/4.jpg)
Oxford e-Research Centre
![Page 5: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/5.jpg)
Providing research computing, high-performance computing
Integrating with national and international infrastructure
Supporting leading edge facilities through education and training
Oxford e-Research Centre
![Page 6: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/6.jpg)
Oxford e-Research Centre
Collaborating with European and wider international groups in, e.g.:
• energy, • radio astronomy, • biological data federation, • life sciences simulation, • biodiversity, • computational chemistry, • neuroscience, • digital humanities tools, • digital music analysis
Research in • computation, • data infrastructure and analysis, • visualisation
![Page 7: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/7.jpg)
tox/pharma
env
health
agro
My team’s activities and groups we work with data management and biocuration, collaborative development
of software and database, standards and ontology
• environmental genomics • metabolomics • metagenomics • nanotechnology • proteomics
• stem cell discovery • system biology • transcriptomics • toxicogenomics • environmental health
![Page 8: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/8.jpg)
http://www.flickr.com/photos/12308429@N03/4957994485/ CC BY
![Page 9: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/9.jpg)
“The buzz around reproducible bioscience data:
the policies, the communities and the standards”
“The reality from the buzz:
how to deliver reproducible bioscience data”
Outline
![Page 10: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/10.jpg)
10
Harmonize collection across sites Find matching studies
Data dissemination Long-term data stewardship
Preserve institutional /
corporate memory
![Page 11: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/11.jpg)
11
Utilize public data
Identify suitable data Retrieve
Curate and harmonize Re-analyze
![Page 12: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/12.jpg)
12
Address reproducibility /
reuse of public data
![Page 13: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/13.jpg)
13
Address reproducibility /
reuse of public data
![Page 14: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/14.jpg)
14
Ioannidis et al., Repeatability of published microarray gene expression analyses. Nature Genetics 41(2), 149-55 (2009) doi:10.1038/ng.295
Address reproducibility /
reuse of public data
![Page 15: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/15.jpg)
15
15
Address reproducibility /
reuse of public data
![Page 16: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/16.jpg)
16
Address reproducibility /
reuse of public data
16
![Page 17: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/17.jpg)
17
17
Address reproducibility /
reuse of public data
![Page 18: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/18.jpg)
http://www.flickr.com/photos/notbrucelee/8016189356/ CC BY
![Page 19: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/19.jpg)
http://www.flickr.com/photos/notbrucelee/8016189356/ CC BY
C O M P R E H E N S I B L E I N T E R O P E R A B L E R E P R O D U C I B L E
R E U S A B L E
![Page 20: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/20.jpg)
The International Conference on Systems Biology (ICSB), 22-28 August, 2008 Susanna-Assunta Sansone www.ebi.ac.uk/net-project
20
Growing, worldwide movement for reproducible research
“Publicly-funded research data are a public good, produced in the public interest”
“Publicly-funded research data should be openly available to the maximum extent possible”
Shared, annotated research data and methods offer new discovery opportunities and prevent unnecessary repetition of work.
Improved data sharing underpins science of the future
![Page 21: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/21.jpg)
§ Researchers and bioinformaticians in both academic and commercial science, along with funding agencies and publishers, embrace the concept that community-developed standards are pivotal to structure and enrich the annotation of
• entities of interest (e.g., genes, metabolites, phenotypes) and • experimental steps (e.g., provenance of study materials,
technology and measurement types)
esoteric formats
hoc or proprietary terminology
lack of sufficient contextual
information
comprehensible?
interoperable?
reusable?
reproducible?
Growing, worldwide movement for reproducible research
![Page 22: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/22.jpg)
Seven week old C57BL/6N mice were treated with low-fat diet.
Liver was dissected out, RNA prepared…etc.
Type of protocol - sample treatment Type of protocol - nucleic acid extraction
Age value Unit
Strain name Subject of the experiment
Type of diet and experimental condition Anatomy part
§ Describe and communicate the information in an unambiguous, human and machine readable manner
Structure and enrich description of the experiments
![Page 23: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/23.jpg)
§ Describe and communicate the information in an unambiguous, human and machine readable manner
Figure: credit to OBI consortium
Structure and enrich description of the experiments
![Page 24: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/24.jpg)
Reproducible & Reusable
Bioscience Research
![Page 25: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/25.jpg)
Reproducible & Reusable
Bioscience Research
Well-annotated & Structured Data
reasoning
analysis
exchange
integration
visualization
browsing retrieval
![Page 26: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/26.jpg)
Reproducible & Reusable
Bioscience Research
Well-annotated & Structured Data
Community Standards
Software Tools
reasoning
analysis
exchange
integration
visualization
browsing retrieval
![Page 27: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/27.jpg)
http://www.flickr.com/photos/lamerentertainment/1581770980/sizes/m/in/photostream/
![Page 28: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/28.jpg)
Source of the figure: EBI website
§ Is interdisciplinary and integrative in character • need to deal with new and existing datasets • deal with a variety of data types
§ ‘How the organism works’ is the focus • Twenty years ago data was the center
Experimental and
computational data
Publications
Today’s bioscience research
![Page 29: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/29.jpg)
The International Conference on Systems Biology (ICSB), 22-28 August, 2008 Susanna-Assunta Sansone www.ebi.ac.uk/net-project
29 Source: http://ebbailey.wordpress.com
![Page 30: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/30.jpg)
Example from the toxicogenomics domain
Study looking at the effect of a compound inducing liver damage by characterizing/measuring
- the metabolic profile by MS and NMR
- protein expression in liver by MS
- gene expression by DNA microarray
- conducting genetic and phenotypical analysis
Information contributing to the construction and validation of system biology models
![Page 31: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/31.jpg)
The International Conference on Systems Biology (ICSB), 22-28 August, 2008 Susanna-Assunta Sansone www.ebi.ac.uk/net-project
31
Example of experiments by InnoMed PredTox a FP6 public-private consortium
![Page 32: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/32.jpg)
§ Capture all salient features of the experimental workflow
§ Make annotation explicit and discoverable
§ Structure the descriptions for consistency, tracking § independent variables § dependent variables using § cross reference and
resolvable identifiers
Structured description of datasets
![Page 33: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/33.jpg)
§ We must strike a balance between • depth and breadth of
information; and • sufficient information
required to reuse the data
Not too much, not too little, just ‘right’
![Page 34: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/34.jpg)
Information intensive experiments
![Page 35: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/35.jpg)
To make the experiments comprehensible and reusable,
underpinning future investigations, we need
common ways to report and share the experimental details and the associated data.
Consistent reporting will have a positive and long-lasting impact
on the value of collective scientific outputs.
Information intensive experiments
![Page 36: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/36.jpg)
§ The challenges we face
• Large in volume: lots of data types and metadata! • Lots of free text descriptions: hard to mine, subject to mistakes! • Babel of terminologies: lack of definitions, hard to map! • Heterogeneous file formats: software lock-in!
§ Need for reporting standards • Minimal reporting descriptors
- Report the same ‘core essentials’ • Controlled vocabularies or ontology
- Use the same word and mean the same thing • Common exchange formats
- Make tools interoperable, allow data exchange and integration
Common ways to report and share
![Page 37: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/37.jpg)
§ Describe and communicate the information to others, in an unambiguous manner
§ To unlock the value in the data • Compare, query and evaluate data
- Facilitate scientific validation of the findings • Understand variability within/between different technologies and
protocols - Facilitate technical validation - Enable optimization of the experimental designs - Identify critical checkpoints and develop quality metrics
§ To define submission and/or publication requirements • Journals • Databases
§ To ensure data integrity, reproducibility and (re)use
Reporting standards – the benefits
![Page 38: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/38.jpg)
Genome annotation www.geneontology.org
Functional Genomics Data Society (FGED)
www.fged.org
HUPO- Proteomics Standards Initiative (PSI)
http://www.psidev.info
Cheminformatics www.ebi.ac.uk/chebi
Pathways www.biopax.org
Systems modelling standards
www.co.mbine.org
Metabolomics Standards Initiative (MSI) http://www.metabolomicssociety.org
Genomics Standards Consortium (GSC)
gensc.org
Escalating number of standardization efforts in bioscience, e.g.:
Enzymology data standards
www.strenda.org
![Page 39: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/39.jpg)
Different community, different norms and standards, e.g.:
report the same core, essential information
use the same word and refer to the same ‘thing’ allow data to flow from
one system to another
Challenges: lack of coordination, fragmentation and uneven coverage
![Page 40: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/40.jpg)
report the same core, essential information
use the same word and refer to the same ‘thing’ allow data to flow from
one system to another
Is this ‘general mobilization’ good or bad?
§ Difference in structures and processes: • organization types (open, close to members, society, WG…) • standards development (how to design, develop, evaluate, maintain…) • adoption, uptake, outreach (link to journals, funders, commercial sector…) • funds (sponsors, memberships, grants, volunteering…)
![Page 41: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/41.jpg)
report the same core, essential information
use the same word and refer to the same ‘thing’ allow data to flow from
one system to another
§ Fragmentation of the standards is a major issue • Being focused on particular communities’ interests, be their individual
technologies or biological/biomedical disciplines, leads to duplication of effort, and more seriously, the development of (largely arbitrarily) different standards
• This severely hinders the interoperability of databases and tools and ultimately the integration of datasets
Is this ‘general mobilization’ good or bad?
![Page 42: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/42.jpg)
Three EBI omics systems S
ubm
issi
on
Acc
ess
Sto
rage
Fragmentation of the databases and data, e.g.
![Page 43: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/43.jpg)
Three EBI omics systems S
ubm
issi
on
Acc
ess
Sto
rage
Fragmentation of the databases and data, e.g.
![Page 44: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/44.jpg)
Three EBI omics systems S
ubm
issi
on
Acc
ess
Sto
rage
Fragmentation of the databases and data, e.g.
![Page 45: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/45.jpg)
Three EBI omics systems
DIFFERENT Formats, terminologies and tools
Sub
mis
sion
DIFFERENT Download formats
Acc
ess
DIFFERENT - Core requirements represented - Representation of the studies and related samples - Curation practices
Sto
rage
Fragmentation of the databases and data, e.g.
![Page 46: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/46.jpg)
Technologically-delineated views of the world
Biologically-delineated views of the world
Generic features (‘common core’) - description of source biomaterial - experimental design components
Arrays
Scanning Arrays & Scanning
Columns
Gels MS MS
FTIR
NMR
Columns
transcriptomics transcriptomics metabolomics
plant biology epidemiology microbiology
To integrate data we need interoperable standards
![Page 47: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/47.jpg)
§ Promote synergies • Among basic academic (omics) research but also regulatory- or
healthcare-driven initiatives
§ Much could be learned from exchange of ideas and practices • Although, regulatory- or healthcare-driven initiatives have far stricter
guidelines
• Although, often SDOs have ‘close’ discussions, require membership
§ Create interoperable standards • Fit neatly into a jigsaw, resolving inconsistency and filling gaps
§ Overcome several barriers • Technical
• Funding issue
• Sociological......
Need to address the fragmentation
![Page 48: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/48.jpg)
The International Conference on Systems Biology (ICSB), 22-28 August, 2008 Susanna-Assunta Sansone www.ebi.ac.uk/net-project
48
“Any customer can have a car painted any colour that he wants so long as it is black” Henry Ford, you know who he is…
“Biologists would rather share their toothbrush than their gene name” Michael Ashburner, Professor Genetics, University of Cambridge, UK
Eloquent quotes
![Page 49: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/49.jpg)
§ Buying nuts and bolts is easy today • But in the 19th century it was very complicated!
Standards – an old issue, e.g. engineering in 1850
![Page 50: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/50.jpg)
§ Buying nuts and bolts is easy today • But in the 19th century it was very complicated!
§ Nuts and bolts were custom made • Products from different shops were incompatible • Craftsmen liked the monopoly
- Customers were ‘locked in’ !!
§ In 1864 William Sellers initiated the standardization • Mass production • Get interchangeable parts • Standardized way to make nuts and bolts
§ Generally adopted only after WWII, though …. !!
Standards – an old issue, e.g. engineering in 1850
![Page 51: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/51.jpg)
The International Conference on Systems Biology (ICSB), 22-28 August, 2008 Susanna-Assunta Sansone www.ebi.ac.uk/net-project
51
Social engeneering
![Page 52: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/52.jpg)
The International Conference on Systems Biology (ICSB), 22-28 August, 2008 Susanna-Assunta Sansone www.ebi.ac.uk/net-project
52
Ownership of open standards can be problematic in broad, grass-root collaborations; it
requires improved models, to encourage maintenance of and contributions to these efforts,
supporting their evolutions
![Page 53: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/53.jpg)
The International Conference on Systems Biology (ICSB), 22-28 August, 2008 Susanna-Assunta Sansone www.ebi.ac.uk/net-project
53
The extensive community liaison needs to be managed
and funded; rewards and incentives need to be identified
for all contributors
![Page 54: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/54.jpg)
The International Conference on Systems Biology (ICSB), 22-28 August, 2008 Susanna-Assunta Sansone www.ebi.ac.uk/net-project
54
The cost of implementing a standards-supported data
sharing vision is as large as the number of stakeholders that must operate synchronously
![Page 55: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/55.jpg)
§ Several data preservation, management and sharing policies have emerged in response to increased funding for omics domains
§ Even if in general terms, standards are recognized as necessary ‘tools’ to unambiguously represent, describe and communicate research data
1. Funders actively developing data policies
![Page 56: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/56.jpg)
![Page 57: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/57.jpg)
§ “… lack of standardized data affects CDER’s review processes by curtailing a reviewer’s ability to perform integral tasks such as rapid acquisition, storage, analysis......efficient management of a portfolio of standards projects will require coordinated efforts and clear roles for multiple participants within/outside FDA”
2. Similar trend in the regulatory arena
![Page 58: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/58.jpg)
![Page 59: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/59.jpg)
The International Conference on Systems Biology (ICSB), 22-28 August, 2008 Susanna-Assunta Sansone www.ebi.ac.uk/net-project
59
§ Continue to support the development of open standards and tools • to support sharing of sufficiently well annotated datasets • to enable comprehensible, reusable, reproducible research
3. Publishes have become strong advocators
![Page 60: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/60.jpg)
….the rise of data-driven journals, e.g.:
partnering with:
![Page 61: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/61.jpg)
The rise of data-driven journals, e.g.:
partnering with:
![Page 62: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/62.jpg)
§ R&D has invested heavily in procedures and tools that integrate external information with their own data to enhance the decision-making process
• Now joining forces to streamline non-competitive elements of the life science workflow by the specification of common standards, business terms, relationships and processes
4. Similar trend in the commercial sector
![Page 63: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/63.jpg)
Big Life Science
Company
Yesterday Today Tomorrow
Yesterday Today Tomorrow Innovation Model
Innovation inside Searching for Innovation Heterogeneity of collaborations; part of the wider ecosystem
IT Internal apps & data Struggling with change security and trust
Cloud, services
Data Mostly inside In and out Distributed
Portfolio Internally driven and owned Partially shared Shared portfolio
Credit to: Pistoia Alliance
Big Life Science
Company
Proprietary content provider
Public content provider
Academic group
Software vendor
CRO
Service provider
Regulatory authorities
....their information landscape is evolving
![Page 64: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/64.jpg)
CC BY
http://www.flickr.com/photos/idiolector/289490834/
![Page 65: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/65.jpg)
![Page 66: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/66.jpg)
![Page 67: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/67.jpg)
“The buzz around reproducible bioscience data:
the policies, the communities and the standards”
u Contribute to the reproducible research movement
u Learn about open community-standards in your area
u Consider data science as a career path
Take home messages
![Page 68: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/68.jpg)
“The buzz around reproducible bioscience data:
the policies, the communities and the standards”
“The reality from the buzz:
how to deliver reproducible bioscience data”
Outline
![Page 69: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/69.jpg)
“The buzz around reproducible bioscience data:
the policies, the communities and the standards”
“The reality from the buzz:
how to deliver reproducible bioscience data”
How do we achieve this? Is it possible to achieve a common,
structured representation of diverse bioscience experiments
that:
• follows the appropriate community standards and
• delivers research? C O M P R E H E N S I B L E I N T E R O P E R A B L E R E P R O D U C I B L E
R E U S A B L E
![Page 70: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/70.jpg)
VO!
miame!MIAPA!
MIRIAM!MIQAS!MIX!
MIGEN!
CIMR!MIAPE!
MIASE!
MIQE!
MISFISHIE….!
REMARK!
CONSORT!
MAGE-Tab!GCDML!
SRAxml!SOFT! FASTA!
DICOM!
MzML !SBRML!
SEDML…!
GELML!
ISA-Tab!
CML!
MITAB!
AAO!CHEBI!
OBI!
PATO! ENVO!MOD!
BTO!IDO…!
TEDDY!
PRO!XAO!
DO
Growing number of reporting standards
![Page 71: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/71.jpg)
Growing number of reporting standards
+ 130
Estimated
+ 150
Source: MIB
BI,
EQU
ATOR
+ 303
Source: BioPortal
Databases, annotation,
curation tools
miame!MIAPA!
MIRIAM!MIQAS!MIX!
MIGEN!
CIMR!MIAPE!
MIASE!
MIQE!
MISFISHIE….!
REMARK!
CONSORT!
MAGE-Tab!GCDML!
SRAxml!SOFT! FASTA!
DICOM!
MzML !SBRML!
SEDML…!
GELML!
ISA-Tab!
CML!
MITAB!
AAO!CHEBI!
OBI!
PATO! ENVO!MOD!
BTO!IDO…!
TEDDY!
PRO!XAO!
DO
VO!
![Page 72: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/72.jpg)
But how much do we know about these standards
miame!MIAPA!
MIRIAM!MIQAS!MIX!
MIGEN!
CIMR!MIAPE!
MIASE!
MIQE!
MISFISHIE….!
REMARK!
CONSORT!
MAGE-Tab!GCDML!
SRAxml!SOFT! FASTA!
DICOM!
MzML !SBRML!
SEDML…!
GELML!
ISA-Tab!
CML!
MITAB!
AAO!CHEBI!
OBI!
PATO! ENVO!MOD!
BTO!IDO…!
TEDDY!
PRO!XAO!
DO
VO!
![Page 73: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/73.jpg)
Which one are mature enough for
me to use or recommend?
I work on plants, are these just for
biomedical applications?
What are the criteria to evaluate
their status and value?
How can I get involved to
propose extensions or modifications?
Which tools and databases
implement which standards?
I use high throughput sequencing technologies, which one are applicable
to me?
But how much do we know about these standards
![Page 74: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/74.jpg)
§ A bewildering array of standards is available, but
• these are hard to find, at different levels of maturity; in
some areas duplications or gaps in coverage also exist
§ Standards are just a ‘means to an end’, therefore
• we want to make them discoverable and accessible,
maximizing their use to assist the virtuous data cycle,
from generation to standardization through publication to
subsequent sharing and reuse
But how much do we know about these standards
![Page 75: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/75.jpg)
(2007) Vol 25 No 11
obofoundry.org
![Page 76: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/76.jpg)
The International Conference on Systems Biology (ICSB), 22-28 August, 2008 Susanna-Assunta Sansone www.ebi.ac.uk/net-project
76
§ Compound terms should be formed out of simpler constituents:
• Body weight weight (quality ontology, PATO) that inheres_in (relation ontology, RO) whole_organism (anatomy ontology, CARO)
• Xylene contaminated soil soil (environmental ontology, EnvO) that
has_contaminated (relation ontology, RO) xylene (chemical ontology, ChEBI)
Towards Lego-like ontologies
![Page 77: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/77.jpg)
(2008) Vol 26 No 8
mibbi.og
![Page 78: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/78.jpg)
§ Serves researchers, biocurators, journal editors and reviewers, and funders to
§ discover checklists for a particular domain § monitor progress of extant efforts § facilitate collaborations
![Page 79: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/79.jpg)
Science (2009), Vol 326, 234-236
http://biosharing.org
![Page 80: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/80.jpg)
The International Conference on Systems Biology (ICSB), 22-28 August, 2008 Susanna-Assunta Sansone www.ebi.ac.uk/net-project
80
A catalogue to map the landscape of standards and the systems implementing them: Over 400 bio-standards (public and in curation)
Field*, Sansone* et al., Omics data sharing. Science 326, 234-36 (2009) doi:0.1126/science.1180598
![Page 81: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/81.jpg)
• A coherent, curated and searchable catalogue of data sharing resources • Bioscience standards and associated data-sharing policies, publications, tools and databases • Assessment criteria for usability and popularity of standards • Relationships among standards • Encouragement for communication & interaction among groups • Promoting interoperability & informed decisions about standards
![Page 82: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/82.jpg)
The International Conference on Systems Biology (ICSB), 22-28 August, 2008 Susanna-Assunta Sansone www.ebi.ac.uk/net-project
Smith et al, 2007
![Page 83: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/83.jpg)
The International Conference on Systems Biology (ICSB), 22-28 August, 2008 Susanna-Assunta Sansone www.ebi.ac.uk/net-project
Smith et al, 2007
Taylor, Field, Sansone et al, 2008
![Page 84: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/84.jpg)
The International Conference on Systems Biology (ICSB), 22-28 August, 2008 Susanna-Assunta Sansone www.ebi.ac.uk/net-project
84
List of databases, linked to standards a collaboration with Database Issue
![Page 85: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/85.jpg)
The International Conference on Systems Biology (ICSB), 22-28 August, 2008 Susanna-Assunta Sansone www.ebi.ac.uk/net-project
85
List of databases, linked to standards a collaboration with Database Issue
![Page 86: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/86.jpg)
The International Conference on Systems Biology (ICSB), 22-28 August, 2008 Susanna-Assunta Sansone www.ebi.ac.uk/net-project
86
List of databases, linked to standards a collaboration with Database Issue
![Page 87: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/87.jpg)
The International Conference on Systems Biology (ICSB), 22-28 August, 2008 Susanna-Assunta Sansone www.ebi.ac.uk/net-project
87
The relationship among popular standard formats for pathway information BioPAX and PSI-MI are designed for data exchange to and from databases and pathway and network data integration. SBML and CellML are designed to support mathematical simulations of biological systems and SBGN represents pathway diagrams.
CREDIT: Demir, et al., The BioPAX community standard for pathway data sharing, 2010.
Major challenge: define ‘relations’ among standards
![Page 88: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/88.jpg)
![Page 89: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/89.jpg)
Example of multi-assays study – how many ‘standards’ are applicable to this?
![Page 90: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/90.jpg)
Example of multi-assays study – how many ‘standards’ are applicable to this?
![Page 91: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/91.jpg)
Example of multi-assays study – how many ‘standards’ are applicable to this?
![Page 92: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/92.jpg)
Example of multi-assays study – how many ‘standards’ are applicable to this?
![Page 93: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/93.jpg)
§ A grass-root collaborative that works to facilitate collection, curation and sharing of experiments using a common, structured representation
of the experiments that • transcends individual biological and technological domains and
• can be ‘configured’ to implement (several of) the community
standards
An exemplar approach to the status quo
www.biosharing.org
www.isacommons.org
TOWARDS INTEROPERABLE BIOSCIENCE DATA
Sansone SA, Rocca-Serra P, Field D, Maguire E, Taylor C, Hofmann O, Fang H, Neumann S, Tong W, Amaral-Zettler L, Begley K, Booth T, Bougueleret L, Burns G, Chapman B, Clark T, Coleman LA, Copeland J, Das S, de Daruvar A, de Matos P, Dix I, Edmunds S, Evelo C, Forster M, Gaudet P, Gilbert J, Goble C, Griffin J, Jacob D, Kleinjans J, Harland L, Haug K, Hermjakob H, Sui S, Laederach A, Liang S, Marshall S, Merrill E, McGrath A, Reilly D, Roux M, Shamu C, Shang C, Steinbeck C, Trefethen A, Williams-Jones B, Wolstencroft K, Xenarios J, Hide W.
Feb 2012
www.isacommons.org
doi:10.1038/ng.1054
![Page 94: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/94.jpg)
An exemplar approach to the status quo
§ A grass-root collaborative that works to facilitate collection, curation and sharing of experiments using a common, structured representation
of the experiments that • transcends individual biological and technological domains and
• can be ‘configured’ to implement (several of) the community
standards
![Page 95: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/95.jpg)
The International Conference on Systems Biology (ICSB), 22-28 August, 2008 Susanna-Assunta Sansone www.ebi.ac.uk/net-project
user community
metadata tracking framework
![Page 96: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/96.jpg)
General-purpose, configurable format, designed to support the use of several standards checklists, terminologies and conversions to (a growing number of) other metadata formats, used by public repositories, e.g.
MAGE-Tab
SRA-xml SOFT
Pride-xml
![Page 97: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/97.jpg)
(Rocca-Serra et al, 2010)
a collaborative effort of international research/service groups: University of Oxford, EBI, Harvard School of Public Health, NERC Environmental Bioinformatics Centre, Genomic Standards Consortium, US FDA Center for Bioinformatics, Leibniz Institute of Plant Biochemistry and more….
ISA software suite: supporting standards-compliant experimental annotation and enabling curation at the community level
![Page 98: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/98.jpg)
![Page 99: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/99.jpg)
1
Create template(s) to fit the type of experiments to be described
Create templates detailing the steps to be reported for different investigations, complying to community standards, e.g. configuring the value(s) allowed for each field to be • text (with/without regular expression testing), • ontology terms, • numbers etc.
![Page 100: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/100.jpg)
Describe, curate your experiment with geographically- distributed collaborators
Report and edit the description of the investigation using customized Google Spreadsheets (importing the ‘template’ created by the ISA configurator) enabled with ontology search and term-tagging features.
2a
![Page 101: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/101.jpg)
Or describe, curate your experiment using a desktop-based tool
Report and edit the description using this tool, (also customized using the templates) with a spreadsheet like look and feel, packed with functionalities such as • ontology search (access via ) • term-tagging features • import from spreadsheets etc…
2b
![Page 102: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/102.jpg)
The International Conference on Systems Biology (ICSB), 22-28 August, 2008 Susanna-Assunta Sansone www.ebi.ac.uk/net-project
102
empowering researchers to use standards
To mint DOIs
ISMB tag: #PP44
![Page 103: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/103.jpg)
Perform data analysis
We are building relevant ISA modules for GenomeSpace, R-based BioConductor and Galaxy tools
3
![Page 104: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/104.jpg)
Share your experiments with the world as Linked Open Data
Through conversion to RDF; work in collaboration with the W3C HCLSIG
4
![Page 105: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/105.jpg)
Share your experiments with the world as Linked Open Data
Through conversion to RDF; work in collaboration with the W3C HCLSIG
4
Tim Berners-Lee’s 5-star deployment scheme for Linked Open Data
![Page 106: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/106.jpg)
5
Submit your experiments to public repositories
Directly in ISA-Tab or reformatting using the ISAconverter
![Page 107: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/107.jpg)
6
Create your own repository
Store the investigations in the database, assign access rights and conduct maintenance tasks. Share, browse, query and view investigations, their descriptions and access associated data files.
![Page 108: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/108.jpg)
Maguire E, Rocca-Serra P, Sansone SA, Davies J and Chen M. Taxonomy-based Glyph Design -- with a Case Study on Visualizing Workflows of Biological Experiments, IEEE Transactions on Visualization and Computer Graphics, volume 18, 2012
(in press)
![Page 109: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/109.jpg)
A growing ecosystem of over 30 public and internal resources using the ISA metadata tracking framework (ISA-Tab and/or format) to facilitate standards-compliant collection, curation, management and reuse of investigations in an increasingly diverse set of life science domains, including:
• environmental health • environmental genomics • metabolomics • metagenomics • nanotechnology • proteomics,
• stem cell discovery • system biology • transcriptomics • toxicogenomics • also by communities working to build
a library of cellular signatures
![Page 110: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/110.jpg)
Importance of a local community
Implementations at Harvard
![Page 111: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/111.jpg)
Importance of a local community
Implementations at Harvard
data sharing in ISA-Tab
![Page 112: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/112.jpg)
Importance of a local community
Implementations at Harvard
data sharing in ISA-Tab
![Page 113: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/113.jpg)
The International Conference on Systems Biology (ICSB), 22-28 August, 2008 Susanna-Assunta Sansone www.ebi.ac.uk/net-project
113
Implementation at the EBI
![Page 114: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/114.jpg)
Data papers
![Page 115: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/115.jpg)
The International Conference on Systems Biology (ICSB), 22-28 August, 2008 Susanna-Assunta Sansone www.ebi.ac.uk/net-project
115
Nanotechnology Informatics Working Group
Extensions of the
![Page 116: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/116.jpg)
Development timeline
Community involvement and uptake!
Core developments!
2008 2009 2010
1st ISA-Tab workshop!3rd ISA-Tab workshop!
2nd ISA-Tab workshop!
Final ISA-Tab spec! Database instance !at EBI!
ISA software v1!
2011
1st public instance: !Harvard Stem Cell !Discovery Engine!
RDF format starts!
Conversions to !Pride-XML/SRA-XML/!MAGE-Tab and more!
User workshops/visits - start!Growing number of systems starts to adopt ISA framework!
Publications!‘Omics data sharing!(Science)!
ISA-Tab and !ISA software suite!(Bioinformatics)!
Stem Cell !Discovery !Engine!(NAR)!
2007 2012
Strawman ISA-Tab spec!
Other tools implement !ISA-Tab!
Workshop reports!ISA Commons!(Nature Genetics)!
Links to analysis tools starts!
Open source code
![Page 117: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/117.jpg)
“The buzz around reproducible bioscience data:
the policies, the communities and the standards”
“The reality from the buzz:
how to deliver reproducible bioscience data”
Final remarks
![Page 118: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/118.jpg)
The International Conference on Systems Biology (ICSB), 22-28 August, 2008 Susanna-Assunta Sansone www.ebi.ac.uk/net-project
118
http://www.flickr.com/photos/equinoxefr/2620239993/ CC BY
Your research and all (publicly funded) research should make
make an … impact
![Page 119: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/119.jpg)
The International Conference on Systems Biology (ICSB), 22-28 August, 2008 Susanna-Assunta Sansone www.ebi.ac.uk/net-project
119
http://www.flickr.com/photos/webhamster/2582189977/ CC BY
…..the biggest possible impact!
![Page 120: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/120.jpg)
http://www.flickr.com/photos/andrevanbortel/3745527869/sizes/m/in/photostream/
![Page 121: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/121.jpg)
Notes in Lab Books(information for humans)
Spreadsheets and Tables( the compromise)
Facts as RDF statements(information for machines)
We must increase the level of annotation
• Invest in curating and manage data at the source using: • a common metadata tracking framework, such as ISA • publicly available and community-developed terminologies • recording sufficient contextual information of the experimental steps
§ Progressively datasets will become more comprehensible, interoperable, reproducible and (re)usable, underpinning future investigations
![Page 122: eScience-School-Oct2012-Campinas-Brazil](https://reader033.fdocuments.in/reader033/viewer/2022061300/54c69f854a795921758b458e/html5/thumbnails/122.jpg)
The International Conference on Systems Biology (ICSB), 22-28 August, 2008 Susanna-Assunta Sansone www.ebi.ac.uk/net-project
122