Post on 24-May-2015
description
04/12/23 2
PPP
DISC: the connected data departments of DTL research Hotels
DTL
DISC*
*) DISC = DTL Data Integration & Stewardship Centre
technology research
education & training
technologyfacilities
What is bioinformatics?
5
• The science of storing, retrieving and analysing large amounts of biological information
• An interdisciplinary science involving biologists, biochemists, computer scientists and mathematicians
• At the heart of modern biology
6
1 GenomesContain genes
1 GenomesContain genes
2 Genes are transcribed
2 Genes are transcribed
5 Proteins interact with each other and with small
molecules to form pathways
5 Proteins interact with each other and with small
molecules to form pathways
3 Transcripts translate to protein
sequences
3 Transcripts translate to protein
sequences
4 Proteins form three-dimensional
structures
4 Proteins form three-dimensional
structures
6 Pathways combine to build
systems
6 Pathways combine to build
systems
Bioinformatics underpins life-science research
Life Science data: Multi-omics, multi-technology, multi organism, multi dimensional
From molecules to medicine
8
Molecular components Integration Translation
Genomes
Nucleotides
Transcripts
Proteins
Complexes
Pathways
Small molecules
Structures
Domains
Cells
Biobanks
Tissues and organs
Humanpopulations
Therapies
Diseaseprevention
EarlyDiagnosis
Humanindividuals
The challenge• Computer speed
and storage capacity is doubling every 18 months and this rate is steady
• DNA sequence data is doubling every 6-8 months over the last 3 years and looks to continue for this decade
11
Guy Cochrane, ENA, EMBL-EBI
Europe has already paid for the science
12
Annual cost of generating new protein structure data in labs around the world
Annual cost of maintaining the datain a central database
ELIXIR’s mission
13
medicine
environment
bioindustries
society
To build a sustainable European infrastructure for biological information, supporting life science research and its translation to:
13 ELIXIR Countries
21
Part two >>>> eScience in LS
• The way we dicover knowledge has changed fundamentally over just a decade.
04/12/23 22
BIGNORANCE
The general challenge: Data has far outgrown institutional handling capacity
….The amount of digital data is exploding, with a staggering 1.8 zettabytes in 2011
The Issue:The Data Deluge is everywhereBut Life Sciences is particularly challenged and complex.
More and moreWe write‘about datasets’ That are too large to publishIn narrative
Cardinal Assertion
1 identicalassertion
‘n’ differentprovenances
Nanopublications & Cardinal Assertions
A Cardinal Assertion aggregates all ‘n’ Nanopublications making the same assertion. It therefore has 1 assertion and ‘n’ provenances, eliminating redundancy.
A Nanopublication is the smallest unit of publishable information containing: 1.Assertion
A statement of concepts in terms of one or more ‘subject -> predicate -> object’ (triple) relationships.
2.Provenancea)Attribution – Who made this assertion, when and where? b)Supporting information – Any other information which is relevant to the assertion (e.g. this assertion is only valid in humans under 18).
Nanopublication
Under the hood……
Managing volume & complexity
Individual Nanopublications
> 1014
55 4 2 1
Individual Cardinal Assertions
> 1011
55
44 22
11
Individual Concept Profiles
≈4x106
Combining Cardinal Assertions with Concept profiles reduces the amount of data with ≈99.999996%
The LS concept web: 2x2x106 concepts (profiles)
28
A dynamic Concept Web versus a static Ontology
More mutual informationNo increase in concept overlap
Including manual curation
More mutual informationNo increase in concept overlap
Including manual curation
More concepts in commonMore concepts in common
Removal of low info pathsRemoval of low info paths
= Known reference pairs= non-co-occurrence pairs
eScience…. in silico reasoning and in cerebro validation
Expert Skype calls
Reading up
Organisation of the ecosystem
CA Space (OCS & ICS)
Providers
Original Data Owners
Global Authority Nanopublishers App & Service Providers
Users
Endorse
Assist & Certify
Application development
Reasoning services
technical and process
consultancy
project delivery capacity
ONS/INSsAcademic & Commercial
Users
KnowledgeManagement
KnowledgeDiscovery
Best
Practices
33
Acceptance of Semantic Web Approach
Over the last decade, academic research organisations developed new methodologies and tools to address the Big Data problem.Global agreement by leading scientists on unique Nanopublication solution.100’s of millions already invested in the basis technologyApplicable as a technology across (STM) domains and industries.Pharmaceutical companies are early adopters (Innovative Medicine Initiative).
Acknowledging…• Herman van Haagen , MsC. (LUMC)• Dr. Peter Bram ‘t Hoen (LUMC)• Dr. Marco Roos (LUMC)• Dr. Erik Schultes (LUMC)• Prof. Johan den Dunnen (LUMC)• Prof. Gertjan van Ommen (LUMC)• Dr. Erik van Mulligen (EMC)• Dr. Jan Kors (EMC)• Dr. Martijn Schuemie (EMC)• Prof. Johan van der Lei (EMC)• Dr. Rob Hooft (NBIC)• Dr. Christine Chichester (NBIC)• Dr. Leon Mei (NBIC)• Kees Burger (NBIC)• Bharat Singh (NBIC/EMC)• Dr. Marc van Driel (NBIC)• Dr. Ruben Kok (NBIC)• Prof. Marcel Reinders (NBIC)• Prof. Jaap Heringa (NBIC)• Prof. Gert Vriend (NBIC)• Dr. Morris Schwertz (BBMRI, CWA)• Dr. Andra Waagmeester (NBIC)• Dr. Kristina Hettne (LUMC)• Dr. Rene van Schaik (eScience Cenrte)• Drs. Albert Mons (PHORTOS consultants)• Mr. Drs. Arie Baak (PHORTOS consultants)
• Prof. Amos Bairoch (SIB, Switzerland, CWA) • Prof. Carole Goble (Mancheste, CWA, OPS)• Prof. Katy Borner (Indiana University CWA)• Prof. Mark Musen (NCBO, Stanford CWA,OPS)• Dr. Pascale Gaudet (UniProt, ISB, CWA• Dr. Mike Colon (VIVO, UF, CWA)• Prof. Maryann Martone (Force 11, USC, CWA)• Dr. Nigam Shah (NCBO, Stanford, CWA, OPS)• Dr. Mark Wlikinson (Canada, CWA)• Abel Packer (Brazil, Scielo, CWA, OPS)• Jan Velterop (ACKnowledge, CWA, OPS)• Albert Mons (CWA, NBIC)• Prof. Frank van Harnelen (FUA/LARKC, CWA, OPS)• Dr. Chris Evelo (Maastrciht, CWA, OPS)• Dr. Antony Willams (RSC/ChemSpider, CWA,OPS)• Dr. Richard Kidd (RSC, OPS)• Dr. Paul Groth (FUA, CWA, OPS)• Dr. Michel Dumontier (Canada, CWA, OPS)• Dr .Andrew Gibson, UA, CWA, OPS)• Dr. Bryn Williams-Jones (Pfizer, OPS)• Dr. Ian Dix (Astra Zeneca, OPS)• Dr. Niklas Blomberg (Astra Zeneca, OPS)• Dr. Mike Barnes, GSK, OPS)• Prof. Jan-erik Litton (CWA, BBMRI)
The ‘Dutch Team’
CWA- Open PHACTS