European Molecular Biology Laboratory (EMBL)- European Bioinformatics Institute (EBI) Building the...

Post on 11-Apr-2017

359 views 0 download

Transcript of European Molecular Biology Laboratory (EMBL)- European Bioinformatics Institute (EBI) Building the...

EMBL-EBI Building the Database with International Isolates

Guy Cochrane, PhD

Structured

NeutralSustainable

Rapid

Data & Analysis

Requirements

COMPARE

COMPARE: the enabling system for rapid identification, containment and mitigation of emerging infectious diseases and foodborne outbreaks by generation and comparison of genomic information on samples and pathogens across sectors, time and locations, with additional contextual data.

A global platform for the sequence-based rapid identification of pathogens

EMBL European Bioinformatics InstituteGenes, genomes & variation

ArrayExpressExpression Atlas

MetabolightsPRIDE

InterPro Pfam UniProt

ChEMBL ChEBI

Literature & ontologies

Europe PubMed CentralGene OntologyExperimental Factor Ontology

Molecular structuresProtein Data Bank in EuropeElectron Microscopy Data Bank

European Nucleotide Archive1000 Genomes

Gene, protein & metabolite expression

Protein sequences, families & motifs

Chemical biology

Reactions, interactions & pathways

IntActReactome

MetaboLights

SystemsBioModelsEnzyme Portal

BioSamples

Ensembl Ensembl Genomes

European Genome-phenome ArchiveMetagenomics portal

European Nucleotide Archive (ENA)

http://www.ebi.ac.uk/ena/

• Globally comprehensive scientific record and European node of INSDC

• A broad platform for the management, sharing, integration and dissemination of sequence data

• Established in the early 1980s, extended for new technologies and applications

• Connectivity with broader EMBL-EBI resources

• Sequence data foundation• Sustained within EMBL-EBI under EMBL

funding with additional support from EC, UK Research councils, Wellcome Trust, etc.

• Substantial scale: 1.3 petabase pairs across >1 million taxa, 2,000-5,000 active data providers, global consumer userbase

• Rich submission, discovery and retrieval software, tools and services

This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 643476.

 

Structured data sharing

Data(primary & derived)

Analysis(routine & ad hoc)

This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 643476.

 

Structured data sharing

Data(primary & derived)

Analysis(routine & ad hoc)

This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 643476.

 

Structured data sharing

Data(primary & derived)

Analysis(routine & ad hoc)

Storage Compute

This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 643476.

 

Structured data sharing

Data(primary & derived)

Analysis(routine & ad hoc)

Storage Compute

This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 643476.

 

Structured data sharing

Data(primary & derived)

Analysis(routine & ad hoc)

Storage Compute

This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 643476.

 

Data Hubs

Data(primary & derived)

Analysis(routine & ad hoc)

Storage Compute

Data Hubs COMPARE-VMNotebooks

This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 643476.

 

Data Hubs

• Reporting and Sharing system– Quarantined pre-publication confidential and

public data– Set up for data providers and data consumers– Data / metadata:

• reported through systematic reporting system Webin – interactive and programmatic interfaces

• structured and validated• upon release embargoed data > INSDC

Data reporting

Data reporting

Data reporting

CGE batch upload

Data reporting

dcc_sibelius

dcc_sibelius

Data access

Systematic analysis

COMPARE Data

Reporting

COMPARE Data Archive

COMPARE

Selected workflow

COMPARE Data

Reporting

COMPARE Data Archive

Status of workflows

COMPARE reference genomesGUI: http://www.ebi.ac.uk/ena/data/xref/search

This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 643476.

 

Data Portal

Data(primary & derived)

Analysis(routine & ad hoc)

Storage ComputeData Portal

PeopleDTU

Jose Luis Bellod CisnerosMartin Christen Frølund ThomsenJohanne AhrenfeldtRolf Sommer KaasLukasz Dariusz DynowskiOle LundFrank AarestrupJeffrey Skiby

WIGNERJános Márk Szalai-GindlLászló OroszlányDávid VisontaiDezso RibliIstvan Csabai

EMBL-EBINima PaksereshtClara AmidNicole SilvesterMarc RosselloNeil GoodgameSuran JayathilakaAna Luisa ToribioAna Cerdeño TarragaPetra ten HoopenRasko LeinonenGuy Cochrane

EMCRon FouchierMarion KoopmansSaskia SmitsDavid van de VijverMarjolein Poen

FLIMartin BeerAnne PohlmannDirk HoeperClaudia WylezichAriane Belka

APHASharon BrooksAmanda SeekingsJill BanksJavier NunezRichard EllisIan Brown

SSIEva LitrupEva Møller Nielsen