Prototype germplasm data portal (2006)

24
NGB Board Meeting December 4, 2006 NGB Board Meeting December 4, 2006 Exchange of germplasm datasets with PyWrapper/BioCASE GAIN Global Accession Information Network December 4, 2006 NGB Board Meeting 2006 Alnarp, Sweden Dag Endresen, Nordic Gene Bank

description

Prototype Germplasm Data Portal, predecessor for the ALIS-Global of the GIGA project. Presentation for the Nordic Gene Bank board meeting on 4th December 2006.

Transcript of Prototype germplasm data portal (2006)

Page 1: Prototype germplasm data portal (2006)

NGB Board Meeting December 4, 2006NGB Board Meeting December 4, 2006

Exchange of germplasm datasets with PyWrapper/BioCASE

GAINGlobal Accession Information Network

December 4, 2006NGB Board Meeting 2006Alnarp, Sweden

Dag Endresen, Nordic Gene Bank

Page 2: Prototype germplasm data portal (2006)

Exchange of germplasm datasets with PyWrapper/BioCASE, December 4, 2006, NGB Board MeetingExchange of germplasm datasets with PyWrapper/BioCASE, December 4, 2006, NGB Board Meeting 2

TOPICSTOPICS

Genetic resources:

Data exchange Information network Outlook

Page 3: Prototype germplasm data portal (2006)

Exchange of germplasm datasets with PyWrapper/BioCASE, December 4, 2006, NGB Board MeetingExchange of germplasm datasets with PyWrapper/BioCASE, December 4, 2006, NGB Board Meeting 3

Germplasm data, seed Germplasm data, seed genebanksgenebanks

Germplasm genebanks are biodiversity collections.

Collection level dataMetadata about genebank institutes and the germplasm collections they hold.

Unit level dataThe unit level data for germplasm collections are the accessions. Genebank accessions share many properties and attributes with other biodiversity specimens.

Descriptive data (phenotype)The germplasm accessions are further described by descriptive characterization and evaluation data.

Page 4: Prototype germplasm data portal (2006)

Exchange of germplasm datasets with PyWrapper/BioCASE, December 4, 2006, NGB Board MeetingExchange of germplasm datasets with PyWrapper/BioCASE, December 4, 2006, NGB Board Meeting 4

Germplasm cataloguesGermplasm catalogues

Most genebank datasets are indexed by three major germplasm catalogues

EURISCO is the data catalogue of the European genebanks (997 631 accessions)

SINGER is the portal to the international CGIAR collections (689 349 accessions)

USDA-GRIN is the portal to the USDA ARS National Germplasm Repositories of the USA (475 178 accessions)

All three catalogues are now published in GBIF

Page 5: Prototype germplasm data portal (2006)

Exchange of germplasm datasets with PyWrapper/BioCASE, December 4, 2006, NGB Board MeetingExchange of germplasm datasets with PyWrapper/BioCASE, December 4, 2006, NGB Board Meeting 5

Data warehouse modelData warehouse model

Page 6: Prototype germplasm data portal (2006)

Exchange of germplasm datasets with PyWrapper/BioCASE, December 4, 2006, NGB Board MeetingExchange of germplasm datasets with PyWrapper/BioCASE, December 4, 2006, NGB Board Meeting 6

The present data flow from genebanks to EURISCO and The present data flow from genebanks to EURISCO and

ECCDBsECCDBs

Page 7: Prototype germplasm data portal (2006)

Exchange of germplasm datasets with PyWrapper/BioCASE, December 4, 2006, NGB Board MeetingExchange of germplasm datasets with PyWrapper/BioCASE, December 4, 2006, NGB Board Meeting 7

Decentralized modelDecentralized model

Page 8: Prototype germplasm data portal (2006)

Exchange of germplasm datasets with PyWrapper/BioCASE, December 4, 2006, NGB Board MeetingExchange of germplasm datasets with PyWrapper/BioCASE, December 4, 2006, NGB Board Meeting 8

Decentralized modelDecentralized model

EURISCO(Data Portal

Europe)

Nordic Gene Bank(Northern Europe)

IPK Gatersleben(Germany)

IHAR(Poland)

(Other European gene banks...)

SINGER(Data Portal for

CGIAR) (CGIARInternationalFuture Harvest gene banks...)

USDA GRIN(Data Portal USA)

(USDA ARSNational Germplasm Repositories...)

WUR CGN(Netherlands)

GBIF(Global BiodiversityData Portal)USER

GAIN(Global germplasmData Portal)[chm.grinfo.net]

Internet

MCPD

MCPDMCPD MCPD MCPD

Svalbard International Seed Vault(Safe Backup)

Page 9: Prototype germplasm data portal (2006)

Exchange of germplasm datasets with PyWrapper/BioCASE, December 4, 2006, NGB Board MeetingExchange of germplasm datasets with PyWrapper/BioCASE, December 4, 2006, NGB Board Meeting 9

Germplasm data indexing toolsGermplasm data indexing tools

We have recently built data indexing methodologies for access to germplasm data with BioCASE/PyWrapper.

This is planned to build a Germplasm Clearing House Mechanism (GAIN).

Development in close cooperation with GBIF, which themselves index basic biodiversity data from a similar approach.

[http://chm.grinfo.net/index.php]

Page 10: Prototype germplasm data portal (2006)

Exchange of germplasm datasets with PyWrapper/BioCASE, December 4, 2006, NGB Board MeetingExchange of germplasm datasets with PyWrapper/BioCASE, December 4, 2006, NGB Board Meeting 10

Page 11: Prototype germplasm data portal (2006)

Exchange of germplasm datasets with PyWrapper/BioCASE, December 4, 2006, NGB Board MeetingExchange of germplasm datasets with PyWrapper/BioCASE, December 4, 2006, NGB Board Meeting 11

Page 12: Prototype germplasm data portal (2006)

Exchange of germplasm datasets with PyWrapper/BioCASE, December 4, 2006, NGB Board MeetingExchange of germplasm datasets with PyWrapper/BioCASE, December 4, 2006, NGB Board Meeting 12

Page 13: Prototype germplasm data portal (2006)

Exchange of germplasm datasets with PyWrapper/BioCASE, December 4, 2006, NGB Board MeetingExchange of germplasm datasets with PyWrapper/BioCASE, December 4, 2006, NGB Board Meeting 13

Page 14: Prototype germplasm data portal (2006)

Exchange of germplasm datasets with PyWrapper/BioCASE, December 4, 2006, NGB Board MeetingExchange of germplasm datasets with PyWrapper/BioCASE, December 4, 2006, NGB Board Meeting 14

Decentralized data network with web services

Page 15: Prototype germplasm data portal (2006)

Exchange of germplasm datasets with PyWrapper/BioCASE, December 4, 2006, NGB Board MeetingExchange of germplasm datasets with PyWrapper/BioCASE, December 4, 2006, NGB Board Meeting 15

Germplasm data exchange with Germplasm data exchange with PyWrapper/BioCASEPyWrapper/BioCASE

GBIF technology demonstrated to IPGRI, FAO, CGIAR centres and genebanks (2005) and widely adopted for PGR information networks

In the spring of 2004 the first European genebanks joined GBIF as data providers.

In 2005 USDA-GRIN joined GBIF. In 2006 both SINGER and EURISCO joined GBIF.

The germplasm datasets worldwide are compatible with the MCPD data standard.

Sharing of germplasm datasets with GBIF was rather straight forward after mapping of the MCPD data standard to ABCD 2.06

Page 16: Prototype germplasm data portal (2006)

Exchange of germplasm datasets with PyWrapper/BioCASE, December 4, 2006, NGB Board MeetingExchange of germplasm datasets with PyWrapper/BioCASE, December 4, 2006, NGB Board Meeting 16

Germplasm BioCASE entry pointsGermplasm BioCASE entry points

[http://chm.grinfo.net/index.php?app=data_providers]

Page 17: Prototype germplasm data portal (2006)

Exchange of germplasm datasets with PyWrapper/BioCASE, December 4, 2006, NGB Board MeetingExchange of germplasm datasets with PyWrapper/BioCASE, December 4, 2006, NGB Board Meeting 17

Taxonomic Database Working GroupTaxonomic Database Working Group

Darwin Core 2 - Element definitions designed to support the sharing and integration of primary biodiversity data". [http://darwincore.calacademy.org/]

Access to Biological Collection Data (ABCD) 2.06 - An evolving comprehensive standard for the access to and exchange of data about specimens and observations (a.k.a. primary biodiversity data)“.[http://www.bgbm.org/TDWG/CODATA/Schema/]

Page 18: Prototype germplasm data portal (2006)

Exchange of germplasm datasets with PyWrapper/BioCASE, December 4, 2006, NGB Board MeetingExchange of germplasm datasets with PyWrapper/BioCASE, December 4, 2006, NGB Board Meeting 18

PGR sub-unit of ABCD 2.06PGR sub-unit of ABCD 2.06

Page 19: Prototype germplasm data portal (2006)

Exchange of germplasm datasets with PyWrapper/BioCASE, December 4, 2006, NGB Board MeetingExchange of germplasm datasets with PyWrapper/BioCASE, December 4, 2006, NGB Board Meeting 19

Generation Challenge Programme, Generation Challenge Programme, GCP_Passport_1.03GCP_Passport_1.03

The Generation Challenge Programme is a research and capacity building network that uses plant genetic diversity to produce better crop varieties for resource-poor farmers.

In the context of the GCP (Generation Challenge Programme), the GCP Passport data exchange schema was developed.

Page 20: Prototype germplasm data portal (2006)

Exchange of germplasm datasets with PyWrapper/BioCASE, December 4, 2006, NGB Board MeetingExchange of germplasm datasets with PyWrapper/BioCASE, December 4, 2006, NGB Board Meeting 20

GCP_Passport Upgrade to ABCDGCP_Passport Upgrade to ABCD

Page 21: Prototype germplasm data portal (2006)

Exchange of germplasm datasets with PyWrapper/BioCASE, December 4, 2006, NGB Board MeetingExchange of germplasm datasets with PyWrapper/BioCASE, December 4, 2006, NGB Board Meeting 21

Global Unique Identifiers, GUID (LSID, Life Science Identifiers) [http://lsid.sourceforge.net/]

Biodiversity informatics workflow tools (BioMOBY and Taverna)

Work in progressWork in progress

Page 22: Prototype germplasm data portal (2006)

Exchange of germplasm datasets with PyWrapper/BioCASE, December 4, 2006, NGB Board MeetingExchange of germplasm datasets with PyWrapper/BioCASE, December 4, 2006, NGB Board Meeting 22

OutlookOutlook

The compatibility of data standards between PGR and biodiversity collections made it possible to integrate the worldwide germplasm collections into the biodiversity community.

Using GBIF technology (and contributing to its development), the PGR community can easily establish specific PGR networks without duplicating GBIF's work.

Use of GBIF technology and integration of PGR collection data into GBIF allows PGR users to simultaneously search PGR collections and other biodiversity collections, and to get access to the data (and possibly the material) of relevant biodiversity collections.

Users from the biodiversity community (who may not be aware of the existence of relevant material in genebanks) will find in GBIF genebank material of, e.g. crop wild relatives, along with data of the same species from herbaria, botanical gardens and floristic observations.

Page 23: Prototype germplasm data portal (2006)

Exchange of germplasm datasets with PyWrapper/BioCASE, December 4, 2006, NGB Board MeetingExchange of germplasm datasets with PyWrapper/BioCASE, December 4, 2006, NGB Board Meeting 23

Special thanks toSpecial thanks to

IPGRI (Bioversity International) [http://www.bioversityinternational.org]

GCP, The Generation Challenge Programme [http://www.generationcp.org/]

GBIF, Global Biodiversity Information Facility [http://www.gbif.org]

BioCASE, The Biological Collection Access Service for Europe. [http://www.biocase.org]

TDWG, Taxonomic Database Working Group [http://www.tdwg.org]

Page 24: Prototype germplasm data portal (2006)

Exchange of germplasm datasets with PyWrapper/BioCASE, December 4, 2006, NGB Board MeetingExchange of germplasm datasets with PyWrapper/BioCASE, December 4, 2006, NGB Board Meeting 24

Thank you for listening!