Global Information Systems for Plant Genetic Resources, SeedNet training course (2008)
-
Upload
dag-endresen -
Category
Education
-
view
2.481 -
download
0
description
Transcript of Global Information Systems for Plant Genetic Resources, SeedNet training course (2008)
SEEDNet course at NordGen May 2008SEEDNet course at NordGen May 2008
Cover Cover slideslideDocumentation of
Genetic Resources
Global Information Systems
SEEDNet Training CourseMay 28, 2008NordGen, Alnarp
Dag Terje Filip EndresenNordic Genetic Resource Center/ Bioversity International
Global PGR Information Systems for the SEEDNet course at NordGen, May 28, 2008Global PGR Information Systems for the SEEDNet course at NordGen, May 28, 2008 2
TOPICSTOPICS
Documentation of genetic resources:
Information Systems Data standards Data exchange Distributed data network
Global PGR Information Systems for the SEEDNet course at NordGen, May 28, 2008Global PGR Information Systems for the SEEDNet course at NordGen, May 28, 2008 3
Global PGRInformationSystems
Global PGR Information Systems for the SEEDNet course at NordGen, May 28, 2008Global PGR Information Systems for the SEEDNet course at NordGen, May 28, 2008 4
SEEDNet, South East European Development Network on Plant Genetic Resources was established in 2004. [http://seednet.geminova.net/]
Albania Bulgaria Croatia Federation of Bosnia and
Herzegovina Kosovo Macedonia Moldova Montenegro Republika Srpska Romania Serbia Slovenia
SEEDNet data portalSEEDNet data portal
Global PGR Information Systems for the SEEDNet course at NordGen, May 28, 2008Global PGR Information Systems for the SEEDNet course at NordGen, May 28, 2008 5
ECPGR ECPGR (AEGIS, ECCDB, EURISCO)(AEGIS, ECCDB, EURISCO)
A European Genebank Integrated System (AEGIS)
Sharing of responsibilities (Most Appropriate Accession; common agreed quality standards for ex situ conservation).
Conservation of the genetically unique and important accessions for Europe and making them available for breeding and research.
Four model crops: Allium, Avena, Brassica and Prunus species.
Membership in AEGIS is open to all European countries (ECPGR).
EURISCO and the Central Crop Databases play a key role in the information management.
Global PGR Information Systems for the SEEDNet course at NordGen, May 28, 2008Global PGR Information Systems for the SEEDNet course at NordGen, May 28, 2008 6
ECPGR Central Crop ECPGR Central Crop DatabasesDatabases
Global PGR Information Systems for the SEEDNet course at NordGen, May 28, 2008Global PGR Information Systems for the SEEDNet course at NordGen, May 28, 2008 7
EURISCO data catalogue of the European genebanks (more than 1 000 000 accessions from 35 European countries)
EURISCO holds accession level data on 1 300 genera and 8 500 species.
EURISCO was released in September 2003 as a result of the EU funded EPGRIS project.
EURISCO is hosted by Bioversity International on behalf of the ECPGR.
EURISCO EURISCO [[http://eurisco.ecpgr.org/]]
Global PGR Information Systems for the SEEDNet course at NordGen, May 28, 2008Global PGR Information Systems for the SEEDNet course at NordGen, May 28, 2008 8
EURISCO (draft, new layout)EURISCO (draft, new layout)
Global PGR Information Systems for the SEEDNet course at NordGen, May 28, 2008Global PGR Information Systems for the SEEDNet course at NordGen, May 28, 2008 9
Data flow from genebanks to EURISCO and ECCDBsData flow from genebanks to EURISCO and ECCDBs
Global PGR Information Systems for the SEEDNet course at NordGen, May 28, 2008Global PGR Information Systems for the SEEDNet course at NordGen, May 28, 2008 10
EPGRIS3 is a volunteer self-funded follow up on the EU funded EPGRIS project.
EPGRIS3 is about improving the data exchange of European genebank datasets and to further develop the IT infrastructure on genetic resources in Europe.
EPGRIS3 EPGRIS3 [http://www.epgris3.eu/]
Global PGR Information Systems for the SEEDNet course at NordGen, May 28, 2008Global PGR Information Systems for the SEEDNet course at NordGen, May 28, 2008 11
A EPGRIS3 Wiki environment is hosted by NordGen. Please register and contribute to the discussions. [http://wwwdev.ngb.se/epgris3/]
Please make contact with one of the EPGRIS3 contact persons if you want to contribute to the EPGRIS3 project.[http://www.epgris3.eu/ EPGRIS3contacts.htm]
EPGRIS3 EPGRIS3 Wiki EnvironmentWiki Environment
Global PGR Information Systems for the SEEDNet course at NordGen, May 28, 2008Global PGR Information Systems for the SEEDNet course at NordGen, May 28, 2008 12
The System-wide Information Network for Genetic Resources (SINGER).
More than 650 000 accessions from the 12 international CGIAR organizations.
SINGER is hosted by Bioversity International on behalf of the CGIAR.
SINGER SINGER [[http://singer.grinfo.net/]]
Global PGR Information Systems for the SEEDNet course at NordGen, May 28, 2008Global PGR Information Systems for the SEEDNet course at NordGen, May 28, 2008 13
AVRDC - The World Vegetable Center
Bioversity - Bioversity International
CIAT - Centro Internacional de Agricultura Tropical
CIMMYT - Centro Internacional de Mejoramiento de Maiz y Trigo
CIP - Centro Internacional de la Papa
ICARDA - International Center for Agricultural Research in the Dry Areas
ICRAF - The World Agroforestry Centre
ICRISAT - International Crops Research Institute for the Semi-Arid Tropics
IITA - International Institute of Tropical Agriculture
ILRI - International Livestock Research Institute
IRRI - International Rice Research Institute
WARDA - The Africa Rice Center
CGIAR CGIAR [[http://www.cgiar.org/]]
Global PGR Information Systems for the SEEDNet course at NordGen, May 28, 2008Global PGR Information Systems for the SEEDNet course at NordGen, May 28, 2008 14
GCPGCPGeneration Challenge Programme.
The GCP Mission: The GCP Mission: To use advanced genomics science and plant genetic diversity to overcome complex agricultural bottlenecks that condemn millions of the world’s neediest people to a future of poverty and hunger.
The GCP Vision: The GCP Vision: A future where plant breeders have the tools to breed crops in marginal environments with greater efficiency and accuracy for the benefit of the resource-poor farmers and their families.
GCP GCP [[http://www.generationcp.org/]]
Global PGR Information Systems for the SEEDNet course at NordGen, May 28, 2008Global PGR Information Systems for the SEEDNet course at NordGen, May 28, 2008 15
The Nordic Genetic Resource Center (NordGen) was established in January 2008.
NordGen replaces the former institute Nordic Gene Bank (NGB) established in 1979.
NordGen is the joint regional genetic resource center for all the 5 Nordic countries: Denmark, Finland, Iceland, Norway and Sweden.
The NordGen reports to the Nordic Council of Ministers [http://www.norden.org].
The mandate of the NordGen is conservation and utilization of Nordic Genetic Resources.
NordGen NordGen [[http://www.nordgen.org/]]
Global PGR Information Systems for the SEEDNet course at NordGen, May 28, 2008Global PGR Information Systems for the SEEDNet course at NordGen, May 28, 2008 16
SEEDNet, South East European Development Network on Plant Genetic Resources was established in 2004. [http://seednet.geminova.net/]
SADC, Southern African Development Community program on genetic resources was started in 1989. [http://www.spgrc.org/]
USDA GRIN, Germplasm Resources Information Network of the US. [http://www.ars-grin.gov/]
… and more
Regional Programs on Genetic Regional Programs on Genetic ResourcesResources
Global PGR Information Systems for the SEEDNet course at NordGen, May 28, 2008Global PGR Information Systems for the SEEDNet course at NordGen, May 28, 2008 17
GBIFGlobal Biodiversity Information Facility
Global PGR Information Systems for the SEEDNet course at NordGen, May 28, 2008Global PGR Information Systems for the SEEDNet course at NordGen, May 28, 2008 18
GBIF Data PortalGBIF Data PortalGBIF [http://data.gbif.org/]
Global PGR Information Systems for the SEEDNet course at NordGen, May 28, 2008Global PGR Information Systems for the SEEDNet course at NordGen, May 28, 2008 19
GBIF PGR Network 2GBIF PGR Network 2
[http://data.gbif.org/datasets/network/2]
Global PGR Information Systems for the SEEDNet course at NordGen, May 28, 2008Global PGR Information Systems for the SEEDNet course at NordGen, May 28, 2008 20
GBIF NordGenGBIF NordGen [http://data.gbif.org/]
Global PGR Information Systems for the SEEDNet course at NordGen, May 28, 2008Global PGR Information Systems for the SEEDNet course at NordGen, May 28, 2008 21
[http://data.gbif.org/]GBIF SINGERGBIF SINGER
Global PGR Information Systems for the SEEDNet course at NordGen, May 28, 2008Global PGR Information Systems for the SEEDNet course at NordGen, May 28, 2008 22
GBIF USDAGBIF USDA
Global PGR Information Systems for the SEEDNet course at NordGen, May 28, 2008Global PGR Information Systems for the SEEDNet course at NordGen, May 28, 2008 23
Germplasm cataloguesGermplasm catalogues
The three large germplasm catalogues are indexed by the GBIF data portal
EURISCO is the data catalogue of the European genebanks (more than 1 000 000 accessions)
SINGER is the portal to the international CGIAR collections (more than 650 000 accessions)
USDA-GRIN is the portal to the USDA ARS National Germplasm Repositories of the USA (more than 400 000 accessions)
Global PGR Information Systems for the SEEDNet course at NordGen, May 28, 2008Global PGR Information Systems for the SEEDNet course at NordGen, May 28, 2008 24
FAOWIEWS
Global PGR Information Systems for the SEEDNet course at NordGen, May 28, 2008Global PGR Information Systems for the SEEDNet course at NordGen, May 28, 2008 25
FAO WIEWS [http://apps3.fao.org/wiews/]
Global PGR Information Systems for the SEEDNet course at NordGen, May 28, 2008Global PGR Information Systems for the SEEDNet course at NordGen, May 28, 2008 26
FAO WIEWS, GPA [http://www.pgrfa.org/gpa/]
Leipzig Declaration 1996, 150 countries[http://www.globalplanofaction.org/]
Global PGR Information Systems for the SEEDNet course at NordGen, May 28, 2008Global PGR Information Systems for the SEEDNet course at NordGen, May 28, 2008 27
Data Standards
Global PGR Information Systems for the SEEDNet course at NordGen, May 28, 2008Global PGR Information Systems for the SEEDNet course at NordGen, May 28, 2008 28
Crop DescriptorsCrop Descriptors
The crop descriptor lists from Bioversity International provide global standards for characterization and evaluation data on crop genetic resources.
The MCPD (Multi Crop Passport Descriptor List) provides a global standard for "passport data" across the crops.
The MCPD descriptor list is compatible with the TDWG standard: ABCD 2.06.
Global PGR Information Systems for the SEEDNet course at NordGen, May 28, 2008Global PGR Information Systems for the SEEDNet course at NordGen, May 28, 2008 29
Accession level, Data StandardsAccession level, Data Standards
Multi Crop Passport (MCPD)[http://www.bioversityinternational.org/publications/pubfile.asp?id_pub=124]
Darwin Core (DwC v2)[http://wiki.tdwg.org/twiki/bin/view/DarwinCore/]
Access to Biological Collection Data (ABCD 2.06)[http://wiki.tdwg.org/twiki/bin/view/ABCD]
Generation Challenge Programme (GCP Passport v1.05)
[http://gcpcr.grinfo.net/include/webservices/schema-documentation.php]
Global PGR Information Systems for the SEEDNet course at NordGen, May 28, 2008Global PGR Information Systems for the SEEDNet course at NordGen, May 28, 2008 30
W3C :: RDFW3C :: RDF
Resource Description Framework Scenario: You have a dataset of genebank accessions with pointers
to the source datasets of the holding genebanks. You produce phenotypic evaluation data on accessions in this dataset. You find evaluation data from other sources on some of the accessions in your dataset. Some of the evaluation data are produced in areas of different day length, rainfall, soils… Some of the accessions in your dataset originate from areas of higher population densities other accessions originate from more natural habitats. Unfortunately most of the different sources of information is located on different web sites and it is difficult to bring the information together.
You would need to go through more or less the same process as other researchers in many domains of gathering heterogeneous data from multiple sources, combining and analysing it. This is the challenge that faces the web as a whole and is being addressed by the Semantic Web project.
RDFs can assist you to relate information from different sources.
A RDF triplet looks like this: subject-predicate-object
<rdf:Description rdf:about="http://www.example.org/index.html"> <dc:creator>John Smith</dc:creator></rdf:Description>
anytime approximate case
study diagnosis inconsistent kads
banana apples stem color
knowledge based systems
knowledge level knowledge
management knowledge representation
LSID accession
number GUID unitID
ontology
owl parametric design Full
Scientific Name peer to peer systems
problem solving landrace
traditional cultivar 300 methods
rdf rdf WEB2 ABCD
SDD semantic web semantics specification
languages web based web ontology INSTCODE plant genetic resources
germplasm agricultural
traits Aegilops
Global PGR Information Systems for the SEEDNet course at NordGen, May 28, 2008Global PGR Information Systems for the SEEDNet course at NordGen, May 28, 2008 31
Life Science IDentifiers
LSID is a digital name tag. LSIDs are GUIDs, Global Unique Identifiers.
[http://lsid.sourceforge.net/]
Structure urn:lsid:authority:namespace:object:revision Example (fictive) urn:lsid:eurisco.org:accession:H451269
The LSID concept introduces a straightforward approach to naming and identifying data resources stored in multiple, distributed data stores.
LSID defines a simple, common way to identify and access biologically significant data
LSID provides a naming standard to support interoperability. Developed by OMG-LSR and W3C, implemented by IBM.
W3C :: LSIDW3C :: LSID
Global PGR Information Systems for the SEEDNet course at NordGen, May 28, 2008Global PGR Information Systems for the SEEDNet course at NordGen, May 28, 2008 32
Biodiversity data exchange tools
Global PGR Information Systems for the SEEDNet course at NordGen, May 28, 2008Global PGR Information Systems for the SEEDNet course at NordGen, May 28, 2008 33
Data Provider Software
PyWrapper v3, based on the BioCASE Python software.
[http://www.pywrapper.org/]
[http://www.biocase.org/]
DiGIR, Distributed Generic Information Retrieval. [http://digir.net]
TapirLink [http://wiki.tdwg.org/twiki/bin/view/TAPIR/TapirLink]
TapirDotNet [http://wiki.tdwg.org/twiki/bin/view/TAPIR/TapirDotNET]
Global PGR Information Systems for the SEEDNet course at NordGen, May 28, 2008Global PGR Information Systems for the SEEDNet course at NordGen, May 28, 2008 34
Distributed BioCASE/PyWrapper networkDistributed BioCASE/PyWrapper network
Global PGR Information Systems for the SEEDNet course at NordGen, May 28, 2008Global PGR Information Systems for the SEEDNet course at NordGen, May 28, 2008 35
Example of a service requestExample of a service request
All exchanged data is formatted with XML tags.
Global PGR Information Systems for the SEEDNet course at NordGen, May 28, 2008Global PGR Information Systems for the SEEDNet course at NordGen, May 28, 2008 36
Example of a service responseExample of a service response
Global PGR Information Systems for the SEEDNet course at NordGen, May 28, 2008Global PGR Information Systems for the SEEDNet course at NordGen, May 28, 2008 37
Data portal and decentralized data networks with web services
Global PGR Information Systems for the SEEDNet course at NordGen, May 28, 2008Global PGR Information Systems for the SEEDNet course at NordGen, May 28, 2008 38
Data warehouse modelData warehouse model
Global PGR Information Systems for the SEEDNet course at NordGen, May 28, 2008Global PGR Information Systems for the SEEDNet course at NordGen, May 28, 2008 39
Decentralized networkDecentralized network
EURISCO(Europe)
NordGen(Northern Europe)
IPK Gatersleben(Germany)
IHAR(Poland)
(Other European gene banks...)
SINGER(CGIAR)
(CGIARInternationalFuture Harvest gene banks...)
USDA GRIN(USA)
(USDA ARSNational Germplasm Repositories...)
WUR CGN(Netherlands)
GBIF(Global BiodiversityInformation Facility)
USER
ALIS(Accession Level Information System)
Web Services
MCPD
MCPD
Svalbard Global Seed Vault(Safe Backup)
SEEDNETCountries
Global PGR Information Systems for the SEEDNet course at NordGen, May 28, 2008Global PGR Information Systems for the SEEDNet course at NordGen, May 28, 2008 40
Germplasm data indexing toolsGermplasm data indexing tools
We have recently built data indexing tools for access to gene bank datasets provided with the BioCASE/PyWrapper.
This is planned to build a Global Accession Level Information System (ALIS).
In cooperation with GBIF, which themselves index basic biodiversity data from a similar approach.
[http://chm.grinfo.net/]
Global PGR Information Systems for the SEEDNet course at NordGen, May 28, 2008Global PGR Information Systems for the SEEDNet course at NordGen, May 28, 2008 41
[http://wwwdev.ngb.se/portal/]
Global PGR Information Systems for the SEEDNet course at NordGen, May 28, 2008Global PGR Information Systems for the SEEDNet course at NordGen, May 28, 2008 42
Crop Wild RelativesCrop Wild Relatives
ARMLKA
BOL
MDG
UZB
National Datasetsare shared with the central CWR data index.
The national datasets as well as access to other International datasets are provided from the CWR data portal.
EURISCO
SINGER[http://www.cropwildrelatives.org]
Global PGR Information Systems for the SEEDNet course at NordGen, May 28, 2008Global PGR Information Systems for the SEEDNet course at NordGen, May 28, 2008 43
The Taxon and Country pages provides access to the relevant external datasets.
Taxonomy level metadata
Global PGR Information Systems for the SEEDNet course at NordGen, May 28, 2008Global PGR Information Systems for the SEEDNet course at NordGen, May 28, 2008 44
Country level metadata
Global PGR Information Systems for the SEEDNet course at NordGen, May 28, 2008Global PGR Information Systems for the SEEDNet course at NordGen, May 28, 2008 45
OutlookOutlook
The compatibility of data standards between PGR and biodiversity collections made it possible to integrate the worldwide germplasm collections into the biodiversity community.
Using GBIF technology (and contributing to its development), the PGR community can easily establish specific PGR networks without duplicating GBIF's work.
Use of GBIF technology and integration of PGR collection data into GBIF allows PGR users to simultaneously search PGR collections and other biodiversity collections, and to get access to the data (and possibly the material) of relevant biodiversity collections.
The establishment of new data portals on a specific crop, a regional thematic network or similar subset of the total global biodiversity datasets; can be done with rather few efforts! This requires only that all the relevant datasets are provided by GBIF compatible web services (like the BioCASE PyWrapper).
Global PGR Information Systems for the SEEDNet course at NordGen, May 28, 2008Global PGR Information Systems for the SEEDNet course at NordGen, May 28, 2008 46
Special thanks to:Special thanks to:
Bioversity International [http://www.bioversityinternational.org]
GBIF, Global Biodiversity Information Facility [http://www.gbif.org]
BioCASE, The Biological Collection Access Service for Europe. [http://www.biocase.org]
TDWG, Taxonomic Database Working Group [http://www.tdwg.org]
Global PGR Information Systems for the SEEDNet course at NordGen, May 28, 2008Global PGR Information Systems for the SEEDNet course at NordGen, May 28, 2008 47
Thank you for listening!
Global PGR Information Systems for the SEEDNet course at NordGen, May 28, 2008Global PGR Information Systems for the SEEDNet course at NordGen, May 28, 2008 48
The data portal applicationThe data portal application
[http://wwwdev.ngb.se/portal/]
The data portal application is tested to work well with MS Windows, Mac OSX, Linux, FreeBSD...
The data portal is developed for the PostgreSQL database, but works well with many different database systems, through the ADODB database abstraction library.
The data portal is developed for UNICODEששچپچ אבדו ضاإطقكغب
The data portal is Open Source and licensed as GPL 2.
The data portal is developed with the PHP5 programming language, with some maintenance scripts developed with Perl.
Global PGR Information Systems for the SEEDNet course at NordGen, May 28, 2008Global PGR Information Systems for the SEEDNet course at NordGen, May 28, 2008 49
TDWG :: SDDTDWG :: SDD
Structured Descriptive Data
In taxonomy, descriptive data takes a number of very different forms. Natural-language descriptions are semi-structured, semi-formalised
descriptions of a taxon (or occasionally of an individual specimen). They may be simple, short and written in plain language (if used for a popular field guide), or long, highly formal and using specialised terminology when used in a taxonomic monograph or other treatment.
The goal of the SDD standard is to allow capture, transport, caching and archiving of descriptive data in all the forms shown above, using a platform- and application-independent, international standard. Such a standard is crucial to enabling lossless porting of data between existing and future software platforms including identification, data-mining and analysis tools, and federated databases. Hagedorn, G.; Thiele, K.; Morris, R. & Heidorn, P. B. 2005. The Structured Descriptive Data (SDD) w3c-xml-schema, version 1.0.http://www.tdwg.org/standards/116/. [Last retrieved 05-May-2007]
[http://www.tdwg.org/standards/116/]