GLOBAL BIODIVERSITY INFORMATION FACILITY The Global Biodiversity Information Facility (GBIF ): The...

22
GLOBAL BIODIVERSITY INFORMATION FACILITY The Global Biodiversity Information Facility (GBIF ): The distributed architecture Samy Gaiji Head of Informatics GBIF Biodiversity Information Standards (TDWG) 2009 Conference 9-13 November 2009 WWW.GBIF.ORG

Transcript of GLOBAL BIODIVERSITY INFORMATION FACILITY The Global Biodiversity Information Facility (GBIF ): The...

GLOBALBIODIVERSITY

GLOBALBIODIVERSITYINFORMATIONFACILITY The Global Biodiversity

Information Facility (GBIF ): The distributed

architecture

Samy GaijiHead of InformaticsGBIF

Biodiversity Information Standards (TDWG) 2009 Conference

9-13 November 2009

WWW.GBIF.ORG

Objectives of this presentation

Objectives of this presentation

Expose the challenges faced by GBIF in building a global information network;

Present GBIF distributed architecture strategy;

Introduce the key building components of the GBIF Informatics suite;

Call for participation to the community.

A growing global network…

A growing global network…

53 country participants43 associated participants

A growing network…

A growing network…

189,4 million records

5% increase/month8186 data resources306 data publishers

Million

of

pri

mary

bio

div

ers

ity r

ecord

s

Data publishers

Architecture

Architecture

Publis

hin

gIn

dexin

gD

isco

veri

ng

<1% IPT 3% TAPIR16% BioCASE80% DiGIR

80% DwC18% ABCD 2% others

189 M records8-9 M/month>300 publishers

A one-stop entry point to data discovery

A one-stop entry point to data discovery

http:/data.gbif.org

What are the challenges today?

What are the challenges today?

More data types

Richer user interface

Better management

Richer content

Better synchronisation

Improved discovery

Decentralisation is therefore aimed at empowering GBIF Nodes and Participants

What are the key processes?

What are the key processes?

NodeDataPublishers

Discovering

Harvesting Indexing

Registry

Registering

ServicePublishers

Access

What are the key components?

What are the key components?

Publishing toolkitHarvesting toolkit

Portal toolkit Registry

Registration &Discovery

Data flow

The GBIF Informatics Suite for Participants

Publishing Component

Publishing Component

DataPublishers

Provide a robust and user-friendly publishing tool (TAPIR compliant, WFS-WMS, EML etc.),

Improve the existing standards (DwC, DwC Archive) and enable the provision of richer content through extensions for specialised communities,

Support the publishing of more datatypes such as Metadata, Names, etc…

The Integrated Publishing Toolkit (IPT)

Harvesting/Indexing component

Harvesting/Indexing component

Provide a tool that will: harvest distributed data publishers using

multiple protocols and schemas, harvest multiple datatypes (Primary

Biodiversity Data, Metadata, Names), Synchronise with the GBIF Registry (part

of the GBRDS), index into a central database.

Harvesting Indexing

The Harvesting and Indexing Toolkit (HIT)

Registry component

Registry component

Provide a mechanism that will: provide a registry of organisation and

resources (collection), provide a registry of schema and

extensions, provide a registry of services and tools.

A compass for all the information networks.

Registry

The Global Biodiversity Resources Discovery System (GBRDS)

Portal component

Portal component

Provide a platform that will publish: Primary Biodiversity Data, Names, Metadata.

Design it as a flexible and customisable platform to meet the needs of a variety of community and needs.

Node Access

The Nodes Portal Toolkit

Where are we today?

Where are we today?

Harvesting Indexing Toolkit (HIT)

Global Biodiversity Resources Discovery System (GBRDS)

Development/Testing phase

Integrated Publishing Toolkit (IPT)

Production phase

Planning phase

Node Portal Toolkit (NPT)

Some successful examples…

Some successful examples…

The DarwinCore Germplasm Extension

Broadening standards

Some successful examples…

Some successful examples…

The DarwinCore Germplasm Extension

Broadening standards

DarwinCore

Sample acquisition

Collecting event

Breeding event

‘IPR’

Trait experiment

Trait measurement

Some successful examples…

Some successful examples…

The DarwinCore Germplasm Extension

Publishing richer content.

Towards decentralisation

Towards decentralisation

Global Register of Migratory Species

World Database on Protected Areas

More data types,Increased content,Better data quality,More participants.

Better discovery,Improved

integration.

Species richness changes…

A complex challenge…

A complex challenge…

A call for participation to the community

A call for participation to the community

1. Improving standards (within and across domains);

2. Evaluate/Contribute to the GBIF Informatics Suite;

3. Develop specific use cases (assessing threats to biodiversity, monitor impacts of invasive species, agro-biodiversity…);

4. Actively engage in the decentralisation of the GBIF architecture to meet YOUR needs;

5. Address challenges in data quality and completeness;

6. Constantly monitor data usage and review/prioritise the Informatics developments.

Ask the GBIF Team !

Ask the GBIF Team !

Nick King GBIF Executive Secretary

Samy Gaiji Head of Informatics

David RemsenSenior Programme Officer for ECAT

Vishwas ChavanSenior Programme Officer for DIGIT

Éamonn Ó TuamaSenior Programme Officer for IDA

Andrea HahnData Portal Manager

José Miguel Cuadra MoralesProgrammer

Kyle BraakProgrammer

Markus DöringSenior Programmer

Challenges: broadening data types!

Challenges: broadening data types!