TDWG Infrastructure Project (TIP) Technical Architecture Group (TAG) Roger Hyam TDWG Executive...

14
TDWG Infrastructure Project (TIP) Technical Architecture Group (TAG) Roger Hyam TDWG Executive Meeting June 1-2, 2006 - Madrid, Spain

Transcript of TDWG Infrastructure Project (TIP) Technical Architecture Group (TAG) Roger Hyam TDWG Executive...

Page 1: TDWG Infrastructure Project (TIP) Technical Architecture Group (TAG) Roger Hyam TDWG Executive Meeting June 1-2, 2006 - Madrid, Spain.

TDWG Infrastructure Project (TIP)

Technical Architecture Group (TAG)Roger Hyam

TDWG Executive MeetingJune 1-2, 2006 - Madrid, Spain

Page 2: TDWG Infrastructure Project (TIP) Technical Architecture Group (TAG) Roger Hyam TDWG Executive Meeting June 1-2, 2006 - Madrid, Spain.

TAG – Proposed Role

• Maintain an account of current situation.• Maintain a vision of how things could be.• Provided formal advice to the Executive

Committee on new subgroups and standards.

• Provided advice to TDWG members on how their work can integrate with others.

(all from a purely technical perspective)

Page 3: TDWG Infrastructure Project (TIP) Technical Architecture Group (TAG) Roger Hyam TDWG Executive Meeting June 1-2, 2006 - Madrid, Spain.

Paradigm

• Starting assumption is that standards are about sharing data.

• Sharing data also implies sharing data through time.

Archive

Page 4: TDWG Infrastructure Project (TIP) Technical Architecture Group (TAG) Roger Hyam TDWG Executive Meeting June 1-2, 2006 - Madrid, Spain.

What is Shared?

• Sharing raw literals isn’t much use.

• They need to be gathered together into ‘semantic’ units or objects.

TaxonName:1234Bellis perennis

perennis

Bellis

1234

Page 5: TDWG Infrastructure Project (TIP) Technical Architecture Group (TAG) Roger Hyam TDWG Executive Meeting June 1-2, 2006 - Madrid, Spain.

Semantics of Objects

• Objects need to be based on some shared semantics.

• There needs to be somewhere to look up what they mean – an ontology.

TaxonName:Bellis perennis

Ontology

TaxonName?

Page 6: TDWG Infrastructure Project (TIP) Technical Architecture Group (TAG) Roger Hyam TDWG Executive Meeting June 1-2, 2006 - Madrid, Spain.

Identity of Objects

• How do I refer to this object?

• Who should I credit?

• Who should I send corrections to?

• Is it the same record as I already have or is it a new one?

• What is the official version of this data - has some one altered it before I received it?

Page 7: TDWG Infrastructure Project (TIP) Technical Architecture Group (TAG) Roger Hyam TDWG Executive Meeting June 1-2, 2006 - Madrid, Spain.

TAG-1 Meeting

• There was consensus on-– Architecture is concerned with shared data– Biodiversity data will be modeled as a graph

of identifiable objects– The semantics of these objects will be

encoded in a series of shared ontologies– Ontologies will be related to each other on the

basis of a shared Base and Core ontologies as a minimum

• Discussion continues on how this is done

Page 8: TDWG Infrastructure Project (TIP) Technical Architecture Group (TAG) Roger Hyam TDWG Executive Meeting June 1-2, 2006 - Madrid, Spain.

Implications

• We need a ontology to define and relate the objects we exchange.

• Ontology governance/management is paramount.

• We need a system of GUIDs to identify the objects.

• We need a roadmap for the protocols to exchange these objects.

Page 9: TDWG Infrastructure Project (TIP) Technical Architecture Group (TAG) Roger Hyam TDWG Executive Meeting June 1-2, 2006 - Madrid, Spain.

Structure of the OntologyBase Ontology

Core Ontology

Domain Ontology

Application Ontologies

BaseThing BaseActor

CoreTaxonName CoreInstitution

TaxonName

NomencalturalType

NomeclaturalNote

Herbarium

ABCD DarwinCore ???

Page 10: TDWG Infrastructure Project (TIP) Technical Architecture Group (TAG) Roger Hyam TDWG Executive Meeting June 1-2, 2006 - Madrid, Spain.

Ontology Governance

• Allow people to create Domain sub-ontologies easily – prevent alienation.

• Each ontology construct (concept) has a status.

• Status is increased by passing through explicit gates defined by actual usage.

Experimental Shared Recommend

Page 11: TDWG Infrastructure Project (TIP) Technical Architecture Group (TAG) Roger Hyam TDWG Executive Meeting June 1-2, 2006 - Madrid, Spain.

Recommendations

• We need to develop a TDWG standard that specifies how we manage the TDWG ontology.

• We need a technology independent way of working with the ontology that can be understood and manipulated by biologists – some form of web based application.

Page 12: TDWG Infrastructure Project (TIP) Technical Architecture Group (TAG) Roger Hyam TDWG Executive Meeting June 1-2, 2006 - Madrid, Spain.

Protocols

• Resolution – LSID, URL etc.

• Harvest – OAI, RSS, other?

• Search/Query – BioCASe, DiGIR, TAPIR, SPARQL, other?

Page 13: TDWG Infrastructure Project (TIP) Technical Architecture Group (TAG) Roger Hyam TDWG Executive Meeting June 1-2, 2006 - Madrid, Spain.

Challenges

Protocol Number at end 2007

DiGIR 200+

BioCASe ~100

TAPIR 10 possibly 40+

SPARQL 30+

LSID 10?

Page 14: TDWG Infrastructure Project (TIP) Technical Architecture Group (TAG) Roger Hyam TDWG Executive Meeting June 1-2, 2006 - Madrid, Spain.

Solutions

• Resolution and harvest protocols are relatively easy to plug into or wrap round existing service providers.

• Implementers of the most widely used protocols are ‘on board’.

• …We have a clear, agreed direction.