Registry webinar

80
Registry Environment registry service Introductory webinar 1

description

Introductory training course on the Linked Data registry and its application to Environmental reference data.

Transcript of Registry webinar

Page 1: Registry webinar

Registry

1

Environment registry service

Introductory webinar

Page 2: Registry webinar

Registry

2

Outline

• reference data and role of the registry• environment registry service– demonstration– information model– services– status

• processes and usage– data flows – preparation and access– governance– implications

• Q & A

Page 3: Registry webinar

Registry

3

Goals of the webinar

• understand the registry service– what it is– what it’s for

• stimulate thinking on– how to make use of it– processes and governance

• feedback– future training needs (technical details course)– service requirements

Page 4: Registry webinar

Registry

4

Motivation

Multiple forces – deliver government policy

• ITC Strategy• Digital Policy • Open Data • Open Standards

– INSPIRE– Defra move to shared services (ICT Futures, Knowledge

Strategy and Open Data Strategy)

"Government IT must be open - open to the people and organisations that use our services and open to any provider, regardless of their size.”

Page 5: Registry webinar

Registry

5

Motivation

• key to making datasharable, reusable, open is accessible reference data

Picture CC-BY-2.0 © [email protected]

Page 6: Registry webinar

Registry

6

EG13987 P816L H 0.x u35 2012

Reference data

Grafham water phosphate level High 0.x mg/l 2012

• standardized terms to identify things in data• normally represented by coded identifiers• key to meaning of the dataentities, places, objects

substances, determinands

units

classifications, codes

assessment methodology

sampling methodology

Page 7: Registry webinar

Registry

7

Reference data identifiers

• to enable data reuse and access we need reference data that is:– independent of a particular data system• global identifiers

– stable• persistent identifiers

– interpretable• identifiers you can look up (“resolvable”)

Page 8: Registry webinar

Registry

8

Reference data identifiers

Open standards process• persistent resolvable

identifiers challenge• adopt HTTP 1.1 and URLs for this– board resolution 24 Sep 2013

• obvious fit– global (DNS)– resolvable (http)

Page 9: Registry webinar

Registry

9

Reference data identifiers

• => requirement to create and maintain URIs for identifiers in reference data

• challenges– how to share an authoritative namespace?– what do they deference to?– managing authoritative lists of terms?

Page 10: Registry webinar

Registry

10

UKGovLD Registry

Registry – tool to manage URIs for reference data

Services:– manage controlled lists of identifiers as URIs– store core data so the identifiers resolve– namespace management– other: validation, discovery, version and history

Design and open source implementationDefra instance at:

http://environment.data.gov.uk/registry/

Page 11: Registry webinar

Registry

11Picture CC-BY-2.0 © Annie Roi @flickr.com

Page 12: Registry webinar

Registry

12

Demo

• front page – list of all registers in registry

Page 13: Registry webinar

Registry

13

Demo

• filter – just registers from Environment Agency

Page 14: Registry webinar

Registry

14

Demo

• collection of code lists on some theme

Page 15: Registry webinar

Registry

15

Demo

• a flat code list

Page 16: Registry webinar

Registry

16

Demo

• an individual code in the code list

Page 17: Registry webinar

Registry

17

Demo

• a hierarchical code list

Page 18: Registry webinar

Registry

18

Demo

• a hierarchical code list

Page 19: Registry webinar

Registry

19

Machine accessible

• each collection of entries or individual entries has a URI

• access through a browser you see a web page• a machine can request specific format like

JSON directly (content negotiation)

Page 20: Registry webinar

Registry

20Picture CC-BY-2.0 © Annie Roi @flickr.com

Page 21: Registry webinar

Registry

21

Information model

Register - a controlled list of items

Page 22: Registry webinar

Registry

22

Information model

• Registers can contain registers or simple itemsroot register

top level of definitions

a themed collection

http://environment.data.gov.uk/registry/def/catchment-planning/RiverBasinDistrict/

http://environment.data.gov.uk/registry/def/catchment-planning/

http://environment.data.gov.uk/registry/def/

http://environment.data.gov.uk/registry/

a code list

codes

• http://environment.data.gov.uk/registry/def/catchment-planning/RiverBasinDistrict/UK01• http://environment.data.gov.uk/registry/def/catchment-planning/RiverBasinDistrict/UK02• ...

Page 23: Registry webinar

Registry

23

Information model

Register - a controlled list of items– manager and governance policy– lifecycle and status for items in the register

Page 24: Registry webinar

Registry

24

Information model

Register item– a definition of something• concept, organization, geographic area, substance ...

– represented as a set of property values

– type and label required– values can be simple (strings, numbers) or URIs

http://environment.data.gov.uk/registry/def/catchment-planning/RiverBasinDistrict/UK05type http://location.data.gov.uk/def/am/wfd/RiverBasinDistrict

label Anglian

notation UK05

Page 25: Registry webinar

Registry

25

Information model

• Register item– no constraints on what properties you use to

describe the item (“schema-less”)– open ended, can add richer descriptions later– gives a lot of flexibility for what you can register– tame this by adopting a few standard patterns• but can add more as needs change

Page 26: Registry webinar

Registry

26

Information model

Standard patterns• collections of codes– SKOS (Simple Knowledge Organization Scheme)– Concepts grouped in to ConceptSchemes

• organizations– ORG (Organization Ontology)

Page 27: Registry webinar

Registry

27

Aside on RDF and linked data

• there’s a standard for how to represent such descriptions of things identified by URI– Resource Description Framework (RDF)

• the registry design and implementation is built on this standards stack

• the standard patterns are RDF vocabularies• but don’t need to use RDF in your information

systems in order to use the registry

Page 28: Registry webinar

Registry

28

Information model

Linking– value of a property can be a URI• any URI – same register, other register, external

– use for hierarchical structure within a register• concept schemes with broader/narrower links• organizational structure with sub-organizational links

Page 29: Registry webinar

Registry

29

Information model

Hierarchy within a register

Anglia

...

Central

Eastern

Northern

Midlands

hasSubOrganization

topConceptOfea-areas

Page 30: Registry webinar

Registry

30

Information model

Linking– value of a property can be a URI• any URI – same register, other register, external

– use for hierarchical structure within a register• concept schemes with broader/narrower links• organizational structure with sub-organizational links

– use for cross matching between code lists• exact match within MMO experimental data

Page 31: Registry webinar

Registry

31

Information model

Cross-links between registers

Addresses

...

Admin areas

MMO/data-theme

Address

...

Administrative Units

MMO/topic-category

exactMatch

Page 32: Registry webinar

Registry

32

Information model

Linking– value of a property can be a URI• any URI – same register, other register, external

– use for hierarchical structure within a register• concept schemes with broader/narrower links• organizational structure with sub-organizational links

– use for cross matching between code lists• exact match within MMO experimental data

– use to relate registered URIs with other URIs• same as links from organizations to organogram IDs

Page 33: Registry webinar

Registry

33

Information model

External items– can register URIs outside the registry service– register holds• definitive list of items• metadata about the registered items (status etc)• copy of minimal description information

– up to the URI owner to maintain the URI

Managed items– URI is within the register namespace– maintenance done within the registry service

Page 34: Registry webinar

Registry

34

Information model

Navigation browse hierarchy of registers

text search for register or item

filter set of registers

Page 35: Registry webinar

Registry

35

Information model

Filters– category

• classification of subject that the register is about (e.g. Water)

– entity• the type of thing in the register (e.g. Regions and Habitats)

– owner• the organization which owns and manages the register

Extensible– this is just metadata associated with the registers– could extend the category schemes or add others

Page 36: Registry webinar

Registry

36

Information model

• Summary– Register• controlled list• arranged in a hierarchy like folders in a file system• can be annotated and classified to help with navigation

– Item• entry in register• can be external to the registry or internal (managed)• has a URI and extensible set of descriptive properties• optional standard patterns for the descriptions• properties enable links between items

Page 37: Registry webinar

Registry

37Picture CC-BY-2.0 © Annie Roi @flickr.com

Page 38: Registry webinar

Registry

38

Registry services

• Outline of services– manage controlled lists of identifiers as URIs– serve data for the registers and managed items– namespace management– validation– discovery– version and history management

Page 39: Registry webinar

Registry

39

Registry services

An API for everything– REST API for each of the services– user interface is layered on top– can build external tools which provide other

interfaces but work via the API– interface itself is template-driven and easy to

modify

Page 40: Registry webinar

Registry

40

Registry services

• Outline of services– manage controlled lists of identifiers as URIs• create register• register item(s)• update items• change status

Page 41: Registry webinar

Registry

41

Registry services

• Outline of services– manage controlled lists of identifiers as URIs– serve data for the registers and managed items• return as RDF, JSON, CSV (TBD)• view as web page in browser• control over how lists (registers) are returned

– see metadata as well as the items– filter by status– page through long lists

Page 42: Registry webinar

Registry

42

Registry services

• Outline of services– manage controlled lists of identifiers as URIs– serve data for the registers and managed items– namespace management• requests to parts of URI space can be forwarded to

other services• some support for federation

Page 43: Registry webinar

Registry

43

Registry services

• Outline of services– manage controlled lists of identifiers as URIs– serve data for the registers and managed items– namespace management– validation• test of a set of URIs are valid

Page 44: Registry webinar

Registry

44

Registry services

• Outline of services– manage controlled lists of identifiers as URIs– serve data for the registers and managed items– namespace management– validation– discovery• supports text search• user interface navigation support

Page 45: Registry webinar

Registry

45

Registry services

• Outline of services– manage controlled lists of identifiers as URIs– serve data for the registers and managed items– namespace management– validation– discovery– version and history management• stores history of item versions• versioned URIs• see item or register at point in time

Page 46: Registry webinar

Registry

46

Registry services

• Outline of services– manage controlled lists of identifiers as URIs– serve data for the registers and managed items– namespace management– validation– discovery– version and history management

Page 47: Registry webinar

Registry

47

Status

• open design and open source implementation– managed by UKGovLD

• proof of concept deployment– proved principle, and running stably for 8 months

• pilot deployment for environment.data.gov.uk– supported for one year– alpha but robust so far– security model for update based on OpenID– intention to continue service if pilot successful– may require enhancements e.g. server replication

Page 48: Registry webinar

Registry

48Picture CC-BY-2.0 © Annie Roi @flickr.com

Page 49: Registry webinar

Registry

49

Processes and usage

• how do you use the registry?• how do you get data into it?• what data should go into it?• how will that be managed and governed?

Page 50: Registry webinar

Registry

50

Using the reference data

Manual consultation– e.g. developer looking up meaning of term in data

Registryservice

lookup URI via browser

Page 51: Registry webinar

Registry

51

Using the reference data

Use code list in IT application– e.g. data entry dialog, export mapping ...

Registryservice

IT Application

RDF[CSV]

JSON

localformat

export code list via web API

or manual download

support other

formats?

use directly

map to application

specific format

Page 52: Registry webinar

Registry

52

Using the reference data

• Publish data using the references– use the URIs instead of free text or opaque codes– can use URIs in CSV or JSON, doesn’t have to be

RDF linked data

Page 53: Registry webinar

Registry

53

Using the reference data

Site Det Measurement

1 A 0.1

1 B 50

2 A 0.5

2 B 10

Site Determinad Value

A X 100

B X 10

C X 20

D X 100

Site Det Measurement

1 http://… 10

1 http://… 50

2 http://… 50

2 http://… 10

Site Determinad Value

A http://… 100

B http://… 10

C http://… 20

D http://… 100

Site Det Measurement

1 http://… 10

1 http://… 50

2 http://… 50

2 http://… 10

A http://… 100

B http://… 10

C http://… 20

D http://… 100

Page 54: Registry webinar

Registry

54

SPARQL

Publishing reference data

Registryservice

manual registration and update

Existing local

code lists

ETL JSON

custom extraction

registry-utilconverter

[CSV] RDF

data preparation service takes

simple CSV inputdirect federation to RDF source

Code list serverproxy requests

Page 55: Registry webinar

Registry

55

Publishing reference data

• which data?– reference data that enables data reuse or sharing– between organizations or as part of open data– connective reference data– judgement required

Page 56: Registry webinar

Registry

56

Governance

Local Experimental FinalNetworkStandard

Business owners

Data board

Registry administration

Organization SRO Organization SRO

publishers

• ----• ----• ----• ----

• ----• ----• ----• ----

• ----• ----• ----• ----

• ----• ----• ----• ----

• ----• ----• ----• ----

• ----• ----• ----• ---- • ----

• ----• ----• ----

• ----• ----• ----• ----

• ----• ----• ----• ----

Page 57: Registry webinar

Registry

57

Processes and usage

• Registry is just tool– manage a set of global, persistent identifiers– to enable data to be reused and integrated• whether across organizations or as open data

– but up to you• which reference data should be managed this way• how the maintenance process should work in practice• when to map data to the shared identifiers

– it’s there to reduce the cost of such management• to gain the benefits of reusable data• not to add additional processes for the sake of it

Page 58: Registry webinar

Registry

58

Next steps

Technical “how to” training plannedCoverage:

• preparing data for registration• registration and managing entries• accessing data

Suitable for • potential registry administrators or publishers

Page 59: Registry webinar

Registry

59

Links

• Design and API detailshttps://github.com/UKGovLD/ukl-registry-poc/wiki

• Alpha sitehttp://environment.data.gov.uk/registry

Page 60: Registry webinar

Registry

60Picture CC-BY-2.0 © Annie Roi @flickr.com

Page 61: Registry webinar

Registry

61

Spares

Page 62: Registry webinar

Registry

62

reg:Register

rdfs:label [1..*]dct:description [1..*]reg:owner [1] (foaf:Agent)reg:manager [1] (foaf:Agent)dct:license [0..*]reg:containedItemClass [0..*]reg:operatingLanguage [0..*]reg:governancePolicy [0..*] (rdfs:Resource)reg:validationQuery [0..*]dct:modified [0..1] (inferred)void:uriLookupEndpoint [0..*]void:uriSpace [0..1]void:exampleResource [0..*]void:openSearchDescription [0..*]

reg:subregister

reg:RegisterItem

rdfs:label [1..*]dct:description [0..*]dct:dateSubmitted [1] (automatic)dct:dateAccepted [0..1]dct:modified [0..1] (inferred)reg:itemClass [1..*]reg:submitter [1] (foaf:Agent)dct:license [0..*]reg:status [1..*]reg:category [0..*] (skos:Concept)reg:notation [0..1]reg:alias [0..*]reg:hasView [0..*]reg:representationOf [0..*]

reg:register

reg:predecessor

version:Version

owl:versionInfo [1]

time:Inteval

version:interval

dct:replacesdct:replacedBy

version:VersionedThing

version:currentVersion

reg:Statusreg:statusNotAccepted

reg:statusSubmittedreg:statusInvalid

reg:statusAcceptedreg:statusValid

reg:statusExperimental reg:statusStable

reg:statusDeprecatedreg:statusSupersededreg:statusRetired

dct:isVersionOf

reg:status

reg:EntityReference

reg:entity [1]reg:sourceGraph [0..1]

reg:definition

Void:Dataset

ldp:Container

ldp:membershipPredicatereg:inverseMembershipPredicate

Full information model

Page 63: Registry webinar

Registry

63

Status lifecycle

Page 64: Registry webinar

Registry

64

Convenient views

• full RegisterItem/Register structure complex• versioning makes that a lot worse

//registry

RegisterVersionedThing

//registry:1

RegisterVersion

//registry/_reg

RegisterItemVersionedThing

//registry/_reg:1

RegisterItemVersion

//registry/reg

RegisterVersionedThing

//registry/reg/_foo

RegisterItemVersionedThing

//registry/reg/foo

(entity)//registry/reg:1

RegisterVersion

//registry/reg/_foo:1

RegisterItemVersion

dct:versionOf

dct:versionOf dct:versionOf

dct:versionOf

reg:register

reg:register

reg:definition

reg:definition

//registry/_reg:2

RegisterItemVersion

//registry:2

RegisterVersion

//registry/reg:2

RegisterVersion

//registry/reg/_foo:2

RegisterItemVersion

Page 65: Registry webinar

Registry

65

Conceptual architecture

router

renderer

requestprocessor

user credentials

roles and bindings

auth

registrycorelogic

Registry RDF store

text index

style and templates

external UI

adminUI

logaudittrail

stor

e AP

I

nginx

proxy

confAPI

Page 66: Registry webinar

Registry

66

Aside on details

• internally a register item is more complex– metadata about status, when submitted etc– the description of the thing “entity”

RegisterRegisterItem

labeldescriptionstatussubmitteritem classdate submitted etc

...

register

register

entitydefinition EntityReference

entitydefinition EntityReference

RegisterItem

labeldescriptionstatussubmitteritem classdate submitted etc

Page 67: Registry webinar

Registry

67

Structure – information model

• managed entity– URL in registry namespace– registry holds master copy of the entity data

Register http://.../def/catchment-planning/RiverBasinDistrict/

Register Item http://.../def/catchment-planning/RiverBasinDistrict/_UK05

Entity http://.../def/catchment-planning/RiverBasinDistrict/UK05

Page 68: Registry webinar

Registry

68

Structure – information model

• referenced entity– URL external to registry (well, register)– registry holds minimal copy of data

Register http://.../def/catchment-planning/RiverBasinDistrict/

Register Item http://.../def/catchment-planning/RiverBasinDistrict/_UK05

Entity http://agency.gov.uk/RDB/Anglia

Page 69: Registry webinar

Registry

69

Information model

complicated by:– item v. entity– versioning

reg:Register entity

reg:register

reg:definition

reg:entity

reg:RegisterItem reg:EntityReferencereg:RegisterItem reg:EntityReference

Page 70: Registry webinar

Registry

70

Information model

• default linked data view of Register is simplified• configurable

– alternative membership property or inverse property– so can make a register look like a skos:Collection, skos:ConceptScheme,

owl:Ontology ...– also acts as a LDP container

• but can request full view (?_view=withMetadata)

reg:Register entity

reg:register

reg:definition

reg:entity

reg:RegisterItem reg:EntityReferencereg:RegisterItem reg:EntityReference

induced membership relation default is

rdfs:membercontainer view

full view

Page 71: Registry webinar

Registry

71

Federation, delegation and namespaces

reg:Delegated

reg:delegationTarget

reg:NamespaceForward

reg:forwardingCode [0..1]

reg:DelegatedRegister

reg:enumerationSubject [0..1]reg:enumerationPredicate [0..1]reg:enumerationObject [0..1]

reg:Register

reg:FederatedRegister

reg:forwardingCode [0..1]

Page 72: Registry webinar

Registry

72

Federation, delegation and namespaces

Case 1: External entities– identifier published in different namespace– want to include it in authoritative list

Solution:– just register as a referenced entity– already seen this– authoritative because it’s on the list– can record properties of the entity, and maintain

history– no namespace management involved

Page 73: Registry webinar

Registry

73

Referenced entities

/local

/id

/local-authority

Registry External service e.g. opencommunities.org

Hosted by LA directly

Page 74: Registry webinar

Registry

74

Case 2: Namespace allocation– want someone else to serve part of the registry

namespace– might be a single item or a complete register sub tree– e.g. allocating namespace in location.data.gov.uk for

serving INSPIRE spatial object identifiers

Solution:– reg:NamespaceForward– can be a redirect (30X) or proxy (200)– no constraints on whether target acts like a Registry – target ought to serve linked data with URIs in the right

namespace, but not required

Federation, delegation and namespaces

Page 75: Registry webinar

Registry

75

Namespace forward

/local

/id

/local-authority

Registry External web sitecould be anything

Page 76: Registry webinar

Registry

76

Federation, delegation and namespaces

Case 3: Federated register– want someone else to run part of the registry

infrastructure but act like one big registry– integrated search, validation etc

Solution:– reg:FederatedRegister– can be a redirect (30X) or proxy (200)– target endpoint must comply with Registry API at

least for search, validation and entity lookup

Page 77: Registry webinar

Registry

77

Federated register

/local

/id

/local-authority

RegistryFederated registry

/local-authority

/id

Page 78: Registry webinar

Registry

78

Federation, delegation and namespaces

Case 4: Delegating a register– some one else to serve the list of contents of the

register– but they only have triple store, not full registry

implementation

Solution:– solution eg:DelegatedRegister– specify SPARQL endpoint and triple

pattern to enumerate members

reg:DelegatedRegister

reg:delegationTarget [1]reg:enumerationSubject [0..1]reg:enumerationPredicate [0..1]reg:enumerationObject [0..1]

Page 79: Registry webinar

Registry

79

Delegated register

/local

/id

/local-authority

RegistryExternal SPARQL service

Page 80: Registry webinar

Registry

80

Security model

• authentication– OpenID (e.g. Google, Google profile)

• authorization– permissions

• Register, Update, StatusUpdate, Force, Grant, GrantAdmin• inherit down the tree• e.g.: Register,Update:/example/local

– can grant to known user or anyone authenticated– bundled into roles

• Maintainer – Update, Grant• Manager – Register, StatusUpdate, Update, Grant• Authorized – Register, Update, StatusUpdate - for experimental areas• Administrator - anything