Digital Object Identifiers for EOSDIS data

13
Digital Object Identifiers for EOSDIS data HDF Workshop April 17, 2012 John Moses, ESDIS [email protected]

Transcript of Digital Object Identifiers for EOSDIS data

Page 1: Digital Object Identifiers for EOSDIS data

Digital Object Identifiers for EOSDIS data

HDF WorkshopApril 17, 2012

John Moses, [email protected]

Page 2: Digital Object Identifiers for EOSDIS data

ID Scheme Data Set

ItemData Set

ItemData Set

ItemData Set

Item

URL/N/I

PURL

XRI

Handle

DOI

ARK

LSID

OID

UUID

2

2

Unique Identifier

Unique Locator

Citable Locator

Scientifically Unique ID

Adapted from Duerr, R. E., et al.. 2011 (submitted). On the utility of identification schemes for digital Earth science data: An assessment and recommendations. Earth Science Informatics.

Study by ESIP Cluster on Preservation and Stewardship in 2009Assessment of identification schemes

Page 3: Digital Object Identifiers for EOSDIS data

Digital Object Identifier for EOS products• The DOI® system and the Handle System provide an Internet

resolution service for unique and persistent identifiers of digital objects– Internet Infrastructure components owned by International DOI Foundation

(IDF)– www.doi.org

• A DOI consists of two part alphanumeric string – doi:[prefix]/[suffix]; for example doi: 10.5067/123; – Prefix 10 identifies the DOI registry; 5067 identifies the Registrant Agent – Suffix alphanumeric string 123 uniquely identifies the data item

• The purpose in assigning DOIs to EOSDIS products is to provide a permanent data identifier for citation in publications– ESIP citation guideline using doi:– Doe, J. and R. Roe. 2001. The FOO Data Set. Version 2.3. The FOO Data Center.

http://dx.doi.org/10.xxxx/notfoo.547983. Accessed 1 May 2011.

3

Page 4: Digital Object Identifiers for EOSDIS data

Implementing DOIs for EOSDIS– Develop ops concept through pilot processes• Guidelines for DOI suffix, location & citation information.

• Request, assign, monitor DOIs, location & citation metadata• Add DOIs to DAAC product citation web pages• Imbed DOIs into product metadata at next reprocessing– HIRDLS, GLAS, AMSR-E data providers are in final

reprocessing• Add DOIs to GCMD and ECHO through metadata updates • Add DOI metadata to NTRS for searchable documentation• Setup metrics collection from journal citation reports

4

Page 5: Digital Object Identifiers for EOSDIS data

Implementation in Interoperable Architectures

Metadata flows in NASA Earth Science Data Systems5

Provenancecollection

DOI ProvenanceServices

tools

tools

DOI

NASA Technical Reports Server

DOI tools

Page 6: Digital Object Identifiers for EOSDIS data

Attributes for embedding DOIs• Framework structures in HDF and netCDF

– HDF global attribute name and value verses naming an identifier group (which would allow discovery of identifier types)

– ECS CoreMetadata Product Specific Attributes in the AdditionalAttributes group section

– netCDF file-level attribute name: “Id” and “naming authority”

• Consider attribute names for DOI value:– Advantage to having two parts – a key code to indicated this is an identifier,

and namespace that indicates the type/application of DOI; e.g., that it applies to the data product level (i.e., has same value for all granules/files of the series – a series identifier).

• Hypothetical DOI example– Attribute name: identifier_product_DOI– Attribute value: 10.5067/Aura/HIRDLS/data1

6

Page 7: Digital Object Identifiers for EOSDIS data

MORE BACKGROUND

7

Page 8: Digital Object Identifiers for EOSDIS data

DOI Examples for Pilot ProjectsSuffix Model String Example[mission]/[instrument]/data[1-n]

doi: 10.5067/Aura/HIRDLS/data1234doi: 10.5067/ICESat/GLAS/data1234doi: 10.5067/Aqua/AMSR-E/data1234

[campaign]/[measurement group]/data[1-n][campaign]/[platform group]/data[1-n]

doi: 10.5067/BOREAS/Airborne/data1234

[program]/[measurement group]/data[1-n][measurement group]/[data[1-n]

doi: 10. 5067/MEaSUREs/OceanFluxes/data1234Doi: 10:5067/MEaSUREs/SnowExtent/data1234

8

Page 9: Digital Object Identifiers for EOSDIS data

DOI Registration and Guidelines• A DOI will be assigned for each EOSDIS standard data

products• The DOI subscription holder (ESDIS) will provide location &

citation metadata to DOI subscription provider (CDL EZID) and will be notified when the DOI has been registered– Ideally we want one DOI per data item but the registry

does not preclude multiple registrations of similar data • New DOI metadata can be uploaded as frequently as desired – Typically when location or citation information changes

• A major new version of the data product would be assigned a new DOI. DOIs of old versions that are no longer available would have updated locators that point to the new version (with explanation)

9

Page 10: Digital Object Identifiers for EOSDIS data

Guidelines for DOI suffix• The DOI itself should be a relatively short string so that users

can read from printed material or display and key into a browser with minimum error.

• The DOI suffix (ASCI characters with no spaces): – Would be a descriptive name of domain-specific structure that reflects

the science data product contents– Should have some recognition by the research community, such as a

semantic name or acronym, e.g., instrument/platform/campaign/investigation name or measurement parameter

– Should help readers distinguish between published paper and dataset– Should not have organizational reference subject to change (i.e.,

publisher, archive, owner)

10

Page 11: Digital Object Identifiers for EOSDIS data

Member Institute using DataCite (RA):California Digital Library and EZID

• EZID is a service providing researchers a way to manage identifiers persistently for datasets, files, and resources of all types.

• The service is available via a machine to machine programming interface (an API) and as a web user interface.

• Core functions:– Create a persistent identifier: DOI– Add object location (URL landing page, separate from citation)– Add citation metadata (DataCite repository, mandatory shown below)

• Creator (person or organization)• Title (long name of dataset)• Publisher (holder of the data – organization making it available)• Publication Year (year when data was, or will be first available)

– Update object location– Update object metadata

11

Page 12: Digital Object Identifiers for EOSDIS data

DOI Persistence

12

Page 13: Digital Object Identifiers for EOSDIS data

Registration Agent: DataCite• DataCite, established a scientific data

application with IDF.• Service is run by open membership

organization of gov and edu libraries. Focused on improving the scholarly infrastructure around datasets.

• Most appropriate RA because of their focus on working with data centers to assign persistent identifiers to datasets leveraging the Digital Object Identifier (DOI) infrastructure.

• United States Member Institutes– California Digital Library (Founding Member)

• Recommended subscription provider because of bulk pricing and EZID Web/API services

– Office of Scientific and Technical Information, US Department of Energy ( new Member Dec 2010)

– Purdue University Libraries (Member)– Interuniversity Consortium for Political and

Social Research - ICPSR (Associate Member)– Microsoft Research (Associate Member)

TIB: German National Library of Science and Technology

13