What Makes a Data Archive Tick: Marrying Content and User Support Steven Worley National Center for...
-
Upload
gavin-phillips -
Category
Documents
-
view
212 -
download
0
Transcript of What Makes a Data Archive Tick: Marrying Content and User Support Steven Worley National Center for...
![Page 1: What Makes a Data Archive Tick: Marrying Content and User Support Steven Worley National Center for Atmospheric Research Computational and Information.](https://reader030.fdocuments.in/reader030/viewer/2022032805/56649ee75503460f94bf7c72/html5/thumbnails/1.jpg)
What Makes a Data Archive Tick: Marrying Content and User Support
Steven WorleyNational Center for Atmospheric Research
Computational and Information Systems LaboratoryMay 17-21, 2010
Summer Institute for Data Curation for Earth and Environmental ScienceGraduate School of Library and Information Science
University of Illinois, Urbana-Champaign
![Page 2: What Makes a Data Archive Tick: Marrying Content and User Support Steven Worley National Center for Atmospheric Research Computational and Information.](https://reader030.fdocuments.in/reader030/viewer/2022032805/56649ee75503460f94bf7c72/html5/thumbnails/2.jpg)
① How to make and keep the archive content relevant to the users?
② How to engage the users?
![Page 3: What Makes a Data Archive Tick: Marrying Content and User Support Steven Worley National Center for Atmospheric Research Computational and Information.](https://reader030.fdocuments.in/reader030/viewer/2022032805/56649ee75503460f94bf7c72/html5/thumbnails/3.jpg)
How to make and keep the archive content relevant to the users?
Know your usersDefine your focus community
Cannot serve everyoneDesign service not to limit othersAt decision points (e.g. changes in service) ask:
“Is this a significant benefit for my users?”The case @ NCAR
Atmospheric, oceanic, and some related geo-science researchGraduate students and higher educationNCAR scientists, researchers @ universities with graduate
degree programs in meteorology and oceanographyOver 50% of 6000+ unique users, annually, are outside focus
group
![Page 4: What Makes a Data Archive Tick: Marrying Content and User Support Steven Worley National Center for Atmospheric Research Computational and Information.](https://reader030.fdocuments.in/reader030/viewer/2022032805/56649ee75503460f94bf7c72/html5/thumbnails/4.jpg)
Understand their science, currently, and trendsAttend seminars, symposia, meetings where they
present their workCorollary: Have science educated staff
The case @ NCAR – Research Data Archive
How to make and keep the archive content relevant to the users?
All have MS degrees, or greater• meteorology (6)• oceanography (2)• computing science
(1)• exception – admin.
(1)
![Page 5: What Makes a Data Archive Tick: Marrying Content and User Support Steven Worley National Center for Atmospheric Research Computational and Information.](https://reader030.fdocuments.in/reader030/viewer/2022032805/56649ee75503460f94bf7c72/html5/thumbnails/5.jpg)
Understand their science, currently, and trendsRoutinely review journals, bulletins, and relevant news
letters Search for science strongly dependent on your data focusContact authors, offer data sharing service
@ NCAR
How to make and keep the archive content relevant to the users?
![Page 6: What Makes a Data Archive Tick: Marrying Content and User Support Steven Worley National Center for Atmospheric Research Computational and Information.](https://reader030.fdocuments.in/reader030/viewer/2022032805/56649ee75503460f94bf7c72/html5/thumbnails/6.jpg)
Understand their science, currently, and trendsDevelop close contacts with a few key users
Seek ‘honest’ opinions about your serviceMake your service known – presentations, publications@ NCAR
How to make and keep the archive content relevant to the users?
![Page 7: What Makes a Data Archive Tick: Marrying Content and User Support Steven Worley National Center for Atmospheric Research Computational and Information.](https://reader030.fdocuments.in/reader030/viewer/2022032805/56649ee75503460f94bf7c72/html5/thumbnails/7.jpg)
Know how your users workHow do they prefer to handle data?
Digital files – write and run program codes to evaluate contentDigital files – specific formats that are application friendly
E.g. netCDF, GIS, WMO ASCII text convenient for worksheets Images of analyses (charts, line graphs, 2D/3D contoured plots)
@NCARDigital files are key Some images for discovery, but not critical
Design the systems to deliver what users want
How to make and keep the archive content relevant to the users?
![Page 8: What Makes a Data Archive Tick: Marrying Content and User Support Steven Worley National Center for Atmospheric Research Computational and Information.](https://reader030.fdocuments.in/reader030/viewer/2022032805/56649ee75503460f94bf7c72/html5/thumbnails/8.jpg)
Choosing the contentAt decision points (e.g. adding a new dataset) ask:
“Can we handle this efficiently?”Does it supplement or extend the central data foci?Does it address a new need or trend?Are the formats aligned with user preferences?
If not, can we make a cost effective conversion?Do you have staff (data scientists / stewards) that can
understand the scientific content?@ NCAR
Atmospheric, oceanic, related geo-sciences observations or analyses derived from observations to support climate and weather research.
How to make and keep the archive content relevant to the users?
![Page 9: What Makes a Data Archive Tick: Marrying Content and User Support Steven Worley National Center for Atmospheric Research Computational and Information.](https://reader030.fdocuments.in/reader030/viewer/2022032805/56649ee75503460f94bf7c72/html5/thumbnails/9.jpg)
Choosing the contentEvaluate user metrics
What datasets are most popular?Who is using what – can you distinguish your focus group? Are there any trends?Caution: this is only part of the story
@ NCAROur user registration allows us to track thisExamples
How to make and keep the archive content relevant to the users?
![Page 10: What Makes a Data Archive Tick: Marrying Content and User Support Steven Worley National Center for Atmospheric Research Computational and Information.](https://reader030.fdocuments.in/reader030/viewer/2022032805/56649ee75503460f94bf7c72/html5/thumbnails/10.jpg)
Unique Users by service path
Users in four service categories MSS to CISL HPC environment Web to world-wide community Orders – one off consulting assisted data
preparation TIGGE
6 thousand users annually FY09: MSS=266, Web=5649, Orders=196,
TIGGE=44
![Page 11: What Makes a Data Archive Tick: Marrying Content and User Support Steven Worley National Center for Atmospheric Research Computational and Information.](https://reader030.fdocuments.in/reader030/viewer/2022032805/56649ee75503460f94bf7c72/html5/thumbnails/11.jpg)
Amount of data by service path
Users in four service categories MSS to CISL HPC environment Web to world-wide community Orders – one off consulting assisted data
preparation TIGGE
162 TB in FY09 FY09: MSS=31, Web=120, Orders=9,
TIGGE=2
![Page 12: What Makes a Data Archive Tick: Marrying Content and User Support Steven Worley National Center for Atmospheric Research Computational and Information.](https://reader030.fdocuments.in/reader030/viewer/2022032805/56649ee75503460f94bf7c72/html5/thumbnails/12.jpg)
NCAR-CSM Symposium on Climate and Energy
12
User ranked popular datasets
7 May 2010
Unique users FY09 datasets Titles
2878ds082.0, ds083.2, ds083.0 NCEP FNL Operational Model Global Tropospheric Analyses
924 ds090.0 NCEP/NCAR Global Reanalysis Products
510ds758.0, ds759.3, ds759.2 NGDC Global 2' and 5' Elevations, USGS 30 ARC-second
477
ds461.0, ds351.0ds337.0, ds464.0,ds353.4 NCEP ADP/PREPBUFR Global Surface and Upper Air Observations
358 ds608.0 NCEP North American Regional Reanalysis (NARR)
264 ds609.2 GCIP NCEP ETA model output
262 ds540.1, ds540.0 International Comprehensive Ocean-Atmosphere Data Set (ICOADS)
190 ds744.4 QSCAT/NCEP Blended Ocean Winds
173 ds277.0 NCEP V2.0 OI Global SST, V3.0 Extended Reconstructed Analyses
153 ds335.0, ds336.0 Unidata (IDD) Observations and Model Data
5921 All Datasets All DSS datasets
Top 10 datasets/groups FY09
~ 6000 Unique Users Annually
![Page 13: What Makes a Data Archive Tick: Marrying Content and User Support Steven Worley National Center for Atmospheric Research Computational and Information.](https://reader030.fdocuments.in/reader030/viewer/2022032805/56649ee75503460f94bf7c72/html5/thumbnails/13.jpg)
Remain flexible – expect constant changeBe ready to take opportunities when they come along
Re-adjust prioritiesResist ‘tight’ mission controlTake advice from advisory groups, but don’t depend on
them exclusively Use holistic approach
@ NCAR, unplanned for exampleArctic System Reanalysis – NSF sponsored research critical to
assess the changes happening in the ArcticNeed controlled access to first prototype data – We do this!
How to make and keep the archive content relevant to the users?
![Page 14: What Makes a Data Archive Tick: Marrying Content and User Support Steven Worley National Center for Atmospheric Research Computational and Information.](https://reader030.fdocuments.in/reader030/viewer/2022032805/56649ee75503460f94bf7c72/html5/thumbnails/14.jpg)
Sustaining for the long-termRichness and data value grow over time
Data assets tend to compliment each other – add value to many different research questions
Scientific publications lead to broader and increased interestDefinitive data citation is a work in progress
Staffing needs to be base/core fundedGrant directed funding can lead to a fractured, ad hoc,
incomplete archiveCan be a major frustration for users
@ NCAR – the Research Data ArchiveBegan 40+ years ago Today sustained by 9 persons
How to make and keep the archive content relevant to the users?
![Page 15: What Makes a Data Archive Tick: Marrying Content and User Support Steven Worley National Center for Atmospheric Research Computational and Information.](https://reader030.fdocuments.in/reader030/viewer/2022032805/56649ee75503460f94bf7c72/html5/thumbnails/15.jpg)
CollaborationsParticipate/volunteer for committees and panels that
tackle data issues (all sorts)Learn from others, share knowledge
Share efforts and data with other organizationsNo one group can do it all (don’t have resources and all
expertise required)@ NCAR (conf. like SIDC for EES)
Volunteerism: NAS, AMS, NOAA, WMO, NASANational and International data agreements with:
European Centre for Medium Range Forecasting Japanese Meteorological AdministrationU.S. National Weather Service, National Center for
Environmental Prediction
How to make and keep the archive content relevant to the users?
![Page 16: What Makes a Data Archive Tick: Marrying Content and User Support Steven Worley National Center for Atmospheric Research Computational and Information.](https://reader030.fdocuments.in/reader030/viewer/2022032805/56649ee75503460f94bf7c72/html5/thumbnails/16.jpg)
How to Engage the Users?Data Discovery – how can people find you?
All 600+ RDA Datasets have metadata in GCMD• Automatically, exported via OAI – PMHSimilarly: RDA > CDP@NCAR > BADC in UK
![Page 17: What Makes a Data Archive Tick: Marrying Content and User Support Steven Worley National Center for Atmospheric Research Computational and Information.](https://reader030.fdocuments.in/reader030/viewer/2022032805/56649ee75503460f94bf7c72/html5/thumbnails/17.jpg)
How to Engage the Users?Design your portal to evolve – it will/should
2002• Search• Navigation• List of menus• Unique layout of
links • Picture of
people
![Page 18: What Makes a Data Archive Tick: Marrying Content and User Support Steven Worley National Center for Atmospheric Research Computational and Information.](https://reader030.fdocuments.in/reader030/viewer/2022032805/56649ee75503460f94bf7c72/html5/thumbnails/18.jpg)
How to Engage the Users?
2008• Search
• Two ways
• Navigation• Links• News• Text• People
![Page 19: What Makes a Data Archive Tick: Marrying Content and User Support Steven Worley National Center for Atmospheric Research Computational and Information.](https://reader030.fdocuments.in/reader030/viewer/2022032805/56649ee75503460f94bf7c72/html5/thumbnails/19.jpg)
How to Engage the Users?
![Page 20: What Makes a Data Archive Tick: Marrying Content and User Support Steven Worley National Center for Atmospheric Research Computational and Information.](https://reader030.fdocuments.in/reader030/viewer/2022032805/56649ee75503460f94bf7c72/html5/thumbnails/20.jpg)
NCAR-CSM Symposium on Climate and Energy
207 May 2010
Primary design feature for web portal• Data Discovery – Find Data!
How to Engage the Users?
2010• All about
search• Gone from top
• people• text• news
![Page 21: What Makes a Data Archive Tick: Marrying Content and User Support Steven Worley National Center for Atmospheric Research Computational and Information.](https://reader030.fdocuments.in/reader030/viewer/2022032805/56649ee75503460f94bf7c72/html5/thumbnails/21.jpg)
How to Engage the Users?
Navigation once they arriveWorking principles
Uniform across web portal Keep organizational elements out of prime visual territory
@ NCARHave user registration – only required to get data
All discovery metadata open – unlimited searching
![Page 22: What Makes a Data Archive Tick: Marrying Content and User Support Steven Worley National Center for Atmospheric Research Computational and Information.](https://reader030.fdocuments.in/reader030/viewer/2022032805/56649ee75503460f94bf7c72/html5/thumbnails/22.jpg)
How to Engage the Users?
The complete data knowledge package, and data cycle
What is a complete data knowledge package?Rich metadata plus the data files!
One example http://dss.ucar.edu/datasets/ds277.0/
![Page 23: What Makes a Data Archive Tick: Marrying Content and User Support Steven Worley National Center for Atmospheric Research Computational and Information.](https://reader030.fdocuments.in/reader030/viewer/2022032805/56649ee75503460f94bf7c72/html5/thumbnails/23.jpg)
How to Engage the Users?
The pieces that make rich metadataDataset navigation (Access, Documentation, Software)TitleSummary
![Page 24: What Makes a Data Archive Tick: Marrying Content and User Support Steven Worley National Center for Atmospheric Research Computational and Information.](https://reader030.fdocuments.in/reader030/viewer/2022032805/56649ee75503460f94bf7c72/html5/thumbnails/24.jpg)
How to Engage the Users?
The pieces that make rich metadataPeriod of data recordUpdate cycleScientific parameters (Variables)Earth reference levels
![Page 25: What Makes a Data Archive Tick: Marrying Content and User Support Steven Worley National Center for Atmospheric Research Computational and Information.](https://reader030.fdocuments.in/reader030/viewer/2022032805/56649ee75503460f94bf7c72/html5/thumbnails/25.jpg)
How to Engage the Users?
The pieces that make rich metadataTimes – temporal increment Data types – points or gridsGeo-spatial coverageSource organizations
![Page 26: What Makes a Data Archive Tick: Marrying Content and User Support Steven Worley National Center for Atmospheric Research Computational and Information.](https://reader030.fdocuments.in/reader030/viewer/2022032805/56649ee75503460f94bf7c72/html5/thumbnails/26.jpg)
How to Engage the Users?
The pieces that make rich metadataRelated Internet sitesPublicationsAcknowledgement statement
![Page 27: What Makes a Data Archive Tick: Marrying Content and User Support Steven Worley National Center for Atmospheric Research Computational and Information.](https://reader030.fdocuments.in/reader030/viewer/2022032805/56649ee75503460f94bf7c72/html5/thumbnails/27.jpg)
How to Engage the Users?
The pieces that make rich metadataVolume – size of the datasetData formatsRelated datasets in the NCAR collectionConsulting contact (email and phone)A 2nd pointer to Data Access
![Page 28: What Makes a Data Archive Tick: Marrying Content and User Support Steven Worley National Center for Atmospheric Research Computational and Information.](https://reader030.fdocuments.in/reader030/viewer/2022032805/56649ee75503460f94bf7c72/html5/thumbnails/28.jpg)
How to Engage the Users?
The complete data knowledge package, and data cycleData Cycle Facts Datasets are re-published – new versions. Datasets are corrected and extended in time or space. Scientific analysis and publication will occur randomly along the
data cycle.
Data referencing is more challenging than traditional publication referencing because of the data cycle.
How can you accurately trace/recover what has been used for publication?
![Page 29: What Makes a Data Archive Tick: Marrying Content and User Support Steven Worley National Center for Atmospheric Research Computational and Information.](https://reader030.fdocuments.in/reader030/viewer/2022032805/56649ee75503460f94bf7c72/html5/thumbnails/29.jpg)
How to Engage the Users?
The complete data knowledge package, and data cycle
@ NCAR Don’t have systematic (organization-wide) way to
handle the data cycle We do not discard/delete old versions of data
Ad hoc approach Currently, building a version tracking software
Versioning will be included in DOI implementation
![Page 30: What Makes a Data Archive Tick: Marrying Content and User Support Steven Worley National Center for Atmospheric Research Computational and Information.](https://reader030.fdocuments.in/reader030/viewer/2022032805/56649ee75503460f94bf7c72/html5/thumbnails/30.jpg)
How to Engage the Users?
ConsultationCritical two-way communication1. Benefits for the user
Guidance to best available datasetsConsolidate research ideas into required data sourcesSoftware assistanceCustomized data preparation if necessary
2. Benefits to the archive stewardshipDetect ways to improve our search processLearn about data requirement trendsOccasionally, acquire new data resources from scientific effortsLearn about data problems we might have
![Page 31: What Makes a Data Archive Tick: Marrying Content and User Support Steven Worley National Center for Atmospheric Research Computational and Information.](https://reader030.fdocuments.in/reader030/viewer/2022032805/56649ee75503460f94bf7c72/html5/thumbnails/31.jpg)
How to Engage the Users?
Provide research tool support and documentation Provide users a starting point for data evaluation
Simple access programs – the languages used by the focus community
Pointers to applications (IDL, MatLab, NCL, NCO, etc.) Specific example are VERY helpful!
Must maintain software/applications and documentation for the long-term.Guarantee users will understand the meaning and have access.
![Page 32: What Makes a Data Archive Tick: Marrying Content and User Support Steven Worley National Center for Atmospheric Research Computational and Information.](https://reader030.fdocuments.in/reader030/viewer/2022032805/56649ee75503460f94bf7c72/html5/thumbnails/32.jpg)
How to Engage the Users?
Provide research tool support and documentation @ NCAR
Remain aware of proprietary software taps, E.g. for documents
will .xls be viable 50 years from now - .xlsx is now standard? Is .pdf any better?
Prefer data file formats that define everything to the byte/bit level
Computer code could always be written to access these.All kinds of reports, project descriptions, and documents that
explain the intent of the data are vital for the long-term.Use dedicated document directories for each datasets
![Page 33: What Makes a Data Archive Tick: Marrying Content and User Support Steven Worley National Center for Atmospheric Research Computational and Information.](https://reader030.fdocuments.in/reader030/viewer/2022032805/56649ee75503460f94bf7c72/html5/thumbnails/33.jpg)
How to Engage the Users?
Follow-up aidNotification service for significant dataset changes
If an error is corrected – should notify all users of the data Subscription service
Inform users when new data is available Prepare special products based on user determined template
– e.g. past requests@ NCAR
We have automated notification serviceProvided users register accurately
We do not have subscription service - yet
![Page 34: What Makes a Data Archive Tick: Marrying Content and User Support Steven Worley National Center for Atmospheric Research Computational and Information.](https://reader030.fdocuments.in/reader030/viewer/2022032805/56649ee75503460f94bf7c72/html5/thumbnails/34.jpg)
① How to make and keep the archive content relevant to the users?
② How to engage the users?
http://dss.ucar.edu/