(Publishing Great Metadata)€¦ · – Metadata best practices – Editing Tools and Tips •...
Transcript of (Publishing Great Metadata)€¦ · – Metadata best practices – Editing Tools and Tips •...
Successful Data SharingPart I
(Publishing Great Metadata)
Tanya Haddad, Oregon Coastal Management ProgramAnna Verrill, GISP, NOAA Office for Coastal Management
Prepared for:West Coast Governor’s Alliance Network Meeting
November 3, 2014
Overview• Data Sharing• Metadata
– Types– Training & Resources– Tools
• Metadata Workflows– Metadata best practices– Editing Tools and Tips
• Sharing and publishing• Implementing at your organization
– Catalog overview– Available software– Levels of sharing– Connecting to communities
Helpful Stuff
For useful resources available externally, pay attention to the links in
blue boxes, e.g.:
http://www.coastalmarinedata.net/resources/
Data Sharing
• Do you know all your customers?• Can you predict what they will all want, now and in the future?
• Will you always be around to answer questions?
If you can’t answer yes to all of these questions, then your sharing system needs to be flexible, reusable, and you should take steps to make it easy for users to find what is relevant to them (when you are not around)
So you want to share your geospatial data:
Metadata
• How is metadata relevant to Data Sharing?
• Most flexible data sharing systems are built upon some form of metadata
• Metadata contains information that helps a user understand the contents of a data set, compare similar data sets, decide which data fit their needs, etc.
Metadata helps users “discover” your data
Types of Metadata
• Human readable– MS Word, Adobe PDF, HTML pages– Project reports, Grant reports, Journal articles
• Machine readable– Software generated – XML or JSON
Guess which of these is easiest to search and compare across many projects?
cc‐by‐sa Aikzhobi
Machine Readable?
T‐shirt Template: http://alymunibari.deviantart.com
Encodings
Content
Metadata Complexity• Metadata can be brief, or very
verbose• Different metadata content standards:
– Dublin Core Schema– FGDC Content Standard for Digital
Geospatial Metadata– International Organization for
Standardization 19139, 19115, 19115‐2: Geographic information ‐Metadata
• Most geospatial metadata you encounter will be encoded as XML
Metadata Basics
Christine White provides a nice overview in her Metadata Basics unit:
http://www.coastalmarinedata.net/meetings/ocmdnV/OCMDN_V_Part_I_Intro_to_Data_Catalogs_Slides.pdf
http://vimeo.com/72984018
Dublin Core (1995)
A small set of vocabulary terms that can be used to describe web resources (video, images, web pages, etc.), as well as physical resources such as books or CDs, and objects like artworks
• Title• Creator• Subject• Description• Publisher• Contributor• Date• Type• Format• Identifier• Source• Language• Relation• Coverage• Rights
FGDC CSDGM (1998)This standard was developed from the perspective of defining the information required by a prospective user to determine the availability of a set of geospatial data; to determine the fitness and the set of geospatial data for an intended use; to determine the means of accessing the set of geospatial data; and to successfully transfer the set of geospatial data
FGDC
ISO – 19139, 19115, 19115‐2 (2005)
• Modular, flexible system• Very customizable• Depicts relationships
between datasets and collection level (parent/child relationships)
• Standardizes descriptors through the use of code lists
• Accommodates new technologies (such as documenting web services)
• Accommodates International scope
• Undergoes revision/review in 5 year cycles
19139XML schema implementation
19115‐2 Part 2: gridded data and
imagery
19115Geographic information –
Metadata
19110
MI_
MD_
FC_
19119Services
SV_19157
ISO 19115 → Core InformationISO 19115-2 → Extensions for Instrumentation and Gridded Data
ISO 19110 → Entities and AttributesISO 19119 → Services
ISO 19157 → Data Quality
19111
J. Mize
Metadata Resources & Training• FGDC has a great “Metadata Quick
Guide” for CSDGM:– http://www.fgdc.gov/metadata/documents/
MetadataQuickGuide.pdf
• Try googling “Metadata Bob”
• NOAA’s Jaci Mize provides regular online training for both CSDGM and ISO metadata. Recordings and materials are available online here:
– ftp://ftp.ncddc.noaa.gov/pub/Metadata/Online_ISO_Training/Intro_to_CSDGM/
– ftp://ftp.ncddc.noaa.gov/pub/Metadata/Online_ISO_Training/Intro_to_ISO/
NOAA CSC
Metadata Resources & Training
• NOAA has created workbooks for learning the ISO formats. Each:– parallels the standard,– provides FAQs,– implementation guide
http://service.ncddc.noaa.gov/rdn/www/metadata‐standards/documents/MD‐Metadata.pdf
http://service.ncddc.noaa.gov/rdn/www/metadata‐standards/documents/MI‐Metadata.pdf J. Mize
Metadata Tools• ArcMap users all have access to ArcCatalog• There are also many other stand alone tools for generating metadata:– EPA Metadata Editor– CatMDEdit– ISOMorph– MERMAid– GeoNetwork– GeoPortal
• For the XML geeks:– XMLSpy– oXygen
Metadata ToolPros and Cons
Jaci Mize reviews these tools in her metadata training:
ftp://ftp.ncddc.noaa.gov/pub/Metadata/Online_ISO_Training/Intro_to_ISO/presentations/5_ToolsforISOMetadata.pptx
Making a plan for Metadata• Plan to document the data you use most and that is most important to your organization
• Use common sense for guidelines – follow standards as appropriate, but you only need to be as complete as is necessary for the intended purpose:– Discovery, or Documentation, or both, or other?
• Use templates when possible!• Review your work after creating a few records, adjust your processes accordingly
• Submit some records to a search system and see how your records look
Complete metadata = Good Discovery Experience
Best Practices – Use Templates!• When you need to create metadata for many items, it helps to
streamline the task by creating a metadata template. Like a MS Word document template, a metadata template contains information that will be used again and again
• Consider creating a template for your organization to use, and then make your organization template more specific for individual projects
• ArcGIS can automatically update properties of an item and any connected metadata template, resulting in much less effort to complete an item's metadata
• With metadata templates, you can focus on documenting important information like the sources and quality of your data, and any special processes you performed
Best Practices – For Good Discovery• Identification Information:
– Title– Abstract (Description)– Publication date– Point of Contact Info– Resource URL
(If data is downloadable or available as a service)
– Website URL– Constraints
• Location Information:– West Bounding Longitude– East Bounding Longitude– North Bounding Latitude– South Bounding Latitude– Browse Graphic URL
• Descriptor Information:– Theme Keywords– Resource Description
If you do nothing else, try to do these items well!
Certain metadata items are critical for discovery to work well (or at all):
ArcCatalog – Identification Info• Title• Abstract • Publication date• Resource URL • Website URL• Point of Contact Info
Rempel, McCune, OSDL
ArcCatalog – Location Info• West Bounding Longitude• East Bounding Longitude• North Bounding Latitude• South Bounding Latitude
• Browse Graphic URL(you can make a browse graphic however you like, and store it in any web accessible location referenced by URL)
Rempel, McCune, OSDL
ArcCatalog – Descriptor Info
• Theme Keywords– Theme Reference:
ISO 19115– Theme Topics
• Distribution Information– Resource Description: Select “Downloadable Data” if downloadable and add Resource URL
Rempel, McCune, OSDL
Sharing & Publishing
• Just like any other product, if you want people to use it, you have to share it, and they have to know about it
• This means licensing!• It also means advertising!
OK, so you know how to make data, and metadata, now what?
Sharing Best Practices
• Decide you want to share your data (license)
• Document it with:– Great Titles– Informative Abstracts– Credit to your Organization– Resource URLs– Any Caveats
• Partner with a friendly existing catalog in your network, or
• Host your own
Levels of Sharing (Data)• Available on the web, in whatever
format (e.g. image scan or PDF), but with an open license
• Available as machine readable structured data (e.g. Excel instead of image scan of a table)
• Available as above, plus in a non‐proprietary format (e.g. CSV instead of Excel)
• All the above, plus using open standards from W3C to identify things with URIs so that people can link to your stuff
• All the above, plus linked. Link your data to other people’s data to provide context
Tim Berners Lee W3C
Building your own Catalog• What are my sharing options?• How do different catalog options compare?• How do I pick a path?• What other the questions I should be asking?
• Are you cataloging one source of data or multiple?• Will other catalogs want to harvest from you?• Do you need to harvest? Do you need to add additional attributes to the resources you harvest?
• Do you need to customize your catalog, or are out‐of the‐box features good enough?
T. Welch
Catalog Options
• 'Simple' Catalog• ArcGIS Online• Geoportal Server (ESRI)• GeoNetwork (OSGeo)• OpenGeoPortal• CKAN
Catalog Options Pros and Cons
Tim Welch reviews these catalog options in his OCMDN overview:
http://www.coastalmarinedata.net/meetings/ocmdnIII/Welch_Catalog_Tech_10312012.pdf
Connecting to Communities
• First step is know your audience(s)
• Try to anticipate needs, but be open to access options and applications you may not be aware of
• Build good documentation habits into all your processes – your future self will be grateful!
http://xkcd.com/1421