UMassD Data Workshop Series Class 2 – Types and Formats of Data, Contextual Details 2015 - October...
-
Upload
emory-peters -
Category
Documents
-
view
216 -
download
0
Transcript of UMassD Data Workshop Series Class 2 – Types and Formats of Data, Contextual Details 2015 - October...
UMassD Data Workshop SeriesClass 2 – Types and Formats of Data, Contextual
Details
2015 - October - 13Dawn Gross, Zac Painter, Liz Winiarz
Module 2: Types, Formats, and Stages of Data
Editor: Lamar Soutter Library, University of Massachusetts Medical SchoolTitle of the work: New England Collaborative Data Management Curriculum
the URL where the original work can be found: http://library.umassmed.edu/necdmc
Research Data Associated with Most Disciplines
• Images
• Video
• Mapping/GIS data
• Numerical measurements
Module 2: Data Types, Stages & Formats
Research Data Associated with Social Sciences
• survey responses
• focus group and individual interviews
• economic indicators
• demographics
• opinion polling
Module 2: Data Types, Stages & Formats
Research Data Associated with Hard Sciences
• measurements generated by sensors/laboratory instruments
• computer modeling
• simulations
• observations and/or field studies
• specimen
Module 2: Data Types, Stages & Formats
Stages of Data Related to Research Data Life Cycle
• Raw Data
• Processed Data
• Analyzed Data
• Finalized/Published Data
• Existing Data across Different Sources
Module 2: Data Types, Stages & Formats
Stages of Data Related to Research Data Life Cycle
Sample hypothesis:
Water temperatures in Lake Superior are now significantly warmer than in previous years. The evidence lends support to global warming.
Module 2: Data Types, Stages & Formats
Using our sample hypothesis
Water temperatures in Lake Superior are now significantly warmer than in previous years. This evidence lends support to global warming.
• Raw Data = daily lake temperatures
• Processed Data = ‘cleaned’ temp. data in spreadsheet
• Analyzed Data = average temps., graphing changes
• Finalized Data = does data support the hypothesis? Module 2: Data Types, Stages & Formats
Preferable Format Types for Long-Term Access to Data
Data formats that offer the best chance for long-term access are both:
• Non-proprietary (also known as open), and
• Unencrypted and uncompressed
Module 2: Data Types, Stages & Formats
Preferred Formats
Examples of preferred formats for various data types include:
Moving Images: MOV, MPEG
Audio: WAVE, MP3
Numbers/statistics: ASCII, SAS
Images: TIFF, JPEG 2000
Text: PDF/A, ASCII
Module 2: Data Types, Stages & Formats
Converting to Preferable Formats
Information can be lost when converting file formats.
To mitigate the risk of lost information:
• Note conversion steps taken
• If possible, keep the original file as well as the converted one
Module 2: Data Types, Stages & Formats
Describing Data, Documenting Reliability & Collection Techniques
Data documentation explains the:
• Who• What• Where• When• And why of data.
Module 2: Data Types, Stages & Formats
Describing Data, Documenting Reliability & Collection Techniques
Who:
• Who collected this data?
• Who or what were the subjects under study?
Module 2: Data Types, Stages & Formats
Describing Data, Documenting Reliability & Collection Techniques
What:
• What data was collected, and for what purpose?
• What is the content and structure of the data?
Module 2: Data Types, Stages & Formats
Describing Data, Documenting Reliability & Collection Techniques
Where:
• Where was this data collected?
• What were the experimental conditions that produced it?
Module 2: Data Types, Stages & Formats
Describing Data, Documenting Reliability & Collection Techniques
When:
• When was the data collected?
• Is the data part of a series, or ongoing experiment?
Module 2: Data Types, Stages & Formats
Describing Data, Documenting Reliability & Collection Techniques
Why:
• Why was this experiment performed?
• How does it relate to your research question?
Module 2: Data Types, Stages & Formats
Cross Discipline Concerns
No matter what, you need to have:
• File naming conventions
• Version control
Module 2: Data Types, Stages & Formats
Why Do I Need to Worry about That?
Consider this:
If you unexpectedly have to leave your research project for a few months, could a colleague easily make sense of your data files?
Module 2: Data Types, Stages & Formats
Why Use File Naming Conventions?
Naming conventions make life easier!
• Help you find your data• Help others find your data• Help track which version of a file
is most current
Module 2: Data Types, Stages & Formats
What File Naming Convention Should I Use?
Has your research group established a convention?
If not, general guidelines include:
• Meaningful file names that aren’t too long• Avoid certain characters• Dates can help with sorting and version control
Module 2: Data Types, Stages & Formats
Module 3: Contextual details needed to make data meaningful to others
Editor: Lamar Soutter Library, University of Massachusetts Medical School
Title of the work: New England Collaborative Data Management Curriculum
the URL where the original work can be found: http://library.umassmed.edu/necdmc
CC BY-NC
Module 3: Metadata
What is Metadata?
“Metadata is structured information that describes, explains, locates, or otherwise makes it easier to retrieve, use, or manage an information resource. Metadata is often called data about data or information about information” (NISO, Understanding Metadata 2004;1).
Module 3: Metadata
You Must Have Metadata to:
• find data from other researchers to support your research;
• use the data that you do find; • help other professionals to find and
use data from your research; and• use your own data in the future when
you may have forgotten details of the research.
Module 3: Metadata
Basic Types of Metadata
• Descriptive metadata
• Structural metadata
• Administrative metadata
Module 3: Metadata
How Metadata Facilitates Discoverability and Reuse
• Discoverability
• Accessibility
Module 3: Metadata
Some Sample Metadata Standards
• Darwin Core
• Ecological Metadata Language (EML)
• Climate and Forecast (CF)
Module 3: Metadata
Collecting and Sharing Metadata
• Controlled vocabularies
• Technical standards
Module 3: Metadata
Controlled Vocabularies
• Help take the guess work out of choosing between:
• a preferred spelling; • a scientific or popular term • determining which synonym to use.
Module 3: Metadata
Technical Standards
ISO 8601 technical standard:
• YYYY (e.g. 1997)• Year and month:• YYYY-MM (e.g. 1997-07)
Complete date: YYYY-MM-DD (e.g. 1997-07-16)• Media types can be problematic as well• The MIME media types helps you choose among the
following: Application, audio, example, image, message, model, multipart, text, video
Module 3: Metadata
Media Types
The MIME media types:
• Application• Audio• Image• Model• Multipart• Message• Text• Video
Module 3: Metadata
Approaches to Creating Metadata
First, identify your elements:• Title• Creator• Identifier• Subject• Dates
Module 3: Metadata
Best Practices
• Consult a metadata librarian!• Consistent data entry is important• Avoid extraneous punctuation• Avoid most abbreviations• Use templates and macros when possible• Extract pre-existing metadata • Keep a data dictionary • Always use an established metadata
standard
Module 3: Metadata
Sources for this UnitWhat is metadata:
National Information Standards Organization (NISO). 2004. Understanding Metadata. http://www.niso.org/publications/press/UnderstandingMetadata.pdf
Neiswender, C. 2010. "Introduction to Metadata." In The MMI Guides: Navigating the World of Marine Metadata. http://marinemetadata.org/guides/mdataintro. Accessed April 1, 2013.
Reuse and discoverability:
National Information Standards Organization (NISO). 2004. Understanding Metadata. http://www.niso.org/publications/press/UnderstandingMetadata.pdf
Miller, Steven J. 2011. Metadata Resources: Selected Reference Documents, Web Sites, and Readings: https://pantherfile.uwm.edu/mll/www/resource.html
Wikipedia page on “Metadata”: http://en.wikipedia.org/wiki/Metadatahttp://www.library.illinois.edu/dcc/bestpractices/chapter_11_structuralmetadata.html
Module 3: Metadata
Sources for this Unit (cont’d)Metadata standards:
Digital Curation Centre’s Disciplinary Metadata resource. http://www.dcc.ac.uk/resources/metadata-standards.
Hogrefe, K., Stocks, K. 2011. "The Importance of Metadata Standards." In The MMI Guides: Navigating the World of Marine Metadata. http://marinemetadata.org/guides/mdatastandards/stdimportance. Accessed March 22, 2013.
Other suggested readings
Introduction to Metadata: Setting the Stage (Getty Research Institute) http://www.getty.edu/research/publications/electronic_publications/intrometadata/setting.html
Documentation and Metadata (MIT Libraries): http://libraries.mit.edu/guides/subjects/data-management/metadata.html
Version control and authenticityhttp://data-archive.ac.uk/create-manage/format/versions
Module 3: Metadata
Other Suggested Readings (cont’d)
What is Metadata?
http://vimeo.com/3161893 Controlled vocabularies and technical standards
http://en.wikipedia.org/wiki/Controlled_vocabulary
http://www.ieee.org/education_careers/education/standards/standards_glossary.html Metadata elements
http://libraries.mit.edu/guides/subjects/data-management/metadata.html
Creating metadata
http://uwdcc.library.wisc.edu/documents/DC_companionv1.3.pdf