UMassD Data Workshop Series Class 2 – Types and Formats of Data, Contextual Details 2015 - October...

36
UMassD Data Workshop Series Class 2 – Types and Formats of Data, Contextual Details 2015 - October - 13 Dawn Gross, Zac Painter, Liz Winiarz

Transcript of UMassD Data Workshop Series Class 2 – Types and Formats of Data, Contextual Details 2015 - October...

Page 1: UMassD Data Workshop Series Class 2 – Types and Formats of Data, Contextual Details 2015 - October - 13 Dawn Gross, Zac Painter, Liz Winiarz.

UMassD Data Workshop SeriesClass 2 – Types and Formats of Data, Contextual

Details

2015 - October - 13Dawn Gross, Zac Painter, Liz Winiarz

Page 2: UMassD Data Workshop Series Class 2 – Types and Formats of Data, Contextual Details 2015 - October - 13 Dawn Gross, Zac Painter, Liz Winiarz.

Module 2: Types, Formats, and Stages of Data

Editor: Lamar Soutter Library, University of Massachusetts Medical SchoolTitle of the work: New England Collaborative Data Management Curriculum

the URL where the original work can be found: http://library.umassmed.edu/necdmc

Page 3: UMassD Data Workshop Series Class 2 – Types and Formats of Data, Contextual Details 2015 - October - 13 Dawn Gross, Zac Painter, Liz Winiarz.

Research Data Associated with Most Disciplines

• Images

• Video

• Mapping/GIS data

• Numerical measurements

Module 2: Data Types, Stages & Formats

Page 4: UMassD Data Workshop Series Class 2 – Types and Formats of Data, Contextual Details 2015 - October - 13 Dawn Gross, Zac Painter, Liz Winiarz.

Research Data Associated with Social Sciences

• survey responses

• focus group and individual interviews

• economic indicators

• demographics

• opinion polling

Module 2: Data Types, Stages & Formats

Page 5: UMassD Data Workshop Series Class 2 – Types and Formats of Data, Contextual Details 2015 - October - 13 Dawn Gross, Zac Painter, Liz Winiarz.

Research Data Associated with Hard Sciences

• measurements generated by sensors/laboratory instruments

• computer modeling

• simulations

• observations and/or field studies

• specimen

Module 2: Data Types, Stages & Formats

Page 6: UMassD Data Workshop Series Class 2 – Types and Formats of Data, Contextual Details 2015 - October - 13 Dawn Gross, Zac Painter, Liz Winiarz.

Stages of Data Related to Research Data Life Cycle

• Raw Data

• Processed Data

• Analyzed Data

• Finalized/Published Data

• Existing Data across Different Sources

Module 2: Data Types, Stages & Formats

Page 7: UMassD Data Workshop Series Class 2 – Types and Formats of Data, Contextual Details 2015 - October - 13 Dawn Gross, Zac Painter, Liz Winiarz.

Stages of Data Related to Research Data Life Cycle

Sample hypothesis:

Water temperatures in Lake Superior are now significantly warmer than in previous years. The evidence lends support to global warming.

Module 2: Data Types, Stages & Formats

Page 8: UMassD Data Workshop Series Class 2 – Types and Formats of Data, Contextual Details 2015 - October - 13 Dawn Gross, Zac Painter, Liz Winiarz.

Using our sample hypothesis

Water temperatures in Lake Superior are now significantly warmer than in previous years. This evidence lends support to global warming.

• Raw Data = daily lake temperatures

• Processed Data = ‘cleaned’ temp. data in spreadsheet

• Analyzed Data = average temps., graphing changes

• Finalized Data = does data support the hypothesis? Module 2: Data Types, Stages & Formats

Page 9: UMassD Data Workshop Series Class 2 – Types and Formats of Data, Contextual Details 2015 - October - 13 Dawn Gross, Zac Painter, Liz Winiarz.

Preferable Format Types for Long-Term Access to Data

Data formats that offer the best chance for long-term access are both:

• Non-proprietary (also known as open), and

• Unencrypted and uncompressed

Module 2: Data Types, Stages & Formats

Page 10: UMassD Data Workshop Series Class 2 – Types and Formats of Data, Contextual Details 2015 - October - 13 Dawn Gross, Zac Painter, Liz Winiarz.

Preferred Formats

Examples of preferred formats for various data types include:

Moving Images: MOV, MPEG

Audio: WAVE, MP3

Numbers/statistics: ASCII, SAS

Images: TIFF, JPEG 2000

Text: PDF/A, ASCII

Module 2: Data Types, Stages & Formats

Page 11: UMassD Data Workshop Series Class 2 – Types and Formats of Data, Contextual Details 2015 - October - 13 Dawn Gross, Zac Painter, Liz Winiarz.

Converting to Preferable Formats

Information can be lost when converting file formats.

To mitigate the risk of lost information:

• Note conversion steps taken

• If possible, keep the original file as well as the converted one

Module 2: Data Types, Stages & Formats

Page 12: UMassD Data Workshop Series Class 2 – Types and Formats of Data, Contextual Details 2015 - October - 13 Dawn Gross, Zac Painter, Liz Winiarz.

Describing Data, Documenting Reliability & Collection Techniques

Data documentation explains the:

• Who• What• Where• When• And why of data.

Module 2: Data Types, Stages & Formats

Page 13: UMassD Data Workshop Series Class 2 – Types and Formats of Data, Contextual Details 2015 - October - 13 Dawn Gross, Zac Painter, Liz Winiarz.

Describing Data, Documenting Reliability & Collection Techniques

Who:

• Who collected this data?

• Who or what were the subjects under study?

Module 2: Data Types, Stages & Formats

Page 14: UMassD Data Workshop Series Class 2 – Types and Formats of Data, Contextual Details 2015 - October - 13 Dawn Gross, Zac Painter, Liz Winiarz.

Describing Data, Documenting Reliability & Collection Techniques

What:

• What data was collected, and for what purpose?

• What is the content and structure of the data?

Module 2: Data Types, Stages & Formats

Page 15: UMassD Data Workshop Series Class 2 – Types and Formats of Data, Contextual Details 2015 - October - 13 Dawn Gross, Zac Painter, Liz Winiarz.

Describing Data, Documenting Reliability & Collection Techniques

Where:

• Where was this data collected?

• What were the experimental conditions that produced it?

Module 2: Data Types, Stages & Formats

Page 16: UMassD Data Workshop Series Class 2 – Types and Formats of Data, Contextual Details 2015 - October - 13 Dawn Gross, Zac Painter, Liz Winiarz.

Describing Data, Documenting Reliability & Collection Techniques

When:

• When was the data collected?

• Is the data part of a series, or ongoing experiment?

Module 2: Data Types, Stages & Formats

Page 17: UMassD Data Workshop Series Class 2 – Types and Formats of Data, Contextual Details 2015 - October - 13 Dawn Gross, Zac Painter, Liz Winiarz.

Describing Data, Documenting Reliability & Collection Techniques

Why:

• Why was this experiment performed?

• How does it relate to your research question?

Module 2: Data Types, Stages & Formats

Page 18: UMassD Data Workshop Series Class 2 – Types and Formats of Data, Contextual Details 2015 - October - 13 Dawn Gross, Zac Painter, Liz Winiarz.

Cross Discipline Concerns

No matter what, you need to have:

• File naming conventions

• Version control

Module 2: Data Types, Stages & Formats

Page 19: UMassD Data Workshop Series Class 2 – Types and Formats of Data, Contextual Details 2015 - October - 13 Dawn Gross, Zac Painter, Liz Winiarz.

Why Do I Need to Worry about That?

Consider this:

If you unexpectedly have to leave your research project for a few months, could a colleague easily make sense of your data files?

Module 2: Data Types, Stages & Formats

Page 20: UMassD Data Workshop Series Class 2 – Types and Formats of Data, Contextual Details 2015 - October - 13 Dawn Gross, Zac Painter, Liz Winiarz.

Why Use File Naming Conventions?

Naming conventions make life easier!

• Help you find your data• Help others find your data• Help track which version of a file

is most current

Module 2: Data Types, Stages & Formats

Page 21: UMassD Data Workshop Series Class 2 – Types and Formats of Data, Contextual Details 2015 - October - 13 Dawn Gross, Zac Painter, Liz Winiarz.

What File Naming Convention Should I Use?

Has your research group established a convention?

If not, general guidelines include:

• Meaningful file names that aren’t too long• Avoid certain characters• Dates can help with sorting and version control

Module 2: Data Types, Stages & Formats

Page 22: UMassD Data Workshop Series Class 2 – Types and Formats of Data, Contextual Details 2015 - October - 13 Dawn Gross, Zac Painter, Liz Winiarz.

Module 3: Contextual details needed to make data meaningful to others

Editor: Lamar Soutter Library, University of Massachusetts Medical School

Title of the work: New England Collaborative Data Management Curriculum

the URL where the original work can be found: http://library.umassmed.edu/necdmc

CC BY-NC

Page 23: UMassD Data Workshop Series Class 2 – Types and Formats of Data, Contextual Details 2015 - October - 13 Dawn Gross, Zac Painter, Liz Winiarz.

Module 3: Metadata

What is Metadata?

“Metadata is structured information that describes, explains, locates, or otherwise makes it easier to retrieve, use, or manage an information resource. Metadata is often called data about data or information about information” (NISO, Understanding Metadata 2004;1).

Page 24: UMassD Data Workshop Series Class 2 – Types and Formats of Data, Contextual Details 2015 - October - 13 Dawn Gross, Zac Painter, Liz Winiarz.

Module 3: Metadata

You Must Have Metadata to:

• find data from other researchers to support your research;

• use the data that you do find; • help other professionals to find and

use data from your research; and• use your own data in the future when

you may have forgotten details of the research.

Page 25: UMassD Data Workshop Series Class 2 – Types and Formats of Data, Contextual Details 2015 - October - 13 Dawn Gross, Zac Painter, Liz Winiarz.

Module 3: Metadata

Basic Types of Metadata

• Descriptive metadata

• Structural metadata

• Administrative metadata

Page 26: UMassD Data Workshop Series Class 2 – Types and Formats of Data, Contextual Details 2015 - October - 13 Dawn Gross, Zac Painter, Liz Winiarz.

Module 3: Metadata

How Metadata Facilitates Discoverability and Reuse

• Discoverability

• Accessibility

Page 27: UMassD Data Workshop Series Class 2 – Types and Formats of Data, Contextual Details 2015 - October - 13 Dawn Gross, Zac Painter, Liz Winiarz.

Module 3: Metadata

Some Sample Metadata Standards

• Darwin Core

• Ecological Metadata Language (EML)

• Climate and Forecast (CF)

Page 28: UMassD Data Workshop Series Class 2 – Types and Formats of Data, Contextual Details 2015 - October - 13 Dawn Gross, Zac Painter, Liz Winiarz.

Module 3: Metadata

Collecting and Sharing Metadata

• Controlled vocabularies

• Technical standards

Page 29: UMassD Data Workshop Series Class 2 – Types and Formats of Data, Contextual Details 2015 - October - 13 Dawn Gross, Zac Painter, Liz Winiarz.

Module 3: Metadata

Controlled Vocabularies

• Help take the guess work out of choosing between:

• a preferred spelling; • a scientific or popular term • determining which synonym to use.

Page 30: UMassD Data Workshop Series Class 2 – Types and Formats of Data, Contextual Details 2015 - October - 13 Dawn Gross, Zac Painter, Liz Winiarz.

Module 3: Metadata

Technical Standards

ISO 8601 technical standard:

• YYYY (e.g. 1997)• Year and month:• YYYY-MM (e.g. 1997-07)

Complete date: YYYY-MM-DD (e.g. 1997-07-16)• Media types can be problematic as well• The MIME media types helps you choose among the

following: Application, audio, example, image, message, model, multipart, text, video

Page 31: UMassD Data Workshop Series Class 2 – Types and Formats of Data, Contextual Details 2015 - October - 13 Dawn Gross, Zac Painter, Liz Winiarz.

Module 3: Metadata

Media Types

The MIME media types:

• Application• Audio• Image• Model• Multipart• Message• Text• Video

Page 32: UMassD Data Workshop Series Class 2 – Types and Formats of Data, Contextual Details 2015 - October - 13 Dawn Gross, Zac Painter, Liz Winiarz.

Module 3: Metadata

Approaches to Creating Metadata

First, identify your elements:• Title• Creator• Identifier• Subject• Dates

Page 33: UMassD Data Workshop Series Class 2 – Types and Formats of Data, Contextual Details 2015 - October - 13 Dawn Gross, Zac Painter, Liz Winiarz.

Module 3: Metadata

Best Practices

• Consult a metadata librarian!• Consistent data entry is important• Avoid extraneous punctuation• Avoid most abbreviations• Use templates and macros when possible• Extract pre-existing metadata • Keep a data dictionary • Always use an established metadata

standard

Page 34: UMassD Data Workshop Series Class 2 – Types and Formats of Data, Contextual Details 2015 - October - 13 Dawn Gross, Zac Painter, Liz Winiarz.

Module 3: Metadata

Sources for this UnitWhat is metadata:

National Information Standards Organization (NISO). 2004. Understanding Metadata. http://www.niso.org/publications/press/UnderstandingMetadata.pdf

Neiswender, C. 2010. "Introduction to Metadata." In The MMI Guides: Navigating the World of Marine Metadata. http://marinemetadata.org/guides/mdataintro. Accessed April 1, 2013.

Reuse and discoverability:

National Information Standards Organization (NISO). 2004. Understanding Metadata. http://www.niso.org/publications/press/UnderstandingMetadata.pdf

Miller, Steven J. 2011. Metadata Resources: Selected Reference Documents, Web Sites, and Readings: https://pantherfile.uwm.edu/mll/www/resource.html

Wikipedia page on “Metadata”: http://en.wikipedia.org/wiki/Metadatahttp://www.library.illinois.edu/dcc/bestpractices/chapter_11_structuralmetadata.html

Page 35: UMassD Data Workshop Series Class 2 – Types and Formats of Data, Contextual Details 2015 - October - 13 Dawn Gross, Zac Painter, Liz Winiarz.

Module 3: Metadata

Sources for this Unit (cont’d)Metadata standards:

Digital Curation Centre’s Disciplinary Metadata resource. http://www.dcc.ac.uk/resources/metadata-standards.

Hogrefe, K., Stocks, K. 2011. "The Importance of Metadata Standards." In The MMI Guides: Navigating the World of Marine Metadata. http://marinemetadata.org/guides/mdatastandards/stdimportance. Accessed March 22, 2013.

Other suggested readings

Introduction to Metadata: Setting the Stage (Getty Research Institute) http://www.getty.edu/research/publications/electronic_publications/intrometadata/setting.html

Documentation and Metadata (MIT Libraries): http://libraries.mit.edu/guides/subjects/data-management/metadata.html

Version control and authenticityhttp://data-archive.ac.uk/create-manage/format/versions

Page 36: UMassD Data Workshop Series Class 2 – Types and Formats of Data, Contextual Details 2015 - October - 13 Dawn Gross, Zac Painter, Liz Winiarz.

Module 3: Metadata

Other Suggested Readings (cont’d)

What is Metadata?

http://vimeo.com/3161893 Controlled vocabularies and technical standards 

http://en.wikipedia.org/wiki/Controlled_vocabulary

http://www.ieee.org/education_careers/education/standards/standards_glossary.html Metadata elements 

http://libraries.mit.edu/guides/subjects/data-management/metadata.html

Creating metadata

http://uwdcc.library.wisc.edu/documents/DC_companionv1.3.pdf