Cal Poly - Data Management for Researchers
-
Upload
carly-strasser -
Category
Technology
-
view
103 -
download
0
description
Transcript of Cal Poly - Data Management for Researchers
Data Management for Researchers
Carly Strasser, PhD California Digital Library
@carlystrasser [email protected]
Cal Poly Oct 2013
From
Calisph
ere, Cou
retsy of U
C Riverside, Califo
rnia M
useu
m of P
hotograp
hy
Tips, Tools, & Why You Should Care
From
Calisph
ere, Cou
rtesy of Tho
usan
d Oak
s Library
Roadmap
4. Toolbox
1. Background
2. Why you should care 3. Best practices
Why don’t people share data?
Is data management being taught? Do attitudes about
sharing differ among disciplines?
What role can libraries play in data education?
How can we promote storing data in repositories?
What barriers to sharing can we eliminate?
NSF funded DataNet Project Office of Cyberinfrastructure
Is data management being taught? Do attitudes about
sharing differ among disciplines?
What role can libraries play in data education?
How can we promote storing data in repositories?
What barriers to sharing can we eliminate?
Why don’t people share data?
A Brief History of Data Collection
Or… how scientists came to be so bad at data management
From
Calisph
ere via Sa
nta Clara University
, ark:/130
30/kt696
nc7j2
Back in the day…
Da Vinci
Curie
Newton
classicalschool.blogspot.com
Darwin
Digital data From
Flickr by Flickm
or
From
Flickr by US Arm
y En
vironm
ental C
omman
d
From
Flickr by DW08
25
C. Strasser
Courtesey of W
HOI
From
Flickr by deltaMike
Digital data +
Complex workflows
From Flickr by ~Minnea~
Data management Documentation Reproducibility
From Flickr by iowa_spirit_walker
• Cost • Confusion about standards
• Lack of training • Fear of lost rights or benefits
• No incentives
the Truth
From
san
dierpa
stures.com
Data management
Metadata
Data repositories
Data sharing
You need to know
about
From Flickr by johntrainor
Why you should care
From Flickr by Redden-‐McAllister
Because they care:
Because they care:
All data must be in a public archive.
You can’t hoard it. If it’s not available you can’t cite it.
Include a data section with how to find datasets.
Data Management: Who Knew Could be a Hot Topic?
From Flickr by Velo Steve
Carly Strasser, PhD California Digital Library
@carlystrasser
Cal Poly Oct 2013
Later!
What should you be doing?
From Flickr by whatthefeed
NOT V
C:\Documents and Settings\hampton\My Documents\NCEAS Distributed Graduate Seminars\[Wash Cres Lake Dec 15 Dont_Use.xls]Sheet1Stable Isotope Data Sheet
Wash Cresc Lake Peter's lab Don't use - old dataAlgal Washed RocksDec. 16Tray 004
SD for delta 13C = 0.07 SD for delta 15N = 0.15
Position SampleID Weight (mg) %C delta 13C delta 13C_ca %N delta 15N delta 15N_ca Spec. No.A1 ref 0.98 38.27 -25.05 -24.59 1.96 4.12 3.47 25354A2 ref 0.98 39.78 -25.00 -24.54 2.03 4.01 3.36 25356A3 ref 0.98 40.37 -24.99 -24.53 2.04 4.09 3.44 25358A4 ref 1.01 42.23 -25.06 -24.60 2.17 4.20 3.55 25360 Shore Avg ConA5 ALG01 3.05 1.88 -24.34 -23.88 0.17 -1.65 -2.30 25362 c -1.26 -27.22A6 Lk Outlet Alg 3.06 31.55 -30.17 -29.71 0.92 0.87 0.22 25364 1.26 0.32A7 ALG03 2.91 6.85 -21.11 -20.65 0.48 -0.97 -1.62 25366 cA8 ALG05 2.91 35.56 -28.05 -27.59 2.30 0.59 -0.06 25368A9 ALG07 3.04 33.49 -29.56 -29.10 1.68 0.79 0.14 25370A10 ALG06 2.95 41.17 -27.32 -26.86 1.97 2.71 2.06 25372B1 ALG04 3.01 43.74 -27.50 -27.04 1.36 0.99 0.34 25374 cB2 ALG02 3 4.51 -22.68 -22.22 0.34 4.31 3.66 25376B3 ALG01 2.99 1.59 -24.58 -24.12 0.15 -1.69 -2.34 25378 cB4 ALG03 2.92 4.37 -21.06 -20.60 0.34 -1.52 -2.17 25380 cB5 ALG07 2.9 33.58 -29.44 -28.98 1.74 0.62 -0.03 25382B6 ref 1.01 44.94 -25.00 -24.54 2.59 3.96 3.31 25384B7 ref 0.99 42.28 -24.87 -24.41 2.37 4.33 3.68 25386B8 Lk Outlet Alg 3.04 31.43 -29.69 -29.23 1.07 0.95 0.30 25388B9 ALG06 3.09 35.57 -27.26 -26.80 1.96 2.79 2.14 25390B10 ALG02 3.05 5.52 -22.31 -21.85 0.45 4.72 4.07 25392C1 ALG04 2.98 37.90 -27.42 -26.96 1.36 1.21 0.56 25394 cC2 ALG05 3.04 31.74 -27.93 -27.47 2.40 0.73 0.08 25396C3 ref 0.99 38.46 -25.09 -24.63 2.40 4.37 3.72 25398
23.78 1.17
Reference statistics:
Sampling Site / Identifier:Sample Type:
Date:Tray ID and Sequence:
From Stephanie Hampton (2010) ESA Workshop on Best Practices
2 tables Random notes
From Stephanie Hampton
C:\Documents and Settings\hampton\My Documents\NCEAS Distributed Graduate Seminars\[Wash Cres Lake Dec 15 Dont_Use.xls]Sheet1Stable Isotope Data Sheet
Wash Cresc Lake Peter's lab Don't use - old dataAlgal Washed RocksDec. 16Tray 004
SD for delta 13C = 0.07 SD for delta 15N = 0.15
Position SampleID Weight (mg) %C delta 13C delta 13C_ca %N delta 15N delta 15N_ca Spec. No.A1 ref 0.98 38.27 -25.05 -24.59 1.96 4.12 3.47 25354A2 ref 0.98 39.78 -25.00 -24.54 2.03 4.01 3.36 25356A3 ref 0.98 40.37 -24.99 -24.53 2.04 4.09 3.44 25358A4 ref 1.01 42.23 -25.06 -24.60 2.17 4.20 3.55 25360 Shore Avg ConA5 ALG01 3.05 1.88 -24.34 -23.88 0.17 -1.65 -2.30 25362 c -1.26 -27.22A6 Lk Outlet Alg 3.06 31.55 -30.17 -29.71 0.92 0.87 0.22 25364 1.26 0.32A7 ALG03 2.91 6.85 -21.11 -20.65 0.48 -0.97 -1.62 25366 cA8 ALG05 2.91 35.56 -28.05 -27.59 2.30 0.59 -0.06 25368A9 ALG07 3.04 33.49 -29.56 -29.10 1.68 0.79 0.14 25370A10 ALG06 2.95 41.17 -27.32 -26.86 1.97 2.71 2.06 25372B1 ALG04 3.01 43.74 -27.50 -27.04 1.36 0.99 0.34 25374 cB2 ALG02 3 4.51 -22.68 -22.22 0.34 4.31 3.66 25376B3 ALG01 2.99 1.59 -24.58 -24.12 0.15 -1.69 -2.34 25378 cB4 ALG03 2.92 4.37 -21.06 -20.60 0.34 -1.52 -2.17 25380 cB5 ALG07 2.9 33.58 -29.44 -28.98 1.74 0.62 -0.03 25382B6 ref 1.01 44.94 -25.00 -24.54 2.59 3.96 3.31 25384B7 ref 0.99 42.28 -24.87 -24.41 2.37 4.33 3.68 25386B8 Lk Outlet Alg 3.04 31.43 -29.69 -29.23 1.07 0.95 0.30 25388B9 ALG06 3.09 35.57 -27.26 -26.80 1.96 2.79 2.14 25390B10 ALG02 3.05 5.52 -22.31 -21.85 0.45 4.72 4.07 25392C1 ALG04 2.98 37.90 -27.42 -26.96 1.36 1.21 0.56 25394 cC2 ALG05 3.04 31.74 -27.93 -27.47 2.40 0.73 0.08 25396C3 ref 0.99 38.46 -25.09 -24.63 2.40 4.37 3.72 25398
23.78 1.17
Reference statistics:
Sampling Site / Identifier:Sample Type:
Date:Tray ID and Sequence:
From Stephanie Hampton (2010) ESA Workshop on Best Practices
Wash Cres Lake Dec 15 Dont_Use.xls
From Stephanie Hampton
C:\Documents and Settings\hampton\My Documents\NCEAS Distributed Graduate Seminars\[Wash Cres Lake Dec 15 Dont_Use.xls]Sheet1Stable Isotope Data Sheet
Wash Cresc Lake Peter's lab Don't use - old dataAlgal Washed RocksDec. 16Tray 004
SD for delta 13C = 0.07 SD for delta 15N = 0.15
Position SampleID Weight (mg) %C delta 13C delta 13C_ca %N delta 15N delta 15N_ca Spec. No.A1 ref 0.98 38.27 -25.05 -24.59 1.96 4.12 3.47 25354A2 ref 0.98 39.78 -25.00 -24.54 2.03 4.01 3.36 25356A3 ref 0.98 40.37 -24.99 -24.53 2.04 4.09 3.44 25358A4 ref 1.01 42.23 -25.06 -24.60 2.17 4.20 3.55 25360 Shore Avg ConA5 ALG01 3.05 1.88 -24.34 -23.88 0.17 -1.65 -2.30 25362 c -1.26 -27.22A6 Lk Outlet Alg 3.06 31.55 -30.17 -29.71 0.92 0.87 0.22 25364 1.26 0.32A7 ALG03 2.91 6.85 -21.11 -20.65 0.48 -0.97 -1.62 25366 cA8 ALG05 2.91 35.56 -28.05 -27.59 2.30 0.59 -0.06 25368A9 ALG07 3.04 33.49 -29.56 -29.10 1.68 0.79 0.14 25370A10 ALG06 2.95 41.17 -27.32 -26.86 1.97 2.71 2.06 25372B1 ALG04 3.01 43.74 -27.50 -27.04 1.36 0.99 0.34 25374 c SUMMARY OUTPUTB2 ALG02 3 4.51 -22.68 -22.22 0.34 4.31 3.66 25376B3 ALG01 2.99 1.59 -24.58 -24.12 0.15 -1.69 -2.34 25378 c Regression StatisticsB4 ALG03 2.92 4.37 -21.06 -20.60 0.34 -1.52 -2.17 25380 c Multiple R 0.283158B5 ALG07 2.9 33.58 -29.44 -28.98 1.74 0.62 -0.03 25382 R Square 0.080178B6 ref 1.01 44.94 -25.00 -24.54 2.59 3.96 3.31 25384 Adjusted R Square-0.022024B7 ref 0.99 42.28 -24.87 -24.41 2.37 4.33 3.68 25386 Standard Error1.906378B8 Lk Outlet Alg 3.04 31.43 -29.69 -29.23 1.07 0.95 0.30 25388 Observations 11B9 ALG06 3.09 35.57 -27.26 -26.80 1.96 2.79 2.14 25390B10 ALG02 3.05 5.52 -22.31 -21.85 0.45 4.72 4.07 25392 ANOVAC1 ALG04 2.98 37.90 -27.42 -26.96 1.36 1.21 0.56 25394 c df SS MS F Significance FC2 ALG05 3.04 31.74 -27.93 -27.47 2.40 0.73 0.08 25396 Regression 1 2.851116 2.851116 0.784507 0.398813C3 ref 0.99 38.46 -25.09 -24.63 2.40 4.37 3.72 25398 Residual 9 32.7085 3.634278
23.78 1.17 Total 10 35.55962
CoefficientsStandard Error t Stat P-value Lower 95%Upper 95%Lower 95.0%Upper 95.0%Intercept -4.297428 4.671099 -0.920003 0.381568 -14.8642 6.269341 -14.8642 6.269341X Variable 1-0.158022 0.17841 -0.885724 0.398813 -0.561612 0.245569 -0.561612 0.245569
Reference statistics:
Sampling Site / Identifier:Sample Type:
Date:Tray ID and Sequence:
Random stats output
From Stephanie Hampton
C:\Documents and Settings\hampton\My Documents\NCEAS Distributed Graduate Seminars\[Wash Cres Lake Dec 15 Dont_Use.xls]Sheet1Stable Isotope Data Sheet
Wash Cresc Lake Peter's lab Don't use - old dataAlgal Washed RocksDec. 16Tray 004
SD for delta 13C = 0.07 SD for delta 15N = 0.15
Position SampleID Weight (mg) %C delta 13C delta 13C_ca %N delta 15N delta 15N_ca Spec. No.A1 ref 0.98 38.27 -25.05 -24.59 1.96 4.12 3.47 25354A2 ref 0.98 39.78 -25.00 -24.54 2.03 4.01 3.36 25356A3 ref 0.98 40.37 -24.99 -24.53 2.04 4.09 3.44 25358A4 ref 1.01 42.23 -25.06 -24.60 2.17 4.20 3.55 25360 Shore Avg ConA5 ALG01 3.05 1.88 -24.34 -23.88 0.17 -1.65 -2.30 25362 c -1.26 -27.22A6 Lk Outlet Alg 3.06 31.55 -30.17 -29.71 0.92 0.87 0.22 25364 1.26 0.32A7 ALG03 2.91 6.85 -21.11 -20.65 0.48 -0.97 -1.62 25366 cA8 ALG05 2.91 35.56 -28.05 -27.59 2.30 0.59 -0.06 25368A9 ALG07 3.04 33.49 -29.56 -29.10 1.68 0.79 0.14 25370A10 ALG06 2.95 41.17 -27.32 -26.86 1.97 2.71 2.06 25372B1 ALG04 3.01 43.74 -27.50 -27.04 1.36 0.99 0.34 25374 c SUMMARY OUTPUTB2 ALG02 3 4.51 -22.68 -22.22 0.34 4.31 3.66 25376B3 ALG01 2.99 1.59 -24.58 -24.12 0.15 -1.69 -2.34 25378 c Regression StatisticsB4 ALG03 2.92 4.37 -21.06 -20.60 0.34 -1.52 -2.17 25380 c Multiple R 0.283158B5 ALG07 2.9 33.58 -29.44 -28.98 1.74 0.62 -0.03 25382 R Square 0.080178B6 ref 1.01 44.94 -25.00 -24.54 2.59 3.96 3.31 25384 Adjusted R Square-0.022024B7 ref 0.99 42.28 -24.87 -24.41 2.37 4.33 3.68 25386 Standard Error1.906378B8 Lk Outlet Alg 3.04 31.43 -29.69 -29.23 1.07 0.95 0.30 25388 Observations 11B9 ALG06 3.09 35.57 -27.26 -26.80 1.96 2.79 2.14 25390B10 ALG02 3.05 5.52 -22.31 -21.85 0.45 4.72 4.07 25392 ANOVAC1 ALG04 2.98 37.90 -27.42 -26.96 1.36 1.21 0.56 25394 c df SS MS F Significance FC2 ALG05 3.04 31.74 -27.93 -27.47 2.40 0.73 0.08 25396 Regression 1 2.851116 2.851116 0.784507 0.398813C3 ref 0.99 38.46 -25.09 -24.63 2.40 4.37 3.72 25398 Residual 9 32.7085 3.634278
23.78 1.17 Total 10 35.55962
CoefficientsStandard Error t Stat P-value Lower 95%Upper 95%Lower 95.0%Upper 95.0%Intercept -4.297428 4.671099 -0.920003 0.381568 -14.8642 6.269341 -14.8642 6.269341X Variable 1-0.158022 0.17841 -0.885724 0.398813 -0.561612 0.245569 -0.561612 0.245569
Reference statistics:
Sampling Site / Identifier:Sample Type:
Date:Tray ID and Sequence:
SampleID ALG03 ALG05 ALG07 ALG06 ALG04 ALG02 ALG01 ALG03 ALG07
Weight (mg) 2.91 2.91 3.04 2.95 3.01 3 2.99 2.92 2.9
%C 6.85 35.56 33.49 41.17 43.74 4.51 1.59 4.37 33.58delta 13C -21.11 -28.05 -29.56 -27.32 -27.50 -22.68 -24.58 -21.06 -29.44
delta 13C_ca -20.65 -27.59 -29.10 -26.86 -27.04 -22.22 -24.12 -20.60 -28.98
%N 0.48 2.30 1.68 1.97 1.36 0.34 0.15 0.34 1.74delta 15N -0.97 0.59 0.79 2.71 0.99 4.31 -1.69 -1.52 0.62
delta 15N_ca -1.62 -0.06 0.14 2.06 0.34 3.66 -2.34 -2.17 -0.03
-3.00
-2.00
-1.00
0.00
1.00
2.00
3.00
4.00
-35.00 -30.00 -25.00 -20.00 -15.00 -10.00 -5.00 0.00
Series1
From Stephanie Hampton
What should you be doing?
From Flickr by whatthefeed
data management
From
Flickr by Big Sw
ede Guy
1. Planning 2. Data collection &
organization 3. Quality control & assurance 4. Metadata 5. Workflows 6. Data stewardship & reuse
Best Practices
data management
From
Flickr by Big Sw
ede Guy
1. Planning 2. Data collection &
organization 3. Quality control & assurance 4. Metadata 5. Workflows 6. Data stewardship & reuse
Best Practices
data management
From
Flickr by Big Sw
ede Guy
1. Planning 2. Data collection &
organization 3. Quality control & assurance 4. Metadata 5. Workflows 6. Data stewardship & reuse
Best Practices
Create unique identifiers • Decide on naming scheme early • Create a key • Different for each sample
2. Data collection & organization
From Flickr by sjbresnahan
From
Flickr by zebb
ie
Standardize • Consistent within columns – only numbers, dates, or text
• Consistent names, codes, formats
Modified from K. Vanderbilt From Pink Floyd, The Wall themurkyfringe.com
2. Data collection & organization
Google Docs Forms
Standardize • Reduce possibility of manual error by constraining entry choices
Modified from K. Vanderbilt
2. Data collection & organization
Excel lists Data
validataion
2. Data collection & organization
Create parameter table Create a site table
From doi:10.3334/ORNLDAAC/777
From doi:10.3334/ORNLDAAC/777
From R Cook, ESA Best Practices Workshop 2010
A relational database is A set of tables Relationships among the tables A language to specify & query the tables
A RDB provides
Scalability: millions+ records Features for sub-‐setting, querying, sorting Reduced redundancy & entry errors
2. Data collection & organization
From Mark Schildhauer
What about databases?
2. Data collection & organization
From Mark Schildhauer
You should invest time in learning databases if your data sets are large or complex
Consider investing time in learning databases if your data are small and humble you ever intend to share your data you are < 30 years old
Use descriptive file names • Unique • Reflect contents
From R Cook, ESA Best Practices Workshop 2010
Bad: Mydata.xls 2001_data.csv best version.txt
Better: Eaffinis_nanaimo_2010_counts.xls
Site name
Year What was measured
Study organism
2. Data collection & organization
*Not for everyone
*
Organize files logically
Biodiversity
Lake
Experiments
Field work
Grassland
Biodiv_H20_heatExp_2005to2008.csv Biodiv_H20_predatorExp_2001to2003.csv … Biodiv_H20_PlanktonCount_2001toActive.csv Biodiv_H20_ChlAprofiles_2003.csv …
From S. Hampton
2. Data collection & organization
Preserve information • Keep raw data raw
• Use scripts to process data & save them with data
Raw data as .csv
R script for processing & analysis
2. Data collection & organization
data management
From
Flickr by Big Sw
ede Guy
1. Planning 2. Data collection &
organization 3. Quality control & assurance 4. Metadata 5. Workflows 6. Data stewardship & reuse
Best Practices
Before data collection • Define & enforce standards • Assign responsibility for data quality
3. Quality control and quality assurance
From
Flickr by StacieBe
e
After data entry • Check for missing, impossible,
anomalous values • Perform statistical summaries • Look for outliers
3. Quality control and quality assurance
0
10
20
30
40
50
60
0 10 20 30 40
data management
From
Flickr by Big Sw
ede Guy
1. Planning 2. Data collection &
organization 3. Quality control & assurance 4. Metadata 5. Workflows 6. Data stewardship & reuse
Best Practices
4. Metadata basics Why are you promoting Excel?
What is metadata?
4. Metadata basics
Metadata = Data reporting
WHO created the data?
WHAT is the content
of the data set?
WHEN was it created?
WHERE was it collected?
HOW was it developed?
WHY was it developed?
From Flickr by /\/\ichael Patric|{
• Digital context
• Name of the data set
• The name(s) of the data file(s) in the data set
• Date the data set was last modified
• Example data file records for each data type file
• Pertinent companion files
• List of related or ancillary data sets
• Software (including version number) used to prepare/read the data set
• Data processing that was performed
• Personnel & stakeholders
• Who collected
• Who to contact with questions
• Funders
• Scientific context
• Scientific reason why the data were collected
• What data were collected
• What instruments (including model & serial number) were used
• Environmental conditions during collection
• Where collected & spatial resolution When collected & temporal resolution
• Standards or calibrations used
• Information about parameters
• How each was measured or produced
• Units of measure
• Format used in the data set
• Precision & accuracy if known
• Information about data
• Definitions of codes used
• Quality assurance & control measures
• Known problems that limit data use (e.g. uncertainty, sampling problems)
• How to cite the data set
4. Metadata basics
• Provides structure to describe data
Common terms | definitions | language | structure
4. Metadata basics
• Lots of different standards EML , FGDC, ISO19115, DarwinCore,…
• Tools for creating metadata files
Morpho (EML), Metavist (FGDC), NOAA MERMaid (CSGDM)
What is metadata?
Select the appropriate standard
data management
From
Flickr by Big Sw
ede Guy
1. Planning 2. Data collection &
organization 3. Quality control & assurance 4. Metadata 5. Workflows 6. Data stewardship & reuse
Best Practices
Temperature data
Salinity data
Data import into R
Analysis: mean, SD
Graph production
Quality control & data cleaning “Clean” T
& S data
Summary statistics
Data in R format
5. Workflows
Workflow: how you get from the raw data to the final products of your research
Simple workflows: flow charts
• R, SAS, MATLAB • Well-‐documented code is…
Easier to review Easier to share Easier to repeat analysis
5. Workflows
Workflow: how you get from the raw data to the final products of your research
Simple workflows: commented scripts
# % $
&
Fancy Schmancy workflows: Kepler Resulting output
5. Workflows
https://kepler-‐project.org
Workflows enable
Reproducibility
Transparency
Executability
5. Workflows
From Flickr by merlinprincesse
Minimally: document your analysis commented code; simple flow-‐chart
Emerging workflow applications will… − Link software for executable end-‐to-‐end analysis − Provide detailed info about data & analysis − Facilitate re-‐use & refinement of complex, multi-‐step
analyses − Enable efficient swapping of alternative models &
algorithms − Help automate tedious tasks
5. Workflows
www.littlebytesoflife.com
Coming Soon:
workflow shar
ing
requirements!
data management
From
Flickr by Big Sw
ede Guy
1. Planning 2. Data collection &
organization 3. Quality control & assurance 4. Metadata 5. Workflows 6. Data stewardship & reuse
Best Practices
The 20-‐Year Rule The metadata accompanying a data set should be written for a user 20 years into the future
6. Data stewardship & reuse
(National Research Council 1991)
From Flickr by greensambaman
RULE
Use stable formats csv, txt, tiff
Create back-‐up copies original, near, far
Periodically test ability to restore information
6. Data stewardship & reuse
Modified from R. Cook
Store your data in a repository
Institutional archive
Discipline/specialty archive
6. Data stewardship & reuse
From Flickr by torkildr
Ask a librarian
Repos of repos:
databib.org
re3data.org
Allows readers to find data products Get credit for data and publications
Promotes reproducibility Better measure of research impact
Example: Sidlauskas, B. 2007. Data from: Testing for unequal rates of morphological diversification in the absence of a detailed phylogeny: a case study from characiform fishes. Dryad Digital Repository. doi:10.5061/dryad.20 Persistent Unique
Identifier
6. Data stewardship & reuse
Practice Data Citation
data management
From
Flickr by Big Sw
ede Guy
1. Planning 2. Data collection &
organization 3. Quality control & assurance 4. Metadata 5. Workflows 6. Data stewardship & reuse
Best Practices
A document that describes what you will
do with your data throughout
the research project
From Flickr by Barbies Land
What is a data management plan?
DMP for funders: A short plan submitted alongside grant applications
But they all have different requirements and express them in
different ways
From Flickr by 401(K) 2013
An outline of – what will be collected – methods – Standards – Metadata – sharing/access – long-‐term storage
Includes how and why
DMP supplement may include: 1. the types of data, samples, physical collections, software, curriculum
materials, and other materials to be produced in the course of the project
2. the standards to be used for data and metadata format and content (where existing standards are absent or deemed inadequate, this should be documented along with any proposed solutions or remedies)
3. policies for access and sharing including provisions for appropriate protection of privacy, confidentiality, security, intellectual property, or other rights or requirements
4. policies and provisions for re-‐use, re-‐distribution, and the production of derivatives
5. plans for archiving data, samples, and other research products, and for preservation of access to them
NSF DMP Requirements
From Grant Proposal Guidelines:
Carly Strasser | @carlystrasser California Digital Library
5 August 2013 ESA 2013 SS 2
From Flickr by OZinOH
DMPTool The Data Management Planning Tool
From
Flickr by dipster1
Toolbox
Step-‐by-‐step wizard for generating DMP
create | edit | re-‐use | share
Free & open to community
dmptool.org Write a DMP
databib.org
Where should I put my data?
Find a repository
Get help
From
Flic
kr b
y th
ewm
att
Get help from your library From
Flickr by North Carolina Digita
l Herita
ge Cen
ter
From Flickr by Madison Guy
DCXL blog: dcxl.cdlib.org
Toolbox:
Get help
From Flickr by dotpolka
Doing science is a privilege – not a right
From Flickr by Michael Tinkler
There is a social contract of science: we have an obligation to ensure dissemination, validation, & advancement.
To not do so is science malpractice.
– Brian Hole, Ubiquity Press at UCL
From Flickr by mikerosebery
My website Email me Tweet me My slides
carlystrasser.net [email protected] @carlystrasser slideshare.net/carlystrasser