QA/QC for ecological data: tips & cheat codes
-
Upload
cjlortie -
Category
Environment
-
view
82 -
download
0
Transcript of QA/QC for ecological data: tips & cheat codes
dead data tell no tales
@cjlortie
you will have to reuse your data
planning promotes reproducibility
https://dmptool.org to begin your game/journey
try a data management planning tool
Michener & Jones 2012
there is no perfect experiment
Ruxton & Colegrave 2016
there are no perfect data
data vary in class and structure
QA/QC
Cai & Zhu 2015
no one set of criterianeed fit all ecological data
but practical principlescan be used as a guide
Pipino et al. 2002
QA/QC
a practical guide to QA/QC for ecological data
increasingly adopt #rstats & #tidyverse workflows
Tip #1. Pilot data & meta-data
build tidy data & do data by design
rnorm(n = 10, mean = 39.74, sd = 25.09)
Tip #2. Use social coding for QA/QC
(at least) two-player mode
Tip #3. Check #rstats for data tools
there is a package for that (at least two)i.e. like ‘cheat codes’ to get you there sooner
Maia et al. 2013
pavo
biogeo for occurrence data
Robertson et al. 2016
codyn for community dynamic metrics with taxize to check names
codyn::check_multispp(), check_names(), check_sppvar()taxize::gnr_resolve()
use R Markdown + GitHub for versioned reviews & data cleaning
Tip #4. Version & annotate your data cleaning
Tip #5. Check classes of vectors/variables
str(), unique(), nrow(), tibble()
Tip #6. Decide what is a true zero
Martin et al. 2005
is.na(), data[!is.na(data$x), ]
Tip #7. Pre-print your data
publish sooner
the reproducibility crisis in science needs to end. today.
avoid a ‘game-over’ effect before the reuse even begins.
better data. better reproducibility.
nom nom