QA/QC for ecological data: tips & cheat codes

Click here to load reader

Embed Size (px)

Transcript of QA/QC for ecological data: tips & cheat codes

  1. 1. dead data tell no tales @cjlortie
  2. 2. you will have to reuse your data planning promotes reproducibility
  3. 3. https://dmptool.org to begin your game/journey try a data management planning tool Michener & Jones 2012
  4. 4. there is no perfect experiment Ruxton & Colegrave 2016
  5. 5. there are no perfect data data vary in class and structure
  6. 6. QA/QC Cai & Zhu 2015
  7. 7. no one set of criteria need t all ecological data but practical principles can be used as a guide Pipino et al. 2002 QA/QC
  8. 8. a practical guide to QA/QC for ecological data increasingly adopt #rstats & #tidyverse workows
  9. 9. Tip #1. Pilot data & meta-data build tidy data & do data by design rnorm(n = 10, mean = 39.74, sd = 25.09)
  10. 10. Tip #2. Use social coding for QA/QC (at least) two-player mode
  11. 11. Tip #3. Check #rstats for data tools there is a package for that (at least two) i.e. like cheat codes to get you there sooner
  12. 12. Maia et al. 2013 pavo
  13. 13. biogeo for occurrence data Robertson et al. 2016
  14. 14. codyn for community dynamic metrics with taxize to check names codyn::check_multispp(), check_names(), check_sppvar() taxize::gnr_resolve()
  15. 15. use R Markdown + GitHub for versioned reviews & data cleaning Tip #4.Version & annotate your data cleaning
  16. 16. Tip #5. Check classes of vectors/variables str(), unique(), nrow(), tibble()
  17. 17. Tip #6. Decide what is a true zero Martin et al. 2005 is.na(), data[!is.na(data$x), ]
  18. 18. Tip #7. Pre-print your data publish sooner
  19. 19. the reproducibility crisis in science needs to end. today. avoid a game-over effect before the reuse even begins.
  20. 20. better data. better reproducibility. nom nom