Making sense of citizen science data: A review of methods

25
Making sense of citizen science data: A review of methods Olivier Gimenez h#ps://oliviergimenez.wordpress.com/

Transcript of Making sense of citizen science data: A review of methods

Making sense of citizen science data: A review of methods  

Olivier Gimenez h#ps://oliviergimenez.wordpress.com/  

A  review  in  15  minutes?!!    

Mo3va3on  

•  Recent interest in large terrestrial and marine mammals

•  Hardly amenable to standard field protocols

•  Growing curiosity in citizen science data (CSD), but where to start?

What  are  the  biases  in  CSD?  

•  Observer bias

•  Spatial bias

•  Detection bias You  see  me   You  don’t  see  me  

Review  of  the  literature  

•  List all papers with ‘Citizen Science’ in them

•  Scan and check those actually analysing CSD

•  Add papers found randomly (ignoring observer bias…)

•  Can we build a taxonomy of methods?

•  It’s going to be clumsy and non-exhaustive

And  boring…  

 1  -­‐  the  ‘compara3ve’  approach  

•  Comparison of results from (classic) analyses of CSD vs. standardized protocols -  Deemed to be study/species specific -  Results are often convergent

•  My review stops here then…

 2  -­‐  ‘filtering’  and  ‘correc3on’  approaches  

•  Methods to filter, select data

•  Correction methods: List Length Analysis, Ball’s approach, Telfer’s approach, Frescalo’s method, …

Sample  Completed  Least  Bi#ern  Survey  Data  Sheet  

 2  -­‐  ‘filtering’  and  ‘correc3on’  approaches  

•  These methods are not robust to bias in CSD, except the Frescalo method

Check  out  our  paper,  it’s  awesome!  

 3  -­‐  the  ‘simula3on’  approach  (Virtual  Ecologist)  

•  Simulate the bias, and check how your favorite method behaves

•  Case study with wolverine in Scandinavia

•  Counts on den sites to infer abundance

•  Accumulation of knowledge about the sites falsely increases observed counts

V.  Gervasi  

 3  -­‐  the  ‘simula3on’  approach  (Virtual  Ecologist)  

Year  

Log(N)  

•  Tool to design protocols adequately and explore potential bias

•  Convincing way to prove that raw indices are biased

 4  -­‐  the  ‘regression’  approach  

•  Use relevant variables to account for biases

Ian  Renner      &      David  Warton  

 4  -­‐  the  ‘regression’  approach  

•  Use relevant variables to account for biases

•  Ecological variables -  Affect species’ presence -  Used for building models and predicting

•  Observer bias variables -  Affect species detection -  Used only for building models -  Prediction with common level of bias

 4  -­‐  the  ‘regression’  approach  Maps of estimated intensity of Eucalyptus apiculata in Australia

(# detections / km2)

Ecological  variables  only  

Ecological  +  observer  bias  variables,  

condiFoning  on  a  common  level  of  bias  

Sydney  

Wollemi  Nat  Park  

 5  -­‐  the  ‘combina3on’  approach  •  Combine CSD with data collected via

standard protocols (detection/non-detection) -  DND data allow correcting for bias in

opportunistic data -  If no DND for one species, share information

with other species assuming similar bias

OpportunisFc  data  

DetecFon/non-­‐detecFon  data  

Actual  presence-­‐absence  of  the  

species  

Will  Fithian  

 5  -­‐  the  ‘combina3on’  approach  

•  Combine CSD with data collected via standard protocols (detection/non-detection) -  DND data allow correcting for bias in

opportunistic data -  If no DND for one species, share information

with other species assuming similar bias

•  Several clever people are on it: Pagel, Giraud, Dorazio, Fithian, O’Hara, …

 6  -­‐  the  ‘occupancy’  approach  

•  Correct for false-negatives, and time/spatial variation in detection -  Account for false-positives -  Extension to multiple species

•  How to get the non-detections? -  Relatively easy for checklist data -  But otherwise? You need to know something

about the observer effort…

•  Typical example of human-wildlife conflict

•  Network of observers all over the country

•  Map its range, and assess its dynamics

Wolf  range  dynamics  in  France  

Wolf  range  dynamics  in  France  

•  Re-construct a posteriori observation effort

•  Use space-time information on the observers

Wolf  range  dynamics  in  France  

Conclusions  

•  CSD are great!

Conclusions  

•  CSD are great!

•  But, we need to deal with bias if we want to extract meaningful ecological signal

 Recommenda3ons  (at  your  own  risk)  •  A myriad of approaches; no decision tree •  Use simulations to explore effect of bias

•  If possible, incorporate detectability via occupancy / capture-recapture models

•  If not, the regression approach, with covariates to correct for observer bias, is an avenue to explore

Perspec3ves  

•  The combination approach holds great promise

•  The (inhomogeneous) Poisson point process modeling framework seems to be a unifying framework

OpportunisFc  data  

DetecFon/non-­‐detecFon  data  

Actual  presence-­‐absence  of  the  species  

Perspec3ves  •  We should focus more on the citizens

-  Fieldwork sheet for recording data on observers too? -  A protocol to collect/store data on both species and citizens

•  Technology will help •  As well as social sciences

 Thank  you!  

… and Barney Stinson from How I met your mother, Tom from the Minions, a random cute cat, Boromir from Lord of the Rings, James Montgomery Flagg (Uncle Sam), Karine and Wesley, Anne-Sophie and Julie from our boulet team, and the meme generators