Sheffield R Jan 2015 - Using R to investigate parasite infections in Asian elephants, Carly Lynsdale

29
A beginner’s view on mixed modelling #crapstats carly.lynsdale@sheff ield.ac.uk CarlyLynsdale MyanmarElepha nt Using R to investigate parasite infection in Asian Elephants

Transcript of Sheffield R Jan 2015 - Using R to investigate parasite infections in Asian elephants, Carly Lynsdale

Page 1: Sheffield R Jan 2015 - Using R to investigate parasite infections in Asian elephants, Carly Lynsdale

A beginner’s view on mixed modelling

#crapstats

[email protected]

CarlyLynsdale

MyanmarElephant

Using R to investigate parasite infection in Asian Elephants

Page 2: Sheffield R Jan 2015 - Using R to investigate parasite infections in Asian elephants, Carly Lynsdale
Page 3: Sheffield R Jan 2015 - Using R to investigate parasite infections in Asian elephants, Carly Lynsdale

Photo: Hannah Mumby,

Page 4: Sheffield R Jan 2015 - Using R to investigate parasite infections in Asian elephants, Carly Lynsdale

EPG

Page 5: Sheffield R Jan 2015 - Using R to investigate parasite infections in Asian elephants, Carly Lynsdale
Page 6: Sheffield R Jan 2015 - Using R to investigate parasite infections in Asian elephants, Carly Lynsdale

Are parasite eggs evenly distributed throughout elephant faeces?

Page 7: Sheffield R Jan 2015 - Using R to investigate parasite infections in Asian elephants, Carly Lynsdale

WITHIN BOLUS

119 hosts, 4 samples per host

1 elephant -> 2 samples (C + E) -> 1 repeat of each

C

E

Page 8: Sheffield R Jan 2015 - Using R to investigate parasite infections in Asian elephants, Carly Lynsdale

BETWEEN BOLUS

20 hosts, 6 samples per host

1 elephant - > 3 bolus sampled from each -> centre and edge from each bolus

C

E

C

E

C

E

Page 9: Sheffield R Jan 2015 - Using R to investigate parasite infections in Asian elephants, Carly Lynsdale

Variables

Between• Elephant (ID)

• Bolus (1st, 3rd, Last)• Sample (C + E)

• EPG

Within• Elephant (ID)• Sample (C + E)

• EPG

ContinuousCategorical (Factor)

Page 10: Sheffield R Jan 2015 - Using R to investigate parasite infections in Asian elephants, Carly Lynsdale
Page 11: Sheffield R Jan 2015 - Using R to investigate parasite infections in Asian elephants, Carly Lynsdale

Why Generalised Linear Mixed Effects Models (GLMMs)?

The effect of sample location (independent) on EPG (dependent), with non-parametric data and accounting

for the effect of elephant age, sex and camp.

Page 12: Sheffield R Jan 2015 - Using R to investigate parasite infections in Asian elephants, Carly Lynsdale

GLMMs deal with potential pseudoreplication by including (fixed and) random measures.

The effect of sample location (independent) on EPG (dependent), with non-parametric data and accounting

for the effect of elephant age, sex and camp.

Why Generalised Linear Mixed Effects Models (GLMMs)?

Page 13: Sheffield R Jan 2015 - Using R to investigate parasite infections in Asian elephants, Carly Lynsdale

GLMMs deal with potential pseudoreplication by including (fixed and) random measures.

i.e. They account for repeated measures.

The effect of sample location (independent) on EPG (dependent), with non-parametric data and accounting

for the effect of elephant age, sex and camp.

Why Generalised Linear Mixed Effects Models (GLMMs)?

Page 14: Sheffield R Jan 2015 - Using R to investigate parasite infections in Asian elephants, Carly Lynsdale

library(lme4) = Linear Mixed Effects v4

library(lme4) # Linear Mixed Effects v4

model1 <- glmer(y ~ xf + (1|xr), family = poisson (link = “log”), data = dframe1)

Package for GLMMs in R

Page 15: Sheffield R Jan 2015 - Using R to investigate parasite infections in Asian elephants, Carly Lynsdale

library(MCMCglmm)= Bayesian Markov chain Monte Carlo

library(asreml) = ASREML-R ASREML-R is available on request for research/teaching, for users with an academic email address. Register online at:

http://www.vsni.co.uk/software/free-to-use/teaching/asreml-teaching/registration

Others…

library(nlme) = Non-Linear Mixed Effects

Page 16: Sheffield R Jan 2015 - Using R to investigate parasite infections in Asian elephants, Carly Lynsdale

So what does my code look like?...

Within:

Between:

modelwithin<- glmer.nb (sqrt(epg) ~ sample1 + ageclass + camp + (1|id), control=glmerControl(optimizer = "bobyqa"), data = within)

modelbetween <- glmer.nb (sqrt(epg) ~ bolusno1 + sample + ageclass + sex + mothcol + (1|id1), data = bween, control = glmerControl (optCtrl=list(optimizer="bobyqa")))

Page 17: Sheffield R Jan 2015 - Using R to investigate parasite infections in Asian elephants, Carly Lynsdale

So what does my code look like?...

Within:

Between:

modelwithin<- glmer.nb (sqrt(epg) ~ sample1 + ageclass + camp + (1|id), control=glmerControl(optimizer = "bobyqa"), data = within)

modelbetween <- glmer.nb (sqrt(epg) ~ bolusno1 + sample + ageclass + sex + mothcol + (1|id1), data = bween, control = glmerControl (optCtrl=list(optimizer="bobyqa")))

Page 18: Sheffield R Jan 2015 - Using R to investigate parasite infections in Asian elephants, Carly Lynsdale

A typical Negative Binomial Distribution….

EPG

No.

of H

osts

Page 19: Sheffield R Jan 2015 - Using R to investigate parasite infections in Asian elephants, Carly Lynsdale

So what does my code look like?...

Within:

Between:

modelwithin<- glmer.nb (sqrt(epg) ~ sample1 + ageclass + camp + (1|id), control=glmerControl(optimizer = "bobyqa"), data = within)

modelbetween <- glmer.nb (sqrt(epg) ~ bolusno1 + sample + ageclass + sex + mothcol + (1|id1), data = bween, control = glmerControl (optCtrl=list(optimizer="bobyqa")))

Page 20: Sheffield R Jan 2015 - Using R to investigate parasite infections in Asian elephants, Carly Lynsdale
Page 21: Sheffield R Jan 2015 - Using R to investigate parasite infections in Asian elephants, Carly Lynsdale

So what does my code look like?...

Within:

Between:

modelwithin<- glmer.nb (sqrt(epg) ~ sample1 + ageclass + sex + camp + (1|id), control=glmerControl(optimizer = "bobyqa"), data = within)

modelbetween <- glmer.nb (sqrt(epg) ~ bolusno1 + sample + ageclass + sex + camp + (1|id1), control = glmerControl (optCtrl=list(optimizer="bobyqa"), data = bween)

Page 22: Sheffield R Jan 2015 - Using R to investigate parasite infections in Asian elephants, Carly Lynsdale

So what does my code look like?...

Within:

Between:

modelwithin<- glmer.nb (sqrt(epg) ~ sample1 + ageclass + sex + camp + (1|id), control=glmerControl(optimizer = "bobyqa"), data = within)

modelbetween <- glmer.nb (sqrt(epg) ~ bolusno1 + sample + ageclass + sex + camp + (1|id1), control = glmerControl (optCtrl=list(optimizer="bobyqa"), data = bween)

Page 23: Sheffield R Jan 2015 - Using R to investigate parasite infections in Asian elephants, Carly Lynsdale

http://stackoverflow.com/questions/21344555/convergence-error-for-development-version-of-lme4

control=glmerControl(optimizer = "bobyqa")

Page 24: Sheffield R Jan 2015 - Using R to investigate parasite infections in Asian elephants, Carly Lynsdale

So what does my code look like?...

Within:

Between:

modelwithin<- glmer.nb (sqrt(epg) ~ sample1 + ageclass + sex + camp + (1|id), control=glmerControl(optimizer = "bobyqa"), data = within)

modelbetween <- glmer.nb (sqrt(epg) ~ bolusno1 + sample + ageclass + sex + camp + (1|id1), control = glmerControl (optCtrl=list(optimizer="bobyqa"), data = bween)

Page 25: Sheffield R Jan 2015 - Using R to investigate parasite infections in Asian elephants, Carly Lynsdale

model2<-glmer.nb(sqrt(epg) ~ relevel (bolusno1,2) + sample + ageclass + sex + mothcol + (1|id1), data = bween, control = glmerControl(optCtrl=list(optimizer="bobyqa"))

Bolus 1 <- Bolus 3 v Bolus 5Bolus 3 <- Bolus 1 & Bolus 5Bolus 5 <- Bolus 1 & Bolus 3

Page 26: Sheffield R Jan 2015 - Using R to investigate parasite infections in Asian elephants, Carly Lynsdale

modelwithin <- glmer.nb(sqrt(epg) ~ sample1 + ageclass + sex + camp + (1|id), control=glmerControl(optimizer = "bobyqa"), data = within)

anova(modelwithin, modelwithin2)

Within (Centre v Edge)

modelwithin2 <- glmer.nb(sqrt(epg) ~ ageclass + sex + camp + (1|id), control=glmerControl(optimizer = "bobyqa"), data = within)

Page 27: Sheffield R Jan 2015 - Using R to investigate parasite infections in Asian elephants, Carly Lynsdale

Between (1st Bolus v 3rd Bolus v Last Bolus) modelbween <- glmer.nb(sqrt(epg) ~ bolusno1+sample+ageclass+sex+camp+(1|id1), data = bween, control=glmerControl(optCtrl=list(optimizer="bobyqa")))

modelbween2 <- glmer.nb(sqrt(epg) ~ sample+ageclass+sex+camp+(1|id1), data = bween, control=glmerControl(optCtrl=list(optimizer="bobyqa")))

Page 28: Sheffield R Jan 2015 - Using R to investigate parasite infections in Asian elephants, Carly Lynsdale

My Final Tips……• Sniff out free courses – NERC funded students / Sheffield APS

R course…

• Keep your data simple!

• Triple check everything (on different days), at least in the beginning…

• Use word / excel / ppt to keep a summary of your outputs / graphs for when you write up.

• Start naming models, exports, files properly from the start.

Page 29: Sheffield R Jan 2015 - Using R to investigate parasite infections in Asian elephants, Carly Lynsdale

Web Links• http://www.r-bloggers.com/ - blog site• Facebook R-space group• http://www.nerc.ac.uk/skills/postgrad/currentstudents/l

atestopportunities/ - nerc free courses

• https://www.coursera.org/ - free online courses• http://

cran.r-project.org/doc/manuals/r-release/R-intro.html - R’s own help site

• http://www.statmethods.net/interface/help.html - nice stats

• https://github.com/ - online stats forum