Post on 07-Aug-2015
A beginner’s view on mixed modelling
#crapstats
carly.lynsdale@sheffield.ac.uk
CarlyLynsdale
MyanmarElephant
Using R to investigate parasite infection in Asian Elephants
Photo: Hannah Mumby,
EPG
Are parasite eggs evenly distributed throughout elephant faeces?
WITHIN BOLUS
119 hosts, 4 samples per host
1 elephant -> 2 samples (C + E) -> 1 repeat of each
C
E
BETWEEN BOLUS
20 hosts, 6 samples per host
1 elephant - > 3 bolus sampled from each -> centre and edge from each bolus
C
E
C
E
C
E
Variables
Between• Elephant (ID)
• Bolus (1st, 3rd, Last)• Sample (C + E)
• EPG
Within• Elephant (ID)• Sample (C + E)
• EPG
ContinuousCategorical (Factor)
Why Generalised Linear Mixed Effects Models (GLMMs)?
The effect of sample location (independent) on EPG (dependent), with non-parametric data and accounting
for the effect of elephant age, sex and camp.
GLMMs deal with potential pseudoreplication by including (fixed and) random measures.
The effect of sample location (independent) on EPG (dependent), with non-parametric data and accounting
for the effect of elephant age, sex and camp.
Why Generalised Linear Mixed Effects Models (GLMMs)?
GLMMs deal with potential pseudoreplication by including (fixed and) random measures.
i.e. They account for repeated measures.
The effect of sample location (independent) on EPG (dependent), with non-parametric data and accounting
for the effect of elephant age, sex and camp.
Why Generalised Linear Mixed Effects Models (GLMMs)?
library(lme4) = Linear Mixed Effects v4
library(lme4) # Linear Mixed Effects v4
model1 <- glmer(y ~ xf + (1|xr), family = poisson (link = “log”), data = dframe1)
Package for GLMMs in R
library(MCMCglmm)= Bayesian Markov chain Monte Carlo
library(asreml) = ASREML-R ASREML-R is available on request for research/teaching, for users with an academic email address. Register online at:
http://www.vsni.co.uk/software/free-to-use/teaching/asreml-teaching/registration
Others…
library(nlme) = Non-Linear Mixed Effects
So what does my code look like?...
Within:
Between:
modelwithin<- glmer.nb (sqrt(epg) ~ sample1 + ageclass + camp + (1|id), control=glmerControl(optimizer = "bobyqa"), data = within)
modelbetween <- glmer.nb (sqrt(epg) ~ bolusno1 + sample + ageclass + sex + mothcol + (1|id1), data = bween, control = glmerControl (optCtrl=list(optimizer="bobyqa")))
So what does my code look like?...
Within:
Between:
modelwithin<- glmer.nb (sqrt(epg) ~ sample1 + ageclass + camp + (1|id), control=glmerControl(optimizer = "bobyqa"), data = within)
modelbetween <- glmer.nb (sqrt(epg) ~ bolusno1 + sample + ageclass + sex + mothcol + (1|id1), data = bween, control = glmerControl (optCtrl=list(optimizer="bobyqa")))
A typical Negative Binomial Distribution….
EPG
No.
of H
osts
So what does my code look like?...
Within:
Between:
modelwithin<- glmer.nb (sqrt(epg) ~ sample1 + ageclass + camp + (1|id), control=glmerControl(optimizer = "bobyqa"), data = within)
modelbetween <- glmer.nb (sqrt(epg) ~ bolusno1 + sample + ageclass + sex + mothcol + (1|id1), data = bween, control = glmerControl (optCtrl=list(optimizer="bobyqa")))
So what does my code look like?...
Within:
Between:
modelwithin<- glmer.nb (sqrt(epg) ~ sample1 + ageclass + sex + camp + (1|id), control=glmerControl(optimizer = "bobyqa"), data = within)
modelbetween <- glmer.nb (sqrt(epg) ~ bolusno1 + sample + ageclass + sex + camp + (1|id1), control = glmerControl (optCtrl=list(optimizer="bobyqa"), data = bween)
So what does my code look like?...
Within:
Between:
modelwithin<- glmer.nb (sqrt(epg) ~ sample1 + ageclass + sex + camp + (1|id), control=glmerControl(optimizer = "bobyqa"), data = within)
modelbetween <- glmer.nb (sqrt(epg) ~ bolusno1 + sample + ageclass + sex + camp + (1|id1), control = glmerControl (optCtrl=list(optimizer="bobyqa"), data = bween)
http://stackoverflow.com/questions/21344555/convergence-error-for-development-version-of-lme4
control=glmerControl(optimizer = "bobyqa")
So what does my code look like?...
Within:
Between:
modelwithin<- glmer.nb (sqrt(epg) ~ sample1 + ageclass + sex + camp + (1|id), control=glmerControl(optimizer = "bobyqa"), data = within)
modelbetween <- glmer.nb (sqrt(epg) ~ bolusno1 + sample + ageclass + sex + camp + (1|id1), control = glmerControl (optCtrl=list(optimizer="bobyqa"), data = bween)
model2<-glmer.nb(sqrt(epg) ~ relevel (bolusno1,2) + sample + ageclass + sex + mothcol + (1|id1), data = bween, control = glmerControl(optCtrl=list(optimizer="bobyqa"))
Bolus 1 <- Bolus 3 v Bolus 5Bolus 3 <- Bolus 1 & Bolus 5Bolus 5 <- Bolus 1 & Bolus 3
modelwithin <- glmer.nb(sqrt(epg) ~ sample1 + ageclass + sex + camp + (1|id), control=glmerControl(optimizer = "bobyqa"), data = within)
anova(modelwithin, modelwithin2)
Within (Centre v Edge)
modelwithin2 <- glmer.nb(sqrt(epg) ~ ageclass + sex + camp + (1|id), control=glmerControl(optimizer = "bobyqa"), data = within)
Between (1st Bolus v 3rd Bolus v Last Bolus) modelbween <- glmer.nb(sqrt(epg) ~ bolusno1+sample+ageclass+sex+camp+(1|id1), data = bween, control=glmerControl(optCtrl=list(optimizer="bobyqa")))
modelbween2 <- glmer.nb(sqrt(epg) ~ sample+ageclass+sex+camp+(1|id1), data = bween, control=glmerControl(optCtrl=list(optimizer="bobyqa")))
My Final Tips……• Sniff out free courses – NERC funded students / Sheffield APS
R course…
• Keep your data simple!
• Triple check everything (on different days), at least in the beginning…
• Use word / excel / ppt to keep a summary of your outputs / graphs for when you write up.
• Start naming models, exports, files properly from the start.
Web Links• http://www.r-bloggers.com/ - blog site• Facebook R-space group• http://www.nerc.ac.uk/skills/postgrad/currentstudents/l
atestopportunities/ - nerc free courses
• https://www.coursera.org/ - free online courses• http://
cran.r-project.org/doc/manuals/r-release/R-intro.html - R’s own help site
• http://www.statmethods.net/interface/help.html - nice stats
• https://github.com/ - online stats forum