forked from glmmTMB/glmmTMB
-
Notifications
You must be signed in to change notification settings - Fork 0
/
owls.rmd
262 lines (215 loc) · 9.79 KB
/
owls.rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
---
title: "Appendix C: Compare zero-inflated mixed models across R packages"
author: "Mollie Brooks and Ben Bolker"
date: "`r Sys.Date()`"
bibliography: zi.bib
bst: model5-names.bst
output: pdf_document
---
```{r setup, include=FALSE, message=FALSE}
library(knitr)
knitr::opts_chunk$set(echo = TRUE, cache=TRUE)
```
In this appendix, we analyze counts of begging behavior by owl nestlings. This example previously appeared in @zuur_mixed_2009 and @bolker_strategies_2013 and was originally published by @roulin_nestling_2007. The response variable is the number of calls from chicks (`NCalls`) in a nest. Since this should directly scale with the number of chicks (i.e. brood size), `logBroodSize` is used as an offset term. Since nests were repeatedly measured, `Nest` is included as a random effect. Covariates of interest include the sex of the parent visiting the nest (`SexParent`), whether the chicks were satiatited or not (`FoodTreatment`), and the timing of the parent's arrival (`ArrivalTime`).
#Preliminaries
##Load packages
```{r libs,warning=FALSE, message=FALSE,cache=FALSE}
library(glmmTMB)
library(glmmADMB)
library(MCMCglmm)
library(brms)
library(INLA)
library(broom) #for tidy
library(plyr)
library(dplyr) #tidyverse
library(ggplot2); theme_set(theme_bw())
library(ggstance)#for position_dodgev
```
## Data organization and helper functions (hidden)
```{r data}
data(Owls)
Owls = plyr::rename(Owls, c(SiblingNegotiation="NCalls"))
Owls = transform(Owls, ArrivalTime=scale(ArrivalTime, center=TRUE, scale=FALSE))
```
```{r helper,echo=FALSE}
# time
tfun = function(...) unname(system.time(capture.output(...))["elapsed"])
abbfun = function(x) {
gsub("[()]","", # (Intercept) -> Intercept
gsub("(Sol\\.)*(trait|at.level\\(trait, 1\\):)*","", # simplify MCMCglmm labs
gsub("FoodTreatment","FT",x))) # shorten
}
tidy.brmsfit = function(x, effects=c("fixed"),...) {
ss <- summary(x)
tidy(ss$fixed) %>%
mutate(effect="fixed",group="fixed") %>%
select(effect,.rownames,Estimate,Est.Error,l.95..CI,u.95..CI,group) %>%
rename(term=.rownames,estimate=Estimate,std.error=Est.Error,
conf.low=l.95..CI,conf.high=u.95..CI)
}
# broom tidy method for class "inla"
tidy.inla <- function(x, effects=c("fixed"),...) {
ss <- summary(x)
tidy(ss$fixed) %>%
mutate(effect="fixed",group="fixed") %>%
select(effect,.rownames,mean,sd,X0.025quant,X0.975quant,group) %>%
rename(term=.rownames,estimate=mean,std.error=sd,
conf.low=X0.025quant,conf.high=X0.975quant)
}
```
# Constant zero-inflation
Here we fit the model with zero-inflation assumed to be constant across the data set, i.e. zero-inflation is independent of the predictor variables.
## glmmTMB
```{r tmbfit}
form = NCalls~(FoodTreatment + ArrivalTime) * SexParent +
offset(logBroodSize) + (1|Nest)
time.tmb = tfun(m1.tmb <<- glmmTMB(form,
ziformula=~1, data = Owls, family=poisson))
```
## glmmADMB
```{r admbfit}
time.admb = tfun(m1.admb <<- glmmadmb(form,
zeroInflation=TRUE, data = Owls, family="poisson"))
```
##MCMCglmm
Code for this example was copied from @bolker_strategies_2013; a more complete description appears in the [supplementary material](https://groups.nceas.ucsb.edu/non-linear-modeling/projects/owls/WRITEUP/owls.pdf) for that paper.
```{r mcmcglmmfit}
offvec = c(1,1,2,rep(1,5)) # 1=non-offset; 2=offset
fixef2 = NCalls~trait-1+ # intercept terms for both count and binary terms
# other fixed-effect terms only apply to count term
at.level(trait,1):logBroodSize+
at.level(trait,1):((FoodTreatment+ArrivalTime)*SexParent)
# residual variances independent for count and binary terms;
# fixed to 1 for binary term
# random-effects variances independent for count and binary terms;
# fixed very small (1e-6) for binary term
prior_overdisp = list(R=list(V=diag(c(1,1)),nu=0.002,fix=2),
G=list(list(V=diag(c(1,1e-6)),nu=0.002,fix=2)))
prior_overdisp_broodoff = c(prior_overdisp,
list(B=list(mu=c(0,1)[offvec],
V=diag(c(1e8,1e-6)[offvec]))))
time.mcmc=tfun(m1.mcmc <<- MCMCglmm(fixef2,
rcov=~idh(trait):units,
random=~idh(trait):Nest,
prior=prior_overdisp_broodoff,
data=Owls,
family="zipoisson",
verbose=FALSE))
```
##brms
```{r brmsfit}
time.brms = tfun(m1.brms <<- brm(form, data = Owls,
family="zero_inflated_poisson",
save_dso=TRUE))
```
```{r brmsfit2}
time.brms2 = tfun(m1.brms2 <<- update(m1.brms))
```
##INLA
```{r inlafit}
time.inla = tfun(m1.inla <<- inla(NCalls~(FoodTreatment + ArrivalTime) * SexParent +
f(Nest, model="iid"),
offset = logBroodSize,
family= "zeroinflatedpoisson1",
data=Owls))
```
#Comparing the results
## Timings
```{r times}
sort(c(glmmTMB=time.tmb,glmmADMB=time.admb,MCMCglmm=time.mcmc,brms=time.brms,
brms2=time.brms2,INLA=time.inla))
```
(Time is recorded in seconds.)
`glmmTMB` fit the model in less than 5 seconds. Other methods were slower, but `MCMCglmm` was in the same order of magnitude (`brms` and `brms2` are times including and excluding compilation time, respectively).
## Estimated fixed-effect coefficients
```{r arrange_coefs,echo=FALSE, cache=FALSE}
modList = list(glmmTMB=m1.tmb,
glmmADMB=m1.admb,
MCMCglmm=m1.mcmc,
brms=m1.brms,
INLA=m1.inla)
coefs = ldply(modList,tidy,effect="fixed",conf.int=TRUE,
.id="model") %>%
mutate(term=abbfun(term)) %>%
select(model,term,estimate,conf.low,conf.high) %>%
filter(!term %in% c("Intercept","NCalls","zi_NCalls","logBroodSize"))
```
```{r coefplot, echo=FALSE, fig.width=7, cache=FALSE}
(gg0 <- ggplot(coefs, aes(color=model,shape=model,y=term, x=estimate))+
labs(x="Regression estimate",y="")+
geom_pointrangeh(aes(xmax=conf.high, xmin=conf.low),
position = position_dodgev(height = 0.6))+
scale_colour_brewer(palette="Set1")+
scale_shape_manual(values=15:19))
```
*Figure C.1 -- Estimated fixed-effect coefficients* Estimates are from the same zero-inflated Poisson model fit using functions `glmmTMB`, `glmmadmb`, `MCMCglmm`, `brm`, and `inla`.
Because we ran `brms` with flat priors, the estimates are very close to the maximum likelihood estimates of `glmmTMB`. Maximum likelihood estimates from `glmmTMB` and `glmmADMB` differ slightly because `glmmADMB` uses some numerical tricks to increase robustness and these change the objective function by a small amount.
# Complex zero-inflation
Here we fit the model with zero-inflation depending on some of the predictor variables. We can no longer use `glmmADMB` and `INLA`.
## glmmTMB
```{r tmbfit_czi}
ziform = ~FoodTreatment+(1|Nest)
time.tmb_czi = tfun(m1.tmb_czi <<- glmmTMB(form,
ziformula=ziform, data = Owls, family=poisson))
```
## MCMCglmm
```{r MCMCglmm_czi}
offvec_czi = c(1,1,2,rep(1,6)) # 1=non-offset; 2=offset
fixef3 = NCalls~trait-1+ # intercept terms for both count and binary terms
# fixed-effect terms for count term
at.level(trait,1):logBroodSize+
at.level(trait,1):((FoodTreatment+ArrivalTime)*SexParent)+
# fixed-effect terms for binary term
at.level(trait,2):FoodTreatment
# residual variances independent for count and binary terms;
# fixed to 1 for binary term
# random-effects variances now allow estimated variance for binary term
# as well
prior_overdisp_czi = list(R=list(V=diag(c(1,1)),nu=0.002,fix=2),
G=list(list(V=diag(c(1,1)),nu=0.002)))
prior_overdisp_broodoff_czi = c(prior_overdisp_czi,
list(B=list(mu=c(0,1)[offvec_czi],
V=diag(c(1e8,1e-6)[offvec_czi]))))
time.mcmc_czi=tfun(m1.mcmc_czi <<- MCMCglmm(fixef3,
rcov=~idh(trait):units,
random=~idh(trait):Nest,
prior=prior_overdisp_broodoff_czi,
data=Owls,
family="zipoisson",
verbose=FALSE))
```
<!-- **to do**: don't know why we're getting the warning, we seem to be getting all of the fixed effects parameters we asked for? -->
## brms
```{r brmszifit}
time.brms_czi = tfun(m1.brms_czi <<- brm(brmsformula(form,zi=ziform),
data = Owls,
family="zero_inflated_poisson",
save_dso=TRUE))
```
```{r brmszifit2}
time.brms_czi2 = tfun(m1.brms_czi2 <<- update(m1.brms_czi))
```
## Comparison
Timings:
```{r times_czi}
sort(c(TMB=time.tmb_czi,MCMCglmm=time.mcmc_czi,brms=time.brms_czi,
brms2=time.brms2))
```
Coefficients:
```{r arrange_coefs_czi,echo=FALSE}
modList_czi = list(glmmTMB=m1.tmb_czi,
MCMCglmm=m1.mcmc_czi,
brms=m1.brms_czi)
coefs_czi = ldply(modList_czi,tidy,effect="fixed",conf.int=TRUE,
.id="model") %>%
mutate(term=abbfun(term)) %>%
select(model,term,estimate,conf.low,conf.high) %>%
filter(!term %in% c("Intercept","NCalls","zi_NCalls","logBroodSize",
"zi_Intercept","zi_FTSatiated",
"FTDeprived:at.level, 2"))
```
```{r coefplot_czi, fig.width=7, cache=FALSE, echo=FALSE}
gg0 %+% coefs_czi
```
*Figure C.2 -- Estimated fixed-effect coefficients* Estimates are from the same zero-inflated Poisson model with predictors on zero-inflation fit using functions `glmmTMB`, `MCMCglmm`, and `brm`.
# References