-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathdodoextinctiontime.Rmd
142 lines (106 loc) · 4.81 KB
/
dodoextinctiontime.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
---
title: "Estimating time of extinction using an optimal linear estimator"
author: "C. George Glen"
date: '2022-04-09'
output:
html_document: default
pdf_document: default
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
This analysis is recreating that made by [Roberts and Solow (2003)](https://www.nature.com/articles/426245a).
## Theory
Going off [Roberts and Solow (2003)](https://www.nature.com/articles/426245a), if we let $T_1 > T_2 > ... > T_n$ be the $k$ most recent sightings of a species, ordered from the most to least recent, then we can use these observations to estimate, $\hat\theta$, or the estimate time of extinction.
An important result is that of [Cooke (1980)](https://academic.oup.com/biomet/article-abstract/67/1/257/276536), who showed the joint distribution of the $k$ most recent sightings has an approximate "Weibull form", regardless of the distribution of the (unknown) complete sightings record. The parameter, $\theta$, is then optimal linear estimator for the extinction time.
The parameter, $\theta$, is computed as the weighted sum of the sightings
$$
\hat\theta = \sum^k_{i=1} a_i T_i
$$
Where $a_i$ is a vector of weights, computed as
$$
a_i = (e^T \Lambda^{-1} e)^{-1} \Lambda^{-1} e
$$
Where $e$ is a vector of $k$ 1's, $\Lambda$ is the symmetric $k \cdot k$ matrix with typical element equal to
$$
\hat\lambda_{ij} = \frac{\Gamma(2 \hat\upsilon + i) \Gamma(\hat\upsilon + j) }{\Gamma(\hat\upsilon + i)\Gamma(j)}, j \leq i
$$
And $\Gamma()$ is the standard Gamma function.
Finally, $\hat\upsilon$ is an estimate of the shape parameter for the joint Weibull distribution of the k most recent sighting times, and is computed as
$$
\hat\upsilon = \frac{1}{k-1}\sum_{i-1}^{k-2} log \frac{T_1 - T_k}{T_1 - T_{i+1}}
$$
We an also compute 95% confidence intervals as
$$
\hat\theta_{95\% \;CI}= (T_1 + \frac{T_1 - T_k}{S_L - 1},T_1 \frac{T_1 - T_k}{S_U - 1})
\\
S_L = \Big(\frac{-log ( 1-\frac{\alpha}{2}) }{k}\Big)^{-\hat\upsilon}
\\
S_U = \Big(\frac{-log (\frac{\alpha}{2}) }{k}\Big)^{-\hat\upsilon}
$$
## Example: Dodo
Let's now apply this to the Dodo.
The last observations for the dodo, given in [Roberts and Solow (2003)](https://www.nature.com/articles/426245a), are
```{r echo=T}
dodosightingtimes <- c(1662, 1638, 1631, 1628, 1628, 1611, 1607, 1602, 1601, 1598)
```
The optimal linear estimator, $\hat\theta$, and the associated 95% CI are computed as
```{r echo=T}
OLE <- function(data, alpha){
## sort the data and define a parameter for length (here k)
obs <- rev(sort(data))
k <- length(obs)
## ---------------------------------------------------------------
## compute v, e, lambda, and a
## ---------------------------------------------------------------
## estimate the shape parameter of the joint Weibull distribution
## 4th equation in Roberts and Solow 2003
v <- (1/(k-1)) * sum(log((obs[1] - obs[k])/(obs[1] - obs[2:(k-1)])))
## define a vector of k 1’s
e <- matrix(rep(1,k), ncol=1)
## Λ is the symmetric k x k matrix with typical element
lambda <- compute.lambda(obs, v)
## make the Λ matrix symmetric
lambda <- ifelse(lower.tri(lambda), lambda, t(lambda))
## vector of weights is given by: a = (e^t * Λ^-1 * e)^-1 * Λ^-1*e
a <- as.vector(solve(t(e) %*% solve(lambda) %*% e)) * solve(lambda) %*% e
## ---------------------------------------------------------------
## calculate confidence intervals
## ---------------------------------------------------------------
SL <- (-log(1-alpha/2)/length(obs))^-v
SU <- (-log(alpha/2)/length(obs))^-v
lowerCI <- max(obs) + ((max(obs)-min(obs))/(SL-1)) ## lower confidence interval
upperCI <- max(obs) + ((max(obs)-min(obs))/(SU-1)) ## upper confidence interval
## ---------------------------------------------------------------
## compute time of extinction
## ---------------------------------------------------------------
extincttime <- sum(t(a)%*%obs)
## ---------------------------------------------------------------
## return a dataframe
## ---------------------------------------------------------------
res <- data.frame(Estimate=extincttime, lowerCI=lowerCI, upperCI=upperCI)
##return the results
return(res)
}
```
And $\Lambda$ is computed using the function
```{r echo=T}
compute.lambda <- function(dates, v){
n <- length(dates)
lambda <- matrix(data=NA, nrow=n, ncol=n)
for(i in 1:n){
for(j in 1:n){
lambda[i,j] <- (gamma(2*v + i) * gamma(v + j))/(gamma(v + i) * gamma(j))
}
}
return(lambda)
}
```
```{r echo=T}
OLE(data=dodosightingtimes, alpha=0.05)
```
This is equal to the estimates of [Roberts and Solow (2003)](https://www.nature.com/articles/426245a): $$
Estimate = 1690\\
lowerCI = 1669\\
upperCI = 1799\\
$$