forked from vegandevs/vegan
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathinfluence.cca.Rd
174 lines (139 loc) · 6.67 KB
/
influence.cca.Rd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
\name{influence.cca}
\alias{hatvalues.cca}
\alias{hatvalues.rda}
\alias{sigma.cca}
\alias{rstandard.cca}
\alias{rstudent.cca}
\alias{cooks.distance.cca}
\alias{SSD.cca}
\alias{vcov.cca}
\alias{qr.cca}
\alias{df.residual.cca}
\title{Linear Model Diagnostics for Constrained Ordination}
\description{
This set of function extracts influence statistics and some other
linear model statistics directly from a constrained ordination result
object from \code{\link{cca}}, \code{\link{rda}},
\code{\link{capscale}} or \code{\link{dbrda}}. The constraints are
linear model functions and these support functions return identical
results as the corresponding linear models (\code{\link{lm}}), and you
can use their documentation. The main functions for normal usage are
leverage values (\code{\link{hatvalues}}), standardized residuals
(\code{\link{rstandard}}), studentized or leave-one-out residuals
(\code{\link{rstudent}}), and Cook's distance
(\code{\link{cooks.distance}}). In addition, \code{\link{vcov}}
returns the variance-covariance matrix of coefficients, and its
diagonal values the variances of coefficients. Other functions are
mainly support functions for these, but they can be used directly.
}
\usage{
\method{hatvalues}{cca}(model, ...)
\method{rstandard}{cca}(model, type = c("response", "canoco"), ...)
\method{rstudent}{cca}(model, type = c("response", "canoco"), ...)
\method{cooks.distance}{cca}(model, type = c("response", "canoco"), ...)
\method{sigma}{cca}(object, type = c("response", "canoco"), ...)
\method{vcov}{cca}(object, type = "canoco", ...)
\method{SSD}{cca}(object, type = "canoco", ...)
\method{qr}{cca}(x, ...)
\method{df.residual}{cca}(object, ...)
}
\arguments{
\item{model, object, x}{A constrained ordination result object.}
\item{type}{Type of statistics used for extracting raw residuals and
residual standard deviation (\code{sigma}). Either
\code{"response"} for species data or difference of WA and LC
scores for \code{"canoco"}.}
\item{\dots}{Other arguments to functions (ignored).}
}
\details{
The \pkg{vegan} algorithm for constrained ordination uses linear model
(or weighted linear model in \code{\link{cca}}) to find the fitted
values of dependent community data, and constrained ordination is
based on this fitted response (Legendre & Legendre 2012). The
\code{\link{hatvalues}} give the leverage values of these constraints,
and the leverage is independent on the response data. Other influence
statistics (\code{\link{rstandard}}, \code{\link{rstudent}},
\code{\link{cooks.distance}}) are based on leverage, and on the raw
residuals and residual standard deviation (\code{\link{sigma}}). With
\code{type = "response"} the raw residuals are given by the
unconstrained component of the constrained ordination, and influence
statistics are a matrix with dimensions no. of observations times
no. of species. For \code{\link{cca}} the statistics are the same as
obtained from the \code{\link{lm}} model using Chi-square standardized
species data (see \code{\link{decostand}}) as dependent variable, and
row sums of community data as weights, and for \code{\link{rda}} the
\code{\link{lm}} model uses non-modified community data and no
weights.
The algorithm in the CANOCO software constraints the results during
iteration by performing a linear regression of weighted averages (WA)
scores on constraints and taking the fitted values of this regression
as linear combination (LC) scores (ter Braak 1984). The WA scores are
directly found from species scores, but LC scores are linear
combinations of constraints in the regression. With \code{type =
"canoco"} the raw residuals are the differences of WA and LC scores,
and the residual standard deviation (\code{\link{sigma}}) is taken to
be the axis sum of squared WA scores minus one. These quantities have
no relationship to residual component of ordination, but they rather
are methodological artefacts of an algorithm that is not used in
\pkg{vegan}. The result is a matrix with dimensions no. of
observations times no. of constrained axes.
Function \code{\link{vcov}} returns the matrix of variances and
covariances of regression coefficients. The diagonal values of this
matrix are the variances, and their square roots give the standard
errors of regression coefficients. The function is based on
\code{\link{SSD}} that extracts the sum of squares and crossproducts
of residuals. The residuals are defined similarly as in influence
measures and with each \code{type} they have similar properties and
limitations, and define the dimensions of the result matrix.
}
\references{
Legendre, P. and Legendre, L. (2012) \emph{Numerical Ecology}. 3rd
English ed. Elsevier.
ter Braak, C.J.F. (1984--): CANOCO -- a FORTRAN program for
\emph{cano}nical \emph{c}ommunity \emph{o}rdination by [partial]
[detrended] [canonical] correspondence analysis, principal components
analysis and redundancy analysis. \emph{TNO Inst. of Applied Computer
Sci., Stat. Dept. Wageningen, The Netherlands}.
}
\note{
Function \code{\link{as.mlm}} casts an ordination object to a multiple
linear model of class \code{"mlm"} (see \code{\link{lm}}), and similar
statistics can be derived from that modified object as with this set
of functions. However, there are some problems in the \R{}
implementation of the further analysis of multiple linear model
objects. When the results differ, the current set of functions is more
probable to be correct. The use of \code{as.mlm} objects should be
avoided.
}
\author{Jari Oksanen}
\seealso{Corresponding \code{\link{lm}} methods and
\code{\link{as.mlm.cca}}. Function \code{\link{ordiresids}} provides
lattice graphics for residuals.}
\examples{
data(varespec, varechem)
mod <- cca(varespec ~ Al + P + K, varechem)
## leverage
hatvalues(mod)
plot(hatvalues(mod), type = "h")
## ordination plot with leverages: points with high leverage have
## similar LC and WA scores
plot(mod, type = "n")
ordispider(mod) # segment from LC to WA scores
points(mod, dis="si", cex=5*hatvalues(mod), pch=21, bg=2) # WA scores
text(mod, dis="bp", col=4)
## deviation and influence
head(rstandard(mod))
head(cooks.distance(mod))
## Influence measures from lm
y <- decostand(varespec, "chi.square") # needed in cca
y1 <- with(y, Cladstel) # take one species for lm
lmod1 <- lm(y1 ~ Al + P + K, varechem, weights = rowSums(varespec))
## numerically identical within 2e-15
all(abs(cooks.distance(lmod1) - cooks.distance(mod)[, "Cladstel"]) < 1e-8)
## t-values of regression coefficients based on type = "canoco"
## residuals
coef(mod)
coef(mod)/sqrt(diag(vcov(mod, type = "canoco")))
}
\keyword{ models }
\keyword{ multivariate }