envfit 结果是如何产生的? [英] How are envfit results created?
问题描述
我有一个关于如何从 vegan 包的问题.
I have a question regarding how to recreate the results from the envfit()
function in the vegan package.
这是一个 envfit()
与排序和环境向量一起使用的示例.
Here is an example of envfit()
being used with an ordination and an environmental vector.
data(varespec)
data(varechem)
ord <- metaMDS(varespec)
chem.envfit <- envfit(ord, varechem, choices = c(1,2), permutations = 999)
chem.scores.envfit <- as.data.frame(scores(chem.envfit, display = "vectors"))
chem.scores.envfit
您在表格中看到的值是线性回归的标准化系数,用于将向量投影到排序中.这些是单位长度箭头的方向." - 评论来自 绘制的 envfit 向量与 NMDS 分数不匹配
另外,来自?envfit
:
连续变量(向量)的打印输出给出了方向余弦是单位头部的坐标长度向量.在图中,这些按它们的相关性(平方r2) 列的根,以便弱预测变量具有较短的箭头而不是强预测.您可以使用以下方法查看缩放的相对长度命令分数.
The printed output of continuous variables (vectors) gives the direction cosines which are the coordinates of the heads of unit length vectors. In plot these are scaled by their correlation (square root of the column r2) so that weak predictors have shorter arrows than strong predictors. You can see the scaled relative lengths using command scores.
有人可以明确地告诉我正在运行什么线性模型,正在使用什么标准化系数,以及在哪里应用余弦来创建这些值?
Could someone please show me explicitly what linear model is being run, what standardized coefficients are being used, and where cosine is being applied to create these values?
推荐答案
我可能不应该在那个答案中说标准化".
I probably shouldn't have said "standardised" in that answer.
对于varechem
中的每一列(变量)和排序的前两个轴(choices = 1:2
),线性模型为:
For each column (variable) in varechem
and the first two axes of the ordination (choices = 1:2
), the linear model is:
\hat(env_j) = \beta_1 * scr1 + \beta_2 * scr2
其中 env_j
是 varechem
中的 $j$th 变量,scr1
和 scr2
是轴分数在第一个和第二个轴上被考虑(即由 choices = 1:2
定义的平面,但这扩展到更高的维度),\beta
是回归系数为轴分数对.
where env_j
is the $j$th variable in varechem
, scr1
and scr2
are the axis scores on the first and second axis being considered (i.e. the plane defined by choices = 1:2
, but this extends to higher dimensions), and the \beta
are the regression coefficients for the pair of axis scores.
在这个模型中没有截距,因为我们(加权)将 varechem
中的所有变量和轴分数居中,权重实际上只涉及 CCA,capscale()
,和 DCA 方法,因为它们本身就是加权模型.
There's no intercept in this model as we (weighted) centre all the variables in varechem
and the axis scores, with weights really only concerning CCA, capscale()
, and DCA methods as those are weighted models themselves.
轴分数所跨越的空间中箭头的头部是该模型的系数—我们实际上进行了标准化(我在其他回复中将其误称为标准化"),以便箭头具有单位长度.这些值(envfit
输出中的 NMDS1
和 NMDS2
列)在 方向余弦href="https://en.wikipedia.org/wiki/Direction_cosine" rel="nofollow noreferrer">https://en.wikipedia.org/wiki/Direction_cosine.
The heads of the arrows in the space spanned by the axis scores are the coefficients of that model — we actually normalise (which I misrepresented as "standardised" in that other reply) so that the arrows have unit length. These values (the NMDS1
and NMDS2
columns in the envfit
output) are direction cosines in the sense of https://en.wikipedia.org/wiki/Direction_cosine.
以下是我们在不涉及权重且 env
中的所有变量都是数字的情况下的简化演练,如您的示例所示.(请注意,出于效率原因,我们实际上并没有这样做:如果您真的想要详细信息,请参阅 vectorfit()
背后的代码以了解所使用的 QR 分解.)
Here's a simplified walk through of what we do where there are no weights involved and all the variables in env
are numeric, as in your example. (Note we don't actually do it this way for efficiency reasons: see the code behind vectorfit()
for the QR decomposition used if you really want the details.)
## extract the axis scores for the axes we want, 1 and 2
scrs <- scores(ord, choices = c(1,2))
## centre the scores (note not standardising them)
scrs <- as.data.frame(scale(scrs, scale = FALSE, center = TRUE))
## centre the environmental variables - keep as matrix
env <- scale(varechem, scale = FALSE, center = TRUE)
## fit the linear models with no intercept
mod <- lm(env ~ NMDS1 + NMDS2 - 1, data = scrs)
## extract the coefficients from the models
betas <- coef(mod)
## normalize coefs to unit length
## i.e. betas for a particular env var have sum of squares = 1
t(sweep(betas, 2L, sqrt(colSums(betas^2)), "/"))
最后一行给出:
> t(sweep(betas, 2L, sqrt(colSums(betas^2)), "/"))
NMDS1 NMDS2
N -0.05731557 -0.9983561
P 0.61972792 0.7848167
K 0.76646744 0.6422832
Ca 0.68520442 0.7283508
Mg 0.63252973 0.7745361
S 0.19139498 0.9815131
Al -0.87159427 0.4902279
Fe -0.93600826 0.3519780
Mn 0.79870870 -0.6017179
Zn 0.61755690 0.7865262
Mo -0.90308490 0.4294621
Baresoil 0.92487118 -0.3802806
Humdepth 0.93282052 -0.3603413
pH -0.64797447 0.7616621
在这种情况下复制(除了显示更多符号)由 envfit()
返回的值:
which replicates (except for showing more signif figures) the values returned by envfit()
in this case:
> chem.envfit
***VECTORS
NMDS1 NMDS2 r2 Pr(>r)
N -0.05732 -0.99836 0.2536 0.045 *
P 0.61973 0.78482 0.1938 0.099 .
K 0.76647 0.64228 0.1809 0.095 .
Ca 0.68520 0.72835 0.4119 0.006 **
Mg 0.63253 0.77454 0.4270 0.003 **
S 0.19139 0.98151 0.1752 0.109
Al -0.87159 0.49023 0.5269 0.002 **
Fe -0.93601 0.35198 0.4450 0.002 **
Mn 0.79871 -0.60172 0.5231 0.002 **
Zn 0.61756 0.78653 0.1879 0.100 .
Mo -0.90308 0.42946 0.0609 0.545
Baresoil 0.92487 -0.38028 0.2508 0.061 .
Humdepth 0.93282 -0.36034 0.5201 0.001 ***
pH -0.64797 0.76166 0.2308 0.067 .
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Permutation: free
Number of permutations: 999
这篇关于envfit 结果是如何产生的?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!