在模型摘要中解释有序和无序因素与数值预测变量 [英] Interpretation of ordered and non-ordered factors, vs. numerical predictors in model summary
问题描述
我安装了一个模型,其中:
I have fitted a model where:
Y〜A + A ^ 2 + B +混合效果(C)
Y ~ A + A^2 + B + mixed.effect(C)
Y是连续的 A是连续的 B实际上是指DAY,目前看起来像这样:
Y is continuous A is continuous B actually refers to a DAY and currently looks like this:
Levels: 1 < 2 < 3 < 4 < 5 < 6 < 7 < 8 < 9 < 11 < 12
我可以轻松更改数据类型,但是我不确定将B视为数字,因子还是有序因子是否更合适.并且当被视为数字或有序因子时,我不太确定如何解释输出.
I can easily change the data type, but I'm not sure whether it is more appropriate to treat B as numeric, a factor, or as an ordered factor. AND when treated as numeric or ordered factor, I'm not quite sure how to interpret the output.
当被视为有序因素时,summary(my.model)输出如下内容:
When treated as an ordered factor, summary(my.model) outputs something like this:
Linear mixed model fit by REML ['lmerMod']
Formula: Y ~ A + I(A^2) + B + (1 | mixed.effect.C)
Fixed effects:
Estimate Std. Error t value
(Intercept) 19.04821 0.40926 46.54
A -151.01643 7.19035 -21.00
I(A^2) 457.19856 31.77830 14.39
B.L -3.00811 0.29688 -10.13
B.Q -0.12105 0.24561 -0.49
B.C 0.35457 0.24650 1.44
B^4 0.09743 0.24111 0.40
B^5 -0.08119 0.22810 -0.36
B^6 0.19640 0.22377 0.88
B^7 0.02043 0.21016 0.10
B^8 -0.48931 0.20232 -2.42
B^9 -0.43027 0.17798 -2.42
B^10 -0.13234 0.15379 -0.86
L,Q和C是什么?我需要知道每增加一天(B)对响应(Y)的影响.如何从输出中获取此信息?
What are L, Q, and C? I need to know the effect of each additional day (B) on the response (Y). How do I get this information from the output?
当我将B视为数字时,我得到的输出是这样的:
When I treat B as.numeric, I get something like this as output:
Fixed effects:
Estimate Std. Error t value
(Intercept) 20.79679 0.39906 52.11
A -152.29941 7.17939 -21.21
I(A^2) 461.89157 31.79899 14.53
B -0.27321 0.02391 -11.42
要获得每增加一天(B)对响应(Y)的影响,我是否应该将B的系数乘以B(天数)?不确定如何处理此输出...
To get the effect of each additional day (B) on the response (Y), am I supposed to multiply the coefficient of B times B (the day number)? Not sure what to do with this output...
推荐答案
这实际上不是特定于混合模型的问题,而是有关R中模型参数化的一般问题.
This is not really a mixed-model specific question, but rather a general question about model parameterization in R.
让我们尝试一个简单的例子.
Let's try a simple example.
set.seed(101)
d <- data.frame(x=sample(1:4,size=30,replace=TRUE))
d$y <- rnorm(30,1+2*d$x,sd=0.01)
x作为数字
这只是线性回归:x
参数表示x
中每变化的单位中y
中的变化;截距在x=0
处指定y
的期望值.
x as numeric
This just does a linear regression: the x
parameter denotes the change in y
per unit of change in x
; the intercept specifies the expected value of y
at x=0
.
coef(lm(y~x,d))
## (Intercept) x
## 0.9973078 2.0001922
x为(无序/常规)因子
coef(lm(y~factor(x),d))
## (Intercept) factor(x)2 factor(x)3 factor(x)4
## 3.001627 1.991260 3.995619 5.999098
截距在因子的基线水平(x=1
)中指定y
的期望值;其他参数指定当x
具有其他值时,y
的期望值之间的差异.
The intercept specifies the expected value of y
in the baseline level of the factor (x=1
); the other parameters specify the difference between the expected value of y
when x
takes on other values.
coef(lm(y~ordered(x),d))
## (Intercept) ordered(x).L ordered(x).Q ordered(x).C
## 5.998121421 4.472505514 0.006109021 -0.003125958
现在,截距在平均值因子级别(2到3之间的一半)指定y
的值; L
(线性)参数提供了线性趋势的量度(不是相当,我可以解释特定值...),Q
和C
指定二次项和三次项(由于图案是线性的,因此在这种情况下接近零);如果有更多的层次,则高阶对比度将被编号为5、6,...
Now the intercept specifies the value of y
at the mean factor level (halfway between 2 and 3); the L
(linear) parameter gives a measure of the linear trend (not quite sure I can explain the particular value ...), Q
and C
specify quadratic and cubic terms (which are close to zero in this case because the pattern is linear); if there were more levels the higher-order contrasts would be numbered 5, 6, ...
coef(lm(y~factor(x),d,contrasts=list(`factor(x)`=MASS::contr.sdif)))
## (Intercept) factor(x)2-1 factor(x)3-2 factor(x)4-3
## 5.998121 1.991260 2.004359 2.003478
此对比将参数指定为连续级别之间的差异,它们都是(大约)2的恒定值.
This contrast specifies the parameters as the differences between successive levels, which are all a constant value of (approximately) 2.
这篇关于在模型摘要中解释有序和无序因素与数值预测变量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!