按组拟合线性模型/方差分析 [英] Fitting linear model / ANOVA by group
问题描述
我试图在R中运行anova()
并遇到一些困难.到目前为止,这是我为帮助阐明我的问题所做的工作.
这是我到目前为止的数据str()
.
str(mhw)
'data.frame': 500 obs. of 5 variables:
$ r : int 1 2 3 4 5 6 7 8 9 10 ...
$ c : int 1 1 1 1 1 1 1 1 1 1 ...
$ grain: num 3.63 4.07 4.51 3.9 3.63 3.16 3.18 3.42 3.97 3.4 ...
$ straw: num 6.37 6.24 7.05 6.91 5.93 5.59 5.32 5.52 6.03 5.66 ...
$ Quad : Factor w/ 4 levels "NE","NW","SE",..: 2 2 2 2 2 2 2 2 2 2 ...
r列是一个数字值,指示单个图位于该字段的哪一行
c列是一个数值,指示单个图位于哪一列
Quad列对应于每个地块所在字段中的地理位置
Quad <- ifelse(mhw$c > 13 & mhw$r < 11, "NE",ifelse(mhw$c < 13 & mhw$r < 11,"NW", ifelse(mhw$c < 13 & mhw$r >= 11, "SW","SE")))
mhw <- cbind(mhw, Quad)
我符合以下条件lm()
nov.model <-lm(mhw$grain ~ mhw$straw)
anova(nov.model)
这是整个字段的anova()
,用于测试数据集中每个图的谷物产量与稻草产量之间的关系.
我的麻烦是我想对数据的Quad列运行一个单独的anova()
,以测试每个象限中的谷物产量和稻草产量.
也许with()
可能会解决该问题.我以前从未使用过它,目前正在学习R.任何帮助将不胜感激.
我认为您正在R中寻找by
工具.
fit <- with(mhw, by(mhw, Quad, function (dat) lm(grain ~ straw, data = dat)))
由于在Quad
中有4个级别,因此在fit
中最终得到4个线性模型,即fit
是长度为4的"by"类对象(列表"的一种). /p>
要获取每种模型的系数,可以使用
sapply(fit, coef)
要生成模型摘要,请使用
lapply(fit, summary)
要导出方差分析表,请使用
lapply(fit, anova)
作为可复制的示例,我以?by
为例:
tmp <- with(warpbreaks,
by(warpbreaks, tension,
function(x) lm(breaks ~ wool, data = x)))
class(tmp)
# [1] "by"
mode(tmp)
# [1] "list"
sapply(tmp, coef)
# L M H
#(Intercept) 44.55556 24.000000 24.555556
#woolB -16.33333 4.777778 -5.777778
lapply(tmp, anova)
#$L
#Analysis of Variance Table
#
#Response: breaks
# Df Sum Sq Mean Sq F value Pr(>F)
#wool 1 1200.5 1200.50 5.6531 0.03023 *
#Residuals 16 3397.8 212.36
#---
#Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#
#$M
#Analysis of Variance Table
#
#Response: breaks
# Df Sum Sq Mean Sq F value Pr(>F)
#wool 1 102.72 102.722 1.2531 0.2795
#Residuals 16 1311.56 81.972
#
#$H
#Analysis of Variance Table
#
#Response: breaks
# Df Sum Sq Mean Sq F value Pr(>F)
#wool 1 150.22 150.222 2.3205 0.1472
#Residuals 16 1035.78 64.736
我知道此选项,但并不熟悉.感谢 @Roland 为上述可重现示例提供代码:
library(nlme)
lapply(lmList(breaks ~ wool | tension, data = warpbreaks), anova)
对于您的数据,我认为应该是
fit <- lmList(grain ~ straw | Quad, data = mhw)
lapply(fit, anova)
您不需要安装nlme
;它带有R作为推荐软件包之一.
I'm trying to run anova()
in R and running into some difficulty. This is what I've done up to now to help shed some light on my question.
Here is the str()
of my data to this point.
str(mhw)
'data.frame': 500 obs. of 5 variables:
$ r : int 1 2 3 4 5 6 7 8 9 10 ...
$ c : int 1 1 1 1 1 1 1 1 1 1 ...
$ grain: num 3.63 4.07 4.51 3.9 3.63 3.16 3.18 3.42 3.97 3.4 ...
$ straw: num 6.37 6.24 7.05 6.91 5.93 5.59 5.32 5.52 6.03 5.66 ...
$ Quad : Factor w/ 4 levels "NE","NW","SE",..: 2 2 2 2 2 2 2 2 2 2 ...
Column r is a numerical value indicating which row in the field an individual plot resides
Column c is a numerical value indicating which column an individual plot resides
Column Quad corresponds to the geographical location in the field to which each plot resides
Quad <- ifelse(mhw$c > 13 & mhw$r < 11, "NE",ifelse(mhw$c < 13 & mhw$r < 11,"NW", ifelse(mhw$c < 13 & mhw$r >= 11, "SW","SE")))
mhw <- cbind(mhw, Quad)
I have fit a lm()
as follows
nov.model <-lm(mhw$grain ~ mhw$straw)
anova(nov.model)
This is an anova()
for the entire field, which is testing grain yield against straw yield for each plot in the dataset.
My trouble is that I want to run an individual anova()
for the Quad column of my data to test grain yield and straw yield in each quadrant.
perhaps a with()
might fix that. I have never used it before and I am in the process of learning R currently. Any help would be greatly appreciated.
I think you are looking for by
facility in R.
fit <- with(mhw, by(mhw, Quad, function (dat) lm(grain ~ straw, data = dat)))
Since you have 4 levels in Quad
, you end up with 4 linear models in fit
, i.e., fit
is a "by" class object (a type of "list") of length 4.
To get coefficient for each model, you can use
sapply(fit, coef)
To produce model summary, use
lapply(fit, summary)
To export ANOVA table, use
lapply(fit, anova)
As a reproducible example, I am taking the example from ?by
:
tmp <- with(warpbreaks,
by(warpbreaks, tension,
function(x) lm(breaks ~ wool, data = x)))
class(tmp)
# [1] "by"
mode(tmp)
# [1] "list"
sapply(tmp, coef)
# L M H
#(Intercept) 44.55556 24.000000 24.555556
#woolB -16.33333 4.777778 -5.777778
lapply(tmp, anova)
#$L
#Analysis of Variance Table
#
#Response: breaks
# Df Sum Sq Mean Sq F value Pr(>F)
#wool 1 1200.5 1200.50 5.6531 0.03023 *
#Residuals 16 3397.8 212.36
#---
#Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#
#$M
#Analysis of Variance Table
#
#Response: breaks
# Df Sum Sq Mean Sq F value Pr(>F)
#wool 1 102.72 102.722 1.2531 0.2795
#Residuals 16 1311.56 81.972
#
#$H
#Analysis of Variance Table
#
#Response: breaks
# Df Sum Sq Mean Sq F value Pr(>F)
#wool 1 150.22 150.222 2.3205 0.1472
#Residuals 16 1035.78 64.736
I was aware of this option, but not familiar with it. Thanks to @Roland for providing code for the above reproducible example:
library(nlme)
lapply(lmList(breaks ~ wool | tension, data = warpbreaks), anova)
For your data I think it would be
fit <- lmList(grain ~ straw | Quad, data = mhw)
lapply(fit, anova)
You don't need to install nlme
; it comes with R as one of recommended packages.
这篇关于按组拟合线性模型/方差分析的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!