按组拟合线性模型/方差分析 [英] Fitting linear model / ANOVA by group

查看:26
本文介绍了按组拟合线性模型/方差分析的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试在 R 中运行 anova() 并且遇到了一些困难.这是我迄今为止所做的,以帮助阐明我的问题.

I'm trying to run anova() in R and running into some difficulty. This is what I've done up to now to help shed some light on my question.

这是到目前为止我的数据的 str().

Here is the str() of my data to this point.

 str(mhw)
'data.frame':   500 obs. of  5 variables:
$ r    : int  1 2 3 4 5 6 7 8 9 10 ...
$ c    : int  1 1 1 1 1 1 1 1 1 1 ...
$ grain: num  3.63 4.07 4.51 3.9 3.63 3.16 3.18 3.42 3.97 3.4 ...
$ straw: num  6.37 6.24 7.05 6.91 5.93 5.59 5.32 5.52 6.03 5.66 ...
$ Quad : Factor w/ 4 levels "NE","NW","SE",..: 2 2 2 2 2 2 2 2 2 2 ...

r 列是一个数值,表示单个图位于该字段中的哪一行列 c 是一个数值,指示单个绘图所在的列
Column Quad 对应每个地块所在的田地中的地理位置

Column r is a numerical value indicating which row in the field an individual plot resides Column c is a numerical value indicating which column an individual plot resides
Column Quad corresponds to the geographical location in the field to which each plot resides

Quad <- ifelse(mhw$c > 13 & mhw$r < 11, "NE",ifelse(mhw$c < 13 & mhw$r < 11,"NW", ifelse(mhw$c < 13 & mhw$r >= 11, "SW","SE")))
mhw <- cbind(mhw, Quad)

我有一个 lm() 如下

nov.model <-lm(mhw$grain ~ mhw$straw)
anova(nov.model)

这是整个田地的 anova(),它正在测试数据集中每个地块的谷物产量与秸秆产量.

This is an anova() for the entire field, which is testing grain yield against straw yield for each plot in the dataset.

我的问题是我想为我的数据的 Quad 列运行单独的 anova() 以测试每个象限中的谷物产量和秸秆产量.

My trouble is that I want to run an individual anova() for the Quad column of my data to test grain yield and straw yield in each quadrant.

也许 with() 可能会解决这个问题.我以前从未使用过它,目前正在学习 R.任何帮助将不胜感激.

perhaps a with() might fix that. I have never used it before and I am in the process of learning R currently. Any help would be greatly appreciated.

推荐答案

我认为您正在寻找 R 中的 by 工具.

I think you are looking for by facility in R.

fit <- with(mhw, by(mhw, Quad, function (dat) lm(grain ~ straw, data = dat)))

因为 Quad 中有 4 个级别,所以 fit 中有 4 个线性模型,即 fit 是一个by"长度为 4 的类对象(一种列表"类型).

Since you have 4 levels in Quad, you end up with 4 linear models in fit, i.e., fit is a "by" class object (a type of "list") of length 4.

要获得每个模型的系数,您可以使用

To get coefficient for each model, you can use

sapply(fit, coef)

要生成模型摘要,请使用

To produce model summary, use

lapply(fit, summary)

要导出方差分析表,请使用

To export ANOVA table, use

lapply(fit, anova)

<小时>

作为一个可重复的例子,我从 ?by 中取例子:

tmp <- with(warpbreaks,
            by(warpbreaks, tension,
               function(x) lm(breaks ~ wool, data = x)))

class(tmp)
# [1] "by"

mode(tmp)
# [1] "list"

sapply(tmp, coef)

#                    L         M         H
#(Intercept)  44.55556 24.000000 24.555556
#woolB       -16.33333  4.777778 -5.777778

lapply(tmp, anova)

#$L
#Analysis of Variance Table
#
#Response: breaks
#          Df Sum Sq Mean Sq F value  Pr(>F)  
#wool       1 1200.5 1200.50  5.6531 0.03023 *
#Residuals 16 3397.8  212.36                  
#---
#Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#
#$M
#Analysis of Variance Table
#
#Response: breaks
#          Df  Sum Sq Mean Sq F value Pr(>F)
#wool       1  102.72 102.722  1.2531 0.2795
#Residuals 16 1311.56  81.972               
#
#$H
#Analysis of Variance Table
#
#Response: breaks
#          Df  Sum Sq Mean Sq F value Pr(>F)
#wool       1  150.22 150.222  2.3205 0.1472
#Residuals 16 1035.78  64.736

<小时>

我知道这个选项,但不熟悉它.感谢 @Roland 为上述可重现示例提供代码:


I was aware of this option, but not familiar with it. Thanks to @Roland for providing code for the above reproducible example:

library(nlme)
lapply(lmList(breaks ~ wool | tension, data = warpbreaks), anova)

对于您的数据,我认为是

For your data I think it would be

fit <- lmList(grain ~ straw | Quad, data = mhw)
lapply(fit, anova)

您不需要安装nlme;它带有 R 作为推荐的软件包之一.

You don't need to install nlme; it comes with R as one of recommended packages.

这篇关于按组拟合线性模型/方差分析的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆