如何对因子变量的多个子集进行线性回归 [英] How to loop a linear regression over multiple subsets of a factor variable

查看:92
本文介绍了如何对因子变量的多个子集进行线性回归的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图编写一个for循环,在4个不同级别的因子变量上分别运行相同的回归(相同的因变量和自变量)4次.然后,我想保存每个线性回归的输出.每个级别大约有500行数据.

I'm trying to write a for loop that runs the same regression (same dependent and independent variables) 4 times separately over the 4 different levels of a factor variable. I then want to save the output of each linear regression Each level has approx ~ 500 rows of data.

我最初的想法是做这样的事情,但是我对R和不同的迭代方法是陌生的.

My initial thought was to do something like this, but I am new to R and the different methods of iteration.

Regressionresults <- list()

for (i in levels(mydata$factorvariable)) {
  Regressionresults[[i]] <- lm(dependent ~ ., data = mydata)
}

我怀疑这很容易做到,但是我不知道怎么做.

I suspect that this is quite easy to do but I don't know how.

如果您也可以将我带到任何帮助文档或其他资源,在这里我可以学习如何编写这些类型的循环,这样我就不必再问类似的问题了,

If you could also direct me to any help documentation or other resource where I can learn how to write these types of loops so I don't have to ask similar questions again, I'd be grateful.

非常感谢!

推荐答案

问题中的代码存在以下问题:

The problems with the code in the question are:

  1. 在R中,通常最好不要首先使用循环
  2. 通常,我用于顺序索引,所以它不是一个好方法选择要用于级别的名称
  3. 循环的主体不进行任何子设置,因此它将在每次迭代中分配相同的结果
  4. 张贴到SO的
  5. 应该具有可复制的数据,问题不包括该数据,而应引用对象而不定义其内容.请阅读标记顶部的说明页.下面,我们使用内置的虹膜数据集来提高可重复性.
  1. in R it is normally better not to use loops in the first place
  2. conventionally i is used for a sequential index so it is not a good choice of name to use for levels
  3. the body of the loop does not do any subsetting so it will assign the same result on each iteration
  4. posts to SO should have reproducible data and the question did not include that but rather referred to objects without defining their contents. Please read the instructions at the top of the r tag page. Below we have used the built in iris data set for reproducibility.

以下是一些使用内置虹膜数据帧以提高可重复性的方法.每个结果都会生成一个命名列表,其中的名称是物种的级别.

Here are some approaches using the builtin iris data frame for reproducibility. Each results in a named list where the names are the levels of Species.

1) lm 子集参数 Map 在给出列表的级别上:

1) lm subset argument Map over the levels giving a list:

sublm <- function(x) lm(Petal.Width ~ Sepal.Width, iris, subset = Species == x)
levs <- levels(iris$Species)
Map(sublm, levs)

2)循环 sublm levs 来自(1).

L <- list()
for(s in levs) L[[s]] <- sublm(s)

3)nlme 或使用nlme中的lmList

3) nlme or use lmList from nlme

library(nlme)
L3 <- lmList(Petal.Width ~ Sepal.Width | Species, iris)
coef(L3)
summary(L3)

这篇关于如何对因子变量的多个子集进行线性回归的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆