子集参数在lm()函数中如何工作? [英] How does the subset argument work in the lm() function?

查看:85
本文介绍了子集参数在lm()函数中如何工作?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直试图弄清楚R的lm()函数中subset参数的工作方式.尤其是下面的代码对我来说似乎是可疑的:

I have been trying to figure out how the subset argument in R's lm() function works. Especially the follwoing code seems dubious for me:

 data(mtcars)
 summary(lm(mpg ~ wt,  data=mtcars))
 summary(lm(mpg ~ wt, cyl, data=mtcars))

在每种情况下,回归都有32个观测值

In every case the regression has 32 observations

  dim(lm(mpg ~ wt, cyl  ,data=mtcars)$model)
  [1] 32  2
   dim(lm(mpg ~ wt  ,data=mtcars)$model)
  [1] 32  2

然而,系数发生了变化(以及R²).该帮助并未提供有关此问题的过多信息:

yet the coefficients change (along with the R²). The help doesn't provide too much information on this matter:

子集一个可选向量,该向量指定在拟合过程中使用的观测子集

subset an optional vector specifying a subset of observations to be used in the fitting process

推荐答案

作为一般原则,子集中使用的向量可以是逻辑(例如,每个元素为TRUE或FALSE)或数字(例如,数字).作为有助于采样的功能,如果为数字R,则如果它出现在子集数字向量中,它将多次包含相同的元素.

As a general principle, vectors used in subsetting can either logical (e.g. a TRUE or FALSE for every element) or numeric (e.g. a number). As a feature to help with sampling, if it is numeric R will include the same element multiple times if it appears in a subsetting numeric vector.

让我们看一下cyl:

> mtcars$cyl
 [1] 6 6 4 6 8 6 8 4 4 6 6 8 8 8 8 8 8 4 4 4 4 8 8 8 8 4 4 4 8 6 8 4

因此,您将获得长度相同的data.frame,但它由第6行,第6行,第4行,第6行等组成.

So you're getting a data.frame of the same length, but it's comprised of row 6, row 6, row 4, row 6, etc.

如果您自己进行子设置,您会看到以下内容:

You can see this if you do the subsetting yourself:

> head(mtcars[mtcars$cyl,])
                mpg cyl  disp  hp drat    wt  qsec vs am gear carb
Valiant        18.1   6 225.0 105 2.76 3.460 20.22  1  0    3    1
Valiant.1      18.1   6 225.0 105 2.76 3.460 20.22  1  0    3    1
Hornet 4 Drive 21.4   6 258.0 110 3.08 3.215 19.44  1  0    3    1
Valiant.2      18.1   6 225.0 105 2.76 3.460 20.22  1  0    3    1
Merc 240D      24.4   4 146.7  62 3.69 3.190 20.00  1  0    4    2
Valiant.3      18.1   6 225.0 105 2.76 3.460 20.22  1  0    3    1

您是想做这样的事情吗?

Did you mean to do something like this?

summary(lm(mpg ~ wt, cyl==6, data=mtcars))

这篇关于子集参数在lm()函数中如何工作?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆