大写字母"I"是什么意思在R线性回归公式中是什么意思? [英] What does the capital letter "I" in R linear regression formula mean?

查看:352
本文介绍了大写字母"I"是什么意思在R线性回归公式中是什么意思?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直无法找到这个问题的答案,这在很大程度上是因为搜索带有独立字母(例如"I")的任何内容都会导致问题.

I haven't been able to find an answer to this question, largely because googling anything with a standalone letter (like "I") causes issues.

"I"在这样的模型中做什么?

What does the "I" do in a model like this?

data(rock)
lm(area~I(peri - mean(peri)), data = rock)

考虑到以下内容不起作用:

Considering that the following does NOT work:

lm(area ~ (peri - mean(peri)), data = rock)

,并且 this 确实有效:

rock$peri - mean(rock$peri)

任何有关如何自己研究的关键词也将非常有帮助.

Any key words on how to research this myself would also be very helpful.

推荐答案

I 隔离隔离从R公式解析的角度来看I( ... )的内容代码.如果您在公式之外使用标准R运算符,则它可以像使用标准R运算符一样工作,而不是被视为特殊的公式运算符.

I isolates or insulates the contents of I( ... ) from the gaze of R's formula parsing code. It allows the standard R operators to work as they would if you used them outside of a formula, rather than being treated as special formula operators.

例如:

y ~ x + x^2

对R而言,意思是给我:

would, to R, mean "give me:

  1. x = x的主要作用,并且
  2. x^2 = x
  3. 的主要作用和二阶相互作用
  1. x = the main effect of x, and
  2. x^2 = the main effect and the second order interaction of x",

不是预期的xx平方:

> model.frame( y ~ x + x^2, data = data.frame(x = rnorm(5), y = rnorm(5)))
           y           x
1 -1.4355144 -1.85374045
2  0.3620872 -0.07794607
3 -1.7590868  0.96856634
4 -0.3245440  0.18492596
5 -0.6515630 -1.37994358

这是因为^是公式中的特殊运算符,如?formula中所述.您最终只在模型框架中包括了x,这是因为x的主要作用已经包含在公式中的x项中,并且没有任何东西可以与x交叉以获得二阶相互作用.在x^2术语中.

This is because ^ is a special operator in a formula, as described in ?formula. You end up only including x in the model frame because the main effect of x is already included from the x term in the formula, and there is nothing to cross x with to get the second-order interactions in the x^2 term.

要获取常规运算符,您需要使用I()将调用与公式代码隔离:

To get the usual operator, you need to use I() to isolate the call from the formula code:

> model.frame( y ~ x + I(x^2), data = data.frame(x = rnorm(5), y = rnorm(5)))
            y          x       I(x^2)
1 -0.02881534  1.0865514 1.180593....
2  0.23252515 -0.7625449 0.581474....
3 -0.30120868 -0.8286625 0.686681....
4 -0.67761458  0.8344739 0.696346....
5  0.65522764 -0.9676520 0.936350....

(最后一列是正确的,因为它属于AsIs类,所以看起来很奇怪.)

(that last column is correct, it just looks odd because it is of class AsIs.)

在您的示例中,-在公式中使用时会指示从模型中删除项,您希望在其中-拥有减法的通常的二元运算符含义:

In your example, - when used in a formula would indicate removal of a term from the model, where you wanted - to have it's usual binary operator meaning of subtraction:

> model.frame( y ~ x - mean(x), data = data.frame(x = rnorm(5), y = rnorm(5)))
Error in model.frame.default(y ~ x - mean(x), data = data.frame(x = rnorm(5),  : 
  variable lengths differ (found for 'mean(x)')

这失败的原因是mean(x)是长度为1的向量,而model.frame()正确地告诉您这与其他变量的长度不匹配.解决方法是I():

This fails for reason that mean(x) is a length 1 vector and model.frame() quite rightly tells you this doesn't match the length of the other variables. A way round this is I():

> model.frame( y ~ I(x - mean(x)), data = data.frame(x = rnorm(5), y = rnorm(5)))
           y I(x - mean(x))
1  1.1727063   1.142200....
2 -1.4798270   -0.66914....
3 -0.4303878   -0.28716....
4 -1.0516386   0.542774....
5  1.5225863   -0.72865....

因此,要在公式中使用具有特殊含义的运算符,但需要其 non-formula 含义,则需要将操作的元素包装在I( )中.

Hence, where you want to use an operator that has special meaning in a formula, but you need its non-formula meaning, you need to wrap the elements of the operation in I( ).

有关特殊运算符的更多信息,请参见?formula,有关数据帧内函数本身 及其其他主要用例的更多详细信息,请参见?I(在AsIs如果您有兴趣的话,请发给您.

Read ?formula for more on the special operators, and ?I for more details on the function itself and its other main use-case within data frames (which is where the AsIs bit originates from, if you are interested).

这篇关于大写字母"I"是什么意思在R线性回归公式中是什么意思?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆