在R中使用lm的正确方法 [英] Right way to use lm in R
问题描述
我对如何使用像lm()这样的函数需要一个公式和一个data.frame的想法并不十分清楚. 在网络上,我对不同的方法感到不满意,但有时R会给我们警告和其他内容
I do not have very clear idea of how to use functions like lm() that ask for a formula and a data.frame. On the web I red about different approach but sometimes R give us warnings and other stuff
例如,假设一个线性模型,其中输出矢量y由矩阵X解释.
Suppose for example a linear model where the output vector y is explained by the matrix X.
我认为最好的方法是使用data.frame(特别是如果以后要使用预测函数的话).
I red that the best way is to use a data.frame (expecially if we are going to use the predict function later).
在X是矩阵的情况下,这是使用lm的最佳方法吗?
In situation where the X is a matrix is this the best way to use lm?
n=100
p=20
n_new=50
X=matrix(rnorm(n*p),n,p)
Y=rnorm(n)
data=list("x"=X,"y"=Y)
l=lm(y~x,data)
X_new=matrix(rnorm(n_new*p),n_new,p)
pred=predict(l,as.data.frame(X_new))
推荐答案
怎么样:
l <- lm(y~.,data=data.frame(X,y=Y))
pred <- predict(l,data.frame(X_new))
在这种情况下,R自动构造列名(X1
... X20
),但是当您使用y~.
语法时,您无需知道它们.
In this case R constructs the column names (X1
... X20
) automatically, but when you use the y~.
syntax you don't need to know them.
或者,如果您总是要基于矩阵拟合线性回归,则可以使用lm.fit()
并使用矩阵乘法自己计算预测:您必须使用cbind(1,.)
添加截距列.
Alternatively, if you are always going to fit linear regressions based on a matrix, you can use lm.fit()
and compute the predictions yourself using matrix multiplication: you have to use cbind(1,.)
to add an intercept column.
fit <- lm.fit(cbind(1,X),Y)
all(coef(l)==fit$coefficients) ## TRUE
pred <- cbind(1,X_new) %*% fit$coefficients
(您也可以使用cbind(1,X_new) %*% coef(l)
.)这很有效,但是它跳过了很多错误检查步骤,因此请谨慎使用...
(You could also use cbind(1,X_new) %*% coef(l)
.) This is efficient, but it skips a lot of the error-checking steps, so use it with caution ...
这篇关于在R中使用lm的正确方法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!