零截距模型的lm()中的R平方 [英] R-squared in lm() for zero-intercept model
本文介绍了零截距模型的lm()中的R平方的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我在R中运行 lm()
,这是摘要的结果:
多个R平方:0.8918,调整后R平方:0.8917F统计量:9和10283 DF上的9416,p值:<2.2e-16
看来这是一个很好的模型,但是如果我手动计算R ^ 2,我会得到:
model = lm(S〜0 + C + HA + L1 + L2,data = train)pred =预测(模型,火车)rss<-sum((model $ fitted.values-train $ S)^ 2)tss<-sum((train $ S-平均值(train $ S))^ 2)1-RSS/TSS## [1] 0.247238rSquared(train $ S,(train $ S-model $ fitted.values))## [,1]## [1,] 0.247238
怎么了?
str(train [,c('S','Campionato','HA','L1','L2')])类别"tbl_df","tbl"和"data.frame":10292 obs.5个变量中:$ S:num 19 18 9 12 12 8 21 24 9 8 ...$ C:有6个等级的因子"D","E","F","I",..:4 4 4 4 4 4 4 4 4 4 ...$ HA:有2个等级的系数"A","H":1 2 1 1 2 1 2 2 1 2 ...$ L1:数量0.99 1.41 1.46 1.43 1.12 1.08 1.4 1.45 0.85 1.44 ...$ L2:编号1.31 0.63 1.16 1.15 1.29 1.31 0.7 0.65 1.35 0.59 ...
解决方案
您正在运行一个没有截距的模型(公式右侧的〜0).对于这些类型的模型,R ^ 2的计算存在问题,并且会产生误导性的值.这篇文章对此进行了很好的解释: https://stats.stackexchange.com/a/26205/99681 >
I run an lm()
in R and this is the results of the summary:
Multiple R-squared: 0.8918, Adjusted R-squared: 0.8917
F-statistic: 9416 on 9 and 10283 DF, p-value: < 2.2e-16
and it seems that it is a good model, but if I calculate the R^2 manually I obtain this:
model=lm(S~0+C+HA+L1+L2,data=train)
pred=predict(model,train)
rss <- sum((model$fitted.values - train$S) ^ 2)
tss <- sum((train$S - mean(train$S)) ^ 2)
1 - rss/tss
##[1] 0.247238
rSquared(train$S,(train$S-model$fitted.values))
## [,1]
## [1,] 0.247238
What's wrong?
str(train[,c('S','Campionato','HA','L1','L2')])
Classes ‘tbl_df’, ‘tbl’ and 'data.frame': 10292 obs. of 5 variables:
$ S : num 19 18 9 12 12 8 21 24 9 8 ...
$ C : Factor w/ 6 levels "D","E","F","I",..: 4 4 4 4 4 4 4 4 4 4 ...
$ HA : Factor w/ 2 levels "A","H": 1 2 1 1 2 1 2 2 1 2 ...
$ L1 : num 0.99 1.41 1.46 1.43 1.12 1.08 1.4 1.45 0.85 1.44 ...
$ L2 : num 1.31 0.63 1.16 1.15 1.29 1.31 0.7 0.65 1.35 0.59 ...
解决方案
You are running a model without the intercept (the ~0 on the right hand side of your formula). For these kinds of models the calculation of R^2 is problematic and will produce misleading values. This post explains it very well: https://stats.stackexchange.com/a/26205/99681
这篇关于零截距模型的lm()中的R平方的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文