R2值-dplyr和扫帚 [英] R2 values - dplyr and broom

查看:138
本文介绍了R2值-dplyr和扫帚的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用dplyr和扫帚组合(如下所示)并遵循拟合多个回归dplyr 的模型来按组提取回归的回归系数。但是-我也对每个回归的R2值感兴趣(不仅对于整个模型)。我尝试使用增强和扫视功能-但无法提取R2值。是否有捷径可寻?

I am using the dplyr and broom combination (per below) and following Fitting several regression models with dplyr to extract the regression coefficients of regressions by group. However - i am also interested in the R2 value of each individual regression (not only for the total model). i have tried to play wiht the augment and glance functions - but have not been able to extract the R2 values. is there an easy way to do this?

非常感谢!

library(dplyr)
library(broom)

df.h = data.frame( 
  hour     = factor(rep(1:24, each = 21)),
  price    = runif(504, min = -10, max = 125),
  wind     = runif(504, min = 0, max = 2500),
  temp     = runif(504, min = - 10, max = 25)  
)

dfHour = df.h %>% group_by(hour) %>%
  do(fitHour = lm(price ~ wind + temp, data = .))

# get the coefficients by group in a tidy data_frame
dfHourCoef = tidy(dfHour, fitHour)
dfHourCoef

   hour        term     estimate   std.error  statistic      p.value
1     1 (Intercept) 92.173945687 21.82132710  4.2240302 5.102470e-04
2     1        wind -0.020840948  0.01335945 -1.5600151 1.361653e-01
3     1        temp -0.162495052  0.90573269 -0.1794073 8.596220e-01
4     2 (Intercept) 53.569821889 20.90439474  2.5626105 1.957638e-02
5     2        wind  0.006492773  0.01273038  0.5100220 6.162329e-01
6     2        temp -0.493028932  0.78353239 -0.6292387 5.370978e-01
7     3 (Intercept) 93.949047453 14.55042590  6.4567902 4.483106e-06
8     3        wind -0.010084298  0.01179878 -0.8546902 4.039553e-01
9     3        temp -0.096177966  0.68416185 -0.1405778 8.897647e-01
10    4 (Intercept) 68.429142611 20.37382251  3.3586796 3.497149e-03


推荐答案

Broom :: glance 为我工作:

dfHourCoef = glance(dfHour, fitHour)
dfHourCoef

Source: local data frame [24 x 12]
Groups: hour

   hour  r.squared adj.r.squared    sigma  statistic   p.value df    logLik      AIC      BIC deviance df.residual
1     1 0.08223448   -0.01973947 34.02159 0.80642638 0.4619401  3 -102.2460 212.4921 216.6701 20834.44          18
2     2 0.07546305   -0.02726328 36.19379 0.73460277 0.4935356  3 -103.5458 215.0915 219.2696 23579.83          18
3     3 0.02395679   -0.08449245 37.17711 0.22090326 0.8039358  3 -104.1087 216.2174 220.3955 24878.47          18
4     4 0.04916169   -0.05648701 40.38246 0.46533173 0.6352725  3 -105.8454 219.6909 223.8690 29353.38          18
5     5 0.16704225    0.07449138 34.47921 1.80486969 0.1930220  3 -102.5266 213.0532 217.2313 21398.69          18
6     6 0.13615197    0.04016886 41.64294 1.41849921 0.2678774  3 -106.4909 220.9818 225.1599 31214.42          18
7     7 0.01979010   -0.08912211 39.22426 0.18170693 0.8353563  3 -105.2343 218.4687 222.6467 27693.76          18
8     8 0.00171480   -0.10920578 31.29634 0.01545971 0.9846722  3 -100.4927 208.9853 213.1634 17630.30          18
9     9 0.19935534    0.11039483 36.02611 2.24094183 0.1351943  3 -103.4482 214.8965 219.0746 23361.85          18
10   10 0.16950148    0.07722387 36.99197 1.83686463 0.1879532  3 -104.0038 216.0077 220.1858 24631.31          18
..  ...        ...           ...      ...        ...       ... ..       ...      ...      ...      ...         ...

这篇关于R2值-dplyr和扫帚的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆