具有置信区间的滚动回归(tidyverse) [英] rolling regression with confidence interval (tidyverse)

查看:135
本文介绍了具有置信区间的滚动回归(tidyverse)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这与

再次考虑这个简单示例

library(dplyr)
library(purrr)
library(broom)
library(zoo)
library(lubridate)

mydata = data_frame('group' = c('a','a', 'a','a','b', 'b', 'b', 'b'),
                     'y' = c(1,2,3,4,2,3,4,5),
                     'x' = c(2,4,6,8,6,9,12,15),
                     'date' = c(ymd('2016-06-01', '2016-06-02', '2016-06-03', '2016-06-04',
                                    '2016-06-03', '2016-06-04', '2016-06-05','2016-06-06')))

  group     y     x date      
  <chr> <dbl> <dbl> <date>    
1 a      1.00  2.00 2016-06-01
2 a      2.00  4.00 2016-06-02
3 a      3.00  6.00 2016-06-03
4 a      4.00  8.00 2016-06-04
5 b      2.00  6.00 2016-06-03
6 b      3.00  9.00 2016-06-04
7 b      4.00 12.0  2016-06-05
8 b      5.00 15.0  2016-06-06

我在这里要做的事情很简单。

What I am trying to do here is pretty simple.

对于每个组(在此示例中为a或b):

For each group (in this example, a or b):


  • 计算y的滚动回归在最后2个观测值上的x上。

  • 将滚动回归的系数及其置信区间存储在数据框的列中。

  • compute the rolling regression of y on x over the last 2 observations.
  • store the coefficient of that rolling regression AND its confidence interval in a column of the dataframe.

我试图修改上面的现有解决方案,但是事实证明添加置信区间很困难,因此这可行(没有置信区间):

I tried to modify the existing solution above, but adding the confidence interval proves to be difficult, so this works (without the confidence interval):

Coef <- . %>% as.data.frame %>% lm %>% coef

mydata %>% 
  group_by(group) %>% 
  do(cbind(reg_col = select(., y, x) %>% rollapplyr(2, Coef, by.column = FALSE, fill = NA),
           date_col = select(., date))) %>%
  ungroup

# A tibble: 8 x 4
  group `reg_col.(Intercept)` reg_col.x date      
  <chr>                 <dbl>     <dbl> <date>    
1 a      NA                      NA     2016-06-01
2 a       0                       0.5   2016-06-02
3 a       0                       0.5   2016-06-03
4 a       0                       0.5   2016-06-04
5 b      NA                      NA     2016-06-03
6 b       0.00000000000000126     0.333 2016-06-04
7 b      -0.00000000000000251     0.333 2016-06-05
8 b       0                       0.333 2016-06-06

但是,不起作用(置信区间):-(

However, THIS does not work (WITH the confidence interval) :-(

Coef <- . %>% as.data.frame %>% lm  %>% tidy(., conf.int = TRUE) %>% as_tibble()

> mydata %>% 
+   group_by(group) %>% 
+   do(reg_col = select(., y, x) %>% rollapplyr(2, Coef, by.column = FALSE, fill = NA)) %>%
+   ungroup()
# A tibble: 2 x 2
  group reg_col      
* <chr> <list>       
1 a     <dbl [4 x 2]>
2 b     <dbl [4 x 2]>

这个 list-column 非常奇怪。有什么想法吗?

With this list-column being super weird. Any ideas what is missing here?

谢谢!

推荐答案

试试这个:

library(dplyr)
library(zoo)

# use better example
set.seed(123)
mydata2 <- mydata %>% mutate(y = jitter(y))

stats <- function(x) {
  fm <- lm(as.data.frame(x))
  slope <- coef(fm)[[2]]
  ci <- confint(fm)[2, ]
  c(slope = slope, conf.lower = ci[[1]], conf.upper = ci[[2]])
}

roll <- function(x) rollapplyr(x, 3, stats, by.column = FALSE, fill = NA)

mydata2 %>%
  group_by(group) %>%
  do(cbind(., select(., y, x) %>% roll)) %>%
  ungroup

给予:

# A tibble: 8 x 7
  group     y     x date        slope conf.lower conf.upper
  <chr> <dbl> <dbl> <date>      <dbl>      <dbl>      <dbl>
1 a     0.915     2 2016-06-01 NA         NA         NA    
2 a     2.12      4 2016-06-02 NA         NA         NA    
3 a     2.96      6 2016-06-03  0.512     -0.133      1.16 
4 a     4.15      8 2016-06-04  0.509     -0.117      1.14 
5 b     2.18      6 2016-06-03 NA         NA         NA    
6 b     2.82      9 2016-06-04 NA         NA         NA    
7 b     4.01     12 2016-06-05  0.306     -0.368      0.980
8 b     5.16     15 2016-06-06  0.390      0.332      0.448

这篇关于具有置信区间的滚动回归(tidyverse)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆