具有置信区间的滚动回归(tidyverse) [英] rolling regression with confidence interval (tidyverse)
本文介绍了具有置信区间的滚动回归(tidyverse)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
再次考虑这个简单示例
library(dplyr)
library(purrr)
library(broom)
library(zoo)
library(lubridate)
mydata = data_frame('group' = c('a','a', 'a','a','b', 'b', 'b', 'b'),
'y' = c(1,2,3,4,2,3,4,5),
'x' = c(2,4,6,8,6,9,12,15),
'date' = c(ymd('2016-06-01', '2016-06-02', '2016-06-03', '2016-06-04',
'2016-06-03', '2016-06-04', '2016-06-05','2016-06-06')))
group y x date
<chr> <dbl> <dbl> <date>
1 a 1.00 2.00 2016-06-01
2 a 2.00 4.00 2016-06-02
3 a 3.00 6.00 2016-06-03
4 a 4.00 8.00 2016-06-04
5 b 2.00 6.00 2016-06-03
6 b 3.00 9.00 2016-06-04
7 b 4.00 12.0 2016-06-05
8 b 5.00 15.0 2016-06-06
我在这里要做的事情很简单。
What I am trying to do here is pretty simple.
对于每个组(在此示例中为a或b):
For each group (in this example, a or b):
- 计算y的滚动回归在最后2个观测值上的x上。
- 将滚动回归的系数及其置信区间存储在数据框的列中。
- compute the rolling regression of y on x over the last 2 observations.
- store the coefficient of that rolling regression AND its confidence interval in a column of the dataframe.
我试图修改上面的现有解决方案,但是事实证明添加置信区间很困难,因此这可行(没有置信区间):
I tried to modify the existing solution above, but adding the confidence interval proves to be difficult, so this works (without the confidence interval):
Coef <- . %>% as.data.frame %>% lm %>% coef
mydata %>%
group_by(group) %>%
do(cbind(reg_col = select(., y, x) %>% rollapplyr(2, Coef, by.column = FALSE, fill = NA),
date_col = select(., date))) %>%
ungroup
# A tibble: 8 x 4
group `reg_col.(Intercept)` reg_col.x date
<chr> <dbl> <dbl> <date>
1 a NA NA 2016-06-01
2 a 0 0.5 2016-06-02
3 a 0 0.5 2016-06-03
4 a 0 0.5 2016-06-04
5 b NA NA 2016-06-03
6 b 0.00000000000000126 0.333 2016-06-04
7 b -0.00000000000000251 0.333 2016-06-05
8 b 0 0.333 2016-06-06
但是,此不起作用(置信区间):-(
However, THIS does not work (WITH the confidence interval) :-(
Coef <- . %>% as.data.frame %>% lm %>% tidy(., conf.int = TRUE) %>% as_tibble()
> mydata %>%
+ group_by(group) %>%
+ do(reg_col = select(., y, x) %>% rollapplyr(2, Coef, by.column = FALSE, fill = NA)) %>%
+ ungroup()
# A tibble: 2 x 2
group reg_col
* <chr> <list>
1 a <dbl [4 x 2]>
2 b <dbl [4 x 2]>
这个 list-column
非常奇怪。有什么想法吗?
With this list-column
being super weird. Any ideas what is missing here?
谢谢!
推荐答案
试试这个:
library(dplyr)
library(zoo)
# use better example
set.seed(123)
mydata2 <- mydata %>% mutate(y = jitter(y))
stats <- function(x) {
fm <- lm(as.data.frame(x))
slope <- coef(fm)[[2]]
ci <- confint(fm)[2, ]
c(slope = slope, conf.lower = ci[[1]], conf.upper = ci[[2]])
}
roll <- function(x) rollapplyr(x, 3, stats, by.column = FALSE, fill = NA)
mydata2 %>%
group_by(group) %>%
do(cbind(., select(., y, x) %>% roll)) %>%
ungroup
给予:
# A tibble: 8 x 7
group y x date slope conf.lower conf.upper
<chr> <dbl> <dbl> <date> <dbl> <dbl> <dbl>
1 a 0.915 2 2016-06-01 NA NA NA
2 a 2.12 4 2016-06-02 NA NA NA
3 a 2.96 6 2016-06-03 0.512 -0.133 1.16
4 a 4.15 8 2016-06-04 0.509 -0.117 1.14
5 b 2.18 6 2016-06-03 NA NA NA
6 b 2.82 9 2016-06-04 NA NA NA
7 b 4.01 12 2016-06-05 0.306 -0.368 0.980
8 b 5.16 15 2016-06-06 0.390 0.332 0.448
这篇关于具有置信区间的滚动回归(tidyverse)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文