使用先知包在R中的数据帧中按组预测 [英] Using Prophet Package to Predict By Group in Dataframe in R

查看:133
本文介绍了使用先知包在R中的数据帧中按组预测的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用从Facebook发布的新包,称为先知。它做时间序列预测,我想应用这个功能按组。



向下滚动到R Section。



https://facebookincubator.github.io/prophet/docs/quick_start.html



这是我的尝试:

  grouping_output = df%>%group_by(group)%>%
do(m = prophet(df [,c(1,3)]))%>%
do(future = make_future_dataframe(m,period = 7))%>%
do = prophet ::: predict.prophet(m,future))

grouping_output [[1]]

然后,我需要从每个组的列表中提取我遇到的麻烦。



以下是我没有组的原始数据框:

  ds< ;  -  as.Date(c('2016-11-01','2016-11-02','2016-11-03','2016-11-04',
'2016-11-05 ','2016-11-06','2016-11-07','2016-11-08',
'2016-11-09','2016-11-10','2016-11 -11','2016-11-12',
'2016-11-13','2016-11-14','2016-11-15','2016-11-16',
'2016-11-17','2016-11-18','2016-11-19','2016-11-20',
'2016-11-21','2016-11 -22','2016-11-23','2016-11-24',
'2016-11-25','2016-11-26','2016-11-27','2016 -11-28',
'2016-11-29','2016-11-30'))
y< - c(15,17,18,19,20,54,67, 23,12,34,12,78,34,12,3,45,67,89,12,111,123,112,14,566,345,123,567,56,87,90)
y< -as.numeric(y)
df < - data.frame(ds,y)

df

ds y
1 2016-11-01 15
2 2016-11-02 17
3 2016-11-03 18
4 2016-11-04 19
5 2016-11-05 20
6 2016-11-06 54
7 2016-11-07 67
8 2016-11-08 23
9 2016-11-09 12
10 2016-11-10 34
11 2016-11-11 12
12 2016-11-12 78
13 2016-11-13 34
14 2016-11-14 12
15 2016-11-15 3
16 2016-11-16 45
17 2016-11-17 67
18 2016-11-18 89
19 2016-11-19 12
20 2016-11-20 111
21 2016-11-21 123
22 2016-11-22 112
23 2016 -11-23 14
24 2016-11-24 566
25 2016-11-25 345
26 2016-11-26 123
27 2016-11-27 567
28 2016-11-28 56
29 2016-11-29 87
30 2016-11-30 90

当我在单个组中执行此功能时,当前功能如下所示:

  #install.packages('prophet')
library(prophet)
m< -prophet(df)
future< - make_future_dataframe(m,period = 7)
forecast< ;- 预言家:: :预测(m,未来)

预测$ yhat
[1] -2.649032 -29.762095 128.169781 59.573684 -11.623727 107.473617 -29.949730 -42.862455 -62.378408 104.797639 46.868610
[12 ] -12.502864 119.282058 -4.914921 -4.402638 -10.643570 169.309505 123.321261 74.734746 215.856347 99.290218 105.508059
[23] 102.882915 284.245984 237.401258 185.688202 321.466962 197.451536 194.280518 180.535663 349.304365 288.684031 222.337210
[34] 342.968499 203.648851 185.377165

我现在要更改这个,以便它应用 prophet :::预测函数到每个组。所以新数据框BY GROUP看起来像这样:

  ds<  -  as.Date(c('2016-11-01 ','2016-11-02','2016-11-03','2016-11-04',
'2016-11-05','2016-11-06','2016-11 -07','2016-11-08',
'2016-11-09','2016-11-10','2016-11-11','2016-11-12',
'2016-11-13','2016-11-14','2016-11-15','2016-11-16',
'2016-11-17','2016-11 -18','2016-11-19','2016-11-20',
'2016-11-21','2016-11-22','2016-11-23','2016 -11-24',
'2016-11-25','2016-11-26','2016-11-27','2016-11-28',
'2016-11 -29','2016-11-30',


'2016-11-01','2016-11-02','2016-11-03','2016 -11-04',
'2016-11-05','2016-11-06','2016-11-07','2016-11-08',
'2016-11 -09' , '二○一六年十一月十日', '2016年11月11日', '2016年11月12日',
'2016-11-13','2016-11-14','2016-11-15','2016-11-16',
'2016-11-17','2016 -11-18','2016-11-19','2016-11-20',
'2016-11-21','2016-11-22','2016-11-23' '2016-11-24',
'2016-11-25','2016-11-26','2016-11-27','2016-11-28',
'2016 -11-29','2016-11-30'))
y< - c(15,17,18,19,20,54,67,23,12,34,12,78,34, 12,3,45,67,89,12,111,123,112,14,566,345,123,567,56,87,90,
45,23,12,10,21,34,12,45,12,44,87,45,32, 67,1,57,87,99,33,234,456,123,89,333,411,232,455,55,90,21)
y< -as.numeric(y)

group< -c(A, A, A, A, A, A, A, A, A, A, A, A, A,A A,A,A,A,A,A,A A,A,A,A,A,
B,B,B,B,B,B B,B,B,B,B,B,B,B,
B,B B,B,B,B,B,B,B,B,B,B,B $ b df< - data.frame(ds,group,y)

df

ds组y
1 2016-11-01 A 15
2 2016-11-02 A 17
3 2016-11-03 A 18
4 2016-11-04 A 19
5 2016-11-05 A 20
6 2016-11-06 A 54
7 2016-11-07 A 67
8 2016-11-08 A 23
9 2016-11-09 A 12
10 2016-11-10 A 34
11 2016-11-11 A 12
12 2016-11-12 A 78
13 2016-11-13 A 34
14 2016-11-14 A 12
15 2016-11-15 A 3
16 2016-11-16 A 45
17 2016-11-17 A 67
18 2016-11-18 A 89
19 2016-11-19 A 12
20 2016-11-20 A 111
21 2016- 11-21 A 123
22 2016-11-22 A 112
23 2016-11-23 A 14
24 2016-11-24 A 566
25 2016-11- 25 A 345
26 2016-11-26 A 123
27 2016-11-27 A 567
28 2016-11-28 A 56
29 2016-11-29 A 87
30 2016-11-30 A 90
31 2016-11-01 B 45
32 2016-11-02 B 23
33 2016-11-03 B 12
34 2016-11-04 B 10
35 2016-11-05 B 21
36 2016-11-06 B 34
37 2016-11-07 B 12
38 2016-11-08 B 45
39 2016-11-09 B 12
40 2016-11-10 B 44
41 2016-11-11 B 87
42 2016-11-12 B 45
43 2016-11-13 B 32
44 2016-11-14 B 67
45 2016-11-15 B 1
46 2016-11-16 B 57
47 2016-11-17 B 87
48 2016-11-18 B 99
49 2016-11-19 B 33
50 2016-11-20 B 234
51 2016 -11-21 B 456
52 2016-11-22 B 123
53 2016-11-23 B 89
54 2016-11-24 B 333
55 2016-11 -25 B 411
56 2016-11-26 B 232
57 2016-11-27 B 455
58 2016-11-28 B 55
59 2016-11-29 B 90
60 2016-11-30 B 21

如何预测使用 prophet 包,y-hat按组而不是总计?

解决方案

这里是使用 tidyr :: nest 的解决方案按组嵌套数据,使用 purrr :: map 将模型合并到这些组中,然后根据请求检索y-hat。
我拿了你的代码,但将其并入 mutate 调用,将使用 purrr :: map

 库(先知)
库(dplyr)
库(purrr)
图书馆(tidyr)

d1< - df%>%
nest(-group)%>%
mutate(m = map(data,prophet))% >%
mutate(future = map(m,make_future_dataframe,period = 7))%>%
mutate(forecast = map2(m,future,predict))

此处的输出是:

  d1 
#A tibble:2×5
组数据未来
< fctr> <列表> <列表> <列表>
1 A< tibble [30×2]> < S3:list> < data.frame [36×1]>
2 B< tibble [30×2]> < S3:list> < data.frame [36×1]>
#...还有1个变量:forecast< list>

然后我使用 unnest()来自预测列的数据,并根据请求选择y-hat值。

  d<  -  d1%>%
不客气(预测)%>%
select(ds,group,yhat)

这里是新预测值的输出:

  d%>%group_by(group)%>%
top_n(7,ds)
来源:本地数据框[14 x 3]
组:group [2]

ds group yhat
< date> < FCTR> < DBL>
1 2016-11-30 A 180.53422
2 2016-12-01 A 349.30277
3 2016-12-02 A 288.68215
4 2016-12-03 A 222.33501
5 2016-12-04 A 342.96654
6 2016-12-05 A 203.64625
7 2016-12-06 A 185.37395
8 2016-11-30 B 131.07827
9 2016-12-01 B 222.83703
10 2016-12-02 B 236.33555
11 2016-12-03 B 145.41001
12 2016-12-04 B 228.59687
13 2016 -12-05 B 162.49244
14 2016-12-06 B 68.44477


I am using the new package released from Facebook called Prophet. It does time series predictions and I want to apply this function By Group.

Scroll down to R Section.

https://facebookincubator.github.io/prophet/docs/quick_start.html

This is my attempt:

grouped_output = df %>% group_by(group) %>%
  do(m = prophet(df[,c(1,3)])) %>%
  do(future = make_future_dataframe(m, period = 7)) %>%
  do(forecast = prophet:::predict.prophet(m, future))

grouped_output[[1]]

I then need to extract the results from the list of each group which I am having trouble doing.

Below is my original dataframe without the groups:

ds <- as.Date(c('2016-11-01','2016-11-02','2016-11-03','2016-11-04',
                   '2016-11-05','2016-11-06','2016-11-07','2016-11-08',
                   '2016-11-09','2016-11-10','2016-11-11','2016-11-12',
                   '2016-11-13','2016-11-14','2016-11-15','2016-11-16',
                   '2016-11-17','2016-11-18','2016-11-19','2016-11-20',
                   '2016-11-21','2016-11-22','2016-11-23','2016-11-24',
                   '2016-11-25','2016-11-26','2016-11-27','2016-11-28',
                   '2016-11-29','2016-11-30'))
y <- c(15,17,18,19,20,54,67,23,12,34,12,78,34,12,3,45,67,89,12,111,123,112,14,566,345,123,567,56,87,90)
y<-as.numeric(y)
df <- data.frame(ds, y)

df

           ds   y
1  2016-11-01  15
2  2016-11-02  17
3  2016-11-03  18
4  2016-11-04  19
5  2016-11-05  20
6  2016-11-06  54
7  2016-11-07  67
8  2016-11-08  23
9  2016-11-09  12
10 2016-11-10  34
11 2016-11-11  12
12 2016-11-12  78
13 2016-11-13  34
14 2016-11-14  12
15 2016-11-15   3
16 2016-11-16  45
17 2016-11-17  67
18 2016-11-18  89
19 2016-11-19  12
20 2016-11-20 111
21 2016-11-21 123
22 2016-11-22 112
23 2016-11-23  14
24 2016-11-24 566
25 2016-11-25 345
26 2016-11-26 123
27 2016-11-27 567
28 2016-11-28  56
29 2016-11-29  87
30 2016-11-30  90

The current function works when I do it to a single group as follows:

#install.packages('prophet')
library(prophet)
m<-prophet(df)
future <- make_future_dataframe(m, period = 7)
forecast <- prophet:::predict.prophet(m, future)

forecast$yhat
 [1]  -2.649032 -29.762095 128.169781  59.573684 -11.623727 107.473617 -29.949730 -42.862455 -62.378408 104.797639  46.868610
[12] -12.502864 119.282058  -4.914921  -4.402638 -10.643570 169.309505 123.321261  74.734746 215.856347  99.290218 105.508059
[23] 102.882915 284.245984 237.401258 185.688202 321.466962 197.451536 194.280518 180.535663 349.304365 288.684031 222.337210
[34] 342.968499 203.648851 185.377165

I now want to change this so that it applies the prophet:::predict function to each group. So the NEW dataframe BY GROUP looks like this:

ds <- as.Date(c('2016-11-01','2016-11-02','2016-11-03','2016-11-04',
            '2016-11-05','2016-11-06','2016-11-07','2016-11-08',
            '2016-11-09','2016-11-10','2016-11-11','2016-11-12',
            '2016-11-13','2016-11-14','2016-11-15','2016-11-16',
            '2016-11-17','2016-11-18','2016-11-19','2016-11-20',
            '2016-11-21','2016-11-22','2016-11-23','2016-11-24',
            '2016-11-25','2016-11-26','2016-11-27','2016-11-28',
            '2016-11-29','2016-11-30',


            '2016-11-01','2016-11-02','2016-11-03','2016-11-04',
            '2016-11-05','2016-11-06','2016-11-07','2016-11-08',
            '2016-11-09','2016-11-10','2016-11-11','2016-11-12',
            '2016-11-13','2016-11-14','2016-11-15','2016-11-16',
            '2016-11-17','2016-11-18','2016-11-19','2016-11-20',
            '2016-11-21','2016-11-22','2016-11-23','2016-11-24',
            '2016-11-25','2016-11-26','2016-11-27','2016-11-28',
            '2016-11-29','2016-11-30'))
y <- c(15,17,18,19,20,54,67,23,12,34,12,78,34,12,3,45,67,89,12,111,123,112,14,566,345,123,567,56,87,90,
   45,23,12,10,21,34,12,45,12,44,87,45,32,67,1,57,87,99,33,234,456,123,89,333,411,232,455,55,90,21)
y<-as.numeric(y)

group<-c("A","A","A","A","A","A","A","A","A","A","A","A","A","A","A",
     "A","A","A","A","A","A","A","A","A","A","A","A","A","A","A",
     "B","B","B","B","B","B","B","B","B","B","B","B","B","B","B",
     "B","B","B","B","B","B","B","B","B","B","B","B","B","B","B")
df <- data.frame(ds,group, y)

df

           ds group   y
1  2016-11-01     A  15
2  2016-11-02     A  17
3  2016-11-03     A  18
4  2016-11-04     A  19
5  2016-11-05     A  20
6  2016-11-06     A  54
7  2016-11-07     A  67
8  2016-11-08     A  23
9  2016-11-09     A  12
10 2016-11-10     A  34
11 2016-11-11     A  12
12 2016-11-12     A  78
13 2016-11-13     A  34
14 2016-11-14     A  12
15 2016-11-15     A   3
16 2016-11-16     A  45
17 2016-11-17     A  67
18 2016-11-18     A  89
19 2016-11-19     A  12
20 2016-11-20     A 111
21 2016-11-21     A 123
22 2016-11-22     A 112
23 2016-11-23     A  14
24 2016-11-24     A 566
25 2016-11-25     A 345
26 2016-11-26     A 123
27 2016-11-27     A 567
28 2016-11-28     A  56
29 2016-11-29     A  87
30 2016-11-30     A  90
31 2016-11-01     B  45
32 2016-11-02     B  23
33 2016-11-03     B  12
34 2016-11-04     B  10
35 2016-11-05     B  21
36 2016-11-06     B  34
37 2016-11-07     B  12
38 2016-11-08     B  45
39 2016-11-09     B  12
40 2016-11-10     B  44
41 2016-11-11     B  87
42 2016-11-12     B  45
43 2016-11-13     B  32
44 2016-11-14     B  67
45 2016-11-15     B   1
46 2016-11-16     B  57
47 2016-11-17     B  87
48 2016-11-18     B  99
49 2016-11-19     B  33
50 2016-11-20     B 234
51 2016-11-21     B 456
52 2016-11-22     B 123
53 2016-11-23     B  89
54 2016-11-24     B 333
55 2016-11-25     B 411
56 2016-11-26     B 232
57 2016-11-27     B 455
58 2016-11-28     B  55
59 2016-11-29     B  90
60 2016-11-30     B  21

How do I predict using the prophet package, the y-hat by group rather than in total?

解决方案

Here is a solution using tidyr::nest to nest the data by group, fit the models in those groups using purrr::map and then retrieving the y-hat as requested. I took your code, but incorporated it into mutate calls that would compute new colums using purrr::map.

library(prophet)
library(dplyr)
library(purrr)
library(tidyr)

d1 <- df %>% 
  nest(-group) %>% 
  mutate(m = map(data, prophet)) %>% 
  mutate(future = map(m, make_future_dataframe, period = 7)) %>% 
  mutate(forecast = map2(m, future, predict))

Here is the output at this point:

d1
# A tibble: 2 × 5
   group              data          m                future
  <fctr>            <list>     <list>                <list>
1      A <tibble [30 × 2]> <S3: list> <data.frame [36 × 1]>
2      B <tibble [30 × 2]> <S3: list> <data.frame [36 × 1]>
# ... with 1 more variables: forecast <list>

Then I use unnest() to retrieve the data from the forecast column and select the y-hat value as requested.

d <- d1 %>% 
  unnest(forecast) %>% 
  select(ds, group, yhat)

And here is the output for the newly forecasted values:

d %>% group_by(group) %>% 
  top_n(7, ds)
Source: local data frame [14 x 3]
Groups: group [2]

           ds  group      yhat
       <date> <fctr>     <dbl>
1  2016-11-30      A 180.53422
2  2016-12-01      A 349.30277
3  2016-12-02      A 288.68215
4  2016-12-03      A 222.33501
5  2016-12-04      A 342.96654
6  2016-12-05      A 203.64625
7  2016-12-06      A 185.37395
8  2016-11-30      B 131.07827
9  2016-12-01      B 222.83703
10 2016-12-02      B 236.33555
11 2016-12-03      B 145.41001
12 2016-12-04      B 228.59687
13 2016-12-05      B 162.49244
14 2016-12-06      B  68.44477

这篇关于使用先知包在R中的数据帧中按组预测的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆