将数据帧转换为R中的TS [英] Transforming a dataframe into a TS in R
问题描述
我一直在尝试将整合在一起的数据帧转换为时间序列,但是由于某种原因它无法正常工作。
I've been trying to transform a dataframe I put together into a Time Series, but for some reason it doesn't work. I am very new to R.
x<-Sales_AEMBG%>%
+ select(Ecriture.DatEcr, Crédit, Mapping)
> names(x)<-c("Dates","Revenue","Mapping")
> str(x)
Classes ‘tbl_df’, ‘tbl’ and 'data.frame': 15167 obs. of 3 variables:
$ Dates : POSIXct, format: "2016-01-02" "2016-01-02" "2016-01-02" "2016-01-02" ...
$ Revenue: num 124065 214631 135810 225293 57804 ...
$ Mapping: chr "E.M 1.5 L" "E.M 1.5 L" "E.M 1.5 L" "E.M 1.5 L" ...'
当我尝试查看数据时,这就是我所拥有的
When I try to look at the data, here's what I have
> head(x)
# A tibble: 6 x 3
Dates Revenue Mapping
<dttm> <dbl> <chr>
1 2016-01-02 00:00:00 124065. E.M 1.5 L
2 2016-01-02 00:00:00 214631. E.M 1.5 L
3 2016-01-02 00:00:00 135810. E.M 1.5 L
4 2016-01-02 00:00:00 225293. E.M 1.5 L
5 2016-01-02 00:00:00 57804. E.M 1.5 L
6 2016-01-02 00:00:00 124065. E.M 1.5 L
当然,我尝试了as.ts函数
Of course, I tried the as.ts function
> x_xts <- as.ts(x)
Warning message:
In data.matrix(data) : NAs introduced by coercion
> is.ts(x)
[1] FALSE
但它一直告诉我数据帧仍未被识别为TS。
But it keeps telling me that my dataframe is still not recognized as a TS.
您有什么建议?
谢谢
推荐答案
我在其中添加了一些观察结果
I've added a few more observations to your data.
# A tibble: 12 x 3
Dates Revenue Mapping
<dttm> <dbl> <chr>
1 2016-01-02 00:00:00 124065 E.M 1.5 L
2 2016-01-02 00:00:00 214631 E.M 1.5 L
3 2016-01-03 00:00:00 135810 E.M 1.5 L
4 2016-01-03 00:00:00 225293 E.M 1.5 L
5 2016-01-05 00:00:00 57804 E.M 1.5 L
6 2016-01-05 00:00:00 124065 E.M 1.5 L
7 2016-01-02 00:00:00 24065 E.M 1.5 M
8 2016-01-02 00:00:00 14631 E.M 1.5 M
9 2016-01-03 00:00:00 35810 E.M 1.5 M
10 2016-01-03 00:00:00 25293 E.M 1.5 M
11 2016-01-05 00:00:00 7804 E.M 1.5 M
12 2016-01-05 00:00:00 24065 E.M 1.5 M
首先,您需要按天(日期
)和产品类型(您的映射
变量?),并转换为更广泛的数据格式:
First you need to sum the sales by day (Dates
) and product type (your Mapping
variable?), and pivot into a wider data format:
library(dplyr)
library(tidyr)
x.sum <- x %>%
group_by(Mapping, Dates) %>%
summarise(Revenue=sum(Revenue)) %>%
pivot_wider(id_cols=Dates, names_from=Mapping, values_from=Revenue)
# A tibble: 3 x 3
Dates `E,M 1.5 L` `E,M 1.5 M`
<dttm> <dbl> <dbl>
1 2016-01-02 00:00:00 338696 38696
2 2016-01-03 00:00:00 361103 61103
3 2016-01-05 00:00:00 181869 31869
请注意,我已故意省略1月4日
Note that I've deliberately omitted Jan 4.
如果您的时间序列数据缺少日期,例如股票价格,周末金融市场休市,则使用 as.ts
(或 ts
)功能不起作用。如果没有丢失的日期,则将数据转换为时间序列对象( ts)的正确方法是指定要转换的列( x.sum [,2:3 ]
)以及该系列的开始日期(2016年1月2日)和频率(每天)。
If your time series data has missing days, such as stock prices where financial markets are closed on the weekends, then using the as.ts
(or ts
) function won't work. If there are no missing days, then then correct way to convert the data into a time series object ("ts") is to specify the column(s) to convert (x.sum[,2:3]
) and the start (January 2, 2016) and frequency (daily) of the series.
x.ts <- ts(x.sum[,2:3], start=c(2016, 2), frequency=365)
开始时要小心,因为第二个参数取决于指定的频率。在这里,365表示每日,因此 2表示2016年的第2天。如果频率为每月,则 2表示2016年的第2个月。
Be careful with the start as the second argument depends on the specified frequency. Here, 365 means daily, so the "2" means day 2 of year 2016. If the frequency was monthly, the "2" would mean month 2 of year 2016.
但是正如我提到的, ts
不会忽略任何丢失的日子。因此,对于这种化妆数据,如果绘制时间序列,则会得到错误的信息。
But as I mentioned, ts
doesn't ignore any missing days. So for this make-up data, if you plotted the time series, then you will get the wrong information.
在这种情况下,其他软件包如 xts 和 zoo 可用于简化工作。
In this case, other packages such as xts and zoo can be used to simply the work.
library(xts)
x.xts <- xts(x.sum[,2:3], order.by=x.sum$Dates)
plot(x.xts) # Correct results.
Other answers about time series can be found here and here.
这篇关于将数据帧转换为R中的TS的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!