在 R 中将数据帧转换为 TS [英] Transforming a dataframe into a TS in R

查看:11
本文介绍了在 R 中将数据帧转换为 TS的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直在尝试将我放在一起的数据框转换为时间序列,但由于某种原因它不起作用.我对 R 很陌生.

I've been trying to transform a dataframe I put together into a Time Series, but for some reason it doesn't work. I am very new to R.

    x<-Sales_AEMBG%>%
+   select(Ecriture.DatEcr, Crédit, Mapping)
> names(x)<-c("Dates","Revenue","Mapping")
> str(x)
Classes ‘tbl_df’, ‘tbl’ and 'data.frame':   15167 obs. of  3 variables:
 $ Dates  : POSIXct, format: "2016-01-02" "2016-01-02" "2016-01-02" "2016-01-02" ...
 $ Revenue: num  124065 214631 135810 225293 57804 ...
 $ Mapping: chr  "E.M 1.5 L" "E.M 1.5 L" "E.M 1.5 L" "E.M 1.5 L" ...'

当我尝试查看数据时,这就是我所拥有的

When I try to look at the data, here's what I have

> head(x)
# A tibble: 6 x 3
  Dates               Revenue Mapping  
  <dttm>                <dbl> <chr>    
1 2016-01-02 00:00:00 124065. E.M 1.5 L
2 2016-01-02 00:00:00 214631. E.M 1.5 L
3 2016-01-02 00:00:00 135810. E.M 1.5 L
4 2016-01-02 00:00:00 225293. E.M 1.5 L
5 2016-01-02 00:00:00  57804. E.M 1.5 L
6 2016-01-02 00:00:00 124065. E.M 1.5 L

当然,我试过as.ts函数

Of course, I tried the as.ts function

 > x_xts <- as.ts(x)
Warning message:
In data.matrix(data) : NAs introduced by coercion
> is.ts(x)
[1] FALSE

但它一直告诉我我的数据框仍未被识别为 TS.

But it keeps telling me that my dataframe is still not recognized as a TS.

你有什么建议?

谢谢

推荐答案

我在您的数据中添加了更多观察结果.

I've added a few more observations to your data.

# A tibble: 12 x 3
   Dates               Revenue Mapping  
   <dttm>                <dbl> <chr>    
 1 2016-01-02 00:00:00  124065 E.M 1.5 L
 2 2016-01-02 00:00:00  214631 E.M 1.5 L
 3 2016-01-03 00:00:00  135810 E.M 1.5 L
 4 2016-01-03 00:00:00  225293 E.M 1.5 L
 5 2016-01-05 00:00:00   57804 E.M 1.5 L
 6 2016-01-05 00:00:00  124065 E.M 1.5 L
 7 2016-01-02 00:00:00   24065 E.M 1.5 M
 8 2016-01-02 00:00:00   14631 E.M 1.5 M
 9 2016-01-03 00:00:00   35810 E.M 1.5 M
10 2016-01-03 00:00:00   25293 E.M 1.5 M
11 2016-01-05 00:00:00    7804 E.M 1.5 M
12 2016-01-05 00:00:00   24065 E.M 1.5 M

<小时>

首先,您需要将销售额按天 (Dates) 和产品类型(您的 Mapping 变量?)相加,然后转换为更广泛的数据格式:


First you need to sum the sales by day (Dates) and product type (your Mapping variable?), and pivot into a wider data format:

library(dplyr)
library(tidyr)

x.sum <- x %>%
  group_by(Mapping, Dates) %>%
  summarise(Revenue=sum(Revenue)) %>%
  pivot_wider(id_cols=Dates, names_from=Mapping, values_from=Revenue)

<小时>

# A tibble: 3 x 3
  Dates               `E,M 1.5 L` `E,M 1.5 M`
  <dttm>                    <dbl>       <dbl>
1 2016-01-02 00:00:00      338696       38696
2 2016-01-03 00:00:00      361103       61103
3 2016-01-05 00:00:00      181869       31869

<小时>

请注意,我故意省略了 1 月 4 日.


Note that I've deliberately omitted Jan 4.

如果您的时间序列数据缺少日期,例如周末金融市场休市的股票价格,则使用 as.ts(或 ts)函数不会工作.如果没有缺失的日期,那么将数据转换为时间序列对象(t​​s")的正确方法是指定要转换的列(x.sum[,2:3])以及该系列的开始时间(2016 年 1 月 2 日)和频率(每日).

If your time series data has missing days, such as stock prices where financial markets are closed on the weekends, then using the as.ts (or ts) function won't work. If there are no missing days, then then correct way to convert the data into a time series object ("ts") is to specify the column(s) to convert (x.sum[,2:3]) and the start (January 2, 2016) and frequency (daily) of the series.

x.ts <- ts(x.sum[,2:3], start=c(2016, 2), frequency=365)

注意开始,因为第二个参数取决于指定的频率.这里,365 表示每天,所以2"表示 2016 年的第 2 天.如果频率是每月,2"表示 2016 年的第 2 个月.

Be careful with the start as the second argument depends on the specified frequency. Here, 365 means daily, so the "2" means day 2 of year 2016. If the frequency was monthly, the "2" would mean month 2 of year 2016.

但正如我所提到的,ts 不会忽略任何缺失的日子.所以对于这个组成数据,如果你绘制了时间序列,那么你会得到错误的信息.

But as I mentioned, ts doesn't ignore any missing days. So for this make-up data, if you plotted the time series, then you will get the wrong information.

在这种情况下,可以使用 xtszoo 等其他软件包来简化工作.

In this case, other packages such as xts and zoo can be used to simply the work.

library(xts)
x.xts <- xts(x.sum[,2:3], order.by=x.sum$Dates)

plot(x.xts) # Correct results.

关于时间序列的其他答案可以在这里这里.

Other answers about time series can be found here and here.

这篇关于在 R 中将数据帧转换为 TS的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆