R中的时间序列对象有很多问题 [英] Having a lot of issues with time series objects in R

查看:129
本文介绍了R中的时间序列对象有很多问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

处理某些预算数据的任何时间序列对象时,我的工作特别困难.

I am having an extraordinarily difficult time dealing with -any- time series objects of some budget data.

原始数据是约1800个合同的14460行支付,其中每行都具有DD/MM/YYYY和金额功能. 2000年1月1日至2014年12月31日之间有5296天,但实际上只有3133天有付款.因此,日期之间的间隔是不规则的,某些天出现多于一笔合同付款,而另一些天则为零付款.

The original data is 14,460 rows of payments on ~1800 contracts, where each row has a DD/MM/YYYY and Amount feature. There are 5296 days between 1/1/2000 and 12/31/2014, but only 3133 of these days actually had payments. The days are therefore irregularly spaced, with more than one contract payment showing up on some days, and zero payments on others.

我遇到的主要问题是,这些时间序列对象在获得不定期间隔的每日数据时表现出的残酷固执.我什至将付款合并到一个连续的日期向量中,仍然遇到相同的问题,即频率,周期性或order.by.

The main issue I'm having is the brutal stubbornness these time series object exhibit when being fed daily data that happens at irregular intervals. I've even merged the payments to a continuous date vector and am still having the same issue, namely with frequency, periodicity, or order.by.

CTS_date_V <- data.frame(Date = seq(as.Date("2000/07/01"), as.Date("2014/12/31"), "days"))
exp_d <- merge(exp, CTS_date_V, by="Date", all.y = T)
exp_d$Amount[is.na(exp_d$Amount)] <- 0

head(exp_d[,c("Amount","Date")],20)
      Amount       Date
1        0.0 2000-07-01
2        0.0 2000-07-02
3        0.0 2000-07-03
4        0.0 2000-07-04
5   269909.4 2000-07-05
6   130021.9 2000-07-06
7  1454135.3 2000-07-06
8   140065.5 2000-07-07
9        0.0 2000-07-08
10       0.0 2000-07-09
11       0.0 2000-07-10
12  274147.2 2000-07-11
13  106959.2 2000-07-11
14  119208.6 2000-07-12
15       0.0 2000-07-13
16       0.0 2000-07-14
17       0.0 2000-07-15
18  125402.5 2000-07-16
19 1170603.1 2000-07-16
20 1908463.3 2000-07-16

我熟悉的大多数预测软件包(以及到目前为止发现的关于SO的任何问题),例如fpp,predicting,timeSeries,tseries,xts等,都需要更有序的Date功能order.by或其他类似问题.

Most of the forecasting packages I am familiar with (as well as any of the questions I have found asked so far on SO) like fpp, forecasting, timeSeries, tseries, xts, and the like require a much more orderly Date feature to order.by or some other such concern.

我担心的是R软件包的适当性,而不是统计方法.例如,我尝试了几种不同的方式来构建预测程序包所需的时间序列对象,包括XTS,TS,并且它们都在频率,周期性或请求顺序方面存在问题.

My concern is over the appropriateness of the R package, not the statistical method. For example, I've tried a few different ways of building the time-series objects needed for the forecasting packages, including XTS, TS, and all of them have issues with either the frequency, the periodicity, or are asking for order.by.

更新:

我用

exp_xts <- xts(exp_d$Amount, start = min(exp$Date), end = max(exp$Date), order.by=exp_d$Date, colnames = "Amount", frequency = "") 

head(exp_xts,15)
                [,1]
2000-07-01       0.0
2000-07-02       0.0
2000-07-03       0.0
2000-07-04       0.0
2000-07-05  269909.4
2000-07-06  130021.9
2000-07-06 1454135.3
2000-07-07  140065.5
2000-07-08       0.0
2000-07-09       0.0
2000-07-10       0.0
2000-07-11  274147.2
2000-07-11  106959.2
2000-07-12  119208.6
2000-07-13       0.0

没有问题,该对象可以plot.xts() ed,但是当我尝试

without an issue, and that object can be plot.xts()ed, but when I try

fit_xts <- stl(exp_xts, s.window="periodic",robust = T) 

Error in if (frequency > 1 && abs(frequency - round(frequency)) < ts.eps) frequency <- round(frequency) : missing value where TRUE/FALSE needed`

推荐答案

我尝试在R中使用时间序列对象进行

I tried using timeseries objects in R for a kaggle competition . What I found was that use timeseries predictions using the various timeseries forecast methods around didn't work well for me. What did work for me was to create a normal standard R dataframe, and create a neural network, based on contextual data, like: temperature, day of the week, day of the year, is today a holiday or not, and so on.

这对您意味着什么,因为您没有在进行预测,而是进行简单的统计分析,也许您根本不需要时间序列功能,而只需使用标准的"R"数据框?

What this could mean for you, since you're not doing prediction, but simple statistical analysis is, maybe you don't need the time series functionality at all, and could simply use a standard 'R' dataframe?

我来到 9th 最后,使用标准数据框和神经网络,没有时间序列的东西:-)

I came 9th in the end, using a standard dataframe, and a neural net, no time series stuff :-)

这篇关于R中的时间序列对象有很多问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆