R:具有重复时间索引条目的时间序列 [英] R: time series with duplicate time index entries

查看:250
本文介绍了R:具有重复时间索引条目的时间序列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是R的n00b和堆栈溢出的n00b(刚加入),所以请原谅我,如果我没有使用标记(我不知道)或错过了自述文件中的内容。

I am a n00b at R and a n00b at stack overflow (just joined), so forgive me if I have failed to use markup (which I don't know) or missed something in the readme.

如果你不介意的话,我会在这里解决我的全部问题,或许你可能会对我如何最好地解决这个问题有所了解!

If you don't mind, I will go through my full problem here as perhaps you might be kind enough to shed some insight into how I should best go about this!

阶段1

为每个TS1构建单独的时间序列对象请在下面找到一个数据示例。基本上,我正在加载一个带有多个不规则时间序列的csv文件(例如TS1,TS2),所以在理想的世界中,我会将它们分成单独的,不规则的时间序列对象(例如动物园?),所以TS1,TS2 ......这里讨论了这个问题( R / zoo:处理非唯一索引条目但不丢失数据?)但我已多次尝试使用此方法,但失败了。

Stage 1
Construction of individual time-series objects for each TS1 Please find a data example below. Essentially, I am loading a csv file with multiple, irregular time-series in it (example TS1, TS2) below, so in an ideal world, I would split these into individual, irregular time-series objects (e.g. zoo?), so TS1, TS2, ... this problem was discussed here (R/zoo: handle non-unique index entries but not lose data?) but I have tried repeatedly to use this approach, and failed.

 Date TS Data 
 21/05/2014 TS1 0.95  
 17/04/2014 TS1 1.02   
 27/03/2014 TS1 0.90   
 30/01/2014 TS1 0.80   
 12/12/2013 TS1 0.70  
 18/09/2013 TS1 0.67  
 01/11/2012 TS1 0.71  
 01/11/2012 TS1 0.70  
 21/05/2014 TS2 0.47  
 20/05/2014 TS2 0.51  
 16/05/2014 TS2 0.49  
 15/05/2014 TS2 0.55  
 10/05/2014 TS2 0.63  
 07/05/2014 TS2 0.77  

可以看出,th问题是由于TS1的重复日期索引 01/11/2012 而导致 read.zoo 不能创建我的拆分数据对象。

as can be seen, the problem arises due to duplicate date index of 01/11/2012 for TS1 which causes read.zoo not to create my split data object.

阶段2

我想要做的是,在每个不定期的日期,添加所有数据约会。由于所有时间序列都是不规则的,并且具有不同的规律性,因此我想使用 TS 的先前值。例如。对于 21/05/2014 ,这个例子中的计算很简单,因为TS1和2都有一个条目,所以答案是 0.47 + 0.95 。但是对于 20/05 ,只有 TS2 有一个条目,所以 TS1的值应该使用的是截至该日期的最新值,即 17/04/2014 1.02 ,因此 20/05/2014 的计算应为 0.51 + 1.02 。实现这一点的最简单方法可能是将每个TS转换为每日值,以便使用先前的值直到新的数据点?但这对于下面的第3阶段来说是浪费/不必要的。

Stage 2
What I would then like to do is, on every irregular date, add all the data as of that date together. Since all the time-series are irregular, and with different regularity, I would like to use the prior value for a TS. E.g. for 21/05/2014, this calculation in the example is straightforward as both TS1 and 2 have an entry, so the answer would be 0.47 + 0.95. But for 20/05, only TS2 has an entry, so the value for TS1 that should be used is the most recent one as of that date, i.e. the 17/04/2014 value of 1.02, so the calculation for 20/05/2014 should be 0.51 + 1.02. It could be that the simplest way of achieving this might be to convert each TS into a daily value, such that the previous value is used until a new data point? but this is wasteful/unnecessary for stage 3 below.

阶段3

创建了所有TS'的汇总数据总和,我想要做多项式曲线拟合。我还想区分这个曲线拟合,以找到由此拟合曲线预测的今天日期的变化率。

Stage 3
Having created this aggregated data sum of all the TS', I want to do a polynomial curve-fit. I also want to differentiate this curve-fit to find the rate-of-change as of today's date predicated by this fitted curve.

任何帮助都将非常感谢!我觉得在这个阶段反复撞击墙壁会比在这个阶段做更多的事情更有趣!!

Any help would be much appreciated! I feel that repeatedly hitting my head against a wall would be more fun than doing anything more at this stage!!

谢谢

更新:感谢Grothendieck,我现在的代码如下。

Updated: I now have code as follows thanks to Grothendieck.

library(scales)  
library(zoo)  
library(ggplot2)  

f <- function (z) {  
zz <- read.zoo(z, header = TRUE, split = 2, format = "%d/%m/%Y", aggregate = mean);  
z.fill <- na.locf(zz);  
z.fill <- (z.fill >= 0.5) * z.fill;  
z.fill <- na.fill(z.fill,0);  
zfill.mat = matrix(z.fill, NROW(z.fill));  
z.sum <- rowSums(zfill.mat);  
zsum <- zoo(z.sum,time(z.fill));  
return(zsum);  
}  

DF <- read.csv(file.choose(), header = TRUE, as.is = TRUE);  
DF.S <- split(DF[-2], DF[[2]]);  
user <- DF[1,2];  
Ret <- lapply(DF.S,  f);  

我还有一个问题:

Ret包含一个数据框列表。我可以通过键入Ret $ user来访问它,但由于用户不同,我需要使其动态化。我试图构建一个动态表达式,例如:

x< - paste(Ret $',user,',sep =);

plot(x )

I a remaining problem:
Ret contains a list of a data frame. I can access this by typing Ret$user, but since user varies, I need to make this dynamic. I have tried to construct a dynamic expression e.g.:
x <- paste("Ret$'",user,"'",sep = "");
plot(x)

但无法对此进行评估。

but could not get this to evaluate.

推荐答案

read.zoo 有一个汇总= 参数,该参数采用一个函数,该函数用于在同一系列中的重复时间聚合值。这里我们采用系列中重复天数的平均值,但您可以使用 sum 或任何其他函数。 (如果数据来自文件,我们将用 read.zoo 中的 text = Lines 替换myfile.dat。)然后我们使用 na.locf 来填写NAs,对行进行求和,我们使用 na.omit 删除任何提供 zsum 的领先NAs。接下来,我们计算一个规则间隔的时间网格 g 和一个样条函数 splfun 评估该函数及其在网格上的衍生物转换回动物园后,给 zspl zder 。最后我们绘制它们。

read.zoo has an aggregate= argument which takes a function that is used to aggregate the values at duplicate times in the same series. Here we take the mean of duplicate days within series but you could use sum or any other function. (If the data were coming from a file we would replace text = Lines argument in read.zoo with something like "myfile.dat".) Then we use na.locf to fill in the NAs, sum the rows and we use na.omit to drop any leading NAs giving zsum. Next we compute a regularly spaced time grid g and a spline function splfun evaluating that function and its derivative on the grid which, after converting back to zoo, give zspl and zder. Finally we plot them.

Lines <- "Date TS Data 
 21/05/2014 TS1 0.95  
 17/04/2014 TS1 1.02   
 27/03/2014 TS1 0.90   
 30/01/2014 TS1 0.80   
 12/12/2013 TS1 0.70  
 18/09/2013 TS1 0.67  
 01/11/2012 TS1 0.71  
 01/11/2012 TS1 0.70  
 21/05/2014 TS2 0.47  
 20/05/2014 TS2 0.51  
 16/05/2014 TS2 0.49  
 15/05/2014 TS2 0.55  
 10/05/2014 TS2 0.63  
 07/05/2014 TS2 0.77"

library(zoo)

z <- read.zoo(text = Lines, header = TRUE, split = 2, format = "%d/%m/%Y",
       aggregate = mean)
zsum <- na.omit(zoo(rowSums(na.locf(z)), time(z)))

g <- seq(start(zsum), end(zsum), "day")
splfun <- splinefun(time(zsum), coredata(zsum))
zspl <- zoo(splfun(g), g)
zder <- zoo(splfun(g, deriv = 1), g)

plot(merge(zspl, zder))

这篇关于R:具有重复时间索引条目的时间序列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆