Excel或R:合并缺失值的时间序列 [英] Excel or R: Merge time series with missing values
问题描述
我有多个不规则的时间系列(每个在CSV档案中),如下所示:
X.csv
日期,时间,值
01/01 / 04,00:15:00,4.98
01/01 / 04,00:25: 4.981
01/01 / 04,00:35:00,4.983
01/01 / 04,00:55:00,4.986
因此:
Y.csv
日期,时间,值
01/01 / 04,00:05:00,9.023
01/01 / 04,00:15:00,9.022
01/01 / 04,00:35:00,9.02
01/01 / 04,00:45:00,9.02
01/01 / 04,00:55:00,9.019
请注意,这两个文件基本上有10分钟的粒度,但每个都有一些缺少的条目。
我现在想合并这两个时间序列,实现以下操作:
date,time,X,Y
01/01 / 04,00:05:00,NA,9.023
01/01 / 04,00:15:00,4.98,9.022
01/01 / 04,00:25:00,4.981,NA
01/01 / 04,00:35:00,4.983,9.02
01/01 / 04,00:45: 00,NA,9.02
01/01 / 04,00:55:00,4.986,9.019
有没有一个简单的方法来实现这个?因为我有多个文件(不只是两个),是否有一种方法为一批文件这样做?
您的资料:
X< - read.table(pipe(pbpaste),sep =,,header = T)
X $ date< - as.POSIXct(粘贴(as.Date(X $ date,format ='%m /%d /%y'),X $ time))
获取
> X
日期时间值
1 2004-01-01 00:15:00 00:15:00 4.980
2 2004-01-01 00:25:00 00:25:00 4.981
3 2004-01-01 00:35:00 00:35:00 4.983
4 2004-01-01 00:55:00 00:55:00 4.986
与Y相同:
; Y
日期时间值
1 2004-01-01 00:05:00 00:05:00 9.023
2 2004-01-01 00:15:00 00:15:00 9.022
3 2004-01-01 00:35:00 00:35:00 9.020
4 2004-01-01 00:45:00 00:45:00 9.020
5 2004-01 -01 00:55:00 00:55:00 9.019
现在将X,Y转换为xts-对象并将两个对象与
外连接
合并以获取所有数据点。result < - merge(as.xts(X [,3],order.by = X $ date),as.xts(Y [,3],order.by = Y $ date),join = 'outer')
names(result)< - c('x','y')
$ b b
最后一步是按行对值进行求和:
result $ bothXY< - rowSums (result,na.rm = T)
如果不再需要x,y列:
result <-result [,3]
即可获得:
result
bothXY
2004-01-01 00:05:00 9.023
2004-01-01 00:15:00 14.002
2004-01-01 00:25:00 4.981
2004-01-01 00:35:00 14.003
2004-01-01 00:45:00 9.020
2004-01-01 00:55:00 14.005
I have multiple somewhat irregular time series (each in a CSV file) like so:
X.csv
date,time,value 01/01/04,00:15:00,4.98 01/01/04,00:25:00,4.981 01/01/04,00:35:00,4.983 01/01/04,00:55:00,4.986
and so:
Y.csv
date,time,value 01/01/04,00:05:00,9.023 01/01/04,00:15:00,9.022 01/01/04,00:35:00,9.02 01/01/04,00:45:00,9.02 01/01/04,00:55:00,9.019
Notice how there's basically a granularity of 10 mins in both files, but each has some missing entries.
I would now like to merge these two time series achieve the following:
date,time,X,Y 01/01/04,00:05:00,NA,9.023 01/01/04,00:15:00,4.98,9.022 01/01/04,00:25:00,4.981,NA 01/01/04,00:35:00,4.983,9.02 01/01/04,00:45:00,NA,9.02 01/01/04,00:55:00,4.986,9.019
Is there an easy way of achieving this? Since I have multiple files (not just two), is there a way of doing this for a batch of files?
解决方案Getting your data :
X <- read.table(pipe("pbpaste"), sep=",", header=T) X$date <- as.POSIXct(paste(as.Date(X$date, format='%m/%d/%y'),X$time))
gets us
> X date time value 1 2004-01-01 00:15:00 00:15:00 4.980 2 2004-01-01 00:25:00 00:25:00 4.981 3 2004-01-01 00:35:00 00:35:00 4.983 4 2004-01-01 00:55:00 00:55:00 4.986
same with Y:
> Y date time value 1 2004-01-01 00:05:00 00:05:00 9.023 2 2004-01-01 00:15:00 00:15:00 9.022 3 2004-01-01 00:35:00 00:35:00 9.020 4 2004-01-01 00:45:00 00:45:00 9.020 5 2004-01-01 00:55:00 00:55:00 9.019
now convert X,Y to xts-objects and merge the 2 objects with an
outer join
to get all the data points.result <- merge(as.xts(X[,3],order.by = X$date),as.xts(Y[,3],order.by = Y$date),join='outer’) names(result) <- c('x','y')
The last step is to sum the values by rows:
result$bothXY <- rowSums(result,na.rm=T)
If you don’t need the x,y columns anymore:
result <- result[,3]
and you get:
> result bothXY 2004-01-01 00:05:00 9.023 2004-01-01 00:15:00 14.002 2004-01-01 00:25:00 4.981 2004-01-01 00:35:00 14.003 2004-01-01 00:45:00 9.020 2004-01-01 00:55:00 14.005
这篇关于Excel或R:合并缺失值的时间序列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!