时间序列数据缺失时间值和数据值 [英] Time series Data Missing Time values and Data values

查看:622
本文介绍了时间序列数据缺失时间值和数据值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我这里有以下时间序列数据集示例:

I have the following time-series dataset sample here:

ymd      rf
19820103  3
19820104  9
19820118  4
19820119  2
19820122  0
19820218  5

现在应该以每日时间序列的方式组织数据集.更具体地说,应将ymd的范围从19820101到19820230连续.但是,如从上面的示例中可以看到的,数据集不是连续的,并且不包含诸如"19820101"和"19820102"之类的日期.对于这些日期如果数据集不可用,我希望能够包括缺少的日期并为rf输入"0"值.

Now the dataset is supposed to be organized in a daily time-series manner. More specifically, ymd is supposed to range continuously from 19820101 through 19820230. However, as you can see from the sample above, the dataset is not continuous and does not contain days such as "19820101" and "19820102", etc. For these dates where the dataset is unavailable, I'd like to be able to include the missing days and enter a "0" value for the rf.

使脚本自动解决此问题的最佳方法是什么?从1979年到2016年的每日时间序列数据集,我将必须执行此操作.

What would be the best way to make a script to automate this problem? I'll have to do this from 1979 through 2016 daily time-series datasets.

推荐答案

让我们假设您的数据位于名为"mydata"的数据框中.然后,您可以执行以下操作:

Let's assume your data is in a data frame named "mydata". Then you could do the following:

#Create full ymd with all the needed dates
ymd.full <- data.frame(ymd=seq(min(mydata$ymd), max(mydata$ymd)))

#Merge both datasets
mydata <- merge(ymd.full, mydata, all.x=T)

#Replace NAs with 0
mydata[is.na(mydata)] <- 0

这篇关于时间序列数据缺失时间值和数据值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆