插入缺失日期/时间的行 [英] Insert rows for missing dates/times

查看:39
本文介绍了插入缺失日期/时间的行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是 R 的新手,但已经转向它来解决我正在尝试处理的大型数据集的问题.目前我有 4 列数据(Y 值)针对分钟间隔时间戳(月/日/年小时:分钟)(X 值)设置,如下所示:

I am new to R but have turned to it to solve a problem with a large data set I am trying to process. Currently I have a 4 columns of data (Y values) set against minute-interval timestamps (month/day/year hour:min) (X values) as below:

    timestamp          tr            tt         sr         st  
1   9/1/01 0:00   1.018269e+02   -312.8622   -1959.393   4959.828  
2   9/1/01 0:01   1.023567e+02   -313.0002   -1957.755   4958.935  
3   9/1/01 0:02   1.018857e+02   -313.9406   -1956.799   4959.938  
4   9/1/01 0:03   1.025463e+02   -310.9261   -1957.347   4961.095  
5   9/1/01 0:04   1.010228e+02   -311.5469   -1957.786   4959.078

我遇到的问题是缺少某些时间戳值 - 例如9/1/01 0:13 和 9/1/01 0:27 之间可能存在间隙,并且这种间隙在整个数据集中是不规则的.我需要将这些系列中的几个放入同一个数据库中,并且由于每个系列的缺失值不同,因此日期目前并未在每一行上对齐.

The problem I have is that some timestamp values are missing - e.g. there may be a gap between 9/1/01 0:13 and 9/1/01 0:27 and such gaps are irregular through the data set. I need to put several of these series into the same database and because the missing values are different for each series, the dates do not currently align on each row.

我想为这些缺失的时间戳生成行并用空白值(无数据,不是零)填充 Y 列,以便我有一个连续的时间序列.

I would like to generate rows for these missing timestamps and fill the Y columns with blank values (no data, not zero), so that I have a continuous time series.

老实说,我不太确定从哪里开始(之前没有真正使用过 R,所以我一边学习一边学习!)但任何帮助将不胜感激.到目前为止,我已经安装了 chron 和 zoo,因为它们似乎很有用.

I'm honestly not quite sure where to start (not really used R before so learning as I go along!) but any help would be much appreciated. I have thus far installed chron and zoo, since it seems they might be useful.

谢谢!

推荐答案

我认为最简单的方法是先按照已经描述的方式设置 Date,然后转换为 zoo,然后再设置一个合并:

I think the easiest thing ist to set Date first as already described, convert to zoo, and then just set a merge:

df$timestamp<-as.POSIXct(df$timestamp,format="%m/%d/%y %H:%M")

df1.zoo<-zoo(df[,-1],df[,1]) #set date to Index

df2 <- merge(df1.zoo,zoo(,seq(start(df1.zoo),end(df1.zoo),by="min")), all=TRUE)

开始和结束是从您的 df1(原始数据)中给出的,并且您正在设置 - 例如 min - 根据您的示例需要.all=TRUE 将缺失日期的所有缺失值设置为 NA.

Start and end are given from your df1 (original data) and you are setting by - e.g min - as you need for your example. all=TRUE sets all missing values at the missing dates to NAs.

这篇关于插入缺失日期/时间的行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆