R/zoo:‘order.by’中的索引条目不是唯一的 [英] R/zoo: index entries in ‘order.by’ are not unique

查看:11
本文介绍了R/zoo:‘order.by’中的索引条目不是唯一的的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个 .csv 文件,其中包含 4 列数据,对应一列日期/时间,间隔为一分钟.缺少一些时间戳,所以我试图生成缺少的日期/时间并在 Y 列中为它们分配 NA 值.我以前使用其他格式完全相同的 .csv 文件完成了此操作,没有任何问题.代码是:

I have a .csv file containing 4 columns of data against a column of dates/times at one-minute intervals. Some timestamps are missing, so I'm trying to generate the missing dates/times and assign them NA values in the Y columns. I have previously done this with other .csv files with exactly the same formatting, with no issues. The code is:

# read the csv file
har10 = read.csv(fpath, header=TRUE);

# set date
har10$HAR.TS<-as.POSIXct(har10$HAR.TS,format="%y/%m/%d %H:%M")

# convert to zoo
df1.zoo<-zoo(har10[,-1],har10[,1]) #set date to Index

# merge and generate NAs
df2 <- merge(df1.zoo,zoo(,seq(start(df1.zoo),end(df1.zoo),by="min")), all=TRUE)

# write zoo object to .csv file in Home directory
write.zoo(df2, file = "har10fixed.csv", sep = ",")

在转换为 POSIXct 后,我​​的数据看起来像这样(大约一整年),这似乎很好:

My data looks like this (for an entire year, more or less) after conversion to POSIXct, which seems to go fine:

                    HAR.TS        C1       C2         C3        C4
1      2010-01-01 00:00:00 -4390.659 5042.423 -2241.6344 -2368.762
2      2010-01-01 00:01:00 -4391.711 5042.056 -2241.1796 -2366.725
3      2010-01-01 00:02:00 -4390.354 5043.003 -2242.5493 -2368.786
4      2010-01-01 00:03:00 -4390.337 5038.570 -2242.7653 -2371.289

当我执行转换为动物园"步骤时,出现以下错误:

When I the "convert to zoo" step I get the following error:

 Warning message:
 In zoo(har10[, -1], har10[, 1]) :
   some methods for "zoo" objects do not work if the index entries in ‘order.by’ are not unique

我检查了重复的条目,但没有得到任何结果:

I have checked for duplicated entries but get no results:

> anyDuplicated(har10)
[1] 0

有什么想法吗?我不知道为什么我在这个文件上收到这个错误,但它对以前的文件有效.谢谢!

Any ideas? I have no idea why I'm getting this error on this file, but it has worked for previous ones. Thanks!

可重现的形式:

编辑 2:必须删除数据/代码,抱歉!

EDIT 2: Have to remove the data/code, sorry!

推荐答案

anyDuplicated(har10) 告诉您是否有任何 完整行 重复.zoo 对索引发出警告,因此您应该运行 anyDuplicated(har10$HAR.TS).sum(duplicated(har10$HAR.TS)) 将显示有近 9,000 个重复的日期时间.第一个副本在第 311811 行附近,其中 10/08/19 13:10 出现了两次.

anyDuplicated(har10) tells you if any complete rows are duplicated. zoo is warning about the index, so you should run anyDuplicated(har10$HAR.TS). sum(duplicated(har10$HAR.TS)) will show there are almost 9,000 duplicate datetimes. The first duplicate is around row 311811, where 10/08/19 13:10 appears twice.

这篇关于R/zoo:‘order.by’中的索引条目不是唯一的的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆