R / zoo:'order.by'中的索引条目不是唯一的 [英] R/zoo: index entries in ‘order.by’ are not unique
问题描述
我有一个包含4列数据的.csv文件,以一分钟的间隔对日期/时间列执行。有些时间戳丢失,所以我试图生成缺少的日期/时间,并在Y列中分配NA值。我以前用其他.csv文件完全相同的格式,没有问题。代码是:
I have a .csv file containing 4 columns of data against a column of dates/times at one-minute intervals. Some timestamps are missing, so I'm trying to generate the missing dates/times and assign them NA values in the Y columns. I have previously done this with other .csv files with exactly the same formatting, with no issues. The code is:
# read the csv file
har10 = read.csv(fpath, header=TRUE);
# set date
har10$HAR.TS<-as.POSIXct(har10$HAR.TS,format="%y/%m/%d %H:%M")
# convert to zoo
df1.zoo<-zoo(har10[,-1],har10[,1]) #set date to Index
# merge and generate NAs
df2 <- merge(df1.zoo,zoo(,seq(start(df1.zoo),end(df1.zoo),by="min")), all=TRUE)
# write zoo object to .csv file in Home directory
write.zoo(df2, file = "har10fixed.csv", sep = ",")
转换为POSIXct后,我的数据看起来像这样(整整一年,或多或少),这似乎很好:
My data looks like this (for an entire year, more or less) after conversion to POSIXct, which seems to go fine:
HAR.TS C1 C2 C3 C4
1 2010-01-01 00:00:00 -4390.659 5042.423 -2241.6344 -2368.762
2 2010-01-01 00:01:00 -4391.711 5042.056 -2241.1796 -2366.725
3 2010-01-01 00:02:00 -4390.354 5043.003 -2242.5493 -2368.786
4 2010-01-01 00:03:00 -4390.337 5038.570 -2242.7653 -2371.289
当我转到动物园步骤我得到以下错误:
When I the "convert to zoo" step I get the following error:
Warning message:
In zoo(har10[, -1], har10[, 1]) :
some methods for "zoo" objects do not work if the index entries in ‘order.by’ are not unique
我已检查重复的条目,但没有结果:
I have checked for duplicated entries but get no results:
> anyDuplicated(har10)
[1] 0
任何想法?我不知道为什么我在这个文件上收到这个错误,但它已经适用于以前的文件。谢谢!
Any ideas? I have no idea why I'm getting this error on this file, but it has worked for previous ones. Thanks!
编辑:可重复形式:
编辑2 :不得不删除数据/代码,对不起!
EDIT 2: Have to remove the data/code, sorry!
推荐答案
anyDuplicated(har10)
告诉您是否有任何完整行重复。动物园是关于索引的警告,所以你应该运行 anyDuplicated(har10 $ HAR.TS)
。 sum(duplicate(har10 $ HAR.TS))
将显示有近9,000个重复数据时间。第一个副本是围绕行311811,其中 10/08/19 13:10
出现两次。
anyDuplicated(har10)
tells you if any complete rows are duplicated. zoo is warning about the index, so you should run anyDuplicated(har10$HAR.TS)
. sum(duplicated(har10$HAR.TS))
will show there are almost 9,000 duplicate datetimes. The first duplicate is around row 311811, where 10/08/19 13:10
appears twice.
这篇关于R / zoo:'order.by'中的索引条目不是唯一的的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!