difftime的分组平均值在data.table中失败 [英] Grouped mean of difftime fails in data.table
问题描述
我在difftime值的data.table中有一列,单位设置为天。我正在尝试创建另一个data.table来总结这些值,
I have a column in a data.table of difftime values with units set to days. I am trying to create another data.table summarizing the values with
dt2 <- dt[, .(AvgTime = mean(DiffTime)), by = Group]
在打印新data.table时,我看到诸如
When printing the new data.table, I see values such as
1.925988e+00 days
1.143287e+00 days
1.453975e+01 days
我只想限制此列的小数位数(即不设置 options ()
,除非我可以这样专门针对difftime值执行此操作)。当我尝试使用上述方法进行修改时,例如
I would like to limit the decimal place values for this column only (i.e. not setting options()
unless I can do this specifically for difftime values this way). When I try to do this using the method above, modified, e.g.
dt2 <- dt[, .(AvgTime = round(mean(DiffTime)), 2), by = Group]
我剩下NA值,基本的 round()
和 format()
函数都返回警告:
I am left with NA values, with both the base round()
and format()
functions returning the warning:
在mean(DiffTime)中:参数不是数字或逻辑。
In mean(DiffTime) : argument is not numeric or logical.
奇怪的是,如果我在数字字段上执行相同的操作,则运行不会有任何问题。另外,如果我运行两行代码,则可以完成我想做的事情:
Oddly enough, if I perform the same operation on a numeric field, this runs with no problems. Also, if I run the two separate lines of code, I can accomplish what I am looking to do:
dt2 <- dt[, .(AvgTime = mean(DiffTime)), by = Group]
dt2[, AvgTime := round(AvgTime, 2)]
可重现的示例:
Reproducible Example:
library(data.table)
set.seed(1)
dt <- data.table(
Date1 =
sample(seq(as.Date('2017/10/01'),
as.Date('2017/10/31'),
by="days"), 24, replace = FALSE) +
abs(rnorm(24)) / 10,
Date2 =
sample(seq(as.Date('2017/10/01'),
as.Date('2017/10/31'),
by="days"), 24, replace = FALSE) +
abs(rnorm(24)) / 10,
Num1 =
abs(rnorm(24)) * 10,
Group =
rep(LETTERS[1:4], each=6)
)
dt[, DiffTime := abs(difftime(Date1, Date2, units = 'days'))]
# Warnings/NA:
class(dt$DiffTime) # "difftime"
dt2 <- dt[, .(AvgTime = round(mean(DiffTime), 2)), by = .(Group)]
# Works when numeric/not difftime:
class(dt$Num1) # "numeric"
dt2 <- dt[, .(AvgNum = round(mean(Num1), 2)), by = .(Group)]
# Works, but takes an additional step:
dt2<-dt[,.(AvgTime = mean(DiffTime)), by = .(Group)]
dt2[,AvgTime := round(AvgTime,2)]
# Works with base::mean:
class(dt$DiffTime) # "difftime"
dt2 <- dt[, .(AvgTime = round(base::mean(DiffTime), 2)), by = .(Group)]
问题:
当类为 difftime
时,为什么为什么我不能一步一步完成转换(均值舍入)?我在执行过程中缺少什么吗?是 data.table
中的某种错误,无法正确处理 difftime
?
Question:
Why am I not able to complete this conversion (rounding of the mean) in one step when the class is difftime
? Am I missing something in my execution? Is this some sort of bug in data.table
where it can't properly handle the difftime
?
在 github 上添加的问题。
更新:从data.table版本 1.10.4 更新为后,问题似乎已清除。 1.12.8 。
Update: Issue appears to be cleared after updating from data.table version 1.10.4 to 1.12.8.
推荐答案
此问题已通过更新#3567 on 2019/05/15,data.table版本 1.12.4 已发布 2019/10/03
This was fixed by update #3567 on 2019/05/15, data.table version 1.12.4 released 2019/10/03
这篇关于difftime的分组平均值在data.table中失败的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!