这是我发现的 data.table 和 integer64 中的错误吗 [英] Is it a bug in data.table and integer64 I found

查看:17
本文介绍了这是我发现的 data.table 和 integer64 中的错误吗的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

data.tableinteger64(包 bit64)> 我的理解是 integer64 还不能用在 by 子句中.虽然我可能在sort"中发现了一个错误.

library(data.table)库(位64)test4 <- 结构(列表(IDFD = c(360627720722618433",360627720722618433")), CDVCA = c("2013-03-13T09:36:07.795", "2013-03-13T09:36:07.795"), NUMSEQ = 结构(c(1.05397451390436e-309, 1.05397443975625e-309), class = "integer64")), .Names = c("IDFD", "CDVCA", "NUMSEQ"), row.names = c(NA, -2L), class = "data.frame")字符串(测试4)'data.frame':2 观察.共 3 个变量:$ IDFD : chr "360627720722618433" "360627720722618433"$ CDVCA : chr "2013-03-13T09:36:07.795" "2013-03-13T09:36:07.795"$ NUMSEQ:Class 'integer64' num [1:2] 1.05e-309 1.05e-309test4 <- as.data.table(test4)字符串(测试4)类data.table"和data.frame":2 obs.共 3 个变量:$ IDFD : chr "360627720722618433" "360627720722618433"$ CDVCA : chr "2013-03-13T09:36:07.795" "2013-03-13T09:36:07.795"$ NUMSEQ:Class 'integer64' num [1:2] 1.05e-309 1.05e-309- attr(*, ".internal.selfref")=setkey(test4,IDFD,CDVCA,NUMSEQ)测试4IDFD CDVCA NUMSEQ1: 360627720722618433 2013-03-13T09:36:07.795 2133268165427202:360627720722618433 2013-03-13T09:36:07.795 213326801534975 #这没有排序!!

我说得对吗?

解决方案

更新:现在在 v1.9.3 中实现(可从 R-Forge 获得),参见 NEWS :

<块引用>

o bit64::integer64 现在可用于分组和连接,#5369.感谢 James Sams 突出显示 UPC 和 Clayton Stanley.
提醒:fread() 已经能够检测和读取 integer64 一段时间了.

以上 OP 的例子:

test4# IDFD CDVCA NUMSEQ# 1: 360627720722618433 2013-03-13T09:36:07.795 213326801534975 ## 排序正确# 2: 360627720722618433 2013-03-13T09:36:07.795 213326816542720

I am having a lot of difficulties with data.table and integer64 (package bit64)> My understanding is that integer64 cannot yet be used in a by clause. Though I might have found a bug in the "sort".

library(data.table)
library(bit64)

test4 <- structure(list(IDFD = c("360627720722618433", "360627720722618433"
), CDVCA = c("2013-03-13T09:36:07.795", "2013-03-13T09:36:07.795"
), NUMSEQ = structure(c(1.05397451390436e-309, 1.05397443975625e-309
), class = "integer64")), .Names = c("IDFD", "CDVCA", "NUMSEQ"
), row.names = c(NA, -2L), class = "data.frame")

str(test4)
'data.frame':   2 obs. of  3 variables:
 $ IDFD  : chr  "360627720722618433" "360627720722618433"
 $ CDVCA : chr  "2013-03-13T09:36:07.795" "2013-03-13T09:36:07.795"
 $ NUMSEQ:Class 'integer64'  num [1:2] 1.05e-309 1.05e-309

test4 <- as.data.table(test4)

str(test4)
Classes ‘data.table’ and 'data.frame':  2 obs. of  3 variables:
 $ IDFD  : chr  "360627720722618433" "360627720722618433"
 $ CDVCA : chr  "2013-03-13T09:36:07.795" "2013-03-13T09:36:07.795"
 $ NUMSEQ:Class 'integer64'  num [1:2] 1.05e-309 1.05e-309
 - attr(*, ".internal.selfref")=<externalptr> 

setkey(test4,IDFD,CDVCA,NUMSEQ)
test4
                 IDFD                   CDVCA          NUMSEQ
1: 360627720722618433 2013-03-13T09:36:07.795 213326816542720
2: 360627720722618433 2013-03-13T09:36:07.795 213326801534975 #THIS IS NOT SORTED !!

Am I right ?

解决方案

Update: This is now implemented in v1.9.3 (available from R-Forge), see NEWS :

o bit64::integer64 now works in grouping and joins, #5369. Thanks to James Sams for highlighting UPCs and Clayton Stanley.
Reminder: fread() has been able to detect and read integer64 for a while.

On OP's example above:

test4
#                  IDFD                   CDVCA          NUMSEQ
# 1: 360627720722618433 2013-03-13T09:36:07.795 213326801534975 ## sorted right
# 2: 360627720722618433 2013-03-13T09:36:07.795 213326816542720

这篇关于这是我发现的 data.table 和 integer64 中的错误吗的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆