R数据表。选择行（整数比较） [英] R data.table select rows (integer comparison)

查看：144 发布时间：2017/3/12 13:08:36 r data.table

本文介绍了R数据表。选择行（整数比较）的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

当尝试通过指定a的值来选择 data.table （ R 的包）字段由大整数组成，我得到奇怪的结果。

When trying to select rows in a data.table (package for R) by specifying the value of a field consisting of large integers, I get strange results. Namely, similar integers are selected too.

require(data.table)
options(digits=15)
data <- data.table(A=c(1000200030001,1000200030002,1000200030003))

尝试通过检查A的值访问第一行：

Try to access the first row by checking the value of A:

data[A==1000200030001]
               A
1: 1000200030001
2: 1000200030002
3: 1000200030003

全部

当使用 as.numeric时问题解决 ：

data[as.numeric(A)==1000200030001]
               A
1: 1000200030001

问题不存在于 j 部分数据中。表：

Problem not present in jpart of data.table:

data[,A == 1000200030001]
[1]  TRUE FALSE FALSE

这似乎是比较大数字的精度的问题。我很困惑，使用 as.numeric 解决问题，因为 str（data）显示A已经是类型数字：

This seems to be a problem with the precision of comparing large numbers. I am very confused that using as.numeric solves the issue since str(data) shows that A already is of type numeric:

str(data)
Classes ‘data.table’ and 'data.frame':  3 obs. of  1 variable:
 $ A: num  1e+12 1e+12 1e+12
 - attr(*, ".internal.selfref")=<externalptr> 
 - attr(*, "index")= atomic  
  ..- attr(*, "A")= int

任何提示如何确保这个问题不会出现在（生产）代码中。

Any hints as how to ensure this problem does not appear in (productive) code are appreciated!

UPDATE：
禁用自动索引时，上述问题不存在。

UPDATE: The problem described above is not present when disabling auto-indexing.

options(datatable.auto.index=FALSE)

但是，通过禁用自动索引无法解决聚合和合并/加入的问题：

However, problems with aggregation and merging/joining are not solved by disabling auto-indexing:

data[,.(B=sum(A)),A]
               A             B
1: 1000200030001 1000200030001

正确的输出将是：

               A             B
1: 1000200030001 1000200030001
2: 1000200030002 1000200030002
3: 1000200030003 1000200030003

我发现所有这些问题的最佳解决方案，使用 bit64 包如所选答案中所述。非常感谢大家！

I found the best solution to all of these problems to use the bit64 package as described in the selected answer. Thanks everybody!

推荐答案

使用 bit64 :: integer64 ：

require(data.table)
options(digits=15)
library(bit64)
data <- fread("A
              1000200030001
              1000200030002
              1000200030003", colClasses = "integer64")


data[A == as.integer64("1000200030001")]
#A
#1: 1000200030001

，停用自动索引（并从中失去性能优势）：

Alternatively, deactivate auto-indexing (and lose the performance advantage from it):

options(datatable.auto.index=FALSE)
data <- data.table(A=c(1000200030001,1000200030002,1000200030003))
data[(A==1000200030001)]
#               A
#1: 1000200030001

这篇关于R数据表。选择行（整数比较）的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

R数据表。选择行（整数比较） [英] R data.table select rows (integer comparison)

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

R数据表。选择行（整数比较） [英] R data.table select rows (integer comparison)

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭