导入具有许多数字的(64位)整数时,R中出现奇怪的错误 [英] Weird error in R when importing (64-bit) integer with many digits

查看:94
本文介绍了导入具有许多数字的(64位)整数时,R中出现奇怪的错误的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我要导入的csv具有单列,其中包含非常长的整数(例如:2121020101132507598)

I am importing a csv that has a single column which contains very long integers (for example: 2121020101132507598)

a< -read.csv('temp.csv',as.is = T)

a<-read.csv('temp.csv',as.is=T)

当我将这些整数作为字符串导入时,它们可以正确输入,但是当作为整数导入时,最后几位数字会更改.我不知道发生了什么事...

When I import these integers as strings they come through correctly, but when imported as integers the last few digits are changed. I have no idea what is going on...

1个"4031320121153001444" 4031320121153001472
2"4113020071082679601" 4113020071082679808
3"4073020091116779570" 4073020091116779520
4"2081720101128577687" 2081720101128577792
5"4041720081087539887" 4041720081087539712
6"4011120071074301496" 4011120071074301440
7"4021520051054304372" 4021520051054304256
8"4082520061068996911" 4082520061068997120
9"4082620101129165548" 4082620101129165312

1 "4031320121153001444" 4031320121153001472
2 "4113020071082679601" 4113020071082679808
3 "4073020091116779570" 4073020091116779520
4 "2081720101128577687" 2081720101128577792
5 "4041720081087539887" 4041720081087539712
6 "4011120071074301496" 4011120071074301440
7 "4021520051054304372" 4021520051054304256
8 "4082520061068996911" 4082520061068997120
9 "4082620101129165548" 4082620101129165312

推荐答案

正如其他人指出的那样,您不能表示那么大的整数.但是R并没有将这些值读取为整数,而是将其读取为双精度数字.

As others have noted, you can't represent integers that large. But R isn't reading those values into integers, it's reading them into double precision numerics.

双精度只能将数字精确地表示为约16位,这就是为什么您看到数字在16位后四舍五入的原因.参见 gmp

Double precision can only represent numbers to ~16 places accurately, which is why you see your numbers rounded after 16 places. See the gmp, Rmpfr, and int64 packages for potential solutions. Though I don't see a function to read from a file in any of them, maybe you could cook something up by looking at their sources.

更新: 将文件放入int64对象的方法如下:

UPDATE: Here's how you can get your file into an int64 object:

# This assumes your numbers are the only column in the file
# Read them in however, just ensure they're read in as character
a <- scan("temp.csv", what="")
ia <- as.int64(a)

这篇关于导入具有许多数字的(64位)整数时,R中出现奇怪的错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆