导入具有许多数字的(64位)整数时,R中出现奇怪的错误 [英] Weird error in R when importing (64-bit) integer with many digits
问题描述
我要导入的csv具有单列,其中包含非常长的整数(例如:2121020101132507598)
I am importing a csv that has a single column which contains very long integers (for example: 2121020101132507598)
a< -read.csv('temp.csv',as.is = T)
a<-read.csv('temp.csv',as.is=T)
当我将这些整数作为字符串导入时,它们可以正确输入,但是当作为整数导入时,最后几位数字会更改.我不知道发生了什么事...
When I import these integers as strings they come through correctly, but when imported as integers the last few digits are changed. I have no idea what is going on...
1个"4031320121153001444" 4031320121153001472
2"4113020071082679601" 4113020071082679808
3"4073020091116779570" 4073020091116779520
4"2081720101128577687" 2081720101128577792
5"4041720081087539887" 4041720081087539712
6"4011120071074301496" 4011120071074301440
7"4021520051054304372" 4021520051054304256
8"4082520061068996911" 4082520061068997120
9"4082620101129165548" 4082620101129165312
1 "4031320121153001444" 4031320121153001472
2 "4113020071082679601" 4113020071082679808
3 "4073020091116779570" 4073020091116779520
4 "2081720101128577687" 2081720101128577792
5 "4041720081087539887" 4041720081087539712
6 "4011120071074301496" 4011120071074301440
7 "4021520051054304372" 4021520051054304256
8 "4082520061068996911" 4082520061068997120
9 "4082620101129165548" 4082620101129165312
推荐答案
正如其他人指出的那样,您不能表示那么大的整数.但是R并没有将这些值读取为整数,而是将其读取为双精度数字.
As others have noted, you can't represent integers that large. But R isn't reading those values into integers, it's reading them into double precision numerics.
双精度只能将数字精确地表示为约16位,这就是为什么您看到数字在16位后四舍五入的原因.参见 gmp , int64 软件包,以寻求潜在的解决方案.尽管我看不到其中任何一个读取文件的功能,但也许您可以通过查看它们的来源来制作一些东西.
Double precision can only represent numbers to ~16 places accurately, which is why you see your numbers rounded after 16 places. See the gmp, Rmpfr, and int64 packages for potential solutions. Though I don't see a function to read from a file in any of them, maybe you could cook something up by looking at their sources.
更新:
将文件放入int64
对象的方法如下:
UPDATE:
Here's how you can get your file into an int64
object:
# This assumes your numbers are the only column in the file
# Read them in however, just ensure they're read in as character
a <- scan("temp.csv", what="")
ia <- as.int64(a)
这篇关于导入具有许多数字的(64位)整数时,R中出现奇怪的错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!