R:如何将长数字转换为字符串以节省精度 [英] R: How to convert long number to string to save precision

查看：86 发布时间：2021/5/30 21:01:22 r precision tostring long-integer

本文介绍了R:如何将长数字转换为字符串以节省精度的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我在将长整数转换为R中的字符串时遇到问题.如何轻松地将数字转换为字符串以保持精度?下面有一个简单的示例.

I have a problem to convert a long number to a string in R. How to easily convert a number to string to preserve precision? A have a simple example below.

a = -8664354335142704128
toString(a)

[1] "-8664354335142704128"

b = -8664354335142703762
toString(b)

[1] "-8664354335142704128"

a == b

[1] TRUE

我期望 toString(a) == toString(b)，但是我得到了不同的值.我想 toString()会将数字转换为float或类似的东西，然后再转换为字符串.

I expected toString(a) == toString(b), but I got different values. I suppose toString() converts the number to float or something like that before converting to string.

谢谢您的帮助.

> -8664354335142704128 == -8664354335142703762

[1] TRUE

> along = bit64::as.integer64(-8664354335142704128)
> blong = bit64::as.integer64(-8664354335142703762)
> along == blong

[1] TRUE

> blong

integer64
[1] -8664354335142704128

我也尝试过:

> as.character(blong)

[1] "-8664354335142704128"

> sprintf("%f", -8664354335142703762)

[1] "-8664354335142704128.000000"

> sprintf("%f", blong)

[1] "-0.000000"

首先我的问题是，是否可以将一个长数字转换为字符串而不会丢失.然后我意识到，在R中不可能获得传递给函数的长整数的真实值，因为R会自动读取带有损失的值.

My question first was, if I can convert a long number to string without loss. Then I realized, in R is impossible to get the real value of a long number passed into a function, because R automatically read the value with the loss.

例如，我具有以下功能:

For example, I have the function:

> my_function <- function(long_number){
+ string_number <- toString(long_number)
+ print(string_number)
+ }

如果有人使用它并传递了一个长号码，我将无法获得信息，而确切地传递了哪个号码.

If someone used it and passed a long number, I am not able to get the information, which number was passed exactly.

> my_function(-8664354335142703762)
[1] "-8664354335142704128"

例如，如果我从文件中读取一些数字，这很容易.但这不是我的情况.我只需要使用某些用户通过的内容.

For example, if I read some numbers from a file, it is easy. But it is not my case. I just need to use something that some user passed.

我不是R专家，所以我很好奇为什么在另一种语言中它会起作用而在R中却不起作用.例如在 Python 中:

I am not R expert, so I just was curious why in another language it works and in R not. For example in Python:

>>> def my_function(long_number):
...     string_number = str(long_number)
...     print(string_number)
... 
>>> my_function(-8664354335142703762)
-8664354335142703762

现在我知道了，问题是R如何读取和存储数字.每种语言都可以做不同的事情.我必须更改将数字传递给R函数的方式，这解决了我的问题.

Now I know, the problem is how R reads and stores numbers. Every language can do it differently. I have to change the way how to pass numbers to R function, and it solves my problem.

所以我的问题的正确答案是:

"我想toString()会将数字转换为浮点数"，不，是您自己做的(即使是无意的)."-不，R是自己做的，这就是R读取数字.

""I suppose toString() converts the number to float", nope, you did it yourself (even if unintentionally)." - Nope, R did it itself, that is the way how R reads numbers.

因此我将r2evans答案标记为最佳答案，因为该用户帮助我找到了正确的解决方案.谢谢！

So I marked r2evans answer as the best answer because this user helped me to find the right solution. Thank you!

推荐答案

最底行，在转换为64位整数之前，您必须(在这种情况下)以字符串形式读取大量数字:

Bottom line up front, you must (in this case) read in your large numbers as string before converting to 64-bit integers:

bit64::as.integer64("-8664354335142704128") == bit64::as.integer64("-8664354335142703762")
# [1] FALSE

关于您尝试过的事情的一些要点:

Some points about what you've tried:

我想toString()会将数字转换为浮点数".，不，您是自己完成的(即使是无意的).在R中，创建数字时， 5 是浮点数， 5L 是整数.即使您尝试将其创建为整数，它仍然会抱怨并失去精度:

"I suppose toString() converts the number to float", nope, you did it yourself (even if unintentionally). In R, when creating a number, 5 is a float and 5L is an integer. Even if you had tried to create it as an integer, it would have complained and lost precision anyway:

class(5)
# [1] "numeric"
class(5L)
# [1] "integer"
class(-8664354335142703762)
# [1] "numeric"
class(-8664354335142703762L)
# Warning: non-integer value 8664354335142703762L qualified with L; using numeric value
# [1] "numeric"

更适当地，当您将其键入为数字并然后尝试对其进行转换时，R首先处理括号的内部.也就是说，

more appropriately, when you type it in as a number and then try to convert it, R processes the inside of the parentheses first. That is, with

bit64::as.integer64(-8664354335142704128)

R必须首先解析并理解"括号内的所有内容，然后才能将其传递给函数.(这通常是编译器/语言解析的东西，而不仅仅是R东西.)在这种情况下，它看起来好像是一个(大)负浮点数，因此它创建了一个 numeric 类.(漂浮).只有 then 会将这个 numeric 发送给函数，但是到这一点为止，精度已经丧失了.否则就不合逻辑

R first has to parse and "understand" everything inside the parentheses before it can be passed to the function. (This is typically a compiler/language-parsing thing, not just an R thing.) In this case, it sees that it appears to be a (large) negative float, so it creates a class numeric (float). Only then does it send this numeric to the function, but by this point the precision has already been lost. Ergo the otherwise-illogical

bit64::as.integer64(-8664354335142704128) == bit64::as.integer64(-8664354335142703762)
# [1] TRUE

在这种情况下，*只是该数字的64位版本等于您的预期.

In this case, it just *happens that the 64-bit version of that number is equal to what you intended.

bit64::as.integer64(-8664254335142704128)  # ends in 4128
# integer64
# [1] -8664254335142704128                 # ends in 4128, yay! (coincidence?)

如果减去1，则会得到相同的有效 integer64 :

If you subtract one, it results in the same effective integer64:

bit64::as.integer64(-8664354335142704127)  # ends in 4127
# integer64
# [1] -8664354335142704128                 # ends in 4128 ?

这持续了一段时间，直到最终移至下一个舍入点

This continues for quite a while, until it finally shifts to the next rounding point

bit64::as.integer64(-8664254335142703617)
# integer64
# [1] -8664254335142704128
bit64::as.integer64(-8664254335142703616)
# integer64
# [1] -8664254335142703104

差异为1024或2 ^ 10不太可能是巧合.我还没有钓鱼，但是我猜想在32位域中的浮点精度方面有一些有意义的事情.

It is unlikely to be coincidence that the difference is 1024, or 2^10. I haven't fished yet, but I'm guessing there's something meaningful about this with respect to floating point precision in 32-bit land.

幸运的是， bit64 :: as.integer64 具有几种S3方法，可用于将不同的格式/类转换为 integer64

fortunately, bit64::as.integer64 has several S3 methods, useful for converting different formats/classes to a integer64

library(bit64)
methods(as.integer64)
# [1] as.integer64.character as.integer64.double    as.integer64.factor   
# [4] as.integer64.integer   as.integer64.integer64 as.integer64.logical  
# [7] as.integer64.NULL

因此， bit64 :: as.integer64.character 可能会很有用，因为在您键入或以字符串形式读取时，精度不会丢失:

So, bit64::as.integer64.character can be useful, since precision is not lost when you type it or read it in as a string:

bit64::as.integer64("-8664354335142704128")
# integer64
# [1] -8664354335142704128
bit64::as.integer64("-8664354335142704128") == bit64::as.integer64("-8664354335142703762")
# [1] FALSE

仅供参考，您的号码已经接近 64 位边界:

FYI, your number is already near the 64-bit boundary:

-.Machine$integer.max
# [1] -2147483647
-(2^31-1)
# [1] -2147483647
log(8664354335142704128, 2)
# [1] 62.9098
-2^63 # the approximate +/- range of 64-bit integers
# [1] -9.223372e+18
-8664354335142704128
# [1] -8.664354e+18

这篇关于R:如何将长数字转换为字符串以节省精度的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

R:如何将长数字转换为字符串以节省精度 [英] R: How to convert long number to string to save precision

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

R:如何将长数字转换为字符串以节省精度 [英] R: How to convert long number to string to save precision

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭