精确的浮点字符串转换 [英] Precise floating-point<->string conversion

查看:107
本文介绍了精确的浮点字符串转换的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在寻找一个库函数将浮点数转换为字符串,并在C ++中再次返回。我想要的属性是str2num(num2str(x))== x和那个num2str(str2num(x))== x(尽可能)。一般的属性是num2str应该代表最简单的有理数,当四舍五入到最接近的可表示浮动指针数字给你回原始数字。

I am looking for a library function to convert floating point numbers to strings, and back again, in C++. The properties I want are that str2num(num2str(x)) == x and that num2str(str2num(x)) == x (as far as possible). The general property is that num2str should represent the simplest rational number that when rounded to the nearest representable floating pointer number gives you back the original number.

到目前为止,我试过boost :: lexical_cast:

So far I've tried boost::lexical_cast:

double d = 1.34;
string_t s = boost::lexical_cast<string_t>(d);
printf("%s\n", s.c_str());
// outputs 1.3400000000000001

我试过std :: ostringstream为大多数值工作,如果我做stream.precision(16)。然而,在精度15或17它会截断或给出丑陋的输出为1.34。我不认为精度16保证有我需要的任何特定的属性,并怀疑它分解为许多数字。

And I've tried std::ostringstream, which seems to work for most values if I do stream.precision(16). However, at precision 15 or 17 it either truncates or gives ugly output for things like 1.34. I don't think that precision 16 is guaranteed to have any particular properties I require, and suspect it breaks down for many numbers.

有一个C ++库有这样的a转换?或者是这样的转换函数已经埋在标准库/ boost中的某处。

Is there a C++ library that has such a conversion? Or is such a conversion function already buried somewhere in the standard libraries/boost.

想要这些函数的原因是将浮点值保存到CSV文件,然后读取他们正确。此外,我希望CSV文件尽可能包含简单的数字,以便人类可以消费。

The reason for wanting these functions is to save floating point values to CSV files, and then read them correctly. In addition, I'd like the CSV files to contain simple numbers as far as possible so they can be consumed by humans.

我知道Haskell读/显示函数已经具有我后面的属性,BSD C库。字符串< - >双重转换的标准参考是来自PLDI 1990的一对论文:

I know that the Haskell read/show functions already have the properties I am after, as do the BSD C libraries. The standard references for string<->double conversions is a pair of papers from PLDI 1990:


  • 如何精确地读取浮点数, Klinger

  • 如何正确打印浮点数字,Guy Steele等人

任何C ++库/ function基于这些将是合适的。

Any C++ library/function based on these would be suitable.

编辑:我完全知道浮点数是十进制数的不精确表示,并且1.34 == 1.3400000000000001。然而,正如上面提到的论文指出,这不是选择显示为1.3400000000000001的借口

I am fully aware that floating point numbers are inexact representations of decimal numbers, and that 1.34==1.3400000000000001. However, as the papers referenced above point out, that's no excuse for choosing to display as "1.3400000000000001"

EDIT2:本文解释了我正在寻找什么: a href =http://drj11.wordpress.com/2007/07/03/python-poor-printing-of-floating-point/> http://drj11.wordpress.com/2007/07/03/ python-poor-printing-of-floating-point /

This paper explains exactly what I'm looking for: http://drj11.wordpress.com/2007/07/03/python-poor-printing-of-floating-point/

推荐答案

我认为这是你想要的,与标准库的strtod()组合:

I think this does what you want, in combination with the standard library's strtod():

#include <stdio.h>
#include <stdlib.h>

int dtostr(char* buf, size_t size, double n)
{
  int prec = 15;
  while(1)
  {
    int ret = snprintf(buf, size, "%.*g", prec, n);
    if(prec++ == 18 || n == strtod(buf, 0)) return ret;
  }
}

一个简单的演示,输入用于尾随垃圾的字词:

A simple demo, which doesn't bother to check input words for trailing garbage:

int main(int argc, char** argv)
{
  int i;
  for(i = 1; i < argc; i++)
  {
    char buf[32];
    dtostr(buf, sizeof(buf), strtod(argv[i], 0));
    printf("%s\n", buf);
  }
  return 0;
}

一些示例输入:

% ./a.out 0.1 1234567890.1234567890 17 1e99 1.34 0.000001 0 -0 +INF NaN
0.1
1234567890.1234567
17
1e+99
1.34
1e-06
0
-0
inf
nan

我想你的C库需要符合标准的一些最新版本,以保证正确的舍入。

I imagine your C library needs to conform to some sufficiently recent version of the standard in order to guarantee correct rounding.

我不确定我选择了 prec 的理想边界,但我想他们必须靠近。也许他们可以更紧?类似地,我认为 buf 的32个字符总是足够,但从不必要。显然,这一切都假设64位IEEE双打。可能值得用一些聪明的预处理器指令 - sizeof(double)== 8 来检查这个假设是一个好的开始。

I'm not sure I chose the ideal bounds on prec, but I imagine they must be close. Maybe they could be tighter? Similarly I think 32 characters for buf are always sufficient but never necessary. Obviously this all assumes 64-bit IEEE doubles. Might be worth checking that assumption with some kind of clever preprocessor directive -- sizeof(double) == 8 would be a good start.

指数有点乱,但是在循环之后但在返回之前不难修复,可能使用 memmove()或者类似的东西向左移动。我确信最多只能有一个 + 和最多一个领导 0 t认为他们甚至可以同时发生 prec> = 10

The exponent is a bit messy, but it wouldn't be difficult to fix after breaking out of the loop but before returning, perhaps using memmove() or suchlike to shift things leftwards. I'm pretty sure there's guaranteed to be at most one + and at most one leading 0, and I don't think they can even both occur at the same time for prec >= 10 or so.

你可以忽略被签名的零,像Javascript一样,你可以很容易地处理它,例如:

Likewise if you'd rather ignore signed zero, as Javascript does, you can easily handle it up front, e.g.:

if(n == 0) return snprintf(buf, size, "0");

我很高兴看到一个详细的比较你3000年的怪物Python代码库。大概短版本更慢,或更不正确,或什么?这将是令人失望的,如果它既不是....

I'd be curious to see a detailed comparison with that 3000-line monstrosity you dug up in the Python codebase. Presumably the short version is slower, or less correct, or something? It would be disappointing if it were neither....

这篇关于精确的浮点字符串转换的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆