浮点计算使用float而不是double来给出不同的结果 [英] Floating point calculation gives different results with float than with double

查看：218 发布时间：2016/10/27 3:56:21 c++ floating-point double floating-accuracy

本文介绍了浮点计算使用float而不是double来给出不同的结果的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我有以下代码。

  hero-> onBeingHit（ENEMY_ATTACK_POINT *（1.0  -  hero-> getDefensePercent（）））;

void onBeingHit（int decHP） code>方法接受整数并更新运行点。

 
   float getDefensePercent（）方法是getter方法返回防御。
 
   ENEMY_ATTACK_POINT 是一个宏常数因子，定义为 #define ENEMY_ATTACK_POINT 20 。

 
 
 让我们说 hero-> getDefensePercent（）返回 0.1 。所以计算是
  20 *（1.0  -  0.1）= 20 *（0.9）= 18 
  
每当我尝试使用下面的代码（没有 f 追加 1.0 ）
  hero-> onBeingHit（ENEMY_ATTACK_POINT *（1.0  -  hero-> ; getDefensePercent（）））; 
  
我得到 17 。
 
 
 但是对于下面的代码（ f 附加在 1.0 后面）
  hero-> onBeingHit（ENEMY_ATTACK_POINT *（1.0f- hero-> getDefensePercent（）））; 
  
我得到 18 。
 
 
 发生了什么事？虽然 hero-> getDefensePercent（）已经在浮动，但 f  
 
解决方案
 发生了什么事？为什么在这两种情况下都不是整数结果 18 ？ 
 
 
 问题是结果
 
 
   0.1 不能精确表示为浮点值（在这两种情况下）。编译器将转换为二进制IEEE754浮点数，并决定是向上舍入还是下舍入到可表示的值。 
 
 
  确定，但由于 double 和 float 的行为，为什么我在这两种情况之一获得 18 但在另一种情况下 17 ？我感到困惑。 
 
 
 您的代码使用函数的结果， 0.1f  float），然后计算 20 *（1.0  -  0.1f）这是一个双表达式，而 20 *（1.0f  -  0.1f）是一个浮点表达式。现在浮动版本恰好稍微大于 18.0 ，并向下舍入到 18 ，而double表达式稍微小于 18.0 ，并向下舍入为 17 。
 
 
 如果你不知道IEEE754二进制浮点数是如何从十进制数构造的，它几乎是随机的，如果它稍微小于或略大于你在代码中输入的十进制数。所以你不应该指望这一点。不要尝试通过将 f 添加到其中一个数字并说现在它的工作，所以我离开这个 f  there，因为另一个值的行为不同。
 
 
  为什么要取决于表达式的类型 f ？ 
 
 
 这是因为C和C ++中的浮点文字的类型为 double 。如果你添加 f ，它是一个浮点数。浮点表达式的结果是较大类型的结果。 double表达式和整数的结果仍是一个double表达式，int和float将是一个浮点数。所以你的表达式的结果是一个浮点数或双精度。
 
 
  好的，但我不想舍入为零。 
要解决此问题，请将一半添加到结果中，然后再将其转换为整数：
  hero-> onBeingHit（ENEMY_ATTACK_POINT *（1.0  -  hero-> getDefensePercent（））+ 0.5）; 
  
在C ++ 11中，有  std :: round（）在之前版本的标准中，没有这样的函数可以舍入到最接近的整数。 
 
 
 如果您没有 std :: round 自己写。处理负数时要小心。当转换为整数时，数字将被截断（向零舍入），这意味着负值将被向上舍入，而不是向下舍入。因此，如果数字为负，我们必须减去一半：
  {
 return（x< 0.0）？ （x-.5）：（x + .5）; 
} 
  
 
I have the following line of code.
hero->onBeingHit(ENEMY_ATTACK_POINT * (1.0 - hero->getDefensePercent()));



void onBeingHit(int decHP) method accepts integer number and updates health points.
float getDefensePercent() method is a getter method returning the defense percent of a hero.
ENEMY_ATTACK_POINT is a macro constant factor defined as #define ENEMY_ATTACK_POINT 20.


Let's say hero->getDefensePercent() returns 0.1. So the calculation is
20 * (1.0 - 0.1)  =  20 * (0.9)  =  18
Whenever I tried it with the following code (no f appending 1.0)
hero->onBeingHit(ENEMY_ATTACK_POINT * (1.0 - hero->getDefensePercent()));
I got 17.

But for the following code (f appended after 1.0)
hero->onBeingHit(ENEMY_ATTACK_POINT * (1.0f - hero->getDefensePercent()));
I got 18.

What's going on? Is f significant to have at all although hero->getDefensePercent() is already in float?
 解决方案 
What's going on? Why isn't the integer result 18 in both cases?

The problem is that the result of the floating point expression is rounded towards zero when being converted to an integer value (in both cases).

0.1 can't be represented exactly as a floating point value (in both cases). The compiler does the conversion to a binary IEEE754 floating point number and decides whether to round up or down to a representable value. The processor then multiplies this value during runtime and the result is rounded to get an integer value.

Ok, but since both double and float behave like that, why do I get 18 in one of the two cases, but 17 in the other case? I'm confused.

Your code takes the result of the function, 0.1f (a float), and then calculates 20 * (1.0 - 0.1f) which is a double expression, while 20 * (1.0f - 0.1f) is a float expression. Now the float version happens to be slightly larger than 18.0 and gets rounded down to 18, while the double expression is slightly less than 18.0 and gets rounded down to 17.

If you don't know exactly how IEEE754 binary floating point numbers are constructed from decimal numbers, it's pretty much random if it will be slightly less or slightly greater than the decimal number you've entered in your code. So you shouldn't count on this. Don't try to fix such an issue by appending f to one of the numbers and say "now it works, so I leave this f there", because another value behaves differently again.

Why depends the type of the expression on the precence of this f?

This is because a floating point literal in C and C++ is of type double per default. If you add the f, it's a float. The result of a floating point epxression is of the "greater" type. The result of a double expression and an integer is still a double expression as well as int and float will be a float. So the result of your expression is either a float or a double.

Ok, but I don't want to round to zero. I want to round to the nearest number.

To fix this issue, add one half to the result before converting it to an integer:
hero->onBeingHit(ENEMY_ATTACK_POINT * (1.0 - hero->getDefensePercent()) + 0.5);
In C++11, there is std::round() for that. In previous versions of the standard, there was no such function to round to the nearest integer. (Please see comments for details.)

If you don't have std::round, you can write it yourself. Take care when dealing with negative numbers. When converting to an integer, the number will be truncated (rounded towards zero), which means that negative values will be rounded up, not down. So we have to subtract one half if the number is negative:
int round(double x) {
    return (x < 0.0) ? (x - .5) : (x + .5);
}


                        
这篇关于浮点计算使用float而不是double来给出不同的结果的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！


                    
                        查看全文

浮点计算使用float而不是double来给出不同的结果 [英] Floating point calculation gives different results with float than with double

问题描述

相关文章

C/C++开发最新文章

热门教程

热门工具

登录关闭

浮点计算使用float而不是double来给出不同的结果 [英] Floating point calculation gives different results with float than with double

问题描述

相关文章

C/C++开发最新文章

热门教程

热门工具

登录 关闭

登录关闭