在C ++中处理非常小的数字 [英] Dealing with very small numbers in C++
问题描述
0
下来投票
最爱
我处理的代码使用非常少量的订单10 ^ -15到10 ^ -25,i尝试使用双倍和长双,但我得到一个错误的答案,因为0.000000000000000000001舍入为0或像0.00000000000000002这样的数字表示为0.00000000000000001999999999999,因为即使是1/1000000的一小部分在我的最终答案中有显着差异,请建议我适当的解决方案。谢谢
我尝试过:
#include < iostream >
#include < math.h >
#include < stdlib.h >
#include < iomanip >
使用 命名空间标准;
int main()
{
double sum,a, b,c,d;
a = 1 ;
b = 1 * pow( 10 , - 15 跨度>);
c = 2 * pow( 10 , - 14 跨度>);
d = 3 * pow( 10 , - 14 跨度>);
sum = a + b + c + d;
cout<< fixed;
cout<< setprecision( 30 );
cout<< a:<< a<< endl< < b:<< b<< endl<< c:<< c<< endl
<< d:<< d<< endl;
cout<< sum:<< sum<< endl< < ENDL;
a = a / sum;
b = b / sum;
c = c / sum;
d = d / sum;
sum = a + b + c + d;
cout<< a:<< a<< endl< < b:<< b<< endl<< c:<< c<< endl
<< d:<< d<< endl;
cout<< sum2:<< sum<< ENDL;
return 0 ;
}
预期产量应为
a:1.000000000000000000000000000000
b:0.000000000000000000000000000000000
c:0.000000000000020000000000000000
d:0.000000000000030000000000000000
总和:1.000000000000051000000000000000
a :1.000000000000000000000000000000
b:0.00000000000000100000000000000000
c:0.000000000000020000000000000000
d:0.000000000000030000000000000000
sum1:1.000000000000051000000000000000
但是,我得到的输出是
a:1.000000000000000000000000000000
b:0.00000000000000100000000000000000
c:0.000000000000020000000000000000
d:0.000000000000029999999999999998
总和:1.000000000000051100000000000000
a:0.999999999999998787999878998887
b:0.000000000000000999999997897899
c:0.000000000000019999999999999458
d:0.000000000000029999999999996589
sum1:0.999999999999989000000000000000
我尝试过double,long double甚至boost_dec_float,但我得到的输出类似。
大多数浮点值无法准确表示。结果,存储的值与实际值略有不同。使用双精度,可以表示16位有效十进制数字。这意味着从第一个非零数字计数,所有数字超过16个位置是不相关的(打印时随机)。
现在执行具有浮点值的操作,引入了更大的错误。例如,如果您将较小的值添加到较大的值,则生成的精度由较大的值定义:
1.000 000 000 000 000 000
+ 0.000 000 000 000 001 xxx yyy
= 1.000 000 000 000 001 zzz
上例中标有x和y的数字将丢失,z成为随机(上例中为零)。
对结果执行更多操作会增加错误。这就是你所看到的。
当错误太大时唯一的解决方案是使用更精确的数字格式。虽然可能会使用long double
,但您应该检查您的平台是否支持它。例如Microsoft Visual Studio不使用long double
(long double
类型实际上是double
)。
您还应该了解(并可能使用)浮点数的scientifc格式。
您可以用它来代替pow()
来电:
// b = 1个* POW(10,-15);
// c = 2 * pow(10,-14);
// d = 3 * pow(10,-14);
b = 1e-15;
c = 2e-14;
d = 3e-14;
使用printf
函数打印值时也可以使用它。这样的输出通常比许多尾随或前导零更好的可读性。特别是G
格式很有用(它仅对小数和大数字使用科学格式):
printf( d:%。16G \ n,d);
在上面的示例中,精度限制为16位,因此不会打印不相关的数字。
所以你应该尝试使用上面的格式你的计划。如果结果是预期的那样(因为不打印不相关的数字而输出是舍入的),一切都没问题。
看看 boost的float128 - 1.63.0 [ ^ ]
0
down vote
favorite
Im dealing with a code which uses very small numbers of order 10^-15 to 10^-25, i tried using double and long double but i get a wrong answer as either 0.000000000000000000001 is rounded off to 0 or a number like 0.00000000000000002 is represented as 0.00000000000000001999999999999, as even a small fraction of 1/1000000 makes a significant difference in my final answers, please suggest me an appropriate fix. Thank you
What I have tried:
#include <iostream>
#include<math.h>
#include<stdlib.h>
#include<iomanip>
using namespace std;
int main()
{
double sum, a, b, c,d;
a=1;
b=1*pow(10,-15);
c=2*pow(10,-14);
d=3*pow(10,-14);
sum=a+b+c+d;
cout<<fixed;
cout<<setprecision(30);
cout<<" a : "<<a<<endl<<" b : "<<b<<endl<<" c : "<<c<<endl
<<" d : "<<d<<endl;
cout<<" sum : "<<sum<<endl<<endl;
a=a/sum;
b=b/sum;
c=c/sum;
d=d/sum;
sum=a+b+c+d;
cout<<" a : "<<a<<endl<<" b : "<<b<<endl<<" c : "<<c<<endl
<<" d : "<<d<<endl;
cout<<" sum2: "<<sum<< endl;
return 0;
}
The expected output should be
a : 1.000000000000000000000000000000
b : 0.000000000000001000000000000000
c : 0.000000000000020000000000000000
d : 0.000000000000030000000000000000
sum : 1.000000000000051000000000000000
a : 1.000000000000000000000000000000
b : 0.000000000000001000000000000000
c : 0.000000000000020000000000000000
d : 0.000000000000030000000000000000
sum1: 1.000000000000051000000000000000
But, the output i get is
a : 1.000000000000000000000000000000
b : 0.000000000000001000000000000000
c : 0.000000000000020000000000000000
d : 0.000000000000029999999999999998
sum : 1.000000000000051100000000000000
a : 0.999999999999998787999878998887
b : 0.000000000000000999999997897899
c : 0.000000000000019999999999999458
d : 0.000000000000029999999999996589
sum1: 0.999999999999989000000000000000
I tried double, long double and even boost_dec_float, but the output which i get is similar.
Most floating point values can not be represented exactly. As a result the stored values differ slightly from the real values. With double precision, 16 significant decimal digits can be represented. That means counting from the first non-zero digit, all digits being more than 16 positions to the right are not relevant (random when printed out).
When performing now operations with floating point values, bigger errors are introduced. If you for example add a smaller value to a larger one, the resulting precision is defined by the larger one:
1.000 000 000 000 000 000 + 0.000 000 000 000 001 xxx yyy = 1.000 000 000 000 001 zzz
The digits marked with x and y in the above example will be lost and z becomes random (zero in the above example).
Performing more operations with the result will increase the errors. That is what you are seeing.
The only solution when the errors are too large is using a more precise number format. Whilelong double
might be used, you should check if it is supported on your platform. Microsoft Visual Studio for example does not uselong double
(thelong double
type is in fact adouble
).
You should also know about (and probably use) the scientifc format for floating point numbers.
You can use it for example to replace thepow()
calls:
//b=1*pow(10,-15); //c=2*pow(10,-14); //d=3*pow(10,-14); b = 1e-15; c = 2e-14; d = 3e-14;
It can be also used when printing values using theprintf
function. Such output is often better readable than a lot of trailing or leading zeroes. Especially theG
format is useful (it will use the scientific format only for small and large numbers):
printf("d: %.16G\n", d);In the above example the precision is limited to 16 digits so that non-relevant digits won't be printed.
So you should try to use the above formatting within your program. If the results are as expected then (because non-relevant digits are not printed and the output is rounded instead), all is OK.
Have a look at boost's float128 - 1.63.0[^].
这篇关于在C ++中处理非常小的数字的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!