循环和内置函数之间的数值不一致 [英] Numerical inconsistency between loop and builtin function

查看:20
本文介绍了循环和内置函数之间的数值不一致的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试计算随机数数组的总和.但是,当我一次执行一个元素时,结果与使用内置函数时的结果似乎不一致.此外,当我降低数据精度时,误差似乎会增加.

导入火炬列 = 43*22行数 = 44torch.manual_seed(0)数组 = torch.rand([行,列],dtype = torch.float64)数组总和 = 0对于范围(行)中的 i:对于范围(列)中的 j:array_sum += array[i, j]torch.abs(array_sum - array.sum())

结果:

张量(3.6380e-10, dtype=torch.float64)

使用 dtype = torch.float32 结果:

张量(0.1426)

使用 dtype = torch.float16 结果(大吼大叫!):

tensor(18784., dtype=torch.float16)

我很难相信从来没有人问过这个问题.然而,我还没有在 SO 中找到类似的问题.

谁能帮我找到一些解释或这个错误的来源?

解决方案

第一个错误是这样的:您应该将求和行更改为

array_sum += float(array[i, j])

对于 float64 来说,这不会导致任何问题,对于其他值,它是一个问题,接下来会进行解释.

首先:在进行浮点运算时,您应该始终记住由于舍入错误而导致的小错误.最简单的查看方法是在 python shell 中:

<预><代码>>>>.1+.1+.1-.35.551115123125783e-17

但是您如何考虑这些错误?将 n 个正整数相加到总 tot 时,分析相当简单,规则是:

错误(tot)

其中因子 n 通常被严重高估,而 machine_epsilon 取决于浮点数的类型(表示大小).大约是:

float64: 2*10^-16浮点 32:1*10^-7浮点 16:1*10^-3

人们通常会认为误差大约在合理因数 tot*machine_epsilon 内.

对于我使用 float16 的测试,我们得到(总是 +-40000 个变量总和为 +- 20000):

error(float64) = 3*10^-10 ≈ 80* 20000 * 2*10^-16错误(float32)= 1*10^-1≈50*20000*1*10^-7

这是可以接受的.

那么float 16还有一个问题,就是机器epsilon = 1e-4 用

可以看出问题<预><代码>>>>ar = torch.ones([1], dtype=float16)>>>ar张量([2048.],dtype=torch.float16)>>>ar[0] += .5>>>ar张量([2048.],dtype=torch.float16)

这里的问题是,当达到值2048时,该值不够精确,无法添加1或更小的值.更具体地说:使用 float16,您可以表示"值 2048,也可以表示值 2050,但两者之间没有任何内容,因为它对于该精度而言位太少.通过将总和保存在 float64 变量中,您可以克服这个问题.我们为 float16 解决了这个问题:

error(float16) = 16 ≈ 8* 20000 * 1*10^-4

较大,但相对于 float16 中表示的 20000 而言是可接受的值.

如果您问自己,这两种方法中的哪一种是正确的",那么答案是否定的,两者都不是,它们都是具有相同精度的近似值,但误差不同.但正如您可能猜到的那样,使用 sum() 方法更快、更好、更可靠.

I'm trying to compute the sum of an array of random numbers. But there seems to be an inconcistancy between the results when I do it one element at a time and when I use the built-in function. Furthermore, the error seems to increase when I decrease the data precision.

import torch
columns = 43*22
rows    = 44
torch.manual_seed(0)
array = torch.rand([rows,columns], dtype = torch.float64)
array_sum = 0
for i in range(rows):
    for j in range(columns):
        array_sum += array[i, j]

torch.abs(array_sum - array.sum())

results in:

tensor(3.6380e-10, dtype=torch.float64)

using dtype = torch.float32 results in:

tensor(0.1426)

using dtype = torch.float16 results in (a whooping!):

tensor(18784., dtype=torch.float16)

I find it hard to believe no one has ever asked about it. Yet, I haven't found a similar question in SO.

Can anyone please help me find some explanation or the source of this error?

解决方案

The first mistake is this: you should change the summation line to

array_sum += float(array[i, j])

For float64 this causes no problems, for the other values it is a problem, the explenation will follow.

To start with: when doing floating point arithmetic, you should always keep in mind that there are small mistakes due to rounding errors. The most simple way to see this is in a python shell:

>>> .1+.1+.1-.3
5.551115123125783e-17

But how do you take these errors into account? When summing n positive integers to a total tot, the analysis is fairly simple and it the rule is:

error(tot) < tot * n * machine_epsilon

Where the factor n is usually a gross over-estimation and the machine_epsilon is dependant on the type (representation size) of floating point-number. And is approximatly:

float64: 2*10^-16
float32: 1*10^-7
float16: 1*10^-3

And one would generally expect as an error approximately within a reasonable factor of tot*machine_epsilon.

And for my tests with float16 we get (with always +-40000 variables summing to a total of +- 20000):

error(float64) = 3*10^-10 ≈ 80* 20000 * 2*10^-16
error(float32) = 1*10^-1  ≈ 50* 20000 * 1*10^-7

which is acceptable.

Then there is another problem with the float 16. There is the machine epsilon = 1e-4 and you can see the problem with

>>> ar = torch.ones([1], dtype=float16)
>>> ar
tensor([2048.], dtype=torch.float16)
>>> ar[0] += .5
>>> ar
tensor([2048.], dtype=torch.float16)

Here the problem is that when the value 2048 is reached, the value is not precise enough to be able to add a value 1 or less. More specifically: with a float16 you can 'represent' the value 2048, and you can represent the value 2050, but nothing in between because it has too little bits for that precision. By keeping the sum in a float64 variable, you overcome this problem. Fixing this we get for float16:

error(float16) = 16  ≈ 8* 20000 * 1*10^-4

Which is large, but acceptable as a value relative to 20000 represented in float16.

If you ask yourself, which of the two methods is 'right' then the answer is none of the two, they are both approximations with the same precision, but a different error. But as you probably guessed using the sum() method is faster, better and more reliable.

这篇关于循环和内置函数之间的数值不一致的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆