如何浮点运算作出确定性? [英] How can floating point calculations be made deterministic?

查看:164
本文介绍了如何浮点运算作出确定性?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

浮点计算既不是联想也不是分配上的处理器。因此,

Floating point calculation is neither associative nor distributive on processors. So,

(A + B)+ C 不等于 A +(B + C)

A *(B + C)不等于 A * B + A * C

有什么办法来执行确定性浮点运算没有给出不同的结果。这将是对单处理器确定性ofcourse,但如果线程添加到一个总和,例如,因为可能有螺纹的不同的交错它不会是在多线程程序确定的。

Is there any way to perform deterministic floating point calculation that do not give different results. It would be deterministic on uniprocessor ofcourse, but it would not be deterministic in multithreaded programs if threads add to a sum for example, as there might be different interleavings of the threads.

我的问题是,怎么能实现浮点运算在多线程程序确定的结果?

So my question is, how can one achieve deterministic results for floating point calculations in multithreaded programs?

推荐答案

浮点的的确定性。同样的浮点运算,在相同的硬件上运行,总是产生相同的结果。有没有黑魔法,噪音小,随机性,模糊,或任何其他的事情,人们通常归因于浮点。牙齿仙女不出来,把你的结果的低位,而你的枕头下留下的四分之一。

Floating-point is deterministic. The same floating-point operations, run on the same hardware, always produces the same result. There is no black magic, noise, randomness, fuzzing, or any of the other things that people commonly attribute to floating-point. The tooth fairy does not show up, take the low bits of your result, and leave a quarter under your pillow.

现在,认为,某些阻止算法通常用于大规模并行计算的的非确定性在其浮点运算的执行顺序而言,这可能会导致整个运行非比特精确的结果。

Now, that said, certain blocked algorithms that are commonly used for large-scale parallel computations are non-deterministic in terms of the order in which floating-point computations are performed, which can result in non-bit-exact results across runs.

你能做些什么呢?

首先,确保你确实不能与生活情况。很多东西,你可能会试图执行的并行计算订货会影响性能。这只是它是如何。

First, make sure that you actually can't live with the situation. Many things that you might try to enforce ordering in a parallel computation will hurt performance. That's just how it is.

我还要指出的是,尽管封锁的算法可能引入的非确定性的一些量,他们经常与交付成果的的舍入误差比做天真畅通串行算法(令人惊讶,但真实的!)。如果你可以用一个天真的串行算法产生的错误生活,你也许可以忍受的并行算法受阻的错误。

I would also note that although blocked algorithms may introduce some amount of non-determinism, they frequently deliver results with smaller rounding errors than do naive unblocked serial algorithms (surprising but true!). If you can live with the errors produced by a naive serial algorithm, you can probably live with the errors of a parallel blocked algorithm.

现在,如果你真的,真的,需要通过运行精确的重现,这里是往往不会对性能产生负面太多影响了几点建议:

Now, if you really, truly, need exact reproducibility across runs, here are a few suggestions that tend not to adversely affect performance too much:


  1. 不要使用多线程算法,可以重新排列浮点计算。问题解决了。这并不意味着你不能在所有使用多线程算法,只是你需要确保每个人的结果只能由同步点之间单个线程感动。请注意,这实际上可以改进的如果处理得当,通过降低内核之间的D $争在一些系统上的性能。

  1. Don't use multithreaded algorithms that can reorder floating-point computations. Problem solved. This doesn't mean you can't use multithreaded algorithms at all, merely that you need to ensure that each individual result is only touched by a single thread between synchronization points. Note that this can actually improve performance on some architectures if done properly, by reducing D$ contention between cores.

在还原操作,可以让每个线程存储其结果在一个数组索引的位置,等待所有线程完成,累加数组的元素顺序。这增加的存储器开销少量,但一般pretty容忍的,特别是当线程的数量是小

In reduction operations, you can have each thread store its result to an indexed location in an array, wait for all threads to finish, the accumulate the elements of the array in order. This adds a small amount of memory overhead, but is generally pretty tolerable, especially when the number of threads is "small".

想办法吊起并行性。代替计算24矩阵乘法,其中的每一个使用并行算法,计算24矩阵制品并联,每其中之一使用串行算法。这也对性能有益(有时是巨大左右)。

Find ways to hoist the parallelism. Instead of computing 24 matrix multiplications, each one of which uses parallel algorithms, compute 24 matrix products in parallel, each one of which uses a serial algorithm. This, too, can be beneficial for performance (sometimes enormously so).

有很多其他的方法来处理这​​个问题。它们都需要思考和关怀。并行编程通常不会。

There are lots of other ways to handle this. They all require thought and care. Parallel programming usually does.

这篇关于如何浮点运算作出确定性?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆