跨平台浮点一致性 [英] Cross Platform Floating Point Consistency

查看：139 发布时间：2018/4/20 16:40:11 c++ linux gcc

本文介绍了跨平台浮点一致性的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在开发一个使用锁步模型在网络上播放的跨平台游戏。作为一个简要概述，这意味着只有输入被传送，并且所有游戏逻辑都在每个客户端的计算机上被模拟。因此，一致性和确定性非常重要。

我正在使用GCC 4.8.1编译MinGW32上的Windows版本，而在Linux上，我正在使用GCC 4.8.2。

最近我最感兴趣的是，当我的Linux版本连接到我的Windows版本时，即使相同的代码是即时的，程序也会立即发散或解除同步在两台机器上编译！结果发现问题是Linux版本是通过64位编译的，而Windows版本是32位版本。

在编译Linux 32位版本后，我很幸运地松了一口气，问题解决了。然而，它让我思考并研究了浮点确定性。

这是我收集的内容：

如果是以下程序，则程序通常是一致的：

在同一架构上运行

所以如果我假设，瞄准个人电脑市场，每个人都有一个x86处理器，然后解决了需求之一。然而，第二个要求似乎有点愚蠢。MinGW，GCC和Clang（分别是Windows，Linux和Mac）都是基于/兼容/不同的编译器GCC。这是否意味着实现跨平台决定论是不可能的？或者它只适用于Visual C ++ vs GCC？

同样，优化标志-O1或-O2是否影响这种确定性？最后，我有三个问题要问：

是否可以放弃它们更安全？

1）对编译器使用MinGW，GCC和Clang时，跨平台确定性是否可能？

2）跨这些编译器设置以确保操作系统/ CPU之间的最一致性？

3）浮点精度对我来说并不重要 - 重要的是它们是一致的。有什么方法可以将浮点数降低到更低的精度（如3-4位小数），以确保系统间的小舍入误差不存在？（到目前为止我尝试编写的每个实现失败）
做了一些跨平台的实验。

利用速度和位置的浮点数，我将一台Linux英特尔笔记本电脑和一台Windows AMD桌面计算机同步到小数点后15位的浮点值。但是，这两个系统都是x86_64。这个测试很简单 - 只是通过网络移动实体，试图确定任何可见的错误。

假设相同的结果会如果一台x86计算机连接到一台x86_64计算机，请等待？（32位与64位操作系统）

解决方案
跨平台和交叉编译器的一致性当然是可能的。任何事情都有可能给予足够的知识和时间！但它可能非常困难，或者非常耗时，或者实际上是不切实际的。

以下是我可以预见的问题，并不是特定的顺序：

请记住，即使是一个非常小的正负1/10 ^ 15的误差也会变得显着（您将该数字乘以10亿分之一的误差，现在你有一个正负0.000001误差，这可能是很重要的。）这些误差可能随着时间的推移积累，直到你有一个不同步的模拟。或者当你比较值时它们可以表现出来（甚至天真地使用浮点比较中的epsilons可能没有帮助;只是取代或延迟表现形式。）
上述问题并不是独立于分布式确定性模拟（就像你的）。关于数字稳定性这个问题的触及，这是一个困难且经常被忽略的主题。

不同的编译器优化开关和不同的浮点行为确定开关可能会导致编译器为相同的语句生成略微不同的CPU指令序列。显然这些编译过程必须是相同的，使用相同的编译器，或者生成的代码必须经过严格比较和验证。
32位和64位编译器（注：我说的是程序而不是CPU）可能会表现出略微不同的浮点行为。默认情况下，除非在编译器命令行中指定了这一点，否则32位程序不能依赖比CPU的x87指令集更高级的任何内容（无SSE，SSE2，AVX等）（或者在内核中使用内联/内联汇编指令）另一方面，64位程序保证在支持SSE2的CPU上运行，因此编译器将默认使用这些指令（同样，除非用户重写）。x87和SSE2浮点数据类型并且对它们的操作相似，它们是 - AFAIK - 不完全相同。如果一个程序使用一个指令集并且另一个程序使用另一个指令集，那么这将导致模拟方面的不一致。

x87指令集包括一个控制字寄存器，其中包含控制浮点运算的某些方面的标志（例如精确的舍入行为等）。这是一个运行时间的事情，您的程序可以执行一组计算，然后更改此寄存器，然后执行完全相同的操作计算并得到不同的结果。显然，这个寄存器必须在不同的机器上检查和处理并保持一致。编译器（或程序中使用的库）可能会生成代码，这些代码会在运行时不一致地在各个程序中更改这些标志。
同样，在x87指令集的情况下，英特尔和AMD在历史上实现的东西稍有不同。例如，一个供应商的CPU可能在内部使用更多位进行一些计算（因此得到更准确的结果），这意味着如果您碰巧在两个不同供应商的两个不同的CPU（两个x86）上运行，简单计算结果可能不一样。我不知道如何以及在何种情况下启用这些更高精度的计算，以及它们是否在正常操作条件下发生，或者您必须特别要求它们，但我确实知道存在这些差异。

随机数并在程序间一致且确定地生成它们与浮点一致性无关。这是很多错误的重要来源，但最终还是需要保持同步的状态。

以下是一些可能有所帮助的技术：

有些项目使用 em>定点数字和定点算术，以避免四舍五入误差和浮点数的一般不可预测性。阅读维基百科文章了解更多信息和外部链接。

在我自己的一个项目中，在开发过程中，我曾经在所有游戏实例中散列所有相关状态（包括大量的浮点数），并将散列每个网络的网络，以确保即使一个位的状态在不同的机器上没有不同。这也有助于调试，而不是相信我的眼睛看到不一致的地方存在的时间和地点（无论如何，它们不会告诉我它们起源于何处），我会知道一台机器上游戏状态的某些部分开始发散从其他人那里得知，并确切知道它是什么（如果散列检查失败，我会停止模拟并开始比较整个状态。）

此功能从一开始就在该代码库中实现，仅在开发过程中用于帮助调试（因为它具有性能和内存成本）。
更新（回答下面的第一条评论）：正如我在第1点所说的，其他人在其他答案中所说的那样，并不保证任何内容。如果你这样做，你可能会减少不一致发生的概率和频率，但可能性不会变成零。如果您没有仔细分析代码中发生的情况以及可能的问题根源，无论您的数字四舍五入，仍然有可能遇到错误。

例如，如果您有两个数字（例如，两个计算应该产生相同结果的结果）为1.111499999和1.111500001，并将它们四舍五入到小数点后三位，则它们分别变为1.111和1.112。原始数据的差异仅为2E-9，但现在已经变为1E-3。事实上，你已经增加了500,000次错误。即使四舍五入，它们仍然不相等。你已经加剧了这个问题。

当然，这种情况并没有发生，我给出的例子是两个不幸的数字来解决这个问题，但它仍然是可能找到这些类型的数字。当你这样做时，你遇到了麻烦。唯一可靠的解决方案，即使使用定点算术或其他方法，也是对所有可能存在问题的领域进行严格和系统的数学分析，并证明它们将在各个计划中保持一致。

总之，对于我们这些凡人来说，您需要有一个不漏水的方式来监控情况，并确定何时以及如何发生最轻微的差异，以便能够在事实后解决问题（而不是依靠你的眼睛看到游戏动画或物体移动或物理行为的问题。）

I'm developing a cross-platform game which plays over a network using a lockstep model. As a brief overview, this means that only inputs are communicated, and all game logic is simulated on each client's computer. Therefore, consistency and determinism is very important.

I'm compiling the Windows version on MinGW32, which uses GCC 4.8.1, and on Linux I'm compiling using GCC 4.8.2.

What struck me recently was that, when my Linux version connected to my Windows version, the program would diverge, or de-sync, instantly, even though the same code was compiled on both machines! Turns out the problem was that the Linux build was being compiled via 64 bit, whereas the Windows version was 32 bit.

After compiling a Linux 32 bit version, I was thankfully relieved that the problem was resolved. However, it got me thinking and researching on floating point determinism.

This is what I've gathered:

A program will be generally consistent if it's:

ran on the same architecture

compiled using the same compiler

So if I assume, targeting a PC market, that everyone has a x86 processor, then that solves requirement one. However, the second requirement seems a little silly.

MinGW, GCC, and Clang (Windows, Linux, Mac, respectively) are all different compilers based/compatible with/on GCC. Does this mean it's impossible to achieve cross-platform determinism? or is it only applicable to Visual C++ vs GCC?

As well, do the optimization flags -O1 or -O2 affect this determinism? Would it be safer to leave them off?

In the end, I have three questions to ask:

1) Is cross-platform determinism possible when using MinGW, GCC, and Clang for compilers?

2) What flags should be set across these compilers to ensure the most consistency between operating systems / CPUs?

3) Floating point accuracy isn't that important for me -- what's important is that they are consistent. Is there any method to reducing floating point numbers to a lower precision (like 3-4 decimal places) to ensure that the little rounding errors across systems become non-existent? (Every implementation I've tried to write so far has failed)

Edit: I've done some cross-platform experiments.

Using floatation points for velocity and position, I kept a Linux Intel Laptop and a Windows AMD Desktop computer in sync for up to 15 decimal places of the float values. Both systems are, however, x86_64. The test was simple though -- it was just moving entities around over a network, trying to determine any visible error.

Would it make sense to assume that the same results would hold if a x86 computer were to connect to a x86_64 computer? (32 bit vs 64 bit Operating System)
解决方案
Cross-platform and cross-compiler consistency is of course possible. Anything is possible given enough knowledge and time! But it might be very hard, or very time-consuming, or indeed impractical.

Here are the problems I can foresee, in no particular order:

Remember that even an extremely small error of plus-or-minus 1/10^15 can blow up to become significant (you multiply that number with that error margin with one billion, and now you have a plus-or-minus 0.000001 error which might be significant.) These errors can accumulate over time, over many frames, until you have a desynchronized simulation. Or they can manifest when you compare values (even naively using "epsilons" in floating-point comparisons might not help; only displace or delay the manifestation.)

The above problem is not unique to distributed deterministic simulations (like yours.) The touch on the issue of "numerical stability", which is a difficult and often neglected subject.

Different compiler optimization switches, and different floating-point behavior determination switches might lead to the compiler generate slightly different sequences of CPU instructions for the same statements. Obviously these must be the same across compilations, using the same exact compilers, or the generated code must be rigorously compared and verified.

32-bit and 64-bit programs (note: I'm saying programs and not CPUs) will probably exhibit slightly different floating-point behaviors. By default, 32-bit programs cannot rely on anything more advanced than x87 instruction set from the CPU (no SSE, SSE2, AVX, etc.) unless you specify this on the compiler command line (or use the intrinsics/inline assembly instructions in your code.) On the other hand, a 64-bit program is guaranteed to run on a CPU with SSE2 support, so the compiler will use those instructions by default (again, unless overridden by the user.) While x87 and SSE2 float datatypes and operations on them are similar, they are - AFAIK - not identical. Which will lead to inconsistencies in the simulation if one program uses one instruction set and another program uses another.

The x87 instruction set includes a "control word" register, which contain flags that control some aspects of floating-point operations (e.g. exact rounding behavior, etc.) This is a runtime thing, and your program can do one set of calculations, then change this register, and after that do the exact same calculations and get a different result. Obviously, this register must be checked and handled and kept identical on the different machines. It is possible for the compiler (or the libraries you use in your program) to generate code that changes these flags at runtime inconsistently across the programs.

Again, in case of the x87 instruction set, Intel and AMD have historically implemented things a little differently. For example, one vendor's CPU might internally do some calculations using more bits (and therefore arrive at a more accurate result) that the other, which means that if you happen to run on two different CPUs (both x86) from two different vendors, the results of simple calculations might not be the same. I don't know how and under what circumstances these higher accuracy calculations are enabled and whether they happen under normal operating conditions or you have to ask for them specifically, but I do know these discrepancies exist.

Random numbers and generating them consistently and deterministically across programs has nothing to do with floating-point consistency. It's important and source of many bugs, but in the end it's just a few more bits of state that you have to keep synched.

And here are a couple of techniques that might help:

Some projects use "fixed-point" numbers and fixed-point arithmetic to avoid rounding errors and general unpredictability of floating-point numbers. Read the Wikipedia article for more information and external links.

In one of my own projects, during development, I used to hash all the relevant state (including a lot of floating-point numbers) in all the instances of the game and send the hash across the network each frame to make sure even one bit of that state wasn't different on different machines. This also helped with debugging, where instead of trusting my eyes to see when and where inconsistencies existed (which wouldn't tell me where they originated, anyways) I would know the instant some part of the state of the game on one machine started diverging from the others, and know exactly what it was (if the hash check failed, I would stop the simulation and start comparing the whole state.)
This feature was implemented in that codebase from the beginning, and was used only during the development process to help with debugging (because it had performance and memory costs.)

Update (in answer to first comment below): As I said in point 1, and others have said in other answers, that doesn't guarantee anything. If you do that, you might decrease the probability and frequency of an inconsistency occurring, but the likelihood doesn't become zero. If you don't analyze what's happening in your code and the possible sources of problems carefully and systematically, it is still possible to run into errors no matter how much you "round off" your numbers.

For example, if you have two numbers (e.g. as results of two calculations that were supposed to produce identical results) that are 1.111499999 and 1.111500001 and you round them to three decimal places, they become 1.111 and 1.112 respectively. The original numbers' difference was only 2E-9, but it has now become 1E-3. In fact, you have increased your error 500'000 times. And still they are not equal even with the rounding. You've exacerbated the problem.

True, this doesn't happen much, and the examples I gave are two unlucky numbers to get in this situation, but it is still possible to find yourself with these kinds of numbers. And when you do, you're in trouble. The only sure-fire solution, even if you use fixed-point arithmetic or whatever, is to do rigorous and systematic mathematical analysis of all your possible problem areas and prove that they will remain consistent across programs.

Short of that, for us mere mortals, you need to have a water-tight way to monitor the situation and find exactly when and how the slightest discrepancies occur, to be able to solve the problem after the fact (instead of relying on your eyes to see problems in game animation or object movement or physical behavior.)

这篇关于跨平台浮点一致性的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

跨平台浮点一致性 [英] Cross Platform Floating Point Consistency

问题描述

相关文章

服务器开发最新文章

热门教程

热门工具

登录关闭

跨平台浮点一致性 [英] Cross Platform Floating Point Consistency

问题描述

相关文章

服务器开发最新文章

热门教程

热门工具

登录 关闭

登录关闭