Android NDK:ARMv6 + VFP 设备.错误计算、NaN、非正规数、VFP11 错误 [英] Android NDK: ARMv6 + VFP devices. wrong calculations, NaN, denormal numbers, VFP11 bug

查看:18
本文介绍了Android NDK:ARMv6 + VFP 设备.错误计算、NaN、非正规数、VFP11 错误的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我希望使用 VFP Android 设备以 ARMv6 为目标.

I wish to target ARMv6 with VFP Android device.

我的 Android.mk 文件中有以下行来启用 VFP

I have following line in my Android.mk file to enable VFP

LOCAL_CFLAGS    := -marm -mfloat-abi=softfp -mfpu=vfp -Wmultichar

我相信我的目标是 ARMv5VFP.

I believe I target ARMv5 with VFP.

我编辑了 android-ndk-r8b\toolchains\arm-linux-androideabi-4.6\setup.mk 以删除 -msoft-float.我也尝试过使用原始的 setup.mk

I edited android-ndk-r8b\toolchains\arm-linux-androideabi-4.6\setup.mk to remove -msoft-float. I also tried with original setup.mk

我的代码在 99.99% 的时间内运行良好,但有时在 ARMv6 设备上会变得疯狂.我有特殊的代码来检测它什么时候发疯.

My code works fine 99.99% of time but some times goes crazy on ARMv6 devices. I have special code to detect when it goes crazy.

代码

glm::vec3 D = P1 - P2;
float f1 = sqrtf(D.x*D.x + D.y*D.y + D.z*D.z);
if(!(f1 < 5)){
    // f1 is bigger then 5 or NaN
    mylog_fmt("Crazy %f %f %f %f", P1.x, P1.y, P1.z, f1);
    mylog_fmt("%f %f %f", P2.x, P2.y, P2.z);
}

LogCat:

12-14 00:59:08.214: I/APP(17091): Crazy -20.000031 0.000000 0.000000 20.000000
12-14 00:59:08.214: I/APP(17091): -20.000000 0.000000 0.000000

它计算两点之间的距离.通常是 0.000031但是当 crazy mode 开启时,它是 20.0

It calculates distance between 2 points. Usually it is 0.000031 But when crazy mode is on it is 20.0

当我在 ARMv7 CPU 上运行时,问题不存在.它仅存在于 ARMv6 CPU 上.

The problem does not exists when I run it on ARMv7 CPU. It exists on ARMv6 CPU only.

我相信这应该是一些与编译器设置或版本相关的常见已知错误.可能是代码缺少内存屏障.

I believe it should be some common known bug related to compiler settings or version. May be codes is missing memory barrier.

我希望看到一些对类似错误的参考.解决办法.或者关于错误的性质.

I would like to see some reference to similar bugs. Way to solve it. Or about nature of bug.

当 ARMv7 上的相同代码没有给出 NaN 时,我也经常在 ARMv6 上得到 NaN 值.

I also often get NaN values on ARMv6 when same code on ARMv7 does not give NaN.

我已经调试了 2 周的代码并在网上搜索.如果有人可以分享类似问题的链接,那将是一个很大的帮助!

I am debugging code for for 2 weeks already and searching the web. If someone could share link to similar problem it would be a great help!

附注.这是编译命令之一的示例.我已经尝试了许多不同的设置.

PS. here is example of one of compile commands. I tried many different settings already.

编译器设置

c:/soft/Android/android-ndk-r8b/toolchains/arm-linux-androideabi-4.6/prebuilt/windows/bin/arm-linux-androideabi-g++
-MMD -MP -MF ./obj/local/armeabi/objs/main/sys/base.o.d -fpic -ffunction-sections -funwind-tables -fstack-protector 
-D__ARM_ARCH_5__ -D__ARM_ARCH_5T__ -D__ARM_ARCH_5E__
-D__ARM_ARCH_5TE__  
-march=armv5te -mtune=arm6 
-mfloat-abi=softfp -mfpu=vfp
-fno-exceptions -fno-rtti -mthumb -Os -fomit-frame-pointer -fno-strict-aliasing -finline-limit=64 
-Ijni/main/ -Ijni/main/sys -Ijni/main/bullet/src -Ijni/main/bullet/src/LinearMath -Ijni/main/bullet/src/BulletCollision/BroadphaseCollision 
-Ijni/main/bullet/src/BulletCollision/CollisionDispatch -Ijni/main/bullet/src/BulletCollision/CollisionShapes -Ijni/main/bullet/src/BulletCollision/NarrowPhaseCollision 
-Ijni/main/bullet/src/BulletDynamics/ConstraintSolver -Ijni/main/bullet/src/BulletDynamics/Dynamics -Ijni/main/../libzip/ -Ic:/soft/Android/android-ndk-r8b/sources/cxx-stl/stlport/stlport 
-Ic:/soft/Android/android-ndk-r8b/sources/cxx-stl//gabi++/include -Ijni/main 
-DANDROID

-marm -march=armv6 -mfloat-abi=softfp -mfpu=vfp -Wmultichar

-Wa,--noexecstack  -frtti  -O2 -DNDEBUG -g   -Ic:/soft/Android/android-ndk-r8b/platforms/android-5/arch-arm/usr/include -c  jni/main/sys/base.cpp
-o ./obj/local/armeabi/objs/main/sys/base.o

更新 2

所有这些设备都有 Qualcomm MSM7227A它有ARM1136JF-S

All these devices have Qualcomm MSM7227A It has ARM1136JF-S

到目前为止我了解到的错误可能与de-norms有关我在某处读到 ARMv7 与 ARMv6 的差异,默认情况下 denorms 刷新为零,ARM1136SF-S 可选.http://infocenter.arm.com/help/topic/com.arm.doc.ddi0211k/DDI0211K_arm1136_r1p5_trm.pdf

What I learnt so far is that the bug could relate to de-norms I read somewhere ARMv7 differences WITH ARMv6 that is has denorms flush to zero by default and ARM1136SF-S has it optionally. http://infocenter.arm.com/help/topic/com.arm.doc.ddi0211k/DDI0211K_arm1136_r1p5_trm.pdf

尚不确定如何验证 ARM 上的 Flush-To-ZERO 标志.

Not yet sure how to verify that Flush-To-ZERO flag on ARM.

更新 3

这个CPU的VFP叫做VFP11我找到了 --vfp11-denorm-fix 选项.还有 --vfp-denorm-fix他们更正了 VFP11 cpu 中的错误.看起来像我的目标问题.发现很少有关于 VFP11 勘误的帖子.希望它会修复代码.

This CPU's VFP is called VFP11 I found --vfp11-denorm-fix option. There is also --vfp-denorm-fix They correct erratum in VFP11 cpus. Looks like my target problem. Found few posts about VFP11 erratum. Hope it will fix the code.

推荐答案

好像我发现了错误.

这是 VFP11(ARMv6 协处理器)规范错误中的错误.非正规数是非常小的数字.

It is bug in VFP11 (ARMv6 coprocessor) denorm bug. denormal numbers are very small number.

我在物理代码中使用倾销实现弹簧的这些数字

I get this numbers in physics code implementing spring with dumping

force1 = (Center - P1) * k1         // force1 directed to center 
force2 = - Velocity * k2            // force2 directed against velocity
Object->applyForce(force1)
Object->applyForce(force2)

当对象归档 Center 时,两种力都变得非常小,而我在最后得到 denormal 值.

Both forces get very small when object archieve Center and I get denormal values at the end.

我可以重写 sring 和 dumping,但我不能重写漏洞 BulletPhysics 或所有数学代码并预测非正规数的每次(甚至是内部)出现.

I can re-write sring and dumping but I can't re-write hole BulletPhysics or all math code and predict every (even internal) occurance of denormal number.

Linker 具有修复代码选项 --vfp11-denorm-fix--vfp-denorm-fixhttp://sourceware.org/binutils/docs-2.19/ld/ARM.html

Linker has fix code options --vfp11-denorm-fix and --vfp-denorm-fix http://sourceware.org/binutils/docs-2.19/ld/ARM.html

NDK 链接器具有 --vfp11-denorm-fix此选项有帮助.代码看起来更可靠,但它并不能 100% 解决问题.

NDK linker has --vfp11-denorm-fix This option helps. Code looks more repliable but it does not fix problem for 100%.

我现在看到的错误更少了.

I see less bugs now.

但是,如果我等待 sping 稳定对象,那么我最终会得到 denorm -> NaN

BUt if I wait sping stabilize object then I finally I get denorm -> NaN

我必须等待更长的时间,但同样的问题又出现了.

I have to wait longer but same problems arrive.

如果你知道可以修复像 --vfp11-denorm-fix 这样的代码的解决方案,那么我应该给你赏金.

If you know solution that will fix code like --vfp11-denorm-fix should then I give you bounty.

我尝试了 --vfp11-denorm-fix=scalar--vfp11-denorm-fix=vector

清零位

      int x;
      // compiles in ARM mode
      asm(
              "vmrs %[result],FPSCR \r\n"
              "orr %[result],%[result],#16777216 \r\n"
              "vmsr FPSCR,%[result]"
              :[result] "=r" (x) : :
      );

不知道为什么,但它需要 LOCAL_ARM_MODE := armAndroid.mk可能是 -mfpu=vfp-d16 而不仅仅是 vfp 是必需的.

Not sure why but it requires LOCAL_ARM_MODE := arm in Android.mk May be -mfpu=vfp-d16 instead of of just vfp is required.

手动清除非正规数

我有上面描述的弹簧代码.我通过手动清除非正规数而不使用具有以下功能的 FPU 来改进它.

I have spring code described above. I improved it by clearing denormal number manually without using FPU with following function.

inline void fixDenorm(float & f){
    union FloatInt32 {
        unsigned int u32;
        float f32;
    };
        FloatInt32 fi;
        fi.f32 = f;

        unsigned int exponent = (fi.u32 >> 23) & ((1 << 8) - 1);
        if(exponent == 0)
            f = 0.f;
}

<小时>

许多地方的原始代码在 15-90 秒内就失败了.


Original code was failing in 15-90 seconds from start in many places.

在 10 分钟的物理模拟后,当前代码仅在一个地方显示了可能与此错误相关的问题.

Current code showed issue possibly related to this bug in only one in place after 10 minutes of physics simulation.

参考错误和修复http://sourceware.org/ml/binutils/2006-12/msg00196.html

他们说GCC只使用scalr代码,--vfp11-denorm-fix=scalar就足够了.它增加了 1 个额外的命令来减慢速度.但是即使 --vfp11-denorm-fix=vector 增加了 2 个额外的命令也是不够的.

They say that GCC uses only scalr code and --vfp11-denorm-fix=scalar is enough. It adds 1 extra command to slow down. But even --vfp11-denorm-fix=vector that adds 2 extra commands is not enough.

问题不容易重现.在频率较高的 800Mhz 手机上,我看到它的频率比在较慢的 600Mhz 上更频繁.有可能在市场上没有快速 CPU 时进行了修复.

Problem is not easier re-producible. On phones with higher frequency 800Mhz I see it more often then on slower one 600Mhz. It is possible that fix was done when there was no fast CPUs on market.

我们的项目中有很多文件,每次配置编译大约需要 10 分钟.使用当前修复状态进行测试需要大约 10 分钟才能在手机上播放.+ 我们在灯下加热手机.热电话显示错误的速度更快.

We have many files in project and every configuration compilations takes around 10 minutes. Testing with current state of fix requires ~10 minutes to play on phone. + We heat phone under the lamp. Hot phone shows errors faster.

我希望测试不同的配置并报告最有效的修复方法.但是现在我们必须添加 hack 来杀死可能与 denorms 相关的最后一个错误.

I wish to test different configurations and report what fix is most efficient. But right now we have to add hack to kill last bug possibly related to denorms.

我希望找到解决它的灵丹妙药,但只有 -msoft-float 性能下降 10 倍或在 ARMv7 上运行应用程序才能做到.

I expected to find silver bullet that will fix it but only -msoft-float with 10x performance degradation or running app on ARMv7 does it.

在 spring/dumping 代码中用新的 fixDenormE 替换以前的 fixDenorm 函数并为 ViewMatrix 应用新函数后,我摆脱了最后一个错误.

After I replaced previous fixDenorm function with new fixDenormE in spring/dumping code and applying the new function for ViewMatrix I get rid of last bug.

inline void fixDenormE(float & f, float epsilon = 1e-8){
    union Data32 {
        unsigned int u32;
        float f32;
    };
        Data32 d;
        d.f32 = f;

        unsigned int exponent = (d.u32 >> 23) & ((1 << 8) - 1);
        if(exponent == 0)
            f = 0.f;
        if(fabsf(f) < epsilon){
          f = 0.f;
        }
}

这篇关于Android NDK:ARMv6 + VFP 设备.错误计算、NaN、非正规数、VFP11 错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆