Objective-C中的非规范化浮点? [英] Denormalized floating point in Objective-C?

查看:117
本文介绍了Objective-C中的非规范化浮点?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

Stack  Overflow问题/答案的相关性 为什么将0.1f更改为0会将性能降低10倍? for Objective-C?如果有任何相关性,这应该如何改变我的编码习惯?有什么方法可以关闭Mac OS X上的非标准化浮点吗?

What is the relevance of Stack Overflow question/answer Why does changing 0.1f to 0 slow down performance by 10x? for Objective-C? If there is any relevance, how should this change my coding habits? Is there some way to shut off denormalized floating points on Mac OS X?

这似乎与iOS完全无关。是正确的吗?

It seems like this is completely irrelevant to iOS. Is that correct?

推荐答案

正如我在回复您的评论时说的:

As I said in response to your comment there:


它更多的是一个CPU而不是一个语言问题,所以它可能有
相关的Objective-C在x86。 (iPhone的ARMv7似乎不支持
反规范浮动,至少使用默认运行时/构建设置)

it is more of a CPU than a language issue, so it probably has relevance for Objective-C on x86. (iPhone's ARMv7 doesn't seem to support denormalized floats, at least with the default runtime/build settings)

strong>更新

Update

我刚刚测试过。在Mac OS X on x86上观察到了减速,在iOS上在ARMv7上它不是(默认构建设置)。

I just tested. On Mac OS X on x86 the slowdown is observed, on iOS on ARMv7 it is not (default build settings).

并且为了预期,在iOS模拟器(on x86)上运行反规范化浮动重新出现。

And as to be expected, running on iOS simulator (on x86) denormalized floats appear again.

FLT_MIN DBL_MIN 分别定义为最小的非反规范数字(在iOS,Mac OS X和Linux )。奇怪的事情发生使用

Interestingly, FLT_MIN and DBL_MIN respectively are defined to the smallest non-denormalized number (on iOS, Mac OS X, and Linux). Strange things happen using

DBL_MIN/2.0

编译器快乐地设置一个非规范化的常数,但是一旦(arm)CPU触摸它,它就被设置为零:

in your code; the compiler happily sets a denormalized constant, but as soon as the (arm) CPU touches it, it is set to zero:

double test = DBL_MIN/2.0;
printf("test      == 0.0 %d\n",test==0.0);
printf("DBL_MIN/2 == 0.0 %d\n",DBL_MIN/2.0==0.0);

输出:

test      == 0.0 1  // computer says YES
DBL_MIN/2 == 0.0 0  // compiler says NO

$因此,如果支持反规范化,那么快速运行时检查可以是:

So a quick runtime check if denormalization is supported can be:

#define SUPPORT_DENORMALIZATION ({volatile double t=DBL_MIN/2.0;t!=0.0;})

这是ARM在flush到zero模式下要说的内容: http://infocenter.arm.com/help/index.jsp?topic=/com .arm.doc.dui0204h / Bcfheche.html

This is what ARM has to say on flush to zero mode: http://infocenter.arm.com/help/index.jsp?topic=/com.arm.doc.dui0204h/Bcfheche.html

更新<< 1

这是在ARMv7上禁用刷新到零模式的方法:

This is how you disable flush to zero mode on ARMv7:

int x;
asm(
    "vmrs %[result],FPSCR \r\n"
    "bic %[result],%[result],#16777216 \r\n"
    "vmsr FPSCR,%[result]"
    :[result] "=r" (x) : :
);
printf("ARM FPSCR: %08x\n",x);

有以下令人惊讶的结果。

with the following surprising result.


  • 第1列:每次迭代除以2的浮动

  • 第2列:此浮动的二进制表示

  • 第3列:汇总此浮动所需的时间1e7次

您可以清楚地看到非正规化以零成本。 (对于一个iPad 2,iPhone 4上,它是在10%放缓的一部小成本)

You can clearly see that the denormalization comes at zero cost. (For an iPad 2. On iPhone 4, it comes at a small cost of a 10% slowdown.)

0.000000000000000000000000000000000100000004670110: 10111100001101110010000011100000 110 ms
0.000000000000000000000000000000000050000002335055: 10111100001101110010000101100000 110 ms
0.000000000000000000000000000000000025000001167528: 10111100001101110010000001100000 110 ms
0.000000000000000000000000000000000012500000583764: 10111100001101110010000110100000 110 ms
0.000000000000000000000000000000000006250000291882: 10111100001101110010000010100000 111 ms
0.000000000000000000000000000000000003125000145941: 10111100001101110010000100100000 110 ms
0.000000000000000000000000000000000001562500072970: 10111100001101110010000000100000 110 ms
0.000000000000000000000000000000000000781250036485: 10111100001101110010000111000000 110 ms
0.000000000000000000000000000000000000390625018243: 10111100001101110010000011000000 110 ms
0.000000000000000000000000000000000000195312509121: 10111100001101110010000101000000 110 ms
0.000000000000000000000000000000000000097656254561: 10111100001101110010000001000000 110 ms
0.000000000000000000000000000000000000048828127280: 10111100001101110010000110000000 110 ms
0.000000000000000000000000000000000000024414063640: 10111100001101110010000010000000 110 ms
0.000000000000000000000000000000000000012207031820: 10111100001101110010000100000000 111 ms
0.000000000000000000000000000000000000006103515209: 01111000011011100100001000000000 110 ms
0.000000000000000000000000000000000000003051757605: 11110000110111001000010000000000 110 ms
0.000000000000000000000000000000000000001525879503: 00010001101110010000100000000000 110 ms
0.000000000000000000000000000000000000000762939751: 00100011011100100001000000000000 110 ms
0.000000000000000000000000000000000000000381469876: 01000110111001000010000000000000 112 ms
0.000000000000000000000000000000000000000190734938: 10001101110010000100000000000000 110 ms
0.000000000000000000000000000000000000000095366768: 00011011100100001000000000000000 110 ms
0.000000000000000000000000000000000000000047683384: 00110111001000010000000000000000 110 ms
0.000000000000000000000000000000000000000023841692: 01101110010000100000000000000000 111 ms
0.000000000000000000000000000000000000000011920846: 11011100100001000000000000000000 110 ms
0.000000000000000000000000000000000000000005961124: 01111001000010000000000000000000 110 ms
0.000000000000000000000000000000000000000002980562: 11110010000100000000000000000000 110 ms
0.000000000000000000000000000000000000000001490982: 00010100001000000000000000000000 110 ms
0.000000000000000000000000000000000000000000745491: 00101000010000000000000000000000 110 ms
0.000000000000000000000000000000000000000000372745: 01010000100000000000000000000000 110 ms
0.000000000000000000000000000000000000000000186373: 10100001000000000000000000000000 110 ms
0.000000000000000000000000000000000000000000092486: 01000010000000000000000000000000 110 ms
0.000000000000000000000000000000000000000000046243: 10000100000000000000000000000000 111 ms
0.000000000000000000000000000000000000000000022421: 00001000000000000000000000000000 110 ms
0.000000000000000000000000000000000000000000011210: 00010000000000000000000000000000 110 ms
0.000000000000000000000000000000000000000000005605: 00100000000000000000000000000000 111 ms
0.000000000000000000000000000000000000000000002803: 01000000000000000000000000000000 110 ms
0.000000000000000000000000000000000000000000001401: 10000000000000000000000000000000 110 ms
0.000000000000000000000000000000000000000000000000: 00000000000000000000000000000000 110 ms
0.000000000000000000000000000000000000000000000000: 00000000000000000000000000000000 110 ms
0.000000000000000000000000000000000000000000000000: 00000000000000000000000000000000 110 ms

这篇关于Objective-C中的非规范化浮点?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆