在iPhone上快速反平方根 [英] Fast inverse square root on the iPhone
问题描述
否。
< NEON指令集(与所有其他矢量ISA *类似)具有硬件近似倒数平方根指令,比那些被引用的技巧要快得多。如果倒数平方根实际上是代码中的性能瓶颈,那么使用它(如往常一样,首先是基准;如果没有确凿的证据表明它的性能很重要,则不要花费时间优化某些东西)。
您可以通过使用 vrsqrte.f32
指令或C,Objective-C编写自己的程序集(内联或其他方式)通过包含< arm_neon.h>
标题并使用 vrsqrte_f32()
b
$在SSE上,它是 rsqrtss
/ rsqrtps
;在Altivec上它是 frsqrte
/ vrsqrte
。
The fast inverse square function used by SGI/3dfx and most notably in Quake is often cited as being faster than the assembly instruction equivalent, however the posts claiming that seem quite dated. I was curious about its performance on more modern hardware, and particularly on mobile devices like the iPhone. I wouldn't be surprised if the Quake sqrt is not longer a worthwhile optimization on desktop systems, but how about for an iPhone project involving a lot of 3D math? Is it something that would be worthwhile to include?
No.
The NEON instruction set (like every other vector ISA*) has a hardware approximate reciprocal square root instruction that is much faster than that oft-cited "trick". Use it instead if reciprocal square root is actually a performance bottleneck in your code (as always, benchmark first; don't spend time optimizing something if you have no hard evidence that its performance matters).
You can get at it by writing your own assembly (inline or otherwise) with the vrsqrte.f32
instruction, or from C, Objective-C, or C++ by including the <arm_neon.h>
header and using the vrsqrte_f32( )
intrinsic.
[*] On SSE it's rsqrtss
/rsqrtps
; on Altivec it's frsqrte
/vrsqrte
.
这篇关于在iPhone上快速反平方根的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!