检测ARM NEON可用性在preprocessor? [英] Detect ARM NEON availability in the preprocessor?
问题描述
按照 ARM ARM ,当霓虹灯SIMD指令可 __ __ ARM_NEON
定义。我遇到了麻烦GCC提供它。
氖这个 BananaPi专业版开发运行Debian 8.2板可供选择:
$执行cat / proc内/ cpuinfo | grep的霓虹灯
特点:SWP一半拇指fastmult VFP EDSP霓虹灯VFPv3的TLS vfpv4 idiva idivt
我使用GCC 4.9:
$ GCC --version
海湾合作委员会(Debian的4.9.2-10)4.9.2
尝试GCC和 -march =本地
:
$ G ++ -march =本地-dM -E - <的/ dev / null的| grep的-i霓虹灯
#定义__ARM_NEON_FP 4
OK,什么尝试谷歌建立了霓虹灯时,使用Android的:
$ G ++ -march =的ARMv7-A -mfpu =的VFPv3-D16 -mfloat-ABI = softfp -dM -E - <的/ dev / null的| grep的-i霓虹灯
#定义__ARM_NEON_FP 4
也许的ARMv7-A用硬浮动:
$ G ++ -march =的ARMv7-A -mfloat-ABI =硬-dM -E - <的/ dev / null的| grep的-i霓虹灯
#定义__ARM_NEON_FP 4
我的问题是:
- 为什么我没有看到
__ __ ARM_NEON
? - 我怎样检测在preprocessor霓虹灯可用性?
也许
- 我应该用什么GCC开关,使霓虹灯SIMD指令?
相关,在 LeMaker HiKey ,它是运行Linaro的海湾合作委员会AARCH64 / ARM64 4.9.2,这里是从preprocessor输出:
$ CPP -dM<的/ dev / null的| grep的-i霓虹灯
#定义__ARM_NEON 1
据ARM,这款主板确实有高级SIMD指令,即使:
$执行cat / proc内/ cpuinfo
处理器:AArch64处理器转3(aarch64)
...
特点:FP asimd evtstrm AES pmull SHA1 SHA2 CRC32
有一些藏在这里的问题,我会尝试提取它们反过来...
据ARM的ARM,当霓虹灯SIMD指令可
__ __ ARM_NEON
定义。我遇到了麻烦GCC提供它。
块引用>这是编译器文档[老版] ARM编译器,而不是ARM Architceture参考手册。一个更好的宏观检查的高级SIMD指令的presence将
__ ARM_NEON
,这是在的 ARM C语言扩展。
尝试GCC和
-march =本地
:
块引用>如您可能已经发现。 GCC针对ARM目标分离出来
-march
(对于结构调整为其GCC应该产生code),-mfpu
(对于浮点/高级SIMD单元可用)和-mfloat-ABI
(对于点如何浮动参数应该被传递,并为presence或没有一个浮点单元的)。最后是-mtune
(它要求GCC尝试优化为特定的处理器)和-mcpu
(充当作为组合-mtune
和-march
)。通过要求
-march =本地
你问GCC产生code适合于在其上运行的处理器的体系结构检测。这对-mfpu
设置没有任何影响,所以不必启用高级SIMD指令的生成。请注意,上述只适用于一个编译器定位AArch32。该AArch64 GCC不支持
-mfpu
并通过-march =原生检测的高级SIMD支持presence
OK,什么尝试谷歌建立了霓虹灯时,使用Android的:
$ G ++ -march =的ARMv7-A -mfpu =的VFPv3-D16 -mfloat-ABI = softfp -dM -E
块引用>本编译标志不足以使对高级SIMD指令的支持,您的笔记可能是不完整的。在
-mfpu
标志由GCC 4.9.2支持的我期望的任何的:
氖
,氖-FP16
,氖vfpv4
,氖-FP-armv8
,加密霓虹灯-FP-armv8
要给你你想要的东西。
据ARM,这款主板确实有高级SIMD指令,即使:
块引用>看起来就像你在一个AArch64内核,它通过
asimd
功能公开支持高级SIMD运行 - 在您的示例输出According to the ARM ARM,
__ARM_NEON__
is defined when Neon SIMD instructions are available. I'm having trouble getting GCC to provide it.Neon available on this BananaPi Pro dev board running Debian 8.2:
$ cat /proc/cpuinfo | grep neon Features : swp half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt
I'm using GCC 4.9:
$ gcc --version gcc (Debian 4.9.2-10) 4.9.2
Try GCC and
-march=native
:$ g++ -march=native -dM -E - </dev/null | grep -i neon #define __ARM_NEON_FP 4
OK, try what Google uses for Android when building for Neon:
$ g++ -march=armv7-a -mfpu=vfpv3-d16 -mfloat-abi=softfp -dM -E - </dev/null | grep -i neon #define __ARM_NEON_FP 4
Maybe a ARMv7-a with a hard float:
$ g++ -march=armv7-a -mfloat-abi=hard -dM -E - </dev/null | grep -i neon #define __ARM_NEON_FP 4
My questions are:
- why am I not seeing
__ARM_NEON__
?- how do I detect Neon availability in the preprocessor?
And maybe:
- what GCC switches should I use to enable Neon SIMD instructions?
Related, on a LeMaker HiKey, which is AARCH64/ARM64 running Linaro with GCC 4.9.2, here's the output from the preprocessor:
$ cpp -dM </dev/null | grep -i neon #define __ARM_NEON 1
According to ARM, this board does have Advanced SIMD instructions even though:
$ cat /proc/cpuinfo Processor : AArch64 Processor rev 3 (aarch64) ... Features : fp asimd evtstrm aes pmull sha1 sha2 crc32
解决方案There are a number of questions hidden in here, I'll try to extract them in turn...
According to the ARM ARM,
__ARM_NEON__
is defined when Neon SIMD instructions are available. I'm having trouble getting GCC to provide it.That is compiler documentation for [an old version of] the ARM Compiler rather than the ARM Architceture Reference Manual. A better macro to check for the presence of the Advanced SIMD instructions would be
__ARM_NEON
, which is defined in the ARM C Language Extensions.Try GCC and
-march=native
:As you may have found. GCC for the ARM target separates out
-march
(For the architecture revision for which GCC should generate code),-mfpu
(For the floating point/Advanced SIMD unit available) and-mfloat-abi
(For how floating point arguments should be passed, and for the presence or absence of a floating point unit). Finally there is-mtune
(Which asks GCC to try to optimise for a particular processor) and-mcpu
(which acts as a combination of-mtune
and-march
).By asking for
-march=native
You're asking GCC to generate code appropriate for the detected architecture of the processor on which you are running. This has no impact on the-mfpu
setting, and so does not necessarily enable Advanced SIMD instruction generation.Note that the above only applies to a compiler targeting AArch32. The AArch64 GCC does not support
-mfpu
and will detect presence of Advanced SIMD support through-march=native
.OK, try what Google uses for Android when building for Neon:
$ g++ -march=armv7-a -mfpu=vfpv3-d16 -mfloat-abi=softfp -dM -E
These build flags are not sufficient to enable support for Advanced SIMD instructions, your notes may be incomplete. Of the
-mfpu
flags supported by GCC 4.9.2 I'd expect any of:
neon
,neon-fp16
,neon-vfpv4
,neon-fp-armv8
,crypto-neon-fp-armv8
To give you what you want.
According to ARM, this board does have Advanced SIMD instructions even though:
Looks like you're running on an AArch64 kernel, which exposes support for Advanced SIMD through the
asimd
feature - as in your example output.这篇关于检测ARM NEON可用性在preprocessor?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!