检测ARM NEON可用性在preprocessor? [英] Detect ARM NEON availability in the preprocessor?

查看:2536
本文介绍了检测ARM NEON可用性在preprocessor?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

按照 ARM ARM ,当霓虹灯SIMD指令可 __ __ ARM_NEON 定义。我遇到了麻烦GCC提供它。

氖这​​个 BananaPi专业版开发运行Debian 8.2板可供选择:

  $执行cat / proc内/ cpuinfo | grep的霓虹灯
特点:SWP一半拇指fastmult VFP EDSP霓虹灯VFPv3的TLS vfpv4 idiva idivt

我使用GCC 4.9:

  $ GCC --version
海湾合作委员会(Debian的4.9.2-10)4.9.2

尝试GCC和 -march =本地

  $ G ++ -march =本地-dM -E  - <的/ dev / null的| grep的-i霓虹灯
#定义__ARM_NEON_FP 4

OK,什么尝试谷歌建立了霓虹灯时,使用Android的:

  $ G ++ -march =的ARMv7-A -mfpu =的VFPv3-D16 -mfloat-ABI = softfp -dM -E  - <的/ dev / null的| grep的-i霓虹灯
#定义__ARM_NEON_FP 4

也许的ARMv7-A用硬浮动:

  $ G ++ -march =的ARMv7-A -mfloat-ABI =硬-dM -E  - <的/ dev / null的| grep的-i霓虹灯
#定义__ARM_NEON_FP 4

我的问题是:


  • 为什么我没有看到 __ __ ARM_NEON

  • 我怎样检测在preprocessor霓虹灯可用性?

也许


  • 我应该用什么GCC开关,使霓虹灯SIMD指令?


相关,在 LeMaker HiKey ,它是运行Linaro的海湾合作​​委员会AARCH64 / ARM64 4.9.2,这里是从preprocessor输出:

  $ CPP -dM<的/ dev / null的| grep的-i霓虹灯
#定义__ARM_NEON 1

据ARM,这款主板确实有高级SIMD指令,即使:

  $执行cat / proc内/ cpuinfo
处理器:AArch64处理器转3(aarch64)
...
特点:FP asimd evtstrm AES pmull SHA1 SHA2 CRC32


解决方案

有一些藏在这里的问题,我会尝试提取它们反过来...


  

据ARM的ARM,当霓虹灯SIMD指令可 __ __ ARM_NEON 定义。我遇到了麻烦GCC提供它。


这是编译器文档[老版] ARM编译器,而不是ARM Architceture参考手册。一个更好的宏观检查的高级SIMD指令的presence将 __ ARM_NEON ,这是在的 ARM C语言扩展


  

尝试GCC和 -march =本地


如您可能已经发现。 GCC针对ARM目标分离出来 -march (对于结构调整为其GCC应该产生code), -mfpu (对于浮点/高级SIMD单元可用)和 -mfloat-ABI (对于点如何浮动参数应该被传递,并为presence或没有一个浮点单元的)。最后是 -mtune (它要求GCC尝试优化为特定的处理器)和 -mcpu (充当作为组合 -mtune -march )。

通过要求 -march =本地你问GCC产生code适合于在其上运行的处理器的体系结构检测。这对 -mfpu 设置没有任何影响,所以不必启用高级SIMD指令的生成。

请注意,上述只适用于一个编译器定位AArch32。该AArch64 GCC不支持 -mfpu 并通过 -march =原生检测的高级SIMD支持presence


  

OK,什么尝试谷歌建立了霓虹灯时,使用Android的:


  
  

$ G ++ -march =的ARMv7-A -mfpu =的VFPv3-D16 -mfloat-ABI = softfp -dM -E


本编译标志不足以使对高级SIMD指令的支持,您的笔记可能是不完整的。在 -mfpu 标志由GCC 4.9.2支持的我期望的任何的:

氖-FP16 氖vfpv4 氖-FP-armv8 加密霓虹灯-FP-armv8

要给你你想要的东西。


  

据ARM,这款主板确实有高级SIMD指令,即使:


看起来就像你在一个AArch64内核,它通过 asimd 功能公开支持高级SIMD运行 - 在您的示例输出

According to the ARM ARM, __ARM_NEON__ is defined when Neon SIMD instructions are available. I'm having trouble getting GCC to provide it.

Neon available on this BananaPi Pro dev board running Debian 8.2:

$ cat /proc/cpuinfo | grep neon
Features    : swp half thumb fastmult vfp edsp neon vfpv3 tls vfpv4 idiva idivt 

I'm using GCC 4.9:

$ gcc --version
gcc (Debian 4.9.2-10) 4.9.2

Try GCC and -march=native:

$ g++ -march=native -dM -E - </dev/null | grep -i neon
#define __ARM_NEON_FP 4

OK, try what Google uses for Android when building for Neon:

$ g++ -march=armv7-a -mfpu=vfpv3-d16 -mfloat-abi=softfp -dM -E - </dev/null | grep -i neon
#define __ARM_NEON_FP 4

Maybe a ARMv7-a with a hard float:

$ g++ -march=armv7-a -mfloat-abi=hard -dM -E - </dev/null | grep -i neon
#define __ARM_NEON_FP 4

My questions are:

  • why am I not seeing __ARM_NEON__?
  • how do I detect Neon availability in the preprocessor?

And maybe:

  • what GCC switches should I use to enable Neon SIMD instructions?

Related, on a LeMaker HiKey, which is AARCH64/ARM64 running Linaro with GCC 4.9.2, here's the output from the preprocessor:

$ cpp -dM </dev/null | grep -i neon
#define __ARM_NEON 1

According to ARM, this board does have Advanced SIMD instructions even though:

$ cat /proc/cpuinfo 
Processor   : AArch64 Processor rev 3 (aarch64)
...
Features    : fp asimd evtstrm aes pmull sha1 sha2 crc32

解决方案

There are a number of questions hidden in here, I'll try to extract them in turn...

According to the ARM ARM, __ARM_NEON__ is defined when Neon SIMD instructions are available. I'm having trouble getting GCC to provide it.

That is compiler documentation for [an old version of] the ARM Compiler rather than the ARM Architceture Reference Manual. A better macro to check for the presence of the Advanced SIMD instructions would be __ARM_NEON, which is defined in the ARM C Language Extensions.

Try GCC and -march=native:

As you may have found. GCC for the ARM target separates out -march (For the architecture revision for which GCC should generate code), -mfpu (For the floating point/Advanced SIMD unit available) and -mfloat-abi (For how floating point arguments should be passed, and for the presence or absence of a floating point unit). Finally there is -mtune (Which asks GCC to try to optimise for a particular processor) and -mcpu (which acts as a combination of -mtune and -march).

By asking for -march=native You're asking GCC to generate code appropriate for the detected architecture of the processor on which you are running. This has no impact on the -mfpu setting, and so does not necessarily enable Advanced SIMD instruction generation.

Note that the above only applies to a compiler targeting AArch32. The AArch64 GCC does not support -mfpu and will detect presence of Advanced SIMD support through -march=native.

OK, try what Google uses for Android when building for Neon:

$ g++ -march=armv7-a -mfpu=vfpv3-d16 -mfloat-abi=softfp -dM -E

These build flags are not sufficient to enable support for Advanced SIMD instructions, your notes may be incomplete. Of the -mfpu flags supported by GCC 4.9.2 I'd expect any of:

neon, neon-fp16, neon-vfpv4, neon-fp-armv8, crypto-neon-fp-armv8

To give you what you want.

According to ARM, this board does have Advanced SIMD instructions even though:

Looks like you're running on an AArch64 kernel, which exposes support for Advanced SIMD through the asimd feature - as in your example output.

这篇关于检测ARM NEON可用性在preprocessor?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆