GCC 发出 ARM idiv 指令(续) [英] GCC to emit ARM idiv instructions (continued)

查看:23
本文介绍了GCC 发出 ARM idiv 指令(续)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想知道这是否适用于 Krait 400 CPU.我遵循了一些建议这里

I am wondering if this is possible for a Krait 400 CPU. I followed some of the suggestions here

当我使用 mcpu=cortexa15 进行编译时,代码会编译并有效地在程序集转储中看到 udiv 指令.

When I compile with mcpu=cortexa15 , then the code compiles and effectively I see udiv instructions in the assembly dump.

不过,我想知道:

  1. 是否有可能让它与 March=armv7-a 一起工作?(未指定 cpu;这是我最初拥有的方式)
  2. 我尝试使用 mcpu=krait2,但由于我没有使用 snapdragon llvm(我还不知道这需要多少努力),所以它无法识别它.是否可以从 llvm 获取 cpu 定义并以某种方式使其可用于我的编译器?
  3. 任何其他方法/补丁/技巧?

我的编译器选项如下:

 /development/android-ndk-r8e/toolchains/arm-linux-androideabi-4.7/prebuilt/linux-x86_64/bin/arm-linux-androideabi-gcc  -DANDROID -DNEON -fexceptions -Wno-psabi --sysroot=/development/android-ndk-r8e/platforms/android-14/arch-arm -fpic -funwind-tables -funswitch-loops -finline-limit=300 -fsigned-char -no-canonical-prefixes -march=armv7-a -mfloat-abi=softfp -mfpu=neon -fdata-sections -ffunction-sections -Wa,--noexecstack  -marm -fomit-frame-pointer -fstrict-aliasing -O3 -DNDEBUG

我得到的错误是:

Error: selected processor does not support ARM mode `udiv r1,r1,r3'

顺便说一句,我不得不说我才刚刚开始了解整个计划,因此我想逐步了解我在做什么.

As a side note I have to say that I am just beginning o understand the whole scheme, therefore I want to keep it in small steps to understand what I am doing.

提前致谢.

编辑 1:

我尝试编译一个单独的模块,只包含 udiv 指令.该模块使用 -mcpu=cortex-a15 参数编译,而应用程序的其余部分使用 -march=armv7-a 参数编译.结果是(以某种方式预期)函数调用开销影响了应用程序的时间性能.我无法获得内联代码,因为尝试进入内联会导致与我最初遇到的错误相同的错误.在尝试重新发明轮子之前,我将切换到 Snapdragon 以查看是否有更好的性能.感谢大家的回答和提示.

I tried compiling a separate module only including the udiv instruction. That module is compiled using the -mcpu=cortex-a15 arameter, while the rest of the application is compiled using the -march=armv7-a parameter. The result was (somehow expected) that the function call overhead affected the time performance of the application. I could not get inline code since tring to get in inline resulted in the same error that I originally had. I will switch to the the Snapdragon to see if there is a better performance before trying to reinvent the wheel. Thanks everybody for their answers and tips.

推荐答案

idiv - 一个表示同时支持 sdivudiv 的混合体是一个可选的 Cortex-A 指令.Cortex-A 的支持可以通过 ID_ISAR0 cp15 寄存器查询,以位 [27:24] 为单位.

idiv - an amalgam to mean both sdiv and udiv is supported is an optional Cortex-A instruction. The support by a Cortex-A can be queried via the ID_ISAR0 cp15 registers, in bits [27:24].

  /* Get idiv support. */
  unsigned int ISAR0;
  int idiv;
  __asm ("mrc 15, 0, %0, c0, c2, 0" :"=r" (ISAR0));
#ifdef __thumb2__
  idiv = (ISAR0 & 0xf000000UL) ? 1 : 0;
#else
  idiv = (ISAR0 & 0xf000000UL) == 0x2000000UL ? 1 : 0;
#endif

位[27:24]是0001,如果只有thumb2支持udivsdiv指令.如果位 [27:24] 是 0010,则两种模式都支持指令.

Bits [27:24] are 0001, if only thumb2 supports the udiv and sdiv instructions. If the bits [27:24] are 0010, then both modes support the instructions.

由于 gcc 标志 -march=armv7-a 等意味着代码应该在 ALL 这种类型的 CPU 上工作,并且这条指令是可选的,它会是gcc 发出此指令的错误.

As the gcc flags -march=armv7-a, etc mean that the code should work on ALL CPUs of this type and this instruction is optional, it would be an error for gcc to emit this instruction.

您可以使用不同的标志编译不同的模块,例如,

You may compile different modules with different flags such as,

gcc -march=armv7-a -o general.o -c general.c 
gcc -mcpu=cortex-a15 -D_USE_IDIV_=1 -o fast_idiv.o -c fast_div.c 

这些模块可以链接在一起,上面的代码可用于在运行时选择合适的例程.例如,两个文件可能都有,

These modules can be linked together and the above code can be used to select at run time an appropriate routine. For example, both files may have,

  #include "fir_template.def"

这个文件可能有,

#ifdef _USE_IDIV_
  #define _FUNC(x) idiv_ ## x
#else
  #define _FUNC(x) x
#endif

int _FUNC(fir8)(FILTER8 *filter, SAMPLE *data,)
{
   ....
}

如果您知道您的代码只能在 Cortex-a15 上运行,请使用 -mcpu 选项.如果您希望它运行得更快如果它可以并且是通用的(支持所有 armv7-a CPU),那么您必须按照上述方法识别 CPU 并动态选择代码.

If you know your code will only run on a Cortex-a15, then use the -mcpu option. If you want this to run faster IF it can and be generic (support all armv7-a CPUs), then you must ID the CPU as outlined above and dynamically select the code.

附录:上述文件(general.cfast_idiv.c)可以放在具有相同 API 的不同共享库中.然后查询/proc/cpuinfo,看是否支持idiv.将 LD_LIBRARY_PATH(或 dlopen())设置为适当的版本.选择将取决于所涉及的代码量.

Addendum: The files above (general.c and fast_idiv.c) could be put in separate shared libraries with the same API. Then interrogate /proc/cpuinfo and see if idiv is supported. Set the LD_LIBRARY_PATH (or dlopen()) to the appropriate version. The choice will depend on how much code is involved.

这篇关于GCC 发出 ARM idiv 指令(续)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆