GCC发出ARM idiv指令(续) [英] GCC to emit ARM idiv instructions (continued)

查看:197
本文介绍了GCC发出ARM idiv指令(续)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想知道这是否可以用于Krait 400 CPU。我遵循了一些建议
此处



当我使用mcpu = cortexa15进行编译时,代码将会编译并有效地在程序集转储中看到udiv指令。



我想知道:


  1. 是否可以使用march = armv7-a? (不指定cpu;这是我最初的样子)

  2. 我尝试使用mcpu = krait2,但是因为我没有使用snapdragon llvm(我还不知道如何很多努力会),它不承认它。是否有可能从llvm获取cpu定义并以某种方式将其提供给我的编译器?

  3. 任何其他方法/补丁程序/技巧

我的编译器选项如下:

  / development / android-ndk- r8e / toolchains / arm-linux-androideabi-4.7 / prebuilt / linux-x86_64 / bin / arm-linux-androideabi-gcc -DANDROID -DNEON -fexceptions -Wno-psabi --sysroot = / development / android-ndk-r8e / platform / android-14 / arch-arm -fpic -funwind-tables -funswitch-loops -finline-limit = 300 -fsigned-char -no-canonical-prefixes -march = armv7 -a -mfloat-abi = softfp -mfpu = neon -fdata-sections -ffunction-sections -Wa, -  noexecstack -marm -fomit-frame-pointer -fstrict-aliasing -O3 -DNDEBUG 

我得到的错误是:

 错误:所选处理器不支持ARM模式`udiv r1,r1,r3'

作为附注,我必须说我刚刚开始o了解整个计划,其中之前我想保留一小段时间来了解我在做什么。



预先致谢。



编辑1

我试着编译一个单独的模块,只包含udiv指令。该模块使用-mcpu = cortex-a15参数编译,而应用程序的其余部分使用-march = armv7参数进行编译。结果(以某种方式期望),函数调用开销影响了应用程序的时间性能。我无法获得内联代码,因为tring进入内联导致了我原来的错误。在尝试重新发明车轮之前,我会切换到Snapdragon以查看是否有更好的性能。感谢大家的答案和提示。 > sdiv 和 udiv 是可选的 Cortex-A 指令。可以通过 ID_ISAR0 cp15寄存器在[27:24]位查询 Cortex-A 的支持。

  / *获得idiv支持。 * / 
unsigned int ISAR0;
int idiv;
__asm(mrc 15,0,%0,c0,c2,0:= r(ISAR0));
#ifdef __thumb2__
idiv =(ISAR0& 0xf000000UL)? 1:0;
#else
idiv =(ISAR0& 0xf000000UL)== 0x2000000UL? 1:0;
#endif

位[27:24]是 0001 ,如果只有thumb2支持 udiv sdiv 说明。如果位[27:24]是 0010 ,那么这两种模式都支持这些指令。

由于gcc标志 -march = armv7 -a 等表示代码应该在所有类型的CPU上工作,并且此指令是可选的,它将是



您可以使用不同的标记编译不同的模块,例如

  gcc -march = armv7-a -o general.o -c general.c 
gcc -mcpu = cortex-a15 -D_USE_IDIV_ = 1 -o fast_idiv.o -c fast_div.c

这些模块可以链接在一起,上面的代码可以用来在运行时选择适当的例程。例如,这两个文件可能都有,

  #includefir_template.def

和这个文件可能有,

  #ifdef _USE_IDIV_ 
#define _FUNC(x)idiv_ ## x
#else
#define _FUNC(x)x
#endif

int _FUNC(fir8 )(FILTER8 * filter,SAMPLE * data,)
{
....
}

如果您知道您的代码只能在 Cortex-a15 上运行,那么请使用 -mcpu 选项。如果你希望它运行的更快,那么它可以是通用的(支持所有的armv7-a CPU),那么你必须如上所述对CPU进行身份识别并动态选择代码。



附录:上面的文件( general.c fast_idiv.c )可以放在单独的共享中具有相同API的库。然后询问 / proc / cpuinfo 并查看是否支持 idiv 。将 LD_LIBRARY_PATH (或 dlopen())设置为适当的版本。选择将取决于涉及多少代码。


I am wondering if this is possible for a Krait 400 CPU. I followed some of the suggestions here

When I compile with mcpu=cortexa15 , then the code compiles and effectively I see udiv instructions in the assembly dump.

However, I would like to know:

  1. Is it possible to get it to work with march=armv7-a? (not specifying a cpu; this is how I have it originally)
  2. I tried to use mcpu=krait2, but since I am not using the snapdragon llvm (I don't know yet how much effort that would be) it does not recognize it. Is it possible to get the cpu definition from the llvm and somehow make it available to my compiler?
  3. Any other method/patch/trick?

My compiler options are as follows:

 /development/android-ndk-r8e/toolchains/arm-linux-androideabi-4.7/prebuilt/linux-x86_64/bin/arm-linux-androideabi-gcc  -DANDROID -DNEON -fexceptions -Wno-psabi --sysroot=/development/android-ndk-r8e/platforms/android-14/arch-arm -fpic -funwind-tables -funswitch-loops -finline-limit=300 -fsigned-char -no-canonical-prefixes -march=armv7-a -mfloat-abi=softfp -mfpu=neon -fdata-sections -ffunction-sections -Wa,--noexecstack  -marm -fomit-frame-pointer -fstrict-aliasing -O3 -DNDEBUG

The error that I get is:

Error: selected processor does not support ARM mode `udiv r1,r1,r3'

As a side note I have to say that I am just beginning o understand the whole scheme, therefore I want to keep it in small steps to understand what I am doing.

Thanks in advance.

EDIT 1:

I tried compiling a separate module only including the udiv instruction. That module is compiled using the -mcpu=cortex-a15 arameter, while the rest of the application is compiled using the -march=armv7-a parameter. The result was (somehow expected) that the function call overhead affected the time performance of the application. I could not get inline code since tring to get in inline resulted in the same error that I originally had. I will switch to the the Snapdragon to see if there is a better performance before trying to reinvent the wheel. Thanks everybody for their answers and tips.

解决方案

idiv - an amalgam to mean both sdiv and udiv is supported is an optional Cortex-A instruction. The support by a Cortex-A can be queried via the ID_ISAR0 cp15 registers, in bits [27:24].

  /* Get idiv support. */
  unsigned int ISAR0;
  int idiv;
  __asm ("mrc 15, 0, %0, c0, c2, 0" :"=r" (ISAR0));
#ifdef __thumb2__
  idiv = (ISAR0 & 0xf000000UL) ? 1 : 0;
#else
  idiv = (ISAR0 & 0xf000000UL) == 0x2000000UL ? 1 : 0;
#endif

Bits [27:24] are 0001, if only thumb2 supports the udiv and sdiv instructions. If the bits [27:24] are 0010, then both modes support the instructions.

As the gcc flags -march=armv7-a, etc mean that the code should work on ALL CPUs of this type and this instruction is optional, it would be an error for gcc to emit this instruction.

You may compile different modules with different flags such as,

gcc -march=armv7-a -o general.o -c general.c 
gcc -mcpu=cortex-a15 -D_USE_IDIV_=1 -o fast_idiv.o -c fast_div.c 

These modules can be linked together and the above code can be used to select at run time an appropriate routine. For example, both files may have,

  #include "fir_template.def"

and this file might have,

#ifdef _USE_IDIV_
  #define _FUNC(x) idiv_ ## x
#else
  #define _FUNC(x) x
#endif

int _FUNC(fir8)(FILTER8 *filter, SAMPLE *data,)
{
   ....
}

If you know your code will only run on a Cortex-a15, then use the -mcpu option. If you want this to run faster IF it can and be generic (support all armv7-a CPUs), then you must ID the CPU as outlined above and dynamically select the code.

Addendum: The files above (general.c and fast_idiv.c) could be put in separate shared libraries with the same API. Then interrogate /proc/cpuinfo and see if idiv is supported. Set the LD_LIBRARY_PATH (or dlopen()) to the appropriate version. The choice will depend on how much code is involved.

这篇关于GCC发出ARM idiv指令(续)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆