什么是创建内联汇编一个常量池的正确方法? [英] What is the right way to create a constant pool for inline assembly?

查看:274
本文介绍了什么是创建内联汇编一个常量池的正确方法?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

问题是,C函数里面我有一个内联汇编。
类似

  LDR R7,= 0xdeadbeef
  SVC 0

如果文字池未明确创建(这种情况下),汇编
创建一个在所述翻译单元的末尾。通常,这是好的,但如果
翻译单元真可谓是真正巨大的,这是不行的,因为
文字池太远LDR指令。

所以,我不知道什么是处理问题的最好办法。最明显的方法是
手动创建一个文字池的内联汇编内的:

  LDR R7,= 0xdeadbeef
  SVC 0
  b 1F
  .ltorg
1:

或者

  LDR R7,1F
  SVC 0
  b 2F
1:
  .word 0xdeadbeef
2:

不幸的是,这会导致次优的code,因为冗余分支的
指令。我不指望汇编足够聪明找到一个适当的
对于放置在函数内部常量池。我想这样做的是
创建在函数的端部的常量池。有没有办法告诉
编译器(GCC),以在函数结束的创建一个文字池的

PS 我结束了使用 MOVW / MOVT 对,而不是常量池。虽然,
首先,MOVW / MOVT解决方案是略多于文字池和便携性较差,
其次,我只是不知道是否有可能在内嵌汇编使用常量池
既可靠且高效地


更新: 那么,什么是处理问题的最好方法

要强制工具链来创建一个常量池后的函数,我们可以把
在一个单独的code部分的功能​​。它的工作原理是因为在一个翻译单元汇编的端部产生
单独的常量池每个部分。

虽然,实际上,在最好的方法是避免将常数加载到寄存器
内联汇编的。这是更好地让编译器做到这一点。在我来说,我
最终写了类似于

一个code

 寄存器INT VAR ASM(R7)= 0xdeadbeef;
ASM挥发性(SVC 0 \\ N::R(VAR));


解决方案

您可以使用 -ffunction截面键,按<一个href=\"http://stackoverflow.com/questions/4274804/query-on-ffunction-section-fdata-sections-options-of-gcc\">query在 -ffunction截面 ,使用 LD --gc截面删除未使用的$ C $角

有是明显的分裂的文件。

这应该工作的一个解决方案是使用函数与注解,因为它永远不会被调用。将单页 .ltorg 这里也存在一个特殊的部分两者的功能; .text.ltorg_kludge 的实例。链接器脚本应该使用的.text * ,并在相同的子部分的功能​​放在一起。在某些方面,这就像文件分手了,因为编译器将尝试内联静态功能。

您可以依赖于编译器的源没有遇到一个特殊的部分发射功能。不过,我不知道这是一个标准或发生-立场。编译器可以通过调用层次结构的一些DAG排序发光功能优化更好。


旁白: MOVW / MOVT 更有效,由于缓存的影响。它也可以用于ARMv6和更高Thumb2 code。我不认为便携性一个大问题(如内联汇编程序是不可移植的,你很可能超过preFER性能的可移植性的),但问题是有关在ARMv4 / 5的用户。


我调查了使用研究的约束从 GCC机器约束的,


  

研究

  &NBSP; &NBSP; &NBSP;常量池中的项目


不过,用gcc-4.8 样本给出了一个错误的不可能约束的。使用替代字母例如 C 也给了同样的错误消息。的的来源contraints.md 似乎表明,在研究约束是一个文档仅供功能。不幸的,因为它的声音内置解决这个问题的目的。

有可能有编译器加载的价值,但这也许次优视在线汇编。例如,

  ASM(添加%0%0%1 \\ n:+ R(0xdeadbeef):R(0xbaddeed0));

The problem is that inside a C function I have an inline assembly. Something like

  ldr r7, =0xdeadbeef
  svc 0

If a literal pool wasn't created explicitly (this is the case), assembler creates one at the end of the translation unit. Usually this is fine, but if the translation unit turns out to be really huge, this doesn't work, because the literal pool is too far from the ldr instruction.

So, I wonder what is the best way to handle the problem. The most obvious way is to create a literal pool manually inside the inline assembly:

  ldr r7, =0xdeadbeef
  svc 0
  b 1f
  .ltorg
1:

Or

  ldr r7, 1f
  svc 0
  b 2f
1:
  .word 0xdeadbeef
2:

Unfortunately, this leads to a suboptimal code because of the redundant branch instruction. I don't expect assembler to be clever enough to find an appropriate place for the constant pool inside the function. What I would like to do is to create a constant pool at the end of the function. Is there any way to tell the compiler (gcc) to create a literal pool at the end of the function?

PS I ended up using movw/movt pair instead of constant pools. Though, firstly, the movw/movt solution is slightly less portable than literal pools and, secondly, I simply wonder if it is possible to use constant pools in inline assembly both reliably and efficiently.


Update: So, what is the best way to handle the problem?

To force the toolchain to create a constant pool after the function one can put the function in a separate code section. It works because at the end of a translation unit assembler generates separate constant pools for each section.

Though, in fact, the best way is to avoid loading constants into registers in inline assembly at all. It's better to let the compiler do that. In my case I eventually wrote a code similar to

register int var asm("r7") = 0xdeadbeef;
asm volatile("svc 0\n" :: "r" (var));

解决方案

You can use -ffunction-sections and as per query on -ffunction-section, use a ld --gc-sections to remove unused code.

There is the obvious of splitting up the file.

A solution that should work is to use a naked function with an unused annotation as it is never called. Place a single .ltorg here and also put both functions in a special section; .text.ltorg_kludge for instance. The linker script should use .text* and functions in identical sub-sections are placed together. In some ways this is like splitting up the file as the compiler will try to inline static functions.

You may rely on the compiler emitting functions as encountered in the source without a special section. However, I am not sure if this is a standard or happen-stance. Compilers may optimize better by emitting function in some DAG ordering of the call hierarchy.


Aside: movw/movt is more efficient due to cache effects. It is also works with ARMv6 and above Thumb2 code. I don't see portability as a big deal (as inline assembler is non-portable and you probably prefer performance over portability), but the question is relevant to ARMv4/5 users.


I investigated the use of the R constraint from gcc machine constraints,

R
      An item in the constant pool

However, a sample with gcc-4.8 gives an error impossible constraint. Using alternative letters like C also give the same error message. Inspection of the source contraints.md seems to indicate that the R constraint is a documentation only feature. Unfortunate, as it sounds purpose built to solve this issue.

It is possible to have the compiler load the value, but this maybe sub-optimal depending on the inline assembler. For example,

  asm(" add %0, %0, %1\n" : "+r" (0xdeadbeef) : "r" (0xbaddeed0));

这篇关于什么是创建内联汇编一个常量池的正确方法?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆