为内联汇编创建常量池的正确方法是什么? [英] What is the right way to create a constant pool for inline assembly?

查看:28
本文介绍了为内联汇编创建常量池的正确方法是什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

问题是在 C 函数内部我有一个内联程序集.类似的东西

The problem is that inside a C function I have an inline assembly. Something like

  ldr r7, =0xdeadbeef
  svc 0

如果没有明确创建文字池(就是这种情况),汇编器在翻译单元的末尾创建一个.通常这很好,但如果结果证明翻译单元真的很大,这是行不通的,因为文字池离 ldr 指令太远了.

If a literal pool wasn't created explicitly (this is the case), assembler creates one at the end of the translation unit. Usually this is fine, but if the translation unit turns out to be really huge, this doesn't work, because the literal pool is too far from the ldr instruction.

所以,我想知道处理这个问题的最佳方法是什么.最明显的方法是在内联程序集中手动创建文字池:

So, I wonder what is the best way to handle the problem. The most obvious way is to create a literal pool manually inside the inline assembly:

  ldr r7, =0xdeadbeef
  svc 0
  b 1f
  .ltorg
1:

  ldr r7, 1f
  svc 0
  b 2f
1:
  .word 0xdeadbeef
2:

不幸的是,由于冗余分支,这导致了次优代码操作说明.我不希望汇编器足够聪明来找到合适的函数内部常量池的位置.我想做的是在函数末尾创建一个常量池.有什么办法可以告诉编译器 (gcc) 在函数的末尾创建一个文字池?

Unfortunately, this leads to a suboptimal code because of the redundant branch instruction. I don't expect assembler to be clever enough to find an appropriate place for the constant pool inside the function. What I would like to do is to create a constant pool at the end of the function. Is there any way to tell the compiler (gcc) to create a literal pool at the end of the function?

PS 我最终使用了 movw/movt 对而不是常量池.尽管,首先,movw/movt 解决方案的可移植性略低于文字池,并且,其次,我只是想知道是否可以在内联汇编中使用常量池既可靠又高效.

PS I ended up using movw/movt pair instead of constant pools. Though, firstly, the movw/movt solution is slightly less portable than literal pools and, secondly, I simply wonder if it is possible to use constant pools in inline assembly both reliably and efficiently.

更新: 那么,处理问题的最佳方法是什么?

强制工具链在可以放置的函数之后创建一个常量池单独的代码部分中的函数.它起作用是因为在翻译单元汇编器的最后生成每个部分都有单独的常量池.

To force the toolchain to create a constant pool after the function one can put the function in a separate code section. It works because at the end of a translation unit assembler generates separate constant pools for each section.

虽然,事实上,最好的方法是避免将常量加载到寄存器中内联汇编.最好让编译器这样做.就我而言,我最终写了一个类似

Though, in fact, the best way is to avoid loading constants into registers in inline assembly at all. It's better to let the compiler do that. In my case I eventually wrote a code similar to

register int var asm("r7") = 0xdeadbeef;
asm volatile("svc 0\n" :: "r" (var));

推荐答案

您可以使用 -ffunction-sections 并按照 查询-ffunction-section,使用ld --gc-sections 删除未使用的代码.

You can use -ffunction-sections and as per query on -ffunction-section, use a ld --gc-sections to remove unused code.

分割文件很明显.

一个可行的解决方案是使用带有 unused 注释的 naked 函数,因为它从未被调用过.在这里放置一个 .ltorg 并将两个函数放在一个特殊的部分;.text.ltorg_kludge 例如.链接描述文件应使用 .text* 并将相同子部分中的函数放在一起.在某些方面,这就像拆分文件一样,因为编译器会尝试内联 static 函数.

A solution that should work is to use a naked function with an unused annotation as it is never called. Place a single .ltorg here and also put both functions in a special section; .text.ltorg_kludge for instance. The linker script should use .text* and functions in identical sub-sections are placed together. In some ways this is like splitting up the file as the compiler will try to inline static functions.

您可以依赖源代码中遇到的编译器发出的函数,而无需特殊部分.但是,我不确定这是标准立场还是偶然立场.编译器可以通过在调用层次结构的某些 DAG 顺序中发出函数来更好地优化.

You may rely on the compiler emitting functions as encountered in the source without a special section. However, I am not sure if this is a standard or happen-stance. Compilers may optimize better by emitting function in some DAG ordering of the call hierarchy.

旁白:由于缓存效应,movw/movt 效率更高.它也适用于 ARMv6 和更高版本的 Thumb2 代码.我认为可移植性不是什么大问题(因为内联汇编器是不可移植的,而且您可能更喜欢性能而不是可移植性),但这个问题与 ARMv4/5 用户有关.

Aside: movw/movt is more efficient due to cache effects. It is also works with ARMv6 and above Thumb2 code. I don't see portability as a big deal (as inline assembler is non-portable and you probably prefer performance over portability), but the question is relevant to ARMv4/5 users.

我调查了 R 约束的使用>gcc 机器约束,

I investigated the use of the R constraint from gcc machine constraints,

R
   常量池中的一项

R
      An item in the constant pool

但是,使用 gcc-4.8 的示例 给出了一个错误不可能的约束.使用诸如 C 之类的替代字母也会给出相同的错误消息.检查 source contraints.md 似乎表明 R 约束是仅文档功能.不幸的是,这听起来是为了解决这个问题.

However, a sample with gcc-4.8 gives an error impossible constraint. Using alternative letters like C also give the same error message. Inspection of the source contraints.md seems to indicate that the R constraint is a documentation only feature. Unfortunate, as it sounds purpose built to solve this issue.

可以让编译器加载该值,但这可能不是最佳的,具体取决于 inline 汇编器.例如,

It is possible to have the compiler load the value, but this maybe sub-optimal depending on the inline assembler. For example,

  asm(" add %0, %0, %1\n" : "+r" (0xdeadbeef) : "r" (0xbaddeed0));

这篇关于为内联汇编创建常量池的正确方法是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆