获取GCC优化手工装配 [英] Getting GCC to optimize hand assembly

查看:165
本文介绍了获取GCC优化手工装配的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在企图使GCC不会产生负载修改店经营每次我做的时间 | = &安培; = ,我已经定义了下列宏:

In an attempt to make GCC not generate a load-modify-store operation every time I do |= or &=, I have defined the following macros:

#define bset(base, offset, mask) bmanip(set, base, offset, mask)

#define bclr(base, offset, mask) bmanip(clr, base, offset, mask)

#define bmanip(op, base, offset, mask) \
asm("pshx");\
asm("ldx " #base);\
asm("b" #op " " #offset ",x " #mask);\
asm("pulx")

和他们工作的伟大;拆卸下来的二进制是完美的。

And they work great; the disassembled binary is perfect.

当我按顺序使用一个以上的问题来了:

The problem comes when I use more than one in sequence:

inline void spi_init()
{
  bset(_io_ports, M6811_DDRD, 0x38);
  bset(_io_ports, M6811_PORTD, 0x20);
  bset(_io_ports, M6811_SPCR, (M6811_SPE | M6811_DWOM | M6811_MSTR));
}

其结果是:

00002227 <spi_init>:
    2227:       3c              pshx
    2228:       fe 10 00        ldx     0x1000 <_io_ports>
    222b:       1c 09 38        bset    0x9,x, #0x38
    222e:       38              pulx
    222f:       3c              pshx
    2230:       fe 10 00        ldx     0x1000 <_io_ports>
    2233:       1c 08 20        bset    0x8,x, #0x20
    2236:       38              pulx
    2237:       3c              pshx
    2238:       fe 10 00        ldx     0x1000 <_io_ports>
    223b:       1c 28 70        bset    0x28,x, #0x70
    223e:       38              pulx
    223f:       39              rts

有没有什么办法让GCC(3.3.6-m68hc1x-20060122),自动优化了冗余堆栈操作?

Is there any way to get GCC (3.3.6-m68hc1x-20060122) to automatically optimize out the redundant stack operations?

推荐答案

GCC总是会发出装配说明你告诉它发出。因此,而不是写明确code加载与要操纵值寄存器,而不是你想告诉GCC做这代表您。你可以用注册的限制做到这一点。

gcc will always emit the assembly instructions you tell it to emit. So instead of explicitly writing code to load registers with the value you want to manipulate, you instead want to tell gcc to do this on your behalf. You can do this with register constraints.

不幸的是,6811 code发生器似乎并没有成为GCC的一个标准部分---我没有发现手册中的文档。所以我不能在文档的平台特定位点你。但是,你需要阅读通用位是在这里:<一href=\"http://gcc.gnu.org/onlinedocs/gcc-4.8.1/gcc/Extended-Asm.html#Extended-Asm\">http://gcc.gnu.org/onlinedocs/gcc-4.8.1/gcc/Extended-Asm.html#Extended-Asm

Unfortunately the 6811 code generator doesn't seem to be a standard part of gcc --- I don't spot the documentation in the manual. So I can't point you at platform-specific bit of the docs. But the generic bit you need to read is here: http://gcc.gnu.org/onlinedocs/gcc-4.8.1/gcc/Extended-Asm.html#Extended-Asm

语法怪异,但摘要:

asm("instructions" : outputs : inputs);

...其中输入输出是限制名单,这告诉GCC什么样的价值放在哪里。典型的例子是:

...where inputs and outputs are lists of constraints, which tell gcc what value to put where. The classic example is:

asm("fsinx %1,%0" : "=f" (result) : "f" (angle));

˚F表示命名值需要进入一个浮点寄存器; = 表示这是一个输出;然后将寄存器的名称代入该指令。

f indicates that the named value needs to go into a floating point register; = indicates it's an output; then the names of the registers are substituted into the instruction.

所以,你可能会想是这样的:

So, you'll probably want something like this:

asm("b" #op " " #offset ",%0 " #mask : "=Z" (i) : "0" (i));

...其中 I 是包含要修改的值的变量。 以Z 你需要仰视的6811 GCC文档---这是它重新$ P $约束psents寄存器有效期为它作为ASM指令产生的。在 0 表示输入股与输出0寄存器,用于读/写值。

...where i is a variable containing the value you want to modify. Z you'll need to look up in the 6811 gcc docs --- it's a constraint which represents a register which is valid for the asm instruction which is being generated. The 0 indicates that the input shares a register with output 0, and is used for read/write values.

由于你告诉GCC你想要的注册 I 是,它可以这方面的知识集成到其寄存器分配,并找到成本最低的方式获得 I ,你有code量最少需要它。 (有时没有额外的code)

Because you've told gcc what register you want i to be, it can integrate this knowledge into its register allocator and find the least-cost way to get i where you need it with the least amount of code. (Sometimes no additional code.)

GCC内嵌汇编深感扭曲和怪异,但pretty强大。这是值得花一些时间来吃透约束系统,以获得最佳的使用出来。

gcc inline assembly is deeply contorted and weird, but pretty powerful. It's worth spending some time to thoroughly understand the constraint system to get the best use out of it.

(顺便说一句,我不知道6811 code,但你忘了把运算结果的地方?我希望看到一个 STX 匹配 LDX

(Incidentally, I don't know 6811 code, but have you forgotten to put the result of the op somewhere? I'd expect to see an stx to match the ldx.)

更新:哦,我明白了什么 BSET 现在正在做的---它的结果写回内存位置,对不对?这仍然是可行的,但它是一个有点更痛苦。你需要告诉你修改内存位置的gcc,这样它才能知道不依赖于任何缓存值。你需要有约束 M 从而重新presents该位置的输出参数。检查文档。

Update: Oh, I see what bset is doing now --- it's writing the result back to a memory location, right? That's still doable but it's a bit more painful. You need to tell gcc that you're modifying that memory location, so that it knows not to rely on any cached value. You'll need to have an output parameter with constraint m which represents that location. Check the docs.

这篇关于获取GCC优化手工装配的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆