ARM:为什么我需要在函数调用时压入/弹出两个寄存器? [英] ARM: Why do I need to push/pop two registers at function calls?
问题描述
我知道我需要在函数调用开始时推送链接寄存器,并在返回之前将该值弹出到程序计数器,以便执行可以从函数调用之前的位置携带一个值.
I understand that I need to push the Link Register at the beginning of a function call, and pop that value to the Program Couter before returning, so that the execution can carry one from where it was before the function call.
我不明白为什么大多数人通过向push/pop添加额外的寄存器来做到这一点.例如:
What I don't understand is why most people do this by adding an extra register to the push/pop. For instance:
push {ip, lr}
...
pop {ip, pc}
例如,这是 ARM 中的 Hello World,由 ARM 官方博客:
For instance, here's a Hello World in ARM, provided by the official ARM blog:
.syntax unified
@ --------------------------------
.global main
main:
@ Stack the return address (lr) in addition to a dummy register (ip) to
@ keep the stack 8-byte aligned.
push {ip, lr}
@ Load the argument and perform the call. This is like 'printf("...")' in C.
ldr r0, =message
bl printf
@ Exit from 'main'. This is like 'return 0' in C.
mov r0, #0 @ Return 0.
@ Pop the dummy ip to reverse our alignment fix, and pop the original lr
@ value directly into pc — the Program Counter — to return.
pop {ip, pc}
@ --------------------------------
@ Data for the printf calls. The GNU assembler's ".asciz" directive
@ automatically adds a NULL character termination.
message:
.asciz "Hello, world.\n"
问题 1:他们称之为虚拟寄存器"的原因是什么?为什么不简单地 push{lr} 和 pop{pc}?他们说是保持堆栈8字节对齐,但堆栈不是4字节对齐吗?
Question 1: what's the reason for the "dummy register" as they call it? Why not simply push{lr} and pop{pc}? They say it's to keep the stack 8-byte aligned, but ain't the stack 4-byte aligned?
问题 2:什么寄存器是ip"(即 r7 或什么?)
Question 2: what register is "ip" (i.e., r7 or what?)
推荐答案
他们称之为虚拟寄存器"的原因是什么?为什么不简单地 push{lr} 和 pop{pc}?他们说是保持堆栈8字节对齐,但堆栈不是4字节对齐吗?
what's the reason for the "dummy register" as they call it? Why not simply push{lr} and pop{pc}? They say it's to keep the stack 8-byte aligned, but ain't the stack 4-byte aligned?
栈只需要4字节对齐;但是 如果数据总线是 64 位宽(就像在许多现代 ARM 上一样),保持 8 字节对齐会更有效.然后,例如,如果您调用一个需要堆叠两个寄存器的函数,则可以在一次 64 位写入而不是两次 32 位写入中完成.
The stack only requires 4-byte alignment; but if the data bus is 64 bits wide (as it is on many modern ARMs), it's more efficient to keep it at an 8-byte alignment. Then, for example, if you call a function that needs to stack two registers, that can be done in a single 64-bit write rather than two 32-bit writes.
更新:显然这不仅仅是为了效率;如评论中所述,这是官方程序调用标准的要求.
UPDATE: Apparently it's not just for efficiency; it's a requirement of the official procedure call standard, as noted in the comments.
如果您的目标是较旧的 32 位 ARM,那么额外的堆栈寄存器可能会略微降低性能.
If you're targetting older 32-bit ARMs, then the extra stacked register might degrade performance slightly.
ip"是什么寄存器(即 r7 或什么?)
what register is "ip" (i.e., r7 or what?)
r12
.例如,请参见此处 用于过程调用标准使用的完整寄存器别名集.
r12
. See, for example, here for the full set of register aliases used by the procedure call standard.
这篇关于ARM:为什么我需要在函数调用时压入/弹出两个寄存器?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!