关于拇指 16/32 位混合指令流中的 arm pc 值 [英] About arm pc value in thumb 16/32bits mixed instructions stream

查看:34
本文介绍了关于拇指 16/32 位混合指令流中的 arm pc 值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我阅读了几篇文章,其中包括 SO 为什么 ARMPC寄存器指向下一条要执行的指令之后的指令?,那个pc寄存器的值实际上是当前执行指令地址加上前面的2条指令,所以在ARM状态下是+8字节(2*32bits).>

我的问题是,对于拇指状态,可能有 16 位或 32 位指令,这是否意味着对于 16/32 位指令,获取 pc 地址可以分别是 +4 字节或 +8 字节的偏移量?

例如:

279ae6: f8df 9338 ldr.w r9, [pc, #824] -->pc 值 = 279aea 或 279aee279aea:f44f 7380 mov.w r3,#256279aee:48cd ldr r0,[pc,#820]

我使用以下代码进行了更多测试:

1598: 467b mov r3, pc159a: f8bf 4000 ldrh.w r4, [pc] ;159c159e:46f9 mov r9,个人电脑15a0: f83f 5001 ldrh.w r5, [pc, #-1] ;15a315a4: f83f 6002 ldrh.w r6, [pc, #-2] ;15a615a8: f83f 7003 ldrh.w r7, [pc, #-3] ;15a915ac: f83f 8004 ldrh.w r8, [pc, #-4] ;15ac15b0: f04f 0908 mov.w r9, #815b4: f8d9 0008 ldr.w r0, [r9, #8] ;触发崩溃以检查寄存器

崩溃时,寄存器为:

I/DEBUG (2632):信号 11 (SIGSEGV),代码 1 (SEGV_MAPERR),故障地址 0x10I/DEBUG (2632): r0 b8ef4fc0 r1 aca6bb6c r2 00000000 r3 aca7c59cI/调试(2632):r4 00004000 r5 00003f50 r6 00006002 r7 000003f8I/DEBUG (2632): r8 0000f83f r9 00000008 sl 00000000 fp aca6bbc0I/DEBUG (2632): ip aca7c591 sp aca6bb40 lr acab722d pc aca7c5b4 cpsr 60070030

上面代码注释中显示的地址(159c/15a3/15a6/15a9/15ac)是objdump生成的,我用寄存器检查了这些位置的内容,好像没问题.

对于 16 位指令:

1598: 467b mov r3, pc ;//r3 = 1598 + 4 = 159c,+4 理论正确

对于 32 位拇指指令:

159a: f8bf 4000 ldrh.w r4, [pc] ;//ld addr = 159a + 2 = 159c,其中内容为4000(hw),正好r4显示;//与+4理论不一致

由此,对于 32 位指令,pc 读取 = pc 执行 +2.我错过了什么吗??现在我真的对 pc 偏移感到困惑.

顺便说一句,这是使用拇指2的armv7a平台.

谢谢你们.

解决方案

在 Thumb 状态下,PC 偏移量始终为 4 个字节.原因是它前面有两条指令获取,而 Thumb 指令获取(概念上)总是一个半字——因此为什么 32 位编码仍然具有两个小端半字的有趣字节顺序,而不是比一个小端单词.

原始 Thumb 指令集中的一个32 位"编码,bl,将每个半字的操作分别定义为前缀"和后缀"指令,巧妙的技巧是在执行第一部分时,返回地址是直接从 PC 中获取的,因为在那个阶段它指向第二部分之后的指令.当 Thumb-2 技术出现并使 32 位编码成为正式事物(包括 bl 追溯)时,PC 偏移量已经承担了与实际微架构无关几代*,因此将其定义的行为更改为依赖于指令流的变量实际上没有任何好处,并且会引入大量的兼容性问题.

更复杂的是,当PC用作寻址操作的基址寄存器时(即adr/ldr/str/etc.) 它始终使用 字对齐 值,即使在 Thumb 状态下也是如此.因此,在 0x159a 处执行加载指令时,PC 寄存器将读取为 0x159e,但 ldr...[pc] 的基地址是 Align(0x159e, 4),即0x159c.由于 PC 相对寻址通常是通过指定标签而不是手动计算偏移量来写入的,因此这个细节很容易被遗漏.

* 就 ARM 自己的设计而言,ARM7 是最后一个基于原始 3 级流水线的微架构.

I read a couple of articles including question here in SO Why does the ARM PC register point to the instruction after the next one to be executed?, that pc register value is actually current executing instruction address plus 2 instructions ahead, so in ARM state it's +8 byte (2*32bits).

My question is that, for thumb state, there could be 16bits or 32bits instructions, does it mean that the fetching pc address could be an offset of +4 bytes OR +8 bytes for 16/32bits instructions respectively?

For example:

279ae6: f8df 9338   ldr.w   r9, [pc, #824] --> pc value= 279aea or 279aee
279aea: f44f 7380   mov.w   r3, #256
279aee: 48cd        ldr r0, [pc, #820]

I did more test with following code:

1598:   467b        mov r3, pc
159a:   f8bf 4000   ldrh.w  r4, [pc]    ; 159c
159e:   46f9        mov r9, pc
15a0:   f83f 5001   ldrh.w  r5, [pc, #-1]   ; 15a3
15a4:   f83f 6002   ldrh.w  r6, [pc, #-2]   ; 15a6
15a8:   f83f 7003   ldrh.w  r7, [pc, #-3]   ; 15a9
15ac:   f83f 8004   ldrh.w  r8, [pc, #-4]   ; 15ac
15b0:   f04f 0908   mov.w   r9, #8
15b4:   f8d9 0008   ldr.w   r0, [r9, #8]    ; Trigger crash to check registers

Upon crash, registers are:

I/DEBUG   ( 2632): signal 11 (SIGSEGV), code 1 (SEGV_MAPERR), fault addr 0x10
I/DEBUG   ( 2632):     r0 b8ef4fc0  r1 aca6bb6c  r2 00000000  r3 aca7c59c
I/DEBUG   ( 2632):     r4 00004000  r5 00003f50  r6 00006002  r7 000003f8
I/DEBUG   ( 2632):     r8 0000f83f  r9 00000008  sl 00000000  fp aca6bbc0
I/DEBUG   ( 2632):     ip aca7c591  sp aca6bb40  lr acab722d  pc aca7c5b4  cpsr 60070030

The addresses shown in above code comments(159c/15a3/15a6/15a9/15ac) are generated by objdump, I checked these positions' contents with registers', seem all right.

for 16 bit instrucion:

1598:   467b        mov r3, pc  ;    // r3 = 1598 + 4 = 159c, correct for +4 theory

While for 32bit thumb instruction:

159a:   f8bf 4000   ldrh.w  r4, [pc]    ; // ld addr = 159a + 2 = 159c, where the content is 4000(hw), exactly r4 shows
                                        ; // Inconsistent with +4 theory

By this, for 32 bit instruction, pc read = pc executing +2. Am I missing anything?? Now I'm really confused about pc offset.

BTW, this is armv7a platform using thumb2.

Thank you guys.

解决方案

The PC offset is always 4 bytes in Thumb state. The reason being that it's two instruction fetches ahead, and a Thumb instruction fetch is (conceptually) always a halfword - hence why 32-bit encodings still have the funny byte order of two little-endian halfwords, rather than one little-endian word.

The one "32-bit" encoding in the original Thumb instruction set, bl, had the operation of each halfword defined separately as "prefix" and "suffix" instructions, the neat trick being that whilst executing the first part, the return address is taken directly from the PC, since at that stage it's pointing to the instruction after the second part. By the time Thumb-2 technology came along and made 32-bit encodings a formal thing (including bl retroactively), the PC offset had borne no relation to the actual microarchitecture for several generations*, so changing its defined behaviour to be variable dependent on the instruction stream would have had virtually no benefit and introduced massive compatibility problems.

To further complicate matters, when the PC is used as a base register for addressing operations (i.e. adr/ldr/str/etc.) it is always the word-aligned value that is used, even in Thumb state. So, whilst executing a load instruction at 0x159a, the PC register will read as 0x159e, but the base address of ldr...[pc] is Align(0x159e, 4), i.e. 0x159c. Since PC-relative addressing is normally written by specifying a label rather than calculating offsets manually, this detail can be easy to miss.

* In terms of ARM's own designs, ARM7 was the last microarchitecture based around the original 3-stage pipeline.

这篇关于关于拇指 16/32 位混合指令流中的 arm pc 值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆