调试某些Armv5汇编代码时出现奇怪的内容 [英] Strange content when debugging some Armv5 assembly code

查看:111
本文介绍了调试某些Armv5汇编代码时出现奇怪的内容的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试通过调试一个简单的ARM程序集来学习ARM.

I am trying to learn ARM by debugging a simple piece of ARM assembly.

    .global start, stack_top
start:
    ldr sp, =stack_top
    bl main
    b .

链接脚本如下所示:

ENTRY(start)
SECTIONS
{
    . = 0x10000;
    .text : {*(.text)}
    .data : {*(.data)}
    .bss : {*(.bss)}
    . = ALIGN(8);
    . = . +0x1000;
    stack_top = .;
}

我在qemu手臂仿真器上运行它.二进制文件加载到0x10000.所以我在那儿放了一个断点. bp被击中.我检查了pc寄存器.它的值是0x10000.然后,我在0x10000处反汇编指令.

I run this on qemu arm emulator. The binary is loaded at 0x10000. So I put a breakpoint there. As soon as the bp is hit. I checked the pc register. It's value is 0x10000. Then I disassemble the instruction at 0x10000.

我看到一条奇怪的评论; 0x1000c <start+12>. 这是什么意思?它来自哪里?

I see a strange comment ; 0x1000c <start+12>. What does it mean? Where does it come from?

Breakpoint 1, 0x00010000 in start ()
(gdb) i r pc
pc             0x10000  0x10000 <start>
(gdb) x /i 0x10000
=> 0x10000 <start>:     ldr     sp, [pc, #4]    ; 0x1000c <start+12> <========= HERE
(gdb) x /i 0x10004
   0x10004 <start+4>:   bl      0x102b0 <main>

然后我继续调试: 我想查看sp寄存器上0x10000ldr sp, [pc, #4]的影响.所以我如下调试.

Then I continued to debug: I want to see the effect of the ldr sp, [pc, #4] at 0x10000 on the sp register. So I debug as below.

通过上面的反汇编,我希望sp的值是[pc + 4],它应该是位于0x10000 + 4 = 0x10004的内容.但是sp却是0x11520.

From the above disassembly, I expected the value of sp to be [pc + 4], which should be the content located at 0x10000 + 4 = 0x10004. But the sp turns out to be 0x11520.

(gdb) i r sp
sp             0x0      0x0
(gdb) si
0x00010004 in start ()
(gdb) x /i $pc
=> 0x10004 <start+4>:   bl      0x102b0 <main>
(gdb) i r sp
sp             0x11520  0x11520 <=================== HERE
(gdb) x /x &stack_top  
0x11520:        0x00000000

因此0x11520值的确来自链接描述文件stack_top.但是它与0x10000上的ldr sp, [pc,#4]指令有什么关系?

So the 0x11520 value does come from the linker script symbol stack_top. But how is it related to the ldr sp, [pc,#4] instruction at 0x10000?

非常感谢 @old_timer 的详细回答.

我正在阅读这本书嵌入式和实时操作KC Wang的系统.我从本书中学到了管道方面的知识.引用如下:

I was reading the book Embedded and Real-Time Operating Systems by K. C. Wang. I learned about the pipeline thing from this book. Quoted as below:

因此,如果今天管道的内容已不再重要. 什么原因使pc值2超出当前执行的指令?

So, if the pipeline thing is no longer relevant today. What reason makes the pc value 2 ahead of the currently executed instruction?

我只是在下面的线程中找到了解决此问题的方法:

I just found below thread addressing this issue:

基本上,这只是另一种情况,人们随着技术的发展而不断为自己犯错误/缺陷/陷阱.

Basically, it just another case that people keep making mistakes/flaws/pitfalls for themselves as they advance the technologies.

回到这个问题:

  • 在我的程序集中,使用的是相对于PC的寻址.
  • ARM的PC指针比当前执行的指令提前2. (然后处理!)

推荐答案

    .global start, stack_top
start:
    ldr sp, =stack_top
    bl main
    b .

假设手臂模式那里有3条指令,则stack_top值要存在的第一个可能的池是.b之后的

assuming arm mode you have three instructions there, the first possible pool for the stack_top value to live is after the .b

_start: ( 0x00000000 )
0x00000000  ldr sp,=stack_top
0x00000004  bl main
0x00000008  b .
0x0000000c  stack_top

根据您的显示,这是汇编程序在其中分配该空间的地方.

and from what you have shown this is where the assembler allocated that space.

因此,在_start + 12处是stack_top VALUE的位置.伪代码ldr sp,= stack_top要么变成mov要么是pc的相对负载.由于历史原因,该pc处于第二位,而今天的相关性为零,有些体系结构是当前指令,有些则是下一条指令的长度是否可变的地址,在arm(aarch32)和thumb的情况下,它是因此"8位".因此,地址0x00000000到达0x0000000C的指令的pc相对负载为0xC-8 =4.所以ldr sp,[pc,#4].

so at _start + 12 is the location of the stack_top VALUE. The pseudo code ldr sp,=stack_top either gets turned into a mov or a pc relative load. The pc is two ahead for historical reasons which have zero relevance today, some architectures the pc is the current instruction, some it is the address at the next instruction variable length or not, and in the case of arm (aarch32) and thumb it is "two ahead" so 8. So a pc-relative load for an instruction at address 0x00000000 to reach 0x0000000C is 0xC - 8 = 4. so ldr sp,[pc,#4].

现在,该地址处的内容与链接器在链接时计算出的链接器脚本中所要求的一样.您在其中放置了一些代码,然后填充了一些没有显示其余代码的内容,这本可以使它成为一个完整的示例,但是无论从发布到发布的任何一种方式,链接器都最终计算出0x11520

Now the CONTENTS at that address is as you asked in the linker script computed by the linker at link time. You put some code in there then padded some stuff didnt show the rest of your code, could have made this a complete example, but either way from your post the linker ended up computing 0x11520

因此对您的问题和评论进行反向工程,我们看到二进制文件以(一旦链接)开头

so reverse engineering your question and comments we see that the binary starts with (once linked)

_start: ( 0x00010000 )
0x00010000  ldr sp,[pc, #4]
0x00010004  bl main
0x00010008  b .
0x0001000c  0x11520

在机械臂模式下,因此第一条指令将按您的要求将值0x11520加载到堆栈指针中.这里没什么奇怪或不对的.

In arm mode, so the first instruction will load the value 0x11520 into the stack pointer as you asked. Nothing strange or wrong here.

0x1000C< _start + 12>仅表示地址0x1000C与最近的标签_start的偏移量为12.有时这是有用的信息.

The 0x1000C <_start + 12> is simply stating that the address 0x1000C is an offset of 12 away from the nearest label _start. Sometimes that is useful information.

使用伪指令而不定义池,如果您添加了nop或其他一些代码,则汇编器将尝试寻找家

Using the pseudo instruction and not defining a pool the assembler is going to attempt to find a home if you added a nop or some other code

    .global start, stack_top
start:
    ldr sp, =stack_top
    bl main
    nop
    b .

然后,汇编程序现在很可能会将其放在pc + 8处,该值在链接后将为0x10010,并且如果没有其他改变,堆栈指针MIGHT可能会是相同的值,或者进一步等于4(或更多),取决于对齐方式并沿该工具进行填充.

Then it is likely the assembler would now put that at pc + 8 which after being linked would be 0x10010 and if nothing else changes the stack pointer MIGHT be at the same value or 4 (or more) further along, depends on alignments and padding made by the tool along the way.

如果管道在实际产品中曾经使用过,那么管道的作用就不再那么有效了,因此不要将其视为管道,这比mips中的分支影子指令在今天(启用时)有意义.对于每个具有pc相对寻址的指令集,您都需要了解规则,即指令的地址(不太常见),下一条指令的地址(最常见)还是前面的两个或其他...类似有一阵子硬编码在他们的大脑前面8个字节,而不是前面2个字节,并且当他们改用拇指时出现了问题.当然,现在有了thumb2扩展,可以考虑前面的两个扩展.我不了解aarch64规则,我希望它是下一条指令,并且不会感染aarch32前面的两个指令.但是与arm(A32)和thumb(T16和T32)一样,在arm文档中也很容易找到此信息(通常,对于任何体系结构,编写或分析机器/汇编语言时都应该方便使用此信息)

The point being the pipe no longer works that way if it ever did in real products so dont think of this as a pipe thing any more than the branch shadow instructions in mips mean anything relevant today (when enabled). For every instruction set that has pc-relative addressing you need to know the rule, is it the address of the instruction (less common), the address of the next instruction (most common) or two ahead, or other...Likewise folks for a while hardcoded in their brain 8 bytes ahead, rather than two ahead, and when they switched to thumb had issues. Now of course there are the thumb2 extensions which hose thinking about two ahead. I dont off hand know the aarch64 rule, I would hope it is next instruction and not infected with the two ahead from aarch32. But as with arm (A32) and thumb (T16 and T32) it is easy to find this information in the arm documentation (which as a rule for any architecture you should have handy when writing or analyzing machine/assembly language)

这篇关于调试某些Armv5汇编代码时出现奇怪的内容的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆