这个汇编程序如何打印“Hello World"? [英] How can this assembly program print "Hello World"?

查看:44
本文介绍了这个汇编程序如何打印“Hello World"?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

.data
.global _start
_start:
    mov r7, #4
    mov r0, #1
    mov r2, #12
    ldr r4, =#0x6c6c6548
    str r4, [pc, #4]
    mov r1, pc
    add pc, pc, #8
    strbt r6, [ip], -r8, asr #10
    svcvs 0x0057206f
    beq 0x193b248
    swi #0
    mov r7, #1
    mov r0, #0
    swi #0

我偶然发现了这个小 ARM 汇编程序,它打印Hello World".将其另存为 test.s 进行测试:

I stumbled upon this little ARM assembly program, which prints "Hello World". Save it as test.s to test:

$ as -o test.o test.s
$ ld -o test test.o
$ ./test
Hello World
$

这是如何工作的?我在整个程序中看不到单个字符串.它也不会从其他任何地方读取字符串;看起来这个代码就是打印字符串所需的全部.字符串从哪里来?

How does this work? I cannot see a single string in the entire program. It also doesn't read the string from anywhere else; it looks like this code is all that's needed to print the string. Where does the string come from?

推荐答案

这里有一个有趣的注释:

Here's an annotation of the interesting bit:

  mov r7, #4
  mov r0, #1
  mov r2, #12
  ldr r4, =#0x6c6c6548
A str r4, [pc, #4]
B mov r1, pc
C add pc, pc, #8
D strbt r6, [ip], -r8, asr #10
E svcvs 0x0057206f
F beq 0x193b248
G swi #0
  mov r7, #1
  mov r0, #0
  swi #0

位于 A 的商店的目标位置是 D - 正如评论中指出的那样,这个词(以小端顺序)创建了 4 个 ASCII 字节地狱" -它存储在那里的无意义指令的顶部(其机器代码是 0xe66c6548 - 接近,但还不够好).这大概也是为什么这在数据部分,以确保它是可写的*.同时,E处指令的机器码为0x6f57206f,即o Wo".指令 F 特别棘手,因为该地址必须导致相对分支偏移,一旦编码,看起来像 "rld"** - beq 编码为 0x0annnnnn,其中 nnnnnn 是 26 位二进制补码偏移值的前 24 位 - 另请注意,顶部字节中的条件代码和操作码构成了最终的换行符.

The store at A is targeting location D - as pointed out in the comments, that word (in little endian order) creates the 4 ASCII bytes "Hell" - that gets stored over the top of the nonsensical instruction there (the machine code of which is 0xe66c6548 - close, but not good enough). That's presumably also why this is in the data section, to ensure that it is writeable*. Meanwhile, the machine code of the instruction at E is 0x6f57206f, which makes "o Wo". Instruction F is particularly tricksy, as that address must result in the relative branch offset, once encoded, looking like "rld"** - the beq encoding is 0x0annnnnn, where nnnnnn is the top 24 bits of a 26-bit two's complement offset value - note also that the condition code and opcode in the top byte there make up the final newline.

指令BD的地址放入r1,即指向字符串开头的指针.r0 和 r2 显然是其他必要的系统调用参数,而 r7 是系统调用编号本身(我懒得查了,但我假设 r0 中的 1 用于标准输出,r2 中的 12 是字符数,系统调用 4 是 write).

Instruction B puts the address of D into r1, i.e. a pointer to the start of the string. r0 and r2 are obviously the other necessary syscall arguments, and r7 is the syscall number itself (I'm too lazy to look it up, but I assume the 1 in r0 is for stdout, the 12 in r2 is the number of characters, and syscall 4 is write).

最后,指令 C 是跳转到 G 处的系统调用,所以 D 处的指令"都没有EF 实际执行了(之后的其余部分只是进行 exit 系统调用).

Finally, instruction C is a jump to the syscall at G, so none of the "instructions" at D, E, and F are actually executed (the rest after that is just making an exit syscall).

非常整洁,用于技巧代码.

Pretty neat, for trick code.

* 并且大概也依赖于加载器中的一些向后兼容性行为来使数据部分可执行.

** 顺便说一句,我的 binutils 2.26 链接器不会发生这种情况,可能是由于最近版本中默认的节对齐方式发生了变化.

这篇关于这个汇编程序如何打印“Hello World"?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆