该汇编程序如何打印"Hello World"? [英] How can this assembly program print "Hello World"?

查看:428
本文介绍了该汇编程序如何打印"Hello World"?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

.data
.global _start
_start:
    mov r7, #4
    mov r0, #1
    mov r2, #12
    ldr r4, =#0x6c6c6548
    str r4, [pc, #4]
    mov r1, pc
    add pc, pc, #8
    strbt r6, [ip], -r8, asr #10
    svcvs 0x0057206f
    beq 0x193b248
    swi #0
    mov r7, #1
    mov r0, #0
    swi #0

我偶然发现了这个小的ARM汇编程序,该程序打印"Hello World".将其另存为test.s进行测试:

I stumbled upon this little ARM assembly program, which prints "Hello World". Save it as test.s to test:

$ as -o test.o test.s
$ ld -o test test.o
$ ./test
Hello World
$

这是如何工作的?我看不到整个程序中的单个字符串.它也不会从其他任何地方读取字符串.看起来,此代码是打印字符串所需的全部.字符串从哪里来?

How does this work? I cannot see a single string in the entire program. It also doesn't read the string from anywhere else; it looks like this code is all that's needed to print the string. Where does the string come from?

推荐答案

以下是有趣的地方的注释:

Here's an annotation of the interesting bit:

  mov r7, #4
  mov r0, #1
  mov r2, #12
  ldr r4, =#0x6c6c6548
A str r4, [pc, #4]
B mov r1, pc
C add pc, pc, #8
D strbt r6, [ip], -r8, asr #10
E svcvs 0x0057206f
F beq 0x193b248
G swi #0
  mov r7, #1
  mov r0, #0
  swi #0

A处的存储的目标位置为D-如注释中所指出,该字(按小尾数顺序)创建4个ASCII字节"Hell"-存储在无意义指令的顶部那里(机器代码为0xe66c6548-接近,但不够好).大概这也是为什么在数据部分中的原因,以确保它是可写的 * .同时,在E处的指令的机器代码为0x6f57206f,为"o Wo".指令F尤其棘手,因为该地址一旦编码就必须导致相对分支偏移,看起来像"rld" ** -beq编码为0x0annnnnn,其中nnnnnn是顶部26位二进制补码偏移值的24位-还应注意,最高位字节中的条件代码和操作码组成了最后的换行符.

The store at A is targeting location D - as pointed out in the comments, that word (in little endian order) creates the 4 ASCII bytes "Hell" - that gets stored over the top of the nonsensical instruction there (the machine code of which is 0xe66c6548 - close, but not good enough). That's presumably also why this is in the data section, to ensure that it is writeable*. Meanwhile, the machine code of the instruction at E is 0x6f57206f, which makes "o Wo". Instruction F is particularly tricksy, as that address must result in the relative branch offset, once encoded, looking like "rld"** - the beq encoding is 0x0annnnnn, where nnnnnn is the top 24 bits of a 26-bit two's complement offset value - note also that the condition code and opcode in the top byte there make up the final newline.

指令BD的地址放入r1,即指向字符串开头的指针. r0和r2显然是其他必要的syscall参数,而r7本身就是syscall号(我懒得查找它,但是我假设r0中的1是stdout,r2中的12是字符数,而syscall 4是write).

Instruction B puts the address of D into r1, i.e. a pointer to the start of the string. r0 and r2 are obviously the other necessary syscall arguments, and r7 is the syscall number itself (I'm too lazy to look it up, but I assume the 1 in r0 is for stdout, the 12 in r2 is the number of characters, and syscall 4 is write).

最后,指令C是在G处的系统调用跳转,因此DEF处的指令"均未实际执行(此后的其余部分只是进行exit syscall).

Finally, instruction C is a jump to the syscall at G, so none of the "instructions" at D, E, and F are actually executed (the rest after that is just making an exit syscall).

非常整洁,可以获取技巧代码.

Pretty neat, for trick code.

*并且大概还依赖于加载程序中的某些向后兼容行为,以使数据段保持可执行状态.

**在我的binutils 2.26链接器中不会偶然发生,可能是由于默认的节对齐在最新版本中已更改.

这篇关于该汇编程序如何打印"Hello World"?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆