为什么这个简单的汇编程序以AT& T语法而不是Intel语法工作? [英] Why does this simple assembly program work in AT&T syntax but not Intel syntax?

查看:119
本文介绍了为什么这个简单的汇编程序以AT& T语法而不是Intel语法工作?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

此代码(在x86_64 Linux上运行)有什么问题?

What's wrong with this code (Running on x86_64 Linux)?

.intel_syntax
.text
.globl _start

_start:
    mov rax, 1
    mov rdi, 1
    mov rsi, msg
    mov rdx, 14
    syscall

    mov rax, 60
    mov rdi, 0
    syscall

.data
msg:
    .ascii "Hello, world!\n"

运行时:

$ clang -o hello_intel hello_intel.s  -nostdlib  && ./hello_intel

无输出.让我们追踪一下:

No output. Let's strace it:

$ strace ./hello_intel
execve("./hello_intel", ["./hello_intel"], [/* 96 vars */]) = 0
write(1, 0x77202c6f6c6c6548, 14)        = -1 EFAULT (Bad address)
exit(0)                                 = ?
+++ exited with 0 +++

它正在取消引用msg而不是使用其位置.为什么?

It's dereferencing msg instead of using its location. Why?

如果我改用AT& T语法...

If I use AT&T syntax instead...

.text
.globl _start

_start:
    mov $1, %rax
    mov $1, %rdi
    mov $msg, %rsi
    mov $14, %rdx
    syscall

    mov $60, %rax
    mov $0, %rdi
    syscall

.data
msg:
    .ascii "Hello, world!\n"

...工作正常:

$ clang -o hello_att hello_att.s  -nostdlib && ./hello_att
Hello, world!

这两者有什么区别?

这是工作中的人:

$ objdump -d hello_att -s -M intel

hello_att:     file format elf64-x86-64

Contents of section .text:
 4000e8 48c7c001 00000048 c7c70100 000048c7  H......H......H.
 4000f8 c6160160 0048c7c2 0e000000 0f0548c7  ...`.H........H.
 400108 c03c0000 0048c7c7 00000000 0f05      .<...H........  
Contents of section .data:
 600116 48656c6c 6f2c2077 6f726c64 210a      Hello, world!.  

Disassembly of section .text:

00000000004000e8 <_start>:
  4000e8:   48 c7 c0 01 00 00 00    mov    rax,0x1
  4000ef:   48 c7 c7 01 00 00 00    mov    rdi,0x1
  4000f6:   48 c7 c6 16 01 60 00    mov    rsi,0x600116
  4000fd:   48 c7 c2 0e 00 00 00    mov    rdx,0xe
  400104:   0f 05                   syscall 
  400106:   48 c7 c0 3c 00 00 00    mov    rax,0x3c
  40010d:   48 c7 c7 00 00 00 00    mov    rdi,0x0
  400114:   0f 05                   syscall 

这是坏掉的一个:

$ objdump -d hello_intel -s -M intel

hello_intel:     file format elf64-x86-64

Contents of section .text:
 4000e8 48c7c001 00000048 c7c70100 0000488b  H......H......H.
 4000f8 34251701 600048c7 c20e0000 000f0548  4%..`.H........H
 400108 c7c03c00 000048c7 c7000000 000f05    ..<...H........ 
Contents of section .data:
 600117 48656c6c 6f2c2077 6f726c64 210a      Hello, world!.  

Disassembly of section .text:

00000000004000e8 <_start>:
  4000e8:   48 c7 c0 01 00 00 00    mov    rax,0x1
  4000ef:   48 c7 c7 01 00 00 00    mov    rdi,0x1
  4000f6:   48 8b 34 25 17 01 60    mov    rsi,QWORD PTR ds:0x600117
  4000fd:   00 
  4000fe:   48 c7 c2 0e 00 00 00    mov    rdx,0xe
  400105:   0f 05                   syscall 
  400107:   48 c7 c0 3c 00 00 00    mov    rax,0x3c
  40010e:   48 c7 c7 00 00 00 00    mov    rdi,0x0
  400115:   0f 05                   syscall 

因此,这里的重要区别是0x600116QWORD PTR ds:0x600117,这确实看起来像是指针与取消引用指针之间的区别.

So the important difference here is 0x600116 vs QWORD PTR ds:0x600117, which indeed looks like the difference between a pointer and dereferencing a pointer.

那么,如何不取消对Intel语法代码中指针的引用?

So how do you not dereference the pointer in the Intel syntax code?

推荐答案

以下代码可在GCC中使用:

Here's code that works in GCC:

.intel_syntax noprefix
.text
.globl _start

_start:
    mov rax, 1
    mov rdi, 1
    mov rsi, offset msg
    mov rdx, 14
    syscall

    mov rax, 60
    mov rdi, 0
    syscall

.data
msg:
    .ascii "Hello, world!\n"

必须同时添加noprefixoffset.可悲的是,这个不适用于clang :

Both the noprefix and the offset had to be added. Sadly this does not work with clang:

hello_intel.s:8:24: error: unknown token in expression
    mov rsi, offset msg
                       ^

但是,您可以使用lea而不是mov来解决此问题:

However, you can work around the issue by using lea instead of mov:

lea rsi, [msg+rip]

这适用于clang和gcc. (并在与位置无关的代码中工作).这是将静态地址放入寄存器的标准方法.

This works in both clang and gcc. (And works in position-independent code). It's the standard way to put static addresses in registers.

mov esi, imm32是相对于RIP相对LEA的次要优化,用于位置相关,但mov rsi, sign_extended_imm32与LEA的代码大小相同.即使在编译时clang offset msg,在clang的.intel_syntax中这似乎也是不可能的:

mov esi, imm32 is a minor optimization over RIP-relative LEA for position-dependent, but mov rsi, sign_extended_imm32 is the same code size as LEA. It's apparently not possible in Clang's .intel_syntax, even though clang emits offset msg when compiling: How to get `mov rdx, symbol` to move symbol value and not value at symbol's address in clang intel-syntax?

这篇关于为什么这个简单的汇编程序以AT&amp; T语法而不是Intel语法工作?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆