在64位系统上装配的32位二进制文​​件(GNU工具链) [英] Assembling 32-bit binaries on a 64-bit system (GNU toolchain)

查看:258
本文介绍了在64位系统上装配的32位二进制文​​件(GNU工具链)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我写的组装code,它可以被编译:

 作为power.s -o power.o

上有问题,当我链接power.o对象文件:

  LD power.o -o电源

为了在64位操作系统(Ubuntu的14.04)跑,我加入 code32 在开头的 power.s 文件,但我仍然得到错误:

分割故障(核心转储)

power.s

  code32
.section伪。数据
.section伪的.text
。全球_start
_开始:
pushl $ 3
pushl $ 2
通话功率
ADDL $ 8%ESP
pushl%eax中pushl $ 2
pushl $ 5
通话功率
ADDL $ 8%ESPpopl%EBX
ADDL%EAX,EBX%MOVL $ 1,%eax中
INT 0x80的$.TYPE功率,@function
功率:
pushl%EBP
MOVL%ESP,EBP%
subl $ 4%ESP
MOVL 8(EBP%),EBX%
MOVL 12(%EBP),ECX%
MOVL%EBX,-4(%EBP)power_loop_start:
CMPL $ 1,ECX%
JE end_power
MOVL -4(%EBP),EAX%
imull%EBX,EAX%
MOVL%EAX,-4(%EBP)DECL%ECX
JMP power_loop_startend_power:
MOVL -4(%EBP),EAX%
MOVL%EBP,ESP%
popl%EBP
RET


解决方案

code32 做的的改变输出文件格式。它只是让你选择什么样的模式,以生成机器code表示。这取决于你不要尝试在64位模式下运行32位code。

.S 扩展名是针对应通过汇编器之前的C preprocessor运行ASM。这也是区分 .S 编译器输出的手写汇编一种有用的方法(从 GCC -O3 -S

要构建32位二进制文​​件,使用这些命令之一

gcc的-g foo.S ​​-o富-m32 -nostdlib -static#静态二进制绝对没有图书馆或启动$ C $ç
                       本身#-nostdlib使得在Linux上的静态可执行文件,而不是OS X.GCC -g foo.S ​​-o富-m32#动态二进制包括启动样板code。与code定义一个main()而不是使用_start

nostdlib 文档, -nostartfiles -static


使用从 libc函数_start (见这个答案为例月底)

某些功能,如的malloc(3)或标准输入输出功能,包括的printf(3),依赖于某些全球数据初始化(如 FILE *标准输出和对象时,它实际指向)。

GCC -nostartfiles 省去了CRT _start 样板code,但链接的libc (动态,默认情况下)。在Linux上,共享库可以由动态链接器在加载它们,跳转到你的 _start 入口点之前运行初始化段。 所以 GCC -nostartfiles hello.S 您仍然可以调用的printf 。对于动态可执行文件,内核运行 /lib/ld-linux.so.2 就可以了,而不是直接运行它(使用 readelf -a 来看到你的二进制ELF间preTER字符串)。当你的 _start 最终运行,而不是所有的寄存器将归零,因为动态连接器在运行过程code。

然而, GCC -nostartfiles -static hello.S 将链接,但坠毁在运行时如果你调用的printf 或不调用的glibc内部的初始化函数的东西。 (见迈克尔·佩奇的评论)。


当然,你可以把 .C .S 的任意组合,而同一命令行上的.o 文件,将它们全部连接成一个可执行文件。如果您有任何C,别忘了 -oG -Wall -Wextra :你不希望成为调试ASM当问题是东西在C简单得调用它,编译器会提醒你一下。

使用 -v 有你的gcc显示它运行组装和链接的命令。 要做到手动

作为foo.S ​​-o foo.o的-g --32&放大器;&安培; #跳过preprocessor
LD -o富foo.o的-m elf_i386文件foo
富:ELF 32位LSB的可执行文件,英特尔80386,版本1(SYSV),静态链接的,不剥离

GCC -nostdlib -m32 更容易比as和ld(两种不同的选择,记住和输入 - 32 -m elf_i386 )。此外,它适用于所有平台,包括那些在那里可执行格式不ELF。 (但是Linux的例子就不会在OS X上工作,因为系统调用号不同,或在Windows上,因为它甚至不使用 0x80的INT ABI)


NASM / YASM

GCC不能处理NASM语法。 ( -masm =英特尔更像是MASM比NASM语法,你需要偏移符号来获取地址作为即时)。当然指令是不同的(例如 .globl VS 全球)。

您可以建立与 NASM 或<一个HREF =htt​​p://yasm.tortall.net/相对=nofollow> YASM ,然后链接的.o GCC 如上,或 LD 直接

我用一个包装脚本,以避免相同的文件名的重复输入三种不同的扩展名。 (NASM和YASM默认为 file.asm - > file.o ,不像GNU的是<$ C默认输出$ C>的a.out )。使用此与 -m32 来汇编和链接32位ELF可执行文件。并非所有的操作系​​统使用ELF,所以这个脚本比使用 GCC -nostdlib -m32 链接会..便携式少

#!/ bin / sh的
#用法:ASM-链接[-q] [-m32] foo.asm [汇编器选项...]
#只需使用一个Makefile任何东西不平凡。这个脚本是故意很小,不处理多个源文件详细= 1#默认
FMT = -felf64
#ldopt = -melf_i386而getopts的'M:VQ'选择;做
    案$选择,在
        M)如果[X $ OPTARG=X32];然后
                FMT = -felf32
                ldopt = -melf_i386
            科幻
            ;;
        Q)详细= 0 ;;
        v)的详细= 1 ;;
    ESAC
DONE
转变$((OPTIND-1))#推卸的选项和可选 - SRC = $ 1
基地= $ {SRC%。*}
转移[$详细= 1]安培;&安培;设置-x#打印命令,因为它们可以运行,比如化妆#yasm$ FMT-Worphan-标签-gdwarf2$ SRC,$ @&放大器;&安培;
NASM$ FMT-Worphan-标签-g -Fdwarf$ SRC,$ @&放大器;&安培;
    LD $ ldopt -o$基地$ base.o#YASM -gdwarf2甚至包括标签。本地,使他们在objdump的输出显示
#NASM默认甚至包括。本地标签的行为#NASM默认为刺调试格式,但-g不是默认

我preFER YASM的几个原因,包括它默认为使长 - NOP 的S不是与许多单字节 NOP 秒。这使得凌乱拆卸输出,以及作为如果NOP指令有史以来运行速度较慢。 (在NASM中,你必须使用了SmartAlign 宏包)。


示例:使用libc函数从_start

程序

#hello32.S#包括LT&; ASM / unistd_32.h&GT; //系统调用号。只有#定义,没有C声明CPP后离开造成的asm语法错误。文本
#。全球主要#取消注释这让这个code工作为_start,或作为主要由glibc的名为_start
#主要:
#.weak伪_start。全球_start
_开始:
        MOV $ __ NR_gettimeofday,%eax中#使一个系统调用,我们可以在strace的输出看,所以我们知道,当我们到达这里
        INT 0x80的$        推%ESP
        推$ print_fmt
        调用printf        #xor%EBX,EBX%#_exit(0)
        #mov $ __ NR_exit_group,%eax中#一样的glibc的_exit(2)包装
        #int $#0x80的不会刷新标准输入输出缓冲器        MOVL $ 0(%ESP)#重复利用堆栈槽,我们设立的printf,而不必先弹出
        调用exit退出#(3)做一个fflush及其他清理        #将$ 8%ESP#弹出由两个推保留的空间
        #RET#只能在主,不_start.section伪.RODATA
print_fmt:.asciz你好,世界\\ n %% ESP启动=%#LX \\ n!


$ GCC -m32 -nostdlib hello32.S
/tmp/ccHNGx24.o:在功能`_start:
(+的.text为0x7):未定义的参考`printf的'
...
$ GCC -m32 hello32.S
/tmp/ccQ4SOR8.o:在功能`_start:
(+的.text为0x0):`_start的多重定义
...


失败在运行时,因为没有调用glibc的初始化函数。 ( __ libc_init_first __ dl_tls_setup __ libc_csu_init 的顺序,根据迈克尔·佩奇的评论。其他的libc 的实施存在,包括 MUSL 这是专为静态链接和不初始化调用工作。)

$ GCC -m32 -nostartfiles -static hello32.S#失败,在运行时
$文件为a.out
a.out的:ELF 32位LSB的可执行文件,英特尔80386,版本1(GNU / Linux)的,静态链接,BuildID [SHA1] = ef4b74b1c29618d89ad60dbc6f9517d7cdec3236,不可剥离
$ strace的-s128 ./a.out
的execve(./ a.out的,[./a.out],[/ * 70 *瓦尔/)= 0
[进程PID = 29681运行在32位模式。 ]
函数gettimeofday(NULL,NULL)= 0
--- SIGSEGV {si_signo = SIGSEGV,SI_ code = SI_KERNEL,si_addr = 0} ---
+++由SIGSEGV(核心转储)被杀+++
分段错误(核心转储)

您也可以 GDB ./a.out ,并运行 B _start 布局章运行,看看会发生什么。


$ GCC -m32 -nostartfiles hello32.S#正确的命令行
$文件为a.out
a.out的:ELF 32位LSB的可执行文件,英特尔80386,版本1(SYSV),动态链接,除preTER /lib/ld-linux.so.2,BuildID [SHA1] = 7b0a731f9b24a77bee41c13ec562ba2a459d91c7,不可剥离$ ./a.out
你好,世界!
ESP%在启动= 0xffdf7460$ ltrace -s128 ./a.out&GT;的/ dev / null的
的printf(你好,世界\\ n %% ESP启动=%#LX \\ n!,0xff937510)= 43#请注意不同的地址:工作中的地址空间布局随机化
出口(0℃;不归路...&GT;
+++退出(状态0)+++$ strace的-s128 ./a.out&GT;的/ dev / null的重定向#标准输出,所以我们看不到正常输出的混合和跟踪输出
的execve(./ a.out的,[./a.out],[/ * 70 *瓦尔/)= 0
[进程PID = 2972​​9运行在32位模式。 ]
brk的(0)= 0x834e000
访问(在/ etc / ld.so.nohwcap,F_OK)= -1 ENOENT(没有这样的文件或目录)
从....动态链接code的系统调用
开放(/ lib目录/ I386-Linux的GNU / libc.so.6的,O_RDONLY | O_CLOEXEC)= 3
mmap2(NULL,1814236,PROT_READ | PROT_EXEC,MAP_PRIVATE | MAP_DENYWRITE,3,0)= 0xfffffffff7556000#映射库的可执行文本部分
...更多的东西
#动态连接器的code的结束,终于跳到我们的_start函数gettimeofday({1461874556,431117},NULL)= 0
fstat64(1,{ST_MODE = S_IFCHR | 0666,st_rdev = MAKEDEV(1,3),...})= 0#标准输入输出是搞清楚标准输出是否是终端或不
的ioctl(1,SNDCTL_TMR_TIMEBASE或SNDRV_TIMER_IOCTL_NEXT_DEVICE或TCGETS,0xff938870)= -1 ENOTTY(用于设备不适当的ioctl)
mmap2(NULL,4096,PROT_READ | PROT_WRITE,MAP_PRIVATE | MAP_ANONYMOUS,-1,0)= 0xfffffffff7743000#4K的标准输出缓冲区
写(1,你好,世界!\\ n%ESP启动= 0xff938fb0 \\ n,43)= 43
exit_group(0)=?
+++ 0退出+++

如果我们想使用 _exit(0),或使系统称自己与 0x80的INT ,<一个href=\"http://stackoverflow.com/questions/36632644/gnu-as-puts-works-but-printf-does-not/36633225#36633225\">the 写(2)就不会发生。随着标准输出重定向到一个非TTY,则默认为全缓冲(未行缓冲),因此写(2)仅由<$ C $触发C> fflush(3)为出口(3)。如果没有重定向,调用的printf(3)包含换行符会立即刷新一个字符串。

表现不同,这取决于是否标准输出是一个终端可以是可取的,但只有当你故意的,不会按错。

I write the assembly code that can be compiled:

as power.s -o power.o

there is on problem when I link the power.o object file:

ld power.o -o power

In order to run on the 64bit OS (Ubuntu 14.04), I added .code32 at the beginning of the power.s file, however I still get error:

Segmentation fault (core dumped)

power.s:

.code32
.section .data
.section .text
.global _start
_start:
pushl $3
pushl $2 
call power 
addl $8, %esp
pushl %eax 

pushl $2
pushl $5
call power
addl $8, %esp

popl %ebx
addl %eax, %ebx

movl $1, %eax
int $0x80



.type power, @function
power:
pushl %ebp  
movl %esp, %ebp 
subl $4, %esp 
movl 8(%ebp), %ebx 
movl 12(%ebp), %ecx 
movl %ebx, -4(%ebp) 

power_loop_start:
cmpl $1, %ecx 
je end_power
movl -4(%ebp), %eax
imull %ebx, %eax
movl %eax, -4(%ebp)

decl %ecx
jmp power_loop_start

end_power:
movl -4(%ebp), %eax 
movl %ebp, %esp
popl %ebp
ret

解决方案

.code32 does not change the output file format. It just lets you choose what mode to generate machine code for. It's up to you to not try to run 32bit code in 64bit mode.

The .S extension is for asm that should be run through the C preprocessor before the assembler. It's also a useful way to distinguish hand-written asm from .s compiler output (from gcc -O3 -S).

To build 32bit binaries, use one of these commands

gcc -g foo.S -o foo -m32 -nostdlib -static  # static binary with absolutely no libraries or startup code
                       # -nostdlib by itself makes static executables on Linux, but not OS X.

gcc -g foo.S -o foo -m32                  # dynamic binary including the startup boilerplate code.  Use with code that defines a main() but not a _start

Documentation for nostdlib, -nostartfiles, and -static.


Using libc functions from _start (see the end of this answer for an example)

Some functions, like malloc(3), or stdio functions including printf(3), depend on some global data being initialized (e.g. FILE *stdout and the object it actually points to).

gcc -nostartfiles leaves out the CRT _start boilerplate code, but still links libc (dynamically, by default). On Linux, shared libraries can have initializer sections that are run by the dynamic linker when it loads them, before jumping to your _start entry point. So gcc -nostartfiles hello.S still lets you call printf. For a dynamic executable, the kernel runs /lib/ld-linux.so.2 on it instead of running it directly (use readelf -a to see the "ELF interpreter" string in your binary). When your _start eventually runs, not all the registers will be zeroed, because the dynamic linker ran code in your process.

However, gcc -nostartfiles -static hello.S will link, but crash at runtime if you call printf or something without calling glibc's internal init functions. (see Michael Petch's comment).


Of course you can put any combination of .c, .S, and .o files on the same command line to link them all into one executable. If you have any C, don't forget -Og -Wall -Wextra: you don't want to be debugging your asm when the problem was something simple in the C that calls it that the compiler could have warned you about.

Use -v to have gcc show you the commands it runs to assemble and link. To do it "manually":

as foo.S -o foo.o -g --32 &&      # skips the preprocessor
ld -o foo foo.o  -m elf_i386

file foo
foo: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), statically linked, not stripped

gcc -nostdlib -m32 is easier to remember and type than the two different options for as and ld (--32 and -m elf_i386). Also, it works on all platforms, including ones where executable format isn't ELF. (But Linux examples won't work on OS X, because the system call numbers are different, or on Windows because it doesn't even use the int 0x80 ABI.)


NASM/YASM

gcc can't handle NASM syntax. (-masm=intel is more like MASM than NASM syntax, where you need offset symbol to get the address as an immediate). And of course the directives are different (e.g. .globl vs global).

You can build with nasm or yasm, then link the .o with gcc as above, or ld directly.

I use a wrapper script to avoid the repetitive typing of the same filename with three different extensions. (nasm and yasm default to file.asm -> file.o, unlike GNU as's default output of a.out). Use this with -m32 to assemble and link 32bit ELF executables. Not all OSes use ELF, so this script is less portable than using gcc -nostdlib -m32 to link would be..

#!/bin/sh
# usage: asm-link [-q] [-m32] foo.asm  [assembler options ...]
# Just use a Makefile for anything non-trivial.  This script is intentionally minimal and doesn't handle multiple source files

verbose=1                       # defaults
fmt=-felf64
#ldopt=-melf_i386

while getopts 'm:vq' opt; do
    case "$opt" in
        m)  if [ "x$OPTARG" = "x32" ]; then
                fmt=-felf32
                ldopt=-melf_i386
            fi
            ;;
        q)  verbose=0 ;;
        v)  verbose=1 ;;
    esac
done
shift "$((OPTIND-1))"   # Shift off the options and optional --

src=$1
base=${src%.*}
shift

[ "$verbose" = 1 ] && set -x    # print commands as they're run, like make

#yasm "$fmt" -Worphan-labels -gdwarf2 "$src" "$@" &&
nasm "$fmt" -Worphan-labels -g -Fdwarf "$src" "$@" &&
    ld $ldopt -o "$base" "$base.o"

# yasm -gdwarf2 includes even .local labels so they show up in objdump output
# nasm defaults to that behaviour of including even .local labels

# nasm defaults to STABS debugging format, but -g is not the default

I prefer yasm for a few reasons, including that it defaults to making long-nops instead of padding with many single-byte nops. That makes for messy disassembly output, as well as being slower if the nops ever run. (In NASM, you have to use the smartalign macro package.)


Example: a program using libc functions from _start

# hello32.S

#include <asm/unistd_32.h>   // syscall numbers.  only #defines, no C declarations left after CPP to cause asm syntax errors

.text
#.global main   # uncomment these to let this code work as _start, or as main called by glibc _start
#main:
#.weak _start

.global _start
_start:
        mov     $__NR_gettimeofday, %eax  # make a syscall that we can see in strace output so we know when we get here
        int     $0x80

        push    %esp
        push    $print_fmt
        call   printf

        #xor    %ebx,%ebx                 # _exit(0)
        #mov    $__NR_exit_group, %eax    # same as glibc's _exit(2) wrapper
        #int    $0x80                     # won't flush the stdio buffer

        movl    $0, (%esp)   # reuse the stack slots we set up for printf, instead of popping
        call    exit         # exit(3) does an fflush and other cleanup

        #add    $8, %esp     # pop the space reserved by the two pushes
        #ret                 # only works in main, not _start

.section .rodata
print_fmt: .asciz "Hello, World!\n%%esp at startup = %#lx\n"


$ gcc -m32 -nostdlib hello32.S
/tmp/ccHNGx24.o: In function `_start':
(.text+0x7): undefined reference to `printf'
...
$ gcc -m32 hello32.S
/tmp/ccQ4SOR8.o: In function `_start':
(.text+0x0): multiple definition of `_start'
...


Fails at run-time, because nothing calls the glibc init functions. (__libc_init_first, __dl_tls_setup, and __libc_csu_init in that order, according to Michael Petch's comment. Other libc implementations exist, including MUSL which is designed for static linking and works without initialization calls.)

$ gcc -m32 -nostartfiles -static hello32.S     # fails at run-time
$ file a.out
a.out: ELF 32-bit LSB executable, Intel 80386, version 1 (GNU/Linux), statically linked, BuildID[sha1]=ef4b74b1c29618d89ad60dbc6f9517d7cdec3236, not stripped
$ strace -s128 ./a.out
execve("./a.out", ["./a.out"], [/* 70 vars */]) = 0
[ Process PID=29681 runs in 32 bit mode. ]
gettimeofday(NULL, NULL)                = 0
--- SIGSEGV {si_signo=SIGSEGV, si_code=SI_KERNEL, si_addr=0} ---
+++ killed by SIGSEGV (core dumped) +++
Segmentation fault (core dumped)

You could also gdb ./a.out, and run b _start, layout reg, run, and see what happens.


$ gcc -m32 -nostartfiles hello32.S             # Correct command line
$ file a.out
a.out: ELF 32-bit LSB executable, Intel 80386, version 1 (SYSV), dynamically linked, interpreter /lib/ld-linux.so.2, BuildID[sha1]=7b0a731f9b24a77bee41c13ec562ba2a459d91c7, not stripped

$ ./a.out
Hello, World!
%esp at startup = 0xffdf7460

$ ltrace -s128 ./a.out > /dev/null
printf("Hello, World!\n%%esp at startup = %#lx\n", 0xff937510)      = 43    # note the different address: Address-space layout randomization at work
exit(0 <no return ...>
+++ exited (status 0) +++

$ strace -s128 ./a.out > /dev/null        # redirect stdout so we don't see a mix of normal output and trace output
execve("./a.out", ["./a.out"], [/* 70 vars */]) = 0
[ Process PID=29729 runs in 32 bit mode. ]
brk(0)                                  = 0x834e000
access("/etc/ld.so.nohwcap", F_OK)      = -1 ENOENT (No such file or directory)
....   more syscalls from dynamic linker code
open("/lib/i386-linux-gnu/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
mmap2(NULL, 1814236, PROT_READ|PROT_EXEC, MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0xfffffffff7556000    # map the executable text section of the library
... more stuff
# end of dynamic linker's code, finally jumps to our _start

gettimeofday({1461874556, 431117}, NULL) = 0
fstat64(1, {st_mode=S_IFCHR|0666, st_rdev=makedev(1, 3), ...}) = 0  # stdio is figuring out whether stdout is a terminal or not
ioctl(1, SNDCTL_TMR_TIMEBASE or SNDRV_TIMER_IOCTL_NEXT_DEVICE or TCGETS, 0xff938870) = -1 ENOTTY (Inappropriate ioctl for device)
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xfffffffff7743000      # 4k buffer for stdout
write(1, "Hello, World!\n%esp at startup = 0xff938fb0\n", 43) = 43
exit_group(0)                           = ?
+++ exited with 0 +++

If we'd used _exit(0), or made the system call ourselves with int 0x80, the write(2) wouldn't have happened. With stdout redirected to a non-tty, it defaults to full-buffered (not line-buffered), so the write(2) is only triggered by the fflush(3) as part of exit(3). Without redirection, calling printf(3) with a string containing newlines will flush immediately.

Behaving differently depending on whether stdout is a terminal can be desirable, but only if you do it on purpose, not by mistake.

这篇关于在64位系统上装配的32位二进制文​​件(GNU工具链)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆