使用printf和ld链接程序? [英] Linking a program using printf with ld?

查看:217
本文介绍了使用printf和ld链接程序?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

使用NASM在构建定义自己的 _start 而不是 main 的汇编程序时,我得到_printf 的未定义引用.x86-64 Ubuntu

I'm getting a undefined reference to _printf when building an assembly program that defines its own _start instead of main, using NASM on x86-64 Ubuntu

构建命令:

   nasm -f elf64 hello.asm
   ld -s -o hello hello.o
   hello.o: In function `_start':
   hello.asm:(.text+0x1a): undefined reference to `_printf'
   MakeFile:4: recipe for target 'compile' failed
   make: *** [compile] Error 1

nasm来源:

extern _printf

section .text
    global _start
_start:
    mov rdi, format     ; argument #1
    mov rsi, message    ; argument #2
    mov rax, 0
  call _printf            ; call printf

    mov rax, 0
    ret                 ; return 0

section .data

    message:    db "Hello, world!", 0
    format:   db "%s", 0xa, 0

你好,世界!应该是输出

Hello, World! should be the output

推荐答案

3个问题:

  • 使用ELF对象文件的GNU/Linux不会用前导下划线修饰/修饰C名称.使用 call printf ,而不是 _printf (与MacOS X不同,MacOS X确实用 _ 装饰符号;将其保留在请注意,如果您正在查看其他操作系统的教程,Windows也会使用不同的调用约定,但只有32位Windows会使用 _ 或其他装饰编码调用约定的装饰修饰名称.)

  • GNU/Linux using ELF object files does not decorate / mangle C names with a leading underscore. Use call printf, not _printf (Unlike MacOS X, which does decorate symbols with an _; keep that in mind if you're looking at tutorials for other OSes. Windows also uses a different calling convention, but only 32-bit Windows mangles names with _ or other decorations that encode the choice of calling convention.)

您没有告诉 ld 链接libc ,也没有亲自定义 printf ,所以您没有t给链接器任何包含该符号定义的输入文件. printf 是libc.so中定义的库函数,与GCC前端不同, ld 不会自动包含它.

You didn't tell ld to link libc, and you didn't define printf yourself, so you didn't give the linker any input files that contain a definition for that symbol. printf is a library function defined in libc.so, and unlike the GCC front-end, ld doesn't include it automatically.

_start 不是函数,您不能从中 ret . RSP指向 argc ,而不是寄信人地址.如果您希望将 main 用作常规函数,则请定义它.

_start is not a function, you can't ret from it. RSP points to argc, not a return address. Define main instead if you want it to be a normal function.

gcc -no-pie -nostartfiles hello.o -o hello 链接,如果您希望动态可执行文件提供自己的 _start 而不是> main ,但仍使用libc.

Link with gcc -no-pie -nostartfiles hello.o -o hello if you want a dynamic executable that provides its own _start instead of main, but still uses libc.

这对于GNU/Linux上的 dynamic 可执行文件是安全的,因为glibc可以通过动态链接器挂钩运行其初始化函数.在Cygwin上并不安全,因为它的libc只能通过其CRT起始文件中的调用来初始化(在调用 main 之前执行该操作).

This is safe for dynamic executables on GNU/Linux, because glibc can run its init functions via dynamic linker hooks. It's not safe on Cygwin, where its libc is only initialized by calls from its CRT start file (which do that before calling main).

使用 call exit 退出,而不是如果使用 printf 直接进行 _exit 系统调用;使libc刷新所有缓冲的输出.(如果将输出重定向到文件,则stdout将被全缓冲,而不是终端上的行缓冲.)

Use call exit to exit, instead of making an _exit system call directly if you use printf; that lets libc flush any buffered output. (If you redirect output to a file, stdout will be full-buffered, vs. line buffered on a terminal.)

-static 并不安全;在静态可执行文件中,没有动态链接程序代码在 _start 之前运行,因此,除非您手动调用函数,否则libc无法自行初始化.这是可能的,但通常不建议这样做.

-static would not be safe; in a static executable no dynamic-linker code runs before your _start, so there's no way for libc to get itself initialized unless you call the functions manually. That's possible, but generally not recommended.

还有其他的libc实现,不需要在 printf / malloc /其他函数工作之前调用任何初始化函数.在glibc中,诸如stdio缓冲区之类的东西是在运行时分配的.(此MUSL libc ,但根据弗洛里安(Florian)对这个答案的评论,显然情况已不再如此.)

There are other libc implementations that don't need any init functions called before printf / malloc / other functions work. In glibc, stuff like the stdio buffers are allocated at runtime. (This used to be the case for MUSL libc, but that's apparently not the case anymore, according to Florian's comment on this answer.)

通常,如果您想使用libc函数,最好定义一个 main 函数而不是您自己的 _start 入口点.然后,您可以正常地与 gcc 链接,而无需任何特殊选项.

Normally if you want to use libc functions, it's a good idea to define a main function instead of your own _start entry point. Then you can just link with gcc normally, with no special options.

请参见

See What parts of this HelloWorld assembly code are essential if I were to write the program in assembly? for that and a version that uses Linux system calls directly, without libc.

如果您希望您的代码在最近发行版中的默认情况下可以在gcc make这样的PIE可执行文件中运行(不带-no-pie ),则需要致电printf wrt..请.

If you wanted your code to work in a PIE executable like gcc makes by default (without --no-pie) on recent distros, you'd need call printf wrt ..plt.

无论哪种方式,您都应该使用 lea rsi,[rel message] ,因为相对于RIP的LEA效率比具有64位绝对地址的 mov r64,imm64 更有效.(在与位置相关的代码中,将静态地址放入64位寄存器的最佳选择是5字节的 mov esi,message ,因为已知非PIE可执行文件中的静态地址位于虚拟地址空间的2GiB低,因此可以用作32位符号扩展或零扩展的可执行文件.但是相对于RIP的LEA并没有差很多,并且可以在任何地方使用.)

Either way, you should use lea rsi, [rel message] because RIP-relative LEA is more efficient than mov r64, imm64 with a 64-bit absolute address. (In position-dependent code, the best option for putting a static address in a 64-bit register is 5-byte mov esi, message, because static addresses in non-PIE executables are known to be in the low 2GiB of virtual address space, and thus work as 32-bit sign- or zero-extended executables. But RIP-relative LEA is not much worse and works everywhere.)

;;; Defining your own _start but using libc
;;; works on Linux for non-PIE executables

default rel                ; Use RIP-relative for [symbol] addressing modes
extern printf
extern exit                ; unlike _exit, exit flushes stdio buffers

section .text
    global _start
_start:
    ;; RSP is already aligned by 16 on entry at _start, unlike in functions

    lea    rdi, [format]        ; argument #1   or better  mov edi, format
    lea    rsi, [message]       ; argument #2
    xor    eax, eax             ; no FP args to the variadic function
    call   printf               ; for a PIE executable:  call printf wrt ..plt

    xor    edi, edi             ; arg #1 = 0
    call   exit                 ; exit(0)
    ; exit definitely does not return

section .rodata        ;; read-only data can go in .rodata instead of read-write .data

    message:    db "Hello, world!", 0
    format:   db "%s", 0xa, 0

正常组装,gcc -no-pie -nostartfiles hello.o 链接. 这省略了通常定义 _start 的 CRT 启动文件code>在调用 main 之前会做一些事情.Libc初始化函数是从动态链接器挂钩调用的,因此 printf 是可用的.

Assemble normally, link with gcc -no-pie -nostartfiles hello.o. This omits the CRT startup files that would normally define a _start that does some stuff before calling main. Libc init functions are called from dynamic linker hooks so printf is usable.

gcc -static -nostartfiles hello.o 并非如此.我提供了一些示例,说明如果使用错误的选项会发生什么情况:

This would not be the case with gcc -static -nostartfiles hello.o. I included examples of what happens if you use the wrong options:

peter@volta:/tmp$ nasm -felf64 nopie-start.asm 
peter@volta:/tmp$ gcc -no-pie -nostartfiles nopie-start.o 
peter@volta:/tmp$ ./a.out 
Hello, world!
peter@volta:/tmp$ file a.out 
a.out: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=0cd1cd111ba0c6926d5d69f9191bdf136e098e62, not stripped

# link error without -no-pie because it doesn't automatically make PLT stubs
peter@volta:/tmp$ gcc -nostartfiles nopie-start.o 
/usr/bin/ld: nopie-start.o: relocation R_X86_64_PC32 against symbol `printf@@GLIBC_2.2.5' can not be used when making a PIE object; recompile with -fPIC
/usr/bin/ld: final link failed: bad value
collect2: error: ld returned 1 exit status


# runtime error with -static
peter@volta:/tmp$ gcc -static -no-pie -nostartfiles nopie-start.o -o static_start-hello
peter@volta:/tmp$ ./static_start-hello 
Segmentation fault (core dumped)


替代版本,定义 main 而不是 _start

(并且通过使用 puts 而不是 printf 进行简化.)


Alternative version, defining main instead of _start

(And simplifying by using puts instead of printf.)

default rel                ; Use RIP-relative for [symbol] addressing modes
extern puts

section .text
    global main
main:
    sub    rsp, 8    ;; RSP was 16-byte aligned *before* a call pushed a return address
                     ;; RSP is now 16-byte aligned, ready for another call

    mov    edi, message         ; argument #1, optimized to use non-PIE-only move imm32
    call   puts

    add    rsp, 8               ; restore the stack
    xor    eax, eax             ; return 0
    ret

section .rodata
    message:    db "Hello, world!", 0     ; puts appends a newline

puts 几乎完全实现了 printf(%s \ n",string);C编译器会为您进行优化,但是在asm中,您应该自己进行优化.

puts pretty much exactly implements printf("%s\n", string); C compilers will make this optimization for you, but in asm you should do it yourself.

gcc -no-pie hello.o 链接,甚至使用 gcc -no-pie -static hello.o 静态链接.CRT启动代码将调用glibc初始化函数.

Link with gcc -no-pie hello.o, or even statically link using gcc -no-pie -static hello.o. The CRT startup code will call glibc init functions.

peter@volta:/tmp$ nasm -felf64 nopie-main.asm 
peter@volta:/tmp$ gcc -no-pie nopie-main.o 
peter@volta:/tmp$ ./a.out 
Hello, world!

# link error if you leave out -no-pie  because of the imm32 absolute address
peter@volta:/tmp$ gcc nopie-main.o 
/usr/bin/ld: nopie-main.o: relocation R_X86_64_32 against `.rodata' can not be used when making a PIE object; recompile with -fPIC
/usr/bin/ld: final link failed: nonrepresentable section on output
collect2: error: ld returned 1 exit status

main 一个函数,因此您需要在重新调用另一个函数之前重新对齐堆栈.虚拟推入也是在函数入口对齐堆栈的有效方法,但是 add / sub rsp,8 更清晰.

main is a function, so you need to re-align the stack before making another function call. A dummy push is also a valid way to align the stack on function entry, but add/sub rsp, 8 is clearer.

另一种选择是 jmp puts 对其进行尾调用,因此 main 的返回值将是 puts 返回的值.在这种情况下,您必须不要先修改 rsp :您只需跳转到 puts ,而返回地址仍在堆栈中,就像您的返回地址一样呼叫者致电了 puts .

An alternative is jmp puts to tailcall it, so main's return value will be whatever puts returns. In this case, you must not modify rsp first: you just jump to puts with your return address still on the stack, exactly like if your caller had called puts.

(您可以创建一个定义自己的 _start 的PIE.留给读者练习.)

(You can make a PIE that defines its own _start. That's left as an exercise for the reader.)

default rel                ; Use RIP-relative for [symbol] addressing modes
extern puts

section .text
    global main
main:
    sub    rsp, 8    ;; RSP was 16-byte aligned *before* a call pushed a return address

    lea    rdi, [message]         ; argument #1
    call   puts  wrt ..plt

    add    rsp, 8
    xor    eax, eax               ; return 0
    ret

section .rodata
    message:    db "Hello, world!", 0     ; puts appends a newline

peter@volta:/tmp$ nasm -felf64 pie.asm
peter@volta:/tmp$ gcc pie.o
peter@volta:/tmp$ ./a.out 
Hello, world!
peter@volta:/tmp$ file a.out
a.out: ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 3.2.0, BuildID[sha1]=b27e6032f955d628a542f6391b50805c68541fb9, not stripped

这篇关于使用printf和ld链接程序?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆