使用 printf 和 ld 链接程序? [英] Linking a program using printf with ld?

查看:27
本文介绍了使用 printf 和 ld 链接程序?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在使用 NASM 构建定义自己的 _start 而不是 main 的汇编程序时,我得到了一个 undefined reference to _printfx86-64 Ubuntu

I'm getting a undefined reference to _printf when building an assembly program that defines its own _start instead of main, using NASM on x86-64 Ubuntu

构建命令:

   nasm -f elf64 hello.asm
   ld -s -o hello hello.o
   hello.o: In function `_start':
   hello.asm:(.text+0x1a): undefined reference to `_printf'
   MakeFile:4: recipe for target 'compile' failed
   make: *** [compile] Error 1

消息来源:

extern _printf

section .text
    global _start
_start:
    mov rdi, format     ; argument #1
    mov rsi, message    ; argument #2
    mov rax, 0
  call _printf            ; call printf

    mov rax, 0
    ret                 ; return 0

section .data

    message:    db "Hello, world!", 0
    format:   db "%s", 0xa, 0

你好,世界!应该是输出

Hello, World! should be the output

推荐答案

3个问题:

  • 使用 ELF 目标文件的 GNU/Linux 不会使用前导下划线修饰/破坏 C 名称.使用 call printf,而不是 _printf(与 MacOS X 不同,它使用 _ 装饰符号;将其保留在请注意,如果您正在查看其他操作系统的教程.Windows 也使用不同的调用约定,但只有 32 位 Windows 使用 _ 或其他对调用约定的选择进行编码的装饰来破坏名称.)

  • GNU/Linux using ELF object files does not decorate / mangle C names with a leading underscore. Use call printf, not _printf (Unlike MacOS X, which does decorate symbols with an _; keep that in mind if you're looking at tutorials for other OSes. Windows also uses a different calling convention, but only 32-bit Windows mangles names with _ or other decorations that encode the choice of calling convention.)

你没有告诉 ld 链接 libc,你也没有自己定义 printf,所以你没有t 为链接器提供包含该符号定义的任何输入文件.printf 是 libc.so 中定义的库函数,与 GCC 前端不同,ld 不会自动包含它.

You didn't tell ld to link libc, and you didn't define printf yourself, so you didn't give the linker any input files that contain a definition for that symbol. printf is a library function defined in libc.so, and unlike the GCC front-end, ld doesn't include it automatically.

_start 不是一个函数,你不能从中 ret . RSP 指向 argc,而不是返回地址.如果您希望 main 成为普通函数,请改为定义它.

_start is not a function, you can't ret from it. RSP points to argc, not a return address. Define main instead if you want it to be a normal function.

链接 gcc -no-pie -nostartfiles hello.o -o hello 如果你想要一个提供自己的 _start 而不是 的动态可执行文件>main,但仍然使用 libc.

Link with gcc -no-pie -nostartfiles hello.o -o hello if you want a dynamic executable that provides its own _start instead of main, but still uses libc.

这对于 GNU/Linux 上的 dynamic 可执行文件是安全的,因为 glibc 可以通过动态链接器挂钩运行它的 init 函数.在 Cygwin 上是不安全的,它的 libc 仅通过来自其 CRT 启动文件的调用来初始化(在调用 main 之前执行此操作).

This is safe for dynamic executables on GNU/Linux, because glibc can run its init functions via dynamic linker hooks. It's not safe on Cygwin, where its libc is only initialized by calls from its CRT start file (which do that before calling main).

使用call exit退出,而不是使用printf直接进行_exit系统调用;这让 libc 刷新任何缓冲的输出.(如果将输出重定向到文件,stdout 将是全缓冲的,而不是在终端上缓冲的行.)

Use call exit to exit, instead of making an _exit system call directly if you use printf; that lets libc flush any buffered output. (If you redirect output to a file, stdout will be full-buffered, vs. line buffered on a terminal.)

-static 不安全;在静态可执行文件中,没有动态链接器代码在您的 _start 之前运行,因此除非您手动调用函数,否则 libc 无法自行初始化.这是可能的,但通常不推荐.

-static would not be safe; in a static executable no dynamic-linker code runs before your _start, so there's no way for libc to get itself initialized unless you call the functions manually. That's possible, but generally not recommended.

还有其他 libc 实现不需要在 printf/malloc/其他函数工作之前调用任何 init 函数.在 glibc 中,像 stdio 缓冲区这样的东西是在运行时分配的.(这 曾经是MUSL libc,但根据 Florian 对此答案的评论,显然情况不再如此.)

There are other libc implementations that don't need any init functions called before printf / malloc / other functions work. In glibc, stuff like the stdio buffers are allocated at runtime. (This used to be the case for MUSL libc, but that's apparently not the case anymore, according to Florian's comment on this answer.)

通常如果您想使用 libc 函数,最好定义一个 main 函数而不是您自己的 _start 入口点.然后你可以正常链接gcc,没有特殊选项.

Normally if you want to use libc functions, it's a good idea to define a main function instead of your own _start entry point. Then you can just link with gcc normally, with no special options.

请参阅 如果我要在汇编中编写程序,这个 HelloWorld 汇编代码的哪些部分是必不可少的?为此以及直接使用 Linux 系统调用的版本,没有 libc.

See What parts of this HelloWorld assembly code are essential if I were to write the program in assembly? for that and a version that uses Linux system calls directly, without libc.

如果您希望您的代码在最近的发行版上默认使用 gcc 生成的 PIE 可执行文件(没有 --no-pie),您需要 call printf wrt ..plt.

If you wanted your code to work in a PIE executable like gcc makes by default (without --no-pie) on recent distros, you'd need call printf wrt ..plt.

无论哪种方式,您都应该使用 lea rsi, [rel message] 因为相对于 RIP 的 LEA 比具有 64 位绝对地址的 mov r64, imm64 更有效.(在位置相关代码中,将静态地址放入 64 位寄存器的最佳选择是 5 字节 mov esi, message,因为已知非 PIE 可执行文件中的静态地址位于低 2GiB 的虚拟地址空间,因此可以用作 32 位符号或零扩展的可执行文件.但与 RIP 相关的 LEA 并没有差多少,而且在任何地方都有效.)

Either way, you should use lea rsi, [rel message] because RIP-relative LEA is more efficient than mov r64, imm64 with a 64-bit absolute address. (In position-dependent code, the best option for putting a static address in a 64-bit register is 5-byte mov esi, message, because static addresses in non-PIE executables are known to be in the low 2GiB of virtual address space, and thus work as 32-bit sign- or zero-extended executables. But RIP-relative LEA is not much worse and works everywhere.)

;;; Defining your own _start but using libc
;;; works on Linux for non-PIE executables

default rel                ; Use RIP-relative for [symbol] addressing modes
extern printf
extern exit                ; unlike _exit, exit flushes stdio buffers

section .text
    global _start
_start:
    ;; RSP is already aligned by 16 on entry at _start, unlike in functions

    lea    rdi, [format]        ; argument #1   or better  mov edi, format
    lea    rsi, [message]       ; argument #2
    xor    eax, eax             ; no FP args to the variadic function
    call   printf               ; for a PIE executable:  call printf wrt ..plt

    xor    edi, edi             ; arg #1 = 0
    call   exit                 ; exit(0)
    ; exit definitely does not return

section .rodata        ;; read-only data can go in .rodata instead of read-write .data

    message:    db "Hello, world!", 0
    format:   db "%s", 0xa, 0

正常组装,gcc -no-pie -nostartfiles hello.o 链接. 这省略了通常定义 _start 在调用 main 之前做一些事情.Libc init 函数是从动态链接器挂钩调用的,因此 printf 是可用的.

Assemble normally, link with gcc -no-pie -nostartfiles hello.o. This omits the CRT startup files that would normally define a _start that does some stuff before calling main. Libc init functions are called from dynamic linker hooks so printf is usable.

gcc -static -nostartfiles hello.o 不是这种情况.我提供了使用错误选项会发生什么情况的示例:

This would not be the case with gcc -static -nostartfiles hello.o. I included examples of what happens if you use the wrong options:

peter@volta:/tmp$ nasm -felf64 nopie-start.asm 
peter@volta:/tmp$ gcc -no-pie -nostartfiles nopie-start.o 
peter@volta:/tmp$ ./a.out 
Hello, world!
peter@volta:/tmp$ file a.out 
a.out: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=0cd1cd111ba0c6926d5d69f9191bdf136e098e62, not stripped

# link error without -no-pie because it doesn't automatically make PLT stubs
peter@volta:/tmp$ gcc -nostartfiles nopie-start.o 
/usr/bin/ld: nopie-start.o: relocation R_X86_64_PC32 against symbol `printf@@GLIBC_2.2.5' can not be used when making a PIE object; recompile with -fPIC
/usr/bin/ld: final link failed: bad value
collect2: error: ld returned 1 exit status


# runtime error with -static
peter@volta:/tmp$ gcc -static -no-pie -nostartfiles nopie-start.o -o static_start-hello
peter@volta:/tmp$ ./static_start-hello 
Segmentation fault (core dumped)

<小时>

替代版本,定义 main 而不是 _start

(并通过使用 puts 而不是 printf 来简化.)


Alternative version, defining main instead of _start

(And simplifying by using puts instead of printf.)

default rel                ; Use RIP-relative for [symbol] addressing modes
extern puts

section .text
    global main
main:
    sub    rsp, 8    ;; RSP was 16-byte aligned *before* a call pushed a return address
                     ;; RSP is now 16-byte aligned, ready for another call

    mov    edi, message         ; argument #1, optimized to use non-PIE-only move imm32
    call   puts

    add    rsp, 8               ; restore the stack
    xor    eax, eax             ; return 0
    ret

section .rodata
    message:    db "Hello, world!", 0     ; puts appends a newline

puts 几乎完全实现了 printf("%s ", string);C 编译器会为你做这个优化,但在 asm 中你应该自己做.

puts pretty much exactly implements printf("%s ", string); C compilers will make this optimization for you, but in asm you should do it yourself.

使用 gcc -no-pie hello.o 链接,甚至使用 gcc -no-pie -static hello.o 进行静态链接.CRT 启动代码会调用 glibc 的 init 函数.

Link with gcc -no-pie hello.o, or even statically link using gcc -no-pie -static hello.o. The CRT startup code will call glibc init functions.

peter@volta:/tmp$ nasm -felf64 nopie-main.asm 
peter@volta:/tmp$ gcc -no-pie nopie-main.o 
peter@volta:/tmp$ ./a.out 
Hello, world!

# link error if you leave out -no-pie  because of the imm32 absolute address
peter@volta:/tmp$ gcc nopie-main.o 
/usr/bin/ld: nopie-main.o: relocation R_X86_64_32 against `.rodata' can not be used when making a PIE object; recompile with -fPIC
/usr/bin/ld: final link failed: nonrepresentable section on output
collect2: error: ld returned 1 exit status

main 一个函数,所以你需要在调用另一个函数之前重新对齐堆栈.虚拟推送也是在函数入口对齐堆栈的有效方法,但 add/sub rsp, 8 更清晰.

main is a function, so you need to re-align the stack before making another function call. A dummy push is also a valid way to align the stack on function entry, but add/sub rsp, 8 is clearer.

另一种方法是 jmp puts 对其进行尾调用,因此 main 的返回值将是 puts 返回的任何值.在这种情况下,您必须首先修改 rsp:您只需跳转到 puts,返回地址仍在堆栈中,就像您的调用者调用了 puts.

An alternative is jmp puts to tailcall it, so main's return value will be whatever puts returns. In this case, you must not modify rsp first: you just jump to puts with your return address still on the stack, exactly like if your caller had called puts.

(您可以创建一个定义自己的 _start 的 PIE.这留给读者作为练习.)

(You can make a PIE that defines its own _start. That's left as an exercise for the reader.)

default rel                ; Use RIP-relative for [symbol] addressing modes
extern puts

section .text
    global main
main:
    sub    rsp, 8    ;; RSP was 16-byte aligned *before* a call pushed a return address

    lea    rdi, [message]         ; argument #1
    call   puts  wrt ..plt

    add    rsp, 8
    xor    eax, eax               ; return 0
    ret

section .rodata
    message:    db "Hello, world!", 0     ; puts appends a newline

peter@volta:/tmp$ nasm -felf64 pie.asm
peter@volta:/tmp$ gcc pie.o
peter@volta:/tmp$ ./a.out 
Hello, world!
peter@volta:/tmp$ file a.out
a.out: ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 3.2.0, BuildID[sha1]=b27e6032f955d628a542f6391b50805c68541fb9, not stripped

这篇关于使用 printf 和 ld 链接程序?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆