我可以在 MacOS 的 _start 处的代码中执行 `ret` 指令吗?Linux? [英] Can I do `ret` instruction from code at _start in MacOS? Linux?

查看:10
本文介绍了我可以在 MacOS 的 _start 处的代码中执行 `ret` 指令吗?Linux?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想知道从程序的入口点返回 ret 是否合法.

NASM 示例:

section .text全局_start_开始:ret;Linux:nasm -f elf64 foo.asm -o foo.o &&ld foo.o;OS X: nasm -f macho64 foo.asm -o foo.o &&ld foo.o -lc -macosx_version_min 10.12.0 -e _start -o foo

ret 从堆栈中弹出一个返回地址并跳转到它.

但是堆栈的顶部字节是程序入口点的有效返回地址,还是我必须调用 exit?

另外,上面的程序在 OS X 上没有段错误.它返回到哪里?

解决方案

MacOS 动态可执行文件

当您使用 MacOS 并链接到:

ld foo.o -lc -macosx_version_min 10.12.0 -e _start -o foo

您将获得动态加载的代码版本._start 不是真正的入口点,动态加载器才是.作为其最后步骤之一的动态加载器执行 C/C++/Objective-C 运行时初始化,然后调用您指定的由 -e 选项指定的入口点.关于

虽然栈上没有返回地址,但还有其他数据表示参数个数、参数、环境变量等信息.此布局C/C++ 中的 main 函数所期望的相同.它是 C 启动代码的一部分,用于在进程创建时将堆栈转换为与 C 调用约定和函数 main (argc, argv, envp).

我在这个 Stackoverflow 答案 中写了关于这个主题的更多信息,它显示了如何静态链接的 MacOS 可执行文件可以遍历内核在进程创建时传递的程序参数.

I am wondering if it is legal to return with ret from a program's entry point.

Example with NASM:

section .text
global _start
_start:
ret

; Linux: nasm -f elf64 foo.asm -o foo.o && ld foo.o
; OS X:  nasm -f macho64 foo.asm -o foo.o && ld foo.o -lc -macosx_version_min 10.12.0 -e _start -o foo

ret pops a return address from the stack and jumps to it.

But are the top bytes of the stack a valid return address at the program entry point, or do I have to call exit?

Also, the program above does not segfault on OS X. Where does it return to?

解决方案

MacOS Dynamic Executables

When you are using MacOS and link with:

ld foo.o -lc -macosx_version_min 10.12.0 -e _start -o foo

you are getting a dynamically loaded version of your code. _start isn't the true entry point, the dynamic loader is. The dynamic loader as one of its last steps does C/C++/Objective-C runtime initialization, and then calls your specified entry point specified with the -e option. The Apple documentation about Forking and Executing the Process has these paragraphs:

A Mach-O executable file contains a header consisting of a set of load commands. For programs that use shared libraries or frameworks, one of these commands specifies the location of the linker to be used to load the program. If you use Xcode, this is always /usr/lib/dyld, the standard OS X dynamic linker.

When you call the execve routine, the kernel first loads the specified program file and examines the mach_header structure at the start of the file. The kernel verifies that the file appear to be a valid Mach-O file and interprets the load commands stored in the header. The kernel then loads the dynamic linker specified by the load commands into memory and executes the dynamic linker on the program file.

The dynamic linker loads all the shared libraries that the main program links against (the dependent libraries) and binds enough of the symbols to start the program. It then calls the entry point function. At build time, the static linker adds the standard entry point function to the main executable file from the object file /usr/lib/crt1.o. This function sets up the runtime environment state for the kernel and calls static initializers for C++ objects, initializes the Objective-C runtime, and then calls the program’s main function

In your case that is _start. In this environment where you are creating a dynamically linked executable you can do a ret and have it return back to the code that called _start which does an exit system call for you. This is why it doesn't crash. If you review the generated object file with gobjdump -Dx foo you should get:

start address 0x0000000000000000

Sections:
Idx Name          Size      VMA               LMA               File off  Algn
  0 .text         00000001  0000000000001fff  0000000000001fff  00000fff  2**0
                  CONTENTS, ALLOC, LOAD, CODE
SYMBOL TABLE:
0000000000001000 g       03 ABS    01 0010 __mh_execute_header
0000000000001fff g       0f SECT   01 0000 [.text] _start
0000000000000000 g       01 UND    00 0100 dyld_stub_binder

Disassembly of section .text:

0000000000001fff <_start>:
    1fff:       c3                      retq

Notice that start address is 0. And the code at 0 is dyld_stub_binder. This is the dynamic loader stub that eventually sets up a C runtime environment and then calls your entry point _start. If you don't override the entry point it defaults to main.


MacOS Static Executables

If however you build as a static executable, there is no code executed before your entry point and ret should crash since there is no valid return address on the stack. In the documentation quoted above is this:

For programs that use shared libraries or frameworks, one of these commands specifies the location of the linker to be used to load the program.

A statically built executable doesn't use the dynamic loader dyld with crt1.o embedded in it. CRT = C runtime library which covers C++/Objective-C as well on MacOS. The processes of dealing with dynamic loading are not done, C/C++/Objective-C initialization code is not executed, and control is transferred directly to your entry point.

To build statically drop the -lc (or -lSystem) from the linker command and add -static option:

ld foo.o -macosx_version_min 10.12.0 -e _start -o foo -static

If you run this version it should produce a segmentation fault. gobjdump -Dx foo produces

start address 0x0000000000001fff

Sections:
Idx Name          Size      VMA               LMA               File off  Algn
  0 .text         00000001  0000000000001fff  0000000000001fff  00000fff  2**0
                  CONTENTS, ALLOC, LOAD, CODE
  1 LC_THREAD.x86_THREAD_STATE64.0 000000a8  0000000000000000  0000000000000000  00000198  2**0
                  CONTENTS
SYMBOL TABLE:
0000000000001000 g       03 ABS    01 0010 __mh_execute_header
0000000000001fff g       0f SECT   01 0000 [.text] _start

Disassembly of section .text:

0000000000001fff <_start>:
    1fff:       c3                      retq

You should notice start_address is now 0x1fff. 0x1fff is the entry point you specified (_start). There is no dynamic loader stub as an intermediary.


Linux

Under Linux when you specify your own entry point it will segmentation fault whether you are building as a static or shared executable. There is good information on how ELF executables are run on Linux in this article and the dynamic linker documentation. The key point that should be observed is that the Linux one makes no mention of doing C/C++/Objective-C runtime initialisation unlike the MacOS dynamic linker documentation.

The key difference between the Linux dynamic loader (ld.so) and the MacOS one (dynld) is that the MacOS dynamic loader performs C/C++/Objective-C startup initialization by including the entry point from crt1.o. The code in crt1.o then transfers control to the entry point you specified with -e (default is main). In Linux the dynamic loader makes no assumption about the type of code that will be run. After the shared objects are processed and initialized control is transferred directly to the entry point.


Stack Layout at Process Creation

FreeBSD (on which MacOS is based) and Linux share one thing in common. When loading 64-bit executables the layout of the user stack when a process is created is the same. The stack for 32-bit processes is similar but pointers and data are 4 bytes wide, not 8.

Although there isn't a return address on the stack, there is other data representing the number of arguments, the arguments, environment variables, and other information. This layout is not the same as what the main function in C/C++ expects. It is part of the C startup code to convert the stack at process creation to something compatible with the C calling convention and the expectations of the function main (argc, argv, envp).

I wrote more information on this subject in this Stackoverflow answer that shows how a statically linked MacOS executable can traverse through the program arguments passed by the kernel at process creation.

这篇关于我可以在 MacOS 的 _start 处的代码中执行 `ret` 指令吗?Linux?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆