如何反汇编、修改然后重新组装 Linux 可执行文件? [英] How to disassemble, modify and then reassemble a Linux executable?

查看:16
本文介绍了如何反汇编、修改然后重新组装 Linux 可执行文件?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

有没有办法做到这一点?我已经使用了 objdump,但它不会产生任何我知道的任何汇编程序都会接受的汇编输出.我希望能够更改可执行文件中的指令,然后再对其进行测试.

Is there anyway this can be done? I've used objdump but that doesn't produce assembly output that will be accepted by any assembler that I know of. I'd like to be able to change instructions within an executable and then test it afterwards.

推荐答案

我认为没有任何可靠的方法可以做到这一点.机器码格式非常复杂,比汇编文件还要复杂.实际上不可能采用编译后的二进制文件(例如,ELF 格式)并生成一个源汇编程序,该程序将编译为相同(或足够相似)的二进制文件.要了解差异,请将 GCC 直接编译到汇编程序 (gcc -S) 的输出与可执行文件上的 objdump 输出 (objdump -D) 进行比较.

I don't think there is any reliable way to do this. Machine code formats are very complicated, more complicated than assembly files. It isn't really possible to take a compiled binary (say, in ELF format) and produce a source assembly program which will compile to the same (or similar-enough) binary. To gain an understanding of the differences, compare the output of GCC compiling direct to assembler (gcc -S) versus the output of objdump on the executable (objdump -D).

我能想到两个主要的并发症.首先,由于指针偏移等原因,机器代码本身与汇编代码不是一一对应的.

There are two major complications I can think of. Firstly, the machine code itself is not a 1-to-1 correspondence with assembly code, because of things like pointer offsets.

例如,考虑到Hello world的C代码:

For example, consider the C code to Hello world:

int main()
{
    printf("Hello, world!
");
    return 0;
}

这将编译为 x86 汇编代码:

This compiles to the x86 assembly code:

.LC0:
    .string "hello"
    .text
<snip>
    movl    $.LC0, %eax
    movl    %eax, (%esp)
    call    printf

其中 .LCO 是一个命名常量,而 printf 是共享库符号表中的一个符号.与 objdump 的输出进行比较:

Where .LCO is a named constant, and printf is a symbol in a shared library symbol table. Compare to the output of objdump:

80483cd:       b8 b0 84 04 08          mov    $0x80484b0,%eax
80483d2:       89 04 24                mov    %eax,(%esp)
80483d5:       e8 1a ff ff ff          call   80482f4 <printf@plt>

首先,常量 .LC0 现在只是内存中某处的一些随机偏移量——很难创建一个在正确位置包含这个常量的汇编源文件,因为汇编器和链接器可以自由选择位置这些常数.

Firstly, the constant .LC0 is now just some random offset in memory somewhere -- it would be difficult to create an assembly source file which contains this constant in the correct place, since the assembler and linker are free to choose locations for these constants.

其次,我对此并不完全确定(这取决于位置无关代码之类的东西),但我相信对 printf 的引用实际上根本没有在该代码中的指针地址处编码,而是 ELFheaders 包含一个查找表,该表在运行时动态替换其地址.因此,反汇编后的代码与源汇编代码并不完全对应.

Secondly, I'm not entirely sure about this (and it depends on things like position independent code), but I believe the reference to printf is not actually encoded at the pointer address in that code there at all, but the ELF headers contain a lookup table which dynamically replaces its address at runtime. Therefore, the disassembled code doesn't quite correspond to the source assembly code.

总而言之,源代码汇编具有符号,而编译后的机器代码具有地址,这些地址很难逆转.

In summary, source assembly has symbols while compiled machine code has addresses which are difficult to reverse.

第二个主要问题是汇编源文件不能包含原始 ELF 文件头中存在的所有信息,例如动态链接的库,以及原始编译器放置在那里的其他元数据.重建它会很困难.

The second major complication is that an assembly source file can't contain all of the information that was present in the original ELF file headers, like which libraries to dynamically link against, and other metadata placed there by the original compiler. It would be difficult to reconstruct this.

就像我说的,一种特殊的工具可能可以处理所有这些信息,但不太可能简单地生成可以重新组装回可执行文件的汇编代码.

Like I said, it's possible that a special tool can manipulate all of this information, but it is unlikely that one can simply produce assembly code which can be reassembled back to the executable.

如果您只想修改可执行文件的一小部分,我推荐一种比重新编译整个应用程序更微妙的方法.使用 objdump 获取您感兴趣的函数的汇编代码.手动将其转换为源汇编语法"(在这里,我希望有一个工具能够以与输入相同的语法实际生成反汇编),并根据需要对其进行修改.完成后,只重新编译那些函数并使用 objdump 找出修改后的程序的机器代码.然后,使用十六进制编辑器手动将新机器代码粘贴到原始程序相应部分的顶部,注意您的新代码与旧代码的字节数完全相同(否则所有偏移量都会出错)).如果新代码较短,您可以使用 NOP 指令将其填充.如果它更长,您可能会遇到麻烦,并且可能不得不创建新函数并调用它们.

If you are interested in modifying just a small section of the executable, I recommend a much more subtle approach than recompiling the whole application. Use objdump to get the assembly code for the function(s) you are interested in. Convert it to "source assembly syntax" by hand (and here, I wish there was a tool that actually produced disassembly in the same syntax as the input), and modify it as you wish. When you are done, recompile just those function(s) and use objdump to figure out the machine code for your modified program. Then, use a hex editor to manually paste the new machine code over the top of the corresponding part of the original program, taking care that your new code is precisely the same number of bytes as the old code (or all the offsets would be wrong). If the new code is shorter, you can pad it out using NOP instructions. If it is longer, you may be in trouble, and might have to create new functions and call them instead.

这篇关于如何反汇编、修改然后重新组装 Linux 可执行文件?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆