如何反汇编,修改然后重新组装Linux可执行文件? [英] How to disassemble, modify and then reassemble a Linux executable?

查看:215
本文介绍了如何反汇编,修改然后重新组装Linux可执行文件?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

无论如何,这可以做到吗?我使用过objdump,但是不会产生我所知道的任何汇编程序都会接受的汇编输出.我希望能够在可执行文件中更改指令,然后再对其进行测试.

Is there anyway this can be done? I've used objdump but that doesn't produce assembly output that will be accepted by any assembler that I know of. I'd like to be able to change instructions within an executable and then test it afterwards.

推荐答案

我认为没有任何可靠的方法可以做到这一点.机器代码格式非常复杂,比汇编文件更复杂.实际上不可能采用已编译的二进制文件(例如ELF格式)并生成可编译为相同(或足够相似)二进制文件的源汇编程序.为了了解它们之间的差异,请比较直接向汇编器(gcc -S)编译的GCC输出与可执行文件(objdump -D)上objdump的输出.

I don't think there is any reliable way to do this. Machine code formats are very complicated, more complicated than assembly files. It isn't really possible to take a compiled binary (say, in ELF format) and produce a source assembly program which will compile to the same (or similar-enough) binary. To gain an understanding of the differences, compare the output of GCC compiling direct to assembler (gcc -S) versus the output of objdump on the executable (objdump -D).

我可以想到两个主要的并发症.首先,由于指针偏移之类的原因,机器代码本身与汇编代码不是一对一的对应关系.

There are two major complications I can think of. Firstly, the machine code itself is not a 1-to-1 correspondence with assembly code, because of things like pointer offsets.

例如,考虑Hello world的C代码:

For example, consider the C code to Hello world:

int main()
{
    printf("Hello, world!\n");
    return 0;
}

这将编译为x86汇编代码:

This compiles to the x86 assembly code:

.LC0:
    .string "hello"
    .text
<snip>
    movl    $.LC0, %eax
    movl    %eax, (%esp)
    call    printf

.LCO是命名常量,而printf是共享库符号表中的符号.与objdump的输出进行比较:

Where .LCO is a named constant, and printf is a symbol in a shared library symbol table. Compare to the output of objdump:

80483cd:       b8 b0 84 04 08          mov    $0x80484b0,%eax
80483d2:       89 04 24                mov    %eax,(%esp)
80483d5:       e8 1a ff ff ff          call   80482f4 <printf@plt>

首先,常量.LC0现在只是内存中某个位置的随机偏移-创建汇编源文件以在正确的位置包含此常量将很困难,因为汇编器和链接器可以自由选择位置这些常数.

Firstly, the constant .LC0 is now just some random offset in memory somewhere -- it would be difficult to create an assembly source file which contains this constant in the correct place, since the assembler and linker are free to choose locations for these constants.

其次,我对此并不完全确定(这取决于位置无关代码之类的东西),但是我相信对printf的引用实际上并没有在该代码中的指针地址处进行编码,而是在ELF处进行了编码.标头包含一个查找表,该表在运行时动态替换其地址.因此,反汇编的代码与源汇编的代码不太对应.

Secondly, I'm not entirely sure about this (and it depends on things like position independent code), but I believe the reference to printf is not actually encoded at the pointer address in that code there at all, but the ELF headers contain a lookup table which dynamically replaces its address at runtime. Therefore, the disassembled code doesn't quite correspond to the source assembly code.

总而言之,源程序集具有符号,而编译后的机器代码具有地址,难以逆转.

In summary, source assembly has symbols while compiled machine code has addresses which are difficult to reverse.

第二个主要问题是,程序集源文件不能包含原始ELF文件头中存在的所有信息,例如要动态链接的库以及原始编译器放置在其中的其他元数据.很难重建它.

The second major complication is that an assembly source file can't contain all of the information that was present in the original ELF file headers, like which libraries to dynamically link against, and other metadata placed there by the original compiler. It would be difficult to reconstruct this.

就像我说的那样,有可能一种特殊的工具可以操纵所有这些信息,但是不太可能一个人可以简单地产生可以重新组装回可执行文件的汇编代码.

Like I said, it's possible that a special tool can manipulate all of this information, but it is unlikely that one can simply produce assembly code which can be reassembled back to the executable.

如果您只想修改可执行文件的一小部分,我建议您使用一种比重新编译整个应用程序更微妙的方法.使用objdump来获取您感兴趣的函数的汇编代码.手动将其转换为源汇编语法"(在这里,我希望有一个工具能够以与输入相同的语法实际产生反汇编) ,然后根据需要对其进行修改.完成后,仅重新编译这些函数,然后使用objdump找出修改后的程序的机器代码.然后,使用十六进制编辑器将新的机器代码手动粘贴到原始程序的相应部分的顶部,请注意新代码与旧代码的字节数完全相同(否则所有偏移量都是错误的).如果新代码较短,则可以使用NOP指令进行填充.如果更长,则可能会遇到麻烦,可能必须创建新函数并改为调用它们.

If you are interested in modifying just a small section of the executable, I recommend a much more subtle approach than recompiling the whole application. Use objdump to get the assembly code for the function(s) you are interested in. Convert it to "source assembly syntax" by hand (and here, I wish there was a tool that actually produced disassembly in the same syntax as the input), and modify it as you wish. When you are done, recompile just those function(s) and use objdump to figure out the machine code for your modified program. Then, use a hex editor to manually paste the new machine code over the top of the corresponding part of the original program, taking care that your new code is precisely the same number of bytes as the old code (or all the offsets would be wrong). If the new code is shorter, you can pad it out using NOP instructions. If it is longer, you may be in trouble, and might have to create new functions and call them instead.

这篇关于如何反汇编,修改然后重新组装Linux可执行文件?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆