如何通过指令读取二进制可执行文件? [英] How to read binary executable by instructions?

查看:271
本文介绍了如何通过指令读取二进制可执行文件?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

是否可以通过编程方式从x86架构上的二进制可执行文件中读取给定数量的指令?

is there a way to read given amount of instructions from a binary executable file on x86 architecture programmatically?

如果我有一个简单的C程序的二进制文件hello.c:

If I had a binary of a simple C program hello.c:

#include <stdio.h>

int main(){
    printf("Hello world\n");
    return 0;
}

使用gcc进行编译后,反汇编的函数main如下所示:

Where after compilation using gcc, the disassembled function main looks like this:

000000000000063a <main>:
 63a:   55                      push   %rbp
 63b:   48 89 e5                mov    %rsp,%rbp
 63e:   48 8d 3d 9f 00 00 00    lea    0x9f(%rip),%rdi        # 6e4 <_IO_stdin_used+0x4>
 645:   e8 c6 fe ff ff          callq  510 <puts@plt>
 64a:   b8 00 00 00 00          mov    $0x0,%eax
 64f:   5d                      pop    %rbp
 650:   c3                      retq   
 651:   66 2e 0f 1f 84 00 00    nopw   %cs:0x0(%rax,%rax,1)
 658:   00 00 00 
 65b:   0f 1f 44 00 00          nopl   0x0(%rax,%rax,1)

在C语言中是否有一种简便的方法可以读取main中的前三个指令(即字节55, 48, 89, e5, 48, 8d, 3d, 9f, 00, 00, 00)?不能保证函数看起来像这样-第一条指令可能具有所有不同的操作码和大小.

Is there an easy way in C to read for example first three instructions (meaning the bytes 55, 48, 89, e5, 48, 8d, 3d, 9f, 00, 00, 00) from main? It is not guaranteed that the function looks like this - the first instructions may have all different opcodes and sizes.

推荐答案

此操作通过获取函数的地址并将其转换为指针unsigned char来打印main函数的前10个字节,以十六进制打印.

this prints the 10 first bytes of the main function by taking the address of the function and converting to a pointer of unsigned char, print in hex.

这个小片段不计算说明.为此,您将需要一个指令大小表(不是很困难,除非您发现该表已经完成,否则就很乏味,

This small snippet doesn't count the instructions. For this you would need an instruction size table (not very difficult, just tedious unless you find the table already done, What is the size of each asm instruction?) to be able to predict the size of each instruction given the first byte.

(当然,除非您要针对的处理器具有固定的指令大小,这使得该问题很难解决)

(unless of course, the processor you're targetting has a fixed instruction size, which makes the problem trivial to solve)

调试器也必须解码操作数,但是在某些情况下,例如步进或跟踪,我怀疑它们有一个方便的表来计算下一个断点地址.

Debuggers have to decode operands as well, but in some cases like step or trace, I suspect they have a table handy to compute the next breakpoint address.

#include <stdio.h>

int main(){
    printf("Hello world\n");
    const unsigned char *start = (const char *)&main;
    int i;
    for (i=0;i<10;i++)
    {
       printf("%x\n",start[i]);
    }    
    return 0;
}

输出:

Hello world
55
89
e5
83
e4
f0
83
ec
20
e8

似乎与反汇编匹配:)

00401630 <_main>:
  401630:   55                      push   %ebp
  401631:   89 e5                   mov    %esp,%ebp
  401633:   83 e4 f0                and    $0xfffffff0,%esp
  401636:   83 ec 20                sub    $0x20,%esp
  401639:   e8 a2 01 00 00          call   4017e0 <___main>

这篇关于如何通过指令读取二进制可执行文件?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆