有趣的可执行文件二进制转储 [英] Interesting binary dump of executable file

查看:100
本文介绍了有趣的可执行文件二进制转储的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

由于某种原因,我用C语言编写了简单的程序来输出给定输入的二进制表示形式:

For some reason I made simple program in C to output binary representation of given input:

int main()
{
  char c;
  while(read(0,&c,1) > 0)
    {
      unsigned char cmp = 128;
      while(cmp)
        {
          if(c & cmp)
            write(1,"1",1);
          else
            write(1,"0",1);
          cmp >>= 1;
        }
    }

  return 0;
}

编译后:

$ gcc bindump.c -o bindump

我做了一个简单的测试,以检查程序是否能够打印二进制文件:

I made simple test to check if program is able to print binary:

$ cat bindump | ./bindump | fold -b100 | nl

输出如下: http://pastebin.com/u7SasKDJ

我怀疑输出看起来像是随机的1和0序列.但是,部分输出似乎更加有趣.例如,看一下第171行与第357行之间的输出.我想知道为什么与可执行文件的其他部分相比,零有很多?

I suspected the output to look like random series of ones and zeroes. However, output partially seems to be quite more interesting. For example take a look at the output between line 171 and 357. I wonder why there are lots of zeros in compare to other sections of executable ?

我的体系结构是:

$ lscpu

Architecture:          i686
CPU op-mode(s):        32-bit, 64-bit
Byte Order:            Little Endian
CPU(s):                4
On-line CPU(s) list:   0-3
Thread(s) per core:    2
Core(s) per socket:    2
Socket(s):             1
Vendor ID:             GenuineIntel
CPU family:            6
Model:                 28
Stepping:              10
CPU MHz:               1000.000
BogoMIPS:              3325.21
Virtualization:        VT-x
L1d cache:             24K
L1i cache:             32K
L2 cache:              512K

推荐答案

当您将程序编译为Linux(以及许多其他的Unix系统)上的可执行文件时,它是以ELF格式编写的. ELF格式包含许多部分,您可以使用readelf或objdump进行检查:

When you compile a program into an executable on Linux (and a number of other unix systems), it is written in the ELF format. The ELF format has a number of sections, which you can examine with readelf or objdump:

readelf -a bindump | less

例如,.text节包含CPU指令,.data全局变量,.bss未初始化的全局变量(在ELF文件本身中实际上为空,但在执行程序时在主存储器中创建) ,.plt.got是跳转表,调试信息等.

For example, section .text contains CPU instructions, .data global variables, .bss uninitialized global variables (it is actually empty in the ELF file itself, but is created in the main memory when the program is executed), .plt and .got which are jump tables, debugging information, etc.

顺便说一句.使用hexdump检查文件的二进制内容要方便得多:

Btw. it is much more convenient to examine the binary content of files with hexdump:

hexdump -C bindata | less

您可以看到从偏移量0x850(转储中的第171行)开始,有很多零,并且您还可以在右侧看到ASCII表示.

There you can see that starting with offset 0x850 (approx. line 171 in your dump) there is a lot of zeros, and you can also see the ASCII representation on the right.

让我们看看在0x850到0x1160之间哪些部分与您感兴趣的块相对应(字段Off –文件中的偏移量在此处很重要):

Let us look at which sections correspond to the block of your interest between 0x850 and 0x1160 (the field Off – offset in the file is important here):

> readelf -a bindata
...
Section Headers:
[Nr] Name              Type            Addr     Off    Size   ES Flg Lk Inf Al
...
[28] .shstrtab         STRTAB          00000000 00074c 000106 00      0   0  1
[29] .symtab           SYMTAB          00000000 000d2c 000440 10     30  45  4
...

您可以使用-x检查单个部分的内容:

You can examine the content of an individual section with -x:

> readelf -x .symtab bindump | less
0x00000000 00000000 00000000 00000000 00000000 ................
0x00000010 00000000 34810408 00000000 03000100 ....4...........
0x00000020 00000000 48810408 00000000 03000200 ....H...........
0x00000030 00000000 68810408 00000000 03000300 ....h...........
0x00000040 00000000 8c810408 00000000 03000400 ................
0x00000050 00000000 b8810408 00000000 03000500 ................
0x00000060 00000000 d8810408 00000000 03000600 ................

您会看到有很多零.该部分由定义符号的18字节值(-x输出中的一行)组成.在readelf -a中,您可以看到它有68个条目,其中的前27个(不包括第一个)是SECTION类型:

You would see that there are many zeros. The section is composed of 18-byte values (= one line in the -x output) defining symbols. From readelf -a you can see that it has 68 entries, and first 27 of them (excl. the very first one) are of type SECTION:

Symbol table '.symtab' contains 68 entries:
   Num:    Value  Size Type    Bind   Vis      Ndx Name
     0: 00000000     0 NOTYPE  LOCAL  DEFAULT  UND 
     1: 08048134     0 SECTION LOCAL  DEFAULT    1 
     2: 08048148     0 SECTION LOCAL  DEFAULT    2 
     3: 08048168     0 SECTION LOCAL  DEFAULT    3 
     4: 0804818c     0 SECTION LOCAL  DEFAULT    4 
     ...

根据规范(第1-18页),每个条目具有以下内容格式:

According to the specification (page 1-18), each entry has the following format:

typedef struct {
    Elf32_Word st_name;
    Elf32_Addr st_value;
    Elf32_Word st_size;
    unsigned char st_info;
    unsigned char st_other;
    Elf32_Half st_shndx;
} Elf32_Sym;

在这里没有过多的细节,我认为在这里重要的是,对于这些SECTION条目,st_name和st_size都为零.两者都是32位数字,这意味着在此特定部分中很多零.

Without going into too much detail here, I think what matters here is that st_name and st_size are both zeros for these SECTION entries. Both are 32-bit numbers, which means lots of zeros in this particular section.

这篇关于有趣的可执行文件二进制转储的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆