.bss段零初始化变量占用elf文件中的空间? [英] Do .bss section zero initialized variables occupy space in elf file?

查看:765
本文介绍了.bss段零初始化变量占用elf文件中的空间?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如果我理解正确,ELF文件中的 .bss 部分用于为零初始化变量分配空间。我们的工具链产生ELF文件,因此我的问题: .bss 部分实际上是否必须包含所有这些零?看来这样一个可怕的浪费空间,当我说,我分配一个全局十兆字节数组,它导致在十亿字节的零在ELF文件。我在这里看到了什么错误?

If I understand correctly, the .bss section in ELF files is used to allocate space for zero-initialized variables. Our tool chain produces ELF files, hence my question: does the .bss section actually have to contain all those zeroes? It seems such an awful waste of spaces that when, say, I allocate a global ten megabyte array, it results in ten megabytes of zeroes in the ELF file. What am I seeing wrong here?

推荐答案

但我想我还记得这个东西。不,它实际上不包含这些零。如果你查看一个ELF文件程序头,那么你会看到每个头有两个数字:一个是文件中的大小。另一个是在虚拟内存中分配时的大小( readelf -l ./a.out ):

Has been some time since i worked with ELF. But i think i still remember this stuff. No, it does not physically contain those zeros. If you look into an ELF file program header, then you will see each header has two numbers: One is the size in the file. And another is the size as the section has when allocated in virtual memory (readelf -l ./a.out):

Program Headers:
  Type           Offset   VirtAddr   PhysAddr   FileSiz MemSiz  Flg Align
  PHDR           0x000034 0x08048034 0x08048034 0x000e0 0x000e0 R E 0x4
  INTERP         0x000114 0x08048114 0x08048114 0x00013 0x00013 R   0x1
      [Requesting program interpreter: /lib/ld-linux.so.2]
  LOAD           0x000000 0x08048000 0x08048000 0x00454 0x00454 R E 0x1000
  LOAD           0x000454 0x08049454 0x08049454 0x00104 0x61bac RW  0x1000
  DYNAMIC        0x000468 0x08049468 0x08049468 0x000d0 0x000d0 RW  0x4
  NOTE           0x000128 0x08048128 0x08048128 0x00020 0x00020 R   0x4
  GNU_STACK      0x000000 0x00000000 0x00000000 0x00000 0x00000 RW  0x4

类型 LOAD 的标题是在加载文件以执行时被复制到虚拟内存中的标题。其他标头包含其他信息,如需要的共享库。如你所见, FileSize MemSiz 明显不同于包含 bss 部分(第二个 LOAD ):

Headers of type LOAD are the one that are copied into virtual memory when the file is loaded for execution. Other headers contain other information, like the shared libraries that are needed. As you see, the FileSize and MemSiz significantly differ for the header that contains the bss section (the second LOAD one):

0x00104 (file-size) 0x61bac (mem-size)

对于此示例代码:

int a[100000];
int main() { }

ELF规范说内存大小大于文件大小刚刚填充在虚拟内存中的零。第二个 LOAD 头的段到段映射如下:

The ELF specification says that the part of a segment that the mem-size is greater than the file-size is just filled out with zeros in virtual memory. The segment to section mapping of the second LOAD header is like this:

03     .ctors .dtors .jcr .dynamic .got .got.plt .data .bss

还有一些其他部分。对于C ++构造函数/析构函数。 Java的同样的事情。然后它包含 .dynamic 部分和其他对动态链接有用的东西(我相信这是包含所需的共享库的地方的其他东西的地方)的副本。之后,包含初始化全局变量和局部静态变量的 .data 部分。最后,会出现 .bss 部分,它在加载时由零填充,因为文件大小不覆盖它。

So there are some other sections in there too. For C++ constructor/destructors. The same thing for Java. Then it contains a copy of the .dynamic section and other stuff useful for dynamic linking (i believe this is the place that contains the needed shared libraries among other stuff). After that the .data section that contains initialized globals and local static variables. At the end, the .bss section appears, which is filled by zeros at load time because file-size does not cover it.

顺便说一句,您可以通过使用 -M 链接器选项来查看特定符号将放在哪个输出节中。对于gcc,您可以使用 -Wl,-M 将选项通过链接器。上面的例子显示 a 分配在 .bss 中。它可以帮助您验证您的未初始化对象真的在 .bss 而不是其他地方:

By the way, you can see into which output-section a particular symbol is going to be placed by using the -M linker option. For gcc, you use -Wl,-M to put the option through to the linker. The above example shows that a is allocated within .bss. It may help you verify that your uninitialized objects really end up in .bss and not somewhere else:

.bss            0x08049560    0x61aa0
 [many input .o files...]
 *(COMMON) 
 *fill*         0x08049568       0x18 00
 COMMON         0x08049580    0x61a80 /tmp/cc2GT6nS.o
                0x08049580                a
                0x080ab000                . = ALIGN ((. != 0x0)?0x4:0x1) 
                0x080ab000                . = ALIGN (0x4) 
                0x080ab000                . = ALIGN (0x4) 
                0x080ab000                _end = .

默认情况下,GCC会在COMMON节中保留未初始化的全局变量,以便与旧编译器兼容,全局变量在程序中定义两次,没有多重定义错误。使用 -fno-common 使GCC使用目标文件的.bss部分(对最终链接的可执行文件没有影响,因为你看到它会进入这是由连接器脚本控制的,用 ld -verbose 显示它。但这不应该吓唬你,它只是一个内部细节。请参阅gcc的联机帮助页。

GCC keeps uninitialized globals in a COMMON section by default, for compatibility with old compilers, that allow to have globals defined twice in a program without multiple definition errors. Use -fno-common to make GCC use the .bss sections for object files (does not make a difference for the final linked executable, because as you see it's going to get into a .bss output section anyway. This is controlled by the linker script. Display it with ld -verbose). But that shouldn't scare you, it's just an internal detail. See the manpage of gcc.

这篇关于.bss段零初始化变量占用elf文件中的空间?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆